The Internet Archive — Class Notes

Why Do We Need the Internet Archive

About the IA:

The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format.

Working to prevent a Digital Dark Age

What are the Challenges?

  • Getting the actual digital data
  • Reading the data, no matter what format
  • “Preserving” the data — in what format?
  • Associating meta-data with the data.
  • Building a solution to house ALL THAT DATA
  • Building an interface so that we can find/use ALL THAT DATA
  • Copyright, licensing, etc.


By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled

Buckets of Stuff:

Strong emphasis on collecting work that is in the public domain or available under a Creative Commons license.

More Exploration:

  • Do the Wayback Time Machine Assignment from the ds106 assignment respository. Share your work on your blog.
  • Search the moving images, audio, text, or software repository for one source that relates to a class you’re taking right now, a topic you’re interested in, or a project/class you’ve worked on in the past. Try and find something you couldn’t find anywhere else (in other words, don’t just set out to find something you’ve already encountered or explored). Share what you found on your blog: what did you set out to look for? What did you find? Was it hard/easy? Were you surprised by what you found?

1 thought on “The Internet Archive — Class Notes

  1. Pingback: Moving into Week Six | Identity & Citizenship in a Digital Age

Leave a Reply

Your email address will not be published. Required fields are marked *