Preserving History

Roy Rosenzweig’s essay, Scarcity or Abundance? Preserving the Past in a Digit sheds light on how history on the Internet may not always be there.  There are countless ways for information to get lost or deleted on the web.  People tend to believe that what is put on the Internet will be there forever.  Rosenzweig says, “Government archives similarly continue to rely on the unwarranted assumption that records can be appraised and accessioned many years after their creation.”  This statement shows that even the highest offices i.e. Library of Congress are carelessly archiving important information on the web.  Although they are most likely using the LOCKSS (lots of copies keeps stuff safe) principle, there is still no guarantee to documentation on the Internet.

I myself am guilty of using Facebook to archive many aspects of my life.  Sometimes I resort to storing photos on Facebook instead of storing photos in a more secure manner such as a folder or putting the content in a flash drive.  Say if Zuckerberg breaks down and deletes Facebook, what am I left with?

On the contrary, the April 16 Archive does good on the Internet by collecting and preserving memories of the Virginia Tech tragedy.  Since memories fade, this archive is able to keep the authenticity of the tragedy alive.  The stories of victims and family members provided on the site really help people carry on the slogan that came with the tragedy, “NeVer ForgeT.”

I think the best way to preserve history is to have hard copies of everything.  To many these days it seems very old fashioned but it is the only feasibly logical mean of preservation.  Pictures and family documents need to be printed and copies of everything should be put into hard drives.  It’s important to recognize that if these things aren’t done, one shouldn’t be surprised if they lose everything.

Digital Research

After reading From Babel to Knowledge: Data Mining Large Digital Collections, by Dan Cohen, the idea of creating narrow search engines rather than using broad ones such as Google search, seems brilliant.  It was interesting to see how Cohen with the help of Google’s web search API was able to build a search engine that displayed classroom course material. He called it the Syllabus Finder.  Cohen said, “I thought a search engine that could locate syllabi on any topic would be useful for professors planning their next course, for discovering the kinds of books and assignments being commonly assigned, and for understanding the state of instruction more broadly.”  Assuming that professors would prefer to have their own search engine to look for class course work rather than web searching it, I couldn’t agree more with Dr. Cohen.

There are other similar search engines like Syllabus Finder.  One of them being the TIME Magazine Corpus.  The TIME Magazine Corpus has a record of all issues of TIME from 1923 – 2006 and allows users to search for words or phrases within all the text.  Although I find the Magazine Corpus to not be very user-friendly, it can still get the job done.  Rather than searching the the entire web for certain issues of TIME, this site can come in handy.

Another notable site is the Google books Ngram viewer, that let’s you search the use of certain words in books over a certain period of time.  The site shows a line graph representation of the popularity of certain words and even let’s you compare more than one of them on the same graph.  When putting in words such as ‘flavour’ and ‘flavor’ you can really see the comparison.  The graph shows that the spelling of the word in old English (flavour) went down periodically while the current version of the word (flavor) has gone up.

I find these sites to be very useful, enabling people to search for topics within a narrower search group, rather than a broader search area such as the web.