So finally I was able to defend my M.S. thesis, “Visualizing Digital Collections at Archive-It”.
Archive-It, a subscription service from the Internet Archive, allows users to create, maintain, and view digital collections of web resources. The current interface of Archive-It is largely text-based, supporting drill-down navigation using lists of URIs. While this interface provides good searching capabilities, it is not very efficient for browsing. In the absence of keywords, a user has to spend large amount of time trying to locate a webpage of interest. In order to provide a better visual experience to the user, we have studied the underlying characteristics of Archive-It collections and implemented six different visualizations (treemap, time cloud, bubble chart, image plot, timeline and wordle), each highlighting one or more of the underlying characteristics of the collection. Archive-It supports grouping of webpages into categories, however, it does not enforce its usage. As a result there are many collections with missing or improper grouping. For such collections, we present a method of grouping webpages based on a set of pre-defined rules.
Here are the slides from my defense.
The following are direct links to the videos in the presentation :