Main Page

From WebMETS

Jump to: navigation, search

This is a wiki designed to keep track of ideas, proposed standard modifications, and implementation information for web implementations of METS. This site is not affiliated with the METS project.

Here is a one-minute overview of METS and possible adaptations to METS.

While X/HTML (or plain XML) on the web allows for file resources to be linked together (e.g., stylesheets, script files, etc.), and allows flexible linking to any number of other files, there is no formal way to indicate how a set of files are grouped together (Standard Navigation is one attempt (and a functional one at that) to do this for sitemaps, but the scope here is for more uses of file groups than only sitemaps). METS meets this need well, but it has not (to our knowledge) been made available on the web in a way that browsers could directly process.

Thus, our goal is to use this standard (or one like it) to allow groups of files to be represented in a standard fashion for use on the web.

There are a number of very interesting potential uses of having such a format and implementations that support it.

  1. Sitemaps (HTML's link rel=next/prev does not provide a picture of a whole website at once, while Google's sitemaps.org is not hierarchical or web-like). With XHTML files linking to a METS file (or potentially a browser directly visiting a METS file) could lead the user agent to display (e.g., in a sidebar) a means to navigate to other pages in the site, and representing the files hierarchically (or as a web).
  2. "Gopher 2.0" - a potential replacement for the Gopher protocol, where files can be easily navigated on the web and drilled down--like navigating files on one's file system (with the user agent potentially allowing use of the same views as on the file system, such as tree views, column views, icon view, etc.). While the Web proved far more popular, for the ability to surround links with text, images, etc., the simplicity of the Gopher model can be efficient for drilling down to information in a large hierarchy, as well as leading to other sites (unlike plain sitemaps which usually confine themselves to a particular site). METS supports <structMap/> and <structLink/> which could allow such relationships to be described, whether the file exclusively describes files on a single site, or between sites.
  3. Linkbases. This vision in XLink has not been realized. A linkbase would allow one to describe link relationships even among sites not under one's control (as METS would potentially support within its <structLink/> section). Such information could be potentially used by user agents which have encountered the linkbases (e.g., prompting users whether they wished to store the suggested interlinks for display the next time they visit the pages references by the linkbase) or they might be harvested by web crawlers/search engines which could in turn make available to users visiting their sites (or web clients which could automatically obtain this information) so users could find out about sites suggesting outbound or inbound links.
  4. Clear specification of a set of files that can be downloaded together, automatically decompressed (if compressed) and reassembled for offline use (whether SQL, or XML (or XHTML)+CSS/XSL/XBL/XQuery/DTD(entities)/JavaScript/etc.)--all without the user needing to care about how and where the files are stored as they do now when downloading a compressed file (though they could be bookmarked and categorized by the browser). This may be especially useful for a site which makes available a set of related books, games, etc. The user would have freedom to determine whether to obtain all of the files at once or just navigate them live. However, even files downloaded for offline use, could use a polling and/or versioning mechanism (an item which may need special consideration for our adapting METS) to periodically check for modifications and optionally stay updated with the latest files.
  5. Limit the number of HTTP requests, a key factor in slowing down websites (though it would, by necessity, increase HTTP requests in cases where the data sources were independent and live)

The fourth possibility in particular opens up a number of other interesting possibilities and consequences:

  1. Data can be stored locally for greater privacy and offline manipulability (while still potentially allowing uploading/versioning/server-side synchronization)
  2. Data-providers (e.g., scientists, TEI document creators, etc.) do not have to spend time worrying designing a user-friendly interface or API, nor worry about server-side security of their interface, or Denial-of-Service attacks as a result of too-powerful interface or API options. With the data pushed to the client, the user can query however they wish, and at worst they may need a browser restart if they perform too powerful of a query (e.g., Firefox has SQLite built-in and an extension could be created to support this). "Freeing your data" will be as simple as linking to a manifest file which points to your data files, and potentially describing how they should or could be viewed or assembled (e.g., for use with a default viewer).
  3. Shareable data among apps. With data downloaded to the client, applications can be designed which allow viewing of the data in a generic way and which do not only work with a given site. One could have any number of Firefox extensions, for example (e.g., SQLite Manager, or a version more targeted for the average user). Applications could view, edit, or act on the data (if installed with trust by the user). The files (whether SQL or XML) would be stored in a location accessible to any add-on.
  4. Perhaps most compelling of all, the format could be used for combining or alteration of different data sources, whether from sites under one's control or not, and in a fashion which was bookmarkable (the link itself could contain some if not all of the data, perhaps using a special protocol, or it could simply point to the (METS) manifest file which contained this information)
    1. Mash-ups (combining sources, e.g., for side-by-side or interlinear display, such as parallel translations, display of independent wiki pages alongside a book's paragraphs)
    2. Mash-downs (narrowing down which portion of the data source to display). XPointer might be employed for this, though we would want this to work in combination with other mashing, such as mash-ups (e.g., show only paragraphs 5-10 of a given book, but with a mash-up combining the book along with collaborative information sources, automated translations, etc.)
    3. Mash-overs (a comment layer). This concept is similar to a mash-up, but will display the information as a layer above the original content/data source(s). This may particularly be relevant for commentary which may appear as a hidable bubble over a portion of a webpage. If browsers (e.g., Firefox) allow editing of webpages (as some extensions allow now), any user could make a mash-over and then save their commentary, making it as something which can be shared in a standard way on any website and associated with the original document/data source.
    4. Mash-under (add meta-data, restyling or restructuring to a document). This concept could allow data to be restyled (e.g., to display data from an existing site but in a different font, or highlighted in color, etc.) or restructured (one could propose content or markup changes to an existing site and let the browser indicate these changes to others who followed your mash-under--whether one submitted it back to the original site owner, or merely shared the suggestion with one's site visitors, in a chat discussion with someone, etc.). Finally, a mash-under could add semantic meta-data (which could be used in searches, etc.) in a stand-off way for a document not necessarily under their control. This could be similar to commentary, but would be less obstructive (e.g., to propose that an item be turned into a heading, formally indicate that it was a date, quotation, etc.).

My hope is also that URNs (ISBNs, book names, etc.) if not protocols could also be associated with METS files or packages and links to URNs encouraged (including possibly via browser extensions which automatically added potential links to any page they visited or on user option) so that users would have the choice of substituting one abstract data source with another version (e.g., if one METS file specified a Bible version and optionally supplied a default such as the New International Version, one could still substitute in the King James Version or vice versa (if not aggregating both)). Even names could be made as URN links which could be optionally opened in Wikipedia, a specialized wiki, a search engine query, etc.

The METS file might be visited directly or discovered via the <link/> tag of an HTML document, and the file's usage could also vary depending on external information such as content-type headers, or parameters specified in the link itself.

Personal tools