Resources for education in digital libraries
Tefko Saracevic, Ph.D.

School of Communication

Return to SC&I home

Tefko Saracevic home
D-Lib Edu home Types Collections Metadata Research Sources


The World Wide Web is the fastest growing and the most rapidly and widely deployed technology in the history of technologies. The explosive growth of the Web provided for ubiquity of information and access to information resources. But it also provided for information anarchy and chaos. A number of metaphors depict the Web as a vast ocean of information and many a Web surfer as lost in that ocean.

The Web is not well organized for searching and retrieving of information. A prerequisite for more effective organization and searching is knowledge of the structure of the data and databases. But the big problem is that Web data and Web databases are notoriously fuzzy and (dis)organized in every which way. The structures vary. They constantly evolve with time. The consistency is low.

It has been long recognized that what is needed is some standardized description or language to increase functionality of the Web. In other words, needed is a mechanism for a more precise description of things on the Web going from machine-readable to machine-understandable. This was missing in the original Web architecture.

Enter a solution: METADATA! Already tiresomely, metadata is defined as data about data, information about information. Metadata refers to a standardized description of what a text or any object is all about. It labels parts of the text (or object) with some standardized, agreeable labels or tags.

But what standards? Who will develop them? How? How to implement them? These and a host of similar problems are the gist of great many metadata efforts, activities, projects, and discussions worldwide.

Libraries and librarians have been involved with metadata for a long time. Centuries. But they did not call it metadata. They called it cataloging rules, controlled vocabulary, indexing formats, and the like. For machines they have developed Machine Readable Cataloging (MARC) - a set of conventions to enable machine exchanges of cataloging records. But with development of digital libraries, librarians have joined the other Web efforts related to metadata.

The selection of metadata sites below, is but a sample of a variety of metadata projects, standards, and sources. It is a good beginning for organized surfing into the vast ocean of metadata information.

Standards and projects

Berkeley Digital Library Sun Site. Z39.50 Information

"Z39.50 is a computer-to-computer communications protocol designed to support searching and retrieval of information -- full-text documents, bibliographic data, images, multimedia -- in a distributed network environment. Based on client/server architecture and operating over the Internet, the Z39.50 protocol is supporting an increasing number of applications. And like the dynamic network environment in which it is used, the standard is evolving to meet the changing needs of information creators, providers, and users." Applied in many libraries worldwide. The Berkeley site has links to standards, applications, and articles. Dublin Core Metadata Initiative "The Dublin Core Metadata Initiative is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global workshops, conferences, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices." Includes description and references to many activities. The DC metadata set is at Text Encoding Initiative Consortium (TEI) "The TEI is an international project to develop guidelines for the encoding of textual material in electronic form for research purposes." Started in 1988 - an early project geared toward humanities. Now hosted by four universities and sponsored by a number of organizations. Provides guidelines for electronic text encoding and interchange. Also includes a popular TEI Lite at The UK Office for Library and Information Networking, UKOLN "UKOLN is a national focus of expertise in digital information management. It provides policy, research and awareness services to the UK library, information and cultural heritage communities." Includes a number of projects related to metadata at and mapping between metadata formats at U.S. Federal Geographic Data Committee. An example of a government initiative. This one involves Content Standard for Digital Geospatial Metadata. U.S. Library of Congress. Standards. "...key standards used in the information community that are maintained by the Library of Congress. Their Web pages supply information on their maintenance and use. Other links below connect to information on the Library's collection of standards and key standards-settings organizations." Includes: MARC Formats - Digital Library Standards; Z39.50 Retrieval Protocol; Encoded Archival Description; ISO Language Codes; International Standard Serial Number; Standards Collections; Related Standards Organizations. U.S. Library of Congress. Encoded Archival Description Official Web site "The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML). The standard is maintained in the Network Development and MARC Standards Office of the Library of Congress (LC) in partnership with the Society of American Archivists." World Wide Web Consortium. (W3C) "The World Wide Web Consortium (W3C) develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding." THE AUTHORITY on tools for Web access. A rich site for a number of metalanguage standards, including HTML, XML. Includes also Web Content Accessibility Guidelines and tutorials for including accessibility in development of own Web sites, such as found in: General information


Ariadne is a leading library online publication. Metadata Corner is a regular feature about metadata issues, projects, and descriptions - an excellent source of current information. Same column in earlier issues can be accessed. International Federation of Library Associations (IFLA). Digital libraries: Metadata resources One of the richest resources for links to many national and international metadata efforts and project - a whole alphabet soup of them. Also links to documents in a variety of metadata areas. A good place to start. Koehler, W. (2000). Tutorial on author tools. Descriptions, examples and links to a variety of markup languages, metatags and initiatives, including SGML/XML, Dublin Core and W3C initiatives in metadata. Many examples, e.g. 'Page 3 MetaTags' gives an example for applying HTML meta tags to your own html pages. Memorial University of Newfoundland Libraries. Metadata standards, crosswalks, and standard organizations. An extensive set of links to anything on metadata. Rutgers University Libraries. Center for Electronic Texts in the Humanities (CETH) Lists a number of projects, some extensive and complex. Includes guidelines, workshops and presentations on XML, SGML. HTML, Cold Fusion and others, and as such also a good educational site on metadata. Schwartz, C. (2000). Simmons College. Metadata Resources An eclectic resource about metadata. Includes readings, national and international efforts, examples of projects, MARC cataloging, SGML, EAD, and TEI, W3C efforts, and much more. Also, great bibliography with links to original papers. Part of a larger set of electronic resources maintained by Candy Schwartz at Simmons College, including a course on digital libraries.
last update 20 March 2002