|
The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The fundamental technological framework and standards that are developing to support this work are independent of the both the type of content offered and the economic mechanisms surrounding that content. Of particular interest is the OAI's primary product, the OAI Metadata Harvesting Protocol. This protocol enables data resources to exchange metadata with services such as indexers and proxies with a minimum of overhead, and can accomodate virtually any metadata format. Emerge is adopting the OAI for its own harvesting effort.
RDF (the Resource Description Framework) is a platform-neutral, extensible framework for describing attributes of objects and relationships between objects. It can accomodate virtually all kinds of metadata and can be used to build ontologies and other knowledge structures allowing for automated inference and other rule-based processing. It's related to technologies such as KIF and Topic Maps.
Z39.50 is an ISO/NISO standard network protocol for querying data collections. It models data collections as unordered sets of records with fields that can be matched by terms in a search query. Queries can be nested using boolean operators and leaf terms can contain combinations of attributes which are defined in attribute sets which are independent of the protocol specification itself. Attribute sets for bibliographic data and geospatial data, among others, have been defined for Z39.50. Since Z39.50's model of data is so abstract and simple, Z39.50 can be used to make many different kinds of data network-searchable, including large-scale scientific data.A disadvantage of Z39.50 is that it's somewhat difficult to implement, in part because of its reliance on ASN.1/BER as its encoding rather than a text encoding such as XML. However, numerous Z39.50 toolkits are available including our own Gazelle and the YAZ toolkit.
Emerge provides both a Z39.50 server toolkit, Gazelle, and a multi-threaded search gateway, Gazebo, which can search multiple Z39.50 services simultaneously and can be accessed through an object oriented Java client interface.
XML is a rapidly emerging standard which is designed to facilitate defining new document types which can express application-specific semantics independently of low-level syntax and user-interface information. Based on SGML, XML is a completely general purpose meta-syntax for a variety of text-based formats. We use it to encode Gazebo's configuration syntax and for the client-to-gazebo protocol.