A brief guide for Content Providers
Why publishing data in AGRIS
Since the beginning of 2008, Google Scholar indexes all the references of the AGRIS repository, boosting the worldwide visibility and accessibility of the institutional repositories participating to the AGRIS Network. Also due to Google indexing, the AGRIS portal has doubled visits in the last two years and currently has approximately 2,000,000 visitors a year.
The OAI Protocols for Metadata Harvesting (OAI-PMH)
The Open Archives Initiative - Protocol for Metadata Harvesting (OAI-PMH), is widely used by content providers and is integrated in several CMS and systems that want to expose their data and make them accessible and harvestable by a wider community.
Content providers, whose repositories use the OAI-PMH protocols for disseminating their data, should be aware that an additional layer should be developed in the code that will allow the repository to export the data using the AGRIS AP XML metadata format.
Some of the institutional repositories, publishing companies or national networks that are currently harvested by AGRIS using OAI-PMH are Scielo in Brazil, Viikki Science Library in Finland, BIBSYS in Norway, National Library of Portugal, Prod-Inra in France, Wageningen UR Library in the Netherlands. AGRIS collects this data via a harvester, based on the OAIHarvester2 Open Source Software (OSS), and that includes the AGRIS AP metadata format as additional prefix. An OAI-PMH tutorial, accessible at http://www.oaforum.org/tutorial/ describes in detail the steps necessary to implement the protocols in an IR (Institutional Repository). An additional benefit for the implementation of the OAI-PMH protocols is the registration of your IR at the official registration site for Data Providers.
AGRIS AP XML generation
During the last years, AGRIS has created a metadata framework (the AGRIS Application Profile or AGRIS AP) that minimize the importance of applications and systems and leverages the idea of a common metadata format for data exchange. Content providers are not asked to use "AGRIS tools", such as the ISIS family systems, but are absolutely free to use any system when they want to submit data to the AGRIS database.
FAO/AGRIS has a long and widespread involvement in the development, deployment and maintenance of content management systems, with tools such as WebAGRIS, but has recently adopted a neutral position on the choice of a content management system of its "data partners", although the team is working in parallel with other International Institutions and UN Organizations with the final objective of providing a customization of two important CMS’ (Dspace and Drupal). These are simple plug-ins that will enable the systems to both use AGROVOC to index resources and to export to the AGRIS AP XML format. Please contact us, should you want to have more information at this regard and if you want to join the community and collaborate on the above projects.
Important international communities and networks, such as the family of CGIAR Centres, although using different systems, have successfully adopted the AGRIS AP format as common metadata for data exchange and all the data is currently regularly published in the AGRIS database.
The following four steps briefly describe one of the simplest processes for generating valid AGRIS AP XML records from proprietary XML-enabled databases:
- Identification of the fields in the catalogue of the local database that will match the AGRIS AP XML DTD elements and schemes. The resulting mapping document links the fields of the local database to the elements and qualifiers of the schemas.
- An XSL stylesheet (or other scripts) encodes the mapping document produced by the cataloguers. The template will link and match the nodes from each field of the local database to the appropriate elements and schemes.
- Applying the XSLT to the well-formed XML documents results in their transformation to AGRIS AP XML resources.
- The XML documents are validated against either the AGRIS AP XML DTD, by means of XML parsers
It is quite evident from the above steps that providing metadata in an XML format is an essential prerequisite that content providers should keep in mind. The first step (crosswalk or mapping) is in itself the key step of the whole process, and, in order to achieve a full interoperability at the level of the content and schema, both the originating and the output matadata formats must have a clear specification of the set of elements and their semantics.
The AGRIS AP format, whose structure is described in the AGRIS AP XML DTD, uses a DC qualified XML metadata, and the provider’s metadata can be mapped and then converted to the proposed AGRIS AP XML format, with the use of XSLTs or other scripts. In this page, there is a comprehensive description and relevant encoding of each AGRIS AP element and scheme that is required to generate valid AGRIS AP XML data.
If your Institution or company has a system that can export to the XML format and wish to publish your citations, related to agriculture, forestry, animal husbandry, aquatic sciences and fisheries and human nutrition, you need to send a request to firstname.lastname@example.org, specifying:
- The name and place of your Institution or Company
- The name and URL (if available) of the Journal or of the repository/database that you wish to include in AGRIS
- The name of the ILMS (Information Management Library System) that is used to catalog and search data
- The type and class of the documents that you intend to publish in AGRIS
- The type of format and the schema (if available) that is used for exporting data from your system
If the database storing the data that should be published in AGRIS has no options for exporting to XML, other available options and formats can be used to generate XML data. For example, MS Access has an interesting “Export to XML” feature (available in Access 2002 or later) where one easily creates data dumps and then exporting to a proprietary XML format. You can also do almost the same with MS Excel. Then the CSV (comma-separated values) is a structured but simple text format for a database table. An online tool, CSV to XML Converter, can easily allow you to generate XML from a CSV file.