Converting into RDF, and full-text indexing

I’d like to provide a bit more detail regarding two important elements of MESA: the resources available to assist projects in converting their metadata into the MESA-specified RDF, and information on indexing full-text to be included for searching in MESA.

Converting into RDF

In addition to the XSLT example files on the Collex wiki, Jeffrey Witt, editor of the Petrus Plaoul Electronic Critical Edition, one of the MESA partner projects, has created a Ruby script to automate creation of his project’s RDF files. It is tailored to suit the Petrus Plaoul project, but is customizable to other locations and other types of data (particularly TEI). The Ruby script is available at https://bitbucket.org/jeffreycwitt/rdfbuilder/, and Dr. Witt has generously offered to help others customize it – you can contact him at jeffreycwitt@gmail.com.

If you are developing scripts or tools for generating MESA-specified RDF and you would like to share them with other partner projects, please let us know and we will post them on the MESA page of the Collex Wiki.

Indexing Full Text

If you would like to include full-text for any item, to make it available for searching in the MESA interface, you will need to use <collex:text> in your RDF. Either 1) the @rdf:resource attribute may point to a URL to a web-accessible, plain text transcription of the object, or 2) plain text may be included as the content for <collex:text>. The plain-text is only included for indexing purposes. When a record is found via a full-text search, the user will need to follow the link to the source website in order to view the full-text.

It is important to note that <collex:text> may only appear once in an RDF file. We expect that there will be many instances where a single object may have multiple textual instances attached to it, for example, a manuscript with a diplomatic transcription, normalized transcription, and description. In order for all three texts to be available for searching, you would need to create one RDF file for each one. The fields <collex:source_xml> and <collex:source_html> would be used to point to the encoded source for the texts. Although they are not required, we recommend that projects use <dcterms:hasPart> and <dcterms:isPartOf> to link together the various RDF records describing different pieces of a single object.

Please comment here, or contact us, if you have questions relating to RDF generation, full-text indexing, or anything else relating to MESA.