Monday 23 April 2007

Blank nodes for specimens without URI


Some specimens in GenBank can be easily linked to an external record via a URI (albeit one I've constructed), but for many GenBank sequences the specimen is either so poorly described, or doesn't have a digital representation, that simply linking to a URI is not possible. After playing with generating my own URIs for records in a local MySQL database of specimens, it occurred to be (eventually) that blank nodes might be a useful way to handle these. That is, a node in the RDF that has no URI, but to which all the information about that specimen is linked. The diagram on the right shows the model. In RDF/XML, it would look something like this:

<bioguid:voucher rdf:parseType="Resource">
<rdf:type rdf:resource="http://bioguid.info/schema/0.1/Specimen"/>
<darwin:Country>Nicaragua</darwin:Country>
<darwin:Locality>Rio San Juan, 10deg56'N 84deg18'W</darwin:Locality>
<geo:lat>10.93</geo:lat>
<geo:long>-84.3</geo:long>
<dc:title>OMNH 33325</dc:title>
</bioguid:voucher>

The original GenBank record is DQ502492.
In the absence of a URI, we make statements such as "the specimen with the title 'OMNH 33325'".

No comments: