Computer and Information SciencesBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Home PageAtom FeedMastodonISSN 2051-8188
language
Published

Hot on the heels of Geoffrey Nunberg's essay about the train wreck that is Google books metadata (see my earlier post) comes Google Scholar’s Ghost Authors, Lost Authors, and Other Problems by Péter Jacsó. It's a fairly scathing look at some of the problems with the quality of Google Scholar's metadata. Now, Google Scholar isn't perfect, but it's come to play a key role in a variety of bibliographic tools, such as Mendeley, and Papers.

Published

I've been playing recently with the Biodiversity Heritage Library (BHL), and am starting to get a sense for the complexities (and limitations) of the metadata BHL stores about publications. The more I look at BHL the more I think the resource is (a) wonderfully useful and (b) hampered by some dodgy metadata.

Published

Continuing on this theme of embedded metadata, this is one reason why DNA barcodingis so appealing. A DNA barcode is rather like embedded metadata -- once we extract it we can look up the sequence and determine the organism's identity (or, at least whether we've seen it before). It's very like identifying a CD based on a hash computed from the track lengths.

Published

Following on from the previous post, as Howison and Goodrum note, Adobe provides XMP as a way to store metadata in files, such as PDFs. XMP supports RDF and namespaces, which means widely used bibliographic standards such as Dublin Core and PRISM can be embedded in a PDF, so the article doesn't become separated from its metadata. Adobe provides a developers kit under a BSD license.