Computer and Information SciencesBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Home PageAtom FeedMastodonISSN 2051-8188
language
Published

Following on from exploring links between GBIF and GenBank here I'm going to look at links between GBIF and the primary literature, in this case articles scanned by the Biodiversity Heritage Library (BHL). The OCR text in BHL can be mined for a variety of entities. BHL itself has used uBio's tools to identity taxonomic names in the OCR text, and in my BioStor project I've extracted article-level metadata and geographic co-ordinates.

Published

This message appeared on the TAXACOM mailing list: Given that most specimens lack resolvable digital identifiers (a theme I've harped on about before, most recently in the context of DNA barcoding), answering this kind of query ends up being a case of searching publications for text strings that contain the acronym of the collection.

Published

Following on from my earlier post Linking taxonomic names to literature: beyond digitised 5×3 index cards I've been slowly updating my latest toy: http://iphylo.org/~rpage/itaxon This site displays a database mapping over 200,000 animal names to the primary literature, using a mix of identifiers (DOIs, Handles, PubMed, URLs) as well as links to freely available PDFs where they are available.

Published

Geoffery Bilder's comments about the unsuitability of URLs as long term identifiers (as opposed, say, to DOIs) came to mind when I discovered that the domain phthiraptera.org is up for sale: This domain used to be home to a wealth of resources on lice (order Phthiraptera). I discovered that ownership of the domain had expired when a bunch of links to PDFs returned by an iSpecies search for Collodennyus all bounced to the holding page

Published

I've been playing recently with the Biodiversity Heritage Library (BHL), and am starting to get a sense for the complexities (and limitations) of the metadata BHL stores about publications. The more I look at BHL the more I think the resource is (a) wonderfully useful and (b) hampered by some dodgy metadata.

Published

Continuing with RSS feeds, I've now added wrappers around IPNI that will return for each plant family a list of names added to the IPNI database in the last 30 days. You can see the list at here. One thing which is a constant source of frustration for me is the disconnect between nomenclators (lists of published names for species) and scientific publishing.

Published

I've been using ISSN's (International Standard Serial Number) to uniquely identify journals, both to generate article identifiers, and as a parameter to send to CrossRef's OpenURL resolver. Recently I've come across journals that change their ISSN, which has fairly catastrophic effects on my lookup tools.