Computer and Information SciencesBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Home PageAtom FeedMastodonISSN 2051-8188
language
BHLBioStorCouchDBPubMed CentralReplicationComputer and Information Sciences
Published

Last December I released a web site called Australian Faunal Directory on CouchDB, which was part of my ongoing exploration of how to build a simple yet useful database of taxonomic names. In particular, I want to link names directly to the primary taxonomic literature.

BioStorBMC BioinformaticsGoogle ScholarPublishedComputer and Information Sciences
Published

My article describing BioStor — "Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library" — has finally seen the light of day in BMC Bioinformatics (doi:10.1186/1471-2105-12-187, the DOI is not working at the moment, give it a little while to go live, meantime you can access the article here).Getting this article published was more work than I expected.

BackgroundBHLBioStorDjVuRTFMComputer and Information Sciences
Published

One of the many biggest challenges I've faced with the BioStor project, apart from dealing with messy metadata, has been handling page images. At present I get these from the Biodiversity Heritage Library. They are big (typically 1 Mb in size), and have the caramel colour of old paper.

CitationDataDryadComputer and Information Sciences
Published

Interest in archiving data and data publication is growing, as evidenced by projects such as Dryad, and earlier tools such as TreeBASE. But I can't help wondering whether this is a little misguided. I think the issues are granularity and reuse.Taking the second issue first, how much re-use do data sets get? I suspect the answer is "not much". I think there are two clear use cases, repeatability of a study, and benchmarks.

MendeleyWeb HooksComputer and Information Sciences
Published

Quick, poorly thought out idea. I've argued before that Mendeley seems the obvious tool to build a "bibliography of life." It has pretty much all the features we need: nice editing tools, support for DOIs, PubMed identifiers, social networking, etc.But there's one thing it lacks. There's not an easy way to transmit updates from Mendeley to another database.

NCBIPLoSPLoS CurrentsPublishedWikipediaComputer and Information Sciences
Published

My paper describing the mapping between NCBI and Wikipedia has been published in PLoS Currents: Tree of Life. You can see the paper here. It's only just gone live, so it's yet to get a PubMed Central number (one of the nice features of PLoS Currents is that the articles get archived in PMC).Publishing in PLoS Currents: Tree of Life was a pleasant experience. The Google Knol editing environment was easy to use, and the reviewing process quick.

Computer and Information Sciences
Published

A few weeks ago I spent some time mapping pages from the BBC Wildlife Finder to the equivalent taxa in the NCBI taxonomy. This seemed a useful exercise because the Wildlife Finder pages have some wonderful picture, video, and audio content, as well as other nice features, such as reusing Wikipedia page titles as "slugs" in the BBC page URLs.