
The first release of the Encyclopedia of Life is officially live today. I have promised to be very good...
The first release of the Encyclopedia of Life is officially live today. I have promised to be very good...
My short note on the LSID Tester tool has been published in the Open Access journal Source Code for Biology and Medicine. The article has just come out so the DOI (doi:10.1186/1751-0473-3-2) isn't live yet, the direct link is http://www.scfbm.org/content/3/1/2/. Source code for the tester is available from Google Code.
In the absence of a proper bug reporting system, I'm going to use this post to collect errors in the TBMap project, which maps taxonomic names in TreeBASE onto names in other databases. TaxonID TaxonName Notes T57654 Lycorideae Erroneously agrep matched to the spider family Lycosidae, this is a plant tribe.
CrossRef have released a tool for bloggers to look up DOIs and insert them into blog posts: So far the tool is only available for WordPress blogs. The idea is that bloggers can use DOIs to uniquely identify papers that they are discussing, while at the same time providing readers with an easy way to go to the site hosting the article, and aggregators such as postgenomic.com can cluster posts about the same paper.
Wired 16.01 has an article entitled The Data Wars by Josh McHugh. A quote from the printed version: It's a sobering read for those of us who advocate harvesting data from as many sources as possible, more so in light of Microsoft's bid to buy Yahoo. Yahoo provides free access to many of its tools via an API (such as the image search I use in iSpecies, and in this sense is much more open than Google. Might this change under Microsoft...?
Dave Lunt has a nice post on How to visualize a phylogeny with thousands of tips?. Dave lists 12 things that his ideal phylogenetic tree viewing tool should do, and invites comments. It will be interesting to see what comes of this...
Came across the paper "Using incomplete citation data for MEDLINE results ranking" (pmid:16779053, fulltext available in PMC .The authors applied PageRank (the algorithm Google use to rank search results) to papers in MEDLINE and found that PageRank is robust to information loss. In other words, even if a citation database is incomplete it will do a good job of ranking results.
Nothing like a little hubris first thing Monday morning... After various experiments, such as a triple store for ants (documented on the Semant blog) and bioGUID (documented on the bioGUID blog), I'm starting from scratch and working on a "database of everything". Put another way, I'm working on a database that aggregates metadata about specimens, sequences, literature, images, taxonomic names, etc.
Leigh Dodds has a nice post How Shall I Integrate Thee? Let Me Count the Ways... about different ways to integrate data.
Spent the week in Portugal at the EDIT Future Trends of Taxonomy meeting, held at the View from cave, at the beach in front of the Hotel Tivoli Almansor, Carvoeiro.