Back in 2006 in a short post entitled "Building the encyclopedia of life" I wrote that GenBank is a potentially rich source of information on host-parasite relationships.
Back in 2006 in a short post entitled "Building the encyclopedia of life" I wrote that GenBank is a potentially rich source of information on host-parasite relationships.
Following on from adding specimens to my OpenURL resolver, I've added support for GenBank records. Either an OpenURL request such as http://bioguid.info/openurl?id=genbank:DQ502033, or the short URL http://bioguid.info/genbank/DQ502033 will resolve the GenBank record for accession number DQ502033.
OMG. Playing with extracting identifiers from text, I have a regular expression for GenBank accession numbers that looks something like this: (A[A-Z])[0-9]{6} | (U[0-9]){5} | (D[A-Z])[0-9]{6} | (E[A-Z])[0-9]{6} | (NC_)[0-9]{6}). OK, it won't get everything, but what is more worrying are the things it will pickup that aren't GenBank accession numbers.
Time for some fun. In between some tedious text mining I've been meaning to explore some visualisations of NCBI. Here's the first, inspired by Jörn Clausen's wonderful Live Earthquake Mashup (thanks to Donat Agosti for telling me about this). What I've done is take all the frog sequences in Genbank that are georeferenced, add the date those Genbank records were created, generate a KML file, and use Nick Rabinowitz's timemap to plot the KML.