Rogue Scholar

Published February 23, 2012

Author Roderic Page

Duplicate records are the bane of any project that aggregates data from multiple sources.

FrogsGBIFGenbankGeophylogenyKMLComputer and Information Sciences

Linking GBIF and Genbank

https://doi.org/10.59350/hj161-hh554

Published February 21, 2012

Author Roderic Page

As part of my mantra that it's not about the data, it's all about the links between the data, I've started exploring matching GenBank sequences to GBIF occurrences using the specimen_voucher codes recorded in GenBank sequences. It's quickly becoming apparent that this is not going to be easy.

Darwin Core RipletDNA BarcodingDOIGBIFIdentifiersComputer and Information Sciences

DNA Barcoding, the Darwin Core Triplet, and failing to learn from past mistakes

https://doi.org/10.59350/aq4wb-dt356

Published December 11, 2011

Author Roderic Page

Given various discussions about identifiers, dark taxa, and DNA barcoding that have been swirling around the last few weeks, there's one notion that is starting to bug me more and more.

C-squaresGBIFGeoreferencingISpeciesRDFComputer and Information Sciences

Referring to a one-degree square in RDF using c-squares

https://doi.org/10.59350/9szbj-fya39

Published May 10, 2010

Author Roderic Page

I'm in the midst of rebuilding iSpecies (my mash-up of Wikipedia, NCBI, GBIF, Yahoo, and Google search results) with the aim of outputting the results in RDF. The goal is to convert iSpecies from a pretty crude "on-the-fly" mash-up to a triple store where results are cached and can be queried in interesting ways. Why?

GBIFGUIDsLinked DataComputer and Information Sciences

GBIF and Linked Data

https://doi.org/10.59350/d5c5f-hby28

Published August 12, 2009

Author Roderic Page

At the end of day two of the GBIF LSID-GUID Task Group I put together this crude diagram to summarise some of the possible links between biodiversity data and the larger linked data cloud, which I, among others, have argued is where biodiversity informatics should be heading.

Bit.lyEOLGBIFLSIDOpen CalaisComputer and Information Sciences

LSIDs, disaster or opportunity

https://doi.org/10.59350/hhamr-8ns77

Published April 16, 2009

Author Roderic Page

OK, really must stop avoiding what I'm supposed to be doing (writing a paper, already missed the deadline), but continuing the theme of LSIDs and short URLs, it occurs to me that LSIDs can be seen as a disaster (don't work in webrowsers, nobody else uses them, hard to implement, etc.) or an opportunity.

EOLGBIFISpeciesRed LionfishSucksComputer and Information Sciences

EOL hyperbole

https://doi.org/10.59350/yh7cw-shy15

Published December 13, 2008

Author Roderic Page

The latest post on the EOL blog (Biodiversity in a rapidly changing world) really, really annoys me. It claims that Nope, I suggest it demonstrates just how limited EOL is. If I view the page for the red lionfish I get an out of date map from GBIF that shows a very limited distribution, and doesn't show the introductions in Florida and the Bahamas (I have to wade through text to find reference to the Florida introduction, and the page doesn't

Data QualityEOLFishBaseGBIFGeoreferencingComputer and Information Sciences

Global biogeographical data bases on marine fishes: caveat emptor

https://doi.org/10.59350/s2c6a-d6n36

Published October 6, 2008

Author Roderic Page

D. Ross Robertson has published a paper entitled "Global biogeographical data bases on marine fishes: caveat emptor" (doi:10.1111/j.1472-4642.2008.00519.x - DOI is broken, you can get the article here). The paper concludes: As I've noted elsewhere on this blog, and as demonstrated by Yesson et al.'s paper on legume records in GBIF (doi:10.1371/journal.pone.0001124) (not cited by Robertson), there are major problems with geographical information

GBIFPrizeVince SmithComputer and Information Sciences

Vince Smith wins 2008 Ebbe Nielsen Prize

https://doi.org/10.59350/y95fm-qnr73

Published August 26, 2008

Author Roderic Page

As spotted by dechronization, GBIF has made public that Vince Smith has won the 2008 Ebbe Nielsen Prize.

DistributionErrorGBIFISpeciesComputer and Information Sciences

More GBIF errors, courtesy of FishBase

https://doi.org/10.59350/rdec3-a0b17

Published June 11, 2008

Author Roderic Page

Resurrecting iSpecies after moving it to a new folder{"=““} on one of my servers, and browsing popular searches, I keep coming across clearly erroneous distributions. FishBase seems a major culprit. For example, the common pandora Pagellus erythrinus is a marine fish, yet GBIF displays numerous occurrences in mainland Africa (dots with black centre on map below). What gives?

iPhylo

How many specimens does GBIF really have?

Linking GBIF and Genbank

DNA Barcoding, the Darwin Core Triplet, and failing to learn from past mistakes

Referring to a one-degree square in RDF using c-squares

GBIF and Linked Data

LSIDs, disaster or opportunity

EOL hyperbole

Global biogeographical data bases on marine fishes: caveat emptor

Vince Smith wins 2008 Ebbe Nielsen Prize

More GBIF errors, courtesy of FishBase