The Plazi project has become one of the major contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences (see Plazi's GBIF page for details). These occurrences are extracted from taxonomic publication using automated methods.
The Plazi project has become one of the major contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences (see Plazi's GBIF page for details). These occurrences are extracted from taxonomic publication using automated methods.
The following is a guest post by Bob Mesibov. No winner yet in the second Darwin Core Million for 2020, but there are another two and a half weeks to go (to 30 September). For details of the contest see this iPhylo blog post. And please don’t submit a million RECORDS, just (roughly) a million DATA ITEMS. That’s about 20,000 records with 50 fields in the table, or about 50,000 records with 20 fields, or something arithmetically similar.
The following is a guest post by Bob Mesibov. The Atlas of Living Australia (ALA) adds "assertions" to Darwin Core occurrence records.
The following is a guest post by Bob Mesibov. There's still time (to 31 March ) to enter a dataset in the 2020 Darwin Core Million, and by way of encouragement I'll celebrate here the best and worst Darwin Core datasets I've seen. The two best are real stand-outs because both are collections of IPT resources rather than one-off wonders. The first is published by the Peabody Museum of Natural History at Yale University.
The following is a guest post by Bob Mesibov. You're feeling pretty good about your institution's collections data.
Yes, this is a clickbait headline, and yes, it may seem like shooting fish in a barrel to complain about crappy data in GBIF, but my point here is raise concerns about the impact of metagenomic data on GBIF, and how difficult it may be to track down the causes of errors.
The following is a guest post by Bob Mesibov. Nico Franz and Beckett Sterner created a stir last year with a preprint in bioRxiv about expert validation (or the lack of it) in the "backbone" classifications used by aggregators.
The following is a guest post by Bob Mesibov. Do you know the party game "Telephone", also known as "Chinese Whispers"? The first player whispers a message in the ear of the next player, who passes the message in the same way to a third player, and so on. When the last player has heard the whispered message, the starting and finishing versions of the message are spoken out loud. The two versions are rarely the same.
Update: Angelique Hjarding and her co-authors have responded in a guest post on iPhylo. The quality and fitness for use of GBIF-mobilised data is a topic of interest to anyone that uses GBIF data.
There is a great post by Jeni Tennison on the Open Data Institute blog entitled Five Stages of Data Grief. It resonates so much with my experience working with biodiversity data (such as building BioNames, or exploring data errors in GBIF) that I've decide to reproduce it here.