Rogue Scholar

Published November 14, 2016

Willi Egloff, Donat Agosti, Puneet Kishor, David Patterson, and Jeremy A. Miller have published an interesting preprint entitled “Copyright and the Use of Images as Biodiversity Data” DOI:10.1101/087015 in which they argue that taxonomic images aren't copyrightable. I'm not convinced, and have commented on the bioRxiv site.

EOLJSON-LDPlatformTraitBankComputer and Information Sciences

EOL Traitbank JSON-LD is broken

https://doi.org/10.59350/ba6y2-7yn42

Published November 12, 2016

Author Roderic Page

One of the most interesting aspects of EOL is "TraitBank", which has been described in a recent paper: TraitBank is available in JSON-LD, and so is potentially part of the Semantic Web.

Bob MesibovCharacter EncodingGuest PostUFT-8Computer and Information Sciences

Guest post: It's 2016 and your data aren't UTF-8 encoded?

https://doi.org/10.59350/t8j1g-8h227

Published September 30, 2016

Author Roderic Page

The following is a guest post by Bob Mesibov. According to w3techs, seven out of every eight websites in the Alexa top 10 million are UTF-8 encoded. This is good news for us screenscrapers, because it means that when we scrape data into a UTF-8 encoded document, the chances are good that all the characters will be correctly encoded and displayed. It's not quite good news for two reasons.

ChallengeGBIFComputer and Information Sciences

GBIF 2016 Ebbe Nielsen Challenge entries

https://doi.org/10.59350/44ber-48595

Published September 30, 2016

Author Roderic Page

The GBIF 2016 Ebbe Nielsen Challenge has received 15 submissions. You can view them here: Unlike last year where the topic was completely open, for the second challenge we've narrowed the focus to "Analysing and addressing gaps and biases in primary biodiversity data". As with last year, judging is limited to the jury (of which I'm a member), however anyone interested in biodiversity informatics can browse the submissions.

Guest PostIRMNGTony ReesComputer and Information Sciences

Guest post: Absorbing task or deranged quest: an attempt to track all genus names ever published

https://doi.org/10.59350/sx242-7wv17

Published September 7, 2016

Author Roderic Page

This guest post by Tony Rees describes his quest to track all genus names ever published (plus a subset of the species…). A “holy grail” for biodiversity informatics is a suitably quality controlled, human- and machine-queryable list of all the world’s species, preferably arranged in a suitable taxonomic hierarchy such as kingdom-phylum-class-order-family-genus or other.

CommunityCurationGrBioWikidataComputer and Information Sciences

GRBio: A Call for Community Curation - what community?

https://doi.org/10.59350/q72m8-z7859

Published August 30, 2016

Author Roderic Page

David Schindel and colleagues recently published a paper in the Biodiversity Data Journal : The paper is a call for the community to help grow a database (GRBio) on biodiversity repositories, a database that will "will require community input and curation". Reading this, I'm struck by the lack of a clear sense of what that community might be. In particular: who is this database for, and who is most likely to build it? I suspect that

BioNamesTreatmentsComputer and Information Sciences

Displaying original species descriptions in BioNames

https://doi.org/10.59350/m1ksn-g8p68

Published August 26, 2016

Author Roderic Page

The goal of my BioNames project is to link every taxonomic name to its original description (initially focussing on animal names). The rationale is that taxonomy is based on evidence, and yet most of this evidence is buried in a non-digitised and/or hard to find literature. Surfacing this information not only makes taxonomic evidence accessible (see Surfacing the deep data of taxonomy), it also surfaces a lot of basic biological information.

ChallengeGapsGBIFHurlbert's IndexOBISComputer and Information Sciences

GBIF Challenge: €31,000 in prizes for analysing and addressing gaps and biases in primary biodiversity data

https://doi.org/10.59350/p0zrm-f5494

Published August 18, 2016

Author Roderic Page

In a classic paper Boggs (1949) appealed for an “atlas of ignorance”, an honest assessment of what we know we don’t know: This is the theme of this year's GBIF Challenge: Analysing and addressing gaps and biases in primary biodiversity data. "Gaps" can be gaps in geographic coverage, taxa group, or types of data. GBIF is looking for ways to access the nature of the gaps in the data it is aggregating from its network of contributors.

BioStorLeafletOpenRefineStamenComputer and Information Sciences

BioStor updates: nicer map, reference matching service

https://doi.org/10.59350/97a6s-ymc08

Published August 18, 2016

Author Roderic Page

BioStor now has 150,000 articles. When I wrote a paper describing how BioStor worked it had 26,784 articles, so things have progressed somewhat! I continue to tweak the interface to BioStor, trying different ways to explore the articles. Spatial search I've tweaked spatial search in BioStor.

ContainersDockerMicroservicesComputer and Information Sciences

Containers, microservices, and data

https://doi.org/10.59350/52qwe-6yh44

Published August 17, 2016

Author Roderic Page

Some notes on containers, microservices, and data. The idea of packaging software into portable containers and running them either locally or in the cloud is very attractive (see Docker). Some use cases I'm interested in exploring. Microservices In Towards a biodiversity knowledge graph (doi:10.3897/rio.2.e8767) I listed a number of services that are essentially self contained, such as name parsers, reconciliation tools, resolvers, etc.

iPhylo

Copyright and the Use of Images as Biodiversity Data

EOL Traitbank JSON-LD is broken

Guest post: It's 2016 and your data aren't UTF-8 encoded?

GBIF 2016 Ebbe Nielsen Challenge entries

Guest post: Absorbing task or deranged quest: an attempt to track all genus names ever published

GRBio: A Call for Community Curation - what community?

Displaying original species descriptions in BioNames

GBIF Challenge: €31,000 in prizes for analysing and addressing gaps and biases in primary biodiversity data

BioStor updates: nicer map, reference matching service

Containers, microservices, and data