InformatikEnglischBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
StartseiteAtom-FeedMastodonISSN 2051-8188
language
ClassificationD3jsRDFSPARQLWikidataInformatikEnglisch
Veröffentlicht

Following on from previous posts The Semantic Web made fun: d3sparql and The Biodiversity Heritage Library meets Wikidata via Wikispecies: adding author identifiers to BioStor I've put together an example query that can be used to extract a taxonomic classification from Wikidata.

BHLBioStorNgramTimelineInformatikEnglisch
Veröffentlicht

Given a big corpus of literature one of the fun things to do is look at how the use of a term has changed over time. When did people first use a particular word? When did one word start to replace another, etc.? Google's Ngram Viewer is perhaps the best known tool for exploring these questions.

DNA BarcodingGBIFIBOLInformatikEnglisch
Veröffentlicht

I've uploaded all the COI barcodes in the iBOL public data dumps into GBIF. This is an update of an earlier project that uploaded a small subset. Now that dataset doi:10.15468/inygc6 has been expanded to include some 2.7 million barcodes.

Bob MesibovCharacter EncodingGuest PostUFT-8InformatikEnglisch
Veröffentlicht

The following is a guest post by Bob Mesibov. According to w3techs, seven out of every eight websites in the Alexa top 10 million are UTF-8 encoded. This is good news for us screenscrapers, because it means that when we scrape data into a UTF-8 encoded document, the chances are good that all the characters will be correctly encoded and displayed. It's not quite good news for two reasons.

ChallengeGBIFInformatikEnglisch
Veröffentlicht

The GBIF 2016 Ebbe Nielsen Challenge has received 15 submissions. You can view them here: Unlike last year where the topic was completely open, for the second challenge we've narrowed the focus to "Analysing and addressing gaps and biases in primary biodiversity data". As with last year, judging is limited to the jury (of which I'm a member), however anyone interested in biodiversity informatics can browse the submissions.

Guest PostIRMNGTony ReesInformatikEnglisch
Veröffentlicht

This guest post by Tony Rees describes his quest to track all genus names ever published (plus a subset of the species…). A “holy grail” for biodiversity informatics is a suitably quality controlled, human- and machine-queryable list of all the world’s species, preferably arranged in a suitable taxonomic hierarchy such as kingdom-phylum-class-order-family-genus or other.