Rogue Scholar

Computer and Information Sciences

Why do museum and gallery displays ignore the web?

Published August 13, 2024

How to cite: Page, R. (2024). Why do museum and gallery displays ignore the web? https://doi.org/10.59350/a83tn-c6t14 This post is inspired by the Pharaoh exhibition at the NGV in Melbourne, Australia. This is a beautifully displayed exhibition of objects from the British Museum, London. It has all the trappings of a modern exhibition, beautiful lighting, a custom sound track, and lots of social media coverage.

Computer and Information Sciences

A future for the Biodiversity Heritage Library

https://doi.org/10.59350/n3dkt-6xd05

Published July 2, 2024

Author Roderic Page

How to cite: Page, R. (2024). A future for the Biodiversity Heritage Library https://doi.org/10.59350/n3dkt-6xd05 Following the 2024 BHL meeting, and the departure of Martin Kalfatovic and the uncertainty the departure of such a pivitol person brings, perhaps it’s time to think about the future of BHL. Below I sketch some thoughts, which are hazy at best. I should say at the outset that I think BHL is an extraordinary project.

Computer and Information Sciences

Visualising big trees: a talk at the Systematics Association 2024

https://doi.org/10.59350/cf6n4-ch767

Published June 19, 2024

Author Roderic Page

How to cite: Page, R. (2024). Visualising big trees: a talk at the Systematics Association 2024 https://doi.org/10.59350/cf6n4-ch767 This blog post has some notes in support of a talk given to the Systematics Association meeting in Reading June 20th, 2024. Slides I will post a link to the slides here once I have given the talk. Page, Roderic (2024). Visualising big trees. figshare. Presentation.

FAIRIdentifiersNanopublicationPensoftRDFComputer and Information Sciences

Nanopubs, a way to create even more silos

https://doi.org/10.59350/6nj85-7te92

Published June 18, 2024

Author Roderic Page

How to cite: Page, R. (2024). Nanopubs, a way to create even more silos https://doi.org/10.59350/6nj85-7te92 Pensoft have recently introduced “nanopubs”, small structured publications that can be thought of as containing the minimum possible statement that could be published. Nanopubs are promoted as FAIR, that is findable, accessible, interoperabile, and reusable.

Computer and Information Sciences

Notes on transforming BHL images

https://doi.org/10.59350/2gpbb-98a53

Published April 19, 2024

Author Roderic Page

How to cite: Page, R. (2024). Notes on transforming BHL images https://doi.org/10.59350/2gpbb-98a53 I’ve been down this road before, e.g. BHL, DjVu, and reading the f*cking manual and Demo of full-text indexing of BHL using CouchDB hosted by Cloudant, but I’m revisiting converting BHL page scans to black and white images, partly to clean them up, to make them closer to what a modern reader might expect, and partly to reduce the

Computer and Information Sciences

Hugging Face Autotrain

https://doi.org/10.59350/7p1n4-wdv84

Published March 27, 2024

Author Roderic Page

How to cite: Page, R. (2024). Hugging Face Autotrain https://doi.org/10.59350/7p1n4-wdv84 These are notes to myself on using Hugging Face AutoTrain. The first version of this had a very nice interface where you could simply upload a folder of images and train a model. It was limited in the range of tasks and models, but made up for that in ease of use.

Computer and Information Sciences

Problems with the DataCite Data Citation Corpus

https://doi.org/10.59350/t80g1-xys37

Published February 20, 2024

Author Roderic Page

How to cite: Page, R. (2024). Problems with the DataCite Data Citation Corpus https://doi.org/10.59350/t80g1-xys37 DataCite have released the Data Citation Corpus, together with a dashboard that summarises the corpus. This is billed as: The goal is to build a citation database between scholarly articles and data, such as datasets in repositories, sequences in GenBank, protein structures in PDB, etc.

Computer and Information Sciences

It's 2023 - why are we still not sharing phylogenies?

https://doi.org/10.59350/n681n-syx67

Published November 29, 2023

Author Roderic Page

How to cite: Page, R. (2023). It’s 2023 - why are we still not sharing phylogenies? https://doi.org/10.59350/n681n-syx67 A quick note to support a recent Twitter thread https://twitter.com/rdmpage/status/1729816558866718796?s=61&t=nM4XCRsGtE7RLYW3MyIpMA The article “Diversification of flowering plants in space and time” by Dimitrov et al. describes a genus-level phylogeny for 14,244 flowering plant genera.

Computer and Information Sciences

Where are the plant type specimens? Mapping JSTOR Global Plants to GBIF

https://doi.org/10.59350/m59qn-22v52

Published October 26, 2023

Author Roderic Page

How to cite: Page, R. (2023). Where are the plant type specimens? Mapping JSTOR Global Plants to GBIF. https://doi.org/10.59350/m59qn-22v52 This blog post documents my attempts to create links between two major resources for plant taxonomy: JSTOR’s Global Plants and GBIF, specifically between type specimens in JSTOR and the corresponding occurrence in GBIF.

ABBYYCRFDjVuDocument LayoutHOCRComputer and Information Sciences

Document layout analysis

https://doi.org/10.59350/z574z-dcw92

Published August 31, 2023

Author Roderic Page

How to cite: Page, R. (2023). Document layout analysis. https://doi.org/10.59350/z574z-dcw92 Some notes to self on document layout analysis. I’m revisiting the problem of taking a PDF or a scanned document and determining its structure (for example, where is the title, abstract, bibliography, where are the figures and their captions, etc.). There are lots of papers on this topic, and lots of tools.

Computer and Information Sciences

The problem with GBIF's Phylogeny Explorer

https://doi.org/10.59350/v0bt3-zp114

Published August 3, 2023

Author Roderic Page

How to cite: Page, R. (2023). The problem with GBIF’s Phylogeny Explorer. https://doi.org/10.59350/v0bt3-zp114 GBIF recently released the Phylogeny Explorer, using legumes as an example dataset. The goal is to enables users to “view occurrence data from the GBIF network aligned to legume phylogeny.” The screenshot below shows the legume phylogeny side-by-side with GBIF data.

iPhylo

Why do museum and gallery displays ignore the web?

A future for the Biodiversity Heritage Library

Visualising big trees: a talk at the Systematics Association 2024

Nanopubs, a way to create even more silos

Notes on transforming BHL images

Hugging Face Autotrain

Problems with the DataCite Data Citation Corpus

It's 2023 - why are we still not sharing phylogenies?

Where are the plant type specimens? Mapping JSTOR Global Plants to GBIF

Document layout analysis

The problem with GBIF's Phylogeny Explorer