Natural SciencesJekyll

Biopragmatics

Unraveling complex biology with biological knowledge graphs. Content licensed under CC BY 4.0.
Home PageAtom FeedMastodon
language
PythonPackagingCookiecutterDocumentationNatural Sciences
Published
Author Charles Tapley Hoyt

PEP 735 introduced dependency groups in packaging metadata, which are complementary to optional dependencies in that they might not correspond to features in the package, but rather be something like development or release dependencies. I am slowly working towards updating my cookiecutter template cookiecutter-snekpack to use PEP 735. So far, uv and tox have released support - all that’s left is ReadTheDocs.

PythonGraphVizEnvironmentsNatural Sciences
Published
Author Charles Tapley Hoyt

Graphviz is software for graph visualization written in C. PyGraphviz provides a nice Python wrapper for it. The issue is that getting Python to know about the C headers changes every few months. I’ll try and keep this blog post updated every time there are some changes.

HaskellJATSPublishingPandocNatural Sciences
Published
Author Charles Tapley Hoyt

I’m working through making a contribution to pandoc that adds first-class support for author role annotations using the Contribution Role Taxonomy (CRediT) and also outputs compliant Journal Publishing Tag Set (JATS) XML. This has lead me down a (losing) journey with learning the Haskell programming language, so I thought I would post a short note on a function I tried to understand.

WordpressPythonPHPNatural Sciences
Published
Author Charles Tapley Hoyt

The International Society of Biocuration (ISB) partners with the journal Database to get discounts for its members when they publish there. This means the ISB’s executive committee needs to send a member list to the journal’s editor.

ORCIDBibliometricsNatural Sciences
Published
Author Charles Tapley Hoyt

The Open Researcher and Contributor Identifier (ORCID) database is an invaluable resource that supports the unambiguous identification of researchers. However, its first party data dump is too complex, verbose, and unstandardized for many use cases. This post describes open source software I wrote that automates downloading, processing, and exporting ORCID into a more usable form. I put the results on Zenodo under the CC0 license.

BioregistrySemantic WebPydanticFastapiPythonNatural Sciences
Published
Author Charles Tapley Hoyt

Using Pydantic for encoding data models and FastAPI for implementing APIs on top of them has become a staple for many Python programmers.

BooksNatural Sciences
Published
Author Charles Tapley Hoyt

I finally got back into reading! Over winter break 2022, I started the Stormlight Archive then followed up in 2023 by reading the entirety of Brandon Sanderson’s Cosmere , as well as a some other fantasy, science fiction, and literary fiction. Here’s the list.

UmlsReproducibilityNatural Sciences
Published
Author Charles Tapley Hoyt

The Unified Medical Language System (UMLS) is a widely used biomedical and clinical vocabulary maintained by the United States National Library of Medicine. However, it is notoriously difficult to access and work with due to licensing restrictions and its complex download system. In the same vein as my previous posts about DrugBank and ChEMBL, this post describes open source software I’ve developed for downloading and working with this data.

CheminformaticsReproducibilityNatural Sciences
Published
Author Charles Tapley Hoyt

I’ve been working on improving reproducibility in the field of cheminformatics for some time now. For example, I’ve written posts about making data from DrugBank and ChEMBL more actionable. Over the last year, I’ve been preparing a concept with the editors of the Journal of Cheminformatics on how to include an assessment of reproducibility to reviews of manuscripts submitted to the journal.

WikidataBibliometricsNatural Sciences
Published
Author Charles Tapley Hoyt

Today’s short post is about three SPARQL queries I wrote to get bibliometric information about journals and publishers out of Wikidata. Each of the following queries can be readily copy-pasted into the Wikidata Query Service and run in the browser.