Chemical SciencesJekyll

Jeremy Monat, PhD

Jeremy Monat, PhD
Scientific software developer
Home PageAtom Feed
language
Published

Exploring other cheminformatics toolkits besides the RDKit, I wanted to try EPAM Indigo Toolkit. The Indigo Toolkit is free and open-source with Apache License 2.0, so it can be used in proprietary software. I was unable to find simple examples of drawing molecules in a Python Jupyter Notebook, so here’s how to do that. This post also demonstrates how to save molecular images to a file.

Published

As the YouTubers would say, “A lot of you have been asking me about how to write cheminformatics blog posts.” Well, not a lot, but at least a couple! I realized that there’s a pattern to how I write cheminformatics blog posts (16 so far), so I’m sharing that here. My blog posts are intended to be tutorials that explore a topic, usually with existing tools. I figure out how to accomplish a cheminformatics objective, then share that.

Published

Molecules have a color if their electronic energy levels are close enough to absorb visible rather than ultraviolet light. For organic molecules, that’s often because of an extensive chain of conjugated bonds. Can we use cheminformatics to find evidence that increasing conjugated bond chain length decreases absorption wavelength, which makes a molecule colored?

Published

Tautomers are chemical structures that readily interconvert under given conditions. For example, an amino acid has a neutral form, and a zwitterionic form with separated positive and negative charges. Cheminformatics packages have algorithms to enumerate tautomers based on rules. Which algorithms produce the most tautomers? And how successful is InChI at representing with a single representation all tautomers of a given structure?

Published

This blog post presents a more computationally-efficient way to determine the abundance of the molecular isotopes of a molecule. In part 1, we created a molecule for each possible placement of each isotope in a molecule. While that worked, it was computationally expensive because it required creating each permutation. In this blog post, we’ll create each combination only once and calculate its abundance using the binomial distribution.

Published

I contributed MolsMatrixToGridImage to the RDKit 2023.09.1 release because I found myself writing similar code over and over to draw row-and-column grids of molecules. For projects where each row represented something, such as a molecule and the fragments off a common core, my mental model corresponded to a two-dimensional (nested) data structure, whereas the pre-existing function MolsToGridImage supported only linear (flat) data structures.

Published

Here’s how to display formatted molecular formulas in tables and graphs. In addition to formatted molecular formulas, these techniques should work for any Markdown or LaTeX. In the last blog post, we generated molecular formulas from SMILES strings or RDKit molecules. Once we have those molecular formulas, formatted as Markdown or LaTeX, we might want to display them in tables or graphs.

Published

In a previous post, I revisited Wiener’s paper predicting alkanes’ boiling points using modern cheminformatics tools. This follow-up post refits the data with modern mathematical tools to check how well the literature parameters, and the current parameters optimized here, fit the data. Wiener and Egloff’s works are impressive for using cheminformatics parameters that model physical data with simple relationships.