Chemical SciencesHugoArchived

Depth-First

Depth-First
Recent content on Depth-First
Home Page
language
Published

Over the last couple of years, I’ve noticed several examples of C&EN readers writing in to complain about the way the American Chemical Society (ACS) deals with its considerable intellecual property claims. Last week, Jiri Janata wrote in to complain about the unfairness he sees in a scholarly organization taking free services from its members while at the same time charging for services at the same rate the general public is charged.

Published

A molecular fingerprint is a special kind of hash function that can reproducibly place any molecule, known or unknown, into one of a large but finite set of groups. Each molecule will be associated with exactly one fingerprint, but each fingerprint can be associated with multiple molecules. In other words, there exists a one-to-many relationship between fingerprints and molecules.

Published

Wiraj Bibile recently posed a question on the use of fingerprints to pre-screen candidate structures in database substructure searches: Let’s say our database consists of 50,000 compounds screened in a central nervous system (CNS) program. One of the most common substructure features in CNS medicinal chemistry is the benzene ring. Not surprisingly, about 75% of the structures in our database contain a benzene ring.

Published

The previous articles in this series have detailed the steps needed to build a working fingerprint screening system using nothing more than the open source tools MySQL, Ruby, and ActiveRecord. With this system we can create, read, update, and destroy fingerprints in persistent storage. Although the system meets all of the requirements of a fingerprint screening system, it isn’t a substructure search system - yet.

Published

In preparation for the first beta release of ChemPhoto, the chemical structure imaging application, I’ve been performing a lot of tests with PubChem SD files. It turns out that having a tool that can be used to quickly browse through tens of thousands of PubChem molecules turns up some very strange beasts, including the one depicted above. If you’re still curious as to what this PubChem record is actually referring to, this tool is quite useful.

Published

The previous article in this series showed how to perform fingerprint screens for substructure searches using nothing more than SQL. Although this is significant progress, working at the level of SQL queries to perform create, read, update, and delete operations (CRUD) on our fingerprint table is more work than it needs to be. We’d really prefer to use an API written in a high-level programming language.

Published

The previous article in this series discussed the configuration of a MySQL database for fast substructure search with binary fingerprints. This article first shows how to populate this database with real fingerprint data for two molecules. Then it shows how to formulate standard SQL queries to screen the database for substructures.

Published

For anyone working in a chemistry-related job, chemical databases are ubiquitous. A printed list of IUPAC names, a spreadsheet containing CAS numbers, and a set of hand-drawn structures on index cards are all primitive chemical databases. They aren’t nearly as useful as they could be to either the creator or his/her collaborators, but they are databases nevertheless.