What are cheminformatics resources?
5 min readMay 21, 2022
Books:
- Computational Approaches in Cheminformatics and Bioinformatics — Include insights from public (NIH), academic, and industrial sources at the same time.
- Chemoinformatics for Drug Discovery — Materials about how to use Chemoinformatics strategies to improve drug discovery results.
- Molecular Descriptors for Chemoinformatics — More than 3300 descriptors and related terms for chemoinformatics analysis of chemical compound properties.
Courses:
- Learncheminformatics.com — “Cheminformatics: Navigating the world of chemical data” courese at Indiana University.
- Python for chemoinformatics
- TeachOpenCADD — A teaching platform for computer-aided drug design (CADD) using open source packages and data.
- Cheminformatics OLCC — Cheminformatics course of the Collaborative Intercollegiate Online Chemistry Course (OLCC) course of University of Arkansas at Little Rock by Robert Belford
- BigChem — All lectures of BigChem (A Horizon 2020 MSC ITN EID project, which provides innovative education in large chemical data analysis.)
- Molecular modeling course — by Dr. Jay Ponder, a professor from WashU St.Louis.
- Simulation in Chemistry and Biochemistry — by Dr. Jay Ponder, a professor from WashU St.Louis.
Visualization:
- PyMOL — Python-enhanced molecular graphics tool.
- Jmol — Browser-based HTML5 viewer and stand-alone Java viewer for chemical structures in 3D.
- VMD — Molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
- Chimera — Highly extensible program for interactive molecular visualization and analysis. Source is available.
- ChimeraX — The next-generation molecular visualization program, following UCSF Chimera. Source is available here.
- DataWarrior — A program for data Visualization and analysis which combines dynamic graphical views and interactive row filtering with chemical intelligence.
Libraries:
General purpose:
- RDKit — Collection of cheminformatics and machine-learning software written in C++ and Python.
- Indigo — Universal molecular toolkit that can be used for molecular fingerprinting, substructure search, and molecular visualization written in C++ package, with Java, C#, and Python wrappers.
- CDK (Chemistry Development Kit) — Algorithms for structural chemo- and bioinformatics, implemented in Java.
- ChemmineR — Cheminformatics package for analyzing drug-like small molecule data in R.
- ChemPy — A Python package useful for chemistry (mainly physical/inorganic/analytical chemistry)
- MolecularGraph.jl — A graph-based molecule modeling and chemoinformatics analysis toolkit fully implemented in Julia
- datamol: — Molecular Manipulation Made Easy. A light wrapper build on top of RDKit.
- CGRtools — Toolkit for processing molecules, reactions and condensed graphs of reactions. Can be used for chemical standardization, MCS search, tautomers generation with backward compatibility to RDKit and NetworkX.
Format Checking:
- ChEMBL_Structure_Pipeline (formerly standardiser) — Tool designed to provide a simple way of standardising molecules as a prelude to e.g. molecular modelling exercises.
- MolVS — Molecule validation and standardization based on RDKit.
- rd_filters — A script to run structural alerts using the RDKit and ChEMBL
- pdb-tools — A swiss army knife for manipulating and editing PDB files.
Visualization:
- Kekule.js — Front-end JavaScript library for providing the ability to represent, draw, edit, compare and search molecule structures on web browsers.
- 3Dmol.js — An object-oriented, webGL based JavaScript library for online molecular visualization.
- JChemPaint — Chemical 2D structure editor application/applet based on the Chemistry Development Kit.
- rdeditor — Simple RDKit molecule editor GUI using PySide.
- nglviewer — Interactive molecular graphics for Jupyter notebooks.
- RDKit.js — Official JavaScript distribution of cheminformatics functionality from the RDKit — a C++ library for cheminformatics.
Molecular Descriptors:
- mordred — Molecular descriptor calculator based on RDKit.
- DescriptaStorus — Descriptor computation(chemistry) and (optional) storage for machine learning.
- mol2vec — Vector representations of molecular substructures.
- Align-it — Align molecules according their pharmacophores.
- Rcpi — R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions.
Machine Learning:
- DeepChem — Deep learning library for Chemistry based on Tensorflow
- ChemML — ChemML is a machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data. (based on Tensorflow)
- OpenChem — OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend.
- chainer-chemistry — A Library for Deep Learning in Biology and Chemistry.
- pytorch-geometric — A PyTorch library provides implementation of many graph convolution algorithms.
- chemmodlab — A Cheminformatics Modeling Laboratory for Fitting and Assessing Machine Learning Models in R.
- Summit — A python package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).
Web APIs:
- webchem — Chemical Information from the Web.
- PubChemPy — Python wrapper for the PubChem PUG REST API.
- ChemSpiPy — Python wrapper for the ChemSpider API.
- CIRpy — Python wrapper for the NCI Chemical Identifier Resolver (CIR).
- Beaker — RDKit and OSRA in the Bottle on Tornado.
- chemminetools — Open source web framework for small molecule analysis based on Django.
Blogs:
- Open Source Molecular Modeling — Updateable catalog of open source molecular modeling software.
- PubChem Blog — News, updates and tutorials about PubChem.
- The ChEMBL-og blog — Stories and news from Computational Chemical Biology Group at EMBL-EBI.
- ChEMBL blog — ChEMBL on GitHub.
- SteinBlog — Blog of Christoph Steinbeck, who is the head of cheminformatics and metabolism at the EMBL-EBI.
- Practical Cheminformatics — Blog with in-depth examples of practical application of cheminformatics.
- So much to do, so little time — Trying to squeeze sense out of chemical data — Bolg of Rajarshi Guha, who is a research scientist at NIH Center for Advancing Translational Science. * Some old blogs 1 2.
- Noel O’Blog — Blog of Noel O’Boyle, who is a Senior Software Engineer at NextMove Software.
- chem-bla-ics — Blog of Egon Willighagen, who is an assistant professor at Maastricht University.
- steeveslab-blog — Some examples using RDKit.
- Macs in Chemistry — Provide a resource for chemists using Apple Macintosh computers.
- DrugDiscovery.NET — Blog of Andreas Bender, who is a Reader for Molecular Informatics at University of Cambridge.
- Is life worth living? — Some examples for cheminformatics libraries.
- Cheminformatics 2.0 — Blog of Alex M. Clark, a research scientist at Collaborative Drug Discovery.
- Depth-First — Blog of Richard L. Apodaca, a chemist living in La Jolla, California.
- Cheminformania — Blog of Ph.D, Esben Jannik Bjerrum, who is a Principle Scientist and a Machine Learning and AI specialists at AstraZeneca.
I hope these resources would be useful for you, please follow our Medium account and keep in touch with me (Melanee) on my GitHub.
Resource:
https://github.com/hsiaoyi0504/awesome-cheminformatics
Writer: Melanee
Contact Melanee: https://github.com/Melanee-Melanee