Complat software training 101
  • Introduction
  • Day 1
  • Day 2
  • TODO
  • Linear regression
  • Tmux
  • quick link
  • CLI more - 1
  • Vim more - 1
  • MQ
  • iv - 1
  • iv - 2
  • iv - 3
  • clear Arch
  • lv - array
  • INTERVIEW - JS
  • RDKit - read/write
  • RDKit - process
  • RDKit - transform
  • RDKit - rxn
  • SYSTEM DESIGN - Question
  • SYSTEM DESIGN - EX1
  • SYSTEM DESIGN - EX2
  • SYSTEM DESIGN - EX3
  • SYSTEM DESIGN - EX99
Powered by GitBook
On this page
  • Chemical Reactions
  • Protecting Atoms from react
  • Molecule fragmentation
  • Chemical Features and Pharmacophores
  • Chemical Features
  • 2D Pharmacophore Fingerprints

Was this helpful?

RDKit - rxn

Chemical Reactions

SMARTS-based language similar to Daylight’s Reaction SMILES

from rxn template

rxn = AllChem.ReactionFromSmarts('[C:1](=[O:2])-[OD1].[N!H0:3]>>[C:1](=[O:2])[N:3]')
rxn.GetNumProductTemplates() # 1
ps = rxn.RunReactants((Chem.MolFromSmiles('CC(=O)O'),Chem.MolFromSmiles('NC')))
len(ps) # 1 -> one entry for each possible set of products
len(ps[0]) # 1 -> each entry contains one molecule for each product
Chem.MolToSmiles(ps[0][0]) # 'CNC(C)=O'
ps = rxn.RunReactants((Chem.MolFromSmiles('C(COC(=O)O)C(=O)O'),Chem.MolFromSmiles('NC')))
len(ps) # 2
Chem.MolToSmiles(ps[0][0]) # 'CNC(=O)OCCC(=O)O'
Chem.MolToSmiles(ps[1][0]) # 'CNC(=O)CCOC(=O)O'

from MDL

rxn = AllChem.ReactionFromRxnFile('data/AmideBond.rxn')
rxn.GetNumReactantTemplates() # 2
rxn.GetNumProductTemplates() # 1
ps = rxn.RunReactants((Chem.MolFromSmiles('CC(=O)O'), Chem.MolFromSmiles('NC')))
len(ps) # 1
Chem.MolToSmiles(ps[0][0]) # 'CNC(C)=O'

use cano-smiles to find unique set.

uniqps = {}
for p in ps:
    smi = Chem.MolToSmiles(p[0])
    uniqps[smi] = p[0]

    sorted(uniqps.keys()) # ['NC1=CCC(O)CC1', 'NC1=CCCC(O)C1']

molecules that are produced by the chemical reaction processing code are not sanitized

sanitized <-> kekulize

rxn = AllChem.ReactionFromSmarts('[C:1]=[C:2][C:3]=[C:4].[C:5]=[C:6]>>[C:1]1=[C:2][C:3]=[C:4][C:5]=[C:6]1')
ps = rxn.RunReactants((Chem.MolFromSmiles('C=CC=C'), Chem.MolFromSmiles('C=C')))
Chem.MolToSmiles(ps[0][0]) # 'C1=CC=CC=C1'
p0 = ps[0][0]
Chem.SanitizeMol(p0)
Chem.MolToSmiles(p0) # 'c1ccccc1'

Protecting Atoms from react

# Before
rxn = AllChem.ReactionFromRxnFile('data/AmideBond.rxn')
acid = Chem.MolFromSmiles('CC(=O)O')
base = Chem.MolFromSmiles('CC(=O)NCCN')
ps = rxn.RunReactants((acid,base))
len(ps) # 2
Chem.MolToSmiles(ps[0][0]) # 'CC(=O)N(CCN)C(C)=O'
Chem.MolToSmiles(ps[1][0]) # 'CC(=O)NCCNC(C)=O'
# doing
amidep = Chem.MolFromSmarts('[N;$(NC=[O,S])]')
for match in base.GetSubstructMatches(amidep):
base.GetAtomWithIdx\(match\[0\]\).SetProp\('\_protected','1'\)
# after
ps = rxn.RunReactants((acid,base))
len(ps) # 1
Chem.MolToSmiles(ps[0][0]) # 'CC(=O)NCCNC(C)=O'

Molecule fragmentation

  1. Recap

  2. BRICS

Chemical Features and Pharmacophores

Chemical Features

from rdkit import Chem
from rdkit.Chem import ChemicalFeatures
from rdkit import RDConfig
import os
fdefName = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)

m = Chem.MolFromSmiles('OCc1ccccc1CN')
feats = factory.GetFeaturesForMol(m)
len(feats) # 8

feats[0].GetFamily() # 'Donor'
feats[0].GetType() # 'SingleAtomDonor'

2D Pharmacophore Fingerprints

Combining a set of chemical features with the 2D (topological) distances between them gives a 2D pharmacophore.

PreviousRDKit - transformNextSYSTEM DESIGN - Question

Last updated 5 years ago

Was this helpful?

https://www.rdkit.org/docs/GettingStartedInPython.html#d-pharmacophore-fingerprints
https://www.rdkit.org/docs/RDKit_Book.html