Complat software training 101
  • Introduction
  • Day 1
  • Day 2
  • TODO
  • Linear regression
  • Tmux
  • quick link
  • CLI more - 1
  • Vim more - 1
  • MQ
  • iv - 1
  • iv - 2
  • iv - 3
  • clear Arch
  • lv - array
  • INTERVIEW - JS
  • RDKit - read/write
  • RDKit - process
  • RDKit - transform
  • RDKit - rxn
  • SYSTEM DESIGN - Question
  • SYSTEM DESIGN - EX1
  • SYSTEM DESIGN - EX2
  • SYSTEM DESIGN - EX3
  • SYSTEM DESIGN - EX99
Powered by GitBook
On this page
  • Deleting
  • Replacing
  • maximum common substructure (MCS)
  • Fingerprinting and Molecular Similarity
  • Morgen / Circular fp
  • Explain FP
  • Descriptor

Was this helpful?

RDKit - transform

Deleting

m = Chem.MolFromSmiles('CC(=O)O')
patt = Chem.MolFromSmarts('C(=O)[OH]')
rm = AllChem.DeleteSubstructs(m,patt)
Chem.MolToSmiles(rm) # 'C'

Replacing

repl = Chem.MolFromSmiles('OC')
patt = Chem.MolFromSmarts('[$(NC(=O))]')
m = Chem.MolFromSmiles('CC(=O)N')
rms = AllChem.ReplaceSubstructs(m,patt,repl)
rms # (<rdkit.Chem.rdchem.Mol object at 0x...>,)
Chem.MolToSmiles(rms[0]) # 'COC(C)=O'

maximum common substructure (MCS)

from rdkit.Chem import rdFMCS
mol1 = Chem.MolFromSmiles("O=C(NCc1cc(OC)c(O)cc1)CCCC/C=C/C(C)C")
mol2 = Chem.MolFromSmiles("CC(C)CCCCCC(=O)NCC1=CC(=C(C=C1)O)OC")
mol3 = Chem.MolFromSmiles("c1(C=O)cc(OC)c(O)cc1")
mols = [mol1,mol2,mol3]
res=rdFMCS.FindMCS(mols)
res # <rdkit.Chem.rdFMCS.MCSResult object at 0x...>
res.numAtoms # 10
res.smartsString # '[#6]1(-[#6]):[#6]:[#6](-[#8]-[#6]):[#6](:[#6]:[#6]:1)-[#8]'

Fingerprinting and Molecular Similarity

from rdkit import DataStructs
from rdkit.Chem.Fingerprints import FingerprintMols
ms = [Chem.MolFromSmiles('CCOC'), Chem.MolFromSmiles('CCO'), Chem.MolFromSmiles('COC')]
fps = [FingerprintMols.FingerprintMol(x) for x in ms]

DataStructs.FingerprintSimilarity(fps[0],fps[1]) # 0.6
DataStructs.FingerprintSimilarity(fps[0],fps[2]) # 0.4
DataStructs.FingerprintSimilarity(fps[1],fps[2]) # 0.25

Morgen / Circular fp

from rdkit.Chem import AllChem
m1 = Chem.MolFromSmiles('Cc1ccccc1')
fp1 = AllChem.GetMorganFingerprint(m1,2)
m2 = Chem.MolFromSmiles('Cc1ncccc1')
fp2 = AllChem.GetMorganFingerprint(m2,2)
DataStructs.DiceSimilarity(fp1,fp2) # 0.55

# specify bits
fp1 = AllChem.GetMorganFingerprintAsBitVect(m1,2,nBits=1024)

Explain FP

m = Chem.MolFromSmiles('c1cccnc1C')
info={}
fp = AllChem.GetMorganFingerprint(m,2,bitInfo=info)
info[4048591891] # ((5, 2),) atom 5 at radius 2

env = Chem.FindAtomEnvironmentOfRadiusN(m,2,5)
amap={}
submol=Chem.PathToSubmol(m,env,atomMap=amap)
amap # {0: 3, 1: 5, 3: 4, 4: 0, 5: 1, 6: 2}

Chem.MolToSmiles(submol,rootedAtAtom=amap[5],canonical=False) # 'c(nc)(C)cc'

Similarity Maps = atomic contributions to the similarity between a molecule and a reference molecule.

Descriptor

ExactMolWt, HeavyAtomCount, MolLogP

from rdkit.Chem import Descriptors
m = Chem.MolFromSmiles('c1ccccc1C(=O)O')
Descriptors.MolLogP(m) # 1.3848

contribution in sim_map

from rdkit.Chem import rdMolDescriptors
contribs = rdMolDescriptors._CalcCrippenContribs(mol)
fig = SimilarityMaps.GetSimilarityMapFromWeights(mol,[x for x,y in contribs], colorMap='jet', contourLines=10)
PreviousRDKit - processNextRDKit - rxn

Last updated 5 years ago

Was this helpful?