Complat software training 101
  • Introduction
  • Day 1
  • Day 2
  • TODO
  • Linear regression
  • Tmux
  • quick link
  • CLI more - 1
  • Vim more - 1
  • MQ
  • iv - 1
  • iv - 2
  • iv - 3
  • clear Arch
  • lv - array
  • INTERVIEW - JS
  • RDKit - read/write
  • RDKit - process
  • RDKit - transform
  • RDKit - rxn
  • SYSTEM DESIGN - Question
  • SYSTEM DESIGN - EX1
  • SYSTEM DESIGN - EX2
  • SYSTEM DESIGN - EX3
  • SYSTEM DESIGN - EX99
Powered by GitBook
On this page
  • Ring
  • Kekulize
  • sanitize <-> kekulize
  • 2D conformation
  • 3D conformation
  • Pickleing
  • Drawing
  • Substructure searching
  • Atom map

Was this helpful?

RDKit - process

for atom in m.GetAtoms():
    print(atom.GetAtomicNum())

m.GetBonds()[0].GetBondType() # SINGLE
m.GetAtomWithIdx(0).GetSymbol() # 'C'
m.GetAtomWithIdx(0).GetExplicitValence() # 2

m.GetBondWithIdx(0).GetBeginAtomIdx() # 0
m.GetBondWithIdx(0).GetEndAtomIdx() # 1

m.GetBondBetweenAtoms(0,1).GetBondType() # rdkit.Chem.rdchem.BondType.SINGLE

Ring

m = Chem.MolFromSmiles('OC1C2C1CC2')
m.GetAtomWithIdx(0).IsInRing() # False
m.GetAtomWithIdx(1).IsInRing() # True
m.GetBondWithIdx(1).IsInRing() # True
m.GetAtomWithIdx(1).IsInRingSize(3) # True
# the smallest set of smallest rings (SSSR)
Chem.GetSymmSSSR(m)
Chem.GetSSSR(m)

Kekulize

m = Chem.MolFromSmiles('c1ccccc1')

m.GetBondWithIdx(0).GetBondType() # rdkit.Chem.rdchem.BondType.AROMATIC
Chem.Kekulize(m)
m.GetBondWithIdx(0).GetBondType() # rdkit.Chem.rdchem.BondType.DOUBLE
m.GetBondWithIdx(1).GetBondType() # rdkit.Chem.rdchem.BondType.SINGLE
m.GetBondWithIdx(1).GetIsAromatic() # True

still aromatic after kekulized, unless the flag is cleared.

Chem.Kekulize(m, clearAromaticFlags=True)
m.GetBondWithIdx(0).GetIsAromatic() # False

restore flag

Chem.SanitizeMol(m)
m.GetBondWithIdx(0).GetBondType() # rdkit.Chem.rdchem.BondType.AROMATIC

sanitize <-> kekulize

2D conformation

coord values: maximize the clarity of the drawing

align with template

m = Chem.MolFromSmiles('c1nccc2n1ccc2')
AllChem.Compute2DCoords(m)

template = Chem.MolFromSmiles('c1nccc2n1ccc2')
AllChem.Compute2DCoords(template)
AllChem.GenerateDepictionMatching2DStructure(m,template)

3D conformation

m = Chem.AddHs(m)
AllChem.EmbedMolecule(m)

Pickleing

m = Chem.MolFromSmiles('c1ccncc1')
import pickle
pkl = pickle.dumps(m)
m2=pickle.loads(pkl)

faster to build a molecule from a pickle than from a Mol file or SMILES string

binStr = m.ToBinary()
m2 = Chem.Mol(binStr)
Chem.MolToSmiles(m2)

Drawing

suppl = Chem.SDMolSupplier('data/cdk2.sdf')
ms = [x for x in suppl if x is not None]
for m in ms: tmp=AllChem.Compute2DCoords(m)

from rdkit.Chem import Draw
Draw.MolToFile(ms[0],'images/cdk2_mol1.o.png')

Substructure searching

m = Chem.MolFromSmiles('C1=CC=CC=C1OC')
m.HasSubstructMatch(Chem.MolFromSmiles('COC')) # True
m.HasSubstructMatch(Chem.MolFromSmarts('COC')) # False
m.HasSubstructMatch(Chem.MolFromSmarts('COc')) # True #<- need an aromatic C
m = Chem.MolFromSmiles('c1ccccc1O')
patt = Chem.MolFromSmarts('ccO')
m.HasSubstructMatch(patt) # True
m.GetSubstructMatches(patt) # ((0, 5, 6), (4, 5, 6))

By default, stereochemistry is not used in substructure searches.

m = Chem.MolFromSmiles('CC[C@H](F)Cl')
m.HasSubstructMatch(Chem.MolFromSmiles('C[C@@H](F)Cl')) # True
m.HasSubstructMatch(Chem.MolFromSmiles('C[C@@H](F)Cl'),useChirality=True) # False

Atom map

qmol = Chem.MolFromSmarts( '[cH0:1][c:2]([cH0])!@[CX3!r:3]=[NX2!r:4]' )
ind_map = {}
for atom in qmol.GetAtoms() :
    map_num = atom.GetAtomMapNum()
    if map_num:
        ind_map[map_num-1] = atom.GetIdx()
ind_map # {0: 0, 1: 1, 2: 3, 3: 4}
map_list = [ind_map[x] for x in sorted(ind_map)]
map_list # [0, 1, 3, 4]

# substructure match
mol = Chem.MolFromSmiles('Cc1cccc(C)c1C(C)=NC')
for match in mol.GetSubstructMatches( qmol ) :
    mas = [match[x] for x in map_list]
    print(mas) # [1, 7, 8, 10]
PreviousRDKit - read/writeNextRDKit - transform

Last updated 5 years ago

Was this helpful?

https://www.slideshare.net/baoilleach/we-need-to-talk-about-kekulization-aromaticity-and-smiles