Molecules#
Reading RDKit molecules — prolif.rdkitmol#
- class prolif.rdkitmol.BaseRDKitMol[source]#
Bases:
MolBase molecular class that behaves like an RDKit
Molwith extra attributes (see below). The sole purpose of this class is to define the common API between theMoleculeandResidueclasses. This class should not be instantiated by users.- Parameters
mol (rdkit.Chem.rdchem.Mol) – A molecule (protein, ligand, or residue) with a single conformer
- centroid#
XYZ coordinates of the centroid of the molecule
- Type
- xyz#
XYZ coordinates of all atoms in the molecule
- Type
Reading proteins and ligands — prolif.molecule#
- class prolif.molecule.Molecule(mol)[source]#
Bases:
BaseRDKitMolMain molecule class that behaves like an RDKit
Molwith extra attributes (see examples below). The main purpose of this class is to access residues as fragments of the molecule.- Parameters
mol (rdkit.Chem.rdchem.Mol) – A ligand or protein with a single conformer
Examples
In [1]: import MDAnalysis as mda In [2]: import prolif In [3]: u = mda.Universe(prolif.datafiles.TOP, prolif.datafiles.TRAJ) In [4]: mol = u.select_atoms("protein").convert_to("RDKIT") In [5]: mol = prolif.Molecule(mol) In [6]: mol Out[6]: <prolif.molecule.Molecule with 302 residues and 4988 atoms at 0x7f3685dc76f0>
You can also create a Molecule directly from a
Universe:In [7]: mol = prolif.Molecule.from_mda(u, "protein") In [8]: mol Out[8]: <prolif.molecule.Molecule with 302 residues and 4988 atoms at 0x7f3685528e00>
Notes
Residues can be accessed easily in different ways:
In [9]: mol["TYR38.A"] # by resid string (residue name + number + chain) Out[9]: <prolif.residue.Residue TYR38.A at 0x7f3685518540> In [10]: mol[42] # by index (from 0 to n_residues-1) Out[10]: <prolif.residue.Residue LEU80.A at 0x7f36854fc5e0> In [11]: mol[prolif.ResidueId("TYR", 38, "A")] # by ResidueId Out[11]: <prolif.residue.Residue TYR38.A at 0x7f3685518540>
See
prolif.residuefor more information on residues- classmethod from_mda(obj, selection=None, **kwargs)[source]#
Creates a Molecule from an MDAnalysis object
- Parameters
obj (MDAnalysis.core.universe.Universe or MDAnalysis.core.groups.AtomGroup) – The MDAnalysis object to convert
selection (None or str) – Apply a selection to obj to create an AtomGroup. Uses all atoms in obj if
selection=None**kwargs (object) – Other arguments passed to the
RDKitConverterof MDAnalysis
Example
In [1]: mol = prolif.Molecule.from_mda(u, "protein") In [2]: mol Out[2]: <prolif.molecule.Molecule with 302 residues and 4988 atoms at 0x7f368552a2f0>
Which is equivalent to:
In [3]: protein = u.select_atoms("protein") In [4]: mol = prolif.Molecule.from_mda(protein) In [5]: mol Out[5]: <prolif.molecule.Molecule with 302 residues and 4988 atoms at 0x7f36855a0590>
- classmethod from_rdkit(mol, resname='UNL', resnumber=1, chain='')[source]#
Creates a Molecule from an RDKit molecule
While directly instantiating a molecule with
prolif.Molecule(mol)would also work, this method insures that every atom is linked to an AtomPDBResidueInfo which is required by ProLIF- Parameters
mol (rdkit.Chem.rdchem.Mol) – The input RDKit molecule
resname (str) – The default residue name that is used if none was found
resnumber (int) – The default residue number that is used if none was found
chain (str) – The default chain Id that is used if none was found
Notes
This method only checks for an existing AtomPDBResidueInfo in the first atom. If none was found, it will patch all atoms with the one created from the method’s arguments (resname, resnumber, chain).
- class prolif.molecule.mol2_supplier(path, **kwargs)[source]#
Bases:
SequenceSupplies molecules, given a path to a MOL2 file
- Parameters
- Returns
suppl – A sequence that provides
Moleculeobjects- Return type
Sequence
Example
The supplier is typically used like this:
>>> lig_suppl = mol2_supplier("docking/output.mol2") >>> for lig in lig_suppl: ... # do something with each ligand
Changed in version 1.0.0: Molecule suppliers are now sequences that can be reused, indexed, and can return their length, instead of single-use generators.
- class prolif.molecule.pdbqt_supplier(paths, template, converter_kwargs=None, **kwargs)[source]#
Bases:
SequenceSupplies molecules, given paths to PDBQT files
- Parameters
paths (list) – A list (or any iterable) of PDBQT files
template (rdkit.Chem.rdchem.Mol) – A template molecule with the correct bond orders and charges. It must match exactly the molecule inside the PDBQT file.
converter_kwargs (dict) – Keyword arguments passed to the RDKitConverter of MDAnalysis
resname (str) – Residue name for every ligand
resnumber (int) – Residue number for every ligand
chain (str) – Chain ID for every ligand
- Returns
suppl – A sequence that provides
Moleculeobjects- Return type
Sequence
Example
The supplier is typically used like this:
>>> import glob >>> pdbqts = glob.glob("docking/ligand1/*.pdbqt") >>> lig_suppl = pdbqt_supplier(pdbqts, template) >>> for lig in lig_suppl: ... # do something with each ligand
Changed in version 1.0.0: Molecule suppliers are now sequences that can be reused, indexed, and can return their length, instead of single-use generators.
Changed in version 1.1.0: Because the PDBQT supplier needs to strip hydrogen atoms before assigning bond orders from the template, it used to replace them entirely with hydrogens containing new coordinates. It now directly uses the hydrogen atoms present in the file and won’t add explicit ones anymore, to prevent the fingerprint from detecting hydrogen bonds with “random” hydrogen atoms. A lot of irrelevant warnings and logs have been disabled as well.
- class prolif.molecule.sdf_supplier(path, **kwargs)[source]#
Bases:
SequenceSupplies molecules, given a path to an SDFile
- Parameters
- Returns
suppl – A sequence that provides
Moleculeobjects. Can be indexed- Return type
Sequence
Example
The supplier is typically used like this:
>>> lig_suppl = sdf_supplier("docking/output.sdf") >>> for lig in lig_suppl: ... # do something with each ligand
Changed in version 1.0.0: Molecule suppliers are now sequences that can be reused, indexed, and can return their length, instead of single-use generators.
- class prolif.residue.Residue(mol)[source]#
Bases:
BaseRDKitMolA class for residues as RDKit molecules
- Parameters
mol (rdkit.Chem.rdchem.Mol) – The residue as an RDKit molecule
- resid#
The residue identifier
Notes
The name of the residue can be converted to a string by using
str(Residue)