Interaction fingerprint
Calculate a Protein-Ligand Interaction Fingerprint — prolif.fingerprint
In [1]: import MDAnalysis as mda
In [2]: from rdkit.DataStructs import TanimotoSimilarity
In [3]: import prolif
In [4]: u = mda.Universe(prolif.datafiles.TOP, prolif.datafiles.TRAJ)
In [5]: prot = u.select_atoms("protein")
In [6]: lig = u.select_atoms("resname LIG")
In [7]: fp = prolif.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", "CationPi",
...: "Cationic"])
...:
In [8]: fp.run(u.trajectory[::10], lig, prot)
Out[8]: <prolif.fingerprint.Fingerprint: 5 interactions: ['HBDonor', 'HBAcceptor', 'Cationic', 'CationPi', 'PiStacking'] at 0x7f96d7471b50>
In [9]: df = fp.to_dataframe()
In [10]: df
Out[10]:
ligand LIG1.G ...
protein ASP129.A THR134.A ... PHE330.B PHE331.B PHE351.B
interaction HBDonor Cationic HBDonor ... PiStacking PiStacking PiStacking
Frame ...
0 True True False ... False True False
10 True True False ... True True True
20 True True False ... False True False
30 True True False ... False False True
40 True True False ... False True True
50 True True False ... False True True
60 True True False ... False True False
70 True True False ... False True False
80 True True False ... True True False
90 True True False ... False False True
100 True True False ... False False False
110 True True False ... True True False
120 True True False ... True True True
130 True True True ... False False False
140 True True False ... False False False
150 True True False ... False False False
160 True True False ... False True False
170 True True False ... False True False
180 True True False ... False True False
190 True True False ... False True False
200 True True False ... True False True
210 True True False ... True True True
220 True True False ... False True False
230 True True False ... False True True
240 True True False ... True True True
[25 rows x 9 columns]
In [11]: bv = fp.to_bitvectors()
In [12]: TanimotoSimilarity(bv[0], bv[1])
Out[12]: 0.7142857142857143
- class prolif.fingerprint.Fingerprint(interactions=['Hydrophobic', 'HBDonor', 'HBAcceptor', 'PiStacking', 'Anionic', 'Cationic', 'CationPi', 'PiCation'])[source]
Class that generates an interaction fingerprint between two molecules
While in most cases the fingerprint will be generated between a ligand and a protein, it is also possible to use this class for protein-protein interactions, or host-guest systems.
- Parameters
interactions (list) – List of names (str) of interaction classes as found in the
prolif.interactions
module
- interactions
Dictionnary of interaction functions indexed by class name. For more details, see
prolif.interactions
- Type
Notes
You can use the fingerprint generator in multiple ways:
On a trajectory directly from MDAnalysis objects:
In [1]: prot = u.select_atoms("protein") In [2]: lig = u.select_atoms("resname LIG") In [3]: fp = prolif.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", ...: "Hydrophobic"]) ...: In [4]: fp.run(u.trajectory[:5], lig, prot) Out[4]: <prolif.fingerprint.Fingerprint: 4 interactions: ['Hydrophobic', 'HBDonor', 'HBAcceptor', 'PiStacking'] at 0x7f96d7045580> In [5]: fp.to_dataframe() Out[5]: ligand LIG1.G ... protein TYR38.A TYR109.A ... THR355.B TYR359.B interaction Hydrophobic Hydrophobic ... Hydrophobic Hydrophobic Frame ... 0 False True ... True True 1 False False ... True True 2 False True ... True True 3 True False ... True True 4 True True ... True True [5 rows x 31 columns]
On two single structures (from RDKit or MDAnalysis):
In [6]: u.trajectory[0] # use coordinates of the first frame Out[6]: < Timestep 0 with unit cell dimensions [ 89.6909 87.95726 102.02041 90. 90. 90. ] > In [7]: prot = prolif.Molecule.from_mda(prot) In [8]: lig = prolif.Molecule.from_mda(lig) In [9]: ifp = fp.generate(lig, prot) In [10]: ifp["Frame"] = 0 In [11]: prolif.to_dataframe([ifp], fp.interactions.keys()) Out[11]: ligand LIG1.G ... protein TYR109.A TRP125.A ... THR355.B TYR359.B interaction Hydrophobic Hydrophobic ... Hydrophobic Hydrophobic Frame ... 0 True True ... True True [1 rows x 27 columns]
On a specific pair of residues for a specific interaction:
# ligand-protein In [12]: fp.hbdonor(lig, prot["ASP129.A"]) Out[12]: True # protein-protein (alpha helix) In [13]: fp.hbacceptor(prot["ASP129.A"], prot["CYS133.A"]) Out[13]: True
You can also obtain the indices of atoms responsible for the interaction:
In [1]: fp.bitvector_atoms(lig, prot["ASP129.A"]) Out[1]: (array([ True, True, False, False]), [8, 52, None, None], [7, 8, None, None]) In [2]: fp.hbdonor.__wrapped__(lig, prot["ASP129.A"]) Out[2]: (True, 52, 8)
- bitvector(res1, res2)[source]
Generates the complete bitvector for the interactions between two residues. To get the indices of atoms responsible for each interaction, see
bitvector_atoms()
- Parameters
res1 (prolif.residue.Residue or prolif.molecule.Molecule) – A residue, usually from a ligand
res2 (prolif.residue.Residue or prolif.molecule.Molecule) – A residue, usually from a protein
- Returns
bitvector – An array storing the encoded interactions between res1 and res2
- Return type
- bitvector_atoms(res1, res2)[source]
Generates the complete bitvector for the interactions between two residues, and returns the indices of atoms responsible for these interactions
- Parameters
res1 (prolif.residue.Residue or prolif.molecule.Molecule) – A residue, usually from a ligand
res2 (prolif.residue.Residue or prolif.molecule.Molecule) – A residue, usually from a protein
- Returns
bitvector (numpy.ndarray) – An array storing the encoded interactions between res1 and res2
lig_atoms (list) – A list containing indices for the ligand atoms responsible for each interaction
pro_atoms (list) – A list containing indices for the protein atoms responsible for each interaction
Changed in version 0.3.2: Atom indices are returned as two separate lists instead of a single list of tuples
- generate(lig, prot, residues=None, return_atoms=False)[source]
Generates the interaction fingerprint between 2 molecules
- Parameters
lig (prolif.molecule.Molecule) – Molecule for the ligand
prot (prolif.molecule.Molecule) – Molecule for the protein
residues (list or "all" or None) – A list of protein residues (
str
,int
orResidueId
) to take into account for the fingerprint extraction. If"all"
, all residues will be used. IfNone
, at each frame theget_residues_near_ligand()
function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue.return_atoms (bool) – For each residue pair and interaction, return indices of atoms responsible for the interaction instead of bits.
- Returns
ifp – A dictionnary indexed by
(ligand, protein)
residue pairs. The format for values will depend onreturn_atoms
:A single bitvector if
return_atoms=False
A tuple of bitvector, ligand atom indices and protein atom indices if
return_atoms=True
- Return type
Example
>>> u = mda.Universe("complex.pdb") >>> lig = prolif.Molecule.from_mda(u, "resname LIG") >>> prot = prolif.Molecule.from_mda(u, "protein") >>> fp = prolif.Fingerprint() >>> ifp = fp.generate(lig, prot)
New in version 0.3.2.
- static list_available(show_hidden=False)[source]
List interactions available to the Fingerprint class.
- Parameters
show_hidden (bool) – Show hidden classes (usually base classes whose name starts with an underscore
_
. Those are not supposed to be called directly)
- run(traj, lig, prot, residues=None, progress=True)[source]
Generates the fingerprint on a trajectory for a ligand and a protein
- Parameters
traj (MDAnalysis.coordinates.base.ProtoReader or MDAnalysis.coordinates.base.FrameIteratorSliced) – Iterate over this Universe trajectory or sliced trajectory object to extract the frames used for the fingerprint extraction
lig (MDAnalysis.core.groups.AtomGroup) – An MDAnalysis AtomGroup for the ligand
prot (MDAnalysis.core.groups.AtomGroup) – An MDAnalysis AtomGroup for the protein (with multiple residues)
residues (list or "all" or None) – A list of protein residues (
str
,int
orResidueId
) to take into account for the fingerprint extraction. If"all"
, all residues will be used. IfNone
, at each frame theget_residues_near_ligand()
function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue.progress (bool) – Use the tqdm package to display a progressbar while running the calculation
- Returns
The Fingerprint instance that generated the fingerprint
- Return type
Example
>>> u = mda.Universe("top.pdb", "traj.nc") >>> lig = u.select_atoms("resname LIG") >>> prot = u.select_atoms("protein") >>> fp = prolif.Fingerprint().run(u.trajectory[:10], lig, prot)
See also
Fingerprint.generate()
to generate the fingerprint between
two single structures. -
Fingerprint.run_from_iterable()
to generate the fingerprint between a protein and a collection of ligands.Changed in version 0.3.2: Moved the
return_atoms
parameter from therun
method to the dataframe conversion code
- run_from_iterable(lig_iterable, prot_mol, residues=None, progress=True)[source]
Generates the fingerprint between a list of ligands and a protein
- Parameters
lig_iterable (list or generator) – An iterable yielding ligands as
Molecule
objectsprot_mol (prolif.molecule.Molecule) – The protein
residues (list or "all" or None) – A list of protein residues (
str
,int
orResidueId
) to take into account for the fingerprint extraction. If"all"
, all residues will be used. IfNone
, at each frame theget_residues_near_ligand()
function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue.progress (bool) –
Use the tqdm package to display a progressbar while running the calculation
- Returns
The Fingerprint instance that generated the fingerprint
- Return type
Example
>>> prot = mda.Universe("protein.pdb") >>> prot = prolif.Molecule.from_mda(prot) >>> lig_iter = prolif.mol2_supplier("docking_output.mol2") >>> fp = prolif.Fingerprint() >>> fp.run_from_iterable(lig_iter, prot)
See also
Fingerprint.generate()
to generate the fingerprint between two single structuresChanged in version 0.3.2: Moved the
return_atoms
parameter from therun_from_iterable
method to the dataframe conversion code
- to_bitvectors()[source]
Converts fingerprints to a list of RDKit ExplicitBitVector
- Returns
bvs – A list of
ExplicitBitVect
for each frame- Return type
- Raises
AttributeError – If the
run()
method hasn’t been used
Example
>>> from rdkit.DataStructs import TanimotoSimilarity >>> bv = fp.to_bitvectors() >>> TanimotoSimilarity(bv[0], bv[1]) 0.42
- to_dataframe(**kwargs)[source]
Converts fingerprints to a pandas DataFrame
- Parameters
dtype (object or None) – Cast the input of each bit in the bitvector to this type. If None, keep the data as is
drop_empty (bool) – Drop columns with only empty values
return_atoms (bool) – For each residue pair and interaction, return indices of atoms responsible for the interaction instead of bits
- Returns
df – A DataFrame storing the results frame by frame and residue by residue. See
to_dataframe()
for more information- Return type
- Raises
AttributeError – If the
run()
method hasn’t been used
Example
>>> df = fp.to_dataframe(dtype=np.uint8) >>> print(df) ligand LIG1.G protein ILE59 ILE55 TYR93 interaction Hydrophobic HBAcceptor Hydrophobic Hydrophobic PiStacking Frame 0 0 1 0 0 0 ...
Detecting interactions between residues — prolif.interactions
You can declare your own interaction class like this:
from scipy.spatial import distance_matrix
class CloseContact(prolif.Interaction):
def detect(self, res1, res2, threshold=2.0):
dist_matrix = distance_matrix(res1.xyz, res2.xyz)
if (dist_matrix <= threshold).any():
return True
Warning
Your custom class must inherit from prolif.interactions.Interaction
The new “CloseContact” class is then automatically added to the list of interactions available to the fingerprint generator:
>>> u = mda.Universe(prolif.datafiles.TOP, prolif.datafiles.TRAJ)
>>> prot = u.select_atoms("protein")
>>> lig = u.select_atoms("resname LIG")
>>> fp = prolif.Fingerprint(interactions="all")
>>> df = fp.run(u.trajectory[0:1], lig, prot).to_dataframe()
>>> df.xs("CloseContact", level=1, axis=1)
ASP129.A VAL201.A
0 1 1
>>> lmol = prolif.Molecule.from_mda(lig)
>>> pmol = prolif.Molecule.from_mda(prot)
>>> fp.closecontact(lmol, pmol["ASP129.A"])
True
- class prolif.interactions.Anionic(cation='[+{1-}]', anion='[-{1-}]', distance=4.5)[source]
Bases:
prolif.interactions._BaseIonic
Ionic interaction between a ligand (anion) and a residue (cation)
- class prolif.interactions.CationPi(cation='[+{1-}]', pi_ring=('a1:a:a:a:a:a:1', 'a1:a:a:a:a:1'), distance=4.5, angles=(0, 30))[source]
Bases:
prolif.interactions._BaseCationPi
Cation-Pi interaction between a ligand (cation) and a residue (aromatic ring)
- class prolif.interactions.Cationic(cation='[+{1-}]', anion='[-{1-}]', distance=4.5)[source]
Bases:
prolif.interactions._BaseIonic
Ionic interaction between a ligand (cation) and a residue (anion)
- class prolif.interactions.EdgeToFace(centroid_distance=6.0, plane_angles=(50, 90), **kwargs)[source]
Bases:
prolif.interactions.PiStacking
Edge-to-face Pi-Stacking interaction between a ligand and a residue
- class prolif.interactions.FaceToFace(centroid_distance=4.5, plane_angles=(0, 40), **kwargs)[source]
Bases:
prolif.interactions.PiStacking
Face-to-face Pi-Stacking interaction between a ligand and a residue
- class prolif.interactions.HBAcceptor(donor='[#7,#8,#16][H]', acceptor='[N,O,F,-{1-};!+{1-}]', distance=3.5, angles=(130, 180))[source]
Bases:
prolif.interactions._BaseHBond
Hbond interaction between a ligand (acceptor) and a residue (donor)
- class prolif.interactions.HBDonor(donor='[#7,#8,#16][H]', acceptor='[N,O,F,-{1-};!+{1-}]', distance=3.5, angles=(130, 180))[source]
Bases:
prolif.interactions._BaseHBond
Hbond interaction between a ligand (donor) and a residue (acceptor)
- class prolif.interactions.Hydrophobic(hydrophobic='[#6,#16,F,Cl,Br,I,At;+0]', distance=4.5)[source]
Bases:
prolif.interactions._Distance
Hydrophobic interaction
- class prolif.interactions.Interaction[source]
Bases:
abc.ABC
Abstract class for interactions
All interaction classes must inherit this class and define a
detect()
method
- class prolif.interactions.MetalAcceptor(metal='[Ca,Cd,Co,Cu,Fe,Mg,Mn,Ni,Zn]', ligand='[O,N,-{1-};!+{1-}]', distance=2.8)[source]
Bases:
prolif.interactions._BaseMetallic
Metallic interaction between a ligand (chelated) and a metal residue
- class prolif.interactions.MetalDonor(metal='[Ca,Cd,Co,Cu,Fe,Mg,Mn,Ni,Zn]', ligand='[O,N,-{1-};!+{1-}]', distance=2.8)[source]
Bases:
prolif.interactions._BaseMetallic
Metallic interaction between a metal and a residue (chelated)
- class prolif.interactions.PiCation(cation='[+{1-}]', pi_ring=('a1:a:a:a:a:a:1', 'a1:a:a:a:a:1'), distance=4.5, angles=(0, 30))[source]
Bases:
prolif.interactions._BaseCationPi
Cation-Pi interaction between a ligand (aromatic ring) and a residue (cation)
- class prolif.interactions.PiStacking(centroid_distance=6.0, shortest_distance=3.8, plane_angles=(0, 90), pi_ring=('a1:a:a:a:a:a:1', 'a1:a:a:a:a:1'))[source]
Bases:
prolif.interactions.Interaction
Pi-Stacking interaction between a ligand and a residue
- Parameters
- class prolif.interactions.XBAcceptor(donor='[#6,#7,Si,F,Cl,Br,I]-[Cl,Br,I,At]', acceptor='[#7,#8,P,S,Se,Te,a;!+{1-}][*]', distance=3.5, axd_angles=(130, 180), xar_angles=(80, 140))[source]
Bases:
prolif.interactions._BaseXBond
Halogen bonding between a ligand (acceptor) and a residue (donor)
- class prolif.interactions.XBDonor(donor='[#6,#7,Si,F,Cl,Br,I]-[Cl,Br,I,At]', acceptor='[#7,#8,P,S,Se,Te,a;!+{1-}][*]', distance=3.5, axd_angles=(130, 180), xar_angles=(80, 140))[source]
Bases:
prolif.interactions._BaseXBond
Halogen bonding between a ligand (donor) and a residue (acceptor)
- class prolif.interactions._BaseCationPi(cation='[+{1-}]', pi_ring=('a1:a:a:a:a:a:1', 'a1:a:a:a:a:1'), distance=4.5, angles=(0, 30))[source]
Bases:
prolif.interactions.Interaction
Base class for cation-pi interactions
- Parameters
cation (str) – SMARTS for cation
pi_ring (tuple) – SMARTS for aromatic rings (5 and 6 membered rings only)
distance (float) – Cutoff distance between the centroid and the cation
angles (tuple) – Min and max values for the angle between the vector normal to the ring plane and the vector going from the centroid to the cation
- class prolif.interactions._BaseHBond(donor='[#7,#8,#16][H]', acceptor='[N,O,F,-{1-};!+{1-}]', distance=3.5, angles=(130, 180))[source]
Bases:
prolif.interactions.Interaction
Base class for Hydrogen bond interactions
- class prolif.interactions._BaseIonic(cation='[+{1-}]', anion='[-{1-}]', distance=4.5)[source]
Bases:
prolif.interactions._Distance
Base class for ionic interactions
- class prolif.interactions._BaseMetallic(metal='[Ca,Cd,Co,Cu,Fe,Mg,Mn,Ni,Zn]', ligand='[O,N,-{1-};!+{1-}]', distance=2.8)[source]
Bases:
prolif.interactions._Distance
Base class for metal complexation
- class prolif.interactions._BaseXBond(donor='[#6,#7,Si,F,Cl,Br,I]-[Cl,Br,I,At]', acceptor='[#7,#8,P,S,Se,Te,a;!+{1-}][*]', distance=3.5, axd_angles=(130, 180), xar_angles=(80, 140))[source]
Bases:
prolif.interactions.Interaction
Base class for Halogen bond interactions
- Parameters
donor (str) – SMARTS for
[Donor]-[Halogen]
acceptor (str) – SMARTS for
[Acceptor]-[R]
distance (float) – Cutoff distance between the halogen and acceptor atoms
axd_angles (tuple) – Min and max values for the
[Acceptor]...[Halogen]-[Donor]
anglexar_angles (tuple) – Min and max values for the
[R]-[Acceptor]...[Halogen]
angle
Notes
Distance and angle adapted from Auffinger et al. PNAS 2004
- class prolif.interactions._Distance(lig_pattern, prot_pattern, distance)[source]
Bases:
prolif.interactions.Interaction
Generic class for distance-based interactions
- class prolif.interactions._InteractionMeta(name, bases, namespace, **kwargs)[source]
Bases:
abc.ABCMeta
Metaclass to register interactions automatically
- prolif.interactions.get_mapindex(res, index)[source]
Get the index of the atom in the original molecule
- Parameters
res (prolif.residue.Residue) – The residue in the protein or ligand
- Returns
mapindex – The index of the atom in the
Molecule
- Return type