Interaction fingerprint#
Calculate a Protein-Ligand Interaction Fingerprint — prolif.fingerprint#
In [1]: import MDAnalysis as mda
In [2]: from rdkit.DataStructs import TanimotoSimilarity
In [3]: import prolif
In [4]: u = mda.Universe(prolif.datafiles.TOP, prolif.datafiles.TRAJ)
In [5]: prot = u.select_atoms("protein")
In [6]: lig = u.select_atoms("resname LIG")
In [7]: fp = prolif.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", "CationPi",
...: "Cationic"])
...:
In [8]: fp.run(u.trajectory[:5], lig, prot)
Out[8]: <prolif.fingerprint.Fingerprint: 5 interactions: ['HBAcceptor', 'HBDonor', 'Cationic', 'CationPi', 'PiStacking'] at 0x7f3685597450>
In [9]: df = fp.to_dataframe()
In [10]: df
Out[10]:
ligand LIG1.G
protein ASP129.A VAL201.A SER212.A PHE331.B
interaction HBDonor Cationic HBAcceptor HBDonor PiStacking
Frame
0 True True True True True
1 True True True False True
2 True True True False False
3 True True True False True
4 True True True False False
In [11]: bv = fp.to_bitvectors()
In [12]: TanimotoSimilarity(bv[0], bv[1])
Out[12]: 0.8
# save results
In [13]: fp.to_pickle("fingerprint.pkl")
# load
In [14]: fp = prolif.Fingerprint.from_pickle("fingerprint.pkl")
- class prolif.fingerprint.Fingerprint(interactions=['Hydrophobic', 'HBDonor', 'HBAcceptor', 'PiStacking', 'Anionic', 'Cationic', 'CationPi', 'PiCation', 'VdWContact'], parameters=None, count=False, vicinity_cutoff=6.0)[source]#
Class that generates an interaction fingerprint between two molecules
While in most cases the fingerprint will be generated between a ligand and a protein, it is also possible to use this class for protein-protein interactions, or host-guest systems.
- Parameters
interactions (list) – List of names (str) of interaction classes as found in the
prolif.interactionsmodule.parameters (dict, optional) – New parameters for the interactions. Mapping between an interaction name and a dict of parameters as they appear in the interaction class.
count (bool) –
For a given interaction class and pair of residues, there might be multiple combinations of atoms that satisfy the interaction constraints. This parameter controls how the fingerprint treats these different combinations:
False: returns the first combination that satisfy the constraints (fast)True: returns all the combinations (slow)
vicinity_cutoff (float) – Automatically restrict the analysis to residues within this range of the ligand. This parameter is ignored if the
residuesparameter of therunmethods is set to anything other thanNone.
- interactions#
Dictionary of interaction classes indexed by class name. For more details, see
prolif.interactions- Type
- vicinity_cutoff#
Used when calling
prolif.utils.get_residues_near_ligand().- Type
- ifp#
Dict of interaction fingerprints in a sparse format for the given trajectory or docking poses:
{<frame number>: <IFP>}. See theIFPclass for more information.- Type
dict, optional
- Raises
NameError – Unknown interaction in the
interactionsorparametersparameters.
Notes
You can use the fingerprint generator in multiple ways:
On a trajectory directly from MDAnalysis objects:
In [1]: prot = u.select_atoms("protein") In [2]: lig = u.select_atoms("resname LIG") In [3]: fp = prolif.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", ...: "Hydrophobic"]) ...: In [4]: fp.run(u.trajectory[:5], lig, prot) Out[4]: <prolif.fingerprint.Fingerprint: 4 interactions: ['Hydrophobic', 'HBAcceptor', 'HBDonor', 'PiStacking'] at 0x7f368507e390> In [5]: fp.to_dataframe() Out[5]: ligand LIG1.G ... protein TYR109.A TRP125.A ... MET337.B PHE351.B interaction Hydrophobic Hydrophobic ... Hydrophobic Hydrophobic Frame ... 0 True True ... True True 1 False True ... True True 2 True True ... False True 3 False True ... True True 4 True True ... True True [5 rows x 19 columns]
On two single structures (from RDKit or MDAnalysis):
In [6]: u.trajectory[0] # use coordinates of the first frame Out[6]: < Timestep 0 with unit cell dimensions [ 89.6909 87.95726 102.02041 90. 90. 90. ] > In [7]: prot = prolif.Molecule.from_mda(prot) In [8]: lig = prolif.Molecule.from_mda(lig) In [9]: ifp = fp.generate(lig, prot, metadata=True) In [10]: prolif.to_dataframe({0: ifp}, fp.interactions) Out[10]: ligand LIG1.G ... protein TYR109.A TRP125.A ... MET337.B PHE351.B interaction Hydrophobic Hydrophobic ... Hydrophobic Hydrophobic Frame ... 0 True True ... True True [1 rows x 18 columns]
On a specific pair of residues for a specific interaction:
# ligand-protein In [11]: next(fp.hbdonor(lig, prot["ASP129.A"])) Out[11]: True # protein-protein In [12]: next(fp.hbacceptor(prot["ASP129.A"], prot["CYS133.A"])) Out[12]: True
You can also obtain the indices of atoms responsible for the interaction:
In [13]: fp.metadata(lig, prot["ASP129.A"]) Out[13]: {'HBDonor': ({'indices': {'ligand': (13, 52), 'protein': (8,)}, 'parent_indices': {'ligand': (13, 52), 'protein': (1489,)}, 'distance': 2.7534162113079534, 'DHA_angle': 158.09471525304645},)} In [14]: next(fp.hbdonor(lig, prot["ASP129.A"], metadata=True), None) Out[14]: {'indices': {'ligand': (13, 52), 'protein': (8,)}, 'parent_indices': {'ligand': (13, 52), 'protein': (1489,)}, 'distance': 2.7534162113079534, 'DHA_angle': 158.09471525304645}
Changed in version 1.0.0: Added pickle support
Changed in version 2.0.0: Changed the format of the
ifpattribute to be a dictionary containing more complete interaction metadata instead of just atom indices. Removed thereturn_atomsargument into_dataframe(). Users should directly useifpinstead. Added theplot_lignetwork()method to generate theLigNetworkplot. Replaced theFingerprint.bitvector_atomsmethod withFingerprint.metadata(). Added avicinity_cutoffparameter controlling the distance used to automatically restrict the IFP calculation to residues within the specified range of the ligand. Removed the__wrapped__attribute on interaction methods that are available from the fingerprint object. These methods now accept ametadataparameter instead. Added theparametersargument to set the interaction classes parameters without having to create a new class. Added thecountargument to control for a given pair of residues whether to return all occurences of an interaction or only the first one.- bitvector(res1, res2)[source]#
Generates the complete bitvector for the interactions between two residues. To access metadata for each interaction, see
metadata().- Parameters
res1 (prolif.residue.Residue) – A residue, usually from a ligand
res2 (prolif.residue.Residue) – A residue, usually from a protein
- Returns
bitvector – An array storing the encoded interactions between res1 and res2. Depending on
Fingerprint.count, the dtype of the array will be bool or uint8.- Return type
- classmethod from_pickle(path_or_bytes)[source]#
Creates a fingerprint object from a pickle dump.
- Parameters
path_or_bytes (str, pathlib.Path or bytes) – The path to the pickle file, or bytes corresponding to a pickle dump.
See also
to_pickle()for creating the pickle dump.New in version 1.0.0.
Changed in version 2.0.0: Switched to dill instead of pickle
- generate(lig, prot, residues=None, metadata=False)[source]#
Generates the interaction fingerprint between 2 molecules
- Parameters
lig (prolif.molecule.Molecule) – Molecule for the ligand
prot (prolif.molecule.Molecule) – Molecule for the protein
residues (list or "all" or None) – A list of protein residues (
str,intorResidueId) to take into account for the fingerprint extraction. If"all", all residues will be used. IfNone, at each frame theget_residues_near_ligand()function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue (seevicinity_cutoff)metadata (bool) – For each residue pair and interaction, return an interaction metadata dictionary instead of bits.
- Returns
ifp – A dictionary indexed by
(ligand, protein)residue pairs. The format for values will depend onmetadata:A single numpy array if
metadata=FalseA sparse dictionary of metadata tuples indexed by interaction name if
metadata=True
- Return type
Example
>>> u = mda.Universe("complex.pdb") >>> lig = prolif.Molecule.from_mda(u, "resname LIG") >>> prot = prolif.Molecule.from_mda(u, "protein") >>> fp = prolif.Fingerprint() >>> ifp = fp.generate(lig, prot)
New in version 0.3.2.
Changed in version 2.0.0:
return_atomsreplaced bymetadata, and it now returns a sparse dictionary of metadata tuples indexed by interaction name instead of a tuple of arrays.
- static list_available(show_hidden=False)[source]#
List interactions available to the Fingerprint class.
- Parameters
show_hidden (bool) – Show hidden classes (base classes meant to be inherited from to create custom interactions).
- metadata(res1, res2)[source]#
Generates a metadata dictionary for the interactions between two residues.
- Parameters
res1 (prolif.residue.Residue) – A residue, usually from a ligand
res2 (prolif.residue.Residue) – A residue, usually from a protein
- Returns
metadata – Dict containing tuples of metadata dictionaries indexed by interaction name. If a specific interaction is not present between residues, it is filtered out of the dictionary.
- Return type
Changed in version 0.3.2: Atom indices are returned as two separate lists instead of a single list of tuples
Changed in version 2.0.0: Returns a dictionnary with all available metadata for each interaction, rather than just atom indices.
- plot_3d(ligand_mol: Molecule, protein_mol: Molecule, *, frame: int, size: Tuple[int, int] = (650, 600), display_all: bool = False)[source]#
Generate and display the complex in 3D with py3Dmol from a fingerprint object that has been used to run an analysis.
- Parameters
ligand_mol (Molecule) – The ligand molecule to display.
protein_mol (Molecule) – The protein molecule to display.
frame (int) – The frame number chosen to select which interactions are going to be displayed.
size (Tuple[int, int] = (650, 600)) – The size of the py3Dmol widget view.
display_all (bool) – Display all occurences for a given pair of residues and interaction, or only the shortest one. Not relevant if
count=Falsein theFingerprintobject.
See also
- plot_barcode(*, figsize: Tuple[int, int] = (8, 10), dpi: int = 100, interactive: bool = False, n_frame_ticks: int = 10, residues_tick_location: Literal['top', 'bottom'] = 'top', xlabel: str = 'Frame', subplots_kwargs: Optional[dict] = None, tight_layout_kwargs: Optional[dict] = None)[source]#
Generate and display a
Barcodeplot from a fingerprint object that has been used to run an analysis.- Parameters
figsize (Tuple[int, int] = (8, 10)) – Size of the matplotlib figure.
dpi (int = 100) – DPI used for the matplotlib figure.
interactive (bool) – Add hover interactivity to the plot (only relevant for notebooks). You may need to add
%matplotlib notebookor%matplotlib ipymplfor it to work as expected.n_frame_ticks (int = 10) – Number of ticks on the X axis. May use ±1 tick to have them evenly spaced.
residues_tick_location (Literal["top", "bottom"] = "top") – Whether the Y ticks appear at the top or at the bottom of the series of interactions of each residue.
xlabel (str = "Frame") – Label displayed for the X axis.
subplots_kwargs (Optional[dict] = None) – Other parameters passed to
matplotlib.pyplot.subplots().tight_layout_kwargs (Optional[dict] = None) – Other parameters passed to
matplotlib.figure.Figure.tight_layout().
See also
- plot_lignetwork(ligand_mol, *, kind: Literal['aggregate', 'frame'] = 'aggregate', frame: int = 0, display_all: bool = False, threshold: float = 0.3, use_coordinates: bool = False, flatten_coordinates: bool = True, kekulize: bool = False, molsize: int = 35, rotation: float = 0, carbon: float = 0.16, width: str = '100%', height: str = '500px')[source]#
Generate and display a
LigNetworkplot from a fingerprint object that has been used to run an analysis.- Parameters
ligand_mol (rdkit.Chem.rdChem.Mol) – Ligand molecule.
kind (str) – One of
"aggregate"or"frame".frame (int) – Frame number (see
ifp). Only applicable forkind="frame".display_all (bool) – Display all occurences for a given pair of residues and interaction, or only the shortest one. Only applicable for
kind="frame". Not relevant ifcount=Falsein theFingerprintobject.threshold (float) – Frequency threshold, between 0 and 1. Only applicable for
kind="aggregate".use_coordinates (bool) – If
True, uses the coordinates of the molecule directly, otherwise generates 2D coordinates from scratch. See alsoflatten_coordinates.flatten_coordinates (bool) – If this is
Trueanduse_coordinates=True, generates 2D coordinates that are constrained to fit the 3D conformation of the ligand as best as possible.kekulize (bool) – Kekulize the ligand.
molsize (int) – Multiply the coordinates by this number to create a bigger and more readable depiction.
rotation (int) – Rotate the structure on the XY plane.
carbon (float) – Size of the carbon atom dots on the depiction. Use 0 to hide the carbon dots.
width (str) – Width of the IFrame window.
height (str) – Height of the IFrame window.
See also
- run(traj, lig, prot, *, residues=None, converter_kwargs=None, progress=True, n_jobs=None)[source]#
Generates the fingerprint on a trajectory for a ligand and a protein
- Parameters
traj (MDAnalysis.coordinates.base.ProtoReader or MDAnalysis.coordinates.base.FrameIteratorSliced) – Iterate over this Universe trajectory or sliced trajectory object to extract the frames used for the fingerprint extraction
lig (MDAnalysis.core.groups.AtomGroup) – An MDAnalysis AtomGroup for the ligand
prot (MDAnalysis.core.groups.AtomGroup) – An MDAnalysis AtomGroup for the protein (with multiple residues)
residues (list or "all" or None) – A list of protein residues (
str,intorResidueId) to take into account for the fingerprint extraction. If"all", all residues will be used. IfNone, at each frame theget_residues_near_ligand()function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue.converter_kwargs (tuple[dict, dict], optional) – Tuple of kwargs passed to the underlying
RDKitConverterfrom MDAnalysis: the first for the ligand, and the second for the proteinprogress (bool) – Display a
tqdmprogressbar while running the calculationn_jobs (int or None) – Number of processes to run in parallel. If
n_jobs=None, the analysis will use all available CPU threads, while ifn_jobs=1, the analysis will run in serial.
- Raises
ValueError – If
n_jobs <= 0- Returns
The Fingerprint instance that generated the fingerprint
- Return type
Example
>>> u = mda.Universe("top.pdb", "traj.nc") >>> lig = u.select_atoms("resname LIG") >>> prot = u.select_atoms("protein") >>> fp = prolif.Fingerprint().run(u.trajectory[:10], lig, prot)
See also
Fingerprint.generate()to generate the fingerprint between two single structures.Fingerprint.run_from_iterable()to generate the fingerprint between a protein and a collection of ligands.
Changed in version 0.3.2: Moved the
return_atomsparameter from therunmethod to the dataframe conversion codeChanged in version 1.0.0: Added support for multiprocessing
Changed in version 1.1.0: Added support for passing kwargs to the RDKitConverter through the
converter_kwargsparameterChanged in version 2.0.0: Changed the format of the
ifpattribute to be a dictionary containing more complete interaction metadata instead of just atom indices.
- run_from_iterable(lig_iterable, prot_mol, *, residues=None, progress=True, n_jobs=None)[source]#
Generates the fingerprint between a list of ligands and a protein
- Parameters
lig_iterable (list or generator) – An iterable yielding ligands as
Moleculeobjectsprot_mol (prolif.molecule.Molecule) – The protein
residues (list or "all" or None) – A list of protein residues (
str,intorResidueId) to take into account for the fingerprint extraction. If"all", all residues will be used. IfNone, at each frame theget_residues_near_ligand()function is used to automatically use protein residues that are distant of 6.0 Å or less from each ligand residue.progress (bool) – Display a
tqdmprogressbar while running the calculationn_jobs (int or None) – Number of processes to run in parallel. If
n_jobs=None, the analysis will use all available CPU threads, while ifn_jobs=1, the analysis will run in serial.
- Raises
ValueError – If
n_jobs <= 0- Returns
The Fingerprint instance that generated the fingerprint
- Return type
Example
>>> prot = mda.Universe("protein.pdb") >>> prot = prolif.Molecule.from_mda(prot) >>> lig_iter = prolif.mol2_supplier("docking_output.mol2") >>> fp = prolif.Fingerprint() >>> fp.run_from_iterable(lig_iter, prot)
See also
Fingerprint.generate()to generate the fingerprint between two single structuresChanged in version 0.3.2: Moved the
return_atomsparameter from therun_from_iterablemethod to the dataframe conversion codeChanged in version 1.0.0: Added support for multiprocessing
Changed in version 2.0.0: Changed the format of the
ifpattribute to be a dictionary containing more complete interaction metadata instead of just atom indices.
- to_bitvectors()[source]#
Converts fingerprints to a list of RDKit ExplicitBitVector
- Returns
bvs – A list of
ExplicitBitVectfor each frame- Return type
- Raises
AttributeError – If the
run()method hasn’t been used
Example
>>> from rdkit.DataStructs import TanimotoSimilarity >>> bv = fp.to_bitvectors() >>> TanimotoSimilarity(bv[0], bv[1]) 0.42
- to_countvectors()[source]#
Converts fingerprints to a list of RDKit UIntSparseIntVect
- Returns
cvs – A list of
UIntSparseIntVectfor each frame- Return type
- Raises
AttributeError – If the
run()method hasn’t been used
Example
>>> from rdkit.DataStructs import TanimotoSimilarity >>> cv = fp.to_countvectors() >>> TanimotoSimilarity(cv[0], cv[1]) 0.42
- to_dataframe(*, count=None, dtype=None, drop_empty=True, index_col='Frame')[source]#
Converts fingerprints to a pandas DataFrame
- Parameters
- Returns
df – A DataFrame storing the results frame by frame and residue by residue. See
to_dataframe()for more information- Return type
- Raises
AttributeError – If the
run()method hasn’t been used
Example
>>> df = fp.to_dataframe(dtype=np.uint8) >>> print(df) ligand LIG1.G protein ILE59 ILE55 TYR93 interaction Hydrophobic HBAcceptor Hydrophobic Hydrophobic PiStacking Frame 0 0 1 0 0 0 ...
Changed in version 2.0.0: Removed the
return_atomsparameter. You can access more metadata information directly throughifp. Added thecountparameter.
- to_pickle(path=None)[source]#
Dumps the fingerprint object as a pickle.
- Parameters
path (str, pathlib.Path or None) – Output path. If
None, the method returns the pickle as bytes.- Returns
obj –
Noneifpathis set, else the bytes corresponding to the pickle- Return type
None or bytes
Example
>>> dump = fp.to_pickle() >>> saved_fp = Fingerprint.from_pickle(dump) >>> fp.to_pickle("data/fp.pkl") >>> saved_fp = Fingerprint.from_pickle("data/fp.pkl")
See also
from_pickle()for loading the pickle dump.New in version 1.0.0.
Changed in version 2.0.0: Switched to dill instead of pickle
Storing interactions — prolif.ifp#
- class prolif.ifp.IFP(dict=None, /, **kwargs)[source]#
Mapping between residue pairs and interaction fingerprint.
Notes
This class provides an easy way to access interaction data from the
ifpdictionary. This class is a dictionary formatted as:{ tuple[<residue_id>, <residue_id>]: { <interaction name>: tuple[{ "indices": { "ligand": tuple[int, ...], "protein": tuple[int, ...] }, "parent_indices": { "ligand": tuple[int, ...], "protein": tuple[int, ...] }, <other metadata>: <value> }, ...] } }Here
<residue_id>corresponds to aResidueIdobject. For convenience, one can directly use strings rather thanResidueIdobjects when indexing the IFP, e.g.ifp[("LIG1.G", "ASP129.A")]. You can also use a singleResidueIdor string to return a filtered IFP that only contains interactions with the specified residue, e.g.ifp["ASP129.A"].
Detecting interactions between residues — prolif.interactions.interactions#
Note that some of the SMARTS patterns used in the interaction classes are inspired from Pharmit and RDKit.
- class prolif.interactions.interactions.Anionic(cation='[+{1-},$([NX3&!$([NX3]-O)]-[C]=[NX3+])]', anion='[-{1-},$(O=[C,S,P]-[O-])]', distance=4.5)#
Bases:
CationicIonic interaction between a ligand (anion) and a residue (cation)
Changed in version 1.1.0: Handles resonance forms for common acids, amidine and guanidine.
- class prolif.interactions.interactions.CationPi(cation='[+{1-},$([NX3&!$([NX3]-O)]-[C]=[NX3+])]', pi_ring=('[a;r6]1:[a;r6]:[a;r6]:[a;r6]:[a;r6]:[a;r6]:1', '[a;r5]1:[a;r5]:[a;r5]:[a;r5]:[a;r5]:1'), distance=4.5, angle=(0, 30))[source]#
Bases:
InteractionCation-Pi interaction between a ligand (cation) and a residue (aromatic ring)
- Parameters
cation (str) – SMARTS for cation
pi_ring (tuple) – SMARTS for aromatic rings (5 and 6 membered rings only)
distance (float) – Cutoff distance between the centroid and the cation
angle (tuple) – Min and max values for the angle between the vector normal to the ring plane and the vector going from the centroid to the cation
Changed in version 1.1.0: Handles resonance forms for amidine and guanidine as cations.
Changed in version 2.0.0:
anglesparameter renamed toangle.
- class prolif.interactions.interactions.Cationic(cation='[+{1-},$([NX3&!$([NX3]-O)]-[C]=[NX3+])]', anion='[-{1-},$(O=[C,S,P]-[O-])]', distance=4.5)[source]#
Bases:
DistanceIonic interaction between a ligand (cation) and a residue (anion)
Changed in version 1.1.0: Handles resonance forms for common acids, amidine and guanidine.
- class prolif.interactions.interactions.EdgeToFace(distance=6.5, plane_angle=(50, 90), normal_to_centroid_angle=(0, 30), pi_ring=('[a;r6]1:[a;r6]:[a;r6]:[a;r6]:[a;r6]:[a;r6]:1', '[a;r5]1:[a;r5]:[a;r5]:[a;r5]:[a;r5]:1'), intersect_radius=1.5)[source]#
Bases:
BasePiStackingEdge-to-face Pi-Stacking interaction between a ligand and a residue
- Parameters
distance (float) – Cutoff distance between each rings centroid
plane_angles (tuple) – Min and max values for the angle between the ring planes
normal_centroids_angles (tuple) – Min and max angles allowed between the vector normal to a ring’s plane, and the vector between the centroid of both rings.
pi_ring (list) – List of SMARTS for aromatic rings
intersect_radius (float) – Used to check whether the intersect point between ring planes falls within
intersect_radiusof the opposite ring’s centroid.
Changed in version 1.1.0: In addition to the changes made to the base pi-stacking interaction, this implementation makes sure that the intersection between the perpendicular ring’s plane and the other’s plane falls inside the ring.
Changed in version 2.0.0: Renamed
centroid_distanceto distance, addedintersect_radiusparameter.
- class prolif.interactions.interactions.FaceToFace(distance=5.5, plane_angle=(0, 35), normal_to_centroid_angle=(0, 33), pi_ring=('[a;r6]1:[a;r6]:[a;r6]:[a;r6]:[a;r6]:[a;r6]:1', '[a;r5]1:[a;r5]:[a;r5]:[a;r5]:[a;r5]:1'))[source]#
Bases:
BasePiStackingFace-to-face Pi-Stacking interaction between a ligand and a residue
- Parameters
distance (float) – Cutoff distance between each rings centroid
plane_angles (tuple) – Min and max values for the angle between the ring planes
normal_centroids_angles (tuple) – Min and max angles allowed between the vector normal to a ring’s plane, and the vector between the centroid of both rings.
pi_ring (list) – List of SMARTS for aromatic rings
Changed in version 2.0.0: Renamed
centroid_distanceto distance.
- class prolif.interactions.interactions.HBAcceptor(acceptor='[#7&!$([nX3])&!$([NX3]-*=[O,N,P,S])&!$([NX3]-[a])&!$([Nv4&+1]),O&!$([OX2](C)C=O)&!$(O(~a)~a)&!$(O=N-*)&!$([O-]-N=O),o+0,F&$(F-[#6])&!$(F-[#6][F,Cl,Br,I])]', donor='[$([O,S;+0]),$([N;v3,v4&+1]),n+0]-[H]', distance=3.5, DHA_angle=(130, 180))[source]#
Bases:
SingleAngleHbond interaction between a ligand (acceptor) and a residue (donor)
- Parameters
Changed in version 1.1.0: The initial SMARTS pattern was too broad.
Changed in version 2.0.0:
anglesparameter renamed toDHA_angle.
- class prolif.interactions.interactions.HBDonor(acceptor='[#7&!$([nX3])&!$([NX3]-*=[O,N,P,S])&!$([NX3]-[a])&!$([Nv4&+1]),O&!$([OX2](C)C=O)&!$(O(~a)~a)&!$(O=N-*)&!$([O-]-N=O),o+0,F&$(F-[#6])&!$(F-[#6][F,Cl,Br,I])]', donor='[$([O,S;+0]),$([N;v3,v4&+1]),n+0]-[H]', distance=3.5, DHA_angle=(130, 180))#
Bases:
HBAcceptorHbond interaction between a ligand (donor) and a residue (acceptor)
- Parameters
Changed in version 1.1.0: The initial SMARTS pattern was too broad.
Changed in version 2.0.0:
anglesparameter renamed toDHA_angle.
- class prolif.interactions.interactions.Hydrophobic(hydrophobic='[c,s,Br,I,S&H0&v2,$([D3,D4;#6])&!$([#6]~[#7,#8,#9])&!$([#6X4H0]);+0]', distance=4.5)[source]#
Bases:
DistanceHydrophobic interaction
- Parameters
Changed in version 1.1.0: The initial SMARTS pattern was too broad.
- class prolif.interactions.interactions.MetalAcceptor(metal='[Ca,Cd,Co,Cu,Fe,Mg,Mn,Ni,Zn]', ligand='[O,#7&!$([nX3])&!$([NX3]-*=[!#6])&!$([NX3]-[a])&!$([NX4]),-{1-};!+{1-}]', distance=2.8)#
Bases:
MetalDonorMetal complexation interaction between a ligand (chelated) and a residue (metal)
- Parameters
Changed in version 1.1.0: The initial SMARTS pattern was too broad.
- class prolif.interactions.interactions.MetalDonor(metal='[Ca,Cd,Co,Cu,Fe,Mg,Mn,Ni,Zn]', ligand='[O,#7&!$([nX3])&!$([NX3]-*=[!#6])&!$([NX3]-[a])&!$([NX4]),-{1-};!+{1-}]', distance=2.8)[source]#
Bases:
DistanceMetal complexation interaction between a ligand (metal) and a residue (chelated)
- Parameters
Changed in version 1.1.0: The initial SMARTS pattern was too broad.
- class prolif.interactions.interactions.PiCation(cation='[+{1-},$([NX3&!$([NX3]-O)]-[C]=[NX3+])]', pi_ring=('[a;r6]1:[a;r6]:[a;r6]:[a;r6]:[a;r6]:[a;r6]:1', '[a;r5]1:[a;r5]:[a;r5]:[a;r5]:[a;r5]:1'), distance=4.5, angle=(0, 30))#
Bases:
CationPiCation-Pi interaction between a ligand (aromatic ring) and a residue (cation)
- Parameters
cation (str) – SMARTS for cation
pi_ring (tuple) – SMARTS for aromatic rings (5 and 6 membered rings only)
distance (float) – Cutoff distance between the centroid and the cation
angle (tuple) – Min and max values for the angle between the vector normal to the ring plane and the vector going from the centroid to the cation
Changed in version 1.1.0: Handles resonance forms for amidine and guanidine as cations.
Changed in version 2.0.0:
anglesparameter renamed toangle.
- class prolif.interactions.interactions.PiStacking(ftf_kwargs=None, etf_kwargs=None)[source]#
Bases:
InteractionPi-Stacking interaction between a ligand and a residue
- Parameters
Changed in version 0.3.4: shortest_distance has been replaced by angle_normal_centroid
Changed in version 1.1.0: The implementation now directly calls
EdgeToFaceandFaceToFaceinstead of overwriting the default parameters with more generic ones.
- class prolif.interactions.interactions.VdWContact(tolerance=0.0, vdwradii=None)[source]#
Bases:
InteractionInteraction based on the van der Waals radii of interacting atoms.
- Parameters
tolerance (float) – Tolerance added to the sum of vdW radii of atoms before comparing to the interatomic distance. If
distance <= sum_vdw + tolerancethe atoms are identified as a contactvdwradii (dict, optional) – Updates to the vdW radii dictionary, with elements (first letter uppercase) as a key and the radius as a value.
- Raises
ValueError –
toleranceparameter cannot be negative
Changed in version 2.0.0: Added the
vdwradiiparameter.
- class prolif.interactions.interactions.XBAcceptor(acceptor='[#7,#8,P,S,Se,Te,a;!+{1-}][*]', donor='[#6,#7,Si,F,Cl,Br,I]-[Cl,Br,I,At]', distance=3.5, AXD_angle=(130, 180), XAR_angle=(80, 140))[source]#
Bases:
DoubleAngleHalogen bonding between a ligand (acceptor) and a residue (donor).
- Parameters
acceptor (str) – SMARTS for
[Acceptor]-[R]donor (str) – SMARTS for
[Donor]-[Halogen]distance (float) – Cutoff distance between the acceptor and halogen atoms
AXD_angle (tuple) – Min and max values for the
[Acceptor]...[Halogen]-[Donor]angleXAR_angle (tuple) – Min and max values for the
[Halogen]...[Acceptor]-[R]angle
Notes
Distance and angle adapted from Auffinger et al. PNAS 2004
Changed in version 2.0.0:
axd_anglesandxar_anglesparameters renamed toAXD_angleandXAR_angle.
- class prolif.interactions.interactions.XBDonor(acceptor='[#7,#8,P,S,Se,Te,a;!+{1-}][*]', donor='[#6,#7,Si,F,Cl,Br,I]-[Cl,Br,I,At]', distance=3.5, AXD_angle=(130, 180), XAR_angle=(80, 140))#
Bases:
XBAcceptorHalogen bonding between a ligand (donor) and a residue (acceptor)
- Parameters
acceptor (str) – SMARTS for
[Acceptor]-[R]donor (str) – SMARTS for
[Donor]-[Halogen]distance (float) – Cutoff distance between the acceptor and halogen atoms
AXD_angle (tuple) – Min and max values for the
[Acceptor]...[Halogen]-[Donor]angleXAR_angle (tuple) – Min and max values for the
[Halogen]...[Acceptor]-[R]angle
Notes
Distance and angle adapted from Auffinger et al. PNAS 2004
Changed in version 2.0.0:
axd_anglesandxar_anglesparameters renamed toAXD_angleandXAR_angle.
Base interaction classes — prolif.interactions.base#
This module contains the base classes used to build most of the interactions.
- class prolif.interactions.base.BasePiStacking(distance, plane_angle, normal_to_centroid_angle, pi_ring=('[a;r6]1:[a;r6]:[a;r6]:[a;r6]:[a;r6]:[a;r6]:1', '[a;r5]1:[a;r5]:[a;r5]:[a;r5]:[a;r5]:1'), intersect=False, intersect_radius=1.5)[source]#
Bases:
InteractionBase class for Pi-Stacking interactions
- Parameters
distance (float) – Cutoff distance between each rings centroid
plane_angles (tuple) – Min and max values for the angle between the ring planes
normal_centroids_angles (tuple) – Min and max angles allowed between the vector normal to a ring’s plane, and the vector between the centroid of both rings.
pi_ring (list) – List of SMARTS for aromatic rings
intersect (bool) – Whether to look for the point of intersection between both rings (for T-shaped interaction).
intersect_radius (float) – If
intersect=True, used to check whether the intersect point falls withinintersect_radiusof the opposite ring’s centroid.
Changed in version 1.1.0: The implementation now relies on the angle between the vector normal to a ring’s plane and the vector between centroids (
normal_centroids_angles) instead of theshortest_distanceparameter.Changed in version 2.0.0: Renamed
centroid_distanceto distance. Addedintersectandring_radiusparameters.
- class prolif.interactions.base.Distance(lig_pattern, prot_pattern, distance)[source]#
Bases:
InteractionGeneric class for distance-based interactions
- class prolif.interactions.base.DoubleAngle(lig_pattern, prot_pattern, distance, L1P2P1_angle, L2L1P2_angle, distance_atoms=('L1', 'P2'), metadata_mapping=None)[source]#
Bases:
InteractionGeneric class for interactions using constraints on a distance and two angles.
- Parameters
lig_pattern (str) – SMARTS for
[L1]-[L2]prot_pattern (str) – SMARTS for
[P1]-[P2]distance (float) – Distance threshold
L1P2P1_angle (tuple) – Min and max values for the
[L1]...[P2]-[P1]angleL2L1P2_angle (tuple) – Min and max values for the
[L2]-[L1]...[P2]angledistance_atoms (tuple[str, str]) – Which atoms to use for the distance calculation: L1 or L2, and P1 or P2
metadata_mapping (dict, optional) – Mapping for names used in the metadata dict for the distance and angle variables
New in version 2.0.0.
- class prolif.interactions.base.Interaction[source]#
Bases:
objectBase class for interactions
All interaction classes must inherit this class and define a
detect()method.Changed in version 2.0.0: Changed the return type of interactions. Added some helper methods to easily update/derive interaction classes.
- class prolif.interactions.base.SingleAngle(lig_pattern, prot_pattern, distance, angle, distance_atom, metadata_mapping=None)[source]#
Bases:
InteractionGeneric class for interactions using constraints on a distance and an angle.
- Parameters
lig_pattern (str) – SMARTS pattern for atoms in ligand residues.
prot_pattern (str) – SMARTS pattern for atoms in protein residues, should include (at least) two atoms, the first two atoms will be used for the angle calculation.
distance (float) – Distance threshold.
angle (tuple) – Min and max values for the
[P1]-[P2]...[L1]angle, where L1 is the first atom in thelig_patternSMARTS, and P1 and P2 are the first two atoms in theprot_patternSMARTS.distance_atom (str) – Which atom from the protein (P1 or P2) to use for the distance calculation with L1.
metadata_mapping (dict, optional) – Mapping for names used in the metadata dict for the distance and angle variables
New in version 2.0.0.
Generating an IFP in parallel — prolif.parallel#
This module provides two classes, TrajectoryPool and MolIterablePool
to execute the analysis in parallel. These are used in the parallel implementation used
in run() and
run_from_iterable() respectively.
- class prolif.parallel.MolIterablePool(n_processes, fingerprint, prot_mol, residues, tqdm_kwargs)[source]#
Process pool for a parallelized IFP analysis on an iterable of ligands. Must be used in a
withstatement.- Parameters
n_processes (int) – Max number of processes
fingerprint (prolif.fingerprint.Fingerprint) – Fingerprint instance used to generate the IFP
prot_mol (prolif.molecule.Molecule) – Protein molecule
residues (list, optional) – List of protein residues considered for the IFP
tqdm_kwargs (dict) – Parameters for the
tqdmprogress bar
- pool#
The underlying pool instance.
- classmethod executor(mol)[source]#
Classmethod executed by each child process on a single ligand molecule from the input iterable.
- Returns
ifp_data – A dictionary indexed by
(ligand, protein)residue pairs, and each value is a sparse dictionary of metadata indexed by interaction name.- Return type
- classmethod initializer(fingerprint, prot_mol, residues)[source]#
Initializer classmethod passed to the pool so that each child process can access these objects without copying them.
- process(args_iterable)[source]#
Maps the input iterable of molecules to the executor function.
- Parameters
args_iterable (Iterable[prolif.molecule.Molecule]) – An iterable yielding ligand molecules
- Returns
ifp – An iterable of
IFPdictionaries.- Return type
- class prolif.parallel.Progress(killswitch, tracker, **kwargs)[source]#
Helper class to track the number of frames processed by the
TrajectoryPool.- Parameters
killswitch (threading.Event) – A threading.Event instance created and controlled by the
TrajectoryPoolto kill the thread updating thetqdmprogress bar.tracker (multiprocess.Value) – Value holding a
ctypes.c_uint32ctype updated by theTrajectoryPool, storing how many frames were processed since the last progress bar update.**kwargs (object) – Used to create the
tqdmprogress bar.
- delay#
Delay between progress bar updates. This requires locking access to the
trackerobject which theTrajectoryPoolneeds access to, so too small values might cause delays in the analysis.- Type
float, default = 0.5
- tqdm_pbar#
The progress bar displayed
- Type
tqdm.std.tqdm
- close()[source]#
Cleanup after the
TrajectoryPoolcomputation is done
- class prolif.parallel.TrajectoryPool(n_processes, fingerprint, residues, tqdm_kwargs, rdkitconverter_kwargs)[source]#
Process pool for a parallelized IFP analysis on an MD trajectory. Must be used in a
withstatement.- Parameters
n_processes (int) – Max number of processes
fingerprint (prolif.fingerprint.Fingerprint) – Fingerprint instance used to generate the IFP
residues (list, optional) – List of protein residues considered for the IFP
tqdm_kwargs (dict) – Parameters for the
tqdmprogress barrdkitconverter_kwargs (tuple[dict, dict]) – Parameters for the
RDKitConverterfrom MDAnalysis: the first for the ligand, and the second for the protein
- tracker#
Value holding a
ctypes.c_uint32ctype storing how many frames were processed since the last progress bar update.- Type
multiprocess.Value
- pool#
The underlying pool instance.
- classmethod executor(args)[source]#
Classmethod executed by each child process on a chunk of the trajectory
- Returns
ifp_chunk – A dictionary of
IFPindexed by frame number- Return type
- classmethod initializer(tracker, fingerprint, residues, rdkitconverter_kwargs)[source]#
Initializer classmethod passed to the pool so that each child process can access these objects without copying them.