Helper functions#
Helper functions — prolif.utils#
- prolif.utils.get_residues_near_ligand(lig: BaseRDKitMol, prot: BaseRDKitMol, cutoff: float = 6.0, use_segid: bool = False) list[prolif.residue.ResidueId][source]#
Detects residues close to a reference ligand
- Parameters:
lig (prolif.molecule.Molecule or prolif.residue.Residue) – Select residues that are near this ligand/residue
prot (prolif.molecule.Molecule) – Protein containing the residues
cutoff (float) – If any interatomic distance between the ligand reference points and a residue is below or equal to this cutoff, the residue will be selected
use_segid (bool, default = False) – Use the segment number rather than the chain identifier as a chain.
- Returns:
residues (list) – A list of unique
ResidueIdthat are close to the ligand.. versionchanged:: 2.1.0 – Added use_segid.
- prolif.utils.select_over_trajectory(u: Universe, trajectory: Trajectory, selections: str, **kwargs: Any) AtomGroup[source]#
- prolif.utils.select_over_trajectory(u: Universe, trajectory: Trajectory, *selections: str, **kwargs: Any) list[MDAnalysis.core.groups.AtomGroup]
Returns AtomGroup(s) that satisfy each distance-based
selectionover the entire trajectory rather than only the first frame. Only useful for distance-based selections. Usegroup {0}in any additional selection to refer to the AtomGroup generated by the first selection, orgroup {N-1}for the Nth selection.
- prolif.utils.to_bitvectors(df: DataFrame) list[rdkit.DataStructs.cDataStructs.ExplicitBitVect][source]#
Converts an interaction DataFrame to a list of RDKit ExplicitBitVector
- Parameters:
df (pandas.DataFrame) – A DataFrame where each column corresponds to an interaction between two residues
- Returns:
bv – A list of
ExplicitBitVectfor each frame- Return type:
Example
>>> from rdkit.DataStructs import TanimotoSimilarity >>> bv = prolif.to_bitvectors(df) >>> TanimotoSimilarity(bv[0], bv[1]) 0.42
- prolif.utils.to_dataframe(ifp: IFPResults, interactions: Collection[str], count: bool = False, dtype: type | None = None, drop_empty: bool = True, index_col: str = 'Frame') DataFrame[source]#
Converts IFPs to a pandas DataFrame
- Parameters:
ifp (dict) – A dict in the format
{<frame number>: {(<residue_id>, <residue_id>): <interactions>}}.<interactions>is either anumpy.ndarraybitvector, or a tuple of dict in the format{<interaction name>: <metadata dict>}.interactions (list) – A list of interactions, in the same order as used to detect the interactions.
count (bool) – Whether to output a count fingerprint or not.
dtype (object or None) – Cast the dataframe values to this type. If
None, usesnp.uint8ifcount=True, elsebool.drop_empty (bool) – Drop columns with only empty values
index_col (str) – Name of the index column in the DataFrame
- Returns:
df – A 3-levels DataFrame where each ligand residue, protein residue, and interaction type are in separate columns
- Return type:
Example
>>> df = prolif.to_dataframe(results, fp.interactions.keys(), dtype=int) >>> print(df) ligand LIG1.G protein ILE59 ILE55 TYR93 interaction Hydrophobic HBAcceptor Hydrophobic Hydrophobic PiStacking Frame 0 0 1 0 0 0 ...
Changed in version 0.3.2: Moved the
return_atomsparameter from therunmethods to the dataframe conversion codeChanged in version 2.0.0: Removed the
return_atomsparameter. Added thecountparameter. Removed support forifpcontainingnp.ndarraybitvectors.