Helper functions#

Helper functions — prolif.utils#

prolif.utils.get_residues_near_ligand(lig: BaseRDKitMol, prot: BaseRDKitMol, cutoff: float = 6.0, use_segid: bool = False) list[prolif.residue.ResidueId][source]#

Detects residues close to a reference ligand

Parameters:
  • lig (prolif.molecule.Molecule or prolif.residue.Residue) – Select residues that are near this ligand/residue

  • prot (prolif.molecule.Molecule) – Protein containing the residues

  • cutoff (float) – If any interatomic distance between the ligand reference points and a residue is below or equal to this cutoff, the residue will be selected

  • use_segid (bool, default = False) – Use the segment number rather than the chain identifier as a chain.

Returns:

  • residues (list) – A list of unique ResidueId that are close to the ligand

  • .. versionchanged:: 2.1.0 – Added use_segid.

prolif.utils.select_over_trajectory(u: Universe, trajectory: Trajectory, selections: str, **kwargs: Any) AtomGroup[source]#
prolif.utils.select_over_trajectory(u: Universe, trajectory: Trajectory, *selections: str, **kwargs: Any) list[MDAnalysis.core.groups.AtomGroup]

Returns AtomGroup(s) that satisfy each distance-based selection over the entire trajectory rather than only the first frame. Only useful for distance-based selections. Use group {0} in any additional selection to refer to the AtomGroup generated by the first selection, or group {N-1} for the Nth selection.

prolif.utils.to_bitvectors(df: DataFrame) list[rdkit.DataStructs.cDataStructs.ExplicitBitVect][source]#

Converts an interaction DataFrame to a list of RDKit ExplicitBitVector

Parameters:

df (pandas.DataFrame) – A DataFrame where each column corresponds to an interaction between two residues

Returns:

bv – A list of ExplicitBitVect for each frame

Return type:

list

Example

>>> from rdkit.DataStructs import TanimotoSimilarity
>>> bv = prolif.to_bitvectors(df)
>>> TanimotoSimilarity(bv[0], bv[1])
0.42
prolif.utils.to_dataframe(ifp: IFPResults, interactions: Collection[str], count: bool = False, dtype: type | None = None, drop_empty: bool = True, index_col: str = 'Frame') DataFrame[source]#

Converts IFPs to a pandas DataFrame

Parameters:
  • ifp (dict) – A dict in the format {<frame number>: {(<residue_id>, <residue_id>): <interactions>}}. <interactions> is either a numpy.ndarray bitvector, or a tuple of dict in the format {<interaction name>: <metadata dict>}.

  • interactions (list) – A list of interactions, in the same order as used to detect the interactions.

  • count (bool) – Whether to output a count fingerprint or not.

  • dtype (object or None) – Cast the dataframe values to this type. If None, uses np.uint8 if count=True, else bool.

  • drop_empty (bool) – Drop columns with only empty values

  • index_col (str) – Name of the index column in the DataFrame

Returns:

df – A 3-levels DataFrame where each ligand residue, protein residue, and interaction type are in separate columns

Return type:

pandas.DataFrame

Example

>>> df = prolif.to_dataframe(results, fp.interactions.keys(), dtype=int)
>>> print(df)
ligand             LIG1.G
protein             ILE59                  ILE55       TYR93
interaction   Hydrophobic HBAcceptor Hydrophobic Hydrophobic PiStacking
Frame
0                       0          1           0           0          0
...

Changed in version 0.3.2: Moved the return_atoms parameter from the run methods to the dataframe conversion code

Changed in version 2.0.0: Removed the return_atoms parameter. Added the count parameter. Removed support for ifp containing np.ndarray bitvectors.