Residues#

ResidueId#

class prolif.residue.ResidueId(name: str = 'UNK', number: int = 0, chain: Optional[str] = None)[source]#

A unique residue identifier

Parameters:
  • name (str) – 3-letter residue name

  • number (int) – residue number

  • chain (str or None, optionnal) – 1-letter protein chain

classmethod from_atom(atom)[source]#

Creates a ResidueId from an RDKit atom

Parameters:

atom (rdkit.Chem.rdchem.Atom) – An atom that contains an RDKit AtomMonomerInfo

classmethod from_string(resid_str)[source]#

Creates a ResidueId from a string

Parameters:

resid_str (str) – A string in the format <3-letter code><residue number>.<chain> All arguments are optionnal, and the dot should be present only if the chain identifier is also present

Examples

string

Corresponding ResidueId

“ALA10.A”

ResidueId("ALA", 10, "A")

“GLU33”

ResidueId("GLU", 33, None)

“LYS.B”

ResidueId("LYS", 0, "B")

“ARG”

ResidueId("ARG", 0, None)

“5.C”

ResidueId("UNK", 5, "C")

“42”

ResidueId("UNK", 42, None)

“.D”

ResidueId("UNK", 0, "D")

“”

ResidueId("UNK", 0, None)

ResidueGroup#

class prolif.residue.ResidueGroup(residues: List[Residue])[source]#

A container to store and retrieve Residue instances easily

Parameters:

residues (list) – A list of Residue

n_residues#

Number of residues in the ResidueGroup

Type:

int

Notes

Residues in the group can be accessed by ResidueId, string, or index. See the Molecule class for an example. You can also use the select() method to access a subset of a ResidueGroup.

select(mask)[source]#

Locate a subset of a ResidueGroup based on a boolean mask

Parameters:

mask (numpy.ndarray) – A 1D array of dtype=bool with the same length as the number of residues in the ResidueGroup. The mask should be constructed by using conditions on the “name”, “number”, and “chain” residue attributes as defined in the ResidueId class

Returns:

rg – A subset of the original ResidueGroup

Return type:

prolif.residue.ResidueGroup

Examples

>>> rg
<prolif.residue.ResidueGroup with 200 residues at 0x7f9a68719ac0>
>>> rg.select(rg.chain == "A")
<prolif.residue.ResidueGroup with 42 residues at 0x7fe3fdb86ca0>
>>> rg.select((10 <= rg.number) & (rg.number < 30))
<prolif.residue.ResidueGroup with 20 residues at 0x7f5f3c69aaf0>
>>> rg.select((rg.chain == "B") & (np.isin(rg.name, ["ASP", "GLU"])))
<prolif.residue.ResidueGroup with 3 residues at 0x7f5f3c510c70>

As seen in these examples, you can combine masks with different operators, similarly to numpy boolean indexing or pandas loc() method

  • AND –> &

  • OR –> |

  • XOR –> ^

  • NOT –> ~