3. Protein-protein interactions¶
This notebooks shows how to compute a fingerprint for protein-protein interactions.
Here we will investigate the interactions in a G-protein coupled receptor (GPCR) between a particular helix (called TM3) and the rest of the protein.
This can obviously be applied to proteins that don’t belong to the same chain/segment, as long as you can figure out an appropriate MDAnalysis selection
[1]:
import MDAnalysis as mda
import prolif as plf
[2]:
# load traj
u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)
tm3 = u.select_atoms("resid 119:152")
prot = u.select_atoms("protein and not group tm3", tm3=tm3)
tm3, prot
[2]:
(<AtomGroup with 531 atoms>, <AtomGroup with 4457 atoms>)
[3]:
# prot-prot interactions
fp = plf.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", "PiCation", "CationPi", "Anionic", "Cationic"])
fp.run(u.trajectory[::10], tm3, prot)
/home/docs/checkouts/readthedocs.org/user_builds/prolif/conda/v0.3.1/lib/python3.9/site-packages/MDAnalysis/coordinates/RDKit.py:492: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
if isinstance(value, (float, np.float)):
/home/docs/checkouts/readthedocs.org/user_builds/prolif/conda/v0.3.1/lib/python3.9/site-packages/MDAnalysis/coordinates/RDKit.py:494: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
elif isinstance(value, (int, np.int)):
[3]:
<prolif.fingerprint.Fingerprint: 7 interactions: ['HBDonor', 'HBAcceptor', 'Cationic', 'Anionic', 'PiCation', 'CationPi', 'PiStacking'] at 0x7f36e6a61d90>
[4]:
df = fp.to_dataframe()
df.head()
[4]:
ligand | GLN119.A | ... | ALA150.A | ILE151.A | THR152.A | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | GLN189.A | ALA190.A | ALA192.A | GLU194.A | VAL196.A | SER197.A | ... | ALA154.A | ARG238.A | ALA154.A | ARG238.A | ASN245.A | ASP153.A | GLU234.A | ARG238.A | LYS241.A | |||||
interaction | HBAcceptor | HBDonor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | ... | HBAcceptor | HBAcceptor | HBAcceptor | HBAcceptor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBAcceptor |
Frame | |||||||||||||||||||||
0 | False | False | True | False | False | True | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
10 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | True | False | False | False | False | False | False |
20 | False | False | False | False | False | False | False | False | False | True | ... | False | False | False | True | False | False | True | False | True | False |
30 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
40 | False | False | False | False | True | False | False | True | False | False | ... | False | False | True | True | False | False | False | False | False | False |
5 rows × 55 columns
[5]:
# show interactions for a specific ligand residue
df.xs("ARG147.A", level="ligand", axis=1).head(5)
[5]:
protein | ALA84.A | GLU309.B | THR313.B | |
---|---|---|---|---|
interaction | HBDonor | HBDonor | Cationic | HBDonor |
Frame | ||||
0 | False | False | False | False |
10 | False | False | False | False |
20 | True | False | False | False |
30 | False | False | False | False |
40 | False | True | True | False |
[6]:
# same for a protein residue
df.xs("GLU309.B", level="protein", axis=1).head(5)
[6]:
ligand | ARG147.A | |
---|---|---|
interaction | HBDonor | Cationic |
Frame | ||
0 | False | False |
10 | False | False |
20 | False | False |
30 | False | False |
40 | True | True |
[7]:
# display a specific type of interaction
df.xs("Cationic", level="interaction", axis=1).head(5)
[7]:
ligand | ARG147.A |
---|---|
protein | GLU309.B |
Frame | |
0 | False |
10 | False |
20 | False |
30 | False |
40 | True |
[8]:
# calculate the occurence of each interaction on the trajectory
occ = df.mean()
# restrict to the frequent ones
occ.loc[occ > 0.3]
[8]:
ligand protein interaction
CYS122.A GLY118.A HBDonor 0.84
ASP123.A LYS191.A Anionic 0.32
TRP125.A VAL102.A HBDonor 0.60
TYR109.A PiStacking 0.48
TRP115.A PiStacking 0.80
SER127.A SER181.A HBDonor 0.32
ASP129.A TYR359.B HBAcceptor 0.96
HSD139.A TRP174.A PiStacking 0.36
ASP146.A TYR157.A HBAcceptor 0.88
ARG161.A HBAcceptor 0.48
Anionic 0.36
ARG147.A GLU309.B HBDonor 0.44
Cationic 0.44
TRP149.A ASP153.A HBAcceptor 0.36
dtype: float64
[9]:
# regroup all interactions together and do the same
g = (df.groupby(level=["ligand", "protein"], axis=1)
.sum()
.astype(bool)
.mean())
g.loc[g > 0.3]
[9]:
ligand protein
ARG147.A GLU309.B 0.44
ASP123.A LYS191.A 0.32
ASP129.A TYR359.B 0.96
ASP146.A ARG161.A 0.48
TYR157.A 0.88
CYS122.A GLY118.A 0.84
HSD139.A TRP174.A 0.36
SER127.A SER181.A 0.32
TRP125.A TRP115.A 0.80
TYR109.A 0.48
VAL102.A 0.60
TRP149.A ASP153.A 0.36
dtype: float64