3. Protein-protein interactions
This notebooks shows how to compute a fingerprint for protein-protein interactions.
Here we will investigate the interactions in a G-protein coupled receptor (GPCR) between a particular helix (called TM3) and the rest of the protein.
This can obviously be applied to proteins that don’t belong to the same chain/segment, as long as you can figure out an appropriate MDAnalysis selection
There is also an example at the end of this tutorial for generating an IFP of PPI without considering the backbone.
[1]:
import MDAnalysis as mda
import prolif as plf
[2]:
# load traj
u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)
tm3 = u.select_atoms("resid 119:152")
prot = u.select_atoms("protein and not group tm3", tm3=tm3)
tm3, prot
[2]:
(<AtomGroup with 531 atoms>, <AtomGroup with 4457 atoms>)
[3]:
# prot-prot interactions
fp = plf.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", "PiCation", "CationPi", "Anionic", "Cationic"])
fp.run(u.trajectory[::10], tm3, prot)
[3]:
<prolif.fingerprint.Fingerprint: 7 interactions: ['HBDonor', 'HBAcceptor', 'Cationic', 'Anionic', 'PiCation', 'CationPi', 'PiStacking'] at 0x7f79207b0ca0>
[4]:
df = fp.to_dataframe()
df.head()
[4]:
ligand | GLN119.A | ... | ALA150.A | ILE151.A | THR152.A | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | GLN189.A | ALA190.A | ALA192.A | GLU194.A | VAL196.A | SER197.A | ... | ALA154.A | ARG238.A | ALA154.A | ARG238.A | ASN245.A | ASP153.A | GLU234.A | ARG238.A | LYS241.A | |||||
interaction | HBAcceptor | HBDonor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | ... | HBAcceptor | HBAcceptor | HBAcceptor | HBAcceptor | HBAcceptor | HBDonor | HBAcceptor | HBDonor | HBAcceptor | HBAcceptor |
Frame | |||||||||||||||||||||
0 | False | False | True | False | False | True | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
10 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | True | False | False | False | False | False | False |
20 | False | False | False | False | False | False | False | False | False | True | ... | False | False | False | True | False | False | True | False | True | False |
30 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
40 | False | False | False | False | True | False | False | True | False | False | ... | False | False | True | True | False | False | False | False | False | False |
5 rows × 55 columns
[5]:
# show interactions for a specific ligand residue
df.xs("ARG147.A", level="ligand", axis=1).head(5)
[5]:
protein | ALA84.A | GLU309.B | THR313.B | |
---|---|---|---|---|
interaction | HBDonor | HBDonor | Cationic | HBDonor |
Frame | ||||
0 | False | False | False | False |
10 | False | False | False | False |
20 | True | False | False | False |
30 | False | False | False | False |
40 | False | True | True | False |
[6]:
# same for a protein residue
df.xs("GLU309.B", level="protein", axis=1).head(5)
[6]:
ligand | ARG147.A | |
---|---|---|
interaction | HBDonor | Cationic |
Frame | ||
0 | False | False |
10 | False | False |
20 | False | False |
30 | False | False |
40 | True | True |
[7]:
# display a specific type of interaction
df.xs("Cationic", level="interaction", axis=1).head(5)
[7]:
ligand | ARG147.A |
---|---|
protein | GLU309.B |
Frame | |
0 | False |
10 | False |
20 | False |
30 | False |
40 | True |
[8]:
# calculate the occurence of each interaction on the trajectory
occ = df.mean()
# restrict to the frequent ones
occ.loc[occ > 0.3]
[8]:
ligand protein interaction
CYS122.A GLY118.A HBDonor 0.84
ASP123.A LYS191.A Anionic 0.32
TRP125.A VAL102.A HBDonor 0.60
TYR109.A PiStacking 0.48
TRP115.A PiStacking 0.80
SER127.A SER181.A HBDonor 0.32
ASP129.A TYR359.B HBAcceptor 0.96
HSD139.A TRP174.A PiStacking 0.36
ASP146.A TYR157.A HBAcceptor 0.88
ARG161.A HBAcceptor 0.48
Anionic 0.36
ARG147.A GLU309.B HBDonor 0.44
Cationic 0.44
TRP149.A ASP153.A HBAcceptor 0.36
dtype: float64
[9]:
# regroup all interactions together and do the same
g = (df.groupby(level=["ligand", "protein"], axis=1)
.sum()
.astype(bool)
.mean())
g.loc[g > 0.3]
[9]:
ligand protein
ARG147.A GLU309.B 0.44
ASP123.A LYS191.A 0.32
ASP129.A TYR359.B 0.96
ASP146.A ARG161.A 0.48
TYR157.A 0.88
CYS122.A GLY118.A 0.84
HSD139.A TRP174.A 0.36
SER127.A SER181.A 0.32
TRP125.A TRP115.A 0.80
TYR109.A 0.48
VAL102.A 0.60
TRP149.A ASP153.A 0.36
dtype: float64
3.1. Ignoring backbone interactions
In some cases, you might want to dismiss backbone interactions. While it might be tempting to just modify the MDAnalysis selection with
"protein and not backbone"
, this won’t work as expected and will lead to adding a charges where the backbone was bonding with the sidechain.However there is a temporary workaround (which will be directly included in the code in the near future):
[10]:
from rdkit import Chem
from rdkit.Chem import AllChem
from tqdm.auto import tqdm
# remove backbone
backbone = Chem.MolFromSmarts("[C^2](=O)-[C;X4](-[H])-[N;+0]")
fix_h = Chem.MolFromSmarts("[H&D0]")
def remove_backbone(atomgroup):
mol = plf.Molecule.from_mda(atomgroup)
mol = AllChem.DeleteSubstructs(mol, backbone)
mol = AllChem.DeleteSubstructs(mol, fix_h)
return plf.Molecule(mol)
# generate IFP
ifp = []
for ts in tqdm(u.trajectory[::10]):
tm3_mol = remove_backbone(tm3)
prot_mol = remove_backbone(prot)
data = fp.generate(tm3_mol, prot_mol)
data["Frame"] = ts.frame
ifp.append(data)
df = plf.to_dataframe(ifp, fp.interactions.keys())
df.head()
[10]:
ligand | GLN119.A | CYS122.A | ASP123.A | PHE124.A | TRP125.A | ... | ASP146.A | ARG147.A | TYR148.A | TRP149.A | THR152.A | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | GLN189.A | SER197.A | LYS191.A | ARG188.A | GLN189.A | LYS191.A | PHE185.A | SER106.A | ... | TYR157.A | ARG161.A | GLU309.B | THR313.B | ARG230.A | TYR157.A | ASP153.A | GLU234.A | ||||
interaction | HBAcceptor | HBAcceptor | Anionic | HBAcceptor | Anionic | HBAcceptor | HBAcceptor | Anionic | PiStacking | HBDonor | ... | HBAcceptor | HBAcceptor | Anionic | HBDonor | Cationic | HBDonor | HBAcceptor | PiStacking | HBDonor | HBDonor |
Frame | |||||||||||||||||||||
0 | False | False | False | False | False | False | False | False | False | True | ... | True | False | False | False | False | False | False | False | False | False |
10 | False | False | False | False | False | False | False | False | False | False | ... | True | False | False | False | False | False | False | True | False | False |
20 | False | True | False | False | False | False | True | True | False | False | ... | True | False | False | False | False | False | False | True | False | False |
30 | False | False | False | True | True | False | False | False | False | False | ... | True | True | True | False | False | False | False | False | False | False |
40 | False | False | False | False | False | False | True | True | False | True | ... | True | False | False | True | True | False | False | False | False | False |
5 rows × 28 columns