4. Visualisation¶
This notebook showcases different ways of visualizing lig-prot and prot-prot interactions, either with atomistic details or simply at the residue level.
[1]:
import MDAnalysis as mda
import prolif as plf
import numpy as np
# load topology
u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)
lig = u.select_atoms("resname LIG")
prot = u.select_atoms("protein")
[2]:
# create RDKit-like molecules for visualisation
lmol = plf.Molecule.from_mda(lig)
pmol = plf.Molecule.from_mda(prot)
[3]:
# get lig-prot interactions with atom info
fp = plf.Fingerprint(["HBDonor", "HBAcceptor", "Cationic", "PiStacking"])
fp.run(u.trajectory[0:1], lig, prot)
df = fp.to_dataframe(return_atoms=True)
df.T
[3]:
Frame | 0 | ||
---|---|---|---|
ligand | protein | interaction | |
LIG1.G | ASP129.A | Cationic | (13, 8) |
HBDonor | (52, 8) | ||
PHE331.B | PiStacking | (3, 7) | |
SER212.A | HBDonor | (44, 10) | |
VAL201.A | HBAcceptor | (28, 1) |
4.1. py3Dmol (3Dmol.js)¶
With py3Dmol we can easily display the interactions.
For interactions involving a ring (pi-cation, pi-stacking…etc.) ProLIF returns the index of one of the ring atoms, but for visualisation having the centroid of the ring looks nicer. We’ll start by writing a function to find the centroid, given the index of one of the ring atoms.
[4]:
from rdkit import Chem
from rdkit import Geometry
def get_ring_centroid(mol, index):
# find ring using the atom index
Chem.SanitizeMol(mol, Chem.SanitizeFlags.SANITIZE_SETAROMATICITY)
ri = mol.GetRingInfo()
for r in ri.AtomRings():
if index in r:
break
else:
raise ValueError("No ring containing this atom index was found in the given molecule")
# get centroid
coords = mol.xyz[list(r)]
ctd = plf.utils.get_centroid(coords)
return Geometry.Point3D(*ctd)
Finally, the actual visualisation code. The API of py3Dmol is exactly the same as the GLViewer class of 3Dmol.js, for which the documentation can be found here.
[5]:
import py3Dmol
colors = {
"HBAcceptor": "blue",
"HBDonor": "red",
"Cationic": "green",
"PiStacking": "purple",
}
# JavaScript functions
resid_hover = """function(atom,viewer) {{
if(!atom.label) {{
atom.label = viewer.addLabel('{0}:'+atom.atom+atom.serial,
{{position: atom, backgroundColor: 'mintcream', fontColor:'black'}});
}}
}}"""
hover_func = """
function(atom,viewer) {
if(!atom.label) {
atom.label = viewer.addLabel(atom.interaction,
{position: atom, backgroundColor: 'black', fontColor:'white'});
}
}"""
unhover_func = """
function(atom,viewer) {
if(atom.label) {
viewer.removeLabel(atom.label);
delete atom.label;
}
}"""
v = py3Dmol.view(650, 600)
v.removeAllModels()
models = {}
mid = -1
for i, row in df.T.iterrows():
lresid, presid, interaction = i
lindex, pindex = row[0]
lres = lmol[lresid]
pres = pmol[presid]
# set model ids for reusing later
for resid, res, style in [(lresid, lres, {"colorscheme": "cyanCarbon"}),
(presid, pres, {})]:
if resid not in models.keys():
mid += 1
v.addModel(Chem.MolToMolBlock(res), "sdf")
model = v.getModel()
model.setStyle({}, {"stick": style})
# add residue label
model.setHoverable({}, True, resid_hover.format(resid), unhover_func)
models[resid] = mid
# get coordinates for both points of the interaction
if interaction in ["PiStacking", "EdgeToFace", "FaceToFace", "PiCation"]:
p1 = get_ring_centroid(lres, lindex)
else:
p1 = lres.GetConformer().GetAtomPosition(lindex)
if interaction in ["PiStacking", "EdgeToFace", "FaceToFace", "CationPi"]:
p2 = get_ring_centroid(pres, pindex)
else:
p2 = pres.GetConformer().GetAtomPosition(pindex)
# add interaction line
v.addCylinder({"start": dict(x=p1.x, y=p1.y, z=p1.z),
"end": dict(x=p2.x, y=p2.y, z=p2.z),
"color": colors[interaction],
"radius": .15,
"dashed": True,
"fromCap": 1,
"toCap": 1,
})
# add label when hovering the middle of the dashed line by adding a dummy atom
c = Geometry.Point3D(*plf.utils.get_centroid([p1, p2]))
modelID = models[lresid]
model = v.getModel(modelID)
model.addAtoms([{"elem": 'Z',
"x": c.x, "y": c.y, "z": c.z,
"interaction": interaction}])
model.setStyle({"interaction": interaction}, {"clicksphere": {"radius": .5}})
model.setHoverable(
{"interaction": interaction}, True,
hover_func, unhover_func)
# show protein:
# first we need to reorder atoms as in the original MDAnalysis file.
# needed because the RDKitConverter reorders them when infering bond order
# and 3Dmol.js doesn't like when atoms from the same residue are spread accross the whole file
order = np.argsort([atom.GetIntProp("_MDAnalysis_index") for atom in pmol.GetAtoms()])
mol = Chem.RenumberAtoms(pmol, order.astype(int).tolist())
mol = Chem.RemoveAllHs(mol)
pdb = Chem.MolToPDBBlock(mol, flavor=0x20 | 0x10)
v.addModel(pdb, "pdb")
model = v.getModel()
model.setStyle({}, {"cartoon": {"style":"edged"}})
v.zoomTo({"model": list(models.values())})
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
[5]:
<py3Dmol.view at 0x7f862f3f6bb0>
4.2. Ligand Interaction Network (LigPlot)¶
Protein-ligand interactions are typically represented with the ligand in atomic details, residues as nodes, and interactions as edges. Such diagram can be easily displayed by calling ProLIF’s builtin class prolif.plotting.network.LigNetwork
. This diagram is interactive and allows moving around the residues, as well as clicking the legend to toggle the display of specific residues types or interactions. LigNetwork can generate two kinds of depictions:
Based on a single specific frame
By aggregating results from several frames
In the latter case, the frequency with which an interaction is seen will control the width of the corresponding edge. You can hide the least frequent interactions by using a threshold, i.e. threshold=0.3
will hide interactions that occur in less than 30% of frames.
[6]:
from prolif.plotting.network import LigNetwork
fp = plf.Fingerprint()
fp.run(u.trajectory[::10], lig, prot)
df = fp.to_dataframe(return_atoms=True)
net = LigNetwork.from_ifp(df, lmol,
# replace with `kind="frame", frame=0` for the other depiction
kind="aggregate", threshold=.3,
rotation=270)
net.display()
[6]:
You can further customize the diagram by changing the colors in LigNetwork.COLORS
or the residues types in LigNetwork.RESIDUE_TYPES
. Type help(LigNetwork)
for more details.
4.3. NetworkX and pyvis¶
NetworkX is a great library for working with graphs, but the drawing options are quickly limited so we will use networkx to create a graph, and pyvis to create interactive plots. The following code snippet will calculate the IFP, each residue (ligand or protein) is converted to a node, each interaction to an edge, and the occurence of each interaction between residues will be used to control the weight and thickness of each edge.
[7]:
import networkx as nx
from pyvis.network import Network
from tqdm.auto import tqdm
from matplotlib import cm, colors
from IPython.display import IFrame
[8]:
# get lig-prot interactions and distance between residues
fp = plf.Fingerprint()
fp.run(u.trajectory[::10], lig, prot)
df = fp.to_dataframe()
df.head()
[8]:
ligand | LIG1.G | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | TYR38.A | TYR109.A | THR110.A | TRP125.A | LEU126.A | ASP129.A | ILE130.A | CYS133.A | ... | MET337.B | PRO338.B | ALA343.B | CYS344.B | LEU348.B | PHE351.B | ASP352.B | THR355.B | TYR359.B | |||
interaction | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | HBDonor | Cationic | Hydrophobic | Hydrophobic | ... | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | Hydrophobic | PiStacking | Hydrophobic | Hydrophobic | Hydrophobic |
Frame | |||||||||||||||||||||
0 | False | True | False | True | True | True | True | True | True | True | ... | True | True | False | False | False | True | False | True | True | True |
10 | False | True | False | True | True | True | True | True | True | True | ... | True | False | True | False | False | True | True | False | True | True |
20 | False | True | True | True | True | True | True | True | True | True | ... | True | False | True | False | False | True | False | False | False | True |
30 | True | True | True | True | True | True | True | True | True | True | ... | True | False | True | True | False | True | True | False | True | True |
40 | False | True | True | True | True | True | True | True | True | True | ... | False | False | False | False | False | True | True | True | False | True |
5 rows × 40 columns
[9]:
def make_graph(values, df=None,
node_color=["#FFB2AC", "#ACD0FF"], node_shape="dot",
edge_color="#a9a9a9", width_multiplier=1):
"""Convert a pandas DataFrame to a NetworkX object
Parameters
----------
values : pandas.Series
Series with 'ligand' and 'protein' levels, and a unique value for
each lig-prot residue pair that will be used to set the width and weigth
of each edge. For example:
ligand protein
LIG1.G ALA216.A 0.66
ALA343.B 0.10
df : pandas.DataFrame
DataFrame obtained from the fp.to_dataframe() method
Used to label each edge with the type of interaction
node_color : list
Colors for the ligand and protein residues, respectively
node_shape : str
One of ellipse, circle, database, box, text or image, circularImage,
diamond, dot, star, triangle, triangleDown, square, icon.
edge_color : str
Color of the edge between nodes
width_multiplier : int or float
Each edge's width is defined as `width_multiplier * value`
"""
lig_res = values.index.get_level_values("ligand").unique().tolist()
prot_res = values.index.get_level_values("protein").unique().tolist()
G = nx.Graph()
# add nodes
# https://pyvis.readthedocs.io/en/latest/documentation.html#pyvis.network.Network.add_node
for res in lig_res:
G.add_node(res, title=res, shape=node_shape,
color=node_color[0], dtype="ligand")
for res in prot_res:
G.add_node(res, title=res, shape=node_shape,
color=node_color[1], dtype="protein")
for resids, value in values.items():
label = "{} - {}<br>{}".format(*resids, "<br>".join([f"{k}: {v}"
for k, v in (df.xs(resids,
level=["ligand", "protein"],
axis=1)
.sum()
.to_dict()
.items())]))
# https://pyvis.readthedocs.io/en/latest/documentation.html#pyvis.network.Network.add_edge
G.add_edge(*resids, title=label, color=edge_color,
weight=value, width=value*width_multiplier)
return G
4.3.1. Regrouping all interactions¶
We will regroup all interactions as if they were equivalent.
[10]:
data = (df.groupby(level=["ligand", "protein"], axis=1)
.sum()
.astype(bool)
.mean())
G = make_graph(data, df, width_multiplier=3)
# display graph
net = Network(width=600, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("lig-prot_graph.html")
IFrame("lig-prot_graph.html", width=610, height=510)
[10]:
4.3.2. Only plotting a specific interaction¶
We can also plot a specific type of interaction.
[11]:
data = (df.xs("Hydrophobic", level="interaction", axis=1)
.mean())
G = make_graph(data, df, width_multiplier=3)
# display graph
net = Network(width=600, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("lig-prot_hydrophobic_graph.html")
IFrame("lig-prot_hydrophobic_graph.html", width=610, height=510)
[11]:
4.3.3. Protein-protein interaction¶
This kind of “residue-level” visualisation is especially suitable for protein-protein interactions. Here we’ll show the interactions between one helix of our G-Protein coupled receptor (transmembrane helix 3, or TM3) in red and the rest of the protein in blue.
[12]:
tm3 = u.select_atoms("resid 119:152")
prot = u.select_atoms("protein and not group tm3", tm3=tm3)
fp = plf.Fingerprint()
fp.run(u.trajectory[::10], tm3, prot)
df = fp.to_dataframe()
df.head()
[12]:
ligand | GLN119.A | ... | THR152.A | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | GLY118.A | GLN189.A | ALA190.A | LYS191.A | ALA192.A | GLU193.A | GLU194.A | ... | ASP153.A | ALA154.A | GLU234.A | ARG238.A | LYS241.A | ||||||||
interaction | Hydrophobic | Hydrophobic | HBAcceptor | HBDonor | Hydrophobic | Hydrophobic | HBDonor | HBAcceptor | Hydrophobic | Hydrophobic | ... | Hydrophobic | HBDonor | HBAcceptor | Hydrophobic | Hydrophobic | HBDonor | Hydrophobic | HBAcceptor | Hydrophobic | HBAcceptor |
Frame | |||||||||||||||||||||
0 | True | False | False | False | False | False | True | False | True | True | ... | True | False | False | False | True | False | False | False | False | False |
10 | True | False | False | False | False | False | False | False | False | False | ... | True | False | False | True | True | False | False | False | False | False |
20 | True | False | False | False | False | False | False | False | True | False | ... | True | False | True | True | True | False | False | True | False | False |
30 | True | False | False | False | False | True | False | False | False | False | ... | True | False | False | False | False | False | True | False | False | False |
40 | True | False | False | False | True | False | False | False | False | False | ... | True | False | False | False | False | False | False | False | False | False |
5 rows × 232 columns
[13]:
data = (df.groupby(level=["ligand", "protein"], axis=1, sort=False)
.sum()
.astype(bool)
.mean())
G = make_graph(data, df, width_multiplier=8)
# color each node based on its degree
max_nbr = len(max(G.adj.values(), key=lambda x: len(x)))
blues = cm.get_cmap('Blues', max_nbr)
reds = cm.get_cmap('Reds', max_nbr)
for n, d in G.nodes(data=True):
n_neighbors = len(G.adj[n])
# show TM3 in red and the rest of the protein in blue
palette = reds if d["dtype"] == "ligand" else blues
d["color"] = colors.to_hex( palette(n_neighbors / max_nbr) )
# convert to pyvis network
net = Network(width=640, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("prot-prot_graph.html")
IFrame("prot-prot_graph.html", width=650, height=510)
[13]:
4.3.4. Residue interaction network¶
Another possible application is the visualisation of the residue interaction network of the whole protein. Since this protein is a GPCR, the graph will mostly display the HBond interactions reponsible for the secondary structure of the protein (7 alpha-helices). It would also show hydrophobic interactions between neighbor residues, so I’m simply going to disable it in the Fingerprint.
[14]:
prot = u.select_atoms("protein")
fp = plf.Fingerprint(['HBDonor', 'HBAcceptor', 'PiStacking', 'Anionic', 'Cationic', 'CationPi', 'PiCation'])
fp.run(u.trajectory[::10], prot, prot)
df = fp.to_dataframe()
df.head()
[14]:
ligand | TYR38.A | ILE39.A | ... | ARG385.B | PHE386.B | LYS387.B | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | TYR38.A | TYR40.A | GLN41.A | ASP42.A | ARG114.A | TRP345.B | PHE346.B | HSE347.B | ASP352.B | SER43.A | ... | LEU383.B | LYS387.B | PHE380.B | HSD381.B | ILE384.B | PHE386.B | HSD381.B | ARG385.B | ||
interaction | PiStacking | PiStacking | HBAcceptor | HBAcceptor | PiCation | PiStacking | HBDonor | PiStacking | HBDonor | HBAcceptor | ... | HBDonor | HBAcceptor | PiStacking | HBDonor | HBAcceptor | PiStacking | HBDonor | PiStacking | HBAcceptor | HBDonor |
Frame | |||||||||||||||||||||
0 | True | False | False | False | False | False | False | False | False | False | ... | True | False | False | True | False | True | False | True | False | False |
10 | True | False | False | False | False | False | True | False | False | False | ... | True | False | False | True | False | True | False | True | True | False |
20 | True | False | False | False | False | False | False | False | False | False | ... | True | False | False | True | False | False | False | True | False | False |
30 | True | False | False | False | False | False | True | True | False | False | ... | False | False | False | True | True | False | False | True | False | False |
40 | True | False | False | False | False | False | True | True | False | False | ... | False | False | False | True | False | False | False | True | False | False |
5 rows × 1337 columns
To hide most of the HBond interactions responsible for the alpha-helix structuration, I will show how to do it on the pandas DataFrame for simplicity, but ideally you should copy-paste the source code inside the fp.run
method and add the condition shown below before calculating the bitvector for a residue pair, then use the custom function instead of fp.run
. This would make the analysis faster and more memory efficient.
[15]:
# remove interactions between residues i and i±4 or less
mask = []
for l, p, interaction in df.columns:
lr = plf.ResidueId.from_string(l)
pr = plf.ResidueId.from_string(p)
if (pr == lr) or (abs(pr.number - lr.number) <= 4
and interaction in ["HBDonor", "HBAcceptor", "Hydrophobic"]):
mask.append(False)
else:
mask.append(True)
df = df[df.columns[mask]]
df.head()
[15]:
ligand | TYR38.A | TYR40.A | GLN41.A | ... | HSD381.B | PHE386.B | LYS387.B | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
protein | TYR40.A | ARG114.A | TRP345.B | PHE346.B | HSE347.B | ASP352.B | TYR38.A | HSE347.B | ASP352.B | LYS49.A | ... | PHE380.B | PHE386.B | LYS387.B | PHE380.B | HSD381.B | HSD381.B | ||||
interaction | PiStacking | PiCation | PiStacking | HBDonor | PiStacking | HBDonor | PiStacking | PiStacking | HBDonor | HBAcceptor | ... | PiStacking | HBDonor | HBAcceptor | PiStacking | HBDonor | PiStacking | HBDonor | HBAcceptor | PiStacking | HBAcceptor |
Frame | |||||||||||||||||||||
0 | False | False | False | False | False | False | False | False | True | False | ... | True | False | True | True | False | False | True | False | True | False |
10 | False | False | False | True | False | False | False | False | True | False | ... | False | False | True | True | True | False | True | False | True | True |
20 | False | False | False | False | False | False | False | False | True | False | ... | False | False | True | False | False | False | True | False | False | False |
30 | False | False | False | True | True | False | False | False | True | False | ... | True | True | True | False | False | False | True | True | False | False |
40 | False | False | False | True | True | False | False | False | True | False | ... | False | False | True | False | False | False | True | False | False | False |
5 rows × 416 columns
[16]:
data = (df.groupby(level=["ligand", "protein"], axis=1, sort=False)
.sum()
.astype(bool)
.mean())
G = make_graph(data, df, width_multiplier=5)
# color each node based on its degree
max_nbr = len(max(G.adj.values(), key=lambda x: len(x)))
palette = cm.get_cmap('YlGnBu', max_nbr)
for n, d in G.nodes(data=True):
n_neighbors = len(G.adj[n])
d["color"] = colors.to_hex( palette(n_neighbors / max_nbr) )
# convert to pyvis network
net = Network(width=640, height=500, notebook=True, heading="")
net.from_nx(G)
# use specific layout
layout = nx.circular_layout(G)
for node in net.nodes:
node["x"] = layout[node["id"]][0] * 1000
node["y"] = layout[node["id"]][1] * 1000
net.toggle_physics(False)
net.write_html("residue-network_graph.html")
IFrame("residue-network_graph.html", width=650, height=510)
[16]:
End of this notebook. If you have other suggestions for displaying interaction fingerprints, please create a new Discussion on GitHub 👍
List of files to be automatically copied to the docs: