4. Visualisation

This notebook showcases different ways of visualizing lig-prot and prot-prot interactions, either with atomistic details or simply at the residue level.

[1]:
import MDAnalysis as mda
import prolif as plf
import numpy as np
# load topology
u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)
lig = u.select_atoms("resname LIG")
prot = u.select_atoms("protein")
[2]:
# create RDKit-like molecules for visualisation
lmol = plf.Molecule.from_mda(lig)
pmol = plf.Molecule.from_mda(prot)
[3]:
# get lig-prot interactions with atom info
fp = plf.Fingerprint(["HBDonor", "HBAcceptor", "Cationic", "PiStacking"])
fp.run(u.trajectory[0:1], lig, prot)
df = fp.to_dataframe(return_atoms=True)
df.T
[3]:
Frame 0
ligand protein interaction
LIG1.G ASP129.A Cationic (13, 8)
HBDonor (52, 8)
PHE331.B PiStacking (3, 7)
SER212.A HBDonor (44, 10)
VAL201.A HBAcceptor (28, 1)

4.1. py3Dmol (3Dmol.js)

With py3Dmol we can easily display the interactions.

For interactions involving a ring (pi-cation, pi-stacking…etc.) ProLIF returns the index of one of the ring atoms, but for visualisation having the centroid of the ring looks nicer. We’ll start by writing a function to find the centroid, given the index of one of the ring atoms.

[4]:
from rdkit import Chem
from rdkit import Geometry

def get_ring_centroid(mol, index):
    # find ring using the atom index
    Chem.SanitizeMol(mol, Chem.SanitizeFlags.SANITIZE_SETAROMATICITY)
    ri = mol.GetRingInfo()
    for r in ri.AtomRings():
        if index in r:
            break
    else:
        raise ValueError("No ring containing this atom index was found in the given molecule")
    # get centroid
    coords = mol.xyz[list(r)]
    ctd = plf.utils.get_centroid(coords)
    return Geometry.Point3D(*ctd)

Finally, the actual visualisation code. The API of py3Dmol is exactly the same as the GLViewer class of 3Dmol.js, for which the documentation can be found here.

[5]:
import py3Dmol

colors = {
    "HBAcceptor": "blue",
    "HBDonor": "red",
    "Cationic": "green",
    "PiStacking": "purple",
}

# JavaScript functions
resid_hover = """function(atom,viewer) {{
    if(!atom.label) {{
        atom.label = viewer.addLabel('{0}:'+atom.atom+atom.serial,
            {{position: atom, backgroundColor: 'mintcream', fontColor:'black'}});
    }}
}}"""
hover_func = """
function(atom,viewer) {
    if(!atom.label) {
        atom.label = viewer.addLabel(atom.interaction,
            {position: atom, backgroundColor: 'black', fontColor:'white'});
    }
}"""
unhover_func = """
function(atom,viewer) {
    if(atom.label) {
        viewer.removeLabel(atom.label);
        delete atom.label;
    }
}"""

v = py3Dmol.view(650, 600)
v.removeAllModels()

models = {}
mid = -1
for i, row in df.T.iterrows():
    lresid, presid, interaction = i
    lindex, pindex = row[0]
    lres = lmol[lresid]
    pres = pmol[presid]
    # set model ids for reusing later
    for resid, res, style in [(lresid, lres, {"colorscheme": "cyanCarbon"}),
                              (presid, pres, {})]:
        if resid not in models.keys():
            mid += 1
            v.addModel(Chem.MolToMolBlock(res), "sdf")
            model = v.getModel()
            model.setStyle({}, {"stick": style})
            # add residue label
            model.setHoverable({}, True, resid_hover.format(resid), unhover_func)
            models[resid] = mid
    # get coordinates for both points of the interaction
    if interaction in ["PiStacking", "EdgeToFace", "FaceToFace", "PiCation"]:
        p1 = get_ring_centroid(lres, lindex)
    else:
        p1 = lres.GetConformer().GetAtomPosition(lindex)
    if interaction in ["PiStacking", "EdgeToFace", "FaceToFace", "CationPi"]:
        p2 = get_ring_centroid(pres, pindex)
    else:
        p2 = pres.GetConformer().GetAtomPosition(pindex)
    # add interaction line
    v.addCylinder({"start": dict(x=p1.x, y=p1.y, z=p1.z),
                   "end":   dict(x=p2.x, y=p2.y, z=p2.z),
                   "color": colors[interaction],
                   "radius": .15,
                   "dashed": True,
                   "fromCap": 1,
                   "toCap": 1,
                  })
    # add label when hovering the middle of the dashed line by adding a dummy atom
    c = Geometry.Point3D(*plf.utils.get_centroid([p1, p2]))
    modelID = models[lresid]
    model = v.getModel(modelID)
    model.addAtoms([{"elem": 'Z',
                     "x": c.x, "y": c.y, "z": c.z,
                     "interaction": interaction}])
    model.setStyle({"interaction": interaction}, {"clicksphere": {"radius": .5}})
    model.setHoverable(
        {"interaction": interaction}, True,
        hover_func, unhover_func)

# show protein:
# first we need to reorder atoms as in the original MDAnalysis file.
# needed because the RDKitConverter reorders them when infering bond order
# and 3Dmol.js doesn't like when atoms from the same residue are spread accross the whole file
order = np.argsort([atom.GetIntProp("_MDAnalysis_index") for atom in pmol.GetAtoms()])
mol = Chem.RenumberAtoms(pmol, order.astype(int).tolist())
mol = Chem.RemoveAllHs(mol)
pdb = Chem.MolToPDBBlock(mol, flavor=0x20 | 0x10)
v.addModel(pdb, "pdb")
model = v.getModel()
model.setStyle({}, {"cartoon": {"style":"edged"}})

v.zoomTo({"model": list(models.values())})

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[5]:
<py3Dmol.view at 0x7f4314aaebe0>

4.2. Ligand Interaction Network (LigPlot)

Protein-ligand interactions are typically represented with the ligand in atomic details, residues as nodes, and interactions as edges. Such diagram can be easily displayed by calling ProLIF’s builtin class prolif.plotting.network.LigNetwork. This diagram is interactive and allows moving around the residues, as well as clicking the legend to toggle the display of specific residues types or interactions. LigNetwork can generate two kinds of depictions:

  • Based on a single specific frame

  • By aggregating results from several frames

In the latter case, the frequency with which an interaction is seen will control the width of the corresponding edge. You can hide the least frequent interactions by using a threshold, i.e. threshold=0.3 will hide interactions that occur in less than 30% of frames.

[6]:
from prolif.plotting.network import LigNetwork

fp = plf.Fingerprint()
fp.run(u.trajectory[::10], lig, prot)
df = fp.to_dataframe(return_atoms=True)

net = LigNetwork.from_ifp(df, lmol,
                          # replace with `kind="frame", frame=0` for the other depiction
                          kind="aggregate", threshold=.3,
                          rotation=270)
net.display()
[6]:
You can further customize the diagram by changing the colors in LigNetwork.COLORS or the residues types in LigNetwork.RESIDUE_TYPES. Type help(LigNetwork) for more details.
The diagram can be saved as an HTML file by calling net.save("output.html"). It is not currently possible to export it as an image, so please make a screenshot instead.
You can combine both saving and displaying the diagram with net.show("output.html").

4.3. NetworkX and pyvis

NetworkX is a great library for working with graphs, but the drawing options are quickly limited so we will use networkx to create a graph, and pyvis to create interactive plots. The following code snippet will calculate the IFP, each residue (ligand or protein) is converted to a node, each interaction to an edge, and the occurence of each interaction between residues will be used to control the weight and thickness of each edge.

[7]:
import networkx as nx
from pyvis.network import Network
from tqdm.auto import tqdm
from matplotlib import cm, colors
from IPython.display import IFrame
[8]:
# get lig-prot interactions and distance between residues

fp = plf.Fingerprint()
fp.run(u.trajectory[::10], lig, prot)
df = fp.to_dataframe()
df.head()
[8]:
ligand LIG1.G
protein TYR38.A TYR109.A THR110.A TRP125.A LEU126.A ASP129.A ILE130.A CYS133.A ... MET337.B PRO338.B ALA343.B CYS344.B LEU348.B PHE351.B ASP352.B THR355.B TYR359.B
interaction Hydrophobic Hydrophobic Hydrophobic Hydrophobic Hydrophobic Hydrophobic HBDonor Cationic Hydrophobic Hydrophobic ... Hydrophobic Hydrophobic Hydrophobic Hydrophobic Hydrophobic Hydrophobic PiStacking Hydrophobic Hydrophobic Hydrophobic
Frame
0 False True False True True True True True True True ... True True False False False True False True True True
10 False True False True True True True True True True ... True False True False False True True False True True
20 False True True True True True True True True True ... True False True False False True False False False True
30 True True True True True True True True True True ... True False True True False True True False True True
40 False True True True True True True True True True ... False False False False False True True True False True

5 rows × 40 columns

[9]:
def make_graph(values, df=None,
               node_color=["#FFB2AC", "#ACD0FF"], node_shape="dot",
               edge_color="#a9a9a9", width_multiplier=1):
    """Convert a pandas DataFrame to a NetworkX object

    Parameters
    ----------
    values : pandas.Series
        Series with 'ligand' and 'protein' levels, and a unique value for
        each lig-prot residue pair that will be used to set the width and weigth
        of each edge. For example:

            ligand  protein
            LIG1.G  ALA216.A    0.66
                    ALA343.B    0.10

    df : pandas.DataFrame
        DataFrame obtained from the fp.to_dataframe() method
        Used to label each edge with the type of interaction

    node_color : list
        Colors for the ligand and protein residues, respectively

    node_shape : str
        One of ellipse, circle, database, box, text or image, circularImage,
        diamond, dot, star, triangle, triangleDown, square, icon.

    edge_color : str
        Color of the edge between nodes

    width_multiplier : int or float
        Each edge's width is defined as `width_multiplier * value`
    """
    lig_res = values.index.get_level_values("ligand").unique().tolist()
    prot_res = values.index.get_level_values("protein").unique().tolist()

    G = nx.Graph()
    # add nodes
    # https://pyvis.readthedocs.io/en/latest/documentation.html#pyvis.network.Network.add_node
    for res in lig_res:
        G.add_node(res, title=res, shape=node_shape,
                   color=node_color[0], dtype="ligand")
    for res in prot_res:
        G.add_node(res, title=res, shape=node_shape,
                   color=node_color[1], dtype="protein")

    for resids, value in values.items():
        label = "{} - {}<br>{}".format(*resids, "<br>".join([f"{k}: {v}"
                                       for k, v in (df.xs(resids,
                                                          level=["ligand", "protein"],
                                                          axis=1)
                                                      .sum()
                                                      .to_dict()
                                                      .items())]))
        # https://pyvis.readthedocs.io/en/latest/documentation.html#pyvis.network.Network.add_edge
        G.add_edge(*resids, title=label, color=edge_color,
                   weight=value, width=value*width_multiplier)

    return G

4.3.1. Regrouping all interactions

We will regroup all interactions as if they were equivalent.

[10]:
data = (df.groupby(level=["ligand", "protein"], axis=1)
          .sum()
          .astype(bool)
          .mean())

G = make_graph(data, df, width_multiplier=3)

# display graph
net = Network(width=600, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("lig-prot_graph.html")
IFrame("lig-prot_graph.html", width=610, height=510)
[10]:

4.3.2. Only plotting a specific interaction

We can also plot a specific type of interaction.

[11]:
data = (df.xs("Hydrophobic", level="interaction", axis=1)
          .mean())

G = make_graph(data, df, width_multiplier=3)

# display graph
net = Network(width=600, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("lig-prot_hydrophobic_graph.html")
IFrame("lig-prot_hydrophobic_graph.html", width=610, height=510)
[11]:

4.3.3. Protein-protein interaction

This kind of “residue-level” visualisation is especially suitable for protein-protein interactions. Here we’ll show the interactions between one helix of our G-Protein coupled receptor (transmembrane helix 3, or TM3) in red and the rest of the protein in blue.

[12]:
tm3 = u.select_atoms("resid 119:152")
prot = u.select_atoms("protein and not group tm3", tm3=tm3)
fp = plf.Fingerprint()
fp.run(u.trajectory[::10], tm3, prot)
df = fp.to_dataframe()
df.head()
[12]:
ligand GLN119.A ... THR152.A
protein GLY118.A GLN189.A ALA190.A LYS191.A ALA192.A GLU193.A GLU194.A ... ASP153.A ALA154.A GLU234.A ARG238.A LYS241.A
interaction Hydrophobic Hydrophobic HBAcceptor HBDonor Hydrophobic Hydrophobic HBDonor HBAcceptor Hydrophobic Hydrophobic ... Hydrophobic HBDonor HBAcceptor Hydrophobic Hydrophobic HBDonor Hydrophobic HBAcceptor Hydrophobic HBAcceptor
Frame
0 True False False False False False True False True True ... True False False False True False False False False False
10 True False False False False False False False False False ... True False False True True False False False False False
20 True False False False False False False False True False ... True False True True True False False True False False
30 True False False False False True False False False False ... True False False False False False True False False False
40 True False False False True False False False False False ... True False False False False False False False False False

5 rows × 232 columns

[13]:
data = (df.groupby(level=["ligand", "protein"], axis=1, sort=False)
          .sum()
          .astype(bool)
          .mean())

G = make_graph(data, df, width_multiplier=8)

# color each node based on its degree
max_nbr = len(max(G.adj.values(), key=lambda x: len(x)))
blues = cm.get_cmap('Blues', max_nbr)
reds = cm.get_cmap('Reds', max_nbr)
for n, d in G.nodes(data=True):
    n_neighbors = len(G.adj[n])
    # show TM3 in red and the rest of the protein in blue
    palette = reds if d["dtype"] == "ligand" else blues
    d["color"] = colors.to_hex( palette(n_neighbors / max_nbr) )

# convert to pyvis network
net = Network(width=640, height=500, notebook=True, heading="")
net.from_nx(G)
net.write_html("prot-prot_graph.html")
IFrame("prot-prot_graph.html", width=650, height=510)
[13]:

4.3.4. Residue interaction network

Another possible application is the visualisation of the residue interaction network of the whole protein. Since this protein is a GPCR, the graph will mostly display the HBond interactions reponsible for the secondary structure of the protein (7 alpha-helices). It would also show hydrophobic interactions between neighbor residues, so I’m simply going to disable it in the Fingerprint.

[14]:
prot = u.select_atoms("protein")
fp = plf.Fingerprint(['HBDonor', 'HBAcceptor', 'PiStacking', 'Anionic', 'Cationic', 'CationPi', 'PiCation'])
fp.run(u.trajectory[::10], prot, prot)
df = fp.to_dataframe()
df.head()
[14]:
ligand TYR38.A ILE39.A ... ARG385.B PHE386.B LYS387.B
protein TYR38.A TYR40.A GLN41.A ASP42.A ARG114.A TRP345.B PHE346.B HSE347.B ASP352.B SER43.A ... LEU383.B LYS387.B PHE380.B HSD381.B ILE384.B PHE386.B HSD381.B ARG385.B
interaction PiStacking PiStacking HBAcceptor HBAcceptor PiCation PiStacking HBDonor PiStacking HBDonor HBAcceptor ... HBDonor HBAcceptor PiStacking HBDonor HBAcceptor PiStacking HBDonor PiStacking HBAcceptor HBDonor
Frame
0 True False False False False False False False False False ... True False False True False True False True False False
10 True False False False False False True False False False ... True False False True False True False True True False
20 True False False False False False False False False False ... True False False True False False False True False False
30 True False False False False False True True False False ... False False False True True False False True False False
40 True False False False False False True True False False ... False False False True False False False True False False

5 rows × 1337 columns

To hide most of the HBond interactions responsible for the alpha-helix structuration, I will show how to do it on the pandas DataFrame for simplicity, but ideally you should copy-paste the source code inside the fp.run method and add the condition shown below before calculating the bitvector for a residue pair, then use the custom function instead of fp.run. This would make the analysis faster and more memory efficient.

[15]:
# remove interactions between residues i and i±4 or less
mask = []
for l, p, interaction in df.columns:
    lr = plf.ResidueId.from_string(l)
    pr = plf.ResidueId.from_string(p)
    if (pr == lr) or (abs(pr.number - lr.number) <= 4
                      and interaction in ["HBDonor", "HBAcceptor", "Hydrophobic"]):
        mask.append(False)
    else:
        mask.append(True)
df = df[df.columns[mask]]
df.head()
[15]:
ligand TYR38.A TYR40.A GLN41.A ... HSD381.B PHE386.B LYS387.B
protein TYR40.A ARG114.A TRP345.B PHE346.B HSE347.B ASP352.B TYR38.A HSE347.B ASP352.B LYS49.A ... PHE380.B PHE386.B LYS387.B PHE380.B HSD381.B HSD381.B
interaction PiStacking PiCation PiStacking HBDonor PiStacking HBDonor PiStacking PiStacking HBDonor HBAcceptor ... PiStacking HBDonor HBAcceptor PiStacking HBDonor PiStacking HBDonor HBAcceptor PiStacking HBAcceptor
Frame
0 False False False False False False False False True False ... True False True True False False True False True False
10 False False False True False False False False True False ... False False True True True False True False True True
20 False False False False False False False False True False ... False False True False False False True False False False
30 False False False True True False False False True False ... True True True False False False True True False False
40 False False False True True False False False True False ... False False True False False False True False False False

5 rows × 416 columns

[16]:
data = (df.groupby(level=["ligand", "protein"], axis=1, sort=False)
          .sum()
          .astype(bool)
          .mean())

G = make_graph(data, df, width_multiplier=5)

# color each node based on its degree
max_nbr = len(max(G.adj.values(), key=lambda x: len(x)))
palette = cm.get_cmap('YlGnBu', max_nbr)
for n, d in G.nodes(data=True):
    n_neighbors = len(G.adj[n])
    d["color"] = colors.to_hex( palette(n_neighbors / max_nbr) )

# convert to pyvis network
net = Network(width=640, height=500,  notebook=True, heading="")
net.from_nx(G)

# use specific layout
layout = nx.circular_layout(G)
for node in net.nodes:
    node["x"] = layout[node["id"]][0] * 1000
    node["y"] = layout[node["id"]][1] * 1000
net.toggle_physics(False)

net.write_html("residue-network_graph.html")
IFrame("residue-network_graph.html", width=650, height=510)
[16]:

End of this notebook. If you have other suggestions for displaying interaction fingerprints, please create a new Discussion on GitHub 👍

List of files to be automatically copied to the docs: