Protein-Small Ligand Interactions

Docking small molecules to a protein is a fundamental step in structure-based drug design. The main approaches are (A) Docking of potential ligands from a compound database, and (B) mapping the protein for the binding sites of molecular probes - small molecules and functional groups - and using the favorable positions for the construction of larger ligands. We develop and apply algorithms for both approaches.

Our basic methodology of docking is very similar to the one we have used for protein-protein docking, and consists of the following steps: (1): Rigid body search to generate a large number of conformations with good shape complementarity, and possibly favorable electrostatics and desolvation, (2) refinement, rescoring and possibly filtering using a more accurate free energy function, (3) clustering of the retained structures, and ranking the clusters on the basis of the average free energy. This algorithm has been implemented for the mapping of proteins using organic solvents as probes, and is being extended to more mainstream docking applications.

Computational Solvent Mapping

Computational solvent mapping of proteins employs molecular probes - small molecules or functional groups - to identify the most favorable binding positions. While X-ray crystallography and NMR reveal that organic solvents bind to a limited number of sites on a protein, current mapping methods result in hundreds of energy minima. We have developed mapping algorithms that move the molecular probes around the protein surface, find favorable positions using empirical free energy functions, cluster the conformations, and rank the clusters on the basis of the average free energy. The mapping procedures reproduce the available experimental solvent mapping results, eliminating the problem of spurious local minima associated with previous computational methods. Mapping both the bound and apo forms of several proteins shows that the approach is generally much less sensitive to variations in the structure of the protein than docking methods, and it is also remarkably robust against changes in the algorithm and energy parameters[1][2][3].

Identification of Enzyme Active Sites by Computational Solvent Mapping

Computational solvent mapping is a tool for the identification and characterization of binding sites on proteins. A very important result is that using at least six different solvent as probes, the consensus sites found by the mapping are always in the major subsites of the functional site, and as a result, the amino acid residues that interact with the probes also bind the specific substrates of the enzyme. Thus, computational mapping provides detailed and reliable information on the functional sites of proteins.

Here we describe the computational mapping of thermolysin, for which experimental mapping results are available, and the application of the algorithm to six other enzymes that have no experimental mapping data, but whose binding sites are well characterized.

Thermolysin (2tlx)

Thermolysin was mapped both experimentally and computationally, using isopropanol, acetone, acetonitrile, and phenol as probes.

Ranking of probe clusters within the consensus sites for thermolysin, obtained by superimposing the five lowest free energy clusters for each probe

Consensus SiteProbe
IsopropanolPhenolAcetoneAcetonitrile
1 (S1')2 (0.69)2 (0.69)4 (0.75)3 (0.32)
2 (S1)- 1 (3.52)2 (3.74)1 (3.88)
3-4 (17.28)1 (18.10)2 (17.42)

The distance of each cluster center from the isopropanol in experimental mapping is shown in parentheses.  The two consensus sites in the substrate binding region are shown in bold.

Thermolysin Experimental Structure: Thermolysin structure co-crystallized with the V-K dipeptide (2tlx), and superimposed the results of experimental mapping, i.e., the ligand positions in structures solved in 10% isopropanol, 50% acetone, 50% acetonitrile, and 50 mM phenol. The color scheme used for the ligands is ochre, V-K dipeptide; red, isopropanol (IPA); blue, acetone (ACN); black, acetonitrile (CCN); and purple, phenol (IPH). For the protein side chains we use the standard atomic colors, i.e., carbon, grey; oxygen, red; nitrogen, blue; and hydrogen, white. All solvents bind in the S1’ pocket (IPA1, ACN1, CCN1, and IPH1), and isopropanol also binds at the S1 site (IPA5). The S1’ pocket is found to be the only location that binds all four probes (Site 1 in Table above), with additional clustering of isopropanol, phenol, and acetone close to the S1 subsite (Site 2 in Table above)Thermolysin Experimental Structure: Thermolysin structure co-crystallized with the V-K dipeptide (2tlx), and superimposed the results of experimental mapping, i.e., the ligand positions in structures solved in 10% isopropanol, 50% acetone, 50% acetonitrile, and 50 mM phenol. The color scheme used for the ligands is ochre, V-K dipeptide; red, isopropanol (IPA); blue, acetone (ACN); black, acetonitrile (CCN); and purple, phenol (IPH). For the protein side chains we use the standard atomic colors, i.e., carbon, grey; oxygen, red; nitrogen, blue; and hydrogen, white. All solvents bind in the S1’ pocket (IPA1, ACN1, CCN1, and IPH1), and isopropanol also binds at the S1 site (IPA5). The S1’ pocket is found to be the only location that binds all four probes (Site 1 in Table above), with additional clustering of isopropanol, phenol, and acetone close to the S1 subsite (Site 2 in Table above)

 

Thermolysin Computational Mapping: results of computational mapping for thermolysin. The main consensus site is in the S1’ pocket that binds all four solvents (Site 1 in Table above), and the second consensus site is close to S1, which binds three solvents (Site 2 in Table above).Thermolysin Computational Mapping: results of computational mapping for thermolysin. The main consensus site is in the S1’ pocket that binds all four solvents (Site 1 in Table above), and the second consensus site is close to S1, which binds three solvents (Site 2 in Table above).

SAS provides nonbonded receptor-ligand interactions and hydrogen bonds in multiple thermolysin structures in the PDB that have been co-crystallized with different ligands.  These contacts can be compared to the contacts from computational mapping.  SAS results for 2tlx.

Thermolysin Nonbonded Contacts: Distribution of intermolecular nonbonded interactions among thermolysin residues. The interactions were determined from three sources: computational mapping; extracted from 23 complexes of thermolysin with different ligands in the PDB database; and experimental mapping. Computational mapping results are based on the interactions found between various thermolysin residues and the probes in the main consensus site.Thermolysin Nonbonded Contacts: Distribution of intermolecular nonbonded interactions among thermolysin residues. The interactions were determined from three sources: computational mapping; extracted from 23 complexes of thermolysin with different ligands in the PDB database; and experimental mapping. Computational mapping results are based on the interactions found between various thermolysin residues and the probes in the main consensus site.

Thermolysin Hydrogen Bonds: The same comparison for the distribution of hydrogen bondsThermolysin Hydrogen Bonds: The same comparison for the distribution of hydrogen bonds

 

Predicted and Consensus Interactions Sites in Enzymes (PRECISE)

PRECISE will provide query and visualization tools for the comparative analyses of the interactions extracted from the structures of enzymes and their complexes with ligands (substrate and transition state analogues, cofactors, inhibitors, and products). For each enzyme, we derive a consensus binding site, obtained by aligning all homologous sequences, identifying the residue positions that are important for the binding of any ligand, and assessing the roles of amino acids at these positions. The identification of consensus residues is based on three sources: (1) relevant enzyme-ligand complex structures in the PDB, (2) computational solvent mapping, and (3) interactions submitted to the data base.

Docking and Scoring: Applications to PPAR-alpha and PPAR-gamma Receptors

Peroxisome proliferators form a diverse group of substances, including many environmental chemicals, which induce a massive accumulation of peroxysomes in hepatocytes and strongly induce enzymes of the peroxysomal and microsomal fatty-acid oxidation systems. Peroxisome proliferators-activated receptors (PPARs) belong to a family of nuclear hormone receptors which, depending on the bound ligand, can directly regulate gene transcription via interaction with response elements. Our goals are to study the binding of environmental chemicals, primarily industrial plasticizers such as phthalates, to PPARs, and to identify further possible binders.

While the work has important public health implications (phthalates are major environmental contaminants in water, food, and soil), it also addresses interesting algorithmic/computational problems. PPAR's have a large binding site, and the different classes of ligands (i.e., full agonists versus partial agonists and antagonists) bind at different regions. Since binding induces conformational changes that directly affect receptor activation and gene expression patterns, it is important to be able to accurately identify the binding modes of potential proliferators. This work is in collaboration with Professors David Waxman (BU Department of Biology) and Scott Mohr (BU Department of Chemistry).


References

  1. Dennis S, Kortvelyesi T, Vajda S.  2002.  Computational mapping identifies the binding sites of organic solvents on proteins. Proceedings of the National Academy of Sciences. 99(7):4290-4295.
  2. Kortvelyesi T, Dennis S, Silberstein M, Brown L, Vajda S.  2003.  Algorithms for computational solvent mapping of proteins. Proteins: Structure, Function, and Genetics. 51(3):340-351.
  3. Kortvelyesi T, Silberstein M, Dennis S, Vajda S.  2003.  Improved mapping of protein binding sites. Journal of Computer-Aided Molecular Design. 17(2/4):173-186.