WO1996034347A1 - Procede d'identification de composes structurellement actifs au moyen de souvenirs de conformation - Google Patents
Procede d'identification de composes structurellement actifs au moyen de souvenirs de conformation Download PDFInfo
- Publication number
- WO1996034347A1 WO1996034347A1 PCT/US1996/006110 US9606110W WO9634347A1 WO 1996034347 A1 WO1996034347 A1 WO 1996034347A1 US 9606110 W US9606110 W US 9606110W WO 9634347 A1 WO9634347 A1 WO 9634347A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- conformational
- gnrh
- dihedral
- sampling
- memories
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 230000015654 memory Effects 0.000 title description 62
- 150000001875 compounds Chemical class 0.000 title description 2
- 238000005070 sampling Methods 0.000 claims abstract description 70
- 238000002922 simulated annealing Methods 0.000 claims abstract description 40
- 238000004088 simulation Methods 0.000 claims abstract description 35
- XLXSAKCOAKORKW-AQJXLSMYSA-N gonadorelin Chemical class C([C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N1[C@@H](CCC1)C(=O)NCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CC=1N=CNC=1)NC(=O)[C@H]1NC(=O)CC1)C1=CC=C(O)C=C1 XLXSAKCOAKORKW-AQJXLSMYSA-N 0.000 description 60
- 239000000579 Gonadotropin-Releasing Hormone Substances 0.000 description 49
- 101000857870 Squalus acanthias Gonadoliberin Proteins 0.000 description 49
- 229940035638 gonadotropin-releasing hormone Drugs 0.000 description 49
- 238000009826 distribution Methods 0.000 description 19
- 108090000765 processed proteins & peptides Proteins 0.000 description 15
- 238000005295 random walk Methods 0.000 description 10
- VNYSSYRCGWBHLG-AMOLWHMGSA-N leukotriene B4 Chemical compound CCCCC\C=C/C[C@@H](O)\C=C\C=C\C=C/[C@@H](O)CCCC(O)=O VNYSSYRCGWBHLG-AMOLWHMGSA-N 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 7
- 108090000623 proteins and genes Proteins 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 102000005962 receptors Human genes 0.000 description 5
- 108020003175 receptors Proteins 0.000 description 5
- 102000008238 LHRH Receptors Human genes 0.000 description 4
- 108010021290 LHRH Receptors Proteins 0.000 description 4
- 101100109397 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) arg-8 gene Proteins 0.000 description 4
- 230000004888 barrier function Effects 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000005755 formation reaction Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- XJFPXLWGZWAWRQ-UHFFFAOYSA-N 2-[[2-[[2-[[2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]acetyl]amino]acetyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NCC(=O)NCC(=O)NCC(=O)NCC(=O)NCC(O)=O XJFPXLWGZWAWRQ-UHFFFAOYSA-N 0.000 description 3
- 101100228206 Caenorhabditis elegans gly-6 gene Proteins 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000001816 cooling Methods 0.000 description 3
- 229940088597 hormone Drugs 0.000 description 3
- 239000005556 hormone Substances 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 238000000329 molecular dynamics simulation Methods 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 229930003316 Vitamin D Natural products 0.000 description 2
- QYSXJUFSXHHAJI-XFEUOLMDSA-N Vitamin D3 Natural products C1(/[C@@H]2CC[C@@H]([C@]2(CCC1)C)[C@H](C)CCCC(C)C)=C/C=C1\C[C@@H](O)CCC1=C QYSXJUFSXHHAJI-XFEUOLMDSA-N 0.000 description 2
- 125000003275 alpha amino acid group Chemical group 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000002260 anti-inflammatory agent Substances 0.000 description 2
- 229940121363 anti-inflammatory agent Drugs 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000000975 bioactive effect Effects 0.000 description 2
- 238000000205 computational method Methods 0.000 description 2
- 239000000039 congener Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 108091008039 hormone receptors Proteins 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 150000002617 leukotrienes Chemical class 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000000302 molecular modelling Methods 0.000 description 2
- 239000000813 peptide hormone Substances 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 235000019166 vitamin D Nutrition 0.000 description 2
- 239000011710 vitamin D Substances 0.000 description 2
- 150000003710 vitamin D derivatives Chemical class 0.000 description 2
- 229940046008 vitamin d Drugs 0.000 description 2
- YFGBQHOOROIVKG-BHDDXSALSA-N (2R)-2-[[(2R)-2-[[2-[[2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]amino]acetyl]amino]-3-phenylpropanoyl]amino]-4-methylsulfanylbutanoic acid Chemical group C([C@H](C(=O)N[C@H](CCSC)C(O)=O)NC(=O)CNC(=O)CNC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=CC=C1 YFGBQHOOROIVKG-BHDDXSALSA-N 0.000 description 1
- 238000005084 2D-nuclear magnetic resonance Methods 0.000 description 1
- IUPHTVOTTBREAV-UHFFFAOYSA-N 3-hydroxybutanoic acid;3-hydroxypentanoic acid Chemical compound CC(O)CC(O)=O.CCC(O)CC(O)=O IUPHTVOTTBREAV-UHFFFAOYSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 241000219495 Betulaceae Species 0.000 description 1
- 229920013642 Biopol™ Polymers 0.000 description 1
- WSNMPAVSZJSIMT-UHFFFAOYSA-N COc1c(C)c2COC(=O)c2c(O)c1CC(O)C1(C)CCC(=O)O1 Chemical compound COc1c(C)c2COC(=O)c2c(O)c1CC(O)C1(C)CCC(=O)O1 WSNMPAVSZJSIMT-UHFFFAOYSA-N 0.000 description 1
- 244000205754 Colocasia esculenta Species 0.000 description 1
- 235000006481 Colocasia esculenta Nutrition 0.000 description 1
- 241000557626 Corvus corax Species 0.000 description 1
- 108700012941 GNRH1 Proteins 0.000 description 1
- 101000904173 Homo sapiens Progonadoliberin-1 Proteins 0.000 description 1
- -1 Leukotriene Structure Leukotrienes Chemical class 0.000 description 1
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 241001282736 Oriens Species 0.000 description 1
- 102100024028 Progonadoliberin-1 Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 230000010799 Receptor Interactions Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000007321 biological mechanism Effects 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- SUAZRRTWDATHDK-UHFFFAOYSA-N cycloheptadecane Chemical compound C1CCCCCCCCCCCCCCCC1 SUAZRRTWDATHDK-UHFFFAOYSA-N 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 208000000509 infertility Diseases 0.000 description 1
- 230000036512 infertility Effects 0.000 description 1
- 231100000535 infertility Toxicity 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 210000004997 mammalian reproductive system Anatomy 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000000955 neuroendocrine Effects 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 229940094443 oxytocics prostaglandins Drugs 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 150000003180 prostaglandins Chemical class 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010583 slow cooling Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000012306 spectroscopic technique Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Definitions
- the present invention relates to a method for predicting the conformation and functionality of a molecule, comprising the steps of, first, performing multiple simulated annealing runs in order to reveal populated and unpopulated regions of multidimensional conformation space, and, second, performing a simula ⁇ tion at a fixed temperature, with sampling only from populated regions found in the first step.
- Regions that are more "important" are sampled with greater frequency in the full expectation that the speed of convergence will be enhanced.
- a simulation employing biased sampling is done in two steps: some initial procedure is employed to reveal (or guess) the important gross features, then an exten- sive search utilizing this information is performed.
- the present invention relates to a method for identifying structurally active molecules comprising the steps of, first, performing multiple simulated annealing runs in order to reveal populated and unpop- ulated regions of multidimensional conformation space, and, second, performing a simulation at a fixed tem- perature, with sampling only from populated regions found in the first step. It is based, at least in part, on the discovery that the method of the invention could be used to sort a large family of analogs of gonadotrophin releasing hormone ("GnRH analogs”) into groups having low or high affinity for the GnRH recep ⁇ tor.
- GnRH analogs gonadotrophin releasing hormone
- the method of the present invention offers the advantage that, since the simulated annealing runs quickly reveal unpopulated regions of the conformation space, the volume of conformation space that needs to be sampled in the second phase of the algorithm is reduced by many orders of magnitude. Additionally, since no energy minimization is used, these populations represent a canonical ensemble which may be used to estimate conformational free energies.
- Figure 2 A Flex-Map or "Conformational Memory” of dihedral 1 from LTB 4 .
- Figure 3 Graphical Representation of the Mapping of Rand Numbers onto the dihedral distribution. Random numbers between 0-1 determine dihedral values (using the line). For example, 0.6 maps to +65°.
- Figure 4. Plot of the average LTB conformer energy vs temperature for ten normal SA runs and ten Smart-SA runs. Very rapid energy lowering is possible using Smart-SA, although the ultimate energy is similar. Each 10 run set required -16 hrs of CPU time on a Vax 8600 and represents 100,000 conformers each.
- Figure 5. Molecular structure of the gonadotropin-releasing hormone (GnRH) . The 35 rotatable torsional angles are indicated by arrows.
- FIG. 7 Conformational Memory difference maps of dihedral angle 20 in GnRH (see Fig.5 * ) .
- the dif- ference maps were created by subtracting the Conforma ⁇ tional Memories from A. 25 and 10 runs, B. 50 and 25 runs, C. 75 and 50 runs, D. 100 and 75 runs. Note that in almost all regions differences are ⁇ 1%. Conforma ⁇ tional Memory Difference Maps of the other dihedrals are very similar.
- Figure 8 A sequences of 310K temperature slices from the Conformational Memory of bond 17, calculated with a. 25 runs; b. 50 runs; c. 75 runs; d. 100 runs. Note the symmetrically equivalent population distri- butions centered about -90 and 90.
- Figure 9 The choice of dihedral angle values in the biased sampling of the populated region of the Conformational Memories.
- the illustration is for dihedral angle 19.
- Panel (a) shows a histogram repre- sentation of the probability distribution for the dihedral angle ⁇ panel
- (b) shows the cumulative probability distribution for dihedral angle 19. Since the random number generator is a cumulative probability distribution, biased sampling is done from the histo- gram in part (b) . If the random number 0.2 is generated, which corresponds to the second block of the histogram in part (b) , the new trial dihedral will be chosen from the interval -170 to -160 degrees with the actual value obtained from a linear interpolation within this interval.
- Figure 12 Backbone trace of a representative of the two conformational families of Lys8-GnRH obtained from Conformational Memories.
- the structure with the beta-type turn comes from a family with an approxi ⁇ mately 3% population.
- the structure with the straight backbone comes from a family with approximately that of 70% population.
- Figure 13 Superimposition of 70 structures that make up the major conformational family of Lys8-GnRH obtained from Conformational Memories. While there is a large amount of fluctuation in the backbone, and an even greater amount of fluctuation in the side chains (especially Lys8) , the backbone is clearly extended.
- Figure 14 Backbone trace of a representative of the two conformational families of Lys8-GnRH obtained from Conformational Memories.
- the structure with the beta-type turn comes from a family with an approxi ⁇ mately 3% population.
- the structure with the straight backbone comes from a family with approximately that of 70% population.
- Figure 13 Superimposition of 70 structures that
- the present invention is described by way of two examples.
- First, the method of the invention is applied to determining the structure of leukotrienes.
- Second, the method of the invention is used to identify GnRH analogs which have a high affinity of binding to the GnRH receptor.
- Leukotrienes for example, are an important class of natural antiinflammatory agents (Sammuelsson et al., Prostaglandins, 17:785 (1979); "The Leukotrienes”: Their Biological Significance,” P.J. Piper, Ed., Raven Press, N.Y., (1986)). Understanding the bioactive con- formations of a key member of this class such as LTB 4 ( Figure 1) , involves conformational analysis of 14 flexible dihedrals.
- the method is a 2-stage process made up of a learning phase and an implementation phase.
- the learning phase starts by randomly sampling the dihedral space of all flexible bonds using the simulated annealing algorithm
- our algorithm requires infor ⁇ mation on the number states, the interval and popula ⁇ tion of each state. (In the example, the number of states is actually four instead of three because dihedral space goes from -180 to +180) .
- the shaded regions are the "dead zones", and thus are never sampled. Since the first region has a 1/3 probability of being surveyed, if the generated random number is between zero and 1/3, the new dihedral is selected from this first region.
- the identity of the rotated dihedrals, the extent of the rotation, the value of the energy of the trial con ⁇ figuration and the new dihedral values if the trial conformation is accepted or the old dihedral values if the trial conformation is rejected are recorded to the log file in temperature blocks.
- One log file is created for each run.
- a utility program inputs all of this raw data and combines it according to temperature blocks. This data is output in comma delimited format so that it can be imported into deltagraph (Deltagraph TM version 1.0, Copyright Deltapoint, Inc., 200 Heritage Harbor, Suite G, Monterey, CA 93940) which is used to plot the conformational memories.
- a temperature slice from the conformational memory is extracted.
- the sampling is done from a subroutine that performs the calculation shown in figure 3 instead of just using the standard random number generator.
- the learning phase of the simulation reveals that about 60% of the entire conformational space is unpopulated "dead zones" at 200K. Going into the implementation phase of the simulation, we were able to reduce the volume of the conformation space that needed to be sampled by many orders of magnitude.
- the key physiological role of the gonadotropin- releasing hormone ([pGlul-His2-Trp3-Ser4-Tyr5-Gly6- Leu7-Arg8-Pro9-GlylO-NH 2 ] ;GnRH) as a mediator of neuroendocrine regulation in the mammalian reproductive system has made it the object of intense study for several decades.
- the ability of GnRH and its analogs to modulate the pituitary-gonadal axis has made them essential therapeutic agents in the treatment of a variety of disorders ranging from infertility to prostatic carcinoma (Casper, Can Med Assoc J. 144:153- 160 (1991); Barbieri, Trends Endocrinol Metab 2:30-34).
- GnRH we have used the recently developed technique of conformational memories (Wilson and Guarnieri, Tetra ⁇ hedron Lett., 22:3601 (1991)). Here we show that application of this technique can yield converged dihedral populations of all 35 rotatable bonds of the peptide. GnRH with no approximations. Samples from the conformational memories using the conformational memory biased sampling technique were used to charac ⁇ terize the conformational families of GnRH and several of its analogs, in an aqueous environment modeled with the generalized Born/surface area (GB/SA) method.
- GB/SA generalized Born/surface area
- the simulation technique of conformational memories is a two stage process consisting of an exploratory phase and a biased sampling phase.
- exploratory phase repeated runs of Monte Carlo simu ⁇ lated annealing (MC/SA) (Kirkpatrick et al., Science, 220:671 (1983)) are carried out in order to map out the entire conformational space of the flexible molecule.
- MC/SA Monte Carlo simu ⁇ lated annealing
- Trial conformations in the MC/SA routine were generated by randomly picking 2 rotatable bonds from among the 35, rotating each bond by a random value between +/-180 degrees, and accepting or rejecting the trial conformation according to the standard Metropolis (Metropolis et al., J. Chem Phys., 1:187 (1953)) cri ⁇ teria with a Boltzmann probability function defined at the given temperature. After each step, whether the conformation was accepted or rejected, the data for the rotated bonds, the extent of rotation, the energy, and the value of the dihedral angles are recorded to a "log file". An example of the output to the log file is given in Table 1.
- the first group of entries corresponding to the first two lines, is the result of a rejected step as indicated by the zeros in the first column.
- the second and third columns iden ⁇ tify the atom numbers of the bonds that were rotated to create the trial move (in this example atom40-atom41 and atom47-atom48) .
- the fourth column lists the extent of rotation of the torsion angle in degrees.
- the fifth column lists the total energy of the structure.
- the sixth column holds the current dihedral value of the bond.
- the current dihedral values given in the last column are the new values of the newly accepted conformation.
- Each run of MC/SA consists of a random walk of 190,000 steps (19 temperatures, 10,000 steps per temperature) . Because two lines of data are added to the log file for each Monte Carlo step, a single run creates a file of 380,000 lines.
- a 157 run MC/SA simula ⁇ tion requires about 12 days of computation on an SGI Challenge 200 MH 3 workstation.
- the log files are used as input to a program (called Flex) that sorts, merges, and compacts the data in several ways. Since the simulations were done at 19 temperatures for each peptide, application of Flex first sorts and merges the data from all log files into 19 temperature blocks. Subsequently, within each temperature block, the data are partitioned into 35 bond blocks ⁇ one for each rotatable bond. For each rotatable bond, the dihedral angle space is partitioned into 36 ten degree intervals. From each line of data for a given bond at a given temperature, the program records the number of times that the bond dihedral angle value belongs to one of the ten degree buckets, i.e. a "Conformational Memory".
- the Flex program produces a 19x36 (recording 19 temperatures by 36 10-degree diheral intervals with normalized popula ⁇ tions) spread sheet for each of the 35 rotatable bonds of the GnRH peptide.
- An excerpt of one of these spread sheets is given in Table 2.
- the spreadsheets are imported into Delagraph (TM Version 1.0, Copyright Deltapoint, Inc. 200 Heritage Harbor, Suite G, Monterey, CA 93940, (1987)) for plotting and graphical representation of the data in the spreadsheets are given in Figures 6 A-D.
- Across the top of the spread ⁇ sheet are the dihedral angle values from -170° to 180° which label the y-axes of Figures 6 A-D (note that the spreadsheet fragment is cut off at -100) .
- the thirty-five conformational memories provide a com- plete mapping of the conformational space of GnRH with no approximations, as long as the calculated popula- tions are converged.
- population convergence was identified as the difficult and crucial aspect of forming conforma ⁇ tional memories (Wilson and Guarnieri, Tetrahedron Lett., 22:3601 (1991).
- the biased sampling explores only the parts of the conformational space identified as popu ⁇ lated regions.
- Population convergence ensures that regions that could be thermally accessible are not erroneously labeled as being unpopulated.
- the correct identification of the populated regions is essential for the second phase of the simulation, because the biased sampling only explores populated regions of the conformational space.
- Population convergence for the GnRH was confirmed in three different ways: by creating conformational memory difference maps for simulations of different length, by analyzing intrinsic symmetry; and by showing that there is no significant difference in the popula- tions of actual structures of GnRH created from
- FIG. 3 shows the Conformational Memory difference maps for dihedral angle 1, comparing simulation lengths of 10, 25, 50, 75 and 100 runs.
- the difference map in Figure 3a is created by subtracting the Conformational Memory obtained from a 25 run MC/SA simulation from a 10 run MC/SA simulation, in Figure 3b the difference is between 50 and 25 runs, Figure 3c shows the difference between 75 and 50 runs, and Figure 3d is the difference map between 100 and 75 runs.
- the progression clearly shows the convergence.
- the other dihedral angles have very similar difference maps for this sequence of comparisons.
- a second measure of convergence is symmetry.
- dihedral angle 17 has a 2-fold axis of sym- metry, it is expected that the dihedral space of this bond will have symmetric population distributions cen ⁇ tered at -90 and 90 degrees.
- a temperature slice at 310K of this dihedral for 25, 50, 75 and 100 run MC/SA simulations isshown in Figure 8. The population dis ⁇ tributions clearly conform to the symmetry considera ⁇ tions.
- the third indication of convergence is the finding (see below) that biased sampling from Conformational Memories created from 25, 50, 75, 100 and 157 MC/SA runs yield very similar profiles of GnRH.
- Table 3 is an excerpt of the probability matrix for GnRH at 310K.
- the dimension of this proba- bility matrix is 35x36 for the 35 rotatable bonds partitioned into 36 buckets over the 360 degree dihedral space (note that only 11 of the 36 dihedral buckets and only 16 of the 35 rotatable torsional angles are shown in Table 3) .
- the first line indicates that at 310K bond 1 is found in the -180 to -170 dihedral interval 10.1% of the time.
- the seventh column of the first row indicates that bond 1 is never found in the dihedral interval -120 to -110 at 310K.
- the two stage process of developing Conformational Memories and then performing the biased sampling from these distributions is necessary in order to sample the entire conformation space of the molecule.
- An obviously simpler alternative would be to limit the conformational exploration to standard Metropolis Monte Carlo at 310K and monitor the development of the random walk over torsional space.
- this simulation constitutes the last step in the development of the Conformational Memories for the temperature of 310K; it is clearly inadequate, as indicated by the acceptance rate.
- the acceptance rate is about 28% at 207OK, with a step size chosen randomly within the interval of +/- 180 degrees and rotating two dihedrals selected randomly at each step.
- the acceptance rate falls below 2%. Therefore, the sampling of the 35 dimensional dihedral space would be incomplete if these parameters were used for the Monte Carol random walk procedure at 310K.
- sampling would still be insufficient because the majority of new conformations would be in the local area of the previous conformation.
- the +/- 180 degree step size was deliberately chosen so that new conformations can be created by jumping between wells without having to climb over barriers. A single simulated annealing run cannot be expected to cover such a vast space, but cumulations of multiple runs while each of the runs performs a different random walk can be shown to converge, as illustrated in Figures 7 and Figure 8.
- the restriction of the sampling to the populated regions identified in the previous step is achieved by partitioning the 0-1 interval of the random number generator into the 36 parts which correspond to the 36 separate 10- degree intervals for each rotatable dihedral angle.
- the partitioning of the random number generator is proportional to the population of the 10-degree bucket.
- New biased trial conformations are generated by randomly choosing two rotatable bonds, generating a new random number for each bond, determining to which of the 36 intervals each new random number for each bond belongs, and driving the dihedrals to the appropriate intervals.
- the exact value of the new dihedral is determined by a linear interpolation. This procedure is illustrated in Figure 9.
- a major advantage of the Conformational Memory biased sampling method is that partitioning the random number generator among the populated intervals results in a sampling technique that eliminates the barrier- crossing problem.
- a new trial configuration is sampled from the Conformational Memory, which can be any part of the populated dihedral space, and then the trial conforma ⁇ tion is created by driving the current structure to the appropriate configuration.
- the notion of a barrier restricting access to any part of the conforma ⁇ tional space is eliminated in this procedure.
- Conformational Memories are mean field population dis ⁇ tributions, the correlations among the different flexible torsional angles have been submerged in the averaging process.
- the Conformational Memory biased sampling technique does preferentially bring together the higher probability regions of the different dihedrals.
- the method introduces average correlations among the different dihedral angles during the selection process, while accessing all populated regions. It is important to note that the original formulation of the Conformational Memories biased sampling technique (Guarnieri and Wilson, J. Comput. Chem 16:648-653 (1995)) violates detailed balance.
- the first run was a 10,000 step MC random walk using the Conformational memory biased sampling technique with uniform sampling of 100 struc ⁇ tures (1 sample every 100 steps) .
- the second run was a 50,000 step MC random walk using the Conformational
- Xcluster then produces a graphical representation of the RMS deviations between every pair of conformers. Since the conformations have been rearranged so that the RMS deviation between nearest neighbors is minimized, any large jump in RMS deviation between nearest neighbors is indicative of a large structural change and hence identifies a new con ⁇ formational family. As described below, we settled on 500,000 steps for the subsequent biased sampling runs. We then performed these biased sampling runs using Conformational Memories created from 25, 50, 75, 100 and 157 run MC/SA simulations.
- the Lys8 analog of GnRH had been constructed to explore the role of Arg ⁇ in molecular recognition of GnRH by its receptor (Schwn and Rivier, Endo. Rev. 1:44-66 (1986); Millar et al., J. Biolog. Chem 264:21007-21013 (1989)). Mutation studies of GnRH receptors from various species have implicated Arg ⁇ as being important for mammalian hormone-receptor recogni ⁇ tion (Flanagan et al., J. Biolog. Chem. 269:22636-22641 (1994)).
- the Lys8-GnRH family that has a beta-type turn conformation of the backbone (Figure 12b) which is virtually identical to the major conformational family of the GnRH ( Figure 10a) , has a probability of only about 3%.
- a distribution of the members of the predominant Lys8-GnRH family superimposed upon each other is shown in Figure 13, with the entire molecule shown in red, except for Lys ⁇ which is colored green. Because the Lys ⁇ -GnRH has a low affinity for the GnRH receptor, but elicits the same response once it interacts with the receptor, it is believed to suggest that adoption of a large population of beta-type turn conformation is a key requirement for hormone-receptor recognition.
- Conformational regions that exhibit 0% population in the calculation of the isolated peptide in water at 310K may still be of biological importance, if some of these conformations can be induced by the interaction energies of the peptide with the receptor.
- the finding that regions unpopulated at 310K are in fact populated at temperatures higher by only 100K (corresponding to an energy difference of only a fraction of a Kcal/mol) indicates the feasibility of such "receptor-induced" conformations.
- the method was shown to be capable of achieving complete sampling of the conformational space, to converge in a very practical number of steps, and to be capable of over- coming energy barriers efficiently.
- the second column lists the pair of atom number identifying the dihedral angles that were rotated to produce the trial structure.
- the third column lists the extent to which the dihedral was rotated in order to create the trial structure.
- the fourth column lists the energy of the current conformation (the energy of the original structure if rejected or the new structure if accepted) .
- the fifth column lists the current dihedral values of the conformation (the dihedral angle of the original structure if rejected or the new structure if accepted) .
- the second column lists the pair of atom number identifying the dihedral angles that were rotated to produce the trial structure.
- the third column lists the extent to which
- a sample of a conformational memory spreadsheet The first row labels the dihedral circle across the y- axis. The first column labels the temperatures across the x-axis. Each cell contains the population corres ⁇ ponding to a given temperature and a given 10 degree dihedral bucket which is plotted on the z-axis. Note that the columns of the spreadsheet are cut off after - 40 degrees. Table 3
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Crystallography & Structural Chemistry (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
Abstract
La présente invention concerne un procédé de prévision de la conformation et de la fonctionnalité d'une molécule. Une première opération du procédé consiste à exécuter plusieurs recuits simulés afin de révéler les régions peuplées et non peuplées d'un espace de conformation multidimensionnel. La seconde opération consiste à effectuer une simulation à température fixe en n'échantillonnant qu'à partir des régions peuplées révélées au cours de la première opération.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU57215/96A AU5721596A (en) | 1995-04-27 | 1996-04-26 | Method for identifying structurally active compounds using c onformational memories |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US43115595A | 1995-04-27 | 1995-04-27 | |
| US08/431,155 | 1995-04-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO1996034347A1 true WO1996034347A1 (fr) | 1996-10-31 |
Family
ID=23710722
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US1996/006110 WO1996034347A1 (fr) | 1995-04-27 | 1996-04-26 | Procede d'identification de composes structurellement actifs au moyen de souvenirs de conformation |
Country Status (2)
| Country | Link |
|---|---|
| AU (1) | AU5721596A (fr) |
| WO (1) | WO1996034347A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6735530B1 (en) | 1998-09-23 | 2004-05-11 | Sarnoff Corporation | Computational protein probing to identify binding sites |
| US7415361B2 (en) | 2003-12-09 | 2008-08-19 | Locus Pharmaceuticals, Inc. | Methods and systems for analyzing and determining ligand-residue interaction |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109447369B (zh) * | 2018-11-09 | 2022-05-17 | 浙江大学 | 一种基于模拟退火算法的考虑多因素的产能端功率分配方法 |
-
1996
- 1996-04-26 AU AU57215/96A patent/AU5721596A/en not_active Abandoned
- 1996-04-26 WO PCT/US1996/006110 patent/WO1996034347A1/fr active Application Filing
Non-Patent Citations (2)
| Title |
|---|
| IN: ADAPTION OF SIMULATED ANNEALING TO CHEMICAL OPTIMIZATION PROBLEMS, Edited by KALIVAS, ELSEVIER SCIENCE B.V., 1995, WILSON et al., "Conformational Analysis of Flexible Molecules", pages 351-367. * |
| JOURNAL OF COMPUTATIONAL CHEMISTRY, 1991, Vol. 12, No. 3, WILSON et al., "Applications of Simulated Annealing to the Conformational Analysis of Flexible Molecules", pages 342-349. * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6735530B1 (en) | 1998-09-23 | 2004-05-11 | Sarnoff Corporation | Computational protein probing to identify binding sites |
| US7415361B2 (en) | 2003-12-09 | 2008-08-19 | Locus Pharmaceuticals, Inc. | Methods and systems for analyzing and determining ligand-residue interaction |
Also Published As
| Publication number | Publication date |
|---|---|
| AU5721596A (en) | 1996-11-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1021780B1 (fr) | Systeme et procede de conception rationnelle des medicaments sur la base d'une structure faisant intervenir la prediction precise de l'energie libre de liaison | |
| Srinivasan et al. | Application of a pairwise generalized Born model to proteins and nucleic acids: inclusion of salt effects | |
| EP1078333B1 (fr) | Systeme, procede et produit de programme informatique servant a representer des donnees de proximite dans un espace multidimensionnel | |
| Oprea et al. | Chemography: the art of navigating in chemical space | |
| Judson | Genetic algorithms and their use in chemistry | |
| Wilson et al. | Applications of simulated annealing to the conformational analysis of flexible molecules | |
| US7415359B2 (en) | Methods and systems for the identification of components of mammalian biochemical networks as targets for therapeutic agents | |
| EP0943131B1 (fr) | Procede, systeme et programme permettant une simulation s'articulant autour d'une synthese de produits chimiques dotes de fonctions biologiques | |
| US20040162852A1 (en) | Multidimensional biodata integration and relationship inference | |
| Leach et al. | Automated conformational analysis: directed conformational search using the A* algorithm | |
| US20130013279A1 (en) | Apparatus and method for structure-based prediction of amino acid sequences | |
| US20030220716A1 (en) | Method and apparatus for automated design of chemical synthesis routes | |
| JP2002530727A (ja) | 定量的構造活性相関におけるファーマコフォア・フィンガープリント並びにプライマリ・ライブラリの構築 | |
| Guarnieri et al. | Conformational memories and a simulated annealing program that learns: application to LTB4 | |
| Lobanov et al. | Stochastic similarity selections from large combinatorial libraries | |
| Chen et al. | Protein Retrieval via Integrative Molecular Ensembles (PRIME) through extended similarity indices | |
| EP1266337A2 (fr) | Systeme et procede de recherche dans un espace combinatoire | |
| Sohn et al. | Hidden Markov Dirichlet process: Modeling genetic inference in open ancestral space | |
| Stahura et al. | Partitioning methods for the identification of active molecules | |
| WO1996034347A1 (fr) | Procede d'identification de composes structurellement actifs au moyen de souvenirs de conformation | |
| Das et al. | Solvation parameters for predicting the structure of surface loops in proteins: transferability and entropic effects | |
| Tchagang et al. | Biclustering of DNA microarray data: theory, evaluation, and applications | |
| Lewis et al. | Quantification of molecular similarity and its application to combinatorial chemistry | |
| Feuilleaubois et al. | Implementation of the three-dimensional-pattern search problem on Hopfield-like neural networks | |
| Heffelfinger et al. | Carbon sequestration in Synechococcus Sp.: from molecular machines to hierarchical modeling |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP US |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
| ENP | Entry into the national phase |
Ref country code: CA Ref document number: 2190116 Kind code of ref document: A Format of ref document f/p: F |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| 122 | Ep: pct application non-entry in european phase |