WO1997015588A9 - Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug design - Google Patents
Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug designInfo
- Publication number
- WO1997015588A9 WO1997015588A9 PCT/US1996/017325 US9617325W WO9715588A9 WO 1997015588 A9 WO1997015588 A9 WO 1997015588A9 US 9617325 W US9617325 W US 9617325W WO 9715588 A9 WO9715588 A9 WO 9715588A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ppca
- pppca
- data
- atomic
- model
- Prior art date
Links
- 238000002441 X-ray diffraction Methods 0.000 title claims abstract description 47
- 239000002243 precursor Substances 0.000 title claims abstract description 32
- 230000001681 protective effect Effects 0.000 title claims abstract description 14
- 238000002425 crystallisation Methods 0.000 title claims description 16
- 230000008025 crystallization Effects 0.000 title claims description 14
- 108090000623 proteins and genes Proteins 0.000 title abstract description 104
- 102000004169 proteins and genes Human genes 0.000 title abstract description 98
- 238000009510 drug design Methods 0.000 title abstract description 21
- 102000005600 Cathepsins Human genes 0.000 title description 2
- 108010084457 Cathepsins Proteins 0.000 title description 2
- 102100028524 Lysosomal protective protein Human genes 0.000 claims abstract description 268
- 238000000034 method Methods 0.000 claims abstract description 94
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 26
- 102000005572 Cathepsin A Human genes 0.000 claims abstract description 17
- 108010059081 Cathepsin A Proteins 0.000 claims abstract description 17
- 238000002424 x-ray crystallography Methods 0.000 claims abstract description 8
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 77
- 239000003446 ligand Substances 0.000 claims description 55
- 101710162021 Lysosomal protective protein Proteins 0.000 claims description 45
- 230000000694 effects Effects 0.000 claims description 44
- 241000282414 Homo sapiens Species 0.000 claims description 32
- 102000004190 Enzymes Human genes 0.000 claims description 29
- 108090000790 Enzymes Proteins 0.000 claims description 29
- 230000027455 binding Effects 0.000 claims description 18
- PXIPVTKHYLBLMZ-UHFFFAOYSA-N Sodium azide Chemical compound [Na+].[N-]=[N+]=[N-] PXIPVTKHYLBLMZ-UHFFFAOYSA-N 0.000 claims description 16
- 230000004071 biological effect Effects 0.000 claims description 16
- 101001122938 Homo sapiens Lysosomal protective protein Proteins 0.000 claims description 15
- 239000000872 buffer Substances 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 11
- 102000050098 human CTSA Human genes 0.000 claims description 9
- 230000004048 modification Effects 0.000 claims description 9
- 238000012986 modification Methods 0.000 claims description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 150000007523 nucleic acids Chemical group 0.000 claims description 7
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 claims description 7
- 150000003839 salts Chemical class 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 5
- 238000012800 visualization Methods 0.000 claims description 5
- 229920002594 Polyethylene Glycol 8000 Polymers 0.000 claims description 4
- 230000002633 protecting effect Effects 0.000 claims description 4
- 238000012835 hanging drop method Methods 0.000 claims description 3
- 230000001376 precipitating effect Effects 0.000 claims description 3
- 238000009792 diffusion process Methods 0.000 claims description 2
- 230000003301 hydrolyzing effect Effects 0.000 claims 1
- 229960000281 trometamol Drugs 0.000 claims 1
- 235000018102 proteins Nutrition 0.000 description 91
- 239000013078 crystal Substances 0.000 description 88
- 239000000178 monomer Substances 0.000 description 80
- 210000004027 cell Anatomy 0.000 description 47
- 125000004429 atom Chemical group 0.000 description 33
- 239000000203 mixture Substances 0.000 description 30
- 102000004196 processed proteins & peptides Human genes 0.000 description 29
- 229920001184 polypeptide Polymers 0.000 description 28
- 239000000539 dimer Substances 0.000 description 27
- 230000035800 maturation Effects 0.000 description 26
- 230000007170 pathology Effects 0.000 description 23
- 230000006870 function Effects 0.000 description 22
- 239000002609 medium Substances 0.000 description 22
- 230000003197 catalytic effect Effects 0.000 description 21
- 108010059841 serine carboxypeptidase Proteins 0.000 description 20
- 239000000243 solution Substances 0.000 description 20
- 230000000903 blocking effect Effects 0.000 description 19
- 239000013598 vector Substances 0.000 description 19
- 230000014509 gene expression Effects 0.000 description 18
- 239000002253 acid Substances 0.000 description 17
- 229910001385 heavy metal Inorganic materials 0.000 description 17
- 150000001875 compounds Chemical class 0.000 description 16
- 239000002904 solvent Substances 0.000 description 16
- 238000003860 storage Methods 0.000 description 16
- 239000000758 substrate Substances 0.000 description 16
- 239000012634 fragment Substances 0.000 description 14
- 238000012935 Averaging Methods 0.000 description 13
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 12
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 12
- 238000003556 assay Methods 0.000 description 12
- 208000024891 symptom Diseases 0.000 description 12
- 210000004962 mammalian cell Anatomy 0.000 description 11
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 10
- UUQMNUMQCIQDMZ-UHFFFAOYSA-N betahistine Chemical compound CNCCC1=CC=CC=N1 UUQMNUMQCIQDMZ-UHFFFAOYSA-N 0.000 description 10
- 239000003814 drug Substances 0.000 description 10
- 230000007246 mechanism Effects 0.000 description 10
- 241000238631 Hexapoda Species 0.000 description 9
- 102000005348 Neuraminidase Human genes 0.000 description 9
- 108010006232 Neuraminidase Proteins 0.000 description 9
- 150000007942 carboxylates Chemical group 0.000 description 9
- 238000002447 crystallographic data Methods 0.000 description 9
- 238000002474 experimental method Methods 0.000 description 9
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 9
- 239000000523 sample Substances 0.000 description 9
- 238000006467 substitution reaction Methods 0.000 description 9
- 208000017462 Galactosialidosis Diseases 0.000 description 8
- 238000013500 data storage Methods 0.000 description 8
- 238000001727 in vivo Methods 0.000 description 8
- 230000001105 regulatory effect Effects 0.000 description 8
- 230000001225 therapeutic effect Effects 0.000 description 8
- 102100025698 Cytosolic carboxypeptidase 4 Human genes 0.000 description 7
- 102000035195 Peptidases Human genes 0.000 description 7
- 108091005804 Peptidases Proteins 0.000 description 7
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 7
- 101100149312 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SER1 gene Proteins 0.000 description 7
- 230000004913 activation Effects 0.000 description 7
- 238000007792 addition Methods 0.000 description 7
- 235000001014 amino acid Nutrition 0.000 description 7
- 229940024606 amino acid Drugs 0.000 description 7
- 150000001413 amino acids Chemical class 0.000 description 7
- 102000005936 beta-Galactosidase Human genes 0.000 description 7
- 108010005774 beta-Galactosidase Proteins 0.000 description 7
- 238000012217 deletion Methods 0.000 description 7
- 230000037430 deletion Effects 0.000 description 7
- 238000002347 injection Methods 0.000 description 7
- 239000007924 injection Substances 0.000 description 7
- 210000004185 liver Anatomy 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 7
- 210000001519 tissue Anatomy 0.000 description 7
- 238000011282 treatment Methods 0.000 description 7
- 241000283690 Bos taurus Species 0.000 description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- 241000124008 Mammalia Species 0.000 description 6
- 239000002202 Polyethylene glycol Substances 0.000 description 6
- 239000004365 Protease Substances 0.000 description 6
- 241000700605 Viruses Species 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 230000002255 enzymatic effect Effects 0.000 description 6
- 210000002950 fibroblast Anatomy 0.000 description 6
- 229910052739 hydrogen Inorganic materials 0.000 description 6
- 230000002401 inhibitory effect Effects 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 229920001223 polyethylene glycol Polymers 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- 241000701447 unidentified baculovirus Species 0.000 description 6
- 108020004414 DNA Proteins 0.000 description 5
- 108010062466 Enzyme Precursors Proteins 0.000 description 5
- 102000010911 Enzyme Precursors Human genes 0.000 description 5
- 108090000371 Esterases Proteins 0.000 description 5
- 101000932590 Homo sapiens Cytosolic carboxypeptidase 4 Proteins 0.000 description 5
- 102000004157 Hydrolases Human genes 0.000 description 5
- 108090000604 Hydrolases Proteins 0.000 description 5
- 208000015439 Lysosomal storage disease Diseases 0.000 description 5
- 101001033003 Mus musculus Granzyme F Proteins 0.000 description 5
- 230000002378 acidificating effect Effects 0.000 description 5
- 238000010171 animal model Methods 0.000 description 5
- 238000013480 data collection Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 229940079593 drug Drugs 0.000 description 5
- 239000000499 gel Substances 0.000 description 5
- 239000001257 hydrogen Substances 0.000 description 5
- 210000003712 lysosome Anatomy 0.000 description 5
- 229910052751 metal Inorganic materials 0.000 description 5
- 239000002184 metal Substances 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 230000008707 rearrangement Effects 0.000 description 5
- 230000014616 translation Effects 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 4
- 101100269850 Caenorhabditis elegans mask-1 gene Proteins 0.000 description 4
- 102000005367 Carboxypeptidases Human genes 0.000 description 4
- 108010006303 Carboxypeptidases Proteins 0.000 description 4
- 241000282412 Homo Species 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- 241000699666 Mus <mouse, genus> Species 0.000 description 4
- 241000283973 Oryctolagus cuniculus Species 0.000 description 4
- 241000700159 Rattus Species 0.000 description 4
- 108020004511 Recombinant DNA Proteins 0.000 description 4
- 241000209140 Triticum Species 0.000 description 4
- 235000021307 Triticum Nutrition 0.000 description 4
- 150000001408 amides Chemical class 0.000 description 4
- 238000006555 catalytic reaction Methods 0.000 description 4
- 238000003776 cleavage reaction Methods 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 4
- 230000002779 inactivation Effects 0.000 description 4
- 230000005764 inhibitory process Effects 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 230000002132 lysosomal effect Effects 0.000 description 4
- 230000001868 lysosomic effect Effects 0.000 description 4
- 108020004707 nucleic acids Proteins 0.000 description 4
- 102000039446 nucleic acids Human genes 0.000 description 4
- HEGSGKPQLMEBJL-RKQHYHRCSA-N octyl beta-D-glucopyranoside Chemical compound CCCCCCCCO[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O HEGSGKPQLMEBJL-RKQHYHRCSA-N 0.000 description 4
- 238000012856 packing Methods 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 230000005855 radiation Effects 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- LWIHDJKSTIGBAC-UHFFFAOYSA-K tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 4
- RNAMYOYQYRYFQY-UHFFFAOYSA-N 2-(4,4-difluoropiperidin-1-yl)-6-methoxy-n-(1-propan-2-ylpiperidin-4-yl)-7-(3-pyrrolidin-1-ylpropoxy)quinazolin-4-amine Chemical compound N1=C(N2CCC(F)(F)CC2)N=C2C=C(OCCCN3CCCC3)C(OC)=CC2=C1NC1CCN(C(C)C)CC1 RNAMYOYQYRYFQY-UHFFFAOYSA-N 0.000 description 3
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 3
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 3
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 3
- 241000699800 Cricetinae Species 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 206010020772 Hypertension Diseases 0.000 description 3
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 3
- 108090001060 Lipase Proteins 0.000 description 3
- 102000004882 Lipase Human genes 0.000 description 3
- 239000004367 Lipase Substances 0.000 description 3
- 241001446467 Mama Species 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 241000282898 Sus scrofa Species 0.000 description 3
- OIPILFWXSMYKGL-UHFFFAOYSA-N acetylcholine Chemical compound CC(=O)OCC[N+](C)(C)C OIPILFWXSMYKGL-UHFFFAOYSA-N 0.000 description 3
- 229960004373 acetylcholine Drugs 0.000 description 3
- 238000001042 affinity chromatography Methods 0.000 description 3
- 235000009582 asparagine Nutrition 0.000 description 3
- 229960001230 asparagine Drugs 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000011575 calcium Substances 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 238000005094 computer simulation Methods 0.000 description 3
- 230000002950 deficient Effects 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 239000003937 drug carrier Substances 0.000 description 3
- 210000001163 endosome Anatomy 0.000 description 3
- 125000000524 functional group Chemical group 0.000 description 3
- 230000013595 glycosylation Effects 0.000 description 3
- 238000006206 glycosylation reaction Methods 0.000 description 3
- 230000002209 hydrophobic effect Effects 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 239000002054 inoculum Substances 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 235000019421 lipase Nutrition 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 239000012460 protein solution Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 239000003656 tris buffered saline Substances 0.000 description 3
- 239000003643 water by type Substances 0.000 description 3
- QDZOEBFLNHCSSF-PFFBOGFISA-N (2S)-2-[[(2R)-2-[[(2S)-1-[(2S)-6-amino-2-[[(2S)-1-[(2R)-2-amino-5-carbamimidamidopentanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]pyrrolidine-2-carbonyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-N-[(2R)-1-[[(2S)-1-[[(2R)-1-[[(2S)-1-[[(2S)-1-amino-4-methyl-1-oxopentan-2-yl]amino]-4-methyl-1-oxopentan-2-yl]amino]-3-(1H-indol-3-yl)-1-oxopropan-2-yl]amino]-1-oxo-3-phenylpropan-2-yl]amino]-3-(1H-indol-3-yl)-1-oxopropan-2-yl]pentanediamide Chemical compound C([C@@H](C(=O)N[C@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(N)=O)NC(=O)[C@@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CCCCN)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](N)CCCNC(N)=N)C1=CC=CC=C1 QDZOEBFLNHCSSF-PFFBOGFISA-N 0.000 description 2
- XHWDVRRNQHMAPE-UHFFFAOYSA-N 2-[[2-[[2-[[2-[[2-[[5-amino-2-[[5-amino-2-[[1-[6-amino-2-[[1-[2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]pyrrolidine-2-carbonyl]amino]-5-oxopentanoyl]amino]-5-oxopentanoyl]amino]-3-phenylpropanoyl]amino]-3-phenylpro Chemical compound C=1C=CC=CC=1CC(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(N)=O)NC(=O)C1N(CCC1)C(=O)C(CCCCN)NC(=O)C1N(CCC1)C(=O)C(N)CCCN=C(N)N)C(=O)NC(C(=O)NCC(=O)NC(CC(C)C)C(=O)NC(CCSC)C(O)=O)CC1=CC=CC=C1 XHWDVRRNQHMAPE-UHFFFAOYSA-N 0.000 description 2
- 102400000344 Angiotensin-1 Human genes 0.000 description 2
- 101800000734 Angiotensin-1 Proteins 0.000 description 2
- 101800004538 Bradykinin Proteins 0.000 description 2
- 102400000967 Bradykinin Human genes 0.000 description 2
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 102000003886 Glycoproteins Human genes 0.000 description 2
- 108090000288 Glycoproteins Proteins 0.000 description 2
- QXZGBUJJYSLZLT-UHFFFAOYSA-N H-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg-OH Natural products NC(N)=NCCCC(N)C(=O)N1CCCC1C(=O)N1C(C(=O)NCC(=O)NC(CC=2C=CC=CC=2)C(=O)NC(CO)C(=O)N2C(CCC2)C(=O)NC(CC=2C=CC=CC=2)C(=O)NC(CCCN=C(N)N)C(O)=O)CCC1 QXZGBUJJYSLZLT-UHFFFAOYSA-N 0.000 description 2
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 208000008955 Mucolipidoses Diseases 0.000 description 2
- 206010072927 Mucolipidosis type I Diseases 0.000 description 2
- 102000002568 Multienzyme Complexes Human genes 0.000 description 2
- 108010093369 Multienzyme Complexes Proteins 0.000 description 2
- 102400000097 Neurokinin A Human genes 0.000 description 2
- HEAUFJZALFKPBA-YRVBCFNBSA-N Neurokinin A Chemical compound C([C@@H](C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(N)=O)C(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC=1NC=NC=1)C(C)O)C1=CC=CC=C1 HEAUFJZALFKPBA-YRVBCFNBSA-N 0.000 description 2
- 101800000399 Neurokinin A Proteins 0.000 description 2
- 102400000050 Oxytocin Human genes 0.000 description 2
- XNOPRXBHLZRZKH-UHFFFAOYSA-N Oxytocin Natural products N1C(=O)C(N)CSSCC(C(=O)N2C(CCC2)C(=O)NC(CC(C)C)C(=O)NCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(CCC(N)=O)NC(=O)C(C(C)CC)NC(=O)C1CC1=CC=C(O)C=C1 XNOPRXBHLZRZKH-UHFFFAOYSA-N 0.000 description 2
- 101800000989 Oxytocin Proteins 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 229920001213 Polysorbate 20 Polymers 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 102400000096 Substance P Human genes 0.000 description 2
- 101800003906 Substance P Proteins 0.000 description 2
- 102000003141 Tachykinin Human genes 0.000 description 2
- 150000007513 acids Chemical class 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 125000001931 aliphatic group Chemical group 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 2
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 2
- 235000011130 ammonium sulphate Nutrition 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- ORWYRWWVDCYOMK-HBZPZAIKSA-N angiotensin I Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O)C(C)C)C1=CC=C(O)C=C1 ORWYRWWVDCYOMK-HBZPZAIKSA-N 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- QXZGBUJJYSLZLT-FDISYFBBSA-N bradykinin Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(=O)NCC(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@@H](CO)C(=O)N2[C@@H](CCC2)C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)CCC1 QXZGBUJJYSLZLT-FDISYFBBSA-N 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 229910052791 calcium Inorganic materials 0.000 description 2
- 150000001720 carbohydrates Chemical class 0.000 description 2
- 125000004432 carbon atom Chemical group C* 0.000 description 2
- 238000005119 centrifugation Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 229940039227 diagnostic agent Drugs 0.000 description 2
- 239000000032 diagnostic agent Substances 0.000 description 2
- 238000002050 diffraction method Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 208000035475 disorder Diseases 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 230000009881 electrostatic interaction Effects 0.000 description 2
- 238000011049 filling Methods 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 238000001415 gene therapy Methods 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 201000008977 glycoproteinosis Diseases 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 238000001114 immunoprecipitation Methods 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 208000015978 inherited metabolic disease Diseases 0.000 description 2
- 239000003112 inhibitor Substances 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 210000003734 kidney Anatomy 0.000 description 2
- 239000002502 liposome Substances 0.000 description 2
- 150000002739 metals Chemical class 0.000 description 2
- 239000011859 microparticle Substances 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 239000012452 mother liquor Substances 0.000 description 2
- 238000002703 mutagenesis Methods 0.000 description 2
- 231100000350 mutagenesis Toxicity 0.000 description 2
- XNOPRXBHLZRZKH-DSZYJQQASA-N oxytocin Chemical compound C([C@H]1C(=O)N[C@H](C(N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CSSC[C@H](N)C(=O)N1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(N)=O)=O)[C@@H](C)CC)C1=CC=C(O)C=C1 XNOPRXBHLZRZKH-DSZYJQQASA-N 0.000 description 2
- 229960001723 oxytocin Drugs 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 239000008194 pharmaceutical composition Substances 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- 210000002826 placenta Anatomy 0.000 description 2
- 239000013600 plasmid vector Substances 0.000 description 2
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 2
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 2
- 229910000160 potassium phosphate Inorganic materials 0.000 description 2
- 235000011009 potassium phosphates Nutrition 0.000 description 2
- 238000001556 precipitation Methods 0.000 description 2
- 235000019419 proteases Nutrition 0.000 description 2
- 230000009993 protective function Effects 0.000 description 2
- 238000001742 protein purification Methods 0.000 description 2
- 230000006337 proteolytic cleavage Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000007423 screening assay Methods 0.000 description 2
- 208000011985 sialidosis Diseases 0.000 description 2
- 229910052708 sodium Inorganic materials 0.000 description 2
- 239000011734 sodium Substances 0.000 description 2
- 230000004936 stimulating effect Effects 0.000 description 2
- 230000000638 stimulation Effects 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 238000007920 subcutaneous administration Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 108060008037 tachykinin Proteins 0.000 description 2
- 210000001550 testis Anatomy 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 239000013603 viral vector Substances 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- HBOMLICNUCNMMY-KJFJCRTCSA-N 1-[(4s,5s)-4-azido-5-(hydroxymethyl)oxolan-2-yl]-5-methylpyrimidine-2,4-dione Chemical compound O=C1NC(=O)C(C)=CN1C1O[C@H](CO)[C@@H](N=[N+]=[N-])C1 HBOMLICNUCNMMY-KJFJCRTCSA-N 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- USFZMSVCRYTOJT-UHFFFAOYSA-N Ammonium acetate Chemical compound N.CC(O)=O USFZMSVCRYTOJT-UHFFFAOYSA-N 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- 241000024188 Andala Species 0.000 description 1
- 102000004580 Aspartic Acid Proteases Human genes 0.000 description 1
- 108010017640 Aspartic Acid Proteases Proteins 0.000 description 1
- 102000035101 Aspartic proteases Human genes 0.000 description 1
- 108091005502 Aspartic proteases Proteins 0.000 description 1
- 206010003591 Ataxia Diseases 0.000 description 1
- 241000201370 Autographa californica nucleopolyhedrovirus Species 0.000 description 1
- 241000255789 Bombyx mori Species 0.000 description 1
- 241000701822 Bovine papillomavirus Species 0.000 description 1
- 241000589638 Burkholderia glumae Species 0.000 description 1
- 101100256223 Caenorhabditis elegans cho-1 gene Proteins 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 241000282461 Canis lupus Species 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 241001466804 Carnivora Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 206010010947 Coordination abnormal Diseases 0.000 description 1
- GUBGYTABKSRVRQ-WFVLMXAXSA-N DEAE-cellulose Chemical compound OC1C(O)C(O)C(CO)O[C@H]1O[C@@H]1C(CO)OC(O)C(O)C1O GUBGYTABKSRVRQ-WFVLMXAXSA-N 0.000 description 1
- FEWJPZIEWOKRBE-JCYAYHJZSA-N Dextrotartaric acid Chemical compound OC(=O)[C@H](O)[C@@H](O)C(O)=O FEWJPZIEWOKRBE-JCYAYHJZSA-N 0.000 description 1
- 102100021210 Double homeobox protein B Human genes 0.000 description 1
- 102000002045 Endothelin Human genes 0.000 description 1
- 108050009340 Endothelin Proteins 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102000005744 Glycoside Hydrolases Human genes 0.000 description 1
- 108010031186 Glycoside Hydrolases Proteins 0.000 description 1
- 206010019842 Hepatomegaly Diseases 0.000 description 1
- 208000028782 Hereditary disease Diseases 0.000 description 1
- 241001272567 Hominoidea Species 0.000 description 1
- 101000968521 Homo sapiens Double homeobox protein B Proteins 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 108090001030 Lipoproteins Proteins 0.000 description 1
- 102000004895 Lipoproteins Human genes 0.000 description 1
- 208000024556 Mendelian disease Diseases 0.000 description 1
- 102000005741 Metalloproteases Human genes 0.000 description 1
- 108010006035 Metalloproteases Proteins 0.000 description 1
- 108010063312 Metalloproteins Proteins 0.000 description 1
- 102000010750 Metalloproteins Human genes 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 102000036675 Myoglobin Human genes 0.000 description 1
- 108010062374 Myoglobin Proteins 0.000 description 1
- 108060008487 Myosin Proteins 0.000 description 1
- 102000003505 Myosin Human genes 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 108090000189 Neuropeptides Proteins 0.000 description 1
- 108010047320 Pepsinogen A Proteins 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 241000276498 Pollachius virens Species 0.000 description 1
- 101710182846 Polyhedrin Proteins 0.000 description 1
- 241000206607 Porphyra umbilicalis Species 0.000 description 1
- ZLMJMSJWJFRBEC-UHFFFAOYSA-N Potassium Chemical compound [K] ZLMJMSJWJFRBEC-UHFFFAOYSA-N 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 230000006819 RNA synthesis Effects 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 229910052772 Samarium Inorganic materials 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- 102000012479 Serine Proteases Human genes 0.000 description 1
- 108010022999 Serine Proteases Proteins 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 206010041660 Splenomegaly Diseases 0.000 description 1
- 241000256251 Spodoptera frugiperda Species 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 241000906446 Theraps Species 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 241000269370 Xenopus <genus> Species 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- QPMSXSBEVQLBIL-CZRHPSIPSA-N ac1mix0p Chemical compound C1=CC=C2N(C[C@H](C)CN(C)C)C3=CC(OC)=CC=C3SC2=C1.O([C@H]1[C@]2(OC)C=CC34C[C@@H]2[C@](C)(O)CCC)C2=C5[C@]41CCN(C)[C@@H]3CC5=CC=C2O QPMSXSBEVQLBIL-CZRHPSIPSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- YDYLCYJNKRJKHU-UHFFFAOYSA-N acetic acid;dimethylarsinic acid Chemical class CC(O)=O.C[As](C)(O)=O YDYLCYJNKRJKHU-UHFFFAOYSA-N 0.000 description 1
- ZOIORXHNWRGPMV-UHFFFAOYSA-N acetic acid;zinc Chemical compound [Zn].CC(O)=O.CC(O)=O ZOIORXHNWRGPMV-UHFFFAOYSA-N 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- 229940043376 ammonium acetate Drugs 0.000 description 1
- 235000019257 ammonium acetate Nutrition 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 238000005844 autocatalytic reaction Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- YFXPPSKYMBTNAV-UHFFFAOYSA-N bensultap Chemical compound C=1C=CC=CC=1S(=O)(=O)SCC(N(C)C)CSS(=O)(=O)C1=CC=CC=C1 YFXPPSKYMBTNAV-UHFFFAOYSA-N 0.000 description 1
- 230000000975 bioactive effect Effects 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 238000010256 biochemical assay Methods 0.000 description 1
- 238000002306 biochemical method Methods 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 230000037396 body weight Effects 0.000 description 1
- VSGNNIFQASZAOI-UHFFFAOYSA-L calcium acetate Chemical compound [Ca+2].CC([O-])=O.CC([O-])=O VSGNNIFQASZAOI-UHFFFAOYSA-L 0.000 description 1
- 239000001639 calcium acetate Substances 0.000 description 1
- 235000011092 calcium acetate Nutrition 0.000 description 1
- 229960005147 calcium acetate Drugs 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 235000011148 calcium chloride Nutrition 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000013553 cell monolayer Substances 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000005465 channeling Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 230000006957 competitive inhibition Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005336 cracking Methods 0.000 description 1
- 210000002858 crystal cell Anatomy 0.000 description 1
- 238000012866 crystallographic experiment Methods 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001687 destabilization Effects 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 239000012954 diazonium Substances 0.000 description 1
- IJGRMHOSHXDMSA-UHFFFAOYSA-O diazynium Chemical compound [NH+]#N IJGRMHOSHXDMSA-UHFFFAOYSA-O 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- KAKKHKRHCKCAGH-UHFFFAOYSA-L disodium;(4-nitrophenyl) phosphate;hexahydrate Chemical compound O.O.O.O.O.O.[Na+].[Na+].[O-][N+](=O)C1=CC=C(OP([O-])([O-])=O)C=C1 KAKKHKRHCKCAGH-UHFFFAOYSA-L 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005670 electromagnetic radiation Effects 0.000 description 1
- 238000001493 electron microscopy Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005421 electrostatic potential Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007824 enzymatic assay Methods 0.000 description 1
- 230000007247 enzymatic mechanism Effects 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000037406 food intake Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 230000009395 genetic defect Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 102000034238 globular proteins Human genes 0.000 description 1
- 108091005896 globular proteins Proteins 0.000 description 1
- 150000002306 glutamic acid derivatives Chemical class 0.000 description 1
- 238000003875 gradient-accelerated spectroscopy Methods 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 210000004408 hybridoma Anatomy 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- 239000012216 imaging agent Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 238000003119 immunoblot Methods 0.000 description 1
- 238000002513 implantation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000007912 intraperitoneal administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 238000010253 intravenous injection Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 108010045069 keyhole-limpet hemocyanin Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 208000028756 lack of coordination Diseases 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- -1 lithium sulfate sodium formate sodium citrate Chemical compound 0.000 description 1
- 108010045758 lysosomal proteins Proteins 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- UEGPKNKPLBYCNK-UHFFFAOYSA-L magnesium acetate Chemical compound [Mg+2].CC([O-])=O.CC([O-])=O UEGPKNKPLBYCNK-UHFFFAOYSA-L 0.000 description 1
- 239000011654 magnesium acetate Substances 0.000 description 1
- 235000011285 magnesium acetate Nutrition 0.000 description 1
- 229940069446 magnesium acetate Drugs 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 235000011147 magnesium chloride Nutrition 0.000 description 1
- GMDNUWQNDQDBNQ-UHFFFAOYSA-L magnesium;diformate Chemical compound [Mg+2].[O-]C=O.[O-]C=O GMDNUWQNDQDBNQ-UHFFFAOYSA-L 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- QSHDDOUJBYECFT-UHFFFAOYSA-N mercury Chemical compound [Hg] QSHDDOUJBYECFT-UHFFFAOYSA-N 0.000 description 1
- 229910052753 mercury Inorganic materials 0.000 description 1
- 239000013081 microcrystal Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 208000031225 myocardial ischemia Diseases 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 210000000287 oocyte Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000007911 parenteral administration Methods 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 230000010412 perfusion Effects 0.000 description 1
- 230000004526 pharmaceutical effect Effects 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 238000005191 phase separation Methods 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 230000001766 physiological effect Effects 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- 229910052700 potassium Inorganic materials 0.000 description 1
- 239000011591 potassium Substances 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 150000003141 primary amines Chemical class 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 230000005588 protonation Effects 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- 238000010791 quenching Methods 0.000 description 1
- 230000000171 quenching effect Effects 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 108700038606 rat Smooth muscle Proteins 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 210000005084 renal tissue Anatomy 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- KZUNJOHGWZRPMI-UHFFFAOYSA-N samarium atom Chemical compound [Sm] KZUNJOHGWZRPMI-UHFFFAOYSA-N 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 210000001626 skin fibroblast Anatomy 0.000 description 1
- 210000000329 smooth muscle myocyte Anatomy 0.000 description 1
- RDZTWEVXRGYCFV-UHFFFAOYSA-M sodium 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonate Chemical compound [Na+].OCCN1CCN(CCS([O-])(=O)=O)CC1 RDZTWEVXRGYCFV-UHFFFAOYSA-M 0.000 description 1
- 239000001632 sodium acetate Substances 0.000 description 1
- 235000017281 sodium acetate Nutrition 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 235000011083 sodium citrates Nutrition 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 239000001488 sodium phosphate Substances 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 235000000346 sugar Nutrition 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000008961 swelling Effects 0.000 description 1
- 230000005469 synchrotron radiation Effects 0.000 description 1
- 229940095064 tartrate Drugs 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- RYFMWSXOAZQYPI-UHFFFAOYSA-K trisodium phosphate Chemical compound [Na+].[Na+].[Na+].[O-]P([O-])([O-])=O RYFMWSXOAZQYPI-UHFFFAOYSA-K 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 210000003934 vacuole Anatomy 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000005526 vasoconstrictor agent Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 230000004304 visual acuity Effects 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 239000004246 zinc acetate Substances 0.000 description 1
- 235000013904 zinc acetate Nutrition 0.000 description 1
Definitions
- the present invention is in the fields of molecular biology, protein purification, protein crystallization, x-ray diffraction analysis, three-dimensional structure determination and rational drug design (RDD)
- the present invention provides crystallized protective protein/cathepsin A (PPCA) and its precursor (pPPCA)
- PPCA crystallized protective protein/cathepsin A
- pPPCA precursor
- the crystallized PPCA or pPPCA is analyzed by x-ray diffraction techniques
- the resulting x-ray diffraction patterns are of sufficiently high resolution to be useful for determining the three-dimensional structure of the PPCA or pPPCA protein, and for RDD Related Background Art
- the human protective protein/cathepsin A (PPCA, also known as human protective protein or HPP) has been identified as the primary genetic defect underlying galactosialidosis (d'Azzo e/ a/ , Proc Natl Acad Sci US A 794535- 4539 ( 1982)), a lysosomal storage disease inherited as an autosomal recessive trait Patients with this disorder are diagnosed as having drastically reduced ⁇ -galactosidase and neuraminidase activities in their cell lysosomes Examples of lysosomal storage diseases are presented in Table 316- 1 of Braunwald et al .
- ET-2 and ET-3 are potent vasoconstrictors and elevate blood pressure in mammals The> also influence cell proliferation and hormone production and have been implicated in cardiovascular disorders, rangin from hypertension to stroke to ischemic heart disease (Rubanyi and Polokoff Pharmc Rev 46.325-415 (1994))
- the present invention provides methods of expressing, purifying and crystallizing a human protective protein/cathepsin A (PPCA) and its precursor, precursor protective protein/cathepsin A (pPPCA)
- PPCA human protective protein/cathepsin A
- pPPCA precursor protective protein/cathepsin A
- the present invention also provides methods for obtaining crystallized PPCA or pPPCA that can be analyzed to obtain x-ray diffraction patterns of sufficiently high resolution to be useful for three-dimensional structure determination of the protein
- the x-ray diffraction patterns can be either analyzed directly to provide the three dimensional structure (if of sufficiently by high resolution), or atomic coordinates for the crystallized PPCA or pPPCA, as provided herein, can be used for structure determination
- the x-ray pattern/diffraction patterns obtained by methods of the present invention, and provided on computer readable media, are used to provide electron density maps
- the ammo acid sequence is also useful for three-dimensional structure determination
- the data is then used in combination with phase determination (eg , using multiple isomorphous replacement (MIR) molecular replacement techniques) to generate electron density maps of a PPCA or a pPPCA, using a suitable computer system
- the electron density maps provided by analysis of either the x-ray diffraction patterns or working backwards from the atomic coordinates, provided herein, are then fitted using suitable computer algorithms to generate secondary, tertiary and/or quaternary domains of a PPCA or a pPPCA, which domains are then used to provide an overall three- dimensional structure, as well as expected binding and active sites of the PPCA or pPPCA pPPCA his some of the active and binding sites of PPCA . except for changes in structure due to the presence of the portion of the pPPCA which is deleted during maturation to PPCA (e g , residues 285-298 of Figure 13)
- RDD rational drug design
- PPCA-specif ⁇ c structural feature or biological activity preferably as associated with a PPCA- or pPPCA-related pathology, e g , protective activity (e g , modulation of ⁇ -galactosidase activity and neuraminidase (N A) activity), and peptide or enzyme modulating activity (eg of endotheiin I (serine carboxypeptidase), neuropeptides, cathepsin A, and the like), according to known assays
- the resulting ligands provided by methods of the present invention are synthesized and are useful for treating, inhibiting or preventing at least one
- Figure 1 is a schematic ribbon diagram of the PPCA monomer (monomer 1), where Secondary structure assignments are according to DSSP (Kabsch and Sander, Bwpolymers 22.2577-2637 (1983))
- the 'core' domain is shown in yellow
- the 'cap' domain consists of a 'helical' subdomam, in red, and a 'maturation' subdomain, in orange
- FIG. 2 is stereo diagram is presented of the C. trace of the PPCA monomer 1 with numbering of selected residues
- the residues forming the ⁇ -helices and ⁇ -strands are as follows according to DSSP
- Figure 7A-F presents a topological comparison of 6 members of the hydrolase fold family
- the arrangement of structural elements in the central core domain (in green and yellow) of the different proteins is generally similar
- the cap domains (in red) vary greatly
- Figure 7A shows the PPCA precursor cap domain that consists of two subdo ains one ⁇ -helical and the other mainly ⁇ -sheet
- Figure 7B shows CPW (3SC2, Liao et al (1992) infra), cap domain helical
- Figure 7C shows CPY (LYSC, End ⁇ zzi et al (1994), infra), cap domain helical
- Figure 7D shows dehalogenase (2HAD, Franken et al .
- FIG. 7E shows lipase from Pseudomonas glumae (1TAH, Noble et al , FEBSLett 331 123- 128 (1993)), cap domain mixed ⁇ -helical and ⁇ -strands
- Figure 7F shows acetylcholine esterase (1 ACE, Sussman et al , Science 253 872-879 (1991)), cap domain large and predominantly ⁇ -helical
- the secondary structure assignments were generated with the computer program O using structures provided and/or available from the Brookhaven Protein Data Bank (This Figure was generated using MOLSCRIPT (Kraulis (1991 ), infra))
- Figure 8A-B shows the superposition of the C traces from the PPCA and CPW monomers, showing that the major differences between the two enzymes are localized in the cap domain PPCA has a large 'maturation' subdomain and the 'helical subdomain' is rotated with respect to the CPW counterpart ( Figure drawn with the O program (Jones (1991), infra))
- Figure 8B shows the C traces from the PPCA and CPW di ers after the core domains from the subunits (shown on the right hand side of the two dimers) have been superimposed Notice the remarkable difference in mutual orientation (of 15°) of the two subunits on the left hand side of the two dimers, which has been accentuated by an arrow ( Figure drawn with the O computer program (Jones ( 1991 ), supra))
- Figure 9 is a stereo view of the Ca trace of PPCA monomer 1 highlighting regions involved in the maturation event Color scheme for the trace is as follows core domain in light blue, helical subdomain in red, maturation subdomain in orange with the exception of the excision peptide (residues 285-298) which is shown in blue Orange sphere mark the residues 272 and 277 marking the beginning and end of the blocking peptide The catalytic triad Ser 150. His 429 and Asp 372 is shown as light blue spheres Two cystemes Cys 253 and Cys 303 referred to in the discussion are colored green (This Figure generated using MOLSCRIPT (Kraulis (1991 ). infra))
- Figure 10 is a close-up representation of the 'blocking' peptide (residues 272-277) bound in the active site rendering the catalytic triad solvent inaccessible Residues from the maturation subdomain are shown in orange residues fro the helical domain in magenta and residues from the core domain in cyan. The excision peptide is shown in blue. Side chains are shown for residues making extensive contacts with the blocking peptide or if mentioned in the text. The catalytic triad is shown in white. ( Figure drawn with O (Jones (1991), infra)).
- Figure 11 is a representation of elements proposed to be involved in the activation mechanism of the precursor form of PPCA as discussed in the text.
- the C'-trace of the core domain is shown in cyan, the helical subdomain in red, the maturation subdomain in orange, and the excision peptide is shown in blue. Relevant side chains are depicted and labeled. Rearrangement of the residues 254-302 limited by the disulfide Cys 253 and Cys 303 would free up the active site cleft.
- a charge cluster Arg 262, Glu 264, Arg 298 and Asp 300 occupies a strategic position within the maturation subdomain, possibly involved in pH dependent regulation of conformational changes.
- BIOGRAF BIOGRAF Construct Users Guide Version 3.2.1. , June 1993).
- Figure 12 is a schematic representation of the proposed activation of PPCA.
- the active site cleft is formed by the core domain (indicated as 'core' in the above scheme) and the helical subdomain (indicated as 'o').
- the maturation subdomain (indicated as 'm') contains the residues that block the active site cleft rendering the precursor enzymatically inactive, shown in structure 1.
- the precursor undergoes activation.
- conformational rearrangements induced by low pH might render the excision peptide more accessible to proteases as a first step, followed by cleavage of the polypeptide chain removing the excision peptide.
- FIG. 13 shows the amino acid sequence of a human pPPCA.
- the underlined portion shows an excision peptide for conversion to the mature form, PPCA.
- Figure 14 shows the amino acid sequence of a human PPCA.
- Figure 15 shows a sequence alignment between pPPCA, CPW and CPY (top three sequences shov/n). Identical residues among all three sequences are boxed. Residue numbering is included for the pPPCA amino acid sequence.
- the alignment was made using the GCG program PILEUP (GCG version 8), then manually adjusted using 3D-structural knowledge from the superposition of the CPW (Liao et al., 1992) and CPY (Endrizzi et al., 1994) atomic coordinates. The alignment was later used to design a multi-Ala search probe for molecular replacement calculations shown in the fourth sequence shown as 'model'.
- pPPCA protein can be divided in two domains: a 'core' domain (residues 1-182 and 303-452) and 'cap' domain (residues 183-302).
- the secondary structure elements for the PPCA precursor are depicted with shaded bars (for details on the assignment and nomenclature, see Rudenko et al. Structure 3: 1249-1259 (1988) ).
- Figure 16 shows a schematic representation of a 'bootstrapping' cycle as described in Example 2.
- Figure 17 is a representation of an initial molecular mask enlarged to accommodate missing area's in the model.
- the program MAMA Karlinsky, 1994
- O Jobones et al. , 1991
- Figure 18 is a representation of an enlargement of the model during the bootstrapping procedure plotted as a function of the expansion step.
- the number of C atoms incorporated in the model per monomer is given ( — ° — ) as well as the number of correct side chains (- « -). Note that after the first round of building in the molecular replacement map (expansion step ' r").37 residues from the molecular replacement search probes had to be deleted from the model reducing the number of C* atoms to 294. Subsequent cycles allowed for the model to be expanded by small increments.
- Figure 19 is a representation of a comparison of the C" trace from a monomer core model (shown in magenta) and the complete PPCA monomer (shown in yellow).
- the core model contained only 294 C atoms.
- the 452 residue PPCA monomer consists of a core domain and a cap domain.
- the helical subdomain and the maturation subdomain forming the cap domain have been shown in the figure above.
- Figure 20A-D is a representation of the resolving power of the bootstrapping procedure showing three different stages in map quality
- the atomic coordinates of the refined model are visualized with the electron density in Figures 20B.
- Figures 20A and 20B show the initial 2m
- the electron density is essentially untnterpretable Fig.
- Figure 21 shows a Ramachandran plot calculated for one monomer from a refined model of a pPPCA Both monomers in the asymmetric unit give essentially equivalent plots
- Figure 22 shows a schematic of a computer system for PPCA or pPPCA structure determination and/or rational drug design
- Figure 23.1-52 lists the atomic coordinates for the active site of a pPPCA dimer having the ammo acid sequence presented as portions of at least one of 50-76, 144-155, 173-197, 226-253, 226-288, 294-310, 327-344, 338- 350, 366-381 and 423-436 of (Figure 23 1-23 26) 452 ammo acids (designated 1 -452) of monomer 1, as well as corresponding portions of ( Figure 23 26-23 52) 452 amino acids (designated 1001-1452) of monomer 2
- the present invention provides methods for expressing, purifying and crystallizing a protective protein/cathepsin A (PPCA) or a precursor protective protein/cathepsin A (pPPCA), where the crystals diffract x-rays with sufficiently high resolution to allow determination of the three-dimensional structure of the PPCA or pPPCA, or a portion or subdomain thereof
- the three-dimensional structure e g ,as provided on computer readable media of the present invention
- Such ligands can be synthesized or recombinantly produced and are useful as diagnostic agents or drugs for diagnosing, treating, inhibiting or preventing at least one PPCA- or pPPCA-related pathology
- the determined structure is made using the PPCA or pPPCA ammo acid sequences and or atomic coordmate/x- ray diffraction data, which are analyzed to provide atomic model output data corresponding to the three-dimensional structure, e
- Structure determination methods are also provided by the present invention for rational drug design (RDD) of PPCA or pPPCA ligands
- RDD rational drug design
- Such drug design uses computer modeling programs that calculate different molecules expected to interact with the determined active sites, binding sites, or other structural or functional domains or subdomains of a PPCA or a pPPCA
- These ligands can then be produced and screened for activity in modulating or binding to a PPCA or pPPCA, according to methods and compositions of the present invention
- the actual PPCA or pPPCA-ligand complexes can optionally be crystallized and analyzed using x-ray diffraction techniques
- the diffraction patterns obtained are similarly used to calculate the three-dimensional interaction of the ligand and the PPCA or pPPCA, to confirm that the ligand binds to or changes the conformation of, particular doma ⁇ n(s) or subdoma ⁇ n(s) of the PPCA or pPPCA
- screening methods are selected from assays for at least one biological activity of a PPCA or a pPPCA
- the resulting ligands provided by methods of the present invention, modulate or bind at least one PPCA or pPPCA and are useful for diagnosing treating or preventing PPCA- or pPPCA- related pathologies in animals, such as humans
- Ligands of a particular PPCA or pPPCA can similarly modulate o t her PPCAs or pPPCAs from other sources, such as other e
- the x-ray diffraction patterns obtained by the x-ray analysis are of moderate, to moderately high, to high resolution, e.g.. 30-10, 10-3.5 or 1.5-3.5 A, respectively, with the higher resolutions included. These diffraction patterns are suitable and useful for three-dimensional structure determination of a PPCA or a pPPCA, domain or subdomain thereof. The determination of the three-dimensional structure of a PPCA or pPPCA has a broad- based utility.
- the three-dimensional structure from one or few PPCAs or pPPCAs can be used to identify ligands that have diagnostic or therapeutic value for at least one PPCA- or pPPCA-related pathology that may involve PPCAs or pPPCAs having different amino acid sequences. Determination of Protein Structures
- the primary structure is obtained by biochemical methods, either by direct determination of the amino acid sequence from the protein, or from the nucleotide sequence of the corresponding gene or cDNA.
- the quaternary structure of large proteins or aggregates can also be determined by electron microscopy.
- x-ray crystallography is preferred. See, e.g., Blundell, infra; Oxender, infra; McPherson, infra; Wyckoff, infra.
- the first prerequisite for solving the three-dimensional structure of a protein by x-ray crystallography is a well- ordered crystal that will diffract x-rays strongly.
- the crystallographic method directs a beam of x-rays onto a regular, repeating array of many identical molecules so that the x-rays are diffracted from it in a pattern from which the structure of an individual molecule can be retrieved.
- Well-ordered crystals of globular protein molecules are large, spherical, or ellipsoidal objects with irregular surfaces, and crystals thereof contain large holes or channels that are formed between the individual molecules. These channels, which usually occupy more than half the volume of the crystal, are filled with disordered solvent molecules.
- the protein molecules are in contact with each other at only a few small regions. This is one reason why structures of proteins determined by x-ray crystallography are generally the same as those for the proteins in solution.
- Crystallization robots can automate and speed up the work of reproducibly setting up large numbers of crystallization experiments.
- a pure and homogeneous protein sample is important for successful crystallization. Proteins obtained from cloned genes in efficient expression vectors can be purified quickly to homogeneity in large quantities in a few purification steps.
- a protein to be crystallized is preferably at least 93-99% pure according to standard criteria of homogeneity. Crystals form when molecules are precipitated very slowly from supersaturated solutions. The most frequently used procedure for making protein crystals is the hanging-drop method, in which a drop of protein solution is brought very gradually to supersaturation by loss of water from the droplet to the larger reservoir that contains salt or polyethylene glycol solution.
- Different crystal forms can be more or less well-ordered and hence give diffraction panerns of different quality.
- X-rays are electromagnetic radiation at short wavelengths, emitted when electrons jump from a higher to a lower energy state
- x-rays are produced by high-voltage tubes in which a metal plate, the anode, is bombarded with accelerating electrons and thereby caused to emit x-rays of a specific wavelength, so-called monochromatic x-rays.
- the high voltage rapidly heats up the metal plate, which therefore has to be cooled
- Efficient cooling is achieved by so-called rotating anode x-ray generators, where the metal plate revolves during the experiment so that different parts are heated up
- More powerful x-ray beams can be produced m synchrotron storage rings where electrons (or positrons) travel close to the speed of light These particles emit very strong radiation at all wavelengths from short gamma rays to visible light
- x-ray source only radiation within a window of suitable wavelengths is channeled from the storage ring
- Polychromatic x-ray beams are produced by having a broad window that allows through x-ray radiation with wavelengths of 0 2 - 3 5 A
- the diffracted spots are recorded either on a film, the classical method, or by an electronic detector
- the exposed film has to be measured and digitized by a scanning device, whereas electronic detectors feed the signals they detect directly in a digitized form into a computer
- Electronic area detectors an electronic film significantly reduce the time required to collect and measure diffraction data
- the diffraction pattern obtained in an x-ray experiment is related to the crystal that caused the diffraction X- rays that are reflected from adjacent planes travel different distances, and diffraction only occurs when the difference in distance is equal to the wavelength of the x-ray beam This distance is dependent on the reflection angle, which is equal to the angle between the primary beam and the planes
- Each atom in a crystal scatters x-rays in all directions, and only those that positively interfere with one another, according to Bragg's law, give rise to diffracted beams that can be recorded as a distinct diffraction spot above background
- Each diffraction spot is the result of interference of all x-rays with the same diffraction angle emerging from all atoms
- each of the about 20,000 diffracted beams that have been measured contain scattered x-rays from each of the around 1500 atoms in the molecule
- the mathematical tool that is used to handle such problems is called the Fourier transform
- Each diffracted beam which is recorded as a spot on the film, is defined bv three properties the amplitude which we can measure from the intensity of the spot, the wavelength, which is set by the x-ray source and the phase, which is lost in x-ray experiments All three properties are needed for all of the diffracted beams, in order to determine the position of the atoms giving rise to the diffracted beams
- MIR multiple isomorphous replacement
- lsomorphous replacement is usually done by diffusing different heavy-metal complexes into the channels of the preformed protein crystals.
- the protein molecules expose side chains (such as SH groups) into these solvent channels that are able to bind heavy metals. It is also possible to replace endogenous light metals in metalloproteins with heavier ones, e.g., zinc by mercury, or calcium by samarium.
- Phase differences between diffracted spots can be determined from intensity changes following heavy-metal substitution.
- the intensity differences are used to deduce the positions of the heavy atoms in the crystal unit cell. Fourier summations of these intensity differences give maps of the vectors between the heavy atoms, the so-called Patterson maps. From these vector maps the atomic arrangement of the heavy atoms is deduced. From the positions of the heavy metals in the unit cell, one can calculate the amplitudes and phases of their contribution to the diffracted beams of protein crystals containing heavy metals.
- phase and amplitude of the heavy metals and the amplitude of the protein alone is known, as well as the amplitude of the protein plus heavy metals (i.e., protein heavy-metal complex), one phase and three amplitudes are known. From this, the interference of the x-rays scattered by the heavy metals and protein can be calculated to see if it is constructive or destructive. The extent of positive or negative interference, with knowledge of the phase of the heavy metal, give an estimate of the phase of the protein. Because two different phase angles are determined and are equally good solutions, a second heavy-metal complex can be used which also gives two possible phase angles.
- the map itself contains errors, mainly due to errors in the phase angles.
- the quality of the map depends on the resolution of the diffraction data, which in turn depends on how well-ordered the crystals are. This directly influences the image that can be produced.
- the resolution is measured in A units; the smaller this number is. the higher the resolution and therefore the greater the amount of detail that can be seen.
- the initial model will contain some errors. Provided the protein crystals diffract to high enough resolution (e.g., better than 3.5 A), most or substantially all of the errors can be removed by crystallographic refinement of the model using computer algorithms. In this process, the model is changed to minimize the difference between the experimentally observed diffraction amplitudes and those calculated for a hypothetical crystal containing the model (instead of the real molecule) This difference is expressed as an R factor (residual disagreement) which is 00 for exact agreement and about 0 59 for total disagreement
- the R factor is preferably between 0 15 and 0 35 (such as less than about 0 24-028) for a well- determined protein structure
- the residual difference is a consequence of errors and imperfections in the data These derive from various sources, including slight variations in the conformation of the protein molecules, as well as inaccurate corrections both for the presence of solvent and for differences in the orientation of the microcrystals from which the crystal is built This means that the final model represents an average of molecules that are slightly different both in conformation and orientation
- Electron-density maps with this resolution range are preferably interpreted by fitting the known amino acid sequences into regions of electron density in which individual atoms are not resolved
- a PPCA or pPPCA polypeptide can refer to any subset of a PPCA or pPPCA as a domain, subdomain, fragment, consensus sequence or repeating unit thereof
- a PPCA or pPPCA polypeptide of the present invention can be prepared by, e g (a) recombinant DNA methods,
- a biological activity of PPCA or pPPCA can be screened according to known screening assays
- the minimum peptide sequence to have activity is based on the smallest unit containing or comprising a particular domain subdomain, fragment, region, consensus sequence, or repeating unit thereof, having at least one biological activity of a PPCA or pPPCA, such as protecting activity, inhibiting activity or enzyme activity
- Non-limiting examples of such activities are protecting activity for ⁇ -galactosidase or neuraminidase (NA), modulating activity (inhibition, stimulation or activation) as an for endotheiin I (serine carboxypeptidase) or cathepsin A and peptide hydrolyzmg activity (e g substance P and substance P-free acid, oxytocin and oxytocin-free acid,
- a PPCA or pPPCA includes an association of two or more polypeptide subdomains, such as at least one 4 ammo acid portion of a core or cap domain of a PPCA or pPPCA This can include 1 -14 subdomains of the cap domain and/or 1-44 subdomains of the core domain (as monomers or dimers), or any range, value or combination thereof Preferably 1 -4 sets of each of at least one core or cap domains or subdomains are included
- the structure of a monomer or domain of at least one PPCA includes at least one subdomain of a PPCA of a pPPCA of the present invention can include one or more of the following subdomains, as described herein Generally a PPCA or pPPCA consists of a dimer of a core domain and a cap domain having the following subdomains having the specified residues, e g .
- a PPCA or pPPCA polypeptide of the invention can have at least 80% homology, such as 80-100% overall homology or identity, with one or more corresponding PPCA or pPPCA subdomains or fragments as described herein, such as a 4-542 amino acid fragment or portion of the amino acid sequence of Figures 13, 14 or 15.
- the above configurations of subdomains are provided as part of a PPCA or pPPCA polypeptide of the invention, when expressed in a suitable host cell, or otherwise synthesized, to provide at least one structural or functional feature of a native PPCA or pPPCA, such as at least one PPCA-related biological activity.
- Such activities can be assayed using a suitable assay, to establish at least one PPCA biological .activity of one or more PPCAs or pPPCAs of the invention.
- a PPCA or pPPCA polypeptide of the invention is not naturally occurring or is naturally occurring but is in a purified or isolated form which does not occur in nature.
- suitable PPCA activity assay include, e.g., cathepsin A activity (Galjart e/ a/., J. Biol. Chem. 266:14754-14762 (1991); Endotheiin I deamidase activity (Jackman, et al., J. Biol. Chem.
- Percent homology or identity can be determined, for example, by comparing sequence information using the
- the GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith and Waterman (Adv. Appl. Math. 2:482 (1981). Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) which are similar, divided by the total number of symbols in the sho ⁇ er ofthe two sequences.
- the preferred default parameters for the GAP program include: (1) a unitary comparison matrix (containing a value of 1 for identities and 0 for non-identities) and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN
- Non-limiting examples of substitutions of a PPCA or pPPCA domains or polypeptide of the invention are those in which at least one amino acid residue in the protein molecule has been removed and a different residue added in its place according to the following Table 2.
- the types of substitutions which can be made in the protein or peptide molecule of the invention can be based on analysis of the frequencies of amino acid changes between a homologous protein of different species, such those presented in Figure 15. Based on such an analysis, alternative substitutions are defined herein as exchanges within one of the following five groups.
- deletions and additions, and substitutions according to the invention are those which do not produce radical changes in the characteristics of the protein or peptide molecule "Characteristics" is defined in a non-inclusive manner to define both changes in secondary structure, e g ⁇ -helix or ⁇ -sheet, as well as changes in physiological activity, e g m biological activity assays
- PPCA or pPPCA screening assay such as, but not limited to, immunoassays or bioassays, to confirm at least one PPCA or pPPCA biological activity
- a PPCA and or a pPPCA is now discovered to have serine carboxypeptidase activity and corresponding structural features, although having only about 30% sequence identity to wheat and yeast serine carboxypeptidases
- carboxypeptidases are members of the hydrolase fold family (Liao et al Biochemistry 31 9796-9812 (1992), End ⁇ zzi etal .Biochemistry 33 1 1106-11120 (1994), Ollis etal, Protein Eng 5 197-21 1 (1992))
- the serine carboxypeptidases have peptidase activity at acidic pH ( pH 4 5-5 5) as well as deamidase and esterase activities at pH 7 (reviewed in Breddam et al Carlsberg Res Commun 51 83- 128 ( 1986), Raw ngs & Barrett, Methods in Enzymology 244 19-61 (1 94)) Mutagenesis studies and enzymatic assays have revealed that only the mature form of PPCA possesses a serine carboxypeptidas
- a nucleic acid sequence encoding a PPCA or a pPPCA can be recombined with vector DNA in accordance with conventional techniques, including blunt-ended or staggered-ended termini for gation, restriction enzyme digestion to provide appropriate termini, filling in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and gation with appropriate ligases Techniques for such manipulations are disclosed, e g , in Sambrook et al , Molecular Cloning A Laboratory Manual, Second edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989), and Ausubel et al ,Current Protocols in Molecular Biology, Wiley Interscience, N Y , ( 1988- 1995) and are well known in the art
- a nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypeptide if it contains nucleotide sequences which contain transcriptional and translational regulatory information and such sequences are “operably linked” to nucleotide sequences which encode the polypeptide
- An operable linkage is a linkage in which the regulatory DNA sequences and the DNA sequence sought to be expressed are connected in such a way as to permit gene expression as a PPCA , pPPCA or fragment thereof, in recoverable amounts
- the precise nature of the regulatory regions needed for gene expression can vary from organism to organism, as is well known in the analogous art See, eg , Sambrook, infra and Ausubel, infra
- the invention accordingly encompasses the expression of a PPCA or a pPPCA, in either prokaryotic or eukaryotic cells, although eukaryotic expression is preferred
- Preferred hosts are bacterial or eukaryotic hosts including bacteria, yeast, insects, fungi, bird and mammalian cells either in vivo, or in situ, or host cells of mammalian, insect, bird or yeast origin It is preferred that the mammalian cell or tissue is of human, primate, hamster, rabbit, rodent, cow, pig, sheep, horse, goat, dog or cat origin, but any other mammalian cell can be used
- Eukaryotic hosts can include yeast, insects, fungi, and mammalian cells either in vivo, or in tissue culture Preferred eukaryotic hosts can also include, but are not limited to insect cells, mammalian cells either in vivo, or in tissue culture Preferred mammalian cells include Xenopus oocytes, HeLa cells, cells of fibroblast origin such as VERO or CHO- 1 or cells of lymphoid origin and their derivatives
- Mammalian cells provide post-translational modifications to protein molecules including correct folding or giycosylation at correct sites
- Mammalian cells which can be useful as hosts include cells of fibroblast origin such as but not limited to NIH 3T3 VERO or CHO or cells of lymphoid origin, such as, but not limited to the hybridoma SP2/0-Agl4 or the mu ⁇ ne myeloma P3-X63Ag8 hamster cell lines (e , CHO-K I and progenitors, e g , CHO- DUXB 1 1 ) and their derivatives
- One preferred type of mammalian cells are cells which are intended to replace the function of the genetically deficient cells in vivo Neuronally derived cells are preferred for gene therapy of disorders of the nervous system
- a mammalian cell host many possible vector systems are available for the expression of at least one PPCA or pPPCA A wide variety of transcriptional and translational regulatory sequences can be employed, depending upon the nature
- PPCA or pPPCA production can be achieved, for example, by infecting the insect host with a baculovirus engineered to express transmembrane polypeptide by methods known to those skilled in the related arts See Ausubel infra, ⁇ 16 8- 16 1 1
- the introduced nucleotide sequence will be incorporated into a plasmid or viral vector capable of autonomous replication in the recipient host
- a plasmid or viral vector capable of autonomous replication in the recipient host
- Any of a wide variety of vectors can be employed for this purpose See. e g , Ausubel er ⁇ / , infra, ⁇ 1 5, 1 10, 7 1, 7 3, 8 1, 9 6, 9 7, 13 4, 16 2, 16 6, and 16 8- 16 1 1
- Factors of importance in selecting a particular plasmid or viral vector include the ease with which recipient cells that contain the vector can be recognized and selected from those recipient cells which do not contain the vector, i he number of copies of the vector which are desired in a particular host and whether it is desirable to be able to "shuttle" the vector between host cells of different species
- Different host cells have characteristic and specific mechanisms for the translational and post-translational processing and modification (e g , glycosylation, cleavage) of proteins
- the DNA construct(s) can be introduced into an appropriate host cell by any of a variety of suitable means, I e , tiansformation, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate-precipitation, direct icroinjection, and the like
- recipient cells are grown in a selective medium, which selects for the growth of vector-containing cells
- Expression of the cloned gene molecule(s) results in the production of a PPCA or pPPCA This can take place in the transformed cells as such, or following the induction of these cells
- a PPCA or pPPCA, or fragments thereof, of this invention can be obtained by expression from recombinant DNA according to known methods Alternatively, a PPCA or pPPCA can be purified from biological material A PPCA or a pPPCA can be purified from different mammalian tissues (e g , human placenta, rat liver mouse liver, pig kidney, bovine testes, bovine liver, and the like) of various genus and species
- the PPCA or pPPCA can be isolated and purified in accordance with conventional method steps, such as extraction precipitation, chromatography affinity chromatography electrophoresis, or the like
- cells expressing at least one PPCA or pPPCA in suitable levels can be collected by centrifugation, or with suitable buffers lysed and the protein isolated by column chromatography for example on DEAE-cellulose phcsphocellulose polynbocvtidylic acid-agarose, hvdroxyapatite or by electrophoresis or immunoprecipitation
- a pPPCA or PPCA can be isolated by the use of antibodies, such as, but not limited to, a PPCA- or pPPCA-specific antibody
- antibodies can be obtained by known method steps (see, e g , Harlow and Lane ANTIBODIES A LABORATORY MANUAL Cold Spring Harbor Laboratory (1988); Colligan et al , eds , Current Protocols in Immunology, Greene Publishing As
- a PPCA or a pPPCA can be purified from different mammalian tissues (e g , human placenta, rat liver, mouse liver, pig kidney, bovine testes, bovine liver, and the like) of various genus and species, using known techniques such as gel filtration, phase separation and affinity chromatography, e g .using polyclonal or monoclonal antibodies specific for a PPCA or pPPCA, according to known methods See . e g , Oxender et al , Protein Engineering, Liss, New York ( 1986)
- a PPCA or pPPCA is isolated in soluble form in sufficient purity and concentration (e g , a monomer or dimer) for crystallization
- the PPCA or pPPCA is then isolated and assayed for biological activity (e g , cathepsin
- the purified PPCA or pPPCA preferably runs as a single band for each monomer under reducing or nonreducing polyacrylamide gel electrophoresis (PAGE)
- the purified PPCA or pPPCA is preferably crystallized under varying conditions of at least one of the following pH, buffer type, buffer concentration, salt type, polymer type, polymer concentration, other precipitating ligands and concentration of purified PPCA or pPPCA
- pH, buffer type, buffer concentration, salt type, polymer type, polymer concentration, other precipitating ligands and concentration of purified PPCA or pPPCA See, e g , known methods (Blundell et ⁇ l , Protein Crystallography, Academic Press, London (1976), Oxender, infra; McPherson, The Preparation and Analysis of Protein Crystals, Wiley Interscience, N Y (1982)) or methods provided in a commercial kit, such as CRYSTAL SCREEN (Hampton Research, Riverside, CA)
- the crystallized PPCA protein can optionally be tested for at least one PPCA activity and differently sized and shaped crystals are further tested for suitability for x-ray diffraction
- the hanging drop method is preferably used to crystallize the purified protein See, e g , Blundell, infra, Oxender, infra, McPherson, infra; Wyckoff, infra, Taylor et al , J Mol Biol 226 1287-1290 (1992), Takimoto ⁇ ?/ ⁇ / ( 1992), infra, CRYSTAL SCREEN, Hampton Research
- a mixture of the purified protein and precipitant can include the following
- buffer type e g , tromefhamine (TRIZMA), sodium azide (NaN 3 ), phosphate, sodium, or cacodylate acetates, imidazole, Tris HCI, sodium hepes
- buffer concentration e g , 1 - 100 M
- salt type e g , sodium azide, calcium chloride, sodium citrate, magnesium chloride, ammonium acetate, ammonium sulfate, potassium phosphate, magnesium acetate, zinc acetate, calcium acetate
- polymer type and concentration e g , polyethylene glycol (PEG) 1 -50%, type 400-10,000
- a non-limiting example of such crystallization conditions is the following • purified PPCA or pPPCA protein (e g . 5 mg/ml), • (2) solutions in serial mixtures
- the above mixtures are used and screened by varying at least one of pH, buffer type, buffer concentration, precipitating salt type or additive or their concentrations, PEG type, PEG concentration, and protein concentration Crystals ranging in size from 0 1-09 mm are formed in 1 -14 days These crystals diffract x-rays to at least 10 A resolution, such as 0 15-100 A, or any range of value therein, such as 1 5, 1 6, 1 7 1 8, 1 9, 2 0, 2 1 , 2 2 2 3, 2 4, 2 5, 2 6, 2 7, 2 8, 2 9, 3 0, 3 1 , 3 2, 3 3, 34 or 3 5, with 3 5 A or higher being preferred for the highest resolution In addition to diffraction patterns having this highest resolution, lower resolution, such as 25-3 5 A can also be used See, e g , Blundell, infra, Oxender, infra, McPherson, infra, Wyckoff, infra, Protein Crystals Crystals appear after 1-14 days and continue to grow on subsequent days Some of the crystals can
- Crystals so produced for a PPCA or pPPCA are x-ray analyzed using a suitable x-ray source Diffraction patterns are obtained Crystals are preferably stable for at least 10 hrs in the x-ray beam Frozen crystals (e g , -220 to -50°C) are optionally used for longer x-ray exposures (eg , 5-72 hrs), the crystals being relatively more stable to the x-rays in the frozen state To collect the maximum number of useful reflections, multiple frames are optionally collected as the crystal is rotated in the x-ray beam, eg , for 5-72 hrs Larger crystals (>02 mm) are preferred, to increase the resolution of the x-ray diffraction patterns obtained Crystals are preferably analyzed using a synchrotron high energy x-ray source Using frozen crystals, x-ray diffraction data is collected on crystals that diffract to at least a relatively high resolution of 10- 1 5 A, with lower resolutions also useful, such as 25-
- the diffraction pattern can be visualized using, e g an image plate or film, resulting in an image with spots corresponding to the diffracted x-rays
- the positions of the spots in the diffraction pattern are used to determine parameters intrinsic to the crystal (such as unicell parameters) and to gain information on the packing of the molecules in the crystal
- the intensity of the spots contains the Fourier transformation of the molecules in the crystal, / e , information on each atom in the crystal and hence of the crvstallized molecule
- the data is processed This includes measuring the spots on each diffraction pattern in terms of position and intensity
- This information is processed (; e mathematical operations are performed on the data (such as scaling, merging and converting the data from intensity of diffracted beams lo amplitudes)) to yield a set of data which is in a form as can be used for the further structure determination of the molecule crystallized
- the amplitudes of the diffracted x-rays are then combined with calculated phases to Droduce an electron density map of the contents of the crystal In this electron density map. the structure of the molecules (as present in the crystal) is built.
- the phases can be determined with various known techniques, one being molecular replacement.
- the phases can be further optimized using a technique called density modification, which allows electron density maps of better quality to be produced facilitating interpretation and model building therein.
- density modification allows electron density maps of better quality to be produced facilitating interpretation and model building therein.
- the atomic model is then refined by allowing the atoms in the model to move in order to match the diffraction data as well as possible while continuing to satisfy stereochemical constraints (sensible bond lengths, bond angles and the like). See, e.g., Blundell, infra; Oxender, infra; McPherson, infra; Wyckoff, infra; Computer Related Embodiments
- An amino acid sequence of a PPCA or pPPCA and/or atomic coordinate/x-ray diffraction data useful for computer structure determination of a PPCA, pPPCA or a portion thereof, can be "provided” in a variety of mediums to facilitate use thereof.
- provided refers to a manufacture, which contains a PPCA or pPPCA amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention, e.g., the amino sequence provided in Figures 13-15, a representative fragment thereof, or an amino acid sequence having at least 80-100% overall identity to a 5-542 amino acid fragment of an amino acid sequence of Figures 13-15.
- Such a method provides the amino acid sequence and/or atomic coordinate/x-ray diffraction data in a form which allows a skilled artisan to analyze and determine the three- dimensional structure of a PPCA, a pPPCA or a subdomain thereof.
- PPCA, pPPCA, or at least one subdomain thereof, amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention is recorded on computer readable media.
- computer readable media refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
- any of the presently known computer readable media can be used to create a manufacture comprising computer readable medium having recorded thereon an amino acid sequence and or atomic coordinate/x-ray diffraction data of the present invention.
- “recorded” refers to a process for storing information on computer readable medium.
- a skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising an amino acid sequence and/or atomic coordinate/x-ray diffraction data information of the present invention.
- a variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon an amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention.
- the choice of the data storage structure will generally be based on the means chosen to access the stored information.
- a variety of data processor programs and formats can be used to store the sequence and x-ray data information of the present invention on computer readable medium.
- the sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MICROSOFT Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like.
- a skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the information of the present invention.
- a skilled artisan can routinely access the sequence and atomic coordinate or x-ray diffraction data to model a PPCA, pPPCA. a subdomain thereof, or a ligand thereof.
- Computer algorithms are publicly and commercially available which allow a skilled artisan to access this data provided on a computer readable medium and analyze it for structure determination and/or RDD See, e g Biotechnology Software Directory Mary Ann Liebert Publ , New York (1995)
- the present invention further provides systems, particularly computer-based systems, which contain the sequence and/or diffraction data described herein
- Such systems are designed to do structure determination and RDD for a PPCA, pPPCA or at least one subdomain thereof
- Non-limiting examples are microcomputei workstations available from Silicon Graphics Incorporated and Sun Microsystems running Unix based Windows NT or IBM OS/2 operating systems
- a computer-based system refers to the hardware means, software means, and data storage means used to analyze the sequence and/or atomic coord mate/x-ray diffraction data of the present irvention
- the minimum hardware means of the computer-based systems of the present invention comprises a central processing unit
- the computer-based systems of the present invention comprise a data storage means having stored therein a PPCA, pPPCA or fragment sequence and/or atomic coordinate/x-ray diffraction data of the present invention and the necessary hardware means and software means for supporting and implementing an analysis means
- data storage means refers to memory which can store sequence or atomic coordinate/x-ray diffraction data of the present invention, or a memory access means which can access manufactures having recorded thereon the sequence or x-ray data of the present invention
- search means or “analysis means” refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence or x-ray data stored within the data storage means Search means are used to identify fragments or regions of a PPCA or pPPCA which match a particular target sequence or target motif
- search means are used to identify fragments or regions of a PPCA or pPPCA which match a particular target sequence or target motif
- a target structural motif refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif
- target motifs include, but are not limited to, enzymic active sites, structural subdomains, epitopes, functional domains and signal sequences
- a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention
- a variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify structural motifs or interpret electron density maps derived in part from the atomic cocrdi ⁇ ate/x-ray diffraction data
- any one of the publicly available computer modeling programs can be used as the search means for the computer-based systems of the present invention
- Figure 22 provides a block diagram of a computer system 102 that can be used to implement the present invention
- the computer system 102 includes a processor 106 connected to a bus 104 Also connected to the bus 104 are a mam memory 108 (preferably implemented as random access memory RAM) and a variety of secondary storage memory 1 10, such as a hard drive 1 12 a removable storage medium 1 14 and a monitor 120
- the removable medium storage device 1 14 may represent, for example, a floppy disk drive, a CD-ROM drive a magnetic tape drive, etc
- a removable storage medium 1 16 (such as a flopp ⁇ disk a compact disk a magnetic tape etc ) containing control logic and/or data recorded therein mav be inserted into the removable medium storage medium 1 14
- the computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 1 14 once inserted in the removable medium storage device 114
- Ammo acid, encoding nucleotide or other sequence and/or atomic coordinate/x-ray diffraction data of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 1 10, and/or a removable storage device 1 16
- Software for accessing and processing the ammo acid sequence and/or atomic coordinate/x-ray diffraction data reside in main memory 108 during execution
- the monitor 120 is optionally used to visualize the structure data Structure Determination
- One or more computational steps, computer programs and/or computer algorithms are used to build a molecular
- x-ray diffraction data and phases are combined to produce electron density maps in which the three-dimensional structure of a PPCA or pPPCA is then built or modeled This structure can then be used for RDD of modulators of at least one PPCA- or pPPCA-related activity that is relevant to at least one PPCA- or pPPCA-related pathology
- Electron density maps can be calculated using such programs as those from the CCP4 computing package (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory,
- Cycles of two-fold averaging can further be used, such as with the program RAVE (Kleywegt & Jones, Bailey et al eds , First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 (1994)) and gradual model expansion
- RAVE Zaleywegt & Jones, Bailey et al eds , First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 (1994)
- RAVE Mapwegt & Jones, Bailey et al eds , First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 (1994)
- O For map visualization and model building a program such as "O" (Jones (1991), infra) can be used
- Rigid body and positional refinement can be carried out using a program such as X-PLOR (Br ⁇ nger (1992), infra), e g, with the stereochemical parameters of Engh and Huber (Ada Cryst
- the model at this stage in the averaged maps still misses residues (e g , at least 5-10 per subunit)
- the some or all of the missing residues can be incorporated in the model during additional cycles of positional refinement and model building
- the refinement procedure can start using data from lower resolution (e g , 25- 10A to
- a program such as ARP (La zin and Wilson, Ada Cryst D49 129-147 (1993)) can be used to add crystallographic waters and as a tool to check for bad areas in the model Programs such as PROCHECK
- Cyclical two-fold density averaging can also be done to improve the electron density maps using a suitable program
- model expansion can also be used to add missing residues for each monomer, resulting in a model with
- the model can be refined in a program such as X-PLOR (Brunger ( 1992) , supra) to a suitable crystallographic R (ta chorus).
- the model data is then saved on computer readable media for use in further analysis such as rational drus design Rational Design of Drugs that Interact with the PPCA orpPPCA
- the determination of the three-dimensional structure of a PPCA or pPPCA, as described hen in, provides a basis for the design of new and specific ligands for the diagnosis and/or treatment of at least one PPCA- or pPPCA- related pathology
- Several approaches can be taken for the use of the crystal structure of a PPCA or pPPCA in the rational design of ligands of this protein
- a computer-assisted, manual examination of the active site structure is optionally done
- the use of software such as GRID ( Goodford, J Med Chem 28 849-857 (1985)) a program that determines probable interaction sites between probes with various functional group characteristics and the enzyme surface — is used to analyze the active site to determine structures of inhibiting compounds
- the program calculations with suitable inhibiting groups on molecules (e g , protonated primary amines) as the probe, are used to identify potential hotspots around accessible positions at suitable energy contour levels Suitable ligands, as inhibiting or stimulating modulating compounds or compositions,
- a diagnostic or therapeutic PPCA or pPPCA modulating ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies
- labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminesce ⁇ t compounds
- any other known diagnostic or therapeutic agent can be used in a method of the invention After preliminary experiments are done to determine the K m of the substrate with each enzyme activity of a
- 0 K,((A 1 + K,) / K.) using PROCNLIN from SAS (SAS Institute Inc , Cary, North Carolina, USA) which performs nonlinear regression using least-square techniques
- the iterative method used is optionally the multiva ⁇ ate secant method, similar to the Gauss- Newton method except that the derivatives in the Taylor series are estimated from the histogram of iterations rather than supplied analytically
- a suitable convergence criterion is optionally used, e g , where there is a change in loss function of less than 10 8
- crystallographic studies of the compounds complexed to a PPCA or pPPCA can be performed as a non-limiting example, PPCA or pPPCA crystals are soaked for 2 days in 0 01-100 mM ligand and x-ray diffraction data are collected on an area detector and/or an image plate detector (e g , a Mar image plate detector) using a rotating anode
- a PPCA or pPPCA ligand is any molecule, compound or composition that is capable of associating with a PPCA or pPPCA and optionally modulating at least one function or structural feature of a PPCA or pPPCA
- a PPCA or pPPCA ligand modulates at least one biological activity of a PPCA or pPPCA Demonstration of clinically useful levels, e g , in vivo activity is also important
- PPCA or pPPCA inhibitors for biological activity in animal models eg , rat, mouse, rabbit
- animal models eg , rat, mouse, rabbit
- the present invention also provides methods for identifying diagnostic or therapeutic ligands of PPCA or pPPCA via computer RDD, to treat a PPCA-related pathology
- a method for determining the therapeutic or diagnostic use of a PPCA or pPPCA modulating ligand, to treat a PPCA related pathology comprises the steps of administering a known dose of at least one ligand containing compositions to an animal model having a phenotype corresponding to a PPCA-related pathology, monitoring the appropriate biological or biochemical parameters, and comparing the results with treated animals to those of untreated animals.
- Results indicating the onset or presence of a PPCA related pathology are generally referred to herein as "symptoms" of the disease See , e g , U S Appl No 08/397,693, filed March 2, 1995, which is entirely incorporated herein by reference
- Appropriate biological and biochemical parameters that reflect the onset and progression of a PPCA related pathology include, but are not limited to, (1) gross biological parameters, e g , physical appearance (i e , flattening of the face, rough haircoat and/or subcutaneous swelling in affected animals) or growth (reduced weight gam), (2) gross behavioral parameters, e g , lack of coordination, (3) biochemical assays, e g , assays of cathepsin A, N-acetyl- ⁇ - neuraminida " or ⁇ -galactosidase activities in primary cultures of skin fibroblasts or tissue homogenates, (4) histopatholo-' al studies (visceromegaly, l e , enlarged liver and spleen accumulation of secondary vacuoles in kidney tissues, etc )
- a first method of evaluating the therapeutic potential of a composition using the trans enic non-human animals of the invention comprises the steps of
- a second method of evaluating the therapeutic potential of a composition using the non-human animals of the invention comprises the steps of
- the composition being tested may comprise a chemical compound administered by circulatory injection or oral ingestion
- the composition being evaluated may alternatively comprise a polypeptide administered by circulatory injection of an isolated or recombinant bacterium or virus that is live or attenuated wherein the polypeptide is present on the surface of the bacterium or virus prior to injection, or a polypeptide administered by circulatory injection of an isolated or recombinant bacterium or virus capable of reproduction within a non-human animal, and the polypeptide is produced within a non-human animal by genetic expression of a DNA sequence encoding the polypeptide
- the composition being evaluated may comprise one or more nucleic acids, including a gene from the human genome or a processed RNA transcript thereof
- the composition being evaluated may comprise cells removed from a mammal and genetically engineered to overexpress a lysosomal protein or some other therapeutic polypeptide
- the PPCA modulating ligand Once the PPCA modulating ligand has been shown to be effective in an animal model, it can then be tested in human clinical trials, according to known method steps In the above methods, delivery of the composition being tested to non-human animals is achieved via means appropriate for the composition being tested, e g , by diet, by intermittent or continuous intravenous injection of one or more of the compositions or of a liposome (Rahman and Schein, in Liposomes as Drug Carriers, Grego ⁇ adis, ed , John Wiley, New York (1988), pages 381-400, Gabizon, A .
- the present invention further provides a method for modulating the activity of a PPCA or pPPCA protein in a eel
- ligands antagonists or agonists
- ligands which have been identified to inhibit or enhance the activity of at least one PPCA or pPPCA ligand can be formulated so that the ligand can be contacted with a cell expressing at least one PPC A or pPPCA protein in vivo
- the contacting of such a cell with such a ligand results in the in vivo modulation of at least one biological activity of a PPCA or pPPCA
- At least one PPCA or pPPCA modulating compound or composition of the invention can be administered by any means that achieve the intended purpose, using a suitable pharmaceutical composition or formulation
- administration can be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, lntranasal, mtracranial, transdermal, or buccal routes
- parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, lntranasal, mtracranial, transdermal, or buccal routes
- parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, lntranasal, mtracranial, transdermal, or buccal routes
- Parenteral administration can be by bolus injection or by gradual perfusion over time
- a typical regimen for treatment or prophylaxis comprises administration of an effective amount over a period of one or several days, up to and including between one week and about six months
- dosage of a diagnostic/pharmaceutical compound or composition of the invention administered in vivo or m vitro will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the diagnostic/ pharmaceutical effect desired
- the ranges of effective doses provided herein are not intended to be limiting and represent preferred dose ranges However, the most preferred dosage will be tailored to the individual subject, as is understood and determinable by one skilled in the relevant arts See, e g , Berkow et al , eds , The Merck Manual, 16th edition, Merck and Co , Rahway, N J , 1992, Goodman et al , eds , Goodman and
- the total dose required for each treatment can be administered by multiple doses or in a single dose
- the diagnostic/pharmaceutical compound or composition can be administered alone or in conjunction with other diagnostics and or pharmaceuticals directed to the pathology, or directed to other symptoms of the pathology
- Effective amounts of a diagnostic/pharmaceutical compound or composition of the invention are from about 0 1 ⁇ g to about 100 mg kg body weight, administered at intervals of 4-72 hours, for a period of 2 hours to 1 year, and/or any range or value therein
- the recipients of administration of compounds and/or compositions of the invention can be any mammals
- the preferred recipients are mammals of the Orders P ⁇ mata (including humans, apes and monkeys),
- Arteriodactyla including horses, goats, cows, sheep, pigs
- Rodenta including mice, rats, rabbits, and hamsters
- Carnivora including cats, and dogs
- the present invention provides in one aspect, the determination of the three-dimensional structure of the human protective protem/cathepsm A (PPCA) in the precursor form (pPPCA) bv a combination of molecular replacement and twofold density averaging
- the structure presented here is the first of an enzvme associated with a human PPCA related pathology and the third human lysosomal enzyme structure determined
- the structure gives us insight into the zvmo ⁇ en activation mechanism of pPPCA as well as the expected 3-D structure of PPCA and its specific and new enzymatic activities PPCA andpPPCA Expression and Purification
- pBC3 had the polylinker situated in a similar position as pJR2, but instead of the 33-nt deletion this plasmid featured an ATG codon mutated in ACG Full-length human PPCA cDNA, PPCA54 (Galjart et al , 1988). and the two deletion cDN A mutants, 32( ⁇ 20) and 20( ⁇ 32) (Galjart et al. , 1991 ), were subcloned either in pJR2 or pBC3 as EcoRI fragments, using standard procedures (Sambrook et al., 1989). ( Figure 1 ). The 20( ⁇ 32) deletion mutant was tagged with the human PPCA signal sequence, as reported earlier (Galjart et al , 1991). All cDNA fragments were engineered to have short 3' and 5' untranslated regions ( ⁇ 10 bp).
- IPLB-SF21 Spodoptera frugiperda insect cells
- Recombinant constructs AcPPCA54, AcPPCA32 and AcPPCA20 were generated by cotransfecting Sf21 cells with 1 ⁇ g wt-AcMNPV DNA and 10 ⁇ g plasmid DNA, using the calcium phosphate method, modified for insect cells (Graham et al , ⁇ 973; Carstens et al , 1980; Summers et al , 1987). Recombinant polyhedrin-negative recombinant baculoviruses were then selected and purified by sequential plaque assays, and verified by dot blot and southern blot analysis (Summers et al, 1987).
- inoculum Large quantities of inoculum were produced by infection of insect cells at 25-50 % confluency, with recombinant virus at a multiplicity of infection (MOI) of ⁇ 1 pfu/cell. After 3 to 6 days at 27 0 C, when all cells appeared infected, the medium was harvested and centrifuged for 5 m at 1000 rpm to remove detached cells. The titre of the inoculum was determined by plaque assay analysis.
- MOI multiplicity of infection
- Sf21 cells were cultured in either 175 CM 2 or 500 CM 2 flasks (triple flask, Nunc) to near confluency, and infected with recombinant baculoviruses at a MOI of 5- 10 pfu/cell After 1.5 h incubation at 27 "C, the inoculum was replaced with complete medium for additional 8 to 10 hrs. Cell monolayers were then rinsed with PBS and cultured further for 38 h in unsupplemented Grace's medium. After infection the medium was collected, centrifuged for 5 m at 1500 g, and for 1 h at 100.000 g (Beckmann SW-28 rotor) to remove virus particles.
- the antibodies designated anti-pep, were tested on immunoblots and by immunoprecipitations of baculovirus produced PPCA.
- Blot* were incubated for at least 12 h in blocking buffer (0.01 M tris-buffered saline pH 8.0 (TBS), 0.05%
- Tween 20 a ⁇ ' ⁇ % (w/v BSA). and subsequently probed for 2 h with polyclonal PPCA antibodies, anti-54, d luted 1 :200 in fresh blocking buffer They were then washed for 1 h in TBS. 0.05% Tween 20, and incubated for 2 h with alkaline phosphatase conjugate anti-rabbit igG (Sigma, 1 : 1000 in blocking buffer) Proteins were visualized using alkaline phosphatase substrate (Sigma, 4-aminodiphenylami ⁇ e diazonium sulfate, naphtol as-mx phosphate).
- Crystals were grown using the hanging drop vapor diffusion - - technique Crystals suitable for data collection were grown using a reservoir solution containing 2- 10 % PEG 8000, pH 8 0 - 8 3, 50mM TRIZMA, ImM NaN 3 , 025 % ⁇ -octyl glucoside at 4-12"C Mixing non-equal volumes of protein solution (in the range 5-lO ⁇ l) and reservoir solution ( in the range 2-6 W) enhanced the occurrence of single large crystals per drop under these crystallization conditions The concentration of the protein solution before mixing was 5 mg/ml Crystal growth was enhanced by macrocrystallization techniques (anything that promotes growth of big crystals) and in some cases by micro- and macroseeding techniques
- Example 2 Structure Determination of a pPPCA Crystallized from Human Cells Data Collection, Data Processing and Reduction
- the crystals were cryoprotected by adding glycerol in 5% -10% steps to a solution of about 12% PEG 8000, 50 mM TRIZMA, pH 8 0, ImM NaN 3 , 025% ⁇ -octyl glucoside, which served as an artificial mother liquor
- the crystals were incubated for half an hour at 40°C after each addition of glycerol
- the final mother liquor contained 30% glycerol Gradually increasing the glycerol was needed to help keep the crystals from cracking
- Diffraction data was collected at the Stanford Synchrotron Radiation Laboratories (SSRL) to 2 0 A at -178 °C on a MAR imaging plate at a wavelength of 1 08 A on beam-line 7- 1
- the diffraction coordinate data (corresponding to atomic coordinates monomer 1 , the other monomer coordinates are provided by matrix conversion of these coordinates, as presented herein) was processed and reduced using MOSFLM version 5 2 from the CCP4 program package (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory UK, 1979)
- SERC CCP4 program package
- the 'multi-Ala core' search model was constructed from the atomic coordinates of the CPW monomer (Liao et al , 1992), based on the sequence alignment as presented in Figure 15 Regions expected to deviate in structure between PPCA and
- Residues were only incorporated in the model where the electron density was visible for the complete side chain. Residues from the search model for which no density was visible were removed. An alanine was built in the model at places where electron density for a side chain was partial. In this manner 294 residues, i.e. 65% of the C* atoms were built in the 'best monomer' core. The second monomer was generated from the 'best monomer' model using the NCS operator relating the two monomers in the asymmetric unit. At this point the data set was partitioned in a working set and a test set consisting of 5% of the reflections between 8 - 2.2A to monitor the R frM (Brunger et al. 1992b).
- the working data set was used for rigid body and positional refinement.
- the unpartitioned data set was used for averaging and map calculations. Twenty-five cycles of refinement using the two 'best monomers cores' positioned in the asymmetric unit as rigid bodies and data from 8.0 - 3.0A, resulted in an R ⁇ , of 53.5% for this resolution range.
- the atomic coordinates of this partial model were used to calculate a new 2m
- Model Building A conservative model building strategy was adopted Initially only side chains were mutated in the core region to fit the PPCA am o acid sequence and where the density was clear, poly-alanme fragments were built in the insertion area's (loops and the cap domain) Newly included atoms were given a B-factor of 20 A 2 Only once models bmc5 and bmc ⁇ were obtained, was the electron density of sufficient quality to allow side chains to be incorporated confidently in the cap domain (residues 190 - 303) At this stage the C" trace was virtually complete for the whole dimer and the sequence could be fit unambiguously
- Positional refinement was postponed until after 3 cycles of bootstrapping resulting in a final model containing 91% of the C ⁇ atoms Forty steps of positional refinement were then earned out to improve the geometry of the model Subsequently only one of the refined monomer was taken and the other generated using NCS operators The rational for delaying the positional refinement is addressed in the discussion
- the program ARP was used to check our model, in particular the region at the dimer interface (Lamzin & Wilson, 1993). Prior to the final round of positional refinement, an IF ob! I/ ⁇ cutoff was applied to reject 10% of the weakest data as well as an anisotropic scale factor to offset the decreased resolution along the crystallographic a axis.
- the final model is of good geometry with a final R,, ⁇ of 21.3% (R fr « of 26.8 %) for data between 8.0 and 2.2 A (see Table 3).
- a Ramachandran plot is given in Figure 21.
- the r.m.s. coordinate error is 0.282 as calculated by SigmaA (Read, 1986).
- the average phase difference between the initial molecular replacement model and the currently refined model is calculated to be 71 ° for data between 10 - 2.2 A.
- PPCA structure determination of PPCA is special in that two-fold averaging could be applied to refine very poor molecular replacement phases, enabling us to retrieve electron density for 148 residues and 185 side chains per monomer. In total 314 complete residues were added per asymmetric unit, equivalent to about 35 kDa of protein. In retrospect we feel that a number of factors contributed to a successful structure determination.
- Crystal Packing Each monomer in the crystal is interacting with four non-crystallographically related monomers. By far the most extensive contact is with a non-crystallographically related monomer generating the physiological dimer. Three additional contacts are extensive crystal contacts ranging from 200-800 A ! averaged per monomer.
- pPPCA and the Hydrolase Family The fold of pPPCA belongs to the large hydrolase fold family containing enzymes such as the serine carboxypeptidases, dehalogenase, various lipases and acetylcholine esterase (Ollis et al. (1992), infra), having various different catalytic functions.
- pPPCA has one of the largest cap domains comprising 121 residues forming the three helical bundle of the helical subdomain and a three stranded ⁇ -sheet of the maturation subdomain.
- the overall fold of the pPPCA monomer is similar to that of the wheat and yeast serine carboxypeptidases (Endrizzi et al. (1994), infra; Ollis et al. (1992), infra).
- the complete core domains of pPPCA and CPW superimpose with an r.m.s. deviation of ] .7 A for 302 C ⁇ atoms and 38% sequence identity. Deleting major deviating loops from the core domain allows for pPPCA to superimpose with an r.m.s. deviation of 1.2 A onto CPW and CPY (293 equivalent C's with 40 % sequence identity for CPW/pPPCA and 271 equivalent C*'s for CPY/pPPCA with 42.2% identity).
- the cap domain in pPPCA differs significantly from the CPW and CPY counterparts.
- the N*l of His 429 is 2.7 A removed from the 0*2 and 3.3 A from the 0*1 of Asp 372. Further, two backbone amides appear to orient the carboxylate group of Asp 372.
- the N of Ala 374 is at a distance of 3.0 A to the O* 1 of Asp 372 and the N of Cys 375 is at a distance of 2.9 A to the 0 ⁇ 2 of Asp 372.
- the oxyanion hole proposed to stabilize the negatively charged tetrahedral intermediate in serine carboxypeptidases is formed by the backbone amides of Gly 57 and Tyr 151 in PPCA.
- the 32 atoms of the catalytic triad residues plus the oxyanion hole amides from PPCA, CPY and CPW superimpose with an r.m.s. deviation of 0.4 A indicating the very high degree of structural similarity of the active site in the PPCA precursor with those in the fully active enzymes CPY and CPW, (see Table 4).
- the carboxylate of Asp 372 and the imidazole of His 429 in PPCA are non-planar, making an angle of approximately 60° between the imidazole and the carboxylate.
- a similar non-planarity has been observed in CPW and CPY, in contrast to the planar orientation found in subtilisin-.and trypsin-type serine proteases (McPhalen et al.. Biochemistry 27:6582-6598 (1988)).
- pPPCA a pair of glutamic acid residues (Glu 69 and Glu 149) is positioned near the catalytic triad, with their carboxylate groups interacting with each other.
- the carboxylate groups are located at approximately 8 A from the 0 ⁇ of Ser 150, and lie at the bottom of the active site.
- An asparagine (Asn 55) is orientated such that it forms a hydrogen bond to each of the two carboxylate groups of the glutamic acid pair, at an N 42 (Asn) to C'/C 2 (Glu) distance of 3.0 and 3.6 A, respectively.
- the two carboxylates interact with each other via hydrogen bonds.
- PPCA has a substrate preference for hydrophobic residues in the PI and/or PI' binding pockets (Jackman et al, Hypertension 2/:925-928 (1993)).
- the PI' pocket was identified to consist of two tyrosine residues (Tyr 60 and Tyr 239) which form a long channel, capped by two acidic residues (Glu 272 and Glu 398) at the end (Liao et al. (1992), infra). This explains the highest preference of this enzyme for Arg and Lys as the leaving group (Breddam et al, Carisberg Res. Commun. 52:297-31 1 (1987)).
- a similarly shaped pocket is formed by the residues Thr 60, Tyr 256, Leu 272 and Met 398 (Endrizzi et al. (1994), infra).
- the analogous residues are Tyr 247 and Asp 64, forming the sides of the pocket with at the far end Met 430 and Thr 304. This is reasonably consistent with an overall preference of PPCA for a hydrophobic leaving group.
- Arg 284 and Arg 292 are particularly well exposed.
- the main chain atoms of Arg 298 are less accessible, being sandwiched between the strand M ⁇ 2 and a loop N-terminal to helix C ⁇ 6, while a salt bridge with Glu 264 renders the side chain atoms of Arg 298 partially solvent inaccessible.
- the active site cleft is blocked by numerous residues from the maturation subdomain in the precursor form of PPCA.
- the catalytic triad is rendered solvent inaccessible by residues Asn 275, lie 276 and Phe 277. These residues are part of the polypeptide Asp 272-Phe 277 which we call the 'blocking' peptide.
- This peptide is held down predominantly by hydrophobic contacts of Leu 273, He 276, and Phe 277 to the core domain residues Gly 57, Cys 60, Leu 180, Leu 190, Val 191 , Leu 232, Val 235, He 246, Leu 280, Leu 282, Met 299 and Ala 373 (Fig 10).
- the PI ' binding pocket seems to be beautifully filled by Pro 301 interacting with Thr 304, Tyr 247, Cys 60 and Cys 334.
- Arrival in the endosome/lysosome is expected to result in protonation of either the Asp or the Glu residue or both, resulting in unfavorable electrostatic interactions and destabilization of this charge cluster. This in turn is expected to promote partial unfolding of maturation subdomain, allowing easier access to additional potential cleavage sites, and stimulating removal of the 'blocking' peptide which fills the active site in the precursor.
- the maturation mechanism for pPPCA appears to be novel among proteases for which the three-dimensional structure of the zymogen is known.
- the catalytic triad in the precursor form is in a catalytically competent conformation. Enzymatic activity is prevented by a 'blocking ' peptide.
- the blocking peptide is however different from the excision peptide and does not get excised from the mature enzyme. This leads to the distinct difference with the other known maturation mechanisms in that, after disappearance of the excision peptide, up to 35 residues filling the active site cleft in the PPCA precursor must rearrange to render the catalytic triad solvent accessible (see Figure 12), but do not get cleaved off.
- the catalytic triad is housed in the core domain and the various cap domains attenuate the biological function by influencing entirely different properties such as: (I) enzyme kinetics exemplified by the interfaciai activation of lipases (Smith et al. Curr. Opinion in Structural Biology 2:490-496 ( 1992)); (ii) substrate channeling as is proposed for acetylcholine esterase (Sussman et al. (1991 ), infra); (iii) substrate recognition, proposed for dehalogenase by (Franken et al. (1991 ), infra) and for CPY and CPW by (Endrizzi et al.
- PPCA protective protein/cathepsin A
- CPW wheat serine carboxypeptidase
- CPY yeast serine carboxypeptidase
- the precursor structure reveals an inactivation mechanism that has not been seen before in any of the other known zymogen structures of proteases (available for the serine-. metallo- and aspartic protease classes).
- the catalytic triad seems to have an arrangement poised for catalysis. However, the triad is rendered solvent and substrate inaccessible by a strand from the maturation subdomain binding in the active site cleft. Surprisingly, this strand called the 'blocking' peptide does not overlap with the 2 kDa "excision' peptide. Hence, after removal of the excision peptide up to 35 additional residues must rearrange in order to unblock the active site cleft.
- a strategically positioned pair of salt bridges comprising Arg 262, Arg 298, Glu 264, and Asp 300 at the base of the excision peptide, are expected to optionally become destabilized at low pH, unraveling this region of the structure, allowing easier access to cleavage sites and/or promoting the rearrangement event.
- a number of research groups are currently involved in designing enzyme and gene therapy procedures for several lysosomal storage diseases. Insight into the three-dimensional structure, protein functioning and stability of PPCA, the first enzyme of known structure associated with a lysosomal storage disease and the third human lysosomal structure to be determined, may prove useful in future designs of an adequate therapy procedure for galactosialidosis. Information from the three-dimensional structure of PPCA, might also aid in designing an engineered form of PPCA with increased stability and a longer half-life.
- Model C's chains (io 4 A 3 ) ⁇ statistics using data between 8.0 and 3. ⁇ A ⁇ mol. repl. mrl rigid body ref. (rmr) 331 125 - 54.2 55.3 0.243 0.244 calculate NCS matrix 52.6 52.9 0.287 0.318 bes monomer (bm) rigid body ref. 294 228 - 55.9 57.4 0.228 0216 update NCS matrix 53.5 55.0 0.320 0328 bmcl (mask 1) 373 258 10.8 49.9 51.3 0.403 0.424 bmc2 (mask 1) 405 277 10.8 48.6 48.4 0.443 0.478 bmc3 (mask 2) rigid body ref.
Abstract
The present invention provides crystallized protective protein/cathepsin A (PPCA), a precursor thereof (pPPCA) or at least one subdomain thereof; methods for x-ray diffraction analysis to provide x-ray diffraction patterns of sufficiently high resolution for three-dimensional structure determination of the protein, as well as methods for rational drug design (RDD), based on using amino acid sequence data and/or x-ray crystallography data provided on computer readable media, as analyzed on a computer system having suitable computer algorithms.
Description
Protective Protein/Cathepsin A and Precursor: Crystallization, X-Ray Diffraction, Three- Dimensional Structure Determination and Rational Drug Design
Background of the Invention
Statement as to Rights to Inventions Made Under Federally-Sponsored Research and Development
Part of the work performed during development of this invention utilized U.S Government funds The U S
Government has certain rights in this invention Field of the Invention The present invention is in the fields of molecular biology, protein purification, protein crystallization, x-ray diffraction analysis, three-dimensional structure determination and rational drug design (RDD) The present invention provides crystallized protective protein/cathepsin A (PPCA) and its precursor (pPPCA) The crystallized PPCA or pPPCA is analyzed by x-ray diffraction techniques The resulting x-ray diffraction patterns are of sufficiently high resolution to be useful for determining the three-dimensional structure of the PPCA or pPPCA protein, and for RDD Related Background Art
The human protective protein/cathepsin A (PPCA, also known as human protective protein or HPP) has been identified as the primary genetic defect underlying galactosialidosis (d'Azzo e/ a/ , Proc Natl Acad Sci US A 794535- 4539 ( 1982)), a lysosomal storage disease inherited as an autosomal recessive trait Patients with this disorder are diagnosed as having drastically reduced β-galactosidase and neuraminidase activities in their cell lysosomes Examples of lysosomal storage diseases are presented in Table 316- 1 of Braunwald et al . eds Harrison 's Principles of Internal Medicine, 1 1th Ed , pp 1661 -1671 , McGraw Hill Book Co , New York (1987), as well as Wenger et al , Biochem Biophys Res Commun #2:589-595 (1978), Tettamanti et al eds., Sialtdases and Sialidosis Perspectives in Inherited Metabolic Diseases, Vol 4, Edi. Ermes, Milano (1981), pp. 261-279 and 379-395, and van Diggelen et al Lancet 2 804( 1987), which references are entirely incorporated herein by reference. Researchers have proposed that one of PPCA 's functions is to stabilize β-galactosidase and neuraminidase in a multi-enzyme complex, which complex is deficient in galactosialidosis patients (d'Azzo et al (1982,), infra; Hoogeveen et al (1983 , infra) Evidence for this protective function comes from studies showing that PPCA is taken up from the culture medium by galactosialidosis fibroblasts and that PPCA restores both β-galactosidase and neuraminidase activities to these fibroblasts (d'Azzo et al ( I982A infra) The cD A for PPCA directs the synthesis of a 452 amino acid precursor PPCA (pPPCA) (Figure 13) with a molecular weight of 54 kDa (Galjart et al . Cell 54 755-764 (1988)) The amino acid sequences of PPCA (Figure 14) and pPPCA (Figure 13) contain two glycosylation sites (Asn 117 and Asn 305), both of which are glycosylated in cultured fibroblasts and cells over-expressing PPCA or pPPCA pPPCA dimeπzes soon after synthesis in the endoplasmιc retιcuιum (ER) (Zhou ef «/ EMBO J 70404-4048 (1991)) Lysosomal PPCA has cathepsin A/deamidase/esterase activities which are exerted in vitro on a specific subset of bioactive peptides Non-limiting examples of those hydrolyzed by PPCA are substance P and substance P-free acid, oxytocin and oxytocin-free acid, neurokinin A, angiotensin I, bradykinin (Jackman infra (1990) Furthermore, the enzyme inactivates endothe n I activity in rat smooth muscle cells and normal human tissues This activity was deficient in liver from a galactosialidosis patient (Itoh, infra 1995, Jackman et al J Biol Chem 267 2872-2875, ( 1992) Endothelins (ET-1. ET-2 and ET-3) are potent vasoconstrictors and elevate blood pressure in mammals The> also influence cell proliferation and hormone production and have been implicated in cardiovascular disorders, rangin from hypertension to stroke to ischemic heart disease (Rubanyi and Polokoff Pharmc Rev 46.325-415 (1994))
The three-dimensional structure of a PPCA or a pPPCA has not previously been published, which structure could delineate specific biological activities and ligands as therapeutics for PPCA-related pathologies Accordingly, there is a need to provide three-dimensional structures of at least one PPCA, pPPCA or ligands for diagnosis or therap> of PPCA-related pathologies
Summary of the Invention
The present invention provides methods of expressing, purifying and crystallizing a human protective protein/cathepsin A (PPCA) and its precursor, precursor protective protein/cathepsin A (pPPCA) The present invention also provides methods for obtaining crystallized PPCA or pPPCA that can be analyzed to obtain x-ray diffraction patterns of sufficiently high resolution to be useful for three-dimensional structure determination of the protein
The x-ray diffraction patterns can be either analyzed directly to provide the three dimensional structure (if of sufficiently by high resolution), or atomic coordinates for the crystallized PPCA or pPPCA, as provided herein, can be used for structure determination The x-ray pattern/diffraction patterns obtained by methods of the present invention, and provided on computer readable media, are used to provide electron density maps The ammo acid sequence is also useful for three-dimensional structure determination The data is then used in combination with phase determination (eg , using multiple isomorphous replacement (MIR) molecular replacement techniques) to generate electron density maps of a PPCA or a pPPCA, using a suitable computer system
The electron density maps, provided by analysis of either the x-ray diffraction patterns or working backwards from the atomic coordinates, provided herein, are then fitted using suitable computer algorithms to generate secondary, tertiary and/or quaternary domains of a PPCA or a pPPCA, which domains are then used to provide an overall three- dimensional structure, as well as expected binding and active sites of the PPCA or pPPCA pPPCA his some of the active and binding sites of PPCA . except for changes in structure due to the presence of the portion of the pPPCA which is deleted during maturation to PPCA (e g , residues 285-298 of Figure 13)
Structure determination methods and computer systems are also provided by the present invention for rational drug design (RDD) These RDD methods use computer modeling programs to find potential ligands that are calculated to associate with, or bind to. sites or domains of a PPCA or a pPPCA Potential ligands are then screened for modulating or binding activity Such screening methods can be selected from assays for at least one PPCA-specifϊc structural feature or biological activity, preferably as associated with a PPCA- or pPPCA-related pathology, e g , protective activity (e g , modulation of β-galactosidase activity and neuraminidase (N A) activity), and peptide or enzyme modulating activity (eg of endotheiin I (serine carboxypeptidase), neuropeptides, cathepsin A, and the like), according to known assays The resulting ligands provided by methods of the present invention are synthesized and are useful for treating, inhibiting or preventing at least one of PPCA related pathology in a mammal
Other objects of the invention will be apparent to one of ordinary skill tn the art from the following detailed description and examples relating to the present invention Brief Description of the Figures
Figure 1 is a schematic ribbon diagram of the PPCA monomer (monomer 1), where Secondary structure assignments are according to DSSP (Kabsch and Sander, Bwpolymers 22.2577-2637 (1983)) The 'core' domain is shown in yellow The 'cap' domain consists of a 'helical' subdomam, in red, and a 'maturation' subdomain, in orange
The catalytic triad Ser 150, His 429 and Asp 372 (from right to left) is shown by small green spheres (Figure generated using MOLSCRIPT (Kraulis, J Appl Cryst 24 946-950 (1991)))
Figure 2 is stereo diagram is presented of the C. trace of the PPCA monomer 1 with numbering of selected residues The residues forming the α-helices and β-strands are as follows according to DSSP
Core domain Cβl (21-27), Cβ2(32-39), Cβ3(50-54) Cα I (63-67) Cβ4(73-75), Cβ5(82-84), Cβ6(94-98), Cα2( 1 18- 135). Cβ7( 144- 149). Cα3( 152- 163). Cβ8( 171 -177) Cα4(307-313). Cα5(316-321 ), Cα6(336-341 ), Cα7(350- 359), Cβ9(363-369) Cα8(377-386), Cβl0(391-401), Cβl 1(407-416) Cβl 2(419-424), Cα9(431 -434), Cαl0(436-447) Cap domain Hold 83-196), Hα2(202-212). Hα3 (226-240). Mβl(261 -264), Mβ2(267-270). Mαl(290-293), Mβ3(296-299) Note that for monomer 2 the secondary structure assignments in the cap domain are slightly different than in monomer 1 Residues in Hβl are in a region of poor density and Mo l is an extended coil (Figure generated using MOLSCRIPT (Kraulis (1991 ), infra)
Figure 3 shows the density for the disulfide bridges Cys 212-Cys 228 and Cys 213-Cys 218 is presented as revealed in the SigmaA weighted 2mF0-DFc electron density map (Read, Ada Crvstallogr A 42 140-149 (1986)) calculated from the model refined to 2 2 A, the map has been contoured at l o (Figure drawn with the O computer program (Jones, Ada Crystallogr A47 U0-] \9 (1991))) Figure 4 is stereo diagram is presented of the superimposed C" traces from the two crystallographically independent PPCA monomers forming the dimer Monomer 1 is in blue, monomer 2 is in red Residues referred to in the text are labeled Residues 259 and 260 have not been incorporated in the model of monomer 2, since no electron density was observed for them Note the tremendous difference in conformation of the excision peptide located in the upper right corner of the proteins (Figure generated by MOLSCRIPT (Kraulis (1991), infra)) Figure 5 is a schematic ribbon diagram is presented of the PPCA dimer viewed approximately along the two¬ fold axis For monomer 1, the core domain is yellow while the cap domain consists of a helical subdomain in red and a maturation subdomain in orange For monomer 2, the core domain is green, while the cap domain consists of a blue helical subdomain and a light blue maturation subdomain (Figure generated using MOLSCRIPT (Kraulis (1991), infra)) Figure 6A-B is a representation of the molecular surface of the PPCA dimer The surface was calculated with GRASP (Nicholls, A , et al , Proteins //.281-296 (1991)) and colored according to the electrostatic potential Dark blue corresponds to positive potential > + 15 0 kT/e and dark red to a negative <-l 5 0 kT/e potential Figure 6A standard view, along the diad with the dimer oriented as in Figure 4 Figure 6B side view of the dimer, ninety degrees rotated with respect to 6 A
Figure 7A-F presents a topological comparison of 6 members of the hydrolase fold family The arrangement of structural elements in the central core domain (in green and yellow) of the different proteins is generally similar The cap domains (in red) vary greatly The following structures are shown starting from the top left hand corner (references and PDB entry codes are given in between brackets) Figure 7A shows the PPCA precursor cap domain that consists of two subdo ains one α-helical and the other mainly β-sheet, Figure 7B shows CPW (3SC2, Liao et al (1992) infra), cap domain helical, Figure 7C shows CPY (LYSC, Endπzzi et al (1994), infra), cap domain helical, Figure 7D shows dehalogenase (2HAD, Franken et al . J EMBO 10 1297-1302 (1991)), cap domain helical but quite different from the serine carboxypeptidases, Figure 7E shows lipase from Pseudomonas glumae (1TAH, Noble et al , FEBSLett 331 123- 128 (1993)), cap domain mixed α-helical and β-strands, and Figure 7F shows acetylcholine esterase (1 ACE, Sussman et al , Science 253 872-879 (1991)), cap domain large and predominantly α-helical The secondary structure assignments were generated with the computer program O using structures provided and/or available from the Brookhaven Protein Data Bank (This Figure was generated using MOLSCRIPT (Kraulis (1991 ), infra))
Figure 8A-B shows the superposition of the C traces from the PPCA and CPW monomers, showing that the major differences between the two enzymes are localized in the cap domain PPCA has a large 'maturation' subdomain and the 'helical subdomain' is rotated with respect to the CPW counterpart (Figure drawn with the O program (Jones (1991), infra)) Figure 8B shows the C traces from the PPCA and CPW di ers after the core domains from the subunits (shown on the right hand side of the two dimers) have been superimposed Notice the remarkable difference in mutual orientation (of 15°) of the two subunits on the left hand side of the two dimers, which has been accentuated by an arrow (Figure drawn with the O computer program (Jones ( 1991 ), supra))
Figure 9 is a stereo view of the Ca trace of PPCA monomer 1 highlighting regions involved in the maturation event Color scheme for the trace is as follows core domain in light blue, helical subdomain in red, maturation subdomain in orange with the exception of the excision peptide (residues 285-298) which is shown in blue Orange sphere mark the residues 272 and 277 marking the beginning and end of the blocking peptide The catalytic triad Ser 150. His 429 and Asp 372 is shown as light blue spheres Two cystemes Cys 253 and Cys 303 referred to in the discussion are colored green (This Figure generated using MOLSCRIPT (Kraulis (1991 ). infra))
Figure 10 is a close-up representation of the 'blocking' peptide (residues 272-277) bound in the active site rendering the catalytic triad solvent inaccessible Residues from the maturation subdomain are shown in orange residues
fro the helical domain in magenta and residues from the core domain in cyan. The excision peptide is shown in blue. Side chains are shown for residues making extensive contacts with the blocking peptide or if mentioned in the text. The catalytic triad is shown in white. (Figure drawn with O (Jones (1991), infra)).
Figure 11 is a representation of elements proposed to be involved in the activation mechanism of the precursor form of PPCA as discussed in the text. The C'-trace of the core domain is shown in cyan, the helical subdomain in red, the maturation subdomain in orange, and the excision peptide is shown in blue. Relevant side chains are depicted and labeled. Rearrangement of the residues 254-302 limited by the disulfide Cys 253 and Cys 303 would free up the active site cleft. A charge cluster Arg 262, Glu 264, Arg 298 and Asp 300 occupies a strategic position within the maturation subdomain, possibly involved in pH dependent regulation of conformational changes. The solvent accessible surface was calculated and visualized with the atomic coordinates by BIOGRAF (BIOGRAF Construct Users Guide Version 3.2.1. , June 1993).
Figure 12 is a schematic representation of the proposed activation of PPCA. The active site cleft is formed by the core domain (indicated as 'core' in the above scheme) and the helical subdomain (indicated as 'o'). The maturation subdomain (indicated as 'm') contains the residues that block the active site cleft rendering the precursor enzymatically inactive, shown in structure 1. In the acidic endosome/lysosome, the precursor undergoes activation. In activation pathway 2a, conformational rearrangements induced by low pH might render the excision peptide more accessible to proteases as a first step, followed by cleavage of the polypeptide chain removing the excision peptide. Alternatively, in pathway 2b, proteolytic cleavage of the excision peptide might form the trigger for the total rearrangement, removing the blocking peptide from the active site and thus generating the fully active enzyme as shown in structure 3. Figure 13 shows the amino acid sequence of a human pPPCA. The underlined portion (residues 285-298) shows an excision peptide for conversion to the mature form, PPCA. Figure 14 shows the amino acid sequence of a human PPCA.
Figure 15 shows a sequence alignment between pPPCA, CPW and CPY (top three sequences shov/n). Identical residues among all three sequences are boxed. Residue numbering is included for the pPPCA amino acid sequence. The alignment was made using the GCG program PILEUP (GCG version 8), then manually adjusted using 3D-structural knowledge from the superposition of the CPW (Liao et al., 1992) and CPY (Endrizzi et al., 1994) atomic coordinates. The alignment was later used to design a multi-Ala search probe for molecular replacement calculations shown in the fourth sequence shown as 'model'. The structure determination of pPPCA subsequently revealed that the protein can be divided in two domains: a 'core' domain (residues 1-182 and 303-452) and 'cap' domain (residues 183-302). The secondary structure elements for the PPCA precursor are depicted with shaded bars (for details on the assignment and nomenclature, see Rudenko et al. Structure 3: 1249-1259 (1988) ).
Figure 16 shows a schematic representation of a 'bootstrapping' cycle as described in Example 2. Figure 17 is a representation of an initial molecular mask enlarged to accommodate missing area's in the model. The program MAMA (Kleywegt & Jones, 1994) was used to calculate the mask and mask editing options in O (Jones et al. , 1991 ) were used to extend the mask.
Figure 18 is a representation of an enlargement of the model during the bootstrapping procedure plotted as a function of the expansion step. The number of C atoms incorporated in the model per monomer is given ( — ° — ) as well as the number of correct side chains (-« -). Note that after the first round of building in the molecular replacement map (expansion step ' r").37 residues from the molecular replacement search probes had to be deleted from the model reducing the number of C* atoms to 294. Subsequent cycles allowed for the model to be expanded by small increments. Figure 19 is a representation of a comparison of the C" trace from a monomer core model (shown in magenta) and the complete PPCA monomer (shown in yellow). The core model contained only 294 C atoms. The 452 residue PPCA monomer consists of a core domain and a cap domain. The helical subdomain and the maturation subdomain forming the cap domain have been shown in the figure above.
Figure 20A-D is a representation of the resolving power of the bootstrapping procedure showing three different stages in map quality The atomic coordinates of the refined model are visualized with the electron density in Figures 20B. 20C and 20D Figures 20A and 20B show the initial 2m|Fobs|-D|Fc,|C| SigmaA weighted map calculated using phases from the molecular replacement solution The electron density is essentially untnterpretable Fig. 20C shows twofold averaged 2|F„ - 1 F,„v | electron density map calculated using inverted phases from cycle bmc6 The density for β-strand Mβ2 (residues 266-271) has become clearly visible Fig. 20D shows unaveraged 2m|FobJ|-D|Fc,,c| SigmaA weighted map calculated using phases from the refined model The quality of the density is very good Density for the helix Mαl (residues 287-293) which assumes a different conformation in the two monomers is now also apparent
Figure 21 shows a Ramachandran plot calculated for one monomer from a refined model of a pPPCA Both monomers in the asymmetric unit give essentially equivalent plots
Figure 22 shows a schematic of a computer system for PPCA or pPPCA structure determination and/or rational drug design
Figure 23.1-52 lists the atomic coordinates for the active site of a pPPCA dimer having the ammo acid sequence presented as portions of at least one of 50-76, 144-155, 173-197, 226-253, 226-288, 294-310, 327-344, 338- 350, 366-381 and 423-436 of (Figure 23 1-23 26) 452 ammo acids (designated 1 -452) of monomer 1, as well as corresponding portions of (Figure 23 26-23 52) 452 amino acids (designated 1001-1452) of monomer 2
Detailed Description of the Preferred Embodiments
The present invention provides methods for expressing, purifying and crystallizing a protective protein/cathepsin A (PPCA) or a precursor protective protein/cathepsin A (pPPCA), where the crystals diffract x-rays with sufficiently high resolution to allow determination of the three-dimensional structure of the PPCA or pPPCA, or a portion or subdomain thereof The three-dimensional structure (e g ,as provided on computer readable media of the present invention) is useful for rational drug design of ligands of a PPCA or a pPPCA Such ligands can be synthesized or recombinantly produced and are useful as diagnostic agents or drugs for diagnosing, treating, inhibiting or preventing at least one PPCA- or pPPCA-related pathology The determined structure is made using the PPCA or pPPCA ammo acid sequences and or atomic coordmate/x- ray diffraction data, which are analyzed to provide atomic model output data corresponding to the three-dimensional structure, e g as provided on computer readable media The computer analysis of the atomic coordinate/x-ray diffraction data and/or the amino acid sequence allows the calculation of the secondary, tertiary and/or quaternary structures, domains, and/or subdomains of the protein These domains are combined and refined by additional calculations using suitable computer subroutines to determine the most probable or actual three-dimensional structure of the PPCA or pPPCA, including potential or actual active sites, binding sites or other structural or functional domains or subdomains of the protein.
Structure determination methods are also provided by the present invention for rational drug design (RDD) of PPCA or pPPCA ligands Such drug design uses computer modeling programs that calculate different molecules expected to interact with the determined active sites, binding sites, or other structural or functional domains or subdomains of a PPCA or a pPPCA These ligands can then be produced and screened for activity in modulating or binding to a PPCA or pPPCA, according to methods and compositions of the present invention
The actual PPCA or pPPCA-ligand complexes can optionally be crystallized and analyzed using x-ray diffraction techniques The diffraction patterns obtained are similarly used to calculate the three-dimensional interaction of the ligand and the PPCA or pPPCA, to confirm that the ligand binds to or changes the conformation of, particular domaιn(s) or subdomaιn(s) of the PPCA or pPPCA Such screening methods are selected from assays for at least one biological activity of a PPCA or a pPPCA The resulting ligands, provided by methods of the present invention, modulate or bind at least one PPCA or pPPCA and are useful for diagnosing treating or preventing PPCA- or pPPCA- related pathologies in animals, such as humans Ligands of a particular PPCA or pPPCA can similarly modulate other PPCAs or pPPCAs from other sources, such as other eukaryotes
A PPCA or pPPCA is also provided as a crystallized protein suitable for x-ray diffraction analysis. The x-ray diffraction patterns obtained by the x-ray analysis are of moderate, to moderately high, to high resolution, e.g.. 30-10, 10-3.5 or 1.5-3.5 A, respectively, with the higher resolutions included. These diffraction patterns are suitable and useful for three-dimensional structure determination of a PPCA or a pPPCA, domain or subdomain thereof. The determination of the three-dimensional structure of a PPCA or pPPCA has a broad- based utility.
Significant sequence identity and conservation of important structural elements are expected to exist among different PPCAs or pPPCAs. Therefore, the three-dimensional structure from one or few PPCAs or pPPCAs can be used to identify ligands that have diagnostic or therapeutic value for at least one PPCA- or pPPCA-related pathology that may involve PPCAs or pPPCAs having different amino acid sequences. Determination of Protein Structures
Different techniques give different and complementary information about protein structure. The primary structure is obtained by biochemical methods, either by direct determination of the amino acid sequence from the protein, or from the nucleotide sequence of the corresponding gene or cDNA. The quaternary structure of large proteins or aggregates can also be determined by electron microscopy. To obtain the secondary and tertiary structure, which requires detailed information about the arrangement of atoms within a protein, x-ray crystallography is preferred. See, e.g., Blundell, infra; Oxender, infra; McPherson, infra; Wyckoff, infra.
The first prerequisite for solving the three-dimensional structure of a protein by x-ray crystallography is a well- ordered crystal that will diffract x-rays strongly. The crystallographic method directs a beam of x-rays onto a regular, repeating array of many identical molecules so that the x-rays are diffracted from it in a pattern from which the structure of an individual molecule can be retrieved. Well-ordered crystals of globular protein molecules are large, spherical, or ellipsoidal objects with irregular surfaces, and crystals thereof contain large holes or channels that are formed between the individual molecules. These channels, which usually occupy more than half the volume of the crystal, are filled with disordered solvent molecules. The protein molecules are in contact with each other at only a few small regions. This is one reason why structures of proteins determined by x-ray crystallography are generally the same as those for the proteins in solution.
The formation of crystals is dependent on a number of different parameters, including pH, temperature, protein concentration, the nature of the solvent and precipitant, as well as the presence of added ions or ligands to the protein. Many routine crystallization experiments may be needed to screen all these parameters for the few combinations that might give crystal suitable for x-ray diffraction analysis. Crystallization robots can automate and speed up the work of reproducibly setting up large numbers of crystallization experiments.
A pure and homogeneous protein sample is important for successful crystallization. Proteins obtained from cloned genes in efficient expression vectors can be purified quickly to homogeneity in large quantities in a few purification steps. A protein to be crystallized is preferably at least 93-99% pure according to standard criteria of homogeneity. Crystals form when molecules are precipitated very slowly from supersaturated solutions. The most frequently used procedure for making protein crystals is the hanging-drop method, in which a drop of protein solution is brought very gradually to supersaturation by loss of water from the droplet to the larger reservoir that contains salt or polyethylene glycol solution.
Different crystal forms can be more or less well-ordered and hence give diffraction panerns of different quality. As a general rule, the more closely the protein molecules pack, and consequently the less water the crystals contain, the better is the diffraction pattern because the molecules are better ordered in the crystal.
X-rays are electromagnetic radiation at short wavelengths, emitted when electrons jump from a higher to a lower energy state, in conventional sources in the laboratory, x-rays are produced by high-voltage tubes in which a metal plate, the anode, is bombarded with accelerating electrons and thereby caused to emit x-rays of a specific wavelength, so-called monochromatic x-rays. The high voltage rapidly heats up the metal plate, which therefore has
to be cooled Efficient cooling is achieved by so-called rotating anode x-ray generators, where the metal plate revolves during the experiment so that different parts are heated up
More powerful x-ray beams can be produced m synchrotron storage rings where electrons (or positrons) travel close to the speed of light These particles emit very strong radiation at all wavelengths from short gamma rays to visible light When used as an x-ray source, only radiation within a window of suitable wavelengths is channeled from the storage ring Polychromatic x-ray beams are produced by having a broad window that allows through x-ray radiation with wavelengths of 0 2 - 3 5 A
In diffraction experiments a narrow and parallel beam of x-rays is taken out from the x-ray source and directed onto the crystal to produce diffracted beams The incident primary beam causes damage to both protein and solvent molecules The crystal is, therefore, usually cooled to prolong its lifetime (e g , -220 to -50°C) The primary beam must strike the crystal from many different directions to produce all possible diffraction spots, and so the crystal is rotated in the beam during the experiment
The diffracted spots are recorded either on a film, the classical method, or by an electronic detector The exposed film has to be measured and digitized by a scanning device, whereas electronic detectors feed the signals they detect directly in a digitized form into a computer Electronic area detectors (an electronic film) significantly reduce the time required to collect and measure diffraction data
When the primary beam from an x-ray source strikes the crystal, some of the x-rays interact with the electrons on each atom and cause them to oscillate The oscillating electrons serve as a new source of x-rays, which are emitted in almost all directions, referred to as scattering When atoms (and hence their electrons) are arranged in a regular three- dimensional array, as in a crystal, the x-rays emitted from the oscillating electrons interfere with one another In most cases, these x-rays, colliding from different directions, cancel each other out, those from certain directions, however, will add together to produce diffracted beams of radiation that can be recorded as a pattern on a photographic plate or detector
The diffraction pattern obtained in an x-ray experiment is related to the crystal that caused the diffraction X- rays that are reflected from adjacent planes travel different distances, and diffraction only occurs when the difference in distance is equal to the wavelength of the x-ray beam This distance is dependent on the reflection angle, which is equal to the angle between the primary beam and the planes
The relationship between the reflection angle (θ), the distance between the planes (d), and the wavelength (λ) is given by Bragg's law 2d sin θ = λ This relation can be used to determine the size of the unit cell in the crystal Briefly, the position on the film of the diffraction data relates each spot to a specific set of planes through the crystal By using Bragg's law, these positions can be used to determine the size of the unit cell
Each atom in a crystal scatters x-rays in all directions, and only those that positively interfere with one another, according to Bragg's law, give rise to diffracted beams that can be recorded as a distinct diffraction spot above background Each diffraction spot is the result of interference of all x-rays with the same diffraction angle emerging from all atoms For example, for the protein crystal of myoglobin, each of the about 20,000 diffracted beams that have been measured contain scattered x-rays from each of the around 1500 atoms in the molecule To extract information about individual atoms from such a system requires considerable computation The mathematical tool that is used to handle such problems is called the Fourier transform
Each diffracted beam, which is recorded as a spot on the film, is defined bv three properties the amplitude which we can measure from the intensity of the spot, the wavelength, which is set by the x-ray source and the phase, which is lost in x-ray experiments All three properties are needed for all of the diffracted beams, in order to determine the position of the atoms giving rise to the diffracted beams
For larger molecules protein crvstallographers have determined the phases in manv cases using a method called multiple isomorphous replacement (MIR) (including heavy metal scattering), which requires the introduction of new x-ra> scatterers into the unit cell of the crystal These additions are usually heavy atoms (so that they make a significant
contribution to the diffraction pattern), such that there should not be too many of them (so that their positions can be located); and they should not change the structure of the molecule or of the crystal cell, i.e., the crystals should be isomorphous. lsomorphous replacement is usually done by diffusing different heavy-metal complexes into the channels of the preformed protein crystals. The protein molecules expose side chains (such as SH groups) into these solvent channels that are able to bind heavy metals. It is also possible to replace endogenous light metals in metalloproteins with heavier ones, e.g., zinc by mercury, or calcium by samarium.
Since such heavy metals contain many more electrons than the light atoms (H, N, C, 0, and S) of the protein, they scatter x-rays more strongly. All diffracted beams would therefore increase in intensity after heavy-metal substitution if all interference were positive. In fact, however, some interference is negative; consequently, following heavy-metal substitution, some spots measurably increase in intensity, others decrease, and many show no detectable difference.
Phase differences between diffracted spots can be determined from intensity changes following heavy-metal substitution. First, the intensity differences are used to deduce the positions of the heavy atoms in the crystal unit cell. Fourier summations of these intensity differences give maps of the vectors between the heavy atoms, the so-called Patterson maps. From these vector maps the atomic arrangement of the heavy atoms is deduced. From the positions of the heavy metals in the unit cell, one can calculate the amplitudes and phases of their contribution to the diffracted beams of protein crystals containing heavy metals.
This knowledge is then used to find the phase of the contribution from the protein in the absence of the heavy- metal atoms. As both the phase and amplitude of the heavy metals and the amplitude of the protein alone is known, as well as the amplitude of the protein plus heavy metals (i.e., protein heavy-metal complex), one phase and three amplitudes are known. From this, the interference of the x-rays scattered by the heavy metals and protein can be calculated to see if it is constructive or destructive. The extent of positive or negative interference, with knowledge of the phase of the heavy metal, give an estimate of the phase of the protein. Because two different phase angles are determined and are equally good solutions, a second heavy-metal complex can be used which also gives two possible phase angles. Only one of these will have the same value as one of the two previous phase angles; it therefore represents the correct phase angle. In practice, more than two different heavy-metal complexes are usually made in order to give a reasonably good phase determination for all reflections. Each individual phase estimate contains experimental errors arising from errors in the measured amplitudes. Furthermore, for many reflections, the intensity differences are too small to measure after one particular isomorphous replacement, and others can be tried. The amplitudes and the phases of the diffraction data from the protein crystals are used to calculate an electron- density map of the repeating unit of the crystal. This map then has to be interpreted as a polypeptide chain with a particular amino acid sequence. The interpretation of the electron-density map is made more complex by several limitations of the data. First of all, the map itself contains errors, mainly due to errors in the phase angles. In addition, the quality of the map depends on the resolution of the diffraction data, which in turn depends on how well-ordered the crystals are. This directly influences the image that can be produced. The resolution is measured in A units; the smaller this number is. the higher the resolution and therefore the greater the amount of detail that can be seen.
Building the initial model is a trial-and-error process. First, one has to decide how the polypeptide chain weaves its ww through the electron-density map. The resulting chain trace constitutes a hypothesis, by which one tries to match the ' .nsity of the side chains to the known sequence of the polypeptide. When a reasonable chain trace has finally been obtained, an initial model is built to give the best fit of the atoms to the electron density. Computer graphics are used both for chain tracing and for model building to present the data and manipulated the models.
The initial model will contain some errors. Provided the protein crystals diffract to high enough resolution (e.g., better than 3.5 A), most or substantially all of the errors can be removed by crystallographic refinement of the model using computer algorithms. In this process, the model is changed to minimize the difference between the experimentally observed diffraction amplitudes and those calculated for a hypothetical crystal containing the model (instead of the real
molecule) This difference is expressed as an R factor (residual disagreement) which is 00 for exact agreement and about 0 59 for total disagreement
In general, the R factor is preferably between 0 15 and 0 35 (such as less than about 0 24-028) for a well- determined protein structure The residual difference is a consequence of errors and imperfections in the data These derive from various sources, including slight variations in the conformation of the protein molecules, as well as inaccurate corrections both for the presence of solvent and for differences in the orientation of the microcrystals from which the crystal is built This means that the final model represents an average of molecules that are slightly different both in conformation and orientation
In refined structures at high resolution, there are usually no major errors in the orientation of individual residues, and the estimated errors m atomic positions are usually around 0 1-0 2 A, provided the am o acid sequence is known Hydrogen bonds, both within the protein and to bound ligands, can be identified with a high degree of confidence
Most x-ray structures are determined to a resolution between 1 7 A and 3 5 A Electron-density maps with this resolution range are preferably interpreted by fitting the known amino acid sequences into regions of electron density in which individual atoms are not resolved
An ammo acid sequence is preferred for accurate x-ray structure determination Thus, recombinant DNA techniques have had a double impact on x-ray structural work When a protein is cloned and overexpressed for structural studies, the ammo acid sequence, necessary for the x-ray work, is also quickly obtained via the nucleotide sequence Recombinant DNA techniques give us not only abundant supplies of rare proteins, but also their ammo acid sequence as a bonus See, e g Blundell, infra, Oxender, infra, McPherson, infra Wyckoff, infra Isolated PPCA and pPPCA Polypeptides
A PPCA or pPPCA polypeptide can refer to any subset of a PPCA or pPPCA as a domain, subdomain, fragment, consensus sequence or repeating unit thereof A PPCA or pPPCA polypeptide of the present invention can be prepared by, e g (a) recombinant DNA methods,
(b) proteolytic digestion of the intact molecule or a domain, subdomain or fragment thereof,
(c) chemical peptide synthesis methods well-known in the art, and/or
(d) by any other method capable of producing a PPCA or pPPCA polypeptide and having a conformation similar to a structural or functional subdomain of a PPCA or a pPPCA A biological activity of PPCA or pPPCA can be screened according to known screening assays The minimum peptide sequence to have activity is based on the smallest unit containing or comprising a particular domain subdomain, fragment, region, consensus sequence, or repeating unit thereof, having at least one biological activity of a PPCA or pPPCA, such as protecting activity, inhibiting activity or enzyme activity Non-limiting examples of such activities are protecting activity for β-galactosidase or neuraminidase (NA), modulating activity (inhibition, stimulation or activation) as an for endotheiin I (serine carboxypeptidase) or cathepsin A and peptide hydrolyzmg activity (e g substance P and substance P-free acid, oxytocin and oxytocin-free acid, neurokinin A, angiotensin I, and bradykinin
According to the present invention, a PPCA or pPPCA includes an association of two or more polypeptide subdomains, such as at least one 4 ammo acid portion of a core or cap domain of a PPCA or pPPCA This can include 1 -14 subdomains of the cap domain and/or 1-44 subdomains of the core domain (as monomers or dimers), or any range, value or combination thereof Preferably 1 -4 sets of each of at least one core or cap domains or subdomains are included
The structure of a monomer or domain of at least one PPCA includes at least one subdomain of a PPCA of a pPPCA of the present invention can include one or more of the following subdomains, as described herein Generally a PPCA or pPPCA consists of a dimer of a core domain and a cap domain having the following subdomains having the specified residues, e g . as presented in Figure 13 (pPPCA) or Figure 14 (PPCA)
Core domain subdomains: Cβl, 21-27; Cβ2, 32-39; Cβ3, 50-54; Cαl , 63-67; Cβ4, 73-75; Cβ5, 82- 84; Cβ6, 94-98; Cα2, 1 18-135; Cβ7, 144-149; Cα3, 152-163; Cβ8, 171-177; Co4, 307-313; Cα5, 316-321 ; Cα6, 336-341 ; Cα7, 350-359; Cβ9, 363-369; Cα8, 377-386; CβlO, 391-401 ; Cβ 1 1 , 407-416; Cβl 2, 419-424; Cα9, 431-434; CαlO, 436-447; and Cap domain subdomains: Hal, 183-196; Ha2, 202-212; Ha 3, 226-240; Mβl, 261-264; Mβ2, 267-
270; Mai, 290-293; Mβ3, 296-299. Note that for monomer 2 the secondary structure assignments in the cap domain are slightly different than in monomer I .
A PPCA or pPPCA polypeptide of the invention can have at least 80% homology, such as 80-100% overall homology or identity, with one or more corresponding PPCA or pPPCA subdomains or fragments as described herein, such as a 4-542 amino acid fragment or portion of the amino acid sequence of Figures 13, 14 or 15. As would be understood by one of ordinary skill in the art, the above configurations of subdomains are provided as part of a PPCA or pPPCA polypeptide of the invention, when expressed in a suitable host cell, or otherwise synthesized, to provide at least one structural or functional feature of a native PPCA or pPPCA, such as at least one PPCA-related biological activity. Such activities can be assayed using a suitable assay, to establish at least one PPCA biological .activity of one or more PPCAs or pPPCAs of the invention. A PPCA or pPPCA polypeptide of the invention is not naturally occurring or is naturally occurring but is in a purified or isolated form which does not occur in nature. Examples of suitable PPCA activity assay include, e.g., cathepsin A activity (Galjart e/ a/., J. Biol. Chem. 266:14754-14762 (1991); Endotheiin I deamidase activity (Jackman, et al., J. Biol. Chem. 267:2872-2875(1992); and tachykinin deamidase activity (Jackman, et al.. J Biol Chem. 265:1 1265-1 1272 (1990)). Percent homology or identity can be determined, for example, by comparing sequence information using the
GAP computer program, version 6.0, available from the University of Wisconsin Genetics Computer Group (UWGCG).
The GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith and Waterman (Adv. Appl. Math. 2:482 (1981). Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) which are similar, divided by the total number of symbols in the shoπer ofthe two sequences. The preferred default parameters for the GAP program include: (1) a unitary comparison matrix (containing a value of 1 for identities and 0 for non-identities) and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN
SEQUENCE AND STRUCTURE, National Biomedical Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps. Thus, one of ordinary skill in the art, given the teachings and guidance presented in the present specification, will know how to add, delete or substitute other amino acid residues in other positions of a PPCA or pPPCA to obtain substituted, deletional or additional variants thereof.
Non-limiting examples of substitutions of a PPCA or pPPCA domains or polypeptide of the invention are those in which at least one amino acid residue in the protein molecule has been removed and a different residue added in its place according to the following Table 2. The types of substitutions which can be made in the protein or peptide molecule of the invention can be based on analysis of the frequencies of amino acid changes between a homologous protein of different species, such those presented in Figure 15. Based on such an analysis, alternative substitutions are defined herein as exchanges within one of the following five groups.
I Small aliphatic, nonpolar or slightly polar residues Ala. Ser, Thr (Pro. Gly), 2 Polar, negatively charged residues and their amides Asp, Asn. Glu. Gin.
4 Large aliphatic, nonpolar residues Mel. Leu. lie. Val (Cys), and 5 Large aromatic residues Phe. Tyr. Trp
Most deletions and additions, and substitutions according to the invention are those which do not produce radical changes in the characteristics of the protein or peptide molecule "Characteristics" is defined in a non-inclusive
manner to define both changes in secondary structure, e g α-helix or β-sheet, as well as changes in physiological activity, e g m biological activity assays However, when the exact effect of the substitution, deletion, or addition is to be confirmed, one skilled in the art will appreciate that the effect of at least one substitution, addition or deletion will be evaluated by at least one PPCA or pPPCA screening assay, such as, but not limited to, immunoassays or bioassays, to confirm at least one PPCA or pPPCA biological activity
Surprisingly, a PPCA and or a pPPCA is now discovered to have serine carboxypeptidase activity and corresponding structural features, although having only about 30% sequence identity to wheat and yeast serine carboxypeptidases These carboxypeptidases are members of the hydrolase fold family (Liao et al Biochemistry 31 9796-9812 (1992), Endπzzi etal .Biochemistry 33 1 1106-11120 (1994), Ollis etal, Protein Eng 5 197-21 1 (1992)) The serine carboxypeptidases have peptidase activity at acidic pH ( pH 4 5-5 5) as well as deamidase and esterase activities at pH 7 (reviewed in Breddam et al Carlsberg Res Commun 51 83- 128 ( 1986), Raw ngs & Barrett, Methods in Enzymology 244 19-61 (1 94)) Mutagenesis studies and enzymatic assays have revealed that only the mature form of PPCA possesses a serine carboxypeptidase activity, which is similar to that of lysosomal cathepsin A, and has a preference for hydrophobic substrates such as the dipeptide Phe-Ala (Galjart et al , J Biol Chem 266 14754-14762 (1991)) On the basis of sequence alignments with members of the serine carboxypeptidase family, mutagenesis studies and the structure determination of pPPCA, the catalytic triad in PPCA has now been determined to be formed by the residues Ser 150, His 429 and Asp 372 PPCA andpPPCA Expression for Isolation and Purification
A nucleic acid sequence encoding a PPCA or a pPPCA (Galjart et al , Cell 54 755-764 (1 88)) can be recombined with vector DNA in accordance with conventional techniques, including blunt-ended or staggered-ended termini for gation, restriction enzyme digestion to provide appropriate termini, filling in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and gation with appropriate ligases Techniques for such manipulations are disclosed, e g , in Sambrook et al , Molecular Cloning A Laboratory Manual, Second edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989), and Ausubel et al ,Current Protocols in Molecular Biology, Wiley Interscience, N Y , ( 1988- 1995) and are well known in the art
A nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypeptide if it contains nucleotide sequences which contain transcriptional and translational regulatory information and such sequences are "operably linked" to nucleotide sequences which encode the polypeptide An operable linkage is a linkage in which the regulatory DNA sequences and the DNA sequence sought to be expressed are connected in such a way as to permit gene expression as a PPCA , pPPCA or fragment thereof, in recoverable amounts The precise nature of the regulatory regions needed for gene expression can vary from organism to organism, as is well known in the analogous art See, eg , Sambrook, infra and Ausubel, infra
The invention accordingly encompasses the expression of a PPCA or a pPPCA, in either prokaryotic or eukaryotic cells, although eukaryotic expression is preferred Preferred hosts are bacterial or eukaryotic hosts including bacteria, yeast, insects, fungi, bird and mammalian cells either in vivo, or in situ, or host cells of mammalian, insect, bird or yeast origin It is preferred that the mammalian cell or tissue is of human, primate, hamster, rabbit, rodent, cow, pig, sheep, horse, goat, dog or cat origin, but any other mammalian cell can be used
Eukaryotic hosts can include yeast, insects, fungi, and mammalian cells either in vivo, or in tissue culture Preferred eukaryotic hosts can also include, but are not limited to insect cells, mammalian cells either in vivo, or in tissue culture Preferred mammalian cells include Xenopus oocytes, HeLa cells, cells of fibroblast origin such as VERO or CHO- 1 or cells of lymphoid origin and their derivatives
Mammalian cells provide post-translational modifications to protein molecules including correct folding or giycosylation at correct sites Mammalian cells which can be useful as hosts include cells of fibroblast origin such as but not limited to NIH 3T3 VERO or CHO or cells of lymphoid origin, such as, but not limited to the hybridoma SP2/0-Agl4 or the muπne myeloma P3-X63Ag8 hamster cell lines (e , CHO-K I and progenitors, e g , CHO-
DUXB 1 1 ) and their derivatives One preferred type of mammalian cells are cells which are intended to replace the function of the genetically deficient cells in vivo Neuronally derived cells are preferred for gene therapy of disorders of the nervous system For a mammalian cell host, many possible vector systems are available for the expression of at least one PPCA or pPPCA A wide variety of transcriptional and translational regulatory sequences can be employed, depending upon the nature of the host The transcriptional and translational regulatory signals can be derived from viral sources, such as, but not limited to, adenovirus, bovine papilloma virus, Simian virus, or the like, where the regulatory signals are associated with a particular gene which has a high level of expression Alternatively, promoters from mammalian expression products, such as, but not limited to, actm, collagen, myosin, protein production
When live insects are to be used, silk moth caterpillars and baculoviral vectors are presently preferred hosts for large scale PPCA or pPPCA production according to the invention Production of PPCA or pPPCA in insects can be achieved, for example, by infecting the insect host with a baculovirus engineered to express transmembrane polypeptide by methods known to those skilled in the related arts See Ausubel infra, §§ 16 8- 16 1 1
In a preferred embodiment, the introduced nucleotide sequence will be incorporated into a plasmid or viral vector capable of autonomous replication in the recipient host Any of a wide variety of vectors can be employed for this purpose See. e g , Ausubel er α/ , infra, §§ 1 5, 1 10, 7 1, 7 3, 8 1, 9 6, 9 7, 13 4, 16 2, 16 6, and 16 8- 16 1 1 Factors of importance in selecting a particular plasmid or viral vector include the ease with which recipient cells that contain the vector can be recognized and selected from those recipient cells which do not contain the vector, i he number of copies of the vector which are desired in a particular host and whether it is desirable to be able to "shuttle" the vector between host cells of different species Different host cells have characteristic and specific mechanisms for the translational and post-translational processing and modification (e g , glycosylation, cleavage) of proteins Appropriate cell lines or host systems can be chosen to ensure the desired modification and processing of the foreign protein expressed For example, expression in a bacterial system can be used to produce an unglycosylated core protein product Expression in yeast will produce a glycosylated product Expression in mammalian cells can be used to ensure "native" glycosylation of the heterologous PPCA or pPPCA Furthermore, different vector/host expression systems can effect processing reactions such as proteolytic cleavages to different extents
As discussed above, expression of PPCA or pPPCA in eukaryotic hosts requires the use of eukaryotic regulatory regions Such regions will, in general, include a promoter region sufficient to direct the initiation of RNA synthesis See e g , Ausubel, infra, Sambrook, infra Once the vector or nucleic acid molecule containing the consfruct(s) has been prepared for expression, the DNA construct(s) can be introduced into an appropriate host cell by any of a variety of suitable means, I e , tiansformation, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate-precipitation, direct icroinjection, and the like After the introduction of the vector, recipient cells are grown in a selective medium, which selects for the growth of vector-containing cells Expression of the cloned gene molecule(s) results in the production of a PPCA or pPPCA This can take place in the transformed cells as such, or following the induction of these cells to differentiate (for example, by administration of bromodeoxyuracil to neuroblastoma cells or the like)
A PPCA or pPPCA, or fragments thereof, of this invention can be obtained by expression from recombinant DNA according to known methods Alternatively, a PPCA or pPPCA can be purified from biological material A PPCA or a pPPCA can be purified from different mammalian tissues (e g , human placenta, rat liver mouse liver, pig kidney, bovine testes, bovine liver, and the like) of various genus and species
The PPCA or pPPCA can be isolated and purified in accordance with conventional method steps, such as extraction precipitation, chromatography affinity chromatography electrophoresis, or the like For example cells expressing at least one PPCA or pPPCA in suitable levels can be collected by centrifugation, or with suitable buffers lysed and the protein isolated by column chromatography for example on DEAE-cellulose phcsphocellulose polynbocvtidylic acid-agarose, hvdroxyapatite or by electrophoresis or immunoprecipitation Alternativeh a pPPCA
or PPCA can be isolated by the use of antibodies, such as, but not limited to, a PPCA- or pPPCA-specific antibody Such antibodies can be obtained by known method steps (see, e g , Harlow and Lane ANTIBODIES A LABORATORY MANUAL Cold Spring Harbor Laboratory (1988); Colligan et al , eds , Current Protocols in Immunology, Greene Publishing Assoc and Wiley Interscience, N Y , (1992, 1993), the contents of which references are entirely incorporated herein by reference)
A PPCA or a pPPCA can be purified from different mammalian tissues (e g , human placenta, rat liver, mouse liver, pig kidney, bovine testes, bovine liver, and the like) of various genus and species, using known techniques such as gel filtration, phase separation and affinity chromatography, e g .using polyclonal or monoclonal antibodies specific for a PPCA or pPPCA, according to known methods See . e g , Oxender et al , Protein Engineering, Liss, New York ( 1986)
Overview of PPCA orpPPCA Purification and Crystallization Methods
In general, a PPCA or pPPCA is isolated in soluble form in sufficient purity and concentration (e g , a monomer or dimer) for crystallization The PPCA or pPPCA is then isolated and assayed for biological activity (e g , cathepsin
A) and for lack of aggregation (which interferes with crystallization) The purified PPCA or pPPCA preferably runs as a single band for each monomer under reducing or nonreducing polyacrylamide gel electrophoresis (PAGE)
(nonreducing is used to evaluate the presence of cysteine bridges)
The purified PPCA or pPPCA is preferably crystallized under varying conditions of at least one of the following pH, buffer type, buffer concentration, salt type, polymer type, polymer concentration, other precipitating ligands and concentration of purified PPCA or pPPCA See, e g , known methods (Blundell et αl , Protein Crystallography, Academic Press, London (1976), Oxender, infra; McPherson, The Preparation and Analysis of Protein Crystals, Wiley Interscience, N Y (1982)) or methods provided in a commercial kit, such as CRYSTAL SCREEN (Hampton Research, Riverside, CA) The crystallized PPCA protein can optionally be tested for at least one PPCA activity and differently sized and shaped crystals are further tested for suitability for x-ray diffraction Generally, larger crystals provide better crystallographic data than smaller crystals, and thicker crystals provide better crystal lographic data than thinner crystals See, e g , Blundell, infra, Oxender, infra; McPherson, infra, Wyckoff et al , Diffraction Methods for Biological Macromolecules oh 1 14-1 15, Methods in Enzvmology, Academic Press, Orlando, FL (1985) Protein Crystallization Methods
The hanging drop method is preferably used to crystallize the purified protein See, e g , Blundell, infra, Oxender, infra, McPherson, infra; Wyckoff, infra, Taylor et al , J Mol Biol 226 1287-1290 (1992), Takimoto <?/ α/ ( 1992), infra, CRYSTAL SCREEN, Hampton Research
A mixture of the purified protein and precipitant can include the following
• pH (e g , 7-9),
• buffer type (e g , tromefhamine (TRIZMA), sodium azide (NaN3), phosphate, sodium, or cacodylate acetates, imidazole, Tris HCI, sodium hepes), • buffer concentration (e g , 1 - 100 M),
• salt type (e g , sodium azide, calcium chloride, sodium citrate, magnesium chloride, ammonium acetate, ammonium sulfate, potassium phosphate, magnesium acetate, zinc acetate, calcium acetate)
• polymer type and concentration (e g , polyethylene glycol (PEG) 1 -50%, type 400-10,000),
• other additives (salts potassium, sodium, tartrate. ammonium sulfate. sodium acetate, lithium sulfate sodium formate sodium citrate, magnesium formate, sodium phosphate, potassium phosphate organics 2-propanol, non-volatile 2-methyl-2,4-pentanedιol), β-octyl glucoside and
• concentration of purified PPCA or pPPCA (e g I 0-100 mg/ml) See e g . CRYSTAL SCREEN. Hampton Research
A non-limiting example of such crystallization conditions is the following • purified PPCA or pPPCA protein (e g . 5 mg/ml),
• (2) solutions in serial mixtures
(1) 40-80 mM TRIZMA, 0 05-2 0 mM NaN3„
(2) 2-30% Polyethylene glycol (PEG) 8000 buffered with 40-80 M TRIZMA and 005-2 0 mM NaN3 * o 05-0 5% β-octyl glucoside,
• at an overall pH of about 8 0-8 3
The above mixtures are used and screened by varying at least one of pH, buffer type, buffer concentration, precipitating salt type or additive or their concentrations, PEG type, PEG concentration, and protein concentration Crystals ranging in size from 0 1-09 mm are formed in 1 -14 days These crystals diffract x-rays to at least 10 A resolution, such as 0 15-100 A, or any range of value therein, such as 1 5, 1 6, 1 7 1 8, 1 9, 2 0, 2 1 , 2 2 2 3, 2 4, 2 5, 2 6, 2 7, 2 8, 2 9, 3 0, 3 1 , 3 2, 3 3, 34 or 3 5, with 3 5 A or higher being preferred for the highest resolution In addition to diffraction patterns having this highest resolution, lower resolution, such as 25-3 5 A can also be used See, e g , Blundell, infra, Oxender, infra, McPherson, infra, Wyckoff, infra, Protein Crystals Crystals appear after 1-14 days and continue to grow on subsequent days Some of the crystals can be optionally removed, washed, and assayed for biological activity (e g PPCA), which activity is preferred for using in further characterizations Other washed crystals are preferably run on a gel and stained, and those that migrate in the same position as the purified PPCA or pPPCA are preferably used From two to one hundred crystals are observed in one drop and crystal forms can occur, such as, but not limited to, orthorombic, bipyramidal, rhomboid, and cubic Initial x-ray analyses indicate that such crystals diffract at moderately high to high resolution When fewer crystals are produced in a drop, they can be much larger size, e g , 04-0 9 mm See, e g , Blundell, infra. Oxender, infra, McPherson, infra, Wyckoff, infra, PPCA andpPPCA X-ray Crystallography Methods
The crystals so produced for a PPCA or pPPCA are x-ray analyzed using a suitable x-ray source Diffraction patterns are obtained Crystals are preferably stable for at least 10 hrs in the x-ray beam Frozen crystals (e g , -220 to -50°C) are optionally used for longer x-ray exposures (eg , 5-72 hrs), the crystals being relatively more stable to the x-rays in the frozen state To collect the maximum number of useful reflections, multiple frames are optionally collected as the crystal is rotated in the x-ray beam, eg , for 5-72 hrs Larger crystals (>02 mm) are preferred, to increase the resolution of the x-ray diffraction patterns obtained Crystals are preferably analyzed using a synchrotron high energy x-ray source Using frozen crystals, x-ray diffraction data is collected on crystals that diffract to at least a relatively high resolution of 10- 1 5 A, with lower resolutions also useful, such as 25-IθA, sufficient to solve the three-dimensional structure of a PPCA or pPPCA in considerable detail, as presented herein
Passing an x-ray beam through a crystal produces a diffraction pattern as a result of the x-rays interacting and being scattered by the contents of the crystal The diffraction pattern can be visualized using, e g an image plate or film, resulting in an image with spots corresponding to the diffracted x-rays The positions of the spots in the diffraction pattern are used to determine parameters intrinsic to the crystal (such as unicell parameters) and to gain information on the packing of the molecules in the crystal The intensity of the spots contains the Fourier transformation of the molecules in the crystal, / e , information on each atom in the crystal and hence of the crvstallized molecule
After data collection of diffraction patterns the data is processed This includes measuring the spots on each diffraction pattern in terms of position and intensity This information is processed (; e mathematical operations are performed on the data (such as scaling, merging and converting the data from intensity of diffracted beams lo amplitudes)) to yield a set of data which is in a form as can be used for the further structure determination of the molecule crystallized The amplitudes of the diffracted x-rays are then combined with calculated phases to Droduce an electron density map of the contents of the crystal In this electron density map. the structure of the molecules (as
present in the crystal) is built. The phases can be determined with various known techniques, one being molecular replacement.
For the molecular replacement technique one takes a known three dimensional structure thought to share structural homology with the structure to be determined, to generate after calculations a first set of initial phases. These phases are then combined with the diffraction information of the molecule for which you want to solve the structure of. The result is an electron density map of the molecules in the crystal from which the diffraction patterns originate.
The phases can be further optimized using a technique called density modification, which allows electron density maps of better quality to be produced facilitating interpretation and model building therein. The atomic model is then refined by allowing the atoms in the model to move in order to match the diffraction data as well as possible while continuing to satisfy stereochemical constraints (sensible bond lengths, bond angles and the like). See, e.g., Blundell, infra; Oxender, infra; McPherson, infra; Wyckoff, infra; Computer Related Embodiments
An amino acid sequence of a PPCA or pPPCA and/or atomic coordinate/x-ray diffraction data, useful for computer structure determination of a PPCA, pPPCA or a portion thereof, can be "provided" in a variety of mediums to facilitate use thereof. As used herein, provided refers to a manufacture, which contains a PPCA or pPPCA amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention, e.g., the amino sequence provided in Figures 13-15, a representative fragment thereof, or an amino acid sequence having at least 80-100% overall identity to a 5-542 amino acid fragment of an amino acid sequence of Figures 13-15. Such a method provides the amino acid sequence and/or atomic coordinate/x-ray diffraction data in a form which allows a skilled artisan to analyze and determine the three- dimensional structure of a PPCA, a pPPCA or a subdomain thereof.
In one application of this embodiment, PPCA, pPPCA, or at least one subdomain thereof, amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention is recorded on computer readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable media can be used to create a manufacture comprising computer readable medium having recorded thereon an amino acid sequence and or atomic coordinate/x-ray diffraction data of the present invention. As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising an amino acid sequence and/or atomic coordinate/x-ray diffraction data information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon an amino acid sequence and/or atomic coordinate/x-ray diffraction data of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the sequence and x-ray data information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MICROSOFT Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the information of the present invention.
By providing on computer readable media having stored therein a PPCA or pPPCA sequence and/or atomic coordinates based on x-ray diffraction data, a skilled artisan can routinely access the sequence and atomic coordinate or x-ray diffraction data to model a PPCA, pPPCA. a subdomain thereof, or a ligand thereof. Computer algorithms are
publicly and commercially available which allow a skilled artisan to access this data provided on a computer readable medium and analyze it for structure determination and/or RDD See, e g Biotechnology Software Directory Mary Ann Liebert Publ , New York (1995)
The present invention further provides systems, particularly computer-based systems, which contain the sequence and/or diffraction data described herein Such systems are designed to do structure determination and RDD for a PPCA, pPPCA or at least one subdomain thereof Non-limiting examples are microcomputei workstations available from Silicon Graphics Incorporated and Sun Microsystems running Unix based Windows NT or IBM OS/2 operating systems
As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the sequence and/or atomic coord mate/x-ray diffraction data of the present irvention The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit
(CPU), input means, output means, and data storage means A skilled artisan can readily appreciate which of the currently available computer-based system are suitable for use in the present invention A monitor is optionally provided to visualize structure data As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a PPCA, pPPCA or fragment sequence and/or atomic coordinate/x-ray diffraction data of the present invention and the necessary hardware means and software means for supporting and implementing an analysis means
As used herein "data storage means" refers to memory which can store sequence or atomic coordinate/x-ray diffraction data of the present invention, or a memory access means which can access manufactures having recorded thereon the sequence or x-ray data of the present invention
As used herein, "search means" or "analysis means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence or x-ray data stored within the data storage means Search means are used to identify fragments or regions of a PPCA or pPPCA which match a particular target sequence or target motif A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting computer analyses that can be adapted for use in the present computer- based systems
As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif There are a variety of target motifs known in the art Protein target motifs include, but are not limited to, enzymic active sites, structural subdomains, epitopes, functional domains and signal sequences A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify structural motifs or interpret electron density maps derived in part from the atomic cocrdiπate/x-ray diffraction data A skilled artisan can readily recognize that any one of the publicly available computer modeling programs can be used as the search means for the computer-based systems of the present invention
One application of this embodiment is provided in Figure 22 Figure 22 provides a block diagram of a computer system 102 that can be used to implement the present invention The computer system 102 includes a processor 106 connected to a bus 104 Also connected to the bus 104 are a mam memory 108 (preferably implemented as random access memory RAM) and a variety of secondary storage memory 1 10, such as a hard drive 1 12 a removable storage medium 1 14 and a monitor 120 The removable medium storage device 1 14 may represent, for example, a floppy disk drive, a CD-ROM drive a magnetic tape drive, etc A removable storage medium 1 16 (such as a flopp\ disk a compact disk a magnetic tape etc ) containing control logic and/or data recorded therein mav be inserted into
the removable medium storage medium 1 14 The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 1 14 once inserted in the removable medium storage device 114
Ammo acid, encoding nucleotide or other sequence and/or atomic coordinate/x-ray diffraction data of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 1 10, and/or a removable storage device 1 16 Software for accessing and processing the ammo acid sequence and/or atomic coordinate/x-ray diffraction data (such as search tools, comparing tools, etc ) reside in main memory 108 during execution The monitor 120 is optionally used to visualize the structure data Structure Determination One or more computational steps, computer programs and/or computer algorithms are used to build a molecular
3-D model of a PPCA or pPPCA, using ammo acid sequence data from Figures 13-15 (or variants thereof) and/or atomic coordmate/x-ray diffraction data, as presented herein
In x-ray crystallography, x-ray diffraction data and phases are combined to produce electron density maps in which the three-dimensional structure of a PPCA or pPPCA is then built or modeled This structure can then be used for RDD of modulators of at least one PPCA- or pPPCA-related activity that is relevant to at least one PPCA- or pPPCA-related pathology
Density Modification and Map Interpretation Electron density maps can be calculated using such programs as those from the CCP4 computing package (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory,
UK, 1979) Cycles of two-fold averaging can further be used, such as with the program RAVE (Kleywegt & Jones, Bailey et al eds , First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 (1994)) and gradual model expansion For map visualization and model building a program such as "O" (Jones (1991), infra) can be used
Refinement and Model Validation Rigid body and positional refinement can be carried out using a program such as X-PLOR (Brϋnger (1992), infra), e g, with the stereochemical parameters of Engh and Huber (Ada Cryst
A47 392-400 (1991)) If the model at this stage in the averaged maps still misses residues (e g , at least 5-10 per subunit), the some or all of the missing residues can be incorporated in the model during additional cycles of positional refinement and model building The refinement procedure can start using data from lower resolution (e g , 25- 10A to
10-3 0 A and then gradually extended to include data from 12-6A to 3 0-1.5 A B-values (also termed temperature factors) for individual atoms can be refined once data of 2 8A or higher (e g , up to 1 5 A) has been added Subsequently waters can be gradually added A program such as ARP (La zin and Wilson, Ada Cryst D49 129-147 (1993)) can be used to add crystallographic waters and as a tool to check for bad areas in the model Programs such as PROCHECK
(Lackowski et al J Appl Cryst 25283-291 (1993)), WHATIF (Vπend, J Mol Graph 852-56 (1990)) and PROFILE
3D (Luthy et al , Nature 356 83-85 (1992)), as well as the geometrical analysis generated by X-PLOR can be been used to check the structure for errors A program such as DSSP can be used to assign the secondary structure elements
(Kabsch and Sander (1983), infra) The structure of a PPCA or pPPCA can thus be solved with the molecular replacement procedure such as by using X-PLOR (Brunger (1 92), infra) A partial search model for the monomer can be constructed using a related protein, such as wheat serine carboxypeptidase structure (Liao et al (1 92), infra) The rotation and translation function can be solved to yield orientations and positions for the subunits in the crystallographic asvmmetπc unit This allows phases to be determined that when combined with information from the x-ray diffraction patterns, allows electron density maps of a PPCA or pPPCA to be calculated The atomic model is then built using these electron density maps
Cyclical two-fold density averaging can also be done to improve the electron density maps using a suitable program
(e g RAVE) and model expansion can also be used to add missing residues for each monomer, resulting in a model with
95-99 9% of the total number residues The model can be refined in a program such as X-PLOR (Brunger ( 1992), supra) to a suitable crystallographic R(ta„ The model data is then saved on computer readable media for use in further analysis such as rational drus design
Rational Design of Drugs that Interact with the PPCA orpPPCA
The determination of the three-dimensional structure of a PPCA or pPPCA, as described hen in, provides a basis for the design of new and specific ligands for the diagnosis and/or treatment of at least one PPCA- or pPPCA- related pathology Several approaches can be taken for the use of the crystal structure of a PPCA or pPPCA in the rational design of ligands of this protein A computer-assisted, manual examination of the active site structure is optionally done The use of software such as GRID ( Goodford, J Med Chem 28 849-857 (1985)) a program that determines probable interaction sites between probes with various functional group characteristics and the enzyme surface — is used to analyze the active site to determine structures of inhibiting compounds The program calculations, with suitable inhibiting groups on molecules (e g , protonated primary amines) as the probe, are used to identify potential hotspots around accessible positions at suitable energy contour levels Suitable ligands, as inhibiting or stimulating modulating compounds or compositions, are then tested for modulating activities of at least one PPCA or pPPCA
A diagnostic or therapeutic PPCA or pPPCA modulating ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies Such labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminesceπt compounds Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention After preliminary experiments are done to determine the Km of the substrate with each enzyme activity of a
PPCA or pPPCA. the time-dependent nature of modulation of ligand K values are determined, (e g , by the method of Henderson (Biochem J 127321-333 (1972)) For example, the substrate (or blank where appropriate) and enzyme are pre-mcubated in buffer Reactions are initiated by the addition of substrate Aliquots are removed over a suitable time course and each quenched by addition into the aliquots of suitable quenching solution (e g , sodium hydroxide in aqueous ethanol) The concentration of product is determined, e g , fluorometrically, using a spectromeler Plots of fluorescence against time can be close to linear over the assay peπod, and are used to obtain values for the initial velocity in the presence (V,) or absence (V0) of hgand Error is present in both axes in a Henderson plot, making it inappropriate for standard regression analysis (Leatherbarrow, Trends Biochem Sci 15455-458 (1990)) Therefore, K, values are obtained from the data by fitting to a modified version of the Henderson equation for competitive inhibition Qr 2 + (£ _ Q - ι - εt = 0
where (using the notation of Henderson (Biochem J 127321-333 (1972))
( A K \ V a i
This equation is solved for the positive root with the constraint that
0 = K,((A1 + K,) / K.) using PROCNLIN from SAS (SAS Institute Inc , Cary, North Carolina, USA) which performs nonlinear regression using least-square techniques The iterative method used is optionally the multivaπate secant method, similar to the Gauss- Newton method except that the derivatives in the Taylor series are estimated from the histogram of iterations rather than supplied analytically A suitable convergence criterion is optionally used, e g , where there is a change in loss function of less than 108
Once modulating ligands are found and isolated or synthesized, crystallographic studies of the compounds complexed to a PPCA or pPPCA can be performed As a non-limiting example, PPCA or pPPCA crystals are soaked for 2 days in 0 01-100 mM ligand and x-ray diffraction data are collected on an area detector and/or an image plate detector (e g , a Mar image plate detector) using a rotating anode x-ray source Data are collected to as high a resolution as possible, e g , an inner limit of diffraction of 1 5-3 5A An atomic model of the inhibitor is built into the difference Fourier map F^,baof O0m^, -Fnuιvc) The model can be refined to adjust the atomic positions to improve the fit with the electron density maps, while maintaining correct stereochemical constraints The model will preferably have low r m s deviations from the ideal bond lengths, as well as for the angles, respectively, as well as a low R-factor (preferably less than about 25-35%, such as less than about 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, or 25% Direct measurements of enzyme inhibition provide further confirmation that the modeled ligands are modulators of at least one biological activity of a PPCA or a pPPCA As a non-limiting example, a modification (Chong et al . Biochim Biophys Acta 107765-7 '1 (1991 )) of the fluorometπc assay of Potιer (e/ al , Analyt Biochem 94 287- 296 (1979)) is optionally used to measure neuraminidase inhibition or stimulation, optionally including determination of inhibition constants (K,) Other suitable PPCA activity assay include, e g cathepsin A activity (Galjart et al J Bio! Chem 266 14754-14762 (1991), Endotheiin I deamidase activity (Jackman, et al J Biol Chem 2672872-2875(1 92), and tachykinin deamidase activity (Jackman, et al , J Biol Chem 265 1 1265-1 1272 (1990))
Ligands of a PPCA or pPPCA, based on the crystal structure of this enzyme, are thus also provided by the present invention A PPCA or pPPCA ligand is any molecule, compound or composition that is capable of associating with a PPCA or pPPCA and optionally modulating at least one function or structural feature of a PPCA or pPPCA Preferably, a PPCA or pPPCA ligand modulates at least one biological activity of a PPCA or pPPCA Demonstration of clinically useful levels, e g , in vivo activity is also important In evaluating PPCA or pPPCA inhibitors for biological activity in animal models (eg , rat, mouse, rabbit) using various oral and parenteral routes of administration are evaluated Using this approach, it is expected that modulation of a PPCA or pPPCA occurs in suitable animal models, using the ligands discovered by structure determination and x-ray crystallography Evaluation of Therapeutic Potentials of Compositions via a PPCA Animal Model
The present invention also provides methods for identifying diagnostic or therapeutic ligands of PPCA or pPPCA via computer RDD, to treat a PPCA-related pathology Generally, a method for determining the therapeutic or diagnostic use of a PPCA or pPPCA modulating ligand, to treat a PPCA related pathology, comprises the steps of administering a known dose of at least one ligand containing compositions to an animal model having a phenotype corresponding to a PPCA-related pathology, monitoring the appropriate biological or biochemical parameters, and comparing the results with treated animals to those of untreated animals Results indicating the onset or presence of a PPCA related pathology are generally referred to herein as "symptoms" of the disease See , e g , U S Appl No 08/397,693, filed March 2, 1995, which is entirely incorporated herein by reference
Appropriate biological and biochemical parameters that reflect the onset and progression of a PPCA related pathology include, but are not limited to, (1) gross biological parameters, e g , physical appearance (i e , flattening of the face, rough haircoat and/or subcutaneous swelling in affected animals) or growth (reduced weight gam), (2) gross behavioral parameters, e g , lack of coordination, (3) biochemical assays, e g , assays of cathepsin A, N-acetyl-α- neuraminida " or β-galactosidase activities in primary cultures of skin fibroblasts or tissue homogenates, (4) histopatholo-' al studies (visceromegaly, l e , enlarged liver and spleen accumulation of secondary vacuoles in kidney tissues, etc )
A first method of evaluating the therapeutic potential of a composition using the trans enic non-human animals of the invention comprises the steps of
(1) Administering a known dose of the composition to a first non-human animal havinc a phenotype corresponding to a human PPCA related pathology, (2) Detecting the time of onset of symptoms in the first non-human animal and
(3) Comparing the time of onset of symptoms in the first non-human animal to the time of onset of symptoms in a second non-human animal having a phenotype corresponding to a human PPCA related pathology, which has not been exposed to the composition, wherein a statistically significant delay in the time of onset of symptoms in the first non-human animal relative to the time of onset of the symptoms in the second non-human animal indicates the potential of the composition for treating a PPCA related pathology
A second method of evaluating the therapeutic potential of a composition using the non-human animals of the invention comprises the steps of
(1) Administering a known dose of the composition to a first non-human animal having a phenotype corresponding to a human PPCA related pathology at an initial time, to,
(2) Determining the extent of symptoms in the first non-human animal at a latei time, t,, and
(3) Comparing, at t„ the extent of symptoms in the first non-human animal to the extent of symptoms in a second non-human animal having a phenotype corresponding to a human PPCA related pathology, which has not been exposed to the composition at to, wherein a statistically significant decrease in the extent of symptoms at t, in the first non-human animal relative to the extent of the symptoms at t, in the second non-human animal indicates the potential of the composition for treating a PPCA related pathology
In the above methods, the composition being tested may comprise a chemical compound administered by circulatory injection or oral ingestion The composition being evaluated may alternatively comprise a polypeptide administered by circulatory injection of an isolated or recombinant bacterium or virus that is live or attenuated wherein the polypeptide is present on the surface of the bacterium or virus prior to injection, or a polypeptide administered by circulatory injection of an isolated or recombinant bacterium or virus capable of reproduction within a non-human animal, and the polypeptide is produced within a non-human animal by genetic expression of a DNA sequence encoding the polypeptide Alternatively, the composition being evaluated may comprise one or more nucleic acids, including a gene from the human genome or a processed RNA transcript thereof Similarly, the composition being evaluated may comprise cells removed from a mammal and genetically engineered to overexpress a lysosomal protein or some other therapeutic polypeptide
Once the PPCA modulating ligand has been shown to be effective in an animal model, it can then be tested in human clinical trials, according to known method steps In the above methods, delivery of the composition being tested to non-human animals is achieved via means appropriate for the composition being tested, e g , by diet, by intermittent or continuous intravenous injection of one or more of the compositions or of a liposome (Rahman and Schein, in Liposomes as Drug Carriers, Gregoπadis, ed , John Wiley, New York (1988), pages 381-400, Gabizon, A . in Drug Carrier Systems, Vol 9, Roerdink et al , eds , John Wiley, New York (1989), pages 185-212) or microparticle (Tice et al , U S Patent 4,542,025 (Sep 17, 1985)) formulation comprising one or more of the compositions, via subdermal implantation of drug-polymer conjugates (Duncan, R , Anti-Cancer Drugs 3 175-210 ( 1992) via microparticle bombardment (Sanford fl a/ , U S Patent 4 945,050 (Jul 31 , 1990)) via infusion pumps (Blackshear and Rohde, in Drug Carrier Systems, Vol 9, Roerdink et al , eds , John Wiley, New York (1989), pages 293-310) or by other appropriate means known in the art (see, generally, Remington's Pharmaceutical Sciences. 18th Ed Gennaro, ed , Mack Publishing Co , Easton, PA (1990)) Pharmaceutical/Diagnostic Administration
Using compounds or compositions comprising at least one PPCA or PPCA modulating ligand the present invention further provides a method for modulating the activity of a PPCA or pPPCA protein in a eel In general ligands (antagonists or agonists) which have been identified to inhibit or enhance the activity of at least one PPCA or pPPCA ligand can be formulated so that the ligand can be contacted with a cell expressing at least one PPC A or pPPCA
protein in vivo The contacting of such a cell with such a ligand results in the in vivo modulation of at least one biological activity of a PPCA or pPPCA
At least one PPCA or pPPCA modulating compound or composition of the invention can be administered by any means that achieve the intended purpose, using a suitable pharmaceutical composition or formulation For example, administration can be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, lntranasal, mtracranial, transdermal, or buccal routes Alternatively, or concurrently, administration can be by the oral route Parenteral administration can be by bolus injection or by gradual perfusion over time
A typical regimen for treatment or prophylaxis comprises administration of an effective amount over a period of one or several days, up to and including between one week and about six months It is understood that the dosage of a diagnostic/pharmaceutical compound or composition of the invention administered in vivo or m vitro will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the diagnostic/ pharmaceutical effect desired The ranges of effective doses provided herein are not intended to be limiting and represent preferred dose ranges However, the most preferred dosage will be tailored to the individual subject, as is understood and determinable by one skilled in the relevant arts See, e g , Berkow et al , eds , The Merck Manual, 16th edition, Merck and Co , Rahway, N J , 1992, Goodman et al , eds , Goodman and
Gilman's The Pharmacological Basis of Therapeutics, 8th edition, Pergamon Press, Inc , Elmsford, N Y , (1990), Avery's
Drug Treatment Principles and Practice of Clinical Pharmacology and Therapeutics, 3rd edition, ADIS Press, LTD ,
Williams and Wilkms. Baltimore MD (1987). Ebadi Pharmacology. Little, Brown and Co , Boston, (1985), Osol et al , eds . Remington's Pharmaceutical Sciences, 18th edition. Mack Publishing Co , Easton, PA (1990), Katzung, Basic and Clinical Pharmacology, Appleton and Lange, Norwalk, CT (1992) which references are entirely incorporated herein by reference
The total dose required for each treatment can be administered by multiple doses or in a single dose The diagnostic/pharmaceutical compound or composition can be administered alone or in conjunction with other diagnostics and or pharmaceuticals directed to the pathology, or directed to other symptoms of the pathology Effective amounts of a diagnostic/pharmaceutical compound or composition of the invention are from about 0 1 μg to about 100 mg kg body weight, administered at intervals of 4-72 hours, for a period of 2 hours to 1 year, and/or any range or value therein
The recipients of administration of compounds and/or compositions of the invention can be any mammals
Among mammals, the preferred recipients are mammals of the Orders Pπmata (including humans, apes and monkeys),
Arteriodactyla (including horses, goats, cows, sheep, pigs), Rodenta (including mice, rats, rabbits, and hamsters), and Carnivora (including cats, and dogs) The most preferred recipients are humans
Having now generally described the invention, the same will be more readily understood through reference to the following example which is provided by way of illustration, and is not intended to be limiting of the present invention
Example I: Preparation, Purification and Crystallization of PPCA orpPPCA from Human Cells
The present invention provides in one aspect, the determination of the three-dimensional structure of the human protective protem/cathepsm A (PPCA) in the precursor form (pPPCA) bv a combination of molecular replacement and twofold density averaging The structure presented here is the first of an enzvme associated with a human PPCA related pathology and the third human lysosomal enzyme structure determined The structure gives us insight into the zvmoαen activation mechanism of pPPCA as well as the expected 3-D structure of PPCA and its specific and new enzymatic activities PPCA andpPPCA Expression and Purification
Plasmid Constructs. AcMΗPV transfer-plasm ids pJR2 and pBC3 (Figure 1 ) were derivatives of plasmid pAc373 carrying the entire polyhedπn gene (Smith et al , 1985) In pJR2 a polylinker with a number of multiple cloning sites (MCS) was inserted directly 3 of the polvhedπn promoter and substituted a 33-nucleotιde deletion of the
polyhedrin gene, starting with the ATG. pBC3 had the polylinker situated in a similar position as pJR2, but instead of the 33-nt deletion this plasmid featured an ATG codon mutated in ACG Full-length human PPCA cDNA, PPCA54 (Galjart et al , 1988). and the two deletion cDN A mutants, 32(Δ20) and 20(Δ32) (Galjart et al. , 1991 ), were subcloned either in pJR2 or pBC3 as EcoRI fragments, using standard procedures (Sambrook et al., 1989). (Figure 1 ). The 20(Δ32) deletion mutant was tagged with the human PPCA signal sequence, as reported earlier (Galjart et al , 1991). All cDNA fragments were engineered to have short 3' and 5' untranslated regions (< 10 bp).
Transfection and Selection of Recombinant Baculovirus. Spodoptera frugiperda insect cells (IPLB-SF21 ) were cultured in monolayers at 27°C in TNM-FH medium (Hink, 1970), supplemented with 10% FBS and antibiotics (complete medium). Wild-type (wt) AcMNPV virus strain E2 (Smith and Summers, 1978) and recombinant baculoviruses were propagated on confluent monolayers of Sf21 cells. Recombinant constructs AcPPCA54, AcPPCA32 and AcPPCA20 were generated by cotransfecting Sf21 cells with 1 μg wt-AcMNPV DNA and 10 μg plasmid DNA, using the calcium phosphate method, modified for insect cells (Graham et al , \ 973; Carstens et al , 1980; Summers et al , 1987). Recombinant polyhedrin-negative recombinant baculoviruses were then selected and purified by sequential plaque assays, and verified by dot blot and southern blot analysis (Summers et al, 1987). Large quantities of inoculum were produced by infection of insect cells at 25-50 % confluency, with recombinant virus at a multiplicity of infection (MOI) of < 1 pfu/cell. After 3 to 6 days at 270C, when all cells appeared infected, the medium was harvested and centrifuged for 5 m at 1000 rpm to remove detached cells. The titre of the inoculum was determined by plaque assay analysis.
Protein purification and western blotting. Sf21 cells were cultured in either 175 CM2 or 500 CM2 flasks (triple flask, Nunc) to near confluency, and infected with recombinant baculoviruses at a MOI of 5- 10 pfu/cell After 1.5 h incubation at 27 "C, the inoculum was replaced with complete medium for additional 8 to 10 hrs. Cell monolayers were then rinsed with PBS and cultured further for 38 h in unsupplemented Grace's medium. After infection the medium was collected, centrifuged for 5 m at 1500 g, and for 1 h at 100.000 g (Beckmann SW-28 rotor) to remove virus particles.
After centrifugation the supernatant was concentrated 20-fold, in an Amicon stirred cell. Glycoproteins were purified -60% using a concanavalin A-SEPH AROSE affinity chromatography column, as described earlier (Ven eijen et al.,
1982). Total protein concentration was measured using the method of Smith et al., (1985). Aliquots of the purified preparation were resolved on 12.5% SDS-polyacrylamide gels under reducing and non-reducing conditions. Gels were either Coomassie brilliant blue- or silver stained (Sambrook et al. , 1989). For western blotting, proteins were transferred from gels to IMMOBILON PVDV membranes (Millipore Corp.), using a semidry blotter (The W.E.P. company). Development and Use of pPPCA antibodies. A 15 ammo acid peptide (NH2-Cys-Met-Trp-His-Gin-Ala-Leu-
Leu-Arg-Ser-Glu-Asp-Lys-Ala-Arg-COOH) (Figure 5), based on the C-terminal sequence of the 34-kDa PPCA subunit
(amino acid 285-298, Galjart et al., 1988), was synthesized on a peptide synthesizer (Applied Biosystems), and covalently linked to the carrier protein Keyhole Limpet Hemocyanin, using the IMJECT ACTIVATED IMMUNOGEN
CONJUGATION KIT (Pierce). Polyclonal antibodies against the conjugated product were raised in rabbit, by multiple subdermal injections of the protein (40-125 μg) mixed with incomplete Freunds adjuvant (Pierce). Rabbiis were bled
34 days after the first injection The antibodies, designated anti-pep, were tested on immunoblots and by immunoprecipitations of baculovirus produced PPCA.
Blot* were incubated for at least 12 h in blocking buffer (0.01 M tris-buffered saline pH 8.0 (TBS), 0.05%
Tween 20. aι ' ι% (w/v BSA). and subsequently probed for 2 h with polyclonal PPCA antibodies, anti-54, d luted 1 :200 in fresh blocking buffer They were then washed for 1 h in TBS. 0.05% Tween 20, and incubated for 2 h with alkaline phosphatase conjugate anti-rabbit igG (Sigma, 1 : 1000 in blocking buffer) Proteins were visualized using alkaline phosphatase substrate (Sigma, 4-aminodiphenylamiπe diazonium sulfate, naphtol as-mx phosphate).
Crystallization of PPCA. Fractions containing the precursor form of the protein as assayed on an SDS-PAGE gel were pooled. Subsequently the protein was concentrated to 5 mg/ml and the buffer exchanged to 50 mM NaAc pH 5.2 or 50 mM MES pH 6.5 using a CENTRICON-I 0. Crystals were grown using the hanging drop vapor diffusion
- - technique Crystals suitable for data collection were grown using a reservoir solution containing 2- 10 % PEG 8000, pH 8 0 - 8 3, 50mM TRIZMA, ImM NaN3, 025 % β-octyl glucoside at 4-12"C Mixing non-equal volumes of protein solution (in the range 5-lOμl) and reservoir solution ( in the range 2-6 W) enhanced the occurrence of single large crystals per drop under these crystallization conditions The concentration of the protein solution before mixing was 5 mg/ml Crystal growth was enhanced by macrocrystallization techniques (anything that promotes growth of big crystals) and in some cases by micro- and macroseeding techniques
Example 2: Structure Determination of a pPPCA Crystallized from Human Cells Data Collection, Data Processing and Reduction
To allow for data collection at cryotemperatures, the crystals were cryoprotected by adding glycerol in 5% -10% steps to a solution of about 12% PEG 8000, 50 mM TRIZMA, pH 8 0, ImM NaN3, 025% β-octyl glucoside, which served as an artificial mother liquor The crystals were incubated for half an hour at 40°C after each addition of glycerol The final mother liquor contained 30% glycerol Gradually increasing the glycerol was needed to help keep the crystals from cracking
Diffraction data was collected at the Stanford Synchrotron Radiation Laboratories (SSRL) to 2 0 A at -178 °C on a MAR imaging plate at a wavelength of 1 08 A on beam-line 7- 1 The diffraction coordinate data (corresponding to atomic coordinates monomer 1 , the other monomer coordinates are provided by matrix conversion of these coordinates, as presented herein) was processed and reduced using MOSFLM version 5 2 from the CCP4 program package (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory UK, 1979) The program REFIX
(Kabsch (1993), infra) was used for auto-indexing Using the CCP4 program suite (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory UK, 1979), the intensities were scaled (ROTAVATA), merged
(AGROVATA) then converted to amplitudes and truncated with the program TRUNCATE Statistics of the data collected are given in Table I The Vm (Matthews, B W , J Mol Biol 33491 -497 ( 1968)) is 3.2 AVDa for 2 monomers in the asymmetric unit, corresponding to a solvent content of 62%
Molecular Replacement Search Model: The best molecular replacement results were obtained using a multi-Ala core as a search probe
The 'multi-Ala core' search model was constructed from the atomic coordinates of the CPW monomer (Liao et al , 1992), based on the sequence alignment as presented in Figure 15 Regions expected to deviate in structure between PPCA and
CPW were deleted from the model (i e with low sequence identity or located in loops) The 125 residues identical in
PPCA and CPW were left in the model, 1 12 residues were truncated to alanine The remaining 94 residues through differing between CPW and PPCA, were considered sufficiently similar in size and the CPW residue left as such in the model The resulting 'multi-Ala core' monomer consisted of 331 residues, constituting a large portion of the core domain and little atomic information for the 'cap' domain (see Figure I) The model contained 30% of the expected protein scattering mass given the fact that there are two monomers in the asymmetric unit The sequence identity between this search model and the true PPCA structure was 37 7% Rotation Function, PC Refinement and Translation Function: Native data of 8 - 4A was used in the molecular replacement calculations The rotational searches utilized a real space Patterson search method, as implemented m X-PLOR (Steigeman, 1974, Huber, 1985, Brunger 1992a) with a Patterson vector cutoff of 21 A The self-rotation function failed to reveal any non-crystallographic two-fold symmetry relating two monomers in the asymmetric unit In addition the native self Pattersons did not reveal the presence of a non-crystallographic two-fold axis parallel to a crystallographic axis These results indicated that the two monomers m the asymmetric unit might not form a dimer together The cross-rotation function was carried to find the orientation of the two monomers in the asvmmetnc unit as follows Patterson vector sets were calculated for the search model and the native data and the 8000 strongest Patterson vectors were used in the rotation function The rotational space restricted to the asymmetric unit of the rotation function according to Rao et al , 1980 was sampled by rotating the Patterson vectors from the search model around Euleπan angles θl Θ2, and Θ3, while sampling Θ2 in angular grid intervals of 2 5° The 5000 highest rotation
function grid points were selected resulting from the product function of the two Patterson vector sets. The grid points (differing less than 8° around any given axis) were then clustered. The result was a list of 169 possible solutions for the rotation function, each corresponding to a set of three angles describing an orientation. The two top solutions were 3.9 and 3.8 sigma above the mean. PC-refinement (Brunger, 1990) was carried out to optimize each of the 169 possible solutions using the complete search model as a single rigid body. This yielded two orientations with a PC-index of 0.043 and 0.051 respectively. The orientations of these solutions were (D, = 261.4, D2 = 36.22, D, = 147.28); and (D , = 18.52, D. = 47.40, £>3 = 23.22), respectively. In contrast, the rest of the possible solutions yielded an average PC-index of 0.022.
Individual translation function calculations were performed on a I A grid. A translational solution was found for each orientation at positions (x=33.30, y=51.97, and z=12.79) and (x=25.23, y=28.58, and z=22.02), with respect to the crystallographic center, as 7.7 and 8.8σ, respectively, above the mean. The
for the individual solutions was 55.6% and 54.8% in the resolution range 8.0 to 4.0A, with a correlation coefficient (CC) of 0.095 and 0.1 14. A combined translation function was calculated to place each solution relative to the same crystallographic origin, resulting in an Rfιαor of 52.8% for data between 8.0 and 4.0A, bringing the Rr,clor down to 51.3% and increasing the CC to 0.22. The molecular packing was assessed on a graphics workstation, which revealed no clashes between the placed search probes. However, a very large amount of empty space was present. The packing showed that the asymmetric unit contained two half dimers, each forming a dimer with another monomer in a neighboring unit cell. The two cores in the asymmetric unit were related by κ=73° around an axis tilted 15.5° off the crystallographic a axis lying in the a.c plane. Iterative Model Building and Two-fold Averaging
Initial Electron Density Map: A 2m|FobJ| -D|FM,C| SigmaA weighted map (Read, 1986) was calculated using
|Fa|C|'s and phases from the molecular replacement solution. The map was contoured at lσ and showed good density for most of the core. Density emerged for many side chains where the input model residue had been an Ala, indicating that the molecular replacement solution was correct. First Model Built: The two rotated and translated search probes formed the starting point for model building of the PPCA precursor. The non-crystallographic symmetry (NCS) matrix was determined between the two cores using the "Lsq_explicit" option in the computer program O (Jones et al., 1991 ). Subsequently a 'best monomer' was built by superimposing the electron densities from each monomer core, and adjusting the model accordingly. Residues were only incorporated in the model where the electron density was visible for the complete side chain. Residues from the search model for which no density was visible were removed. An alanine was built in the model at places where electron density for a side chain was partial. In this manner 294 residues, i.e. 65% of the C* atoms were built in the 'best monomer' core. The second monomer was generated from the 'best monomer' model using the NCS operator relating the two monomers in the asymmetric unit. At this point the data set was partitioned in a working set and a test set consisting of 5% of the reflections between 8 - 2.2A to monitor the RfrM (Brunger et al. 1992b). The working data set was used for rigid body and positional refinement. For averaging and map calculations the unpartitioned data set was used. Twenty-five cycles of refinement using the two 'best monomers cores' positioned in the asymmetric unit as rigid bodies and data from 8.0 - 3.0A, resulted in an R^, of 53.5% for this resolution range. The atomic coordinates of this partial model were used to calculate a new 2m |Fobs| -
SigmaA weighted map which we called the 'best monomer map'. Averaging: Search for Missing Density: The phasing power from the rigid body refined 'best monomer cores', consisting of 294 residues per core was insufficient to bring back interpretable electron density for the missing part of the model. 158 residues per monomer. To overcome this a 'bootstrapping' procedure was appl ied, entailing density averaging using RAVE (Kleywegt & Jones, 1994a) and model expansion. The 'best monomer map' and the rigid body refined 'best monomer cores' served as the starting point for this procedure.
Six bootstrapping cycles were carried out, called bmcl through bmcό, allowing for the model to be extended in stepwise increments Figure 16 shows a scheme of the steps incorporated in one bootstrapping cycle After a cycle in which the model had undergone major expansion, a new molecular mask was calculated with MAMA (Kleywegt & Jones, 1994b) for use in the subsequent bootstrapping cycle No phase recombination was applied between bootstrapping cycles At the end of each cycle the inverted phases αmv and inverted amplitudes F,rv 's were discarded The NCS operator was re-optimized after cycle bmc3 The resolution range of the data included in the bootstrapping cycle started with 15 - 3 0 A for bmc 1 and was gradually extended to 15 - 2 7 A in bmcό The bootstrapping procedure is summarized in Table 2 To optimize the bootstrapping procedure, consideration was given to the molecular mask used in the averaging, the model building strategy and the refinement procedure Molecular masks: Four different masks were constructed in total The atomic radius of all atoms was set to
4A to calculate each mask The masks were then manually modified using mask editing options in O (Jones et al 1991) Mask 1 , was constructed around the 'best monomer core' Subsequently it was greatly enlarged by multiple blocks of 10 - 15 A3 in the regions where the model was incomplete (Figure 17) This was crucial to prevent the density in the insertion area's from being flattened during the averaging step Approximately one half of the dimer interface was estimated to be formed by regions from the missing cap domain Major expansions of the mask in this area were made to accommodate for this This resulted in a serious overlap problem when the mask was duplicated to cover a complete dimer The mask was reduced where overlap occurred with the "overlap_tπm" option of MAMA After several bootstrapping cycles, new incorporated polypeptide fragments were carefully assigned to one of the two monomers forming the dimer and the mask at the dimer interface area's was manually adjusted accordingly Essentially the masks were kept far too large in regions where the model was missing in order to avoid erroneous flattening of electron density In contrast the masks were tightened around the area's of the molecule where the model was complete
Model Building: A conservative model building strategy was adopted Initially only side chains were mutated in the core region to fit the PPCA am o acid sequence and where the density was clear, poly-alanme fragments were built in the insertion area's (loops and the cap domain) Newly included atoms were given a B-factor of 20 A2 Only once models bmc5 and bmcό were obtained, was the electron density of sufficient quality to allow side chains to be incorporated confidently in the cap domain (residues 190 - 303) At this stage the C" trace was virtually complete for the whole dimer and the sequence could be fit unambiguously
Refinement: Positional refinement was postponed until after 3 cycles of bootstrapping resulting in a final model containing 91% of the Cα atoms Forty steps of positional refinement were then earned out to improve the geometry of the model Subsequently only one of the refined monomer was taken and the other generated using NCS operators The rational for delaying the positional refinement is addressed in the discussion
Completing the model: deviations from two-fold symmetry. It was possible to add 148 residues and 185 side chains per monomer after a total of 6 bootstrapping cycles At this stage, each subunit contained 442 residues and 413 side chains, l e 98% of the C* and 91% of the side chains atoms The gradual model expansion as a function of the bootstrapping cycle is shown in Figure 18
Twenty residues were still missing in the asymmetric unit at this stage These were localized to two stretches per monomer (260 - 262 and 287-292) With most of the scattering mass incorporated, the monomers from model bmcό was refined individually with X-PLOR (Brunger, 1992a) in an attempt to retrieve electron density for the still missing residues After 40 steps of positional refinement using data from 8 0 - 2 6 A, the Rfiαor dropped significantly from 402% to 33 2% The model was further positionally refined using a full weight WA on the crystallographic term The data included in the refinement was gradually extended to 2 2 A At 2 4 A resolution individual B-factors were refined and the distribution checked as a function of atom location (/ e , low B-factors in the core and high B-factors on the surface) Cycles of refinement and refining allowed for 18 missing residues to be added Essentially almost the complete cap domain was retrieved using the bootstrapping procedure as shown in Figure 19 It became apparent from the refined maps that the two stretches of missing amino acids adopted a very different conformation in the two monomers (with
as much as an average r.m.s.d. of 7.9 A for the C's of residues 287 - 292). For this reason electron density for these regions had not been retrieved in the two-fold averaging process. The stepwise improvement of the electron density maps along with averaging, model expansion and refinement is shown in Figure 6.
The program ARP was used to check our model, in particular the region at the dimer interface (Lamzin & Wilson, 1993). Prior to the final round of positional refinement, an IFob!I/σ cutoff was applied to reject 10% of the weakest data as well as an anisotropic scale factor to offset the decreased resolution along the crystallographic a axis. The final model is of good geometry with a final R,,^ of 21.3% (Rfr« of 26.8 %) for data between 8.0 and 2.2 A (see Table 3). A Ramachandran plot is given in Figure 21. The r.m.s. coordinate error is 0.282 as calculated by SigmaA (Read, 1986). The average phase difference between the initial molecular replacement model and the currently refined model is calculated to be 71 ° for data between 10 - 2.2 A.
The structure determination of PPCA is special in that two-fold averaging could be applied to refine very poor molecular replacement phases, enabling us to retrieve electron density for 148 residues and 185 side chains per monomer. In total 314 complete residues were added per asymmetric unit, equivalent to about 35 kDa of protein. In retrospect we feel that a number of factors contributed to a successful structure determination. Crystal Packing. Each monomer in the crystal is interacting with four non-crystallographically related monomers. By far the most extensive contact is with a non-crystallographically related monomer generating the physiological dimer. Three additional contacts are extensive crystal contacts ranging from 200-800 A ! averaged per monomer. The largest nondimer crystal contact involves the precursor loops from two crystallographically independent monomers ( region 265-267, 281-295 from monomer I with residues 281-293 from monomer 2) making intimate contact with each other. Summed together these loops create an intermolecular buried surface of 1680 A2. We believe that this stabilizes an otherwise very flexible area, possibly explaining the good diffraction qualities of the P2,2,2 crystals.
It is also in this crystal contact that we find deviating spacial conformation and secondary structure between the two monomers as mentioned before. The electron density in this region is of very good quality with average temperature factors of 16.6 A2 for main chain and 18.3 A2 for side chains. pPPCA and the Hydrolase Family. The fold of pPPCA belongs to the large hydrolase fold family containing enzymes such as the serine carboxypeptidases, dehalogenase, various lipases and acetylcholine esterase (Ollis et al. (1992), infra), having various different catalytic functions. Though the central core is the same (a central β -sheet flanked by α-helices on both sides) the proteins in this family all seem to have different 'cap' domains, both with respect to fold as well as size (Figure 7A-F). pPPCA has one of the largest cap domains comprising 121 residues forming the three helical bundle of the helical subdomain and a three stranded β-sheet of the maturation subdomain.
Major Differences and Comparison With the Serine Carboxypeptidases. The overall fold of the pPPCA monomer is similar to that of the wheat and yeast serine carboxypeptidases (Endrizzi et al. (1994), infra; Ollis et al. (1992), infra). The complete core domains of pPPCA and CPW superimpose with an r.m.s. deviation of ] .7 A for 302 Cα atoms and 38% sequence identity. Deleting major deviating loops from the core domain allows for pPPCA to superimpose with an r.m.s. deviation of 1.2 A onto CPW and CPY (293 equivalent C's with 40 % sequence identity for CPW/pPPCA and 271 equivalent C*'s for CPY/pPPCA with 42.2% identity).
The cap domain in pPPCA differs significantly from the CPW and CPY counterparts. The pPPCA structure reveals a large maturation subdomain not present in the structure of CPW and CPY for which the structures of the enzymatically active forms are known. All three enzymes contain a 3 helical bundle in the cap domain. The sequence identity between the three proteins in this region is very low (ca. 12 %). In contrast, PPCA shows a much greater deviation. Hal superimposes reasonably well with the CPW counterpart maintaining the same general orientation with respect to the core domain (requiring a rotation of only 7.4°). But helices Hα2 and Hα3 have undergone major rotations with respect to Hal and the core domains by K = 28.5° and K = 93.4°, respectively (Figure 8A).
Due to the integral role of the cap domain in forming the dimer interface, the dimers of PPCA and CPW were compared. In the pPPCA and CPW dimers the monomers are oriented differently with respect to each other.
Supeφosition of the core domain of one monomer from each dimer shows that the second pair of monomers (forming the respective dimers) differ by a remarkable 15° in orientation (Figure 8B). Thus, it appears that the extensive differences in the cap domains lead to a different arrangement of the subunits in the dimers of PPCA and CPW.
Catalytic Triad and Enzymatic Mechanism. Our structure shows that the precursor PPCA has all the elements proposed for the enzymatic machinery of the serine carboxypeptidase family (Liao et al. (1992), infra: Endrizzi et al. (1994), infra), and is now discovered to be the third structure elucidated belonging to this family of enzymes after CPW and CPY. The catalytic triad in the active site of pPPCA is formed by residues Ser 150, His 429 and Asp 372. The Oγ of Ser 150 forms a good hydrogen bond with the N'l of His 429 with a N to O distance of 2.8 A. The N*l of His 429 is 2.7 A removed from the 0*2 and 3.3 A from the 0*1 of Asp 372. Further, two backbone amides appear to orient the carboxylate group of Asp 372. The N of Ala 374 is at a distance of 3.0 A to the O*1 of Asp 372 and the N of Cys 375 is at a distance of 2.9 A to the 0β2 of Asp 372.
The oxyanion hole proposed to stabilize the negatively charged tetrahedral intermediate in serine carboxypeptidases is formed by the backbone amides of Gly 57 and Tyr 151 in PPCA. The 32 atoms of the catalytic triad residues plus the oxyanion hole amides from PPCA, CPY and CPW superimpose with an r.m.s. deviation of 0.4 A indicating the very high degree of structural similarity of the active site in the PPCA precursor with those in the fully active enzymes CPY and CPW, (see Table 4). The carboxylate of Asp 372 and the imidazole of His 429 in PPCA are non-planar, making an angle of approximately 60° between the imidazole and the carboxylate. A similar non-planarity has been observed in CPW and CPY, in contrast to the planar orientation found in subtilisin-.and trypsin-type serine proteases (McPhalen et al.. Biochemistry 27:6582-6598 (1988)). In pPPCA, a pair of glutamic acid residues (Glu 69 and Glu 149) is positioned near the catalytic triad, with their carboxylate groups interacting with each other. The carboxylate groups are located at approximately 8 A from the 0γ of Ser 150, and lie at the bottom of the active site. An asparagine (Asn 55) is orientated such that it forms a hydrogen bond to each of the two carboxylate groups of the glutamic acid pair, at an N42 (Asn) to C'/C 2 (Glu) distance of 3.0 and 3.6 A, respectively. In addition the two carboxylates interact with each other via hydrogen bonds. This configuration of two glutamic acid residues and an asparagine, is conserved between pPPCA, CPW and CPY (see Table 4), and has been implicated in regulating the low pH optimum for the carboxypeptidase activity found in the serine carboxypeptidases (Liao et al. (1992), infra). Biochemical data has suggested that a functional group with an apparent pK, value of pH 5.5, functions to bind the C-terminal carboxylate group of peptide substrates and is responsible for the observed pH optimum of 5.5 (reviewed in Breddam et al (1986), infra; Rawlings & Barrett ( 1994), infra). Together with their structural data, Liao and colleagues (Liao et al. ( 1992), infra) have suggested that at pH 5.5 or below, one or both glutamates must be uncharged, while at a pH higher than 5.5 one or both of the carboxylates which are orientated opposite to each other, may become deprotonated resulting in unfavorable electrostatic interactions. This would disturb the hydrogen bonding pattern or result in structural perturbations causing the observed increase in Km for peptide substrates at high pH. In pPPCA the orientation of this pair of glutamic acids as well as that of the asparagine is essentially identical in structure to the equivalent residues in CPW and CPY (see Table 4), even though the structure has been determined at pH 8. The CPW and CPY structures have been determined at pH 5.7 and at pH 6.5-7.0. Thus, our structure appears to rule out large pH induced conformational changes of these three residues at least up to a pH value 2.5 units above that optimal for carboxypeptidase activity. However the high degree of conservation of these residues does indicate some role in a characteristic shared by all three enzymes. From our comparison it is clear that the enzymatic machinery in the PPCA precursor form is in a conformation virtually identical to that found in the fully active CPW and CPY enzymes. On this basis, the conformation of the enzymatic machinery found in pPPCA is expected to faithfully represent the conformation that will be found in the active PPCA.
ActiveSite, Substrate Specificity. PPCA has a substrate preference for hydrophobic residues in the PI and/or PI' binding pockets (Jackman et al, Hypertension 2/:925-928 (1993)). In CPW the PI' pocket was identified to consist of two tyrosine residues (Tyr 60 and Tyr 239) which form a long channel, capped by two acidic residues (Glu 272 and Glu 398) at the end (Liao et al. (1992), infra). This explains the highest preference of this enzyme for Arg and Lys as the leaving group (Breddam et al, Carisberg Res. Commun. 52:297-31 1 (1987)). In CPY a similarly shaped pocket is formed by the residues Thr 60, Tyr 256, Leu 272 and Met 398 (Endrizzi et al. (1994), infra). In PPCA the analogous residues are Tyr 247 and Asp 64, forming the sides of the pocket with at the far end Met 430 and Thr 304. This is reasonably consistent with an overall preference of PPCA for a hydrophobic leaving group.
Inactivation Mechanism of the Precursor Form. During the maturation step of the PPCA precursor form, at maximum residues 285-298 forming the 'excision' peptide, are removed by an as yet unidentified protease(s). In vitro, the maturation event can be mimicked by digestion with trypsin utilizing probably positions Arg 284, as v/ell as Arg 292 and/or Arg 298. The residues forming the 'excision' peptide adopt distinctly different conformations in the two crystallographically distinct monomers forming the PPCA dimer in our crystal structure. Yet in both monomers this polypeptide region extends out from the protein surface and is virtually completely solvent and protease accessible (Figure 9). Arg 284 and Arg 292 are particularly well exposed. The main chain atoms of Arg 298 are less accessible, being sandwiched between the strand Mβ2 and a loop N-terminal to helix Cα6, while a salt bridge with Glu 264 renders the side chain atoms of Arg 298 partially solvent inaccessible.
The active site cleft is blocked by numerous residues from the maturation subdomain in the precursor form of PPCA. The catalytic triad is rendered solvent inaccessible by residues Asn 275, lie 276 and Phe 277. These residues are part of the polypeptide Asp 272-Phe 277 which we call the 'blocking' peptide. This peptide is held down predominantly by hydrophobic contacts of Leu 273, He 276, and Phe 277 to the core domain residues Gly 57, Cys 60, Leu 180, Leu 190, Val 191 , Leu 232, Val 235, He 246, Leu 280, Leu 282, Met 299 and Ala 373 (Fig 10). In addition residue Asn 275 of the blocking peptide appears to fill what might be part of the PI binding pocket in the mature form. Further inspection of the blocking peptide suggests that Gly 274 with Ramachandran angles φ = 66° and φ = 28°, might play a central role in the strand blocking the active site. A glycine at this position appears critical to allow the polypeptide chain to adopt a conformation with its main chain at a safe distance from the catalytic triad. This might aid in allowing the blocking peptide to assume a conformation resistant to autocatalysis. The PI ' binding pocket seems to be beautifully filled by Pro 301 interacting with Thr 304, Tyr 247, Cys 60 and Cys 334. Thus substrate binding is not possible in the precursor form due to the inaccessibility of the substrate binding pockets. We conclude that the inactivation mechanism of PPCA is based on blocking of the active site, and not upon changes in the position of functional groups involved in catalysis/transition state stabilization. Both the PI, P2 and PI' binding pockets are rendered solvent inaccessible. The function of the blocking peptide seems to be to render the catalytic triad as well as the region around the PI and P2 binding pockets solvent inaccessible. The blocking peptide, however, does not assume a conformation that a peptide substrate would adopt. It is carefully positioned in a manner which is different from that of a productive substrate, thereby avoiding being by the nearby catalytic residues which are correctly poised for catalysis. A crucial observation is that the excision peptide itself does not bind in the active site cleft. Hence, mere removal of the excision peptide alone is not sufficient to allow solvent or substrate access to the active site.
Proposed Maturation Event and Extent of Conformational Rearrangement. The active site of the precursor of PPCA appears to be fully blocked by 49 residues of the maturation subdomain, as shown in Figure 1 1. Based on the precursor structure and the comparison with CPW and CPY it is proposed that a region comprising approximately residues 254-284 rearranges to free the PI. P2 binding sites, while the residues 299-302 rearrange to free the PI' binding pocket. The linker connecting these two segments of polypeptide chain is the 14 amino acid excision peptide Met 285- Arg 298. The extent of the residues rearranging is likely to be limited by a disulfide bridge Cys 253 and Cys 303, which
is conserved in the serine carboxypeptidase family. This critical disulfide serves to keep the secondary structure elements together at the far end of the PI' pocket.
An interesting pair of salt bridges is observed between Arg 262, Asp 300, Glu 264 and Arg 298, four residues located on strands Mβl and Mβ3 of the mixed β-sheet found in the maturation subdomain. This cluster of residues is strategically positioned at the base of the excision peptide, close the core domain and 'shielding' the mixed β-sheet via side chain interactions (see Figure 1 1 ). These residues are strictly conserved among the human, mouse and chicken PPCAs (Galjart et al. (1991), infra). This charge cluster may be effected by a shift from neutral to acidic pH. Arrival in the endosome/lysosome is expected to result in protonation of either the Asp or the Glu residue or both, resulting in unfavorable electrostatic interactions and destabilization of this charge cluster. This in turn is expected to promote partial unfolding of maturation subdomain, allowing easier access to additional potential cleavage sites, and stimulating removal of the 'blocking' peptide which fills the active site in the precursor.
A similar double salt bridge has been observed in the aspartic proteinase zymogen pepsinogen between the proenzyme segment (Arg 8P) and the enzyme (Arg 308, Glu 13, Asp 304).
The maturation mechanism for pPPCA appears to be novel among proteases for which the three-dimensional structure of the zymogen is known. The catalytic triad in the precursor form is in a catalytically competent conformation. Enzymatic activity is prevented by a 'blocking' peptide. The blocking peptide is however different from the excision peptide and does not get excised from the mature enzyme. This leads to the distinct difference with the other known maturation mechanisms in that, after disappearance of the excision peptide, up to 35 residues filling the active site cleft in the PPCA precursor must rearrange to render the catalytic triad solvent accessible (see Figure 12), but do not get cleaved off. Removal of the excision peptide, and possibly a shift to lower pH in the endosome/lysosome, appears to be a trigger for this event. The mechanism does not appear to be autocatalytic, as uptake experiments with cultured galactosialidosis fibroblasts, have shown that a mutant PPCA with the catalytic Ser 150 mutated to Ala, is properly targeted and processed. It retains its protective function and except for the loss of catalytic activity is biochemically indistinguishable from the wild type enzyme (Galjart et al. ( 1991 ), infra). Surprisingly, the maturation mechanism of the serine carboxypeptidases PPCA, CPW and CPY may all differ from each other as well. This is clearest for CPY, in which a 91 residue polypeptide is cleaved off N -ter inally to convert the zymogen to an active enzyme (Winther and Sorensen, Proc. Natl. Acad. Sci. USA 55:9330-9334 (1991 )), as opposed to the excision of a peptide from within the zymogen generating a two chain active form as is the case for PPCA and CPW.
Looking at the hydrolase fold family, the catalytic triad is housed in the core domain and the various cap domains attenuate the biological function by influencing entirely different properties such as: (I) enzyme kinetics exemplified by the interfaciai activation of lipases (Smith et al. Curr. Opinion in Structural Biology 2:490-496 ( 1992)); (ii) substrate channeling as is proposed for acetylcholine esterase (Sussman et al. (1991 ), infra); (iii) substrate recognition, proposed for dehalogenase by (Franken et al. (1991 ), infra) and for CPY and CPW by (Endrizzi et al. (1994), infra); and (iv) enzyme inactivation in the case of PPCA. Biological Implications. Deficiency of the protective protein/cathepsin A (PPCA) in humans results in the lysosomal storage disease galactosialidosis. PPCA is thought to form a multi-enzyme complex with β-galactosidase and neuraminidase in the lysosomes protecting the latter glycosidases in their harsh acidic and proteases-rich environment. PPCA has a 30% sequence identity to the wheat serine carboxypeptidase (CPW) and yeast serine carboxypeptidase (CPY). It has been show that PPCA in the precursor form is inactive, but upon maturation, entailing excision of a 2 kDa peptide, carboxypeptidase activity is released.
The precursor structure reveals an inactivation mechanism that has not been seen before in any of the other known zymogen structures of proteases (available for the serine-. metallo- and aspartic protease classes). The catalytic triad seems to have an arrangement poised for catalysis. However, the triad is rendered solvent and substrate inaccessible by a strand from the maturation subdomain binding in the active site cleft. Surprisingly, this strand called the 'blocking' peptide does not overlap with the 2 kDa "excision' peptide. Hence, after removal of the excision peptide
up to 35 additional residues must rearrange in order to unblock the active site cleft. A strategically positioned pair of salt bridges, comprising Arg 262, Arg 298, Glu 264, and Asp 300 at the base of the excision peptide, are expected to optionally become destabilized at low pH, unraveling this region of the structure, allowing easier access to cleavage sites and/or promoting the rearrangement event. A number of research groups are currently involved in designing enzyme and gene therapy procedures for several lysosomal storage diseases. Insight into the three-dimensional structure, protein functioning and stability of PPCA, the first enzyme of known structure associated with a lysosomal storage disease and the third human lysosomal structure to be determined, may prove useful in future designs of an adequate therapy procedure for galactosialidosis. Information from the three-dimensional structure of PPCA, might also aid in designing an engineered form of PPCA with increased stability and a longer half-life.
Table 1: X-ray Data Collection Statistics
Table 2: Course of Model Building
Rfactor Rfrec CC ccfree nr. of nr. of side * muk
Model C's chains (io4 A3) {statistics using data between 8.0 and 3.θA} mol. repl. mrl rigid body ref. (rmr) 331 125 - 54.2 55.3 0.243 0.244 calculate NCS matrix 52.6 52.9 0.287 0.318 bes monomer (bm) rigid body ref. 294 228 - 55.9 57.4 0.228 0216 update NCS matrix 53.5 55.0 0.320 0328 bmcl (mask 1) 373 258 10.8 49.9 51.3 0.403 0.424 bmc2 (mask 1) 405 277 10.8 48.6 48.4 0.443 0.478 bmc3 (mask 2) rigid body ref. 411 307 9.99 47.1 48.6 0.471 0.491 positional ref. (pbmc3) 46.9 48.4 0.476 0.492 update NCS matrix 39.4 44.7 0.622 0.562 bmc4 (mask 1) 412 327 10.8 41.7 43.1 0.584 0.585 bmc5 (mask 3) 435 387 8.88 39.8 40.6 0.621 0.623 bmcό (mask 4) 442 413 9.11 38.4 40.2 0.647 0.637
Summary of the bootstrapping procedure. The resulting models have been listed chronologically starting with the molecular replacement solution, i.e. mr (molecular replacement), bm (best monomer core), and the bootstrapping cycles bmcl through bmcό. The following statistics are given for the various models: the number of C" atoms built per monomer; the number of correct side chains incorporated per monomer and the volume of the molecular mask used during the averaging if applicable. The quality of each model is assessed using the Rf,,-,*, R„e, CC and CCfm calculated by X-PLOR for data between 8.0 and 3.0 A. After positional refinement of model bmc3. both monomers were made equivalent by taking one monomer and generating the non-crystallographically related one.
Table 3: Current Statu. r of the Model statistics for the data used in refinement: resolution (A) Rfactor (%) completeness (%)
8.0 - 4.3 22.4 85.7
4.3 - 3.5 19.0 89.1
3.5 - 3.0 20.6 89.1
3.0 - 2.8 21.3 87.9
2.8 - 2.6 22.3 86.1
2.6 - 2.4 22.2 84.0
2.4 - 2.3 22.7 81.3
2.3 - 2.2 24.0 78.3
8.0 - 2.2. A 21.3%
model: molecules in the asymmetric unit: 2 residues (out of 904 possible): 902 sugars: 6 waters: 296
r.m.s.d. bond length (A): 0.012 r.m.s.d. bond angles ("): 1.72
average B-values for main chain atoms (A2): 16.6 side chain atoms (A2): 18.3
Table 4
Superposition of the proposed catalytic machinery of the serine carboxypeptidases with known
Ausubel i . I , eds , Current Protocols in Molecular Biology, Greene Publishing Assoc ana Wiley Interscience, N Y , (1987, 1992, 1993. 1994) Aymard-Henry et al , Bulletin of the World Health Organization 48 199-202 (1973) Bailey et al , Improving Protein Phases, Proceedings of the CCP4 Study Weekend,
(1988) Baldwin et al , Proc Natl Acad Sci USA 90 6796-6800 (1993) Bousse et al , Virology 204 506-514 (1994) Breddam et al Carlsberg Res Commun 57:83- 128 (1986) Breddam ef -j/ Cansberg Res Commun 52 297-31 1 (1987)
Brϋnger et al, J. Mol. Biol. 203:803-816 (1987)
Brunger, A.T., Acta Cryst. A46:46-57 (1990)
Brunger, X-PLOR: Version 3.1, "A system for X-ray crystallography and NMR," Yale University Press, New Haven, CT (1992) Chong et al, Biochim. Biophys. Ada 7077:65-71 (1991)
Chong et al, Eur. J. Biochem. 207:335-343 (1992)
Colligan et al, eds., Current Protocols in Immunology, Greene Publishing Assoc. and Wiley Interscience, N.Y., (1992, 1993)
Cowtan and Main, Acta Crystallogr. D 49: 148- 157 (1993) Creighton, Catalysis in Proteins Structures and Molecular Principles, W.H. Freeman and Company (1984), pp. 439-443
Crennell et al, Structure 2:535-544 (Jun. 1994) d'Azzo et al, Proc. Natl. Acad. Sci. U.S.A. 79:4535-4539 (1982) d'Azzo et al, "Galactosialidosis," in The Metabolis and Molecular Bases of Inherited Disease, Scriver et al., eds., McGraw Hill Inc., New York (1994), pp. 2785-2832.
Endrizzi et al.Biochemistry 33:11106-11 120 (1994).
Engh and Huber,Λc/σ Cryst. A47:392-400 (1991)
Franken, S.M., et al, J. EMBO 70:1297-1302 (1991)
Franken, S.M., et al, J. EMBO 70:1297-1302 (1991) Fujinaga and Read, J. Appl. Cryst 20:517-512 ( 1987)
Galjart et al, J. Biol. Chem. 266:14754-147 '62 (1991)
Goodford, J. Med. Chem. 25:849-857 (1985)
Guasch, A., et al, J. Mol Biol 224: 141-157 (1992)
Hanna et al, J. Immunol. 153:4663-4672 (1994) Harlow and Lane, Antibodies: a Laboratory Manual, Cold Spring Harbor Laboratory (1988)
Henderson, Biochem. J. 727:321-333 (1972)
Hoogeveen et al, J. Biol. Chem. 255:12143-12146 (1983)
Hubbes et al, J. Biochem. 285: 827-831 (1992) ltoh et al, J. Biol Chem. 270:515-518 (1995)
Jackman etal, J. Biol. Chem. 265: 1 1265- 11272 (1990)
Jackman et al, J. Biol. Chem. 267/2872-2875, (1992)
Jackman et al, Hypertension 27:925-928 (1993)
James and Sielecki, Nature 37°:33-38 (1986) Jones, et al, Acta Crystallogr. A47: 110-119 (1991)
Jones, etal, Acta Crystallogr. A47- 53-770 (1991)
Kabsch and Sander, Biopolymers 22:2577-2637 (1983)
Kabsch, J. Appl. Crystallogr. 2(5:795-800 (1993)
Kase et al, Biochem. Biophys. Res. Commun. 172: 1 175- 1179 ( 1990) Kaufman et al, eds. Handbook of Molecular and Cellular Methods in Biology and Medicine, CRC Press, Inc., Boca Raton (1995)
Kleywegt & Jones, Bailey et al. eds.. First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 ( 1994)
Kohler and Milstein, Nature 256:495-497 (1975) Kraulis, J. Appl Cryst. 24:946-950 ( 1991 )
Lackowski et al, J. Appl. Cryst. 26:283-291 (1993)
Lamzin and Wilson, Acta Cryst. D49Λ29Λ47 (1993)
Laver, Virology 56:78-87 (1978)
Leatherbarrow, Trends Biochem. Sci. 75:455-458 (1990) Liao et al. Biochemistry 37:9796-9812 (1992)
Lϋthy et al., Nature 356:83-85 (1992)
Luthy et al. Nature 356:83-85 (1992)
Matthews, B.W.. J. Mol. Biol 33:491 -497 (1968)
McPhalen and James et al. Biochemistry 27:6582-6598 (1988)
Metcalf & Fusek, EMBO 72:1293-1302 (1993)
Molecular Replacement. "Proceedings of the CCP4 Study Weekend," Machin, (1985)
Molecular Replacement. "Proceedings of the CCP4 Study Weekend," Dodson et al, (1992) Murti et al. , Proc. Natl. Acad. Sci. USA 90: 1523-1525 ( 1993)
Musil et al, EMBO 70:2321-2330 (1991)
Nicholls, A., et al, Proteins 77:281-296 (1991)
Noble et al, FEBSLett. 337:123-128 (1993)
Oho, B-H, Acta Cryst. D51: \ 40- 144 ( 1995) Okamura-Oho, Y. et al, Biochim. Biophys. Acta 7225:244-254 (1994)
Ollis et al, Protein Eng. 5:197-21 1 (1992)
Potier et al, Analyt. Biochem. 94:2 7-296 (1979)
Potier et al, J. Biochem. 267:197-202 (1990)
Pshezhetsky et al, Biochem. Biophys. Acta 1122: 154- 160 ( 1992) Pshezhetsky et al, Biochemistry 34:2431 -2440 ( 1995)
Rao, et al, Acta Cryst. A36: 878-884 (1980)
Rawlings & Barrett, Methods in Enzymology. 244:19-61 (1994)
Read, R.J., Acta Crystallogr. A 42: 140- 149 (1986)
Rossmann, "Improving Protein Phases" Proceedings of the CCP4 Study Weekend. (Feb. 5-6, 1988) Ruban i and Polokoff, Pharmc.Rev. 46:325-415 ( 1994)
Rudenko et al. , Structure 3: 1249- 1259 (1995).
Sambrook et al., Molecular Cloning: A Laboratory Manual, Second edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989)
Sawyer el al, eds., Proceedings ofCCP4 Study Weekend, pp. 56-62, SERC Darsbary Lab.,UK (1993)
Scheibe et al, Biomed. Biochim. Acta 49:547-556 (1990)
Scriver, CR. et al, eds., The Metabolic and Molecular Bases of Inherited Disease, Vol 11, 7th Ed., McGraw Hill Inc., p. 2825-2837 (1994)
Smith et al, Curr. Opinion in Structural Biology 2:490-496 (1992) Steigeman, W., PhD Thesis, Technical University, Munich, Germany
Sussman et al, Science 253: 872-879 (1991)
Takimoto e α/., J. Virol. 66:7597-7600 (1992)
Taylor et al.J. Mol Biol 226:1287-1290 (1992)
Tete-Favier, et al, Acta Cryst. D49.246-256 (1993) Tettamanti et al eds., Sialidases and Sialidosis. Perspectives in Inherited Metabolic Diseases, Vol. 4, Edi. Ermes, Milano, pp. 261-279 and 379-395 (1981)
Thompson et al, J. Virol. 62:4653-4660 (1988)
Tronrud et al.,Acta Crystallogr. A 43:489-501 (1987)
Varghese et al, Proteins 74:327-332 (1992) Verheijen β/ α/., Biochem. Biophys. Res. Commun. 705:868-875 (1982)
Vriend, J. Mol. Graph. 5:52-56 (1990) van Diggelen et al, J. Biochem. 200: 143-151 (1981) van Diggelen et al, Biochem. Biophys. Acta. 703:69-76 (1982) van Diggelen et. al. Lancet 2:804(1987) Wahl et al, J. Nucl. Med. 24:316-325 ( 1983)
Wenger et al, Biochem. Biophys. Res. Commun. 52:589-595 (1978)
Winther and Sorensen, Proc. Natl. Acad. Sci. USA 55:9330-9334 (1991 ) Wolf et al, eds., Isomorphous Replacement and Anomalous Scattering: Proceedings of CCP4 Study Weekend, pp. 80-86, SERC Daresbury Lab., UK (1991) Yamamoto & Nishi ura, J. Biochem. 79:435-442 (1987) Yamamoto et al. J. Biochem. 92:13-21 (1982) Zhou et al. J. EMBO. 70:404-4048 (1991 )
Claims
1. A method for crystallizing a human protective protein cathepsin A (PPCA) or precursor human protective/cathepsin A protein (pPPCA). comprising (a) providing a purified PPCA or pPPCA; (b) crystallizing the purified PPCA or pPPCA using a hanging drop or diffusion method, to provide crystallized PPCA or pPPCA having biological activity, wherein the crystallized PPCA or pPPCA is resolvable using x-ray crystallography to obtain x-ray diffraction patterns suitable for three-dimensional structure determination of the PPCA or pPPCA.
2. A method according to claim 1, wherein said PPCA or pPPCA has at least one biological activity selected from the group consisting of enzyme protecting activity, enzyme modulating activity and peptide hydrolyzing activity.
3. A method according to claim 1, wherein said crystallization step is done under conditions of purified PPCA or pPPCA; 2-30% PEG400-10,000; precipitating salt; buffers, and pH 7-9.
4. A method according to claim 3, wherein the crystallization conditions are PPCA or pPPCA; 5-14% PEG8000, 40-80 mM tromethamine, 0.05-2.0 mM NaN3 and pH 8.0-8.3.
5. A crystallized PPCA or pPPCA, or at least one subdomain thereof, provided by a method according to claim 1.
6. A method for providing an atomic model of a PPCA or pPPCA, comprising
(a) providing a computer readable medium having stored thereon atomic coordinate/x-ray diffraction data of said PPCA or pPPCA in crystalline form, said data sufficient to model the three-dimensional structure of said PPCA, said pPPCA, or at least one subdomain thereof;
(b) analyzing, on a computer using at least one subroutine executed in said computer, the atomic coordinate/x-ray diffraction data from (a) to provide data output defining an atomic model of said PPCA or said pPPCA. said analyzing utilizing at least one computing algorithm selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement, positional refinement; and
(c) obtaining atomic model output data defining the three-dimensional structure of said PPCA, pPPCA or at least one subdomain thereof.
7. A method according to claim 6, wherein said computer readable medium further has stored thereon data corresponding to a nucleic acid sequence or an amino acid sequence data comprising at least one structural domain or a functional domain of a PPCA or pPPCA corresponding to a portion of the amino acid sequences of Figures 13 or 14, and wherein said analyzing step further comprises analyzing said sequence data.
8. A computer readable medium having stored thereon atomic model data of said PPCA or pPPCA as the model output data produced by a method according to claim 6.
9. A computer-based system for providing atomic model data of the three dimensional structure of a PPCA or a pPPCA, comprising the following elements; (a) a computer readable medium having stored thereon atomic coordinate/x-ray diffraction data of said PPCA or pPPCA or at least one subdomain thereof;
(b) at least one computing subroutine, that when executed in a computer, causes the computer to analyze the atomic coordinate/x-ray diffraction data from (a) to provide data output defining an atomic model of said PPCA or pPPCA, said analyzing utilizing at least one computing subroutine selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement, positional refinement; and (c) retrieval means for obtaining atomic model output data defining the three- dimensional structure of said PPCA, pPPCA or at least one subdomain thereof.
10. A computer-based system according to claim 9, wherein said computer readable medium further has stored thereon data corresponding to a nucleic acid sequence or an amino acid sequence data comprising at least one structural domain or a functional domain of a PPCA or pPPCA corresponding to a portion of the amino acid sequences of Figures 13 or 14, and wherein said at least one subroutine further includes analyzing said sequence data.
11. A computer readable medium, having stored thereon atomic model data of a PPCA, pPPCA, or at least one subdomain thereof, produced by a computer system according to claim 9.
12. A method for providing an computer atomic model of a ligand of a PPCA or pPPCA, comprising
(a) providing a computer readable medium according to claim 1 1 , having stored thereon atomic model data of a PPCA, a pPPCA or at least one subdomain thereof;
(b) providing a computer readable medium having stored thereon atomic model data sufficient to generate atomic models of potential ligands of PPCA or pPPCA; (c) analyzing on a computer, using at least one subroutine executed in said computer, the atomic model data from (a) and the ligand data from (b), to determine binding sites of PPCA or pPPCA and to provide data output defining an atomic model of a ligand of said PPCA, pPPCA, or at least one subdomain thereof, said analyzing utilizing computing subroutines selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, eleciron map visualization, model building, rigid body refinement, positional refinement; and (d) obtaining atomic model output data defining the three-dimensional structure of a ligand of said PPCA, pPPCA or at least one subdomain thereof.
13. A computer readable medium having stored thereon the model output data produced by a method according to claim 12.
14. An isolated PPCA or pPPCA ligand, corresponding to the physical molecule of the atomic model of the ligand model produced by a method according to claim 12.
15. A computer-based system for providing an atomic model of a ligand of a PPCA or pPPCA, comprising the following elements;
(a) a computer readable medium having stored thereon atomic model data of a PPCA or pPPCA;
(b) a computer readable medium having stored thereon atomic model data sufficient to generate atomic models of potential ligands of PPCA or pPPCA;
(c) at least one computing subroutine for analyzing on a computer the atomic model data of PPCA or pPPCA from (a) and the ligand data from (b), to determine binding sites of PPCA or pPPCA and to provide data output defining a atomic models of potential ligands of PPCA or pPPCA, said analyzing utilizing at least one computing subroutine selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement, positional refinement; and
(d) retrieval means for obtaining atomic model output data defining the atomic models of potential ligands of PPCA or pPPCA.
16. A computer readable medium, comprising atomic model output data of a potential ligand of PPCA or pPPCA, said data produced by a method according to claim 15.
17. An isolated PPCA or pPPCA ligand, corresponding to the physical molecule of the atomic model of a ligand produced by a computer system according to claim 15.
18. A crystallized pPPC A, having the atomic coordinates presented in Figure 23.1-23.41.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU11157/97A AU1115797A (en) | 1995-10-26 | 1996-10-25 | Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug design |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US597695P | 1995-10-26 | 1995-10-26 | |
| US60/005,976 | 1995-10-26 | ||
| US680295P | 1995-11-15 | 1995-11-15 | |
| US60/006,802 | 1995-11-15 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO1997015588A1 WO1997015588A1 (en) | 1997-05-01 |
| WO1997015588A9 true WO1997015588A9 (en) | 1997-09-18 |
Family
ID=26674999
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US1996/017325 WO1997015588A1 (en) | 1995-10-26 | 1996-10-25 | Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug design |
Country Status (2)
| Country | Link |
|---|---|
| AU (1) | AU1115797A (en) |
| WO (1) | WO1997015588A1 (en) |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6183121B1 (en) * | 1997-08-14 | 2001-02-06 | Vertex Pharmaceuticals Inc. | Hepatitis C virus helicase crystals and coordinates that define helicase binding pockets |
| AU1187599A (en) * | 1997-10-14 | 1999-05-03 | E.I. Du Pont De Nemours And Company | Trihydroxynaphthalene reductase: methods for three-dimensional structure determinations and rational inhibitor design |
| DE69936445T2 (en) * | 1998-02-04 | 2008-04-10 | Immunex Corp., Seattle | CRYSTALLINE TNF-ALPHA CONVERTING ENZYME AND USE THEREOF |
| US6842704B2 (en) | 1998-02-04 | 2005-01-11 | Immunex Corporation | Crystalline TNF-α-converting enzyme and uses thereof |
| US7383135B1 (en) | 1998-05-04 | 2008-06-03 | Vertex Pharmaceuticals Incorporated | Methods of designing inhibitors for JNK kinases |
| US6988041B2 (en) | 2000-01-31 | 2006-01-17 | Pharmacia & Upjohn Company | Crystallization and structure determination of Staphylococcus aureus NAD synthetase |
| US7736875B2 (en) | 2000-09-08 | 2010-06-15 | Prozymex A/S | Dipeptidyl peptidase I crystal structure and its uses |
| WO2002020804A1 (en) * | 2000-09-08 | 2002-03-14 | Prozymex A/S | Rat cathepsin, dipeptidyl peptidase i (dppi): crystal structure, inhibitors and its uses |
| US6869792B2 (en) | 2001-03-16 | 2005-03-22 | Irm, Llc | Method and apparatus for performing multiple processing steps on a sample in a single vessel |
| JP2007508844A (en) | 2003-10-24 | 2007-04-12 | ギリアード サイエンシーズ, インコーポレイテッド | Methods and compositions for the identification of therapeutic compounds |
| LT2751279T (en) | 2011-08-31 | 2017-12-11 | St. Jude Children`S Research Hospital | Methods and compositions to detect the level of lysosomal exocytosis activity and methods of use |
-
1996
- 1996-10-25 WO PCT/US1996/017325 patent/WO1997015588A1/en active Application Filing
- 1996-10-25 AU AU11157/97A patent/AU1115797A/en not_active Abandoned
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20040014194A1 (en) | Beta-secretase crystals and methods for preparing and using the same | |
| AU756117B2 (en) | Bactericidal/permeability-increasing protein: crystallization, X-ray diffraction, three-dimensional structure determination, rational drug design and molecular modeling of related proteins | |
| Dalby et al. | Crystal structure of human muscle aldolase complexed with fructose 1, 6-bisphosphate: mechanistic implications | |
| Rossjohn et al. | Crystallization, structural determination and analysis of a novel parasite vaccine candidate: Fasciola hepatica glutathione S-transferase | |
| WO1997015588A9 (en) | Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug design | |
| WO1997015588A1 (en) | Protective protein/cathepsin a and precursor: crystallization, x-ray diffraction, three-dimensional structure determination and rational drug design | |
| US8710188B2 (en) | Factor IXa crystals, related complexes and methods | |
| US20070166813A1 (en) | Gaucher disease drugs and methods of identifying same | |
| US7498157B2 (en) | Three-dimensional structure of dipeptidyl peptidase IV | |
| JP2005503144A (en) | Novel BACE protein, nucleic acid molecule thereof, novel crystal structure of novel BACE protein, and production and use method | |
| Karlsen et al. | Atomic resolution structure of human HBP/CAP37/azurocidin | |
| US20090263784A1 (en) | Three-dimensional structure of prostaglandin d synthase and utilization thereof | |
| US20090155815A1 (en) | Crystal structure of the carboxyl transferase domain of human acetyl-coa carboxylase 2 protein (acc2 ct) and uses thereof | |
| US20030143714A1 (en) | Crystal structure of a mutant of cathepsin S enzyme | |
| US20110117658A1 (en) | Method of rational-based drug design using osteocalcin | |
| US7252958B2 (en) | Modulation of tetraspanin function | |
| JP2005522987A (en) | Sperm factor sequence | |
| US7590494B1 (en) | Drug design based on the structure of LTA4 hydrolase | |
| US20060074081A1 (en) | Progesterone receptor structure | |
| CA2357526A1 (en) | Mannosidase structures | |
| US7052851B2 (en) | Crystal structure of cysteine protease | |
| US6842704B2 (en) | Crystalline TNF-α-converting enzyme and uses thereof | |
| US20030235811A1 (en) | Crystallized mammalian carboxylesterase polypeptide and screening methods employing same | |
| US20040014153A1 (en) | Bactericidal/permeability-increasing protein: crystallization, x-ray diffraction, three-dimensional structure determination, rational drug design and molecular modeling of related proteins | |
| JP2003502036A (en) | Caspase-8 crystals, models and methods |