[go: up one dir, main page]

WO1997033900A1 - E2a-binding protein - Google Patents

E2a-binding protein Download PDF

Info

Publication number
WO1997033900A1
WO1997033900A1 PCT/US1997/004117 US9704117W WO9733900A1 WO 1997033900 A1 WO1997033900 A1 WO 1997033900A1 US 9704117 W US9704117 W US 9704117W WO 9733900 A1 WO9733900 A1 WO 9733900A1
Authority
WO
WIPO (PCT)
Prior art keywords
glu
thr
pro
leu
gly
Prior art date
Application number
PCT/US1997/004117
Other languages
French (fr)
Inventor
Mu-En Lee
Edgar Haber
Wilson O. Endege
Matthew D. Layne
Original Assignee
President And Fellows Of Harvard College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by President And Fellows Of Harvard College filed Critical President And Fellows Of Harvard College
Publication of WO1997033900A1 publication Critical patent/WO1997033900A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Definitions

  • This invention relates to regulation of gene expression and differentiation in vascular smooth muscle cells.
  • E12 and E47 which are alternatively spliced products of the E2A gene (Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991), belong to a growing family of transcription factors characterized by a highly conserved helix-loop-helix (HLH) motif (Kadesch, Cell. Growth Differ. 4:49-55, 1993; Kadesch, Immunol. Today 13:31-6, 1992) .
  • HLH helix-loop-helix
  • E2A proteins are widely expressed.
  • E47 forms homodimers, while both E12 and E47 heterodimerize with tissue-specific HLH factors (Hsu et al . , Proc. Natl. Acad. Sci. USA 91:3181-5, 1994; Johnson et al . , Proc. Natl. Acad. Sci. USA 89:3596-600, 1992; Shirakata, Genes & Dev. 7:2456-70, 1993; Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991; Shirakata, EMBO J. 14:1766-72, 1995).
  • most HLH proteins Immediately adjacent to the HLH domain, most HLH proteins contain a basic region that allows homodimers or heterodi ers to bind the CANNTG consensus sequence, which is known as the E-box
  • Idl, Id2, Id3, and Id4 members of the Id family (Idl, Id2, Id3, and Id4) and the Drosophila extramacrochaetae (emc) protein do not contain the basic DNA-binding region. Heterodimers of Id and other HLH proteins do not bind DNA, and thus Id proteins function as inhibitors of DNA binding. In vitro, Idl and Id2 have higher affinity for E2A proteins than MyoD and they may dissociate functional MyoD-E2A heterodimers (Benezra et al . , Cell 61:49-59, 1990; Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991).
  • E2A proteins are involved in regulating growth and differentiation of many different cell types.
  • E2A proteins have important roles in transcriptional activation of pancreatic and immunoglobulin genes (Cordle et al . , Mol. Cell. Biol. 11:2881-6, 1991; Henthorn et al . , Science 247:467-70, 1990; Murre et al . f Mol. Cell. Biol. 11:1156-60, 1991; Nelson et al . , Genes & Dev. 4:1035-43, 1990). They are essential for B cell development, as deletion of the E2A gene by homologous recombination prevents B cell differentiation in mice (Bain et al .
  • E2A proteins may be important in the differentiation of hematopoietic and neural tissue, respectively (Hsu et al . , Proc. Natl. Acad. Sci. USA 91:3181-5, 1994; Johnson et al . , Proc. Natl. Acad. Sci. USA 89:3596-600, 1992;
  • induction of myogenesis can be achieved by overexpression of myogenic factors in C3H10T1/2 embryonic fibroblasts or simply by removal of growth factors in the culture medium of C2C12 myoblasts (Olson and Klein, Genes & Dev. 8:1-8, 1994; Tapscott et al., Science 242:405-11, 1988; Weintraub, Cell 75:1241-4, 1993).
  • E2A proteins in conjunction with skeletal muscle-specific HLH proteins, are required for terminal differentiation and cell cycle withdrawal of skeletal muscle cells. Although in most instances the effect of E2A proteins on differentiation and growth inhibition appears to require heterodimerization with tissue-specific HLH proteins (Jan and Jan, Cell 75:827-30, 1993; Olson and Klein, Genes & Dev.
  • the vascular smooth muscle cell in mature animals is a highly specialized cell type whose principal function is contraction (Owens, Physiol. Rev. 75:487-517, 1995; Schwartz et al . , Physiol. Rev. 70:1177-1209, 1990).
  • the vascular smooth muscle cell is not terminally differentiated.
  • the adult vascular smooth muscle cell proliferates at an extremely low rate, it can undergo rapid changes to a highly proliferative phenotype in response to stimuli such as oxidized low density lipoprotein, homocysteine, and mechanical and immunological injuries (Libby and Hansson, Lab. Invest. 64:5-15, 1991; Munro and Cotran, Lab.
  • E2A-BP is herein defined as a protein which (1) has at least 70% sequence identity with SEQ ID NO:16, (2) binds to both E12 and E47, under physiological conditions, and (3) inhibits binding of E47 homodimer to an E-box probe consisting Of: 5'-GATCTACACCTGCTGCCTCCCAACACCTGCTGCCTCCC AACACCTGCTGCCTCCCAACACCTGCTGAGCT-3' (SEQ ID NO:3).
  • an E2A-BP polypeptide may be introduced into a vascular smooth muscle cell, in order to promote growth of the cell.
  • the sequence of the E2A-BP of the invention is preferably at least 80% identical, more preferably at least 90% identical, and most preferably at least 95% identical to SEQ ID NO:16. It can have the sequence of a naturally occurring protein from, e.g., a human, mouse (SEQ ID NO: 18) , rat, hamster, chicken, goat, horse, pig, cow, monkey, or ape, or can have one or more amino acid deletions, additions, or substitutions. It may be identical to SEQ ID NO:16; a truncated form of E2A-BP, e.g., the sequence of
  • the polypeptide of the invention may, for example, decrease the level of expression of myogenin and/or myosin heavy chain, and decrease the formation of multinuclear myotubes by these cells.
  • the polypeptide of the invention inhibits binding of E47-MyoD heterodimer to the E-box probe, but binds poorly or not at all to MyoD or to the E-box probe itself.
  • the invention also features an antibody (monoclonal or polyclonal) , which specifically binds to E2A-BP and can be made using standard methods.
  • substantially pure polypeptide is meant a polypeptide which is separated from those components (e.g., proteins and other naturally-occurring organic molecules) , which accompany it in its natural state.
  • components e.g., proteins and other naturally-occurring organic molecules
  • substantially pure polypeptides include recombinant polypeptides derived from a eukaryote, but produced in E. coli or another prokaryote, or in a eukaryote other than that from which the polypeptide was originally derived.
  • the polypeptide is substantially pure when it constitutes at least 60%, by weight, of the protein in the preparation.
  • the protein in the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight, E2A-BP polypeptide.
  • a substantially pure E2A-BP polypeptide may be obtained by, for example, extraction from a natural source (e.g., a vascular smooth muscle), expression of a recombinant nucleic acid encoding an E2A-BP polypeptide; in vitro translation; or chemical synthesis of the protein. Purity of the polypeptide can be measured by any appropriate method, e.g., column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis.
  • the invention also features isolated nucleic acids (DNA, RNA, or combinations or modifications thereof) that encode E2A-BP, as is defined above.
  • the isolated nucleic acid may contain the nucleotide sequence of SEQ ID NO:15, or a portion thereof, e.g., SEQ ID NO:2, or may contain the nucleotide sequence of SEQ ID NO:17.
  • the isolated nucleic acid may hybridize under high stringency conditions to a probe having a nucleotide sequence complementary to SEQ ID NO:15 or SEQ ID NO:17, or a portion thereof.
  • the invention also includes all degenerate variants of sequences which encode E2A-BP.
  • an E2A-BP-encoding nucleic acid molecule of the invention may be introduced into a vascular smooth muscle cell, using gene therapy methods, in order to promote growth of the cell.
  • isolated nucleic acid is meant a nucleic acid that is free of the genes which, in the naturally- occurring genome of the organism from which the DNA is derived, flank the gene of interest.
  • the term therefore includes, for example, a recombinant DNA which is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote at a site other than its natural site, or which exists as a separate molecule (e.g., a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease digestion) independent of other sequences.
  • It also includes a recombinant DNA which is part of a hybrid gene encoding an additional polypeptide sequence, such as a polypeptide sequence which facilitates purification (e.g., glutathione-S- transferase (GST) ) .
  • GST glutathione-S- transferase
  • the invention also features an isolated nucleic acid (DNA, RNA, or combinations or modifications thereof) having at least 50% sequence identity (preferably at least 70%, more preferably at least 80%, more preferably 90%, and most preferably at least 95%) to SEQ ID NO:15, and encoding E2A-BP, as defined above.
  • the percent sequence identity of one DNA to another is determined by standard means, e.g., by the Sequence Analysis Software Package developed by the Genetics Computer Group (University of Wisconsin Biotechnology Center, Madison, WI) (or an equivalent program; see, e.g., Ausubel et al . , supra) , employing the default parameters thereof.
  • Hybridization is carried out using standard techniques, such as those described in Ausubel et al . (Current Protocols in Molecular Biology, John Wiley & Sons, 1989) .
  • “High stringency” refers to nucleic acid hybridization and wash conditions characterized by high temperature and low salt concentration, e.g. , wash conditions of 65°C at a salt concentration of approximately 0.1 X SSC.
  • “Low” to “moderate” stringency refers to DNA hybridization and wash conditions characterized by low temperature and high salt concentration, e .g. , wash conditions of less than 60°C at a salt concentration of at least 1.0 X SSC.
  • high stringency conditions may include hybridization at about 42°C, and about 50% formamide; a first wash at about 65°C, about 2X SSC, and 1% SDS; followed by a second wash at about 65°C and about 0.1% x SSC.
  • Lower stringency conditions suitable for detecting DNA sequences having about 50% sequence identity to an E2A-BP gene are detected by, for example, hybridization at about 42°C in the absence of formamide; a first wash at about 42°C, about 6X SSC, and about 1% SDS; and a second wash at about 50°C, about 6X SSC, and about 1% SDS.
  • an isolated nucleic acid such as a DNA, containing an E2A-BP promoter, e.g., a human E2A-BP promoter.
  • promoter is meant a DNA sequence sufficient to direct transcription of a coding sequence to which it is linked. Promoters may be constitutive or inducible, and may be coupled to other regulatory sequences or "elements" which render promoter-dependent gene expression cell-type specific, tissue-specific or inducible by external signals or agents; such elements may be located in the 5' or 3' region of the native gene, or within an intron.
  • an E2A-BP promoter is capable of directing gene expression in vascular smooth muscle cells.
  • An E2A-BP promoter of the invention may be operably linked to the coding sequence of E2A-BP, or may be operably linked to a sequence which encodes a heterologous polypeptide, e . g. , a growth inhibitor, such as retinoblastoma, an inhibitor of cyclins (e.g., p21, p57, pl8, or pl7) , or a vasodilator, such as cNOS or a prostacyclin.
  • the E2A-BP promoter may also be operably linked to a segment of DNA which is transcribed into an RNA that is antisense to an MRNA naturally produced in a vascular smooth muscle cell.
  • RNAs that are antisense to MRNAS encoding proteins such as E2A-BP, heparin-binding epidermal growth factor (HBEGF) , and platelet-derived growth factor (PDGF)
  • E2A-BP promoter a coding sequence and a regulatory sequence(s) (i.e., a promoter) are connected in such a way as to permit gene expression when the appropriate molecules (e.g., transcriptional activator proteins) are bound to the regulatory sequence(s) .
  • the invention also features an isolated, single- stranded nucleic acid consisting of a nucleotide sequence which is antisense to at least a portion of a naturally occurring E2A-BP sense strand, e.g., E2A-BP mRNA (an "antisense oligonucleotide”) .
  • the antisense oligonucleotide may consist of DNA, RNA, or combinations or modifications thereof. For example, stabilized analogues of deoxyribonucleotides (see below) may be used.
  • the antisense oligonucleotide may be produced by standard in vitro methods, e .g. , by standard chemical synthesis (see, e .g. , Ausubel et al .
  • the antisense oligonucleotides described above may be introduced into a cell, such as a vascular smooth muscle cell, in order to inhibit expression of E2A-BP in the cell.
  • antisense RNA oligonucleotides may also be produced in vivo by transcription of a nucleic acid introduced into a cell.
  • a nucleic acid may have a sequence containing (a) an expression control sequence (i.e., a promoter, such as the E2A-BP promoter) which permits expression in a vascular smooth muscle cell (e.g., a human vascular smooth muscle cell) , operably linked to (b) a sequence which is transcribed into an RNA antisense to at least a 10, preferably at least a 14, or more preferably at least a 20 (e.g., at least a 30) nucleotide sequence of a target mRNA.
  • an antisense RNA may also be formulated to be complementary to all of a specific mRNA sequence, e.g., an E2A-BP mRNA sequence.
  • the invention also features a method of inhibiting proliferation of a vascular smooth muscle cell by introducing into the cell a compound (e.g., a small organic compound, a peptide having a sequence corresponding to the E2A-BP binding site on E2A, a peptide having a sequence corresponding to the E2A binding site on E2A-BP, or an antibody specific for either E2A-BP or E2A) , which inhibits binding between E2A-BP and an E2A transcription factor, such as E12 or E47.
  • a compound e.g., a small organic compound, a peptide having a sequence corresponding to the E2A-BP binding site on E2A, a peptide having a sequence corresponding to the E2A binding site on E2A-BP, or an antibody specific for either E2A-BP or E2A
  • Such compounds may be identified using any of several screening methods. For example, in one screening method, E2A can be contacted with E2A-BP in the presence of a candidate compound, and the level of E2A-BP binding to E2A can then be determined. A decrease in the level of binding in the presence of the compound, compared to the level of binding in the absence of compound, may be used as an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding. In another screening method, a complex containing E2A and E2A-BP can be contacted with a candidate compound. Whether the candidate compound decreases the binding of E2A-BP to E2A in the complex can be determined as an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding.
  • a cell that expresses E2A-BP and E2A can be cultured in the presence of a candidate compound.
  • the level of expression of an E2A-regulated gene in the cell can then be determined.
  • An increase in the level of expression of the gene in the presence of the compound, compared to the level of expression in the absence of the compound, is an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding. Additional variations of these methods are described below.
  • the invention also features a genetically altered non-human mammal, e.g., a mouse, whose muscle cells produce altered levels or forms of a functional E2A-BP gene product.
  • the levels of the E2A-BP gene product in the genetically altered mammal can be increased or decreased.
  • genetically altered mammal is meant a mammal in which the genomic DNA sequence has been manipulated in some way.
  • the genetically altered mammal may be a knockout in which the endogenous E2A-BP sequences have been deleted or otherwise altered to decrease or change the pattern of expression.
  • the genetically altered mammal may be transgenic, retaining endogenous E2A-BP coding sequences and having exogenous E2A-BP sequences as well.
  • the transgenic mammal may express E2A-BP sequences from another species, may overexpress E2A-BP gene product, or may express E2A-BP in tissues and at developmental stages other than those in which E2A-BP is normally expressed.
  • the nucleated cells of genetically altered mammal not producing a functional endogenous E2A-BP gene product may be engineered to encode human E2A-BP polypeptide, and to express functional human E2A-Bp polypeptide, or, alternatively, E2A-BP from another heterologous species.
  • Fig. IA is a schematic representation of a partial amino acid sequence of human E2A-BP (SEQ ID NO:l) containing 768 amino acids.
  • the sequences homologous to the two carboxypeptidase signature domains l and 2 are underlined.
  • the nuclear localization signal, KRIR is in bold.
  • An acidic domain, rich in glutamic acid residues, is in italics.
  • R404, the first amino acid of the original cDNA clone isolated by interaction cloning, is marked by a dagger.
  • IB is a schematic representation of a comparison of two E2A-BP sequences (SEQ ID NO:4 and SEQ ID NO:5), which are the E2A-BP sequences that are homologous and to the sequences of signature domains 1 (SEQ ID NO:6) and 2 (SEQ ID NO:7) in carboxypeptidase E (CPE) .
  • the putative zinc binding amino acids are in bold.
  • the histidine and glutamic acid residues in signature 1 are conserved in E2A-BP, but the histidine in signature 2 is replaced by asparagine.
  • Fig. 2 is a schematic representation of a partial nucleotide sequence of a human cDNA (SEQ ID NO:2), which encodes E2A-BP, as well as the corresponding amino acid sequence (SEQ ID NO:l), in single letter code.
  • Figs. 3A and 3B are schematic representations of the nucleotide sequences of portions of a rat E2A-BP cDNA.
  • Fig. 3A shows a portion of the sequence of the sense strand (SEQ ID NO:8) and
  • Fig. 3B shows a portion of the sequence of the antisense strand (SEQ ID NO:9).
  • Fig. 4 is a schematic representation of the nucleotide sequence of a full-length human E2A-BP cDNA (SEQ ID NO:15) .
  • Fig. 5 is a schematic representation of the amino acid sequence (SEQ ID:NO 16) encoded by the full-length human cDNA (SEQ ID NO:15).
  • the open reading frame of the full-length human E2A-BP contains 845 amino acids and has a predicted molecular weight of 96 kDa.
  • Fig. 6 is a schematic representation of the nucleotide sequence of the full-length murine cDNA (SEQ ID NO:17) .
  • Fig. 7 is a schematic representation of the amino acid sequence (SEQ ID NO:18) encoded by the full-length murine cDNA (SEQ ID NO:17).
  • the open reading frame of the full-length urine E2A-BP contains 1128 amino acids and has a predicted molecular weight of 128 kDa.
  • Naturally occurring E2A-BP is a nuclear protein that is expressed in vascular smooth muscle cells. It binds to, and modulates the activities of, E2A transcription factors.
  • the invention features nucleic acids that encode E2A-BP, as well as the E2A-BP polypeptides themselves. Also within the invention are the E2A-BP promoter, therapeutic methods employing E2A-BP nucleic acids or polypeptides, and methods for identifying compounds which modulate E2A-BP activity.
  • E2A-BP Nucleic Acids
  • SEQ ID NO:15 The nucleotide sequences of a full length cDNA encoding human E2A-BP (SEQ ID NO:15) and of a partial human cDNA encoding human E2A-BP (SEQ ID NO:2) are shown in Figs. 4 and 2, respectively.
  • the invention includes all degenerate variants of the coding sequence of SEQ ID NO:15 or SEQ ID NO:2, as well as any isolated nucleic acid having a nucleotide sequence that encodes any other E2A-BP (as defined above) .
  • the nucleotide sequence of a full-length mouse E2A-BP cDNA (SEQ ID NO:17) is shown in Fig. 6.
  • the invention includes all degenerate variants of the coding sequence of SEQ ID NO:17.
  • Figs. 3A and 3B The nucleotide sequence of a cDNA encoding a portion of rat E2A-BP (SEQ ID NO:8 and SEQ ID NO:9; see description of Figs. 3A and 3B, above) is shown in Figs. 3A and 3B.
  • the rat cDNA was isolated by PCR using primers derived from the human E2A- BP cDNA sequence.
  • Nucleic acids encoding naturally occurring E2A-BP polypeptides from species in addition to human and rat are included in the invention, and may be obtained using standard methods, such as the PCR method used to isolate the rat clone.
  • the invention also includes nucleic acids which encode E2A-BP polypeptides having substitutions and/or deletions of single and/or multiple amino acids.
  • nucleic acids can be generated using standard methods, and the polypeptides that they encode can easily be screened for E2A-BP activity, as described below.
  • Plasmids encoding full-length human E2A-BP or mouse E2A-BP cDNAs, as well as E. coli strains containing the above plasmids, were deposited with the American Type Culture Collection (ATCC, Rockville, Maryland) , under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure on March 12, 1997.
  • the E. coli strains are INV ⁇ F' (Invitrogen)
  • the cDNAS are cloned in the vector pCR2.1 (Invitrogen).
  • E2A-BP nucleic acids may be used in gene therapy and antisense methods for treating vascular diseases.
  • E2A-BP nucleic acids may be used in methods for producing E2A-BP polypeptides.
  • E2A-BP Polypeptides may be used in gene therapy and antisense methods for treating vascular diseases.
  • E2A-BP nucleic acids may be used in methods for producing E2A-BP polypeptides.
  • the amino acid sequences of full-length human E2A-BP (SEQ ID NO:16) and a partial human E2A-BP amino acid sequence (SEQ ID NO:l) are shown in Figs. 5 and IA, respectively.
  • the amino acid sequence of full-length mouse E2A-BP (SEQ ID NO:18) is shown in Fig. 7.
  • polypeptides which (1) have at least 70% sequence identity with SEQ ID NO:l, SEQ ID NO:16, or SEQ ID NO:18; (2) bind to both E12 and E47, under physiological conditions; and (3) inhibit binding of E47 homodi er to an E-box probe consisting of: 5'- GATCTACACCTGCTGCCTCCCAACACCTGCTGCCTCCCAACACCTGCTGCCTCCC AACACCTGCTGAGCT-3' (SEQ ID NO:3), are included in the invention.
  • Polypeptides according to the invention may be produced by transformation of a suitable host cell with all or part of an E2A-BP-encoding cDNA fragment (e.g., the cDNA described above) in a suitable expression vehicle.
  • a suitable host cell with all or part of an E2A-BP-encoding cDNA fragment (e.g., the cDNA described above) in a suitable expression vehicle.
  • E2A-BP-encoding cDNA fragment e.g., the cDNA described above
  • coli or in a eukaryotic host (e.g., yeast, such as Saccharomyces cerevisiae; insect cells, such as Sf-9 cells; or mammalian cells, such as COS-1, NIH-3T3, and JEG3 cells).
  • yeast such as Saccharomyces cerevisiae
  • insect cells such as Sf-9 cells
  • mammalian cells such as COS-1, NIH-3T3, and JEG3 cells.
  • expression vehicles may be chosen from, e.g., those described in Cloning Vectors '. A Laboratory Manual (P.H. Pouwels et al . , 1985, Supp. 1987) and by Ausubel et al . supra .
  • an expression system which may be used is a mouse 3T3 fibroblast host cell transfected with a pMAMneo expression vector (Clonetech, Palo Alto, CA) .
  • pMAMneo provides: an RSV-LTR enhancer linked to a dexamethasone-inducible MMTV-LTR promotor, an SV40 origin of replication, which allows replication in mammalian systems, a selectable neomycin gene, and SV40 splicing and polyadenylation sites.
  • DNA encoding an E2A-BP polypeptide can be inserted into the pMAMneo vector in an orientation designed to allow expression. The recombinant E2A-BP could then be isolated as described below.
  • Other host cells which may be used in conjunction with pMAMneo, or similar expression systems, include COS cells and CHO cells (ATCC Accession Nos. CRL 1650 and CCL 61, respectively) .
  • E2A-BP polypeptides may also be produced in stably-transfected mammalian cell lines.
  • a number of vectors suitable for stable transfection of mammalian cells are available to the public, e.g., see Pouwels et al . (supra) ; methods for constructing such cell lines are well known in the art (see, e .g. , Ausubel et al . , supra) .
  • cDNA encoding E2A-BP is cloned into an expression vector which includes the dihydrofolate reductase (DHFR) gene.
  • DHFR dihydrofolate reductase
  • Integration of the plasmid and, therefore, the E2A-BP-encoding gene into the host cell chromosome is selected for by inclusion of 0.01-300 ⁇ ,M ethotrexate in the cell culture medium (see, e.g.,
  • DHFR-deficient CHO cell line e.g., CHO DHFR ' cells, ATCC Accession No. CRL 9096
  • CHO DHFR ' cells ATCC Accession No. CRL 9096
  • E2A-BP polypeptide Once an E2A-BP polypeptide is expressed, it may be isolated using standard methods, such as affinity chromatography. For example, E2A or an antibody against E2A-BP may be attached to a column and used to isolate the E2A-BP polypeptide. Lysis and fractionation of E2A- BP-harboring cells prior to affinity chromatography may be performed by standard methods (see, e .g. , Ausubel et al . , supra) . Once isolated, the recombinant protein can, if desired, be further purified, e.g.
  • E2A-BP polypeptides can also be produced by chemical synthesis (e.g., by the methods described in Solid Phase Peptide Synthesis , 2nd ed. , 1984, The Pierce Chemical Co. , Rockford, IL) .
  • E2A-BP polypeptides may be used in therapeutic methods to promote vascular smooth muscle cell growth in, e.g., wound healing (see below).
  • E2A-BP polypeptides, or E2A-BP polypeptide fragments e .g. , ⁇ E2A-BP
  • E2A-BP polypeptides, or E2A-BP polypeptide fragments may also be used in methods for generating antibodies to E2A-BP, which may be used, e .g. , in E2A-BP purification methods.
  • the E2A-BP promoter which directs gene expression in vascular smooth muscle cells, is also included in the invention.
  • the E2A-BP promoter can be identified using standard methods of molecular biology (see, e.g., Ausubel et al . , supra) .
  • an E2A-BP cDNA probe can be used to isolate a genomic clone containing the E2A-BP gene from a genomic library.
  • the promoter region can be identified in the genomic clone using standard methods, such as primer extension and/or SI nuclease mapping. Further characterization of the E2A-BP promoter may be achieved by comparing the sequence located 5' to the E2A- BP coding sequence with known promoter element sequences.
  • a construct that includes E2A-BP promoter sequences which confer vascular smooth muscle cell- specific expression to a reporter gene to which the sequences are operably linked can be progressively deleted, by 5', 3', and/or nested deletions, until the ability of the promoter to induce transcription of the reporter gene in transfected cells is reduced.
  • other sequences such as 3' untranslated and intronic sequences, may be analyzed for effects on promoter activity.
  • the E2A- BP promoter may be used in gene therapy methods to direct vascular smooth muscle cell-specific expression of the E2A-BP gene, genes encoding heterologous polypeptides, or DNA sequences encoding antisense transcripts (e.g., transcripts antisense to E2A-BP RNA, or RNAs encoding growth promoting proteins, such as heparin-binding epidermal growth factor (HBEGF) and platelet-derived growth factor (PDGF) .
  • antisense inhibition of E2A-BP expression may be achieved by introduction of antisense oligonucleotides directly into vascular smooth muscle cells.
  • the nucleic acids of the invention can be used in gene therapy methods for treatment of vascular diseases, such as arteriosclerosis.
  • a vector containing the E2A-BP gene can be administered to a patient for use in expressing E2A-BP in a vascular smooth muscle cell.
  • the E2A-BP promoter operably linked to the coding sequence of a heterologous gene, i.e., a gene which encodes a protein other than E2A-BP, can be used to express the heterologous gene in vascular smooth muscle cells.
  • An E2A-BP promoter sequence directs transcription of DNA to which it is linked, preferably in vascular smooth muscle cells compared to -non vascular smooth muscle cells.
  • Heterologous genes the expression of which is regulated by E2A-BP promoter sequences in vascular smooth muscle cells include, e.g., sequences encoding t-PA (Pennica et al . , 1982, Nature 301:214); cyclin inhibitors, such as p21, p57, pl8, and pl7 (El-Deiry et al . , 1993, Cell
  • thrombolytic agents may be expressed under the control of the E2A-BP promoter sequences for expression by vascular smooth muscle cells in blood vessels, e.g., vessels occluded by aberrant blood clots.
  • heterologous proteins e.g., proteins which inhibit smooth muscle cell proliferation, e.g., interferon- ⁇ and atrial natriuretic polypeptide, may be specifically expressed in vascular smooth muscle cells to ensure the delivery of these therapeutic peptides to an arteriosclerotic lesion or an area at risk of developing an arteriosclerotic lesion, e .g. , an injured blood vessel.
  • the E2A-BP promoter sequences of the invention may also be used in gene therapy to promote angiogenesis to treat diseases such as peripheral vascular disease or coronary artery disease (Isner et al., 1995, Circulation 91:2687-2692).
  • the DNA of the invention can be operably linked to sequences encoding cellular growth factors which promote angiogenesis, e.g., VEGF, acidic fibroblast growth factor, or basic fibroblast growth factor.
  • Antisense Therapy e.g., VEGF, acidic fibroblast growth factor, or basic fibroblast growth factor.
  • the E2A-BP nucleic acids of the invention may also be used in methods for antisense treatment.
  • Antisense treatment may be carried out by administering to a mammal, such as a human, DNA containing the E2A-BP promoter operably linked to a DNA sequence (an antisense template) , which is transcribed into an antisense RNA.
  • a mammal such as a human
  • DNA containing the E2A-BP promoter operably linked to a DNA sequence (an antisense template) which is transcribed into an antisense RNA.
  • antisense oligonucleotides may be introduced directly into vascular smooth muscle cells.
  • the antisense oligonucleotide may be a short nucleotide sequence (generally at least 10, preferably at least 14, more preferably at least 20 (e.g., at least 30) , and up to 100 or more nucleotides) formulated to be complementary to a portion or all of a specific mRNA sequence. Standard methods relating to antisense technology have been described (see, e.g., Melani et al . , Cancer Res. 51:2897-2901, 1991). Following transcription of a DNA sequence into an antisense RNA, the antisense RNA binds to its target nucleic acid molecule, such as an mRNA molecule, thereby inhibiting expression of the target nucleic acid molecule.
  • target nucleic acid molecule such as an mRNA molecule
  • an antisense sequence complementary to a portion or all of the E2A-BP mRNA could be used to inhibit the expression of E2A-BP, thereby promoting differentiation.
  • Such antisense therapy may be used to treat conditions characterized by proliferation of vascular smooth muscle cells, such as arteriosclerosis, e .g. , restenosis in response to angioplasty.
  • the antisense therapy of the invention may also be used to treat cancer by inhibiting angiogenesis at the site of a solid tumor, as well as other pathogenic conditions which are caused by or exacerbated by angiogenesis, e .g. , inflammatory diseases such as rheumatoid arthritis and diabetic retinopathy.
  • the promoter of the invention can be operably linked to antisense templates which are transcribed into antisense RNA capable of inhibiting the expression of growth promoting proteins, such as HBEGF and PDGF.
  • the antisense oligonucleotides of the invention may be provided exogenously to a target vascular smooth muscle cell.
  • the antisense oligonucleotide may be produced within the cell by transcription of a nucleic acid molecule including a promoter sequence operably linked to a sequence encoding the antisense oligonucleotide.
  • the nucleic acid molecule is contained within a non- replicating linear or circular DNA or RNA molecule, is contained within an autonomously replicating plasmid or viral vector, or is integrated into the host genome. Any vector that can transfect a vascular smooth muscle may be used in this method of the invention.
  • Preferred vectors are viral vectors, including those derived from replication-defective hepatitis viruses (e.g., HBV and HCV) , retroviruses (see, e .g. , WO 89/07136; Rosenberg et al . , N. Eng. J. Med. 323(9) :570-578, 1990), adenovirus (see, e .g. , Morsey et al . , J. Cell. Biochem., Supp. 17E, 1993), adeno-associated virus (Kotin et al . , Proc. Natl. Acad. Sci.
  • viral vectors including those derived from replication-defective hepatitis viruses (e.g., HBV and HCV) , retroviruses (see, e .g. , WO 89/07136; Rosenberg et al . , N. Eng. J. Med. 323(9) :570-578,
  • HSV herpes simplex viruses
  • Methods for constructing expression vectors are well known in the art (see, e.g., Sambrook et al., supra) .
  • Additional suitable gene delivery systems include liposomes, receptor-mediated delivery systems, naked DNA.
  • the invention also includes any other methods which accomplish in vivo transfer of nucleic acids into eukaryotic cells.
  • the nucleic acids may be packaged into liposomes, receptor-mediated delivery systems, non-viral nucleic acid-based vectors, erythrocyte ghosts, or microspheres (e.g., microparticles; see, e.g., U.S. Patent No. 4,789,734; U.S. Patent No. 4,925,673; U.S. Patent No. 3,625,214; Gregoriadis, Drug Carriers in Biology and Medicine, pp. 287-341 (Academic Press, 1979)).
  • naked DNA may be administered. Delivery of nucleic acids to a specific site in the body for gene therapy or antisense therapy may also be accomplished using a biolistic delivery system, such as that described by Williams et al .
  • delivery of antisense oligonucleotides may be accomplished by direct injection of the oligonucleotides into target tissues, for example, in a calcium phosphate precipitate or coupled with lipids.
  • the antisense oligonucleotides of the invention may consist of DNA, RNA, or any modifications or combinations thereof.
  • modifications that the oligonucleotides may contain, inter-nucleotide linkages other than phosphodiester bonds, such as phosphorothioate, methylphosphonate, methylphosphodiester, phosphorodithioate, phosphoramidate, phosphotriester, or phosphate ester linkages (Uhlman et al . , Chem. Rev. 90(4) :544-584, 1990; Anticancer Research 10:1169, 1990), may be present in the oligonucleotides, resulting in their increased stability.
  • Oligonucleotide stability may also be increased by incorporating 3'-deoxythymidine or 2'-substituted nucleotides (substituted with, e .g. , alkyl groups) into the oligonucleotides during synthesis, by providing the oligonucleotides as phenylisourea derivatives, or by having other molecules, such as aminoacridine or poly- lysine, linked to the 3' ends of the oligonucleotides (see, e.g., Anticancer Research 10:1169-1182, 1990).
  • RNA and/or DNA nucleotides which make up the oligonucleotides of the invention may be present throughout the oligonucleotide, or in selected regions of the oligonucleotide, e.g., in the 5' and/or 3' ends.
  • the antisense oligonucleotides may also be modified so as to increase their ability to penetrate the target tissue by, e.g., coupling the oligonucleotides to lipophilic compounds.
  • the antisense oligonucleotides of the invention can be made by any method known in the art, including standard chemical synthesis, ligation of constituent oligonucleotides, and transcription of DNA encoding the oligonucleotides, as is mentioned above.
  • E2A-BP is naturally expressed in vascular smooth muscle cells, which are, therefore, the preferred cellular targets for the antisense oligonucleotides of the invention.
  • Targeting of antisense oligonucleotides to vascular smooth muscle cells may be achieved by coupling the oligonucleotides to ligands of vascular smooth muscle cell receptors.
  • oligonucleotides may be targeted to vascular smooth muscle cells by being conjugated to monoclonal antibodies that specifically bind to vascular smooth muscle-specific cell surface proteins.
  • the antisense oligonucleotides of the invention may be used in therapeutic compositions for treating, e.g., vascular diseases.
  • the therapeutic applications of antisense oligonucleotides in general are described, e .g. , in the following review articles: Le Doan et al . , Bull. Cancer 76:849-852, 1989; Dolnick, Biochem. Pharmacol. 40:671-675, 1990; Crooke, Annu. Rev. Pharmacol. Toxicol. 32, 329-376, 1992.
  • the therapeutic compositions of the invention may be used alone or in admixture, or in chemical combination, with one or more materials, including other antisense oligonucleotides or recombinant vectors, materials that increase the biological stability of the oligonucleotides or the recombinant vectors, or materials that increase the ability of the therapeutic compositions to penetrate vascular smooth muscle cells selectively.
  • the therapeutic compositions of the invention may be administered in pharmaceutically acceptable carriers (e .g. , physiological saline), which are selected on the basis of the mode and route of administration, and standard pharmaceutical practice. Suitable pharmaceutical carriers, as well as pharmaceutical necessities for use in pharmaceutical formulations, are described in Remington 's Pharmaceutical Sciences , a standard reference text in this field, and in the USP/NF.
  • a therapeutically effective amount is an amount of the antisense molecule of the invention which is capable of producing a medically desirable result in a treated animal.
  • dosage for any one patient depends upon many factors, including the patient's size, body surface area, age, the particular compound to be administered, sex, time and route of administration, general health, and other drugs being administered concurrently. Dosages will vary, but a preferred dosage for intravenous administration of DNA is approximately IO 6 to IO 22 copies of the DNA molecule.
  • the compositions of the invention may be administered locally or systemically. Administration will generally be parenterally, e . g. , intravenously.
  • DNA may also be administered directly to the target site, e .g.
  • E2A-BP/E2A binding Modulation of the growth of vascular smooth muscle cells can be achieved by contacting the cells with a compound that blocks or enhances E2A-BP/E2A binding.
  • a compound that blocks or enhances E2A-BP/E2A binding can be identified by methods ranging from rational drug design to screening of random compounds. The latter method is preferable, as simple and rapid assays for testing such compounds are available. Small organic molecules are desirable candidate compounds for this analysis, as frequently these molecules are capable of passing through the plasma membrane so that they can potentially modulate E2A-BP/E2A binding within the cell.
  • E2A-BP antibodies specific for either E2A-BP or E2A, or alternatively peptides (or peptide imetics) (a) derived from the binding site on E2A-BP, which would block by occupying the binding site on E2A; or (b) derived from the binding site on E2A, which would block by occupying the binding site on E2A- BP.
  • the screening of compounds for the ability to modulate E2A-BP/E2A binding may be carried out using in vitro biochemical assays, cell culture assays, or animal model systems.
  • labeled E2A- BP e.g., E2A-BP labeled with a fluorochrome or a radioisotope
  • a candidate compound is applied to the column before, after, or simultaneously with the labeled E2A-BP, and the amount of labeled protein bound to the column in the presence of the compound is determined by conventional methods.
  • a compound tests positive for inhibiting E2A-BP/E2A binding if the amount of labeled protein bound in the presence of the compound is lower than the amount bound in its absence.
  • a compound tests positive for enhancing E2A-BP/E2A binding if the amount of labeled protein bound in the presence of the compound is greater than the amount bound in its absence.
  • binding of labeled E2A to immobilized E2A-BP is measured. In all of these methods, large numbers of compounds can be screened very rapidly and easily.
  • candidate compounds may also be screened using cell culture assays.
  • Cells expressing E2A-BP and E2A either naturally or after introduction into the cells of genes encoding E2A and/or E2A-BP (e.g., C2C12 cells transfected with E2A-BP, see below) , are cultured in the presence of the candidate compound.
  • the level of E2A-BP/E2A binding in the cell may be inferred using any of several assays.
  • levels of expression of E2A-regulated genes e.g., genes encoding myogenin, myosin heavy chain, or myosin light chain
  • levels of expression of E2A-regulated genes may be determined using, e.g., Northern blot analysis, RNAse protection analysis, immunohistochemistry, or other standard methods (see below) .
  • the ability of a candidate compound to modulate E2A-BP/E2A binding may be evaluated by determining the effect of the candidate compound on cell differentiation (see below) or cell growth, which may be measured by, e .g. , monitoring uptake of [ 3 H]thymidine.
  • Compounds found to inhibit E2A-BP/E2A binding may be used in methods for inhibiting growth of vascular smooth muscle cells in order to, e.g., prevent or treat arteriosclerosis or angiogenesis.
  • Compounds found to enhance E2A-BP/E2A binding may be used in methods to promote proliferation of vascular smooth muscle cells in order to, e.g., promote angiogenesis in wound healing (e .g. , healing of broken bones, burns, diabetic ulcers, or traumatic or surgical wounds) and organ transplantation.
  • wound healing e.g. , healing of broken bones, burns, diabetic ulcers, or traumatic or surgical wounds
  • organ transplantation e.g., vascular endothelial vascular disease
  • such compounds may be used to treat peripheral vascular disease, cerebral vascular disease, hypoxic tissue damage (e.g., hypoxic damage to heart tissue) , or coronary vascular disease.
  • Compounds identified using the above-described methods may also be used to treat patients who have, or have had, transient ischemic attacks, vascular graft surgery, balloon angioplasty, frostbite, gangrene, or poor circulation.
  • the therapeutic compounds identified using the methods of the invention may be administered to a patient by any appropriate method for the particular compound, e . g. , orally, intravenously, parenterally, transdermally, transmucosally, by inhalation, or by surgery or implantation at or near the site where the effect of the compound is desired (e.g., with the compound being in the form of a solid or semi-solid biologically compatible and resorbable matrix) .
  • a salve or transdermal patch that can be directly applied to the skin so that a sufficient quantity of the compound is absorbed to increase vascularization locally may be used.
  • This method would apply most generally to wounds on the skin.
  • Salves containing the compound can be applied topically to induce new blood vessel formation locally, thereby improving oxygenation of the area and hastening wound healing.
  • Therapeutic doses are determined specifically for each compound, most being administered within the range of 0.001 to 100.0 mg/kg body weight, or within a range that is clinically determined to be appropriate by one skilled in the art. Genetically Altered E2A-BP Mammals
  • Genetically altered mammals can be created which have cells that express altered levels of the endogenous functional E2A-BP gene product.
  • the genetically altered mammals may in addition express heterologous E2A-BP gene product derived from a second animal.
  • knock-out mice which do not express the mouse homologue of E2A-BP (SEQ ID NO:17) can be generated.
  • Mice from this line can be manipulated further to express the human homologue of E2A-BP (SEQ ID NO:2 and SEQ: ID NO:15), e.g., by introduction of a transgene encoding the human sequence.
  • part or all of the endogenous mouse genomic sequences can be replaced with the corresponding human E2A-BP sequence by homologous reco bination.
  • human homologue may be directed to particular tissues or cell types, e.g., skeletal muscle cells and vascular smooth muscle cells, through the use of tissue- or cell type-specific regulatory elements. Many such elements are known to skilled artisans.
  • Such transgenic mammals represent model systems for the study of conditions or diseases that are caused, exacerbated, or ameliorated by the E2A- BP protein.
  • the cells of a genetically altered mammal may bear genetic information received, directly or indirectly, by deliberate genetic manipulation at the subcellular level, such as DNA received by microinjection or by infection with recombinant virus.
  • mammals of the invention are those with one or more cells that contain a recombinant DNA molecule.
  • this molecule becomes stably integrated into the mammal's chromosomes, but the use of DNA sequences that replicate extrachromosomally, such as might be engineered into yeast artificial chromosomes, is also contemplated.
  • the mammal is one in which heterologous genetic information has been taken up and integrated into a germ line cell.
  • transgenic mammals typically have the ability to transfer the genetic information to their offspring. If the offspring in fact possess some or all of the genetic information delivered to the parent animal, then they, too, are transgenic mammals.
  • a genetically altered mammal may be any mammal except Homo sapiens. Farm animals (pigs, goats, sheep, cows, horses, rabbits, and the like), rodents (such as rats, guinea pigs, and mice) , and domestic animals (for example, dogs and cats) are within the scope of the present invention.
  • the genetically altered mammals of the present invention are produced by introducing DNA encoding E2A-BP of the invention into single-celled embryos so that the DNA is stably integrated into the DNA of germ-line cells in the mature mammal, and inherited in a Mendelian fashion. It has been possible for many years to introduce heterologous DNA into fertilized mammalian ova.
  • totipotent or pluripotent stem cells can be transfected by microinjection, calcium phosphate-mediated precipitation, liposome fusion, retroviral infection, or other means.
  • the transfected cells are then introduced into an embryo (for example, into the cavity of a blastula) and implanted into a pseudo-pregnant female that is capable of carrying the embryos to term.
  • the transfected, fertilized ova can be implanted directly into the pseudopregnant female.
  • the appropriate DNA is injected into the pronucleus of embryos, at the single cell stage, and the embryos allowed to complete their development within a pseudopregnant female.
  • a recombinant E47 fusion protein (N3-SH[ALA]) , containing the bHLH domain of hamster shPan-1 (amino acids 509-646, with mutations R551A, V552L, and R553A) with a heart muscle kinase recognition sequence and the FLAG epitope, was expressed and purified as described (Blanar et al . , Proc. Natl. Acad. Sci. USA 92:5870-4, 1995; Blanar and Rutter, Science 256:1014-8, 1992).
  • N3-SH[ALA] was phosphorylated by heart muscle kinase in the presence of ⁇ - 32 P-ATP and used to screen a human aorta ⁇ gtll cDNA expression library (Clonetech) by interaction cloning (Blanar et al . , Proc. Natl. Acad. Sci. USA 92:5870-4, 1995; Blanar and Rutter, Science 256:1014-8, 1992).
  • a 1450-bp cDNA clone ( ⁇ E2A-BP) obtained from interaction cloning was radiolabeled by random priming and used to isolate a 2786 bp cDNA clone (E2A-BP) from the same human aorta ⁇ gtll cDNA library. After restriction mapping, the appropriate fragments were subcloned into plasmids (pBluescript SK, Stratagene) . DNA sequencing was performed by using the dideoxy chain termination method and T7 DNA polymerase.
  • Sequencing templates used were (1) alkaline-denatured double- stranded DNA or (2) single-stranded DNA generated by in vitro excision by helper phage virus (Stratagene) (Sambrook et al . , supra) . Both strands of the 2786 bp human E2A-BP cDNA were sequenced at least once. Using reverse transcription PCR and primers encoding human E2A-BP sequences (forward,
  • rat E2A-BP cDNA fragment was amplified from rat aortic smooth muscle RNA as described (Lee et al . , J. Biol. Chem. 266:16188-92, 1991). This rat E2A-BP cDNA was subcloned into plasmid PCRTMII (Invitrogen) and used as a probe for in situ hybridization (see below) . The authenticity of the rat E2A-BP was confirmed by sequencing.
  • Figs. 3A SEQ ID NO:8; a portion of the sequence of the sense strand
  • 3B SEQ ID NO:9; a portion of the sequence of the antisense strand
  • E2A-BP cDNA corresponding to the 4 kb E2A-BP transcript (see below) was isolated by subjecting human aortic RNA to three successive rounds of amplification using 5' RACE. 5' RACE reagents were obtained from Gibco. The cDNA product was subcloned into pCR2.1 (Invitrogen).
  • the sequence of the full-length human cDNA fragment is shown in Fig. 4 (SEQ ID NO:15).
  • the cDNA has an open-reading frame encoding 845 amino acids beginning at the ATG highlighted in bold and terminating at the TGA highlighted in bold.
  • the predicted amino acid sequence is shown in Fig. 5 (SEQ ID NO:16).
  • a full-length mouse cDNA corresponding to the mouse E2A-BP transcript was isolated using 5' RACE with primers designed from the mouse AEBP-cDNA sequence (EMBL nucleotide sequence accession number X80478) .
  • RT-PCR was used to generate a cDNA which was subcloned into pCR2.1 (Invitrogen) and sequenced.
  • the sequence of the full- length mouse cDNA fragment is shown in Fig. 6 (SEQ ID NO:17).
  • the cDNA has an open-reading frame encoding 1128 amino acids beginning at the ATG highlighted in bold and terminating at the TGA highlighted in bold, and generating the amino acid sequence shown in Fig. 7 (SEQ ID NO:17).
  • RNA probe As a control, a sense RNA probe was synthesized using T7 RNA polymerase to transcribe Hindlll-linearized rat E2A-BP cDNA in PCRTMII. The RNA probes were hydrolyzed for 20 minutes at 60°C to generate probes of approximately 100 nucleotides long. Each tissue section was hybridized with 20 million counts-per- minute (cpm) of probe at 50°C overnight. After the hybridization procedure, the sections were washed at 50°C under stringent conditions and dried. The tissue sections were subsequently dipped into emulsion solution (Kodak NTB2) and exposed for 2 to 4 days at 4°C. The sections were counter-stained with hematoxylin-eosin. cell Lines, Cell Culture, and Reagents
  • C2C12 and COS-7 cells were obtained from the American Type Culture Collection (Rockville, Maryland) . All cells were cultured in Dulbecco's modified Eagle medium (DMEM, JRH) supplemented with 10% fetal calf serum (Hyclone) or Serum Plus (JRH) before transfection. C2C12 cells were cultured in either 2% (differentiation medium) or 20% (growth medium) fetal calf serum after transfection.
  • DMEM Dulbecco's modified Eagle medium
  • JRH Dulbecco's modified Eagle medium
  • C2C12 cells were cultured in either 2% (differentiation medium) or 20% (growth medium) fetal calf serum after transfection.
  • the monoclonal antibody for myosin heavy chain was obtained from Sigma and the anti-c-myc 9E10 peptide antibody was obtained from either Oncogene or Santa Cruz.
  • cDNA probes for rat myogenin, mouse myosin light chain, and mouse myosin heavy chain were provided by A.B. Lassar (Harvard Medical School) .
  • Human MyoD, E12, and E47 plasmid constructs were provided by D. Baltimore (Whitehead Institute, MIT) and F.A. Peverali (EMBL, Heidelberg, Germany) .
  • the hybridized filters were washed in 30 mM NaCl, 3 mM sodium citrate, and 0.1% sodium dodecyl sulfate at 55°C, and then were used to expose film or stored on PhosphorImager screens for 68 hours. To correct for differences in RNA loading, the filters were washed in a 50% formamide solution at 80°C and rehybridized with a radiolabeled 28S oligonucleotide probe. The filters were scanned and radioactivity was measured on a Phosphorlmager running the ImageQuant software (Molecular Dynamics, Sunnyvale, CA) . Cellular Localisation of E2A-BP
  • the expression plasmid Myc-E2A-BP/pCR3 was constructed for cellular localization of E2A-BP.
  • the c-Myc peptide tag (EQKLISEED; SEQ ID NO:12) was added in-frame with the E2A-BP open reading frame encoded by SEQ ID NO:2 at the N-terminus using PCR techniques, and cloned into the expression vector pCR3 (Invitrogen) .
  • COS-7 cells were transiently transfected with Myc-E2A-BP/pCRTM3 plasmids using the DEAE-dextran method (Sambrook et al. , supra) . Immunostaining was performed 48 hours after transfection.
  • the transfected cells grown on chamber slides, were fixed with 4% paraformaldehyde in phosphate-buffered saline (PBS) and stained with an anti-c-myc monoclonal antibody (9E10, Oncogene) , followed by a rhodamine-conjugated goat anti-mouse IgG as the secondary antibody. Counter- staining for the nucleus with Hoechst 33258 was performed as recommended by the manufacturer.
  • a DNA fragment containing the partial E2A-BP human open reading frame corresponding to SEQ ID NO:2 was cloned into expression plasmids pcDNA3 and pCRM3 (Invitrogen) in sense or antisense orientations.
  • Stable transformants of C2C12 cells were generated by electroporation, as described previously (Sambrook et al . , supra) . Briefly, 2.5 X IO 6 cells were harvested at 60% confluence and resuspended in 0.8 ml of PBS. The cells were transferred to electroporation cuvettes (0.4 mm, Biorad) and mixed with 20 ⁇ g of plasmid DNA.
  • Stable transfectants were selected in DMEM media supplemented with 0.5 mg/ml of G418 (GIBCO) . Mutagenesis Mutations were introduced into the human E2A-BP cDNA in p-Bluescript vector using the Clonetech site- directed mutagenesis kit. Two conserved residues implicated in metal binding, H-236 and E-239 (as numbered in SEQ ID NO:l), were mutated to glutamine. Two oligonucleotides, mtXbal, 5'-GGCGGCCGCTGTAGAACTAGT-3' (SEQ ID NO:13) and mtSgnl
  • 35 S-labeled proteins were prepared by in vitro transcription and translation (Promega TNT kit) using the - 36 - cDNA plasmid encoding ⁇ E47 (amino acid 561 to 651), E47, E12, MyoD, Id-3, ⁇ E2A-BP (amino acid 404 to 768), mtE2A-BP, and wild type E2A-B.
  • glutathione- S-transferase (GST) fusion proteins GST- ⁇ E47, GST- ⁇ E12 (amino acids 477 to 654), GST-MyoD, GST-Id3 , and GST- ⁇ E2A-BP, were prepared as previously described (Shrivastava et al .
  • a typical binding reaction mixture contained DNA probe at 50,000 cpm, 1 ⁇ g of poly(dl-dC) »poly(dI-dC) , 10 mM Tris (pH 7.5), 50 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol, 5% glycerol, and 3-5 ml of in vitro translated protein products, in a final volume of 25 ml.
  • the reaction mixture was incubated at room temperature for 30 minutes and analyzed by 5% native polyacrylamide gel electrophoresis in a 0.25 X TBE buffer (22 mM Tris base, 22 mM boric acid, and 0.5 mM EDTA).
  • Cells were harvested in low ionic strength lysis buffer (10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 3 mM MgCl 2 , 300 mM sucrose, and 0.1% NP40) as described (Jen et al.. Genes Dev. 6:1466, 1992). Protein lysates were cleared by preincubation with normal rabbit IgG and protein A- agarose (Oncogene) , followed by an overnight incubation with c-Myc-antibody agarose (Santa Cruz) , talon metal affinity resin (Clonetech) , or protein A-agarose.
  • lysis buffer 10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 3 mM MgCl 2 , 300 mM sucrose, and 0.1% NP40
  • E2A-BP bHLH domain of E47 as a probe to screen a human aortic expression library
  • ten interacting clones were isolated; eight clones encoded Id and two clones encoded novel sequences.
  • One of the two novel clones, designated E2A-BP was characterized first.
  • a 1450-bp E2A-BP cDNA identified from the interaction cloning was used to isolate a 2795-bp cDNA clone.
  • E2A-BP lacks an HLH domain, despite its isolation by interaction cloning using the bHLH domain of E47 as a probe.
  • Carboxypeptidases contain two signature domains that are important for binding one atom of zinc (Manser et al . , Biochem. J. 267:517-25, 1990; Reynolds et al. , J. Biol. Chem. 264:20094-9, 1989; Tan et al. , J. Biol. Chem. 264:13165-70, 1989).
  • a histidine and a glutamic acid in signature 1 and an additional histidine in signature 2 are implicated in zinc binding (Fig. IB) .
  • the histidine and glutamic acid in signature l are present in both carboxypeptidase E and E2A-BP.
  • the histidine in signature 2 is present in carboxypeptidase E, but not in E2A-BP.
  • E2A-BP mRNA is Preferentially Expressed in the Aorta
  • E2A-BP mRNA The expression pattern of E2A-BP mRNA was determined in a variety of rat and human organs by Northern blot analysis using a human E2A-BP cDNA probe. For rat organs, a single 4 kb transcript was detected by the E2A-BP probe. The highest expression of E2A-BP was observed in the aorta with its adventitia removed and containing mainly smooth muscle cells. E2A-BP was undetectable in other rat organs, except a low level was detected in the adventitia and esophagus. E2A-BP was also expressed at high levels in human aorta, compared to heart, lung, and skeletal muscle, which have 40, 80, and 100-fold less expression, respectively.
  • E2A-BP is expressed in aortic smooth muscle cells.
  • in situ hybridization was performed.
  • a rat E2A-BP cDNA to generate both sense and antisense cRNA probes.
  • An intense concentration of autoradiographic grains was present after hybridization with the antisense, but not the sense probe, indicating a high level expression of E2A-BP transcript in aortic smooth muscle cells.
  • autoradiographic grains were not detected in skeletal muscle cells hybridized with the E2A-BP antisense probe. Hybridization of the antisense probe to small vessels in skeletal muscle was also detected. This observation is consistent with the low level of E2A-BP expression in skeletal muscle detected by RNA blot analysis.
  • E2A-BP Downregulation of E2A-BP mRNA in Human Adult Skeletal Muscle
  • E2A-BP expression in human fetal and adult skeletal muscle was examined.
  • a high level of E2A-BP mRNA was detected in fetal skeletal muscle cells.
  • E2A-BP mRNA was downregulated markedly in adult skeletal muscle cells that had differentiated terminally.
  • an RNA blot previously hybridized with an E2A-BP probe was rehybridized with a cyclin A probe. Cyclin A mRNA was present at a high level in fetal skeletal muscle but was undetectable in adult skeletal muscle.
  • E2A-BP is a nuclear protein
  • a fusion plasmid was generated which expresses a fusion protein containing a c-myc peptide tag on the N-terminus side of a portion of the E2A-BP cDNA open reading frame corresponding to SEQ ID NO:2.
  • the construct was then transfected into C2C12 and COS-7 cells.
  • the fusion protein was detected by a specific monoclonal antibody (9E10) to the c-myc tag.
  • DNA staining by Hoechst 33258 was used to localize the nucleus.
  • C-myc tagged E2A-BP protein was expressed in the nucleus in both C2C12 and COS-7 cells. This result is consistent with the presence of a nuclear localization signal (Boulikas, J. Cell. Biochem. 55:32-58, 1994), KRIR, at amino acid 599 to 602 of E2A-BP (Fig. IA) .
  • E2A-BP Binds E12 and E47, but not MyoD and Id3
  • E2A-BP Suppresses Binding of E47 homodimer and E 7-MyoD Heterodi er to the E-box
  • E2A-BP gel mobility shift analysis was performed using an E-box probe consisting of four repeats of the consensus CANNTG sequences in the enhancer of muscle creatine kinase (Lassar et al . , Cell 58:823-31, 1989; Murre et al . , Cell 56:777-83, 1989; Murre et al . , Cell 58:537-44, 1989) and HLH or E2A-BP proteins synthesized by in vitro transcription and translation. The proteins were translated alone or co-translated with other proteins. The effect of E2A-BP on binding of ⁇ E47 homodimers was analyzed. Incubation of the probe with ⁇ E47 resulted in formation of a specific ⁇ E47-E-box complex, which was abolished by incubation with a
  • E47/MyoD heterodimer The binding of E47/MyoD heterodimer to the E-box probe in the presence and absence of E2A-BP was also assessed. Incubation of the E-box probe with full length E47 and MyoD resulted in formation of a specific E47/MyoD-E-box complex, which was abolished by incubation with a 100-fold molar excess of identical nonradiolabeled DNA, but not by incubation with nonidentical DNA. E2A-BP decreased the binding of E47/MyoD heterodimer to the E-box by more than 75%. To determine whether the conserved histidine and glutamic acid residues of signature 1 (Fig.
  • IB are important in inhibiting the binding of E47/MyoD to the E-box, the histidine and glutamic acid residues were mutated to glutamine. Mutation of these two amino acids prevented the inhibition of E47/MyoD-E-box complex formation by E2A-BP.
  • E2A proteins have an important role in regulating differentiation and inhibiting growth of many cell types (Kadesch, Cell. Growth Differ. 4:49-55, 1993; Lassar et al . , Cell 58:823-31, 1989; Olson and Klein, Genes & Dev. 8:1-8, 1994; Peverali et al . , EMBO J. 13:4291-301, 1994), the effect of E2A-BP, which attenuates binding of E2A to DNA, was tested. C2C12 myoblasts were used for these studies because differentiation in these cells is well characterized (Guo et al . , Mol. Cell. Biol. 15:3823-9, 1995; Jen et al .
  • C2C12 myoblasts express skeletal muscle specific genes, such as myogenin, myosin heavy chain (MHC) , and myosin light chain (MLC) and differentiate into multinucleated myotubes (Jen et al . , Genes & Dev. 6:1466-79, 1992).
  • MHC myosin heavy chain
  • MLC myosin light chain
  • C2C12 cells were stably transfected with vectors containing no insert or a full length E2A-BP cDNA, in either sense or antisense orientations. Expression of sense and antisense transcripts was confirmed by Northern blot analysis. Two clones expressing the sense E2A-BP and three clones expressing antisense E2A-BP were selected. Since the responses of these clones were similar, the results of one representative clone from each group are presented. To determine the effect of E2A-BP on the mRNA levels of muscle specific genes, C2C12 clones were cultured in either growth medium for two days or in differentiation medium for one or two days, and total RNA was harvested for Northern blot analysis.
  • C2C12 cells in growth medium expressed MyoD, but not myogenin, MHC, and MLC C2C12 cells in growth medium expressed MyoD, but not myogenin, MHC, and MLC.
  • differentiation medium markedly increased the mRNA levels of myogenin, MHC, and MLC, but did not affect MyoD mRNA levels.
  • differentiation medium failed to induce expression of myogenin, MHC, and MLC mRNA in C2C12 cells transfected with sense E2A-BP.
  • E2A-BP E2A-BP would inhibit expression of MHC
  • cells were immunostained with an anti-MHC primary antibody and a rhodamine-conjugated secondary antibody. The nuclei were labeled by Hoechst 32258. Expression of MHC protein can be easily detected in C2C12 cells transfected with vector alone or antisense E2A-BP, but not in cells transfected with sense E2A-BP.
  • the transfected C2C12 clones were treated with differentiation medium for 4 days. Formation of multinuclear myotubes was detected in C2C12 cells transfected with vector alone, but not in cells transfected with sense E2A-BP. Since C2C12 cells expressed low levels of E2A-BP mRNA, the effect of antisense E2A-BP on myotube expression was also tested. Compared with C2C12 cells transfected with vector alone, transfection of C2C12 cells with antisense E2A-BP accelerated formation of myotubes.
  • E2A-BP Interacts with E2A Proteins in Vivo
  • MOLECULE TYPE Genomic DNA
  • GGAGTGGACC CTACGAAA GTCAAGTTCC CCCCATTGGG ATG GAG TCA CAC CGT 75
  • GAG ATC TCA GAC AAC CCT GGG GAG CAT GAA CTG GGG GAG CCC GAG TTC 847 Glu He Ser A ⁇ p A ⁇ n Pro Gly Glu His Glu Leu Gly Glu Pro Glu Phe 215 220 225
  • CAG CAG CGA CGC CTA CAA CAC CGC CTG CGG CTT CGG GCA CAG ATG CGG 2095 5 Gin Gin Arg Arg Leu Gin His Arg Leu Arg Leu Arg Ala Gin Met Arg 630 635 640 645
  • GAG AAA GAG GAG GAG ATA GCC ACT GGC CAG GCA TTC CCC TTC ACA ACA 2431 Glu Lye Glu Glu Glu He Ala Thr Gly Gin Ala Phe Pro Phe Thr Thr 745 750 755 5 GTA GAG ACC TAC ACA GTG AAC TTT GGG GAC TTC TGAGATCAGC GTCCTACCAA 2484
  • ATTCCCTCGC TCACCCCATC CTCTCTCCCG CCCCTTCCTG GATTCCCTCA CCCGTCTCGA 60
  • TCCCCTCTCC GCCCTTTCCC AGAGACCCAG AGCCCCTGAC CCCCCGCGCC CCTGCTCAGC 120 TGCCTCCTGG CGTTGCTGGC CCTGTGCCCT GGAGGGCGCC CGCAGACGGT GCTGACCGAC 180
  • GCC CGC ACG CCT ACC CAG GAG CAG CTG CTG GCC GCA GCC ATG GCA GCA 2460 Ala Arg Thr Pro Thr Gin Glu Gin Leu Leu Ala Ala Ala Met Ala Ala
  • GAC CAC GCC ATC TTC CGG TGG CTT GCC ATC TCC TTC GCC TCC GCA CAC 2556 A ⁇ p Hie Ala He Phe Arg Trp Leu Ala He Ser Phe Ala Ser Ala Hie 495 500 505 510 CTC ACC TTG ACC GAG CCC TAC CGC GGA GGC TGC CAA GCC CAG GAC TAC 2604 Leu Thr Leu Thr Glu Pro Tyr Arg Gly Gly Cy ⁇ Gin Ala Gin Asp Tyr 515 520 525
  • MOLECULE TYPE protein
  • FRAGMENT TYPE internal
  • MOLECULE TYPE Genomic DNA
  • FEATURE
  • GAG GCC AAG CAG CCC CGG CCA GAG CCA GAG GAG GAG ACT GAG ATG CCC 729 Glu Ala Lys Gin Pro Arg Pro Glu Pro Glu Glu Glu Thr Glu Met Pro
  • AAA ATC AAG TGC CCA CCT ATT GGG ATG GAG TCA CAC CGC ATT GAG GAC 1209 Lys He Lys Cys Pro Pro He Gly Met Glu Ser His Arg He Glu Asp 375 380 385
  • MOLECULE TYPE protein
  • FRAGMENT TYPE internal

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention features nucleic acids that encode E2A-BP, as well as the E2A-BP polypeptides themselves. Also within the invention are the E2A-BP promoter, therapeutic methods employing E2A-BP nucleic acids or polypeptides, and methods for identifying compounds which modulate E2A-BP activity.

Description

E2A-Bindinσ Protein
Statement of Government Support This invention was funded in part by the U.S. Government under grant number ROIGM 53249 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Background of the Invention This invention relates to regulation of gene expression and differentiation in vascular smooth muscle cells.
E12 and E47, which are alternatively spliced products of the E2A gene (Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991), belong to a growing family of transcription factors characterized by a highly conserved helix-loop-helix (HLH) motif (Kadesch, Cell. Growth Differ. 4:49-55, 1993; Kadesch, Immunol. Today 13:31-6, 1992) . In contrast to other tissue-specific HLH factors, such as members of the MyoD family (e.g., SCL/TAL and MASH) , E2A proteins are widely expressed. Using the HLH motif, E47, but not E12, forms homodimers, while both E12 and E47 heterodimerize with tissue-specific HLH factors (Hsu et al . , Proc. Natl. Acad. Sci. USA 91:3181-5, 1994; Johnson et al . , Proc. Natl. Acad. Sci. USA 89:3596-600, 1992; Shirakata, Genes & Dev. 7:2456-70, 1993; Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991; Shirakata, EMBO J. 14:1766-72, 1995). Immediately adjacent to the HLH domain, most HLH proteins contain a basic region that allows homodimers or heterodi ers to bind the CANNTG consensus sequence, which is known as the E-box
(Blackwell and Weintraub, Science 350:1104-10, 1990; Ephrussi et al . , Science 227:134-40, 1985). In contrast to most HLH proteins, members of the Id family (Idl, Id2, Id3, and Id4) and the Drosophila extramacrochaetae (emc) protein do not contain the basic DNA-binding region. Heterodimers of Id and other HLH proteins do not bind DNA, and thus Id proteins function as inhibitors of DNA binding. In vitro, Idl and Id2 have higher affinity for E2A proteins than MyoD and they may dissociate functional MyoD-E2A heterodimers (Benezra et al . , Cell 61:49-59, 1990; Sun et al . , Mol. Cell. Biol. 11:5603-11, 1991).
E2A proteins are involved in regulating growth and differentiation of many different cell types. For example, E2A proteins have important roles in transcriptional activation of pancreatic and immunoglobulin genes (Cordle et al . , Mol. Cell. Biol. 11:2881-6, 1991; Henthorn et al . , Science 247:467-70, 1990; Murre et al . f Mol. Cell. Biol. 11:1156-60, 1991; Nelson et al . , Genes & Dev. 4:1035-43, 1990). They are essential for B cell development, as deletion of the E2A gene by homologous recombination prevents B cell differentiation in mice (Bain et al . , Cell 79:885-92, 1994; Zhuang et al . , Cell 79:875-84, 1994). Heterodimerization of E2A proteins with the tissue-specific factors SCL/TAL and MASH indicates that E2A proteins may be important in the differentiation of hematopoietic and neural tissue, respectively (Hsu et al . , Proc. Natl. Acad. Sci. USA 91:3181-5, 1994; Johnson et al . , Proc. Natl. Acad. Sci. USA 89:3596-600, 1992;
Shivdasani et al . , Nature 373:432-4, 1995). The function of E2A proteins and members of the MyoD family in growth and differentiation is most clearly demonstrated in skeletal muscle cells. In this cell type, growth and differentiation are mutually exclusive (Nadal-Ginard, Cell 15:855-64, 1978). The expression of members of the MyoD family, including MyoD, Myf5, myogenin, and MRF4, is essential for the differentiation of skeletal muscle both in vitro and in vivo (Buckingham, Cell 78:15-21, 1994; Buckingham, Cur. Opin. Genet. Dev. 4:745-51, 1994; Lassar et al . , Cell 66:305-15, 1991; Olson, Genes & Dev. 4:1454- 61, 1990; Olson and Klein, Genes & Dev. 8:1-8, 1994; Weintraub, Cell 75:1241-4, 1993). In vitro, induction of myogenesis can be achieved by overexpression of myogenic factors in C3H10T1/2 embryonic fibroblasts or simply by removal of growth factors in the culture medium of C2C12 myoblasts (Olson and Klein, Genes & Dev. 8:1-8, 1994; Tapscott et al., Science 242:405-11, 1988; Weintraub, Cell 75:1241-4, 1993). Expression of antisense E2A transcripts in C3H10T1/2 cells suppresses terminal muscle differentiation induced by MyoD, indicating that functional activity of myogenic HLH proteins of the MyoD family requires heterodimerization with E2A proteins (Lassar et al . , Cell 66:305-15, 1991). Thus, E2A proteins, in conjunction with skeletal muscle-specific HLH proteins, are required for terminal differentiation and cell cycle withdrawal of skeletal muscle cells. Although in most instances the effect of E2A proteins on differentiation and growth inhibition appears to require heterodimerization with tissue-specific HLH proteins (Jan and Jan, Cell 75:827-30, 1993; Olson and Klein, Genes & Dev. 8:1-8, 1994), this effect has also been demonstrated in fibroblasts, a cell type in which MyoD, or any other tissue-specific HLH protein, has not been identified (Peverali et al . , EMBO J. 13:4291-301, 1994).
Similar to the skeletal muscle cell, the vascular smooth muscle cell in mature animals is a highly specialized cell type whose principal function is contraction (Owens, Physiol. Rev. 75:487-517, 1995; Schwartz et al . , Physiol. Rev. 70:1177-1209, 1990). In contrast to the skeletal muscle cell, the vascular smooth muscle cell is not terminally differentiated. Although the adult vascular smooth muscle cell proliferates at an extremely low rate, it can undergo rapid changes to a highly proliferative phenotype in response to stimuli such as oxidized low density lipoprotein, homocysteine, and mechanical and immunological injuries (Libby and Hansson, Lab. Invest. 64:5-15, 1991; Munro and Cotran, Lab. Invest. 58:249-261, 1988; Ross, Nature 362:801-9, 1993; Tsai et al . , Proc. Natl. Acad. Sci. USA 91:6369- 6373, 1994). This proliferation of vascular smooth muscle cells is a critical event in the development of arteriosclerotic lesions (Ross, Nature 362:801-9, 1993). Despite the fact that arteriosclerosis and its complications, such as heart attack, stroke, and peripheral vascular diseases, are the main causes of death in developed countries (Ross, Nature 362:801-9, 1993) , little is known about the genes that regulate differentiation and phenotypic changes of vascular smooth muscle cells. This is mainly due to a lack of vascular smooth muscle cell-specific markers and precursor cells that can differentiate into vascular smooth muscle cells in vitro (Owens, Physiol. Rev. 75:487-517, 1995). Although E2A and Id proteins are expressed in vascular smooth muscle cells, MyoD and other tissue-specific HLH proteins have not been identified in these cells (Kemp et al . , FEBS Lett. 368:81-6, 1995).
Summary of the Invention The invention features substantially pure E2A-BP. ,,E2A-BP" is herein defined as a protein which (1) has at least 70% sequence identity with SEQ ID NO:16, (2) binds to both E12 and E47, under physiological conditions, and (3) inhibits binding of E47 homodimer to an E-box probe consisting Of: 5'-GATCTACACCTGCTGCCTCCCAACACCTGCTGCCTCCC AACACCTGCTGCCTCCCAACACCTGCTGAGCT-3' (SEQ ID NO:3). As discussed further below, an E2A-BP polypeptide may be introduced into a vascular smooth muscle cell, in order to promote growth of the cell. The sequence of the E2A-BP of the invention is preferably at least 80% identical, more preferably at least 90% identical, and most preferably at least 95% identical to SEQ ID NO:16. It can have the sequence of a naturally occurring protein from, e.g., a human, mouse (SEQ ID NO: 18) , rat, hamster, chicken, goat, horse, pig, cow, monkey, or ape, or can have one or more amino acid deletions, additions, or substitutions. It may be identical to SEQ ID NO:16; a truncated form of E2A-BP, e.g., the sequence of
SEQ ID NO:l; or a splice variant thereof. When recombinantly expressed in C2C12 yoblasts cultured in differentiation medium (see below) , the polypeptide of the invention may, for example, decrease the level of expression of myogenin and/or myosin heavy chain, and decrease the formation of multinuclear myotubes by these cells. Typically, the polypeptide of the invention inhibits binding of E47-MyoD heterodimer to the E-box probe, but binds poorly or not at all to MyoD or to the E-box probe itself. The invention also features an antibody (monoclonal or polyclonal) , which specifically binds to E2A-BP and can be made using standard methods. By a "substantially pure polypeptide" is meant a polypeptide which is separated from those components (e.g., proteins and other naturally-occurring organic molecules) , which accompany it in its natural state. Thus, a protein which is chemically synthesized or produced in a cellular system different from the cell in which it naturally originates will be substantially free from its naturally associated components. Accordingly, substantially pure polypeptides include recombinant polypeptides derived from a eukaryote, but produced in E. coli or another prokaryote, or in a eukaryote other than that from which the polypeptide was originally derived. Typically, the polypeptide is substantially pure when it constitutes at least 60%, by weight, of the protein in the preparation. Preferably, the protein in the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight, E2A-BP polypeptide. A substantially pure E2A-BP polypeptide may be obtained by, for example, extraction from a natural source (e.g., a vascular smooth muscle), expression of a recombinant nucleic acid encoding an E2A-BP polypeptide; in vitro translation; or chemical synthesis of the protein. Purity of the polypeptide can be measured by any appropriate method, e.g., column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis.
The invention also features isolated nucleic acids (DNA, RNA, or combinations or modifications thereof) that encode E2A-BP, as is defined above. For example, the isolated nucleic acid may contain the nucleotide sequence of SEQ ID NO:15, or a portion thereof, e.g., SEQ ID NO:2, or may contain the nucleotide sequence of SEQ ID NO:17. The isolated nucleic acid may hybridize under high stringency conditions to a probe having a nucleotide sequence complementary to SEQ ID NO:15 or SEQ ID NO:17, or a portion thereof. The invention also includes all degenerate variants of sequences which encode E2A-BP. As is discussed below, an E2A-BP-encoding nucleic acid molecule of the invention may be introduced into a vascular smooth muscle cell, using gene therapy methods, in order to promote growth of the cell.
By an "isolated nucleic acid" is meant a nucleic acid that is free of the genes which, in the naturally- occurring genome of the organism from which the DNA is derived, flank the gene of interest. The term therefore includes, for example, a recombinant DNA which is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote at a site other than its natural site, or which exists as a separate molecule (e.g., a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease digestion) independent of other sequences. It also includes a recombinant DNA which is part of a hybrid gene encoding an additional polypeptide sequence, such as a polypeptide sequence which facilitates purification (e.g., glutathione-S- transferase (GST) ) .
The invention also features an isolated nucleic acid (DNA, RNA, or combinations or modifications thereof) having at least 50% sequence identity (preferably at least 70%, more preferably at least 80%, more preferably 90%, and most preferably at least 95%) to SEQ ID NO:15, and encoding E2A-BP, as defined above. The percent sequence identity of one DNA to another is determined by standard means, e.g., by the Sequence Analysis Software Package developed by the Genetics Computer Group (University of Wisconsin Biotechnology Center, Madison, WI) (or an equivalent program; see, e.g., Ausubel et al . , supra) , employing the default parameters thereof.
Hybridization is carried out using standard techniques, such as those described in Ausubel et al . (Current Protocols in Molecular Biology, John Wiley & Sons, 1989) . "High stringency" refers to nucleic acid hybridization and wash conditions characterized by high temperature and low salt concentration, e.g. , wash conditions of 65°C at a salt concentration of approximately 0.1 X SSC. "Low" to "moderate" stringency refers to DNA hybridization and wash conditions characterized by low temperature and high salt concentration, e .g. , wash conditions of less than 60°C at a salt concentration of at least 1.0 X SSC. For example, high stringency conditions may include hybridization at about 42°C, and about 50% formamide; a first wash at about 65°C, about 2X SSC, and 1% SDS; followed by a second wash at about 65°C and about 0.1% x SSC. Lower stringency conditions suitable for detecting DNA sequences having about 50% sequence identity to an E2A-BP gene are detected by, for example, hybridization at about 42°C in the absence of formamide; a first wash at about 42°C, about 6X SSC, and about 1% SDS; and a second wash at about 50°C, about 6X SSC, and about 1% SDS.
Also featured in the invention is an isolated nucleic acid, such as a DNA, containing an E2A-BP promoter, e.g., a human E2A-BP promoter. By "promoter" is meant a DNA sequence sufficient to direct transcription of a coding sequence to which it is linked. Promoters may be constitutive or inducible, and may be coupled to other regulatory sequences or "elements" which render promoter-dependent gene expression cell-type specific, tissue-specific or inducible by external signals or agents; such elements may be located in the 5' or 3' region of the native gene, or within an intron. As is discussed further below, an E2A-BP promoter is capable of directing gene expression in vascular smooth muscle cells.
An E2A-BP promoter of the invention may be operably linked to the coding sequence of E2A-BP, or may be operably linked to a sequence which encodes a heterologous polypeptide, e . g. , a growth inhibitor, such as retinoblastoma, an inhibitor of cyclins (e.g., p21, p57, pl8, or pl7) , or a vasodilator, such as cNOS or a prostacyclin. The E2A-BP promoter may also be operably linked to a segment of DNA which is transcribed into an RNA that is antisense to an MRNA naturally produced in a vascular smooth muscle cell. For example, RNAs that are antisense to MRNAS encoding proteins such as E2A-BP, heparin-binding epidermal growth factor (HBEGF) , and platelet-derived growth factor (PDGF) , may be produced using the E2A-BP promoter. By "operably linked" is meant that a coding sequence and a regulatory sequence(s) (i.e., a promoter) are connected in such a way as to permit gene expression when the appropriate molecules (e.g., transcriptional activator proteins) are bound to the regulatory sequence(s) .
The invention also features an isolated, single- stranded nucleic acid consisting of a nucleotide sequence which is antisense to at least a portion of a naturally occurring E2A-BP sense strand, e.g., E2A-BP mRNA (an "antisense oligonucleotide") . The antisense oligonucleotide may consist of DNA, RNA, or combinations or modifications thereof. For example, stabilized analogues of deoxyribonucleotides (see below) may be used. The antisense oligonucleotide may be produced by standard in vitro methods, e .g. , by standard chemical synthesis (see, e .g. , Ausubel et al . , supra) , and in general will be at least 10, preferably at least 14, more preferably at least 20 (e.g., at least 30), but usually no more than 50 nucleotides in length. The antisense oligonucleotides described above may be introduced into a cell, such as a vascular smooth muscle cell, in order to inhibit expression of E2A-BP in the cell.
As is mentioned above, antisense RNA oligonucleotides may also be produced in vivo by transcription of a nucleic acid introduced into a cell. Such a nucleic acid may have a sequence containing (a) an expression control sequence (i.e., a promoter, such as the E2A-BP promoter) which permits expression in a vascular smooth muscle cell (e.g., a human vascular smooth muscle cell) , operably linked to (b) a sequence which is transcribed into an RNA antisense to at least a 10, preferably at least a 14, or more preferably at least a 20 (e.g., at least a 30) nucleotide sequence of a target mRNA. Such an antisense RNA may also be formulated to be complementary to all of a specific mRNA sequence, e.g., an E2A-BP mRNA sequence.
The invention also features a method of inhibiting proliferation of a vascular smooth muscle cell by introducing into the cell a compound (e.g., a small organic compound, a peptide having a sequence corresponding to the E2A-BP binding site on E2A, a peptide having a sequence corresponding to the E2A binding site on E2A-BP, or an antibody specific for either E2A-BP or E2A) , which inhibits binding between E2A-BP and an E2A transcription factor, such as E12 or E47.
Such compounds may be identified using any of several screening methods. For example, in one screening method, E2A can be contacted with E2A-BP in the presence of a candidate compound, and the level of E2A-BP binding to E2A can then be determined. A decrease in the level of binding in the presence of the compound, compared to the level of binding in the absence of compound, may be used as an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding. In another screening method, a complex containing E2A and E2A-BP can be contacted with a candidate compound. Whether the candidate compound decreases the binding of E2A-BP to E2A in the complex can be determined as an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding. In yet another screening method, a cell that expresses E2A-BP and E2A can be cultured in the presence of a candidate compound. The level of expression of an E2A-regulated gene in the cell can then be determined. An increase in the level of expression of the gene in the presence of the compound, compared to the level of expression in the absence of the compound, is an indication of the ability of the candidate compound to inhibit E2A-BP/E2A binding. Additional variations of these methods are described below.
The invention also features a genetically altered non-human mammal, e.g., a mouse, whose muscle cells produce altered levels or forms of a functional E2A-BP gene product. The levels of the E2A-BP gene product in the genetically altered mammal can be increased or decreased. By "genetically altered mammal" is meant a mammal in which the genomic DNA sequence has been manipulated in some way. The genetically altered mammal may be a knockout in which the endogenous E2A-BP sequences have been deleted or otherwise altered to decrease or change the pattern of expression. Alternatively, the genetically altered mammal may be transgenic, retaining endogenous E2A-BP coding sequences and having exogenous E2A-BP sequences as well. The transgenic mammal may express E2A-BP sequences from another species, may overexpress E2A-BP gene product, or may express E2A-BP in tissues and at developmental stages other than those in which E2A-BP is normally expressed.
The nucleated cells of genetically altered mammal not producing a functional endogenous E2A-BP gene product may be engineered to encode human E2A-BP polypeptide, and to express functional human E2A-Bp polypeptide, or, alternatively, E2A-BP from another heterologous species. Other features and advantages of the invention will be apparent from the following detailed description, the drawings, and the claims.
Brief Description of the Drawings Fig. IA is a schematic representation of a partial amino acid sequence of human E2A-BP (SEQ ID NO:l) containing 768 amino acids. The sequences homologous to the two carboxypeptidase signature domains l and 2 are underlined. The nuclear localization signal, KRIR, is in bold. An acidic domain, rich in glutamic acid residues, is in italics. R404, the first amino acid of the original cDNA clone isolated by interaction cloning, is marked by a dagger. Fig. IB is a schematic representation of a comparison of two E2A-BP sequences (SEQ ID NO:4 and SEQ ID NO:5), which are the E2A-BP sequences that are homologous and to the sequences of signature domains 1 (SEQ ID NO:6) and 2 (SEQ ID NO:7) in carboxypeptidase E (CPE) . The putative zinc binding amino acids are in bold. The histidine and glutamic acid residues in signature 1 are conserved in E2A-BP, but the histidine in signature 2 is replaced by asparagine.
Fig. 2 is a schematic representation of a partial nucleotide sequence of a human cDNA (SEQ ID NO:2), which encodes E2A-BP, as well as the corresponding amino acid sequence (SEQ ID NO:l), in single letter code.
Figs. 3A and 3B are schematic representations of the nucleotide sequences of portions of a rat E2A-BP cDNA. Fig. 3A shows a portion of the sequence of the sense strand (SEQ ID NO:8) and Fig. 3B shows a portion of the sequence of the antisense strand (SEQ ID NO:9). Fig. 4 is a schematic representation of the nucleotide sequence of a full-length human E2A-BP cDNA (SEQ ID NO:15) .
Fig. 5 is a schematic representation of the amino acid sequence (SEQ ID:NO 16) encoded by the full-length human cDNA (SEQ ID NO:15). The open reading frame of the full-length human E2A-BP contains 845 amino acids and has a predicted molecular weight of 96 kDa.
Fig. 6 is a schematic representation of the nucleotide sequence of the full-length murine cDNA (SEQ ID NO:17) .
Fig. 7 is a schematic representation of the amino acid sequence (SEQ ID NO:18) encoded by the full-length murine cDNA (SEQ ID NO:17). The open reading frame of the full-length urine E2A-BP contains 1128 amino acids and has a predicted molecular weight of 128 kDa.
Detailed Description Naturally occurring E2A-BP is a nuclear protein that is expressed in vascular smooth muscle cells. It binds to, and modulates the activities of, E2A transcription factors. The invention features nucleic acids that encode E2A-BP, as well as the E2A-BP polypeptides themselves. Also within the invention are the E2A-BP promoter, therapeutic methods employing E2A-BP nucleic acids or polypeptides, and methods for identifying compounds which modulate E2A-BP activity. E2A-BP Nucleic Acids The nucleotide sequences of a full length cDNA encoding human E2A-BP (SEQ ID NO:15) and of a partial human cDNA encoding human E2A-BP (SEQ ID NO:2) are shown in Figs. 4 and 2, respectively. In addition to nucleic acids having the sequence of SEQ ID NO:15 or SEQ ID NO:2, particularly the coding sequences, the invention includes all degenerate variants of the coding sequence of SEQ ID NO:15 or SEQ ID NO:2, as well as any isolated nucleic acid having a nucleotide sequence that encodes any other E2A-BP (as defined above) . The nucleotide sequence of a full-length mouse E2A-BP cDNA (SEQ ID NO:17) is shown in Fig. 6. In addition to nucleic acids having the sequence of SEQ ID NO:17, particularly the coding sequence, the invention includes all degenerate variants of the coding sequence of SEQ ID NO:17.
The nucleotide sequence of a cDNA encoding a portion of rat E2A-BP (SEQ ID NO:8 and SEQ ID NO:9; see description of Figs. 3A and 3B, above) is shown in Figs. 3A and 3B. As is discussed below, the rat cDNA was isolated by PCR using primers derived from the human E2A- BP cDNA sequence. Nucleic acids encoding naturally occurring E2A-BP polypeptides from species in addition to human and rat are included in the invention, and may be obtained using standard methods, such as the PCR method used to isolate the rat clone.
The invention also includes nucleic acids which encode E2A-BP polypeptides having substitutions and/or deletions of single and/or multiple amino acids. Such nucleic acids can be generated using standard methods, and the polypeptides that they encode can easily be screened for E2A-BP activity, as described below.
Plasmids encoding full-length human E2A-BP or mouse E2A-BP cDNAs, as well as E. coli strains containing the above plasmids, were deposited with the American Type Culture Collection (ATCC, Rockville, Maryland) , under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure on March 12, 1997. The E. coli strains are INVα F' (Invitrogen) , and the cDNAS are cloned in the vector pCR2.1 (Invitrogen). The
Applicants' assignee, The President and Fellows of Harvard College, acknowledges its duty to replace the deposit, should the depository be unable to furnish a sample when requested, due to the condition of the deposit, before the end of the term of a patent issued hereon, and its responsibility to notify the ATCC of the issuance of such a patent, at which time the deposit will be made available to the public. Prior to that time, the deposit will be made available to the Commissioner of Patents and Trademarks under the terms of 37 C.F.R. §1*14 and 35 U.S.C. §112.
As is discussed below, E2A-BP nucleic acids may be used in gene therapy and antisense methods for treating vascular diseases. In addition, E2A-BP nucleic acids may be used in methods for producing E2A-BP polypeptides. E2A-BP Polypeptides
The amino acid sequences of full-length human E2A-BP (SEQ ID NO:16) and a partial human E2A-BP amino acid sequence (SEQ ID NO:l) are shown in Figs. 5 and IA, respectively. The amino acid sequence of full-length mouse E2A-BP (SEQ ID NO:18) is shown in Fig. 7. In addition to E2A-BP polypeptides having the sequence of SEQ ID NO:l, SEQ ID NO:16, or SEQ ID NO:18, as discussed above, polypeptides which (1) have at least 70% sequence identity with SEQ ID NO:l, SEQ ID NO:16, or SEQ ID NO:18; (2) bind to both E12 and E47, under physiological conditions; and (3) inhibit binding of E47 homodi er to an E-box probe consisting of: 5'- GATCTACACCTGCTGCCTCCCAACACCTGCTGCCTCCCAACACCTGCTGCCTCCC AACACCTGCTGAGCT-3' (SEQ ID NO:3), are included in the invention.
Polypeptides according to the invention may be produced by transformation of a suitable host cell with all or part of an E2A-BP-encoding cDNA fragment (e.g., the cDNA described above) in a suitable expression vehicle. Those skilled in the field of molecular biology will understand that any of a wide variety of expression systems may be used to produce the recombinant E2A-BP polypeptide. The precise host cell used is not critical to the invention. The E2A-BP polypeptide may be produced in a prokaryotic host (e.g., E. coli) or in a eukaryotic host (e.g., yeast, such as Saccharomyces cerevisiae; insect cells, such as Sf-9 cells; or mammalian cells, such as COS-1, NIH-3T3, and JEG3 cells). Such cells are available from a wide range of sources, e.g., the ATCC. (Also see, e .g. , Ausubel et al . , supra. ) The method of transfection and the choice of expression vehicle will depend on the host system selected. Standard transformation and transfection methods are described, e.g., in Ausubel et al . (supra) ; expression vehicles may be chosen from, e.g., those described in Cloning Vectors '. A Laboratory Manual (P.H. Pouwels et al . , 1985, Supp. 1987) and by Ausubel et al . supra .
One example of an expression system which may be used is a mouse 3T3 fibroblast host cell transfected with a pMAMneo expression vector (Clonetech, Palo Alto, CA) . pMAMneo provides: an RSV-LTR enhancer linked to a dexamethasone-inducible MMTV-LTR promotor, an SV40 origin of replication, which allows replication in mammalian systems, a selectable neomycin gene, and SV40 splicing and polyadenylation sites. DNA encoding an E2A-BP polypeptide can be inserted into the pMAMneo vector in an orientation designed to allow expression. The recombinant E2A-BP could then be isolated as described below. Other host cells which may be used in conjunction with pMAMneo, or similar expression systems, include COS cells and CHO cells (ATCC Accession Nos. CRL 1650 and CCL 61, respectively) .
E2A-BP polypeptides may also be produced in stably-transfected mammalian cell lines. A number of vectors suitable for stable transfection of mammalian cells are available to the public, e.g., see Pouwels et al . (supra) ; methods for constructing such cell lines are well known in the art (see, e .g. , Ausubel et al . , supra) . In one example, cDNA encoding E2A-BP is cloned into an expression vector which includes the dihydrofolate reductase (DHFR) gene. Integration of the plasmid and, therefore, the E2A-BP-encoding gene into the host cell chromosome is selected for by inclusion of 0.01-300 μ,M ethotrexate in the cell culture medium (see, e.g.,
Ausubel et al . , supra) . This dominant selection can be accomplished in most cell types. Recombinant protein expression can be increased by DHFR-mediated amplification of the transfected gene. Methods for selecting cell lines bearing gene amplifications are described in Ausubel et al . (supra) ; such methods generally involve extended culture in medium containing gradually increasing levels of methotrexate. DHFR-containing expression vectors commonly used for this purpose include pCVSEII-DHFR and pAdD26SV(A) , which are described in Ausubel et al . , (supra) . Any of the host cells described above or a DHFR-deficient CHO cell line (e.g., CHO DHFR'cells, ATCC Accession No. CRL 9096) are among the host cells, which may be used for DHFR selection of a stably-transfected cell line or DHFR- mediated gene amplification.
Once an E2A-BP polypeptide is expressed, it may be isolated using standard methods, such as affinity chromatography. For example, E2A or an antibody against E2A-BP may be attached to a column and used to isolate the E2A-BP polypeptide. Lysis and fractionation of E2A- BP-harboring cells prior to affinity chromatography may be performed by standard methods (see, e .g. , Ausubel et al . , supra) . Once isolated, the recombinant protein can, if desired, be further purified, e.g. , by high performance liquid chromatography (see, e.g., Fisher, Laboratory Techniques In Biochemistry And Molecular Biology, eds., Work and Burdon, Elsevier, 1980). Fragments of E2A-BP polypeptides can also be produced by chemical synthesis (e.g., by the methods described in Solid Phase Peptide Synthesis , 2nd ed. , 1984, The Pierce Chemical Co. , Rockford, IL) .
E2A-BP polypeptides may be used in therapeutic methods to promote vascular smooth muscle cell growth in, e.g., wound healing (see below). E2A-BP polypeptides, or E2A-BP polypeptide fragments (e .g. , ΔE2A-BP) may also be used in methods for generating antibodies to E2A-BP, which may be used, e .g. , in E2A-BP purification methods. The E2A-BP Promoter
The E2A-BP promoter, which directs gene expression in vascular smooth muscle cells, is also included in the invention. The E2A-BP promoter can be identified using standard methods of molecular biology (see, e.g., Ausubel et al . , supra) . For example, an E2A-BP cDNA probe can be used to isolate a genomic clone containing the E2A-BP gene from a genomic library. The promoter region can be identified in the genomic clone using standard methods, such as primer extension and/or SI nuclease mapping. Further characterization of the E2A-BP promoter may be achieved by comparing the sequence located 5' to the E2A- BP coding sequence with known promoter element sequences. In addition, standard promoter deletion analysis may be carried out. A construct that includes E2A-BP promoter sequences which confer vascular smooth muscle cell- specific expression to a reporter gene to which the sequences are operably linked, can be progressively deleted, by 5', 3', and/or nested deletions, until the ability of the promoter to induce transcription of the reporter gene in transfected cells is reduced. In addition to the 5' region of the E2A-BP gene, other sequences, such as 3' untranslated and intronic sequences, may be analyzed for effects on promoter activity.
As is discussed in further detail below, the E2A- BP promoter may be used in gene therapy methods to direct vascular smooth muscle cell-specific expression of the E2A-BP gene, genes encoding heterologous polypeptides, or DNA sequences encoding antisense transcripts (e.g., transcripts antisense to E2A-BP RNA, or RNAs encoding growth promoting proteins, such as heparin-binding epidermal growth factor (HBEGF) and platelet-derived growth factor (PDGF) . Also, as is discussed below, antisense inhibition of E2A-BP expression may be achieved by introduction of antisense oligonucleotides directly into vascular smooth muscle cells. Gene Therapy
The nucleic acids of the invention can be used in gene therapy methods for treatment of vascular diseases, such as arteriosclerosis. A vector containing the E2A-BP gene, including the promoter and coding sequence, can be administered to a patient for use in expressing E2A-BP in a vascular smooth muscle cell. Alternatively, the E2A-BP promoter, operably linked to the coding sequence of a heterologous gene, i.e., a gene which encodes a protein other than E2A-BP, can be used to express the heterologous gene in vascular smooth muscle cells. An E2A-BP promoter sequence directs transcription of DNA to which it is linked, preferably in vascular smooth muscle cells compared to -non vascular smooth muscle cells.
Heterologous genes, the expression of which is regulated by E2A-BP promoter sequences in vascular smooth muscle cells include, e.g., sequences encoding t-PA (Pennica et al . , 1982, Nature 301:214); cyclin inhibitors, such as p21, p57, pl8, and pl7 (El-Deiry et al . , 1993, Cell
75:817-823); nitric oxide synthase (Bredt et al . , 1990, Nature 347:768-770); prostacyclins; and retinoblasto a. In addition, thrombolytic agents may be expressed under the control of the E2A-BP promoter sequences for expression by vascular smooth muscle cells in blood vessels, e.g., vessels occluded by aberrant blood clots. Other heterologous proteins, e.g., proteins which inhibit smooth muscle cell proliferation, e.g., interferon-γ and atrial natriuretic polypeptide, may be specifically expressed in vascular smooth muscle cells to ensure the delivery of these therapeutic peptides to an arteriosclerotic lesion or an area at risk of developing an arteriosclerotic lesion, e .g. , an injured blood vessel. The E2A-BP promoter sequences of the invention may also be used in gene therapy to promote angiogenesis to treat diseases such as peripheral vascular disease or coronary artery disease (Isner et al., 1995, Circulation 91:2687-2692). For example, the DNA of the invention can be operably linked to sequences encoding cellular growth factors which promote angiogenesis, e.g., VEGF, acidic fibroblast growth factor, or basic fibroblast growth factor. Antisense Therapy
The E2A-BP nucleic acids of the invention may also be used in methods for antisense treatment. Antisense treatment may be carried out by administering to a mammal, such as a human, DNA containing the E2A-BP promoter operably linked to a DNA sequence (an antisense template) , which is transcribed into an antisense RNA. Alternatively, as mentioned above, antisense oligonucleotides may be introduced directly into vascular smooth muscle cells. The antisense oligonucleotide may be a short nucleotide sequence (generally at least 10, preferably at least 14, more preferably at least 20 (e.g., at least 30) , and up to 100 or more nucleotides) formulated to be complementary to a portion or all of a specific mRNA sequence. Standard methods relating to antisense technology have been described (see, e.g., Melani et al . , Cancer Res. 51:2897-2901, 1991). Following transcription of a DNA sequence into an antisense RNA, the antisense RNA binds to its target nucleic acid molecule, such as an mRNA molecule, thereby inhibiting expression of the target nucleic acid molecule. For example, an antisense sequence complementary to a portion or all of the E2A-BP mRNA could be used to inhibit the expression of E2A-BP, thereby promoting differentiation. Such antisense therapy may be used to treat conditions characterized by proliferation of vascular smooth muscle cells, such as arteriosclerosis, e .g. , restenosis in response to angioplasty. The antisense therapy of the invention may also be used to treat cancer by inhibiting angiogenesis at the site of a solid tumor, as well as other pathogenic conditions which are caused by or exacerbated by angiogenesis, e .g. , inflammatory diseases such as rheumatoid arthritis and diabetic retinopathy.
The expression of other vascular smooth muscle cell proteins may also be inhibited in a similar manner. For example, the promoter of the invention can be operably linked to antisense templates which are transcribed into antisense RNA capable of inhibiting the expression of growth promoting proteins, such as HBEGF and PDGF.
The antisense oligonucleotides of the invention may be provided exogenously to a target vascular smooth muscle cell. Alternatively, the antisense oligonucleotide may be produced within the cell by transcription of a nucleic acid molecule including a promoter sequence operably linked to a sequence encoding the antisense oligonucleotide. In this method, the nucleic acid molecule is contained within a non- replicating linear or circular DNA or RNA molecule, is contained within an autonomously replicating plasmid or viral vector, or is integrated into the host genome. Any vector that can transfect a vascular smooth muscle may be used in this method of the invention. Preferred vectors are viral vectors, including those derived from replication-defective hepatitis viruses (e.g., HBV and HCV) , retroviruses (see, e .g. , WO 89/07136; Rosenberg et al . , N. Eng. J. Med. 323(9) :570-578, 1990), adenovirus (see, e .g. , Morsey et al . , J. Cell. Biochem., Supp. 17E, 1993), adeno-associated virus (Kotin et al . , Proc. Natl. Acad. Sci. USA 87:2211-2215, 1990), replication defective herpes simplex viruses (HSV; Lu et al . , Abstract, page 66, Abstracts of the Meeting on Gene Therapy, Sept. 22- 26, 1992, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York) , and any modified versions of these vectors. Methods for constructing expression vectors are well known in the art (see, e.g., Sambrook et al., supra) . Additional suitable gene delivery systems include liposomes, receptor-mediated delivery systems, naked DNA. The invention also includes any other methods which accomplish in vivo transfer of nucleic acids into eukaryotic cells. For example, the nucleic acids may be packaged into liposomes, receptor-mediated delivery systems, non-viral nucleic acid-based vectors, erythrocyte ghosts, or microspheres (e.g., microparticles; see, e.g., U.S. Patent No. 4,789,734; U.S. Patent No. 4,925,673; U.S. Patent No. 3,625,214; Gregoriadis, Drug Carriers in Biology and Medicine, pp. 287-341 (Academic Press, 1979)). Alternatively, naked DNA may be administered. Delivery of nucleic acids to a specific site in the body for gene therapy or antisense therapy may also be accomplished using a biolistic delivery system, such as that described by Williams et al . , 1991, Proc. Natl. Acad. Sci. USA 88:2726-2729. Further, as is mentioned above, delivery of antisense oligonucleotides may be accomplished by direct injection of the oligonucleotides into target tissues, for example, in a calcium phosphate precipitate or coupled with lipids.
The antisense oligonucleotides of the invention may consist of DNA, RNA, or any modifications or combinations thereof. As an example of the modifications that the oligonucleotides may contain, inter-nucleotide linkages other than phosphodiester bonds, such as phosphorothioate, methylphosphonate, methylphosphodiester, phosphorodithioate, phosphoramidate, phosphotriester, or phosphate ester linkages (Uhlman et al . , Chem. Rev. 90(4) :544-584, 1990; Anticancer Research 10:1169, 1990), may be present in the oligonucleotides, resulting in their increased stability. Oligonucleotide stability may also be increased by incorporating 3'-deoxythymidine or 2'-substituted nucleotides (substituted with, e .g. , alkyl groups) into the oligonucleotides during synthesis, by providing the oligonucleotides as phenylisourea derivatives, or by having other molecules, such as aminoacridine or poly- lysine, linked to the 3' ends of the oligonucleotides (see, e.g., Anticancer Research 10:1169-1182, 1990). Modifications of the RNA and/or DNA nucleotides which make up the oligonucleotides of the invention may be present throughout the oligonucleotide, or in selected regions of the oligonucleotide, e.g., in the 5' and/or 3' ends. The antisense oligonucleotides may also be modified so as to increase their ability to penetrate the target tissue by, e.g., coupling the oligonucleotides to lipophilic compounds. The antisense oligonucleotides of the invention can be made by any method known in the art, including standard chemical synthesis, ligation of constituent oligonucleotides, and transcription of DNA encoding the oligonucleotides, as is mentioned above.
E2A-BP is naturally expressed in vascular smooth muscle cells, which are, therefore, the preferred cellular targets for the antisense oligonucleotides of the invention. Targeting of antisense oligonucleotides to vascular smooth muscle cells may be achieved by coupling the oligonucleotides to ligands of vascular smooth muscle cell receptors. Similarly, oligonucleotides may be targeted to vascular smooth muscle cells by being conjugated to monoclonal antibodies that specifically bind to vascular smooth muscle-specific cell surface proteins.
The antisense oligonucleotides of the invention, and the recombinant vectors containing nucleic acid sequences which are transcribed into such antisense oligonucleotides, may be used in therapeutic compositions for treating, e.g., vascular diseases. The therapeutic applications of antisense oligonucleotides in general are described, e .g. , in the following review articles: Le Doan et al . , Bull. Cancer 76:849-852, 1989; Dolnick, Biochem. Pharmacol. 40:671-675, 1990; Crooke, Annu. Rev. Pharmacol. Toxicol. 32, 329-376, 1992. The therapeutic compositions of the invention may be used alone or in admixture, or in chemical combination, with one or more materials, including other antisense oligonucleotides or recombinant vectors, materials that increase the biological stability of the oligonucleotides or the recombinant vectors, or materials that increase the ability of the therapeutic compositions to penetrate vascular smooth muscle cells selectively. The therapeutic compositions of the invention may be administered in pharmaceutically acceptable carriers (e .g. , physiological saline), which are selected on the basis of the mode and route of administration, and standard pharmaceutical practice. Suitable pharmaceutical carriers, as well as pharmaceutical necessities for use in pharmaceutical formulations, are described in Remington 's Pharmaceutical Sciences , a standard reference text in this field, and in the USP/NF.
A therapeutically effective amount is an amount of the antisense molecule of the invention which is capable of producing a medically desirable result in a treated animal. As is well known in the medical arts, dosage for any one patient depends upon many factors, including the patient's size, body surface area, age, the particular compound to be administered, sex, time and route of administration, general health, and other drugs being administered concurrently. Dosages will vary, but a preferred dosage for intravenous administration of DNA is approximately IO6 to IO22 copies of the DNA molecule. The compositions of the invention may be administered locally or systemically. Administration will generally be parenterally, e . g. , intravenously. As mentioned above, DNA may also be administered directly to the target site, e .g. , by biolistic delivery to an internal or external target site or by catheter to a site in an artery. Identification and Use of Compounds Which Modulate E2A- BP/E2A Binding Modulation of the growth of vascular smooth muscle cells can be achieved by contacting the cells with a compound that blocks or enhances E2A-BP/E2A binding. Such a compound can be identified by methods ranging from rational drug design to screening of random compounds. The latter method is preferable, as simple and rapid assays for testing such compounds are available. Small organic molecules are desirable candidate compounds for this analysis, as frequently these molecules are capable of passing through the plasma membrane so that they can potentially modulate E2A-BP/E2A binding within the cell. However, one could also use antibodies specific for either E2A-BP or E2A, or alternatively peptides (or peptide imetics) (a) derived from the binding site on E2A-BP, which would block by occupying the binding site on E2A; or (b) derived from the binding site on E2A, which would block by occupying the binding site on E2A- BP.
The screening of compounds for the ability to modulate E2A-BP/E2A binding may be carried out using in vitro biochemical assays, cell culture assays, or animal model systems.
For example, in a biochemical assay, labeled E2A- BP (e.g., E2A-BP labeled with a fluorochrome or a radioisotope) is applied to a column containing immobilized E2A. A candidate compound is applied to the column before, after, or simultaneously with the labeled E2A-BP, and the amount of labeled protein bound to the column in the presence of the compound is determined by conventional methods. A compound tests positive for inhibiting E2A-BP/E2A binding if the amount of labeled protein bound in the presence of the compound is lower than the amount bound in its absence. Conversely, a compound tests positive for enhancing E2A-BP/E2A binding if the amount of labeled protein bound in the presence of the compound is greater than the amount bound in its absence. In a variation of the above-described biochemical assay, binding of labeled E2A to immobilized E2A-BP is measured. In all of these methods, large numbers of compounds can be screened very rapidly and easily.
As mentioned above, candidate compounds may also be screened using cell culture assays. Cells expressing E2A-BP and E2A, either naturally or after introduction into the cells of genes encoding E2A and/or E2A-BP (e.g., C2C12 cells transfected with E2A-BP, see below) , are cultured in the presence of the candidate compound. The level of E2A-BP/E2A binding in the cell may be inferred using any of several assays. For example, levels of expression of E2A-regulated genes (e.g., genes encoding myogenin, myosin heavy chain, or myosin light chain) in the cell may determined using, e.g., Northern blot analysis, RNAse protection analysis, immunohistochemistry, or other standard methods (see below) . In addition, the ability of a candidate compound to modulate E2A-BP/E2A binding may be evaluated by determining the effect of the candidate compound on cell differentiation (see below) or cell growth, which may be measured by, e .g. , monitoring uptake of [3H]thymidine. A decrease in the level of expression of a gene that is normally upregulated by E2A, and/or detection of increased cell proliferation, indicates that the compound enhances E2A-BP/E2A binding. Conversely, an increase in expression of a gene that is normally upregulated by E2A, and/or detection of a decrease in cell proliferation, indicates that the compound blocks E2A-BP/E2A binding.
Compounds identified as having the desired effect, either enhancing or inhibiting E2A-BP/E2A binding, can be tested further in appropriate animal models of vascular smooth muscle cell growth, which are known to those skilled in the art. One example of such an animal model would be an animal subjected to vascular trauma, such as balloon angioplasty.
Compounds found to inhibit E2A-BP/E2A binding may be used in methods for inhibiting growth of vascular smooth muscle cells in order to, e.g., prevent or treat arteriosclerosis or angiogenesis. Compounds found to enhance E2A-BP/E2A binding may be used in methods to promote proliferation of vascular smooth muscle cells in order to, e.g., promote angiogenesis in wound healing (e .g. , healing of broken bones, burns, diabetic ulcers, or traumatic or surgical wounds) and organ transplantation. In addition, such compounds may be used to treat peripheral vascular disease, cerebral vascular disease, hypoxic tissue damage (e.g., hypoxic damage to heart tissue) , or coronary vascular disease. Compounds identified using the above-described methods may also be used to treat patients who have, or have had, transient ischemic attacks, vascular graft surgery, balloon angioplasty, frostbite, gangrene, or poor circulation. The therapeutic compounds identified using the methods of the invention may be administered to a patient by any appropriate method for the particular compound, e . g. , orally, intravenously, parenterally, transdermally, transmucosally, by inhalation, or by surgery or implantation at or near the site where the effect of the compound is desired (e.g., with the compound being in the form of a solid or semi-solid biologically compatible and resorbable matrix) . For example, a salve or transdermal patch that can be directly applied to the skin so that a sufficient quantity of the compound is absorbed to increase vascularization locally may be used. This method would apply most generally to wounds on the skin. Salves containing the compound can be applied topically to induce new blood vessel formation locally, thereby improving oxygenation of the area and hastening wound healing. Therapeutic doses are determined specifically for each compound, most being administered within the range of 0.001 to 100.0 mg/kg body weight, or within a range that is clinically determined to be appropriate by one skilled in the art. Genetically Altered E2A-BP Mammals
Genetically altered mammals can be created which have cells that express altered levels of the endogenous functional E2A-BP gene product. The genetically altered mammals may in addition express heterologous E2A-BP gene product derived from a second animal. For example, knock-out mice which do not express the mouse homologue of E2A-BP (SEQ ID NO:17) can be generated. Mice from this line can be manipulated further to express the human homologue of E2A-BP (SEQ ID NO:2 and SEQ: ID NO:15), e.g., by introduction of a transgene encoding the human sequence. Alternatively, part or all of the endogenous mouse genomic sequences can be replaced with the corresponding human E2A-BP sequence by homologous reco bination. Expression of the human homologue may be directed to particular tissues or cell types, e.g., skeletal muscle cells and vascular smooth muscle cells, through the use of tissue- or cell type-specific regulatory elements. Many such elements are known to skilled artisans. Such transgenic mammals represent model systems for the study of conditions or diseases that are caused, exacerbated, or ameliorated by the E2A- BP protein. The cells of a genetically altered mammal may bear genetic information received, directly or indirectly, by deliberate genetic manipulation at the subcellular level, such as DNA received by microinjection or by infection with recombinant virus. Thus, mammals of the invention are those with one or more cells that contain a recombinant DNA molecule. It is highly preferred that this molecule becomes stably integrated into the mammal's chromosomes, but the use of DNA sequences that replicate extrachromosomally, such as might be engineered into yeast artificial chromosomes, is also contemplated. Preferably, the mammal is one in which heterologous genetic information has been taken up and integrated into a germ line cell. These mammals, i.e., transgenic mammals, typically have the ability to transfer the genetic information to their offspring. If the offspring in fact possess some or all of the genetic information delivered to the parent animal, then they, too, are transgenic mammals.
A genetically altered mammal may be any mammal except Homo sapiens. Farm animals (pigs, goats, sheep, cows, horses, rabbits, and the like), rodents (such as rats, guinea pigs, and mice) , and domestic animals (for example, dogs and cats) are within the scope of the present invention. Preferably, the genetically altered mammals of the present invention are produced by introducing DNA encoding E2A-BP of the invention into single-celled embryos so that the DNA is stably integrated into the DNA of germ-line cells in the mature mammal, and inherited in a Mendelian fashion. It has been possible for many years to introduce heterologous DNA into fertilized mammalian ova. For instance, totipotent or pluripotent stem cells can be transfected by microinjection, calcium phosphate-mediated precipitation, liposome fusion, retroviral infection, or other means. The transfected cells are then introduced into an embryo (for example, into the cavity of a blastula) and implanted into a pseudo-pregnant female that is capable of carrying the embryos to term. Alternatively, the transfected, fertilized ova can be implanted directly into the pseudopregnant female. In a preferred method, the appropriate DNA is injected into the pronucleus of embryos, at the single cell stage, and the embryos allowed to complete their development within a pseudopregnant female. These techniques are well known. Reviews of standard laboratory procedures for microinjection of heterologous DNAs into fertilized, mammalian ova include Hogan et al. "Manipulating the Mouse Embryo" (Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1986; Krimpenfort et al., Bio/Technology 9:86, 1991; Palmiter et al., Cell 41:343, 1985; Kraemer et al., "Genetic Manipulation of the Early Mammalian Embryo" (Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1985; Hammer et al. , Nature 315:680, 1985; Purcel et al.. Science, 244:1281, 1986; Wagner et al., U.S. Patent 5,175,385; and Krimpenfort et al., U.S. Patent No. 5,175,384 (all of these publications are hereby incorporated by reference) . Experimental Data
Experimental Procedures
Cloning and Sequencing of E2A-BP
A recombinant E47 fusion protein (N3-SH[ALA]) , containing the bHLH domain of hamster shPan-1 (amino acids 509-646, with mutations R551A, V552L, and R553A) with a heart muscle kinase recognition sequence and the FLAG epitope, was expressed and purified as described (Blanar et al . , Proc. Natl. Acad. Sci. USA 92:5870-4, 1995; Blanar and Rutter, Science 256:1014-8, 1992).
N3-SH[ALA] was phosphorylated by heart muscle kinase in the presence of γ-32P-ATP and used to screen a human aorta λgtll cDNA expression library (Clonetech) by interaction cloning (Blanar et al . , Proc. Natl. Acad. Sci. USA 92:5870-4, 1995; Blanar and Rutter, Science 256:1014-8, 1992). A 1450-bp cDNA clone (ΔE2A-BP) obtained from interaction cloning was radiolabeled by random priming and used to isolate a 2786 bp cDNA clone (E2A-BP) from the same human aorta λgtll cDNA library. After restriction mapping, the appropriate fragments were subcloned into plasmids (pBluescript SK, Stratagene) . DNA sequencing was performed by using the dideoxy chain termination method and T7 DNA polymerase. Sequencing templates used were (1) alkaline-denatured double- stranded DNA or (2) single-stranded DNA generated by in vitro excision by helper phage virus (Stratagene) (Sambrook et al . , supra) . Both strands of the 2786 bp human E2A-BP cDNA were sequenced at least once. Using reverse transcription PCR and primers encoding human E2A-BP sequences (forward,
5,-CCTGGATGGAGAAGAACCCCTT-3' (SEQ ID NO:10); reverse, 5'-CGAGCCAGGATGAAGTTGTTGACATGA-3' (SEQ ID NO:11), a 705-bp rat E2A-BP cDNA fragment was amplified from rat aortic smooth muscle RNA as described (Lee et al . , J. Biol. Chem. 266:16188-92, 1991). This rat E2A-BP cDNA was subcloned into plasmid PCRTMII (Invitrogen) and used as a probe for in situ hybridization (see below) . The authenticity of the rat E2A-BP was confirmed by sequencing. The sequence of a portion of a rat E2A-BP cDNA is shown in Figs. 3A (SEQ ID NO:8; a portion of the sequence of the sense strand) and 3B (SEQ ID NO:9; a portion of the sequence of the antisense strand) .
A full-length human E2A-BP cDNA, corresponding to the 4 kb E2A-BP transcript (see below) was isolated by subjecting human aortic RNA to three successive rounds of amplification using 5' RACE. 5' RACE reagents were obtained from Gibco. The cDNA product was subcloned into pCR2.1 (Invitrogen).
The sequence of the full-length human cDNA fragment is shown in Fig. 4 (SEQ ID NO:15). The cDNA has an open-reading frame encoding 845 amino acids beginning at the ATG highlighted in bold and terminating at the TGA highlighted in bold. The predicted amino acid sequence is shown in Fig. 5 (SEQ ID NO:16).
A full-length mouse cDNA corresponding to the mouse E2A-BP transcript was isolated using 5' RACE with primers designed from the mouse AEBP-cDNA sequence (EMBL nucleotide sequence accession number X80478) . RT-PCR was used to generate a cDNA which was subcloned into pCR2.1 (Invitrogen) and sequenced. The sequence of the full- length mouse cDNA fragment is shown in Fig. 6 (SEQ ID NO:17). The cDNA has an open-reading frame encoding 1128 amino acids beginning at the ATG highlighted in bold and terminating at the TGA highlighted in bold, and generating the amino acid sequence shown in Fig. 7 (SEQ
ID NO:18) .
In situ Hybridisation
Adult male Sprague-Dawley rats were perfused with 4% paraformaldehyde and organs were removed and sectioned (Lee et al . , Endocrinology 132:2136-2140, 1993). Probe preparation and in situ hybridization were conducted by methods described previously (Lee et al . , J. Biol. Chem. 269:12032-12039, 1994; Lee et al . , Endocrinology 132: 2136-2140, 1993). E2A-BP mRNA was detected with an 35S- UTP-labeled antisense RNA probe made by using SP6 RNA polymerase to transcribe Xbal-linearized rat E2A-BP cDNA in PCRTMII. As a control, a sense RNA probe was synthesized using T7 RNA polymerase to transcribe Hindlll-linearized rat E2A-BP cDNA in PCRTMII. The RNA probes were hydrolyzed for 20 minutes at 60°C to generate probes of approximately 100 nucleotides long. Each tissue section was hybridized with 20 million counts-per- minute (cpm) of probe at 50°C overnight. After the hybridization procedure, the sections were washed at 50°C under stringent conditions and dried. The tissue sections were subsequently dipped into emulsion solution (Kodak NTB2) and exposed for 2 to 4 days at 4°C. The sections were counter-stained with hematoxylin-eosin. cell Lines, Cell Culture, and Reagents
C2C12 and COS-7 cells were obtained from the American Type Culture Collection (Rockville, Maryland) . All cells were cultured in Dulbecco's modified Eagle medium (DMEM, JRH) supplemented with 10% fetal calf serum (Hyclone) or Serum Plus (JRH) before transfection. C2C12 cells were cultured in either 2% (differentiation medium) or 20% (growth medium) fetal calf serum after transfection.
The monoclonal antibody for myosin heavy chain (MY 32) was obtained from Sigma and the anti-c-myc 9E10 peptide antibody was obtained from either Oncogene or Santa Cruz. cDNA probes for rat myogenin, mouse myosin light chain, and mouse myosin heavy chain were provided by A.B. Lassar (Harvard Medical School) . Human MyoD, E12, and E47 plasmid constructs were provided by D. Baltimore (Whitehead Institute, MIT) and F.A. Peverali (EMBL, Heidelberg, Germany) . Northern Blot Analysis
Total RNA was obtained from cultured cells or rat organs by guanidiniu isothiocyanate extraction and centrifugation through cesium chloride (Sambrook et al . , supra) . The RNA was fractionated on a 1.3% formaldehyde agarose gel and transferred onto nitrocellulose membrane filters. The filters were hybridized with the appropriate 32P-random-primed cDNA probes, as described previously (Perrella et al . , J. Biol. Chem. 269:14595- 600, 1994; Sambrook et al., supra ; Yoshizumi et al . , Mol. Cell. Biol. 15:3266-3272, 1995). The hybridized filters were washed in 30 mM NaCl, 3 mM sodium citrate, and 0.1% sodium dodecyl sulfate at 55°C, and then were used to expose film or stored on PhosphorImager screens for 68 hours. To correct for differences in RNA loading, the filters were washed in a 50% formamide solution at 80°C and rehybridized with a radiolabeled 28S oligonucleotide probe. The filters were scanned and radioactivity was measured on a Phosphorlmager running the ImageQuant software (Molecular Dynamics, Sunnyvale, CA) . Cellular Localisation of E2A-BP
The expression plasmid Myc-E2A-BP/pCR3 was constructed for cellular localization of E2A-BP. In this plasmid, the c-Myc peptide tag (EQKLISEED; SEQ ID NO:12) was added in-frame with the E2A-BP open reading frame encoded by SEQ ID NO:2 at the N-terminus using PCR techniques, and cloned into the expression vector pCR3 (Invitrogen) . COS-7 cells were transiently transfected with Myc-E2A-BP/pCRTM3 plasmids using the DEAE-dextran method (Sambrook et al. , supra) . Immunostaining was performed 48 hours after transfection. The transfected cells, grown on chamber slides, were fixed with 4% paraformaldehyde in phosphate-buffered saline (PBS) and stained with an anti-c-myc monoclonal antibody (9E10, Oncogene) , followed by a rhodamine-conjugated goat anti-mouse IgG as the secondary antibody. Counter- staining for the nucleus with Hoechst 33258 was performed as recommended by the manufacturer.
Generation of C2C12 Clones Expressing E2A-BP
A DNA fragment containing the partial E2A-BP human open reading frame corresponding to SEQ ID NO:2 was cloned into expression plasmids pcDNA3 and pCRM3 (Invitrogen) in sense or antisense orientations. Stable transformants of C2C12 cells were generated by electroporation, as described previously (Sambrook et al . , supra) . Briefly, 2.5 X IO6 cells were harvested at 60% confluence and resuspended in 0.8 ml of PBS. The cells were transferred to electroporation cuvettes (0.4 mm, Biorad) and mixed with 20 μg of plasmid DNA. Stable transfectants were selected in DMEM media supplemented with 0.5 mg/ml of G418 (GIBCO) . Mutagenesis Mutations were introduced into the human E2A-BP cDNA in p-Bluescript vector using the Clonetech site- directed mutagenesis kit. Two conserved residues implicated in metal binding, H-236 and E-239 (as numbered in SEQ ID NO:l), were mutated to glutamine. Two oligonucleotides, mtXbal, 5'-GGCGGCCGCTGTAGAACTAGT-3' (SEQ ID NO:13) and mtSgnl
5'-GCTGGGATCCAAGGCAACCAAGTGCTGGGC-3' (SEQ ID NO:14), were used to introduce the desired mutations into the Xbal site within the p-Bluescript polylinker and the conserved histidine and glutamic acid of E2A-BP, respectively. The mutated E2A-BP was termed mtE2A-BP. Introduction of the mutation was confirmed by sequencing. In Vitro Binding Assays
35S-labeled proteins were prepared by in vitro transcription and translation (Promega TNT kit) using the - 36 - cDNA plasmid encoding ΔE47 (amino acid 561 to 651), E47, E12, MyoD, Id-3, ΔE2A-BP (amino acid 404 to 768), mtE2A-BP, and wild type E2A-B. In addition, glutathione- S-transferase (GST) fusion proteins, GST-ΔE47, GST-ΔE12 (amino acids 477 to 654), GST-MyoD, GST-Id3 , and GST-ΔE2A-BP, were prepared as previously described (Shrivastava et al . , Science 262:1889-1892, 1993). These 35S-labeled proteins were incubated at 4°C for 1 hour with GST-fusion proteins immobilized on glutathione-4B sepharose in 50 mM NaCl and 1 mg/ml bovine serum albumin. The incubation mixtures were washed four times on ice with 0.1% NP40 in PBS. Bound proteins were eluted in IX protein sample loading buffer and resolved on a 10% SDS-PAGE Tricine gel (Schagger and von Jagow, Analytical Biochemistry 166:368-379, 1987). Gel Mobility Shift Assay
Gel mobility shift analysis was performed as described previously (Lassar et al . , Cell 66:305-15, 1991; Yoshizumi et al . , Mol. Cell. Biol. 15:3266-3272, 1995) . Probes were made from four E-box sequences derived from the MEF-1 site of the muscle creatine kinase enhancer (Lassar et al., Cell 66:305-15, 1991). The probes were radiolabeled as described (Sambrook et al., supra; Yoshizumi et al . , Mol. Cell. Biol. 15:3266-3272, 1995) . A typical binding reaction mixture contained DNA probe at 50,000 cpm, 1 μg of poly(dl-dC) »poly(dI-dC) , 10 mM Tris (pH 7.5), 50 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol, 5% glycerol, and 3-5 ml of in vitro translated protein products, in a final volume of 25 ml. The reaction mixture was incubated at room temperature for 30 minutes and analyzed by 5% native polyacrylamide gel electrophoresis in a 0.25 X TBE buffer (22 mM Tris base, 22 mM boric acid, and 0.5 mM EDTA). To determine the effect of E2A-BP on the binding of E47 homodimers or E47/MyoD heterodimers to the E-box probe, these HLH transcription factors were co-translated in the presence or absence of ΔE2A-BP, full length E2A-BP, or mtE2A-BP. Immunoprecipitation
Cells were harvested in low ionic strength lysis buffer (10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 3 mM MgCl2, 300 mM sucrose, and 0.1% NP40) as described (Jen et al.. Genes Dev. 6:1466, 1992). Protein lysates were cleared by preincubation with normal rabbit IgG and protein A- agarose (Oncogene) , followed by an overnight incubation with c-Myc-antibody agarose (Santa Cruz) , talon metal affinity resin (Clonetech) , or protein A-agarose. Agaraose or Sepharose beads in complex with proteins were washed three times in lysis buffer and then eluted with SDS-Laemmli protein sample loading buffer. Eluted proteins were resolved on 10% SDS-PAGE tricine gels, blotted on Immobilon-P™ membranes, and incubated overnight with antibodies. Antibody binding was detected with an ECL western blotting kit (Amersham) . Results E2A-BP Shares Homology with Carboxypeptidase E
Using the bHLH domain of E47 as a probe to screen a human aortic expression library, ten interacting clones were isolated; eight clones encoded Id and two clones encoded novel sequences. One of the two novel clones, designated E2A-BP, was characterized first. A 1450-bp E2A-BP cDNA identified from the interaction cloning was used to isolate a 2795-bp cDNA clone. The open reading frame coded for 768 amino acids, which corresponds to a predicted molecular mass of 87 kDa and a calculated isoelectric point of 4.6 (Fig. IA) . Nucleotides flanking the putative initiating methionine comply with the Kozak consensus sequence for translation initiation (Kozak, Annu. Rev. Cell Biol. 8:197-255, 1992). A GenBank search revealed that E2A-BP shares 45% homology with human carboxypeptidase E at the amino acid level (Fricker, Annu. Rev. Physiol. 50:309-21, 1988; Manser et al . , Biochem. J. 267:517-25, 1990; Skidgel, Trends Pharmacol. Sci 9:299-304, 1988). Domain search of the E2A-BP open reading frame revealed a putative nuclear localization signal (Boulikas, J. Cell. Biochem. 55:32-58, 1994), KRIR, at amino acids 599 to 602 and an acidic domain at amino acids 689 to 746. E2A-BP lacks an HLH domain, despite its isolation by interaction cloning using the bHLH domain of E47 as a probe. Carboxypeptidases contain two signature domains that are important for binding one atom of zinc (Manser et al . , Biochem. J. 267:517-25, 1990; Reynolds et al. , J. Biol. Chem. 264:20094-9, 1989; Tan et al. , J. Biol. Chem. 264:13165-70, 1989). Three conservative residues, a histidine and a glutamic acid in signature 1 and an additional histidine in signature 2, are implicated in zinc binding (Fig. IB) . The histidine and glutamic acid in signature l are present in both carboxypeptidase E and E2A-BP. In contrast, the histidine in signature 2 is present in carboxypeptidase E, but not in E2A-BP. E2A-BP mRNA is Preferentially Expressed in the Aorta
The expression pattern of E2A-BP mRNA was determined in a variety of rat and human organs by Northern blot analysis using a human E2A-BP cDNA probe. For rat organs, a single 4 kb transcript was detected by the E2A-BP probe. The highest expression of E2A-BP was observed in the aorta with its adventitia removed and containing mainly smooth muscle cells. E2A-BP was undetectable in other rat organs, except a low level was detected in the adventitia and esophagus. E2A-BP was also expressed at high levels in human aorta, compared to heart, lung, and skeletal muscle, which have 40, 80, and 100-fold less expression, respectively.
To further determine whether E2A-BP is expressed in aortic smooth muscle cells, in situ hybridization was performed. A rat E2A-BP cDNA to generate both sense and antisense cRNA probes. An intense concentration of autoradiographic grains was present after hybridization with the antisense, but not the sense probe, indicating a high level expression of E2A-BP transcript in aortic smooth muscle cells. In contrast, autoradiographic grains were not detected in skeletal muscle cells hybridized with the E2A-BP antisense probe. Hybridization of the antisense probe to small vessels in skeletal muscle was also detected. This observation is consistent with the low level of E2A-BP expression in skeletal muscle detected by RNA blot analysis. Downregulation of E2A-BP mRNA in Human Adult Skeletal Muscle E2A-BP expression in human fetal and adult skeletal muscle was examined. A high level of E2A-BP mRNA was detected in fetal skeletal muscle cells. In contrast, E2A-BP mRNA was downregulated markedly in adult skeletal muscle cells that had differentiated terminally. To confirm that the adult but not the fetal skeletal muscle cells had exited the cell cycle, an RNA blot previously hybridized with an E2A-BP probe was rehybridized with a cyclin A probe. Cyclin A mRNA was present at a high level in fetal skeletal muscle but was undetectable in adult skeletal muscle. E2A-BP is a nuclear protein
To determine the cellular localization of E2A-BP, a fusion plasmid was generated which expresses a fusion protein containing a c-myc peptide tag on the N-terminus side of a portion of the E2A-BP cDNA open reading frame corresponding to SEQ ID NO:2. The construct was then transfected into C2C12 and COS-7 cells. The fusion protein was detected by a specific monoclonal antibody (9E10) to the c-myc tag. DNA staining by Hoechst 33258 was used to localize the nucleus. C-myc tagged E2A-BP protein was expressed in the nucleus in both C2C12 and COS-7 cells. This result is consistent with the presence of a nuclear localization signal (Boulikas, J. Cell. Biochem. 55:32-58, 1994), KRIR, at amino acid 599 to 602 of E2A-BP (Fig. IA) .
E2A-BP Binds E12 and E47, but not MyoD and Id3
To determine whether E2A-BP could bind HLH proteins in an in vitro assay (Shrivastava et al . , Science 262:1889-1892, 1993), GST fusion proteins of ΔE2A-BP, ΔE12, ΔE47, MyoD, and Id3 (Δ indicates a truncated protein,, see above for details) were bound to glutathione-agarose matrices and incubated with 35S- methionine labeled ΔE2A-BP, which contains the C-terminal 365 amino acids of E2A-BP encoded by the cDNA clone originally isolated by interaction cloning or ΔE47. The incubation mixture was washed, eluted, and analyzed by gel electrophoresis. The binding of 35S-ΔE47 to GST-ΔE12 served as a positive control. Compared with the background binding to GST, 35S- ΔE2A-BP bound to GST-ΔE2A-BP (5-fold of background) , but bound even better to both GST-ΔE47 (47-fold of background) and GST-ΔE12 (18-fold of background) . In contrast, 35S-ΔE2A-BP did not bind GST-MyoD and GST-Id3. The reverse experiment indicated that 35S-ΔE47 and 35S- ΔE12 bound very well to GST-ΔE2A-BP. These results show that the C-terminal
365 amino acids of E2A-BP are capable of associating with E2A proteins. In addition, this association is not universal for all HLH proteins, since 35S-ΔE2A-BP bound poorly with GST-MyoD and GST-Id3. E2A-BP Suppresses Binding of E47 homodimer and E 7-MyoD Heterodi er to the E-box
To determine whether the association of E2A-BP with E2A proteins affects the ability of E2A proteins to bind DNA, gel mobility shift analysis was performed using an E-box probe consisting of four repeats of the consensus CANNTG sequences in the enhancer of muscle creatine kinase (Lassar et al . , Cell 58:823-31, 1989; Murre et al . , Cell 56:777-83, 1989; Murre et al . , Cell 58:537-44, 1989) and HLH or E2A-BP proteins synthesized by in vitro transcription and translation. The proteins were translated alone or co-translated with other proteins. The effect of E2A-BP on binding of ΔE47 homodimers was analyzed. Incubation of the probe with ΔE47 resulted in formation of a specific ΔE47-E-box complex, which was abolished by incubation with a
100-fold molar excess of identical nonradiolabeled DNA, but not by incubation with nonidentical DNA. Although E2A-BP clone did not bind the E-box probe, E2A-BP decreased the density of ΔE47-E-box complexes by more than 75%, despite the presence of equal amounts of E47. Although ΔE2A-BP bound E47, it failed to inhibit the formation of ΔE47-E-box complexes. These results show that, although the C-terminal region of E2A-BP is sufficient to bind E47, the entire protein is needed to inhibit the binding of E47 to DNA.
The binding of E47/MyoD heterodimer to the E-box probe in the presence and absence of E2A-BP was also assessed. Incubation of the E-box probe with full length E47 and MyoD resulted in formation of a specific E47/MyoD-E-box complex, which was abolished by incubation with a 100-fold molar excess of identical nonradiolabeled DNA, but not by incubation with nonidentical DNA. E2A-BP decreased the binding of E47/MyoD heterodimer to the E-box by more than 75%. To determine whether the conserved histidine and glutamic acid residues of signature 1 (Fig. IB) are important in inhibiting the binding of E47/MyoD to the E-box, the histidine and glutamic acid residues were mutated to glutamine. Mutation of these two amino acids prevented the inhibition of E47/MyoD-E-box complex formation by E2A-BP.
E2A-BP Inhibits Differentiation of C2C12 Myoblasts
Since E2A proteins have an important role in regulating differentiation and inhibiting growth of many cell types (Kadesch, Cell. Growth Differ. 4:49-55, 1993; Lassar et al . , Cell 58:823-31, 1989; Olson and Klein, Genes & Dev. 8:1-8, 1994; Peverali et al . , EMBO J. 13:4291-301, 1994), the effect of E2A-BP, which attenuates binding of E2A to DNA, was tested. C2C12 myoblasts were used for these studies because differentiation in these cells is well characterized (Guo et al . , Mol. Cell. Biol. 15:3823-9, 1995; Jen et al . , Genes & Dev. 6:1466-79, 1992; Kim et al . , J. Biol. Chem. 267:15140-5, 1992; Lassar et al . , Cell 58:823-31, 1989). In response to culturing in differentiation medium containing 2% fetal calf serum, C2C12 myoblasts express skeletal muscle specific genes, such as myogenin, myosin heavy chain (MHC) , and myosin light chain (MLC) and differentiate into multinucleated myotubes (Jen et al . , Genes & Dev. 6:1466-79, 1992). C2C12 cells were stably transfected with vectors containing no insert or a full length E2A-BP cDNA, in either sense or antisense orientations. Expression of sense and antisense transcripts was confirmed by Northern blot analysis. Two clones expressing the sense E2A-BP and three clones expressing antisense E2A-BP were selected. Since the responses of these clones were similar, the results of one representative clone from each group are presented. To determine the effect of E2A-BP on the mRNA levels of muscle specific genes, C2C12 clones were cultured in either growth medium for two days or in differentiation medium for one or two days, and total RNA was harvested for Northern blot analysis. C2C12 cells in growth medium expressed MyoD, but not myogenin, MHC, and MLC. In C2C12 cells transfected with vector alone, differentiation medium markedly increased the mRNA levels of myogenin, MHC, and MLC, but did not affect MyoD mRNA levels. In contrast, differentiation medium failed to induce expression of myogenin, MHC, and MLC mRNA in C2C12 cells transfected with sense E2A-BP. To determine whether E2A-BP would inhibit expression of MHC, cells were immunostained with an anti-MHC primary antibody and a rhodamine-conjugated secondary antibody. The nuclei were labeled by Hoechst 32258. Expression of MHC protein can be easily detected in C2C12 cells transfected with vector alone or antisense E2A-BP, but not in cells transfected with sense E2A-BP.
To determine whether suppression of muscle-specific gene expression by E2A-BP leads to inhibition of myotube formation, the transfected C2C12 clones were treated with differentiation medium for 4 days. Formation of multinuclear myotubes was detected in C2C12 cells transfected with vector alone, but not in cells transfected with sense E2A-BP. Since C2C12 cells expressed low levels of E2A-BP mRNA, the effect of antisense E2A-BP on myotube expression was also tested. Compared with C2C12 cells transfected with vector alone, transfection of C2C12 cells with antisense E2A-BP accelerated formation of myotubes.
E2A-BP Interacts with E2A Proteins in Vivo
In C2C12 cells stably expressing full-length E2A- BP tagged with c-Myc (c-Myc-E2A-BP) , an antibody to the C-Myc tag co-immunoprecipitated both c-Myc-E2A-BP and E- 47. The presence of E47 was confirmed by its binding to an anti-E47 antibody but not to a negative control antibody, rabbit IgG. Moreover, in extracts of COS-7 cells that had been co-transfected with c-Myc-E2A-BP and E47 tagged with a histidine epitope, the His-tagged E47 and c-Myc-E2A-BP proteins were co-immunoprecipitated with both c-Myc antibody-agarose and talon metal affinity resin, which binds the histidine tag. These proteins were not precipitated with a negative control, protein A- agarose. Taken together, these data indicate that E2A-BP interacts with E2A proteins in vivo.
SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: Lee, Mu-En
Haber, Edgar Endege, Wilson O.
Layne, Matthew D.
(ii) TITLE OF THE INVENTION: E2A-BINDING PROTEIN
(iii) NUMBER OF SEQUENCES: 18
(iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: Finn & Richardβon, P.O.
(B) STREET: 225 Franklin Street
(C) CITY: Boston
(D) STATE: MA
(E) COUNTRY: US <F) ZIP: 02110-2804
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette
(B) COMPUTER: IBM Compatible
(C) OPERATING SYSTEM: Windows95 (D) SOFTWARE: FastSEQ for Windows Version 2.0
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE: 14-MAR-1997
(C) CLASSIFICATION: (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 60/013,439
(B) FILING DATE: 15-MAR-1996
(viii) ATTORNEY/AGENT INFORMATION: (A) NAME: Fraser, Janis K. (B) REGISTRATION NUMBER: 34,819
(C) REFERENCE/DOCKET NUMBER: 05433/031001
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 617-542-5070
(B) TELEFAX: 617-542-8906
(2) INFORMATION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 768 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:
Met Glu Ser His Arg ie Glu Asp Aβn Gin lie Arg Ala Ser Ser Met 1 5 10 15 Pro Ala Pro Arg Pro Gly Gly Thr Ala Gly Arg Leu Asn Met Gin Thr 20 25 30 Gly Ala Thr Glu Asp Asp Tyr Tyr Asp Gly Ala Trp Cys Ala Glu Asp
35 40 45
Asp Ala Arg Thr Gin Trp He Glu Val Asp Thr Arg Arg Thr Thr Arg 50 55 60 Phe Thr Gly Val He Thr Gin Gly Arg Asp Ser Ser He His Asp Asp 65 70 75 80
Phe Val Thr Thr Phe Phe Val Gly Phe Ser Asn Asp Ser Gin Thr Trp
85 90 95
Val Met Tyr Thr Asn Gly Tyr Glu Glu Met Thr Phe His Gly Asn Val 100 105 110
Asp Lys Asp Thr Pro Val Leu Ser Glu Leu Pro Glu Pro Val Val Ala
115 120 125
Arg Phe He Arg He Tyr Pro Leu Thr Trp Asn Gly Ser Leu Cyβ Met
130 135 140 Arg Leu Glu Val Leu Gly Cys Ser Val Ala Pro Val Tyr Ser Tyr Tyr
145 150 155 160
Thr Gin Asn Glu Val Val Ala Thr Asp Aβp Leu Asp Phe Arg His His
165 170 175
Ser Tyr Lys Asp Met Arg Gin Leu Met Lys Val Val Asn Glu Glu Cys 180 185 190
Pro Thr He Thr Arg Thr Tyr Ser Leu Gly Lys Ser Ser Arg Gly Leu
195 200 205
Lys He Tyr Ala Met Glu He Ser Asp Asn Pro Gly Glu Hie Glu Leu
210 215 220 Gly Glu Pro Glu Phe Arg Tyr Thr Ala Gly He His Gly Asn Glu Val
225 230 235 240
Leu Gly Arg Glu Leu Leu Leu Leu Leu Met Gin Tyr Leu Cys Arg Glu
245 250 255
Tyr Arg Asp Gly Asn Pro Arg Val Arg Ser Leu Val Gin Asp Thr Arg 260 265 270
He His Leu Val Pro Ser Leu Asn Pro Asp Gly Tyr Glu Val Ala Ala
275 280 285
Gin Met Gly Ser Glu Phe Gly Asn Trp Ala Leu Gly Leu Trp Thr Glu
290 295 300 Glu Gly Phe Aβp He Phe Glu Aβp Phe Pro Asp Leu Asn Ser Val Leu
305 310 315 320
Trp Gly Ala Glu Glu Arg Lys Trp Val Pro Tyr Arg Val Pro Asn Asn
325 330 335
Asn Leu Pro He Pro Glu Arg Tyr Leu Ser Pro Asp Ala Thr Val Ser 340 345 350
Thr Glu Val Arg Ala He He Ala Trp Met Glu Lys Asn Pro Phe Val
355 360 365
Leu Gly Ala Asn Leu Asn Gly Gly Glu Arg Leu Val Ser Tyr Pro Tyr
370 375 380 Asp Met Ala Arg Thr Pro Thr Gin Glu Gin Leu Leu Ala Ala Ala Met
385 390 395 400
Ala Ala Ala Arg Gly Glu Aβp Glu Asp Glu Val Ser Glu Ala Gin Glu
405 410 415
Thr Pro Asp His Ala He Phe Arg Trp Leu Ala He Ser Phe Ala Ser 420 425 430
Ala His Leu Thr Leu Thr Glu Pro Thr Arg Gly Gly Cyβ Gin Ala Gin
435 440 445
Asp Tyr Thr Gly Gly Met Gly He Val Asn Gly Ala Lys Trp Asn Pro
450 455 460 Arg Thr Gly Thr He Asn Asp Phe Ser Tyr Leu His Thr Aβn Cys Leu
465 470 475 480
Glu Leu Ser Phe Tyr Leu Gly Cys Asp Lys Phe Pro His Glu Ser Glu
485 490 495
Leu Pro Arg Glu Trp Glu Asn Asn Lye Glu Ala Leu Leu Thr Phe Met 500 505 510
Glu Gin Val His Arg Gly He Lys Gly Val Val Thr Aβp Glu Gin Gly 515 520 525 Ile Pro He Ala Asn Ala Thr He Ser Val Ser Gly He Asn His Gly
530 535 540
Val Lys Thr Ala Ser Gly Gly Asp Tyr Trp Arg He Leu Asn Pro Gly 545 550 555 560 Glu Tyr Arg Val Thr Ala His Ala Glu Gly Tyr Thr Pro Ser Ala Lys
565 570 575
Thr Cyβ Asn Val Aβp Tyr Aβp He Gly Ala Thr Gin Cyβ Asn Phe He
580 585 590
Leu Ala Arg Ser Asn Trp Lye Arg He Arg Glu He Met Ala Met Asn 595 600 605
Gly Asn Arg Pro He Pro Hie He Aβp Pro Ser Arg Pro Met Thr Pro
610 615 620
Gin Gin Arg Arg Leu Gin Gin Arg Arg Leu Gin His Arg Leu Arg Leu 625 630 635 640 Arg Ala Gin Met Arg Leu Arg Arg Leu Asn Ala Thr Thr Thr Leu Gly
645 650 655
Pro His Thr Val Pro Pro Thr Leu Pro Pro Ala Pro Ala Thr Thr Lθu .
660 665 670
Ser Thr Thr He Glu Pro Trp Gly Leu He Pro Pro Thr Thr Ala Gly 675 680 685
Trp Glu Glu Ser Glu Thr Glu Thr Tyr Thr Glu Val Val Thr Glu Phe
690 695 700
Gly Thr Glu Val Glu Pro Glu Phe Gly Thr Lys Val Glu Pro Glu Phe 705 710 715 720 Glu Thr Gin Leu Glu Pro Glu Phe Glu Thr Gin Leu Glu Pro Glu Phe
725 730 735
Glu Glu Glu Glu Glu Glu Lys Glu Glu Glu He Ala Thr Gly Gin Ala
740 745 750
Phe Pro Phe Thr Thr Val Glu Thr Tyr Thr Val Aβn Phe Gly Aβp Phe 755 760 765
(2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2795 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (i ) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 161...2464
(D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
GAATTCCGGG GAAGCTGAGC CTGCTAGACT GAGTGACTGC AGTTAGGAGG GATCCGACAA 0 GTGGGCAGTG GAGAAGGGCA AGGACCACAA AGAGCCCCGA AGGGGCGAGG AGTTGGAGGA 20
GGAGTGGACC CTACGAGAAA GTCAAGTTCC CCCCATTGGG ATG GAG TCA CAC CGT 75
Met Glu Ser His Arg 1 5
ATT GAG GAC AAC CAG ATC CGA GCC TCC TCC ATG CCT GCG CCA CGG CCT 23 He Glu Asp Aβn Gin He Arg Ala Ser Ser Met Pro Ala Pro Arg Pro
10 15 20 GGG GGC ACA GCC GGC CGG CTC AAC ATG CAG ACC GGT GCC ACT GAG GAC 271 Gly Gly Thr Ala Gly Arg Leu Asn Met Gin Thr Gly Ala Thr Glu Aβp 25 30 35 GAC TAC TAT GAT GGT GCG TGG TGT GCC GAG GAC GAT GCC AGG ACC CAG 319 Aβp Tyr Tyr Asp Gly Ala Trp Cyβ Ala Glu Aβp Aβp Ala Arg Thr Gin 40 45 50
TGG ATA GAG GTG GAC ACC AGG AGG ACT ACC CGG TTC ACA GGC GTC ATC 367
Trp He Glu Val Aβp Thr Arg Arg Thr Thr Arg Phe Thr Gly Val He 55 60 65
ACC CAG GGC AGA GAC TCC AGC ATC CAT GAC GAT TTT GTG ACC ACC TTC 415 Thr Gin Gly Arg Aβp Ser Ser He His Asp Asp Phe Val Thr Thr Phe 70 75 80 85
TTC GTG GGC TTC AGC AAT GAC AGC CAG ACA TGG GTG ATG TAC ACC AAC 463 Phe Val Gly Phe Ser Aβn Aβp Ser Gin Thr Trp Val Met Tyr Thr Aβn 90 95 100
GGC TAT GAG GAA ATG ACC TTT CAT GGG AAC GTG GAC AAG GAC ACA CCC 511 Gly Tyr Glu Glu Met Thr Phe Hie Gly Asn Val Asp Lye Asp Thr Pro 105 110 115 GTG CTG AGT GAG CTC CCA GAG CCG GTG GTG GCT CGT TTC ATC CGC ATC 559 Val Leu Ser Glu Leu Pro Glu Pro Val Val Ala Arg Phe He Arg He 120 125 130
TAC CCA CTC ACC TGG AAT GGC AGC CTG TGC ATG CGC CTG GAG GTG CTG 607
Tyr Pro Leu Thr Trp Asn Gly Ser Leu Cys Met Arg Leu Glu Val Leu 135 140 145
GGG TGC TCT GTG GCC CCT GTC TAC AGC TAC TAC ACA CAG AAT GAG GTG 655 Gly Cys Ser Val Ala Pro Val Tyr Ser Tyr Tyr Thr Gin Aβn Glu Val
150 155 160 165
GTG GCC ACC GAT GAC CTG GAT TTC CGG CAC CAC AGC TAC AAG GAC ATG 703
Val Ala Thr Asp Asp Leu Aβp Phe Arg Hie Hie Ser Tyr Lys Aβp Met 170 175 180
CGC CAG CTC ATG AAG GTG GTG AAC GAG GAG TGC CCC ACC ATC ACC CGC 751 Arg Gin Leu Met Lye Val Val Asn Glu Glu Cyβ Pro Thr He Thr Arg 185 190 195 ACT TAC AGC CTG GGC AAG AGC TCA CGA GGC CTC AAG ATC TAT GCC ATG 799 Thr Tyr Ser Leu Gly Lys Ser Ser Arg Gly Leu Lys He Tyr Ala Met 200 205 210
GAG ATC TCA GAC AAC CCT GGG GAG CAT GAA CTG GGG GAG CCC GAG TTC 847 Glu He Ser Aβp Aβn Pro Gly Glu His Glu Leu Gly Glu Pro Glu Phe 215 220 225
CGC TAC ACT GCT GGG ATC CAT GGC AAC GAG GTG CTG GGC CGA GAG CTG 895 Arg Tyr Thr Ala Gly He Hie Gly Aβn Glu Val Leu Gly Arg Glu Leu 230 235 240 245
TTG CTG CTG CTC ATG CAG TAC CTG TGC CGA GAG TAC CGC GAT GGG AAC 943
Leu Leu Leu Leu Met Gin Tyr Leu Cyβ Arg Glu Tyr Arg Asp Gly Asn 250 255 260
CCA CGT GTG CGC AGC CTG GTG CAG GAC ACA CGC ATC CAC CTG GTG CCC 991 Pro Arg Val Arg Ser Leu Val Gin Aβp Thr Arg He Hie Leu Val Pro 265 270 275 TCA CTG AAC CCT GAT GGC TAC GAG GTG GCA GCG CAG ATG GGC TCA GAG 1039
Ser Leu Aβn Pro Asp Gly Tyr Glu Val Ala Ala Gin Met Gly Ser Glu 280 285 290
TTT GGG AAC TGG GCG CTG GGA CTG TGG ACT GAG GAG GGC TTT GAC ATC 1087
Phe Gly Asn Trp Ala Leu Gly Leu Trp Thr Glu Glu Gly Phe Asp He 295 300 305
TTT GAA GAT TTC CCG GAT CTC AAC TCT GTG CTC TGG GGA GCT GAG GAG 1135 Phe Glu Asp Phe Pro Asp Leu Aβn Ser Val Leu Trp Gly Ala Glu Glu 310 315 320 325
AGG AAA TGG GTC CCC TAC CGG GTC CCC AAC AAT AAC TTG CCC ATC CCT 1183 Arg Lye Trp Val Pro Tyr Arg Val Pro Aβn Aβn Aβn Leu Pro He Pro 330 335 340
GAA CGC TAC CTT TCG CCA GAT GCC ACG GTA TCC ACG GAG GTC CGG GCC 1231
Glu Arg Tyr Leu Ser Pro Aβp Ala Thr Val Ser Thr Glu Val Arg Ala 345 350 355 ATC ATT GCC TGG ATG GAG AAG AAC CCC TTC GTG CTG GGA GCA AAT CTG 1279 He He Ala Trp Met Glu Lys Asn Pro Phe Val Leu Gly Ala Asn Leu 360 365 370
AAC GGC GGC GAG CGG CTA GTA TCC TAC CCC TAC GAT ATG GCC CGC ACG 1327
Asn Gly Gly Glu Arg Lau Val Ser Tyr Pro Tyr Asp Met Ala Arg Thr 375 380 385
C'^ ACC CAG GAG CAG CTG CTG GCC GCA GCC ATG GCA GCA GCC CGG GGG 1375 Pro Thr Gin Glu Gin Leu Leu Ala Ala Ala Met Ala Ala Ala Arg Gly 390 395 400 405
GAG GAT GAG GAC GAG GTC TCC GAG GCC CAG GAG ACT CCA GAC CAC GCC 1423 Glu Aβp Glu Aβp Glu Val Ser Glu Ala Gin Glu Thr Pro Aβp Hie Ala 410 415 420 ATC TTC CGG TGG CTT GCC ATC TCC TTC GCC TCC GCA CAC CTC ACC TTG 1471 He Phe Arg Trp Leu Ala He Ser Phe Ala Ser Ala Hie Leu Thr Leu 425 430 435 ACC GAG CCT ACC CGC GGA GGC TGC CAA GCC CAG GAC TAC ACC GGC GGC 1519 Thr Glu Pro Thr Arg Gly Gly Cys Gin Ala Gin Asp Tyr Thr Gly Gly 440 445 450
ATG GGC ATC GTC AAC GGG GCC AAG TGG AAC CCC CGG ACC GGG ACT ATC 1567
Met Gly He Val Aβn Gly Ala Lys Trp Aβn Pro Arg Thr Gly Thr He 455 460 465
AAT GAC TTC AGT TAC CTG CAT ACC AAC TGC CTG GAG CTC TCC TTC TAC 1615 Aβn Aβp Phe Ser Tyr Leu Hie Thr Aβn Cyβ Leu Glu Leu Ser Phe Tyr 470 475 480 485
CTG GGC TGT GAC AAG TTC CCT CAT GAG AGT GAG CTG CCC CGC GAG TGG 1663 Leu Gly Cyβ Aβp Lye Phe Pro Hie Glu Ser Glu Leu Pro Arg Glu Trp 490 495 500
GAG AAC AAC AAG GAG GCG CTG CTC ACC TTC ATG GAG CAG GTG CAC CGC 1711 Glu Asn Aβn Lys Glu Ala Leu Leu Thr Phe Met Glu Gin Val Hie Arg 505 510 515 GGC ATT AAG GGG GTG GTG ACG GAC GAG CAA GGC ATC CCC ATT GCC AAC 1759 Gly He Lys Gly Val Val Thr Aβp Glu Gin Gly He Pro He Ala Aβn 520 525 530
GCC ACC ATC TCT GTG AGT GGC ATT AAT CAC GGC GTG AAG ACA GCC AGT 1807
Ala Thr He Ser Val Ser Gly He Asn His Gly Val Lys Thr Ala Ser 535 540 545
GGT GGT GAT TAC TGG CGA ATC TTG AAC CCG GGT GAG TAC CGG GTG ACA 1855 Gly Gly Aβp Tyr Trp Arg He Leu Aβn Pro Gly Glu Tyr Arg Val Thr 550 555 560 565
GCC CAC GCG GAG GGC TAC ACC CCG AGC GCC AAG ACC TGC AAT GTT GAC 1903 Ala His Ala Glu Gly Tyr Thr Pro Ser Ala Lye Thr Cyβ Aβn Val Aβp 570 575 580
TAT GAC ATC GGG GCC ACT CAG TGC AAC TTC ATC CTG GCT CGC TCC AAC 1951
Tyr Aβp He Gly Ala Thr Gin Cyβ Aβn Phe He Leu Ala Arg Ser Aβn 585 590 595 TGG AAG CGC ATC CGG GAG ATC ATG GCC ATG AAC GGG AAC CGG CCT ATC 1999 Trp Lys Arg He Arg Glu He Met Ala Met Aβn Gly Asn Arg Pro He 600 605 610
CCA CAC ATA GAC CCA TCG CGC CCT ATG ACC CCC CAA CAG CGA CGC CTG 2047 Pro His He Aβp Pro Ser Arg Pro Met Thr Pro Gin Gin Arg Arg Leu 615 620 625
CAG CAG CGA CGC CTA CAA CAC CGC CTG CGG CTT CGG GCA CAG ATG CGG 2095 5 Gin Gin Arg Arg Leu Gin His Arg Leu Arg Leu Arg Ala Gin Met Arg 630 635 640 645
CTG CGG CGC CTC AAC GCC ACC ACC ACC CTA GGC CCC CAC ACT GTG CCT 2143
Leu Arg Arg Leu Aβn Ala Thr Thr Thr Leu Gly Pro His Thr Val Pro 0 650 655 660
CCC ACG CTG CCC CCT GCC CCT GCC ACC ACC CTG AGC ACT ACC ATA GAG 2191 Pro Thr Leu Pro Pro Ala Pro Ala Thr Thr Leu Ser Thr Thr He Glu 665 670 675 5 CCC TGG GGC CTC ATA CCG CCA ACC ACC GCT GGC TGG GAG GAG TCG GAG 2239 Pro Trp Gly Leu He Pro Pro Thr Thr Ala Gly Trp Glu Glu Ser Glu 680 685 690
ACT GAG ACC TAC ACA GAG GTG GTG ACA GAG TTT GGG ACC GAG GTG GAG 0 2287
Thr Glu Thr Tyr Thr Glu Val Val Thr Glu Phe Gly Thr Glu Val Glu 695 700 705
CCC GAG TTT GGG ACC AAG GTG GAG CCC GAG TTT GAG ACC CAG TTG GAG 2335 5 Pro Glu Phe Gly Thr Lys Val Glu Pro Glu Phe Glu Thr Gin Leu Glu 710 715 720 725
CCT GAG TTT GAG ACC CAG CTG GAA CCC GAG TTT GAG GAA GAG GAG GAG 2383 Pro Glu Phe Glu Thr Gin Leu Glu Pro Glu Phe Glu Glu Glu Glu Glu 30 730 735 740
GAG AAA GAG GAG GAG ATA GCC ACT GGC CAG GCA TTC CCC TTC ACA ACA 2431 Glu Lye Glu Glu Glu He Ala Thr Gly Gin Ala Phe Pro Phe Thr Thr 745 750 755 5 GTA GAG ACC TAC ACA GTG AAC TTT GGG GAC TTC TGAGATCAGC GTCCTACCAA 2484
Val Glu Thr Tyr Thr Val Aβn Phe Gly Aβp Phe
760 765
GACCCCAGCC CAACTCAAGC TACAGCAGCA GCACTTCCCA AGCCTGCTGA CCACAGTCAC 40 2544
ATCACCCATC AGCACATGGA AGGCCCCTGG TATGGACACT GAAAGGAAGG GCTGGTCCTG 2604
CCCTTTGAGG GGGTGCAAAC ATGAGCTGGG ACCTAAGAGC CAGAGGCTGT GTAGAGGCTC 2664 45 CTGCTCCACC TGCCAGTCTC GTAAGAGATG GGGTTGCTGC AGTGTTGGAG TAGGGGCAGA 2724
GGGAGGGAGC CAAGGTCACT CCAATAAAAC AAGCTCATGG CAAAAAAAAA AAAAAAAAAA 2784 AACCGGAATT C 50 2795
(2) INFORMATION FOR SEQ ID NO:3: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 70 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
GATCTACACC TGCTGCCTCC CAACACCTGC TGCCTCCCAA CACCTGCTGC CTCCCAACAC 0 CTGCTGAGCT 0
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
Pro Glu Phe Arg Tyr Thr Ala Gly He His Gly Aβn Glu Val Leu Gly 1 5 10 15
Arg Glu Leu Leu Leu Leu 20
(2) INFORMATION FOR SEQ ID NO:5:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: Aβn Gly Gly Glu Arg Leu Val Ser Tyr Pro Tyr 1 5 10
(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 amino acidβ (B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
Pro Glu Phe Lye Tyr He Gly Asn Met Hie Gly Aβn Glu Ala Val Gly 1 5 10 15
Arg Glu Leu Leu He Phe 20
(2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
Hie Gly Gly Aβp Leu Val Ala Aβn Tyr Pro Tyr 1 5 10
(2) INFORMATION FOR SEQ ID NO:8: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 226 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
GAATTCGGCT TCCTGGATGG AGAAGAACCC CTTGGTGCTG GGTGCAAATC TGAATGGTGG 60
CGAGCGGCTT GTGTCTTACC CTATGACATG GCCCGGACAC CTAGCCAGGA ACAGCTGTAG 120
GCCGCGGCAC TGGCAGCTGC CGTGGAGAAG ACGAGGATGA GGTGTCTGAG GCCCAGGAGA 180
CTCAGATCAC GCCATTTCCG CTGGCTGGCT ACTCTTTGCT CTGCCA 226 (2) INFORMATION FOR SEQ ID NO:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 251 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
GAATTCGGCT TCGAGCCAGG ATGAAGTTGC ACTGGGTGGC CCCGATATCG TAGTCCACGT 60 TGCAAATCTT GGCACTCGAG GTGTAGCCCT CTGCGTGAGC TGTCACACGG TACTCACCCG 120
GGTTTAGAAT GCGCCAGTAG TCACCTCCAC TCGCTGTCTT TACTCCGTGG TTGAGTCCAC 180
TCACAGAGAT GGTAGCATTG GCAATGGGAT CCTTGACTCA TCCGTACCAC ACCTTAGTCA 240
CGTGACGCTC T 251
(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
CCTGGATGGA GAAGAACCCC TT 2 (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
CGAGCCAGGA TGAAGTTGTT GACATGA 7 (2) INFORMATION FOR SEQ ID NO:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:
Glu Gin Lys Leu He Ser Glu Glu Aβp 1 5
(2) INFORMATION FOR SEQ ID NO:13: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
GGCGGCCGCT GTAGAACTAG T 1
(2) INFORMATION FOR SEQ ID NO:14: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
GCTGGGATCC AAGGCAACCA AGTGCTGGGC 0
(2) INFORMATION FOR SEQ ID NO:15: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3854 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1027...3561 (D) OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:
ATTCCCTCGC TCACCCCATC CTCTCTCCCG CCCCTTCCTG GATTCCCTCA CCCGTCTCGA 60
TCCCCTCTCC GCCCTTTCCC AGAGACCCAG AGCCCCTGAC CCCCCGCGCC CCTGCTCAGC 120 TGCCTCCTGG CGTTGCTGGC CCTGTGCCCT GGAGGGCGCC CGCAGACGGT GCTGACCGAC 180
GACGAGATCG AGGAGTTCCT CGAGGGCTTC CTGTCAGAGC TAGAACCTGA GCCCCGGGAG 240 GACGACGTGG AGGCCCCGCC GCCTCCCGAG CCCACCCCGC GGGTCCGAAA AGCCCAGGCG 300
GGGGGCAAGC CAGGGAAGCG GCCAGGGACG GCCGCAGAAG TGCCTCCGGA AAAGACCAAA 360
GACAAAGGGA AGAAAGGCAA GAAAGACAAA GGCCCCAAGG TGCCCAAGGA GTCCTTGGAG 420 GGGTCCCCCA GGCCGCCCAA GAAGGGGAAG GAGAAGCCAC CCAAGGCCAC CAAGAAGCCC 480
AAGGAGAAGC CACCTAAGGC CACCAAGAAG CCCAAGGAGA AGCCACCCAA GGCCACCAAG 540 AAGCCCAAAG AGAAGCCACC CAAGGCCACC AAGAAGCCCC CGTCAGGGAA GAGGCCCCCC 600
ATTCTGGCTC CCTCAGAAAC CCTGGAGTGG CCACTGCCCC CACCCCCCAG CCCTGGCCCC 660
GAGGAGCTAC CCCAGGAGGG AGGGGCGCCC CTCTCAAATA ACTGGCAGAA TCCAGGAGAG 720 GAGACCCATG TGGAOGCACA GGAGCACCAG CCTGAGCCGG AGGAGGAGAC CGAGCAACCC 780
GCACTGGACT ACAATGACCA GATCGAGAGG GAGGACTATG AGGACTTTGA GTACATTCGG 840 CGCCAGAAGC AACCCAGGCC ACCCCCAAGC AGAAGGAGGA GGCCCGAGCG GGTCTGGCCA 900
GAGCCCCCTG AGGAGAAGGC CCCGGCCCCA GCCCCGGAGG AGAGGATTGA GCCTCCTGTG 960
AAGCCTCTGC TGCCCCCGCT GCCCCCTGAC TATGGTGATG GTTACGTGAT CCCCAACTAC 1020 GATGAC ATG GAC TAT TAC TTT GGG CCT CCT CCG CCC CAG AAG CCC GAT 1068
Met Aβp Tyr Tyr Phe Gly Pro Pro Pro Pro Gin Lye Pro Aβp 1 5 10 GCT GAG CGC CAG ACG GAC GAA GAG AAG GAG GAG CTG AAG AAA CCC AAA 1116 Ala Glu Arg Gin Thr Asp Glu Glu Lye Glu Glu Leu Lye Lys Pro Lys 15 20 25 30 AAG GAG GAC AGC AGC CCC AAG GAG GAG ACC GAC AAG TGG GCA GTG GAG 1164 Lys Glu Aβp Ser Ser Pro Lye Glu Glu Thr Aβp Lye Trp Ala Val Glu 35 40 45
AAG GGC AAG GAC CAC AAA GAG CCC CGA AAG GGC GAG GAG TTG GAG GAG 1212
Lye Gly Lys Aβp Hie Lys Glu Pro Arg Lye Gly Glu Glu Leu Glu Glu 50 55 60
GAG TGG ACG CCT ACG GAG AAA GTC AAG TGT CCC CCC ATT GGG ATG GAG 1260 Glu Trp Thr Pro Thr Glu Lye Val Lys Cyβ Pro Pro He Gly Met Glu 65 70 75
TCA CAC CGT ATT GAG GAC AAC CAG ATC CGA GCC TCC TCC ATG CTG CGC 1308 Ser Hie Arg He Glu Asp Asn Gin He Arg Ala Ser Ser Met Leu Arg 80 85 90
CAC GGC CTG GGG GCA CAG CGC GGC CGG CTC AAC ATG CAG ACC GGT GCC 1356 His Gly Leu Gly Ala Gin Arg Gly Arg Leu Asn Met Gin Thr Gly Ala 95 100 105 110 ACT GAG GAC GAC TAC TAT GAT GGT GCG TGG TGT GCC GAG GAC GAT GCC 1404
Thr Glu Aβp Aβp Tyr Tyr Aβp Gly Ala Trp Cyβ Ala Glu Aβp Asp Ala 115 120 125
AGG ACC CAG TGG ATA GAG GTG GAC ACC AGG AGG ACT ACC CGG TTC ACA 1452
Arg Thr Gin Trp He Glu Val Asp Thr Arg Arg Thr Thr Arg Phe Thr 130 135 140
GGC GTC ATC ACC CAG GGC AGA GAC TCC AGC ATC CAT GAC GAT TTT GTG 1500 Gly Val He Thr Gin Gly Arg Asp Ser Ser He Hie Asp Asp Phe Val 145 150 155
ACC ACC TTC TTC GTG GGC TTC AGC AAT GAC AGC CAG ACA TGG GTG ATG 1548 Thr Thr Phe Phe Val Gly Phe Ser Aβn Aβp Ser Gin Thr Trp Val Met 160 165 170
TAC ACC AAC GGC TAT GAG GAA ATG ACC TTT CAT GGG AAC GTG GAC AAG 1596 Tyr Thr Aβn Gly Tyr Glu Glu Met Thr Phe Hie Gly Asn Val Aβp Lye 175 180 185 190 GAC ACA CCC GTG CTG AGT GAG CTC CCA GAG CCG GTG GTG GCT CGT TTC 1644 Aβp Thr Pro Val Leu Ser Glu Leu Pro Glu Pro Val Val Ala Arg Phe 195 200 205
ATC CGC ATC TAC CCA CTC ACC TGG AAT GGC AGC CTG TGC ATG CGC CTG 1692 Ile Arg He Tyr Pro Leu Thr Trp Asn Gly Ser Leu Cys Met Arg Leu 210 215 220
GAG GTG CTG GGG TGC TCT GTG GCC CCT GTC TAC AGC TAC TAC GCA CAG 1740 Glu Val Leu Gly Cys Ser Val Ala Pro Val Tyr Ser Tyr Tyr Ala Gin 225 230 235
AAT GAG GTG GTG GCC ACC GAT GAC CTG GAT TTC CGG CAC CAC AGC TAC 1788 Asn Glu Val Val Ala Thr Aβp Asp Leu Aβp Phe Arg His His Ser Tyr 240 245 250
AAG GAC ATG CGC CAG CTC ATG AAG GTG GTG AAC GAG GAG TGC CCC ACC 1836 Lys Aβp Met Arg Gin Leu Met Lys Val Val Aβn Glu Glu Cyβ Pro Thr 255 260 265 270 ATC ACC CGC ACT TAC AGC CTG GGC AAG AGC TCA CGA GGC CTC AAG ATC 1884 He Thr Arg Thr Tyr Ser Leu Gly Lye Ser Ser Arg Gly Leu Lye He 275 280 285
TAT GCC ATG GAG ATC TCA GAC AAC CCT GGG GAG CAT GAA CTG GGG GAG 1932
Tyr Ala Met Glu He Ser Aβp Aβn Pro Gly Glu His Glu Leu Gly Glu
290 295 300
CCC GAG TTC CGC TAC ACT GCT GGG ATC CAT GGC AAC GAG GTG CTG GGC 1980 Pro Glu Phe Arg Tyr Thr Ala Gly He His Gly Asn Glu Val Leu Gly 305 310 315
CGA GAG CTG TTG CTG CTG CTC ATG CAG TAC CTG TGC CGA GAG TAC CGC 2028 Arg Glu Leu Leu Leu Leu Leu Met Gin Tyr Leu Cyβ Arg Glu Tyr Arg 320 325 330
GAT GGG AAC CCA CGT GTG CGC AGC CTG GTG CAG GAC ACA CGC ATC CAC 2076
Aβp Gly Aβn Pro Arg Val Arg Ser Leu Val Gin Aβp Thr Arg He His
335 340 345 350 CTG GTG CCC TCA CTG AAC CCT GAT GGC TAC GAG GTG GCA GCG CAG ATG 2124
Leu Val Pro Ser Leu Asn Pro Aβp Gly Tyr Glu Val Ala Ala Gin Met 355 360 365
GGC TCA GAG TTT GGG AAC TGG GCG CTG GGA CTG TGG ACT GAG GAG GGC 2172
Gly Ser Glu Phe Gly Aβn Trp Ala Leu Gly Leu Trp Thr Glu Glu Gly 370 375 380
TTT GAC ATC TTT GAA GAT TTC CCG GAT CTC AAC TCT GTG CTC TGG GGA 2220 Phe Aβp He Phe Glu Aβp Phe Pro Aβp Leu Aβn Ser Val Leu Trp Gly 385 390 395
GCT GAG GAG AGG AAA TGG GTC CCC TAC CGG GTC CCC AAC AAT AAC TTG 2268 Ala Glu Glu Arg Lye Trp Val Pro Tyr Arg Val Pro Aβn Aβn Aβn Leu 400 405 410 CCC ATC CCT GAA CGC TAC CTT TCG CCA GAT GCC ACG GTA TCC ACG GAG 2316 Pro He Pro Glu Arg Tyr Leu Ser Pro Aβp Ala Thr Val Ser Thr Glu 415 420 425 430 GTC CGG GCC ATC ATT GCC TGG ATG GAG AAG AAC CCC TTC GTG CTG GGA 2364 Val Arg Ala He He Ala Trp Met Glu Lye Aβn Pro Phe Val Leu Gly 435 440 445
GCA AAT CTG AAC GGC GGC GAG CGG CTA GTA TCC TAC CCC TAC GAT ATG 2412
Ala Aβn Leu Aβn Gly Gly Glu Arg Leu Val Ser Tyr Pro Tyr Aβp Met 450 455 460
GCC CGC ACG CCT ACC CAG GAG CAG CTG CTG GCC GCA GCC ATG GCA GCA 2460 Ala Arg Thr Pro Thr Gin Glu Gin Leu Leu Ala Ala Ala Met Ala Ala
465 470 475
GCC CGG GGG GAG GAT GAG GAC GAG GTC TCC GAG GCC CAG GAG ACT CCA 2508
Ala Arg Gly Glu Aβp Glu Aβp Glu Val Ser Glu Ala Gin Glu Thr Pro 480 485 490
GAC CAC GCC ATC TTC CGG TGG CTT GCC ATC TCC TTC GCC TCC GCA CAC 2556 Aβp Hie Ala He Phe Arg Trp Leu Ala He Ser Phe Ala Ser Ala Hie 495 500 505 510 CTC ACC TTG ACC GAG CCC TAC CGC GGA GGC TGC CAA GCC CAG GAC TAC 2604 Leu Thr Leu Thr Glu Pro Tyr Arg Gly Gly Cyβ Gin Ala Gin Asp Tyr 515 520 525
ACC GGC GGC ATG GGC ATC GTC AAC GGG GCC AAG TGG AAC CCC CGG ACC 2652
Thr Gly Gly Met Gly He Val Asn Gly Ala Lys Trp Aβn Pro Arg Thr 530 535 540
GGG ACT ATC AAT GAC TTC AGT TAC CTG CAT ACC AAC TGC CTG GAG CTC 2700 Gly Thr He Asn Asp Phe Ser Tyr Leu Hie Thr Aβn Cys Leu Glu Leu 545 550 555
TCC TTC TAC CTG GGC TGT GAC AAG TTC CCT CAT GAG AGT GAG CTG CCC 2748 Ser Phe Tyr Leu Gly Cyβ Aβp Lys Phe Pro His Glu Ser Glu Leu Pro 560 565 570
CGC GAG TGG GAG AAC AAC AAG GAG GCG CTG CTC ACC TTC ATG GAG CAG 2796
Arg Glu Trp Glu Aβn Aβn Lye Glu Ala Leu Leu Thr Phe Met Glu Gin
575 580 585 590 GTG CAC CGC GGC ATT AAG GGG GTG GTG ACG GAC GAG CAA GGC ATC CCC 2844
Val His Arg Gly He Lys Gly Val Val Thr Asp Glu Gin Gly He Pro
595 600 605
ATT GCC AAC GCC ACC ATC TCT GTG AGT GGC ATT AAT CAC GGC GTG AAG 2892 Ile Ala Aβn Ala Thr He Ser Val Ser Gly He Asn His Gly Val Lye 610 615 620
ACA GCC AGT GGT GGT GAT TAC TGG CGA ATC TTG AAC CCG GGT GAG TAC 2940 Thr Ala Ser Gly Gly Aβp Tyr Trp Arg He Leu Asn Pro Gly Glu Tyr 625 630 635
CGC GTG ACA GCC CAC GCG GAG GGC TAC ACC CCG AGC GCC AAG ACC TGC 2988
Arg Val Thr Ala His Ala Glu Gly Tyr Thr Pro Ser Ala Lys Thr Cys 640 645 650
AAT GTT GAC TAT GAC ATC GGG GCC ACT CAG TGC AAC TTC ATC CTG GCT 3036 Asn Val Asp Tyr Asp He Gly Ala Thr Gin Cys Asn Phe He Leu Ala 655 660 665 670 CGC TCC AAC TGG AAG CGC ATC CGG GAG ATC ATG GCC ATG AAC GGG AAC 3084 Arg Ser Asn Trp Lys Arg He Arg Glu He Met Ala Met Asn Gly Aβn 675 680 685
CGG CCT ATC CCA CAC ATA GAC CCA TCG CGC CCT ATG ACC CCC CAA CAG 3132
Arg Pro He Pro Hie He Asp Pro Ser Arg Pro Met Thr Pro Gin Gin
690 695 700
CGA CGC CTG CAG CAG CGA CGC CTA CAA CAC CGC CTG CGG CTT CGG GCA 3180 Arg Arg Leu Gin Gin Arg Arg Leu Gin His Arg Leu Arg Leu Arg Ala 705 710 715
CAG ATG CGG CTG CGG CGC CTC AAC GCC ACC ACC ACC CTA GGC CCC CAC 3228 Gin Met Arg Leu Arg Arg Leu Aβn Ala Thr Thr Thr Leu Gly Pro Hie 720 725 730
ACT GTG CCT CCC ACG CTG CCC CCT GCC CCT GCC ACC ACC CTG AGC ACT 3276 Thr Val Pro Pro Thr Leu Pro Pro Ala Pro Ala Thr Thr Leu Ser Thr 735 740 745 750 ACC ATA GAG CCC TGG GGC CTC ATA CCG CCA ACC ACC GCT GGC TGG GAG 3324 Thr He Glu Pro Trp Gly Leu He Pro Pro Thr Thr Ala Gly Trp Glu 755 760 765
GAG TCG GAG ACT GAG ACC TAC ACA GAG GTG GTG ACA GAG TTT GGG ACC 3372
Glu Ser Glu Thr Glu Thr Tyr Thr Glu Val Val Thr Glu Phe Gly Thr 770 775 780
GAG GTG GAG CCC GAG TTT GGG ACC AAG GTG GAG CCC GAG TTT GAG ACC 3420 Glu Val Glu Pro Glu Phe Gly Thr Lye Val Glu Pro Glu Phe Glu Thr 785 790 795
CAG TTG GAG CCT GAG TTC GAG ACC CAG CTG GAA CCC GAG TTT GAG GAA 3468 Gin Leu Glu Pro Glu Phe Glu Thr Gin Leu Glu Pro Glu Phe Glu Glu 800 805 810 GAG GAG GAG GAG GAG AAA GAG GAG GAG ATA GCC ACT GGC CAG GCA TTC 3516 Glu Glu Glu Glu Glu Lye Glu Glu Glu He Ala Thr Gly Gin Ala Phe 815 820 825 830 CCC TTC ACA ACA GTA GAG ACC TAC ACA GTG AAC TTT GGG GAC TTC TGAGA 3566 Pro Phe Thr Thr Val Glu Thr Tyr Thr Val Aβn Phe Gly Aβp Phe 835 840 845
TCAGCGTCCT ACCAAGACCC CAGCCCAACT CAAGCTACAG CAGCAGCACT TCCCAAGCCT 3626
GCTGACCACA GTCACATCAC CCATCAGCAC ATGGAAGGCC CCTGGTATGG ACACTGAAAG 3686
GAAGGGCTGG TCCTGCCCCT TTGAGGGGGT GCAAACATGA CTGGGACCTA AGAGCCAGAG 3746 GCTGTGTAGA GGCTCCTGCT CCACCTGCCA GTCTCGTAAG AGATGGGGTT GCTGCAGTGT 3806
TGGAGTAGGG GCAGAGGGAG GGAGCCAAGG TCACTCCAAT AAAACAAG 3854
(2) INFORMATION FOR SEQ ID NO:16: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 845 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:
Met Asp Tyr Tyr Phe Gly Pro Pro Pro Pro Gin Lye Pro Aβp Ala Glu
1 5 10 15
Arg Gin Thr Asp Glu Glu Lys Glu Glu Leu Lys Lye Pro Lye Lys Glu 20 25 30
Asp Ser Ser Pro Lys Glu Glu Thr Asp Lys Trp Ala Val Glu Lys Gly
35 40 45
Lys Aβp Hie Lye Glu Pro Arg Lye Gly Glu Glu Leu Glu Glu Glu Trp 50 55 60 Thr Pro Thr Glu Lye Val Lye Cyβ Pro Pro He Gly Met Glu Ser His 65 70 75 80
Arg He Glu Asp Aβn Gin He Arg Ala Ser Ser Met Leu Arg Hie Gly
85 90 95
Leu Gly Ala Gin Arg Gly Arg Leu Aβn Met Gin Thr Gly Ala Thr Glu 100 105 110
Asp Asp Tyr Tyr Aβp Gly Ala Trp Cys Ala Glu Asp Aβp Ala Arg Thr
115 120 125
Gin Trp He Glu Val Aβp Thr Arg Arg Thr Thr Arg Phe Thr Gly Val
130 135 140 He Thr Gin Gly Arg Asp Ser Ser He His Asp Aβp Phe Val Thr Thr
145 150 155 160
Phe Phe Val Gly Phe Ser Aβn Aβp Ser Gin Thr Trp Val Met Tyr Thr
165 170 175
Aβn Gly Tyr Glu Glu Met Thr Phe Hie Gly λβn Val Aβp Lye Asp Thr 180 185 190
Pro Val Leu Ser Glu Leu Pro Glu Pro Val Val Ala Arg Phe He Arg
195 200 205
He Tyr Pro Leu Thr Trp Asn Gly Ser Leu Cyβ Met Arg Leu Glu Val 210 215 220 Leu Gly Cyβ Ser Val Ala Pro Val Tyr Ser Tyr Tyr Ala Gin Aβn Glu 225 230 235 240 Val Val Ala Thr Asp Aβp Leu Asp Phe Arg His His Ser Tyr Lys Aβp
245 250 255
Met Arg Gin Leu Met Lye Val Val Aβn Glu Glu Cyβ Pro Thr He Thr 260 265 270 Arg Thr Tyr Ser Leu Gly Lys Ser Ser Arg Gly Leu Lys He Tyr Ala 275 280 285
Met Glu He Ser Asp Asn Pro Gly Glu His Glu Leu Gly Glu Pro Glu
290 295 300
Phe Arg Tyr Thr Ala Gly He His Gly Aβn Glu Val Leu Gly Arg Glu 305 310 315 320
Leu Leu Leu Leu Leu Met Gin Tyr Leu Cyβ Arg Glu Tyr Arg Aβp Gly
325 330 335
Aβn Pro Arg Val Arg Ser Leu Val Gin Aβp Thr Arg He Hie Leu Val 340 345 350 Pro Ser Leu Aβn Pro Aβp Gly Tyr Glu Val Ala Ala Gin Met Gly Ser 355 360 365
Glu Phe Gly Aβn Trp Ala Leu Gly Leu Trp Thr Glu Glu Gly Phe Aβp
370 375 380
He Phe Glu Asp Phe Pro Aβp Leu Aβn Ser Val Leu Trp Gly Ala Glu 385 390 395 400
Glu Arg Lye Trp Val Pro Tyr Arg Val Pro Aβn Aβn Asn Leu Pro He
405 410 415
Pro Glu Arg Tyr Leu Ser Pro Asp Ala Thr Val Ser Thr Glu Val Arg 420 425 430 Ala He He Ala Trp Met Glu Lys Aβn Pro Phe Val Leu Gly Ala Aβn 435 440 445
Leu Aβn Gly Gly Glu Arg Leu Val Ser Tyr Pro Tyr Aβp Met Ala Arg
450 455 460
Thr Pro Thr Gin Glu Gin Leu Leu Ala Ala Ala Met Ala Ala Ala Arg 465 470 475 480
Gly Glu Aβp Glu Aβp Glu Val Ser Glu Ala Gin Glu Thr Pro Asp His
485 490 495
Ala He Phe Arg Trp Leu Ala He Ser Phe Ala Ser Ala His Leu Thr 500 505 510 Leu Thr Glu Pro Tyr Arg Gly Gly Cys Gin Ala Gin Asp Tyr Thr Gly 515 520 525
Gly Met Gly He Val Asn Gly Ala Lys Trp Asn Pro Arg Thr Gly Thr
530 535 540
He Asn Asp Phe Ser Tyr Leu His Thr Aβn Cyβ Leu Glu Leu Ser Phe 545 550 555 560
Tyr Leu Gly Cys Asp Lys Phe Pro His Glu Ser Glu Leu Pro Arg Glu
565 570 575
Trp Glu Asn Asn Lye Glu Ala Leu Leu Thr Phe Met Glu Gin Val His 580 585 590 Arg Gly He Lye Gly Val Val Thr Aβp Glu Gin Gly He Pro He Ala 595 600 605
Aβn Ala Thr He Ser Val Ser Gly He Aβn Hie Gly Val Lye Thr Ala
610 615 620
Ser Gly Gly Aβp Tyr Trp Arg He Leu Aβn Pro Gly Glu Tyr Arg Val 625 630 635 640
Thr Ala Hie Ala Glu Gly Tyr Thr Pro Ser Ala Lye Thr Cys Asn Val
645 650 655
Asp Tyr Aβp He Gly Ala Thr Gin Cyβ Aβn Phe He Leu Ala Arg Ser 660 665 670 Aβn Trp Lye Arg He Arg Glu He Met Ala Met Aβn Gly Aβn Arg Pro 675 680 685
He Pro Hie He Aβp Pro Ser Arg Pro Met Thr Pro Gin Gin Arg Arg
690 695 700
Leu Gin Gin Arg Arg Leu Gin His Arg Leu Arg Leu Arg Ala Gin Met 705 710 715 720
Arg Leu Arg Arg Leu Aβn Ala Thr Thr Thr Leu Gly Pro His Thr Val 725 730 735 Pro Pro Thr Leu Pro Pro Ala Pro Ala Thr Thr Leu Ser Thr Thr He
740 745 750
Glu Pro Trp Gly Leu He Pro Pro Thr Thr Ala Gly Trp Glu Glu Ser 755 760 765 Glu Thr Glu Thr Tyr Thr Glu Val Val Thr Glu Phe Gly Thr Glu Val 770 775 780
Glu Pro Glu Phe Gly Thr Lys Val Glu Pro Glu Phe Glu Thr Gin Leu 785 790 795 800
Glu Pro Glu Phe Glu Thr Gin Leu Glu Pro Glu Phe Glu Glu Glu Glu 805 810 815
Glu Glu Glu Lys Glu Glu Glu He Ala Thr Gly Gin Ala Phe Pro Phe
820 825 830
Thr Thr Val Glu Thr Tyr Thr Val Aβn Phe Gly Aβp Phe 835 840 845
(2) INFORMATION FOR SEQ ID NO:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3633 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 46...3429 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
AAGTCCCTGC TCAAGCCCGC CCGGCTCCCG CGCGTGCCCA GAGCC ATG GCT CCA GTG 7
Met Ala Pro Val 1
CGC ACC GCA TCC CTG CTC TGC GGC CTC CTG GCA CTG CTG ACG CTG TGC 105
Arg Thr Ala Ser Leu Leu Cys Gly Leu Leu Ala Leu Leu Thr Leu Cys 5 10 15 20 CCT GAG GGG AAC CCA CAG ACG GTG CTG ACG GAC GAC GAG ATC GAG GAG 153 Pro Glu Gly Asn Pro Gin Thr Val Leu Thr Aβp Asp Glu He Glu Glu 25 30 35
TTC CTC GAA GGC TTC CTT TCG GAG TTG GAG ACC CAG TCC CCG CCC CGG 201
Phe Leu Glu Gly Phe Leu Ser Glu Leu Glu Thr Gin Ser Pro Pro Arg 40 45 50
GAA GAC GAC GTG GAA GTC CAG CCG CTT CCC GAA CCC ACC CAG CGT CCC 249 Glu Asp Aβp Val Glu Val Gin Pro Leu Pro Glu Pro Thr Gin Arg Pro 55 60 65
CGC AAA TCC AAG GCA GGG GGC AAG CAG CGG GCA GAT GTA GAA GTC CCT 297 Arg Lye Ser Lys Ala Gly Gly Lys Gin Arg Ala Aβp Val Glu Val Pro 70 75 80 CCA GAA AAA AAC AAA GAC AAA GAG AAG AAA GGA AAG AAG GAC AAA GGC 345
Pro Glu Lys Asn Lys Asp Lys Glu Lys Lys Gly Lys Lys Asp Lys Gly
85 90 95 100 CCC AAA GCC ACA AAA CCC CTG GAG GGC TCT ACC AGG CCC ACC AAG AAA 393
Pro Lys Ala Thr Lys Pro Leu Glu Gly Ser Thr Arg Pro Thr Lys Lys 105 110 115
CCA AAG GAG AAG CCA CCC AAG GCC ACC AAG AAG CCC AAG GAG AAA CCA 441
Pro Lys Glu Lys Pro Pro Lys Ala Thr Lys Lys Pro Lys Glu Lys Pro 120 125 130
CCC AAG GCC ACC AAG AAG CCC AAG GAG AAG CCA CCC AAG GCC ACC AAG 489 Pro Lye Ala Thr Lys Lys Pro Lys Glu Lys Pro Pro Lys Ala Thr Lys 135 140 145
AAG CCT AAG GAG AAG CCA CCC AAG GCC ACT AAG AGG CCC TCG GCA GGA 537
Lys Pro Lys Glu Lye Pro Pro Lys Ala Thr Lys Arg Pro Ser Ala Gly 150 155 160
AAG AAG TTC TCA ACT GTG GCC CCC TTG GAA ACG CTG GAT CGG TTA CTC 585
Lys Lys Phe Ser Thr Val Ala Pro Leu Glu Thr Leu Aβp Arg Leu Leu
165 170 175 180 CCC TCA CCC TCC AAC CCC AGC GCC CAG GAG CTA CCG CAG AAG AGA GAC 633 Pro Ser Pro Ser Aβn Pro Ser Ala Gin Glu Leu Pro Gin Lys Arg Aβp 185 190 195
ACA CCC TTC CCA AAT GCC TGG CAA GGT CAA GGA GAA GAG ACC CAG GTG 681
Thr Pro Phe Pro Aβn Ala Trp Gin Gly Gin Gly Glu Glu Thr Gin Val 200 205 210
GAG GCC AAG CAG CCC CGG CCA GAG CCA GAG GAG GAG ACT GAG ATG CCC 729 Glu Ala Lys Gin Pro Arg Pro Glu Pro Glu Glu Glu Thr Glu Met Pro
215 220 225
ACA CTG GAC TAC AAT GAC CAG ATA GAG AAG GAG GAT TAC GAG GAT TTT 777 Thr Leu Asp Tyr Aβn Aβp Gin He Glu Lys Glu Asp Tyr Glu Asp Phe 230 235 240
GAG TAC ATC CGT CGC CAG AAG CAG CCC AGG CCA ACA CCC AGC AGG AGG 825 Glu Tyr He Arg Arg Gin Lye Gin Pro Arg Pro Thr Pro Ser Arg Arg 245 250 255 260 AGG CTC TGG CCA GAG CGC CCT GAG GAG AAG ACT GAA GAG CCA GAG GAA 873 Arg Leu Trp Pro Glu Arg Pro Glu Glu Lys Thr Glu Glu Pro Glu Glu 265 270 275
AGG AAG GAA GTC GAG CCA CCT CTG AAG CCC CTG CTG CCT CCG GAC TAT 921 Arg Lys Glu Val Glu Pro Pro Leu Lys Pro Leu Leu Pro Pro Aβp Tyr 280 285 290
GGG GAT AGC TAC GTG ATC CCC AAC TAT GAT GAC TTG GAC TAT TAT TTC 969 Gly Aβp Ser Tyr Val He Pro Aβn Tyr Aβp Aβp Leu Aβp Tyr Tyr Phe 295 300 305
CCC CAC CCT CCA CCG CAG AAG CCT GAT GTT GGA CAA GAG GTG GAT GAG 1017
Pro His Pro Pro Pro Gin Lys Pro Asp Val Gly Gin Glu Val Asp Glu 310 315 320
GAA AAG GAA GAG ATG AAG AAG CCC AAA AAG GAG GGT AGT AGC CCC AAG 1065 Glu Lys Glu Glu Met Lys Lys Pro Lys Lye Glu Gly Ser Ser Pro Lye 325 330 335 340 GAG GAC ACA GAG GAC AAG TGG ACC GTG GAG AAA AAC AAG GAC CAC AAA 1113 Glu Aβp Thr Glu Aβp Lye Trp Thr Val Glu Lye Aβn Lye Aβp His Lys 345 350 355
GGG CCC CGG AAG GGT GAG GAG CTG GAG GAG GAG TGG GCG CCA GTG GAG 1161
Gly Pro Arg Lys Gly Glu Glu Leu Glu Glu Glu Trp Ala Pro Val Glu 360 365 370
AAA ATC AAG TGC CCA CCT ATT GGG ATG GAG TCA CAC CGC ATT GAG GAC 1209 Lys He Lys Cys Pro Pro He Gly Met Glu Ser His Arg He Glu Asp 375 380 385
AAC CAG ATC CGT GCC TCC TCC ATG CTG CGC CAC GGC CTC GGA GCC CAG 1257 Aβn Gin He Arg Ala Ser Ser Met Leu Arg Hie Gly Leu Gly Ala Gin 390 395 400
CGG GGC CGG CTC AAC ATG CAG GCT GGT GCC AAT GAA GAT GAC TAC TAT 1305 Arg Gly Arg Leu Aβn Met Gin Ala Gly Ala Aβn Glu Aβp Aβp Tyr Tyr 405 410 415 420 GAC GGG GCA TGG TGT GCT GAG GAC GAG TCG CAG ACC CAG TGG ATC GAG 1353
Aβp Gly Ala Trp Cyβ Ala Glu Aβp Glu Ser Gin Thr Gin Trp He Glu 425 430 435
GTG GAC ACC CGA AGG ACA ACT CGG TTC ACG GGC GTC ATC ACT CAG GGC 1401
Val Aβp Thr Arg Arg Thr Thr Arg Phe Thr Gly Val He Thr Gin Gly 440 445 450
CT GAC TCC AGC ATC CAT GAC GAC TTC GTG ACT ACC TTC TTT GTG GGC 1449 Arg Aβp Ser Ser He Hie Asp Aβp Phe Val Thr Thr Phe Phe Val Gly 455 460 465
TTC AGC AAT GAC AGC CAG ACC TGG GTG ATG TAC ACC AAT GGC TAC GAG 1497 Phe Ser Asn Aβp Ser Gin Thr Trp Val Met Tyr Thr Asn Gly Tyr Glu 470 475 480 GAA ATG ACC TTC TAT GGA AAT GTG GAC AAG GAC ACA CCT GTG CTG AGC 1545
Glu Met Thr Phe Tyr Gly Aβn Val Aβp Lys Asp Thr Pro Val Leu Ser
485 490 495 500 GAG CTC CCT GAG CCA GTT GTG GCC CGT TTC ATC CGC ATC TAT CCA CTC 1593 Glu Leu Pro Glu Pro Val Val Ala Arg Phe He Arg He Tyr Pro Leu 505 510 515
ACC TGG AAC GGT AGC CTG TGC ATG CGC CTG GAG GTG CTA GGC TGC CCC 1641
Thr Trp Asn Gly Ser Leu Cys Met Arg Leu Glu Val Leu Gly Cys Pro 520 525 530
GTG ACC CCT GTC TAC AGC TAC TAC GCA CAG AAT GAG GTG GTA ACT ACT 1689 Val Thr Pro Val Tyr Ser Tyr Tyr Ala Gin Asn Glu Val Val Thr Thr 535 540 545
GAC AGC CTG GAC TTC CGG CAC CAC AGC TAC AAG GAC ATG CGC CAG CTG 1737 Aβp Ser Leu Asp Phe Arg His His Ser Tyr Lye Aβp Met Arg Gin Leu 550 555 560
ATG AAG GCT GTC AAT GAG GAG TGC CCC ACA ATC ACT CGC ACA TAC AGC 1785 Met Lys Ala Val Asn Glu Glu Cys Pro Thr He Thr Arg Thr Tyr Ser 565 570 575 580 CTG GGC AAG AGT TCA CGA GGG CTC AAG ATC TAC GCA ATG GAA ATC TCA 1833 Leu Gly Lys Ser Ser Arg Gly Leu Lys He Tyr Ala Met Glu He Ser 585 590 595
GAC AAC CCT GGG GAT CAT GAA CTG GGG GAG CCC GAG TTC CGC TAC ACA 1881
Aβp Aβn Pro Gly Aβp Hie Glu Leu Gly Glu Pro Glu Phe Arg Tyr Thr 600 605 610
GCC GGG ATC CAC GGC AAT GAG GTG CTA GGC CGA GAG CTC CTG CTC CTG 1929 Ala Gly He Hie Gly Aβn Glu Val Leu Gly Arg Glu Leu Leu Leu Leu 615 620 625
CTC ATG CAA TAC CTA TGC CAG GAG TAC CGC GAT GGG AAC CCG AGA GTG 1977 Leu Mat Gin Tyr Leu Cyβ Gin Glu Tyr Arg Aβp Gly Aβn Pro Arg Val 630 635 640
CGC AAC CTG GTG CAG GAC ACA CGC ATC CAC CTG GTG CCC TCG CTG AAC 2025
Arg Aβn Leu Val Gin Aβp Thr Arg He Hie Leu Val Pro Ser Leu Aβn
645 650 655 660 CCT GAT GGC TAT GAG GTG GCA GCG CAG ATG GGC TCA GAG TTT GGG AAC 2073 Pro Aβp Gly Tyr Glu Val Ala Ala Gin Met Gly Ser Glu Phe Gly Aβn 665 670 675
TGG GCA CTG GGG CTG TGG ACT GAG GAG GGC TTT GAC ATC TTC GAG GAC 2121 Trp Ala Leu Gly Leu Trp Thr Glu Glu Gly Phe Aβp He Phe Glu Aβp 680 685 690
TTC CCA GAT CTC AAC TCT GTG CTC TGG GCA GCT GAG GAG AAG AAA TGG 2169 Phe Pro Aβp Leu Aβn Ser Val Leu Trp Ala Ala Glu Glu Lys Lys Trp 695 700 705
GTC CCC TAC AGG GTC CCA AAC AAT AAC TTG CCA ATC CCT GAA CGT TAC 2217 Val Pro Tyr Arg Val Pro Aβn Asn Asn Leu Pro He Pro Glu Arg Tyr 710 715 720
CTG TCC CCA GAT GCC ACG GTC TCC ACA GAA GTC CGG GCC ATT ATT TCC 2265 Leu Ser Pro Asp Ala Thr Val Ser Thr Glu Val Arg Ala He He Ser 725 730 735 740 TGG ATG GAG AAG AAC CCC TTT GTG CTG GGT GCA AAT CTG AAC GGT GGT 2313 Trp Met Glu Lye Asn Pro Phe Val Leu Gly Ala Asn Leu Asn Gly Gly 745 750 755
GAG CGG CTT GTG TCT TAT CCC TAT GAC ATG GCC CGG ACA CCT AGC CAG 2361
Glu Arg Leu Val Ser Tyr Pro Tyr Asp Met Ala Arg Thr Pro Ser Gin 760 765 770
GAG CAG CTG TTG GCC GAG GCA CTG GCA GCT GCC CGC GGA GAA GAT GAT 2409 Glu Gin Leu Leu Ala Glu Ala Leu Ala Ala Ala Arg Gly Glu Aβp Asp 775 780 785
GAC GGG GTG TCT GAG GCC CAG GAG ACT CCA GAT CAC GCT ATT TTC CGC 2457 Asp Gly Val Ser Glu Ala Gin Glu Thr Pro Aβp Hie Ala He Phe Arg 790 795 800
TGG CTG GCC ATC TCA TTT GCC TCC GCC CAT CTC ACC ATG ACG GAG CCC 2505
Trp Leu Ala He Ser Phe Ala Ser Ala Hie Leu Thr Met Thr Glu Pro 805 810 815 820 TAC CGG GGA GGG TGC CAG GCC CAG GAC TAC ACC AGC GGC ATG GGC ATT 2553
Tyr Arg Gly Gly Cyβ Gin Ala Gin Aβp Tyr Thr Ser Gly Met Gly He
825 830 835
GTC AAC GGG GCC AAG TGG AAT CCT CGC TCT GGG ACT TTC AAT GAC TTT 2601
Val Asn Gly Ala Lys Trp Asn Pro Arg Ser Gly Thr Phe Aβn Asp Phe 840 845 850
AGC TAC CTG CAC ACA AAC TGT CTG GAG CTC TCC GTA TAC CTG GGC TGT 2649 Ser Tyr Leu His Thr Asn Cys Leu Glu Leu Ser Val Tyr Leu Gly Cyβ 855 860 865
GAC AAG TTC CCC CAC GAG AGT GAG CTA CCC CGA GAA TGG GAG AAC AAC 2697 Aβp Lys Phe Pro His Glu Ser Glu Leu Pro Arg Glu Trp Glu Aβn Aβn 870 875 880 AAA GAA GCG CTG CTC ACC TTC ATG GAG CAG GTG CAC CGT GGC ATT AAG 2745
Lye Glu Ala Leu Leu Thr Phe Met Glu Gin Val His Arg Gly He Lys 885 890 895 900 GGT GTG GTG ACA GAT GAG CAA GGC ATC CCC ATT GCC AAT GCC ACC ATC 2793 Gly Val Val Thr Asp Glu Gin Gly He Pro He Ala Aβn Ala Thr He 905 910 915
TCT GTG AGT GGC ATC AAC CAT GGT GTG AAG ACA GCA AGT GGA GGT GAC 2841
Ser Val Ser Gly He Aβn Hie Gly Val Lye Thr Ala Ser Gly Gly Aβp 920 925 930
TAC TGG CGC ATT CTG AAC CCG GGT GAG TAC CGT GTG ACA GCT CAC GCA 2889 Tyr Trp Arg He Leu Aβn Pro Gly Glu Tyr Arg Val Thr Ala Hie Ala 935 940 945
GAG GGC TAC ACC TCA AGT GCC AAG ATC TGC AAT GTG GAC TAC GAT ATT 2937
Glu Gly Tyr Thr Ser Ser Ala Lye He Cys Aβn Val Asp Tyr Aβp He 950 955 960
GGG GCC ACT CAG TGC AAC TTC ATC CTG GCT CGA TCC AAC TGG AAG CGC 2985
Gly Ala Thr Gin Cys Asn Phe He Leu Ala Arg Ser Asn Trp Lys Arg
965 970 975 980 ATT CGG GAG ATC TTG GCT ATG AAC GGG AAC CGT CCC ATT CTC GGA GTT 3033 He Arg Glu He Leu Ala Met Asn Gly Asn Arg Pro He Leu Gly Val 985 990 995
GAC CCC TCA CGA CCC ATG ACC CCC CAG CAG CGG CGC ATG CAG CAG CGC 3081
Aβp Pro Ser Arg Pro Met Thr Pro Gin Gin Arg Arg Met Gin Gin Arg 1000 1005 1010
CGT CTA CAG TAC CGG CTC CGC ATG AGG GAA CAG ATG CGA CTG CGT CGC 3129 Arg Leu Gin Tyr Arg Leu Arg Met Arg Glu Gin Met Arg Leu Arg Arg 1015 1020 1025
CTC AAT TCT ACC GCA GGC CCT GCC ACA AGC CCC ACT CCT GCC CTT ATG 3177 Leu Asn Ser Thr Ala Gly Pro Ala Thr Ser Pro Thr Pro Ala Leu Met 1030 1035 1040
CCT CCC CCT TCC CCT ACA CCA GCC ATT ACC TTG AGG CCC TGG GAA GTT 3225
Pro Pro Pro Ser Pro Thr Pro Ala He Thr Leu Arg Pro Trp Glu Val
1045 1050 1055 1060 CTA CCC ACT ACC ACT GCA GGC TGG GAG GAG TCA GAG ACT GAG ACC TAT 3273 Leu Pro Thr Thr Thr Ala Gly Trp Glu Glu Ser Glu Thr Glu Thr Tyr 1065 1070 1075
ACA GAA GTA GTG ACA GAG TTT GAG ACA GAG TAT GGG ACT GAC CTA GAG 3321 Thr Glu Val Val Thr Glu Phe Glu Thr Glu Tyr Gly Thr Asp Leu Glu 1080 1085 1090
GTG GAA GAG ATA GAG GAG GAG GAG GAG GAG GAG GAG GAA GAG ATG GAC 3369 Val Glu Glu He Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Met Asp 1095 1100 1105
ACA GGC CTT ACA TTT CCA CTC ACA ACA GTG GAG ACC TAC ACA GTG AAC 3417 Thr Gly Leu Thr Phe Pro Leu Thr Thr Val Glu Thr Tyr Thr Val Asn 1110 1115 1120
TTT GGG GAC TTC TGAGACTGGG ATCTCAAAGC CCTGCCCAAT TCAAACTAAG GCAGC 3474 Phe Gly Asp Phe
1125 ACCTCCCAAG CCTGTGCCAG CAGACACATA GCCATCAGAT GTCCCTGGGT GGACCCCACT 3534
CCCCCAGTGT GGGACATCAA AGCTACCGGG ACTCTGCATA GACTCTGGTC TACCCGCCCC 3594 AGCTCTACCT GCCAGCCTTT GGGGAGGGGC AGGCAAGGA 3633
(2) INFORMATION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1128 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
Met Ala Pro Val Arg Thr Ala Ser Leu Leu Cys Gly Leu Leu Ala Leu 1 5 10 15
Leu Thr Leu Cyβ Pro Glu Gly Asn Pro Gin Thr Val Leu Thr Aβp Aβp
20 25 30
Glu He Glu Glu Phe Leu Glu Gly Phe Leu Ser Glu Leu Glu Thr Gin 35 40 45 Ser Pro Pro Arg Glu Aβp Aβp Val Glu Val Gin Pro Leu Pro Glu Pro 50 55 60
Thr Gin Arg Pro Arg Lye Ser Lys Ala Gly Gly Lys Gin Arg Ala Asp 65 70 75 80
Val Glu Val Pro Pro Glu Lys Aβn Lys Aβp Lye Glu Lye Lye Gly Lye 85 90 95
Lys Asp Lye Gly Pro Lye Ala Thr Lye Pro Leu Glu Gly Ser Thr Arg
100 105 110
Pro Thr Lye Lys Pro Lys Glu Lye Pro Pro Lye Ala Thr Lye Lys Pro 115 120 125 Lys Glu Lys Pro Pro Lye Ala Thr Lye Lye Pro Lys Glu Lys Pro Pro 130 135 140
Lys Ala Thr Lys Lye Pro Lye Glu Lys Pro Pro Lye Ala Thr Lye Arg 145 150 155 160
Pro Ser Ala Gly Lye Lye Phe Ser Thr Val Ala Pro Leu Glu Thr Leu 165 170 175
Asp Arg Leu Leu Pro Ser Pro Ser Aβn Pro Ser Ala Gin Glu Leu Pro
180 185 190
Gin Lye Arg Aβp Thr Pro Phe Pro Aβn Ala Trp Gin Gly Gin Gly Glu 195 200 205 Glu Thr Gin Val Glu Ala Lye Gin Pro Arg Pro Glu Pro Glu Glu Glu
210 215 220
Thr Glu Met Pro Thr Leu Aβp Tyr Aβn Aβp Gin He Glu Lye Glu Aβp 225 230 235 240 Tyr Glu Aβp Phe Glu Tyr He Arg Arg Gin Lye Gin Pro Arg Pro Thr
245 250 255
Pro Ser Arg Arg Arg Leu Trp Pro Glu Arg Pro Glu Glu Lye Thr Glu
260 265 270
Glu Pro Glu Glu Arg Lys Glu Val Glu Pro Pro Leu Lye Pro Leu Leu 275 280 285
Pro Pro Aβp Tyr Gly Aβp Ser Tyr Val He Pro Aβn Tyr Aβp Aβp Leu
290 295 300
Aβp Tyr Tyr Phe Pro Hie Pro Pro Pro Gin Lye Pro Aβp Val Gly Gin 305 310 315 320 Glu Val Aβp Glu Glu Lye Glu Glu Met Lye Lye Pro Lye Lye Glu Gly
325 330 335
Ser Ser Pro Lye Glu Aβp Thr Glu Aβp Lye Trp Thr Val Glu Lys Aβn
340 345 350
Lys Aβp Hie Lys Gly Pro Arg Lys Gly Glu Glu Leu Glu Glu Glu Trp 355 360 365
Ala Pro Val Glu Lys He Lye Cys Pro Pro He Gly Met Glu Ser Hie
370 375 380
Arg He Glu Aβp Aβn Gin He Arg Ala Ser Ser Met Leu Arg Hie Gly 385 390 395 400 Leu Gly Ala Gin Arg Gly Arg Leu Aβn Met Gin Ala Gly Ala Aβn Glu
405 410 415
Aβp Aβp Tyr Tyr Aβp Gly Ala Trp Cyβ Ala Glu Aβp Glu Ser Gin Thr
420 425 430
Gin Trp He Glu Val Aβp Thr Arg Arg Thr Thr Arg Phe Thr Gly Val 435 440 445
He Thr Gin Gly Arg Aβp Ser Ser He Hie Aβp Aβp Phe Val Thr Thr
450 455 460
Phe Phe Val Gly Phe Ser Aβn Asp Ser Gin Thr Trp Val Met Tyr Thr 465 470 475 480 Aβn Gly Tyr Glu Glu Met Thr Phe Tyr Gly Aβn Val Aβp Lys Aβp Thr
485 490 495
Pro Val Leu Ser Glu Leu Pro Glu Pro Val Val Ala Arg Phe He Arg
500 505 510
He Tyr Pro Leu Thr Trp Aβn Gly Ser Leu Cyβ Met Arg Leu Glu Val 515 520 525
Leu Gly Cyβ Pro Val Thr Pro Val Tyr Ser Tyr Tyr Ala Gin Aβn Glu
530 535 540
Val Val Thr Thr Aβp Ser Leu Aβp Phe Arg Hie Hie Ser Tyr Lys Asp 545 550 555 560 Met Arg Gin Leu Met Lys Ala Val Aβn Glu Glu Cyβ Pro Thr He Thr
565 570 575
Arg Thr Tyr Ser Leu Gly Lye Ser Ser Arg Gly Leu Lye He Tyr Ala
580 585 590
Met Glu He Ser Aβp Asn Pro Gly Aβp His Glu Leu Gly Glu Pro Glu 595 600 605
Phe Arg Tyr Thr Ala Gly He His Gly Asn Glu Val Leu Gly Arg Glu
610 615 620
Leu Leu Leu Leu Leu Met Gin Tyr Leu Cyβ Gin Glu Tyr Arg Asp Gly 625 630 635 640 Aβn Pro Arg Val Arg Aβn Leu Val Gin Aβp Thr Arg He Hie Leu Val
645 650 655
Pro Ser Leu Asn Pro Asp Gly Tyr Glu Val Ala Ala Gin Met Gly Ser
660 665 670
Glu Phe Gly Asn Trp Ala Leu Gly Leu Trp Thr Glu Glu Gly Phe Asp 675 680 685
He Phe Glu Aβp Phe Pro Aβp Leu Aβn Ser Val Leu Trp Ala Ala Glu 690 695 700 Glu Lye Lys Trp Val Pro Tyr Arg Val Pro Aβn Aβn Aβn Leu Pro He
705 710 715 720
Pro Glu Arg Tyr Leu Ser Pro Aβp Ala Thr Val Ser Thr Glu Val Arg
725 730 735 Ala He He Ser Trp Met Glu Lye Aβn Pro Phe Val Leu Gly Ala Aβn
740 745 750
Leu Aβn Gly Gly Glu Arg Leu Val Ser Tyr Pro Tyr Aβp Met Ala Arg
755 760 765
Thr Pro Ser Gin Glu Gin Leu Leu Ala Glu Ala Leu Ala Ala Ala Arg 770 775 780
Gly Glu Aβp Aβp Aβp Gly Val Ser Glu Ala Gin Glu Thr Pro Aβp His
785 790 795 800
Ala He Phe Arg Trp Leu Ala He Ser Phe Ala Ser Ala His Leu Thr
805 810 815 Met Thr Glu Pro Tyr Arg Gly Gly Cyβ Gin Ala Gin Aβp Tyr Thr Ser
820 825 830
Gly Met Gly He Val Asn Gly Ala Lye Trp Aβn Pro Arg Ser Gly Thr
835 840 845
Phe Aβn Aβp Phe Ser Tyr Leu Hie Thr Aβn Cyβ Leu Glu Leu Ser Val 850 855 860
Tyr Leu Gly Cyβ Aβp Lys Phe Pro His Glu Ser Glu Leu Pro Arg Glu
865 870 875 880
Trp Glu Aβn Aβn Lye Glu Ala Leu Leu Thr Phe Met Glu Gin Val Hie
885 890 895 Arg Gly He Lye Gly Val Val Thr Aβp Glu Gin Gly He Pro He Ala
900 905 910
Aβn Ala Thr He Ser Val Ser Gly He Aβn Hie Gly Val Lye Thr Ala
915 920 925
Ser Gly Gly Aβp Tyr Trp Arg He Leu Aβn Pro Gly Glu Tyr Arg Val 930 935 940
Thr Ala Hie Ala Glu Gly Tyr Thr Ser Ser Ala Lye He Cyβ Aβn Val
945 950 955 960
Aβp Tyr Aβp He Gly Ala Thr Gin Cyβ Aβn Phe He Leu Ala Arg Ser
965 970 975 Aβn Trp Lye Arg He Arg Glu He Leu Ala Met Aβn Gly Aβn Arg Pro
980 985 990
He Leu Gly Val Aβp Pro Ser Arg Pro Met Thr Pro Gin Gin Arg Arg
995 1000 1005
Met Gin Gin Arg Arg Leu Gin Tyr Arg Leu Arg Met Arg Glu Gin Met 1010 1015 1020
Arg Leu Arg Arg Leu Aβn Ser Thr Ala Gly Pro Ala Thr Ser Pro Thr
025 1030 1035 1040
Pro Ala Leu Met Pro Pro Pro Ser Pro Thr Pro Ala He Thr Leu Arg
1045 1050 1055 Pro Trp Glu Val Leu Pro Thr Thr Thr Ala Gly Trp Glu Glu Ser Glu
1060 1065 1070
Thr Glu Thr Tyr Thr Glu Val Val Thr Glu Phe Glu Thr Glu Tyr Gly
1075 1080 1085
Thr Asp Leu Glu Val Glu Glu He Glu Glu Glu Glu Glu Glu Glu Glu 1090 1095 1100
Glu Glu Met Aβp Thr Gly Leu Thr Phe Pro Leu Thr Thr Val Glu Thr 105 1110 1115 1120
Tyr Thr Val Aβn Phe Gly Asp Phe 1125
What is claimed is:

Claims

1. A substantially pure E2A-BP polypeptide, wherein the amino acid sequence of said polypeptide is at least 80% identical to SEQ ID NO:16.
2. The polypeptide of claim l, wherein the amino acid sequence of said polypeptide is at least 95% identical to SEQ ID NO: 16.
3. The polypeptide of claim 1, wherein the amino acid sequence of said polypeptide comprises SEQ ID NO:16.
4. A substantially pure E2A-BP polypeptide, wherein the amino acid sequence of said polypeptide is at least 80% identical to SEQ ID NO:18.
5. The polypeptide of claim 4, wherein the amino acid sequence of said polypeptide is at least 95% identical to SEQ ID NO: 18.
6. The polypeptide of claim 4, wherein the amino acid sequence of polypeptide comprises SEQ ID NO:18.
7. Isolated DNA encoding the polypeptide of claim l.
8. The isolated DNA of claim 7, wherein the amino acid sequence of the polypeptide comprises the coding sequence of SEQ ID NO:16.
9. The DNA of claim 8, wherein the nucleotide sequence of said DNA comprises the coding sequence of SEQ ID NO:15.
10. The DNA of claim 7, wherein said DNA hybridizes under high stringency conditions to a probe having a nucleotide sequence complementary to the entire coding sequence of SEQ ID NO:15.
11. Isolated DNA encoding the polypeptide of claim 6.
12. The DNA of claim 11, wherein the nucleotide sequence of said DNA comprises the coding sequence of SEQ ID NO:17.
13. The DNA of claim 11, wherein said DNA hybridizes under high stringency conditions to a probe having a nucleotide sequence complementary to the entire coding sequence of SEQ ID NO:17.
14. The DNA of claim 7, wherein the nucleotide sequence of said DNA comprises SEQ ID NO:2.
15. The DNA of claim 7, wherein said DNA hybridizes under high stringency conditions to a probe having a nucleotide sequence complementary to the entire coding sequence of SEQ ID NO:2.
16. Isolated DNA comprising the human E2A-BP promoter, but not encoding human E2A-BP polypeptide.
17. The DNA of claim 16, further comprising a sequence which encodes a heterologous polypeptide and is operably linked to said promoter.
18. The DNA of claim 16, further comprising a segment of DNA which is operably linked to said promoter and transcribed into an RNA that is antisense to an mRNA naturally produced in a vascular smooth muscle cell.
19. An isolated, single-stranded nucleic acid consisting of a nucleotide sequence which is antisense to at least a 17 nucleotide portion of an E2A-BP mRNA.
20. The nucleic acid of claim 19, wherein said nucleic acid consists of less than 50 nucleotides, said nucleotides being deoxyribonucleotides or stabilized analogues thereof.
21. A nucleic acid having a sequence comprising
(a) an expression control sequence which permits expression in a human vascular smooth muscle cell, operably linked to
(b) a sequence which is transcribed into an RNA antisense to at least a 17 nucleotide portion of a naturally occurring E2A-BP mRNA.
22. A method of inhibiting expression of E2A-BP in a cell, said method comprising introducing into said cell the nucleic acid of claim 19.
23. A method of inhibiting expression of E2A-BP in a cell, said method comprising introducing into said cell the nucleic acid of claim 21.
24. A method of inhibiting proliferation of a vascular smooth muscle cell, said method comprising introducing into said cell a compound which inhibits binding between E2A-BP and transcription factor E2A.
25. A method for determining the ability of a candidate compound to inhibit binding of E2A-BP to E2A, said method comprising the steps of:
(a) contacting E2A with E2A-BP in the presence of said candidate compound; and
(b) determining the level of E2A-BP binding to said E2A, wherein a lower level of binding in the presence of said compound, compared to the level of binding in the absence of said compound, is an indication of the ability of said candidate compound to inhibit E2A- BP/E2A binding.
26. A method for determining the ability of a candidate compound to inhibit binding of E2A-BP to E2A, said method comprising the steps of: (a) providing E2A with E2A-BP bound thereto to form a complex;
(b) contacting said complex with said candidate compound; and
(c) determining whether said candidate compound decreases the binding of E2A-BP to E2A in said complex, as an indication of the ability of said candidate compound to inhibit E2A-BP/E2A binding.
27. A method for determining the ability of a candidate compound to inhibit binding of E2A-BP to E2A, said method comprising the steps of:
(a) providing a cell that expresses E2A-BP and E2A;
(b) culturing said cell in the presence of said candidate compound; and
(c) determining the level of expression of an E2A- regulated gene in said cell, wherein an increase in said level of expression in the presence of said compound, compared to the level of expression in the absence of said compound, is an indication of the ability of said candidate compound to inhibit E2A-BP/E2A binding.
28. A method of promoting growth of a vascular smooth muscle cell, said method comprising introducing into said cell the polypeptide of claim 1.
29. A method of promoting growth of a vascular smooth muscle cell, said method comprising introducing into said cell the nucleic acid of claim 7.
30. A genetically altered mouse bearing a mutation in at least one endogenous E2A-BP gene which prevents that gene from expressing functional mouse E2A- BP polypeptide.
PCT/US1997/004117 1996-03-15 1997-03-14 E2a-binding protein WO1997033900A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US1343996P 1996-03-15 1996-03-15
US60/013,439 1996-03-15

Publications (1)

Publication Number Publication Date
WO1997033900A1 true WO1997033900A1 (en) 1997-09-18

Family

ID=21759978

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/004117 WO1997033900A1 (en) 1996-03-15 1997-03-14 E2a-binding protein

Country Status (1)

Country Link
WO (1) WO1997033900A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0588118A2 (en) * 1992-08-28 1994-03-23 Hoechst Japan Limited Bone-related carboxypeptidase-like protein and process for its production

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0588118A2 (en) * 1992-08-28 1994-03-23 Hoechst Japan Limited Bone-related carboxypeptidase-like protein and process for its production
US5460951A (en) * 1992-08-28 1995-10-24 Hoechst Japan Limited Bone-related carboxypeptidase-like protein and process for its production

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NATURE, 02 November 1995, Vol. 378, HE et al., "A Eukaryotic Transcriptional Repressor with Carboxypeptidase Activity", pages 92-96. *

Similar Documents

Publication Publication Date Title
US6043083A (en) Inhibitors of the JNK signal transduction pathway and methods of use
US6395548B1 (en) Methods of modulating of angiogenesis
WO1998004590A1 (en) Conservin compositions and therapeutic and diagnostic uses therefor
WO2000012525A1 (en) Sequences characteristic of hypoxia-regulated gene transcription
AU3830195A (en) The irs family of genes
Sasaki et al. A point mutation of the T3 receptor β1 gene in a kindred of generalized resistance to thyroid hormone
US6436669B1 (en) Semaphorin genes (I)
US6514935B1 (en) Methods of treating hypertension
WO1997033900A1 (en) E2a-binding protein
US8623831B2 (en) Nuclear factor κB inducing factor
US20020115069A1 (en) Eh domain containing genes and proteins
US6258557B1 (en) Smooth muscle cell LIM promoter
US6579700B1 (en) NPY family member
CA2404688C (en) Nuclear factor .kappa.b inducing factor
AU2001247931B2 (en) Nuclear factor kB inducing factor
HK1053312B (en) Nuclear factor kb inducing factor
ZA200207767B (en) Nuclear factor kB inducing factor.
AU2001247931A1 (en) Nuclear factor kB inducing factor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA JP MX

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 97532886

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase