AU735512B2

AU735512B2 - Modulators of BRAC1 activity

Info

Publication number: AU735512B2
Application number: AU38293/97A
Authority: AU
Inventors: Carol Ligenfelter; Paul Polakis; Bonnee Rubinfeld; Terilyn T. Vuong
Original assignee: Onyx Pharmaceuticals Inc
Current assignee: Onyx Pharmaceuticals Inc
Priority date: 1996-09-04
Filing date: 1997-08-06
Publication date: 2001-07-12
Anticipated expiration: 2017-08-06
Also published as: JP2001502893A; CA2259959A1; CN1228811A; EP0929668A1; AU3829397A; WO1998010066A1

Description

WO 98/10066 PCT/US97/13944 MODULATORS OF BRCA1 ACTIVITY Field of the Invention The invention described herein relates generally to the field of human disease, and more specifically to treating and diagnosing disease based on the presence of modulators of BRCA1 activity.

Background Breast cancer is one of the leading causes of cancer deaths of women in the United States, and approximately 170,000 women are affected by the disease each year.

About 5% of these reported cases are thought to result from a patient's genetic predisposition to the disease. Breast cancer is generally considered to be classifiable as early-age onset and late-age onset, the latter being defined as occurring at about age Approximately 25% of patients diagnosed with breast cancer before the age of 40 are thought to be familial, and thus have an underlying genetic component. Late-age onset breast cancer is also often familial although the risks of a family member developing the disease is less compared to early-age onset if relatives have presented with the disease.

As a result of studies involving families with inherited early onset breast and ovarian cancers a gene thought to be involved in these diseases has been mapped to the long arm of chromosome 17 and termed BRCA1, or breast cancer one gene. See, Hitoyuki, et al., Cancer res. vol. 55: 2998-3002. Additional studies on sporadic cases of breast cancer have also established a genetic link with this disease to BRCA1 which was more precisely localized to the chromosomal region 17q21. See, Hall, J. et al.

Science, vol. 250: 1684-1689 (1990).

Recently, the BRCA1 gene has been cloned, and shown to encode a protein having the properties of a tumor suppressor protein. See, Miki, et al Science, vol.

266: 66-71; and W096/05306. It has been known for some time that a variety of cancers are caused, at least in part, by mutations to certain normal genes, termed "protooncogenes." Proto-oncogenes are involved in regulating normal cell growth in ways WO 98/10066 PCT/US97/13944 that are only now beginning to be appreciated at the molecular level. The mutated proto-oncogenes, or cancer causing genes termed "oncogenes," disrupt normal cell growth which ultimately causes the death of the organism, if the cancer is not detected and treated in time. During normal or cancer cell growth, proto-oncogenes or oncogenes, are counterbalanced by growth-regulating proteins which regulate or try to regulate the growth of normal or cancer cells, respectively. Such proteins are termed "tumor suppressor proteins," and include BRCA1, p53, retinoblastoma protein (Rb), adenomatous polyposis coli protein (APC), Wilm's tumor 1 protein (WT1), neurofibromatosis type 1 protein (NF1), and neurofibromatosis type 2 protein (NF2).

BRCA1 cDNA encodes a 1863 amino acid protein with a predicted molecular weight of approximately 207,000. See, Miki, et al. (1994) Science vol. 266, pages 66- 71. The cloning and characterization of BRCA1 has facilitated establishing it as a tumor suppressor protein. For example, recent work by several investigators have shown that transfection and expression of the BRCA1 gene sequence into MCF-7 tumor cells retards tumor growth in vivo, and extends the survival time of tumor bearing animals. See, Holt, J. et al, (1996) Nat. Genet. vol. 12, pages 298-302. Similar results were obtained using a retroviral vector expressing wild-type BRCA1 against an established MCF-7 peritoneal tumor.

Considerable work has been done to identify those regions of BRCA1 that affects its tumor suppressor activity. It appears that different regions of the molecule may affect its tumor suppressor activity differently. For instance, near full length truncated BRCA1 proteins do not inhibit breast cancer cell growth, but do inhibit ovarian cancer cell growth. See, Holt, J. et al, (1996) Nat. Genet. vol. 12, pages 298-302. These observations strongly suggest that different host cell factors, presumably proteins, are interacting with different regions of BRCA1 to affect cell growth.

Over the past several years, the interactions of certain tumor suppressor proteins with host cell proteins have begun to be elucidated. See, Levin, Annu. Rev.

Biochem. 1993, vol. 62: pages 623-651. The identification of proteins involved in these interactions will facilitate the development of novel diagnostic methods, as well as novel therapeutics for identifying and treating cancer. For example, the retinoblastoma tumor suppressor protein is phosphorylated at serine residues adjacent to a proline.

The level of phosphorylation is high through S, G2, and M-phase of the cell cycle. The 2 3 kinase that effects this reaction is, in turn, activated by a cyclin that regulates events in the cell cycle. Subsequently, in late mitosis, a phosphatase removes the phosphate groups from the protein, and returns the retinoblastoma tumor suppressor protein to an unphosphorylated state in Go-G1. Clearly, the identification of drugs that can effect these interactions can be expected to play a critical role in regulating cell growth and thus be useful in the treatment of cancer.

To date, however, there have been few, if any studies on the interaction of proteins with the tumor suppressor protein, BRCA1. In order to better develop methods to diagnosis and treat both breast and ovarian cancer the identification and isolation of such proteins is critical.

Summary of the Invention In one aspect the present invention provides an isolated nucleic acid sequence that encodes a BRCA1 Modulator Protein wherein said sequence is selected from the group consisting of 091-21A31, Sequence ID NO. 1, 091- 1F84, Sequence ID No. 3 and 091-132Q20, Sequence ID No. In a further aspect the present invention provides an isolated nucleic acid sequence that encodes a protein encoded by the cDNA on deposit with the V00.0 0ATCC with accession no. 98141 (091-1F84, Sequence ID No. 3).

In an even further aspect the present invention provides an isolated o *nucleic acid sequence that encodes a protein encoded by the cDNA on deposit with the ATCC with accession no. 98142 (091-21A31, Sequence ID No. 1).

In yet an even further aspect the present invention provides an isolated Oleo 2 nucleic acid sequence that encodes a protein encoded by the cDNA on deposit 25 with the ATCC with accession no. 98143 (091-132Q0, Sequence ID No. The discussion of documents, acts, materials, devices, articles and the like is included in this specification solely for the purpose of providing a context for the present invention. It is not suggested or represented that any or all of these matters formed part of the prior art base or were common general 3A knowledge in the field relevant to the present invention as it existed in Australia before the priority date of each claim of this application.

Brief Description of the Drawings Figure 1 shows the clDNA and amino acid sequence of the BROAl Modulator Protein, depicted in Sequence ID No. 1, 091-21A31.

0@ @0 0

S

0 @000 0e 0 0 00 OS 0 S 0

S.

S

@0 S S S0 0 S@ SO S S 0

S

S.

50 S 5500 0@ 0S S

S

0000 0 0555 0 0 0@ S S .55.

05 S 0 S

S.

0 m c$7 ~~DUANEiiE~SPECI\38293DOC 4 Figure 2 shows the cDNA and amino acid sequence of the BRCA1 Modulator Protein, depicted in Sequence ID No. 3, 091-1F84.

Figure 3 shows the cDNA and amino acid sequence of the BRCA1 Modulator Protein, depicted in Sequence ID No. 5, 091-132Q20.

Figure 4 shows the format of an assay to identify compounds that increase the intracellular levels of BRCA1.

Table 1 shows the regions of BRCA1 that interact with the BRCA1 Modulator Proteins 091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1 and 091-132Q20, Sequence ID No. 5. The experiment was conducted using the two-hybrid assay as described in U.S. Patent No. 5,283,173, or Chien et al., 1991, Proc. Natl. Acad, Sci. USA, 88:9578-9582. The cDNAs that encode 091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1 and 091-132Q20, Sequence ID No. 5 were fused to the GAL 4 activation domain, and those regions of BRCA1 shown in the table were fused to the binding domain of GAL4. The sign is a subjective measure of the amount of bgalactosidase activity. One being the lowest, and three being the highest activity.

Table 2 shows regions of the BRCA1 Modulator Protein 091-21A31, 2 Sequence ID No. 1 that interact with regions of BRCA1.

A 20 Detailed Description of the Invention All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

25 Definitions At the outset it is worth noting that unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly "understood by one of ordinary skill in the art to which this invention belongs.

Generally, the nomenclature used herein and the laboratory procedures described below are those well known and commonly employed in the art.

Standard techniques are used for recombinant nucleic acid methods, q polynucleotide synthesis, and microbial culture and transformation ::\WINWORDUENNYM\SPECNKI 8293-97.DOC electroporation, lipofection). Generally enzymatic reactions and purification steps are performed according to the manufacturer's specifications. The techniques and procedures are generally performed according to conventional methods in the art and various general references (see generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd. edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, which is incorporated herein by reference) which are provided throughout this document. The nomenclature used herein and the laboratory procedures in analytical chemistry, organic synthetic chemistry, and pharmaceutical formulation described below are those well known and commonly employed in the art.

Standard techniques are used for chemical syntheses, chemical analyses, pharmaceutical formulation and delivery, and treatment of patients.

In the formulas representing selected specific embodiments of BRCA1 or BRCA1 Modulator Proteins of the present invention, the amino- and carboxyterminal groups, although often not specifically shown, will be understood to be in the form they would assume at physiological pH values, unless otherwise specified. Thus, the N-terminal H 2 and C-terminal-O0 at physiological pH are :understood to be present though not necessarily specified and shown, either in specific examples or in generic formulas. In the polypeptide notation used 20 herein, the left-hand end of the molecule is the amino terminal end and the right-hand end is the carboxy-terminal end, in accordance with standard usage and convention. Of course, the basic and acid addition salts including those which are formed at nonphysiological pH values are also included in the compounds of the invention. The amino acid residues described herein are 25 preferably in the isomeric form. Stereoisomers D-amino acids) of the twenty conventional amino acids, unnatural amino acids such as a,a-distributed amino acids, N-alkyl amino acids, lactic acid, and other unconventional amino acids may also be suitable components for polypeptides of the present invention, as long as the desired functional property is retained by the polypeptide. For the peptides shown, each encoded residue where appropriate is represented by a three letter designation, corresponding to the trivial name of Sthe conventional amino acid, in keeping with standard polypeptide C:\WINWORDUENNYM\SPECNKI\38293-97.DOC 6 nomenclature (described in J. Biol. Chem., 243:3552-59 (1969) and adopted at 37 CFR Free functional groups, including those at the carboxy- or aminoterminus, referred to as noninterfering substituents, can also be modified by amidation, acylation or other substitution, which can, for example, change the solubility of the compounds without affecting their activity. This may be particularly useful in those instances where BRCA1 Modulator Proteins are known to have certain regions that bind to BRCA1, and it is desirable to make soluble peptides from these regions.

As employed throughout the disclosure, the following terms, unless otherwise indicated, shall be understood to have the following meanings: Throughout the description and claims of the specification the word "comprise" and variations of the word, such as "comprising" and "comprises", is not intended to exclude other additives, components, integers or steps.

The term "isolated protein" referred to herein means a protein of cDNA, recombinant RNA, or synthetic origin or some combination thereof, which by virtue of its origin the "isolated protein" is not substantially associated with proteins found in nature, is substantially free of other proteins from the same source, e.g. free of human proteins, may be expressed by a cell from a 20 different species, or does not occur in nature.

The term "naturally-occurring" as used herein as applied to an object refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has 25 not been intentionally modified by man in the laboratory is naturally-occurring.

C:\WINWORDUJENNYM\SPECNKIl38293-97.DOC WO 98/10066 PCT/US97/13944 The term "polynucleotide" as referred to herein means a polymeric form of nucleotides of at least 10 bases in length, either ribonucleotides or deoxynucleotides or a modified form of either type of nucleotide. The term includes single and double stranded forms of DNA.

The term "oligonucleotide" referred to herein includes naturally occurring, and modified nucleotides linked together by naturally occurring, and non-naturally occurring oligonucleotide linkages. Oligonucleotides are a polynucleotide subset with 200 bases or fewer in length. Preferably oligonucleotides are 10 to 60 bases in length and most preferably 12, 13, 14, 15, 16, 17, 18, 19, or 20 to 40 bases in length.

Oligonucleotides are usually single stranded, e.g. for probes; although oligonucleotides may be double stranded, e.g. for use in the construction of a gene mutant.

Oligonucleotides of the invention can be either sense or antisense oligonucleotides. The term "naturally occurring nucleotides" referred to herein includes deoxyribonucleotides and ribonucleotides. The term "modified nucleotides" referred to herein includes nucleotides with modified or substituted sugar groups and the like. The term "oligonucleotide linkages" referred to herein includes oligonucleotides linkages such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phoshoroaniladate, phosphoroamidate, and the like. An oligonucleotide can include a label for detection, if desired.

The term "sequence homology" referred to herein describes the proportion of base matches between two nucleic acid sequences or the proportion amino acid matches between two amino acid sequences. When sequence homology is expressed as a percentage, 5 the percentage denotes the proportion of matches over the length of sequence from BRCA 1 that is compared to some other sequence. Gaps (in either of the two sequences) are permitted to maximize matching; gap lengths of 15 bases or less are usually used, 6 bases or less are preferred with 2 bases or less more preferred.

When using oligonucleotides as probes or treatments the sequence homology between the target nucleic acid and the oligonucleotide sequence is generally not less than 17 target base matches out of 20 possible oligonucleotide base pair matches preferably not less than 9 matches out of 10 possible base pair matches and most preferably not less than 19 matches out of 20 possible base pair matches WO 98/10066 PCT/US97/13944 Two amino acid sequences are homologous if there is a partial or complete identity between their sequences. For example, 85% homology means that 85% of the amino acids are identical when the two sequences are aligned for maximum matching.

Gaps (in either of the two sequences being matched) are allowed in maximizing matching; gap lengths of 5 or less are preferred with 2 or less being more preferred.

Alternatively and preferably, two protein sequences (or polypeptide sequences derived from them of at least 30 amino acids in length) are homologous, as this term is used herein, if they have an alignment score of at more than 5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 or greater. See Dayhoff, in Atlas of Protein Sequence and Structure, 1972, volume National Biomedical Research Foundation, pp. 101-110, and Supplement 2 to this volume, pp. 1-10. The two sequences or parts thereof are more preferably homologous if their amino acids are greater than or equal to 50% identical when optimally aligned using the ALIGN program.

One of the properties of a BRCA1 Modulator Protein is the presence of a leucine zipper domain. The latter is defined as a stretch of amino acids rich in leucine residues, generally every seventh residue, which provide a means whereby a protein may dimerize to form either homodimers or heterodimers. Examples of proteins with leucine zippers include Jun and Fos.

An optional property of a BRCA1 Modulator Protein is the presence of a zinc finger domain, preferrably of the type C 3

H

2

C

3

C

3

HC

4 or CX2CX,, 27

,,CXHX

2 H or CX 2

CX

6 ,7CX 2 C; where C, X, and H denote cysteine, an amino acid, and histidine, respectively.

The domain binds zinc ions, and is often associated with proteins that bind DNA. Such domains are readily identified using an appropriate data base known to a skilled practitioner of this art, particularly the Prosite Protein Database.

As used herein, "substantially pure" means an object species is the predominant species present on a molar basis it is more abundant than any other macromolecular individual species in the composition), and preferably a substantially purified fraction is a composition wherein the object species comprises at least about percent (on a molar basis) of all macromolecular species present. Generally, a substantially pure composition will comprise more than about 80 percent of all macromolecular species present in the composition, more preferably more than about 8 WO 98/10066 PCT/US97/13944 90%, 95%, and 99%. Most preferably, the object species is purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of a single macromolecular species.

The phrases "Modulator Protein," "Modulator Peptide," or Modulator Polypeptide" refer to proteins or peptides that affect the activity of the BRCA1 gene or the protein encoded by the gene. Each of these definitions is meant to encompass one or more such entities.

Chemistry terms herein are used according to conventional usage in the art, as exemplified by The McGraw-Hill Dictionary of Chemical Terms (ed. Parker, 1985), McGraw-Hill, San Francisco, incorporated herein by reference.

The production of proteins from cloned genes by genetic engineering is well known. See, e.g. U.S. Patent Number 4,761,371 to Bell et al. at column 6, line 3 to column 9, line 65. (The disclosure of all patent references cited herein is to be incorporated herein by reference.) The discussion which follows is accordingly intended as an overview of this field, and is not intended to reflect the full state of the art.

DNA regions are operably linked when they are functionally related to each other. For example: a promoter is operably linked to a coding sequence if it controls the transcription of the sequence; a ribosome binding site is operably linked to a coding sequence if it is positioned so as to permit translation. Generally, operably linked means contiguous and, in the case of leader sequences, contiguous and in reading frame.

Suitable host cells include prokaryotes, yeast cells, or higher eukaryotic cells.

Prokaryotes include gram negative or gram positive organisms, for example Escherichia coli coli) or Bacilli. Higher eukaryotic cells include established cell lines of mammalian origin as described below. Exemplary host cells are DH5a, E. coli W3110 (ATCC 27,325), E coli B, E. coli X1776 (ATCC 31,537) and E. coli 294 (ATCC 31,446).

Pseudomonas species, Bacillus species, and Serratia marcesans are also suitable.

In an insect system, Autographa californica nuclear polyhidrosis virus (AcNPV) may be used as a vector to express foreign genes. see Smith et al., 1983, J. Virol.

46: 584; Smith, U.S. Patent No. 4,215,051). In a specific embodiment described below, WO 98/10066 PCT/US97/13944 Sf9 insect cells are infected with a baculovirus vector expressing a glu-glu epitope tagged BRCA1 Modulator construct. See, Rubinfeld, et al., J. Biol. Chem. vol. 270, no.

pp 5549-5555 (1995). Other epitope tags may be employed that are known in the art including a 6x histidine tag myc, or an EE-tag Glu-Glu-tag). refers to the amino acid glutamine.

A broad variety of suitable microbial vectors are available. Generally, a microbial vector will contain an origin of replication recognized by the intended host, a promoter which will function in the host and a phenotypic selection gene such as a gene encoding proteins conferring antibiotic resistance or supplying an autotrophic requirement. Similar constructs will be manufactured for other hosts. E. coli is typically transformed using pBR322. See Bolivar et al., Gene 2, 95 (1977). pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells. Expression vectors should contain a promoter which is recognized by the host organism. This generally means a promoter obtained from the intended host. Promoters most commonly used in recombinant microbial expression vectors include the beta-lactamase (penicillinase) and lactose promoter systems (Chang et al., Nature 275, 615 (1978); and Goeddel et al., Nucleic Acids Res. 8, 4057 (1980) and EPO Application Publication Number 36,776) and the tac promoter De Boer et al., Proc.

Natl. Acad. Sci. USA 80, 21 (1983)). While these are commonly used, other microbial promoters are suitable. Details concerning nucleotide sequences of many promoters have been published, enabling a skilled worker to operably ligate them to DNA encoding BRCA 1 in plasmid or viral vectors (Siebenlist et al., Cell 20, 269, 1980)). The promoter and Shine-Dalgarno (SD) sequence (for prokaryotic host expression) are operably linked to the DNA encoding BRCA 1, i.e. they are positioned so as to promote transcription of the BRCA 1 messenger RNA from the DNA. The SD sequence is thought to promote binding of mRNA to the ribosome by the pairing of bases between the SD sequence and the 3' end of E. coli 16S rRNA (Steitz et al. (1979). In Biological Regulation and Development: Gene Expression (ed. R.F. Goldberger)). To express eukaryotic genes and prokaryotic genes with a weak ribosome-binding site see Sambrook et al. (1989) "Expression of cloned genes in Escherichia coli." In Molecular Cloning: A Laboratory Manual. Furthermore, a bacterial promoter can include naturally occurring promoters of non-bacterial origin that have the ability to bind WO 98/10066 PCT/US97/13944 bacterial RNA polymerase and initiate transcription. A naturally occurring promoter of non-bacterial origin can also be coupled with a compatible RNA polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA polymerase/promoter system is an example of a coupled promoter system (Studier et al. (1986) J. Mol. Biol. 189:113; Tabor et al. (1985) Proc. Natl. Acad. Sci. 82:1074). In addition, a hybrid promoter can also be composed of a bacteriophage promoter and an E. coli operator region (EPO Pub. No. 267,851).

BRCA1 Modulators can be expressed intracellularly. A promoter sequence can be directly linked with a BRCA1 Modulator gene or a fragment thereof, in which case the first amino acid at the N-terminus will always be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus can be cleaved from the protein by in vitro incubation with cyanogen bromide or by either in vivo on in vitro incubation with a bacterial methionine N-terminal peptidase (EPO Pub. No. 219,237).

Eukaryotic microbes such as yeast cultures may be transformed with suitable BRCA1 Modulator vectors. See, e.g. U.S. Patent Number 4,745,057. Saccharomyces cerevisiae is the most commonly used among lower eukaryotic host microorganisms, although a number of other strains are commonly available. Yeast vectors may contain an origin of replication from the 2 micron yeast plasmid or an autonomously replicating sequence (ARS), a promoter, DNA encoding BRCA1 Modulator, sequences for polyadenylation and transcription termination, and a selection gene.

Suitable promoting sequences in yeast vectors include the promoters for metallothionein, 3-phosphoglycerate kinase (Hitzeman et al., J. Biol. Chem. 255, 2073 (1980) or other glycolytic enzymes (Hess et al., J. Adv. Enzyme Reg. 7, 149 (1968); and Holland et al., Biochemistry 17, 4900 (1978)), such as enolase, glyceraldehyde-3phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. Suitable vectors and promotes for use in yeast expression are further described in R. Hitzman et al., EPO Publication Number 73,657.

Cultures of cells derived from multicellular organisms are a desirable host for recombinant BRCA1 Modulator synthesis. In principal, any higher eukaryotic cell culture is workable, whether from vertebrate or invertebrate culture. However, 11 WO 98/10066 PCT/US97/13944 mammalian cells are preferred, as illustrated in the Examples. Propagation of such cells in cell culture has become a routine procedure. See Tissue Culture, Academic Press, Kruse and Paterson, editors (1973).

The transcriptional and translational control sequences in expression vectors to be used in transforming vertebrate cells are often provided by viral sources. For example, commonly used promoters are derived from CMV, polyoma, Adenovirus 2, and Simian Virus 40 (SV40). See, U.S. Patent Number 4,599,308.

An origin of replication may be provided either by construction of the vector to include an exogenous origin, such as may be derived from SV40 or other viral source Polyoma, Adenovirus, VSV, or BPV), or may be provided by the host cell chromosomal replication mechanism. If the vector is integrated into the host cell chromosome, the latter may be sufficient.

Identification of BRCA1 Modulators BRCA1 Modulators can be identified using several different techniques for detecting protein-protein interactions. Among the traditional methods which may be employed are co-immunoprecipitation, crosslinking and co-purification through gradients or chromatographic columns of cell lysates, or proteins obtained from cell lysates using BRCA1 to identify proteins in the lysate that interact with BRCA1. Such assays may employ full length BRCA1 or a BRCA1 peptide. Once isolated, such an intracellular protein can be identified and can, in turn, be used, in conjunction with standard techniques, to identify proteins with which it interacts. For example, at least a portion of the amino acid sequence of an intracellular protein which interacts with BRCA1, can be ascertained using techniques well known to those of skill in the art, such as the Edman degradation technique. (See, Creighton, 1983, "Proteins: Structures and Molecular Principles", W.H. Freeman Co., pp.34-49). The amino acid sequence obtained may be used as a guide for the generation of oligonucleotide mixtures that can be used to screen for gene sequences encoding such intracellular proteins. Screening may be accomplished, for example, by standard hybridization or PCR techniques. Techniques for the generation of oligonucleotide mixtures and the screening are well-known. (See, Ausubel, sura., and PR Protocols: A Guide to Methods and Applications, 1990, Innis, M. et al., eds. Academic Press, Inc., New York).

WO 98/10066 PCT/US97/13944 Additionally, methods may be employed which result in the simultaneous identification of genes which encode the intracellular proteins interacting with BRCA1.

These methods include, for example, probing expression libraries, in a manner similar to the well known technique of antibody probing of Xgt11 libraries, using labeled BRCA1 protein, or fusion protein, BRCA1 fused to a marker an enzyme, fluor, luminescent protein, or dye), or an Ig-Fc domain.

One method which detects protein interactions in vivo, the two-hybrid system is described in detail for illustration only and not by way of limitation. This system has been described S. Patent No. 5, 283, 173 Chien et al., 1991, Proc. Natl. Acad. Sci.

USA, 88:9578-9582) and is commercially available from Clontech (Palo Alto, CA).

Briefly, utilizing such a system, plasmids are constructed that encode two hybrid proteins: one plasmid consists of nucleotides encoding the DNA-binding domain of a transcription activator protein fused to a BRCA1 nucleotide sequence encoding BRCA1, or BRCA1 peptide or fusion protein, and the other plasmid consists of nucleotides encoding the transcription activator protein's activation domain fused to a cDNA encoding an unknown protein which has been recombined into this plasmid as a part of the cDNA library. The DNA-binding domain fusion plasmid and the cDNA library are transformed into a strain of the yeast Saccharomyces cerevisiae that contains a reporter gene HIS3 or lacZ) whose regulatory region contain the transcription activator's binding site. Either hybrid protein alone cannot activate transcription of the reporter gene; the DNA-binding domain hybrid cannot because it does not provide activation function, and the activation domain hybrid cannot because it cannot localize to the activator's binding sites. Interaction of the two hybrid proteins reconstitutes the functional activator protein and results in transcriptional activation of the reporter gene, which is detected by an assay for the reporter gene product.

The two-hybrid system or related methodology may be used to screen activation domain libraries for proteins that interact with the "bait" gene product. By way of example, and not by way of limitation, preferrably BRCA1 peptides, or fusion proteins are used as the bait gene product. Full length BRCA1 alone can act as a transcriptional activator protein and thus cannot serve as "bait." Total genomic or cDNA sequences are fused to the DNA encoding an activation domain. This library and a plasmid encoding a hybrid of a bait BRCA1 gene product fused to the DNA-binding domain are 13 WO 98/10066 PCT/US97/13944 cotransformed into a yeast reporter strain, and the resulting tranformants are screened for those that have transcriptionally activated reporter gene. For example, and not by way of limitation, a bait BRCA1 gene sequence, such as the open reading frame of BRCA1 (or a domain of BRCA1) can be cloned into a vector such that it is translationally fused to the DNA encoding the DNA-binding domain of the GAL4 protein. These colonies are purified and the library plasmids responsible for reporter gene transcription are isolated. DNA sequencing is then used to determine the nucleotide sequence of the clones which, in turn, reveals the identity of the protein sequences encoded by the library plasmids.

A cDNA library of the cell line from which proteins that interact with bait BRCA1 gene product are to be detected can be made using methods routinely practiced in the art. According to the particular system described herein, for example, the cDNA fragments can be inserted into a vector such that they are translationally fused to the transcriptional activation domain of GAL4. This library can be co-transformed along with the bait BRCA1 gene-GAL4 fusion plasmid into a yeast strain which contains a lacZ gene driven by a promoter which contains GAL4 activation sequence. A cDNA encoded protein, fused to GAL4 transcriptional activation domain, that interacts with bait BRCA1 gene product will reconstitute an active GAL4 protein and thereby drive expression of the HIS3 gene. Colonies which express HIS3 can be detected by their growth on petri dishes containing semi-solid agar based media lacking histidine. The cDNA can then be purified from these strains, and used to produce and isolate the bait BRCA1 gene-interacting protein using techniques routinely practiced in the art.

Using the above described two-hybrid technique several BRCA1 modulators were identified, and shown to share certain properties including a leucine zipper domain.

BRCA1 Modulator cDNA The cDNA, and deduced amino acid sequences, of three representative BRCA1 Modulator Proteins are shown in Figures 1-3. The cDNAs or the proteins that they encode are hereinafter referred to as 091-21A31, Sequence ID No. 1, 091-1F84, Sequence ID No. 3, and 091-132Q20, Sequence ID No. 5. The cDNAs encode proteins that have calculated molecular weights in the range of about 45-97kd. Particularly noteworthy is the presence of at least one leucine zipper motif, and optionally a zinc finger domain.

14 WO 98/10066 PCT/US97/13944 The BRCA1 Modulator Protein nucleotide sequences of the invention include: (a)the DNA sequences shown in Figures 1-3 or contained in the cDNA clones as deposited with the American Type Culture Collection on August 14, 1996 (ATCC) under accession numbers 98141 (091-1F84, Sequence ID No. 98142 (091-21A31, Sequence ID No. and 98143 (091-132Q20, Sequence ID No. and any nucleotide sequence that hybridizes to the complement of the DNA sequence shown in Figures 1-3 or contained in the cDNA clones as deposited with the ATCC under highly stringent conditions, hybridization to filter-bound DNA in 0.5 M NaHPO 4 7% sodium dodecyl sulfate (SDS), 1mM EDTA at 65 0 C, and washing in 0.1xSSC/0.1% SDS at 68 0

C

(Ausubel F.M. et al., eds., 1989, Current Protocols in Molecular Biology, Vol. I, Green Publishing Associates, Inc., and John Wiley sons, Inc., New York, at p. 2.10.3) and encodes a functionally equivalent gene product; and any nucleotide sequence that hybridizes to the complement of the DNA sequences that encode the amino acid sequence shown in Figures 1-3 or contained in the cDNA clones as deposited with the ATCC, as described above, under less stringent conditions, such as moderately stringent conditions, washing in 0.2xSSC/0.1% SDS at 42 0 C (Ausubel et al., 1989, supra), yet which still encodes a functionally equivalent BRCA1 Modulator Protein gene product. Functional equivalents include naturally occurring BRCA1 Modulator Protein genes present in other species, and mutant BRCA1 Modulator Protein genes whether naturally occurring or engineered which retain at least some of the functional activities of a BRCA1 Modulator Protein binding to BRCA1). The invention also includes degenerate variants of sequences through The invention also includes nucleic acid molecules, preferably DNA molecules, that hybridize to, and are therefore the complements of, the nucleotide sequences through in the preceding paragraph. Such hybridization conditions may be highly stringent or less highly stringent, as described above. In instances wherein the nucleic acid molecules are deoxyoligonucleotides ("oligos"), highly stringent conditions may refer, e.g, to washing in 6xSSC/0.05% sodium pyrophosphate at 37 0 C (for 14-base oligos), 48 0 C (for 17-base oligos), 55 0 C (for 20-base oligos), and 60 0 C (for 23-base oligos).

These nucleic acid molecules may encode or act as BRCA1 Modulator gene antisense molecules, useful, for example, in gene regulation (for and/or as antisense primers in WO 98/10066 PCT/US97/13944 amplification reactions of BRCA1 Modulator gene nucleic acid sequences). Such sequences may be used as part of ribozyme and/or triple helix sequences, also useful for BRCA1 Modulator gene regulation. Still further, such molecules may be used as components of diagnostic methods whereby, for example, the presence of a particular BRCA1 Modulator allele associated with uncontrolled cell growth cancer) may be detected.

Further, it will be appreciated by one skilled in the art that a BRCA1 Modulator gene homolog may be isolated from nucleic acid of an organism of interest by performing PCR using two degenerate oligonucleotide primer pools designed on the basis of amino acid sequences within the BRCA1 Modulator gene product disclosed herein. The template for the reaction may be cDNA obtained by reverse transcription of mRNA prepared from, for example, human or non-human cell lines or cell types, such as breast or ovarian cells, known or suspected to express a BRCA1 Modulator gene allele.

The PCR product may be subcloned and sequenced to ensure that the amplified sequences represent the sequences of a BRCA1 Modulator gene. The PCR fragment may then be used to isolate a full length cDNA clone by a variety of methods. For example, the amplified fragment may be labeled and used to screen a cDNA library, such as a bacteriophage cDNA library. Alternatively, the labeled fragment may be used to isolate genomic clones via the screening of a genomic library.

PCR technology may also be utilized to isolate full length cDNA sequences. For example, RNA may be isolated, following standard procedures, from an appropriate cellular source one known, or suspected, to express a BRCA1 Modulator gene, such as, for example, from breast or ovarian cells). A reverse transcription reaction may be performed on the RNA using an oligonucleotide primer specific for the most 5' end of the amplified fragment for the priming of first strand synthesis. The resulting RNA/DNA hybrid may then be "tailed" with guanines using a standard terminal transferase reaction, the hybrid may be digested with RNAase H, and second strand synthesis may then be primed with a poly-C primer. Thus, cDNA sequences upstream of the amplified fragment may easily be isolated. For a review of cloning strategies which may be used, see Sambrook et al., 1989, supra.

WO 98/10066 PCT/US97/13944 A cDNA of a mutant BRCA1 Modulator gene may also be isolated, for example, by using PCR. In this case, the first cDNA strand may be synthesized by hybridizing an oligo-dT oligonucleotide to mRNA isolated from cells known or suspected to be expressed in an individual putatively carrying the mutant BRCA1 Modulator allele, and by extending the new strand with reverse transcriptase. The second strand of the cDNA is then synthesized using an oligonucleotide that hybridizes specifically to the end of the normal gene. Using these two primers, the product is then amplified via PCR, cloned into a suitable vector, and subjected to DNA sequence analysis through methods well known to those of skill in the art. By comparing the DNA sequence of the mutant BRCA1 Modulator allele to that of the normal BRCA1 Modulator allele, the mutation(s) responsible for the loss or alteration of function of the mutant BRCA1 Modulator gene product can be ascertained.

A genomic library can be constructed using DNA obtained from an individual suspected of or known to carry the mutant BRCA1 Modulator allele, or a cDNA library can be constructed using RNA from a cell type known, or suspected, to express the mutant BRCA1 Modulator allele. The normal BRCA1 Modulator gene or any suitable fragment thereof may then be labeled and used as a probe to identify the corresponding mutant BRCA1 Modulator allele in such libraries. Clones containing the mutant BRCA1 Modulator gene sequences may then be purified and subjected to sequence analysis according to methods well known to those of skill in the art.

Additionally, an expression library can be constructed utilizing cDNA synthesized from, for example, RNA isolated from a cell type known, or suspected, to express a mutant BRCA1 Modulator allele in an individual suspected of or known to carry such a mutant allele. In this manner, gene products made by the putatively mutant cell type may be expressed and screened using standard antibody screening techniques in conjunction with antibodies raised against the normal BRCA1 Modulator gene product, as described, below. (For screening techniques, see, for example, Harlow, E. and Lane, eds., 1988, "Antibodies: A Laboratory Manual", Cold Spring Harbor Press, Cold Spring Harbor.) Additionally, screening can be accomplished by screening with labeled fusion proteins. In cases where a BRCA1 Modulator mutation results in an expressed gene product with altered function as a result of a missense or a frameshift mutation), a polyclonal set of antibodies to a BRCA1 Modulator are likely to 17 WO 98/10066 PCT/US97/13944 cross-react with the BRCA1 Modulator mutant. Such BRCA1 Modulator mutants detected via their reaction with labeled antibodies can be purified and subjected to sequence analysis according to methods well known to those of skill in the art.

The invention also encompasses nucleotide sequences that encode peptide fragments of a BRCA1 Modulator, truncated BRCA1 Modulators, and fusion proteins of a BRCA1 Modulator. Nucleotides encoding fusion proteins may include but are not limited to full length BRCA1 Modulators, truncated BRCA1 Modulators or peptide fragments to an unrelated protein or peptide, such as for example, an epitope tag which aids in purification or detection of the resulting fusion protein; or an enzyme, fluorescent protein, luminescent protein which can be used as a marker. The preferred epitope tag is glu-glu as described by Rubinfeld, et al., J. Biol. Chem. vol. 270, no. pp 5549-5555 (1995), and Grussenmyer, et al., Proc. Natl. Acad. Sci. U. S. A. vol. 82, pp. 7952-7954 (1985).

The invention also encompasses DNA vectors that contain any of the foregoing BRCA1 Modulator coding sequences and/or their complements antisense); DNA expression vectors that contain any of the foregoing BRCA1 Modulator coding sequences operatively associated with a regulatory element that directs the expression of the coding sequences; and genetically engineered host cells that contain any of the foregoing BRCA1 Modulator coding sequences operatively associated with a regulatory element that directs the expression of the coding sequences in the host cell. As used herein, regulatory elements include but are not limited to inducible and non-inducible promoters, enhancers, operators and other elements known to those skilled in the art that drive and regulate expression. Such regulatory elements include but'are not limited to the baculovirus promoter, cytomegalovirus hCMV immediate early gene, the early or late promoters of SV40 adenovirus, the lac system, the trp system, the TAC system, the TRC system, the major operator and promoter regions of phage A, the control regions of fd coat protein, the promoter for 3phosphoglycerate kinase, the promoters of acid phosphatase, and the promoters of the yeast-mating factors.

BRCA1 Modulator Proteins As mentioned above, Figures 1-3 shows the cDNA, and deduced amino acid sequences, of three representative BRCA1 Modulator Proteins; 091-21A31, Sequence ID 18 WO 98/10066 PCTIUS97/13944 No. 1, 091-1F84, Sequence ID No. 3, and 091-132Q20, Sequence ID No. 5. 091-132Q20, Sequence ID No. 5 is not a full length sequence. The proteins have calculated molecular weights in the range of about 45-97kd. Particularly noteworthy is the presence of at least one leucine zipper motif, and optionally a zinc finger domain. For instance, 091-1F84, Sequence ID No. 3 has two leucine zippers, 091-132Q20, Sequence ID No. 5 has a single leucine zipper, while 091-21A31, Sequence ID No. 1 has a single leucine zipper and a zinc finger domain. Such domains are readily identified using the Prosite Protein Database.

The invention BRCA1 Modular Proteins, peptide fragments, mutated, truncated or deleted forms of and fusion proteins of these can be prepared for a variety of uses, including but not limited to the generation of antibodies, as reagents in diagnostic assays, the identification and/or the interaction with other cellular gene products involved in cell growth, as reagents in assays for screening for compounds that can be used in the treatment of unwanted cell growth disorders, including but not limited to cancer, and as pharmaceutical reagents useful in the treatment of such diseases.

By way of example, the 091-21A31, Sequence ID No. 1 BRCA1 Modulator Protein sequence begins with a methionine in a DNA sequence context consistent with a translation initiation site. The predicted molecular mass of this BRCA1 Modulator Protein is 53.3 kD.

The BRCA1 Modulator Protein amino acid sequences of the invention include the amino acid sequence shown in FIG. 1, or the amino acid sequence encoded by the cDNA clone, as deposited with the ATCC, as described above. Further, BRCA1 Modulator Proteins of other species are encompassed by the invention. In fact, any BRCA1 Modulator Protein protein encoded by the cDNAs described above, are within the scope of the invention.

The invention also encompasses proteins that are functionally equivalent to the BRCA1 Modulator Protein encoded by the nucleotide sequences described above, as judged by any of a number of criteria, including but not limited to the ability to bind BRCA1, the binding affinity for BRCA1 a change in cellular metabolism or change in phenotype when the BRCA1 Modulator Protein equivalent is present in an appropriate cell type (such as ovarian or breast cells). Such functionally equivalent BRCA1 Modulator Protein proteins include but are not limited to additions or substitutions of 19 WO 98/10066 PCT/US97/13944 amino acid residues within the amino acid sequence encoded by the BRCA1 Modulator nucleotide sequences described, above, but which result in a silent change, thus producing a functionally equivalent gene product. Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic acid.

While random mutations can be made to BRCA1 Modulator DNA (using random mutagenesis techniques well known to those skilled in the art) and the resulting mutant BRCA1 Modulator Proteins tested for activity, site-directed mutations of the BRCA1 Modulator coding sequence can be engineered (using site-directed mutagenesis techniques well known to those skilled in the art) to generate mutant BRCA1 Modulator Proteins with increased function, altered binding affinity for BRCA1.

For example, mutant BRCA1 Modulator Proteins can be engineered so that regions of interspecies identity are maintained, whereas the variable residues are altered, eg, by deletion or insertion of an amino acid residue(s) or by substitution of one or more different amino acid residues. Conservative alterations at the variable positions can be engineered in order to produce a mutant BRCA1 Modulator Protein that retains function. Non-conservative changes can be engineered at these variable positions to alter function. Alternatively, where alteration of function is desired, deletion or non-conservative alterations of the conserved regions can be engineered.

One of skill in the art may easily test such mutant or deleted BRCA1 Modulator Proteins for these alterations in function using the teachings presented herein.

Other mutations to a BRCA1 Modulator coding sequence can be made to generate BRCA1 Modulator Proteins that are better suited for expression, scale up, etc.

in the host cells chosen. For example, the triplet code for each amino acid can be modified to conform more closely to the preferential codon usage of the host cell's translational machinery.

WO 98/10066 PCT/US97/13944 Peptides corresponding to one or more domains (or a portion of a domain) of a BRCA1 Modulator Protein leucine zippers, zinc fingers), truncated or deleted BRCA1 Modulator Proteins BRCA1 Modulator Proteins in which portions of one or more of the above domains are deleted) as well as fusion proteins in which the full length of a BRCA1 Modulator Protein, a BRCA1 Modulator Protein peptide or truncated BRCA1 Modulator Protein is fused to an unrelated protein are also within the scope of the invention and can be designed on the basis of a BRCA1 Modulator nucleotide and BRCA1 Modulator Protein amino acid sequences disclosed in this Section and above. Such fusion proteins include but are not limited to fusions to an epitope tag (such as is exemplified herein); or fusions to an enzyme, fluorescent protein, or luminescent protein which provide a marker function.

While the BRCA1 Modulator Proteins and peptides can be chemically synthesized see Creighton, 1983, Proteins: Structures and Molecular Principles, W.H. Freeman Co., large polypeptides derived from the BRCA1 Modulator Protein and the full length BRCA1 Modulator Protein itself may advantageously be produced by recombinant DNA technology using techniques well known in the art for expressing nucleic acid containing BRCA1 Modulator gene sequences and/or coding sequences. Such methods can be used to construct expression vectors containing the BRCA1 Modulator nucleotide sequences described above and appropriate transcriptional and translational control signals. These methods include, for example, in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. See, for example, the techniques described in Sambrook et al., 1989, supra, and Ausubel et al., 1989, supra. Alternatively, RNA capable of encoding BRCA1 Modulator nucleotide sequences may be chemically synthesized using, for example, synthesizers. See, for example, the techniques described in "Oligonucleotide Synthesis", 1984, Gait, M.J. ed., IRL Press, Oxford, which is incorporated by reference herein in its entirety.

A variety of host-expression vector systems may be utilized to express the BRCA1 Modulator nucleotide sequences of the invention. Where a BRCA1 Modulator Protein peptide or polypeptide is a soluble secreted derivative the peptide or polypeptide can be recovered from the culture medium. If the BRCA1 Modulator Protein peptide or polypeptide is not secreted, it may be isolated from the host cells.

21 WO 98/10066 PCT/US97/13944 However, such engineered host cells themselves may be used in situations where it is important not only to retain the structural and functional characteristics of a BRCA1 Modulator Protein, but to assess biological activity, in drug screening assays.

The expression systems that may be used for purposes of the invention include but are not limited to microorganisms such as bacteria E. coli, B. subtilis) transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing BRCA1 Modulator nucleotide sequences; yeast Saccharomyces, Pichia) transformed with recombinant yeast expression vectors containing the BRCA1 Modulator nucleotide sequences; insect cell systems infected with recombinant virus expression vectors baculovirus) containing the BRCA1 Modulator sequences; plant cell systems infected with recombinant virus expression vectors (e cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression vectors Ti plasmid) containing BRCA1 Modulator nucleotide sequences; or mammalian cell systems COS, CHO, BHK, 293, 3T3, U937) harboring recombinant expression constructs containing promoters derived from the genome of mammalian cells metallothionein promoter) or from mammalian viruses (eg, the adenovirus late promoter; the vaccinia virus 7.5K promoter).

In bacterial systems, a number of expression vectors may be advantageously selected depending upon the use intended for the BRCA1 Modulator gene product being expressed. For example, when a large quantity of such a protein is to be produced, for the generation of pharmaceutical compositions of BRCA1 Modulator Protein or for raising antibodies to the BRCA1 Modulator Protein, for example, vectors which direct the expression of high levels of fusion protein products that are readily purified may be desirable. Such vectors include, but are not limited, to the E. coli expression vector pUR278 (Ruther et al., 1983, EMBO J. 2:1791), in which the BRCA1 Modulator coding sequence may be ligated individually into the vector in frame with the lacZ coding region so that a fusion protein is produced; pIN vectors (Inouye Inouye, 1985, Nucleic Acids Res. 13:3101-3109; Van Heeke Schuster, 1989, J. Biol.

Chem. 264:5503-5509); and the like. pGEX vectors may also be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). If the inserted sequence encodes a relatively small polypeptide (less than 25 kD), such fusion proteins WO 98/10066 PCT/US97/13944 are generally soluble and can easily be purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione. The pGEX vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned target gene product can be released from the GST moiety. Alternatively, if the resulting fusion protein is insoluble and forms inclusion bodies in the host cell, the inclusion bodies may be purified and the recombinant protein solubilized using techniques well known to one of skill in the art.

46: 584; Smith, U.S. Patent No. 4,215,051). In a specific embodiment described below, Sf9 insect cells are infected with a baculovirus vectors expressing either a 6 x HIStagged construct, or an (EE)-tagged BRCA1 Modulator construct.

In mammalian host cells, a number of viral-based expression systems may be utilized. Specific embodiments described more fully below express tagged BRCA1 Modulator cDNA sequences using a CMV promoter to transiently express recombinant protein in U937 cells or in Cos-7 cells. Alternatively, retroviral vector systems well known in the art may be used to insert the recombinant expression construct into host cells. For example, retroviral vector systems for transducing hematopoietic cells are described in published PCT applications WO 96/09400 and WO 94/29438.

In cases where an adenovirus is used as an expression vector, the BRCA1 Modulator nucleotide sequence of interest may be ligated to an adenovirus transcription/translation control complex, the late promoter and tripartite leader sequence. This chimeric gene may then be inserted in the adenovirus genome by in vitro or in vivo recombination. Insertion in a non-essential region of the viral genome region El or E3) will result in a recombinant virus that is viable and capable of expressing the BRCA1 Modulator gene product in infected hosts. See Logan Shenk, 1984, Proc. Natl. Acad. Sci. USA 81:3655-3659). Specific initiation signals may also be required for efficient translation of inserted BRCA1 Modulator nucleotide sequences. These signals include the ATG initiation codon and adjacent sequences. In cases where an entire BRCA1 Modulator gene or cDNA, including its own initiation codon and adjacent sequences, is inserted into the appropriate expression vector, no additional translational control signals may be needed. However, in cases where only a 23 WO 98/10066 PCT[US97/13944 portion of the BRCA1 Modulator coding sequence is inserted, exogenous translational control signals, including, perhaps, the ATG initiation codon, must be provided.

Furthermore, the initiation codon must be in phase with the reading frame of the desired coding sequence to ensure translation of the entire insert. These exogenous translational control signals and initiation codons can be of a variety of origins, both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of appropriate transcription enhancer elements, transcription terminators, etc. (See Bittner et al., 1987, Methods in Enzymol. 153:516-544).

In addition, a host cell strain may be chosen which modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Such modifications glycosylation) and processing cleavage) of protein products may be important for the function of the protein. Different host cells have characteristic and specific mechanisms for the post-translational processing and modification of proteins and gene products. Appropriate cell lines or host systems can be chosen to ensure the correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells which possess the cellular machinery for proper processing of the primary transcript may be used. Such mammalian host cells include but are not limited to CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and U937 cells.

For long-term, high-yield production of recombinant proteins, stable expression is preferred. For example, cell lines which stably express the BRCA1 Modulator sequences described above may be engineered. Rather than using expression vectors which contain viral origins of replication, host cells can be transformed with DNA controlled by appropriate expression control elements promoter, enhancer sequences, transcription terminators, polyadenylation sites, etc.), and a selectable marker. Following the introduction of the foreign DNA, engineered cells may be allowed to grow for 1-2 days in an enriched media, and then are switched to a selective media. The selectable marker in the recombinant plasmid confers resistance to the selection and allows cells to stably integrate the plasmid into their chromosomes and grow to form colonies which in turn can be cloned and expanded into cell lines. This method may advantageously be used to engineer cell lines which express the BRCA1 Modulator gene product. Such engineered cell lines may be particularly useful in 24 WO 98/10066 PCT/US97/13944 screening and evaluation of compounds that affect the endogenous activity of the BRCA1 Modulator gene product.

A number of selection systems may be used, including but not limited to the herpes simplex virus thymidine kinase (Wigler et al., 1977, Cell 11:223), hypoxanthineguanine phosphoribosyltransferase (Szybalska Szybalski, 1962, Proc. Natl. Acad. Sci.

USA 48:2026), and adenine phosphoribosyltransferase (Lowy et al., 1980, Cell 22:817) genes can be employed in tk-, hgprt- or aprt- cells, respectively. Also, antimetabolite resistance can be used as the basis of selection for the following genes: dhfr, which confers resistance to methotrexate (Wigler et al., 1980, Natl. Acad. Sci. USA 77:3567; O'Hare et al., 1981, Proc. Natl. Acad. Sci. USA 78:1527); gpt, which confers resistance to mycophenolic acid (Mulligan Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072); neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin et al., 1981, J.

Mol. Biol. 150:1); and hygro, which confers resistance to hygromycin (Santerre et al., 1984, Gene 30:147).

The BRCA1 Modulator gene products can also be expressed in transgenic animals. Animals of any species, including, but not limited to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, goats, and non-human primates, e.g, baboons, monkeys, and chimpanzees may be used to generate BRCA1 Modulator transgenic animals.

Any technique known in the art may be used to introduce the BRCA1 Modulator transgene into animals to produce the founder lines of transgenic animals. Such techniques include, but are not limited to pronuclear microinjection (Hoppe, P.C. and Wagner, 1989, U.S. Pat. No. 4,873,191); retrovirus mediated gene transfer into germ lines (Van der Putten et al., 1985, Proc. Natl. Acad. Sci., USA 82:6148-6152); gene targeting in embryonic stem cells (Thompson et al., 1989, Cell 56:313-321); electroporation of embryos (Lo, 1983, Mol Cell. Biol. 3:1803-1814); and sperm-mediated gene transfer (Lavitrano et al., 1989, Cell 57:717-723); etc. For a review of such techniques, see Gordon, 1989, Transgenic Animals, Intl. Rev. Cytol. 115:171-229, which is incorporated by reference herein in its entirety.

The present invention provides for transgenic animals that carry the BRCA1 Modulator transgene in all their cells, as well as animals which carry the transgene in some, but not all their cells, mosaic animals. The transgene may be integrated as a WO 98/10066 PCT/US97/13944 single transgene or in concatamers, head-to-head tandems or head-to-tail tandems.

The transgene may also be selectively introduced into and activated in a particular cell type by following, for example, the teaching of Lasko et al. (Lasko, M. et al., 1992, Proc.

Natl. Acad. Sci. USA 89: 6232-6236). The regulatory sequences required for such a celltype specific activation will depend upon the particular cell type of interest, and will be apparent to those of skill in the art. When it is desired that the BRCA1 Modulator transgene be integrated into the chromosomal site of the endogenous BRCA1 Modulator gene, gene targeting is preferred. Briefly, when such a technique is to be utilized, vectors containing some nucleotide sequences homologous to the endogenous BRCA1 Modulator gene are designed for the purpose of integrating, via homologous recombination with chromosomal sequences, into and disrupting the function of the nucleotide sequence of the endogenous BRCA1 Modulator gene. In this way, the expression of the endogenous BRCA1 Modulator gene may also be eliminated by inserting non-functional sequences into the endogenous gene. The transgene may also be selectively introduced into a particular cell type, thus inactivating the endogenous BRCA1 Modulator gene in only that cell type, by following, for example, the teaching of Gu et al. (Gu et al., 1994, Science 265: 103-106). The regulatory sequences required for such a cell-type specific inactivation will depend upon the particular cell type of interest, and will be apparent to those of skill in the art.

Once transgenic animals have been generated, the expression of the recombinant BRCA1 Modulator gene may be assayed utilizing standard techniques. Initial screening may be accomplished by Southern blot analysis or PCR techniques to analyze animal tissues to assay whether integration of the transgene has taken place. The level of mRNA expression of the transgene in the tissues of the transgenic animals may also be assessed using techniques which include but are not limited to Northern blot analysis of cell type samples obtained from the animal, in situ hybridization analysis, and RT-PCR.

Samples of BRCA1 Modulator gene-expressing tissue, may also be evaluated immunocytochemically using antibodies specific for the BRCA1 Modulator transgene product, as described below.

Antibodies to BRCA1 Modulator Proteins Antibodies that specifically recognize one or more epitopes of a BRCA1 Modulator Protein, or epitopes of conserved variants, or peptide fragments are also 26 WO 98/10066 PCT/US97/13944 encompassed by the invention. Such antibodies include but are not limited to polyclonal antibodies, monoclonal antibodies (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab') 2 fragments, fragments produced by a Fab expression library, anti-idiotypic (anti-Id) antibodies, and epitopebinding fragments of any of the above.

The antibodies of the invention may be used, for example, in the detection of the BRCA1 Modulator Protein in a biological sample and may, therefore, be utilized as part of a diagnostic or prognostic technique whereby patients may be tested for abnormal amounts of these proteins. Such antibodies may also be utilized in conjunction with, for example, compound screening schemes, as described herein for the evaluation of the effect of test compounds on expression and/or activity of the BRCA1 Modulator Protein. Additionally, such antibodies can be used in conjunction with the gene therapy techniques described herein, to, for example, evaluate the normal and/or engineered BRCA1 Modulator Protein expressing cells prior to their introduction into the patient.

Such antibodies may additionally be used as a method for the inhibition of abnormal BRCA1 Modulator Protein activity.

For the production of antibodies, various host animals may be immunized by injection with the BRCA1 Modulator Protein, a BRCA1 Modulator Protein peptide, truncated BRCA1 Modulator Protein polypeptides, functional equivalents of the BRCA1 Modulator Protein or mutants of the BRCA1 Modulator Protein. Such host animals may include but are not limited to rabbits, mice, and rats, to name but a few.

Various adjutants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjutants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. Polyclonal antibodies are heterogeneous populations of antibody molecules derived from the sera of the immunized animals.

Monoclonal antibodies, which are homogeneous populations of antibodies to a particular antigen, may be obtained by any technique which provides for the WO 98/10066 PCT/US97/13944 production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique of Kohler and Milstein, (1975, Nature 256:495-497; and U.S. Patent No. 4,376,110), the human B-cell hybridoma technique (Kosbor et al., 1983, Immunology Today 4:72; Cole et al., 1983, Proc. Natl. Acad. Sci.

USA 80:2026-2030), and the EBV-hybridoma technique (Cole et al., 1985, Monoclonal Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Such antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD and any subclass thereof. The hybridoma producing the mAb of this invention may be cultivated in vitro or in vivo. Production of high titers of mAbs in vivo makes this the presently preferred method of production.

In addition, techniques developed for the production of "chimeric antibodies" (Morrison et al., 1984, Proc. Natl. Acad. Sci. USA, 81:6851-6855; Neuberger et al., 1984, Nature, 312:604-608; Takeda et al., 1985, Nature, 314:452-454) by splicing the genes from a mouse antibody molecule of appropriate antigen specificity together with genes from a human antibody molecule of appropriate biological activity can be used. A chimeric antibody is a molecule in which different portions are derived from different animal species, such as those having a variable region derived from a mAb and a human immunoglobulin constant region.

Alternatively, techniques described for the production of single chain antibodies Patent 4,946,778; Bird, 1988, Science 242:423-426; Huston et al., 1988, Proc. Natl.

Acad. Sci. USA 85:5879-5883; and Ward et al., 1989, Nature 334:544-546) can be adapted to produce single chain antibodies against BRCA1 Modulator Protein gene products.

Single chain antibodies are formed by linking the heavy and light chain fragments of the Fv region via an amino acid bridge, resulting in a single chain polypeptide.

Antibody fragments which recognize specific epitopes may be generated by known techniques. For example, such fragments include but are not limited to: the F(ab') 2 fragments which can be produced by pepsin digestion of the antibody molecule and the Fab fragments which can be generated by reducing the disulfide bridges of the F(ab') 2 fragments. Alternatively, Fab expression libraries may be constructed (Huse et al., 1989, Science, 246:1275-1281) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity.

WO 98/10066 PCT/US97/13944 Antibodies to the BRCA1 Modulator Protein can, in turn, be utilized to generate anti-idiotype antibodies that "mimic" the BRCA1 Modulator Protein using techniques well known to those skilled in the art. (See, Greenspan Bona, 1993, FASEB J 7(5):437-444; and Nissinoff, 1991, J. Immunol. 147(8):2429-2438).

Identification of Compounds that Increase BRCA1 Levels using BRCA1 Modulators The BRCA1 gene encodes a protein that has been shown to have tumor suppressor activity. See, Holt, J. et al, (1996) Nat. Genet. vol. 12, pages 298-302. Such studies have shown that certain cancer cells have low levels of BRCA1, and that increasing the levels causes a reversion to the normal cell phenotype. Thus, compounds that increase BRCA1 levels will have significant therapeutic use for the treatment of cancer.

An aspect of the instant invention is the description of an assay using BRCA1 and BRCA1 Modulators that facilitates the identification of compounds that increase intracellular levels of BRCA1. One format of the assay is shown in schematic form in Figure 4. Briefly, the assay makes use of two events: firstly, BRCA1 is known to be a general transcriptional activator, and secondly, BRCA1 Modulators bind to BRCA1.

The assay makes use of certain features of the two-hybrid assay described above. Two plasmids are constructed and transfected into a suitable cell line, preferrably a breast or ovarian cell line. A preferred breast cell line would be MCF-7. One plasmid contains the nucleotide sequence recognized by GAL4 operably linked to an activator sequence, and a reporter gene downstream of this sequence. An example of a preferred reporter gene is the gene that encodes luciferase. The second plasmid encodes and expresses the GAL4 DNA binding domain fused to a BRCA1 Modulator. The preferred Modulator is 091-21A31, Sequence ID No. 1.

The GAL4 DNA binding domain-BRCA1 Modulator fusion protein binds to the GAL4 DNA binding domain on the first plasmid which, in turn, recruits any BRCA1 present to form a complex consisting of GAL4 DNA binding domain-BRCA1 Modulator fusion and BRCA1. As part of the complex, BRCA1 is in proximity to the activator sequence which in turn initiates transcription of the reporter gene. Thus, compounds can be tested for their capacity to stimulate the production of BRCA1.

WO 98/10066 PCT/US97/13944 Those that do will cause an increase in the reporter gene product. The above assay is schematically presented in Figure 4.

Identification of Compounds that alter BRCA1 Interaction with BRCA1 Modulators As mentioned above, BRCA1 is a known tumor suppressor. See, Holt, J. et al, (1996) Nat. Genet. vol. 12, pages 298-302. Thus compounds that affect the normal interaction of BRCA1 with BRCA1 Modulator Proteins may affect the tumor suppressor activity of BRCA1. The extent of the effect will, in large part, depend on the chemical properties of the compounds tested. Some may strongly disrupt the interaction of BRCA1 with BRCA1 Modulator Proteins, while others would have a minimal effect.

The former would be reflected in a biological assay for altered tumorgenicity, while the latter would not. The converse is also true, certain compounds may strengthen the interaction of BRCA1 with BRCA1 Modulator Proteins, in which case the opposite biological effect would be anticipated. Thus, it is highly desirable to assay for compounds that affect BRCA1 interactions with BRCA1 Modulator Protein.

The basic principle of the assay systems used to identify such compounds that affect BRCA1 interactions with BRCA1 Modulator Proteins involves preparing a reaction mixture containing BRCA1 protein, polypeptide, peptide or fusion protein as described above, and a BRCA1 Modulator Protein under conditions and for a time sufficient to allow the two to interact and bind, thus forming a complex. In order to test a compound for inhibitory activity, the reaction mixture is prepared in the presence and absence of the test compound. The test compound may be initially included in the reaction mixture, or may be added at a time subsequent to the addition of the BRCA1 moiety and its BRCA1 Modulator Protein. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the BRCA1 moiety and the BRCA1 Modulator Protein is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the BRCA1 and the interactive BRCA1 Modulator Protein. Additionally, complex formation within reaction mixtures containing the test compound and normal BRCA1 protein may also be compared to complex formation within reaction mixtures containing the test compound and a mutant BRCA1. This comparison may be important in those cases WO 98/10066 PCTIUS97/13944 wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal BRCA1.

The assay for compounds that interfere with the interaction of the BRCA1 and BRCA1 Modulator Proteins can be conducted in a heterogeneous or homogeneous 5 format. Heterogeneous assays involve anchoring either the BRCA1 moiety or the BRCA1 Modulator Protein onto a solid phase and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction by competition can be identified by conducting the reaction in the presence of the test substance; by adding the test substance to the reaction mixture prior to or simultaneously with the BRCA1 moiety and interactive BRCA1 Modulator Protein. Alternatively, test compounds that disrupt preformed complexes, compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed.

Representative formats are described briefly below.

In a heterogeneous assay system, either the BRCA1 moiety or the interactive BRCA1 Modulator Protein, is anchored onto a solid surface, while the non-anchored species is labeled, either directly or indirectly. In practice, microtiter plates are conveniently utilized. The anchored species may be immobilized by non-covalent or covalent attachments. Non-covalent attachment may be accomplished simply by coating the solid surface with a solution of BRCA1 or BRCA1 Modulator Protein and drying. Alternatively, an immobilized antibody specific for the species to be anchored may be used to anchor the species to the solid surface. The surfaces may be prepared in advance and stored.

In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed by washing) and any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that WO 98/10066 PCT/US97/13944 complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, may be directly labeled or indirectly labeled with a labeled anti-Ig antibody).

Depending upon the order of addition of reaction components, test compounds which inhibit complex formation or which disrupt preformed complexes can be detected.

Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds which inhibit complex or which disrupt preformed complexes can be identified.

In an alternate embodiment of the invention, a homogeneous assay can be used.

In this approach, a preformed complex of the BRCA1 moiety and the interactive BRCA1 Modulator Protein is prepared in which either the BRCA1 or its BRCA1 Modulator Proteins is labeled, but the signal generated by the label is quenched due to formation of the complex (see, g. U.S. Patent No.4,109,496 by Rubenstein which utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances which disrupt BRCA1/intracellular BRCA1 Modulator Protein interaction can be identified.

In a particular embodiment, a BRCA1 fusion protein can be prepared for immobilization. For example, BRCA1 or a peptide fragment can be fused to a glutathione-S-transferase (GST) gene using a fusion vector, such as pGEX-5X-1, in such a manner that its binding activity is maintained in the resulting fusion protein. The interactive BRCA1 Modulator Protein can be purified and used to raise a monoclonal antibody, using methods routinely practiced in the art and described above. This antibody can be labeled with the radioactive isotope 125I, for example, by methods routinely practiced in the art. In a heterogeneous assay, the GST-BRCA1 fusion protein can be anchored to glutathione-agarose beads. The interactive BRCA1 WO 98/10066 PCT/US97/13944 Modulator Protein can then be added in the presence or absence of the test compound in a manner that allows interaction and binding to occur. At the end of the reaction period, unbound material can be washed away, and the labeled monoclonal antibody can be added to the system and allowed to bind to the complexed components. The interaction between the BRCA1 protein and the interactive BRCA1 Modulator Protein can be detected by measuring the amount of radioactivity that remains associated with the glutathione-agarose beads. A successful inhibition of the interaction by the test compound will result in a decrease in measured radioactivity.

Alternatively, the GST-BRCA1 fusion protein and the interactive BRCA1 Modulator Protein can be mixed together in liquid in the absence of the solid glutathione-agarose beads. The test compound can be added either during or after the species are allowed to interact. This mixture can then be added to the glutathioneagarose beads and unbound material is washed away. Again the extent of inhibition of the BRCA1/BRCA1 Modulator Protein interaction can be detected by adding the labeled antibody and measuring the radioactivity associated with the beads.

In another embodiment of the invention, these same techniques can be employed using peptide fragments that correspond to the binding domains of the BRCA1 and/or the interactive or BRCA1 Modulator in place of one or both of the full length proteins.

Any number of methods routinely practiced in the art can be used to identify and isolate the binding domains. Such domains are discussed more fully in the examples, below. These methods include, but are not limited to, mutagenesis of the gene encoding one of the proteins and screening for disruption of binding in a co-immunoprecipitation assay. Compensating mutations in the gene encoding the second species in the complex can then be selected. Sequence analysis of the genes encoding the respective proteins will reveal the mutations that correspond to the region of the protein involved in interactive binding. The two hybrid assay may also be used, as discussed more fully in the examples below. For instance, once the gene coding for the intracellular BRCA1 Modulator Protein is obtained, short gene segments can be engineered to express peptide fragments of the protein, which can then be tested for binding activity and purified or synthesized. purified or synthesized.

WO 98/10066, PCTIUS97/13944 Effective Dose Toxicity and therapeutic efficacy of compounds identified above that affect the interaction of BRCA1 with BRCA1 Modulator Proteins, and thus affect cell growth can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, for determining the LD 5 0 (the dose lethal to 50% of the population) and the ED 5 0 (the dose therapeutically effective in 50% of the population). Numerous model systems are known to the skilled practitioner of the art that can be employed to test the cell growth properties of the instant compounds including growth of cells in soft agar, and effect on tumors in vivo. Such experiments can be conducted on cells cotransfected with BRCA1 and BRCA1 Modulators The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD 5 0

/ED

5 0 Compounds which exhibit large therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED 5 0 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC 5 0 the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

The Examples which follow are illustrative of specific embodiments of the invention, and various uses thereof. They are set forth for explanatory purposes only, and are not to be taken as limiting the invention.

34 WO 98/10066 PCT/US97/13944 Example 1 Identification of cDNAs that Encode BRCA1 Modulator Proteins BRCA1 modulators were identified initially using the yeast two hybrid assay system described in U. S. Patent No. 5, 283, 173, or Chien et al., 1991, Proc. Natl. Acad.

Sci. USA, 88:9578-9582. The assay components are also commercially available from Clontech (Palo Alto, CA).

The cDNA encoding human BRCA1 (See, Miki, et al Science, vol. 266: 66-71; and PCT/US95/10202) was digested with MvnI-NheI and the fragment representing BRCA1 amino acids 8-1293 was fused to the GAL4 binding domain in the SmaI-NheI sites of pGBT8 plasmid, which is the pMA424 plasmid of Chien et al. as described in Proc. Natl. Acad. Sci. vol. 88: pages 9578-9582 (1991), modified by the insertion of the sequence CCGGGGATCCCCATGGCTAGCCATATG-3' between the EcoRI and Sail unique sites.

This was transformed into the yeast strain YGH1, and the YGH1 strain carrying the plasmid GAL4-BRCA1 (8-1293) was evaluated for its intrinsic ability to activate the two reporters-growth in histidine minus media and P-galactosidase activity. The YGH1 strain carrying the plasmid GAL4-BRCA1 (8-1293) was able to grow on minus histidine plates but this was controlled by the addition of 7.5mM 3-amino-1,2,4-Triazole (3AT) to the minus histidine plates and the strain had no detectable P-galactosidase activity. The YGH1 strain carrying the plasmid GAL4-BRCA1 (8-1293) was subsequently transformed with a HeLa cell cDNA library fused to the GAL4 activation domain in the pGAD plasmid (Chien et al., Proc. Natl. Acad. Sci. vol. 88: pages 9578-9582 (1991).

When a cDNA encodes a protein that interacts with the BRCA1 protein (amino acids 8- 1293), the YGH1 strain is expected to grow in the absence of histidine supplemented with 7.5mM 3AT and produce p-galactosidase.

Four of the 2.5 X 106 transformants screened grew in the absence of histidine supplemented with 7.5mM 3AT and had P-galactosidase activity. The plasmids recovered from these 4 yeast strains were used to re-transform the original YGH1 GAL4-BRCA1 (8-1293) strain. All the plasmids conferred the ability to grow in the absence of histidine supplemented with 7.5mM 3AT and to produce p-galactosidase.

Upon subsequent screening, three of the four were found to have cDNAs that encode Modulator Proteins that clearly bound to BRCA1. One of the plasmids contained the novel cDNA encoding for the BRCA1 Modulator Protein hereinafter termed, 091-21A31, Sequence ID No. 1. The nucleotide and protein sequence are shown in Figure 1. The calculated molecular weight is about 53kd, and it has an estimated pi of 9.05.

Particularly noteworthy is the presence of a zinc finger domain and a leucine zipper motif.

The nucleotide sequence of the second cDNA and amino acid sequence that it encodes, hereinafter termed, 091-1F84, Sequence ID No. 3, is shown in Figure 2. Note that this clone displays two leucine zipper domains. The protein has a calculated molecular weight of 96, 443. 3 and an estimated pI of 4.95.

The nucleotide sequence of the third cDNA and amino acid sequence that it encodes, hereinafter termed, 091-132Q20, Sequence ID No. 5, is shown in Figure 3. Note that this clone also displays a leucine zipper domain. The protein has a calculated molecular weight of 45,904. 9 and an estimated pi of 6.73.

Example 2 Binding of BRCA1 Domains to BRCA1 Modulators Experiments were conducted to ascertain which regions of BRCA1 interact with the three BRCA1 Modulators described in Example 1. The experiment was conducted using the two-hybrid assay as described in U. S. Patent No. 5, 283, 173, or Chien et al., 20 1991, Proc. Natl. Acad. Sci. USA, 88:9578-9582. The cDNA that encodes the 091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1, and 091-132Q20, Sequence ID No. was fused to the GAL 4 activation domain, and those regions of BRCA1 shown in Table 1 and that contain BRCA1 amino acids 1-300, 1-600, or 8-1293 were fused to the binding domain of GAL4. Controls consisted of the vector, or bcl-2 fused to the GAL 4 binding S 25 domain (See, U. S. Patent 5, 539,085).

9@ 36a TABLE 1 INTERACTION OF BRCAI WITH TWO HYBRID HITS (091-1 GAL4AD GAL4BD F84 21 A31 132 BRCA1 (1-300) BROAl (1-600) BROA1 (8-1293) vector or BCL2 The BRCAI constructs employed in the above studies were generated using restriction fragments of BROA1, and cloning them into the plasmid pGBT8, which is a derivative of the plasmid pMA424, as described by Chien et al. in Proc. Nati. Acad. Sci. vol. 88: pages 9578-9582 (1991), modified by the insertion of the sequence 5'-CCGGGGATCCCCATGGCTAGCCATATG-3' 10 between the EcoRI and Sall unique sites.

9* 99 9 9 9 9 9999 9* 9* .9 99 9 9* 99 9 99 9 9 9 9 99 99 9.

9 9 9 9 9*99 9* 9 9* 99 9, 99 *9 9 9 9 9 9. 99 9. 9 9 9 9 9 9. 9 9 99 9 99 9 9 WO 98/10066 PCT/US97/13944 Briefly, the construct containing the first 300 amino acids of BRCA1 was generated by subcloning the Ncol-EcoR1 blunted BRCA1 fragment into the blunted EcoR1 site of pGBT8. The BRCA1 containing amino acids 8-1293 was generated as described above.

Lastly, the BRCA1 construct containing amino acids 1-600 was generated by subcloning the Ncol-Spel BRCA1 fragment into the Ncol-Nhel site of pGBT8.

Table 1 shows those regions of BRCA1 that interact with the proteins encoded by 091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1, and 091-132Q20, Sequence ID No. 5. The sign is a subjective measure of the amount of b-galactosidase activity. One being the lowest, and three being the highest activity. It is apparent from Table 1 that the first 300 amino acids of BRCA1 do not bind to any of the three BRCA1 Modulators, but that all three BRCA1 Modulators bind to the BRCA1 construct containing the first 600 amino acids of BRCA. None of the BRCA1 Modulators bind to the vector or bcl-2 controls, while all the BRCA1 Modulators bound to the near full length BRCA1 construct which has amino acids 8-1293.

The results show that the three BRCA1 Modulators preferrentially bind to the first 600 amino acids of BRCA1.

Example 3 Identification of Interacting Domains of 091-21A31, Sequence ID No. 1 and BRCA1 Two hybrid experiments were conducted to ascertain the regions of the BRCA1 Modulator 091-21A31, Sequence ID No. 1 that interact with BRCA1. The assay was run essentially as described in Example 1. Transformation and growth of yeast cultures were performed essentially as described in U. S. Patent No. 5, 283, 173; Chien et al., 1991, Proc. Natl. Acad. Sci. USA, 88:9578-9582; or Spaargaren, et al., (1994) Biochem. J. 300, 303-307.

Briefly, the YGH1 yeast strain was co-transformed with cDNA encoding 091- 21A31, Sequence ID No. 1, or cDNA encoding 091-21A31, Sequence ID No. 1 fragments containing amino acids 78-469, 1-300, or 300-469 fused to the GAL4 activation domain.

As a control, bcl-2 cDNA (See, U. S. Patent 5, 539, 085) was fused to the GAL4 activation domain. cDNAs encoding BRCA1 fragments having amino acids 1-300, 1- 600, or 8-1293 were fused to the GAL4 binding domain as described in Example 2.

The 091-21A31, Sequence ID No. 1 constructs were generated using the plasmids pGADGH or pGAD424; both are available from Clontech.

The 091-21A31, Sequence ID No. 1 construct containing amino acids 75-469 was generated by subcloning the EcoR1-Xhol 091-21A31, Sequence ID No. 1 fragment into the EcoR1-Sall site of pGAD424.

The 091-21A31, Sequence ID No. 1 construct containing amino acids 1-300 was generated by subcloning the BamH1-Sall 091-21A31, Sequence ID No. 1 fragment into the BamH1-Sall site of pGADGH.

The 091-21A31, Sequence ID No. 1 construct containing amino acids 300-469 was generated by subcloning the BamH1 blunted-Sail 091-21A31, Sequence ID No. 1 fragment into the Sail blunted-Xhol site of pGADGH.

Table 2 shows the results of the co-transformation studies. It is apparent that the first 300 amino acids of BRCA1 do not to bind to any of three 091-21A31, Sequence ID No. 1 fragment constructs, nor to 091-21A31, Sequence ID No. 1. The BRCA1 construct containing amino acids 1-600 does bind to 091-21A31, Sequence ID No. 1, and to the construct containing 091-21A31, Sequence ID No. 1 amino acids 78-469, but not to the amino acid 091-21A31, Sequence ID No. 1 constructs 1-300 and 300-469. Also, the BRCA1 construct having amino acids 8-1293 also binds 091-21A31, Sequence ID No. 1, the 78-469 and 1-300 amino acid constructs, but not to the 091-21A31, Sequence ID No. 1 20 construct having amino acids 300-469.

O

0 o 38a TABLE 2 INTERACTION OF BRCA1 WITH 091-21 GAL4AD 21 21 21 21 GAL4BD A31 (78-469) (1-300) (300-469) BRCA1 (1-300) BRCA1 (1-600) BRCA1 (8-1293) vector or BCL2 Example 4 Expression and Purification of BRCA1 Modulators The BRCA1 Modulators were expressed in and purified from baculovirus SF9 infected cells. Methods for producing baculovirus, as well as growing SF9 cells are well known in the art, and detailed procedures can be found in M.

10 Summers and G. Smith in "A Manual of Methods for Baculovirus Vectors and S" Insect Cell Culture Procedures," Texas Agricultural Experiment Station, Bulletin No. 1555 (May, 1987 or in EPS 127,839 to G. E. Smith and M. D. Summers.

The following constructs were generated using pAcC13 (See, Rubinfeld, et al. Cell 65, 1033-1042 (1991)) or pAcOG, a derivative of pAcC13 in which 15 the polylinker T.oOC WO 98/10066 PCT/US97/13944 was replaced with a synthetic linker engineered to encode an initiating methionine, the Glu-Glu (See, Grussenmyer, et al. Proc. Natl. Acad. Sci. U.S.A. 82, 7952 (1985)) epitope tag, and a multiple cloning site containing several stop codons (See, Rubinfeld, et al. J. Biol. Chem, 270,5549-5555 (1995)).

The construct containing 091-21A31, Sequence ID No. 1 was generated by subcloning the Kpnl-Xbal 091-21A31, Sequence ID No. 1 fragment into pAcC13 at the Kpnl-Xbal site.

The construct containing 091-1F84, Sequence ID No. 3 was generated by subcloning the Ncol Xbal 091-1F84, Sequence ID No. 3 fragment into pAcOG1 at the Ncol-Xbal sites.

The construct containing 091-132Q20, Sequence ID No. 5 was generated by subcloning the Kpnl-Xbal fragment of 091-132Q20, Sequence ID No. 5 into pAcC13 at the Kpnl-Xbal site.

Baculovirus containing the appropriate BRCA1 Modulator was produced by transfecting the above described plasmids into SF9 cells, and isolating the corresponding baculovirus using essentially the methods described in Pharmingen's cat. no. 21100D, BaculoGoldtm/Baculovirus DNA. Virus was isolated from individual plaques, and used to infect Sf9 cells. The cells were grown for 4 days, isolated by centrifugation, and cell extracts made by solubilizing the cell pellet. Briefly, recombinant Sf9 cells were pelleted, lysed in 5 volumes of [20mM Tris (pH8.0), 1mM EDTA, 10g/ml each of leupeptin, pepstatin, pefabloc, 1mM aprotinin and ImM DTT] and incubated on ice for 10 minutes. NaCl was then added to a final concentration of 150mM, incubated at room temperature for 10 minutes and centrifuged. The resulting supernatant was loaded onto a 1-ml affinity column containing a mouse Glu-Glu monoclonal antibody covalently cross-linked to protein G-Sepharose. See, Grussenmyer, et al., Proc. Natl. Acad. Sci. U. S. A. vol. 82, pp. 7952-7954 (1985).

The column was washed with 10-15ml of lysis buffer and eluted with 100lg of Glu-Glu peptide (EYMPME) per ml in the same buffer. Fractions were collected and analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), the peak fractions were pooled and based on purity subjected to further purification on HPLC columns which include Resource Q, Resource S and Resource Eth (Pharmacia). For WO 98/10066 PCT/US97/13944 purification of insoluble proteins, in particular 091-21A31, Sequence ID No. 1, recombinant Sf9 cells were pelleted, lysed in 5 volumes of [20mM Tris (pH8.0), 137mM NaC1, ImM EGTA, 1.5mM MgC 1 2%SDS, 10pg/ml each of leupeptin, pepstatin, pefabloc, 1mM aprotinin and 1mM DTT], incubated at room temperature for 20-30 minutes and ultra centifuged. The upper phase was removed, NaCl was adjusted to 400mM and recentrifuged. The clarified supernatent was then diluted 1:10 in 1 X TG buffer [20mM Tris (pH8.0), 137mM NaC1, ImM EGTA, 1.5mM MgCl 2 1% Triton X100, glycerol, 10g/ml each of leupeptin, pepstatin, pefabloc, ImM aprotinin and 1mM DTT], filtered through a 3uM Gelman Versapore filter and loaded onto a 1-ml anti-Glu- Glu affinity column. See, Rubinfeld, et al., Mol. Cell. Bio. 12, 4634-4642 (1992). The column was washed with 10-15ml of 1 X TG buffer with 400mM NaC and eluted in 1 X TG buffer with 1% SDS and 100g/ml Glu-Glu peptide. Fractions were analyzed by

SDS-PAGE.

Example Confirmation of BRCA1 Modulator Protein Binding to BRCA1 To confirm the results of the two-hybrid assays described in Example 1 and further establish the binding of each of the BRCA1 Modulators to BRCA1, two BRCA1 constructs were generated and tested for BRCA1 Modulator binding. The BRCA1 constructs were Glu-Glu tagged BRCA1 5' (1-1293), and BRCA1 3' (1293-1863). The Glu- Glu epitope tag facilitated immunoaffinity purification as described in the above examples. A control construct consisted of rapGAP. This construct was made as described by Rubinfeld, B. and Polakis "Purification of Baculovirus-Produced Rapl GTPase-activating Proteins". In: Methods and Enzymology, W.E. Balch, Channing J.

Der and Alan Hall, Eds., California: Academic Press, Inc., 255, 31-38. The BRCA1 constructs were generated as follows:. pAcO BRCA1 5' (1-1293) was generated by subcloning the NcoI-Nhel BRCA1 fragment into pAcO G1S NcoI-Nhel sites. pAcO BRCA1 3' (1293-1863) was generated by subcloning the Nhel blunted-Not1 BRCA1 fragment into pAcO G2 StuI-Notl sites. Using standard methods, the constructs were transfected into Sf9 cells. The BRCA1 constructs were purified using the immunoaffinity purification methods essentially as described in the preceding Examples.

WO 98/10066 PCT/US97/13944 For in vitro transcription/translation of the BRCA1 Modulators, the following constructs were subcloned into PCANmyc, a derivative of pCDNA3 (Invitrogen) in which the polylinker was replaced with a synthetic linker engineered to encode an initiating methionine, the Myc (See, Evan, et al. Mol. Cell. Biol. 5, 3610 (1985)) epitope tag, and a multiple cloning site (See, Rubinfeld, et al. Science, 272, 1023- 1026(1996)).

The plasmid containing the BRCA1 Modulator 091-1F84, Sequence ID No. 3, PCAN myc 091-1F84, Sequence ID No. 3, was generated by subcloning the Spel blunted Xhol 091-1F84, Sequence ID No. 3 fragment into PCAN myc3 EcoRV-Xhol sites. The plasmid containing the BRCA1 Modulator 091-21A31, Sequence ID No. 1, PCAN myc 091-21A31, Sequence ID No. 1, was generated by subcloning the BamH1- Xhol 091-21A31, Sequence ID No. 1 fragment into PCAN myc3 BamH1-Xhol sites.

Lastly, the plasmid containing the BRCA1 Modulator 091-132Q20, Sequence ID No. PCAN myc 091-132Q20, Sequence ID No. 5, was generated by subcloning-the EcoR1- Xhol 091-132Q20, Sequence ID No. 5 fragment into PCAN myc3 EcoR1-Xhol sites.

For in vitro binding analysis, the BRCA1 Modulator cDNAs (091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1, 091-132Q20, Sequence ID No. 5) were transcribed and translated in vitro in the presence of 35 S]Methionine using the TNTcoupled wheat germ cell lysate system (Promega). Next, one-two ug of purified recombinant BRCA1 protein, either Glu-Glu tagged BRCA1 5' (1-1293), or BRCA1 3' (1293-1863) was added to 25pl of precleared lysate along with 10lO of anti-Glu Glu coupled protein G-Sepharose beads. Following a 2 hour incubation with rocking at 4 0

C,

the beads were washed three times with 1 ml each of ice cold buffer B (20mM tris pH 150mM NaC1, 0.5% Nonidet P-40), eluted with 2041 of SDS-PAGE sample buffer and subjected to SDS-PAGE and fluorography.

SDS-PAGE fluorography revealed that all three of the BRCA1 Modulators were affinity precipitated with the construct BRCA1 5' (1-1293) but not BRCA1 3' (1293-1863).

The rapGAP control also did not affinity precipitate any of the three BRCA1 Modulators. Taken together these results confirm and extend the results of the two hybrid assay, and establishes that the BRCA1 Modulator proteins interact with BRCA1.

WO 98/10066 PCT/US97/13944 Example 6 Preparation of Antibody to BRCA1 Modulators For antibody production, immunoaffinity purification of BRCA1 Modulators from baculovirus infected Sf9 insect cells was performed with immobilized anti-Glu- Glu antibody specific for the Glu-Glu epitope tag expressed on the recombinant soluble proteins (See, Rubinfeld, et al., Mol. Cell. Bio. 12, 4634-4642 (1992)). Briefly, recombinant Sf9 cells were pelleted, lysed in 5 volumes of [20mM Tris (pH8.0), 1mM EDTA, 10tg/ml each of leupeptin, pepstatin, pefabloc, 1mM aprotinin and 1mM DTT] and incubated on ice for 10 minutes. NaC was then added to a final concentration of 150mM, incubated at room temperature for 10 minutes and centrifuged. Then resulting supernatant was loaded onto a 1-ml affinity column containing the Glu-Glu antibody covalently cross-linked to protein G-Sepharose. The column was washed with 10-15ml of lysis buffer and eluted with 100pg of Glu-Glu peptide (EYMPME) per ml in the same buffer. Fractions were collected and analyzed by sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE), the peak fractions were pooled and based on purity subjected to further purification on HPLC columns which include Resource Q, Resource S and Resource Eth (Pharmacia). For purification of insoluble proteins, in particular 21, recombinant Sf9 cells were pelleted, lysed in 5 volumes of Tris (pH8.0), 137mM NaC1, ImM EGTA, 1.5mM MgC1, 2%SDS, 10ug/ml each of leupeptin, pepstatin, pefabloc, 1mM aprotinin and 1mM DTT], incubated at room temperature for 20-30 minutes and ultra centifuged. The upper phase was removed, NaCl was adjusted to 400mM and recentrifuged. The clarified supernatent was then diluted 1:10 in 1 X TG buffer [20mM Tris (pH8.0), 137mM NaC1, ImM EGTA, MgCl,, 1% Triton X100, 10% glycerol, 10ug/ml each of leupeptin, pepstatin, pefabloc, 1mM aprotinin and 1mM DTT], filtered through a 3um Gelman Versapore filter and loaded onto a 1-ml anti-Glu-Glu affinity column. The column was washed with 10-15ml of 1 X TG buffer with 400mM NaC1 and eluted in 1 X TG buffer with 1% SDS and 100gg/ml Glu-Glu peptide. Fractions were analyzed by SDS-PAGE, pooled and used to immunize rabbits.

To produce antisera containing antibodies directed against the BRCA1 Modulators the latter are used to immunize rabbits as follows. For the BRCA1 Modulator 091-21A31, Sequence ID No. 1, the immunization protocol generally 42 WO 98/10066 PCT/US97/13944 consisted of two immunizations; the first was a subcutaneous injection of 0.500mg in CFA, followed by a second intramuscular injection of 0.250 mg about four weeks later in ICFA. The rabbits were bled, antisera collected and antibody purified as setforth below.

BRCA1 Modulator antibodies are affinity purified using BRCA1 Modulator immunogens which have been coupled to a support matrix. Briefly, the BRCA1 Modulator 091-21A31, Sequence ID No. 1 is coupled to CNBr activated Sepharose 6MB (Pharmacia) as follows. One ml of matrix was activated according to manufacturer's instructions (ie. resuspended in 1mM H Cl, washed for 15 min. in ImM H Cl on a sintered glass filter). One mg of 091-21A31, Sequence ID No. 1 was dialyzed against coupling buffer [0.1M NaHCO, pH 8.3, 0.5M NaC1] overnight at 4°C with two changes of buffer. The dialyzed protein was then incubated with the CNBr activated Sepharose 6MB and incubated with rocking overnight at 4°C. The excess ligand was washed away with coupling buffer and any remaining active groups were blocked with 1M ethanolamine at room temperature for two hours. This material was then washed with three cycles of alternating pH each cycle consists of a wash with 0.1M acetate buffer, pH 4.0, 0.5M NaCl followed by a wash with 0.1M Tris, pH 8.0, 0.5M NaCI. The protein coupled gel matrix was then resuspended in PBS and incubated with 5ml of antibody serum with rocking overnight at 4°C. The mixture was poured into a column, allowed to drip through and washed 3 times with 15ml PBS per wash. Seven elutions with 800 1 of 0.2M glycine, pH 2.5, were collected and each elution was neutralized immediately with 200pl 1M K 2

HPO

4 Peak fractions were combined and dialyzed into PBS Azide for storage.

American Type Culture Collection Deposits The cDNA clones that encode 091-1F84, Sequence ID No. 3, 091-21A31, Sequence ID No. 1, and 091-132Q20, Sequence ID No. 5 were deposited with the American Type Culture Collection (ATCC) on August 14, 1996 under accession numbers 98141 (091- 1F84, Sequence ID No. 98142 (091-21A31, Sequence ID No. and 98143 (091- 132Q20, Sequence ID No. The deposits were made under the Budapest Treaty and shall be maintained at least 30 years after the date of depost and 5 years after the date of the most recent request for the deposit.

WO 98/10066 PCT/US97/13944 The invention now being fully described, it will be apparent to one of ordinary skill in the art that many changes and modifications can be made thereto without departing from the spirit or scope of the appended claims.

WO 98/10066 PCTIUS97/13944 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: Rubinfeld, Bonnee Polakis, Paul G.

Ligenfelter, Carol Vuong, Terilyn T.

(ii) TITLE OF INVENTION: MODULATORS OF BRCA1 ACTIVITY (iii) NUMBER OF SEQUENCES: 6 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: ONYX Pharmaceuticals, Inc.

STREET: 3031 Research Drive CITY: Richmond STATE: CA COUNTRY: USA ZIP: 94806 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: US Unknown FILING DATE: CLASSIFICATION: Utility (viii) ATTORNEY/AGENT

INFORMATION:

NAME: Giotta, Gregory REGISTRATION NUMBER: 32,028 REFERENCE/DOCKET NUMBER: ONYX1024 GG (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: (510) 262-8710 TELEFAX: (510) 222-9758 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 2065 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL:

NO

(iv) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 103..1512 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: WO 98/10066 PCT/US97/13944 GTGGATCCCC CGGGCTGCAG GAATTCGGCA CGAGCGGCAC AGCAGTTTCT TTGGCTGCCT GGGCCCCTTG AGTCCAGCCA GCT CTG TGC ACT Ala Leu Cys Thr GCC GCC ATC CAC Ala Ala Ile His TGG TTT GAG ACA Trp Phe Giu Thr GTT GGC AAA AGA Val Gly Lys Arg ATC TGC Ile Cys 10 TGC GGC Cys Gly TCC GAC TTC TTC Ser Asp Phe Phe CAC ACC TTC CAC His Thr Phe His AGT CGG ACC TGC Ser Arg Thr Cys GAGTACGAAG CCCGACCTGT TC ATG CCT ATC CGT Met Pro Ile Arg 1 CAC TCC CGC GAC GTG His Ser Arg Asp Vai CAG TGC CTA ATT CAG Gin Cys Leu Ile Gin 114

CCA

Pro CCA CAG Pro Gin ACC ATT ATC Thr Ile Ile

GAG

Glu

AAT

Asn 60

GCA

Ala CTC TTC TTT Leu Phe Phe TTC TTA AAG Phe Leu Lvs GAG GAG Glu Glu 70 AAT GTC Asn Val AAT GTC TTG Asn Val Leu

CAT

Asp 75

TCC

Ser

CAA

Glu

CAG

Gin AGA CCC CAG CTT Arg Ala Gin Leu 90 ATC ATC GAC ACT Ile Ile Asp Thr CAG AAA GAC Gin Lys Asp

GAG

Glu

GAA

Glu TGC CGA ATC CAC Cys Arg Ile Gin GAT CTT GCC CAG Asp Leu Ala Gin AAT GAA CTG GAC Asn Giu Leu Asp AAA CGA GAC AGC Lys Arg Asp Ser 100 GAA CGC AAT CCT Glu Arg Asn Ala 115 402

GTC

Val ACT GTG GTA TCT Thr Val Val Ser 120 105

CTG

Leu CTG CGG GAT ACG CTG Leu Arg Asp Thr Leu 110 CAG GCC TTC GGC AAG Gin Aia Leu Gly Lys

CAG

Gin TCC ACA CTC Ser Thr Leu 135 ACC AAA CAA Thr Lys Gin 150

AAA

Lys

GCA

Ala AAG CAG ATC AAG Lys Gin Met Lys 140 CAA GAG GAG GCC Gin Giu Giu Ala 125

TAC

Tyr GCC GAG ATG CTG TGC Ala Giu Met Leu Cys 130 CAC CAG CAG CAT GAG Gin Gin Gin Asp Glu TTA GAG Leu Glu

ACC

Thr 165

GAG

Glu ATG GAG CAG ATT GAG Met Giu Gin Ile Glu 170 GAG ATG ATC CGA GAC Glu Met Ile Arg Asp 185 GCT CTG TAC TGT GTG Ala Val Tyr Cys Val 200 155

CTT

Leu

ATG

Met CGC CGG CTC AGG Arg Arg Leu Arg 160 145

AGC

Ser AAG ATG AAC Lys Met Lys CTA CTC CAG AGC Leu Leu Gin Ser 175 GGT GTC GGA CAC Gly Val Gly Gin

CAG

Gin

TCA

Ser CGC CCT GAG GTG Arg Pro Giu Val 180 GCG GTG GAA CAG Ala Val Giu Gin

CTC

Leu TCT CTC AAG Ser Leu Lys 205 190

AAA

Lys GAG TAC GAG Glu Tyr Glu 195 CTA AAA Leu Lys

AAT

Asn 210 WO 98/10066 WO 9810066PCTIUS97/1 3944 GAG GCA COG AAG Glu Ala Arg Lys 215 TTG TTT TCC TCC Leu Phe Ser Ser GCC TCA GGG GAG GTG GCT GAC AAG CTG AGG AAG GAT Ala Ser Gly Glu 220

TTG

Leu Val Ala Asp Lys Leu 225

TCT

S er Arg Lys Asp GAA TTG OAT Glu Leu Asp AGA AOC Arg Ser 230

CAG

Gin 245 10 GAC Asp 0CC Al a AAG TTA GAA CTG Lys Leu Olu Leu 250

AAG

Lys 235

AAG

Lys

CTO

Leu CAG ACA OTC Gin Thr Val

TAC

Tyr 240

GAC

Asp 834 882 TCA 0CC CAG AAG Ser Ala Gin Lys 255 AAA AAO AAG CTA Lys Lys Lys Leu TTA CAG AOT Leu Gin Ser

OCT

Ala 260 AAG OAA ATC ATO Lys Olu Ile Met 265 TTO AAC CTG CCA Leu Asn Leu Pro

AOC

Ser ACO ATO CTG Thr Met Leu CAG GAA Gin Glu 275

ACC

Thr CCA OTO 0CC Pro Val Ala

AGT

Ser 285

OTO

Val1 ACT GTC GAC COC CTG OTT Thr Val Asp Arg Leu Val 290 CTG AAO CTC COC COG OCA Leu Lys Leu Arg Arg Pro TTA GAG AGC CCA Leu Olu Ser Pro 295 TCC TTC COT OAT Ser Phe Arg Asp 0CC CCT OTO Ala Pro Val OAT ATT OAT Asp Ile Asp 315 CCC TCC AOC Pro Ser Ser

GAG

Oiu 300

CTC

Leu AAT OCT ACC Asn Ala Thr

AAT

Asn 305

OAT

Asp OTO OAT ACT Val Asp Thr 310

CCC

Pro 325

TOO

Cys

CCA

Pro 0CC COO Ala Arg TCC CAG CAT Ser Gin His CTA GAG AAO TCA Leu Giu Lys Ser 345 AAA GOC CCC AGO Lys Gly Pro Arg 330

CAC

His

GT

Gly 335

OAT

Asp TAC OAA AAA CTT Tyr Giu Lys Leu 340 CCC AAO AAG ATA Pro Lys Lys Ile 1026 1074 1122 1170 1218 TCC CCA ATT Ser Pro Ile

OTC

Val

TOC

Cys AAO GAG TCC CAG Lys Giu Ser Gin 365 TCA CTG GOT Ser Leu Gly

GOC

Gly 370 355 CAG AOC Gin Ser 360 TOT OCA OGA Cys Ala Gly 375 GAO CCA OAT GAG Glu Pro Asp Oiu GTC COO Val Arg 390 GAG TCC Oiu Ser

AAT

Asn 0CC ATC CTA Ala Ile Leu

GOC

Gly 395

OAT

Asp

GAA

Glu 380

CAG

Gin

OTO

Val1 CTO OTT GOT 0CC Leu Val Gly Ala AAA CAG CCC AAO Lys Gln Pro Lys 400 OTA AGO ACA GOC Val Arg Thr Oly TTC CCT ATT TTT Phe Pro Ile Phe 385 AGO CCC AGO TCA Arg Pro Arg Ser 1266 1314 1362 1410 405

GOT

Gly TCT TOC AGC AAA Ser Cys Ser Lys 410 COG ACA AAA TTC Arg Thr Lys Phe TTC OAT 000 Phe Asp Gly

CTC

Leu 420

COC

Arg

GOC

Gly ATC CAG CCT Ile Gin Pro ACA GTC ATO Thr Val Met 425 WO 98/10066 PCTIUS97/13944 CCA TTG CCT GTT AAG CCC AAG ACC AAG GTT AAG CAG AGG GTG AGG GTG Pro Leu Pro Val Lys Pro Lys Thr Lys Val Lys Gin Arg Val Arg Val 440 445 450 AAG ACA GTG CCT TCT CTC TTC CAG GCC AAG CTG GAC ACC TTC CTG TGG Lys Thr Val Pro Ser Leu Phe Gin Ala Lys Leu Asp Thr Phe Leu Trp 455 460 465 TCG TGA GAACAGTGAG TCTGACCAAT GGCCAGACAC ATGCCTGCAA~

CTTGTAGGTC

Ser 1458 1506 1562 470

AAGGACTGTC

GTAAGGGCAG

CACCCTGCCC

GGTCCTGCTC

TCTGGGCCTG

ATCTCAGGCA

GGGCCAAGCA

TCATGTAAAA

AA~AAAAAAA

CAGGCAGGGG

ACAAACAGGT

CACTCCTACG

CTGTTGCCAG

GAGACCACGG

GCCTCAGCCC

GGGTGGGGAA

TAAAATTAAA

AAAAAAACTC

TTTTGTGGAC

GAGGGTGAGT

ACTGGGAGCT

GCTCCTGTTT

TCACTTGTTG

AAGCTTCTAC

TGGAGGATAG

GAG

AGAGCCCCAC

GTGACACCCA

GACATGACCA

ATAGCCATGA

ACTGTCTCTG

CTGCCTTTGA

CATOGGATGT

CAAAAAAAAA

TTTCGGGACC

GAGACTGCTC

GCCCACTGAT

TCAGATGTGG

TGGACCAGAG

CTTGCTTCTA

ATGGAGAGGA

AAAAZAAAA

AGCCTGAGGT

TTCCTGCCCT

CCTGTCAGCA

TCAGACTCTT

TGCTTGAGGC

GGCATAGCCT

TGGAAGATTT

1622 1682 1742 1802 1862 1922 1982 2042 2065 INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 470 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE (xi) SEQUENCE TYPE: protein DESCRIPTION: SEQ ID NO:2: Met 1 Ser Pro Ile Arg Ala Leu Cys Thr Ile 5 Ser Asp Phe Phe Asp His Arg Asp Val Leu Ile Gin Ala Ala Ile His Trp Phe Glu Thr Cys 25 Ala His Thr Phe Cys Pro Ser Arg 40 Arg Thr Lys His Leu Gin Cys Pro Gin Leu Phe Phe Cys Arg 50 Asp Leu Ile Gin Val Gly Lys Glu Thr Ile Ile Ala Gin Glu Asn Val Leu Asn Giu Phe Leu Lys Glu Leu Asp Lys Arg Asp Glu Arg Asn Ser 100 Ala Asn Gin Thr Val Arg Ala Gin Val Ile Ile Asp 105 Val Val Ser Leu 120 Leu 90 Thr Gin Lys Asp Lys Giu Leu Arg Asp Thr Leu Glu 110 Giy Lys Ala Gin Gin Ala Leu 125 WO 98/10066 PCT[US97/13944 Glu Met Leu Cys Ser Thr Leu Lys Lys Gin Met Lys Tyr Leu Glu Gin 130 135 140 Gin Gin Asp Glu Thr Lys Gin Ala Gin Glu Glu Ala Arg Arg Leu Arg 145 150 155 160 Ser Lys Met Lys Thr Met Glu Gin Ile Glu Leu Leu Leu Gin Ser Gin 165 170 175 Arg Pro Glu Val Glu Glu Met Ile Arg Asp Met Gly Val Gly Gin Ser 180 185 190 Ala Val Glu Gin Leu Ala Val Tyr Cys Val Ser Leu Lys Lys Glu Tyr 195 200 205 Glu Asn Leu Lys Glu Ala Arg Lys Ala Ser Gly Glu Val Ala Asp Lys 210 215 220 Leu Arg Lys Asp Leu Phe Ser Ser Arg Ser Lys Leu Gin Thr Val Tyr 225 230 235 240 Ser Glu Leu Asp Gin Ala Lys Leu Glu Leu Lys Ser Ala Gin Lys Asp 245 250 255 Leu Gin Ser Ala Asp Lys Glu Ile Met Ser Leu Lys Lys Lys Leu Thr 260 265 270 Met Leu Gin Glu Thr Leu Asn Leu Pro Pro Val Ala Ser Glu Thr Val 275 280 285 Asp Arg Leu Val Leu Glu Ser Pro Ala Pro Val Glu Val Asn Leu Lys 290 295 300 Leu Arg Arg Pro Ser Phe Arg Asp Asp Ile Asp Leu Asn Ala Thr Phe 305 310 315 320 Asp Val Asp Thr Pro Pro Ala Arg Pro Ser Ser Ser Gin His Gly Tyr 325 330 335 Tyr Glu Lys Leu Cys Leu Glu Lys Ser His Ser Pro Ile Gin Asp Val 340 345 350 Pro Lys Lys Ile Cys Lys Gly Pro Arg Lys Glu Ser Gin Leu Ser Leu 355 360 365 Gly Gly Gin Ser Cys Ala Gly Glu Pro Asp Glu Glu Leu Val Gly Ala 370 375 380 Phe Pro Ile Phe Val Arg Asn Ala Ile Leu Gly Gin Lys Gin Pro Lys 385 390 395 400 Arg Pro Arg Ser Glu Ser Ser Cys Ser Lys Asp Val Val Arg Thr Gly 405 410 415 Phe Asp Gly Leu Gly Gly Arg Thr Lys Phe Ile Gin Pro Thr Asp Thr 420 425 430 Val Met Ile Arg Pro Leu Pro Val Lys Pro Lys Thr Lys Val Lys Gin 435 440 445 Arg Val Arg Val Lys Thr Val Pro Ser Leu Phe Gin Ala Lys Leu Asp 450 455 460 Thr Phe Leu Trp Ser 465 470 WO 98/10066 PCT/US97/13944 INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 3256 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 34..2541 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GAACTAGTGG ATCCCCCGGG CTGCAGGAAT TCG GCA CGA GAA AGC TTA TCC CTT Ala Arg Glu Ser Leu Ser Leu CCC TCG ATG Pro Ser Met CTT CGG GAT GCT Leu Arg Asp Ala

GCA

Ala 15

ACT

Thr ATT GGC ACT ACC Ile Gly Thr Thr CCT TCA GCA CCA Pro Ser Ala Pro TGC TCG Cys Ser ACA AAC Thr Asn

GTG

Val GGG ACT TGG Gly Thr Trp

TTT

Phe 30

GGC

Gly CCT TTC TCT ACT Pro Phe Ser Thr CAG GAA AAG AGT Gin Glu Lys Ser CAC AGT ACT TCT His Ser Thr Ser CTG ACT GCC TTG Leu Thr Ala Leu 102 150 198 ACA TCC CAG Thr Ser Gin 40

GAG

Glu ACA GAG CAG Thr Glu Gin

CTC

Leu

TTG

Leu

ACA

Thr 45

CTG

Leu

GAA

Glu CTG GTT GGC Leu Val Gly

AAG

Lys

GAT

Asp TGT GGC CGG Cys Gly Arg GAT AAC CTG Asp Asn Leu

CCT

Pro 65 TCT CGA CAT Ser Arg His GAG GTT CTC Glu Val Leu CCT CAC CCA Pro His Pro

GAC

Asp

TCC

Ser

GAA

Glu CGC CAG CTT CGG Arg Gin Leu Arg 95 ACC CAG GAC AGT Thr Gin Asp Ser 80

GAC

Asp

AGC

Ser CTG AGC TCT CTT GTC ATT CTG Leu Ser Ser Leu Val Ile Leu TGG AAG AGC CAG CTG GCT GTC Trp Lys Ser Gin Leu Ala Val 100 ACA CAG ACT GAC ACA TCT CAC Thr Gln Thr Asp Thr Ser His 105 AGT GGG Ser Gly ATA ACT AAT Ile Thr Asn 120

GGA

Gly

AAA

Lys 125

CAG

Gin 115 CAG CAT CTT AAG GAG Gin His Leu Lys Glu AGC CAT GAG Ser His Glu

ATG

Met 135

CTT

Leu CAG GCC CTA CAG Gln Ala Leu Gin 140 GCC AGA AAT GTC ATG CAA TCA TGG Ala Arg Asn Val Met Gin Ser Trp WO 98/10066 PCT/US97/13944 ATC TCT AAA GAG CTG ATA TCC TTG CTT CAC CTA TCC CTG TTG CAT TTA Ile Ser Lys Glu Leu Ile Ser Leu GAA GAA GAT Glu Glu Asp 170 TTG GTC TGT Leu Val Cys 155

AAG

Lys ACT ACT GTG Thr Thr Val

AGT

Ser 175

TTG

Leu Leu 160

CAG

Gin

CTG

Leu His Leu Ser Leu Leu His Leu 165 GAG TCT CGG CGT GCA GAA ACA Glu Ser Arg Arg Ala Glu Thr 180 AAG AAA TTG AGG GCA AAG CTC Lys Lys Leu Arg Ala Lys Leu 582

CAG

Gin 200

GCT

Ala 185

AGC

Ser TGC TGT TTT Cys Cys Phe AAA GCA GAA Lys Ala Glu

GAT

Asp 190

CTC

Leu CTC AGA GGC Leu Arg Gly

AAG

Lys 220

CAG

Gin 205

GAT

Asp

CGC

Arg AGG GAG GAG GCA Arg Glu Glu Ala GCG GCA GAG ATA Ala Ala Glu Ile 225 ATC AGC CAG CTG Ile Ser Gin Leu 195

CAC

His

TTG

Leu AGA GAG GAA ATG Arg Glu Glu Met 215 GAG GCT TTC TGT Glu Ala Phe Cys 230 GCA CAC GCC Ala His Ala 240 GAA CAG GAC CTA GCA TCC Glu Gin Asp Leu Ala Ser 245 GCC CAG ACC CAA CTG GTA Ala Gin Thr Gin Leu Val ATG CGG Met Arg GGG CTT Gly Leu 265 ACT TCT Thr Ser

GAA

Glu 250

CAT

His

ACC

Thr AGA GGC CTT CTG Arg Gly Leu Leu 255

AAG

Lys

CTG

Leu

GAT

Asp GCC AAG CAA GAA Ala Lys Gin Glu 270 TTG CAA CAA GAC Leu Gin Gin Asp

GAG

Glu 260 GTT CAG CAG ACA Val Gin Gin Thr GTG AGT CTT Val Ser Leu 275

CAA

Gin 280

ACA

Thr

ACA

Thr TGG ACA GCT Trp Thr Ala GTC AAG AGC Val Lys Ser

TTG

Leu 300 285

CTG

Leu TGG AGG TCC ATG Trp Arg Ser Met 290 CGG TCC CGA CAA Arg Ser Arg Gin CTG GAT TAT Leu Asp Tyr

ACA

Thr 295

AGT

Ser CTC ACA GAG Leu Thr Glu GAA AAG Glu Lys GAG GAG Glu Glu 345 CTA GCA Leu Ala 360 315

GAG

Glu

AAA

Lys CAG CAA GCC CTG Gin Gin Ala Leu GTT TCT AGG GTG Val Ser Arg Val 335 GGC CAA ACA GAA Gly Gin Thr Glu

CAG

Gin 320

CTG

Leu 305

GAA

Glu

GAA

Glu CGT GAT GTG GCA ATT GAG Arg Asp Val Ala Ile Glu 325 CAA GTC TCT GCC CAG TTA Gin Val Ser Ala Gin Leu AAA CTC Lys Leu 310 870 918 966 1014 1062 1110 1158

ACA

Thr 350 GAT CTC CGG GCT Asp Leu Arg Ala CAA CTG GAG TTG Gin Leu Glu Leu 355 TTG CAG ATT CTG Leu Gin Ile Leu AAC AGT CGT Asn Ser Arg

CAG

Gin GCC AAC ATG Ala Asn Met 365 WO 98/10066 PCT/US97/13944 AGC CAG CTA AAA Ser Gin Leu Lys CTG GCT ATG AAG Leu Ala Met Lys 395 GAG CAG GCT GCT Glu Gin Ala Ala GAG CTA CAG AGT CAG CAT ACC CAT TGT GCC CAG GAC Glu 380

GAT

Asp Leu Gin Ser Gin Thr His Cys Ala Gin Asp 390 GAG TTA TTC Glu Leu Phe

TGC

Cys 400

GAA

Glu CTT ACC CAG Leu Thr Gin AGC AAT GAG Ser Asn Glu 405 CAA TGG CAA Gin Trp Gin CAG GCA Gin Ala 425 GAC CTG Asp Leu 410

GAA

Glu

AAG

Lys 415

CAA

Gin GAG ATG GCA Glu Met Ala CTA AAA CAC ATG Leu Lys His Met 420 AAA GAG GTG CGG Lys Glu Val Arg CTG CAG CAG Leu Gin Gin

CAA

Gin 430

GAG

Glu GCT GTC CTG Ala Val Leu

GCC

Ala 435

GAG

Glu 440

CAC

His

GTG

Val AAA GAG ACC TTG Lys Glu Thr Leu 445 TTT GCA GAC Phe Ala Asp CTG GAG CTG GGT Leu Glu Leu Gly 460 CTC CGG GAG CGC Leu Arg Glu Arg CAG GTT GAG TGT Gin Val Glu Cys AGC TTG CAG TGT Ser Leu Gin Cys 480 AAA CTG GCC AGC Lys Leu Ala Ser

CAA

Gin 465

GAG

Glu

CAG

Gin 450

TTG

Leu

AAC

Asn AAT CAG GTT Asn Gin Val

GCT

Ala 455 AAA ACC ACA CTG GAA Lys Thr Thr Leu Glu 470 CTC AAG GAC ACT GTA Leu Lys Asp Thr Val 485 1206 1254 1302 1350 1398 1446 1494 1542 1590 1638 1686 1734 1782 1830 GAG AAC Glu Asn CAA GAT Gin Asp 505 ACT GAG Thr Glu

CTA

Leu 490

CTG

Leu

CAA

Gin

GCT

Ala ACC ATA GCA GAT AAC CAG GAG Thr Ile Ala Asp Asn Gin Glu 500 TCT CAA AAG CTA AGG CTG CTG Ser Gin Lys Leu Arg Leu Leu GAG AAA ACA CGG Glu Lys Thr Arg 510 CTA CAG AGC CTG Leu Gin Ser Leu

TAC

Tyr 515 520

GAG

Glu ACT CTC TTT CTA Thr Leu Phe Leu 530 CTT CTG CTG AGT Leu Leu Leu Ser

CAG

Gin ACA AAA CTA Thr Lys Leu

AAG

Lys 535 AAG ACT GAA Lys Thr Glu ACC CAG GAA Thr Gin Glu TTG ACA GCA Leu Thr Ala 570 CTT GGA AGT Leu Gly Ser 585

CAC

His 555

GTG

Val

CAA

Gin 540

CCT

Pro

GCA

Ala

ACC

Thr ACA GCC TGT CCT CCC Thr Ala Cys Pro Pro 550 TTC CTG GGA AGC ATC Phe Leu Gly Ser Ile CTG CCT AAT Leu Pro Asn GAT GAA GAG Asp Glu Glu

GAC

Asp 560

ACC

Thr 565

GTG

Val CCA GAA TCA ACT Pro Glu Ser Thr ACC CGA GTA GCA Thr Arg Val Ala 595

CCT

Pro 580

TCA

Ser CCC TTG Pro Leu GAC AAG AGT GCT Asp Lys Ser Ala 590 ATG GTT TCC Met Val Ser WO 98/10066 PCT/US97/13944 CTT CAG CCC Leu Gin Pro 600 AGT ATT ATG Ser Ile Met GCA GAG ACC CCA GGC ATG GAG GAG AGC CTG GCA GAA ATG Ala Glu Thr 605 ACT ACT GAG Thr Thr Glu Pro Gly Met Glu Ser Leu Ala Glu Met 615 CTT CAG AGT Leu Gin Ser 620

GCC

Ala

CTT

Leu 625

CAG

Gin TCC.CTG CTA Ser Leu Leu TCT AAA GAA Ser Lys Glu ATC AGG ACT Ile Arg Thr CAA GTT AGG CTG Gin Val Arg Leu 650 GCA AAA GAA GCA Ala Lys Glu Ala CAG GCC CAG Gin Ala Gin GAC ATA GAG Asp Ile Glu 670

GAA

Glu 655

AAG

Lys

CTG

Leu 640

GAA

Glu

CTG

Leu CGA AAA ATT TGT Arg Lys Ile Cys 645 CAA GAG Gin Glu 630 GAG CTG Glu Leu CAG AAG Gin Lys CAG CAT CAG GAA Gin His Gin Glu 660 AAC CAG GCC TTG Asn Gin Ala Leu

GTC

Val TGC TTG CGC Cys Leu Arg 665 TAC AAG Tyr Lys AAT GAA Asn Glu AAG GAG Lys Glu 680

AAG

Lys ATC CTA GAA CAG Ile Leu Glu Gin 700 GAG GTG ACC CAC Glu Val Thr His 685

ATA

Ile

CTT

Leu CTC CAG GAA GTG Leu Gin Glu Val GAC AAG AGT GGC Asp Lys Ser Gly 705 ACC CGC TCA CTT Thr Arg Ser Leu

ATA

Ile 690 675

CAG

Gin CAG CAG AAT Gin Gin Asn

GAG

Glu GAG CTC ATA AGC CTT AGA Glu Leu Ile Ser Leu Arg 710 CGG CGT GCG GAG ACA GAG Arg Arg Ala Glu Thr Glu 725 CAG CTG GAC TCC AAC TGC Gin Leu Asp Ser Asn Cvs 1878 1926 1974 2022 2070 2118 2166 2214 2262 2310 2358 2406 2454 2502 ACC AAA GTG Thr Lys Val 730 CAG CCT ATG Gin Pro Met 745 715

CTC

Leu

CAG

Gin 720 GAG GCC CTG GCA Glu Ala Leu Ala

GGC

Gly GCC ACC AAT Ala Thr Asn 735

ATC

Ile

CAG

Gin GAG AAA GTG Glu Lys Val

GAG

Glu 760

GAA

Glu GTG GAC AAA CTG AGA Val Asp Lys Leu Arg 765 AAA CTC ATG ATC AAG Lys Leu Met Ile Lys 780 CTT CGG CGC TCT GAC Leu Arg Arg Ser Asp ATG TTC CTG Met Phe Leu

GAG

Glu 770

AGA

Arg 755

ATG

Met

AAT

Asn 740 TGG CTC TCT CAG Trp Leu Ser Gin AAA AAT GAG AAG Lys Asn Glu Lys 775 ATC CTA GAG GAG Ile Leu Glu Glu TTC CAG AGC Phe Gin Ser AAG GAG TTA Lys Glu Leu

CAT

His 785

GAA

Glu

AAC

Asn

AAA

Lys 790 CTA GAT GAC ATT GTT Leu Asp Asp Ile Val CAG CAT ATT Gin His Ile 810 800 AAG ACC CTG CTC TCT Lys Thr Leu Leu Ser 815 ATT CCA GAG Ile Pro Glu

GTG

Val 820 805 GTG AGG GGA Val Arg Gly WO 98/10066 WO 9810066PCTIUS97/13944 TGC AGA GAA CTA CAG GGA TTG CTG GAA TTT CTG AGC TAA GAAACTGAAA Cys Arg Glu Leu Gin Gly Leu Leu Glu Phe Leu Ser 2551

GCCAGAATCT

ACTGGCATAG

GTGCCGAATT

CCTGAAAAGT

ATAGCCTTTG

AGGAALAGCAG

CGTGTGATCA

AGACAGA.AGG

AAGCTCCGTG

TTCTGGGGCC

GTGGGTGTGT

ACCAAAAAAA.

GCTTCACCTC

AGCCAACTGA

CGGCACGAGC

TCCAGCATAT

CCATCACTGC

ACATTGACCT

CCATTATGCA

ATGTAAAGGA

AAGACCTGGA

TTCGTGTCCG

CCAAGAAGAA

TTTTTACCTG

GATAAATGCT

GGCACGAGCG

TTTGCGAGTA

CATTAAGGGT

CACCAAGAGG

GAATCCACGC

TGGAAAATAC

GCGACTGAAG

AGGCCAGCAC

ATAA.GTCTGT

ACTCGAGCAT

CAATACCCCC

ATTTAAATAA

GCACGAGCTG

CTCAACACCA

GTGGGCCGAA

GCGGGAGAAC

CAGTACAAGA

AGCCAGGTCC

AAGATTCGGG

ACCAAGACCA

AGGCCTTGTC

GCATCTAGAG

TTACCCCAAT

AGTGTATTTA

CAGCCATGTC

ACATCGATGG

GATATGCTCA

TCACTGAGGA

TCCCAGACTG

TAGCCAATGG

CCCATAGAGG

CTGGCCGCCG

TGTTAATAAA

GGCCC

ACCAAGACCA

ATGAAAACTC

TCTAGTGATC

GCGGCGGAAA

TGTGGTGTTG

TGAGGTGGA\

GTTCTTGAAC

TCTGGACAAC

GCTGCGTCAC

TGGCCGCACC

TAGTTTATAT

2611 2671 2731 2791 2851 2911 2971 3031 3091 3151 3211 3256 INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 836 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ala 1 Gly Arg Glu Ser Thr Thr Pro Ala Pro Gin Leu 5 Phe Ser Leu Pro Ser Leu Arg Asp Ala Ala Ile Ser Thr Cys Ser 25 Asn Gly Thr Trp Phe Thr Pro Gly Leu Val Ser Glu Lys Ser Thr Ser Gin Gly Thr Lys His Ser Thr Ser 55 Leu Thr Giu Gin Leu Leu Cys Gly Arg Pro Leu Pro Asp Leu Thr Ala 70 Ile Ser Arg His Asp Ser Glu Asp Asn Ser Ser Leu Leu Glu Val Arg Gin Leu Arg Asp Ser Ser Trp Lys Ser Thr Gin Thr 115 Leu Lys Giu Gin 100 Asp Ala Val Pro His 105 Gly Giu Thr Gin Asp Thr Ser His Ile Thr Ser His Glu 130 Val Met Gin Ser Trp Val Met 135 Leu Gin Ala Leu Asn Lys 125 Gin Gin 140 Leu Ile 110 Leu Gin His Ala Arg Asn Ser Leu Leu 160 Ile Ser Lys Giu 155 150 WO 98/10066 WO 9810066PCTIUS97/13944 His Leu Ser Leu Leu His Leu Giu Giu Asp Lys Thr Thr Val Ser Gin 170 175 Giu Lys Ala Ile 225 Leu Asp Vai Ser Arg 305 Giu Giu Leu Gin His 385 Gin Giu Val Asp Gin 465 Glu Ser Lys Arg 210 Val1 Giu Ala Gin Met 290 Gin Arg Gin Giu Ile 370 Thr Leu Met Leu Gin 450 Leu Asn Arg Leu 195 His Leu Gin Gin Gin 275 Gin Leu Asp Val1 Leu 355 Leu His Thr Ala Ala 435 Giu Lys Leu Arg 180 Arg Arg Giu Asp Thr 260 Thr Leu Thr Val1 Ser 340 Giu Ala Cys Gin Leu 420 Lys Asn Thr Lys Ala Ala Glu Ala Leu 245 Gin Val1 Asp Glu Ala 325 Ala Asn Asn Ala Ser 405 Lys Giu Gin Thr Asp 485 Giu Lys Glu Phe 230 Ala Leu Ser Tyr Lys 310 Ile Gin Ser Met Gin 390 Asn His Val1 Val Leu 470 Thr Thr Leu Met 215 Cys Ser Val1 Leu Thr 295 Leu Giu Leu Arg Asp 375 Asp Giu Met Arg Ala 455 Giu Val1 Leu Gin 200 Ala Ala Met Gly Thr 280 Thr Thr Glu Giu Leu 360 Ser Leu Glu Gin Asp 440 His Val1 Glu Val 185 Ser Leu His Arg Leu 265 S er Trp Val Lys Giu 345 Ala Gin Ala Gin Ala 425 Leu Leu Leu Asn Cys Leu Arg Ala Giu 250 His Thr Thr Lys Gin 330 Cys Thr Leu Met Ala 410 Glu Lys Giu Arg Leu 490 Cys Lys Gly Ser 235 Phe Ala Leu Ala Ser 315 Giu Lys Asp Lys Lys 395 Ala Leu Glu Leu Giu 475 Thr Glu Cys Ala Lys 220 Gin Arg Lys Gin Leu 300 Gin Val Gly Leu Glu 380 Asp Gin Gin Thr Gly 460 Arg Ala Lys Phe Glu 205 Asp Arg Gly Gin Gin 285 Leu Gin Ser Gin Arg 365 Leu Giu Trp Gin Leu 445 Gin Ser Lys Thr Asp 190 Arg Ala Ile Leu Giu 270 Asp Ser Al a Arg Thr 350 Al a Gin Leu Gin Gin 430 Giu Val1 Leu Leu Arg L eu Glu Ala Ser Leu 255 Glu Trp Arg Leu Val 335 Glu Gin Ser Phe Lys 415 Gin Phe Giu Gin Ala 495 Gin Leu Giu Giu Gin 240 Lys Leu Arg Ser Gin 320 Leu Gin Leu Gin Cys 400 Giu Ala Ala Cys Cys 480 Ser Tyr Thr Ile Ala Asp Asn Gin Glu Gin Asp Leu 500 505 WO 98/10066 Ser Gin Lys 515 Phe Leu Gin 530 Leu Ser Thr 545 Arg Thr Phe Glu Ser Thr Arg Val Ala 595 Giu Glu Ser 610 Leu Cys Ser 625 Gin Arg Lys Gin His Gin Asn Gin Ala 675 Val Ile Gin 690 Giy Glu Leu 705 Leu Arg Arg Giy Gin Leu Giu Lys Vai 755 Leu Giu Met 770 His Arg Asn 785 Glu Lys Leu Ile Pro Giu Phe Leu Ser 835 PCT/US97/13944 Leu Leu Arg Leu Leu Thr Glu Gin Leu Gin Ser Leu Thr 525 Thr Ala Leu Pro 580 Ser L eu Leu Ile Giu 660 Leu Gin Ile Ala Asp 740 Trp Lys Ile Asp Val1 Lys Cys Gly 565 Val1 Met Al a Leu Cys 645 Val1 Cys Gin Ser Glu 725 Ser Leu Asn Leu Asp 805 Val1 Leu Pro 550 Ser Pro Val1 Giu Gin 630 Giu Gin Leu Asn Leu 710 Thr Asn S er Giu Giu 790 Ile Arg Lys 535 Pro Ile Leu Ser Met 615 Giu Leu Lys Arg Giu 695 Arg Giu Cys Gin Lys 775 Giu Val1 Gly Glu Thr Leu Leu Leu 600 Ser Ser Gin Ala Tyr 680 Lys Glu Thr Gin Glu 760 Giu Asn Gin Cys Lys Gin Thr Gly 585 Gin Ile Lys Vali Lys 665 Lys Ile Giu Lys Pro 745 Val1 Lys Leu His Arg Thr Giu Aia 570 Ser Pro Met Giu Arg 650 Glu Asn Leu Val Val1 730 Met Asp Leu Arg Ile 810 Glu Glu His 555 Val1 Asp Ala Thr Glu 635 Leu Aia Glu Glu Thr 715 Leu Al a Lys Met Arg 795 Tyr Leu Gin 540 Pro Ala Lys Glu Thr 620 Ala Gin Asp Lys Gin 700 His Gin Thr Leu Ile 780 Ser Lys Gin Giu Leu Asp Ser Thr 605 Giu Ile Aia Ile Giu 685 Ile Leu Glu Asn Arg 765 Lys Asp Thr Gly Thr Pro Glu Ala 590 Pro Leu Arg Gin Giu 670 Leu Asp Thr Ala Trp 750 Val1 Phe Lys Leu Leu Leu Asn Giu 575 Phe Gly Gin Thr Giu 655 Lys Gin Lys Arg Leu 735 Ile Met Gin Giu Leu 815 Leu Leu Asp 560 Pro Thr Met Ser Leu 640 Glu Leu Giu Ser Ser 720 Ala Gin Phe Ser Leu 800 Ser Glu WO 98/10066 PCT/US97/13944 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1191 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 34..1191 (xi) SEQUENCE DESCRIPTION: SEQ ID GAACTAGTGG ATCCCCCGGG CTGCAGGAAT TCG GCA CGA GGC GGC GCC GAA GAG Ala Arg Gly Gly Ala Glu Glu GCG ACT GAG Ala Thr Glu TTT GAA ATT Phe Glu Ile GCC GGA CGG GGC Ala Gly Arg Gly GGC ACA ATG GAA Gly Thr Met Glu 30 ATG TTG TGT AAC Met Leu Cys Asn CGG CGA Arg Arg CGC AGC CCG Arg Ser Pro CGG CAG AAG Arg Gln Lys CTA GGG GTG Leu Gly Val AAA GCA Lys Ala GCT GGA ATT TGT Ala Gly Ile Cys CAA TCA AAT GAT Gln Ser Asn Asp

GGG

Gly

GAT

Asp

TCT

Ser

CAA

Gln 45 ATT CTT CAA CAT Ile Leu Gln His TCA TTG GAA GAG Ser Leu Glu Glu GGC TCA AAT TGT Gly Ser Asn Cys GAA GGC AGT GAC Glu Gly Ser Asp

GGT

Gly

TTT

Phe GGC ACA AGT AAC Gly Thr Ser Asn 65 ATA ACA GAG AAC Ile Thr Glu Asn

CAT

His

GAT

Asp GCA TAC TGC Ala Tyr Cys CGA ACA GAT Arg Thr Asp CAA GAA TCA AGA Gln Glu Ser Arg 95 CCT GAT GGT CAG Pro Asp Gly Gln

GAG

Glu AGG AAT TTG GTG Arg Asn Leu Val ATC CCT GGG GGA Ile Pro Gly Gly AGC CCA Ser Pro GAA GCT Glu Ala

GAA

Glu

CCC

Pro 105

AAA

Lys 120

AAC

Asn 105 GAA AAA ACT TTA GGA Glu Lys Thr Leu Gly 125 ACC CTT TCA ACC CCA Thr Leu Ser Thr Pro 110

AAA

Lys CAA GAT TCA Gln Asp Ser GTT TTA TTA Val Leu Leu

GAG

Glu 115 100

TGC

Cys AAC AGG AAC Asn Arg Asn

GAA

Glu CTG ATG CAA GCC CTA Leu Met Gln Ala Leu 135 GCT CTC TGT AAG AAA Ala Leu Cys Lys Lys GAG GAG AAG CTG Glu Glu Lys Leu WO 98/10066 PCT/US97/13944 TAT GCT GAT Tyr Ala Asp ATC CTG CAG Ile Leu Gin 170

CTT

Leu 155

AAG

Lys CTG GAG GAG AGC Leu Glu Glu Ser AGT GTT CAG AAG Ser Val Gin Lys

CAA

Gin 165

GTT

Val ATG AAG Met Lys CAC TTG His Leu AAG CAA GCC Lys Gin Ala

CAG

Gin 175

ATC

Ile GTG AAA GAG Val Lys Glu

AAA

Lys 180

AAG

Lys CAG AGT Gin Ser

CTT

Leu 200

ATG

Met GAA CAT AGC AAG Glu His Ser Lys AGA GAA CTT CAG Arg Glu Leu Gin 205 CAG GCA CGA GAG Gin Ala Arg Glu

GCT

Ala 190

CGT

Arg TTG GCA AGA Leu Ala Arg CAC AAT AAG His Asn Lys

ACG

Thr 210

AGC

Ser 195

TTA

Leu

ATA

Ile CTA GAA TCT Leu Glu Ser

CAG

Gin 220

ACC

Thr GAA GAA GAA CGA CGT Glu Glu Glu Arg Arg 225 AAT GAA ATT CAA GCC Asn Glu Ile Gin Ala AAG GAG GAA AAT Lys Glu Glu Asn 215 GAA GCA ACT GCA Glu Ala Thr Ala 230 726 CAT TTC CAG His Phe Gin GAC ATC CAC Asp Ile His 250 AAG CTA AAG Lys Leu Lys

ATT

Ile 235

AAC

Asn

TTA

Leu GCC AAA CTC CGA Ala Lys Leu Arg 255 240

CAG

Gin CAG CTG GAG CAG CAT Gin Leu Glu Gin His 245 ATT GAG CTG GGG GAG Ile Glu Leu Gly Glu GAA AAC Glu Asn 822

GAT

Asp 280

AAA

Lys 265

AAG

Lys

CTG

Leu AAG CTC ATC GAA Lys Leu Ile Glu 270 TTC AAA CAT AAG Phe Lys His Lys

CAG

Gin

GAA

Glu TAC GCA CTG AGG Tyr Ala Leu Arg 275 CTG CAA CAG CAG Leu Gin Gin Gin 260

GAA

Glu GAG CAC ATT Glu His Ile

GTG

Val 285 CAG CAA ACG ACA Gin Gin Thr Thr CAA CTG ATA Gin Leu Ile 290

GAA

Glu CTC GTG GAT GCC Leu Val Asp Ala 295 GAT GAA AAA CAT AsD Glu Lvs His

GCT

Ala 310 CAG AGA GAG Gin Arg Glu AAA TAC GAA Lys Tyr Glu 330 TCT CTT TAT Ser Leu Tyr

AGA

Arg 315

CAA

Gin

ATG

Met TTT TTA TTA Phe Leu Leu

AAA

Lys 320

GAA

Glu GCG ACA GAA TCG AGG CAC Ala Thr Glu Ser Arg His 325 CAA CTA AAA CAG CAG CTT Gin Leu Lys Gin Gin Leu 870 918 966 1014 1062 1110 1158 ATG AAA CAG CAA Met Lys Gin Gin 335 GAT AAG TTT GAA Asp Lys Phe Glu

GTA

Val GAA TTC CAG ACT Glu Phe Gin Thr 355 AGA CAG GAA ATG Arg Gin Glu Met 370 340

ACC

Thr ATG GCA AAA Met Ala Lys

AGC

Ser 360 GAA CTG TTT Glu Leu Phe

ACA

Thr 365

TTC

Phe GAA AAG ATG Glu Lys Met WO 98/10066 AAG AAA ATT AAA AAA AAA AAA AAA AAA CTC GAG Lys Lys Ile Lys Lys Lys Lys Lys Lys Leu Glu 380 385 INFORMATION FOR SEQ ID NO:6: SEQUENCE CHA RACTERISTICS: LENGTH: 386 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: PCTIUS97/13944 1191 Ala Arg Gly Gly Ala Glu Glu Ala Thr Giu 1 Arg Gly S er Asn Asn Glu Asp Leu Leu 145 Ser Val Ala Lys Arg 225 Gin Arg Ile Asn 50 Lys Arg Ile Ser Leu 130 Ala Val1 Lys Arg Thr 210 Arg Ala Ser Cys Asp His Asn Pro Glu 115 Leu Ala Gin Glu Ser 195 Leu Ile Gin Pro Gly Ile Ser Leu Gly 100 Cys Met Leu Lys Lys 180 Lys Lys Glu Leu 5 Arg Leu Leu Leu Val1 Gly Asn Gin Cys Gin 165 Val Leu Glu Ala Glu 245 Gin Gly Gin Glu 70 Ser Glu Arg Ala Lys 150 Met His Giu Glu Thr 230 Gin Lys Val His 55 Glu Pro Ala Asn Leu 135 Lys Lys Leu Ser Asn 215 Ala H is Phe Lys 40 Gin Asp Al a Arg Lys 120 Asn Tyr Ile Gin Leu 200 Met His Asp Giu 25 Ala Gly Glu Tyr Thr 105 Glu Thr Al a Leu Ser 185 Cys Gin Phe Ile 10 Ile Asp Ser Gly Cys 90 Asp Lys Leu Asp Gin 170 Glu Arg Gin Gin His Ala Gly Met Asn Ser 75 Thr Pro Thr Ser Leu 155 Lys His Giu Ala Ile 235 Gly Thr Leu Cys Asp Gin Pro Leu Thr 140 Leu Lys Ser Leu Arg 220 Thr Arg Met Cys Gly Phe Glu Asp Gly 125 Pro Giu Gin Lys Gin 205 Glu Leu Gly Glu Asn Gly Ile Ser Gly 110 Lys Glu Giu Ala Ala 190 Arg Giu Asn Gly Giu Ser Thr Thr Arg Gin Glu Glu Ser Gin 175 Ile His Glu Glu Arg Ala Gin S er Glu Giu Gin Val1 Lys Arg 160 Ile Leu Asn Glu Ile 240 Asn Ala Lys Leu Arg Gin WO 98/10066 Giu Asn Ile Giu 260 Ala Leu Arg Giu 275 Gin Gin Gin Leu 290 Lys Giu Ala Asp 305 Glu Aia Thr Giu Val Gin Leu Lys 340 Phe Gin Thr Thr 355 Gin Giu Met Glu 370 Leu Giu 385 Leu Giu Val Giu Ser 325 Gin Met Lys Gly His Asp Lys 310 Arg Gin Ala Met Giu Ile Ala 295 His His Leu Lys Thr 375 Lys Asp 280 Lys Gin Lys Ser Ser 360 Lys Leu 265 Lys Leu Arg Tyr Leu 345 Asn Lys Lys Vai Gin Giu Giu 330 Tyr Giu Ile Lys Phe Gin Arg 315 Gin Met Leu Lys Leu Lys Thr 300 Giu Met Asp Phe Lys 380 Ile His 285 Thr Phe Lys Lys Thr 365 Lys Giu 270 Lys Gin Leu Gin Phe 350 Thr Lys Gin Giu Leu Leu Gin 335 Giu Phe Lys PCTJUS97/13944 Tyr Leu Ile Lys 320 Giu Giu Arg Lys

Claims

1. An isolated nucleic acid sequence that encodes a BRCA1 Modulator Protein wherein said sequence is selected from the group consisting of 091- 21A31, Sequence ID NO. 1, 091-1F84, Sequence ID No. 3 and 091-132Q20, Sequence ID No.

2. Isolated host cells comprising an isolated nucleic acid sequence of claim 1 that encodes a BRCA1 Modulator Protein.

3. Vectors that comprise an isolated nucleic acid sequence of claim 1 that encodes a BRCA1 Modulator Protein.

4. An isolated nucleic acid sequence that encodes a protein encoded by the 15 cDNA on deposit with the ATCC with accession no. 98141 (091-1F84, Sequence ID No. 3).

5. An isolated nucleic acid sequence that encodes a protein encoded by the cDNA on deposit with the ATCC with accession no. 98142 (091-21A31, 20 Sequence ID No. 1).

6. An isolated nucleic acid sequence that encodes a protein encoded by the cDNA on deposit with the ATCC with accession no. 98143 (091-132Q20, Sequence ID No.

7. An isolated process for producing a BRCA1 Modulator Protein comprising culturing a cell of claim 2 in a suitable culture medium and isolating said protein from said cell or said medium. DATED: 10 May, 2001 PHILLIPS ORMONDE FITZPATRICK Attorneys for: ONYX PHARMACEUTICALS, INC W:nanelle\speci38293a.doc