WO1998039480A1 - Methods and compositions for identifying expressed genes - Google Patents
Methods and compositions for identifying expressed genes Download PDFInfo
- Publication number
- WO1998039480A1 WO1998039480A1 PCT/US1998/004094 US9804094W WO9839480A1 WO 1998039480 A1 WO1998039480 A1 WO 1998039480A1 US 9804094 W US9804094 W US 9804094W WO 9839480 A1 WO9839480 A1 WO 9839480A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- samples
- primer
- primers
- nucleic acid
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 title claims abstract description 66
- 239000000203 mixture Substances 0.000 title abstract description 22
- 239000012472 biological sample Substances 0.000 claims abstract description 13
- 150000007523 nucleic acids Chemical class 0.000 claims description 84
- 108020004999 messenger RNA Proteins 0.000 claims description 64
- 102000039446 nucleic acids Human genes 0.000 claims description 64
- 108020004707 nucleic acids Proteins 0.000 claims description 64
- 108020004414 DNA Proteins 0.000 claims description 62
- 239000000523 sample Substances 0.000 claims description 57
- 230000000295 complement effect Effects 0.000 claims description 44
- 210000004027 cell Anatomy 0.000 claims description 40
- 108091008146 restriction endonucleases Proteins 0.000 claims description 38
- 241000894007 species Species 0.000 claims description 30
- 239000002299 complementary DNA Substances 0.000 claims description 29
- 108091026890 Coding region Proteins 0.000 claims description 28
- 206010028980 Neoplasm Diseases 0.000 claims description 22
- 201000011510 cancer Diseases 0.000 claims description 18
- 239000012807 PCR reagent Substances 0.000 claims description 14
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 8
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 6
- 230000001580 bacterial effect Effects 0.000 claims description 6
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 5
- 230000002441 reversible effect Effects 0.000 claims description 5
- 238000001502 gel electrophoresis Methods 0.000 claims description 4
- 102100034343 Integrase Human genes 0.000 claims 2
- 239000013615 primer Substances 0.000 abstract description 179
- 239000003155 DNA primer Substances 0.000 abstract description 7
- 230000008685 targeting Effects 0.000 abstract description 2
- 102000053602 DNA Human genes 0.000 description 64
- 229920002477 rna polymer Polymers 0.000 description 43
- 238000009396 hybridization Methods 0.000 description 36
- 108091028043 Nucleic acid sequence Proteins 0.000 description 30
- 108091034117 Oligonucleotide Proteins 0.000 description 20
- 210000001519 tissue Anatomy 0.000 description 20
- 241000894006 Bacteria Species 0.000 description 19
- 238000006243 chemical reaction Methods 0.000 description 18
- 230000003321 amplification Effects 0.000 description 17
- 238000003199 nucleic acid amplification method Methods 0.000 description 17
- 239000000499 gel Substances 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 13
- 125000003729 nucleotide group Chemical group 0.000 description 13
- 238000000137 annealing Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 102000004169 proteins and genes Human genes 0.000 description 12
- 239000012634 fragment Substances 0.000 description 10
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 10
- 230000008488 polyadenylation Effects 0.000 description 10
- 238000002360 preparation method Methods 0.000 description 10
- 239000000243 solution Substances 0.000 description 10
- 108091092195 Intron Proteins 0.000 description 9
- 238000001514 detection method Methods 0.000 description 9
- 230000000977 initiatory effect Effects 0.000 description 9
- 108020004418 ribosomal RNA Proteins 0.000 description 9
- 239000007787 solid Substances 0.000 description 9
- 238000013518 transcription Methods 0.000 description 9
- 230000035897 transcription Effects 0.000 description 9
- 238000012408 PCR amplification Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 8
- 239000003623 enhancer Substances 0.000 description 8
- 244000005700 microbiome Species 0.000 description 8
- 238000012163 sequencing technique Methods 0.000 description 8
- 238000013459 approach Methods 0.000 description 7
- 230000027455 binding Effects 0.000 description 7
- 229940088598 enzyme Drugs 0.000 description 7
- 230000002068 genetic effect Effects 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 238000013519 translation Methods 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 150000001413 amino acids Chemical group 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 6
- 238000001962 electrophoresis Methods 0.000 description 6
- 102000040430 polynucleotide Human genes 0.000 description 6
- 108091033319 polynucleotide Proteins 0.000 description 6
- 239000002157 polynucleotide Substances 0.000 description 6
- 239000003795 chemical substances by application Substances 0.000 description 5
- 210000005260 human cell Anatomy 0.000 description 5
- 206010061289 metastatic neoplasm Diseases 0.000 description 5
- 238000007899 nucleic acid hybridization Methods 0.000 description 5
- 238000003752 polymerase chain reaction Methods 0.000 description 5
- 230000001105 regulatory effect Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000001890 transfection Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 238000011144 upstream manufacturing Methods 0.000 description 5
- -1 urine Substances 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 108020004635 Complementary DNA Proteins 0.000 description 4
- 206010059866 Drug resistance Diseases 0.000 description 4
- 102100031780 Endonuclease Human genes 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 238000000636 Northern blotting Methods 0.000 description 4
- 108020004511 Recombinant DNA Proteins 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 238000002105 Southern blotting Methods 0.000 description 4
- 230000000692 anti-sense effect Effects 0.000 description 4
- 238000000376 autoradiography Methods 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 238000004925 denaturation Methods 0.000 description 4
- 230000036425 denaturation Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 230000001394 metastastic effect Effects 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 108090000765 processed proteins & peptides Proteins 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 230000002285 radioactive effect Effects 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 3
- 239000000020 Nitrocellulose Substances 0.000 description 3
- 101710163270 Nuclease Proteins 0.000 description 3
- 239000004677 Nylon Substances 0.000 description 3
- 229910019142 PO4 Inorganic materials 0.000 description 3
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 3
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 3
- 239000005547 deoxyribonucleotide Substances 0.000 description 3
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 229960001484 edetic acid Drugs 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 230000001965 increasing effect Effects 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 230000005291 magnetic effect Effects 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 229920001220 nitrocellulos Polymers 0.000 description 3
- 230000000683 nonmetastatic effect Effects 0.000 description 3
- 230000009871 nonspecific binding Effects 0.000 description 3
- 229920001778 nylon Polymers 0.000 description 3
- 229940124276 oligodeoxyribonucleotide Drugs 0.000 description 3
- 229910052760 oxygen Inorganic materials 0.000 description 3
- 239000001301 oxygen Substances 0.000 description 3
- 230000005298 paramagnetic effect Effects 0.000 description 3
- HMFHBZSHGGEWLO-UHFFFAOYSA-N pentofuranose Chemical group OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 3
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 3
- 239000010452 phosphate Substances 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 210000003705 ribosome Anatomy 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 108020003215 DNA Probes Proteins 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 102000035195 Peptidases Human genes 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- 241001638204 Pseudophilautus stuarti Species 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 2
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 2
- 108010006785 Taq Polymerase Proteins 0.000 description 2
- 108020004566 Transfer RNA Proteins 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 239000011543 agarose gel Substances 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 238000010804 cDNA synthesis Methods 0.000 description 2
- 230000006037 cell lysis Effects 0.000 description 2
- 229920002678 cellulose Polymers 0.000 description 2
- 239000001913 cellulose Substances 0.000 description 2
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 2
- 230000003081 coactivator Effects 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 239000000356 contaminant Substances 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000002509 fluorescent in situ hybridization Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 230000003211 malignant effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 230000008018 melting Effects 0.000 description 2
- 208000037819 metastatic cancer Diseases 0.000 description 2
- 208000011575 metastatic malignant neoplasm Diseases 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 230000000813 microbial effect Effects 0.000 description 2
- 210000001938 protoplast Anatomy 0.000 description 2
- 239000011347 resin Substances 0.000 description 2
- 229920005989 resin Polymers 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 230000005030 transcription termination Effects 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- 108020004465 16S ribosomal RNA Proteins 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 241000186046 Actinomyces Species 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 241000269627 Amphiuma means Species 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 108020000946 Bacterial DNA Proteins 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 102100025570 Cancer/testis antigen 1 Human genes 0.000 description 1
- 241000606161 Chlamydia Species 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- 108091033380 Coding strand Proteins 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 241000192125 Firmicutes Species 0.000 description 1
- 102100036263 Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Human genes 0.000 description 1
- 238000003794 Gram staining Methods 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 101000856237 Homo sapiens Cancer/testis antigen 1 Proteins 0.000 description 1
- 101000746783 Homo sapiens Cytochrome b-c1 complex subunit 6, mitochondrial Proteins 0.000 description 1
- 101001001786 Homo sapiens Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Proteins 0.000 description 1
- 241001082241 Lythrum hyssopifolia Species 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 206010048723 Multiple-drug resistance Diseases 0.000 description 1
- 241000204031 Mycoplasma Species 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 108020003217 Nuclear RNA Proteins 0.000 description 1
- 102000043141 Nuclear RNA Human genes 0.000 description 1
- 241000425347 Phyla <beetle> Species 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 108700005075 Regulator Genes Proteins 0.000 description 1
- 206010038997 Retroviral infections Diseases 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 102000004389 Ribonucleoproteins Human genes 0.000 description 1
- 108010081734 Ribonucleoproteins Proteins 0.000 description 1
- 241000606701 Rickettsia Species 0.000 description 1
- 102100030852 Run domain Beclin-1-interacting and cysteine-rich domain-containing protein Human genes 0.000 description 1
- 108091081021 Sense strand Proteins 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 1
- 241000589970 Spirochaetales Species 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 239000000061 acid fraction Substances 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 230000001093 anti-cancer Effects 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 239000008346 aqueous phase Substances 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 210000002469 basement membrane Anatomy 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 230000003196 chaotropic effect Effects 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 239000000287 crude extract Substances 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 231100000433 cytotoxic Toxicity 0.000 description 1
- 230000001472 cytotoxic effect Effects 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 229960002086 dextran Drugs 0.000 description 1
- 229960000633 dextran sulfate Drugs 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000003511 endothelial effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000003608 fece Anatomy 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000001744 histochemical effect Effects 0.000 description 1
- 102000048638 human UQCRH Human genes 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 210000002510 keratinocyte Anatomy 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 235000021056 liquid food Nutrition 0.000 description 1
- 210000005228 liver tissue Anatomy 0.000 description 1
- 210000004324 lymphatic system Anatomy 0.000 description 1
- 210000001365 lymphatic vessel Anatomy 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 239000003068 molecular probe Substances 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000012074 organic phase Substances 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 210000005059 placental tissue Anatomy 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 229940024999 proteolytic enzymes for treatment of wounds and ulcers Drugs 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 239000001632 sodium acetate Substances 0.000 description 1
- 235000017281 sodium acetate Nutrition 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 235000021055 solid food Nutrition 0.000 description 1
- 210000004989 spleen cell Anatomy 0.000 description 1
- 206010041823 squamous cell carcinoma Diseases 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 210000001179 synovial fluid Anatomy 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples.
- HGP Human Genome Project
- a subtracted cDNA library contains cDNA clones corresponding to mRNAs present in one sample and not present in another (e.g. present in a particular species, tissue or cell and not present in another species, tissue or cell). See generally, Current Protocols in Molecular Biology, Section 5.8.9 (1990). In the protocol, cDNA containing the gene(s) of interest ["+cDNA”] is prepared with EcoRI ends and the cDNA not containing the gene(s) of interest ["-cDNA”] is prepared with blunt ends.
- the +cDNA is mixed with a 50-fold excess of -cDNA inserts and the mixture is heated to make the DNA single-stranded. Thereafter, the mixture is cooled to allow for hybridization. Annealed cDNA inserts are ligated to a vector and transfected.
- the only +cDNA likely to be double-stranded with an ⁇ coRI site at each end are those not hybridized to something in the -cDNA preparation; in other words, where a complementary sequence is in the -cDNA preparation, the sequence will not be transfected.
- sequences unique to the +cDNA preparation will be cloned and amplified.
- DDRT-PCR differential display of mRNAs using arbitrarily primed polymerase chain reaction
- the polymerase chain reaction is described by Mullis, et al., in U.S. Patents Nos. 4.683,195, 4,683,202 and 4,965,188, hereby incorporated by reference.
- the PCR process consists of introducing a molar excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence. The two primers are complementary to their respective strands of the double-stranded sequence. The mixture is denatured and then allowed to hybridize.
- the primers are extended with a thermostable DNA polymerase so as to form complementary strands.
- the steps of denaturation, hybridization, and polymerase extension can be repeated as often as needed to obtain a relatively high concentration of a segment of the desired target sequence.
- the target is mRNA; the mRNA is, however, treated with reverse transcriptase in the presence of oligo(dT) primers to make cDNA prior to the PCR process.
- the PCR is carried out with random primers in combination with the oligo(dT) primer used for cDNA synthesis.
- the amplified products are placed in side-by-side lanes of a gel; following electrophoresis, the products can be compared or "differentially displayed.”
- the present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples.
- the present invention employs oligonucleotide primers targeting conserved motifs within each expressed gene.
- the present invention contemplates first and second oligonucleotide primers, said first oligonucleotide primer specific for the highly conserved Kozak sequence present before the translation initiating first methionine codon and said second oligonucleotide primer containing sequence complementary to a specific restriction endonuclease recognition site.
- the specificity of the oligonucleotide primers can be enhanced by the presence of degenerate bases 5' and 3' of the target sequence thus allowing for PCR to be performed at a higher annealing temperature which in turn provide sufficient specificity to generate reproducible patterns of bands on a sequencing gel. This reproducibility enables the method of the present invention
- the present invention contemplates applying the method for the study of functional genomics and for analyzing the differentially expressed genes in various cell types. It is not intended that the present invention be limited by the nature of the sample.
- sample and “specimen” in the present specification and claims are used in their broadest sense. On the one hand they are meant to include a specimen or culture. On the other hand, they are meant to include both biological and environmental samples. These terms encompasses all types of samples obtained from humans and other animals, including but not limited to, body fluids such as urine, blood, fecal matter, cerebrospinal fluid (CSF), semen, and saliva, cells as well as solid tissue (including both normal and diseased tissue). These terms also refers to swabs and other sampling devices which are commonly used to obtain samples for culture of microorganisms.
- body fluids such as urine, blood, fecal matter, cerebrospinal fluid (CSF), semen, and saliva
- CSF cerebrospinal fluid
- saliva cells as well as solid tissue (including both normal and diseased tissue).
- solid tissue including both normal and diseased tissue
- the invention may be desirable to differentiate between normal and cancerous tissue.
- the present invention may be used to differentiate between cancer tissue that is metastatic and cancer tissue that is non-metastatic.
- the present invention may be used to detect drug resistance.
- it may be desirable to simply detect the presence or absence of specific pathogens (or pathogenic variants) in a clinical sample.
- it may be disirable to distinguish one species or strain from another.
- the present invention contemplates comparing the expressed genes of two samples suspected to be different species.
- a species that is suspected to have changed or diverged from the parent species is compared with the parent species.
- a species or strain of bacteria may develop a different susceptibilities to a drug (e.g. antibiotics) as compared to the parent species: rapid identification of the specific species or subspecies aids diagnosis and allows initiation of appropriate treatment.
- the present invention contemplates a method of analyzing nucleic acid in a sample, comprising: a) providing: i) a sample containing nucleic acid, ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said nucleic acid of said sample, iii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said nucleic acid of said sample, and iv) a polymerase and PCR reagents; b) preparing said nucleic acid from said sample under conditions so as to produce amplifiable nucleic acid; c) amplifying said nucleic acid with said first and second primers, said polymerase and said
- said sample comprises eukaryotic cells and said natural common sequence is the Kozak sequence.
- said sample comprises prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence.
- said detecting comprises gel electrophoresis.
- the present invention can be used with particular success when comparing samples.
- the present invention contemplates amethod of analyzing expressed genes in biological samples, comprising: a) providing: i) two samples containing mRNA. ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on at least a portion of said mRNA of said two samples, iii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said two samples, and iv) a polymerase and PCR reagents; b) treating said mRNA of each of said two samples under conditions so as to produce amplifiable DNA from each sample; c) amplifying said DNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated from each of said two samples; d) detecting said amplified product.
- each of said two samples comprise eukaryotic cells and said natural common sequence is the Kozak sequence.
- dissimilar samples can be usefully compared.
- said two samples comprise prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence, and said two samples comprises bacterial cells of different species. It is not intended that the present invention be limited by the number of samples compared.
- the present invention contemplates amethod of analyzing expressed genes in a multiple samples, comprising: a) providing: i) at least two samples containing mRNA, ii) random primers, iii) reverse transcriptase, iv) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said mRNA of said samples, v) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said samples, and vi) a polymerase and PCR reagents: b) extracting mRNA from each of said samples and reverse transcribing said mRNA with said reverse transcriptase and said random primers under conditions such that cDNA is produced: c) amplifying said cDNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated
- kits containing these novel compositions.
- the kit comprises: i) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence, and ii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence.
- said natural common sequence is the Kozak sequence.
- said natural common sequence is the Shine-Dalgarno sequence.
- said restriction enzyme recognition sequence is selected from the group consisting of the sequences set forth in Table 1.
- primers are contemplated.
- said first primer is of the general formula:
- the present invention also contemplates said second primer is of the general formula: 5 X N M0 -X-N
- the recognition sequences can be selected from a variety of sources, including but not limited to those in Table 1.
- Nucleic acid sequence and “nucleotide sequence” as used herein refer to an oligonucleotide or polynucleotide, and fragments or portions thereof, and to DNA o RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or antisense strand.
- the term "recombinant DNA molecule” as used herein refers to a DNA molecule which is comprised of segments of DNA joined together by means of molecular biological techniques.
- recombinant protein or “recombinant polypeptide” as used herein refers to a protein molecule which is expressed using a recombinant DNA molecule.
- vector and “vehicle” are used interchangeably in reference to nucleic acid molecules that transfer DNA segment(s) from one cell to another.
- expression vector or "expression cassette” as used herein refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism.
- Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences.
- Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
- in operable combination refers to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced.
- the term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.
- transfection refers to the introduction of foreign DNA into cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, polybrene- mediated transfection. electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, biolistics (i.e., particle bombardment) and the like. __.
- the terms “complementary” or “complementarity” are used in reference to “polynucleotides” and “oligonucleotides” (which are interchangeable terms that refer to a sequence of nucleotides) related by the base-pairing rules.
- sequence “C-A- G-T.” is complementary to the sequence “G-T-C-A.”
- Complementarity can be “partial” or “total.”
- Partial complementarity is where one or more nucleic acid bases is not matched according to the base pairing rules.
- “Total” or “complete” complementarity between nucleic acids is where each and every nucleic acid base is matched with another base under the base pairing rules.
- the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.
- nucleotide sequences refer to a degree of complementarity with other nucleotide sequences. There may be partial homology or complete homology (i.e., identity).
- a nucleotide sequence which is partially complementary, i.e.. “substantially homologous,” to a nucleic acid sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency.
- a substantially homologous sequence or probe will compete for and inhibit the binding (i.e.. the hybridization) of a completely homologous sequence to a target sequence under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction.
- the absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of nonspecific binding the probe will not hybridize to the second non-complementary target.
- Low stringency conditions comprise conditions equivalent to binding or hybridization at 42°C in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 PO 4 -H 2 O and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5X Denhardt's reagent [50X
- Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)] and 100 ⁇ g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5X SSPE, 0.1% SDS at 42°C when a probe of about 500 nucleotides in length is employed.
- low stringency conditions factors such as the length and nature (DNA, RNA. base composition) of the probe and nature of the target ( DNA. RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol), as well as components of the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions.
- conditions which promote hybridization under conditions of high stringency e.g.. increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
- substantially homologous refers to any probe which can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above.
- substantially homologous refers to any probe which can hybridize (i.e., it is the complement ol) the single-stranded nucleic acid sequence under conditions of low stringency as described above.
- hybridization is used in reference to the pairing of complementary nucleic acids using any process by which a strand of nucleic acid joins with a complementary strand through base pairing to form a hybridization complex.
- Hybridization and the strength of hybridization is impacted by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, the T m of the formed hybrid, and the G:C ratio within the nucleic acids.
- hybridization complex refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds between complementary G and C bases and between complementary A and T bases; these hydrogen bonds may be further stabilized by base stacking interactions.
- the two complementary nucleic acid sequences hydrogen bond in an antiparallel configuration.
- a hybridization complex may be formed in solution (e.g., C 0 t or Rot analysis) or between one nucleic acid __.
- T m is used in reference to the "melting temperature.”
- the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
- the equation for calculating the T m of nucleic acids is well known in the art.
- T m 81.5 + 0.41(% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl [see e.g., Anderson and Young,
- stringency is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. "Stringency” typically occurs in a range from about T m -5°C (5°C below the T m of the probe) to about 20°C to 25°C below T m . As will be understood by those of skill in the art, a stringent hybridization can be used to identify or detect identical polynucleotide sequences or to identify or detect similar or related polynucleotide sequences.
- amplifiable nucleic acid is used in reference to nucleic acids which may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid” will usually comprise "sample template.”
- sample template refers to nucleic acid originating from a sample which is analyzed for the presence of a target sequence of interest. In contrast.
- background template is used in reference to nucleic acid other than sample template which may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
- Amplification is defined as the production of additional copies of a nucleic acid sequence and is generally carried out using polymerase chain reaction technologies well known in the art [Dieffenbach CW and GS Dveksler (1995) PCR Primer, a Laboratory .__
- PCR polymerase chain reaction
- PCR With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32 P-labeled deoxynucleotide triphosphates, such as dCTP or dATP. into the amplified segment).
- any oligonucleotide sequence can be amplified with the appropriate set of primer molecules.
- the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
- PCR reagents or "PCR materials”, which herein are defined as all reagents necessary to carry out amplification except the polymerase, primers and template.
- PCR reagents nomally include nucleic acid precursors (dCTP. dTTP etc.) and buffer.
- the term "primer” refers to an oligonucleotide. whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e.. in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
- the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
- the primer is an oligodeoxyribonucleotide.
- the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
- probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest.
- a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences.
- any probe used in the present invention will be labelled with any "reporter molecule,” so that it is detectable using any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
- restriction endonucleases and “restriction enzymes” refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
- DNA molecules are said to have "5' ends” and "3" ends” because mononucleotides are reacted to make oligonucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotide is referred to as the "5' end” if its 5 " phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring.
- an end of an oligonucleotide is referred to as the "3' end” if its 3 " oxygen is not linked to a 5 " phosphate of another mononucleotide pentose ring.
- a nucleic acid sequence even if internal to a larger oligonucleotide. also may be said to have 5' and 3' ends.
- discrete elements are referred to as being “upstream” or 5 " of the "downstream” or 3' elements. This terminology reflects the fact that transcription proceeds in a 5 " to 3 " fashion along the DNA strand.
- the promoter and enhancer elements which direct transcription of a linked gene are generally located 5' or upstream of the coding region. However, enhancer elements can exert their effect even when located 3' of the promoter element and the coding region. Transcription termination and polyadenylation signals are located 3 " or downstream of the coding region.
- an oligonucleotide having a nucleotide sequence encoding a gene means a nucleic acid sequence comprising the coding region of a gene, i.e. the nucleic acid sequence which encodes a gene product.
- the coding region may be present in either a cDNA. genomic DNA or RNA form.
- the oligonucleotide may be single-stranded (i.e., the sense strand) or double-stranded.
- Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc.
- the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
- regulatory element refers to a genetic element which controls some aspect of the expression of nucleic acid sequences.
- a promoter is a regulatory element which facilitates the initiation of transcription of an operably linked coding region.
- Other regulatory elements are splicing signals, polyadenylation signals, termination signals, etc.
- Transcriptional control signals in eukaryotes comprise "promoter” and “enhancer” elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription [Maniatis, T. et al. , Science 236:1237 (1987)]. Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in plant, yeast, insect and mammalian cells and viruses (analogous control elements, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on what cell type is to be used to express the protein of interest.
- Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site [Sambrook. J. et al.. Molecular Cloning: A Laboratory Manual, 2nd ed.. Cold Spring Harbor
- a commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40.
- Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation of the resulting transcript. Transcription termination signals are generally found downstream of the polyadenylation signal and are a few hundred nucleotides in length.
- the term "poly A site” or "poly A sequence” as used herein denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of __.
- the recombinant transcript is desirable as transcripts lacking a poly A tail are unstable and are rapidly degraded.
- the poly A signal utilized in an expression vector may be "heterologous" or "endogenous.”
- An endogenous poly A signal is one that is found naturally at the 3' end of the coding region of a given gene in the genome.
- a heterologous poly A signal is one which is isolated from one gene and placed 3' of another gene.
- transfection or "transfected” refers to the introduction of foreign DNA into a cell.
- nucleic acid molecule encoding As used herein, the terms “nucleic acid molecule encoding.” “DNA sequence encoding.” and “DNA encoding” refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes for the amino acid sequence.
- antisense is used in reference to RNA sequences which are complementary to a specific RNA sequence (e.g., mRNA).
- Antisense RNA may be produced by any method, including synthesis by splicing the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a coding strand. Once introduced into a cell, this transcribed strand combines with natural mRNA produced by the cell to form duplexes. These duplexes then block either the further transcription of the mRNA or its translation. In this manner, mutant phenotypes may be generated.
- the term “antisense strand” is used in reference to a nucleic acid strand that is complementary to the "sense” strand.
- the designation (-) i.e. , "negative" is sometimes used in reference to the antisense strand, with the designation (+) sometimes used in reference to the sense (i.e. , "positive" strand.
- Southern blot refers to the analysis of DNA on agarose or acrylamide gels to fractionate the DNA according to size, followed by transfer and immobilization of the
- DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane.
- the immobilized DNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect DNA species complementary to the probe used.
- the DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support.
- Southern blots are a standard tool of molecular biologists [J. Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31 -9.58]. __.
- Northern blot refers to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect RNA species complementary to the probe used.
- Northern blots are a standard tool of molecular biologists [J. Sambrook, J. et al. (1989) supra, pp 7.39-7.52].
- reverse Northern blot refers to the analysis of DNA by electrophoresis of DNA on agarose gels to fractionate the DNA on the basis of size followed by transfer of the fractionated DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane.
- a solid support such as nitrocellulose or a nylon membrane.
- the immobilized DNA is then probed with a labeled oligo-ribonuclotide probe or RNA probe to detect DNA species complementary to the ribo probe used.
- isolated when used in relation to a nucleic acid, as in “an isolated oligonucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is nucleic acid present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA which are found in the state they exist in nature.
- the term “purified” or “to purify” refers to the removal of undesired components from a sample.
- substantially purified refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated.
- An "isolated polynucleotide” is therefore a substantially purified polynucleotide.
- the term “coding region” when used in reference to a structural gene refers to the nucleotide sequences which encode the amino acids found in the nascent polypeptide as a result of translation of a mRNA molecule.
- the coding region is bounded, in eukaryotes. on the 5 " side by the nucleotide triplet "ATG” which encodes the initiator methionine and on the 3' side by one of the three triplets which specify stop codons (i.e. , TAA, TAG. TGA).
- structural gene refers to a DNA sequence coding for RNA or a protein.
- regulatory genes are structural genes which encode products which control the expression of other genes (e.g., transcription factors).
- gene means the deoxyribonucleotide sequences comprising the coding region of a structural gene and including sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA.
- sequences which are located 5 " of the coding region and which are present on the mRNA are referred to as 5' non-translated sequences.
- the sequences which are located 3' or downstream of the coding region and which are present on the mRNA are referred to as 3' non-translated sequences.
- the term "gene” encompasses both cDNA and genomic forms of a gene.
- a genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed "introns” or “intervening regions” or “intervening sequences.”
- Introns are segments of a gene which are transcribed into heterogenous nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or "spliced out" from the nuclear or primary transcript: introns therefore are absent in the messenger RNA (mRNA) transcript.
- mRNA messenger RNA
- genomic forms of a gene may also include sequences located on both the 5' and 3' end of the sequences which are present on the RNA transcript. These sequences are referred to as "flanking" sequences or regions (these flanking sequences are located 5 * or 3' to the non-translated sequences present on the mRNA transcript).
- the 5" flanking region may contain regulatory sequences such as promoters and enhancers which control or influence the transcription of the gene.
- the 3" flanking region may contain sequences which direct the termination of transcription, posttranscriptional cleavage and polyadenylation.
- sample as used herein is used in its broadest sense and includes environmental and biological samples.
- Environmental samples include material from the environment such as soil and water.
- Biological samples may be animal, including, human, fluid (e.g.. blood, plasma and serum), solid (e.g., stool), tissue, liquid foods (e.g., milk), and solid foods (e.g., vegetables).
- bacteria and "bacterium” refer to all prokaryotic organisms, including those within all of the phyla in the Kingdom Procaryotae. It is intended that the term encompass all microorganisms considered to be bacteria including Mycoplasma. Chlamydia, Actinomyces, Streptomyces, and Rickettsia. All forms of bacteria are included within this definition including cocci, bacilli, spirochetes, spheroplasts, protoplasts, etc. Also included within this term are prokaryotic organisms which are gram negative or gram positive.
- Gram negative and gram positive refer to staining patterns with the Gram-staining process which is well known in the art [Finegold and Martin, Diagnostic Microbiology, 6th Ed. (1982), CV Mosby St. Louis, pp 13-15].
- Gram positive bacteria are bacteria which retain the primary dye used in the Gram stain, causing the stained cells to appear dark blue to purple under the microscope.
- Gram negative bacteria do not retain the primary dye used in the Gram stain, but are stained by the counterstain. Thus, gram negative bacteria appear red.
- K primer partially hybridized to one strand of a denatured double-stranded template.
- Figure 2 schematically shows one embodiment of the primers of the present invention (an "RE Primer") partially hybridized to the other strand of denatured double-stranded target DNA.
- Figure 3 is an autoradiograph of PAGE showing differential expression in a variety of human cell types.
- Figure 4 is an autoradiograph of PAGE showing differential expression in a variety of species of bacteria.
- Figure 5 is an autoradiograph of PAGE showing differential expression in a variety of human cell types where differentially expressed bands have been obtained and cloned.
- Figure 6 shows the nucleic acid sequence of one of the cloned transcripts encoding a human mitochondrial hinge protein.
- Figure 7 shows the sequence of one of the cloned transcripts corresponding to a coactivator gene.
- Figure 8 is an autoradiograph of PAGE showing differential expression in normal and malignant tissue.
- the present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples.
- the description of the invention involves the I) Design of the Primers, II) Preparation of RNA from Samples; and III) Comparing of Biological Samples. .__
- differentially expressed genes ideally one must be able to identify nearly all of the expressed genes (or at least a significant majority of them) in a cell type, only then a meaningful comparison can be made with a related cell or tissue sample.
- the present invention contemplates the use of specific primers able to anneal with sequences which are conserved in expressed genes.
- the present invention contemplates primers directed at the Kozak sequence, a string of non-random nucleotides which are present before the translation initiating first ATG in majority of the mRNAs which are transcribed and translated in an eukarytic cells. See M. Kozak, Cell 44:283-292 (1986).
- an oligonucleotide primer specific for the Kozak sequence (consensus sequence 5XGCCA/GCCATGG-3') with degenerate bases at its 5 ' and 3" end will provide sufficient specificity to be used in a PCR amplification reaction as an upstream primer.
- a second primer (an "RE primer”) can be designed. Again, the presence of degenerate bases at the 5' and 3' end of these primers would provide length sufficient to give specificity in a PCR amplification reaction. Since the ability of a primer pair to amplify a transcript is a function of transcript abundance and the specificity of primer-template interactions, the use of K and RE-primers is likely to significantly improve the detection rate of rare mRNAs-an outcome not possible with standard or modified differential display methods because of the use of random primers.
- M. Kozak performed an analysis of nearly 700 vertebrate mRNAs. See M. Kozak, "An analysis of 5Xnoncoding sequences from 699 vertebrate messenger RNAs, Nucleic Acids
- the present invention therefore contemplates primers which can specifically hybridize with the nucleotide sequences present around the initiating codon. Collectively, these primers would hybridize with all of the expressed mRNAs although the hydridization of individual primers within an expressed gene pool may vary. This would help in reducing the complexity of the target transcripts by effectively dividing the transcript pool in subsets based on the presence of the nucleotides with reference to the ATG in the mRNA sequence.
- degenerate bases can be used i) before the consensus Kozak sequence at the 5' end. ii) inside the Kozak sequence (e.g. at position -5) and/or iii) after the ATG at the 3 ' end.
- the primers are selected from the group consisting of the primers: NNN-X- GCC(A or G)CCATGGNN; NNN-X-GCC(A or G)CCATGANN; NNN-X-GCC(A or G)CCATGCNN; and NNN-X-GCC(A or G)CCATGTNN (wherein X is either a recognition sequence or nothing, and wherein N is either A,T,G,C or nothing).
- This embodiment contains primers that vary at the +4 position.
- recognition sequence it is meant that the sequence is a known sequence that can be targeted by a) nucleic acid hybridization (e.g. poly(dT) or poly (dA), b) an enzyme (e.g. a restriction enzyme), or c) a ligand (e.g. biotin or avidin).
- Preferred primers are those where X is the recognition sequence for a restriction enzyme; introducing this sequence into expressed genes facilitates subsequent manipulation (e.g. cloning).
- preferred primers are those where X is the recognition sequence for the restriction enzyme BamHl; these primers are selected from the group consisting of NNNGGATCCGCC(A or G)CCATGGNN; NNNGGATCCGCC(A or G)CCATGANN; NNNGGATCCGCC(A or G)CCATGCNN; and NNNGGATCCGCC(A or G)CCATGTNN (wherein N is either A.T,G,C or nothing).
- Table 1 sets forth, for illustrative purposes, a number of restriction enzyme recognition sequences.
- "X" can be selected from this list depending on design considerations.
- Other restriction enzymes from commercially available sources have recognition sequences that can also be employed with success.
- Primers containing facilitating moieties such as recoginition sequences allow for the introduction of such sequences into the product of the amplification reaction. That is to say, amplification in PCR involves primer extension to make the so-called “long products.” These long products are the template for subsequent cycles of amplification. While it is not intended that the present invention be limited by any understanding of the mechanism whereby the primers of the present invention successfully operate, it is believed that a primer such as NNN-X-GCC(A or G)CCATGGNN will only partially hybridize to one strand of the denatured double-stranded target nucleic acid in the first round as set forth in Figure 1.
- the present invention contemplates using a lower annealing temperature (discussed more below).
- a lower annealing temperature discussed more below.
- the present invention also contemplates isolating the long products via the recognition sequence prior to subsequent cycles.
- the long products are isolated using an oligo (dT) resin; the long products containing the corresponding recognition sequence bind to the resin, while the background template nucleic acid does not. In this manner, the background template can be removed and subsequent rounds of hybridization are carried out on the long products [with the same primers or with the primers that lack the recognition sequence (but that are otherwise the same)].
- the primers are selected from the group consisting of the primers: NNN-X-GCC(A or G)CCATGG(C or A)GNN; NNN-X-GCC(A or G)CCATGG(C or A)TNN; NNN-X-GCC(A or G)CCATGG(C or A)ANN; and NNN-X-GCC(A or G)CCATGG (C or A)CNN (wherein X is either a recognition sequence or nothing, and wherein N is either
- This embodiment contains primers with the concensus sequence extending to the +5 position, but that vary at the +6 position.
- the present invention contemplates primers where there are many degenerate bases after the ATG at the 3' end (e.g. between three and ten. more preferrably between three and five) as well as where there is only one degenerate base after the ATG at the 3' end.
- the primers are selected from the group consisting of the primers: GCC(A or G)CCATGN (wherein N is either A,T,G or C). These primers can be linked to a recognition sequence ("X") in the manner described above, if desired.
- the present invention also contemplates primers where there are a number of degenerate bases at the 5 " end (i.e. prior to the Kozak sequence).
- the primers are selected from the group consisting of the primers: N 0 GCC(A or G)CCATGGNN; N M0 GCC(A or G)CCATGANN;
- N M0 GCC(A or G)CCATGCNN; and N, .I0 GCC(A or G)CCATGTNN (wherein N is either A.T,G or C).
- the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)CNATGG (hereinafter "Kl” when N is C); CGGGATCCGCC(A or G)CNATGA (hereinafter “K2" when N is C); CGGGATCCGC A or G)CNATGC (hereinafter “K3” when N is C); and CGGGATCCGCC(A or G)CNATGT (hereinafter "K4" when N is C).
- the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)(C or G)NATGG (hereinafter "K-2-1 " when N is C); CGGGATCCGCC(A or G)(C or G)NATGC (hereinafter "K-2-2" when N is C);
- CGGGATCCGCC(A or G)(C or G)NATGT (hereinafter "K-2-3" when N is C); and CGGGATCCGCC(A or G)CNATGA (hereinafter "K-2-4" when N is C).
- the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)(C or G)NATGGN (hereinafter "K-3-1") when N is C); CGGGATCCGCC(A or G)(C or G)NATGCN (hereinafter "K-3-2"); CGGGATCCGCC(A or
- N can be A, C, G or T.
- the primer of the present invention can be only partially complementary to this natural common non-coding sequence.
- the present invention contemplates linking the ATG triplet to degenerate bases on either side (or both sides).
- a recognition sequence (“X") can be linked to such a primer on the 5 " end.
- the primers are of the general formula: 5XN,. 10 X-N,. i() ATGN M0 -3 " (wherein N is A, T,G, C or nothing).
- X is the recognition sequence for a restriction enzyme; again, introducing this sequence into expressed genes facilitates subsequent manipulation (e.g. cloning).
- preferred primers are those where X is the recognition sequence for the restriction enzyme BamRl; these primers are selected from the group consisting of NGGATCCNNNATGA; NGGATCCNNNATGC;
- NGGATCCNNNATGT and NGGATCCNNNATGG (wherein N is either A.T,G,C or nothing).
- primer extension or PCR of DNA using K primers also contemplates hybridization of the K primers to the corresponding mRNA Kozak sequence: 5XACCAUGG.
- primers can be made having the ACCAUGG sequence that can be used to hybridize to DNA.
- the present invention contemplates downstream primers designed with recoginition sequences for common restriction enzymes (hereafter "RE" primers).
- the RE primers are designed with degeneraate bases on either side (or both sides) of the recognition sequence.
- the RE primer is designed with 3 degenerate bases at the 5 ' and 2 degenerate bases at the 3' end (5 X N 3 -specific recognition sequence-N 2 -3').
- the downstream primers of the present invention are primers selected from the group consisting of the primers: 5XX-NNNGATC-3' ( i.e. having the recognition sequence for Mbol); 5'-X-NNNCTAG-3 " (i.e. having the recognition sequence for Bfal); 5 " -X-NNNCCGC-3' (i.e. having the recognition sequence for Acil); 5 " -X-NNNCCGG-3' (i.e. having the recognition sequence for Hpall); and 5'-X- NNNAATT-3 ' (i.e. having the recognition sequence for Tsp 509 I), wherein X is a recognition sequence on the 5 " end that is different from the recognition sequence of the 3' end. or X is nothing).
- the recognition sequence on the 5 " end of the downstream primers of the present invention is for EcoRI.
- primers are selected from the group consisting of the primers: GAATTCNNNGATC; GAATTCNNNCTAG; GAATTCNNNCCGC: GAATTCNNNCCGG: GAATTCNNNAATT; GAATTCNNNTTAA: and GAATTCNNNGCGC.
- the recognition sequence on the 5' end of the downstream primers of the present invention is for BamHI.
- Such primers are selected from the group consisting of the primers: GGATTCCNNNGATC (hereinafter “Mbol primer”); GGATTCCNNNCTAG (hereinafter “Bfal primer”); GGATTCCNNNCCGC (hereinafter “Acil primer”); GGATTCCNNNCCGG (hereinafter “Hpall primer”); and GGATTCCNNNAATT (hereinafter “Tsp509I primer”).
- Primers containing facilitating moieties such as 5 " recoginition sequences of the RE primers of the present invention allow for the introduction of such sequences into the product of the amplification reaction.
- amplification in PCR involves primer extension to make the so-called “long products.” These long products are the template for subsequent cycles of amplification. While it is not intended that the present invention be limited by any understanding of the mechanism whereby the primers of the present invention successfully operate, it is believed that a primer such as X-NNNGATC will only partially hybridize to one strand of the denatured double-stranded target nucleic acid in the first round as set forth in Figure 2.
- primers of the present invention be limited by the precise sequence of a restriction recognition sequence. Indeed, it is specifically contemplated that the primers of the present invention can be only partially complementary to the recognition sequence.
- the prokaryotic mRNA ribosome binding site usually contains part or all of a polypurine domain UAAGGAGGU known as the Shine-Dalgarno (SD) sequence found just 5' to the translation initiation codon: mRNA 5'-UAAGGAGGU - N 5 ., 0 - AUG
- the present invention therefore contemplates primers containing this motif (in a manner similar to the Kozak motif discussed above).
- An oligonucleotide primer specific for the SD sequence (with or without degenerate bases at its 5' and 3' end) will provide sufficient specificity to be used in a PCR amplification reaction as an upstream primer.
- Taq DNA polymerase adds an A to the 5 'end of such PCR products and this can be used to clone by virtue of commercially available ligation kits (e.g. from Promega).
- a second primer (a "RE primer”) can be designed for use with the SD primer. Again, the presence of degenerate bases at the 5" and 3 " end of these primers would provide length sufficient to give specificity in a PCR amplification reaction.
- the SD primers of the present invention are of the general formula: 5 ' -N O -X-N ( TAAGGAGGN MO -3' (where X is a recognition sequence or nothing, and where N is A, T, G. C or nothing).
- the recognition sequence (X) is a restriction enzyme recognition sequence; such sequences can be selected from Table 1 or other known lists of such sequences.
- the recognition sequence can be a region of nucleic acid that can be targeted by hybridization or by a ligand. Such recognition sequences can be used to separate the products of the first cycles of PCR (as discussed above).
- the recognition sequence is a restriction enzyme recognition sequence
- a preferred sequence is that for the enzyme EcoRI.
- the SD primers are selected from the group of the general formula:
- the present invention contemplates linking a portion of the SD sequence (e.g. AGGAGG) to degenerate bases on either side (or both sides) to create a useful primer. It is also contemplated that the SD primers of the present invention need not hybridize completely to the target nucleic acid. In the manner set forth in Figure 1 for K primers, it is contemplated that the primer can be extended even though portions of the primer are not hybridized.
- the nucleic acid content of cells consists of deoxyribonucleic acid (DNA) and ribonucleic acid (RNA).
- DNA contains the genetic blueprint of the cell.
- RNA is involved as an intermediary in the production of proteins based on the DNA sequence. RNA exists in three forms within cells, structural RNA (i.e., ribosomal RNA "rRNA”), transfer
- RNA which is involved in translation
- mRNA messenger RNA
- the cell's mRNA component at any given time is representative of the physiological state of the cell. In order to study and utilize the molecular biology of the cell, it is therefore important to be able to purify mRNA, including purifying mRNA from the total nucleic acid of a sample.
- RNA is complicated by the presence of ribonucleases that degrade RNA (e.g.. T. Maniatis et al.. Molecular Cloning, pp. 188-190, Cold Spring Harbor Laboratory [1982]). Furthermore, the preparation of amplifiable RNA is made difficult by the presence of ribonucleoproteins in association with RNA. ( See, R. J. Slater, In:
- the steps involved in purification of nucleic acid from cells include 1) cell lysis; 2) inactivation of cellular nucleases; and 3) separation of the desired nucleic acid from the cellular debris and other nucleic acid.
- Cell lysis may be achieved through various methods, including enzymatic, detergent or chaotropic agent treatment.
- Inactivation of cellular nucleases may be achieved by the use of proteases and/or the use of strong denaturing agents.
- separation of the desired nucleic acid is typically achieved by extraction of the nucleic acid with phenol or phenol-chloroform; this method partitions the sample into an aqueous phase (which contains the nucleic acids) and an organic phase (which contains other cellular components, including proteins).
- Commonly used protocols require the use of salts in conjunction with phenol (P. Chomczynski and N. Sacchi, Anal. Biochem. 162:156 [1987]), __.
- the structure of the mRNA molecule may used to assist in the purification of mRNA from DNA and other RNA molecules. Because the mRNA of higher organisms is usually polyadenylated on its 3' end
- poly-A tail or "poly-A track”
- poly-A track one means of isolating RNA from cells has been based on binding the poly-A tail with its complementary sequence (i.e., oligo-dT), that has been linked to a support such as cellulose.
- oligo-dT its complementary sequence
- the hybridized mRNA/ oligo-dT is separated from the other components present in the sample through centrifugation or. in the case of magnetic formats, exposure to a magnetic field.
- the mRNA is usually removed from the oligo-dT. However, for some applications, the mRNA may remain bound to the oligo-dT that is linked to a solid support.
- RNA Ribonucleic acid
- mammalian e.g. liver tissue
- the present invention contemplates the isolation of PolyA+ RNA from extracts, including direct isolation from crude extracts.
- the present invention may be used to compare normal tissue with cancer tissue, as well as to differentiate between cancer tissue that is metastatic and cancer tissue that is non-metastatic.
- the present invention may be used to detect drug resistance.
- metastatic disease it is believed that cancer cells proteolytically alter basement membranes underlying epithelia or the endothelial linings of blood and lymphatic vessels, invade through the defects created by proteolysis, and enter the circulatory or lymphatic systems to colonize distant sites. During this process, the secretion of proteolytic enzymes is coupled with increased cellular motility and altered adhesion. After their colonization of distant sites, metastasizing tumor cells proliferate to establish metastatic nodules.
- the present invention can be used to compare metastatic cancer tissue with non- metastatic cancer tissue to identify differentially expressed genes as markers of metastatic potential. Thereafter, the present invention can be used to determine the presence or absence of these markers in various clinical cancer isolates.
- the present invention also contemplates "phenotyping" cancer cells adapted to tissue culture.
- differentially expressed genes as markers of drug resistance. Thereafter, the present invention can be used to determine the presence or absence of these markers in various clinical cancer isolates.
- microorganisms recovered from clinical specimens or environmental sources is an important aspect of clinical microbiology, as this information is important to physicians in making decisions related to methods of treatment.
- reproducible systems for identifying microorganisms are critical.
- Finegold "The primary purpose of nomenclature of microorganisms is to permit us to know as exactly as possible what another clinician, microbiologist, epidemiologist, or author is referring to when describing an organism responsible for infection of an individual or outbreak" (S. Finegold. "Introduction to summary of current nomenclature, taxonomy, and classification of various microbial agents," Clin. Infect. Dis., 16:597 [1993]).
- Classification, nomenclature, and identification are three separate, but interrelated aspects of taxonomy. Classification is the arranging of organisms into taxonomic groups (i.e., taxa) on the basis of similarities or relationships. A multitude of prokaryotic organisms has been identified, with great diversity in their types, and many more organisms being characterized and classified on a regular basis. It is a matter of convenience to classify the organisms into groups based upon their similarities. Classification has been used to organize the seemingly chaotic array of individual bacteria into an orderly framework. Through use of a classification framework, a new isolate can be more easily be characterized by comparison with known organisms. The choice of criteria for placement into groups is somewhat arbitrary, although most classifications are based on phylogenetic relationships.
- rRNA Ribosomal RNA sequence analysis
- molecular probes and amplification methods e.g., PCR
- the test DNA is denatured and exposed to denatured DNA of known sequence from a particular organism.
- the amount of hybridization between the test DNA and known DNA provides an indication of the degree of relatedness between the test and known organisms.
- An important drawback to this approach is that hybridization between two single DNA strands can occur even when 15% of the sequences are not complementary.
- Ribosomal RNA analysis is another method by which the relatedness of organisms has been determined. Because ribosomes are critical to cellular function and interact with many other molecules (e.g., mRNA and tRNAs), the core rRNA sequences are highly constrained and well-conserved throughout evolution.
- rRNA also contains highly variable regions, it is usually possible to identify regions of 20-30 bases that are unique to a particular species. While analyzing sequence differences between the rRNAs of different organisms, this approach is extremely narrow in that it looks at no other differences between organisms.
- identification of an organism is based on its overall morphological and biochemical patterns observed in culture.
- numerous organisms associated with disease may not be cultured in vitro. Indeed, some do not grow well in traditional in vivo culture systems, such as cell cultures or embryonated eggs. Nonetheless, their detection and identification is crucial for the appropriate treatment of affected individuals.
- Genetic testing methods have proven useful for the classification and identification of such organisms. For example, universal ribosomal primers designed to hybridize to and amplify all bacterial rRNA may be used to detect bacteria in any sterile body site (e.g., synovial fluid). Once detected, the organism may then be identified by sequencing and/or amplification methods, and comparing the results with those obtained from known organisms. While this method has led to the identification and classification of various organisms that were historically not cultivable, it is again limited in its focus on rRNA.
- the present invention can be used to identify genes unique to a particular species, subspecies or strain. Unlike the above-described currently used genetic approaches, the __.
- present invention is not limited to any particular genes or gene sequences (e.g. rRNA sequences).
- the present invention contemplates comparing the expressed genes of two samples suspected to be different species.
- a species that is suspected to have changed or diverged from the parent species is compared with the parent species.
- a species or strain of bacteria may develop a different susceptibilities to a drug (e.g. antibiotics) as compared to the parent species: rapid identification of the specific species or subspecies aids diagnosis and allows initiation of appropriate treatment.
- RNAse-free DNAse-1 RQ-1 DNAse, Promega, Madison, WI
- phenol-Chloroform Sigma Chemical Company. St. Louis, MO
- cDNA used for the PCR reaction can be made in a variety of ways. However, in the examples below, single stranded cDNA (sscDNA) was synthesized using 1 ⁇ g of total
- RNA or 100 ⁇ g of mRNA with random primers according to the instructions supplied with a commercially available kit (Superscript, BRL-GIBCO, Gaithersburg. MD).
- the reverse transcriptase enzyme was killed by heating at 94°C for 15 min. __.
- PCR conditions can vary depending on desired outcome. Nonetheless, unless otherwise indicated, the conditions used were as follows. First, the amount of cDNA used in each PCR amplification reaction was empirically determined; 2-5 ng of sscDNA give satisfactory results. Second, the PCR reactions were setup in precooled 0.2ml thin-walled tubes on ice and contained, 50mM TrisHCl (pH 8.5), 50mM KC1, 1.5 mM MgCl 2 , ImM of each dNTP. 2-5 ng of sscDNA, lOpmoles of a K-primer, lOpmoles of an RE-primer, 0.5 ⁇ l of a ⁇ -P 33 dCTP (10 ⁇ Ci/ ⁇ l. Amersham) and water to 20 ⁇ l.
- the mixture can be subjected to PCR cycles in different ways.
- the first cycle (or even the first few cycles) involves a lower annealing temperature than the annealing temperature in subsequent cycles.
- an annealing temperature of between approximately 34°C and approximately 44°C, and more preferrably between approximately 36°C and approximately 40°C, and most preferrably approximately 38°C (for approximately 30 seconds), can be used for the first cycle (or even the first few cycles).
- the subsequent cycles of denaturation, annealing and extension can involve a higher temperature.
- annealing temperature is between approximately 40°C and approximately 60"C, more preferrably between approximately 44°C and approximately 54°C, and most preferrably approximately 48°C (for approximately 30 sec).
- the annealing temperature is approximately the same temperature for all cycles.
- the above-described mixture is subjected to 35 cycles of denaturation.
- annealing and extension wherein the annealing temperature is between approximately 38°C and approximately 40°C (for approximately 30 seconds).
- RNA available commercially from Clontech
- the total RNA was reverse transcribed using 6-mer random primers (available from Pharmacia).
- the resultant cDNA was subjected to thirty-five cycles of PCR (in the presence of a radioactive precursor) using a mixture of two anchor primers ("K2" for Figure 3A and "K3" for Figure 3B) and restriction enzyme-based primers [for this experiment, the recognition sequence on the 5' end of the RE downstream primers was for EcoRI; the primer sequences were: GAATTCNNNGT(A or C)(G or T)AC (lanes 1-4); GAATTCNNNCGGC (lanes 5-8); GAATTCNNN(A or G)GCGC(C or T) (lanes 9-12); GAATTCNNNTTAA (lanes 13-16)].
- the PCR products were analyzed by PAGE using 6% sequencing gels (BRL) and visualized by autoradiography. The results show a large number of bands (see Figures 3A and 3B). Importantly, there is differential expression of transcripts in the various cell types.
- Bacterial DNA was prepared by standard methods. 10-50 ng of genomic DNA from E coli and P. stuarti (in the first and second lane, respectively, of each two lane group in Figure 4) was subjected to thirty-five cycles of PCR (in the presence of a radioactive precursor) using a mixture of anchor primers (SD-primers: 5'- GGAATTCNNN-TAAGGAGG-3') and restriction enzyme-based primers (RE-primers: 5'- GGATTC-CNNNGATC (this "Mbol primer” was used in lanes 1 and 2 of
- GGATTCCNNNCTAG this "Bfal primer” was used in lanes 3 and 4 of Figure 4
- GGATTCCNNNCCGC this "Acil primer” was used in lanes 5 and 6 in Figure 4
- GGATTCCNNNCCGG this "Hpall primer” was used in lanes 7 and 8 in Figure 4
- GGATTCCNNNAATT this "Tsp509I primer” was used in lanes 9 and 10 in Figure 4).
- the PCR products were analyzed by PAGE using 6% sequencing gels (BRL) and visualized by autoradiography.
- EXAMPLE 3 This example describes the cloning and sequencing of expressed transcripts. Briefly,
- DNA bands representing differently expressed transcripts were identified by visual scanning of the autoradiograph and marked (Figure 5. which represents a different exposure of the experiment run in Figure 3A).
- the film was then used as a template and the marked bands were cut out and eluted in water, precipitated with 0.3M sodium acetate. pH 6.0. and 2.5 vol of ethanol, pelleted by centrifugation (12,000 x g, 20 min), washed 2X with 70% ethanol, air dried and dissolved in 10 ⁇ l of nuclease free water.
- Half of the sample was then used for reamplification using the same primer combination and PCR conditions. Amplified material was resolved on a 2% agarose gel and the size of the amplified fragments was determined with reference to DNA size standards ( 100 bp ladder. BRL) and the amplified DNA fragments were gel purified using a commercially available kit
- NCBI National Center for Biotechnology Information
- This example describes the comparison of normal and malignant tissues.
- a variety of cell types were studied: 1) normal human keratinocytes, 2) normal human skin, and 3-5) three squamous cell carcinoma samples from patients (in the first, second, third, fourth and fifth lane of each five lane group in Figure 8).
- the total RNA was reverse transcribed using 6- mer random primers (available from Pharmacia).
- the resultant cDNA was subjected to thirty-five cycles (all cycles were performed using annealing temperatures between 38 and 42 degrees) of PCR (in the presence of a radioactive precursor) using a mixture of two anchor primers ("Kl") and restriction enzyme-based primers (RE-primers: 5 * - GGATTCCNNNGATC (this "Mbo I primer” was used in the reactions represented by lanes 1 through 5 of Figure 8); GGATTCCNNNCTAG (this "Bfal primer” was used in the reactions represented by lanes 6 through 10 of Figure 8); GGATTCCNNNCCGC (this "Acil primer” was used in reactions represented by lanes 11 through 15 in Figure 8); GGATTCCNNNCCGG (this "Hpall primer” was used in reactions represented by lanes 16 through 20 in Figure 8); and GGATTCCNNNAATT (this "Tsp509I primer” was used in the reactions represented in lanes 21 through 25 in Figure 8).
- the PCR products were analyzed by PAGE using 6% sequencing
- the present invention provides a convenient method for distinguishing between the expression of genes in two or more biological samples. Importantly, the method also promotes followup analysis once a gene of interest is indentified.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Methods and compositions are described for distinguishing between the expression of genes in two or more biological samples. The present invention employs oligonucleotide primers targeting conserved motifs within each expressed gene. The primers and method of the present invention allows for the accurate identification of differentially expressed gene(s) in various cell types.
Description
METHODS AND COMPOSITIONS FOR IDENTIFYING EXPRESSED GENES
FIELD OF THE INVENTION The present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples.
BACKGROUND The initial observations of the "hybridization" process, i.e., the ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction, by Marmur and Lane, Proc.Nat.Acad.Sci.. U.S.A. 46. 453 (1960) and Doty, et al. Proc.Nat.Acad.Sci.. U.S.A. 46, 461 (1960), have been followed by the refinement of this process into an essential tool of modern biology. Initial hybridization studies, such as those performed by Hayashi, et al., Proc.Nat.Acad.Sci., U.S.A. 50, 664 (1963). were formed in solution. Further development led to the immobilization of the target DNA or RNA on solid supports. With the discovery of specific restriction endonucleases by Smith and Wilcox, J.Mol.Biol. 51, 379 (1970), it became possible to isolate discrete fragments of DNA. Utilization of immobilization techniques, such as those described by Southern, J.Mol.Biol. 98. 503 (1975). in combination with restriction enzymes, has allowed for the identification by hybridization of single copy genes among a mass of fractionated, genomic DNA.
With the development of these complex and powerful biological techniques, an ambitious project has been undertaken. This project, called the Human Genome Project (HGP). involves the complete characterization of the archetypal human genome sequence which comprises 3 x 109 DNA nucleotide base pairs. An implicit goal of the project is to find genes that may be involved in human health.
However, humans are greater than 99% identical at the DNA sequence level. Thus, merely finding the native sequence of genes will not reveal whether the gene is important in a disease-related process. Indeed, it is the identification of the differences between people that arguably will provide the information most relevant to individual health care.
Identifying differences between biological samples is not trivial. The first approach involved the production of a so-called "subtracted cDNA library." A subtracted cDNA library contains cDNA clones corresponding to mRNAs present in one sample and not present in
another (e.g. present in a particular species, tissue or cell and not present in another species, tissue or cell). See generally, Current Protocols in Molecular Biology, Section 5.8.9 (1990). In the protocol, cDNA containing the gene(s) of interest ["+cDNA"] is prepared with EcoRI ends and the cDNA not containing the gene(s) of interest ["-cDNA"] is prepared with blunt ends. The +cDNA is mixed with a 50-fold excess of -cDNA inserts and the mixture is heated to make the DNA single-stranded. Thereafter, the mixture is cooled to allow for hybridization. Annealed cDNA inserts are ligated to a vector and transfected. In theory, the only +cDNA likely to be double-stranded with an ΕcoRI site at each end are those not hybridized to something in the -cDNA preparation; in other words, where a complementary sequence is in the -cDNA preparation, the sequence will not be transfected. Thus, only sequences unique to the +cDNA preparation will be cloned and amplified.
The subtraction approach is tedious. Moreover the hybridizations and library production with a small amount of cDNA are technically artful.
A second approach to identifying differences involves the differential display of mRNAs using arbitrarily primed polymerase chain reaction (DDRT-PCR). The polymerase chain reaction is described by Mullis, et al., in U.S. Patents Nos. 4.683,195, 4,683,202 and 4,965,188, hereby incorporated by reference. Briefly, the PCR process consists of introducing a molar excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence. The two primers are complementary to their respective strands of the double-stranded sequence. The mixture is denatured and then allowed to hybridize.
Following hybridization, the primers are extended with a thermostable DNA polymerase so as to form complementary strands. The steps of denaturation, hybridization, and polymerase extension can be repeated as often as needed to obtain a relatively high concentration of a segment of the desired target sequence. In the case of DDRT-PCR, the target is mRNA; the mRNA is, however, treated with reverse transcriptase in the presence of oligo(dT) primers to make cDNA prior to the PCR process. The PCR is carried out with random primers in combination with the oligo(dT) primer used for cDNA synthesis. In theory, since only mRNA is (indirectly) amplified, only the expressed genes are amplified. Where two samples are to be compared, the amplified products are placed in side-by-side lanes of a gel; following electrophoresis, the products can be compared or "differentially displayed."
DDRT-PCR, while an improvement over subtractive hybridization, has a number of drawbacks. First, the use of arbitrary random primers can cause faint banding at essentially
.__
every position of the gel. Secondly, the process is generally biased toward high-copy number genes.
There have been some attempts to remedy these problems. For example. E. Haag, et al, "Effects of Primer Choice and Source of Taq DNA Polymerase on the Banding Patterns of Differential Display RT-PCR," Biotechniques 17:226-228 (1994) describes an improved DDRT-PCR method, whereby the use of the standard oligo-dT primer in the PCR step is omitted to decrease the faint banding at essentially every position of the electrophoresis gel. Instead, a second arbitrary primer was utilized in PCR. Another example is O.C. Ikonomov, et al, "Differential Display Protocol With Selected Primers That Preferentially Isolate mRNAs of Moderate to Low Abundance in a Microscopic System," Biotechniques 20:1030-1042 ( 1996); this paper describes the use of a modified DDRT-PCR protocol to increase bias towards moderate to low abundance transcripts. The authors utilized experimentally selected primer pairs directed at known coding sequences that avoid amplification of highly abundant ribosomal and mitochondrial transcripts. While such efforts have improved DDRT-PCR, the process remains unsatisfactory because of the continued amplification of material that is not of interest.
What is needed is a convenient method for distinguishing between the expression of genes in two or more biological samples. Such a method should also promote followup analysis once a gene of interest is identified.
SUMMARY OF THE INVENTION
The present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples. The present invention employs oligonucleotide primers targeting conserved motifs within each expressed gene. In one embodiment, the present invention contemplates first and second oligonucleotide primers, said first oligonucleotide primer specific for the highly conserved Kozak sequence present before the translation initiating first methionine codon and said second oligonucleotide primer containing sequence complementary to a specific restriction endonuclease recognition site. It is contemplated that the specificity of the oligonucleotide primers can be enhanced by the presence of degenerate bases 5' and 3' of the target sequence thus allowing for PCR to be performed at a higher annealing temperature which in turn provide sufficient specificity to generate reproducible patterns of bands on a sequencing gel. This reproducibility enables the method of the present invention
- _» -
__.
to accurately identify differentially expressed gene(s) in various human cell types including intermediate and low abundance transcripts. The present invention contemplates applying the method for the study of functional genomics and for analyzing the differentially expressed genes in various cell types. It is not intended that the present invention be limited by the nature of the sample.
The terms "sample" and "specimen" in the present specification and claims are used in their broadest sense. On the one hand they are meant to include a specimen or culture. On the other hand, they are meant to include both biological and environmental samples. These terms encompasses all types of samples obtained from humans and other animals, including but not limited to, body fluids such as urine, blood, fecal matter, cerebrospinal fluid (CSF), semen, and saliva, cells as well as solid tissue (including both normal and diseased tissue). These terms also refers to swabs and other sampling devices which are commonly used to obtain samples for culture of microorganisms.
It is also not intended that the invention be limited by the particular purpose for carrying out the biological reactions. In one medical diagnostic application, it may be desirable to differentiate between normal and cancerous tissue. In one embodiment, the present invention may be used to differentiate between cancer tissue that is metastatic and cancer tissue that is non-metastatic. In yet another embodiment, the present invention may be used to detect drug resistance. In another medical diagnostic application, it may be desirable to simply detect the presence or absence of specific pathogens (or pathogenic variants) in a clinical sample. In yet another application, it may be disirable to distinguish one species or strain from another.
With regard to distinguishing different species, in one embodiment, the present invention contemplates comparing the expressed genes of two samples suspected to be different species. In another embodiment, a species that is suspected to have changed or diverged from the parent species is compared with the parent species. For example, a species or strain of bacteria may develop a different susceptibilities to a drug (e.g. antibiotics) as compared to the parent species: rapid identification of the specific species or subspecies aids diagnosis and allows initiation of appropriate treatment. In one embodiment, the present invention contemplates a method of analyzing nucleic acid in a sample, comprising: a) providing: i) a sample containing nucleic acid, ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said nucleic acid of said sample, iii) a
second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said nucleic acid of said sample, and iv) a polymerase and PCR reagents; b) preparing said nucleic acid from said sample under conditions so as to produce amplifiable nucleic acid; c) amplifying said nucleic acid with said first and second primers, said polymerase and said
PCR reagents under conditions such that amplified product is generated; d) detecting said amplified product.
It is not intended that the present invention be limited by the nature of the non-coding sequence; the choice may depend on the type of sample. In one embodiment, said sample comprises eukaryotic cells and said natural common sequence is the Kozak sequence. In another embodiment, said sample comprises prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence.
It is not intended that the present invention be limited by the means of detection. In one embodiment, said detecting comprises gel electrophoresis. The present invention can be used with particular success when comparing samples.
In one embodiment, the present invention contemplates amethod of analyzing expressed genes in biological samples, comprising: a) providing: i) two samples containing mRNA. ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on at least a portion of said mRNA of said two samples, iii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said two samples, and iv) a polymerase and PCR reagents; b) treating said mRNA of each of said two samples under conditions so as to produce amplifiable DNA from each sample; c) amplifying said DNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated from each of said two samples; d) detecting said amplified product.
The comparison can be made between cells of similar type. For example, in one embodiment, each of said two samples comprise eukaryotic cells and said natural common sequence is the Kozak sequence. On the other hand, dissimilar samples can be usefully compared. For example, in one embodiment, said two samples comprise prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence, and said two samples comprises bacterial cells of different species.
It is not intended that the present invention be limited by the number of samples compared. The present invention contemplates amethod of analyzing expressed genes in a multiple samples, comprising: a) providing: i) at least two samples containing mRNA, ii) random primers, iii) reverse transcriptase, iv) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said mRNA of said samples, v) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said samples, and vi) a polymerase and PCR reagents: b) extracting mRNA from each of said samples and reverse transcribing said mRNA with said reverse transcriptase and said random primers under conditions such that cDNA is produced: c) amplifying said cDNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated from each of said samples: d) detecting said amplified product.
Clinical samples are specifically contemplated within the scope of the present invention. For example, where said samples comprise eukaryotic cells and said natural common sequence is the Kozak sequence, said samples can comprise human cancer cells. The present invention contemplates the primers of the present invention as unique compositions. The present invention also contemplates kits containing these novel compositions. In one embodiment, the kit comprises: i) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence, and ii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence. In one embodiment, said natural common sequence is the Kozak sequence. In another embodiment, said natural common sequence is the Shine-Dalgarno sequence. In one embodiment, said restriction enzyme recognition sequence is selected from the group consisting of the sequences set forth in Table 1.
A variety of primers are contemplated. In one embodiment, the present invention contemplates that said first primer is of the general formula:
5XN ϋX-NMOATGN O-3' ,wherein N is A, T,G, C or nothing, and wherein X is the recognition sequence for a restriction enzyme or nothing.
The present invention also contemplates said second primer is of the general formula: 5XNM0-X-N|. l0TAAGGAGGN ϋ-3\ where X is a recognition sequence or nothing, and where N is A, T, G, C or nothing.
Again, the recognition sequences can be selected from a variety of sources, including but not limited to those in Table 1.
DEFINITIONS To facilitate understanding of the invention, a number of terms are defined below.
"Nucleic acid sequence" and "nucleotide sequence" as used herein refer to an oligonucleotide or polynucleotide, and fragments or portions thereof, and to DNA o RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or antisense strand. The term "recombinant DNA molecule" as used herein refers to a DNA molecule which is comprised of segments of DNA joined together by means of molecular biological techniques.
The term "recombinant protein" or "recombinant polypeptide" as used herein refers to a protein molecule which is expressed using a recombinant DNA molecule. As used herein, the terms "vector" and "vehicle" are used interchangeably in reference to nucleic acid molecules that transfer DNA segment(s) from one cell to another.
The term "expression vector" or "expression cassette" as used herein refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
The terms "in operable combination", "in operable order" and "operably linked" as used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.
The term "transfection" as used herein refers to the introduction of foreign DNA into cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, polybrene- mediated transfection. electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, biolistics (i.e., particle bombardment) and the like.
__.
As used herein, the terms "complementary" or "complementarity" are used in reference to "polynucleotides" and "oligonucleotides" (which are interchangeable terms that refer to a sequence of nucleotides) related by the base-pairing rules. For example, the sequence "C-A- G-T." is complementary to the sequence "G-T-C-A." Complementarity can be "partial" or "total." "Partial" complementarity is where one or more nucleic acid bases is not matched according to the base pairing rules. "Total" or "complete" complementarity between nucleic acids is where each and every nucleic acid base is matched with another base under the base pairing rules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.
The terms "homology" and "homologous" as used herein in reference to nucleotide sequences refer to a degree of complementarity with other nucleotide sequences. There may be partial homology or complete homology (i.e., identity). A nucleotide sequence which is partially complementary, i.e.. "substantially homologous," to a nucleic acid sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e.. the hybridization) of a completely homologous sequence to a target sequence under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of nonspecific binding the probe will not hybridize to the second non-complementary target.
Low stringency conditions comprise conditions equivalent to binding or hybridization at 42°C in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH2PO4-H2O and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5X Denhardt's reagent [50X
Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)] and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution
comprising 5X SSPE, 0.1% SDS at 42°C when a probe of about 500 nucleotides in length is employed.
The art knows well that numerous equivalent conditions may be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA. base composition) of the probe and nature of the target ( DNA. RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol), as well as components of the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, the art knows conditions which promote hybridization under conditions of high stringency (e.g.. increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
When used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone, the term "substantially homologous" refers to any probe which can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above.
When used in reference to a single-stranded nucleic acid sequence, the term "substantially homologous" refers to any probe which can hybridize (i.e., it is the complement ol) the single-stranded nucleic acid sequence under conditions of low stringency as described above.
As used herein, the term "hybridization" is used in reference to the pairing of complementary nucleic acids using any process by which a strand of nucleic acid joins with a complementary strand through base pairing to form a hybridization complex. Hybridization and the strength of hybridization (i.e.. the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, the Tm of the formed hybrid, and the G:C ratio within the nucleic acids.
As used herein the term "hybridization complex" refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds between complementary G and C bases and between complementary A and T bases; these hydrogen bonds may be further stabilized by base stacking interactions. The two complementary nucleic acid sequences hydrogen bond in an antiparallel configuration. A hybridization complex may be formed in solution (e.g., C0t or Rot analysis) or between one nucleic acid
__.
sequence present in solution and another nucleic acid sequence immobilized to a solid support [e.g., a nylon membrane or a nitrocellulose filter as employed in Southern and Northern blotting, dot blotting or a glass slide as employed in in situ hybridization, including FISH (fluorescent in situ hybridization)]. As used herein, the term "Tm" is used in reference to the "melting temperature." The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the Tm of nucleic acids is well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm = 81.5 + 0.41(% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl [see e.g., Anderson and Young,
Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985)]. Other references include more sophisticated computations which take structural as well as sequence characteristics into account for the calculation of Tm.
As used herein the term "stringency" is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. "Stringency" typically occurs in a range from about Tm-5°C (5°C below the Tm of the probe) to about 20°C to 25°C below Tm. As will be understood by those of skill in the art, a stringent hybridization can be used to identify or detect identical polynucleotide sequences or to identify or detect similar or related polynucleotide sequences.
As used herein, the term "amplifiable nucleic acid" is used in reference to nucleic acids which may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" will usually comprise "sample template."
As used herein, the term "sample template" refers to nucleic acid originating from a sample which is analyzed for the presence of a target sequence of interest. In contrast.
"background template" is used in reference to nucleic acid other than sample template which may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
"Amplification" is defined as the production of additional copies of a nucleic acid sequence and is generally carried out using polymerase chain reaction technologies well known in the art [Dieffenbach CW and GS Dveksler (1995) PCR Primer, a Laboratory
.__
Manual, Cold Spring Harbor Press, Plainview NY]. As used herein, the term "polymerase chain reaction" ("PCR") refers to the method of K.B. Mullis U.S. Patent Nos. 4.683,195 and 4,683,202. hereby incorporated by reference, which describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. The length of the amplified segment of the desired target sequence is determined by the relative positions of two oligonucleotide primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified".
With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP. into the amplified segment). In addition to genomic DNA, any oligonucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
Amplification in PCR requires "PCR reagents" or "PCR materials", which herein are defined as all reagents necessary to carry out amplification except the polymerase, primers and template. PCR reagents nomally include nucleic acid precursors (dCTP. dTTP etc.) and buffer.
As used herein, the term "primer" refers to an oligonucleotide. whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e.. in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact
lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention will be labelled with any "reporter molecule," so that it is detectable using any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
DNA molecules are said to have "5' ends" and "3" ends" because mononucleotides are reacted to make oligonucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotide is referred to as the "5' end" if its 5" phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring. An end of an oligonucleotide is referred to as the "3' end" if its 3" oxygen is not linked to a 5" phosphate of another mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide. also may be said to have 5' and 3' ends. In either a linear or circular DNA molecule, discrete elements are referred to as being "upstream" or 5" of the "downstream" or 3' elements. This terminology reflects the fact that transcription proceeds in a 5" to 3" fashion along the DNA strand. The promoter and enhancer elements which direct transcription of a linked gene are generally located 5' or upstream of the coding region. However, enhancer elements can exert their effect even when located 3' of the promoter element and the coding region. Transcription termination and polyadenylation signals are located 3" or downstream of the coding region.
As used herein, the term "an oligonucleotide having a nucleotide sequence encoding a gene" means a nucleic acid sequence comprising the coding region of a gene, i.e. the nucleic acid sequence which encodes a gene product. The coding region may be present in either a
cDNA. genomic DNA or RNA form. When present in a DNA form, the oligonucleotide may be single-stranded (i.e., the sense strand) or double-stranded. Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region of the gene if needed to permit proper initiation of transcription and/or correct processing of the primary RNA transcript. Alternatively, the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
As used herein, the term "regulatory element" refers to a genetic element which controls some aspect of the expression of nucleic acid sequences. For example, a promoter is a regulatory element which facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, etc.
Transcriptional control signals in eukaryotes comprise "promoter" and "enhancer" elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription [Maniatis, T. et al. , Science 236:1237 (1987)]. Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in plant, yeast, insect and mammalian cells and viruses (analogous control elements, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on what cell type is to be used to express the protein of interest.
The presence of "splicing signals" on an expression vector often results in higher levels of expression of the recombinant transcript. Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site [Sambrook. J. et al.. Molecular Cloning: A Laboratory Manual, 2nd ed.. Cold Spring Harbor
Laboratory Press. New York (1989) pp. 16.7-16.8]. A commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40.
Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation of the resulting transcript. Transcription termination signals are generally found downstream of the polyadenylation signal and are a few hundred nucleotides in length. The term "poly A site" or "poly A sequence" as used herein denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of
__.
the recombinant transcript is desirable as transcripts lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized in an expression vector may be "heterologous" or "endogenous." An endogenous poly A signal is one that is found naturally at the 3' end of the coding region of a given gene in the genome. A heterologous poly A signal is one which is isolated from one gene and placed 3' of another gene.
The term "transfection" or "transfected" refers to the introduction of foreign DNA into a cell.
As used herein, the terms "nucleic acid molecule encoding." "DNA sequence encoding." and "DNA encoding" refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes for the amino acid sequence.
As used herein, the term "antisense" is used in reference to RNA sequences which are complementary to a specific RNA sequence (e.g., mRNA). Antisense RNA may be produced by any method, including synthesis by splicing the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a coding strand. Once introduced into a cell, this transcribed strand combines with natural mRNA produced by the cell to form duplexes. These duplexes then block either the further transcription of the mRNA or its translation. In this manner, mutant phenotypes may be generated. The term "antisense strand" is used in reference to a nucleic acid strand that is complementary to the "sense" strand. The designation (-) (i.e. , "negative") is sometimes used in reference to the antisense strand, with the designation (+) sometimes used in reference to the sense (i.e. , "positive") strand.
The term "Southern blot" refers to the analysis of DNA on agarose or acrylamide gels to fractionate the DNA according to size, followed by transfer and immobilization of the
DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect DNA species complementary to the probe used. The DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support. Southern blots are a standard tool of molecular biologists [J. Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31 -9.58].
__.
The term "Northern blot" as used herein refers to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists [J. Sambrook, J. et al. (1989) supra, pp 7.39-7.52].
The term "reverse Northern blot" as used herein refers to the analysis of DNA by electrophoresis of DNA on agarose gels to fractionate the DNA on the basis of size followed by transfer of the fractionated DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then probed with a labeled oligo-ribonuclotide probe or RNA probe to detect DNA species complementary to the ribo probe used.
The term "isolated" when used in relation to a nucleic acid, as in "an isolated oligonucleotide" refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is nucleic acid present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA which are found in the state they exist in nature.
As used herein, the term "purified" or "to purify" refers to the removal of undesired components from a sample. As used herein, the term "substantially purified" refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated. An "isolated polynucleotide" is therefore a substantially purified polynucleotide. As used herein the term "coding region" when used in reference to a structural gene refers to the nucleotide sequences which encode the amino acids found in the nascent polypeptide as a result of translation of a mRNA molecule. The coding region is bounded, in eukaryotes. on the 5" side by the nucleotide triplet "ATG" which encodes the initiator methionine and on the 3' side by one of the three triplets which specify stop codons (i.e. , TAA, TAG. TGA).
As used herein, the term "structural gene" refers to a DNA sequence coding for RNA or a protein. In contrast, "regulatory genes" are structural genes which encode products which control the expression of other genes (e.g., transcription factors).
As used herein, the term "gene" means the deoxyribonucleotide sequences comprising the coding region of a structural gene and including sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA. The sequences which are located 5" of the coding region and which are present on the mRNA are referred to as 5' non-translated sequences. The sequences which are located 3' or downstream of the coding region and which are present on the mRNA are referred to as 3' non-translated sequences. The term "gene" encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed "introns" or "intervening regions" or "intervening sequences." Introns are segments of a gene which are transcribed into heterogenous nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or "spliced out" from the nuclear or primary transcript: introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.
In addition to containing introns, genomic forms of a gene may also include sequences located on both the 5' and 3' end of the sequences which are present on the RNA transcript. These sequences are referred to as "flanking" sequences or regions (these flanking sequences are located 5* or 3' to the non-translated sequences present on the mRNA transcript). The 5" flanking region may contain regulatory sequences such as promoters and enhancers which control or influence the transcription of the gene. The 3" flanking region may contain sequences which direct the termination of transcription, posttranscriptional cleavage and polyadenylation.
The term "sample" as used herein is used in its broadest sense and includes environmental and biological samples. Environmental samples include material from the environment such as soil and water. Biological samples may be animal, including, human, fluid (e.g.. blood, plasma and serum), solid (e.g., stool), tissue, liquid foods (e.g., milk), and solid foods (e.g., vegetables).
The term "bacteria" and "bacterium" refer to all prokaryotic organisms, including those within all of the phyla in the Kingdom Procaryotae. It is intended that the term encompass all microorganisms considered to be bacteria including Mycoplasma. Chlamydia, Actinomyces, Streptomyces, and Rickettsia. All forms of bacteria are included within this definition including cocci, bacilli, spirochetes, spheroplasts, protoplasts, etc. Also included within this
term are prokaryotic organisms which are gram negative or gram positive. "Gram negative" and "gram positive" refer to staining patterns with the Gram-staining process which is well known in the art [Finegold and Martin, Diagnostic Microbiology, 6th Ed. (1982), CV Mosby St. Louis, pp 13-15]. "Gram positive bacteria" are bacteria which retain the primary dye used in the Gram stain, causing the stained cells to appear dark blue to purple under the microscope. "Gram negative bacteria" do not retain the primary dye used in the Gram stain, but are stained by the counterstain. Thus, gram negative bacteria appear red.
DESCRIPTION OF THE DRAWINGS Figure 1 schematically shows one embodiment of the primers of the present invention
(a "K primer") partially hybridized to one strand of a denatured double-stranded template.
Figure 2 schematically shows one embodiment of the primers of the present invention (an "RE Primer") partially hybridized to the other strand of denatured double-stranded target DNA. Figure 3 is an autoradiograph of PAGE showing differential expression in a variety of human cell types.
Figure 4 is an autoradiograph of PAGE showing differential expression in a variety of species of bacteria.
Figure 5 is an autoradiograph of PAGE showing differential expression in a variety of human cell types where differentially expressed bands have been obtained and cloned.
Figure 6 shows the nucleic acid sequence of one of the cloned transcripts encoding a human mitochondrial hinge protein.
Figure 7 shows the sequence of one of the cloned transcripts corresponding to a coactivator gene. Figure 8 is an autoradiograph of PAGE showing differential expression in normal and malignant tissue.
DESCRIPTION OF THE INVENTION
The present invention relates to the identification of expressed genes, and in particular, methods and compositions for distinguishing between the expression of genes in two or more biological samples. The description of the invention involves the I) Design of the Primers, II) Preparation of RNA from Samples; and III) Comparing of Biological Samples.
.__
I. Design of Primers
To identify differentially expressed genes ideally one must be able to identify nearly all of the expressed genes (or at least a significant majority of them) in a cell type, only then a meaningful comparison can be made with a related cell or tissue sample. For this purpose. the present invention contemplates the use of specific primers able to anneal with sequences which are conserved in expressed genes.
In one embodiment, the present invention contemplates primers directed at the Kozak sequence, a string of non-random nucleotides which are present before the translation initiating first ATG in majority of the mRNAs which are transcribed and translated in an eukarytic cells. See M. Kozak, Cell 44:283-292 (1986). Thus, an oligonucleotide primer specific for the Kozak sequence (consensus sequence 5XGCCA/GCCATGG-3') with degenerate bases at its 5' and 3" end will provide sufficient specificity to be used in a PCR amplification reaction as an upstream primer. Additionally, the presence of degenerate bases at the 3 "-end of these primers (Kozak or K primers) would reduce the complexity of the copied transcript pool to which this primer may hybridize thus allowing the primers to access and anneal with specific subsets of transcripts overcoming the problem of "competition' in a PCR reaction.
Based on the knowledge of distribution of specific DNA sequences within the genome which are recognized by restriction endonucleases ("RE"), a second primer (an "RE primer") can be designed. Again, the presence of degenerate bases at the 5' and 3' end of these primers would provide length sufficient to give specificity in a PCR amplification reaction. Since the ability of a primer pair to amplify a transcript is a function of transcript abundance and the specificity of primer-template interactions, the use of K and RE-primers is likely to significantly improve the detection rate of rare mRNAs-an outcome not possible with standard or modified differential display methods because of the use of random primers.
A. Specific Design Considerations
1. Kozak Primers (Upstream Primers)
M. Kozak performed an analysis of nearly 700 vertebrate mRNAs. See M. Kozak, "An analysis of 5Xnoncoding sequences from 699 vertebrate messenger RNAs, Nucleic Acids
Research 15:8125 (1987). The results provide a general approximation of the frequency of A.C. G and T around the translational start site in vertebrate mRNAs:
Position
-6 -5 -4 -3 -2 -1 +4
%A 17 18 25 61 27 15 23
%C 19 39 53 2 49 55 16
%G 44 23 15 36 13 21 46
%T 20 20 7 1 11 9 15
A search of the GenBank and a random selection of 100 mRNA sequences (largely human) revealed that bases at +5 position (with reference to translation initiating ATG triplet of the Kozak sequence and A being +1) are also highly conserved. These results indicated that >38% of the mRNAs surveyed had a C at this position and approximately 25% had an A at this position.
The present invention therefore contemplates primers which can specifically hybridize with the nucleotide sequences present around the initiating codon. Collectively, these primers would hybridize with all of the expressed mRNAs although the hydridization of individual primers within an expressed gene pool may vary. This would help in reducing the complexity of the target transcripts by effectively dividing the transcript pool in subsets based on the presence of the nucleotides with reference to the ATG in the mRNA sequence.
Specifically, with regard to the primers of the present invention, it is contemplated that degenerate bases can be used i) before the consensus Kozak sequence at the 5' end. ii) inside the Kozak sequence (e.g. at position -5) and/or iii) after the ATG at the 3' end. In one embodiment, the primers are selected from the group consisting of the primers: NNN-X- GCC(A or G)CCATGGNN; NNN-X-GCC(A or G)CCATGANN; NNN-X-GCC(A or G)CCATGCNN; and NNN-X-GCC(A or G)CCATGTNN (wherein X is either a recognition sequence or nothing, and wherein N is either A,T,G,C or nothing). This embodiment contains primers that vary at the +4 position.
It is not intended that the present invention be limited by the nature of the recognition sequence. By "recognition sequence" it is meant that the sequence is a known sequence that can be targeted by a) nucleic acid hybridization (e.g. poly(dT) or poly (dA), b) an enzyme (e.g. a restriction enzyme), or c) a ligand (e.g. biotin or avidin). Preferred primers are those where X is the recognition sequence for a restriction enzyme; introducing this sequence into expressed genes facilitates subsequent manipulation (e.g. cloning).. For example, preferred primers are those where X is the recognition sequence for the restriction enzyme BamHl;
these primers are selected from the group consisting of NNNGGATCCGCC(A or G)CCATGGNN; NNNGGATCCGCC(A or G)CCATGANN; NNNGGATCCGCC(A or G)CCATGCNN; and NNNGGATCCGCC(A or G)CCATGTNN (wherein N is either A.T,G,C or nothing). Table 1 sets forth, for illustrative purposes, a number of restriction enzyme recognition sequences. One skilled in the art will understand that "X" (see above) can be selected from this list depending on design considerations. Other restriction enzymes from commercially available sources have recognition sequences that can also be employed with success.
Primers containing facilitating moieties such as recoginition sequences allow for the introduction of such sequences into the product of the amplification reaction. That is to say, amplification in PCR involves primer extension to make the so-called "long products." These long products are the template for subsequent cycles of amplification. While it is not intended that the present invention be limited by any understanding of the mechanism whereby the primers of the present invention successfully operate, it is believed that a primer such as NNN-X-GCC(A or G)CCATGGNN will only partially hybridize to one strand of the denatured double-stranded target nucleic acid in the first round as set forth in Figure 1.
To improve hybridization of the primer for making long products, the present invention, in one embodiment, contemplates using a lower annealing temperature (discussed more below). To improve the specificity of hybridization in subsequent cycles, the present invention, in one embodiment, also contemplates isolating the long products via the recognition sequence prior to subsequent cycles. In one embodiment, the long products are isolated using an oligo (dT) resin; the long products containing the corresponding recognition sequence bind to the resin, while the background template nucleic acid does not. In this manner, the background template can be removed and subsequent rounds of hybridization are carried out on the long products [with the same primers or with the primers that lack the recognition sequence (but that are otherwise the same)].
In another embodiment, the primers are selected from the group consisting of the primers: NNN-X-GCC(A or G)CCATGG(C or A)GNN; NNN-X-GCC(A or G)CCATGG(C or A)TNN; NNN-X-GCC(A or G)CCATGG(C or A)ANN; and NNN-X-GCC(A or G)CCATGG (C or A)CNN (wherein X is either a recognition sequence or nothing, and wherein N is either
A,T,G.C or nothing). This embodiment contains primers with the concensus sequence extending to the +5 position, but that vary at the +6 position.
TABLE 1 - RECOGNITION SEQUENCES
Kev:
D = A or G or T N = A or C or G or T H = A or C or T R = A or G K = G or T S = C or G M = A or C Y = C or T
The present invention contemplates primers where there are many degenerate bases after the ATG at the 3' end (e.g. between three and ten. more preferrably between three and five) as well as where there is only one degenerate base after the ATG at the 3' end. In one embodiment, the primers are selected from the group consisting of the primers: GCC(A or G)CCATGN (wherein N is either A,T,G or C). These primers can be linked to a recognition sequence ("X") in the manner described above, if desired.
The present invention also contemplates primers where there are a number of degenerate bases at the 5" end (i.e. prior to the Kozak sequence). In one embodiment, the primers are selected from the group consisting of the primers: N 0GCC(A or G)CCATGGNN; NM0GCC(A or G)CCATGANN;
NM0GCC(A or G)CCATGCNN; and N,.I0GCC(A or G)CCATGTNN (wherein N is either A.T,G or C).
In another embodiment, the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)CNATGG (hereinafter "Kl" when N is C); CGGGATCCGCC(A or G)CNATGA (hereinafter "K2" when N is C); CGGGATCCGC A or G)CNATGC (hereinafter "K3" when N is C); and CGGGATCCGCC(A or G)CNATGT (hereinafter "K4" when N is C).
In another embodiment, the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)(C or G)NATGG (hereinafter "K-2-1 " when N is C); CGGGATCCGCC(A or G)(C or G)NATGC (hereinafter "K-2-2" when N is C);
CGGGATCCGCC(A or G)(C or G)NATGT (hereinafter "K-2-3" when N is C); and CGGGATCCGCC(A or G)CNATGA (hereinafter "K-2-4" when N is C).
In another embodiment, the primers are selected from the group consisting of the primers: CGGGATCCGCC(A or G)(C or G)NATGGN (hereinafter "K-3-1") when N is C); CGGGATCCGCC(A or G)(C or G)NATGCN (hereinafter "K-3-2"); CGGGATCCGCC(A or
G)(C or G)NATGTN (hereinafter "K-3-3"); and CGGGATCCGCC(A or G)CNATGAN (hereinafter "K-3-4"). In these embodiments, N can be A, C, G or T.
It is not intended that the present invention be limited to the entire Kozak sequence. It is specifically contemplated that the primer of the present invention can be only partially complementary to this natural common non-coding sequence. For example, in one embodiment, the present invention contemplates linking the ATG triplet to degenerate bases on either side (or both sides). A recognition sequence ("X") can be linked to such a primer on the 5" end. In such an embodiment, the primers are of the general formula: 5XN,.10X-N,.
i()ATGNM0-3" (wherein N is A, T,G, C or nothing). In a preferred embodiment, X is the recognition sequence for a restriction enzyme; again, introducing this sequence into expressed genes facilitates subsequent manipulation (e.g. cloning).. For example, preferred primers are those where X is the recognition sequence for the restriction enzyme BamRl; these primers are selected from the group consisting of NGGATCCNNNATGA; NGGATCCNNNATGC;
NGGATCCNNNATGT: and NGGATCCNNNATGG (wherein N is either A.T,G,C or nothing).
While the above discussion has focused on primer extension or PCR of DNA using K primers, the present invention also contemplates hybridization of the K primers to the corresponding mRNA Kozak sequence: 5XACCAUGG. In addition, primers can be made having the ACCAUGG sequence that can be used to hybridize to DNA.
2. Primers Complementary To Restriction Enzyme Recognition
Sequences (Downstream Primers) Since the efficiency of sequencing gels in resolving DNA fragments greater than 600 bases is very limited, the presence of recognition sequences for 4 and 6 base cutting restriction enzymes were searched within 600 bp from the putative Kozak sequence. It was found that the sequence GATC. which is recognized by the Mbo I enzyme and its isoschizomer Saιι3Al. was present in the target region at least once in 72% of the cDNAs. The remaining of the cDNAs had sequences for other common restriction enzymes Hpa II
( 10%): HinPll (6%); Maell (5%). Sequences for Msel (TTAA) and Mael (CTAG) restriction endonucleases were present in only 1 and 2% respectively of the cDNAs surveyed. Thus by using oligonucleotide primers having 3Xsequences complementary to the recognition sequences for 4-6 common restriction enzymes in combination with Kozak primers one could amplify the entire repertoire of the expressed genes.
Therefore, the present invention contemplates downstream primers designed with recoginition sequences for common restriction enzymes (hereafter "RE" primers). In one embodiment, the RE primers are designed with degeneraate bases on either side (or both sides) of the recognition sequence. In a preferred embodiment, the RE primer is designed with 3 degenerate bases at the 5' and 2 degenerate bases at the 3' end (5XN3-specific recognition sequence-N2-3').
In one embodiment, the downstream primers of the present invention are primers selected from the group consisting of the primers: 5XX-NNNGATC-3'
( i.e. having the recognition sequence for Mbol); 5'-X-NNNCTAG-3" (i.e. having the recognition sequence for Bfal); 5"-X-NNNCCGC-3' (i.e. having the recognition sequence for Acil); 5"-X-NNNCCGG-3' (i.e. having the recognition sequence for Hpall); and 5'-X- NNNAATT-3' (i.e. having the recognition sequence for Tsp 509 I), wherein X is a recognition sequence on the 5" end that is different from the recognition sequence of the 3' end. or X is nothing).
It is not intended that the present invention be limited by the recognition sequence on the 5" end: again resort can be made to a variety of recognition sequences, including but not limited to those sequences found in Table 1. In one embodiment, the recognition sequence on the 5 " end of the downstream primers of the present invention is for EcoRI. Such primers are selected from the group consisting of the primers: GAATTCNNNGATC; GAATTCNNNCTAG; GAATTCNNNCCGC: GAATTCNNNCCGG: GAATTCNNNAATT; GAATTCNNNTTAA: and GAATTCNNNGCGC.
In one embodiment, the recognition sequence on the 5' end of the downstream primers of the present invention is for BamHI. Such primers are selected from the group consisting of the primers: GGATTCCNNNGATC (hereinafter "Mbol primer"); GGATTCCNNNCTAG (hereinafter "Bfal primer"); GGATTCCNNNCCGC (hereinafter "Acil primer"); GGATTCCNNNCCGG (hereinafter "Hpall primer"); and GGATTCCNNNAATT (hereinafter "Tsp509I primer"). Primers containing facilitating moieties such as 5" recoginition sequences of the RE primers of the present invention allow for the introduction of such sequences into the product of the amplification reaction. As noted above, amplification in PCR involves primer extension to make the so-called "long products." These long products are the template for subsequent cycles of amplification. While it is not intended that the present invention be limited by any understanding of the mechanism whereby the primers of the present invention successfully operate, it is believed that a primer such as X-NNNGATC will only partially hybridize to one strand of the denatured double-stranded target nucleic acid in the first round as set forth in Figure 2.
It is not intended that the primers of the present invention be limited by the precise sequence of a restriction recognition sequence. Indeed, it is specifically contemplated that the primers of the present invention can be only partially complementary to the recognition sequence.
B. Shine-Dalgarno
The prokaryotic mRNA ribosome binding site (RBS) usually contains part or all of a polypurine domain UAAGGAGGU known as the Shine-Dalgarno (SD) sequence found just 5' to the translation initiation codon: mRNA 5'-UAAGGAGGU - N5.,0 - AUG
The present invention therefore contemplates primers containing this motif (in a manner similar to the Kozak motif discussed above). An oligonucleotide primer specific for the SD sequence (with or without degenerate bases at its 5' and 3' end) will provide sufficient specificity to be used in a PCR amplification reaction as an upstream primer. Additionally, Taq DNA polymerase adds an A to the 5 'end of such PCR products and this can be used to clone by virtue of commercially available ligation kits (e.g. from Promega).
Based on the knowledge of distribution of specific DNA sequences within the genome which are recognized by restriction endonucleases ("RE"), a second primer (a "RE primer") can be designed for use with the SD primer. Again, the presence of degenerate bases at the 5" and 3" end of these primers would provide length sufficient to give specificity in a PCR amplification reaction.
In one embodiment, the SD primers of the present invention are of the general formula: 5'-N O-X-N (TAAGGAGGNMO-3' (where X is a recognition sequence or nothing, and where N is A, T, G. C or nothing). In a preferred embodiment, the recognition sequence (X) is a restriction enzyme recognition sequence; such sequences can be selected from Table 1 or other known lists of such sequences. On the other hand, the recognition sequence can be a region of nucleic acid that can be targeted by hybridization or by a ligand. Such recognition sequences can be used to separate the products of the first cycles of PCR (as discussed above). Where the recognition sequence is a restriction enzyme recognition sequence, a preferred sequence is that for the enzyme EcoRI. In such an embodiment, the SD primers are selected from the group of the general formula:
5 ' -NG AATTCNNNTAAGGAGG-3 ' where N is A. T, G, C or nothing. It is not intended that the present invention be limited to the entire SD sequence. For example, in one embodiment, the present invention contemplates linking a portion of the SD sequence (e.g. AGGAGG) to degenerate bases on either side (or both sides) to create a useful primer.
It is also contemplated that the SD primers of the present invention need not hybridize completely to the target nucleic acid. In the manner set forth in Figure 1 for K primers, it is contemplated that the primer can be extended even though portions of the primer are not hybridized.
II. Preparation of RNA
The nucleic acid content of cells consists of deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). The DNA contains the genetic blueprint of the cell. RNA is involved as an intermediary in the production of proteins based on the DNA sequence. RNA exists in three forms within cells, structural RNA (i.e., ribosomal RNA "rRNA"), transfer
RNA ("tRNA"). which is involved in translation, and messenger RNA ("mRNA"). Since the mRNA is the intermediate molecule between the genetic information encoded in the DNA, and the corresponding proteins, the cell's mRNA component at any given time is representative of the physiological state of the cell. In order to study and utilize the molecular biology of the cell, it is therefore important to be able to purify mRNA, including purifying mRNA from the total nucleic acid of a sample.
The preparation of RNA is complicated by the presence of ribonucleases that degrade RNA (e.g.. T. Maniatis et al.. Molecular Cloning, pp. 188-190, Cold Spring Harbor Laboratory [1982]). Furthermore, the preparation of amplifiable RNA is made difficult by the presence of ribonucleoproteins in association with RNA. ( See, R. J. Slater, In:
Techniques in Molecular Biology, J.M. Walker and W. Gaastra, eds.. Macmillan. NY, pp. 1 13-120 [1983]).
Typically, the steps involved in purification of nucleic acid from cells include 1) cell lysis; 2) inactivation of cellular nucleases; and 3) separation of the desired nucleic acid from the cellular debris and other nucleic acid. Cell lysis may be achieved through various methods, including enzymatic, detergent or chaotropic agent treatment. Inactivation of cellular nucleases may be achieved by the use of proteases and/or the use of strong denaturing agents. Finally, separation of the desired nucleic acid is typically achieved by extraction of the nucleic acid with phenol or phenol-chloroform; this method partitions the sample into an aqueous phase (which contains the nucleic acids) and an organic phase (which contains other cellular components, including proteins). Commonly used protocols require the use of salts in conjunction with phenol (P. Chomczynski and N. Sacchi, Anal. Biochem. 162:156 [1987]),
__.
or employ a centrifugation step to remove the protein (R.J. Slater, supra). While useful, phenol extraction is time consuming and creates a serious waste disposal problem.
Once the nucleic acid fraction been isolated from the cell, the structure of the mRNA molecule may used to assist in the purification of mRNA from DNA and other RNA molecules. Because the mRNA of higher organisms is usually polyadenylated on its 3' end
("poly-A tail" or "poly-A track"), one means of isolating RNA from cells has been based on binding the poly-A tail with its complementary sequence (i.e., oligo-dT), that has been linked to a support such as cellulose. Commonly, the hybridized mRNA/ oligo-dT is separated from the other components present in the sample through centrifugation or. in the case of magnetic formats, exposure to a magnetic field. Once the hybridized mRNA/oligo-dT is separated from the other sample components, the mRNA is usually removed from the oligo-dT. However, for some applications, the mRNA may remain bound to the oligo-dT that is linked to a solid support.
A wide variety of solid supports with linked oligo-dT have been developed and are commercially available. Cellulose remains the most common support for most oligo-dT systems, although formats with oligo-dT covalently linked to latex beads and paramagnetic particles have also been developed and are commercially available. The paramagnetic particles may be used in a biotin-avidin system, in which biotinylated oligo-dT is annealed in solution to mRNA. The hybrids are then captured with streptavidin-coated paramagnetic particles, and separated using a magnetic field. In addition to these methods, variations exist, such as affinity purification of polyadenylated RNA from eukaryotic total RNA in a spun- column format. These approaches allow for hybridization of poly A mRNA. but vary in efficiency and sensitivity.
It is not intended that the present invention be limited by the source of RNA; a variety of sources is contemplated, including but not limited to mammalian (e.g. liver tissue), plant
(e.g. tobacco leaves) and microbial (e.g. yeast). In one embodiment, the present invention contemplates the isolation of PolyA+ RNA from extracts, including direct isolation from crude extracts.
III. Comparing Biological Samples
Successful amplification can be confirmed by characterization of the product(s) from the reaction. The present invention contemplates, in one embodiment, using electrophoresis to confirm product formation and compare the results between samples.
A. Cancer Tissue
As noted above, the present invention may be used to compare normal tissue with cancer tissue, as well as to differentiate between cancer tissue that is metastatic and cancer tissue that is non-metastatic. In yet another embodiment, the present invention may be used to detect drug resistance.
The treatment of cancer has been hampered by the fact that there is considerable heterogeneity even within one type of cancer. Some cancers, for example, have the ability to invade tissues and display an aggressive course of growth characterized by metastases. These tumors generally are associated with a poor outcome for the patient. And yet. without a . means of identifying such tumors and distinguishing such tumors from non-invasive cancer. the physician is at a loss to change and/or optimize therapy.
With regard to metastatic disease, it is believed that cancer cells proteolytically alter basement membranes underlying epithelia or the endothelial linings of blood and lymphatic vessels, invade through the defects created by proteolysis, and enter the circulatory or lymphatic systems to colonize distant sites. During this process, the secretion of proteolytic enzymes is coupled with increased cellular motility and altered adhesion. After their colonization of distant sites, metastasizing tumor cells proliferate to establish metastatic nodules. The present invention can be used to compare metastatic cancer tissue with non- metastatic cancer tissue to identify differentially expressed genes as markers of metastatic potential. Thereafter, the present invention can be used to determine the presence or absence of these markers in various clinical cancer isolates. The present invention also contemplates "phenotyping" cancer cells adapted to tissue culture.
With regard to drug resistance, it should be noted that success with chemotherapeutics as anticancer agents has been severely hampered by the phenomenon of multiple drug resistance, resistance to a wide range of structurally unrelated cytotoxic anticancer compounds. J.H. Gerlach et al.. Cancer Surveys, 5:25-46 (1986). The underlying cause of progressive drug resistance may be due to a small population of drug-resistant cells within the tumor (e.g.. mutant cells) at the time of diagnosis. J.H. Goldie and Andrew J. Coldman, Cancer Research, 44:3643-3653 (1984). Treating such a tumor with a single drug first results in a remission, where the tumor shrinks in size as a result of the killing of the predominant drug-sensitive cells. With the drug-sensitive cells gone, the remaining drug-resistant cells continue to multiply and eventually dominate the cell population of the tumor. The present invention can be used to compare drug resistant cells with non-resistant cells to identify
__.
differentially expressed genes as markers of drug resistance. Thereafter, the present invention can be used to determine the presence or absence of these markers in various clinical cancer isolates.
B. Classification and Identification of Microorganisms
The detection and identification of microorganisms recovered from clinical specimens or environmental sources is an important aspect of clinical microbiology, as this information is important to physicians in making decisions related to methods of treatment. In order that a particular microorganism is identified correctly and consistently, regardless of the source or the laboratory identifying the organism, reproducible systems for identifying microorganisms are critical. As stated by Finegold. "The primary purpose of nomenclature of microorganisms is to permit us to know as exactly as possible what another clinician, microbiologist, epidemiologist, or author is referring to when describing an organism responsible for infection of an individual or outbreak" (S. Finegold. "Introduction to summary of current nomenclature, taxonomy, and classification of various microbial agents," Clin. Infect. Dis., 16:597 [1993]).
Classification, nomenclature, and identification are three separate, but interrelated aspects of taxonomy. Classification is the arranging of organisms into taxonomic groups (i.e., taxa) on the basis of similarities or relationships. A multitude of prokaryotic organisms has been identified, with great diversity in their types, and many more organisms being characterized and classified on a regular basis. It is a matter of convenience to classify the organisms into groups based upon their similarities. Classification has been used to organize the seemingly chaotic array of individual bacteria into an orderly framework. Through use of a classification framework, a new isolate can be more easily be characterized by comparison with known organisms. The choice of criteria for placement into groups is somewhat arbitrary, although most classifications are based on phylogenetic relationships. An example of the arbitrariness of bacterial classification is reflected in the genetic definition of a "species" as being strains of bacteria that exhibit 70% DNA relatedness. with 5% or less divergence within related sequences (Baron et al., "Classification and identification of bacteria," in Manual of Clinical Microbiology, Murray et al. (eds.), ASM Press, Washington. D.C., pp. 249-264 [1995]).
There are two basic genetic test methods used in the classification and identification of bacteria. Nucleic acid hybridization studies may be conducted to determine the degree of relatedness of organisms on the DNA level. Ribosomal RNA (rRNA) sequence analysis is
another method used to study the relationships between organisms. In addition to these methods, molecular probes and amplification methods (e.g., PCR) may be used to detect and identify microorganisms.
In nucleic acid hybridization methods, the test DNA is denatured and exposed to denatured DNA of known sequence from a particular organism. The amount of hybridization between the test DNA and known DNA provides an indication of the degree of relatedness between the test and known organisms. An important drawback to this approach is that hybridization between two single DNA strands can occur even when 15% of the sequences are not complementary. Ribosomal RNA analysis is another method by which the relatedness of organisms has been determined. Because ribosomes are critical to cellular function and interact with many other molecules (e.g., mRNA and tRNAs), the core rRNA sequences are highly constrained and well-conserved throughout evolution. However, because rRNA also contains highly variable regions, it is usually possible to identify regions of 20-30 bases that are unique to a particular species. While analyzing sequence differences between the rRNAs of different organisms, this approach is extremely narrow in that it looks at no other differences between organisms.
Generally, identification of an organism (e.g. bacteria) is based on its overall morphological and biochemical patterns observed in culture. However, numerous organisms associated with disease may not be cultured in vitro. Indeed, some do not grow well in traditional in vivo culture systems, such as cell cultures or embryonated eggs. Nonetheless, their detection and identification is crucial for the appropriate treatment of affected individuals. Genetic testing methods have proven useful for the classification and identification of such organisms. For example, universal ribosomal primers designed to hybridize to and amplify all bacterial rRNA may be used to detect bacteria in any sterile body site (e.g., synovial fluid). Once detected, the organism may then be identified by sequencing and/or amplification methods, and comparing the results with those obtained from known organisms. While this method has led to the identification and classification of various organisms that were historically not cultivable, it is again limited in its focus on rRNA.
The present invention can be used to identify genes unique to a particular species, subspecies or strain. Unlike the above-described currently used genetic approaches, the
__.
present invention is not limited to any particular genes or gene sequences (e.g. rRNA sequences).
With regard to distinguishing different species, in one embodiment, the present invention contemplates comparing the expressed genes of two samples suspected to be different species. In another embodiment, a species that is suspected to have changed or diverged from the parent species is compared with the parent species. For example, a species or strain of bacteria may develop a different susceptibilities to a drug (e.g. antibiotics) as compared to the parent species: rapid identification of the specific species or subspecies aids diagnosis and allows initiation of appropriate treatment.
EXPERIMENTAL
The following examples serve to illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.
In the experimental disclosure which follows, the following abbreviations apply: eq ( equivalents); M (Molar): μM (micromolar); N (Normal); mol (moles): mmol (millimoles); μmol (micromoles); nmol (nanomoles); gm (grams); mg (milligrams); μg (micrograms); L ( liters); ml (milliliters); μl (microliters); cm (centimeters); mm (millimeters); μm (micrometers); nm (nanometers); °C (degrees Centigrade); Ci (Curies); MW (molecular weight); OD (optical density); EDTA (ethylenediamine-tetracetic acid); PAGE (polyacrylamide gel electrophoresis); UV (ultraviolet); V (volts): W (watts); mA (milliamps); bp (base pair): CPM (counts per minute).
The present invention contemplates preparation of RNA. While a variety of preparation schemes can be used successfully with the present invention, in the experiments below, the total RNA and mRNA were either purchased commercially (Clontech, Palo Alto, CA) or prepared according to standard protocols. However, to remove any contaminating genomic DNA, all RNA preparations were digested with RNAse-free DNAse-1 (RQ-1 DNAse, Promega, Madison, WI), extracted with phenol-Chloroform (Sigma Chemical Company. St. Louis, MO) and precipitated with ethanol.
The cDNA used for the PCR reaction can be made in a variety of ways. However, in the examples below, single stranded cDNA (sscDNA) was synthesized using 1 μg of total
RNA or 100 μg of mRNA with random primers according to the instructions supplied with a commercially available kit (Superscript, BRL-GIBCO, Gaithersburg. MD). At the end of the synthesis reaction, the reverse transcriptase enzyme was killed by heating at 94°C for 15 min.
__.
To calculate the amount of synthesized cDNA, an aliquot of the cDNA synthesis mixture was mixed with labelled dCTP (Amersham, Arlington Heights, IL) and the yield was determined by standard methods. The cDNAs were diluted to a final concentration of lng/μl.
PCR conditions can vary depending on desired outcome. Nonetheless, unless otherwise indicated, the conditions used were as follows. First, the amount of cDNA used in each PCR amplification reaction was empirically determined; 2-5 ng of sscDNA give satisfactory results. Second, the PCR reactions were setup in precooled 0.2ml thin-walled tubes on ice and contained, 50mM TrisHCl (pH 8.5), 50mM KC1, 1.5 mM MgCl2, ImM of each dNTP. 2-5 ng of sscDNA, lOpmoles of a K-primer, lOpmoles of an RE-primer, 0.5 μl of a α-P33dCTP (10 μCi/μl. Amersham) and water to 20 μl.
The mixture can be subjected to PCR cycles in different ways. In one embodiment, the first cycle (or even the first few cycles) involves a lower annealing temperature than the annealing temperature in subsequent cycles. For example, an annealing temperature of between approximately 34°C and approximately 44°C, and more preferrably between approximately 36°C and approximately 40°C, and most preferrably approximately 38°C (for approximately 30 seconds), can be used for the first cycle (or even the first few cycles). The subsequent cycles of denaturation, annealing and extension can involve a higher temperature. For example, in one embodiment, there are approximately 25-35 subsequent cycles of denaturation (approximately 94°C for approximately 30 seconds), annealing, and extension (72"C for 1 min), wherein the annealing temperature is between approximately 40°C and approximately 60"C, more preferrably between approximately 44°C and approximately 54°C, and most preferrably approximately 48°C (for approximately 30 sec).
In another embodiment, the annealing temperature is approximately the same temperature for all cycles. For example, the above-described mixture is subjected to 35 cycles of denaturation. annealing and extension, wherein the annealing temperature is between approximately 38°C and approximately 40°C (for approximately 30 seconds).
Cycling is done using a Perkin-Elmer System 2400 Thermal Cycler (Perkin-Elmer. Norwalk. CT). PCR amplifications were performed with subsets of K-primers in combination with different RE-primers. Finally, the PCR products were analyzed by high resolution polyacrylamide gel electrophoresis using 6% sequencing-grade gels (BRL) and the amplified
DNA fragments were visualized by autoradiography using BioMaxMR film (Kodak, Rochester. NY).
EXAMPLE 1
This example describes the generation of PCR product from several human cell types using one embodiment of the method of the present invention. Total RNA (available commercially from Clontech) was used from four human cell types: 1) the K562 tumor cell line. 2) placental tissue. 3) spleen cells, and 4) thymus cells (in the first, second, third and fourth lane of each four lane group in Figures 3A and 3B). The total RNA was reverse transcribed using 6-mer random primers (available from Pharmacia). The resultant cDNA was subjected to thirty-five cycles of PCR (in the presence of a radioactive precursor) using a mixture of two anchor primers ("K2" for Figure 3A and "K3" for Figure 3B) and restriction enzyme-based primers [for this experiment, the recognition sequence on the 5' end of the RE downstream primers was for EcoRI; the primer sequences were: GAATTCNNNGT(A or C)(G or T)AC (lanes 1-4); GAATTCNNNCGGC (lanes 5-8); GAATTCNNN(A or G)GCGC(C or T) (lanes 9-12); GAATTCNNNTTAA (lanes 13-16)]. The PCR products were analyzed by PAGE using 6% sequencing gels (BRL) and visualized by autoradiography. The results show a large number of bands (see Figures 3A and 3B). Importantly, there is differential expression of transcripts in the various cell types.
EXAMPLE 2
This example describes the generation of PCR product from several bacterial species using one embodiment of the method of the present invention. Bacterial DNA was prepared by standard methods. 10-50 ng of genomic DNA from E coli and P. stuarti (in the first and second lane, respectively, of each two lane group in Figure 4) was subjected to thirty-five cycles of PCR (in the presence of a radioactive precursor) using a mixture of anchor primers (SD-primers: 5'- GGAATTCNNN-TAAGGAGG-3') and restriction enzyme-based primers (RE-primers: 5'- GGATTC-CNNNGATC (this "Mbol primer" was used in lanes 1 and 2 of
Figure 4); GGATTCCNNNCTAG (this "Bfal primer" was used in lanes 3 and 4 of Figure 4); GGATTCCNNNCCGC (this "Acil primer" was used in lanes 5 and 6 in Figure 4); GGATTCCNNNCCGG (this "Hpall primer" was used in lanes 7 and 8 in Figure 4); and GGATTCCNNNAATT (this "Tsp509I primer" was used in lanes 9 and 10 in Figure 4). The PCR products were analyzed by PAGE using 6% sequencing gels (BRL) and visualized by autoradiography.
The results show a large number of bands (Figure 4). Importantly, there is differential expression of transcripts in the different species. For example, there are clearly
DNA fragments that are associated with E. co li. that are not found in P. stuarti. Such bands are markers for identification.
EXAMPLE 3 This example describes the cloning and sequencing of expressed transcripts. Briefly,
DNA bands representing differently expressed transcripts (see single and double arrows) were identified by visual scanning of the autoradiograph and marked (Figure 5. which represents a different exposure of the experiment run in Figure 3A). The film was then used as a template and the marked bands were cut out and eluted in water, precipitated with 0.3M sodium acetate. pH 6.0. and 2.5 vol of ethanol, pelleted by centrifugation (12,000 x g, 20 min), washed 2X with 70% ethanol, air dried and dissolved in 10 μl of nuclease free water. Half of the sample was then used for reamplification using the same primer combination and PCR conditions. Amplified material was resolved on a 2% agarose gel and the size of the amplified fragments was determined with reference to DNA size standards ( 100 bp ladder. BRL) and the amplified DNA fragments were gel purified using a commercially available kit
(Qaquick. Qiagen. Los Angeles, CA). Amplified fragments were then cloned into a T-tailed vector using a commercially available kit (pGEM-T, Promega, Madison. WI) and the recombinants were identified by blue-white color selection. Positive clones were grown in LB medium (BRL) and plasmid minipreps were prepared (Qiagen) and sequenced (CWRU Molecular Biology Core Facility). Sequencing homology searches were performed at the
National Center for Biotechnology Information (NCBI) using BLAST network service.
One band (single arrow of Figure 5) was found to have a sequence corresponding to a human mitochonrial hinge protein (see Figure 6 for the partial nucleic acid sequence). The other band (double arrow of Figure 5) was found to have a sequence corresponding to a human coactivator gene (Figure 7 shows the partial nucleic acid sequence).
EXAMPLE 4
This example describes the comparison of normal and malignant tissues. A variety of cell types were studied: 1) normal human keratinocytes, 2) normal human skin, and 3-5) three squamous cell carcinoma samples from patients (in the first, second, third, fourth and fifth lane of each five lane group in Figure 8). The total RNA was reverse transcribed using 6- mer random primers (available from Pharmacia). The resultant cDNA was subjected to thirty-five cycles (all cycles were performed using annealing temperatures between 38 and 42
degrees) of PCR (in the presence of a radioactive precursor) using a mixture of two anchor primers ("Kl") and restriction enzyme-based primers (RE-primers: 5*- GGATTCCNNNGATC (this "Mbo I primer" was used in the reactions represented by lanes 1 through 5 of Figure 8); GGATTCCNNNCTAG (this "Bfal primer" was used in the reactions represented by lanes 6 through 10 of Figure 8); GGATTCCNNNCCGC (this "Acil primer" was used in reactions represented by lanes 11 through 15 in Figure 8); GGATTCCNNNCCGG (this "Hpall primer" was used in reactions represented by lanes 16 through 20 in Figure 8); and GGATTCCNNNAATT (this "Tsp509I primer" was used in the reactions represented in lanes 21 through 25 in Figure 8). The PCR products were analyzed by PAGE using 6% sequencing gels (BRL) and visualized by autoradiography.
The results show a large number of bands (Figure 8). Importantly, there is differential expression of transcripts in the different species. For example, there are clearly DNA fragments that are associated with normal cells that are not found in the lanes representing cancer cells. There are also DNA fragments that are expressed at much higher levels in cancer cells than in normal cells. These are useful markers for cancer identification.
From the above it should be evident that the present invention provides a convenient method for distinguishing between the expression of genes in two or more biological samples. Importantly, the method also promotes followup analysis once a gene of interest is indentified.
Claims
1. A method of analyzing nucleic acid in a sample, comprising: a) providing: i) a sample containing nucleic acid, ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said nucleic acid of said sample. iii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said nucleic acid of said sample, and iv) a polymerase and PCR reagents; b) preparing said nucleic acid from said sample under conditions so as to produce amplifiable nucleic acid; c) amplifying said nucleic acid with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated; d) detecting said amplified product.
2. The method of Claim 1, wherein said sample comprises eukaryotic cells and said natural common sequence is the Kozak sequence.
3. The method of Claim 1, wherein said sample comprises prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence.
4. The method of Claim 1, wherein said detecting comprises gel electrophoresis. __
5. A method of analyzing expressed genes in biological samples, comprising: a) providing: i) two samples containing mRNA. ii) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on at least a portion of said mRNA of said two samples, iii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said two samples, and iv) a polymerase and PCR reagents; b) treating said mRNA of each of said two samples under conditions so as to produce amplifiable DNA from each sample; c) amplifying said DNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated from each of said two samples: d) detecting said amplified product.
6. The method of Claim 5, wherein each of said two samples comprise eukaryotic cells and said natural common sequence is the Kozak sequence.
7. The method of Claim 5. wherein each of said two samples comprise prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence.
8. The method of Claim 7, wherein said two samples comprises bacterial cells of different species.
9. The method of Claim 5, wherein said detecting comprises gel electrophoresis.
10. A method of analyzing expressed genes in a multiple samples, comprising: a) providing: i) at least two samples containing mRNA, ii) random primers, iii) reverse transcriptase, __.
iv) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence on a portion of said mRNA of said samples, v) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence present on a portion of said mRNA of said samples, and vi) a polymerase and PCR reagents; b) extracting mRNA from each of said samples and reverse transcribing said mRNA with said reverse transcriptase and said random primers under conditions such that cDNA is produced; c) amplifying said cDNA from each sample with said first and second primers, said polymerase and said PCR reagents under conditions such that amplified product is generated from each of said samples: d) detecting said amplified product.
1 1. The method of Claim 10, wherein said samples comprise eukaryotic cells and said natural common sequence is the Kozak sequence.
12. The method of Claim 11, wherein a portion of said samples comprise human cancer cells.
13. The method of Claim 10, wherein said sample comprise prokaryotic cells and said natural common sequence is the Shine-Dalgarno sequence.
14. The method of Claim 10, wherein said analyzing means comprises gel electrophoresis.
15. A kit, comprising: i) a first primer having a sequence of which at least a portion is at least partially complementary to a natural common non-coding sequence, and ii) a second primer having a sequence of which at least a portion is at least partially complementary to a restriction enzyme recognition sequence.
16. The kit of Claim 15, wherein said natural common sequence is the Kozak sequence.
17. The kit of Claim 15, wherein said natural common sequence is the Shine- Dalgarno sequence.
18. The kit of Claim 15, wherein said restriction enzyme recognition sequence is selected from the group consisting of the sequences set forth in Table 1.
19. The kit of Claim 15, wherein said first primer is of the general formula:
S XN LKJX-N L KJATGN HO-S ' .wherein N is A, T,G, C or nothing, and wherein X is the recognition sequence for a restriction enzyme or nothing.
20. The kit of Claim 15, wherein said second primer is of the general formula: 5"-N 0-X-N|.10TAAGGAGGNMO-3\ where X is a recognition sequence or nothing, and where N is A. T, G, C or nothing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU64446/98A AU6444698A (en) | 1997-03-03 | 1998-03-03 | Methods and compositions for identifying expressed genes |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US78552297A | 1997-03-03 | 1997-03-03 | |
US08/785,522 | 1997-03-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1998039480A1 true WO1998039480A1 (en) | 1998-09-11 |
Family
ID=25135787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1998/004094 WO1998039480A1 (en) | 1997-03-03 | 1998-03-03 | Methods and compositions for identifying expressed genes |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU6444698A (en) |
WO (1) | WO1998039480A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5580726A (en) * | 1994-04-29 | 1996-12-03 | Geron Corporation | Method and Kit for enhanced differential display |
US5599672A (en) * | 1992-03-11 | 1997-02-04 | Dana-Farber Cancer Institute, Inc. | Method of differential display of exposed mRNA by RT/PCR |
-
1998
- 1998-03-03 AU AU64446/98A patent/AU6444698A/en not_active Abandoned
- 1998-03-03 WO PCT/US1998/004094 patent/WO1998039480A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5599672A (en) * | 1992-03-11 | 1997-02-04 | Dana-Farber Cancer Institute, Inc. | Method of differential display of exposed mRNA by RT/PCR |
US5580726A (en) * | 1994-04-29 | 1996-12-03 | Geron Corporation | Method and Kit for enhanced differential display |
Non-Patent Citations (6)
Title |
---|
"RESTRICTION ENZYMES AND LINKERS.", PROMEGA PROTOCOLS AND APPLICATIONS GUIDE, XX, XX, 1 January 1991 (1991-01-01), XX, pages 26 - 29., XP002911085 * |
"THE LOGIC AND MACHINERY OF GENE EXPRESSION. GENES AND GENOMES, PASSAGE.", GENES AND GENOMES, XX, XX, 1 January 1991 (1991-01-01), XX, pages 168/169 + 415., XP002911086 * |
IVANOVA N. B., ET AL.: "IDENTIFICATION OF DIFFERENTIALLY EXPRESSED GENES BY RESTRICTION ENDONUCLEASE-BASED GENE EXPRESSION FINGERPRINTING.", NUCLEIC ACIDS RESEARCH, INFORMATION RETRIEVAL LTD., GB, vol. 23., no. 15., 11 August 1995 (1995-08-11), GB, pages 2954 - 2958., XP002911088, ISSN: 0305-1048 * |
JOHNSTON S. L., ET AL.: "A NOVEL METHOD FOR SEQUENCING MEMBERS OF MULTI-GENE FAMILIES.", NUCLEIC ACIDS RESEARCH, INFORMATION RETRIEVAL LTD., GB, vol. 23., no. 15., 1 August 1995 (1995-08-01), GB, pages 3074/3075., XP002911090, ISSN: 0305-1048 * |
KATO K.: "DESCRIPTION OF THE ENTIRE MRNA POPULATION BY A 3' END CDNA FRAGMENT GENERATED BY CLASS IIS RESTRICTION ENZYMES.", NUCLEIC ACIDS RESEARCH, INFORMATION RETRIEVAL LTD., GB, vol. 23., no. 18., 1 September 1995 (1995-09-01), GB, pages 3685 - 3690., XP002911089, ISSN: 0305-1048 * |
PRASHAR Y., ET AL.: "ANALYSIS OF DIFFERENTIAL GENE EXPRESSION BY DISPLAY OF 3' END RESTRICTION FRAGMENTS OF CDNAS.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, NATIONAL ACADEMY OF SCIENCES, US, vol. 93., 1 January 1996 (1996-01-01), US, pages 659 - 663., XP002911087, ISSN: 0027-8424, DOI: 10.1073/pnas.93.2.659 * |
Also Published As
Publication number | Publication date |
---|---|
AU6444698A (en) | 1998-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0791074B1 (en) | Method for the detection of ras oncogenes, in particular the k-ras ancogene | |
US5674682A (en) | Nucleic acid primers for detecting micrometastasis of prostate cancer | |
KR100748863B1 (en) | Enzyme Source Nucleic Acid Detection Methods, and Related Molecules and Kits | |
AU767983B2 (en) | Methods for detecting nucleic acids indicative of cancer | |
US5935825A (en) | Process and reagent for amplifying nucleic acid sequences | |
CA2513780C (en) | Method to detect prostate cancer from a urine sample | |
US6783943B2 (en) | Rolling circle amplification detection of RNA and DNA | |
US5827658A (en) | Isolation of amplified genes via cDNA subtractive hybridization | |
EP1394272A1 (en) | Method for detection of Ki-ras mutations | |
EP0832285A1 (en) | Detection of gene sequences in biological fluids | |
US20070178482A1 (en) | Method for preparing single-stranded dna | |
EP1546327A1 (en) | Selection and isolation of living cells using rna-binding probes | |
WO1997007244A1 (en) | ISOLATION OF AMPLIFIED GENES VIA cDNA SUBTRACTIVE HYBRIDIZATION | |
CN115210386A (en) | CRISPR-based assays for detecting TB in body fluids | |
JP2000511767A (en) | Genetic marker and method for detecting Escherichia coli serotype 0157: H7 | |
CN109680044B (en) | Gene mutation detection method based on selective elimination of wild chain background interference | |
WO1998049345A1 (en) | Methods and compositions for targeted dna differential display | |
CN119110851A (en) | Nucleic acid detection methods | |
WO1997044488A2 (en) | Compositions and methods for the detection of mycobacterium kansasii | |
US5851805A (en) | Method for producing DNA from mRNA | |
WO1998039480A1 (en) | Methods and compositions for identifying expressed genes | |
WO1994017203A1 (en) | Amplified dna fingerprinting method for detecting genomic variation | |
Maher et al. | The sensitive detection of fluorescently labelled PCR products using an automated detection system | |
CN113853440A (en) | Protocol for detecting one or more DNA intramolecular interactions in a cell | |
Bevan et al. | From Linkage to Genes: Positional Cloning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: JP Ref document number: 1998538681 Format of ref document f/p: F |
|
NENP | Non-entry into the national phase |
Ref country code: CA |
|
122 | Ep: pct application non-entry in european phase |