CN112553361A - Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data - Google Patents
Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data Download PDFInfo
- Publication number
- CN112553361A CN112553361A CN202011310367.5A CN202011310367A CN112553361A CN 112553361 A CN112553361 A CN 112553361A CN 202011310367 A CN202011310367 A CN 202011310367A CN 112553361 A CN112553361 A CN 112553361A
- Authority
- CN
- China
- Prior art keywords
- snp
- sequence
- sequencing data
- identifying
- rad
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 240000006677 Vicia faba Species 0.000 title claims abstract description 49
- 235000010749 Vicia faba Nutrition 0.000 title claims abstract description 49
- 235000002098 Vicia faba var. major Nutrition 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012268 genome sequencing Methods 0.000 title claims abstract description 19
- 239000002773 nucleotide Substances 0.000 title description 9
- 125000003729 nucleotide group Chemical group 0.000 title description 9
- 238000012163 sequencing technique Methods 0.000 claims abstract description 55
- 238000003205 genotyping method Methods 0.000 claims abstract description 16
- 238000004458 analytical method Methods 0.000 claims abstract description 9
- 239000003550 marker Substances 0.000 claims abstract description 8
- 238000003908 quality control method Methods 0.000 claims abstract description 7
- 238000012408 PCR amplification Methods 0.000 claims abstract description 5
- 239000011543 agarose gel Substances 0.000 claims abstract description 5
- 238000001514 detection method Methods 0.000 claims description 12
- 239000000203 mixture Substances 0.000 claims description 7
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 239000012634 fragment Substances 0.000 claims description 4
- 238000000246 agarose gel electrophoresis Methods 0.000 claims description 3
- 238000000227 grinding Methods 0.000 claims description 3
- 239000007788 liquid Substances 0.000 claims description 3
- 229910052757 nitrogen Inorganic materials 0.000 claims description 3
- 238000003752 polymerase chain reaction Methods 0.000 claims description 3
- 241001386813 Kraken Species 0.000 claims description 2
- 230000035772 mutation Effects 0.000 abstract description 21
- 108020004414 DNA Proteins 0.000 abstract description 19
- 108090000623 proteins and genes Proteins 0.000 abstract description 16
- 238000011161 development Methods 0.000 abstract description 7
- 230000014509 gene expression Effects 0.000 abstract description 3
- 108091026890 Coding region Proteins 0.000 abstract description 2
- 230000002068 genetic effect Effects 0.000 description 5
- 230000014616 translation Effects 0.000 description 5
- 238000011144 upstream manufacturing Methods 0.000 description 5
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000003321 amplification Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 101100428022 Arabidopsis thaliana UTR3 gene Proteins 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- 238000007400 DNA extraction Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 101150066209 RAD gene Proteins 0.000 description 2
- 101100453133 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ISY1 gene Proteins 0.000 description 2
- 101150007199 UTR5 gene Proteins 0.000 description 2
- 230000037433 frameshift Effects 0.000 description 2
- 238000012252 genetic analysis Methods 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 229920000936 Agarose Polymers 0.000 description 1
- 241000128742 Ascochyta fabae Species 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 244000303847 Lagenaria vulgaris Species 0.000 description 1
- 235000009797 Lagenaria vulgaris Nutrition 0.000 description 1
- 240000004658 Medicago sativa Species 0.000 description 1
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- ZYFVNVRFVHJEIU-UHFFFAOYSA-N PicoGreen Chemical compound CN(C)CCCN(CCCN(C)C)C1=CC(=CC2=[N+](C3=CC=CC=C3S2)C)C2=CC=CC=C2N1C1=CC=CC=C1 ZYFVNVRFVHJEIU-UHFFFAOYSA-N 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 244000098338 Triticum aestivum Species 0.000 description 1
- 235000010726 Vigna sinensis Nutrition 0.000 description 1
- 244000042314 Vigna unguiculata Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004178 biological nitrogen fixation Methods 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000003967 crop rotation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 235000021332 kidney beans Nutrition 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000007862 touchdown PCR Methods 0.000 description 1
- 238000011222 transcriptome analysis Methods 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Images
Classifications
- 
        - C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/6895—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
 
- 
        - C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
 
- 
        - G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
 
- 
        - G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
 
- 
        - C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
 
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Organic Chemistry (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Botany (AREA)
- Mycology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a method for identifying SNP of broad beans by using simplified genome sequencing data, which comprises the following steps: step one, extracting sample genome DNA; step two, constructing a RAD library, digesting genomic DNA by using EcoRI, enriching the library by using a PCR amplification method, recovering a target band by using agarose gel, and sequencing a single end; generating an original sequence, performing sequence quality control analysis, clustering high-quality sequences according to sequence similarity to generate RAD-tags, clustering the RAD-tags to perform SNP calling, and correcting the SNP genotype; step four, KASP marker development and SNP genotyping. The method provided by the invention utilizes genome sequencing data to mine SNP, so that not only can SNP mutation of a gene expression region be identified, but also SNP mutation of non-coding regions such as the inside and the inter-gene of a gene can be identified, and the source of the SNP is more abundant.
    Description
Technical Field
      The invention relates to the technical field of biological detection, in particular to a method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data.
    Background
      The broad beans are rich in protein and cellulose and are easy to digest and absorb, dry seeds of the broad beans can be used as grains and feeds or processed into leisure food, and fresh seeds can be used as vegetables for eating. The root system of the broad bean has the function of biological nitrogen fixation, and is an important crop rotation and soil cultivation crop in the structure adjustment of the planting industry.
      In recent years, with the rapid development of second-generation sequencing technologies and the significant reduction of sequencing cost, high-throughput sequencing has been widely applied to the label development, gene localization and other works of complex and huge genome crops such as wheat. DNA sequence Polymorphism caused by Single Nucleotide variation on genome level, namely Single Nucleotide Polymorphism (SNP) markers, become ideal molecular markers of a new generation due to the characteristics of wide distribution, high density, good stability, suitability for large-scale screening and the like on the genome, but on broad beans, the number of the SNP markers which are publicly reported at present is limited.
      Simplified genome sequencing, such as RAD-Seq (Restriction site-associated DNA sequencing), refers to the use of Restriction enzymes to break down genomic DNA and to perform high-throughput sequencing of specific fragments to obtain sequence data representing the entire genomic information of a species of interest, by reducing the complexity of the genome. Because the sequencing depth is moderate, the cost is low and the reference genome can not be depended on, the method is widely applied to marker development, genetic map construction, target gene positioning and the like on a plurality of non-model species at present.
      Broad beans are diploid crops (2n ═ 2x ═ 12), have genomes of about 13Gb, are 25 times larger than alfalfa, which is a leguminous crop, and are one of the species with the largest genome in the leguminous crops. The ultra-large genome of broad bean seriously hinders genome resource researches such as whole genome sequencing and marker development, so that work progress such as acquiring genetic gain by using molecular markers is slow, and therefore, the prior art needs to be improved.
    Disclosure of Invention
      The invention provides a method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data, which aims to solve the technical problem that the work progress is slow when the molecular markers are used for acquiring genetic gain and the like because the oversized genome of the broad beans seriously hinders genome sequencing, marker development and other genome resource researches.
      In order to solve the technical problems, the invention provides the following technical scheme:
      a method for identifying broad bean SNP by using simplified genome sequencing data comprises the following steps:
      taking young leaves of broad bean seedlings as a sample, and grinding the young leaves by liquid nitrogen to extract sample genome DNA;
      step two, constructing a RAD library, digesting genomic DNA by using EcoRI, enriching the library by using a PCR amplification method, recovering a target band by using agarose gel, and sequencing a single end;
      generating an original sequence, performing sequence quality control analysis, removing sequences smaller than 85bp to obtain a screened sequence, clustering the screened sequence according to sequence similarity to generate RAD-tags, clustering the RAD-tags to perform SNP calling, and correcting the SNP genotype;
      selecting a sequence which covers the SNP locus and has a total length of 100bp, designing KASP primers, enabling each KASP primer marker to respectively comprise two forward primer sequences and a universal reverse primer sequence for distinguishing SNP allelic variation, and carrying out SNP signal detection by SNP genotyping.
      Further, the broad bean young leaf in the first step is the broad bean young leaf which grows for 1 week.
      Further, in the first step, after the sample genomic DNA is extracted, the method further comprises:
      the quality of the extracted sample genomic DNA was checked by agarose gel electrophoresis, and the concentration of the extracted sample genomic DNA was checked using a NanoDrop2000 ultramicro spectrophotometer.
      Further, in the second step, when the genomic DNA is digested by EcoRI, adding ' A ' to the 3' end of the digested fragment for treatment, and connecting an MID joint; single-ended sequencing used the Illumina HiSeq2000 platform.
      Further, in the third step, the original sequence is generated by Illumina base catching software CASAVA v1.8.2, and sequence quality control analysis is performed by using trimmatic software under default parameters.
      Further, in the third step, clustering the screened sequences by using an ustacks software according to the sequence similarity to generate RAD-tags, clustering the RAD-tags by using a cstags software under default parameters to perform SNP calling, and finally correcting the SNP genotype by using a Bayesian algorithm.
      Further, in the fourth step, Kraken is adoptedTMThe software designed KASP primers.
      Further, in the fourth step, SNP genotyping adopts an IntelliQube high-throughput genotyping detection platform to detect SNP signals.
      Furthermore, in the fourth step, when SNP signal detection is performed in SNP genotyping, the volume of a single-site reaction is 1.6. mu.L, wherein the volume of the sample DNA is 0.8. mu.L, and the volume of the mixture of 2xMaster mix and Primer mix is 0.8. mu.L.
      Further, in the fourth step, when SNP genotyping is performed for SNP signal detection, the PCR amplification procedure is 15min at 95 ℃,20 s at 94 ℃ and 60s at 61-55 ℃, which are 10 cycles; 26 cycles at 94 ℃ for 20s and 55 ℃ for 60 s.
      The technical scheme provided by the invention has the beneficial effects that at least:
      the invention provides a method for identifying broad bean SNP by using simplified genome sequencing data, which is different from the method for mining SNP by using transcriptome sequencing data in the prior art. The SNP identified by the method can provide a powerful genetic tool for broad bean germplasm resource identification, gene localization and molecular breeding.
    Drawings
      In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
      FIG. 1 is a schematic diagram of 4 amplification signals of KASP markers provided by an embodiment of the present invention.
    Detailed Description
      In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
      The embodiment provides a method for identifying broad bean SNP by using simplified genome sequencing data, which comprises the steps of carrying out simplified genome sequencing on broad bean varieties by using RAD-Seq technology, identifying broad bean SNP in a whole genome range by using a reference genome-independent SNP identification technology and analyzing the characteristics of the broad bean SNP.
      The broad bean germplasm used in the embodiment is collected and stored by agriculture and forestry scientific research institute of Lishui city; wherein, 8 germplasms (FB017, FB032, FB036, FB056, FB076, FB080, FB081) are used for simplifying genome sequencing, and the other 46 germplasms are used for verifying SNP accuracy.
      The method for identifying the broad bean SNP by using the simplified genome sequencing data comprises the following steps:
      step one, DNA extraction:
      taking young leaves of broad bean seedlings as samples, grinding the young leaves by using liquid nitrogen, and extracting sample genome DNA by using a DNA extraction kit; wherein, the young leaves of the broad beans adopted in the embodiment are young leaves of the broad beans which grow for 1 week; in this embodiment, after extracting the genomic DNA of the sample, the method further comprises: the quality of the extracted sample genomic DNA was checked by agarose gel electrophoresis, and the concentration of the extracted sample genomic DNA was checked using a NanoDrop2000 ultramicro spectrophotometer.
      Step two, library construction and sequencing:
      constructing RAD library by referring to the method of Baird et al (Baird NA, Etter PD, Atwood TS, et al, Rapid SNP discovery and genetic mapping using sequential RAD markers [ J ]. PLoS ONE,2008,3, e3376), digesting genomic DNA with EcoRI, treating 3' end of the digested fragment with ' A ', and connecting MID (multiple identifier) linker; enriching the library by using a PCR amplification method, recovering a target band by using agarose gel, and sequencing at a single end; wherein, the Illumina HiSeq2000 platform is used for single-end sequencing.
      Step three, SNP identification:
      generating an original sequence through Illumina base cloning software CASAVA v1.8.2, performing sequence quality control analysis under default parameters by using Trimmomatic software, removing sequences smaller than 85bp to obtain a screened sequence, clustering the screened sequence according to sequence similarity to generate RAD-tags, clustering the RAD-tags to perform SNP cloning, and correcting SNP genotype;
      specifically, the screened sequences are clustered according to sequence similarity to generate RAD-tags, the RAD-tags are clustered to perform SNP calling, and the process of correcting the SNP genotype is as follows:
      clustering the screened sequences by using ustacks software to generate read tags (RAD-tags), referring to Xu et al (Xu P, Xu S, Wu X, et al. A expression genetic analysis from low-coverage RAD-Seq data: a case study on the non-model cut gene good. plant J,2014,77: 430. sup. 442), identifying RAD gene full genome SNP without reference genome, clustering RAD-tags by using cstags software under default parameters, and finally correcting gene type SNP by using a Bayesian algorithm (Hohenlohe PA, Bassham S, Etter PD, et al. P.P. genetic analysis of additive in feedback gene, P.1000862, Bayesian gene, P.P. No. 13. sub.A method for identifying RAD gene full genome SNP by using a template software under default parameters.
      Step four, KASP marker development and genotyping
      Selecting a sequence which covers SNP sites and has a total length of about 100bp, and adopting KrakenTMThe software designs KASP primers so that each KASP primer marker comprises three primer sequences, namely two forward primer sequences for distinguishing SNP allelic variation and a universal reverse primer sequence, and SNP genotyping is carried out for SNP signal detection. Wherein, SNP genotyping is carried out in a public laboratory of agricultural scientific college of Zhejiang province, and SNP signal detection is carried out by adopting an IntelliQube high-throughput genotyping detection platform; the single-site reaction volume was 1.6. mu.L, where the sample DNA was 0.8. mu.L and the volume after mixing the 2xMaster mix and the Primer mix was 0.8. mu.L. The PCR amplification procedure is 10 cycles of 15min at 95 deg.C, 20s at 94 deg.C, and 60s at 61-55 deg.C (Touch-Down PCR, 0.6 deg.C per cycle); 26 cycles at 94 ℃ for 20s and 55 ℃ for 60 s. The test result is analyzed by adopting IntelliQube platform with software.
      Further, in another possible embodiment, the implementation process of the second step is as follows:
      1) the specific experimental steps are as follows:
      1. constructing a library by using the initial amount of 1 mu g of DNA;
      2, breaking DNA to 300-500 bp by Covaris M220 ultrasonic;
      3. filling 3' end with A, connecting index joint (TruSeq)TMNano DNA Sample Prep Kit);
      4. Enriching the library, and amplifying 8 cycles by PCR;
      5.2% Agarose gel recovery of the target band (verified Low Range Ultra Agarose);
      TBS380(PicoGreen) quantification, mixing and loading according to the data proportion;
      performing bridge PCR amplification on the cBot to generate clusters;
      illumina Hiseq sequencing platform, 2 × 150bp sequencing was performed.
      2) And (3) biological information analysis flow:
      the reads obtained by sequencing are aligned with a reference genome sequence by using BWA software, and then the sequencing reads generated by PCR-replication are removed by using Picard-tools. Then, based on the alignment results, the sequencing depth and coverage relative to the reference genome are calculated. And (3) detecting the SNP and small index information by using the GATK software package. SV was identified using the Breakdancer-1.1.2 software.
      3) Raw sequencing data illustrates:
      raw image data obtained by Illumina sequencing is converted into sequence data through Base Calling, and the result is stored in a FASTQ file format. The FASTQ file is the most primitive data file, and contains sequence information of sequencing reads as well as sequencing quality information. The FASTQ file format is as follows:
      @K00169:186:HM5C2CCXX:6:1101:8136:2962 1:N:0:CTGGCATA
      CCACTCATAATCCAGCAAATACTAAATCTGCTGCAGGAAAAGAAATGCGGTTGAGCTTAAATAGCCCAG
      +
      AFFKKFKKFFKKKKFKAFKKAAKFAFFKKFKKFFKKKKFKAFKKAAKFAFFKKFKKFFKKKKFKAFKKAA
      each read contains 4 lines of information, where the first and third lines represent the read name and ID (where the first line starts with "@" and the third line starts with "+"; the ID may be omitted in the third line but "+" cannot be omitted), the second line is the base sequence of the read, and the fourth line is the sequencing quality value for each base of the sequence in the second line. To facilitate the storage and sharing of high throughput sequencing data generated by various laboratories, the NCBI data center has built a large database SRA (Sequence Read Archive, http:// www.ncbi.nlm.nih.gov/Traces/SRA) to store shared raw sequencing data. Raw data volume statistics are shown in table 1:
      TABLE 1 raw sequencing data
      | Sample | Raw reads | Raw bases | Q20(%) | Q30(%) | 
| FB017-0911-1 | 34620356 | 5089192332 | 94.85 | 88.84 | 
| FB032-09016-3 | 35246208 | 5181192576 | 95.16 | 89.36 | 
| FB036-09019-1 | 35445202 | 5210444694 | 95.11 | 89.36 | 
| FB056-09031 | 35092390 | 5158581330 | 95.1 | 89.31 | 
| FB076 | 35898166 | 5277030402 | 95.02 | 89.21 | 
| FB079 | 36945126 | 5467878648 | 95.36 | 89.77 | 
| FB080 | 29994288 | 4439154624 | 95.05 | 89.26 | 
| FB081 | 35299488 | 5224324224 | 95.53 | 90.09 | 
Sample: the name of the sample;
      raw reads: counting original sequence data, taking four rows as a unit, and counting the number of sequencing sequences of each file;
      raw bases: multiplying the number of sequencing sequences by the length of the sequencing sequences;
      q20, Q30: indicates the percentage of the total base by the base with the Phred value of more than 20 and 30 respectively;
      4) quality control of original sequencing data:
      illumina sequencing belongs to a second generation sequencing technology, billions of reads can be generated by single operation, and thus the quality condition of each read cannot be displayed one by massive data; the statistical method is used for counting the base distribution and quality fluctuation of each circle of all sequencing reads, and the sequencing quality and the library construction quality of a sample can be visually reflected macroscopically.
      Since the original sequencing data of the Illumina Hiseq contains sequencing adaptor sequences, low-quality reads, sequences with high N-rate and sequences with too short length, the quality of subsequent assembly is seriously affected. In order to ensure the accuracy of the subsequent biological information analysis, the original sequencing data is filtered firstly, so as to obtain high-quality sequencing data (clean data) to ensure the smooth proceeding of the subsequent analysis, and the specific steps and sequence are as follows: removing the adaptor sequence in reads, removing reads without inserts due to adaptor self-ligation and the like; trimming the bases with lower quality (quality value less than 20) at the tail end (3' end) of the sequence, if the bases with quality value less than 10 still exist in the residual sequence, removing the whole sequence, otherwise, keeping the sequence; removing reads with the N content ratio exceeding 10%; discarding the sequence with the length less than 75bp after removing the adapter and mass pruning.
      Further, in another possible embodiment, the SNP identification process in step three above is as follows:
      and (3) comprehensively considering influence factors in the aspects of data characteristics, sequencing quality and experiments, and calculating the probability of each possible genotype on the basis of actually observed data by using a Bayesian model (GATK UnifiedGenottyper). And selecting the genotype with the highest probability as the genotype of the specific site of the sequenced individual, and providing a quality value reflecting the accuracy of the genotype on the basis of the genotype, and obtaining a consistent sequence. Based on the consensus sequence, sites with polymorphisms in the reference sequence are screened and filtered.
      The method mainly comprises the following steps:
      1. converting the sam file into a Bam file, and sequencing the Bam file;
      2. marking PCR duplicates, and removing reads of the PCR duplicates;
      3. filtering and indexing comparison reads with mappingQ lower than 10;
      4. realignment (realignment) around INDEL;
      5. SNPs and INDEL calling using GATK;
      6. filtering the Variant result to obtain high-accuracy variation;
      the statistical format for SNP identification is shown in Table 2:
      TABLE 2 SNP identification statistical Format
      | type | FB017-0911-1 | FB032-09016-3 | FB036-09019-1 | …… | 
| all-snp | 7915 | 10227 | 9896 | …… | 
| hom | 5259 | 6859 | 6676 | …… | 
| het | 2656 | 3368 | 3220 | …… | 
| all-indel | 302 | 368 | 385 | …… | 
| deletion | 144 | 183 | 182 | …… | 
| insertion | 158 | 185 | 203 | …… | 
Hom represents homozygous mutation, example: a- > T; het represents a heterozygous mutation, example: a- > A/T; insert mutation and delete mutation.
      Genome-wide SNP mutations can be divided into 6 classes. Taking T: A > C: G as an example, this type of SNP mutation includes T > C and A > G. Since the sequencing data aligns to both the positive and negative strands of the reference genome, when a T > C type mutation occurs on the positive strand of the reference genome, an A > G type mutation is at the same position on the negative strand of the reference genome, and thus T > C and A > G are divided into one class.
      SNP annotation: ANNOVAR is an efficient software tool that can functionally annotate genetic variations detected from multiple genomes with up-to-date information. ANNOVAR can be analyzed given the chromosome in which the variation is located, the start site, the stop site, the reference nucleotide and the variant nucleotide. In view of ANNOVAR's powerful annotation function and international acceptance, we used it to annotate SNP detection results. The statistics of SNP annotation results are shown in Table 3, and the statistics of small index annotation results are shown in Table 4:
      TABLE 3 SNP annotation results
      | type | FB017-0911-1 | FB032-09016-3 | FB036-09019-1 | …… | 
| UTR3 | 111 | 120 | 133 | …… | 
| UTR5 | 132 | 129 | 148 | …… | 
| downstream | 190 | 204 | 287 | …… | 
| exonic | 2207 | 3222 | 2976 | …… | 
| exonic;splicing | 0 | 0 | 0 | …… | 
| intergenic | 3735 | 4761 | 4688 | …… | 
| intronic | 1227 | 1469 | 1299 | …… | 
| splicing | 9 | 8 | 7 | …… | 
| upstream | 241 | 254 | 290 | …… | 
TABLE 4 results of small indel annotation
      | type | FB017-0911-1 | FB032-09016-3 | FB036-09019-1 | …… | 
|  | 10 | 13 | 15 | …… | 
| UTR5 | 17 | 9 | 11 | …… | 
| downstream | 13 | 13 | 20 | …… | 
| exonic | 37 | 48 | 47 | …… | 
| intergenic | 136 | 172 | 181 | …… | 
| intronic | 60 | 80 | 71 | …… | 
| splicing | 2 | 3 | 4 | …… | 
| upstream | 23 | 26 | 32 | …… | 
The above table specifically describes and illustrates the reference link addresses:
      http://www.openbioinformatics.org/annovar/annovar_gene.html
      sample name.
      Upstream: the 1Kb region upstream of the gene.
      Exonic: the variation is located in an exon region; missense: non-synonymous variants; stop gain: allowing the gene to acquire a variation of a stop codon; stop loss: a mutation that deprives the gene of a stop codon; synonymous: synonymous variants.
      Intronic: the variation is located in an intron region.
      And (3) spicing: the variation is located at the splice site (2 bp near the exon/intron boundary in the intron).
      Downstream: the 1Kb region downstream of the gene.
      Upstream of the gene, 1Kb, and Downstream of the other gene, 1 Kb.
      Intergenic: the variation is located in the intergenic region.
      For the SNP and small indel sites in the CDS region, the effect of the mutation site on protein translation will be annotated. The statistics of the results (SNPs) of the effect of the mutated site of the CDS region on protein translation are shown in table 5:
      TABLE 5 results of the influence of the mutated site of the CDS region on protein translation
      | type | FB017-0911-1 | FB032-09016-3 | FB036-09019-1 | …… | 
| nonsynonymous SNV | 888 | 1292 | 1201 | …… | 
| stopgain SNV | 23 | 33 | 24 | …… | 
| stoploss SNV | 2 | 3 | 3 | …… | 
| synonymous SNV | 1294 | 1894 | 1748 | …… | 
Statistics of the effect of the mutated positions of the CDS region on protein translation (Small Indel) are shown in Table 6, and the degenerate base meanings are shown in Table 7
      TABLE 6 results of the influence of the mutated site of the CDS region on protein translation
      | type | FB017-0911-1 | FB032-09016-3 | FB036-09019-1 | …… | 
| frameshift | 10 | 19 | 14 | …… | 
| frameshift | 15 | 18 | 19 | …… | 
| nonframeshift deletion | 4 | 3 | 6 | …… | 
| nonframeshift insertion | 8 | 8 | 7 | …… | 
| stopgain | 0 | 0 | 1 | …… | 
TABLE 7 base meanings
      | Degenerate/mixed bases | A+C+G | V | 
| Degenerate/mixed bases | A+T+G | D | 
| Degenerate/mixed bases | T+C+G | B | 
| Degenerate/mixed bases | A+T+C | H | 
| Degenerate/mixed bases | A+T | W | 
| Degenerate/mixed bases | C+G | S | 
| Degenerate/mixed bases | T+G | K | 
| Degenerate/mixed bases | A+C | M | 
| Degenerate/mixed bases | C+T | Y | 
| Degenerate/mixed bases | A+G | R | 
| Degenerate/mixed bases | A+G+C+T | N | 
Further, to illustrate the feasibility of the method of the present invention for identifying SNP in faba beans using simplified genomic sequencing data, the results of the method were statistically analyzed as follows:
      1) sequencing data statistics:
      in this example, 8 broad bean germplasms were sequenced using Illumina Hiseq sequencing platform, and 35.47Gb data were obtained altogether, to generate 245443516 reads, each of which has an average length of 144 bp. In 8 germplasms, the minimum sequencing data amount is 3.83Gb, the maximum sequencing data amount is 4.77Gb, and the average sequencing data amount is 4.43 Gb; the minimum number of reads is 26415662, the maximum number is 32822210, and the average number is 30680439.5; q20 and Q30 are respectively more than 97.89 percent and 93.83 percent, and the variation range of GC content is 38.05 percent to 40.09 percent; statistics of 8 germplasm sequencing data are shown in table 8:
      table 8, 8 germplasm sequencing data
      
      2) Broad bean whole genome SNP identification:
      in this example, 3722 group SNPs were identified by the method for identifying SNP in bottle gourd using the special bayesian algorithm without reference genome, and the statistics of SNP identification information in 8 materials are shown in table 9:
      TABLE 9 SNP identification information
      
      On a single germplasm, the number of SNPs identified in FB076 was the least, 3278, and the number of SNPs identified in FB079 was the most, reaching 3578. The number of homozygous SNP mutations varied from 1579 to 2033, the number of heterozygous SNP mutations varied from 1245 to 1804 in 8 germplasm, with the exception of FB080 and FB056, which were greater than the number of heterozygous SNP mutations in most germplasm (table 9).
      Of the 6 SNP mutation types, the T: A- > C: G mutation type accounts for the largest proportion (average 38.8%), followed by C: G- > T: A (average 28.0%), and the T: A- > A: T (average 7.50%) with the smallest occurrence proportion, and the statistics of SNP mutation patterns are shown in Table 10:
      TABLE 10 SNP mutation patterns
      
      3) SNP validation
      In order to verify the effectiveness of the SNPs, 56 SNPs are selected to develop KASP markers after filtering according to the standard that the deletion rate is less than or equal to 20%, the MAF value is greater than or equal to 0.05 and the occurrence frequency of the SNPs is greater than or equal to 40, and finally 31 SNPs are converted into KASP markers with the conversion success rate of 55.3%. The 31 pairs of KASP markers developed in this example are shown in table 11:
      tables 11, 31 pairs of KASP tags
      
      
      
      The results of genotyping 46 broad bean germplasm resources with the 31 pairs of KASP markers show that 22 pairs of markers detect successfully amplified signals, wherein 14 pairs of markers detect single genotype signals, 4 pairs of markers show 2 genotype signals, and 4 pairs of markers show 3 genotype signals as shown in FIG. 1. In FIG. 1, 4 amplification signals, A, of the present example, labeled with KASP failed amplification; b, single genotype; c, 2 genotypes; d, 3 genotypes.
      With the rapid decrease in sequencing costs, identification of SNPs in the genome-wide range using genome re-sequencing has been widely used on a variety of crops. Because broad beans have large genome and no reference genome exists at present, SNP (single nucleotide polymorphism) mining of broad beans lags behind other leguminous crops such as soybeans, kidney beans, cowpeas and the like.
      In this example, the RAD-Seq data of 8 germplasm were used to identify 3722 SNP markers. And Ocana et al (S,Seoane P,Bautista R et al.Large-Scale Transcriptome Analysis in Faba Bean(Vicia faba L.)under Ascochyta fabae Infection.PLoS ONE,2015,10(8):e013514)]And Webb et al (Webb A, Cottage A, Wood T, et al. A SNP-based transducing qualitative map for synthesizing-based tracking targeting in faba bean (Vicia faba L.) [ J]Plant Biotechnology Journal,2016,14:177-185) utilizes transcriptome sequencing data to mine SNPs differently, this example utilizes genome sequencing data to mine SNPs, and not only can SNP mutations in gene expression regions be identified, but also SNP mutations in non-coding regions such as gene interiors and intergenes can be identified, and SNP sources are more abundant.
      The foregoing is directed to the preferred embodiment of the present invention and it is noted that while the preferred embodiment of the present invention has been described, numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the invention and without departing from the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
    Claims (10)
1. A method for identifying broad bean SNP by using simplified genome sequencing data is characterized by comprising the following steps:
      taking young leaves of broad bean seedlings as a sample, and grinding the young leaves by liquid nitrogen to extract sample genome DNA;
      step two, constructing a RAD library, digesting genomic DNA by using EcoRI, enriching the library by using a PCR amplification method, recovering a target band by using agarose gel, and sequencing a single end;
      generating an original sequence, performing sequence quality control analysis, removing sequences smaller than 85bp to obtain a screened sequence, clustering the screened sequence according to sequence similarity to generate RAD-tags, clustering the RAD-tags to perform SNP calling, and correcting the SNP genotype;
      selecting a sequence which covers the SNP locus and has a total length of 100bp, designing KASP primers, enabling each KASP primer marker to respectively comprise two forward primer sequences and a universal reverse primer sequence for distinguishing SNP allelic variation, and carrying out SNP signal detection by SNP genotyping.
    2. The method for identifying SNP of broad beans by using simplified genome sequencing data as set forth in claim 1, wherein the young leaves of broad bean seedlings in the first step are young leaves of broad bean seedlings which grow for 1 week.
    3. The method for identifying faba bean SNPs using simplified genomic sequencing data according to claim 1, wherein the step one, after extracting genomic DNA from the sample, further comprises:
      the quality of the extracted sample genomic DNA was checked by agarose gel electrophoresis, and the concentration of the extracted sample genomic DNA was checked using a NanoDrop2000 ultramicro spectrophotometer.
    4. The method for identifying SNP of broad beans by using simplified genome sequencing data as set forth in claim 1, wherein in the second step, when the genomic DNA is digested by EcoRI, the 3' end of the digested fragment is treated by adding ' A ' to connect with MID linker; single-ended sequencing used the Illumina HiSeq2000 platform.
    5. The method for identifying broad bean SNPs using simplified genomic sequencing data as claimed in claim 1, wherein in step three, the original sequence is generated by Illumina base cloning software CASAVA v1.8.2, and sequence quality control analysis is performed using trimmatic software under default parameters.
    6. The method for identifying SNP in broad beans by using simplified genome sequencing data as set forth in claim 1, wherein in the third step, the screened sequences are clustered by using ustacks software according to sequence similarity to generate RAD-tags, the RAD-tags are clustered by using cstags software under default parameters to perform SNP calling, and finally the SNP genotype is corrected by using Bayesian algorithm.
    7. The method for identifying faba bean SNPs using simplified genomic sequencing data as claimed in claim 1 wherein in step four, Kraken is usedTMThe software designed KASP primers.
    8. The method for identifying faba bean SNPs using simplified genomic sequencing data as claimed in claim 1, wherein in step four, SNP genotyping is performed using IntelliQube high throughput genotyping detection platform for SNP signal detection.
    9. The method for identifying faba bean SNPs using simplified genomic sequencing data according to claim 1, wherein in the fourth step, when SNP genotyping is performed, the single-spot reaction volume is 1.6 μ L, wherein the sample DNA is 0.8 μ L, and the mixed volume of the 2xMaster mix and the Primer mix is 0.8 μ L.
    10. The method for identifying SNP in broad beans according to claim 1, wherein in the fourth step, when SNP genotyping is performed on SNP signals, the PCR amplification procedure is 15min at 95 ℃,20 s at 94 ℃ and 60s at 61-55 ℃, which are 10 cycles; 26 cycles at 94 ℃ for 20s and 55 ℃ for 60 s.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202011310367.5A CN112553361A (en) | 2020-11-20 | 2020-11-20 | Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202011310367.5A CN112553361A (en) | 2020-11-20 | 2020-11-20 | Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN112553361A true CN112553361A (en) | 2021-03-26 | 
Family
ID=75044213
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202011310367.5A Pending CN112553361A (en) | 2020-11-20 | 2020-11-20 | Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN112553361A (en) | 
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20060041955A1 (en) * | 2004-08-23 | 2006-02-23 | Pioneer Hi-Bred International, Inc. | Marker mapping and resistance gene associations in soybean | 
| CN101701255A (en) * | 2009-11-13 | 2010-05-05 | 中国检验检疫科学研究院 | Primers and method for identification of broad bean by PCR | 
| US20150322447A1 (en) * | 2012-07-06 | 2015-11-12 | Bayer Cropscience Nv | Soybean rod1 gene sequences and uses thereof | 
| CN106755328A (en) * | 2016-11-25 | 2017-05-31 | 中国农业科学院作物科学研究所 | A kind of construction method of broad bean SSR finger-prints | 
| CN110139872A (en) * | 2016-12-21 | 2019-08-16 | 中国农业科学院作物科学研究所 | Plant seed character-related protein, gene, promoter and SNP and haplotype | 
- 
        2020
        - 2020-11-20 CN CN202011310367.5A patent/CN112553361A/en active Pending
 
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20060041955A1 (en) * | 2004-08-23 | 2006-02-23 | Pioneer Hi-Bred International, Inc. | Marker mapping and resistance gene associations in soybean | 
| CN101701255A (en) * | 2009-11-13 | 2010-05-05 | 中国检验检疫科学研究院 | Primers and method for identification of broad bean by PCR | 
| US20150322447A1 (en) * | 2012-07-06 | 2015-11-12 | Bayer Cropscience Nv | Soybean rod1 gene sequences and uses thereof | 
| CN106755328A (en) * | 2016-11-25 | 2017-05-31 | 中国农业科学院作物科学研究所 | A kind of construction method of broad bean SSR finger-prints | 
| CN110139872A (en) * | 2016-12-21 | 2019-08-16 | 中国农业科学院作物科学研究所 | Plant seed character-related protein, gene, promoter and SNP and haplotype | 
Non-Patent Citations (4)
| Title | 
|---|
| ANNE WEBB等: "A SNP-based consensus genetic map for synteny-based trait targeting in faba bean (Vicia faba L.)", 《PLANT BIOTECHNOLOGY JOURNAL》 * | 
| ANNE WEBB等: "A SNP-based consensus genetic map for synteny-based trait targeting in faba bean (Vicia faba L.)", 《PLANT BIOTECHNOLOGY JOURNAL》, vol. 14, no. 1, 10 April 2015 (2015-04-10), pages 177 - 185 * | 
| 刘庭付等: "利用简化基因组测序数据鉴定蚕豆SNP", 《分子植物育种》 * | 
| 刘庭付等: "利用简化基因组测序数据鉴定蚕豆SNP", 《分子植物育种》, 13 November 2020 (2020-11-13), pages 1 - 9 * | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| Yang et al. | Target SSR-Seq: a novel SSR genotyping technology associate with perfect SSRs in genetic analysis of cucumber varieties | |
| Lee et al. | Young inversion with multiple linked QTLs under selection in a hybrid zone | |
| US9976191B2 (en) | Rice whole genome breeding chip and application thereof | |
| CN102747138B (en) | Rice whole genome SNP chip and application thereof | |
| Ryu et al. | Genotyping-by-sequencing based single nucleotide polymorphisms enabled Kompetitive Allele Specific PCR marker development in mutant Rubus genotypes | |
| CN117144040B (en) | Fresh corn genotyping chip and application thereof | |
| CN116004898A (en) | Peanut 40K liquid-phase SNP chip PeannitGBTS 40K and application thereof | |
| Mishra et al. | Analysis of SSR and SNP markers | |
| Gaur et al. | A high-density SNP-based linkage map using genotyping-by-sequencing and its utilization for improved genome assembly of chickpea (Cicer arietinum L.) | |
| CN115992265A (en) | Grouper whole genome liquid phase chip and application thereof | |
| CN110959178B (en) | Systems and methods for targeted genome editing | |
| CN111916151B (en) | Traceability detection method and application of verticillium wilt of alfalfa | |
| CN117457075B (en) | A method for identifying oil-tea camellia varieties | |
| CN118064428A (en) | MNP molecular marker combination and method for constructing DNA fingerprint of rubber tree | |
| CN112553361A (en) | Method for identifying SNP (single nucleotide polymorphism) of broad beans by using simplified genome sequencing data | |
| KR101911307B1 (en) | Method for selecting and utilizing tag-SNP for discriminating haplotype in gene unit | |
| CN116732153A (en) | A method and application for whole-genome genotype identification of very large genome species | |
| CN113718342A (en) | Construction method of high-density genetic map of recombinant inbred line population | |
| Bello et al. | Genetic Diversity Analysis of Selected Sugarcane (Saccharum spp. Hybrids) Varieties Using DArT-Seq Technology. | |
| CN118240948B (en) | Identification method and application of genetic relationship of Litopenaeus vannamei based on targeted sequencing typing | |
| CN116622881B (en) | Tobacco whole genome SNP locus combination, probe, chip and application thereof | |
| Dong et al. | The mutational dynamics of the Arabidopsis centromeres | |
| CN117904317B (en) | SNP molecular marker combination for detecting propagation traits of Nile-Lafei buffalo and application | |
| CN119685521B (en) | Liquid phase chip detection method for leymus chinensis variety identification | |
| Li et al. | An initial exploration of core collection construction and DNA fingerprinting in Elymus sibiricus L. using SNP markers | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date: 20210326 |