[go: up one dir, main page]

CN108728515A - A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods - Google Patents

A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods Download PDF

Info

Publication number
CN108728515A
CN108728515A CN201810585283.9A CN201810585283A CN108728515A CN 108728515 A CN108728515 A CN 108728515A CN 201810585283 A CN201810585283 A CN 201810585283A CN 108728515 A CN108728515 A CN 108728515A
Authority
CN
China
Prior art keywords
sequence
single strand
connector
strand dna
ctdna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810585283.9A
Other languages
Chinese (zh)
Inventor
王沛
马兴勇
谭达
杨洁
李光宇
阎海
王思振
王晓月
焦宇辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Genetron Health Gen Technology Co Ltd
Original Assignee
Beijing Genetron Health Gen Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Genetron Health Gen Technology Co Ltd filed Critical Beijing Genetron Health Gen Technology Co Ltd
Priority to CN201810585283.9A priority Critical patent/CN108728515A/en
Publication of CN108728515A publication Critical patent/CN108728515A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a kind of analysis methods of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods.A kind of method in the library that the present invention provides structures for detecting the mutation of ctDNA low frequencies, in turn includes the following steps:(1) ctDNA samples are carried out to end successively to repair and 3 ' ends plus A processing;(2) by step (1), treated that ctDNA is connect with connector mixture, and library is obtained after PCR amplification.The present invention has the following advantages:1, barcode labels have been used, and have been reduced to the combined sequence of 4bp long, have been increased operation rate, detection sensitivity is improved, synthesis is simple, at low cost, it is easy to use, greatly improve the ratio of duplex mark molecules in the joint efficiency and sequencing data of adapter.2, the analysis of biological information method of the identification duplex used can quickly and effectively remove the mistake introduced during sequencing, capture and PCR, reduce the false positive of detection.

Description

A kind of library construction and sequencing using the detection ctDNA low frequencies mutation of duplex methods The analysis method of data
Technical field
The present invention relates to a kind of points of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods Analysis method.
Background technology
CtDNA (circulating tumor DNA), i.e. Circulating tumor DNA, refer to tumour cell release be present in blood DNA fragmentation in the body fluid such as liquid, cerebrospinal fluid is a kind of characteristic tumor biomarker.The variation of ctDNA detects, and can be used for The early diagnosis of tumour, the dynamic monitoring of tumor development and curative effect, Drug Resistance Detection, recurring risk assessment etc..CtDNA is in blood Content in the body fluid such as slurry is very low, and the 1% of usually less than total cfDNA, variation detection difficulty is very big.
Technology currently used for ctDNA abrupt climatic changes mainly has ARMS methods (mainly Super-ARMS), second generation sequencing (NGS) and digital pcr (dPCR, including BEAMing technologies).Super-ARMS and digital pcr technology are easy quickly, specificity and Sensitivity is higher, the disadvantage is that the known mutations of finite number can only be detected, flux is low.NGS flux is high, and detection gene dosage is not It is limited, it can detect known or unknown mutation.But NGS technical sophistications, time-consuming and is not easy to standardize, library prepares and sequencing False positive may be all introduced in the process, influence result interpretation.
Mainly there are three sources for the false positive of NGS testing results:1. the PCR mistakes accumulated in the preparation process of library;2. surveying The sequencing mistake generated in program process;3. sample room pollutes, the microarray datasets inherent technology such as mainly Hiseq X, Novaseq lacks Index calibration is to the problem on wrong sample caused by falling into, also referred to as " label jump " (index hopping).
Duplex technologies, using special joint sequence, can introduce random in library construction in the both sides of target fragment Sequence label, to label from same molecule positive minus strand.By the dual correction of positive minus strand, can be significantly reduced The false positive of PCR mistakes and sequencing error tape.The technology main is asked in the application of ct DNA variation detection there are two Topic:1. joint sequence purity is relatively low, initial molecule is caused to lose in connection procedure;2. the duplex ratios detected in sequencing result Example very low (10%-15% or so), most of molecule can not carry out positive minus strand correction.
Invention content
The object of the present invention is to provide a kind of library construction using the detection ctDNA low frequencies mutation of duplex methods and sequencings The analysis method of data.
A kind of method in the library that the present invention provides structures for detecting the mutation of ctDNA low frequencies includes following step successively Suddenly:
(1) ctDNA samples are carried out to end successively to repair and 3 ' ends plus A processing;
(2) by step (1), treated that ctDNA is connect with connector mixture, and library is obtained after PCR amplification;
The connector mixture is made of n connector;
Each connector forms part duplex structure by a sense primer first and a downstream primer first and obtains;
There is barcode label first in sense primer first;There is barcode label second in downstream primer first;
Barcode labels first and barcode label second reverse complementals;
Barcode label first is made of A, T, C and G, is put in order arbitrary;
N connector is using n different barcode label first;
The random natural number that n is >=8.
First nucleotide of 3 ' ends of the sense primer first is the T with modification;The purpose of the modification is anti- Only exonuclease is degraded;5 ' ends of downstream primer first carry out phosphorylation modification.
Also there is adapter sequences 1 in the sense primer first;Also there is adapter sequences in the downstream primer first 2;Adapter sequences 1 and adapter sequences 2 are selected according to microarray dataset, partial sector reverse complemental in the two;N connector is adopted With identical adapter sequences 1 and adapter sequences 2.
The partially double stranded structure by label first and adapter sequences 1 partial sequence and barcode labels second and Partial sequence reverse complemental in adapter sequences 2 obtains.
The sense primer first is followed successively by adapter sequences 1, barcode labels first and base T from 5 ' ends.
The downstream primer first is followed successively by barcode labels second and adapter sequences 2 from 5 ' ends.
The connector can be by being annealed to obtain sense primer first and downstream primer first.
In the connector mixture, each connector equimolar mixing.
When microarray dataset is illumina microarray datasets, adapter sequences 1 specifically can be such as the sequence 1 from 5 ' of sequence table It holds shown in the 1st to 21, the adapter sequences 2 specifically can be if the sequence 2 of sequence table is from shown in 5 ' the 5th to 26, ends.
The n concretely 12.
When n is 12, the barcode labels first of 12 connectors respectively as the sequence 1 of sequence table from 5 ' the 22nd to 25, ends, Sequence 3 is held from 5 ' the 22nd to 25, ends, sequence 9 from 5 ' from 5 ' the 22nd to 25, ends, sequence 5 from 5 ' the 22nd to 25, ends, sequence 7 22nd to 25, sequence 11 from 5 ' end the 22nd to 25, sequence 13 from 5 ' end the 22nd to 25, sequence 15 from 5 ' end the 22nd to 25 Position, sequence 17 hold the 22nd to 25 and sequences from 5 ' the 22nd to 25, ends, sequence 21 from 5 ' the 22nd to 25, ends, sequence 19 from 5 ' 23 from shown in 5 ' the 22nd to 25, ends.
When n is 12,12 connectors are as follows:
Single strand dna forms part shown in single strand dna and sequence 2 shown in sequence 1 of the connector 1 by sequence table Duplex structure obtains;Single strand dna shape shown in single strand dna and sequence 4 shown in sequence 3 of the connector 2 by sequence table It is obtained at partially double stranded structure;Single stranded DNA shown in single strand dna and sequence 6 shown in sequence 5 of the connector 3 by sequence table Molecule forms part duplex structure and obtains;It is single shown in single strand dna and sequence 8 shown in sequence 7 of the connector 4 by sequence table Ssdna molecule forms part duplex structure and obtains;10 institute of single strand dna shown in sequence 9 of the connector 5 by sequence table and sequence The single strand dna shown forms part duplex structure and obtains;Single strand dna shown in sequence 11 of the connector 6 by sequence table and Single strand dna shown in sequence 12 forms part duplex structure and obtains;It is single-stranded shown in sequence 13 of the connector 7 by sequence table Single strand dna shown in DNA molecular and sequence 14 forms part duplex structure and obtains;Connector 8 by sequence table 15 institute of sequence Single strand dna shown in the single strand dna and sequence 16 shown forms part duplex structure and obtains;Connector 9 is by sequence table Single strand dna shown in single strand dna and sequence 18 shown in sequence 17 forms part duplex structure and obtains;Connector 10 by Single strand dna shown in single strand dna and sequence 20 shown in the sequence 19 of sequence table forms part duplex structure and obtains; Single strand dna shown in single strand dna and sequence 22 shown in sequence 21 of the connector 11 by sequence table is formed partially double stranded Structure obtains;Single strand dna is formed shown in single strand dna and sequence 24 shown in sequence 23 of the connector 12 by sequence table Partially double stranded structure obtains.
In the method, the primer pair that the PCR amplification uses is made of sense primer second and downstream primer second;Under described It includes index sequence labels to swim primer second.
The sense primer second specifically can be as shown in the sequence 25 of sequence table.
The downstream primer second is followed successively by section first, index sequence labels and section second from 5 ' ends.The section first such as sequence Shown in the sequence 26 of list, the section second is as shown in the sequence 27 of sequence table.
The method further includes the steps that the library after PCR amplification is carried out target area capture.
Existing library capture commercial kits can be used in the target area capture, and panel therein, which can be replaced, to be appointed Anticipate the panel containing targeted mutagenesis.
The present invention also protects the library that the method for any description above is prepared.
The present invention also protects method or library targeted mutagenesis and its mutation in detecting ctDNA samples of any description above Application in frequency.
The present invention also protects a kind of kit for building ctDNA low frequency abrupt climatic changes library, including any of the above institute State connector mixture.
The kit further includes the reagent extracted for ctDNA, the reagent in library is built for DNA, is purified for library Reagent, the material that library construction is used for for the reagent etc. of library capture.
The present invention also protects a kind of method of targeted mutagenesis and its frequency of mutation in detection ctDNA samples, including walks as follows Suddenly:
(1) library is prepared according to the method for any description above;
(2) library is sequenced, obtains sequencing result, according to sequencing result analyze in ctDNA samples targeted mutagenesis and its The frequency of mutation.
The analysis method of the sequencing result is as follows:
(a) sequencing result is compared to ginseng and is examined on genome hg19;
(b) there is identical starting and final position, and the cluster with identical barcode labels in the genome Reads comes from the same chain of same molecule.The reads of same cluster is compared and is calculated, it is same in cluster reads Sequence of the consistency higher than 80% is effective on position, and the most base of occurrence number is correct base, is retained; If consistency be less than 80%, the base be sequencing, PCR or capture mistake caused by, be labeled as N, do not enter subsequent Variation detection;
(c) there are identical starting and final position, two reversed clusters of read1 and read2barcode labels in the genome Reads, is respectively from the positive minus strand of same molecule, referred to as duplex, on the same position of genome, duplex read Consistent sequence is considered correct, and caused by inconsistent sequence is sequencing, PCR or capture mistake, it is labeled as N, Subsequent variation detection is not entered.
The computational methods of the frequency of mutation are:In sequencing data, the data in the site are covered, support data/(branch of mutation Hold the data of data+support wild type of mutation).
In (a), sequencing result compares specifically usable bwa softwares.
In the calculating of the frequency of mutation, samtools softwares can be used first, the bam of comparison is converted to mpileup The file of format, then calculates the frequency of mutation.
CtDNA of any description above ctDNA samples concretely from human blood sample.
The frequency of mutation of any description above low frequency mutation is most down to 0.1%.
The invention adopts the above technical scheme, which has the following advantages:
1 has used barcode labels, with significant notation and can distinguish ctDNA molecules all in original sample.General feelings Under condition, concentration (Fig. 1) is compared in ctDNA extracted amounts generally only tens nanograms, Insert Fragment distribution, and barcode labels can have Effect distinguishes different ctNDA molecules, increases operation rate, and improves detection sensitivity.
2barcode labels are reduced to 12 kinds of fixed combined sequences of 4bp long by original 8bp random tags, are had and are closed At simple, at low cost, easy to use feature.In addition, using this method, carries out duplex and build library, greatly improve The ratio of duplex mark molecules in the joint efficiency (Fig. 2) and sequencing data of adapter.
3 present invention use identification duplex analysis of biological information method, can quickly and effectively remove sequencing, capture and The mistake introduced during PCR, reduces the false positive of detection.
Description of the drawings
Fig. 1 is the fragment size distribution of ctDNA.
Fig. 2 is ctDNA molecule joint efficiencies.
Fig. 3 is duplex rates.
Fig. 4 is snv, the detection accuracy of indel.
Specific implementation mode
Embodiment below facilitates a better understanding of the present invention, but does not limit the present invention.Experiment in following embodiments Method is unless otherwise specified conventional method.Test material as used in the following examples is unless otherwise specified certainly What routine biochemistry reagent shop was commercially available.Quantitative test in following embodiment is respectively provided with three repeated experiments, as a result makes even Mean value.
Embodiment 1, library construction
One, ctDNA samples Quality Control
Content assaying is carried out to ctDNA samples using Qubit nucleic acid quantifications instrument, while using Agilent 2100 Bioanalyzer detects the segment distribution (Fig. 1) of ctDNA, it is ensured that without contaminating genomic DNA.
Two, connector prepares
1,24 single stranded DNAs shown in table 1 are synthesized, wherein F represents sense primer, and R represents downstream primer.Sense primer 3 ' ends nucleotide carry out thio-modification (modification mode also can be replaced can prevent exonuclease degrade other repair Decorations mode).The nucleotide at 5 ' ends of downstream primer carries out phosphorylation modification.
1 joint sequence information of table
In table 1, upstream sequence and downstream sequence in each group can be by annealed combinations at double-stranded adapters.
The barcode labels that underscore mark is 4bp long, each group of upstream sequence and downstream barcode labels are Reverse complemental relationship, barcode labels can be replaced arbitrary 4bp long by A T C tetra- base random combines of G and content it is equal The sequence of weighing apparatus;
The T of overstriking and " A " that initial molecule end adds are complementary, carry out TA connections;
It does not carry out underscore label and does not carry out the sequence of overstriking label to be the adapter sequences of illumina microarray datasets Row such as use other microarray datasets, can be replaced the corresponding adapter sequences of other platforms;
Include altogether 12 groups of connectors (i.e. 12 groups of barcode labels) in table 1, purpose is as follows:(1) different ctDNA points is distinguished Sub (2) identify the positive minus strand of a ctDNA molecule.12 groups of barcode labels can form the combination of 12 × 12=144 kinds, in conjunction with The sequence information of molecule itself, it is sufficient to distinguish all molecules in primary sample, also can suitably increase in practical application (synthesis at Originally increase) or reduction (it is slightly weak to distinguish effect) group number.
2, by the single stranded DNA in table 1 with TE dissolved dilutions to final concentration of 100 μM.By two single stranded DNAs in same group Isometric mixing (total volume is no more than 100 μ l), (cycle of annealing of being annealed:95 DEG C, 30min;25 DEG C, 2h), obtain 12 groups DNA solution mixes 12 groups of DNA solution equimolars, obtains connector mixed liquor (- 20 DEG C of storages).
Three, library construction and amplification
1,45ngctDNA is taken, library kit (KAPA, article No. are built according to KAPA Hyper DNA:KK8505 normal stream) Cheng Jinhang builds library, and (including the end of ctDNA is repaired plus A tails and ctDNA connect and build library with connector mixed liquor prepared by step 2 And etc.).Product is recycled with AMPure XP magnetic beads for purifying, 20 μ l Nuclease-free water elutions, as PCR moulds later Plate.
2, the pcr template for taking step 1 to obtain, reaction system is configured according to table 2, is carried out PCR amplification according to table 3, is obtained PCR Amplified production.Product is recycled with AMPure XP magnetic beads for purifying, and 20 μ l Nuclease-free water elutions obtain library.
2 reaction system of table
F1(5’-3’):AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (sequence 25);
R1(5'-3'):CAAGCAGAAGACGGCATACGAGAT (sequence 26)-XXXXXXXX- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT(sequence 27).
In primers F 1 and primer R1, underscore mark is part corresponding with the connector in table 1, is used for amplified library.
In primer R1, XXXXXXXX indicates index sequence labels, and effect is to mark all molecules of the same sample, when When there are multiple samples while being sequenced, sample can be distinguished by index sequence labels.Index sequence label length is 6-8bp.
3 response procedures of table
Four, target area captures
Using the commercialized 63 gene panel enrichment kits of ctDNA of pangen gene Co., Ltd (Agilent Sureselect XT probes) carry out target area capture.When probe hybridizes, replaced using P5, P7blocking of Integrated Device Technology, Inc. Blocking in original reagent box.It captures product and carries out PCR amplification using KAPA HiFiPCR kits, amplification cycles number is 16 It is a.
Embodiment 2, sequencing and analysis of biological information
1, library prepared by embodiment 1 carries out the sequencing of both-end 151bp on the HiseqX sequenators of illumina companies. Sequencing depth is about 20,000 ×.
2, it compares:Reads will be sequenced using bwa softwares to compare onto hg19 genomes.
3, duplicate removal and error correction:There is identical starting and final position in the genome, and there is identical barcode sequences Cluster reads, come from the same chain of same molecule.The reads of same cluster is compared and is calculated, cluster reads Sequence of the consistency higher than 80% is effective on middle same position, and the most base of occurrence number is correct base, is protected It stays;If consistency be less than 80%, the base be sequencing, PCR or capture mistake caused by, be labeled as N, do not enter Subsequent variation detection.Cluster reads in this way generates the read after an error correction after error correction and duplicate removal.
4, duplex and error correction are identified:There is identical starting and final position, read1 and read2 in the genome Two reversed cluster reads of barcode sequences, are respectively from the positive minus strand of same molecule.For example, read1 and read2 Barcode sequences are that the barcode sequences of A, the cluster reads of B, with read1 and read2 are B, the other cluster reads of A, It is respectively from the positive minus strand of same molecule.This reads for being respectively from a positive minus strand of molecule, also referred to as duplex. The two cluster reads of Duplex carry out duplicate removal and generate the read after error correction with error correction, carry out error correction again respectively.I.e. in genome Same position on, sequence consistent duplex read be considered as it is correct, and inconsistent sequence be sequencing, PCR or Caused by capture mistake, it is labeled as N, does not enter subsequent variation detection.
5, joint efficiency and duplex rates statistics:The molecule that statistics builds how many ratio in the starting template of library enters sequencing number According to i.e. joint efficiency;And the molecule of how many ratio is duplex labels, i.e. duplex rates.
12 ctDNA samples (coming from healthy human blood's sample), after testing, this method joint efficiency are detected using this method About 40% (Fig. 2), duplex rates are about 60% (Fig. 3), hence it is evident that higher than 10% or so duplex rates of existing literature report (CAPP)。
Embodiment 3, method validation
Produced using Horizon companies the quasi- product of HD734 (Tru-Q 7 (1.3%Tier) Reference Standard, Article No.:HD734) carry out snv, the detection verification of indel, the standard items include 1% or so mutation 34 (be shown in Table 4, it is all prominent Become site to be present in the panel ranges used in 1 step 4 of embodiment).
Standard items 70ng is taken respectively, is dissolved into 100 μ L TE buffer (10mMTris-HCl, pH 8.0), is used DNA is broken into the segment of 200bp by Covaris M220.The DNA fragmentation having no progeny of fighting each other is recycled, spare.Use Healthy People Leucocyte DNA, the DNA of dilution standard product to the frequency of mutation is respectively 0.5% and 0.1% or so respectively.Standard items DNA and The DNA of two kinds of diluted concentrations is respectively used to subsequent Accuracy Verification.
4 standard items of table (frequency of mutation 1%) testing result deck watch
5 standard items of table (frequency of mutation 0.5%) testing result deck watch
6 standard items of table (frequency of mutation 0.1%) testing result deck watch
In table 5 and table 6, ND expressions are not detected.
It is detected according to the method in embodiment 1 and embodiment 2, carries out 3 repetitions.
The result shows that for the snv of 0.1%, 0.5% and 1% frequency of mutation of standard items, the detection sensitivity point of indel It Wei 95.10%, 97.06% and 100%.In addition, the frequency of mutation detected and the very high (figure of expected frequency of mutation consistency 4, table 4- tables 6).
Detection sensitivity:There are 34 mutation in standard items, carry out 3 repetitions, totally 102 mutation, sensitivity is to detect Mutation count/102 × 100% arrived.
The expected frequency of mutation:The frequency of mutation (the molecular number of mutation/(mutating molecule number+wild type molecule in standard items Number)).
The frequency of mutation detected:In sequencing data, the data in the site are covered, support the data of mutation/(support prominent The data of the data of change+support wild type).
Sequence table
<110>Beijing pangen Gene Tech. Company Limited
<120>A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods
<160> 27
<170> SIPOSequenceListing 1.0
<210> 1
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 1
tacacgacgc tcttccgatc tagctt 26
<210> 2
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 2
agctagatcg gaagagcaca cgtct 25
<210> 3
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 3
tacacgacgc tcttccgatc tatgct 26
<210> 4
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 4
gcatagatcg gaagagcaca cgtct 25
<210> 5
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 5
tacacgacgc tcttccgatc tactgt 26
<210> 6
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 6
cagtagatcg gaagagcaca cgtct 25
<210> 7
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 7
tacacgacgc tcttccgatc ttgact 26
<210> 8
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 8
gtcaagatcg gaagagcaca cgtct 25
<210> 9
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 9
tacacgacgc tcttccgatc ttcgat 26
<210> 10
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 10
tcgaagatcg gaagagcaca cgtct 25
<210> 11
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 11
tacacgacgc tcttccgatc ttacgt 26
<210> 12
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 12
cgtaagatcg gaagagcaca cgtct 25
<210> 13
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 13
tacacgacgc tcttccgatc tgatct 26
<210> 14
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 14
gatcagatcg gaagagcaca cgtct 25
<210> 15
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 15
tacacgacgc tcttccgatc tgcatt 26
<210> 16
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 16
atgcagatcg gaagagcaca cgtct 25
<210> 17
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 17
tacacgacgc tcttccgatc tgtcat 26
<210> 18
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 18
tgacagatcg gaagagcaca cgtct 25
<210> 19
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 19
tacacgacgc tcttccgatc tcagtt 26
<210> 20
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 20
actgagatcg gaagagcaca cgtct 25
<210> 21
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 21
tacacgacgc tcttccgatc tctagt 26
<210> 22
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 22
ctagagatcg gaagagcaca cgtct 25
<210> 23
<211> 26
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 23
tacacgacgc tcttccgatc tcgtat 26
<210> 24
<211> 25
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 24
tacgagatcg gaagagcaca cgtct 25
<210> 25
<211> 58
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 25
aatgatacgg cgaccaccga gatctacact ctttccctac acgacgctct tccgatct 58
<210> 26
<211> 24
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 26
caagcagaag acggcatacg agat 24
<210> 27
<211> 34
<212> DNA
<213>Artificial sequence (Artificial Sequence)
<400> 27
gtgactggag ttcagacgtg tgctcttccg atct 34

Claims (10)

1. a kind of method in library of structure for detecting the mutation of ctDNA low frequencies, in turn includes the following steps:
(1) ctDNA samples are carried out to end successively to repair and 3 ' ends plus A processing;
(2) by step (1), treated that ctDNA is connect with connector mixture, and library is obtained after PCR amplification;
The connector mixture is made of n connector;
Each connector forms part duplex structure by a sense primer first and a downstream primer first and obtains;
There is barcode label first in sense primer first;There is barcode label second in downstream primer first;
Barcode labels first and barcode label second reverse complementals;
Barcode label first is made of A, T, C and G, is put in order arbitrary;
N connector is using n different barcode label first;
The random natural number that n is >=8.
2. the method as described in claim 1, it is characterised in that:First nucleotide of 3 ' ends of sense primer first be with The T of modification;The purpose of the modification is to prevent exonuclease from degrading;5 ' ends of downstream primer first carry out phosphorylation modification.
3. method as claimed in claim 1 or 2, it is characterised in that:N=12.
4. method as claimed in claim 3, it is characterised in that:
The barcode labels first of 12 connectors is respectively if the sequence 1 of sequence table is from 5 ' the 22nd to 25, ends, sequence 3 from 5 ' ends the 22 to 25, sequence 5 from 5 ' end the 22nd to 25, sequence 7 from 5 ' end the 22nd to 25, sequence 9 from 5 ' end the 22nd to 25, sequences Row 11 hold the 22nd to 25, sequences 17 from 5 ' from 5 ' the 22nd to 25, ends, sequence 15 from 5 ' the 22nd to 25, ends, sequence 13 from 5 ' Hold the 22nd to 25, sequence 19 from 5 ' end the 22nd to 25, sequence 21 from 5 ' end the 22nd to 25 and sequence 23 from 5 ' end the 22nd Shown in 25.
5. method as described in claim 3 or 4, it is characterised in that:
12 connectors are as follows:
Single strand dna shown in single strand dna and sequence 2 shown in sequence 1 of the connector 1 by sequence table is formed partially double stranded Structure obtains;Single strand dna forming portion shown in single strand dna and sequence 4 shown in sequence 3 of the connector 2 by sequence table Duplex structure is divided to obtain;Single strand dna shown in single strand dna and sequence 6 shown in sequence 5 of the connector 3 by sequence table Part duplex structure is formed to obtain;It is single-stranded shown in single strand dna and sequence 8 shown in sequence 7 of the connector 4 by sequence table DNA molecular forms part duplex structure and obtains;Shown in single strand dna and sequence 10 shown in sequence 9 of the connector 5 by sequence table Single strand dna formed part duplex structure obtain;Single strand dna and sequence shown in sequence 11 of the connector 6 by sequence table Single strand dna shown in row 12 forms part duplex structure and obtains;Single stranded DNA shown in sequence 13 of the connector 7 by sequence table Single strand dna shown in molecule and sequence 14 forms part duplex structure and obtains;Shown in sequence 15 of the connector 8 by sequence table Single strand dna shown in single strand dna and sequence 16 forms part duplex structure and obtains;Connector 9 by sequence table sequence Single strand dna shown in single strand dna shown in 17 and sequence 18 forms part duplex structure and obtains;Connector 10 is by sequence Single strand dna shown in single strand dna and sequence 20 shown in the sequence 19 of table forms part duplex structure and obtains;Connector Single strand dna forms part duplex structure shown in single strand dna and sequence 22 shown in 11 sequence 21 by sequence table It obtains;Single strand dna forms part shown in single strand dna and sequence 24 shown in sequence 23 of the connector 12 by sequence table Duplex structure obtains.
6. the library that any methods of claim 1-5 are prepared.
7. library described in claim the 1-5 any method or claim 6 in detecting ctDNA samples targeted mutagenesis and Application in its frequency of mutation.
8. a kind of for building the kit in ctDNA low frequency abrupt climatic changes library, including any described in claim 1-5 connect Head mixture.
9. a kind of method of targeted mutagenesis and its frequency of mutation in detection ctDNA samples, includes the following steps:
(1) library is prepared according to any method in claim 1-5;
(2) library is sequenced, obtains sequencing result, targeted mutagenesis and its mutation in ctDNA samples are analyzed according to sequencing result Frequency.
10. method as claimed in claim 9, it is characterised in that:The analysis method of the sequencing result is as follows:
(a) sequencing result is compared to ginseng and is examined on genome hg19;
(b) there is identical starting and final position, and the cluster reads with identical barcode labels in the genome, come From in the same chain of same molecule;The reads of same cluster is compared and is calculated, in cluster reads one on same position Sequence of the cause property higher than 80% is effective, and the most base of occurrence number is correct base, is retained;If consistent Property be less than 80%, then the base is sequencing, caused by PCR or capture mistake, is labeled as N, does not enter subsequent variation and detect;
(c) there are identical starting and final position, two reversed clusters of read1 and read2 barcode labels in the genome Reads, is respectively from the positive minus strand of same molecule, referred to as duplex, on the same position of genome, duplex read Consistent sequence is considered correct, and caused by inconsistent sequence is sequencing, PCR or capture mistake, it is labeled as N, Subsequent variation detection is not entered;
The computational methods of the frequency of mutation are:In sequencing data, the data in the site are covered, support the data of mutation/(support prominent The data of the data of change+support wild type).
CN201810585283.9A 2018-06-08 2018-06-08 A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods Pending CN108728515A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810585283.9A CN108728515A (en) 2018-06-08 2018-06-08 A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810585283.9A CN108728515A (en) 2018-06-08 2018-06-08 A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods

Publications (1)

Publication Number Publication Date
CN108728515A true CN108728515A (en) 2018-11-02

Family

ID=63932572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810585283.9A Pending CN108728515A (en) 2018-06-08 2018-06-08 A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods

Country Status (1)

Country Link
CN (1) CN108728515A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109439729A (en) * 2018-12-27 2019-03-08 上海鲸舟基因科技有限公司 Detect connector, connector mixture and the correlation method of low frequency variation
CN111073961A (en) * 2019-12-20 2020-04-28 苏州赛美科基因科技有限公司 High-throughput detection method for gene rare mutation
CN113718034A (en) * 2021-09-27 2021-11-30 中国医学科学院肿瘤医院 Marker, detection kit and detection method for guiding medication and curative effect evaluation of ovarian cancer platinum drug-resistant patient

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105861710A (en) * 2016-05-20 2016-08-17 北京科迅生物技术有限公司 Sequencing joint and preparation method and application thereof in ultra-low frequency mutation detection
CN106599616A (en) * 2017-01-03 2017-04-26 上海派森诺医学检验所有限公司 duplex-seq-based ultralow-frequency mutation site detection analysis method
CN106834275A (en) * 2017-02-22 2017-06-13 天津诺禾医学检验所有限公司 The analysis method of the construction method, kit and library detection data in ctDNA ultralow frequency abrupt climatic changes library

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105861710A (en) * 2016-05-20 2016-08-17 北京科迅生物技术有限公司 Sequencing joint and preparation method and application thereof in ultra-low frequency mutation detection
CN106599616A (en) * 2017-01-03 2017-04-26 上海派森诺医学检验所有限公司 duplex-seq-based ultralow-frequency mutation site detection analysis method
CN106834275A (en) * 2017-02-22 2017-06-13 天津诺禾医学检验所有限公司 The analysis method of the construction method, kit and library detection data in ctDNA ultralow frequency abrupt climatic changes library

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NEWMAN, A. M.等: "Integrated digital error suppression for improved detection of circulating tumor DNA", 《NATURE BIOTECHNOLOGY》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109439729A (en) * 2018-12-27 2019-03-08 上海鲸舟基因科技有限公司 Detect connector, connector mixture and the correlation method of low frequency variation
CN111073961A (en) * 2019-12-20 2020-04-28 苏州赛美科基因科技有限公司 High-throughput detection method for gene rare mutation
CN113718034A (en) * 2021-09-27 2021-11-30 中国医学科学院肿瘤医院 Marker, detection kit and detection method for guiding medication and curative effect evaluation of ovarian cancer platinum drug-resistant patient

Similar Documents

Publication Publication Date Title
US10619214B2 (en) Detecting genetic aberrations associated with cancer using genomic sequencing
CN107475375B (en) A kind of DNA probe library, detection method and kit hybridized for microsatellite locus related to microsatellite instability
KR102339760B1 (en) Diagnosing fetal chromosomal aneuploidy using massively parallel genomic sequencing
KR102028375B1 (en) Systems and methods to detect rare mutations and copy number variation
CN115029451B (en) A sheep liquid phase chip and its application
CN115198023B (en) Hainan cattle liquid-phase breeding chip and application thereof
CN106834507B (en) DMD gene trap probe and its application in DMD detection in Gene Mutation
US12416047B2 (en) Noninvasive prenatal diagnostic methods
CN109461473B (en) Method and device for acquiring concentration of free DNA of fetus
EP3564391A1 (en) Method, device and kit for detecting fetal genetic mutation
CN111073961A (en) High-throughput detection method for gene rare mutation
CN113564266B (en) SNP typing genetic marker combination, detection kit and application
CN111321209A (en) Method for double-end correction of circulating tumor DNA sequencing data
CN108728515A (en) A kind of analysis method of library construction and sequencing data using the detection ctDNA low frequencies mutation of duplex methods
CN115083521A (en) Method and system for identifying tumor cell group in single cell transcriptome sequencing data
US20230265496A1 (en) Method for low frequency somatic cell mutation identification and quantification
CN109920480B (en) Method and device for correcting high-throughput sequencing data
Edwards Whole-genome sequencing for marker discovery
Genner et al. Haplotype-Resolved DNA Methylation at the APOE Locus identifies Allele-Specific Epigenetic Signatures Relevant to Alzheimer’s Disease Risk
KR100450816B1 (en) Selection method of probe set for genotyping
CN109280697A (en) The method for carrying out fetus genotype identification using pregnant woman blood plasma dissociative DNA
HK40004815A (en) Method and device for acquiring fetal free dna concentration
HK40004815B (en) Method and device for acquiring fetal free dna concentration
CN113897427A (en) Reagent and kit for detecting fetal FGA gene mutation
CN118147344A (en) Primer group and kit for identifying sunflower varieties and application of primer group and kit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181102

RJ01 Rejection of invention patent application after publication