[go: up one dir, main page]

CN109517882A - A quality control method and application for detecting unique paired-end library tag combinations - Google Patents

A quality control method and application for detecting unique paired-end library tag combinations Download PDF

Info

Publication number
CN109517882A
CN109517882A CN201811337895.2A CN201811337895A CN109517882A CN 109517882 A CN109517882 A CN 109517882A CN 201811337895 A CN201811337895 A CN 201811337895A CN 109517882 A CN109517882 A CN 109517882A
Authority
CN
China
Prior art keywords
library
group
quality control
tags
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811337895.2A
Other languages
Chinese (zh)
Other versions
CN109517882B (en
Inventor
张之宏
罗健
汉雨生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Burning Rock Dx Co ltd
Original Assignee
Guangzhou Burning Rock Dx Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Burning Rock Dx Co ltd filed Critical Guangzhou Burning Rock Dx Co ltd
Priority to CN201811337895.2A priority Critical patent/CN109517882B/en
Priority to CN202111090137.7A priority patent/CN113957123B/en
Publication of CN109517882A publication Critical patent/CN109517882A/en
Application granted granted Critical
Publication of CN109517882B publication Critical patent/CN109517882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a quality control method for detecting a unique double-end library label combination and application thereof, belonging to the technical field of biological detection, wherein the quality control method comprises the following steps: s1) constructing a gDNA library with a unique double-ended library tag combination, performing on-machine sequencing on the constructed library, reading a library tag sequence, S2) performing first quality control analysis on the library tag sequence, S3) replacing problematic tag raw materials under the condition of needing according to the analysis result of S2, reconstructing the gDNA library with the unique double-ended library tag combination according to the step S1) method, performing on-machine sequencing on the constructed library, reading the library tag sequence, S4) performing second quality control analysis on the library tag sequence, and judging whether to continue replacing problematic library tags according to the S3) method according to the result until all library tags meet the index of the quality control analysis. The quality control method can improve the detection efficiency of the library label and is more suitable for the requirement of accurate sequencing of the library.

Description

A kind of quality control method and application for detecting unique both-end library tag combination
Technical field
The invention belongs to technical field of biological more particularly to a kind of for detecting unique both-end library tag combination Quality control method and application.
Background technique
With the rapid development of high-throughput techniques, the flux of sequenator is increasing, early to separate method example first with physics Method as shunting (Lane) formula flowing groove (Flow Cell) distinguishes different sequencing libraries has been not suitable for.Multiple library sequencing (Multiplex Sequencing) is widely used in the every field of two generations sequencing.The key of multiple library sequencing is then text Library label (Index).Library label is in the preparation of the library (Next Generation Sequecing) NGS, to each sample Special sequence label is carried out, for the distinguished sequence of distinguishing different DNA, general length is 4~12 bases longs.In high pass It measures in program process, sequencing reaction, the Insert Fragment in library is carried out by the library of different known label sequence marks after mixing And label is read out sequentially and is converted to base.In next analytic process, software utilizes expected sequence label pair Sequencing result is classified, and sequencing result is split into different samples.
In multiple sequencing procedure, in case of the distribution of library sequence mistake, it is not belonging to the sequence in certain library originally just It can be by the classification of mistake.The generation of this kind of mistake distribution will bring certain applications the analysis result of mistake.For example, when The library for being derived from the tissue samples of cancer patient and the library for the tissue samples for being derived from benign tumour patient are sequenced jointly, such as Fruit has the sequence of part cancerous tissue sample to be assigned in benign tumor tissue sample by mistake, leads to the detection report of benign tumour patient Announcement is shown as malignant tumour, leads to diagnostic error.
There is a lot of reasons that library sequence can be caused to be distributed by mistake.Common includes following several: 1) prepared by library Cross contamination in journey, 2) cross contamination in Tag primer production process, 3) multiple library carries out cluster reaction in flowing groove The cross reaction of Shi Fasheng and 4) due to cluster density is excessive etc. caused by optical aberration etc..
Suitable for two generation sequencing library Tag primers, often length is 50~70 bases, it is however generally that needs to purify to guarantee The purity of overall length primer.However, purifying itself frequently can lead to more intersect dirty due to needing gel extraction or crossing column Dye.For HPLC (high performance liquid chromatography), purification column can inevitable band to the absorption and reuse of Tag primer Carry out cross contamination.Although this kind of pollution can be by carrying out empty sample elution or nothing between two different Tag primers cross column purification Sample elution is closed to reduce residual contamination, this still cannot be avoided cross contamination completely.Rule of thumb, it is residual to purify meeting twice for front and back Stay 0.5%~5% previous Tag primer into the latter Tag primer.
The hypersensitivity as brought by the high throughput of NGS, the quality inspection of Tag primer need very sensitive method with It is polluted in detection down to the one thousandth even possibility of a ten thousandth.It is closely similar additionally, due to the sequence between Tag primer, No matter conventional method such as qPCR is unsuitable for from sensitivity or specificity for detecting pollution.Generally conventional method It is still to carry out quality inspection using NGS platform, but conventional method can only at most detect every Lane one target labels and draw Object, in this way for so that quality inspection cost is become unattainable.
Therefore, it is necessary to a kind of quality control method of novel unique both-end library tag combination be designed, to propose detection effect Rate.
Summary of the invention
Present invention aims to overcome that the shortcomings of the prior art, and provide a kind of for detecting unique both-end Index Combined quality control method and application, can improve the detection efficiency of library label, be more suitable for the demand that library is accurately sequenced.
To achieve the above object, the technical scheme adopted by the invention is as follows: one kind is for detecting unique both-end library set of tags The quality control method of conjunction comprising following steps:
S1) using library tag standards product and gDNA standard items as raw material, building is with unique both-end library tag combination The library built is carried out upper machine sequencing, and reads library sequence label by the library gDNA;
S2 first time Analysis of quality control) is carried out to library sequence label, the index of Analysis of quality control includes following items: maximum Unilateral label pollution accounting≤2.5%, maximum tag combination pollution accounting≤0.01%, every group of exemplar sequence item number >= 5000, all tag combinations mix accounting coefficient of variation≤0.5, Compositive sequence percent of pass >=97%, every group of exemplar sequence Column accounting >=0.2/ library tag combination logarithm, the label accounting that unilateral side is greater than 1% pollution answer≤10%;
S3) if step S2) Analysis of quality control show index do not meet, recombine do not meet Quality Control requirement library mark Label;According to step S1) method, with the satisfactory library label of library label, first time Analysis of quality control that recombines and GDNA is raw material, and the library built is re-started upper machine by gDNA library of the building with unique both-end library tag combination Sequencing, and read library sequence label;
S4 second of Analysis of quality control) is carried out to library sequence label, until all library labels meet the finger of Analysis of quality control Mark;
In the parameter of Analysis of quality control, the uniqueness both-end library tag combination is by upstream library label and downstream library Label composition, upstream library label are referred to as IG5, and IG5 is included as A and B;Downstream library label is referred to as IG7, IG7 includes a and b;Matching and correctly uniqueness both-end library tag combination are A-a and B-b;Unmatched uniqueness both-end library Tag combination is A-b and B-a;By analyzing the available respective sequence item number of combination of the above after each sequencing reaction;
The unilateral side label pollution accounting is the cross contamination ratio occurred between group interior label, and pollution is only possible to occur In group, i.e., polluted in IG5 group or/and in IG7 group;
When any cross contamination does not occur in process of production for a of IG7, for the A of IG5, wherein the pollution containing B accounts for Than=sequence item the number containing B-a/all sequence item numbers containing a,
When any cross contamination does not occur in process of production for the A of IG5, for a of IG7, wherein the pollution containing b accounts for Than=contain A-b sequence item number/all sequence item numbers containing A;
A is polluted when B pollutes A and b, then B-b tag combination pollution accounting=(the sequence item number containing B-a/all contains a Sequence item number) × (containing A-b sequence item number/all sequence item numbers containing A);
When any cross contamination does not occur in process of production for the b of IG7, for the B of IG5, wherein the pollution containing A accounts for Than=sequence item the number containing A-b/all sequence item numbers containing b,
When any cross contamination does not occur in process of production for the B of IG5, for the b of IG7, wherein the pollution containing a accounts for Than=contain B-a sequence item number/all sequence item numbers containing B;
B is polluted when A pollutes B and a, then A-a tag combination pollution accounting=(the sequence item number containing A-b/all contains b Sequence item number) × (containing B-a sequence item number/all sequence item numbers containing B);
Every group of exemplar sequence item number is to be contained by filtered every group correct matched sequence item number of system The sequence item number of A-a or sequence item number containing B-b;
All tag combination mixing accounting coefficient of variation are to pass through filtered every group correct matched sequence item of system Number variance of proportion coefficient in total correct sequence item number of pairing filtered by system;
The Compositive sequence percent of pass be sequencing reaction after by system it is filtered it is correct pairing and ordered sequence it is total Item number is accounted for filtered by system after all sequences total number ratio;
It is logical that every group of exemplar sequence accounting is that the sequence item number correctly matched by filtered every group of system accounts for Cross the ratio of total sequence after system filters;
The unilateral label accounting for being greater than 1% pollution are as follows: in the label of upstream library, pollution ratio is greater than 1% library The ratio of the total library number of tags of number of tags Zhan;And in the label of downstream library, it is total that pollution ratio is greater than 1% library number of tags Zhan The ratio of library number of tags.
As an improvement of the above technical solution, the step S1) successively the following steps are included: gDNA standard items prepare, GDNA fragmentation, end are repaired, connector connection, the purifying of connector connection product, amplified library, expand the purifying in library, purifying text The quality inspection in library, machine sequencing in the detection of purified library clip size and library.
As an improvement of the above technical solution, unique both-end library tag combination is made of IG5 group and IG7 group, IG5 Sequence Hamming distance >=2 of library label between the Hamming distance >=3, IG5 and IG7 group of the library label in IG7 respectively group.
As a further improvement of the above technical scheme, library label is purified and is led to by high performance liquid chromatography Cross mass spectral analysis confirmation molecular weight, it is desirable that purity >=85%.
As an improvement of the above technical solution, unique both-end library tag combination is made of 96 pairs of library labels, i.e., There is 96 upstreams library label in IG5 group, there is 96 downstreams library label in IG7 group, corresponds;Every group of exemplar sequence Accounting is then accordingly adjusted to >=0.2%.
As an improvement of the above technical solution, unique both-end library tag combination is made of 48 pairs of library labels, i.e., There is 48 upstreams library label in IG5 group, there is 48 downstreams library label in IG7 group, corresponds;Every group of exemplar sequence Accounting is then accordingly adjusted to >=0.4%.
As an improvement of the above technical solution, when unique both-end library tag combination is made of 192 pairs of library labels, i.e., There is 192 upstreams library label in IG5 group, there are 192 downstream library labels in IG7 group, corresponds, every group of exemplar sequence Accounting is then accordingly adjusted to >=0.1%.
As an improvement of the above technical solution, when unique both-end library tag combination is made of 288 pairs of library labels, i.e., There is 288 upstreams library label in IG5 group, there are 288 downstream library labels in IG7 group, corresponds, every group of exemplar sequence Accounting is then accordingly adjusted to >=0.07%.
As an improvement of the above technical solution, when unique both-end library tag combination is made of 384 pairs of library labels, i.e., There is 384 upstreams library label in IG5 group, there are 384 downstream library labels in IG7 group, corresponds, every group of exemplar sequence Accounting is then accordingly adjusted to >=0.05%.
It is applied in sample sequence measurement in addition, the present invention also provides the quality control methods.
The beneficial effects of the present invention are: the present invention provides a kind of for detecting the Quality Control of unique both-end library tag combination Method and application, the quality control method energy efficient detection go out the cross contamination of library label, and advantage of lower cost, can more be suitble to sample The high throughput assay of this sequence.
Detailed description of the invention
Fig. 1 shows the Quality Control result simulated for the first time in embodiment 1;
Fig. 2 shows the Quality Control result of second of simulation in embodiment 1;
Fig. 3 shows the result of the end the IG5 first time Analysis of quality control of embodiment 2;
Fig. 4 is the pollution accounting hotspot graph of the end the IG7 first time Analysis of quality control of embodiment 2, has 96 pairs of labels to draw in Fig. 4 Object, abscissa are followed successively by IG5A01~IG5A12, IG5B01~IG5B12, IG5C01~IG5C12 until IG5H01 from left to right ~IG5H12, ordinate be followed successively by from top to bottom IG7A01~IG7A12, IG7B01~IG7B12, IG7C01~IG7C12 until IG7H01~IG7H12;The oval point irised out is expressed as undesirable Tag primer in figure;It is similar below;
Fig. 5 is the pollution accounting hotspot graph of the end the IG5 first time Analysis of quality control of embodiment 2;
Fig. 6 is the distribution map of the pollution accounting of the end the IG7 and IG5 first time Analysis of quality control of embodiment 2;
Fig. 7 shows the stability contrast result of the Analysis of quality control twice at the end IG7 in embodiment 2;
Fig. 8 shows the stability contrast result of the Analysis of quality control twice at the end IG5 in embodiment 2;
Fig. 9 shows the result of the end the IG5 first time Analysis of quality control of embodiment 3;
Figure 10 is the pollution accounting hotspot graph of the end the IG7 first time Analysis of quality control of embodiment 3;
Figure 11 is the pollution accounting hotspot graph of the end the IG5 first time Analysis of quality control of embodiment 3;
Figure 12 is the distribution map of the pollution accounting of the end the IG7 and IG5 first time Analysis of quality control of embodiment 3.
Specific embodiment
Purposes, technical schemes and advantages in order to better illustrate the present invention, below in conjunction with specific embodiments and the drawings pair The present invention is described further.
In addition, it is necessary to which explanation is in this Shen description of the invention, Index, library label and Tag primer indicate same A meaning;In the calculating of every group of exemplar sequence accounting, the result of 0.2/ library tag combination logarithm retains a non-zero Number (and rounding up).
Unique both-end library label leads to the principle of sample contamination in anti-cross-contamination
In the field NGS, in order to distinguish the different samples under the same sequencing reaction, add during building library to different samples Upper specific " label " (Index), to be separated different sample datas in subsequent data analysis.As sequenator is logical The continuous improvement of amount, more samples are pieced together into the same flow channel (Lane) sequencing, to the quantity and differentiation of Index Degree is put forward higher requirements.In addition, Illumina HiSeqX/4000 and NovaSeq are used different from other Illumina The clustering method of sequenator, document report its have higher Index cross contamination risk.The single-ended Index primer foundation of tradition One end carries out data fractionation, is easy to when polluting by data mistake point.It can be utmostly using unique both-end Index primer It avoids guaranteeing the reliability of product due to Index cross contamination bring sample contamination risk.Unique both-end Index draws Object increases " dual fail-safe " for sequencing sequence, pollutes due to carrying out data fractionation by unique both-end pairing Index Most of sequence can be dropped.Table 1 then compare it is single-ended, combine both-end and unique both-end Index strategy Index is intersected it is dirty The tolerance of dye.
Table 1
High-throughput pollution quality inspection principle is carried out to unique both-end Index primer by NGS method
Due to having used unique both-end Index, each sample is by Index label 2 times, so to single-ended label Cross contamination tolerance rises significantly between primer.For example, practical if the ratio of 2 pairs of unilateral side Index pollution is 1% Caused sample mistake divides pollution risk to be 1% × 1%=0.01%.This tolerance also significantly reduces the synthesis of Index primer With the pressure of purifying, control manufacturing cost further.
Using the advantage of unique both-end Index, Tag primer is detected using NGS the present invention provides a kind of simple possible The quality control method of cross contamination.Its basic principle is combined in entire sequencing result based on unexpected both-end Index is observed Accounting is to estimate the maximum cross contamination possibility that may occur and the Index being related to, to avoid due to Index Mistake distribution between sample caused by cross contamination between primer.
For example, four libraries are respectively labeled as A+a, B+b, C+c, D+d.Therefore when carrying out sequence analysis, only Having above-mentioned 4 kinds of combinations is considered as valid combination.For combining A+b, because theoretically only having A that can match with a, if observation Having arrived A+b combination, there are two types of possibilities: 1) Tag primer b enters primer a, and defining S here is the sequence containing this kind of Index Number, estimation pollution ratio are S(A+b)/SA;2) primer A enters primer B, and estimation pollution ratio is S(A+b)/Sb.It may be noted that It is that the premise of the calculation method is in generic Index such as A/B/C/D without containing any non-generic Index for example a/b/c/d.In addition appraising model only considered simple one-to-one pollution mode, rather than the complex situations such as multiple pollution.Separately The outer calculation method is the directionality estimated maximum possibility of pollution and had no ability to judgement pollution, in fact any After a kind of unidirectional contamination accident occurs, such as the event of " A enters B ", it can all be detected as " A enters B " or " b enters a " Two kinds of possibilities.According to the computation model, we are estimated that unique combination both-end Index library A+a in multiple sequencing The greatest combined pollution risk by other primers are as follows:
However due to it is desirable that combination there was only A+a, B+b, C+c, tetra- kinds of D+d, actual effective greatest contamination wind Danger may be calculated:
In practical application example, we carry out PCR operation label Index to 48 pairs or 96 pairs of Index primers respectively and arrive Then library mixes and carries out routine MiSeq sequencing.Sequencing post analysis calls directly analysis script to 96 × 6=9216 Kind combined sequence is analyzed, and is found improper combined ratio and is calculated respective pollution accounting.
The upper machine sequencing of Index
1, prepared by gDNA standard items
1) 48plex Index progress quality inspection needs 500ng gDNA standard items, and 96plex Index carries out quality inspection needs 1000ng gDNA standard items;
2) 50 μ l 1 × IDTE Buffer are taken, are added in new 1.5ml Eppendorf LoBind pipe, then Xiang Guanzhong The gDNA standard items of corresponding volume are added: the detection of 48plex Index plate, it is 2 μ l that volume, which is added, in gDNA standard items;96plex The detection of Index plate, it is 4 μ l that volume, which is added, in gDNA standard items;It is vortexed afterwards and mixes 10~15s, rear of short duration centrifugation makes solution return to pipe Bottom;
3) standard items dilution product are transferred in Covaris MicroT Μ BE pipe, supplement 1 × IDTE Buffer to 50 μ L carries out subsequent DNA fragmentationization operation afterwards.
2, gDNA fragmentation
DNA is interrupted into the segment to 170~200bp using Covaris M220 instrument, after the completion of interrupting, by Covaris MicroTube pipe takes out, and centrifugation makes liquid return to tube bottom.
3, end is repaired, 3 ' ends plus A
1) reagent prepares: opening KAPA Hyper Prep 96reaction Kit, takes out following 2 pipes and be placed in and melt on ice Change;
2) in new 1.5ml Eppendorf LoBind pipe, end is prepared on ice and repairs and A reaction system is added to mix Liquid, finger flick 3~5 times, turn upside down mixing 2~3 times, and centrifuge is centrifuged 1~3s;The configuration of reaction system is as shown in table 2;
3) 60 μ l mixing liquids of absorption are distributed into 4 (48plex Index plates) or 8 (96plex Index plate) 0.2ml are flat In lid PCR pipe, 1~3s of the of short duration centrifugation of centrifuge;
4) be put into PCR instrument, perform the following operation: 85 DEG C of heat lids, 20 DEG C of 30min, 65 DEG C of 30min, 4 DEG C save, in 2h Into in next step.
Table 2
4, connector connects, and the DNA double chain segment both ends for adding A are connect with preparation joint (containing T cohesive end)
1) in new 1.5ml Eppendorf LoBind pipe, connector coupled reaction system mixing liquid, finger are prepared on ice It flicks 3~5 times, turns upside down mixing 2~3 times, centrifuge is centrifuged 1~3 second;The configuration of reaction system such as table 3 shows;
2) (48plex Index plate 4 is managed totally, 96plex Index plate in the above-mentioned 0.2ml pipe of 50 μ l mixing liquids of absorption addition Totally 8 pipe), pipettor pipettor blows and beats 5 mixings up and down, is centrifuged 1~3s;
3) following procedure: 20 DEG C of 15min, 70 DEG C of 10min is run in PCR instrument, 4 DEG C save (85 DEG C of heat lids).
Table 3
5, the purifying of connection product removes the other compositions such as connector dimer and not connected connector
1) it turns upside down 2~3 times, is vortexed and mixes the SPB magnetic bead that 5~10s replys room temperature, make its homogenization;Take 1.5ml from The magnetic bead and adjunction head product of homogenization is successively added in coupled reaction system and magnetic bead volume 1:0.8 ratio in heart pipe;Specifically Strategy is as follows: magnetic bead is 352 μ l, connector product is 440 μ l, and 4 pipes merge into the purifying of 1 pipe, totally 1 pipe;Magnetic bead is 2 × 352 μ l, connects Head product is 2 × 440 μ l, and 4 pipes merge into the purifying of 1 pipe, totally 2 pipe (96plex Index);It is vortexed and mixes after addition, rotation is incubated for 5min, of short duration centrifugation;
2) centrifuge tube is placed in magnetic frame, waits solution clarification;Centrifuge tube is placed in motionless on magnetic frame, opening pipe lid, Clarified supernatant carefully is siphoned away, avoids encountering magnetic bead;
3) pipe is still placed on magnetic frame, and 75% ethyl alcohol of 500 μ L Fresh is added in every pipe, and 1min is waited to keep magnetic bead abundant Precipitating, during which slow rotating centrifugal pipe 1 encloses in the horizontal direction, siphons away ethyl alcohol;Multiple this step 1 time;
4) it is centrifuged 1~3s, centrifuge tube is placed back in into magnetic frame and stands 30s, using the cleared residual ethanol of pipettor, is kept Pipe Gai Kaiqi;Room temperature 3min keeps magnetic bead dry, and 500 μ l EB solution are added in every pipe, and sufficiently piping and druming mixes, and is incubated at room temperature 2min; Centrifuge tube is placed in magnetic frame 2min until solution clarification, pipettes 490 μ l supernatants using pipettor, be transferred to new It is standby on ice in Eppendorf LoBind 1.5ml centrifuge tube (96plex Index plate, two pipes merge into 1 pipe after elution) With.
6, amplified library, amplification have connected the library of connector
1) prepare respective volume reaction system in 5ml Eppendorf LoBind pipe (or 15ml centrifuge tube) to mix Liquid (is prepared) on ice, and finger flicks 3~5 times, is turned upside down mixing 2~3 times, stands 0.5~1min vertically;Reaction system is matched It sets as shown in table 4;
2) prepared reaction system mixed liquor is evenly distributed in 8 connecting legs, partial volume is 138 μ l every time (96Index pair Plate (refer part2#) detection needs to carry out mean allocation twice: 142 μ l+132 μ l);
3) reaction system mixed liquor is distributed into 48 new orifice plates (48plex Index) or 96PCR plate (96plex Index), packing volume is 22.5 holes μ l/;
4) taken out from IDP plate 2.5 μ l Index (be added to good 48 orifice plate of reaction system mixed liquor of above-mentioned packing or In 96 hole PCR plates, piping and druming is mixed 2~3 times repeatedly, and sealer;Knockout plate machine is centrifuged 1000rpm, 1min (25 μ l of reaction volume);It sets In running in PCR instrument, operation program is as shown in table 5.
Table 4
Table 5
7, the library purifying expanded, removes primer dimer and reaction system
1) SPB magnetic bead is turned upside down 2~3 times, 5~10s is mixed under VORTEX maximum (top) speed, is made its homogenization;
2) corresponding SPB magnetic bead is drawn into loading slot, 20 μ l SPB magnetic beads of each sample addition (sample: magnetic bead=1: 0.8): 1440 μ l or so magnetic bead is then added in 48 samples in loading slot, and it is left that 2880 μ L are then added in 96 samples in loading slot Right magnetic bead;
3) 48 orifice plates are taken out from PCR instrument, 1000rpm 3s carefully tears pad pasting off;20 μ l SPB are drawn from loading slot Magnetic bead is added in 48 orifice plates/96 hole PCR plates, up and down piping and druming 10 times;
4) 48 orifice plates/96 hole PCR plate pad pastings, of short duration centrifugation 1000rpm 3s are placed in room temperature 5min;48 orifice plates/96 hole PCR Plate is placed on 96 hole magnetic frames, is clarified to solution;Film is abandoned, 45 μ l of supernatant is drawn, is abandoned;
5) 48 orifice plates/96 hole PCR plates are still placed on magnetic frame, and 75% second of 200 μ l Fresh is added in sample aperture Alcohol;48 orifice plates/96 hole PCR plates are stood on magnetic frame embathes magnetic bead sufficiently, to 1min, abandons ethyl alcohol;Repeat this step 1 time;
6) 48 orifice plates/96 hole PCR plates are rested on into 30s on magnetic frame, and cleared residual ethanol;By 48 orifice plates/96 hole PCR Plate is removed from magnetic frame, is placed in room temperature 2min on PCR plate frame, keeps magnetic bead dry;14 μ are added in 48 orifice plates/96 hole PCR plates L EB covers eight connecting leg lids, vortex 5s or so, of short duration centrifugation 1000rpm 3s;
7) 48 orifice plates are placed in incubation at room temperature 2min, abandon film, 48 orifice plates is placed in magnetic frame 2min, until solution is clarified; 8 μ L of supernatant is pipetted into 48 new orifice plates/96 hole PCR plates, not magnetically attractive pearl;
8) each column library is transferred in same new 8 connecting leg of 0.2ml, then 8 connecting leg Chinese library of 0.2ml is transferred to Same new 1.5ml Eppendorf LoBind pipe, merges into the library pooling, Vortex is mixed and is centrifuged;After mixing Purified library take out 20 μ l to one it is new 1.5ml Eppendorf LoBind pipe, add 180 μ l EB, repeatedly blow and beat 5~ 6 times, 10 times of library beforehand dilution are prepared for subsequent detection.8, the quality inspection of purified library
It usesDsDNA HS (High Sensitivity) Assay Kit (Thermo Fisher) measurement dilution Library concentration afterwards, and the pre- library concentration that converts back;Library concentration circle is between 9~60ng/ μ l, and Labchip result is normal, Then library construction part is qualified, can carry out machine on subsequent Miseq;It needs to re-start library system if it cannot reach requirement It is standby.
9, the library fragments size detection (Library QC) purified
Using The LabChip DNA High Sensitivity Reagent kit (Perkin Elmer) to dilution It is detected in library afterwards;Qualified library fragments main peak is in 350~500bp, without obvious small fragment in the section 10~150bp.
10, machine strategy (Miseq Run) on library
1) purified library is diluted to 4nM according to the detectable concentration of QC, 1N NaOH is diluted using nuclease-free water To 0.2N;
2) library is denaturalized: taking 5 μ l of the library for being diluted to 4nM that new 1.5ml Eppendorf LoBind pipe is added, then 5 μ l 0.2N NaOH are added, piping and druming mixes 15~20 times, is incubated at room temperature 5min;
3) library is diluted to 13pM;
4) subsequent operation refers to Illumina Miseq operating guidance, is recycled using Read1=12, Index1=8 circulation, Library is sequenced in the corresponding setting of Index2=8 circulation.
11, sequencing data analysis (QC Analysis)
The sequence of all index1 and index2 are exported using Illumina bcl2fastq software cooperation relevant parameter (Fastq format), it is for statistical analysis to sequence using corresponding scripts, obtain each index.
11, library sequencing result criterion
Machine index under Miseq: sequencing data quality 01:Q30 > 90%, sequencing data quality 02:PF > 97%, sequencing data Quality 03:Phasing and Prephasing are respectively less than 0.30.
Embodiment 1The simulation of quality control method
1) unidirectional pollution is simulated for the first time: detecting 1 cross contamination for the first time, provides 2 kinds of supposition pollution directions, it is maximum Contamination ratio (i.e. maximum unilateral label pollution accounting) 4%;Analogue data generates 96 pairs of standard matched sequences, occurs polluting Normal pairing IG7F01+IG5F01 48000, IG7F01+IG5E012000 item;Remaining each pair of normal pairing is 50000 Item.The literature data of simulation is subjected to data analysis, the results are shown in Table 6 for actual test, carries out Quality Control point according to the parameter of table 6 Analysis, it is as shown in Figure 1 to obtain Analysis of quality control result;Wherein, (i.e. maximum tag combination pollution accounts for maximum pairing pollution accounting product Than)=4% × 0=0, it is correctly 48000 with sequence item number, and correctly pairing and ordered sequence item number are 48000, sequence passes through Rate is 100%, and the number of tags that unilateral side is greater than 1% pollution is a kind, and it is (i.e. unilateral to be greater than 1% pollution to be greater than 1% pollution index accounting Label accounting)=1/96=1.04%.
Table 6
2) second simulation can cause the two-way pollution of sample mistake point: detect 2 cross contaminations for the second time, and this 2 A cross contamination can cause sample mistake point, greatest contamination ratio 2%, maximum pairing pollution product 0.04%;Simulate number It is 50000 according to standard matched sequence is generated, occurs normal pairing IG7F01+IG5F01 48000 polluted, mistake is matched To IG7F01+IG5E01 1000, IG7E01+IG5F011000 item.The library sample of simulation is subjected to data analysis, it is practical Test result is as shown in table 7, carries out Analysis of quality control according to the parameter of table 7, it is as shown in Figure 2 to obtain Analysis of quality control result.
Table 7
It can be seen that this simulation test test result and expection are consistent.
Embodiment 2
The present embodiment carries out quality inspection with 96 pairs of library labels, and first time Analysis of quality control report result is as shown in table 8 and table 9, table 8 The case where only listing pollution with table 9.
Table 8 is directed to the sequencing result of the end IG7 Index
Query It is expected that compound object It is expected that combining Unexpected combination Unexpected compound object Total sequence item number Unexpected composite sequence item number Pollution sources It is contaminated Pollute accounting
IG7A01 IG5A01 IG7A01-IG5A01 IG7A01-IG5B01 IG5B01 96 45 IG5B01 IG5A01 46.88%
IG7A01 IG5A01 IG7A01-IG5A01 IG7A01-IG5A02 IG5A02 96 51 IG5A02 IG5A01 53.13%
IG7A08 IG5A08 IG7A08-IG5A08 IG7A08-IG5H07 IG5H07 53249 88 IG5H07 IG5A08 0.17%
IG7B02 IG5B02 IG7B02-IG5B02 IG7B02-IG5A03 IG5A03 40825 43 IG5A03 IG5B02 0.11%
IG7B10 IG5B10 IG7B10-IG5B10 IG7B10-IG5D08 IG5D08 46021 70 IG5D08 IG5B10 0.15%
IG7B11 IG5B11 IG7B11-IG5B11 IG7B11-IG5C11 IG5C11 47969 68 IG5C11 IG5B11 0.14%
IG7C01 IG5C01 IG7C01-IG5C01 IG7C01-IG5G12 IG5G12 39518 64 IG5G12 IG5C01 0.16%
IG7C06 IG5C06 IG7C06-IG5C06 IG7C06-IG5C07 IG5C07 60810 637 IG5C07 IG5C06 1.05%
IG7C08 IG5C08 IG7C08-IG5C08 IG7C08-IG5B08 IG5B08 67961 119 IG5B08 IG5C08 0.18%
IG7D03 IG5D03 IG7D03-IG5D03 IG7D03-IG5E03 IG5E03 44222 48 IG5E03 IG5D03 0.11%
IG7D03 IG5D03 IG7D03-IG5D03 IG7D03-IG5C03 IG5C03 44222 56 IG5C03 IG5D03 0.13%
IG7D04 IG5D04 IG7D04-IG5D04 IG7D04-IG5D03 IG5D03 40521 41 IG5D03 IG5D04 0.10%
IG7D07 IG5D07 IG7D07-IG5D07 IG7D07-IG5E08 IG5E08 39029 281 IG5E08 IG5D07 0.72%
IG7D08 IG5D08 IG7D08-IG5D08 IG7D08-IG5C08 IG5C08 53581 85 IG5C08 IG5D08 0.16%
IG7D09 IG5D09 IG7D09-IG5D09 IG7D09-IG5E09 IG5E09 54786 70 IG5E09 IG5D09 0.13%
IG7E03 IG5E03 IG7E03-IG5E03 IG7E03-IG5F03 IG5F03 60714 78 IG5F03 IG5E03 0.13%
IG7E07 IG5E07 IG7E07-IG5E07 IG7E07-IG5D07 IG5D07 57285 88 IG5D07 IG5E07 0.15%
IG7F04 IG5F04 IG7F04-IG5F04 IG7F04-IE5D04* IE5D04* 49814 54 IE5D04* IG5F04 0.11%
IG7F07 IG5F07 IG7F07-IG5F07 IG7F07-IG5E07 IG5E07 55273 63 IG5E07 IG5F07 0.11%
IG7G08 IG5G08 IG7G08-IG5G08 IG7G08-IG5F08 IG5F08 43769 167 IG5F08 IG5G08 0.38%
IG7G10 IG5G10 IG7G10-IG5G10 IG7G10-IG5F06 IG5F06 57227 60 IG5F06 IG5G10 0.10%
IG7H02 IG5H02 IG7H02-IG5H02 IG7H02-IG5H03 IG5H03 38360 58 IG5H03 IG5H02 0.15%
IG7H07 IG5H07 IG7H07-IG5H07 IG7H07-IG5G07 IG5G07 36388 42 IG5G07 IG5H07 0.12%
Table 9 is directed to the sequencing result of the end IG5 Index
Query It is expected that compound object It is expected that combining Unexpected combination Unexpected compound object Total sequence item number Unexpected composite sequence item number Pollution sources It is contaminated Pollute accounting
IG5A01 IG7A01 IG5A01-IG7A01 IG5A01-IG7B01 IG7B01 26 26 IG7B01 IG7A01 100.00%
IG5A02 IG7A02 IG5A02-IG7A02 IG5A02-IG7A01 IG7A01 49928 51 IG7A01 IG7A02 0.10%
IG5A03 IG7A03 IG5A03-IG7A03 IG5A03-IG7B02 IG7B02 33067 43 IG7B02 IG7A03 0.13%
IG5A08 IG7A08 IG5A08-IG7A08 IG5A08-IG7B08 IG7B08 53201 60 IG7B08 IG7A08 0.11%
IG5B08 IG7B08 IG5B08-IG7B08 IG5B08-IG7C08 IG7C08 61974 119 IG7C08 IG7B08 0.19%
IG5B11 IG7B11 IG5B11-IG7B11 IG5B11-IG7A11 IG7A11 47967 51 IG7A11 IG7B11 0.11%
IG5C03 IG7C03 IG5C03-IG7C03 IG5C03-IG7D03 IG7D03 49273 56 IG7D03 IG7C03 0.11%
IG5C07 IG7C07 IG5C07-IG7C07 IG5C07-IG7C06 IG7C06 45027 637 IG7C06 IG7C07 1.41%
IG5C08 IG7C08 IG5C08-IG7C08 IG5C08-IG7D08 IG7D08 67868 85 IG7D08 IG7C08 0.13%
IG5C11 IG7C11 IG5C11-IG7C11 IG5C11-IG7B11 IG7B11 57807 68 IG7B11 IG7C11 0.12%
IG5D07 IG7D07 IG5D07-IG7D07 IG5D07-IG7E07 IG7E07 38866 88 IG7E07 IG7D07 0.23%
IG5D07 IG7D07 IG5D07-IG7D07 IG5D07-IG7C08 IG7C08 38866 49 IG7C08 IG7D07 0.13%
IG5D08 IG7D08 IG5D08-IG7D08 IG5D08-IG7B10 IG7B10 53619 70 IG7B10 IG7D08 0.13%
IG5D08 IG7D08 IG5D08-IG7D08 IG5D08-IG7E08 IG7E08 53619 65 IG7E08 IG7D08 0.12%
IG5E07 IG7E07 IG5E07-IG7E07 IG5E07-IG7F07 IG7F07 57203 63 IG7F07 IG7E07 0.11%
IG5E08 IG7E08 IG5E08-IG7E08 IG5E08-IG7D07 IG7D07 72767 281 IG7D07 IG7E08 0.39%
IG5E09 IG7E09 IG5E09-IG7E09 IG5E09-IG7D09 IG7D09 58757 70 IG7D09 IG7E09 0.12%
IG5F03 IG7F03 IG5F03-IG7F03 IG5F03-IG7E03 IG7E03 54811 78 IG7E03 IG7F03 0.14%
IG5F06 IG7F06 IG5F06-IG7F06 IG5F06-IG7G10 IG7G10 50348 60 IG7G10 IG7F06 0.12%
IG5F08 IG7F08 IG5F08-IG7F08 IG5F08-IG7G08 IG7G08 67091 167 IG7G08 IG7F08 0.25%
IG5G12 IG7G12 IG5G12-IG7G12 IG5G12-IG7C01 IG7C01 40234 64 IG7C01 IG7G12 0.16%
IG5H03 IG7H03 IG5H03-IG7H03 IG5H03-IG7H02 IG7H02 48832 58 IG7H02 IG7H03 0.12%
IG5H06 IG7H06 IG5H06-IG7H06 IG5H06-IG7A11 IG7A11 42784 62 IG7A11 IG7H06 0.14%
IG5H07 IG7H07 IG5H07-IG7H07 IG5H07-IG7A08 IG7A08 36410 88 IG7A08 IG7H07 0.24%
IG5H11 IG7H11 IG5H11-IG7H11 IG5H11-IG7E11 IG7E11 32519 50 IG7E11 IG7H11 0.15%
It is for statistical analysis to table 8 and 9 data of table progress first time quality inspection result, available IG7A01-IG5A01's Relevant information, as shown in table 10 and Fig. 3;In addition, the dirt of available IG7 and IG5 for statistical analysis to 96 pairs of Tag primers Contaminate the distribution map (Fig. 6) of the pollution accounting of accounting hotspot graph (Fig. 4 and Fig. 5) and IG7 and IG;Summarize the conclusion obtained Are as follows: 1) sequence that this combination of IG7A01-IG5A01 measures is few, and the sequence containing IG7A01 only has 96, the sequence containing IG5A01 Column also only have 26, well below quality inspection need at least 5000 and accounting > 0.2% requirement;2) due to combination of the above Sequence it is few, the combination uniquely measured is forbidden combination again, thus pollution ratio it is very high;3) synthesis apparently, corresponds to This hole IG7A01-IG5A01 is problematic, no matter is all desirable from ordered sequence number or for contaminated possibility Replacement.
Table 10
Since this hole IG7A01-IG5A01 is problematic, this 2 strip label of separately synthesized IG7A01 and IG5A01 again The primer that primer dissolution recombines is put into the corresponding aperture of new deep-well plates in proportion to normal concentration, is removed original All liq remaining in the original mother plate of quality inspection failure is transferred to one piece of new deep-well plates by the corresponding hole IG7A01-IG5A01 Interior corresponding position, molecule plate carries out pollution Quality Control detection again;Second of Analysis of quality control report result as shown in table 11 and table 12, The case where table 11 and table 12 only list pollution.
Table 11 is directed to the sequencing result of the end IG7 Index
Query It is expected that compound object It is expected that combining Unexpected combination Unexpected compound object Total sequence item number Unexpected composite sequence item number Pollution sources It is contaminated Pollute accounting
IG7A01 IG5A01 IG7A01-IG5A01 IG7A01-IG5B01 IG5B01 490179 579 IG5B01 IG5A01 0.12%
IG7A08 IG5A08 IG7A08-IG5A08 IG7A08-IG5H07 IG5H07 244997 300 IG5H07 IG5A08 0.12%
IG7A11 IG5A11 IG7A11-IG5A11 IG7A11-IG5B11 IG5B11 398075 580 IG5B11 IG5A11 0.15%
IG7B02 IG5B02 IG7B02-IG5B02 IG7B02-IG5A03 IG5A03 285357 358 IG5A03 IG5B02 0.13%
IG7B10 IG5B10 IG7B10-IG5B10 IG7B10-IG5D08 IG5D08 262786 435 IG5D08 IG5B10 0.17%
IG7B11 IG5B11 IG7B11-IG5B11 IG7B11-IG5C11 IG5C11 336262 664 IG5C11 IG5B11 0.20%
IG7C06 IG5C06 IG7C06-IG5C06 IG7C06-IG5C07 IG5C07 345406 4253 IG5C07 IG5C06 1.23%
IG7C08 IG5C08 IG7C08-IG5C08 IG7C08-IG5B08 IG5B08 306713 460 IG5B08 IG5C08 0.15%
IG7C09 IG5C09 IG7C09-IG5C09 IG7C09-IG5D09 IG5D09 233352 284 IG5D09 IG5C09 0.12%
IG7C11 IG5C11 IG7C11-IG5C11 IG7C11-IG5D11 IG5D11 343633 744 IG5D11 IG5C11 0.22%
IG7D01 IG5D01 IG7D01-IG5D01 IG7D01-IG5C01 IG5C01 260626 313 IG5C01 IG5D01 0.12%
IG7D03 IG5D03 IG7D03-IG5D03 IG7D03-IG5C03 IG5C03 230602 238 IG5C03 IG5D03 0.10%
IG7D03 IG5D03 IG7D03-IG5D03 IG7D03-IG5E03 IG5E03 230602 264 IG5E03 IG5D03 0.11%
IG7D07 IG5D07 IG7D07-IG5D07 IG7D07-IG5E08 IG5E08 243561 1585 IG5E08 IG5D07 0.65%
IG7D07 IG5D07 IG7D07-IG5D07 IG7D07-IG5C07 IG5C07 243561 255 IG5C07 IG5D07 0.10%
IG7D08 IG5D08 IG7D08-IG5D08 IG7D08-IG5C08 IG5C08 316348 448 IG5C08 IG5D08 0.14%
IG7D09 IG5D09 IG7D09-IG5D09 IG7D09-IG5E09 IG5E09 351153 672 IG5E09 IG5D09 0.19%
IG7E07 IG5E07 IG7E07-IG5E07 IG7E07-IG5D07 IG5D07 314916 468 IG5D07 IG5E07 0.15%
IG7E08 IG5E08 IG7E08-IG5E08 IG7E08-IG5D08 IG5D08 313695 328 IG5D08 IG5E08 0.10%
IG7E08 IG5E08 IG7E08-IG5E08 IG7E08-IG5A11 IG5A11 313695 318 IG5A11 IG5E08 0.10%
IG7E12 IG5E12 IG7E12-IG5E12 IG7E12-IG5D12 IG5D12 189902 195 IG5D12 IG5E12 0.10%
IG7G01 IG5G01 IG7G01-IG5G01 IG7G01-IG5F01 IG5F01 271347 362 IG5F01 IG5G01 0.13%
IG7G08 IG5G08 IG7G08-IG5G08 IG7G08-IG5F08 IG5F08 202711 733 IG5F08 IG5G08 0.36%
IG7G09 IG5G09 IG7G09-IG5G09 IG7G09-IG5H09 IG5H09 268926 317 IG5H09 IG5G09 0.12%
IG7G10 IG5G10 IG7G10-IG5G10 IG7G10-IG5F06 IG5F06 363743 533 IG5F06 IG5G10 0.15%
IG7G10 IG5G10 IG7G10-IG5G10 IG7G10-IG5F10 IG5F10 363743 510 IG5F10 IG5G10 0.14%
IG7H02 IG5H02 IG7H02-IG5H02 IG7H02-IG5H03 IG5H03 237964 294 IG5H03 IG5H02 0.12%
IG7H07 IG5H07 IG7H07-IG5H07 IG7H07-IG5G07 IG5G07 240529 355 IG5G07 IG5H07 0.15%
IG7H08 IG5H08 IG7H08-IG5H08 IG7H08-IG5A09 IG5A09 121660 253 IG5A09 IG5H08 0.21%
Table 12 is directed to the sequencing result of the end IG5 Index
After primer replacement operation, cross contamination is not present in IG7A01-IG5A01, and 96 pairs of Tag primer indices are all Meet quality testing standard.
In addition, the present embodiment is also compared first time Analysis of quality control and second of Analysis of quality control, as a result such as table 13, figure Shown in shown in 7 (being compared analysis to IG7 Tag primer) and Fig. 8 (being compared analysis to IG5 Tag primer), it can be seen that The reproducibility of Analysis of quality control or good twice, it is seen that the stabilization of quality control method of the present invention is preferable.
Table 13
Embodiment 3
The present embodiment carries out quality inspection with 96 pairs of library labels, for statistical analysis to first time quality inspection result, available The relevant information of one pair of them Tag primer, as shown in Figure 9;In addition, it is for statistical analysis to 96 pairs of Tag primers, it is available The distribution map (Figure 12) of the pollution accounting of the pollution accounting hotspot graph (Figure 10 and Figure 11) and IG7 and IG of IG7 and IG5;Summarize The conclusion obtained:, to Tag primer through first time Analysis of quality control, 96 pairs of Tag primers meet index for this.
Finally, it should be noted that above embodiments protect the present invention to illustrate technical solution of the present invention The limitation of range, although the invention is described in detail with reference to the preferred embodiments, those skilled in the art should be managed Solution, can modify to technical solution of the present invention or replace on an equal basis, without departing from technical solution of the present invention essence and Range.

Claims (10)

1.一种用于检测独特双端文库标签组合的质控方法,其特征在于,包括以下步骤:1. A quality control method for detecting unique paired-end library label combinations, characterized in that, comprising the following steps: S1)以文库标签标准品与gDNA标准品为原料,构建带有独特双端文库标签组合的gDNA文库,将构建好的文库进行上机测序,并读取文库标签序列;S1) Using the library tag standard and gDNA standard as raw materials, construct a gDNA library with a unique paired-end library tag combination, perform sequencing on the constructed library, and read the library tag sequence; S2)对文库标签序列进行第一次质控分析,质控分析的指标包括以下几项:最大的单侧标签污染占比≤2.5%,最大的标签组合污染占比≤0.01%,每组标签样本序列条数≥5000条,所有标签组合混合占比方差系数≤0.5,综合序列通过率≥97%,每组标签样本序列占比≥0.2/文库标签组合对数,单侧大于1%污染的标签占比应≤10%;S2) Perform the first quality control analysis on the library tag sequence. The indicators of the quality control analysis include the following items: the largest single-sided tag pollution ratio ≤ 2.5%, the largest tag combination pollution ratio ≤ 0.01%, each group of tags The number of sample sequences is ≥5000, the variance coefficient of all label combinations is ≤0.5, the comprehensive sequence pass rate is ≥97%, the proportion of each label sample sequence is ≥0.2/logarithm of library label combinations, and one side is more than 1% contaminated The proportion of labels should be ≤10%; S3)若步骤S2)质控分析显示指标不符合,则重新合成不符合质控要求的文库标签;按照步骤S1)方法,以重新合成的文库标签、第一次质控分析符合要求的文库标签和gDNA为原料,构建带有独特双端文库标签组合的gDNA文库,将构建好的文库重新进行上机测序,并读取文库标签序列;S3) If the quality control analysis in step S2) shows that the indicators do not meet the requirements, re-synthesize the library label that does not meet the quality control requirements; follow the method of step S1) to use the re-synthesized library label and the library label that meets the requirements for the first quality control analysis Using gDNA as a raw material, construct a gDNA library with a unique paired-end library tag combination, re-sequence the constructed library on the machine, and read the library tag sequence; S4)对文库标签序列进行第二次质控分析,直至所有文库标签符合质控分析的指标;S4) Perform a second quality control analysis on the library tag sequence until all library tags meet the quality control analysis indicators; 在质控分析的参数中,所述独特双端文库标签组合均由上游文库标签和下游文库标签组成,所述上游文库标签统称为IG5,IG5包含为A和B;所述下游文库标签统称为IG7,IG7包含a和b;匹配且正确的独特双端文库标签组合为A-a以及B-b;不匹配的独特双端文库标签组合为A-b,以及B-a;每个测序反应后通过分析可以得到以上组合各自的序列条数;In the parameters of quality control analysis, the unique paired-end library tag combinations are composed of upstream library tags and downstream library tags, the upstream library tags are collectively referred to as IG5, and IG5 is included as A and B; the downstream library tags are collectively referred to as IG7, IG7 contains a and b; matching and correct unique paired-end library tag combinations are A-a and B-b; mismatched unique paired-end library tag combinations are A-b, and B-a; each of the above combinations can be obtained by analysis after each sequencing reaction the number of serial lines; 所述单侧标签污染占比为组内标签之间发生的交叉污染比例,且污染只可能发生在组内,即IG5组内或/和IG7组内发生污染;The proportion of one-sided label pollution is the proportion of cross-contamination between labels within the group, and the pollution can only occur within the group, that is, pollution occurs within the IG5 group or/and within the IG7 group; 当IG7的a在生产过程中未发生任何交叉污染,对IG5的A而言,其中含有B的污染占比=含有B-a的序列条数/所有含有a的序列条数,When a of IG7 does not have any cross-contamination during the production process, for A of IG5, the proportion of contamination containing B in it = the number of sequences containing B-a/the number of all sequences containing a, 当IG5的A在生产过程中未发生任何交叉污染,对IG7的a而言,其中含有b的污染占比=含有A-b序列条数/所有含有A的序列条数;When A of IG5 does not have any cross-contamination during the production process, for a of IG7, the proportion of contamination containing b = number of sequences containing A-b/number of all sequences containing A; 当B污染A且b污染a,则B-b标签组合污染占比=(含有B-a的序列条数/所有含有a的序列条数)×(含有A-b序列条数/所有含有A的序列条数);When B pollutes A and b pollutes a, then the B-b tag combination pollution ratio = (number of sequences containing B-a/number of all sequences containing a) × (number of sequences containing A-b/number of all sequences containing A); 当IG7的b在生产过程中未发生任何交叉污染,对IG5的B而言,其中含有A的污染占比=含有A-b的序列条数/所有含有b的序列条数,When b of IG7 does not have any cross-contamination during the production process, for B of IG5, the proportion of contamination containing A = the number of sequences containing A-b/the number of all sequences containing b, 当IG5的B在生产过程中未发生任何交叉污染,对IG7的b而言,其中含有a的污染占比=含有B-a序列条数/所有含有B的序列条数;When B of IG5 does not have any cross-contamination during the production process, for b of IG7, the proportion of pollution containing a = the number of sequences containing B-a/the number of all sequences containing B; 当A污染B且a污染b,则A-a标签组合污染占比=(含有A-b的序列条数/所有含有b的序列条数)×(含有B-a序列条数/所有含有B的序列条数);When A pollutes B and a pollutes b, then the proportion of A-a tag combination pollution=(number of sequences containing A-b/number of all sequences containing b)×(number of sequences containing B-a/number of all sequences containing B); 所述每组标签样本序列条数为通过系统过滤后的每组正确配对序列条数,即含有A-a的序列条数或含有B-b的序列条数;The number of sample sequences for each group of tags is the number of correct paired sequences for each group filtered by the system, that is, the number of sequences containing A-a or the number of sequences containing B-b; 所述所有标签组合混合占比方差系数为通过系统过滤后的每组正确配对序列条数在通过系统过滤后的总配对正确序列条数中比例的方差系数;The mixed ratio variance coefficient of all tag combinations is the variance coefficient of the ratio of the number of correct paired sequences for each group filtered by the system to the total number of correct pairs of sequences filtered by the system; 所述综合序列通过率为测序反应后通过系统过滤后的正确配对且有效序列的总条数占通过系统过滤后所有序列总条数的比例;The pass rate of the comprehensive sequence is the ratio of the total number of correct paired and effective sequences filtered by the system after the sequencing reaction to the total number of all sequences filtered by the system; 所述每组标签样本序列占比为通过系统过滤后的每组正确配对的序列条数占通过系统过滤后总序列的比例;The proportion of each group of label sample sequences is the ratio of the number of correctly paired sequences of each group after filtering by the system to the total sequence after filtering by the system; 所述单侧大于1%污染的标签占比为:上游文库标签内,污染比例大于1%的文库标签数占总文库标签数的比例;以及,下游文库标签内,污染比例大于1%文库标签数占总文库标签数的比例。The proportion of labels with more than 1% pollution on one side is: among the upstream library labels, the ratio of the number of library labels with a pollution ratio greater than 1% to the total number of library labels; and, among the downstream library labels, the pollution ratio is greater than 1% of the library labels The ratio of the number to the total number of library tags. 2.如权利要求1所述的质控方法,其特征在于,所述步骤S1)依次包括以下步骤:gDNA标准品准备,gDNA片段化,末端修复,接头连接,接头连接产物纯化,文库扩增,扩增文库的纯化,纯化文库的质检,纯化文库片段大小的检测和文库上机测序。2. The quality control method according to claim 1, wherein said step S1) comprises the following steps in sequence: gDNA standard preparation, gDNA fragmentation, end repair, adapter ligation, adapter ligation product purification, library amplification , During the purification of the library, the quality inspection of the purification library, the detection of the size of the library fragment, and the sequencing of the library. 3.如权利要求1所述的质控方法,其特征在于,所述独特双端文库标签组合由IG5组和IG7组组成,IG5和IG7各自组内的文库标签的汉明距离≥3,IG5和IG7组间的文库标签的序列汉明距离≥2。3. The quality control method according to claim 1, wherein the unique double-ended library tag combination is composed of IG5 group and IG7 group, and the Hamming distance of the library tags in the respective groups of IG5 and IG7 is ≥ 3, and IG5 The sequence of the library label between the IG7 group is ≥2. 4.如权利要求3所述的质控方法,其特征在于,文库标签通过高效液相色谱法进行纯化以及通过质谱分析确认分子量,要求纯度≥85%。4. The quality control method according to claim 3, wherein the library tags are purified by high performance liquid chromatography and the molecular weight is confirmed by mass spectrometry, and the purity is required to be ≥ 85%. 5.如权利要求1所述的质控方法,其特征在于,所述独特双端文库标签组合由96对文库标签组成,即IG5组内有96个上游文库标签,IG7组内有96个下游文库标签,一一对应;每组标签样本序列占比则相应调整为≥0.2%。5. The quality control method according to claim 1, wherein the unique paired-end library tag combination consists of 96 pairs of library tags, that is, there are 96 upstream library tags in the IG5 group, and 96 downstream library tags in the IG7 group. The library labels are in one-to-one correspondence; the proportion of each group of label sample sequences is adjusted to ≥0.2%. 6.如权利要求1所述的质控方法,其特征在于,所述独特双端文库标签组合由48对文库标签组成,即IG5组内有48个上游文库标签,IG7组内有48个下游文库标签,一一对应;每组标签样本序列占比则相应调整为≥0.4%。6. The quality control method according to claim 1, wherein the unique paired-end library tag combination consists of 48 pairs of library tags, that is, there are 48 upstream library tags in the IG5 group, and 48 downstream library tags in the IG7 group. Library labels correspond to one -to -one correspondence; each group of label sample sequences account for ≥0.4 %. 7.如权利要求1所述的质控方法,其特征在于,当独特双端文库标签组合由192对文库标签组成,即IG5组内有192个上游文库标签,IG7组内有192下游文库标签,一一对应,每组标签样本序列占比则相应调整为≥0.1%。7. The quality control method according to claim 1, wherein when the unique paired-end library tag combination consists of 192 pairs of library tags, that is, there are 192 upstream library tags in the IG5 group, and 192 downstream library tags in the IG7 group Correspondingly, the proportion of the sequence of the label samples in each group is corresponding to ≥0.1 %. 8.如权利要求1所述的质控方法,其特征在于,当独特双端文库标签组合由288对文库标签组成,即IG5组内有288个上游文库标签,IG7组内有288下游文库标签,一一对应,每组标签样本序列占比则相应调整为≥0.07%。8. The quality control method according to claim 1, wherein when the unique paired-end library tag combination consists of 288 pairs of library tags, that is, there are 288 upstream library tags in the IG5 group, and 288 downstream library tags in the IG7 group Correspondingly, the proportion of the sequence of the label samples in each group is corresponding to ≥0.07 %. 9.如权利要求1所述的质控方法,其特征在于,当独特双端文库标签组合由384对文库标签组成,即IG5组内有384个上游文库标签,IG7组内有384下游文库标签,一一对应,每组标签样本序列占比则相应调整为≥0.05%。9. The quality control method according to claim 1, wherein when the unique paired-end library tag combination consists of 384 pairs of library tags, that is, there are 384 upstream library tags in the IG5 group, and 384 downstream library tags in the IG7 group Correspondingly, the proportion of the sequence of the label samples in each group is corresponding to ≥0.05 %. 10.如权利要求1~9所述的质控方法在样本序列测定中应用。10. The mass control method as described as claims 1 to 9 is applied in the sample sequence measurement.
CN201811337895.2A 2018-11-09 2018-11-09 A quality control method and application for detecting unique paired-end library tag combinations Active CN109517882B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811337895.2A CN109517882B (en) 2018-11-09 2018-11-09 A quality control method and application for detecting unique paired-end library tag combinations
CN202111090137.7A CN113957123B (en) 2018-11-09 2018-11-09 Method for constructing and detecting gDNA library containing unique double-end library tag combination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811337895.2A CN109517882B (en) 2018-11-09 2018-11-09 A quality control method and application for detecting unique paired-end library tag combinations

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202111090137.7A Division CN113957123B (en) 2018-11-09 2018-11-09 Method for constructing and detecting gDNA library containing unique double-end library tag combination

Publications (2)

Publication Number Publication Date
CN109517882A true CN109517882A (en) 2019-03-26
CN109517882B CN109517882B (en) 2021-08-17

Family

ID=65773575

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202111090137.7A Active CN113957123B (en) 2018-11-09 2018-11-09 Method for constructing and detecting gDNA library containing unique double-end library tag combination
CN201811337895.2A Active CN109517882B (en) 2018-11-09 2018-11-09 A quality control method and application for detecting unique paired-end library tag combinations

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111090137.7A Active CN113957123B (en) 2018-11-09 2018-11-09 Method for constructing and detecting gDNA library containing unique double-end library tag combination

Country Status (1)

Country Link
CN (2) CN113957123B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970091A (en) * 2019-12-20 2020-04-07 北京优迅医学检验实验室有限公司 Label quality control method and device
CN111910258A (en) * 2020-08-19 2020-11-10 纳昂达(南京)生物科技有限公司 Paired-end library tag composition and application thereof in MGI sequencing platform
CN114807309A (en) * 2022-05-19 2022-07-29 广州微远基因科技有限公司 A quality control method for library index primers and its application
CN115197999A (en) * 2022-07-15 2022-10-18 纳昂达(南京)生物科技有限公司 Method and device for synthesizing crosstalk by quality control double-end unique tag connector

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104293783A (en) * 2014-09-30 2015-01-21 天津诺禾致源生物信息科技有限公司 Primer applicable to amplicon sequencing library construction, construction method, amplicon library and kit comprising amplicon library
CN104561294A (en) * 2014-12-26 2015-04-29 北京诺禾致源生物信息科技有限公司 Construction method and sequencing method of genetic typing sequencing library
CN105671644A (en) * 2016-02-26 2016-06-15 武汉冰港生物科技有限公司 Preparation method of genome mixing sequencing library
WO2016109981A1 (en) * 2015-01-09 2016-07-14 深圳华大基因研究院 High-throughput detection method for dna synthesis product
WO2018197950A1 (en) * 2017-04-23 2018-11-01 Illumina Cambridge Limited Compositions and methods for improving sample identification in indexed nucleic acid libraries

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104099666A (en) * 2013-04-15 2014-10-15 江苏基谱生物科技发展有限公司 Construction method for next-generation sequencing library
CN105734048A (en) * 2016-02-26 2016-07-06 武汉冰港生物科技有限公司 PCR-free sequencing library preparation method for genome DNA

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104293783A (en) * 2014-09-30 2015-01-21 天津诺禾致源生物信息科技有限公司 Primer applicable to amplicon sequencing library construction, construction method, amplicon library and kit comprising amplicon library
CN104561294A (en) * 2014-12-26 2015-04-29 北京诺禾致源生物信息科技有限公司 Construction method and sequencing method of genetic typing sequencing library
WO2016109981A1 (en) * 2015-01-09 2016-07-14 深圳华大基因研究院 High-throughput detection method for dna synthesis product
CN105671644A (en) * 2016-02-26 2016-06-15 武汉冰港生物科技有限公司 Preparation method of genome mixing sequencing library
WO2018197950A1 (en) * 2017-04-23 2018-11-01 Illumina Cambridge Limited Compositions and methods for improving sample identification in indexed nucleic acid libraries

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ILLUMINA WHITE PAPER: ""Effects of index misassignment on multiplexing and downstream"", 《ILLUMINA》 *
LAURA E. MACCONAILL等: ""Unique, dual-indexed sequencing adapters with UMIs effectively eliminate index cross-talk and significantly improve sensitivity of massively parallel sequencing"", 《BMC GENOMICS》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970091A (en) * 2019-12-20 2020-04-07 北京优迅医学检验实验室有限公司 Label quality control method and device
CN110970091B (en) * 2019-12-20 2023-05-23 北京优迅医学检验实验室有限公司 Label quality control method and device
CN111910258A (en) * 2020-08-19 2020-11-10 纳昂达(南京)生物科技有限公司 Paired-end library tag composition and application thereof in MGI sequencing platform
CN111910258B (en) * 2020-08-19 2021-06-15 纳昂达(南京)生物科技有限公司 Paired-end library tag composition and application thereof in MGI sequencing platform
CN114807309A (en) * 2022-05-19 2022-07-29 广州微远基因科技有限公司 A quality control method for library index primers and its application
CN115197999A (en) * 2022-07-15 2022-10-18 纳昂达(南京)生物科技有限公司 Method and device for synthesizing crosstalk by quality control double-end unique tag connector
CN115197999B (en) * 2022-07-15 2024-01-23 纳昂达(南京)生物科技有限公司 Method and device for synthesizing crosstalk by quality control double-end unique tag connector

Also Published As

Publication number Publication date
CN113957123A (en) 2022-01-21
CN109517882B (en) 2021-08-17
CN113957123B (en) 2025-07-25

Similar Documents

Publication Publication Date Title
CN109517882A (en) A quality control method and application for detecting unique paired-end library tag combinations
CN116064753A (en) Method for constructing high-efficiency sequencing library, primer set and kit
CN108517567B (en) Adaptor, primer group, kit and library construction method for cfDNA library construction
CN114277096B (en) Method and kit for identifying thalassemia alpha anti4.2 heterozygotes and HK alpha heterozygotes
CN117004756A (en) MNP marker sites, primer compositions and kits for identification of Osmanthus fragrans varieties and their applications
CN101956005A (en) Fluorescently-labeled insertion/deletion (InDel) genetic polymorphism locus composite amplification system and application thereof
CN109234380B (en) Kit for detecting related gene of hereditary hearing loss and specific primer group
Ding et al. Monolithic, 3D-printed lab-on-disc platform for multiplexed molecular detection of SARS-CoV-2
CN111748637A (en) A SNP molecular marker combination, multiplex composite amplification primer set, kit and method for kinship analysis and identification
CN113293227A (en) SNP molecular marker primer for identifying color traits of waxberry fruits and application thereof
CN107475451B (en) European and American dual microdroplet digital PCR absolute quantitative detection kit for porcine reproductive and respiratory syndrome virus
CN119177321B (en) Primers and methods for whole genome amplification of porcine epidemic diarrhea virus based on nanopore sequencing
CN116287357A (en) Respiratory tract pathogenic bacteria detection kit based on targeted amplicon sequencing
CN106191319B (en) A kind of multi-fluorescence immunoassay method of 6 kinds of fowl respiratory pathogens of quick differentiation
WO2021203461A1 (en) Position anchoring bar code system for nanopore sequencing library construction
CN112522792B (en) Construction method of RNA sequencing library
WO2017219511A1 (en) Method for rapid homogenization or equal proportion of dna samples
CN106011313B (en) A kind of the multi-fluorescence immunoassay method and reagent of quick differentiation ILTV, IBV, MG and MS
CN111139315A (en) Method for high-throughput detection of respiratory viruses by using second-generation sequencing and application
CN105256379A (en) Method for preparing novel genome simplified methylation sequencing library
CN106086241B (en) A kind of primer, kit and the method for the multi-fluorescence immunoassay of 4 kinds of fowl respiratory pathogens of quick differentiation
CN110438219B (en) Primers, probes, kits and methods for non-invasive prenatal diagnosis of Pap edema fetuses based on droplet digital PCR
CN109266723A (en) Rare mutation detection method, its kit and application
CN105779581B (en) A set of core SNP markers suitable for the construction of nucleic acid fingerprint database of Chinese cabbage varieties and their applications
CN113046415A (en) Construction method and application of RNA sequencing library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant