CN108070609B

CN108070609B - A method for expressing recombinant proteins using Trichoderma reesei as a host

Info

Publication number: CN108070609B
Application number: CN201711446635.4A
Authority: CN
Inventors: 秦丽娜; 董志扬; 江贤章; 黄建忠
Original assignee: Fujian Normal University
Current assignee: Shenzhen Upfo Biotech Co ltd
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2020-12-18
Anticipated expiration: 2037-12-27
Also published as: CN108070609A

Abstract

本发明公开了一种利用里氏木霉作为宿主表达重组蛋白的方法。本发明所提供的利用里氏木霉作为宿主表达重组蛋白的方法，包括如下步骤：将含有待表达的目的蛋白的编码基因的表达盒克隆到里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区。本发明实验证明在cbh1基因缺失株中，将cbh1基因的表达骨架整合入位点ND1，并与野生菌株比较cbh1基因的表达水平，发现将重组基因插入该位点，可以获得与cbh1位点相当的表达水平。本发明进一步丰富完善了利用里氏木霉作为宿主高效表达重组蛋白的技术方案。The invention discloses a method for expressing recombinant protein by using Trichoderma reesei as a host. The method for expressing recombinant protein using Trichoderma reesei as a host provided by the present invention comprises the steps of: cloning the expression cassette containing the encoding gene of the target protein to be expressed into the gene whose Protein ID is 68606 in the Trichoderma reesei genome and the non-coding region between the gene with Protein ID 68608. The experiment of the present invention proves that in the cbh1 gene deletion strain, the expression skeleton of the cbh1 gene is integrated into the site ND1, and the expression level of the cbh1 gene is compared with that of the wild strain, and it is found that the recombinant gene is inserted into this site. expression level. The invention further enriches and perfects the technical scheme of using Trichoderma reesei as a host to express recombinant protein efficiently.

Description

A method for expressing recombinant proteins using Trichoderma reesei as a host

技术领域technical field

本发明涉及基因工程领域，具体涉及一种利用里氏木霉作为宿主表达重组蛋白的方法。The invention relates to the field of genetic engineering, in particular to a method for expressing recombinant protein using Trichoderma reesei as a host.

背景技术Background technique

丝状真菌里氏木霉(Trichoderma reesei)是工业上主要的纤维素酶半纤维素酶生产菌株，是美国FDA认证的食品安全级菌株。里氏木霉具有强大的蛋白分泌能力，某些突变株的外分泌蛋白量可达100g/L，并具有与高等哺乳动物相似的糖基化系统，因此里氏木霉是一种非常理想的重组蛋白表达宿主，得到国际著名酶制剂公司(如Genencor、Novozymes等)的高度重视，并成功应用于多种药物、化学试剂以及酶制剂的表达和生产。近年来，随着里氏木霉基因组学的快速发展及其遗传操作系统的不断完善，利用里氏木霉作为宿主表达重组蛋白也越来越受到重视。The filamentous fungus Trichoderma reesei (Trichoderma reesei) is a major industrial cellulase hemicellulase producing strain, and is a food-safe strain certified by the US FDA. Trichoderma reesei has strong protein secretion ability, the amount of exocrine protein of some mutant strains can reach 100g/L, and has a glycosylation system similar to that of higher mammals, so Trichoderma reesei is a very ideal recombinant The protein expression host has been highly valued by internationally renowned enzyme preparation companies (such as Genencor, Novozymes, etc.), and has been successfully used in the expression and production of various drugs, chemical reagents and enzyme preparations. In recent years, with the rapid development of T. reesei genomics and the continuous improvement of its genetic operating system, the use of T. reesei as a host to express recombinant proteins has attracted more and more attention.

值得指出的是，里氏木霉中外源DNA片段多以非同源末端连接的途径NHEJ(nonhomologous end-joining)整合入基因组，呈随机插入的形式。重组蛋白的基因表达框在基因组中的不同的插入位点会直接影响到其表达水平。例如，若重组基因插入到沉默子的调控区可能直接导致其不能转录而表达失败，如果外源基因插入到增强子的调控区域也可能使其表达量大幅度提高。里氏木霉中纤维二糖水解酶基因(cbh1)的位点通常被认为是基因高效表达的位点。It is worth noting that the exogenous DNA fragments in Trichoderma reesei are mostly integrated into the genome by the way of non-homologous end-joining (NHEJ), which is in the form of random insertion. The different insertion sites of the recombinant protein gene expression cassette in the genome will directly affect its expression level. For example, if a recombinant gene is inserted into the regulatory region of a silencer, it may directly lead to its failure to transcribe and fail to express, and if a foreign gene is inserted into the regulatory region of an enhancer, its expression may also be greatly increased. The site of the cellobiohydrolase gene (cbh1) in T. reesei is generally considered to be the site of high gene expression.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种利用里氏木霉作为宿主表达重组蛋白的方法。The purpose of the present invention is to provide a method for expressing recombinant protein using Trichoderma reesei as a host.

本发明所提供的利用里氏木霉作为宿主表达重组蛋白的方法，可包括如下步骤：将含有待表达的目的蛋白的编码基因的表达盒克隆到里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区。The method for expressing recombinant protein using Trichoderma reesei as a host provided by the present invention may include the steps of: cloning the expression cassette containing the encoding gene of the target protein to be expressed into the Trichoderma reesei genome and being located at a protein ID of 68606. The non-coding region between the gene and the gene with Protein ID 68608.

其中，所述“将含有待表达的目的蛋白的编码基因的表达盒克隆到里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区”可为如下任一：Wherein, the described "cloning the expression cassette containing the coding gene of the target protein to be expressed into the non-coding region between the gene whose Protein ID is 68606 and the gene whose Protein ID is 68608 in the Trichoderma reesei genome" can be as follows Either:

(a)用所述含有待表达的目的蛋白的编码基因的表达盒取代所述里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区的任意片段；(a) Replace any segment of the non-coding region between the gene with Protein ID 68606 and the gene with Protein ID 68608 in the Trichoderma reesei genome with the expression cassette containing the gene encoding the target protein to be expressed ;

(b)将所述含有待表达的目的蛋白的编码基因的表达盒插入到所述里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区的任意位置。(b) inserting the expression cassette containing the gene encoding the target protein to be expressed into any non-coding region between the gene with Protein ID 68606 and the gene with Protein ID 68608 in the Trichoderma reesei genome Location.

进一步地，所述里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区的核苷酸序列为SEQ ID No.1。Further, the nucleotide sequence of the non-coding region located between the gene with Protein ID 68606 and the gene with Protein ID 68608 in the Trichoderma reesei genome is SEQ ID No.1.

在本发明的实施例中，所述“将含有待表达的目的蛋白的编码基因的表达盒克隆到里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区”具体为：将所述含有待表达的目的蛋白的编码基因的表达盒取代所述里氏木霉基因组中SEQ ID No.1的第429-551位。In an embodiment of the present invention, the "cloning the expression cassette containing the gene encoding the target protein to be expressed into the Trichoderma reesei genome is located between the gene whose Protein ID is 68606 and the gene whose Protein ID is 68608. The "coding region" is specifically: replacing positions 429-551 of SEQ ID No. 1 in the Trichoderma reesei genome with the expression cassette containing the gene encoding the target protein to be expressed.

如同本领域的常规理解，在上述方法中，所述含有待表达的目的蛋白的编码基因的表达盒自5’端到3’端依次包含启动子、由所述启动子启动表达的所述待表达的目的蛋白的编码基因，以及终止子。当然，根据需要还可以包含其他类型的调控序列。As is conventionally understood in the art, in the above method, the expression cassette containing the gene encoding the target protein to be expressed sequentially comprises a promoter from the 5' end to the 3' end, and the expression of the to-be-to-be-expressed protein is promoted by the promoter. The gene encoding the expressed protein of interest, as well as the terminator. Of course, other types of regulatory sequences may also be included as desired.

术语“|表达盒”是包含目的基因及其表达所必需的所有调控序列的双链核酸分子。所述调控序列在其相容条件下能指导编码序列在合适的宿主细胞中表达目的基因。所述调控序列包括但不限于，前导序列、多聚腺苷酸化序列、启动子、信号序列和转录终止子。最低限度，调控序列要包括启动子以及转录和翻译的终止信号。为了导入载体的特定限制性酶位点以便将调控序列与目的基因的编码区进行连接，可以提供带接头的调控序列。调控序列可以是合适的启动子序列，即可被表达核酸序列的宿主细胞识别的核酸序列。启动子序列含有介导目的基因表达的转录调控序列。启动子可以是在所选宿主细胞中有转录活性的任何核酸序列，包括突变的、截短的和杂合的启动子，可以得自编码与宿主细胞同源或异源的胞外或胞内多肽的基因。调控序列还可以是合适的转录终止序列，即能被宿主细胞识别从而终止转录的一段序列。终止序列可操作连接在编码多肽的核酸序列的3’末端。在所选宿主细胞中可发挥功能的任何终止子都可以用于本发明。调控序列还可以是合适的前导序列，即对宿主细胞的翻译十分重要的mRNA非翻译区。前导序列可操作连接于目的基因的5’末端。在所选宿主细胞中可发挥功能的任何前导序列均可用于本发明。在这些例子中，应将目的基因的核酸序列与调控序列可操作连接在一起。术语“可操作连接”在文中定义为这样一种构象，其中调控序列位于目的基因的适当位置，以使调控序列指导目的基因的表达。The term "|expression cassette" is a double-stranded nucleic acid molecule that contains a gene of interest and all regulatory sequences necessary for its expression. The regulatory sequences are capable of directing the coding sequence to express the gene of interest in a suitable host cell under compatible conditions. Such regulatory sequences include, but are not limited to, leader sequences, polyadenylation sequences, promoters, signal sequences, and transcription terminators. At a minimum, the regulatory sequences include a promoter and transcriptional and translational termination signals. In order to introduce specific restriction enzyme sites into the vector for ligation of the regulatory sequences with the coding region of the gene of interest, the regulatory sequences with linkers can be provided. The regulatory sequence may be a suitable promoter sequence, ie, a nucleic acid sequence that is recognized by the host cell in which the nucleic acid sequence is expressed. The promoter sequence contains transcriptional regulatory sequences that mediate the expression of the gene of interest. The promoter can be any nucleic acid sequence that is transcriptionally active in the host cell of choice, including mutated, truncated and hybrid promoters, and can be derived from extracellular or intracellular encoding homologous or heterologous to the host cell Polypeptide gene. The regulatory sequence may also be a suitable transcription termination sequence, ie, a sequence recognized by the host cell to terminate transcription. The termination sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator that is functional in the host cell of choice can be used in the present invention. The regulatory sequence may also be a suitable leader sequence, an untranslated region of an mRNA that is important for translation by the host cell. The leader sequence is operably linked to the 5' end of the gene of interest. Any leader sequence that is functional in the host cell of choice can be used in the present invention. In these instances, the nucleic acid sequence and regulatory sequences of the gene of interest should be operably linked. The term "operably linked" is defined herein as a conformation in which a regulatory sequence is positioned in a gene of interest such that the regulatory sequence directs expression of the gene of interest.

在本发明的一个实施例中，所述启动子为来源于里氏木霉的纤维二糖水解酶I基因的启动子；所述终止子为来源于里氏木霉的纤维二糖水解酶I基因的终止子。In one embodiment of the present invention, the promoter is the promoter of the cellobiohydrolase I gene derived from Trichoderma reesei; the terminator is the cellobiohydrolase I derived from Trichoderma reesei the terminator of the gene.

在本发明的另一个实施例中，所述启动子为来源于里氏木霉的纤维二糖水解酶I基因的启动子；所述终止子为来源于里氏木霉的纤维二糖水解酶II基因的终止子。In another embodiment of the present invention, the promoter is the promoter of the cellobiohydrolase I gene derived from Trichoderma reesei; the terminator is the cellobiohydrolase derived from Trichoderma reesei The terminator of the II gene.

进一步地，所述来源于里氏木霉的纤维二糖水解酶I基因的启动子的序列具体为SEQ ID No.2的第1-2078位或SEQ ID No.3的第1939-4016位。所述来源于里氏木霉的纤维二糖水解酶I基因的终止子的序列具体为SEQ ID No.2的第3685-5927位。所述来源于里氏木霉的纤维二糖水解酶II基因的终止子的序列具体为SEQ ID No.3的第5693-7784位。Further, the sequence of the promoter of the cellobiohydrolase I gene derived from Trichoderma reesei is specifically positions 1-2078 of SEQ ID No.2 or positions 1939-4016 of SEQ ID No.3. The specific sequence of the terminator of the cellobiohydrolase I gene derived from Trichoderma reesei is positions 3685-5927 of SEQ ID No.2. The specific sequence of the terminator of the cellobiohydrolase II gene derived from Trichoderma reesei is positions 5693-7784 of SEQ ID No.3.

在上述方法中，所述待表达的目的蛋白为来源于真核生物的蛋白或来源于原核生物的蛋白。In the above method, the target protein to be expressed is a protein derived from eukaryotes or a protein derived from prokaryotes.

进一步地，所述真核生物可为真菌、植物或动物。Further, the eukaryotic organisms can be fungi, plants or animals.

更进一步地，所述真菌在本发明中具体为黑曲霉或里氏木霉。Further, in the present invention, the fungus is specifically Aspergillus niger or Trichoderma reesei.

在本发明的一个实施例中，所述待表达的目的蛋白为来源于黑曲霉的脂肪酶，其氨基酸序列同SEQ ID No.2的第2138-2947位编码所得的氨基酸序列。In an embodiment of the present invention, the target protein to be expressed is a lipase derived from Aspergillus niger, and its amino acid sequence is the same as the amino acid sequence encoded by positions 2138-2947 of SEQ ID No.2.

在本发明的另一个实施例中，所述待表达的目的蛋白为来源于里氏木霉的纤维二糖水解酶I，其氨基酸序列同SEQ ID No.3的第4017-5692位编码所得的氨基酸序列。In another embodiment of the present invention, the target protein to be expressed is cellobiohydrolase I derived from Trichoderma reesei, and its amino acid sequence is the same as the one encoded by the 4017-5692 position of SEQ ID No.3. amino acid sequence.

在上述方法中，所述“将含有待表达的目的蛋白的编码基因的表达盒克隆到里氏木霉基因组中位于Protein ID为68606的基因和Protein ID为68608的基因之间的非编码区”可通过向里氏木霉宿主菌中导入重组表达载体来实现；所述重组表达载体上含有所述表达盒，以及位于所述表达盒上下游用于定点插入或替换的同源臂。In the above method, the "cloning the expression cassette containing the gene encoding the target protein to be expressed into the non-coding region between the gene with Protein ID 68606 and the gene with Protein ID 68608 in the Trichoderma reesei genome" This can be achieved by introducing a recombinant expression vector into a Trichoderma reesei host; the recombinant expression vector contains the expression cassette and homology arms located upstream and downstream of the expression cassette for site-directed insertion or replacement.

进一步地，在本发明的一个实施例中，所述同源臂具体为SEQ ID No.3的第1-1938位(上游同源臂)和第7785-9710位(下游同源臂)所示的两段DNA片段。Further, in an embodiment of the present invention, the homology arms are specifically shown in positions 1-1938 (upstream homology arms) and 7785-9710 (downstream homology arms) of SEQ ID No.3 two DNA fragments.

本发明通过对一系列异源脂肪酶基因随机插入到里氏木霉基因组不同位置的转化子进行流式细胞仪的高通量筛选，并对脂肪酶表达量较高的转化株进行基因拷贝数和插入位点的鉴定，最终对一株重组蛋白基因为单拷贝的高产菌株进行插入位点分析，鉴定出除了里氏木霉cbh1位点以外的另一个有利于基因表达的基因插入位点ND1。最后在cbh1基因缺失株中，将cbh1基因的表达骨架整合入该位点，并与野生菌株比较cbh1基因的表达水平，发现将重组基因插入该位点，可以获得与cbh1位点相当的表达水平。本发明进一步丰富完善了利用里氏木霉作为宿主高效表达重组蛋白的技术方案。In the present invention, high-throughput screening by flow cytometer is performed on a series of transformants randomly inserted into different positions of the Trichoderma reesei genome by a series of heterologous lipase genes, and gene copy number is performed on the transformants with higher lipase expression. And the identification of the insertion site, finally the insertion site analysis of a high-yielding strain with a single copy of the recombinant protein gene was carried out, and another gene insertion site ND1, which was beneficial to gene expression, was identified in addition to the Trichoderma reesei cbh1 site. . Finally, in the cbh1 gene deletion strain, the expression backbone of the cbh1 gene was integrated into this site, and the expression level of the cbh1 gene was compared with that of the wild strain, and it was found that the recombinant gene was inserted into this site. . The invention further enriches and perfects the technical scheme of using Trichoderma reesei as a host to express recombinant protein efficiently.

附图说明Description of drawings

图1为质粒pSKLR的结构示意图。Figure 1 is a schematic diagram of the structure of plasmid pSKLR.

图2为原生质体的制备及红色荧光蛋白的诱导表达示意图。Figure 2 is a schematic diagram of the preparation of protoplasts and the induced expression of red fluorescent protein.

图3为红色荧光蛋白和脂肪酶表达量的对应关系示意图。Figure 3 is a schematic diagram of the corresponding relationship between the expression levels of DsRed and lipase.

图4为重组脂肪酶基因在菌株ND1中的表达水平检测结果图。Figure 4 is a graph showing the results of detection of the expression level of the recombinant lipase gene in strain ND1.

图5为构建菌株ND1cbh1所需要的组件片段。Figure 5 shows the component fragments required for the construction of strain ND1cbh1.

图6为cbh1基因在菌株TU6、ND1和ND1cbh1菌株中的表达水平检测结果图。Figure 6 is a graph showing the detection results of the expression level of cbh1 gene in strains TU6, ND1 and ND1cbh1.

具体实施方式Detailed ways

下述实施例中所使用的实验方法如无特殊说明，均为常规方法。The experimental methods used in the following examples are conventional methods unless otherwise specified.

下述实施例中所用的材料、试剂等，如无特殊说明，均可从商业途径得到。The materials, reagents, etc. used in the following examples can be obtained from commercial sources unless otherwise specified.

里氏木霉(Trichoderma reesei)QM9414：ATCC 26921。Trichoderma reesei QM9414: ATCC 26921.

里氏木霉(Trichoderma reesei)TU6：ATCC MYA-256，同时记载于“E Gruber,j.Visser C.P.Kubicek,L.H.de Graaff.The development of a heterologoustransformation system for the cellulolytic fungus Trichoderma reesei based ona pyrG-negative mutant strain.Curr Genet(1990)18:71-76”一文，公众可从申请人处获得，仅可用于重复本发明实验使用。Trichoderma reesei TU6: ATCC MYA-256, also described in "E Gruber, j. Visser C. P. Kubicek, L. H. de Graaff. The development of a heterologous transformation system for the cellulolytic fungus Trichoderma reesei based on a pyrG-negative mutant strain. Curr Genet (1990) 18:71-76", which is publicly available from the applicant, and may only be used to replicate the experiments of the present invention.

里氏木霉(Trichoderma reesei)N10：记载于“Lina Qin,Fu-rong Cai,Xin-ruiDong,Zhen-bang Huang,Yong Tao,Jian-zhong Huang,Zhi-yang Dong.Improvedproduction of heterologous lipase in Trichoderma reesei by RNAi mediated genesilencing of an endogenic highly expressed gene.Bioresour Technol,2012,109:116-122.”一文，公众可从申请人处获得，仅可用于重复本发明实验使用。Trichoderma reesei N10: described in "Lina Qin, Fu-rong Cai, Xin-ruiDong, Zhen-bang Huang, Yong Tao, Jian-zhong Huang, Zhi-yang Dong. Improved production of heterologous lipase in Trichoderma reesei by RNAi mediated genesilencing of an endogenic highly expressed gene. Bioresour Technol, 2012, 109: 116-122.” The article, which is publicly available from the applicant, can only be used to replicate the experiments of the present invention.

里氏木霉(Trichoderma reesei)TR1124：已于2012年7月20日保藏于中国微生物菌种保藏管理委员会普通微生物中心(简称CGMCC，地址为：北京市朝阳区北辰西路1号院3号，中国科学院微生物研究所)，保藏登记号为CGMCC No.6384。并记载于中国专利申请201210271826.2。Trichoderma reesei TR1124: It has been deposited in the General Microbiology Center of China Microorganism Culture Collection Management Committee (CGMCC for short) on July 20, 2012. The address is: No. 3, No. 1, Beichen West Road, Chaoyang District, Beijing. Institute of Microbiology, Chinese Academy of Sciences), the deposit registration number is CGMCC No.6384. And recorded in Chinese patent application 201210271826.2.

里氏木霉(Trichoderma reesei)Tu6Δku70：该菌株的构建见“Gene targetingin a nonhomologous end joining deficient Hypocrea jecorina”一文，，该文献中，菌株命名为Δtku70,但后来有文章发表，该菌株被命名为Tu6Δku70，见文章“A novel majorfacilitator transporter TrSTR1 is essential for pentose utilization andinvolved in xylanase induction in Trichoderma reesei,Biochemical andbiophysical research communications,2015,460(3):663-669”，公众可从申请人处获得，仅可用于重复本发明实验使用。Trichoderma reesei Tu6Δku70: The construction of this strain is shown in the article "Gene targetingin a nonhomologous end joining deficient Hypocrea jecorina", in this document, the strain is named Δtku70, but later an article was published, the strain was named Tu6Δku70 , see the article "A novel major facilitator transporter TrSTR1 is essential for pentose utilization and involved in xylanase induction in Trichoderma reesei, Biochemical and biophysical research communications, 2015, 460(3):663-669", publicly available from the applicant, available only for The experiments of the present invention were repeated.

质粒pSK-Lip记载于“Lina Qin,Fu-rong Cai,Xin-rui Dong,Zhen-bang Huang,Yong Tao,Jian-zhong Huang,Zhi-yang Dong.Improved production of heterologouslipase in Trichoderma reesei by RNAi mediated gene silencing of an endogenichighly expressed gene.Bioresour Technol,2012,109:116-122”一文，公众可从申请人处获得，仅可用于重复本发明实验使用。The plasmid pSK-Lip is described in "Lina Qin, Fu-rong Cai, Xin-rui Dong, Zhen-bang Huang, Yong Tao, Jian-zhong Huang, Zhi-yang Dong. Improved production of heterologouslipase in Trichoderma reesei by RNAi mediated gene silencing of an endogenic highly expressed gene. Bioresour Technol, 2012, 109: 116-122”, which is available to the public from the applicant, can only be used to replicate the experiments of the present invention.

质粒pDsRed-Monomer-N1：为TAKARA公司产品，货号：632465。Plasmid pDsRed-Monomer-N1: Product of TAKARA Company, Cat. No. 632465.

质粒pBluescript SK(+)：为Snapgene,Addgene产品。Plasmid pBluescript SK (+): Snapgene, Addgene product.

实施例1、利于重组基因表达的里氏木霉基因组插入位点ND1的获得及应用实例Embodiment 1, the acquisition and application example of Trichoderma reesei genome insertion site ND1 beneficial to recombinant gene expression

本发明利用来源于口蹄疫病毒Foot and mouth disease virus(FMDV)中的一段多肽2A序列作为连接肽将红色荧光蛋白基因DsRed与一个来源于黑曲霉的脂肪酶基因LipA在里氏木霉中进行融合表达。2A序列可以使DsRed基因和LipA基因呈1：1的比例进行表达，LipA基因因引入信号肽序列将被分泌至细胞外，而红色荧光蛋白将作为报告基因留在细胞内作为标记基因用于流式细胞仪的高通量筛选。融合表达骨架随机整合入里氏木霉基因组，因插入位点和拷贝数的不同会产生重组蛋白表达量各不相同的转化子，通过流式细胞仪筛选获得一株重组蛋白表达量较高的转化子，对这株转化子的拷贝数进行鉴定发现该菌株中重组基因为单拷贝，随后对重组基因的插入位点进行了分析获取插入位点两端的侧翼序列，最后在cbh1基因缺失株中，将cbh1基因的表达骨架整合入该位点，并与野生菌株比较cbh1基因的表达水平，发现将重组基因插入该位点，可以获得与cbh1位点相当的表达水平。In the present invention, a segment of polypeptide 2A sequence derived from Foot and Mouth Disease Virus (FMDV) is used as a connecting peptide to fuse and express the red fluorescent protein gene DsRed and a lipase gene LipA derived from Aspergillus niger in Trichoderma reesei . The 2A sequence can express the DsRed gene and LipA gene in a ratio of 1:1. The LipA gene will be secreted out of the cell due to the introduction of the signal peptide sequence, and the red fluorescent protein will remain in the cell as a reporter gene as a marker gene for flow. High-throughput screening by cytometry. The fusion expression backbone is randomly integrated into the Trichoderma reesei genome, and transformants with different recombinant protein expression levels will be generated due to different insertion sites and copy numbers. Transformant, the copy number of the transformant was identified and found that the recombinant gene in this strain was a single copy, and then the insertion site of the recombinant gene was analyzed to obtain the flanking sequences at both ends of the insertion site. Finally, in the cbh1 gene deletion strain , integrated the expression backbone of the cbh1 gene into this site, and compared the expression level of the cbh1 gene with the wild strain. It was found that the expression level equivalent to the cbh1 site could be obtained by inserting the recombinant gene into this site.

一、利于重组基因表达的里氏木霉基因组插入位点ND1的获得1. Acquisition of Trichoderma reesei Genome Insertion Site ND1 Facilitating Recombinant Gene Expression

(一)重组质粒的构建(1) Construction of recombinant plasmid

1、红色荧光蛋白与异源脂肪酶融合表达载体pSKLR的构建1. Construction of the fusion expression vector pSKLR of red fluorescent protein and heterologous lipase

(1)PCR扩增纤维二糖水解酶I(CBH1)基因的启动子和终止子(1) PCR amplification of the promoter and terminator of the cellobiohydrolase I (CBH1) gene

以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，Pcbh1-F，Pcbh1-R为引物进行PCR扩增，扩增条件为：95℃预变性5min，94℃变性30s，57℃退火30s，72℃延伸2min，30个循环；最后72℃扩展延伸10min，扩增得到cbh1基因的启动子片段Pcbh1。同样，以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，Tcbh1-F，Tcbh1-R为引物进行PCR扩增，扩增条件为：95℃预变性5min，94℃变性30s，60℃退火30s，72℃延伸2.5min，30个循环；最后72℃扩展延伸10min，扩增得到cbh1基因的终止子片段Tcbh1。琼脂糖凝胶电泳，分别对两片段进行胶回收。Using Trichoderma reesei QM9414 genomic DNA as template, Pcbh1-F, Pcbh1-R as primers for PCR amplification, the amplification conditions are: 95 ℃ pre-denaturation 5min, 94 ℃ denaturation 30s, 57 ℃ annealing 30s, 30 cycles of extension at 72°C for 2 min; finally, extension at 72°C for 10 min to obtain the promoter fragment Pcbh1 of the cbh1 gene. Similarly, using Trichoderma reesei QM9414 genomic DNA as the template, Tcbh1-F, Tcbh1-R as primers for PCR amplification, the amplification conditions are: pre-denaturation at 95 °C for 5 min, denaturation at 94 °C for 30 s, and annealing at 60 °C 30 s at 72°C for 2.5min, 30 cycles; the last extension at 72°C for 10min, the terminator fragment Tcbh1 of the cbh1 gene was amplified. Agarose gel electrophoresis was performed to recover the two fragments.

(2)重叠延伸PCR扩增红色荧光蛋白，2A序列与脂肪酶基因的融合片段L2aR(2) Overlap extension PCR amplification of red fluorescent protein, the fusion fragment L2aR of 2A sequence and lipase gene

第一轮PCR：以质粒pSK-Lip为模板，lipase2a-F，lipase2a-R为引物扩增含有2A序列和部分红色荧光蛋白基因片段的脂肪酶基因片段。以质粒pDsRed-Monomer-N1为模板，2ared-F，2ared-R为引物扩增含有2A序列和部分脂肪酶基因片段的红色荧光蛋白基因。将这两个片段进行琼脂糖凝胶电泳并用胶回收试剂盒纯化。The first round of PCR: using plasmid pSK-Lip as template, lipase2a-F, lipase2a-R as primers to amplify the lipase gene fragment containing the 2A sequence and part of the red fluorescent protein gene fragment. Using plasmid pDsRed-Monomer-N1 as template and 2ared-F and 2ared-R as primers, the red fluorescent protein gene containing 2A sequence and part of lipase gene fragment was amplified. The two fragments were subjected to agarose gel electrophoresis and purified using a gel recovery kit.

第二轮PCR：将胶回收纯化的两个PCR产物等摩尔混合作为模板，lipase2a-F，2ared-R为引物进行OverlapPCR扩增即可得到两个基因融合的片段L2AR，将该片段克隆入pMD18T-simple载体中进行测序。The second round of PCR: The equimolar mixture of the two PCR products recovered and purified from the gel is used as a template, and lipase2a-F and 2ared-R are used as primers to perform Overlap PCR amplification to obtain the fragment L2AR fused with the two genes. The fragment is cloned into pMD18T -simple vector for sequencing.

(3)融合表达载体pSKLR的构建(3) Construction of fusion expression vector pSKLR

用限制性内切酶SalI和NotI对质粒pBluescript SK(+)进行双酶切，用限制性内切酶SalI和EcoRI对Pcbh1片段进行双酶切消化，用限制性内切酶EcoRI和SpeI对L2aR进行双酶切消化，用限制性内切酶SpeI和NotI对Tcbh1进行双酶切消化，将消化好的载体片段、Pcbh1片段、LaR片段、Tcbh1片段混合进行四片段连接，转化大肠杆菌Top10，菌落PCR鉴定转化子克隆，所用引物为lipase2a-F和2ared-R，挑取鉴定正确的克隆抽提质粒，并经测序验证正确后即得融合表达载体pSKLR(结构见图1)。The plasmid pBluescript SK(+) was double-digested with restriction enzymes SalI and NotI, the Pcbh1 fragment was double-digested with restriction enzymes SalI and EcoRI, and L2aR was double-digested with restriction enzymes EcoRI and SpeI Double-enzyme digestion was performed, and Tcbh1 was double-enzyme digestion with restriction enzymes SpeI and NotI. The digested vector fragment, Pcbh1 fragment, LaR fragment, and Tcbh1 fragment were mixed for four-fragment ligation, and transformed into E. coli Top10, colony The transformant clones were identified by PCR. The primers used were lipase2a-F and 2ared-R. The correct clones were picked and the plasmid was extracted and verified by sequencing to obtain the fusion expression vector pSKLR (see Figure 1 for the structure).

2、尿嘧啶缺陷筛选标记载体pSKpyr4的构建2. Construction of uracil-deficient selection marker vector pSKpyr4

(1)里氏木霉pyr4基因的克隆(1) Cloning of Trichoderma reesei pyr4 gene

以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，Pyr4-F，Pyr4-R为引物进行PCR扩增，扩增条件为：95℃预变性5min，94℃变性30s，62℃退火40s，72℃延伸2min，30个循环；最后72℃扩展延伸10min。将所得PCR产物于1％琼脂糖凝胶电泳，琼脂糖凝胶电泳试剂盒回收目的条带，TA克隆连入pMD18T-Simple载体中，进行测序。PCR amplification was carried out with Trichoderma reesei QM9414 genomic DNA as the template and Pyr4-F and Pyr4-R as primers. Extension at 72°C for 2 min, 30 cycles; final extension at 72°C for 10 min. The obtained PCR product was electrophoresed on 1% agarose gel, the target band was recovered by agarose gel electrophoresis kit, and the TA clone was ligated into the pMD18T-Simple vector for sequencing.

(2)质粒载体pSKpyr4的构建(2) Construction of plasmid vector pSKpyr4

分别用限制性内切酶ClaI和EcoRI对质粒pBluescript SK(+)和测序正确的pyr4基因TA克隆质粒进行双酶切，回收酶切处理的pBluescript SK(+)载体片段和pyr4基因片段，并进行连接。转化大肠杆菌Top10，挑取转化子进行菌落PCR鉴定，所用引物为Pyr4-F，Pyr4-R，PCR条件同前，提取质粒并经测序验证正确后即得尿嘧啶缺陷菌株TU6的互补质粒载体pSKpyr4。The plasmid pBluescript SK(+) and the correctly sequenced pyr4 gene TA cloned plasmid were double-enzyme digested with restriction enzymes ClaI and EcoRI, respectively, and the digested pBluescript SK(+) vector fragment and pyr4 gene fragment were recovered. connect. Transform E. coli Top10, pick the transformants for colony PCR identification, the primers used are Pyr4-F, Pyr4-R, the PCR conditions are the same as before, the plasmid is extracted and verified by sequencing to obtain the complementary plasmid vector pSKpyr4 of the uracil-deficient strain TU6 .

上述构建重组质粒过程中所用到的全部引物均具体参见表1。All primers used in the above-mentioned construction of recombinant plasmids are shown in Table 1 for details.

表1用于构建重组载体的引物Table 1 Primers used to construct recombinant vectors

(二)融合表达红色荧光蛋白和异源脂肪酶的里氏木霉转化子的获得(2) Obtainment of Trichoderma reesei transformants fused to express red fluorescent protein and heterologous lipase

1、里氏木霉菌株TU6的原生质体制备1. Protoplast preparation of Trichoderma reesei strain TU6

(1)取新鲜培养的斜面或平板上的里氏木霉(Trichoderma reesei)TU6的孢子，用适量无菌水洗涤孢子制成孢子悬液，200目筛子过滤除去残余的菌丝。将过滤的孢子悬液接种至装有100mL MM培养基(2％葡萄糖为碳源)的500mL三角瓶中，28℃培养13h-14h，至菌丝伸展。(1) Take the spores of Trichoderma reesei TU6 on a freshly cultivated slant or plate, wash the spores with an appropriate amount of sterile water to make a spore suspension, and filter through a 200-mesh sieve to remove the residual hyphae. The filtered spore suspension was inoculated into a 500 mL Erlenmeyer flask containing 100 mL of MM medium (2% glucose as the carbon source), and cultured at 28° C. for 13h-14h until the mycelium expanded.

(2)将培养液经200目筛子过滤，收集菌体，无菌水洗涤2-3次，最后用1.2M的MgSO₄溶液洗涤一次，让溶液自然流尽；(2) filter the culture solution through a 200-mesh sieve, collect the bacterial cells, wash with sterile water 2-3 times, and finally wash once with a 1.2M MgSO solution, and let the solution flow out naturally _;

(3)将筛子上的菌体冲洗到装有15mL裂解液的三角瓶中(裂解液是含有150mg的裂解酶Lysing enzyme Sigma,Cat.No.L-1412和15mg纤维素酶Cellulase R-10,YAKULT,Japan的1.2M MgSO₄)，30℃反应1.5h，显微镜下观察原生质体产生的情况，1h后每隔10min取样观察一次；(3) Rinse the bacteria on the sieve into a conical flask containing 15 mL of lysing solution (the lysing solution contains 150 mg of Lysing enzyme Sigma, Cat. No. L-1412 and 15 mg of Cellulase R-10, YAKULT, 1.2M MgSO ₄ from Japan), react at 30°C for 1.5h, observe the production of protoplasts under a microscope, and take samples every 10min for observation after 1h;

(4)当原生质体大量产生并且仍有大量菌丝存在时，加等体积0.6M的山梨醇溶液终止反应，200目筛子过滤除去残余的菌丝，室温3000rpm离心10min收集原生质体沉淀；(4) when protoplasts are produced in a large amount and there is still a large amount of mycelia, add an equal volume of 0.6M sorbitol solution to terminate the reaction, filter with a 200-mesh sieve to remove residual mycelium, and collect protoplast sediments by centrifugation at room temperature 3000rpm for 10min;

(5)沿着沉淀一侧倒去上清，原生质体沉淀用1.0M山梨醇溶液重悬，室温3000rpm离心10min；(5) Pour off the supernatant along one side of the precipitation, resuspend the protoplast precipitation with 1.0M sorbitol solution, and centrifuge at 3000rpm for 10min at room temperature;

(6)重复步骤(5)，弃上清将原生质体悬浮于200μL 1.0M山梨醇溶液中，血球板计数器观察并计数。(6) Repeat step (5), discard the supernatant, suspend the protoplasts in 200 μL of 1.0 M sorbitol solution, observe and count with a hemocytometer.

2、融合表达载体的里氏木霉转化2. Trichoderma reesei transformation of fusion expression vector

(1)将质粒pSKpyr4和pSKLR进行大量抽提并用2倍体积的无水乙醇及1/10体积的3.0M醋酸钠(pH5.2)过夜沉淀，用70％乙醇洗涤2次，用灭菌ddH₂O溶解，使各个质粒浓度达μg级。(1) The plasmids pSKpyr4 and pSKLR were extracted in large quantities and precipitated overnight with 2 times the volume of absolute ethanol and 1/10 volume of 3.0M sodium acetate (pH 5.2), washed twice with 70% ethanol, and sterilized with ddH Dissolve in ₂ O to make the concentration of each plasmid reach μg level.

(2)质粒pSKpyr4与pSKLR按摩尔比1:3至1:10的比例混合，体积不超过20μL，将该混合液加入到上述制备的原生质体中，轻轻混匀，再往其中各加入50μL PEG4000，再次混匀，冰上放置30min；设对照，对照用等体积的无菌水代替DNA混合液。(2) The plasmid pSKpyr4 and pSKLR were mixed in a molar ratio of 1:3 to 1:10, and the volume did not exceed 20 μL. The mixture was added to the protoplasts prepared above, mixed gently, and then 50 μL was added to each. PEG4000, mixed again, and placed on ice for 30 min; a control was set, and an equal volume of sterile water was used instead of the DNA mixture for the control.

(3)再往上述管中各加入1mL PEG4000，混匀，室温放置20min。(3) Add 1 mL of PEG4000 to each of the above tubes, mix well, and place at room temperature for 20 min.

(4)最后再各自加入1mL 1.0M山梨醇，再次混匀，将所有的原生质体转化液转移至装有50mL诱导培养基的250mL三角瓶中，诱导培养基配方为每升培养基中含1g葡萄糖，1g甘油，20g乳糖，180g山梨醇，0.05g(NH₄)₂SO₄，0.15g KH₂PO₄，0.006g MgSO₄，0.006g CaCl₂，0.00005g FeSO₄·7H₂O，0.000016g MnSO₄·H₂O，0.000014g，ZnSO₄·7H₂O，0.00002g CoCl₂。培养条件为30℃，100rpm缓慢震荡培养24h后将转速调至200rpm继续震荡培养。(4) add 1mL 1.0M sorbitol respectively at last, mix again, transfer all protoplast transformation liquids to the 250mL Erlenmeyer flask that 50mL induction medium is housed, and the induction medium formula is to contain 1g in every liter of culture medium Glucose, 1 g glycerol, 20 g lactose, 180 g sorbitol, 0.05 g (NH ₄ ) ₂ SO ₄ , 0.15 g KH ₂ PO ₄ , 0.006 g MgSO ₄ , 0.006 g CaCl ₂ , 0.00005 g FeSO ₄ 7H ₂ O, 0.000016 g _MnSO4.H2O , _0.000014g , _ZnSO4.7H2O , _0.00002g _CoCl2 . The culture condition was 30°C, and the culture was slowly shaken at 100 rpm for 24 h, and then the rotation speed was adjusted to 200 rpm to continue the shaking culture.

(5)30℃培养3-6天后，收集所有的菌丝，再次制备原生质体进行流式细胞仪分选用(见图2)。图2中，A：原生质体在再生诱导培养基中诱导96h后的菌丝生长状况；B：荧光显微镜下观察红色荧光蛋白的诱导情况；C：普通光学显微镜下观察重新将诱导的转化子制备成的原生质体；D：荧光显微镜下观察重新将诱导的转化子制备成原生质体后红色荧光蛋白的表达情况。由图2可见，成功表达重组蛋白后，将菌丝制备成原生质体，红色荧光仍有明显表达，可用于流式细胞仪分选。(5) After culturing at 30° C. for 3-6 days, collect all mycelia, and prepare protoplasts again for flow cytometry sorting (see FIG. 2 ). In Figure 2, A: Mycelial growth after protoplasts were induced in the regeneration induction medium for 96 hours; B: The induction of DsRed was observed under a fluorescence microscope; C: The newly induced transformants were prepared under an ordinary light microscope The resulting protoplasts; D: The expression of red fluorescent protein after the induced transformants were re-prepared into protoplasts were observed under a fluorescence microscope. It can be seen from Figure 2 that after successfully expressing the recombinant protein, the hyphae are prepared into protoplasts, and the red fluorescence is still clearly expressed, which can be used for flow cytometry sorting.

(三)转化子的流式细胞仪筛选(3) Flow cytometry screening of transformants

1、样品制备：收集所有转化子在诱导条件下再生的菌丝，再次制备成原生质体悬液，400目筛子过滤。镜检观察原生质体制备的完整性并用血球计数板进行计数。将原生质体悬液调整至浓度为10⁷/mL的悬液，500μL。荧光显微镜观察红色荧光蛋白表达情况。1. Sample preparation: collect all the hyphae of the transformants regenerated under the induction conditions, prepare the protoplast suspension again, and filter through a 400-mesh sieve. The integrity of the protoplast preparation was observed microscopically and counted with a hemocytometer. The protoplast suspension was adjusted to a concentration of 10 ⁷ /mL of suspension, 500 μL. The expression of red fluorescent protein was observed by fluorescence microscope.

2、流式细胞分选仪调试：用FACS Aria流式细胞仪(美国BD FACS Aria)分选原生质体中带RFP的细胞。100μm喷嘴，557nm绿光激发，鞘液压力为20psi。选定100μm喷嘴，超声清洗1分钟。开机后，打开主液流断点窗口，点击主液流，调节液流振动幅度(Ampl)，使液滴间隔值(Gap值)稳定，将调试参数尽量设置到最佳，保持主液流的连续性和稳定性。主液流稳定后，打开侧液流窗口电压，点击Test Sort，调整参数，使液流分束清晰。安装四路分选装置，打开侧液流窗口电压，点击Test Sort，收起废液抽屉，调整侧液流窗的电压滚动条，使偏转的分选液流落入相应的收集管中。将主液流窗中实际调出的Dmpl值添到默认窗口中。2. Debugging of the flow cytometer: Use a FACS Aria flow cytometer (BD FACS Aria, USA) to sort the cells with RFP in the protoplasts. 100 μm nozzle, excitation with 557 nm green light, and sheath fluid pressure of 20 psi. A 100 μm nozzle was selected and ultrasonically cleaned for 1 minute. After starting up, open the main liquid flow breakpoint window, click the main liquid flow, adjust the liquid flow vibration amplitude (Ampl), make the droplet interval value (Gap value) stable, set the debugging parameters to the best possible, and maintain the main liquid flow. Continuity and stability. After the main liquid flow is stable, open the voltage of the side liquid flow window, click Test Sort, and adjust the parameters to make the liquid flow split clearly. Install the four-way sorting device, open the voltage of the side liquid flow window, click Test Sort, close the waste liquid drawer, and adjust the voltage scroll bar of the side liquid flow window so that the deflected sorting liquid flow falls into the corresponding collection tube. Add the Dmpl value actually called in the main flow window to the default window.

3、四通道细胞分选：以出发菌株TU6的原生质体作为阴性对照建立分选门。分选门的确定基于以下几个原则：完整的原生质体具有较高的前向角(forward scatter，FSC)和侧向角(side scatter，SSC)比值，相对于阴性对照，RFP阳性细胞在绿光通道(585nm)有强烈的发射光。细胞直接分选至原生质体再生培养基中(MM+山梨醇)。3. Four-channel cell sorting: a sorting gate was established with the protoplast of the starting strain TU6 as a negative control. The determination of the sorting gate is based on the following principles: intact protoplasts have a higher ratio of forward scatter (FSC) and side scatter (SSC), and compared with negative controls, RFP-positive cells are in green The optical channel (585nm) has strong emission light. Cells were sorted directly into protoplast regeneration medium (MM+sorbitol).

4、转化子原生质体的二次筛选：将以上分选到的荧光强度最强的一组原生质体进行再生，再次制备原生质体悬液，进行96孔微孔板单个细胞分选，微孔板中装入诱导再生培养基。4. Secondary screening of transformant protoplasts: regenerate the group of protoplasts with the strongest fluorescence intensity sorted above, prepare protoplast suspension again, and perform single cell sorting in a 96-well microplate. Induce regeneration medium.

(四)红色荧光蛋白和异源脂肪酶表达量的对应关系测定(4) Corresponding relationship between DsRed and heterologous lipase expression

96孔微孔板单个细胞分选后，选取荧光强度各不同的46株菌在PDA固体平板上产孢，PDA固体培养基的配方为每升培养基中含土豆200g，葡萄糖20g，琼脂20g。将孢子制备成孢子悬液接种种子瓶，种子瓶培养基的配方为每升培养基中含20g葡萄糖，0.05g(NH₄)₂SO₄，0.15g KH₂PO₄，0.006g MgSO₄，0.006g CaCl₂，0.00005g FeSO₄·7H₂O，0.000016g MnSO₄·H₂O，0.000014g，ZnSO₄·7H₂O，0.00002g CoCl₂。培养条件为30℃，200rpm震荡培养48h后，四层纱布过滤菌体，分别称取1.8g菌体接种至发酵培养基中，发酵培养基配方为种子培养基的碳源更换为乳糖即可，30℃，200rpm震荡培养96h后，测定发酵液上清的脂肪酶酶活，并对菌丝体荧光强度进行测定。After single cell sorting in a 96-well microplate, 46 strains of bacteria with different fluorescence intensities were selected to produce spores on a PDA solid plate. The spores were prepared into a spore suspension and inoculated into a seed bottle. The formula of the seed bottle medium was 20 g glucose, 0.05 g (NH ₄ ) ₂ SO ₄ , 0.15 g KH ₂ PO ₄ , 0.006 g MgSO ₄ , 0.006 g per liter of medium. g CaCl ₂ , 0.00005 g FeSO ₄ ·7H ₂ O, 0.000016 g MnSO ₄ ·H ₂ O, 0.000014 g, ZnSO ₄ ·7H ₂ O, 0.00002 g CoCl ₂ . The culture condition is 30°C, after 48 hours of shaking at 200 rpm, the cells are filtered with four layers of gauze, and 1.8 g of the cells are respectively weighed and inoculated into the fermentation medium. The fermentation medium formula is that the carbon source of the seed medium is replaced with lactose. After 96 hours of shaking at 30°C and 200 rpm, the lipase enzyme activity of the supernatant of the fermentation broth was measured, and the fluorescence intensity of the mycelium was measured.

1、红色荧光蛋白的荧光测定1. Fluorescence assay of red fluorescent protein

称取等重的菌丝体置于24孔微孔板中，用多功能荧光读板仪读取各个转化子的荧光值，设定激发光波长为557nm，发射光波长为585nm，采用13×13多点扫描的方式进行扫描后取其平均值，然后将不表达荧光蛋白的菌株TU6的荧光值设为1，转化子取相对荧光值。Weigh equal weight of mycelium and place it in a 24-well microplate, read the fluorescence value of each transformant with a multi-function fluorescence plate reader, set the excitation light wavelength to 557 nm and the emission light wavelength to 585 nm, using 13× 13 After scanning by multi-point scanning, the average value was taken, and then the fluorescence value of the strain TU6 that did not express fluorescent protein was set to 1, and the relative fluorescence value of the transformant was taken.

2、脂肪酶的活性测定2. Determination of lipase activity

脂肪酶的活性测定参照国标GB/T 23535，反应底物为橄榄油乳化液。脂肪酶活力单位(U)定义为：在试验条件下，每毫升酶液每分钟催化底物释放出1μmol的游离脂肪酸，定义为一个脂肪酶活力单位(U)。The activity of lipase was determined with reference to the national standard GB/T 23535, and the reaction substrate was olive oil emulsion. Lipase activity unit (U) is defined as: under test conditions, 1 μmol of free fatty acid is released from the catalytic substrate per milliliter of enzyme solution per minute, which is defined as one lipase activity unit (U).

红色荧光蛋白和异源脂肪酶表达量的对应关系见图3，由图可以看出，转化子的红色荧光蛋白的表达水平与体外脂肪酶的活性趋势基本吻合，结果表明，利用2A序列作为连接肽融合表达红色荧光蛋白和脂肪酶，可以使这两个蛋白等量表达并独立存在，具有信号肽的脂肪酶可以有效分泌到胞外，而缺乏信号肽的红色荧光蛋白留在胞内作为报告基因进行流式细胞仪高通量筛选脂肪酶高表达的菌株。The corresponding relationship between the expression levels of DsRed and heterologous lipase is shown in Figure 3. It can be seen from the figure that the expression level of DsRed in the transformants is basically consistent with the activity trend of lipase in vitro. The results show that the use of 2A sequence as a linker Peptide fusion expression of red fluorescent protein and lipase can make these two proteins expressed in equal amounts and exist independently, the lipase with signal peptide can be effectively secreted into the extracellular, while the red fluorescent protein lacking the signal peptide remains in the cell as a reporter Gene high-throughput screening of strains with high lipase expression by flow cytometry.

(五)重组蛋白高产菌株中重组基因的基因拷贝数鉴定(5) Gene copy number identification of recombinant genes in recombinant protein high-producing strains

1、提取重组蛋白高产菌株和对照菌株N10的DNA。1. Extract the DNA of recombinant protein high-producing strain and control strain N10.

(1)取重组蛋白高产菌株和里氏木霉(Trichoderma reesei)对照菌株N10，将各菌株10⁸的孢子悬液接种于含2％葡萄糖的MM培养基中培养36h后，200目筛子过滤收集里氏木霉菌丝，在滤纸上压干，液氮研磨至粉末状。(1) Take recombinant protein high-producing strain and Trichoderma reesei control strain N10, inoculate 10 ⁸ spore suspensions of each strain in MM medium containing 2% glucose and cultivate for 36 hours, then filter and collect through 200 mesh sieves Trichoderma reesei hyphae, pressed dry on filter paper, and ground to powder in liquid nitrogen.

(2)每克菌丝加入5mL DNA抽提缓冲液，振荡混匀后于37℃水浴1h，每隔10min取出振荡混匀一次。(2) Add 5 mL of DNA extraction buffer per gram of mycelium, shake and mix evenly, then place in a water bath at 37°C for 1 hour, take out every 10 minutes, shake and mix evenly.

(3)加入等体积的酚：氯仿：异戊醇(25：24：1，体积比)溶液，上下颠倒混匀，12000rpm离心10min。(3) An equal volume of phenol:chloroform:isoamyl alcohol (25:24:1, volume ratio) solution was added, mixed upside down and centrifuged at 12000 rpm for 10 min.

(4)将上清转移到一新的离心管中，加入0.5-0.6倍体积的冰预冷的异丙醇并颠倒混匀，于-20℃放置30-60min，然后在4℃，12000rpm离心10min。(4) Transfer the supernatant to a new centrifuge tube, add 0.5-0.6 times the volume of ice-cold isopropanol, invert and mix, place at -20°C for 30-60min, and then centrifuge at 4°C and 12000rpm 10min.

(5)沉淀用75％乙醇洗涤两次，溶于50-100μL含20μg/mL RNA酶的100mM Tris-HCl(pH8.0)中，-20℃保存备用。(5) The precipitate was washed twice with 75% ethanol, dissolved in 50-100 μL of 100 mM Tris-HCl (pH 8.0) containing 20 μg/mL RNase, and stored at -20° C. for later use.

2、实时荧光定量PCR鉴定红色荧光蛋白和脂肪酶融合基因的拷贝数2. Real-time quantitative PCR to identify the copy number of the fusion gene of DsRed and lipase

以单拷贝基因actin基因作为标准内参，分别用引物Actin-F/Actin-R，LipRT-F/LipRT-R(表1)对N10和各个高产转化株的基因组DNA进行扩增。所有的PCR反应均在ABIPrism 7000real-time detection system(ABI)仪器上进行，试剂为Power SYBR GreenPCR Master Mix(ABI)。体系为25μL，采用的步骤为：50℃2min；95℃预变性10min，95℃30s，60℃30s，72℃30s，40个循环。由于菌株N10中脂肪酶基因的拷贝数为单拷贝(参见文章：Lina Qin,Fu-rong Cai,Xin-rui Dong,Zhen-bang Huang,Yong Tao,Jian-zhong Huang,Zhi-yang Dong.Improved production of heterologous lipase in Trichodermareesei by RNAi mediated gene silencing of an endogenic highly expressedgene.Bioresour Technol,2012,109:116-122.)，因此，各转化株中重组基因的拷贝数用2^-ΔΔCt法相对定量。Using the single-copy gene actin gene as a standard internal reference, the genomic DNAs of N10 and each high-yielding transformant were amplified with primers Actin-F/Actin-R and LipRT-F/LipRT-R (Table 1). All PCR reactions were performed on an ABIPrism 7000 real-time detection system (ABI) instrument using Power SYBR GreenPCR Master Mix (ABI). The system was 25 μL, and the steps used were: 50°C for 2 min; pre-denaturation at 95°C for 10 min, 95°C for 30s, 60°C for 30s, and 72°C for 30s, 40 cycles. Since the copy number of lipase gene in strain N10 is single copy (see article: Lina Qin, Fu-rong Cai, Xin-rui Dong, Zhen-bang Huang, Yong Tao, Jian-zhong Huang, Zhi-yang Dong.Improved production of heterologous lipase in Trichodermareesei by RNAi mediated gene silencing of an endogenic highly expressed gene. Bioresour Technol, 2012, 109: 116-122.), therefore, the copy number of the recombinant gene in each transformant was relatively quantified by the 2- ^ΔΔCt method.

(六)重组基因单拷贝的重组蛋白高产菌株中插入位点的鉴定(6) Identification of the insertion site in the recombinant protein high-producing strain with a single copy of the recombinant gene

在所有转化株中，质粒pSKpyr4和pSKLR以随机插入的方式整合入里氏木霉基因组中，这个质粒除了含有红色荧光蛋白和脂肪酶融合蛋白完整的表达框以外，还具有针对大肠杆菌的复制子序列和氨苄抗性基因。因此我们采用质粒拯救的方法对基因的插入位点进行鉴定。本发明中主要针对一株重组蛋白基因为单拷贝的高产菌株ND1进行插入位点分析，具体操作过程如下：In all transformants, plasmids pSKpyr4 and pSKLR were integrated into the Trichoderma reesei genome by random insertion. In addition to the complete expression cassettes for DsRed and lipase fusion proteins, this plasmid also has a replicon for E. coli Sequence and ampicillin resistance gene. Therefore, we used the plasmid rescue method to identify the insertion site of the gene. In the present invention, insertion site analysis is mainly carried out for a high-yielding strain ND1 whose recombinant protein gene is a single copy, and the specific operation process is as follows:

1、用限制性内切酶SalI消化5-10μg的基因组DNA，待酶切完全后，65℃水浴10min，使限制性内切酶失活。1. Digest 5-10 μg of genomic DNA with restriction endonuclease SalI. After the digestion is complete, take a water bath at 65°C for 10 minutes to inactivate the restriction endonuclease.

2、在酶切体系中加入1/10体积的3M醋酸钠和2.5倍体积的无水乙醇，-20℃沉淀至少1h以上。2. Add 1/10 volume of 3M sodium acetate and 2.5 times volume of absolute ethanol to the enzyme digestion system, and precipitate at -20 °C for at least 1 hour.

3、13000rpm，4℃离心15min，沉淀用70％的乙醇洗两遍，自然晾干或真空抽干，用50-100μL的无菌ddH₂O溶解。3. Centrifuge at 4°C for 15 min at 13000 rpm, wash the precipitate twice with 70% ethanol, air dry or vacuum dry, and dissolve with 50-100 μL of sterile ddH ₂ O.

4、在下列连接体系中，16℃连接过夜。4. In the following ligation system, ligation overnight at 16°C.

5、65℃处理10min以失活T4DNA连接酶。5. Treat at 65°C for 10 min to inactivate T4 DNA ligase.

6、加入1/10体积的3M醋酸钠和2.5倍体积的无水乙醇，-20℃沉淀至少1h以上。6. Add 1/10 volume of 3M sodium acetate and 2.5 times volume of absolute ethanol, and precipitate at -20°C for at least 1 hour.

7、13000rpm，4℃离心15min，沉淀用70％的乙醇洗两遍，自然晾干或真空抽干，用5μL无菌ddH₂O溶解DNA。7. Centrifuge at 4°C for 15 min at 13000 rpm, wash the pellet twice with 70% ethanol, air dry or vacuum dry, and dissolve the DNA with 5 μL of sterile ddH ₂ O.

8、将5μL无菌ddH₂O溶解的DNA全部转化大肠杆菌XL10-gold感受态细胞。8. All DNA dissolved in 5 μL sterile ddH ₂ O was transformed into E. coli XL10-gold competent cells.

9、挑取大肠杆菌转化平板上的克隆，利用引物Lip-F，Lip-R(表1)进行菌落PCR鉴定，以确保所得克隆中含有脂肪酶基因。9. Pick the clone on the E. coli transformation plate, and use primers Lip-F and Lip-R (Table 1) to carry out colony PCR identification to ensure that the obtained clone contains the lipase gene.

10、提取以上步骤9中阳性克隆中的质粒并送测序公司进行测序，从而获得重组基因在基因组中整合位点的侧翼序列。10. Extract the plasmids in the positive clones in the above step 9 and send them to a sequencing company for sequencing, so as to obtain the flanking sequences of the integration sites of the recombinant genes in the genome.

经过上述各步骤的筛选鉴定，最终获得菌株ND1。After the screening and identification of the above steps, the strain ND1 was finally obtained.

第一，对菌株ND1进行重组脂肪酶基因的表达水平检测，结果如图4所示。图4中，a为SDS-PAGE检测纤维素条件下发酵120小时后的发酵液上清；b为纤维素条件下发酵120小时后的脂肪酶活性。里氏木霉菌株TR1124是一株将红色荧光蛋白基因和脂肪酶基因的融合基因整合入cbh1位点的脂肪酶高产菌株。图4的结果表明，通过流式细胞仪高通量筛选得到的菌株ND1也是一株脂肪酶高产菌株，而且在该菌株中脂肪酶基因没有整合入cbh1位点(详见下文)。First, the expression level of the recombinant lipase gene was detected in strain ND1, and the results are shown in Figure 4. In Figure 4, a is the supernatant of the fermentation broth after 120 hours of fermentation under cellulose conditions detected by SDS-PAGE; b is the lipase activity after 120 hours of fermentation under cellulose conditions. Trichoderma reesei strain TR1124 is a high-yielding strain of lipase that integrates the fusion gene of the red fluorescent protein gene and the lipase gene into the cbh1 site. The results in FIG. 4 show that the strain ND1 obtained by high-throughput screening by flow cytometry is also a high-producing strain of lipase, and the lipase gene is not integrated into the cbh1 site in this strain (see below for details).

第二，由表2的结果可见，重组脂肪酶基因在ND1菌株中的拷贝数为单拷贝。Second, it can be seen from the results in Table 2 that the copy number of the recombinant lipase gene in the ND1 strain is a single copy.

表2 Real Time PCR鉴定重组脂肪酶基因在ND1菌株中的拷贝数Table 2 Identification of the copy number of recombinant lipase gene in ND1 strain by Real Time PCR

第三，菌株ND1中重组基因插入位点分析Third, analysis of the insertion site of the recombinant gene in strain ND1

经检测菌株ND1中重组基因插入位点(将该插入位点记作ND1位点)位于ProteinID为68606的基因和Protein ID为68608的基因之间的非编码区(SEQ ID No.1)。The insertion site of the recombinant gene in the tested strain ND1 (denoted as ND1 site) is located in the non-coding region (SEQ ID No. 1) between the gene with Protein ID 68606 and the gene with Protein ID 68608.

测序结果显示：菌株ND1中重组基因lipA-red具体是取代了里氏木霉基因组中SEQID No.1的第429-551位。The sequencing results showed that the recombinant gene lipA-red in strain ND1 specifically replaced the 429-551 position of SEQID No.1 in the Trichoderma reesei genome.

二、插入位点ND1对重组蛋白表达影响的分析2. Analysis of the effect of insertion site ND1 on recombinant protein expression

纤维二糖水解酶基因cbh1在基因组中的位置通常被认为是里氏木霉基因组中有利于基因表达的位点。为了比较步骤一所鉴定出的ND1位点与cbh1位点哪一个更有利于基因的表达。以同源重组率极高的里氏木霉菌株Tu6Δku70为出发株，将脂肪酶基因lipA整合入cbh1位点替换cbh1基因，同时将cbh1基因整合入步骤一所鉴定出的插入位点ND1，检测该菌株中cbh1基因的表达水平并与对照菌株Tu6以及ND1进行比较。具体操作步骤如下：The location of the cellobiohydrolase gene cbh1 in the genome is generally considered to be a site of favorable gene expression in the T. reesei genome. In order to compare the ND1 locus identified in step 1 and the cbh1 locus which is more favorable for gene expression. Taking the Trichoderma reesei strain Tu6Δku70 with a very high homologous recombination rate as the starting strain, the lipase gene lipA was integrated into the cbh1 site to replace the cbh1 gene, and the cbh1 gene was integrated into the insertion site ND1 identified in step 1. The expression level of the cbh1 gene in this strain was compared with the control strains Tu6 and ND1. The specific operation steps are as follows:

1、构建如图5所示的组件片段1. Build the component fragment shown in Figure 5

(1)以质粒pSKLR为模板，Pcbh1-F，Tcbh1-R(表1)为引物进行PCR扩增，扩增得到的片段即为图5中a所示片段。(1) PCR amplification was performed using plasmid pSKLR as a template and Pcbh1-F and Tcbh1-R (Table 1) as primers, and the amplified fragment was the fragment shown in a in Figure 5 .

图5中a所示片段的核苷酸序列如SEQ ID No.2所示，其中第1-2078位为Pcbh1启动子序列，第2138-2947位为Lipase基因序列，第2948-2998位为2A序列，第2999-3676位为DsRed序列，第3685-5927位为Tcbh1终止子序列。The nucleotide sequence of the fragment shown in a in Figure 5 is shown in SEQ ID No. 2, in which the 1-2078th position is the Pcbh1 promoter sequence, the 2138-2947th position is the Lipase gene sequence, and the 2948-2998th position is the 2A Sequence, the 2999-3676 position is the DsRed sequence, and the 3685-5927 position is the Tcbh1 terminator sequence.

(2)以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，引物Flup-F，Flup-R(表1)为引物进行PCR扩增，扩增得到的片段为插入位点上游同源臂序列；以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，Pcbh1-2F，cbh1-2R(表1)为引物进行PCR扩增，扩增得到的片段为cbh1基因的启动子及ORF序列；以里氏木霉(Trichodermareesei)QM9414基因组DNA为模板，引物Tcbh2-F和Tcbh2-R(表1)为引物进行PCR扩增，所得片段为cbh2基因终止子序列；以里氏木霉(Trichoderma reesei)QM9414基因组DNA为模板，引物Fldown-F，Fldown-R(表1)为引物进行PCR扩增，扩增得到的片段为插入位点下游同源臂序列；以质粒pBluescriptSK(+)为模板，引物vec-F，vec-R(表1)进行PCR扩增，扩增得到的片段为载体骨架片段。将PCR扩增得到的这些片段进行胶回收，利用Gibson克隆试剂盒进行连接，并转化大肠杆菌XL10-gold感受态细胞，构建载体pND1-cbh1。以质粒pND1-cbh1为模板，引物Flup-F，Fldown-R(表1)为引物进行扩增，扩增得到的片段即为图5中b所示的组件片段。(2) using Trichoderma reesei (Trichoderma reesei) QM9414 genomic DNA as a template, primers Flup-F, Fluup-R (Table 1) are primers to carry out PCR amplification, and the amplified fragment is the upstream homology arm of the insertion site Sequence; Take Trichoderma reesei (Trichoderma reesei) QM9414 genomic DNA as template, Pcbh1-2F, cbh1-2R (table 1) are primers to carry out PCR amplification, and the fragment that amplification obtains is the promoter and ORF sequence of cbh1 gene; Take Trichoderma reesei (Trichoderma reesei) QM9414 genomic DNA as template, primer Tcbh2-F and Tcbh2-R (table 1) are primers to carry out PCR amplification, gained fragment is cbh2 gene terminator sequence; Take Trichoderma reesei (Trichoderma reesei) ) QM9414 genomic DNA is a template, primers Fldown-F, Fldown-R (Table 1) are primers to carry out PCR amplification, and the fragment that the amplification obtains is the insert site downstream homology arm sequence; With plasmid pBluescriptSK (+) as template, The primers vec-F and vec-R (Table 1) were used for PCR amplification, and the amplified fragments were vector backbone fragments. The fragments amplified by PCR were recovered by gel, ligated with Gibson cloning kit, and transformed into E. coli XL10-gold competent cells to construct the vector pND1-cbh1. Using plasmid pND1-cbh1 as a template, primers Flup-F and Fldown-R (Table 1) as primers for amplification, the amplified fragment is the component fragment shown in b in Figure 5 .

图5中b所示片段的核苷酸序列如SEQ ID No.3所示，其中第1-1938位为上游同源臂序列，第1939-4016位为Pcbh1启动子序列，第4017-5692位为cbh1基因序列，第5693-7784位为Tcbh1终止子序列，第7785-9710位为下游同源臂序列。The nucleotide sequence of the fragment shown in b in Figure 5 is shown in SEQ ID No.3, in which the 1-1938th position is the upstream homology arm sequence, the 1939-4016th position is the Pcbh1 promoter sequence, and the 4017th-5692th position is the cbh1 gene sequence, the 5693-7784th position is the Tcbh1 terminator sequence, and the 7785th-9710th position is the downstream homology arm sequence.

2、将以上PCR扩增得到的组件片段(即SEQ ID No.2和SEQ ID No.3所示的两个片段)进行PCR纯化后，与质粒pSK-pyr4共同转化里氏木霉菌株Tu6Δku70，转化方法如步骤一(二)。2. After the assembly fragments obtained by the above PCR amplification (that is, the two fragments shown in SEQ ID No. 2 and SEQ ID No. 3) are subjected to PCR purification, and the plasmid pSK-pyr4 is co-transformed into Trichoderma reesei strain Tu6Δku70, The transformation method is as in step one (two).

3、PCR鉴定阳性转化子：获得转化子后，提取DNA，引物对Ocbh1-F/Lip-R(表1)进行PCR，以此鉴定脂肪酶基因是否整合入cbh1位点，引物对OND1-F/Vcbh1-R(表1)进行PCR鉴定cbh1基因是否整合入ND1位点。经鉴定的阳性转化子命名为ND1cbh1。3. PCR identification of positive transformants: After obtaining the transformants, DNA was extracted, and the primer pair Ocbh1-F/Lip-R (Table 1) was subjected to PCR to identify whether the lipase gene was integrated into the cbh1 site, and the primer pair OND1-F /Vcbh1-R (Table 1) PCR was performed to identify whether the cbh1 gene was integrated into the ND1 site. The identified positive transformants were named ND1cbh1.

菌株ND1cbh1中cbh1表达盒取代了里氏木霉基因组中SEQ ID No.1的第429-551位。The cbh1 expression cassette in strain ND1cbh1 replaces positions 429-551 of SEQ ID No. 1 in the T. reesei genome.

4、以上阳性转化株中cbh1基因表达量的分析：将阳性转化子ND1cbh1以及对照菌株TU6和ND1接种于PDA平板上产孢，用无菌水将孢子洗下来制备成孢子悬液，200目筛子过滤，孢子悬液的浓度调整至10⁸/mL，在装有50mL MM培养基(2％葡萄糖为碳源)的250mL三角瓶中接种1mL孢子悬液，28℃，200rpm培养48h后，四层纱布过滤菌体，称取1.8g湿菌体转接至250mL三角瓶的50mL MM+1％微晶纤维素的培养基中。其中接种菌株TU6的培养基中需添加5mM的尿苷，纤维素诱导培养基中培养120h后，SDS-PAGE电泳检测它们的发酵液上清中CBH1分泌情况，并测CBH1的酶活。并提取菌体DNA和RNA，实时荧光定量PCR分析cbh1基因的拷贝数及其mRNA水平。4. Analysis of cbh1 gene expression in the above positive transformants: Inoculate the positive transformants ND1cbh1 and control strains TU6 and ND1 on a PDA plate to produce spores, wash the spores with sterile water to prepare a spore suspension, sieve with 200 mesh Filter, adjust the concentration of the spore suspension to 10 ⁸ /mL, inoculate 1 mL of the spore suspension in a 250 mL Erlenmeyer flask with 50 mL of MM medium (2% glucose as carbon source), and culture at 28°C and 200 rpm for 48 hours. Cells were filtered with gauze, and 1.8 g of wet cells were weighed and transferred to a medium of 50 mL MM+1% microcrystalline cellulose in a 250 mL conical flask. The medium inoculated with strain TU6 needs to be supplemented with 5mM uridine. After culturing in the cellulose-induced medium for 120 hours, SDS-PAGE electrophoresis was used to detect the secretion of CBH1 in the supernatant of the fermentation broth, and the enzymatic activity of CBH1 was measured. The bacterial DNA and RNA were extracted, and the copy number and mRNA level of cbh1 gene were analyzed by real-time fluorescence quantitative PCR.

(1)cbh1基因在菌株ND1cbh1中拷贝数的鉴定。收集菌株TU6和ND1cbh1的菌丝，液氮中研磨并提取总DNA，以单拷贝基因actin基因作为标准内参，分别用引物Actin-F/Actin-R，RTcbh1-F/RTcbh1-R(表1)对菌株TU6和ND1cbh1的基因组DNA进行扩增。所有的PCR反应均在ABI Prism 7000 real-time detection system(ABI)仪器上进行，试剂为PowerSYBR Green PCR Master Mix(ABI)。体系为25μL，采用的步骤为：50℃2min；95℃预变性10min，95℃30s，60℃30s，72℃30s，40个循环。由于菌株TU6中cbh1基因的拷贝数为单拷贝(参见文章Genome sequencing and analysis of the biomass-degrading fungusTrichoderma reesei(syn.Hypocrea jecorina)中的表3)，因此，ND1cbh1菌株中基因的拷贝数用2^-ΔΔCt法进行相对定量。(1) Identification of the copy number of cbh1 gene in strain ND1cbh1. The hyphae of strains TU6 and ND1cbh1 were collected, ground in liquid nitrogen, and total DNA was extracted. The single-copy gene actin gene was used as a standard internal reference, and primers Actin-F/Actin-R and RTcbh1-F/RTcbh1-R were used respectively (Table 1). Genomic DNA of strains TU6 and ND1cbh1 was amplified. All PCR reactions were performed on an ABI Prism 7000 real-time detection system (ABI) instrument using PowerSYBR Green PCR Master Mix (ABI). The system was 25 μL, and the steps used were: 50°C for 2 min; pre-denaturation at 95°C for 10 min, 95°C for 30s, 60°C for 30s, and 72°C for 30s, 40 cycles. Since the copy number of the cbh1 gene in strain TU6 is a single copy (see Table 3 in the article Genome sequencing and analysis of the biomass ^- degrading fungus Trichoderma reesei (syn. The ^ΔΔCt method was used for relative quantification.

(2)Real Time PCR分析cbh1基因的mRNA水平。收集菌丝，液氮中研磨并提取总RNA，采用两步法进行实时荧光定量PCR实验。(2) Real Time PCR analysis of mRNA level of cbh1 gene. The hyphae were collected, ground in liquid nitrogen and total RNA was extracted, and a two-step real-time quantitative PCR experiment was performed.

第一步：反转录Step 1: Reverse Transcription

将以下试剂加入到200μL PCR管中：Add the following reagents to a 200 μL PCR tube:

65℃孵育10min；立即取出放于冰上5min；Incubate at 65°C for 10min; immediately remove and place on ice for 5min;

加入下列试剂：Add the following reagents:

将反应混合物于50℃孵育1h；再升温至85℃10min以灭活反转录酶。The reaction mixture was incubated at 50°C for 1 h; then heated to 85°C for 10 min to inactivate reverse transcriptase.

第二步：Real Time PCRStep 2: Real Time PCR

所有的PCR反应均在ABI Prism 7000real-time detection system(ABI)仪器上进行，试剂为Power SYBR Green PCR Master Mix(ABI)。反应参照ABI Prism 7000使用说明，体系为25μL，采用的步骤为：50℃2min；95℃预变性10min，95℃30s，60℃30s，72℃30s，40个循环。基因的转录水平均以actin基因作为内参。All PCR reactions were performed on an ABI Prism 7000 real-time detection system (ABI) instrument using Power SYBR Green PCR Master Mix (ABI). The reaction was performed according to the instructions of ABI Prism 7000. The system was 25 μL. The steps used were: 50°C for 2 min; pre-denaturation at 95°C for 10 min, 95°C for 30s, 60°C for 30s, and 72°C for 30s, 40 cycles. The transcription level of the gene was used as an internal reference for the actin gene.

(3)纤维二塘水解酶(CBH)的酶活测定(3) Determination of the enzymatic activity of cellulosic pond hydrolase (CBH)

用4-methylumbelliferyl-D-cellobioside(MUC,Sigma)作底物，称取15mg MUC，溶于500μL DMSO(Sigma)，再将其转移到30mL柠檬酸缓冲液(pH4.8 50mM)中。将12μL适当稀释的酶液和200μL MUC、25μl葡萄糖(1M)和25μL柠檬酸缓冲液混合，此为不加纤维二糖实验组。在混合溶液中，葡萄糖会抑制BG(β-Glucosidase,β-葡萄糖苷酶)降解MUC，因此，测得的活性实际为CBH1和EG(endoglucosidase内切葡聚糖酶)的活性。同时设置加纤维二糖实验组：12μL酶液和200μL MUC、25μL葡萄糖、25μL(50mM)纤维二糖和25μL柠檬酸缓冲液，于50℃反应10min。加入纤维二糖会抑制CBH的活性，因此测得的是EG的活性。加入250μL Na₂CO₃(1M)。取100μL稀释11倍，于370nm测定吸光度。将不加纤维二糖实验组OD值减去加纤维二糖实验组，即为CBH较为特异的活性。一单位的CBH酶活定义为每秒催化1nmol MUC水解所需要的酶量。Using 4-methylumbelliferyl-D-cellobioside (MUC, Sigma) as a substrate, 15 mg of MUC was weighed, dissolved in 500 μL of DMSO (Sigma), and then transferred to 30 mL of citric acid buffer (pH 4.8 50 mM). Mix 12 μL of appropriately diluted enzyme solution with 200 μL MUC, 25 μL glucose (1M) and 25 μL citrate buffer, this is the experimental group without cellobiose. In the mixed solution, glucose will inhibit BG (β-Glucosidase, β-glucosidase) from degrading MUC, so the measured activity is actually the activity of CBH1 and EG (endoglucosidase). At the same time, a cellobiose-added experimental group was set up: 12 μL enzyme solution, 200 μL MUC, 25 μL glucose, 25 μL (50 mM) cellobiose and 25 μL citrate buffer, and reacted at 50°C for 10 min. The addition of cellobiose inhibited the activity of CBH, so the activity of EG was measured. ₂₅₀ μL _Na2CO3 (1M) was added. Take 100 μL of diluted 11-fold, and measure the absorbance at 370 nm. The OD value of the experimental group without cellobiose was subtracted from the experimental group with cellobiose, which is the more specific activity of CBH. One unit of CBH enzymatic activity was defined as the amount of enzyme required to catalyze the hydrolysis of 1 nmol MUC per second.

5、结果5. Results

cbh1基因在菌株ND1cbh1中拷贝数的鉴定结果如表3所示，可见：cbh1基因在菌株ND1cbh1中的拷贝数与菌株TU6接近，为单拷贝。The identification results of the copy number of cbh1 gene in strain ND1cbh1 are shown in Table 3. It can be seen that the copy number of cbh1 gene in strain ND1cbh1 is close to that of strain TU6, and it is a single copy.

表3 Real Time PCR鉴定重组cbh1基因在ND1cbh1菌株中的拷贝数Table 3 Identification of the copy number of recombinant cbh1 gene in ND1cbh1 strain by Real Time PCR

cbh1基因在菌株Tu6、ND1和ND1cbh1菌株中的表达水平鉴定结果如图6所示。图6中，a为SDS-PAGE检测纤维素条件下发酵120小时后的发酵液上清；b为纤维素条件下发酵120小时后的CBH活性。由图可见：cbh1基因在ND1位点的表达水平与菌株TU6相当，说明该位点对重组基因表达水平的贡献与cbh1基因相当。Figure 6 shows the identification results of the expression levels of cbh1 gene in strains Tu6, ND1 and ND1cbh1 strains. In Figure 6, a is the supernatant of the fermentation broth after 120 hours of fermentation under cellulose conditions detected by SDS-PAGE; b is the CBH activity after 120 hours of fermentation under cellulose conditions. It can be seen from the figure that the expression level of the cbh1 gene at the ND1 site is comparable to that of the strain TU6, indicating that the contribution of this site to the expression level of the recombinant gene is comparable to that of the cbh1 gene.

<110> 福建师范大学<110> Fujian Normal University

<120> 利用里氏木霉作为宿主表达重组蛋白的方法<120> Method for expressing recombinant protein using Trichoderma reesei as host

<130> GNCLN172245<130> GNCLN172245

<160> 3<160> 3

<170> PatentIn version 3.5<170> PatentIn version 3.5

<210> 1<210> 1

<211> 1095<211> 1095

<212> DNA<212> DNA

<213> 里氏木霉（Trichoderma reesei）<213> Trichoderma reesei

<400> 1<400> 1

gagagcaatg aactatctct tgggatggct gtgatgacag gctgctccgt atgataggta 60gagagcaatg aactatctct tgggatggct gtgatgacag gctgctccgt atgataggta 60

cagtatgtac gaagaagtac tactattaca attcttcatc aaggctgatt gttgctaggt 120cagtatgtac gaagaagtac tactattaca attcttcatc aaggctgatt gttgctaggt 120

acacaaaggc cattactcag agcgaagccc aattgcccat cccccataga tcctctcttg 180acacaaaggc cattactcag agcgaagccc aattgcccat cccccataga tcctctcttg 180

agaagaatga aatggatcag cattgacctc aactcccatg atccagtcaa cggctccaaa 240agaagaatga aatggatcag cattgacctc aactcccatg atccagtcaa cggctccaaa 240

caagtggcac aatggcccaa tccgtccaag tttttagggt cctcttcaac tatccccgag 300caagtggcac aatggcccaa tccgtccaag tttttagggt cctcttcaac tatccccgag 300

tccaccacaa ctttcttagc aaataatatc ttcacgataa cacacacaca cagccctaaa 360tccaccacaa ctttcttagc aaataatatc ttcacgataa cacacacaca cagccctaaa 360

ttctctcatg gctaaaatca cacgcccgga taccaccgga tcatttcccg cccggctttt 420ttctctcatg gctaaaatca cacgcccgga taccaccgga tcatttcccg cccggctttt 420

ggccgttgaa tgtttgcttc caactgccga catcgccgtc actatgattg tatcgagagc 480ggccgttgaa tgtttgcttc caactgccga catcgccgtc actatgattg tatcgagagc 480

cgatcaattg gcccttaccc ctcctttttt aaatgtgacg gttggtgaga atccggtcga 540cgatcaattg gcccttaccc ctcctttttt aaatgtgacg gttggtgaga atccggtcga 540

ctataaattg ggatgaaaat cacgcagaag tcaatgctcg tttctcttcc ttttcacatc 600ctataaattg ggatgaaaat cacgcagaag tcaatgctcg tttctcttcc ttttcacatc 600

tcgttcatcg ttcttccacc aagaaagaat caaaagtcaa ctcaatcgta acctatccaa 660tcgttcatcg ttcttccacc aagaaagaat caaaagtcaa ctcaatcgta acctatccaa 660

ataaaagtat tcagtgatat acttcacacc ccccctccaa ccccctctct ccaatgcatc 720ataaaagtat tcagtgatat acttcacacc ccccctccaa ccccctctct ccaatgcatc 720

tgacacacag caggccaaat accacactct cctgtcggta caacctcgtc cgtcgccatg 780tgacacacag caggccaaat accacactct cctgtcggta caacctcgtc cgtcgccatg 780

gcgtccttcg gtggtagtcc gtccagacgg ggttctgtct cctcatcttc aggtaatgtg 840gcgtccttcg gtggtagtcc gtccagacgg ggttctgtct cctcatcttc aggtaatgtg 840

tatcgcatga gccggtgccc gggctccccc ccttcctttg tcccacttca ctcctctccc 900tatcgcatga gccggtgccc gggctccccc ccttcctttg tcccacttca ctcctctccc 900

ctcgtctctg cttgtcttta caaacaagga aacacaccca atgagatgga gaaggtgatt 960ctcgtctctg cttgtcttta caaacaagga aacacaccca atgagatgga gaaggtgatt 960

gtgtggaaat aaggaagagg gggctcgggc tgagaatata cggccctgaa cttgatctgg 1020gtgtggaaat aaggaagagg gggctcgggc tgagaatata cggccctgaa cttgatctgg 1020

ataataccag cgaaaggatc atgctcctcc cccttttcaa tcccgtccaa atccctaacc 1080ataataccag cgaaaggatc atgctcctcc cccttttcaa tcccgtccaa atccctaacc 1080

cttgtaggcc catca 1095cttgtaggcc catca 1095

<210> 2<210> 2

<211> 5927<211> 5927

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 2<400> 2

gacttccgcc cagagctgaa ggtcgcacaa ccgcatgata tagggtcggc aacggcaaaa 60gacttccgcc cagagctgaa ggtcgcacaa ccgcatgata tagggtcggc aacggcaaaa 60

aagcacgtgg ctcaccgaaa agcaagatgt ttgcgatcta acatccagga acctggatac 120aagcacgtgg ctcaccgaaa agcaagatgt ttgcgatcta acatccagga acctggatac 120

atccatcatc acgcacgacc actttgatct gctggtaaac tcgtattcgc cctaaaccga 180atccatcatc acgcacgacc actttgatct gctggtaaac tcgtattcgc cctaaaccga 180

agtgcgtggt aaatctacac gtgggcccct ttcggtatac tgcgtgtgtc ttctctaggt 240agtgcgtggt aaatctacac gtgggcccct ttcggtatac tgcgtgtgtc ttctctaggt 240

gccattcttt tcccttcctc tagtgttgaa ttgtttgtgt tggagtccga gctgtaacta 300gccattcttt tcccttcctc tagtgttgaa ttgtttgtgt tggagtccga gctgtaacta 300

cctctgaatc tctggagaat ggtggactaa cgactaccgt gcacctgcat catgtatata 360cctctgaatc tctggagaat ggtggactaa cgactaccgt gcacctgcat catgtatata 360

atagtgatcc tgagaagggg ggtttggagc aatgtgggac tttgatggtc atcaaacaaa 420atagtgatcc tgagaagggg ggtttggagc aatgtgggac tttgatggtc atcaaacaaa 420

gaacgaagac gcctcttttg caaagttttg tttcggctac ggtgaagaac tggatacttg 480gaacgaagac gcctcttttg caaagttttg tttcggctac ggtgaagaac tggatacttg 480

ttgtgtcttc tgtgtatttt tgtggcaaca agaggccaga gacaatctat tcaaacacca 540ttgtgtcttc tgtgtatttt tgtggcaaca agaggccaga gacaatctat tcaaacacca 540

agcttgctct tttgagctac aagaacctgt ggggtatata tctagagttg tgaagtcggt 600agcttgctct tttgagctac aagaacctgt ggggtatata tctagagttg tgaagtcggt 600

aatcccgctg tatagtaata cgagtcgcat ctaaatactc cgaagctgct gcgaacccgg 660aatcccgctg tatagtaata cgagtcgcat ctaaatactc cgaagctgct gcgaacccgg 660

agaatcgaga tgtgctggaa agcttctagc gagcggctaa attagcatga aaggctatga 720agaatcgaga tgtgctggaa agcttctagc gagcggctaa attagcatga aaggctatga 720

gaaattctgg agacggcttg ttgaatcatg gcgttccatt cttcgacaag caaagcgttc 780gaaattctgg agacggcttg ttgaatcatg gcgttccatt cttcgacaag caaagcgttc 780

cgtcgcagta gcaggcactc attcccgaaa aaactcggag attcctaagt agcgatggaa 840cgtcgcagta gcaggcactc attcccgaaa aaactcggag attcctaagt agcgatggaa 840

ccggaataat ataataggca atacattgag ttgcctcgac ggttgcaatg caggggtact 900ccggaataat ataataggca atacattgag ttgcctcgac ggttgcaatg caggggtact 900

gagcttggac ataactgttc cgtaccccac ctcttctcaa cctttggcgt ttccctgatt 960gagcttggac ataactgttc cgtaccccac ctcttctcaa cctttggcgt ttccctgatt 960

cagcgtaccc gtacaagtcg taatcactat taacccagac tgaccggacg tgttttgccc 1020cagcgtaccc gtacaagtcg taatcactat taacccagac tgaccggacg tgttttgccc 1020

ttcatttgga gaaataatgt cattgcgatg tgtaatttgc ctgcttgacc gactggggct 1080ttcatttgga gaaataatgt cattgcgatg tgtaatttgc ctgcttgacc gactggggct 1080

gttcgaagcc cgaatgtagg attgttatcc gaactctgct cgtagaggca tgttgtgaat 1140gttcgaagcc cgaatgtagg attgttatcc gaactctgct cgtagaggca tgttgtgaat 1140

ctgtgtcggg caggacacgc ctcgaaggtt cacggcaagg gaaaccaccg atagcagtgt 1200ctgtgtcggg caggacacgc ctcgaaggtt cacggcaagg gaaaccaccg atagcagtgt 1200

ctagtagcaa cctgtaaagc cgcaatgcag catcactgga aaatacaaac caatggctaa 1260ctagtagcaa cctgtaaagc cgcaatgcag catcactgga aaatacaaac caatggctaa 1260

aagtacataa gttaatgcct aaagaagtca tataccagcg gctaataatt gtacaatcaa 1320aagtacataa gttaatgcct aaagaagtca tataccagcg gctaataatt gtacaatcaa 1320

gtggctaaac gtaccgtaat ttgccaacgg cttgtggggt tgcagaagca acggcaaagc 1380gtggctaaac gtaccgtaat ttgccaacgg cttgtggggt tgcagaagca acggcaaagc 1380

cccacttccc cacgtttgtt tcttcactca gtccaatctc agctggtgat cccccaattg 1440cccacttccc cacgtttgtt tcttcactca gtccaatctc agctggtgat cccccaattg 1440

ggtcgcttgt ttgttccggt gaagtgaaag aagacagagg taagaatgtc tgactcggag 1500ggtcgcttgt ttgttccggt gaagtgaaag aagacagagg taagaatgtc tgactcggag 1500

cgttttgcat acaaccaagg gcagtgatgg aagacagtga aatgttgaca ttcaaggagt 1560cgttttgcat acaaccaagg gcagtgatgg aagacagtga aatgttgaca ttcaaggagt 1560

atttagccag ggatgcttga gtgtatcgtg taaggaggtt tgtctgccga tacgacgaat 1620atttagccag ggatgcttga gtgtatcgtg taaggaggtt tgtctgccga tacgacgaat 1620

actgtatagt cacttctgat gaagtggtcc atattgaaat gtaagtcggc actgaacagg 1680actgtatagt cacttctgat gaagtggtcc atattgaaat gtaagtcggc actgaacagg 1680

caaaagattg agttgaaact gcctaagatc tcgggccctc gggccttcgg cctttgggtg 1740caaaagattg agttgaaact gcctaagatc tcgggccctc gggccttcgg cctttgggtg 1740

tacatgtttg tgctccgggc aaatgcaaag tgtggtagga tcgaacacac tgctgccttt 1800tacatgtttg tgctccgggc aaatgcaaag tgtggtagga tcgaacacac tgctgccttt 1800

accaagcagc tgagggtatg tgataggcaa atgttcaggg gccactgcat ggtttcgaat 1860accaagcagc tgagggtatg tgataggcaa atgttcaggg gccactgcat ggtttcgaat 1860

agaaagagaa gcttagccaa gaacaatagc cgataaagat agcctcatta aacggaatga 1920agaaagagaa gcttagccaa gaacaatagc cgataaagat agcctcatta aacggaatga 1920

gctagtaggc aaagtcagcg aatgtgtata tataaaggtt cgaggtccgt gcctccctca 1980gctagtaggc aaagtcagcg aatgtgtata tataaaggtt cgaggtccgt gcctccctca 1980

tgctctcccc atctactcat caactcagat cctccaggag acttgtacac catcttttga 2040tgctctcccc atctactcat caactcagat cctccaggag acttgtacac catcttttga 2040

ggcacagaaa cccaatagtc aaccgcggac tgcgcatcat gtatcggaag ttggccgtca 2100ggcacagaaa cccaatagtc aaccgcggac tgcgcatcat gtatcggaag ttggccgtca 2100

tctcggcctt cttggccaca gctcgtgctg aattccgagt gtctcgactt ccacgttgga 2160tctcggcctt cttggccaca gctcgtgctg aattccgagt gtctcgactt ccacgttgga 2160

tgagctgcaa ttgttcgcgc aatggtctgc cgcagcttat tgctcgaaca atatcgactc 2220tgagctgcaa ttgttcgcgc aatggtctgc cgcagcttat tgctcgaaca atatcgactc 2220

ggacgactcc aacgtgacat gcacggccga cgcctgtcca tcagtcgagg aagcgagcac 2280ggacgactcc aacgtgacat gcacggccga cgcctgtcca tcagtcgagg aagcgagcac 2280

caagatgctg ctggagtttg atttgacaaa taactttgga ggcacagccg gtttcctggc 2340caagatgctg ctggagtttg atttgacaaa taactttgga ggcacagccg gtttcctggc 2340

cgcggacaac accaacaagc ggctcgtggt cgccttccga ggcagtagca ccatcaagaa 2400cgcggacaac accaacaagc ggctcgtggt cgccttccga ggcagtagca ccatcaagaa 2400

ctggattgct gatctcggct tcatcctgca agataacgat gacctctgta ctggctgcaa 2460ctggattgct gatctcggct tcatcctgca agataacgat gacctctgta ctggctgcaa 2460

ggttcacact ggattctgga aggcatggga agccgctgca gacaatctga cgagcaagat 2520ggttcacact ggattctgga aggcatggga agccgctgca gacaatctga cgagcaagat 2520

caagtccgcg atgagcacgt actcaggcta taccctctac ttcaccgggc acagcttggg 2580caagtccgcg atgagcacgt actcaggcta taccctctac ttcaccgggc acagcttggg 2580

cggcgcattg gctacgctgg gagcaacggt cttgcgaaat gacggttata gcgttgaact 2640cggcgcattg gctacgctgg gagcaacggt cttgcgaaat gacggttata gcgttgaact 2640

gtacacctat ggatgtcctc gagtcggaaa ctatgcgctg gccgagcaca tcaccagcca 2700gtacacctat ggatgtcctc gagtcggaaa ctatgcgctg gccgagcaca tcaccagcca 2700

gggatcagga gcgaacttcc gcgttacaca cttgaacgac atcgtccccc ggttgccacc 2760gggatcagga gcgaacttcc gcgttacaca cttgaacgac atcgtccccc ggttgccacc 2760

catggacttt ggattcagcc agccaagtcc agaatactgg atcaccagtg gcaccggagc 2820catggacttt ggattcagcc agccaagtcc agaatactgg atcaccagtg gcaccggagc 2820

cagtgtcacg gcgtcggata ttgaactcat cgagggaatc aattcgacgg cggggaatgc 2880cagtgtcacg gcgtcggata ttgaactcat cgagggaatc aattcgacgg cggggaatgc 2880

aggcgaagca acggtggacg ttttggctca cttgtggtac tttttcgcga tttccgagtg 2940aggcgaagca acggtggacg ttttggctca cttgtggtac tttttcgcga tttccgagtg 2940

cctgctaaac ttcgacctcc tcaagctcgc cggcgacgtc gagtccaacc ccggccccat 3000cctgctaaac ttcgacctcc tcaagctcgc cggcgacgtc gagtccaacc ccggccccat 3000

ggacaacacc gaggacgtca tcaaggagtt catgcagttc aaggtgcgca tggagggctc 3060ggacaacacc gaggacgtca tcaaggagtt catgcagttc aaggtgcgca tggagggctc 3060

cgtgaacggc cactacttcg agatcgaggg cgagggcgag ggcaagccct acgagggcac 3120cgtgaacggc cactacttcg agatcgaggg cgagggcgag ggcaagccct acgagggcac 3120

ccagaccgcc aagctgcagg tgaccaaggg cggccccctg cccttcgcct gggacatcct 3180ccagaccgcc aagctgcagg tgaccaaggg cggccccctg cccttcgcct gggacatcct 3180

gtccccccag ttccagtacg gctccaaggc ctacgtgaag caccccgccg acatccccga 3240gtccccccag ttccagtacg gctccaaggc ctacgtgaag caccccgccg acatccccga 3240

ctacatgaag ctgtccttcc ccgagggctt cacctgggag cgctccatga acttcgagga 3300ctacatgaag ctgtccttcc ccgagggctt cacctgggag cgctccatga acttcgagga 3300

cggcggcgtg gtggaggtgc agcaggactc ctccctgcag gacggcacct tcatctacaa 3360cggcggcgtg gtggaggtgc agcaggactc ctccctgcag gacggcacct tcatctacaa 3360

ggtgaagttc aagggcgtga acttccccgc cgacggcccc gtaatgcaga agaagactgc 3420ggtgaagttc aagggcgtga acttccccgc cgacggcccc gtaatgcaga agaagactgc 3420

cggctgggag ccctccaccg agaagctgta cccccaggac ggcgtgctga agggcgagat 3480cggctgggag ccctccaccg agaagctgta cccccaggac ggcgtgctga agggcgagat 3480

ctcccacgcc ctgaagctga aggacggcgg ccactacacc tgcgacttca agaccgtgta 3540ctcccacgcc ctgaagctga aggacggcgg ccactacacc tgcgacttca agaccgtgta 3540

caaggccaag aagcccgtgc agctgcccgg caaccactac gtggactcca agctggacat 3600caaggccaag aagcccgtgc agctgcccgg caaccactac gtggactcca agctggacat 3600

caccaaccac aacgaggact acaccgtggt ggagcagtac gagcacgccg aggcccgcca 3660caccaaccac aacgaggact acaccgtggt ggagcagtac gagcacgccg aggcccgcca 3660

ctccggctcc cagtagacta gtccagctcc gtggcgaaag cctgacgcac cggtagattc 3720ctccggctcc cagtagacta gtccagctcc gtggcgaaag cctgacgcac cggtagattc 3720

ttggtgagcc cgtatcatga cggcggcggg agctacatgg ccccgggtga tttatttttt 3780ttggtgagcc cgtatcatga cggcggcggg agctacatgg ccccgggtga tttatttttt 3780

ttgtatctac ttctgaccct tttcaaatat acggtcaact catctttcac tggagatgcg 3840ttgtatctac ttctgaccct tttcaaatat acggtcaact catctttcac tggagatgcg 3840

gcctgcttgg tattgcgatg ttgtcagctt ggcaaattgt ggctttcgaa aacacaaaac 3900gcctgcttgg tattgcgatg ttgtcagctt ggcaaattgt ggctttcgaa aacacaaaac 3900

gattccttag tagccatgca ttttaagata acggaataga agaaagagga aattaaaaaa 3960gattccttag tagccatgca ttttaagata acggaataga agaaagagga aattaaaaaa 3960

aaaaaaaaaa caaacatccc gttcataacc cgtagaatcg ccgctcttcg tgtatcccag 4020aaaaaaaaaa caaacatccc gttcataacc cgtagaatcg ccgctcttcg tgtatcccag 4020

taccacggca aaggtatttc atgatcgttc aatgttgata ttgttcccgc cagtatggct 4080taccacggca aaggtatttc atgatcgttc aatgttgata ttgttcccgc cagtatggct 4080

ccacccccat ctccgcgaat ctcctcttct cgaacgcggt agtggcgcgc caattggtaa 4140ccacccccat ctccgcgaat ctcctcttct cgaacgcggt agtggcgcgc caattggtaa 4140

tgacccatag ggagacaaac agcataatag caacagtgga aattagtggc gcaataattg 4200tgacccatag ggagacaaac agcataatag caacagtgga aattagtggc gcaataattg 4200

agaacacagt gagaccatag ctggcggcct ggaaagcact gttggagacc aacttgtccg 4260agaacacagt gagaccatag ctggcggcct ggaaagcact gttggagacc aacttgtccg 4260

ttgcgaggcc aacttgcatt gctgtcaaga cgatgacaac gtagccgagg accgtcacaa 4320ttgcgaggcc aacttgcatt gctgtcaaga cgatgacaac gtagccgagg accgtcacaa 4320

gggacgcaaa gttgtcgcgg atgaggtctc cgtagatggc atagccggca atccgagagt 4380gggacgcaaa gttgtcgcgg atgaggtctc cgtagatggc atagccggca atccgagagt 4380

agcctctcaa caggtggcct tttcgaaacc ggtaaacctt gttcagacgt cctagccgca 4440agcctctcaa caggtggcct tttcgaaacc ggtaaacctt gttcagacgt cctagccgca 4440

gctcaccgta ccagtatcga ggattgacgg cagaatagca gtggctctcc aggatttgac 4500gctcaccgta ccagtatcga ggattgacgg cagaatagca gtggctctcc aggatttgac 4500

tggacaaaat cttccagtat tcccaggtca cagtgtctgg cagaagtccc ttctcgcgtg 4560tggacaaaat cttccagtat tcccaggtca cagtgtctgg cagaagtccc ttctcgcgtg 4560

cgagtcgaaa gtcgctatag tgcgcaatga gagcacagta ggagaatagg aacccgcgag 4620cgagtcgaaa gtcgctatag tgcgcaatga gagcacagta ggagaatagg aacccgcgag 4620

cacattgttc aatctccaca tgaattggat gactgctggg cagaatgtgc tgcctccaaa 4680cacattgttc aatctccaca tgaattggat gactgctggg cagaatgtgc tgcctccaaa 4680

atcctgcgtc caacagatac tctggcaggg gcttcagatg aatgcctctg ggcccccaga 4740atcctgcgtc caacagatac tctggcaggg gcttcagatg aatgcctctg ggcccccaga 4740

taagatgcag ctctggattc tcggttacga tgatatcgcg agagagcacg agttggtgat 4800taagatgcag ctctggattc tcggttacga tgatatcgcg agagagcacg agttggtgat 4800

ggaggggacg aggaggcata ggtcggccgc aggcccataa ccagtcttgc acagcattga 4860ggaggggacg aggaggcata ggtcggccgc aggcccataa ccagtcttgc acagcattga 4860

tcttcctcac gaggagctcc tgatgcagaa actcctccat gttgctgatt gggttgagaa 4920tcttcctcac gaggagctcc tgatgcagaa actcctccat gttgctgatt gggttgagaa 4920

tttcatcgct cctggatcgt atggttgctg gcaagaccct gcttaaccgt gccgtgtcat 4980tttcatcgct cctggatcgt atggttgctg gcaagaccct gcttaaccgt gccgtgtcat 4980

ggtcatctct ggtggcttcg tcgctggcct gtctttgcaa ttcgacagca aatggtggag 5040ggtcatctct ggtggcttcg tcgctggcct gtctttgcaa ttcgacagca aatggtggag 5040

atctctctat cgtgacagtc atggtagcga tagctaggtg tcgttgcacg cacataggcc 5100atctctctat cgtgacagtc atggtagcga tagctaggtg tcgttgcacg cacataggcc 5100

gaaatgcgaa gtggaaagaa tttcccggcg cggaatgaag tctcgtcatt ttgtactcgt 5160gaaatgcgaa gtggaaagaa tttcccggcg cggaatgaag tctcgtcatt ttgtactcgt 5160

actcgacacc tccaccgaag tgttaagaat ggatccacga tgccaaaaag cttgttcatt 5220actcgacacc tccaccgaag tgttaagaat ggatccacga tgccaaaaag cttgttcatt 5220

tcggctagcc cgtgatcctg gcgcttctag ggctgaaact gtgttgttaa tgtattattg 5280tcggctagcc cgtgatcctg gcgcttctag ggctgaaact gtgttgttaa tgtattattg 5280

gctgtgtaac tgacttgaat ggggaatgag gagcgcgatg gattcgcttg catgtcccct 5340gctgtgtaac tgacttgaat ggggaatgag gagcgcgatg gattcgcttg catgtcccct 5340

ggccaagacg agccgctttg gcggtttgtg attcgaaggt gtgtcagcgg aggcgccagg 5400ggccaagacg agccgctttg gcggtttgtg attcgaaggt gtgtcagcgg aggcgccagg 5400

gcaacacgca ctgagccagc caacatgcat tgctgccgac atgaatagac acgcgccgag 5460gcaacacgca ctgagccagc caacatgcat tgctgccgac atgaatagac acgcgccgag 5460

cagacatagg agacgtgttg actgtaaaaa ttctactgaa tattagcacg catggtctca 5520cagacatagg agacgtgttg actgtaaaaa ttctactgaa tattagcacg catggtctca 5520

ataagagcaa taggaatgct tgccaatcat aagtacgtat gtgctttttc ctgcaaatgg 5580ataagagcaa taggaatgct tgccaatcat aagtacgtat gtgctttttc ctgcaaatgg 5580

tacgtacgga cagttcatgt tgtctgtcat cccccactca ggctctcatg atcattttat 5640tacgtacgga cagttcatgt tgtctgtcat cccccactca ggctctcatg atcattttat 5640

gggactgggg ttttgctgac tgaatggatt cagccgcacg aaacaaattg ggggccatgc 5700gggactgggg ttttgctgac tgaatggatt cagccgcacg aaacaaattg ggggccatgc 5700

agaagggaag cccccccagc cccctgttca taatttgtta agagtcggag agctgcctag 5760agaagggaag cccccccagc cccctgttca taatttgtta agagtcggag agctgcctag 5760

tatgaagcag caattgataa cgttgacttt gcgcatgagc tctgaagccg ggcatatgta 5820tatgaagcag caattgataa cgttgacttt gcgcatgagc tctgaagccg ggcatatgta 5820

tcacgtttct gcctagagcc gcacgggacc caagaagctc ttgtcataag gtatttatga 5880tcacgtttct gcctagagcc gcacgggacc caagaagctc ttgtcataag gtatttatga 5880

gtgttcagct gccaacgctg gttctacttt ggctcaaccg catccca 5927gtgttcagct gccaacgctg gttctacttt ggctcaaccg catccca 5927

<210> 3<210> 3

<211> 9710<211> 9710

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 3<400> 3

gacacggtag acaagcgaat cgaaacttta ttgttttgtt cttgcgttct ccctaagtga 60gacacggtag acaagcgaat cgaaacttta ttgttttgtt cttgcgttct ccctaagtga 60

tgacgaagga ggtgtagcac tgaaagtggc agagccacgt acaggagaga acagctcgta 120tgacgaagga ggtgtagcac tgaaagtggc agagccacgt acaggagaga acagctcgta 120

cctgatgaac tgctgtctca gattgaatat catacagctc gtggcattga agctccatag 180cctgatgaac tgctgtctca gattgaatat catacagctc gtggcattga agctccatag 180

atatgctagc gataagcagg tacagtgagg acacggcttg atcgtggctt gagaatcacc 240atatgctagc gataagcagg tacagtgagg acacggcttg atcgtggctt gagaatcacc 240

aaggacagtg gaacggagtg gctgctttgg tgattccgac caactgttca cgagaccagc 300aaggacagtg gaacggagtg gctgctttgg tgattccgac caactgttca cgagaccagc 300

tcggatgccc gggccaacga atcaaccctc cgtaagatgg agccacgatc gtataaccct 360tcggatgccc gggccaacga atcaaccctc cgtaagatgg agccacgatc gtataaccct 360

cgctctcaaa cagatgttaa tatttaggcg gaattgtaaa ctcgggggcc ctggagaggc 420cgctctcaaa cagatgttaa tatttaggcg gaattgtaaa ctcgggggcc ctggagaggc 420

cgcaactggg cttcatgggc agaacagcgg cgccatgccg gcatcccaga agtggaatgg 480cgcaactggg cttcatgggc agaacagcgg cgccatgccg gcatcccaga agtggaatgg 480

gcggatcgga gctggtgcgc ggggggaagt aagctgggtg gcaggccctt cggatagctg 540gcggatcgga gctggtgcgc ggggggaagt aagctgggtg gcaggccctt cggatagctg 540

ggagctgtct caactactag aggagcttag ctttggcagc tcttgataac tgtgacagct 600ggagctgtct caactactag aggagcttag ctttggcagc tcttgataac tgtgacagct 600

attcgccgtt tctcgtagtg ttcctggact gtgtctgaga gctgtcatcc cctcatcacc 660attcgccgtt tctcgtagtg ttcctggact gtgtctgaga gctgtcatcc cctcatcacc 660

acttacatgg acgaacagtt gtgtgaatga gcgcaatcat ccacctccgt tctccatctt 720acttacatgg acgaacagtt gtgtgaatga gcgcaatcat ccacctccgt tctccatctt 720

catcatcagc caacatgtcc agctcaacgc ccacgcggcg cctcatcggc ctctcgacca 780catcatcagc caacatgtcc agctcaacgc ccacgcggcg cctcatcggc ctctcgacca 780

aaatgtactt ttctctccag caaacaagag acttcacaaa cagcttctta tccaagctgg 840aaatgtactt ttctctccag caaacaagag acttcacaaa cagcttctta tccaagctgg 840

cagacgtccc gatagagctc ctctcccaca ttgacatctt catcatcccc gacttcgtct 900cagacgtccc gatagagctc ctctcccaca ttgacatctt catcatcccc gacttcgtct 900

ccctgaccac cacaatcgcc cagcttaagt ccagctccgt tgccatttgg accggcgctc 960ccctgaccac cacaatcgcc cagcttaagt ccagctccgt tgccatttgg accggcgctc 960

aagactgcca ctgggaagac caaggcgcat tcaccggcga agtcagcccc tccgtgttgc 1020aagactgcca ctgggaagac caaggcgcat tcaccggcga agtcagcccc tccgtgttgc 1020

gcagcgcagg cgtcaagctc gtggaagtcg gacatgcaga gcgcaggagc atctttggcg 1080gcagcgcagg cgtcaagctc gtggaagtcg gacatgcaga gcgcaggagc atctttggcg 1080

aggacgacac catggttgcg aataaggctg ccgctgtaag tcgcaatggc atgataccgc 1140aggacgacac catggttgcg aataaggctg ccgctgtaag tcgcaatggc atgataccgc 1140

ttgtgtgcat tggagaaaag actcaaggcg acgtgcatat cgcggttgag gagtgtcggg 1200ttgtgtgcat tggagaaaag actcaaggcg acgtgcatat cgcggttgag gagtgtcggg 1200

tgcaggttga tgctgtcatg aatgctgtgc ccgccgaggc agaggtcatt ttggcgtatg 1260tgcaggttga tgctgtcatg aatgctgtgc ccgccgaggc agaggtcatt ttggcgtatg 1260

agcctgtatg ggcgattggt gccagtcaac ctgctggaga gaagcacatc ctcgatgttg 1320agcctgtatg ggcgattggt gccagtcaac ctgctggaga gaagcacatc ctcgatgttg 1320

ttgctgggat cagagggttg gagtccgttg ggaggagggc ggggacgacg agggttttgt 1380ttgctgggat cagagggttg gagtccgttg ggaggagggc ggggacgacg agggttttgt 1380

atggggggag tgccgggccg gggctctacg agaagttgaa gagcggactg gacggattgt 1440atggggggag tgccgggccg gggctctacg agaagttgaa gagcggactg gacggattgt 1440

tcctggggcg gttcggacac gatccggagc agtttgtcaa gacgatacag gaagtagctg 1500tcctggggcg gttcggacac gatccggagc agtttgtcaa gacgatacag gaagtagctg 1500

aagcgtgagg gagagcaatg aactatctct tgggatggct gtgatgacag gctgctccgt 1560aagcgtgagg gagagcaatg aactatctct tgggatggct gtgatgacag gctgctccgt 1560

atgataggta cagtatgtac gaagaagtac tactattaca attcttcatc aaggctgatt 1620atgataggta cagtatgtac gaagaagtac tactattaca attcttcatc aaggctgatt 1620

gttgctaggt acacaaaggc cattactcag agcgaagccc aattgcccat cccccataga 1680gttgctaggt acacaaaggc cattactcag agcgaagccc aattgcccat cccccataga 1680

tcctctcttg agaagaatga aatggatcag cattgacctc aactcccatg atccagtcaa 1740tcctctcttg agaagaatga aatggatcag cattgacctc aactcccatg atccagtcaa 1740

cggctccaaa caagtggcac aatggcccaa tccgtccaag tttttagggt cctcttcaac 1800cggctccaaa caagtggcac aatggcccaa tccgtccaag tttttagggt cctcttcaac 1800

tatccccgag tccaccacaa ctttcttagc aaataatatc ttcacgataa cacacacaca 1860tatccccgag tccaccacaa ctttcttagc aaataatatc ttcacgataa cacacacaca 1860

cagccctaaa ttctctcatg gctaaaatca cacgcccgga taccaccgga tcatttcccg 1920cagccctaaa ttctctcatg gctaaaatca cacgcccgga taccaccgga tcatttcccg 1920

cccggctttt ggccgttgga cttccgccca gagctgaagg tcgcacaacc gcatgatata 1980cccggctttt ggccgttgga cttccgccca gagctgaagg tcgcacaacc gcatgatata 1980

gggtcggcaa cggcaaaaaa gcacgtggct caccgaaaag caagatgttt gcgatctaac 2040gggtcggcaa cggcaaaaaa gcacgtggct caccgaaaag caagatgttt gcgatctaac 2040

atccaggaac ctggatacat ccatcatcac gcacgaccac tttgatctgc tggtaaactc 2100atccaggaac ctggatacat ccatcatcac gcacgaccac tttgatctgc tggtaaactc 2100

gtattcgccc taaaccgaag tgcgtggtaa atctacacgt gggccccttt cggtatactg 2160gtattcgccc taaaccgaag tgcgtggtaa atctacacgt gggccccttt cggtatactg 2160

cgtgtgtctt ctctaggtgc cattcttttc ccttcctcta gtgttgaatt gtttgtgttg 2220cgtgtgtctt ctctaggtgc cattcttttc ccttcctcta gtgttgaatt gtttgtgttg 2220

gagtccgagc tgtaactacc tctgaatctc tggagaatgg tggactaacg actaccgtgc 2280gagtccgagc tgtaactacc tctgaatctc tggagaatgg tggactaacg actaccgtgc 2280

acctgcatca tgtatataat agtgatcctg agaagggggg tttggagcaa tgtgggactt 2340acctgcatca tgtatataat agtgatcctg agaagggggg tttggagcaa tgtgggactt 2340

tgatggtcat caaacaaaga acgaagacgc ctcttttgca aagttttgtt tcggctacgg 2400tgatggtcat caaacaaaga acgaagacgc ctcttttgca aagttttgtt tcggctacgg 2400

tgaagaactg gatacttgtt gtgtcttctg tgtatttttg tggcaacaag aggccagaga 2460tgaagaactg gatacttgtt gtgtcttctg tgtatttttg tggcaacaag aggccagaga 2460

caatctattc aaacaccaag cttgctcttt tgagctacaa gaacctgtgg ggtatatatc 2520caatctattc aaacaccaag cttgctcttt tgagctacaa gaacctgtgg ggtatatatc 2520

tagagttgtg aagtcggtaa tcccgctgta tagtaatacg agtcgcatct aaatactccg 2580tagagttgtg aagtcggtaa tcccgctgta tagtaatacg agtcgcatct aaatactccg 2580

aagctgctgc gaacccggag aatcgagatg tgctggaaag cttctagcga gcggctaaat 2640aagctgctgc gaacccggag aatcgagatg tgctggaaag cttctagcga gcggctaaat 2640

tagcatgaaa ggctatgaga aattctggag acggcttgtt gaatcatggc gttccattct 2700tagcatgaaa ggctatgaga aattctggag acggcttgtt gaatcatggc gttccattct 2700

tcgacaagca aagcgttccg tcgcagtagc aggcactcat tcccgaaaaa actcggagat 2760tcgacaagca aagcgttccg tcgcagtagc aggcactcat tcccgaaaaa actcggagat 2760

tcctaagtag cgatggaacc ggaataatat aataggcaat acattgagtt gcctcgacgg 2820tcctaagtag cgatggaacc ggaataatat aataggcaat acattgagtt gcctcgacgg 2820

ttgcaatgca ggggtactga gcttggacat aactgttccg taccccacct cttctcaacc 2880ttgcaatgca ggggtactga gcttggacat aactgttccg taccccacct cttctcaacc 2880

tttggcgttt ccctgattca gcgtacccgt acaagtcgta atcactatta acccagactg 2940tttggcgttt ccctgattca gcgtacccgt acaagtcgta atcactatta acccagactg 2940

accggacgtg ttttgccctt catttggaga aataatgtca ttgcgatgtg taatttgcct 3000accggacgtg ttttgccctt catttggaga aataatgtca ttgcgatgtg taatttgcct 3000

gcttgaccga ctggggctgt tcgaagcccg aatgtaggat tgttatccga actctgctcg 3060gcttgaccga ctggggctgt tcgaagcccg aatgtaggat tgttatccga actctgctcg 3060

tagaggcatg ttgtgaatct gtgtcgggca ggacacgcct cgaaggttca cggcaaggga 3120tagaggcatg ttgtgaatct gtgtcgggca ggacacgcct cgaaggttca cggcaaggga 3120

aaccaccgat agcagtgtct agtagcaacc tgtaaagccg caatgcagca tcactggaaa 3180aaccaccgat agcagtgtct agtagcaacc tgtaaagccg caatgcagca tcactggaaa 3180

atacaaacca atggctaaaa gtacataagt taatgcctaa agaagtcata taccagcggc 3240atacaaacca atggctaaaa gtacataagt taatgcctaa agaagtcata taccagcggc 3240

taataattgt acaatcaagt ggctaaacgt accgtaattt gccaacggct tgtggggttg 3300taataattgt acaatcaagt ggctaaacgt accgtaattt gccaacggct tgtggggttg 3300

cagaagcaac ggcaaagccc cacttcccca cgtttgtttc ttcactcagt ccaatctcag 3360cagaagcaac ggcaaagccc cacttcccca cgtttgtttc ttcactcagt ccaatctcag 3360

ctggtgatcc cccaattggg tcgcttgttt gttccggtga agtgaaagaa gacagaggta 3420ctggtgatcc cccaattggg tcgcttgttt gttccggtga agtgaaagaa gacagaggta 3420

agaatgtctg actcggagcg ttttgcatac aaccaagggc agtgatggaa gacagtgaaa 3480agaatgtctg actcggagcg ttttgcatac aaccaagggc agtgatggaa gacagtgaaa 3480

tgttgacatt caaggagtat ttagccaggg atgcttgagt gtatcgtgta aggaggtttg 3540tgttgacatt caaggagtat ttagccaggg atgcttgagt gtatcgtgta aggaggtttg 3540

tctgccgata cgacgaatac tgtatagtca cttctgatga agtggtccat attgaaatgt 3600tctgccgata cgacgaatac tgtatagtca cttctgatga agtggtccat attgaaatgt 3600

aagtcggcac tgaacaggca aaagattgag ttgaaactgc ctaagatctc gggccctcgg 3660aagtcggcac tgaacaggca aaagattgag ttgaaactgc ctaagatctc gggccctcgg 3660

gccttcggcc tttgggtgta catgtttgtg ctccgggcaa atgcaaagtg tggtaggatc 3720gccttcggcc tttgggtgta catgtttgtg ctccgggcaa atgcaaagtg tggtaggatc 3720

gaacacactg ctgcctttac caagcagctg agggtatgtg ataggcaaat gttcaggggc 3780gaacacactg ctgcctttac caagcagctg agggtatgtg ataggcaaat gttcaggggc 3780

cactgcatgg tttcgaatag aaagagaagc ttagccaaga acaatagccg ataaagatag 3840cactgcatgg tttcgaatag aaagagaagc ttagccaaga acaatagccg ataaagatag 3840

cctcattaaa cggaatgagc tagtaggcaa agtcagcgaa tgtgtatata taaaggttcg 3900cctcattaaa cggaatgagc tagtaggcaa agtcagcgaa tgtgtatata taaaggttcg 3900

aggtccgtgc ctccctcatg ctctccccat ctactcatca actcagatcc tccaggagac 3960aggtccgtgc ctccctcatg ctctccccat ctactcatca actcagatcc tccaggac 3960

ttgtacacca tcttttgagg cacagaaacc caatagtcaa ccgcggactg cgcatcatgt 4020ttgtacacca tcttttgagg cacagaaacc caatagtcaa ccgcggactg cgcatcatgt 4020

atcggaagtt ggccgtcatc tcggccttct tggccacagc tcgtgctcag tcggcctgca 4080atcggaagtt ggccgtcatc tcggccttct tggccacagc tcgtgctcag tcggcctgca 4080

ctctccaatc ggagactcac ccgcctctga catggcagaa atgctcgtct ggtggcacgt 4140ctctccaatc ggagactcac ccgcctctga catggcagaa atgctcgtct ggtggcacgt 4140

gcactcaaca gacaggctcc gtggtcatcg acgccaactg gcgctggact cacgctacga 4200gcactcaaca gacaggctcc gtggtcatcg acgccaactg gcgctggact cacgctacga 4200

acagcagcac gaactgctac gatggcaaca cttggagctc gaccctatgt cctgacaacg 4260acagcagcac gaactgctac gatggcaaca cttggagctc gaccctatgt cctgacaacg 4260

agacctgcgc gaagaactgc tgtctggacg gtgccgccta cgcgtccacg tacggagtta 4320agacctgcgc gaagaactgc tgtctggacg gtgccgccta cgcgtccacg tacggagtta 4320

ccacgagcgg taacagcctc tccattggct ttgtcaccca gtctgcgcag aagaacgttg 4380ccacgagcgg taacagcctc tccattggct ttgtcaccca gtctgcgcag aagaacgttg 4380

gcgctcgcct ttaccttatg gcgagcgaca cgacctacca ggaattcacc ctgcttggca 4440gcgctcgcct ttaccttatg gcgagcgaca cgacctacca ggaattcacc ctgcttggca 4440

acgagttctc tttcgatgtt gatgtttcgc agctgccgta agtgacttac catgaacccc 4500acgagttctc tttcgatgtt gatgtttcgc agctgccgta agtgacttac catgaacccc 4500

tgacgctatc ttcttgttgg ctcccagctg actggccaat tcaaggtgcg gcttgaacgg 4560tgacgctatc ttcttgttgg ctcccagctg actggccaat tcaaggtgcg gcttgaacgg 4560

agctctctac ttcgtgtcca tggacgcgga tggtggcgtg agcaagtatc ccaccaacac 4620agctctctac ttcgtgtcca tggacgcgga tggtggcgtg agcaagtatc ccaccaacac 4620

cgctggcgcc aagtacggca cggggtactg tgacagccag tgtccccgcg atctgaagtt 4680cgctggcgcc aagtacggca cggggtactg tgacagccag tgtccccgcg atctgaagtt 4680

catcaatggc caggccaacg ttgagggctg ggagccgtca tccaacaacg cgaacacggg 4740catcaatggc caggccaacg ttgagggctg ggagccgtca tccaacaacg cgaacacggg 4740

cattggagga cacggaagct gctgctctga gatggatatc tgggaggcca actccatctc 4800cattggagga cacggaagct gctgctctga gatggatatc tgggaggcca actccatctc 4800

cgaggctctt accccccacc cttgcacgac tgtcggccag gagatctgcg agggtgatgg 4860cgaggctctt accccccacc cttgcacgac tgtcggccag gagatctgcg agggtgatgg 4860

gtgcggcgga acttactccg ataacagata tggcggcact tgcgatcccg atggctgcga 4920gtgcggcgga acttactccg ataacagata tggcggcact tgcgatcccg atggctgcga 4920

ctggaaccca taccgcctgg gcaacaccag cttctacggc cctggctcaa gctttaccct 4980ctggaaccca taccgcctgg gcaacaccag cttctacggc cctggctcaa gctttaccct 4980

cgataccacc aagaaattga ccgttgtcac ccagttcgag acgtcgggtg ccatcaaccg 5040cgataccacc aagaaattga ccgttgtcac ccagttcgag acgtcgggtg ccatcaaccg 5040

atactatgtc cagaatggcg tcactttcca gcagcccaac gccgagcttg gtagttactc 5100atactatgtc cagaatggcg tcactttcca gcagcccaac gccgagcttg gtagttactc 5100

tggcaacgag ctcaacgatg attactgcac agctgaggag gcagaattcg gcggatcctc 5160tggcaacgag ctcaacgatg attactgcac agctgaggag gcagaattcg gcggatcctc 5160

tttctcagac aagggcggcc tgactcagtt caagaaggct acctctggcg gcatggttct 5220tttctcagac aagggcggcc tgactcagtt caagaaggct acctctggcg gcatggttct 5220

ggtcatgagt ctgtgggatg atgtgagttt gatggacaaa catgcgcgtt gacaaagagt 5280ggtcatgagt ctgtgggatg atgtgagttt gatggacaaa catgcgcgtt gacaaagagt 5280

caagcagctg actgagatgt tacagtacta cgccaacatg ctgtggctgg actccaccta 5340caagcagctg actgagatgt tacagtacta cgccaacatg ctgtggctgg actccaccta 5340

cccgacaaac gagacctcct ccacacccgg tgccgtgcgc ggaagctgct ccaccagctc 5400cccgacaaac gagacctcct ccacacccgg tgccgtgcgc ggaagctgct ccaccagctc 5400

cggtgtccct gctcaggtcg aatctcagtc tcccaacgcc aaggtcacct tctccaacat 5460cggtgtccct gctcaggtcg aatctcagtc tcccaacgcc aaggtcacct tctccaacat 5460

caagttcgga cccattggca gcaccggcaa ccctagcggc ggcaaccctc ccggcggaaa 5520caagttcgga cccattggca gcaccggcaa ccctagcggc ggcaaccctc ccggcggaaa 5520

cccgcctggc accaccacca cccgccgccc agccactacc actggaagct ctcccggacc 5580cccgcctggc accaccacca cccgccgccc agccactacc actggaagct ctcccggacc 5580

tacccagtct cactacggcc agtgcggcgg tattggctac agcggcccca cggtctgcgc 5640tacccagtct cactacggcc agtgcggcgg tattggctac agcggcccca cggtctgcgc 5640

cagcggcaca acttgccagg tcctgaaccc ttactactct cagtgcctgt aaagcgttga 5700cagcggcaca acttgccagg tcctgaaccc ttactactct cagtgcctgt aaagcgttga 5700

cttgcctctg gtctgtccag acgggggcac gatagaatgc gggcacgcag ggagctcgta 5760cttgcctctg gtctgtccag acgggggcac gatagaatgc gggcacgcag ggagctcgta 5760

gacattgggc ttaatatata agacatgcta tgttgtatct acattagcaa atgacaaaca 5820gacattgggc ttaatatata agacatgcta tgttgtatct acattagcaa atgacaaaca 5820

aatgaaaaag aacttatcaa gcactgtacc aaggaagctc attatgcgtc tggcgggttc 5880aatgaaaaag aacttatcaa gcactgtacc aaggaagctc attatgcgtc tggcgggttc 5880

aaatgatccg tgataggtta tgccagctga ttgtttgccc gcagggttga caccaccagg 5940aaatgatccg tgataggtta tgccagctga ttgtttgccc gcagggttga caccaccagg 5940

ggataatggc cactttcatc tgaatcaagg acaggagccg ttgatgattt ccagatatcc 6000ggataatggc cactttcatc tgaatcaagg acaggagccg ttgatgattt ccagatatcc 6000

tatcttcatc gctagtacta ataacaagta agcaaacagc cagctacact cgtacacacc 6060tatcttcatc gctagtacta ataacaagta agcaaacagc cagctacact cgtacacacc 6060

gctcatgaaa acataagaca aagctcaagc catgtcaagc accaccccat cgtacacgta 6120gctcatgaaa acataagaca aagctcaagc catgtcaagc accaccccat cgtacacgta 6120

catcttctac caatctgtcc agatgccggc tacgtcagcc tccaacccat tgcgatacaa 6180catcttctac caatctgtcc agatgccggc tacgtcagcc tccaacccat tgcgatacaa 6180

tgacggatgt cgtatagacc atggcaacgg ctcccaccac cattgtcact gcgtccaaag 6240tgacggatgt cgtatagacc atggcaacgg ctcccaccac cattgtcact gcgtccaaag 6240

tcttctccca cctgctctca gcaacgccct tgaaatgcag gaatgccgga tacatgtaga 6300tcttctccca cctgctctca gcaacgccct tgaaatgcag gaatgccgga tacatgtaga 6300

ccaatggaat gcaggcaaaa gatcccgtca gcgccacaaa cttatccaaa tcacttgctc 6360ccaatggaat gcaggcaaaa gatcccgtca gcgccacaaa cttatccaaa tcacttgctc 6360

ccaccaccgc cacgccgatg catgccacaa tggtcaaggc ccggaggccg ttcttcttcc 6420ccaccaccgc cacgccgatg catgccacaa tggtcaaggc ccggaggccg ttcttcttcc 6420

acttgattgc cgcgctcttc ttaccagtgg cgcgctctcc gaaaatcgac gtttccaata 6480acttgattgc cgcgctcttc ttaccagtgg cgcgctctcc gaaaatcgac gtttccaata 6480

tgcggatcgc ggggaacagc tgcacaggct cgccagccag tacggccagc gagtagagaa 6540tgcggatcgc ggggaacagc tgcacaggct cgccagccag tacggccagc gagtagagaa 6540

actggactgc attgacgaga ggcgagtcct ggggaaaatt ggagatgatt tgaatcttgg 6600actggactgc attgacgaga ggcgagtcct ggggaaaatt ggagatgatt tgaatcttgg 6600

tgtgttctcc aaacgtcgca tagcagagtg cgccgactga tgtgaagatg gccgtgatga 6660tgtgttctcc aaacgtcgca tagcagagtg cgccgactga tgtgaagatg gccgtgatga 6660

ggagcatgac aaagtagagg aggcccttga actgatgagg cttcttcata ctcgactgaa 6720ggagcatgac aaagtagagg aggcccttga actgatgagg cttcttcata ctcgactgaa 6720

ttggcaggat caggccaatg ccctcaaacg tgaagatcgc cgagcccaga gtcaggggaa 6780ttggcaggat caggccaatg ccctcaaacg tgaagatcgc cgagcccaga gtcaggggaa 6780

aatcggaggg gttgaagagc ttgacgcttg gctccatacc gtgcctgacc aggcttcgaa 6840aatcggaggg gttgaagagc ttgacgcttg gctccatacc gtgcctgacc aggcttcgaa 6840

tgtcaaaata ccagatgtat acgagtccaa tgaggatgaa ggcgtccgcc accagagcaa 6900tgtcaaaata ccagatgtat acgagtccaa tgaggatgaa ggcgtccgcc accagagcaa 6900

ccggacccag cttggagatg ttgcggatcc aggctaacgg aatgagaggg aggaactgaa 6960ccggacccag cttggagatg ttgcggatcc aggctaacgg aatgagaggg aggaactgaa 6960

gcgcaattag gcccgacacc ccgaaatgac caaggccgtt ggtgacagcg tcgaggaagg 7020gcgcaattag gcccgacacc ccgaaatgac caaggccgtt ggtgacagcg tcgaggaagg 7020

agaagagatt ttcagcagtg aagataatgc ctgagcagac gaagcccagc tgagaaatgg 7080agaagagatt ttcagcagtg aagataatgc ctgagcagac gaagcccagc tgagaaatgg 7080

caatggaggc gagaatcaag cttcgaaacc ggggtccaac gacggcctcg ccaatgtctc 7140caatggaggc gagaatcaag cttcgaaacc ggggtccaac gacggcctcg ccaatgtctc 7140

catagccacc gccgtacttg tggcggcagt ccagaagtaa gcggaaacag aagcaattga 7200catagccacc gccgtacttg tggcggcagt ccagaagtaa gcggaaacag aagcaattga 7200

ccagtgacac ggagatgagc gccagtgacg aaaagaggat acctccgttg cggaatgcct 7260ccagtgacac ggagatgagc gccagtgacg aaaagaggat acctccgttg cggaatgcct 7260

tgggcaggaa caggatgcct gtgccgatga aagccttgag cagcgtgaag aaggtcttga 7320tgggcaggaa caggatgcct gtgccgatga aagccttgag cagcgtgaag aaggtcttga 7320

ctgtgctcgc gtcaccctgg cgggcaagac ggcgggagct tctccggcga ccagcactcg 7380ctgtgctcgc gtcaccctgg cgggcaagac ggcgggagct tctccggcga ccagcactcg 7380

gtcgcgcaag cagcggacgc cgctcgtcgg cggcttcttc cgcctcatca gtgacggcag 7440gtcgcgcaag cagcggacgc cgctcgtcgg cggcttcttc cgcctcatca gtgacggcag 7440

actcgtcatc gtcgtccgta tcgtcgagcg cttcgccggc aaagctgccg tagaggtcga 7500actcgtcatc gtcgtccgta tcgtcgagcg cttcgccggc aaagctgccg tagaggtcga 7500

ggaagtcaac gaacgtgttg gccacaatca tgccgctctg ctgcctctgt cgagctcggc 7560ggaagtcaac gaacgtgttg gccacaatca tgccgctctg ctgcctctgt cgagctcggc 7560

gccgcaggtg ctgccgcctg aaaccgccgg gctcgagctg ctcagagtgc gtcaactcgc 7620gccgcaggtg ctgccgcctg aaaccgccgg gctcgagctg ctcagagtgc gtcaactcgc 7620

ctgccggact caggtcgggc gagcggcggt gagagaaggt attggatcgg cggagagagc 7680ctgccggact caggtcgggc gagcggcggt gagagaaggt attggatcgg cggagagagc 7680

ctgctttggc cttgatcttg aacatgtcgc gatggatgtc gccgccctga agcttgagcg 7740ctgctttggc cttgatcttg aacatgtcgc gatggatgtc gccgccctga agcttgagcg 7740

aggactcgag cgtctcgagg gtcggcgttg cgtcgctgct cccggatgaa aatcacgcag 7800aggactcgag cgtctcgagg gtcggcgttg cgtcgctgct cccggatgaa aatcacgcag 7800

aagtcaatgc tcgtttctct tccttttcac atctcgttca tcgttcttcc accaagaaag 7860aagtcaatgc tcgtttctct tccttttcac atctcgttca tcgttcttcc accaagaaag 7860

aatcaaaagt caactcaatc gtaacctatc caaataaaag tattcagtga tatacttcac 7920aatcaaaagt caactcaatc gtaacctatc caaataaaag tattcagtga tatacttcac 7920

accccccctc caaccccctc tctccaatgc atctgacaca cagcaggcca aataccacac 7980accccccctc caaccccctc tctccaatgc atctgacaca cagcaggcca aataccacac 7980

tctcctgtcg gtacaacctc gtccgtcgcc atggcgtcct tcggtggtag tccgtccaga 8040tctcctgtcg gtacaacctc gtccgtcgcc atggcgtcct tcggtggtag tccgtccaga 8040

cggggttctg tctcctcatc ttcaggtaat gtgtatcgca tgagccggtg cccgggctcc 8100cggggttctg tctcctcatc ttcaggtaat gtgtatcgca tgagccggtg cccgggctcc 8100

cccccttcct ttgtcccact tcactcctct cccctcgtct ctgcttgtct ttacaaacaa 8160cccccttcct ttgtcccact tcactcctct cccctcgtct ctgcttgtct ttacaaacaa 8160

ggaaacacac ccaatgagat ggagaaggtg attgtgtgga aataaggaag agggggctcg 8220ggaaacacac ccaatgagat ggagaaggtg attgtgtgga aataaggaag agggggctcg 8220

ggctgagaat atacggccct gaacttgatc tggataatac cagcgaaagg atcatgctcc 8280ggctgagaat atacggccct gaacttgatc tggataatac cagcgaaagg atcatgctcc 8280

tccccctttt caatcccgtc caaatcccta acccttgtag gcccatcatg agccctcccg 8340tccccctttt caatcccgtc caaatcccta acccttgtag gcccatcatg agccctcccg 8340

ccgccgtctc gcccccccag cgaaccgctg aactcgtcac cccgtccaag atggccgttg 8400ccgccgtctc gcccccccag cgaaccgctg aactcgtcac cccgtccaag atggccgttg 8400

cccagccgca gcagcacctc gaggcccagg ccaagtccgt ctcggacatg ttcggccagt 8460cccagccgca gcagcacctc gaggcccagg ccaagtccgt ctcggacatg ttcggccagt 8460

gggactcgtt caccttctcg cccatccgcg agtcccaggt gtcgcgcgcc atgacccgcc 8520gggactcgtt caccttctcg cccatccgcg agtcccaggt gtcgcgcgcc atgacccgcc 8520

gctacttcga ggacctcgac cgctacgccg agtccgacat tgtcatcatc ggcgccggct 8580gctacttcga ggacctcgac cgctacgccg agtccgacat tgtcatcatc ggcgccggct 8580

cctgcggcct cagcaccgcc tacgtcctcg gcacccagcg cccggacctc aagattgcca 8640cctgcggcct cagcaccgcc tacgtcctcg gcacccagcg cccggacctc aagattgcca 8640

tcatcgaggc ctccgtctcc cccggcggcg gtgcctggct gggcggccag ctcttctccg 8700tcatcgaggc ctccgtctcc cccggcggcg gtgcctggct gggcggccag ctcttctccg 8700

ccatggtcat gcgcaagcct gccgatgcct tcctccgcga gattggcgtc ccgtacgagg 8760ccatggtcat gcgcaagcct gccgatgcct tcctccgcga gattggcgtc ccgtacgagg 8760

acgagggcaa ctacgtcgtc gtcaagcacg ccgccctctt cacctccacc atcatggcca 8820acgagggcaa ctacgtcgtc gtcaagcacg ccgccctctt cacctccacc atcatggcca 8820

aggtgctgca gctacccaac gtcaagctct tcaacgccac ctgcgtcgaa gacctcatca 8880aggtgctgca gctacccaac gtcaagctct tcaacgccac ctgcgtcgaa gacctcatca 8880

cccgcccctc tgccgagggc gtgcgcatcg ccggtgtcgt caccaactgg accctcgtct 8940cccgcccctc tgccgagggc gtgcgcatcg ccggtgtcgt caccaactgg accctcgtct 8940

ccatgcacca cgacgaccag tcctgcatgg accccaacac catcaacgcg cccctcgtca 9000ccatgcacca cgacgaccag tcctgcatgg accccaacac catcaacgcg cccctcgtca 9000

tctccaccac cggccacgac ggccccatgg gtgccttctg cgtcaagcgc cttgtcagca 9060tctccaccac cggccacgac ggccccatgg gtgccttctg cgtcaagcgc cttgtcagca 9060

tgggccgcat cgagaagctt ggcggcatgc gcggcctcga catgagcagg gctgaggatg 9120tgggccgcat cgagaagctt ggcggcatgc gcggcctcga catgagcagg gctgaggatg 9120

ccattgtcaa gaacacccgt gaggttgttc ccggcctgat tgtcggagga atggagctgt 9180ccattgtcaa gaacacccgt gaggttgttc ccggcctgat tgtcggagga atggagctgt 9180

ctgagattga cggagccaac cgcatgggta tgttccctct ttgaatctcc ctctcgcaag 9240ctgagattga cggagccaac cgcatgggta tgttccctct ttgaatctcc ctctcgcaag 9240

aacatgtatt aacctgtgtg tgtgtacagg tcctaccttt ggtgccatgg ccctcagcgg 9300aacatgtatt aacctgtgtg tgtgtacagg tcctaccttt ggtgccatgg ccctcagcgg 9300

tgtcaaggct gccgaggagg ctctcaaggt ctttgaggcc cgccgcaagg agaatgctca 9360tgtcaaggct gccgaggagg ctctcaaggt ctttgaggcc cgccgcaagg agaatgctca 9360

gtaaatgagc tctgtgacct tgttccgtca ttcatttgat agtgtacttg catgactggt 9420gtaaatgagc tctgtgacct tgttccgtca ttcatttgat agtgtacttg catgactggt 9420

cactcatttg atgtaaagca gtaaaagtac aaaagaccaa aaaaaagcat gatcacgctg 9480cactcatttg atgtaaagca gtaaaagtac aaaagaccaa aaaaaagcat gatcacgctg 9480

cgaaccagat tgaataaact gcagcatatt actcaagata atcgttctca ttccatccgt 9540cgaaccagat tgaataaact gcagcatatt actcaagata atcgttctca ttccatccgt 9540

gatcttctga tttggttcct gtaccctggt caggtaggta taggaaacgt attccacagt 9600gatcttctga tttggttcct gtaccctggt caggtaggta taggaaacgt attccacagt 9600

aaaacaagca cagcacttgc ctctgttgga tcggcacctc agaagagatt gccatgaaga 9660aaaacaagca cagcacttgc ctctgttgga tcggcacctc agaagagatt gccatgaaga 9660

cgcactagaa gaaacaagtc gacgtcgtgg caagctcatg cttgagcacg 9710cgcactagaa gaaacaagtc gacgtcgtgg caagctcatg cttgagcacg 9710

Claims

1. A method for expressing recombinant protein by using Trichoderma reesei as a host comprises the following steps: cloning an expression cassette containing a coding gene of a target Protein to be expressed into a non-coding region between a gene with Protein ID 68606 and a gene with Protein ID 68608 in a trichoderma reesei genome;

the "cloning of the expression cassette containing the gene encoding the target Protein to be expressed into the non-coding region between the gene with Protein ID 68606 and the gene with Protein ID 68608 in the genome of Trichoderma reesei" is to replace the expression cassette containing the gene encoding the target Protein to be expressed with the 429-551 position of SEQ ID No.1 in the genome of Trichoderma reesei.

2. The method of claim 1, wherein: the expression cassette containing the coding gene of the target protein to be expressed sequentially comprises a promoter, the coding gene of the target protein to be expressed and started to be expressed by the promoter, and a terminator from the 5 'end to the 3' end.

3. The method of claim 2, wherein: the promoter is derived from a cellobiohydrolase I gene of trichoderma reesei; the terminator is derived from the cellobiohydrolase I gene of trichoderma reesei or derived from the cellobiohydrolase II gene of trichoderma reesei.

4. The method of claim 3, wherein: the sequence of the promoter of the cellobiohydrolase I gene derived from the trichoderma reesei is 1-2078 bit of SEQ ID No.2 or 1939-4016 bit of SEQ ID No. 3; the terminator sequence of the cellobiohydrolase I gene derived from the trichoderma reesei is the 3685-5927 th site of SEQ ID No. 2; the terminator of the cellobiohydrolase II gene derived from Trichoderma reesei has the sequence of position 5693-7784 of SEQ ID No. 3.

5. The method according to any one of claims 1-4, wherein: the target protein to be expressed is protein derived from eukaryote or prokaryote.

6. The method of claim 1, wherein: the cloning of the expression cassette containing the coding gene of the target Protein to be expressed into the non-coding region between the gene with the Protein ID 68606 and the gene with the Protein ID 68608 in the trichoderma reesei genome is realized by introducing a recombinant expression vector into trichoderma reesei host bacteria; the recombinant expression vector contains the expression cassette and a homology arm which is positioned at the upstream and downstream of the expression cassette and is used for site-specific insertion or replacement.

7. The method of claim 6, wherein: the homologous arms are two DNA segments shown in the 1 st-1938 th site and the 7785 th-9710 th site of SEQ ID No. 3.