Detailed Description
The following definitions and methods are provided to better define the present application and to guide those of ordinary skill in the art in the practice of the present application. Unless otherwise indicated, terms are to be construed according to conventional usage by those of ordinary skill in the relevant art. All patent documents, academic papers, industry standards, and other publications cited herein are incorporated by reference in their entirety.
As used herein, a "plant" is any plant, including whole plants, plant cells, plant organs, plant protoplasts, plant cell tissue cultures from which plants can be regenerated, plant calli, whole plant cells in plants or plant parts, such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, stems, roots, root tips, anthers, and the like. Unless otherwise indicated, nucleic acids are written in the 5 'to 3' direction from left to right, while amino acid sequences are written in the amino to carboxyl direction from left to right. Amino acids may be represented herein by their commonly known three-letter symbols or by the single-letter symbols recommended by the IUPAC-IUB biochemical nomenclature committee. Likewise, nucleotides may be referred to by commonly accepted single letter codes. The numerical range includes the numbers defining the range. As used herein, "nucleic acid" includes reference to deoxyribonucleotide or ribonucleotide polymers in either single-or double-stranded form, and unless otherwise limited, includes known analogs (e.g., peptide nucleic acids) having the basic properties of natural nucleotides that hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides. As used herein, the term "encode" or "encoded" when used in the context of a particular nucleic acid, means that the nucleic acid contains the necessary information to direct translation of the nucleotide sequence into a particular protein. The information encoding the protein is represented using codons. As used herein, reference to a "full-length sequence" of a particular polynucleotide or protein encoded thereby refers to an entire nucleic acid sequence or an entire amino acid sequence having a natural (non-synthetic) endogenous sequence. The full length polynucleotide encodes the full length, catalytically active form of the particular protein. The terms "polypeptide", "polypeptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The term is used for amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acid. The term is also used for naturally occurring amino acid polymers. The terms "residue" or "amino acid" are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide, or peptide (collectively, "protein"). Amino acids may be naturally occurring amino acids, and unless otherwise limited, may include known analogs of natural amino acids, which analogs may function in a similar manner to naturally occurring amino acids.
The term "trait" refers to a physiological, morphological, biochemical or physical characteristic of a plant or a particular plant material or cell. In some cases, this property is visible to the human eye, such as seed or plant size, or can be measured by biochemical techniques, such as detecting the protein, starch or oil content of the seed or leaf, or by observing metabolic or physiological processes, for example by measuring tolerance to water deprivation or specific salt or sugar or nitrogen concentrations, or by observing the expression level of one or more genes, or by agronomic observations such as osmotic stress tolerance or yield.
"Transgenic" refers to any cell, cell line, callus, tissue, plant part or plant whose genome has been altered by the presence of a heterologous nucleic acid, such as a recombinant DNA construct. The term "transgene" as used herein includes those initial transgenic events as well as those produced from the initial transgenic events by sexual hybridization or asexual reproduction, and does not encompass genomic (chromosomal or extrachromosomal) changes by conventional plant breeding methods or by naturally occurring events such as random fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation.
In the present application, the terms "comprises," "comprising," or variations thereof, are to be understood to encompass other elements, numbers, or steps in addition to those described. "subject plant" or "subject plant cell" refers to a plant or plant cell in which genetic engineering has been effected, or a progeny cell of a plant or cell so engineered, which progeny cell comprises the engineering. "control" or "control plants" provide a reference point for measuring phenotypic changes in a subject plant.
The negative or control plant may include, for example, (a) a wild-type plant or cell, i.e., a plant or cell having the same genotype as the genetically engineered starting material, which is to produce a test plant or cell, (b) a plant or plant cell having the same genotype as the starting material but that has been transformed with an empty construct (i.e., with a construct that has no known effect on the trait of interest, such as a construct comprising a marker gene), (c) a plant or plant cell that is a non-transformed isolate of the test plant or plant cell, (d) a plant or plant cell that is genetically identical to the test plant or plant cell but that has not been exposed to conditions or stimuli that would induce expression of the gene of interest, or (e) the test plant or plant cell itself under conditions where the gene of interest is not expressed.
Those skilled in the art will readily recognize that advances in molecular biology, such as site-specific and random mutagenesis, polymerase chain reaction methods, and protein engineering techniques, provide a wide range of suitable tools and procedures for engineering or engineering amino acid sequences and potentially genetic sequences of proteins of agricultural interest.
In some embodiments, the nucleotide sequences of the present application may be altered to make conservative amino acid substitutions. The principles and examples of conservative amino acid substitutions are described further below. In certain embodiments, the nucleotide sequence of the present application may be subjected to substitutions in accordance with the disclosed monocot codon preferences that do not alter the amino acid sequence, e.g., codons encoding the same amino acid sequence may be replaced with monocot-preferred codons without altering the amino acid sequence encoded by the nucleotide sequence. In some embodiments, a portion of the nucleotide sequence in the present application is replaced with a different codon encoding the same amino acid sequence, such that the amino acid sequence encoded thereby is not changed while the nucleotide sequence is changed. Conservative variants include those sequences that encode the amino acid sequence of one of the proteins of an embodiment due to the degeneracy of the genetic code. In some embodiments, a portion of the nucleotide sequences of the present application are substituted according to monocot preference codons. Those skilled in the art will recognize that amino acid additions and/or substitutions are generally based on the relative similarity of amino acid side chain substituents, e.g., hydrophobicity, charge, size, etc., of the substituents. Exemplary amino acid substituents having various of the aforementioned contemplated properties are well known to those skilled in the art and include arginine and lysine, glutamic acid and aspartic acid, serine and threonine, glutamine and asparagine, and valine, leucine and isoleucine. Guidelines for suitable amino acid substitutions that do not affect the biological activity of the protein of interest can be found in the model of Dayhoff et al (1978) Atlas of Protein Sequence and Structure (protein sequence and structure atlas) (Natl. Biomed. Res. Foundation, washington, D.C.), incorporated herein by reference. Conservative substitutions, such as substitution of one amino acid for another with similar properties, may be made. Identification of sequence identity includes hybridization techniques. For example, all or part of a known nucleotide sequence is used as a probe for selective hybridization with other corresponding nucleotide sequences present in a cloned genomic DNA fragment or population of cDNA fragments (i.e., a genomic library or cDNA library) from a selected organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32P or other detectable marker. Thus, for example, hybridization probes can be prepared by labeling synthetic oligonucleotides based on the sequences of the embodiments. Methods for preparing hybridization probes and constructing cDNA and genomic libraries are generally known in the art. Hybridization of the sequences may be performed under stringent conditions. As used herein, the term "stringent conditions" or "stringent hybridization conditions" refers to conditions under which a probe will hybridize to its target sequence to a detectably greater extent (e.g., at least 2-fold, 5-fold, or 10-fold over background) relative to hybridization to other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the hybridization stringency and/or controlling the washing conditions, target sequences 100% complementary to the probes can be identified (homologous probe method). Alternatively, stringent conditions can be adjusted to allow for some sequence mismatches in order to detect lower similarity (heterologous probe method). Typically, the probe is less than about 1000 or 500 nucleotides in length. Typically stringent conditions are those in which the salt concentration is less than about 1.5M Na ion, typically about 0.01M to 1.0M Na ion concentration (or other salt) at pH 7.0 to 8.3, and the temperature conditions are at least about 30℃when used for short probes (e.g., 10 to 50 nucleotides) and at least about 60℃when used for long probes (e.g., greater than 50 nucleotides). Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization at 37 ℃ with 30% to 35% formamide buffer, 1M NaCl, 1% sds (sodium dodecyl sulfate), washing in 1 x to 2 x SSC (20 x SSC = 3.0M NaCl/0.3M trisodium citrate) at 50 ℃ to 55 ℃. Exemplary moderately stringent conditions include hybridization in 40% to 45% formamide, 1.0M NaCl, 1% SDS at 37℃and washing in 0.5×to 1×SSC at 55℃to 60 ℃. exemplary high stringency conditions include hybridization in 50% formamide, 1M NaCl, 1% sds at 37 ℃, final washes in 0.1 x SSC at 60 ℃ to 65 ℃ for at least about 20 minutes. Optionally, the wash buffer may comprise about 0.1% to about 1% sds. The duration of hybridization is typically less than about 24 hours, typically from about 4 hours to about 12 hours. Specificity generally depends on post-hybridization washing, the key factors being the ionic strength and temperature of the final wash solution. The Tm (thermodynamic melting point) of a DNA-DNA hybrid can be approximated from the formula Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284: tm=81.5 ℃ +16.6 (log M) +0.41 (% GC) -0.61 (% formamide) -500/L; where M is the molar concentration of monovalent cations,% GC is the percentage of guanosine and cytosine nucleotides in the DNA,% formamide is the percentage of formamide in the hybridization solution, and L is the base pair length of the hybrid. tm is the temperature (at a defined ionic strength and pH) at which 50% of the complementary target sequence hybridizes to a perfectly matched probe. Washing is typically performed at least until equilibrium is reached and a low hybridization background level is reached, such as 2 hours, 1 hour, or 30 minutes. Each 1% mismatch corresponds to a decrease in Tm of about 1℃and thus, the Tm, hybridization and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with ≡90% identity are desired, the Tm can be reduced by 10 ℃. Typically, stringent conditions are selected to be about 5 ℃ lower than the Tm for the specific sequence and its complement at a defined ionic strength and pH. However, under very stringent conditions hybridization and/or washing may be performed at 4℃below the Tm, under moderately stringent conditions hybridization and/or washing may be performed at 6℃below the Tm, and under low stringent conditions hybridization and/or washing may be performed at 11℃below the Tm.
In some embodiments, fragments of the nucleotide sequence and the amino acid sequence encoded thereby are also included. As used herein, the term "fragment" refers to a portion of the nucleotide sequence of a polynucleotide or a portion of the amino acid sequence of a polypeptide of an embodiment. Fragments of a nucleotide sequence may encode protein fragments that retain the biological activity of the native or corresponding full-length protein and thus have protein activity. Mutant proteins include biologically active fragments of a native protein that comprise consecutive amino acid residues that retain the biological activity of the native protein. Some embodiments also include a transformed plant cell or transgenic plant comprising the nucleotide sequence of at least one embodiment. In some embodiments, the plant is transformed with an expression vector comprising the nucleotide sequence of at least one embodiment and operably linked thereto a promoter that drives expression in a plant cell. Transformed plant cells and transgenic plants refer to plant cells or plants comprising a heterologous polynucleotide within the genome. In general, the heterologous polynucleotide is stably integrated within the genome of the transformed plant cell or transgenic plant, such that the polynucleotide is delivered to the offspring. The heterologous polynucleotide may be integrated into the genome, either alone or as part of an expression vector. In some embodiments, the plants to which the present application relates include plant cells, plant protoplasts, plant cell tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells, which are whole plants or parts of plants, such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, nuts, ears, cobs, hulls, stalks, roots, root tips, anthers, and the like. The application also includes plant cells, protoplasts, tissues, calli, embryos and flowers, stems, fruits, leaves and roots derived from the transgenic plants of the application or progeny thereof, and thus comprising at least in part the nucleotide sequences of the application.
The following examples are illustrative of the application and are not intended to limit the scope of the application. Modifications and substitutions to methods, procedures, or conditions of the present application without departing from the spirit and nature of the application are intended to be within the scope of the present application. Examples follow conventional experimental conditions, such as the molecular cloning laboratory Manual of Sambrook et al (Sambrook J & Russell DW, molecular cloning: a laboratory manual, 2001), or conditions recommended by the manufacturer's instructions, unless otherwise indicated. Unless otherwise indicated, all chemical reagents used in the examples were conventional commercial reagents, and the technical means used in the examples were conventional means well known to those skilled in the art.
Example 1 editing results of expression of Cas9 Polypeptides driven by commonly used constitutive promoters in Rice
In this example, the SE5 gene (Andrés F,Galbraith DW,Talón M,Domingo C(2009)Analysis of PHOTOPERIOD SENSITIVITY5 sheds light on the role of phytochromes in photoperiodic flowering in rice.Plant Physiol 151:681-690) which had developed premature senility after homozygous mutation in rice was used as the target test gene in this example. The gRNA specifically targeting SE5 was designed by means of a network design tool (http:// CRISPR. Hzau. Edu. Cn/CRISPR /). The target gene was modified by editing using the CRISPR/Cas9 system.
GRNA molecule targets the segment of SE5 gene AGACGAGCTTGCTGTCGACG
The gene editing vector TKC(He Y,Zhu M,Wang L,Wu J,Wang Q,Wang R,Zhao Y.Programmed Self-Elimination of the CRISPR/Cas9 Construct Greatly Accelerates the Isolation of Edited and Transgene-Free Rice Plants.Mol Plant,2018,11(9):1210-1213) previously created by the inventors was used as a backbone vector (FIG. 1). The incorporation of the gRNA expression cassette is described in detail in (He Y,Zhu M,Wang L,Wu J,Wang Q,Wang R,Zhao Y.Programmed Self-Elimination of the CRISPR/Cas9 Construct Greatly Accelerates the Isolation of Edited and Transgene-Free Rice Plants.Mol Plant,2018,11(9):1210-1213)
The sequenced positive plasmid was transformed into agrobacterium (EHA 105) and infects rice callus. The transformed variety is rice "Zhonghua 11" (also called ZH11, from the institute of crop science of the national academy of agricultural sciences), (genetic transformation of rice is now a popular method of operation in the field of transgenesis of rice, wherein detailed transformation steps and various medium formulations used are described in literature :Hiei Y,Ohta S,Komari T,Kumashiro T.Efficient transformation of rice(Oryza sativa L.)mediated by Agrobacterium and sequence analysis of the boundaries of the T-DNA.Plant J,1994,6(2):271-282). to obtain plants of SE5-CK editing event.
For SE5-CK editing event, identification primers, namely SE5-F (GGAGGAGATGAGGGCGGTGGCCATGCGGCT) and SE5-R (TTTCCGATCACTCACACCAGGGGACGGCGGCGCGATCG), are designed at the upstream and downstream of the SE5 gene target point. Plants from the SE5 editing event were PCR amplified using SE5-F, SE-F. When the target site is edited and mutated, a Sal I cleavage site (the recognition sequence is GTCGAC) at the SE5 target site can be destroyed, the PCR product of the mutant DNA can not be cleaved by Sal I, and when the target site is not edited, the wild type DNA is amplified, and the PCR product DNA can be cleaved by Sal I. Therefore, PCR amplification can be carried out through SE5-F, SE-F, and then the mutation type at the SE5 gene target point in the plants of the SE5-CR editing event is identified by utilizing a Sal I enzyme digestion PCR product method, and the identified glue graph is shown in figure 3.
In the SE5-CK editing event, 50 mutant plants were obtained in total, wherein 6 hybrid plants were obtained and the proportion of heterozygous mutations was 12%. By selecting the offspring of the mutant T0 plants for planting, the plants with the mutation of the SE5 gene can be detected in the offspring of each T0 mutant plant, and the mutation can be inherited stably.
Example 2 editing results Using a space-time specific promoter to drive expression of Cas9 Polypeptides
In the above examples, excessive Cas/gRNA complex will continuously modify the target gene in the plant cell until all allelic target genes in the cell are mutated, resulting in homozygous mutation, causing abnormal plant development and even death. If heterozygous mutant plants can be obtained, it is expected that the plants will grow normally, since there is still a functionally complete gene at the other allele. This requires insufficient editing modification of the target site. In general, insufficient modification can be achieved by reducing the amount of expression of the Cas/gRNA complex, but reducing the expression of the complex to a suitable extent is also relatively complex. This example intends to limit Cas9 to expression at and after the period of differentiation of callus into multicellular embryos by driving Cas9 polypeptide expression using a spatiotemporal specific promoter, and is expected to achieve the technical effect of insufficient modification. The use of a spatiotemporal specific promoter to drive Cas9 polypeptide expression is an incomplete editing strategy, but such incomplete editing is likely to produce mutations that cannot be inherited to offspring, so how to design to achieve the technical effect of just creating heritable heterozygous mutations is unexpected to those skilled in the art.
In this example, in addition to the SE5 gene which exhibits premature senescence after homozygous mutation in rice, gene YSA(Su N,Hu ML,Wu DX,Wu FQ,Fei GL,Lan Y,Chen XL,Shu XL,Zhang X,Guo XP,Cheng ZJ,Lei CL,Qi CK,Jiang L,Wang H,Wan JM(2012)Disruption of a rice pentatricopeptide repeat protein causes a seedling-specific albino phenotype and its utilization to enhance seed purity in hybrid rice production.Plant Physiol 159:227-238) which exhibited albino leaf whitening after homozygous mutation in seedling stage was selected as the target test gene of the example. The gRNAs specifically targeting YSA and SE5 were designed by a network design tool (http:// CRISPR. Hzau. Edu. Cn/CRISPR /). The target gene was modified by editing using the CRISPR/Cas9 system.
The segment of the gRNA molecule that targets the SE5 gene is consistent with example 1.
The gRNA molecule targets the YSA gene segment GAGTAGGGCCGCTTCGGCCG
By comprehensively comparing and analyzing chip data (Wang L,Xie W,Chen Y,Tang W,Yang J,Ye R,Liu L,Lin Y,Xu C,Xiao J,Zhang Q(2010)A dynamic gene expression atlas covering the entire life cycle of rice.Plant J 61:752-766) of early-stage published rice in the whole growth period, we excavate 3 genes C5-1, C5-2 and C5-3 which are continuously and highly expressed in the early-stage differentiation and later-stage period of the callus.
The gene editing vector TKC previously created by the inventors was used as a backbone vector. The promoters of Cas9 are respectively replaced by promoters of C5-1, C5-2 and C5-3 genes (the promoter sequences of the C5-1, C5-2 and C5-3 genes are respectively SEQ ID NO.1, SEQ ID NO.2 and SEQ ID NO. 3), so as to respectively obtain three vectors of TKC-C51, TKC-C52 and TKC-C53 (figure 2). The gRNA expression cassette is then ligated into it, for a specific procedure, see (He Y,Zhu M,Wang L,Wu J,Wang Q,Wang R,Zhao Y.Programmed Self-Elimination of the CRISPR/Cas9 Construct Greatly Accelerates the Isolation of Edited and Transgene-Free Rice Plants.Mol Plant,2018,11(9):1210-1213)
The sequenced positive plasmid was transformed into agrobacterium (EHA 105) and infects rice callus. The transformed variety is rice "Zhonghua 11" (also called ZH11, from the institute of crop science of the national academy of agricultural sciences), (genetic transformation of rice is now a common method of operation in the field of rice transgenesis, wherein the detailed transformation steps and the various culture medium formulations used are described in literature :Hiei Y,Ohta S,Komari T,Kumashiro T.Efficient transformation of rice(Oryza sativa L.)mediated by Agrobacterium and sequence analysis of the boundaries of the T-DNA.Plant J,1994,6(2):271-282). to obtain plants with TKC-C51-YSA, TKC-C52-YSA, TKC-C53-SE5 total of four editing events, respectively.
For the TKC-C53-SE5 editing event, the mutation detection method was identical to example 1.
For TKC-C51-YSA, TKC-C52-YSA and TKC-C53-YSA editing event, identification primers YSA-F (GTCCTGCGGCCACTTCCTCCCTC) and YSA-R (GCTGCTGCCCCTCGTAGCTGTCC) are designed at the upstream and downstream of the YSA gene target point. Because one Eag I cleavage site (recognition sequence CGGCCG) at the YSA target site is destroyed after the target gene is edited and mutated, the PCR product of mutant DNA cannot be cleaved by Eag I, and when the target site is not edited, the wild type DNA is amplified and the PCR product DNA is cleaved by Eag I. Therefore, PCR amplification can be carried out through YSA-F, YSA-R, and then the mutant type at the YSA gene target point in the plants of YSA-CR editing event can be identified by utilizing the Eag I enzyme digestion PCR product method, and the identified glue graph is shown in figure 4.
In TKC-C51-YSA editing event, 3 mutant plants are obtained in total, and are heterozygous plants, and the heterozygous mutation proportion is 100%. By selecting offspring of heterozygous T0 plants for planting, 2 heterozygous mutant plants of YSA are detected in 17T 1 plants, which shows that the mutation generated by TKC-C51-YSA can be stably inherited to the offspring.
In TKC-C52-YSA editing event, 10 mutant plants were obtained in total, all hybrid plants, and the proportion of heterozygous mutation was 100%. By selecting offspring of heterozygous T0 plants for planting, mutant plants of YSA were not detected in 63T 1 plants, indicating that the mutation produced by TKC-C52-YSA is not heritable and that the C5-2 promoter is not suitable for creating heterozygous genetic material.
In the TKC-C53-YSA editing event, 4 mutant plants were obtained in total, wherein 3 plants were heterozygous and the heterozygous mutation ratio was 75%. By selecting offspring of heterozygous T0 plants for planting, 2 heterozygous mutant plants of YSA are detected in 20T 1 plants, which shows that the mutation generated by TKC-C53-YSA can be stably inherited to the offspring.
In the TKC-C53-SE5 editing event, 16 mutant plants were obtained in total, wherein the heterozygous plants were 16, and the heterozygous mutation ratio was 100%. By selecting offspring of heterozygous T0 plants for planting, 7 heterozygous mutant plants of SE5 are detected in 52T 1 plants, which shows that the mutation generated by TKC-C53-SE5 can be stably inherited to the offspring.
The above results demonstrate that TKC-C51, TKC-C53 expressed by Cas9 protein driven by promoters of C5-1, C5-3 genes can produce a high proportion of heterozygous plants in the T0 generation and can be inherited to offspring.
While the invention has been described in detail in the foregoing general description and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.