CN114203258A - Single-stranded DNA screening method for regulating gene mRNA expression - Google Patents
Single-stranded DNA screening method for regulating gene mRNA expression Download PDFInfo
- Publication number
- CN114203258A CN114203258A CN202111437868.4A CN202111437868A CN114203258A CN 114203258 A CN114203258 A CN 114203258A CN 202111437868 A CN202111437868 A CN 202111437868A CN 114203258 A CN114203258 A CN 114203258A
- Authority
- CN
- China
- Prior art keywords
- sequence
- stranded dna
- sequences
- mrna expression
- screening method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 108020004414 DNA Proteins 0.000 title claims abstract description 32
- 230000014509 gene expression Effects 0.000 title claims abstract description 31
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 30
- 102000053602 DNA Human genes 0.000 title claims abstract description 29
- 108020004682 Single-Stranded DNA Proteins 0.000 title claims abstract description 28
- 238000012216 screening Methods 0.000 title claims abstract description 26
- 108020004999 messenger RNA Proteins 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000001105 regulatory effect Effects 0.000 title claims abstract description 19
- 108091023040 Transcription factor Proteins 0.000 claims abstract description 19
- 102000040945 Transcription factor Human genes 0.000 claims abstract description 19
- 230000003252 repetitive effect Effects 0.000 claims abstract description 9
- 230000000295 complement effect Effects 0.000 claims description 13
- 108091081062 Repeated sequence (DNA) Proteins 0.000 claims description 10
- 238000011144 upstream manufacturing Methods 0.000 claims description 9
- 230000009471 action Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 238000011160 research Methods 0.000 claims description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 abstract description 4
- 230000008569 process Effects 0.000 abstract description 4
- 238000012795 verification Methods 0.000 abstract description 4
- 230000002380 cytological effect Effects 0.000 abstract description 2
- 230000002194 synthesizing effect Effects 0.000 abstract description 2
- 101150013707 HBB gene Proteins 0.000 description 7
- 210000004027 cell Anatomy 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000001351 cycling effect Effects 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 210000003917 human chromosome Anatomy 0.000 description 2
- 150000007523 nucleic acids Chemical group 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 101150112014 Gapdh gene Proteins 0.000 description 1
- 101100230565 Homo sapiens HBB gene Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 238000000692 Student's t-test Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 210000001726 chromosome structure Anatomy 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003828 downregulation Effects 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 108091008104 nucleic acid aptamers Proteins 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 238000003762 quantitative reverse transcription PCR Methods 0.000 description 1
- 230000003584 silencer Effects 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000003827 upregulation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a single-chain DNA screening method for regulating gene mRNA expression, which adopts a program to analyze complex DNA sequence information, greatly improves the efficiency and time, simultaneously adopts program comparison to analyze related information in a database, predicts the probability of combination of a sequence and a transcription factor, and screens the repetitive sequence information of an important transcription factor for regulating gene expression. Artificially synthesizing the predicted single-stranded DNA sequence, and performing cytological experimental verification. Finally obtaining single-stranded DNA for regulating gene mRNA expression. The invention is universal in the process of screening and searching the single-stranded DNA for regulating the mRNA expression of other genes, and is suitable for screening and verifying the single-stranded DNA with the regulated gene expression of various genes.
Description
Technical Field
The invention relates to a single-stranded DNA screening method for regulating gene mRNA expression. Belongs to the technical field of genetics.
Background
In the genome of eukaryotes, many repetitive sequences exist, and play a variety of roles in expression of cellular genes and the like, including palindromic repeats, inverted repeats, mirror repeats, complementary repeats and the like.
The upstream of eukaryotic cell gene has several cis-acting elements, such as enhancer, silencer, promoter, etc. to regulate the downstream gene expression. When a eukaryotic cell transcribes a gene, the chromosome structure is loosened, a DNA double strand is separated and combined with a transcription factor, and in the process, various types of repeated sequences at the upstream of the gene can enable the DNA strand to form various spatial structures through complementary pairing, so that the combination of the DNA and the transcription factor is influenced, and the expression of the gene is regulated. The nucleic acid aptamer is a nucleic acid sequence which can be combined with a specific protein domain through forming a certain stable spatial structure, thereby playing a certain role in biological function. Through screening of specific types of repeated sequences in a genome, it is possible to find a small segment of nucleic acid sequence with good binding property with a transcription factor, thereby screening sequences with a regulation function on gene expression.
Studies have shown that single-stranded DNA can affect the expression of certain genes. But not a repetitive single-stranded DNA sequence. Currently, the existing repetitive single-stranded DNA has few studies on the function in cells, and the methods of performing computational screening using a computer programming language and performing verification in experiments are still few. The invention is a technology with original significance.
Disclosure of Invention
The present invention aims at overcoming the demerits of available technology, and provides one kind of single stranded DNA screening method for regulating gene mRNA expression.
In order to achieve the purpose, the invention adopts the following technical scheme:
a single-stranded DNA screening method for regulating gene mRNA expression comprises the following specific steps:
(1) analyzing the upstream sequence of the gene by python language to obtain all complementary repetitive sequences and reverse complementary repetitive sequence pairs, and deriving the sequence length and the sequence initial position to obtain an alternative sequence;
(2) predicting transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, and numbering corresponding sequences to obtain pre-screened sequences;
(3) then transfecting the cell with the pre-screened sequence, setting internal reference, inspecting the change of the mRNA expression quantity of the corresponding gene, and screening out the sequence with the statistical difference of the change quantity, namely the single-stranded DNA for regulating the mRNA expression of the gene.
Preferably, the single-stranded DNA comprises a single-stranded DNA sequence that up-regulates or down-regulates the expression of a gene mRNA.
Preferably, the trruit database can be used to analyze the transcription factors corresponding to the obtained single-stranded DNA sequences, and determine their action targets and action effects for further research.
Preferably, the specific method of step (1) is as follows: reading in sequence information and the length range of the repeated sequences, cycling through each length n, establishing a set (set) data structure, and for each sequence with the position [ i, i + n-1] (i represents the relative position of bases, and n represents the number of base increment), if its inverted repeat/complementary repeat sequence is in the set, storing it and the sequence in the set as a repeat sequence pair. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result.
Preferably, in step (3), the transfected cells are K562 cells.
Preferably, in step (3), the statistical difference criterion is t < 0.05.
The invention has the beneficial effects that:
the invention adopts program to analyze complex DNA sequence information, greatly improves efficiency and time, and simultaneously adopts program comparison to analyze relevant information in a database, predicts the probability of combination of the sequence and the transcription factor, thereby screening and regulating the repetitive sequence information of the important transcription factor expressed by the gene. Artificially synthesizing the predicted single-stranded DNA sequence, and performing cytological experimental verification. Finally obtaining single-stranded DNA for regulating gene mRNA expression.
The key innovation point of the invention is that the single-stranded DNA is screened by means of standardized processes such as sequence information retrieval, program design, transcription factor binding prediction, experimental verification and the like for the first time. The screened single-stranded DNA has the capability of regulating the expression level of mRNA of a specific screened gene in cells, and the specific single-stranded DNA has the possibility of generating up-regulation or down-regulation function on the gene.
The invention is universal in the process of screening and searching the single-stranded DNA for regulating the mRNA expression of other genes, and is suitable for screening and verifying the single-stranded DNA with the regulated gene expression of various genes.
Drawings
FIG. 1 is a diagram of HBB sequence information;
FIG. 2 is a flowchart of the python algorithm;
FIG. 3 shows the expression of mRNA of HBB gene.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the invention.
1.HBB upstream sequence query
The location of the human HBB gene was found in the NCBI database at human chromosome 11, 5227071-. The upstream 5730bp sequence is taken as the sequence of the 5232800-5227071 fragment of human chromosome 11. (FIG. 1)
2. Programming and analysis results
Using python language program to analyze upstream 5730bp sequence and obtain all pairs of complementary repeat sequence and inverted complementary repeat sequence in 5730bp sequence and derive the length and sequence start position (calculated from HBB gene start)
The specific idea is as follows: reading in sequence information and the length range of the repeated sequences, cycling through each length n, establishing a set (set) data structure, and for each sequence with the position [ i, i + n-1] (i represents the relative position of bases, and n represents the number of base increment), if its inverted repeat/complementary repeat sequence is in the set, storing it and the sequence in the set as a repeat sequence pair. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result. (FIG. 2)
The results of the program analysis are shown in tables 1 and 2.
TABLE 1 analysis of complementary repeats in 5730bp upstream of HBB Gene
TABLE 2 analysis of the inverted complementary repeat sequence in 5730bp upstream of the HBB gene
3. Screening of sequences having potential transcription binding sites
And (3) predicting the transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, numbering the corresponding sequences to obtain 12 sequences, and obtaining the results shown in Table 3.
TABLE 3 prediction of transcription factor binding
4. RT-qPCR result chart
The results of the change in the expression level of mRNA of HBB gene after transfection of 12 sequences into K562 cells (ATCC, USA) were plotted, and the results are shown in FIG. 3 (in which A to L are the expression levels of mRNA of HBB gene of sequence Nos. 1 to 12 in this order). With the GAPDH gene (synthesized in shanghai) as an internal reference, a statistical difference criterion was set to t <0.05, and t-tests were performed on the HBB expression changes after introduction of different sequences into cells compared to the blank control group NC for the number of repetitions, and it was found that sequence nos. 1, 2, 3, 4, 5, 6, 11, and 12 did not show statistical differences. The sequences 7, 8, 9 and 10 have significant differences and the expression level is reduced compared with that of a blank control group, wherein the sequences 7 and 8 are significantly reduced after being introduced, which indicates that the sequences may have a greater influence on the transcription of the HBB gene.
5. Sequence-corresponding transcription factor and mechanism prediction
The TRRUST database was used to analyze the transcription factors corresponding to sequences No. 7, 8, 9 and 10, and the existing action targets, action effects and sources of the literature in the past were found for further study (Table 4).
TABLE 4 sequence downstream Effect prediction analysis
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto, and various modifications and variations which do not require inventive efforts and which are made by those skilled in the art are within the scope of the present invention.
Claims (6)
1. A single-stranded DNA screening method for regulating gene mRNA expression is characterized by comprising the following specific steps:
(1) analyzing the upstream sequence of the gene by python language to obtain all complementary repetitive sequences and reverse complementary repetitive sequence pairs, and deriving the sequence length and the sequence initial position to obtain an alternative sequence;
(2) predicting transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, and numbering corresponding sequences to obtain pre-screened sequences;
(3) then transfecting the cell with the pre-screened sequence, setting internal reference, inspecting the change of the mRNA expression quantity of the corresponding gene, and screening out the sequence with the statistical difference of the change quantity, namely the single-stranded DNA for regulating the mRNA expression of the gene.
2. The screening method of claim 1, wherein the single-stranded DNA comprises a single-stranded DNA sequence that up-regulates or down-regulates gene mRNA expression.
3. The screening method according to claim 1, wherein the trruit database is further used to analyze the transcription factors corresponding to the obtained single-stranded DNA sequences, and determine their action targets and action effects for further research.
4. The screening method according to claim 1, wherein the specific method of step (1) is as follows: reading in sequence information and the length range of the repeated sequences, circulating for each length n, establishing an aggregate data structure, and storing each sequence with the position of [ i, i + n-1] and the sequence in the aggregate as a repeated sequence pair if the inverted repeat/complementary repeated sequence of the sequence is in the aggregate. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result.
5. The screening method according to claim 1, wherein the transfected cells in step (3) are K562 cells.
6. The screening method according to claim 1, wherein in the step (3), the standard for the statistical difference is t < 0.05.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111437868.4A CN114203258B (en) | 2021-11-29 | 2021-11-29 | A single-stranded DNA screening method for regulating gene mRNA expression |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111437868.4A CN114203258B (en) | 2021-11-29 | 2021-11-29 | A single-stranded DNA screening method for regulating gene mRNA expression |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114203258A true CN114203258A (en) | 2022-03-18 |
| CN114203258B CN114203258B (en) | 2024-12-20 |
Family
ID=80649589
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202111437868.4A Active CN114203258B (en) | 2021-11-29 | 2021-11-29 | A single-stranded DNA screening method for regulating gene mRNA expression |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114203258B (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1997010360A1 (en) * | 1995-09-13 | 1997-03-20 | Chiron Corporation | Method and construct for screening for inhibitors of transcriptional activation |
| US20040191779A1 (en) * | 2003-03-28 | 2004-09-30 | Jie Zhang | Statistical analysis of regulatory factor binding sites of differentially expressed genes |
| CN102839214A (en) * | 2012-09-06 | 2012-12-26 | 许汉鹏 | Method for screening specific gene regulation correlated mechanism and molecule in eukaryon genome |
-
2021
- 2021-11-29 CN CN202111437868.4A patent/CN114203258B/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1997010360A1 (en) * | 1995-09-13 | 1997-03-20 | Chiron Corporation | Method and construct for screening for inhibitors of transcriptional activation |
| US20040191779A1 (en) * | 2003-03-28 | 2004-09-30 | Jie Zhang | Statistical analysis of regulatory factor binding sites of differentially expressed genes |
| CN102839214A (en) * | 2012-09-06 | 2012-12-26 | 许汉鹏 | Method for screening specific gene regulation correlated mechanism and molecule in eukaryon genome |
Non-Patent Citations (1)
| Title |
|---|
| 张世昌;郑江;: "基于生物信息学miR-221调控前列腺癌分子机制的研究", 医学理论与实践, no. 05, 10 March 2020 (2020-03-10) * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114203258B (en) | 2024-12-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Vaishnav et al. | The evolution, evolvability and engineering of gene regulatory DNA | |
| Leger et al. | RNA modifications detection by comparative Nanopore direct RNA sequencing | |
| Liu et al. | A-to-I RNA editing is developmentally regulated and generally adaptive for sexual reproduction in Neurospora crassa | |
| McGlincy et al. | A genome-scale CRISPR interference guide library enables comprehensive phenotypic profiling in yeast | |
| Lin et al. | Expression dynamics, relationships, and transcriptional regulations of diverse transcripts in mouse spermatogenic cells | |
| Cui et al. | Comparative analysis and classification of cassette exons and constitutive exons | |
| CN113066527B (en) | Target prediction method and system for siRNA knockdown mRNA | |
| Ferreira et al. | Protein abundance prediction through machine learning methods | |
| Yu et al. | Bioinformatics resources for deciphering the biogenesis and action pathways of plant small RNAs | |
| Caselle et al. | Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes | |
| WO2008157299A9 (en) | Differential expression profiling analysis of cell culture phenotypes and uses thereof | |
| Ma et al. | Convenient synthesis and delivery of a megabase-scale designer accessory chromosome empower biosynthetic capacity | |
| Shao et al. | Riboformer: a deep learning framework for predicting context-dependent translation dynamics | |
| Jiang et al. | Assessing base-resolution DNA mechanics on the genome scale | |
| Wang et al. | A review of deep learning models for the prediction of chromatin interactions with DNA and epigenomic profiles | |
| CN114203258A (en) | Single-stranded DNA screening method for regulating gene mRNA expression | |
| Rajendran et al. | Identification of small non-coding RNAs from Rhizobium etli by integrated genome-wide and transcriptome-based methods | |
| CN112951322B (en) | Rule weight distribution siRNA design method based on grid search | |
| Yao et al. | M1ARegpred: epitranscriptome target prediction of N1-methyladenosine (m1A) regulators based on sequencing features and genomic features | |
| Pan et al. | Prediction and motif analysis of 2’-O-methylation using a hybrid deep learning model from RNA primary sequence and nanopore signals | |
| CN116312755B (en) | Target determination method and device and computer equipment | |
| CN116189756B (en) | Target selection method and device, computer equipment | |
| Aziz et al. | A mixed convolution neural network for identifying rna pseudouridine sites | |
| CN110415765A (en) | A method for predicting the subcellular localization of long noncoding RNAs | |
| Herman et al. | Adaptation for protein synthesis efficiency in a naturally occurring self-regulating operon |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |