[go: up one dir, main page]

CN114203258A - Single-stranded DNA screening method for regulating gene mRNA expression - Google Patents

Single-stranded DNA screening method for regulating gene mRNA expression Download PDF

Info

Publication number
CN114203258A
CN114203258A CN202111437868.4A CN202111437868A CN114203258A CN 114203258 A CN114203258 A CN 114203258A CN 202111437868 A CN202111437868 A CN 202111437868A CN 114203258 A CN114203258 A CN 114203258A
Authority
CN
China
Prior art keywords
sequence
stranded dna
sequences
mrna expression
screening method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111437868.4A
Other languages
Chinese (zh)
Other versions
CN114203258B (en
Inventor
孙曙明
王启航
廖永康
刘静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202111437868.4A priority Critical patent/CN114203258B/en
Publication of CN114203258A publication Critical patent/CN114203258A/en
Application granted granted Critical
Publication of CN114203258B publication Critical patent/CN114203258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a single-chain DNA screening method for regulating gene mRNA expression, which adopts a program to analyze complex DNA sequence information, greatly improves the efficiency and time, simultaneously adopts program comparison to analyze related information in a database, predicts the probability of combination of a sequence and a transcription factor, and screens the repetitive sequence information of an important transcription factor for regulating gene expression. Artificially synthesizing the predicted single-stranded DNA sequence, and performing cytological experimental verification. Finally obtaining single-stranded DNA for regulating gene mRNA expression. The invention is universal in the process of screening and searching the single-stranded DNA for regulating the mRNA expression of other genes, and is suitable for screening and verifying the single-stranded DNA with the regulated gene expression of various genes.

Description

Single-stranded DNA screening method for regulating gene mRNA expression
Technical Field
The invention relates to a single-stranded DNA screening method for regulating gene mRNA expression. Belongs to the technical field of genetics.
Background
In the genome of eukaryotes, many repetitive sequences exist, and play a variety of roles in expression of cellular genes and the like, including palindromic repeats, inverted repeats, mirror repeats, complementary repeats and the like.
The upstream of eukaryotic cell gene has several cis-acting elements, such as enhancer, silencer, promoter, etc. to regulate the downstream gene expression. When a eukaryotic cell transcribes a gene, the chromosome structure is loosened, a DNA double strand is separated and combined with a transcription factor, and in the process, various types of repeated sequences at the upstream of the gene can enable the DNA strand to form various spatial structures through complementary pairing, so that the combination of the DNA and the transcription factor is influenced, and the expression of the gene is regulated. The nucleic acid aptamer is a nucleic acid sequence which can be combined with a specific protein domain through forming a certain stable spatial structure, thereby playing a certain role in biological function. Through screening of specific types of repeated sequences in a genome, it is possible to find a small segment of nucleic acid sequence with good binding property with a transcription factor, thereby screening sequences with a regulation function on gene expression.
Studies have shown that single-stranded DNA can affect the expression of certain genes. But not a repetitive single-stranded DNA sequence. Currently, the existing repetitive single-stranded DNA has few studies on the function in cells, and the methods of performing computational screening using a computer programming language and performing verification in experiments are still few. The invention is a technology with original significance.
Disclosure of Invention
The present invention aims at overcoming the demerits of available technology, and provides one kind of single stranded DNA screening method for regulating gene mRNA expression.
In order to achieve the purpose, the invention adopts the following technical scheme:
a single-stranded DNA screening method for regulating gene mRNA expression comprises the following specific steps:
(1) analyzing the upstream sequence of the gene by python language to obtain all complementary repetitive sequences and reverse complementary repetitive sequence pairs, and deriving the sequence length and the sequence initial position to obtain an alternative sequence;
(2) predicting transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, and numbering corresponding sequences to obtain pre-screened sequences;
(3) then transfecting the cell with the pre-screened sequence, setting internal reference, inspecting the change of the mRNA expression quantity of the corresponding gene, and screening out the sequence with the statistical difference of the change quantity, namely the single-stranded DNA for regulating the mRNA expression of the gene.
Preferably, the single-stranded DNA comprises a single-stranded DNA sequence that up-regulates or down-regulates the expression of a gene mRNA.
Preferably, the trruit database can be used to analyze the transcription factors corresponding to the obtained single-stranded DNA sequences, and determine their action targets and action effects for further research.
Preferably, the specific method of step (1) is as follows: reading in sequence information and the length range of the repeated sequences, cycling through each length n, establishing a set (set) data structure, and for each sequence with the position [ i, i + n-1] (i represents the relative position of bases, and n represents the number of base increment), if its inverted repeat/complementary repeat sequence is in the set, storing it and the sequence in the set as a repeat sequence pair. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result.
Preferably, in step (3), the transfected cells are K562 cells.
Preferably, in step (3), the statistical difference criterion is t < 0.05.
The invention has the beneficial effects that:
the invention adopts program to analyze complex DNA sequence information, greatly improves efficiency and time, and simultaneously adopts program comparison to analyze relevant information in a database, predicts the probability of combination of the sequence and the transcription factor, thereby screening and regulating the repetitive sequence information of the important transcription factor expressed by the gene. Artificially synthesizing the predicted single-stranded DNA sequence, and performing cytological experimental verification. Finally obtaining single-stranded DNA for regulating gene mRNA expression.
The key innovation point of the invention is that the single-stranded DNA is screened by means of standardized processes such as sequence information retrieval, program design, transcription factor binding prediction, experimental verification and the like for the first time. The screened single-stranded DNA has the capability of regulating the expression level of mRNA of a specific screened gene in cells, and the specific single-stranded DNA has the possibility of generating up-regulation or down-regulation function on the gene.
The invention is universal in the process of screening and searching the single-stranded DNA for regulating the mRNA expression of other genes, and is suitable for screening and verifying the single-stranded DNA with the regulated gene expression of various genes.
Drawings
FIG. 1 is a diagram of HBB sequence information;
FIG. 2 is a flowchart of the python algorithm;
FIG. 3 shows the expression of mRNA of HBB gene.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the invention.
1.HBB upstream sequence query
The location of the human HBB gene was found in the NCBI database at human chromosome 11, 5227071-. The upstream 5730bp sequence is taken as the sequence of the 5232800-5227071 fragment of human chromosome 11. (FIG. 1)
2. Programming and analysis results
Using python language program to analyze upstream 5730bp sequence and obtain all pairs of complementary repeat sequence and inverted complementary repeat sequence in 5730bp sequence and derive the length and sequence start position (calculated from HBB gene start)
The specific idea is as follows: reading in sequence information and the length range of the repeated sequences, cycling through each length n, establishing a set (set) data structure, and for each sequence with the position [ i, i + n-1] (i represents the relative position of bases, and n represents the number of base increment), if its inverted repeat/complementary repeat sequence is in the set, storing it and the sequence in the set as a repeat sequence pair. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result. (FIG. 2)
The results of the program analysis are shown in tables 1 and 2.
TABLE 1 analysis of complementary repeats in 5730bp upstream of HBB Gene
Figure BDA0003382388580000031
Figure BDA0003382388580000041
TABLE 2 analysis of the inverted complementary repeat sequence in 5730bp upstream of the HBB gene
Figure BDA0003382388580000042
3. Screening of sequences having potential transcription binding sites
And (3) predicting the transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, numbering the corresponding sequences to obtain 12 sequences, and obtaining the results shown in Table 3.
TABLE 3 prediction of transcription factor binding
Figure BDA0003382388580000043
Figure BDA0003382388580000051
4. RT-qPCR result chart
The results of the change in the expression level of mRNA of HBB gene after transfection of 12 sequences into K562 cells (ATCC, USA) were plotted, and the results are shown in FIG. 3 (in which A to L are the expression levels of mRNA of HBB gene of sequence Nos. 1 to 12 in this order). With the GAPDH gene (synthesized in shanghai) as an internal reference, a statistical difference criterion was set to t <0.05, and t-tests were performed on the HBB expression changes after introduction of different sequences into cells compared to the blank control group NC for the number of repetitions, and it was found that sequence nos. 1, 2, 3, 4, 5, 6, 11, and 12 did not show statistical differences. The sequences 7, 8, 9 and 10 have significant differences and the expression level is reduced compared with that of a blank control group, wherein the sequences 7 and 8 are significantly reduced after being introduced, which indicates that the sequences may have a greater influence on the transcription of the HBB gene.
5. Sequence-corresponding transcription factor and mechanism prediction
The TRRUST database was used to analyze the transcription factors corresponding to sequences No. 7, 8, 9 and 10, and the existing action targets, action effects and sources of the literature in the past were found for further study (Table 4).
TABLE 4 sequence downstream Effect prediction analysis
Figure BDA0003382388580000052
Figure BDA0003382388580000061
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto, and various modifications and variations which do not require inventive efforts and which are made by those skilled in the art are within the scope of the present invention.

Claims (6)

1. A single-stranded DNA screening method for regulating gene mRNA expression is characterized by comprising the following specific steps:
(1) analyzing the upstream sequence of the gene by python language to obtain all complementary repetitive sequences and reverse complementary repetitive sequence pairs, and deriving the sequence length and the sequence initial position to obtain an alternative sequence;
(2) predicting transcription factor binding sites of the alternative sequences by using a JASPAR database, screening out the transcription factors with scores of more than 9, and numbering corresponding sequences to obtain pre-screened sequences;
(3) then transfecting the cell with the pre-screened sequence, setting internal reference, inspecting the change of the mRNA expression quantity of the corresponding gene, and screening out the sequence with the statistical difference of the change quantity, namely the single-stranded DNA for regulating the mRNA expression of the gene.
2. The screening method of claim 1, wherein the single-stranded DNA comprises a single-stranded DNA sequence that up-regulates or down-regulates gene mRNA expression.
3. The screening method according to claim 1, wherein the trruit database is further used to analyze the transcription factors corresponding to the obtained single-stranded DNA sequences, and determine their action targets and action effects for further research.
4. The screening method according to claim 1, wherein the specific method of step (1) is as follows: reading in sequence information and the length range of the repeated sequences, circulating for each length n, establishing an aggregate data structure, and storing each sequence with the position of [ i, i + n-1] and the sequence in the aggregate as a repeated sequence pair if the inverted repeat/complementary repeated sequence of the sequence is in the aggregate. And adding the sequence [ i, i + n-1] into the set every time the judgment is finished, and judging the next [ i +1, i + n ] until i is less than or equal to 5001-n. And finally, outputting the sequence, length, position and other information of all the retrieved repeated sequence pairs as a program operation result.
5. The screening method according to claim 1, wherein the transfected cells in step (3) are K562 cells.
6. The screening method according to claim 1, wherein in the step (3), the standard for the statistical difference is t < 0.05.
CN202111437868.4A 2021-11-29 2021-11-29 A single-stranded DNA screening method for regulating gene mRNA expression Active CN114203258B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111437868.4A CN114203258B (en) 2021-11-29 2021-11-29 A single-stranded DNA screening method for regulating gene mRNA expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111437868.4A CN114203258B (en) 2021-11-29 2021-11-29 A single-stranded DNA screening method for regulating gene mRNA expression

Publications (2)

Publication Number Publication Date
CN114203258A true CN114203258A (en) 2022-03-18
CN114203258B CN114203258B (en) 2024-12-20

Family

ID=80649589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111437868.4A Active CN114203258B (en) 2021-11-29 2021-11-29 A single-stranded DNA screening method for regulating gene mRNA expression

Country Status (1)

Country Link
CN (1) CN114203258B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997010360A1 (en) * 1995-09-13 1997-03-20 Chiron Corporation Method and construct for screening for inhibitors of transcriptional activation
US20040191779A1 (en) * 2003-03-28 2004-09-30 Jie Zhang Statistical analysis of regulatory factor binding sites of differentially expressed genes
CN102839214A (en) * 2012-09-06 2012-12-26 许汉鹏 Method for screening specific gene regulation correlated mechanism and molecule in eukaryon genome

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997010360A1 (en) * 1995-09-13 1997-03-20 Chiron Corporation Method and construct for screening for inhibitors of transcriptional activation
US20040191779A1 (en) * 2003-03-28 2004-09-30 Jie Zhang Statistical analysis of regulatory factor binding sites of differentially expressed genes
CN102839214A (en) * 2012-09-06 2012-12-26 许汉鹏 Method for screening specific gene regulation correlated mechanism and molecule in eukaryon genome

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张世昌;郑江;: "基于生物信息学miR-221调控前列腺癌分子机制的研究", 医学理论与实践, no. 05, 10 March 2020 (2020-03-10) *

Also Published As

Publication number Publication date
CN114203258B (en) 2024-12-20

Similar Documents

Publication Publication Date Title
Vaishnav et al. The evolution, evolvability and engineering of gene regulatory DNA
Leger et al. RNA modifications detection by comparative Nanopore direct RNA sequencing
Liu et al. A-to-I RNA editing is developmentally regulated and generally adaptive for sexual reproduction in Neurospora crassa
McGlincy et al. A genome-scale CRISPR interference guide library enables comprehensive phenotypic profiling in yeast
Lin et al. Expression dynamics, relationships, and transcriptional regulations of diverse transcripts in mouse spermatogenic cells
Cui et al. Comparative analysis and classification of cassette exons and constitutive exons
CN113066527B (en) Target prediction method and system for siRNA knockdown mRNA
Ferreira et al. Protein abundance prediction through machine learning methods
Yu et al. Bioinformatics resources for deciphering the biogenesis and action pathways of plant small RNAs
Caselle et al. Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes
WO2008157299A9 (en) Differential expression profiling analysis of cell culture phenotypes and uses thereof
Ma et al. Convenient synthesis and delivery of a megabase-scale designer accessory chromosome empower biosynthetic capacity
Shao et al. Riboformer: a deep learning framework for predicting context-dependent translation dynamics
Jiang et al. Assessing base-resolution DNA mechanics on the genome scale
Wang et al. A review of deep learning models for the prediction of chromatin interactions with DNA and epigenomic profiles
CN114203258A (en) Single-stranded DNA screening method for regulating gene mRNA expression
Rajendran et al. Identification of small non-coding RNAs from Rhizobium etli by integrated genome-wide and transcriptome-based methods
CN112951322B (en) Rule weight distribution siRNA design method based on grid search
Yao et al. M1ARegpred: epitranscriptome target prediction of N1-methyladenosine (m1A) regulators based on sequencing features and genomic features
Pan et al. Prediction and motif analysis of 2’-O-methylation using a hybrid deep learning model from RNA primary sequence and nanopore signals
CN116312755B (en) Target determination method and device and computer equipment
CN116189756B (en) Target selection method and device, computer equipment
Aziz et al. A mixed convolution neural network for identifying rna pseudouridine sites
CN110415765A (en) A method for predicting the subcellular localization of long noncoding RNAs
Herman et al. Adaptation for protein synthesis efficiency in a naturally occurring self-regulating operon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant