Mohamed et al., 2013 - Google Patents
Accelerating data-intensive genome analysis in the cloudMohamed et al., 2013
View PDF- Document ID
- 5669466697575276270
- Author
- Mohamed N
- Lin H
- Feng W
- Publication year
- Publication venue
- Proceedings of the 5th International Conference on Bioinformatics and Computational Biology (BICoB), Honolulu, Hawaii, USA
External Links
Snippet
Next-generation sequencing (NGS) technologies have made it possible to rapidly sequence the human genome, heralding a new era of health-care innovations based on personalized genetic information. However, these NGS technologies generate data at a rate that far …
- 238000004458 analytical method 0 title abstract description 16
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
- G06F3/0601—Dedicated interfaces to storage systems
- G06F3/0628—Dedicated interfaces to storage systems making use of a particular technique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Lovrić et al. | PySpark and RDKit: moving towards big data in cheminformatics | |
| Decap et al. | Halvade: scalable sequence analysis with MapReduce | |
| Nothaft et al. | Rethinking data-intensive science using scalable analytics systems | |
| US10366053B1 (en) | Consistent randomized record-level splitting of machine learning data | |
| Mohamed et al. | Accelerating data-intensive genome analysis in the cloud | |
| US20240004838A1 (en) | Quality score compression for improving downstream genotyping accuracy | |
| Ferraro Petrillo et al. | Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics | |
| US9201916B2 (en) | Method, system, and computer-readable medium for providing a scalable bio-informatics sequence search on cloud | |
| Kienzler et al. | Stream as you go: The case for incremental data access and processing in the cloud | |
| Tabari et al. | PorthoMCL: parallel orthology prediction using MCL for the realm of massive genome availability | |
| Gurtowski et al. | Genotyping in the cloud with crossbow | |
| Huang et al. | Analyzing large scale genomic data on the cloud with Sparkhit | |
| Ye et al. | H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs | |
| Diao et al. | Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis. | |
| Kienzler et al. | Large-scale DNA sequence analysis in the cloud: a stream-based approach | |
| Maarala et al. | ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads | |
| Shi et al. | A case study of tuning MapReduce for efficient Bioinformatics in the cloud | |
| Shanker | Genome research in the cloud | |
| Ge et al. | Counting kmers for biological sequences at large scale | |
| Piñeiro et al. | BigSeqKit: a parallel Big Data toolkit to process FASTA and FASTQ files at scale | |
| Deng et al. | HiGene: A high-performance platform for genomic data analysis | |
| Wilke et al. | An experience report: porting the MG‐RAST rapid metagenomics analysis pipeline to the cloud | |
| Vijayakumar et al. | Optimizing sequence alignment in cloud using hadoop and mpp database | |
| Yin et al. | RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures | |
| Boulund et al. | Tentacle: distributed quantification of genes in metagenomes |