Risvik et al., 2013 - Google Patents
Maguro, a system for indexing and searching over very large text collectionsRisvik et al., 2013
View PDF- Document ID
- 9310260604649110553
- Author
- Risvik K
- Chilimbi T
- Tan H
- Kalyanaraman K
- Anderson C
- Publication year
- Publication venue
- Proceedings of the sixth ACM international conference on Web search and data mining
External Links
Snippet
Maguro is a system for efficiently searching very large collections of text content of up to 1 trillion documents at low cost. Search engines span across content that is very dynamic and highly augmented with metadata to the tail content of the web. A long tail distribution of …
- 238000009826 distribution 0 abstract description 4
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30477—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Risvik et al. | Maguro, a system for indexing and searching over very large text collections | |
| Dhulavvagol et al. | Performance analysis of distributed processing system using shard selection techniques on elasticsearch | |
| AU2012260534B2 (en) | Hybrid and iterative keyword and category search technique | |
| Wang et al. | On summarization and timeline generation for evolutionary tweet streams | |
| US10496642B2 (en) | Querying input data | |
| US8352494B1 (en) | Distributed image search | |
| US9858280B2 (en) | System, apparatus, program and method for data aggregation | |
| He et al. | Combining implicit and explicit topic representations for result diversification | |
| WO2019133928A1 (en) | Hierarchical, parallel models for extracting in real-time high-value information from data streams and system and method for creation of same | |
| US20110179002A1 (en) | System and Method for a Vector-Space Search Engine | |
| Kulkarni et al. | Shard ranking and cutoff estimation for topically partitioned collections | |
| US7895210B2 (en) | Methods and apparatuses for information analysis on shared and distributed computing systems | |
| US20110314026A1 (en) | System and Method for Retrieving Information Using a Query Based Index | |
| Kulkarni et al. | Document allocation policies for selective searching of distributed indexes | |
| Tandon et al. | Hawk: Hardware support for unstructured log processing | |
| US8375022B2 (en) | Keyword determination based on a weight of meaningfulness | |
| Gu et al. | Chronos: An elastic parallel framework for stream benchmark generation and simulation | |
| Cheng et al. | Supporting entity search: a large-scale prototype search engine | |
| CN118964686A (en) | Vector retrieval method, device, equipment and storage medium | |
| US8484221B2 (en) | Adaptive routing of documents to searchable indexes | |
| Hwang et al. | Organizing user search histories | |
| Yu et al. | Finding the most similar documents across multiple text databases | |
| US9830355B2 (en) | Computer-implemented method of performing a search using signatures | |
| Wang et al. | Event Indexing and Searching for High Volumes of Event Streams in the Cloud | |
| Hwang et al. | Binrank: Scaling dynamic authority-based search using materialized subgraphs |