[go: up one dir, main page]

Risvik et al., 2013 - Google Patents

Maguro, a system for indexing and searching over very large text collections

Risvik et al., 2013

View PDF
Document ID
9310260604649110553
Author
Risvik K
Chilimbi T
Tan H
Kalyanaraman K
Anderson C
Publication year
Publication venue
Proceedings of the sixth ACM international conference on Web search and data mining

External Links

Snippet

Maguro is a system for efficiently searching very large collections of text content of up to 1 trillion documents at low cost. Search engines span across content that is very dynamic and highly augmented with metadata to the tail content of the web. A long tail distribution of …
Continue reading at www.microsoft.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30477Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30312Storage and indexing structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
Risvik et al. Maguro, a system for indexing and searching over very large text collections
Dhulavvagol et al. Performance analysis of distributed processing system using shard selection techniques on elasticsearch
AU2012260534B2 (en) Hybrid and iterative keyword and category search technique
Wang et al. On summarization and timeline generation for evolutionary tweet streams
US10496642B2 (en) Querying input data
US8352494B1 (en) Distributed image search
US9858280B2 (en) System, apparatus, program and method for data aggregation
He et al. Combining implicit and explicit topic representations for result diversification
WO2019133928A1 (en) Hierarchical, parallel models for extracting in real-time high-value information from data streams and system and method for creation of same
US20110179002A1 (en) System and Method for a Vector-Space Search Engine
Kulkarni et al. Shard ranking and cutoff estimation for topically partitioned collections
US7895210B2 (en) Methods and apparatuses for information analysis on shared and distributed computing systems
US20110314026A1 (en) System and Method for Retrieving Information Using a Query Based Index
Kulkarni et al. Document allocation policies for selective searching of distributed indexes
Tandon et al. Hawk: Hardware support for unstructured log processing
US8375022B2 (en) Keyword determination based on a weight of meaningfulness
Gu et al. Chronos: An elastic parallel framework for stream benchmark generation and simulation
Cheng et al. Supporting entity search: a large-scale prototype search engine
CN118964686A (en) Vector retrieval method, device, equipment and storage medium
US8484221B2 (en) Adaptive routing of documents to searchable indexes
Hwang et al. Organizing user search histories
Yu et al. Finding the most similar documents across multiple text databases
US9830355B2 (en) Computer-implemented method of performing a search using signatures
Wang et al. Event Indexing and Searching for High Volumes of Event Streams in the Cloud
Hwang et al. Binrank: Scaling dynamic authority-based search using materialized subgraphs