[go: up one dir, main page]

Henson et al., 2005 - Google Patents

Guidelines for using compare-by-hash

Henson et al., 2005

View PDF
Document ID
11208421064707662631
Author
Henson V
Henderson R
et al.
Publication year
Publication venue
Comput. Sci

External Links

Snippet

Recently, a new technique called compare-by-hash has become popular. Compare-by-hash is a method of content-based addressing in which data is identified only by the cryptographic hash of its contents. Hash collisions are ignored, with the justification that they occur less …
Continue reading at www.cs.cmu.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30156De-duplication implemented within the file system, e.g. based on file segments
    • G06F17/30159De-duplication implemented within the file system, e.g. based on file segments based on file chunks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30182File system types
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communication
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communication including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communication including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • G06F3/0601Dedicated interfaces to storage systems
    • G06F3/0628Dedicated interfaces to storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring

Similar Documents

Publication Publication Date Title
Xia et al. A comprehensive study of the past, present, and future of data deduplication
Batten et al. pStore: A secure peer-to-peer backup system
JP4263477B2 (en) System for identifying common digital sequences
JP4846156B2 (en) Hash file system and method for use in a commonality factoring system
US9344112B2 (en) Sampling based elimination of duplicate data
US7478113B1 (en) Boundaries
US8112477B2 (en) Content identification for peer-to-peer content retrieval
US6704730B2 (en) Hash file system and method for use in a commonality factoring system
KR101381551B1 (en) Group based complete and incremental computer file backup system, process and apparatus
Henson An analysis of compare-by-hash
US8577850B1 (en) Techniques for global data deduplication
US20200097452A1 (en) Data deduplication device, data deduplication method, and data deduplication program
US20110099200A1 (en) Data sharing and recovery within a network of untrusted storage devices using data object fingerprinting
JP2007202146A (en) Method and apparatus for distributed data replication
EP2087418A1 (en) Methods and systems for data management using multiple selection criteria
CA2670400A1 (en) Methods and systems for quick and efficient data management and/or processing
US10339124B2 (en) Data fingerprint strengthening
CN103649946A (en) Transmitting file system changes over the network
US11860739B2 (en) Methods for managing snapshots in a distributed de-duplication system and devices thereof
Park et al. Supporting Practical Content-Addressable Caching with CZIP Compression.
US20070055834A1 (en) Performance improvement for block span replication
Henson et al. Guidelines for using compare-by-hash
Cooley et al. ABS: the apportioned backup system
Tian et al. Sed‐Dedup: An efficient secure deduplication system with data modifications
CN112685219A (en) Method, apparatus and computer program product for backing up data