[go: up one dir, main page]

Javed et al., 2015 - Google Patents

A direct approach for word and character segmentation in run-length compressed documents with an application to word spotting

Javed et al., 2015

Document ID
10369158319323976233
Author
Javed M
Nagabhushan P
Chaudhuri B
Publication year
Publication venue
2015 13th International conference on document analysis and recognition (ICDAR)

External Links

Snippet

Segmentation of a text document into lines, words and characters is an important objective in application like OCR and related analytics. However in today's scenario, the documents are compressed for archival and transmission efficiency. Text segmentation in compressed …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • G06F17/30303Improving data quality; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • G06F17/30247Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00442Document analysis and understanding; Document recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/41Bandwidth or redundancy reduction
    • H04N1/411Bandwidth or redundancy reduction for the transmission or storage or reproduction of two-tone pictures, e.g. black and white pictures
    • H04N1/4115Bandwidth or redundancy reduction for the transmission or storage or reproduction of two-tone pictures, e.g. black and white pictures involving the recognition of specific patterns, e.g. by symbol matching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/64Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor
    • H04N1/644Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor using a reduced set of representative colours, e.g. each representing a particular range in a colour space
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

Similar Documents

Publication Publication Date Title
Javed et al. A direct approach for word and character segmentation in run-length compressed documents with an application to word spotting
US9292690B2 (en) Anomaly, association and clustering detection
US6928435B2 (en) Compressed document matching
US7392472B2 (en) Layout analysis
JP2020511726A (en) Data extraction from electronic documents
US8340412B2 (en) Image processing
EP2953064B1 (en) Information conversion method, information conversion device, and information conversion program
CN108595710B (en) Rapid massive picture de-duplication method
CN112949476A (en) Text relation detection method and device based on graph convolution neural network and storage medium
JP2006505075A (en) Nonlinear quantization and similarity matching method for video sequence retrieval with multiple image frames
CN103020321B (en) Neighbor search method and system
US20120084305A1 (en) Compiling method, compiling apparatus, and compiling program of image database used for object recognition
KR101634395B1 (en) Video identification
CN102737243A (en) Method and device for acquiring descriptive information of multiple images and image matching method
US20140153838A1 (en) Computer vision-based methods for enhanced jbig2 and generic bitonal compression
Javed et al. Extraction of line-word-character segments directly from run-length compressed printed text-documents
US20150012544A1 (en) Index scan device and index scan method
JPWO2015005017A1 (en) Multidimensional range search apparatus and multidimensional range search method
US20160275370A1 (en) Methods and systems for determining a perceptual similarity between images
Lu et al. Detection of image seam carving using a novel pattern
Javed et al. Automatic extraction of correlation-entropy features for text document analysis directly in run-length compressed domain
WO2013087250A1 (en) Dynamic anomaly, association and clustering detection
CN103957012A (en) Method and device for compressing DFA matrix
CN115269957A (en) Method and device for data identification by adopting computer equipment
JP2014182617A (en) Image processing apparatus, method, and program