[go: up one dir, main page]

Showing 266 open source projects for "tesseract-ocr"

View related business solutions
  • Monitor your whole IT Infrastructure Icon
    Monitor your whole IT Infrastructure

    Know what's up and what's new: Monitor all your systems, devices, traffic and applications.

    Caters to tech staff, system Administrators, and companies of any size, from small and medium sized businesses to enterprises that need their IT network to be reliable and easy to monitor in real-time. Equipped with an easy-to-use, intuitive interface with a cutting-edge monitoring engine. PRTG optimizes connections and workloads as well as reducing operational costs by avoiding outages while saving time and controlling service level agreements (SLAs).
    Start Your Free PRTG Trial Now
  • FusionAuth: Authentication and User Management Software Icon
    FusionAuth: Authentication and User Management Software

    Offer your users flexible authentication options, including passwords, passwordless, single sign-on (SSO), and multi-factor authentication (MFA).

    FusionAuth adds login, registration, SSO, MFA, and a bazillion other features to your app in days - not months.
    Learn More
  • 1

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns. Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports...
    Downloads: 2,713 This Week
    Last Update:
    See Project
  • 2
    Tesseract.js

    Tesseract.js

    A pure Javascript Multilingual OCR

    Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 3
    Zerox OCR

    Zerox OCR

    PDF to Markdown with vision models

    A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes. If you want to train your own model, please move...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cloud-based observability solution that helps businesses track and manage workload and performance on a unified dashboard. Icon
    Cloud-based observability solution that helps businesses track and manage workload and performance on a unified dashboard.

    For developers, engineers, and operational teams in organizations of all sizes

    Monitor everything you run in your cloud without compromising on cost, granularity, or scale. groundcover is a full stack cloud-native APM platform designed to make observability effortless so that you can focus on building world-class products. By leveraging our proprietary sensor, groundcover unlocks unprecedented granularity on all your applications, eliminating the need for costly code changes and development cycles to ensure monitoring continuity.
    Learn More
  • 5
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 71 This Week
    Last Update:
    See Project
  • 6
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 7

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 8
    EasyOCR

    EasyOCR

    Ready-to-use OCR with 80+ supported languages

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 9
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Patch Management and Vulnerability Remediation Software | Action1 Icon
    Patch Management and Vulnerability Remediation Software | Action1

    Enable IT security and operations teams to detect, prioritize, and remediate vulnerabilities to ensure continuous compliance – all while reducing cost

    Action1 reinvents patching with an infinitely scalable, highly secure, cloud-native platform configurable in 5 minutes — it just works and is always free for the first 100 endpoints, with no functional limits. Featuring unified OS and third-party patching with peer-to-peer patch distribution and real-time vulnerability assessment with no VPN needed, it enables autonomous endpoint management that preempts ransomware and security risks, all while eliminating costly routine labor. Trusted by thousands of enterprises managing millions of endpoints globally, Action1 is certified for SOC 2 and ISO 27001.
    Learn More
  • 10
    JavaCV

    JavaCV

    Java interface to OpenCV, FFmpeg, and more

    JavaCV uses wrappers from the JavaCPP Presets of commonly used libraries by researchers in the field of computer vision (OpenCV, FFmpeg, libdc1394, FlyCapture, Spinnaker, OpenKinect, librealsense, CL PS3 Eye Driver, videoInput, ARToolKitPlus, flandmark, Leptonica, and Tesseract) and provides utility classes to make their functionality easier to use on the Java platform, including Android. JavaCV also comes with hardware accelerated full-screen image display (CanvasFrame and GLCanvasFrame), easy...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 11
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 12
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    .... The framework supports features like Optical Character Recognition (OCR) and Set-of-Mark (SoM) prompting to enhance visual grounding capabilities. It is designed to be compatible with macOS, Windows, and Linux (with X server installed), and is released under the MIT license.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 13
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    PDFMathTranslate

    PDFMathTranslate

    PDF scientific paper translation with preserved formats

    PDFMathTranslate is a Python-based tool that uses AI translation to convert academic PDFs into bilingual (e.g. Chinese-English) documents while preserving formatting, including math notation. It supports OCR-enhanced content and offers CLI, GUI, Docker, and Zotero integration under AGPL v3.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...-source software which means that transparency is the core value of our software development. Source code can be reviewed and improved by anyone from anywhere. Papermerge supports multiple users. Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. OCR technology is vital part of Papermerge. It extracts text information from scanned documents, PDF, JPEG, TIFF files.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    Texify

    Texify

    Math OCR model that outputs LaTeX and markdown

    Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Paper2GUI

    Paper2GUI

    Convert AI papers to GUI

    Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术 Paper2GUI: An AI desktop APP toolbox for ordinary people. It can be used immediately without installation. It already supports 40+ AI models, covering AI painting, speech synthesis, video frame complementing, video super-resolution, object detection, and image stylization. , OCR recognition and other fields. Support Windows, Mac, Linux systems. Paper2GUI: 一款面向普通...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    EcoPaste

    EcoPaste

    Open source clipboard management tools for Windows, Macos and Linux

    Open source clipboard management tools for Windows, macOS, and Linux. Built with Tauri, the application is lightweight and refined, consuming minimal resources. It also delivers a uniform user experience across both Windows, MacOS, and Linux platforms. The application is resident in the background, wakes up with one click through custom shortcut keys, saves time, and improves efficiency. Allows you to bookmark clipboard content for easy and fast access. Whether it's crucial data for work or...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    ... variants. Qwen3-VL is built for complex tasks such as GUI automation, multimodal coding (converting images or videos into HTML, CSS, JS, or Draw.io diagrams), long-context reasoning with support up to 1M tokens, and comprehensive video understanding. It also brings advanced perception capabilities, including spatial grounding, object recognition, OCR across 32 languages, and robust handling of challenging inputs like low-light or distorted text.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    ... of image tagging, object detection, segmentation, OCR, Audio, Video, Text classification, CSV for tabular data and time series. Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    OpenRPA

    OpenRPA

    Free Open Source Enterprise Grade RPA

    Open Source Robotic Process Automation Software. OpenRPA is where the actual automation happens, inside it you create workflows and invoke them, run all the activities needed to complete your task and deploy them with the assistance of OpenFlow and Node-RED. The framework also offers integration with other tools that are essential such as a message broker and repository management tools. OpenRPA also includes other features within itself, such as Image detection/OCR, Browser Navigation and many...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    WindowTextExtractor

    WindowTextExtractor

    WindowTextExtractor allows you to get a text from any OS

    WindowTextExtractor allows you to get a text from any window of an operating system including asterisk passwords. Extract text from modal windows, buttons, textboxes, lists, etc. Show passwords stored behind asterisks (*****) from most of the windows apps. Show detailed window and process information. Show process environment variables. Show or hide almost any desktop window. Take a window screenshot. Record window stream in avi file. OCR support (including text, bar codes and QR codes...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    docconv

    docconv

    Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

    A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing docconv...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next