Compare OCR results from images
Extract text from documents
Python3 package for Chinese/English OCR, with paddleocr-v4 o
Extract text from images using OCR
Scan and extract text from documents
Extract text from barcodes
Upload images to extract and clean text
Read text from images
Extract text and search keywords from images
Extract text from PDFs
Extract and overlay text on PDFs
Extract text from images
Streamlit OCR Comparator is a web-based application designed to compare OCR (Optical Character Recognition) results from different engines. It provides a user-friendly interface to upload images, extract text using multiple OCR engines, and slice and dice the results to find the most accurate output.
• Multi-Engine Support: Compare text extraction results from various OCR engines in one place.
• Image Upload: Directly upload images from your local filesystem or provide URLs.
• Result Comparison: Side-by-side comparison of extracted text to identify differences.
• Accuracy Analysis: Highlight mismatches and evaluate the performance of each OCR engine.
• Customizable Settings: Fine-tune OCR parameters like language, DPI, and layout analysis.
• Export Results: Download comparison reports for further analysis.
pip install streamlit-ocr-comparator in your terminal.streamlit run ocr_comparator.py.What OCR engines are supported?
Streamlit OCR Comparator supports multiple engines, including Tesseract, Google Vision API, and Microsoft Azure Computer Vision.
How do I install the app?
Installation is straightforward. Run pip install streamlit-ocr-comparator in your terminal, and the app will be ready to use.
Can I customize the OCR settings?
Yes, users can customize settings like language, DPI, and layout analysis for each OCR engine to optimize results.