πŸ›οΈ OpenMLkit OCR Models

This repository hosts a collection of highly optimized, lightweight, on-device OCR (Optical Character Recognition) models extracted from Google ML Kit APK components. These models are designed to run fully offline and are fully compatible with OpenMLkitOCR, a lightweight offline OCR engine for Python.

The repository includes:

  • Text Detection Model: A Region Proposal Network (RPN) architecture (rpn_detector.tflite) that identifies text bounding boxes in image tiles.
  • Text Recognition Models: Lightweight CRNN + CTC model pipelines for 15+ different languages and scripts.

πŸ›οΈ Supported Languages and Scripts

Below is the registry of the available models, vocabulary maps, and language model priors stored in this repository:

Code Script / Language Recognizer Model Label Map File Language Model / Priors
detector Text Detection (All) rpn_detector.tflite β€” β€”
en Latin / English line_recognizer.fb LabelMap.pb β€”
ru Cyrillic / Russian recognizer_cyrl.tflite LabelMap_cyrl.pb FST LM + Priors
zh Chinese / Han (Hani) recognizer_hani.tflite recognizer_hani_label_map.pb FST LM + Priors
ja Japanese (Jpan) recognizer_jpan.tflite recognizer_jpan_label_map.pb FST LM + Priors
ko Korean (Kore) recognizer_kore.tflite recognizer_kore_label_map.pb FST LM + Priors
ar Arabic (Arab) recognizer_arab_retrained.tflite recognizer_arab_label_map.pb FST LM + Priors
he Hebrew (Hebr) hebr.tflite hebr_label_map.pb Priors
ka Georgian (Geor) geor.tflite geor_label_map.pb Priors
bn Bengali & Devanagari (Bede) bede.tflite bede_label_map.pb Priors
gu Gujarati (Gujr) gocr_tflite_recognizer_gujr.tflite gocr_tflite_recognizer_gujr_label_map.pb Priors
kn Kannada (Knda) recognizer_knda.tflite recognizer_knda_label_map.pb FST LM + Priors
ml Malayalam (Mlym) recognizer_mlym.tflite recognizer_mlym_label_map.pb FST LM + Priors
ta Tamil (Taml) recognizer_taml.tflite recognizer_taml_label_map.pb FST LM + Priors
te Telugu (Telu) recognizer_telu.tflite recognizer_telu_label_map.pb FST LM + Priors
vi Vietnamese / Latin gocr_tflite_recognizer_latn_vi.tflite gocr_tflite_recognizer_latn_vi_label_map.pb Priors

πŸ“‚ File Types Explained

  1. *.tflite / *.fb (Neural Network weights):
    • detector/rpn_detector.tflite is a Convolutional Neural Network (CNN) that processes 256x256 tiles of the image and predicts text bounding boxes.
    • recognizer_*.tflite and line_recognizer.fb are CRNN (Convolutional Recurrent Neural Network) architectures that predict CTC logits for cropped text line images.
  2. *_label_map.pb / LabelMap.pb (Vocabulary):
    • Binary Protobuf files mapping character indices to Unicode symbols for decoding CTC outputs.
  3. *_lm.compact_fst.gz & *.syms (Language Models):
    • Compact Finite State Transducer (FST) language models and symbol mapping files. These are used for advanced Beam Search decoding, correcting spelling and character sequences based on word frequencies.
  4. *_prior.pb / *_config.pb (Priors & Config):
    • Character prior probabilities and model configurations used to calibrate neural network outputs before applying the language model.

πŸš€ How to Use with openmlkitOCR

The Python package openmlkitOCR handles automatic downloading and caching of these models from Hugging Face if they are not present locally.

1. Installation

Install the library directly using pip:

pip install openmlkitOCR

2. Python Usage Example

import os
import cv2
from openmlkit import OpenMLKitOCR

# Configure the pipeline to pull models from this Hugging Face repository
os.environ["OPENMLKIT_MODEL_REPO"] = "0cve0/OpenMLKitOCR"

# Initialize the pipeline for a specific language (e.g., 'en' for English/Latin)
# This will automatically download and cache the detector and recognizer files.
ocr = OpenMLKitOCR(lang='en')

# Load an image
image = cv2.imread("test_image.jpg")

# Run OCR (detection & recognition)
results = ocr.run(image, score_threshold=0.35)

# Output the localized text bounding boxes and recognised characters
for item in results:
    print(f"Box: {item['box']} -> Text: {item['text']}")

βš–οΈ License and Disclaimer

  • Software: The Python library OpenMLkitOCR is licensed under the Apache 2.0 License.
  • Model Weights: The model weights and configurations in this repository are extracted from Google ML Kit APK components and are subject to Google's terms of service and license agreements. These models are intended for educational, research, and non-commercial local testing purposes.
Downloads last month
158
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support