--- language: ja tags: - image-to-text - onnx datasets: - manga109s --- # Manga OCR (ONNX) This is an ONNX version of the Manga OCR model, designed for optical character recognition of Japanese text, with a primary focus on manga. This model is based on the original work by [kha-white/manga-ocr](https://github.com/kha-white/manga-ocr), [kha-white/manga-ocr-base](https://huggingface.co/kha-white/manga-ocr-base) and modification by [jzhang533/manga-ocr-base-2025](https://huggingface.co/jzhang533/manga-ocr-base-2025). The models in this repository were exported to the ONNX format using [Hugging Face Optimum](https://huggingface.co/docs/optimum/index). ## Original Model Information Manga OCR utilizes the [Vision Encoder Decoder](https://huggingface.co/docs/transformers/model_doc/vision-encoder-decoder) framework. It is designed to be a high-quality text recognition tool, robust against various scenarios specific to manga: - Both vertical and horizontal text - Text with furigana - Text overlaid on images - A wide variety of fonts and font styles - Low-quality images The original training data included manga109-s and synthetic data. ## Using the ONNX Models To use these ONNX models for inference, you will need the `optimum` library. You can install it as follows: ```bash pip install optimum[onnxruntime] ``` Here is an example of how to run inference with the ONNX models: ```python from transformers import TrOCRProcessor from optimum.onnxruntime import ORTModelForVision2Seq from PIL import Image # Load the processor and model processor = TrOCRProcessor.from_pretrained("l0wgear/manga-ocr-2025-onnx") model = ORTModelForVision2Seq.from_pretrained("l0wgear/manga-ocr-2025-onnx") # Load an image image = Image.open("path/to/your/manga/image.jpg").convert("RGB") # Process the image and generate text pixel_values = processor(images=image, return_tensors="pt").pixel_values generated_ids = model.generate(pixel_values) generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] print(generated_text) ``` ## Acknowledgements - **Original Author:** [kha-white](https://github.com/kha-white) for creating the original Manga OCR. - **Fine-tuning:** [jzhang533](https://huggingface.co/jzhang533) for training the `manga-ocr-base-2025` model.