Android Offline Live OCR & Translation with ML Kit and Tesseract

Repository: https://github.com/AndreiMaksimovich/android-live-ocr-and-translation–demo

In this article, I’ll walk through some technical details of the creation of an android technical demo project that showcases offline live OCR and translation. The project uses Google ML Kit Text Recognition V2 for OCR and Tesseract as a fallback (when language in not supported by ML Kit), and relies on Google ML Kit Translation for on-device translations.

Technology Stack

The application is built using Kotlin and XML layouts.

The application integrates several open-source and Google-provided components:

OCR Implementation

In this demo, OCR is implemented through an abstraction layer with a factory that provides an OCR service. The service accepts a Bitmap as input and returns a generic OCR response structured hierarchically as: Blocks → Lines → Words → Symbols.

https://github.com/AndreiMaksimovich/android-live-ocr-and-translation–demo/tree/main/app/src/main/java/com/amaxsoftware/ocrplayground/src/ocr

Translation

Translation is implemented using the same pattern as OCR, with a factory and an abstract TranslationService that exposes a suspended translation method.

https://github.com/AndreiMaksimovich/android-live-ocr-and-translation–demo/tree/main/app/src/main/java/com/amaxsoftware/ocrplayground/src/translation

OCR Translation

The OCR translation service builds on the TranslationService: it accepts a generic OCRResult and returns a list of translated lines.

https://github.com/AndreiMaksimovich/android-live-ocr-and-translation–demo/tree/main/app/src/main/java/com/amaxsoftware/ocrplayground/src/ocr/translation

Note

This demo isn’t meant as a production-ready app, but as a reference implementation that illustrates how these technologies can work together.

  • Tesseract data files are stored in the assets folder and extracted during app initialization.
  • ML Kit language models are automatically downloaded during app initialization.
  • Supported languages are hardcoded.
  • The camera captures images without applying any filters or effects.

In a production application, you should use proper systems and managers to handle on-demand model downloads and dynamic language support.

The OCR workflow should begin with image preprocessing, including steps such as adjusting light balance, converting to grayscale or black-and-white, and applying region filtering.