Published in rOpenSci - open tools for open science
Author Jeroen Ooms
Earlier this month we released a new version of the tesseract package to CRAN. This package provides R bindings to Google’s open source optical character recognition (OCR) engine Tesseract. Two major new features are support for HOCR and support for the upcoming Tesseract 4. hOCR output Support for HOCR output was requested by one of our users on Github.