Published in Martin Paul Eve
Author Martin Paul Eve
Yesterday, I wrote of a challenge that I faced in working out which texts in a corpus have decent OCR and, then, which texts they actually are. This morning, I put together a small script that has a first go at this. I enclose this below for anybody who is interested.