Yay! I am now set up to OCR scan using this – the Fujitsu Scansnap. Having played with this for a few days I’m really pleased with it.
It’s a neat little sheet-feed scanner – shown next to a Mac here. It’s permanently plugged in and you switch it on by opening it. Then you just feed in what you want to scan. It’ll take up to A4 size papers (and A3 if you wrap it round a carrier sheet). It scans both sides, but is clever enough to leave out blank pages.
But the magic is really in the software. OCR (Optical Character Recognition) software looks at the page image and extracts editable text from it. For printed text – such as a novel – the conversion is very nearly perfect. It’s not quite clever enough yet to work out page breaks and it has a bit of trouble with foreign accents (but it can be set to various languages so this may be a fix) and has the occasional inexplicable wobble.
OCRed text will always need checking, but it is the perfect solution for authors wanting to rerelease their backlist as e-books or print-on-demand titles.
I’ll post about the workflow later…