Optical character recognition (ORC), content translation, title generation, detection and text extraction from more file formats, are among the new features now part of your favorite crawlers: Norconex HTTP Collector 2.1.0 and Norconex Filesystem Collector 2.1.0. They are both available now and can be downloaded for free. They both ship with and use the latest version of the Norconex Importer module, which is in big part responsible for many of these new features.
For more details and usage examples, check this article.
These two Collector releases also include bug fixes and stability improvements. We recommend to existing users to upgrade.