Atalasoft
HomeProductsPurchaseDownloadSupportCompany InfoForumsBlogs

OCR Capabilities in DotImage

Atalasoft DotImage OCR is an optical character recognition module for Microsoft .NET developers giving programmers the capability to add character recognition to their applications. Atalasoft's approach to OCR is to provide an object oriented generic interface that can support any OCR engine.  This enables users of DotImage OCR to change OCR engines with a single line of code. It's also convenient for testing and evaluating various OCR engines. Atalasoft currently provides four OCR engine interfaces. Our own GlyphReader engine as well as the Abbyy, and the royalty-free Tesseract engines.

Using OCR in DotImage requires a license of DotImage Document Imaging. The following OCR Engines are available, each one including the DotImage OCR interface.

Also available is an add-on module for OCR to generate searchable PDF documents

Please contact Atalasoft if you are interested in acquiring an interface for other OCR engines.

Product Features

  • Fully extensible file and stream export
  • OCR Engine neutral, open API
  • Built-in image preprocessing
  • Fully overridable image preprocessing
  • Easy event model for tracking progress and reporting/modifying document layout
  • Fully extensible document and page model
  • Font abstraction
  • Confidence provided at region, line, word, and glyph levels
  • OCR any image that can be read by DotImage
  • Easy integration with Twain Capture
  • Images can come from any source, not just files
  • Output formats specified by MIME standard
  • Built-In Text Translator for formatted text output
  • Searchable PDF module for outputting results in highly compressed JBIG2 Adobe PDF as Text Only, or Hidden Text Underneath Image.
  • Supports engines that automatically localize regions (or zones) of an image, or manually zone images yourself.
  • Support for Tesseract OCR Engine 
  • Support for GlyphReader OCR Engine
  • Support for Abbyy OCR Engine

Object Model Design

Atalasoft DotImage OCR is designed to easily interface with other aspects of your application, and extensible with an event driven object oriented object model. In just a few lines of code, a developer can recognize an image and output to a file, or enumerate through the lines, words, and characters with confidence.

This diagram represents the high level design of the OCR module.

Engine DiagramEngine Components

For more information about the design, and how to use OCR, please download an evaluation.

Buy Online | Download Demo | Phone Order: 1-866-568-0129