IT DATASOFT
Data Entry Data Conversion & IT Training
OUR SERVICES



OCR Conversion
With our OCR s/w we would be able to convert your printed documents into precise information. Conversion of printed documents through OCR had not been so easy until our s/w team has developed and maintaining it.

Document Scanning
IT Datasoft provides document scanning, indexing, and archiving and retrieval services for a wide variety of applications. Scanning documents can be a cost effective alternative to the long-term storage of paper. Pages of forms are scanned and converted into bit-mapped (usually TIFF) images of forms which are either compressed and stored for later batch processing, or are passed immediately in an uncompressed format to an ICR engine for recognition.

Image analysis
The document image is cleaned up. Character image quality is improved, using image enhancement techniques. Background "noise" is removed from the form.

Form Processing
We can capture data from all types of handwritten and typed forms accurately in the most cost effective manner. Just as documents must be prepared in order to be fed into a scanner by removing staples, smoothing wrinkles, positioning them for optimal registration, etc., so the image of a form document must be prepared by following these steps before it can be intelligently recognized.

Data Entry Services
Data entry from paper/Books
Data entry from Image file in any format
Transaction sales/ purchase / payroll
Data entry of E-Books
Data Entry - Yellow Pages and White Pages
Data Entry and compilation from Web site
Data Capture / Collection
Business Card Data Entry into any Format
Data Entry from hardcopy/Printed Material
Receipt & Bill Data Entry
Data Entry of Prospect

Data Entry for Mailing List/Mailing Label

Copy, Paste, Editing, sorting, Indexing Data in to any format
Hand Written Data Entry
Coupons /Catalogue Entry
Data Processing Services
Data Processing
Insurance Claim Processing
OCR Cleanup
Data Mining
Image Processing
Data Cleansing
Financial Services

Accounting & Book Keeping

Balance Sheet
Profit & Loss Account
Receivable Accounts
Payable Accounts
Ad design services
OCR/ICR

Data Conversion

Book Conversion

Scanning, OCR with Editing & Indexing

PDF Conversion

Tiff Image Conversion
CAD / CAM Solution
Publishing and Prepress Services
College sports rosters
Membership application forms

Legal forms

Online forms
Questionnaires
Resume processing
E-mail forms
HTML Forms
CGI forms
ASP, JSP & PHP Forms
Account and Tax forms
Insurance Claim Forms
Medical Claim Forms
Survey Forms
Market Research Forms
Tax Forms
Legal Forms
Subscription Forms
Form Alignment

The image is registered and deskewed by the ICR software, which utomatically aligns the form by locating special symbols on the document called registration marks as guides.

Form background removal

This stage is not necessary if the document is a form that was originally printed in a colored ("drop out") ink that is invisible to the scanner being used.

If colored ink is not used, the form image may contain lines, boxes, fine print, & other form attributes-passive data-that tend to confuse the ICR engine. These form attributes must be extracted from the image of the form, so that only the character images-the active data-are left behind. Broken and fragmented characters are automatically repaired and restored to their original shapes.

Character segmentation
Sophisticated software routines analyze, separate, and break down the character fields into isolated characters. If the form is "ICR-friendly" characters
are segmented with the aid of graphic devices such as boxes, tick-marks, and connected boxes called "combs" that serve to force the form user to legibly separate the characters from one another.

Character Classification
Individual characters are classified by ICR algorithms according to their ASCII category and assigned a confidence value, which is an
index of how "certain" the ICR engine "feels" about the selection it has made. Alternate character choices are ranked according to those values, so that they can be incorporated into editing procedures that improve ICR accuracy. For example, the alternate choice "1" might be used instead of the first-ranked choice "I" when contextual analysis reports that the field is all-numeric.

Form Identification

The document is identified by certain spredefined characteristics that the ICR software is trained to look for, so that the zones containing the fields designated forrecognition
can be located by a customized, predefined
ICR template. Form ID attributes can includeform numbers, corporate logos, or the name of the form itself imprinted somewhere on the form.
 

Character Field Location

The predefined ICR template automatically locates the fields that contain character data. The template identifies which individual fields on the form

image require character recognition, and what the nature of those fields are-hand print, machine print, numeric, alphabetic, alphanumeric, etc. The template also identifies which areas are barcodes or check box recognition zones.
Post-Processing
The initial or "raw" recognition results are validated using edit procedures such as grammatical rules, spell-checkers, dictionaries, check-sum routines, and look-up tables.
Ambiguous and erroneous data fields-the "rejects"- are identified and sent to data entry operators at workstations for manual correction.

Manual correction of rejected character                fields
The manner in which the data entry operator is presented the rejected data for correction can dramatically impact both the speed and the accuracy of the reject repair process. In particular, the data entry GUI is important because the ergonomics of data entry are what enable a given data entry operator to reach his or her maximum correction speed.
With so much opportunity for error increasing at each successive step of the way, it is remarkable that ICR accuracy rates can attain (and sometimes exceed) human performance levels.


© IT Datasoft, New Delhi, India . All Rights Reserved.