
Blog
Optical character recognition, or "OCR," is an increasingly widely used set of applications that can process scanned documents, PDFs, and other files and translate them into text that can be saved, searched, and edited by other programs and human users.
Anyone who has ever had to re-type a scanned PDF or manually input data from hard-copy documents can appreciate the huge potential such tools have. Unfortunately, the adoption of OCR applications is not as universal as one might expect. This is likely due in large part to a lack of awareness and understanding of what OCR is and does and what its real-world applications are.
OCR refers to the ability of a computer or machine to process a variety of text and convert it into something that’s machine-readable, i.e., something a computer can understand.
While a human might read a hard-copy book the same way it reads a handwritten note or a PDF on a screen, that human’s brain is performing a human version of optical character recognition. It’s recognizing characters and translating them into their underlying meaning. A family member’s scribbled note on the refrigerator might be sloppier than a typed note, but readers can generally derive the same meaning from both.
The same concept applies to computer OCR programs. Instead of requiring characters to be inputted through keyboard strokes or an application programming interface (API), OCR allows computers to optically scan printed or written text and translate that text into a standardized format the computer understands.
The term OCR is often used generically to refer to the ability of a computer to process scanned text, whether typed or handwritten. Strictly speaking, however, there’s an important distinction between optical character recognition (OCR) and intelligent character recognition, or ICR.
While OCR is generally used for reading printed text, ICR is used for reading handwriting. The “intelligent” element of ICR comes from the fact that a computer needs to be smart enough (either through extensive, brute-force programming or machine learning) to be able to process handwritten text. This is no small feat, considering many humans often have trouble reading handwritten text themselves.
ICR, therefore, can be considered a subset of OCR, as opposed to a separate function entirely.
When people talk about OCR, they may use a related term: image recognition. OCR neophytes need not be confused by the two terms, however. Put simply, OCR is a subset of image recognition, which can include not only OCR, but also ICR, and also broader use cases such as facial recognition.
OCR systems have become increasingly more useful as technology has developed. Businesses have discovered how to use OCR to improve several tedious manual processes so that employee efforts can be directed toward more creative problem-solving.
The following five optical character recognition uses are great examples of how OCR can reduce human bottlenecks in a business setting.
For years, accounting professionals have had to painstakingly and manually input data from customer or vendor invoices into a centralized system. Different organizations may use different formats for their invoices and purchase orders, requiring the ability to match like with when compiling them all into a centralized location.
Before OCR, even when all invoices were in a standard and consistent format, it was typically necessary for a human to transfer data from those invoices into a centralized database. Using OCR instead can help dramatically reduce processing time as well as input errors.
Even when a company has scanned and uploaded documents to a central repository, it can be a tedious and unreliable process to search for specific information. OCR can supercharge data retrieval processes because it allows users to manage multiple different document types, including paper documents, images, and PDFs, and convert the information from those files to formats like Word, Excel, searchable PDFs, and others.
One use case for OCR that many people may be familiar with regardless of their technical aptitude is to scan checks. Many banks allow customers to save a trip to a local branch to cash a check by allowing them to take a picture of a check on their phone and upload that image to the bank’s website or app. The information contained on a check is limited and standardized, making this use case a relatively easy job for OCR.
Similarly, the number of fields contained in most forms of identification is fairly limited and standardized. These might include a person's name, address, date of birth, date of ID issuance and expiration, and other identifying information, such as height, weight, hair color, etc. By scanning and inputting that information, an OCR program can quickly compare it against the information it might expect to see from a presented ID.
Few data processing tasks require more care than health records processing. Accuracy in this process can mean the difference between life and death in extreme cases. Even when the stakes aren't so high, the sheer volume of health records healthcare providers, insurers, and others sift through daily is staggering. OCR tools can be a tremendous asset in streamlining that process.
The ability of a computer to read and understand text may seem mindboggling. Indeed, the underlying technology is certainly impressive. But conceptually it’s not difficult for even non-technical users to grasp.
The first step is simply getting as clear as possible an image of the text being scanned. This means smoothing out any fuzzy or blurry lines. This allows OCR programs to interpret each character separately, one by one. Interpreting those characters, of course, is the tough part.
Early forms of OCR, dating back to the 1960s, relied on all text inputted into the computer to be of one standard font, called OCR-A. Computers were programmed to understand text in that font by checking the standard length, width, and shape of OCR-A characters against a finite library of pre-programmed options. If a document were presented to one of these early OCR applications in a different font, such as Times New Roman, the system wouldn’t be able to read it. Today, OCR tools have a much-expanded library of fonts against which to compare text.
For unfamiliar fonts or handwriting, a rules-based system is used by more sophisticated OCR applications. For example, a rule to recognize the letter “V” might include treating any character composed of two lines that slope to meet at a point at the bottom as the letter V. A separate rule or rule component, based on the height of that character relative to others, would be able to distinguish between upper and lower case.
NITCO Inc. can calibrate your data needs and help you implement OCR as part of a comprehensive document understanding system. These systems can be deployed via Robotic Process Automation and Intelligent Process Automation solutions to elevate business productivity exponentially.
Contact NITCO today to learn how our team can help streamline your business processes.