<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=565596&amp;fmt=gif">

5 mins read

Accounting and Technology

Invoice Scanning OCR Technology in AP Automation Software

by Yooz on 08.21.2018


OCR: Another one of those technical buzzwords that we’re hearing a lot about these days in accounting and technology. It’s of particular interest in FinTech (another new buzzword that stands for ‘financial technology’) and even more specifically in accounts payable. But what is it? More importantly what isn’t it? How does it even work? And how does it make a difference in the AP workflow? In this blog, we’ll answer all your questions about OCR and it’s role in automating invoice processing.


What is Optical Character Recognition (OCR)?

The official definition of optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, printed, or even handwritten text, into machine-encoded text. The text can come from a scanned document (like a vendor invoice), a photo of a document (like a receipt), a scene-photo (for example the text on signs and billboards in a landscape photo), or from subtitle text superimposed on an image (like from a television broadcast).


OCR looks like this.


Simply put, it’s a computer looking at an image or document, such as a supplier invoice, and being able to identify what is on it.[1]


Don’t confuse OCR invoice scanning with data extraction.

OCR is a technology that turns a picture into words. The next layer, smart data extraction, understands and processes the text from the OCR to transform or format it into relevant data. As many of you are exploring AP automation software providers, you may ask, “Do you have OCR technology?” Good question. But what you really want to know when investigating invoice and payment processing automation providers is if the software has a complete technology, combining OCR (converting images to text), smart data extraction (transforming the text into relevant data), and machine learning (remembering the data and populating it into the applicable data fields each time the data is recognized). Today, there are three predominant types of extraction technology:


  1. Human verified or outsourced extraction.
  2. Zonal-based extraction that utilizes predefined templates.
  3. Systems based upon artificial intelligence (AI) or machine learning.


OCR processing needs assistance.

These are all necessary as OCR by itself does not know what to do with the information it reads from your documents. OCR can scan the text on a pdf invoice or document but doesn’t know where to put that information. Some AP automation software providers might use OCR, but then apply human extraction, outsourcing to a third party—also called third-party verification. OCR extraction that layers human verification uses people to put data read by the OCR into predefined fields. In this scenario data entry is done by an outsourced firm and takes time as the data is being populated by people, typically 24 to 72 business hours. Kind of defeats the purpose of moving from a manual AP process to an automated process to save time, right?

In a template-based data extraction software, a user has to build or predefine specifically “where” on a document a specific piece of information can be found and “what” the tool should do with the data it finds. The retrieval process can be done fairly quickly, however, it can become an administrative burden as templates must be manually managed and updated as documents change.

In this scenario, humans have to constantly manage the templates; read them, interpret them, and update them. This might defeat the intent of transitioning to invoice and payment automation because you are not saving time, and it might even be more time-intensive. Even duplicating efforts in some respects.




Pair OCR with Artificial Intelligence and Machine Learning.

Data extraction using AI or machine learning is able to “understand” what information on a document or invoice needs to be used and, more importantly, what should be done with said information to make it relevant data. For example, software utilizing machine learning is able to populate the “Total Amount” of an invoice without being taught or shown where to grab the data and which field to associate it with. Because the tool has seen thousands of document examples it is able to draw on past experiences to make conclusions. Smart!


Pick the right AP automation solution with the best invoice scanning software.

But what does all this mean, exactly, to finance leaders and their AP teams? Mark Brousseau, consultant, Institute of Finance Management (IOFM) spokesperson, and speaker puts it in perspective in an interview with The Institute of Finance Management.

“Businesses today are expecting more from their AP function. They realize that if they can get at the information and data housed in their AP department, they can use it to support better management of their working capital, mitigate potential risk, and make more strategic decisions.”

This allows companies to transform data into knowledge that can better inform their business decisions and streamline their business processes. And it’s made possible with the combination of OCR and smart data extraction, the most important feature of advanced AP automation solutions.

When it comes to the Yooz AP automation software, our smart data extraction technology leverages OCR to extract information from scanned invoices/photos of paper invoices or invoice images received via email. It then interprets the information, pulls out the relevant data, then applies it to the appropriate field in the application to then be reviewed and sent for approval. Finally, the data is exported to an ERP or financial package. If there are pieces of data that cannot be interpreted from a document, Yooz learns over time how to handle those missing pieces. This is referred to as machine learning and is powered by AI.

With constant enhancements, no end user is ever involved to teach the software. The staff transitions from manual data entry and third-party verification to simply reviewing data retrieved for accuracy. If there is a miss, the reviewer can click inside the Yooz application to correct it and flag the missed information quickly and easily. Utilizing machine learning optimizations, the system will become more intelligent over time, reducing the number of mistakes. It gets smarter the more you “Yooz” it!


What you need is a complete process of invoice scanning OCR, data extraction, and machine learning powered by AI.

Today’s organizations are focused on speed, efficiency, and leveraging technology to solve business problems. When it comes to automating your AP process, take the time to first set your business goals and determine what challenges need to be solved. Then find an AP automation solution that solves as many or all of those critical needs.

Sure, you can ask, “Do you have OCR?” But don’t stop there. Keep digging until you have a complete understanding of each solution you are considering and, more importantly, what best suits your business.


Article written by Justin Holden, VP of Sales, Yooz


CTA-US   Private Demo

dashboard for infographic4

Read now

How Advanced Technologies Power AP Automation