Amazon Textract has improved accuracy of detecting currency symbols, key value pairs and checkboxes

Posted on: Sep 24, 2020

Amazon Textract is a machine learning service that enables customers to automatically extract text and data, including from tables and forms within scanned documents and images. As a fully managed service, Textract delivers continuous improvement over time. Today, we are pleased to announce a few quality enhancements to both our Optical Character Recognition (OCR) feature and the forms recognition feature. The new OCR model detects the degree symbol (°) and the currency symbols of Chinese Yuan (CNY ¥), Japanese Yen (JPY ¥), Indian Rupee (₹), British Pound (£), and the US Dollar ($) more precisely than before.

The latest forms model has higher accuracy on a variety of forms, especially ‘income verification’ documents such as pay stubs, bank statements, and tax documents. With these improvements, you can now leverage Amazon Textract to more accurately detect contextual information about amounts, temperature readings, selected/not selected information in checkboxes, and key value pairs in documents with form elements.

The OCR feature update has launched in the Asia Pacific (Singapore) region today and will be launched over the next few days in all other AWS regions where Amazon Textract is available. The latest forms model has launched today in all AWS regions where Amazon Textract is available.  

Get started with Amazon Textract today.