Form Recognizer 2021-09-30-preview

Form Recognizer extracts information from forms and images into structured data. It includes the following options:

  • Layout - Extracts text and table structure from documents using optical character recognition (OCR).
  • Document - Extract text, selection marks, tables, entities, and general key-value pairs from documents.
  • Business Card - Detects and extracts data from business cards using optical character recognition (OCR) and our business card model, enabling you to easily extract structured data from business cards such as contact names, company names, phone numbers, emails, and more.
  • ID Document - Detects and extracts data from identification documents using optical character recognition (OCR) and our ID document model, enabling you to easily extract structured data from ID documents such as first name, last name, date of birth, document number, and more.
  • Invoices - Detects and extracts data from invoices using optical character recognition (OCR) and our invoice understanding deep learning models, enabling you to easily extract structured data from invoices such as customer, vendor, invoice ID, invoice due date, total, invoice amount due, tax amount, ship to, bill to, line items and more.
  • Receipt - Detects and extracts data from receipts using optical character recognition (OCR) and our receipt model, enabling you to easily extract structured data from receipts such as merchant name, merchant phone number, transaction date, transaction total, and more.
  • Custom - Extracts information from forms (PDFs and images) into structured data based on a model created from a set of representative training forms. Form Recognizer learns the structure of your forms to intelligently extract text and data. It ingests text from forms, applies machine learning technology to identify keys, tables, and fields, and then outputs structured data that includes the relationships within the original file.

Analyze - Get analyze result

Gets the result of document analysis.

Select the testing console in the region where you created your resource:

Open API testing console

Request URL

Request parameters

string

Format - [a-zA-Z0-9][a-zA-Z0-9._~-]{1,63}. Unique model name.

string

Analyze operation result ID.

Request headers

string
Subscription key which provides access to this API. Found in your Cognitive Services accounts.

Request body

Response 200

Supported Document Fields


prebuilt:businesscard
Field Type Description Example
ContactNames array
ContactNames.* object Contact name Chris Smith
ContactNames.*.FirstName string First (given) name of contact Chris
ContactNames.*.LastName string Last (family) name of contact Smith
CompanyNames array
CompanyNames.* string Company name CONTOSO
JobTitles array
JobTitles.* string Job title Senior Researcher
Departments array
Departments.* string Department or organization Cloud & AI Department
Addresses array
Addresses.* string Address 4001 1st Ave NE Redmond, WA 98052
WorkPhones array
WorkPhones.* phoneNumber Work phone number +1 (987) 213-5674
MobilePhones array
MobilePhones.* phoneNumber Mobile phone number +1 (987) 123-4567
Faxes array
Faxes.* phoneNumber Fax number +1 (987) 312-6745
OtherPhones array
OtherPhones.* phoneNumber Other phone number +1 (987) 213-5673
Emails array
Emails.* string Contact email chris.smith@contoso.com
Websites array
Websites.* string Website https://www.contoso.com
prebuilt:idDocument:driverLicense
Field Type Description Example
CountryRegion countryRegion Country or region code USA
Region string State or province Washington
DocumentNumber string Driver license number WDLABCD456DG
FirstName string Given name and middle initial if applicable LIAM R.
LastName string Surname TALBOT
Address string Address 123 STREET ADDRESS YOUR CITY WA 99999-1234
DateOfBirth date Date of birth (DOB) 01/06/1958
DateOfExpiration date Date of expiration (EXP) 08/12/2020
Sex string Sex M
Endorsements string Endorsements L
Restrictions string Restrictions B
VehicleClassifications string Vehicle classification D
prebuilt:idDocument:passport
Field Type Description Example
MachineReadableZone object Machine readable zone (MRZ) P
MachineReadableZone.FirstName string Given name and middle initial if applicable JENNIFER
MachineReadableZone.LastName string Surname BROOKS
MachineReadableZone.DocumentNumber string Passport number 340020013
MachineReadableZone.CountryRegion countryRegion Issuing country or organization USA
MachineReadableZone.Nationality countryRegion Nationality USA
MachineReadableZone.DateOfBirth date Date of birth 1980-01-01
MachineReadableZone.DateOfExpiration date Date of expiration 201-05-05
MachineReadableZone.Sex string Sex F
prebuilt:invoice
Field Type Description Example
CustomerName string Customer being invoiced Microsoft Corp
CustomerId string Reference ID for the customer CID-12345
PurchaseOrder string A purchase order reference number PO-3333
InvoiceId string ID for this specific invoice (often 'Invoice Number') INV-100
InvoiceDate date Date the invoice was issued 11/15/2019
DueDate date Date payment for this invoice is due 12/15/2019
VendorName string Vendor who has created this invoice CONTOSO LTD.
VendorAddress string Mailing address for the Vendor 123 456th St New York, NY, 10001
VendorAddressRecipient string Name associated with the VendorAddress Contoso Headquarters
CustomerAddress string Mailing address for the Customer 123 Other St, Redmond WA, 98052
CustomerAddressRecipient string Name associated with the CustomerAddress Microsoft Corp
BillingAddress string Explicit billing address for the customer 123 Bill St, Redmond WA, 98052
BillingAddressRecipient string Name associated with the BillingAddress Microsoft Services
ShippingAddress string Explicit shipping address for the customer 123 Ship St, Redmond WA, 98052
ShippingAddressRecipient string Name associated with the ShippingAddress Microsoft Delivery
SubTotal number Subtotal field identified on this invoice $100.00
TotalTax number Total tax field identified on this invoice $10.00
InvoiceTotal number Total new charges associated with this invoice $110.00
AmountDue number Total Amount Due to the vendor $610.00
PreviousUnpaidBalance number Explicit previously unpaid balance $500.00
RemittanceAddress string Explicit remittance or payment address for the customer 123 Remit St New York, NY, 10001
RemittanceAddressRecipient string Name associated with the RemittanceAddress Contoso Billing
ServiceAddress string Explicit service address or property address for the customer 123 Service St, Redmond WA, 98052
ServiceAddressRecipient string Name associated with the ServiceAddress Microsoft Services
ServiceStartDate date First date for the service period (for example, a utility bill service period) 10/14/2019
ServiceEndDate date End date for the service period (for example, a utility bill service period) 11/14/2019
Items array List of line items
Items.* object A single line item 3/4/2021 A123 Consulting Services 2 hours $30.00 10% $60.00
Items.*.Amount number The amount of the line item $60.00
Items.*.Date date Date corresponding to each line item. Often it is a date the line item was shipped 3/4/2021
Items.*.Description string The text description for the invoice line item Consulting service
Items.*.Quantity number The quantity for this invoice line item 2
Items.*.ProductCode string Product code, product number, or SKU associated with the specific line item A123
Items.*.Tax number Tax associated with each line item. Possible values include tax amount, tax %, and tax Y/N 10%
Items.*.Unit string The unit of the line item, e.g, kg, lb etc. hours
Items.*.UnitPrice number The net or gross price (depending on the gross invoice setting of the invoice) of one unit of this item $30.00
prebuilt:receipt
Field Type Description Example
ReceiptType string Type of receipt Itemized
Locale string Locale en-US
MerchantName string Name of the merchant issuing the receipt Contoso
MerchantPhoneNumber phoneNumber Listed phone number of merchant 987-654-3210
MerchantAddress string Listed address of merchant 123 Main St Redmond WA 98052
Total number Full transaction total of receipt $14.34
TransactionDate date Date the receipt was issued June 06, 2019
TransactionTime time Time the receipt was issued 4:49 PM
Subtotal number Subtotal of receipt, often before taxes are applied $12.34
Tax number Tax on receipt, often sales tax or equivalent $2.00
Tip number Tip included by buyer $1.00
ArrivalDate date Date of arrival 27Mar21
DepartureDate date Date of departure 28Mar21
Currency currency Currency unit of receipt amounts, or 'MIXED' if multiple values are found USD
MerchantAliases array
MerchantAliases.* string Alternative name of merchant Contoso (R)
Items array
Items.* object Extracted line item 1 Surface Pro 6 $999.00 $999.00
Items.*.TotalPrice number Total price of line item $999.00
Items.*.Name string Item name Surface Pro 6
Items.*.Quantity number Quantity of each item 1
Items.*.Price number Individual price of each item unit $999.00
Items.*.Description string Item description Room Charge
Items.*.Date date Item date 27Mar21
Items.*.Category string Item category Room

Error

Form Recognizer uses an unified design to represent all errors encountered in the REST APIs. Whenever an API operations returns a 4xx or 5xx status code, additional information about the error are returned in the response JSON body as follows:
    
{
  "error": {
    "code": "InvalidRequest",
    "message": "Invalid request.",
    "innererror": {
      "code": "InvalidContent",
      "message": "The file format is unsupported or corrupted. Refer to documentation for the list of supported formats."
    }
  }
}
    
For long-running operations where multiple errors may be encountered, the top-level error code is set to the most severe error, with the individual errors listed under the error.details property. In such scenarios, the target property of each individual error specifies the trigger of the error.
    
{
    "status": "failed",
    "createdDateTime": "2021-07-14T10:17:51Z",
    "lastUpdatedDateTime": "2021-07-14T10:17:51Z",
    "error": {
        "code": "InternalServerError",
        "message": "An unexpected error occurred.",
        "details": [
            {
                "code": "InternalServerError",
                "message": "An unexpected error occurred."
            },
            {
                "code": "InvalidContentDimensions",
                "message": "The input image dimensions are out of range. Refer to documentation for supported image dimensions.",
                "target": "2"
            }
        ]
    }
}
    

{
    "status": "succeeded",
    "createdDateTime": "2021-09-30T12:42:07Z",
    "lastUpdatedDateTime": "2021-09-30T12:42:13Z",
    "analyzeResult": {
      // Basic analyze result metadata
      "apiVersion": "2021-09-30-preview",   // REST API version used
      "modelId": "prebuilt-invoice",        // ModelId used
      "stringIndexType": "textElements",    // Character unit used for string offsets and lengths: textElements, unicodeCodePoint, utf16CodeUnit
      // Concatenated content in global reading order across pages.
      // Words are generally delimited by space, except CJK (Chinese, Japanese, Korean) characters.
      // Lines and selection marks are generally delimited by newline character.
      // Selection marks are represented in Markdown emoji syntax (:selected:, :unselected:).
      "content": "CONTOSO LTD.\nINVOICE\nContoso Headquarters...",
      "pages": [                            // List of pages analyzed
        {
          // Basic page metadata
          "pageNumber": 1,                  // 1-indexed page number
          "angle": 0,                       // Orientation of content in clockwise direction (degree)
          "width": 0,                       // Page width
          "height": 0,                      // Page height
          "unit": "pixel",                  // Unit for width, height, and bounding box coordinates
          "spans": [                        // Parts of top-level content covered by page
            {
              "offset": 0,                  // Offset in content
              "length": 7                   // Length in content
            }
          ],
          // List of words in page
          "words": [
            {
              "content": "CONTOSO",         // Equivalent to $.content.Substring(span.offset, span.length)
              "boundingBox": [ ... ],       // Position in page
              "confidence": 0.99,           // Extraction confidence
              "span": { ... }               // Part of top-level content covered by word
            }, ...
          ],
          // List of selectionMarks in page
          "selectionMarks": [
            {
              "state": "selected",          // Selection state: selected, unselected
              "boundingBox": [ ... ],       // Position in page
              "confidence": 0.95,           // Extraction confidence
              "span": { ... }               // Part of top-level content covered by selection mark
            }, ...
          ],
          // List of lines in page
          "lines": [
            {
              "content": "CONTOSO LTD.",    // Concatenated content of line (may contain both words and selectionMarks)
              "boundingBox": [ ... ],       // Position in page
              "spans": [ ... ],             // Parts of top-level content covered by line
            }, ...
          ]
        }, ...
      ],
      // List of extracted tables
      "tables": [
        {
          "rowCount": 1,                    // Number of rows in table
          "columnCount": 1,                 // Number of columns in table
          "boundingRegions": [              // Bounding boxes potentially across pages covered by table
            {
              "pageNumber": 1,              // 1-indexed page number
              "boundingBox": [ ... ],       // Bounding box
            }
          ],
          "spans": [ ... ],                 // Parts of top-level content covered by table
          // List of cells in table
          "cells": [
            {
              "kind": "stubHead",           // Cell kind: content (default), rowHeader, columnHeader, stubHead, description
              "rowIndex": 0,                // 0-indexed row position of cell
              "columnIndex": 0,             // 0-indexed column position of cell
              "rowSpan": 1,                 // Number of rows spanned by cell (default=1)
              "columnSpan": 1,              // Number of columns spanned by cell (default=1)
              "content": "SALESPERSON",     // Concatenated content of cell
              "boundingRegions": [ ... ],   // Bounding regions covered by cell
              "spans": [ ... ]              // Parts of top-level content covered by cell
            }, ...
          ]
        }, ...
      ],
      // List of extracted key-value pairs
      "keyValuePairs": [
        {
          "key": {                          // Extracted key
            "content": "INVOICE:",          // Key content
            "boundingRegions": [ ... ],     // Key bounding regions
            "spans": [ ... ]                // Key spans
          },
          "value": {                        // Extracted value corresponding to key, if any
            "content": "INV-100",           // Value content
            "boundingRegions": [ ... ],     // Value bounding regions
            "spans": [ ... ]                // Value spans
          },
          "confidence": 0.95                // Extraction confidence
        }, ...
      ],
      // List of extracted entities
      "entities": [
        {
          "category": "DateTime",           // Primary entity category
          "subCategory": "Date",            // Secondary entity category
          "content": "11/15/2019",          // Entity content
          "boundingRegions": [ ... ],       // Entity bounding regions
          "spans": [ ... ],                 // Entity spans
          "confidence": 0.99                // Extraction confidence
        }, ...
      ],
      // List of extracted styles
      "styles": [
        {
          "isHandwritten": true,            // Is content in this style handwritten?
          "spans": [ ... ],                 // Spans covered by this style
          "confidence": 0.95                // Detection confidence
        }, ...
      ],
      // List of extracted documents
      "documents": [
        {
          "docType": "prebuilt:invoice",    // Classified document type (model dependent)
          "boundingRegions": [ ... ],       // Document bounding regions
          "spans": [ ... ],                 // Document spans
          "confidence": 0.99,               // Document splitting/classification confidence
          // List of extracted fields
          "fields": {
            "VendorName": {                 // Field name (docType dependent)
              "type": "string",             // Field value type: string, number, array, object, ...
              "valueString": "CONTOSO LTD.",// Normalized field value
              "content": "CONTOSO LTD.",    // Raw extracted field content
              "boundingRegions": [ ... ],   // Field bounding regions
              "spans": [ ... ],             // Field spans
              "confidence": 0.99            // Extraction confidence
            }, ...
          }
        }, ...
      ]
    }
  }

Response 404

The top-level error.code property can be one of the following:

Error Code Message
NotFound Resource not found.
When possible, additional details are specified in the innererror property:
Top Error Code Inner Error Code Message
NotFound OperationNotFound The requested operation was not found. The identifier may be invalid or the operation may have expired.

{
  "error": {
    "code": "NotFound",
    "message": "Resource not found.",
    "innererror": {
      "code": "OperationNotFound",
      "message": "The requested operation was not found. The identifier may be invalid or the operation may have expired."
    }
  }
}

Response 500

The top-level error.code property can be one of the following:

Error Code Message
InternalServerError An unexpected error occurred.
When possible, additional details are specified in the innererror property:
Top Error Code Inner Error Code Message
InternalServerError Unknow Unknow error.

{
  "error": {
    "code": "InternalServerError",
    "message": "An unexpected error occurred.",
    "innererror": {
      "code": "Unknown",
      "message": "Unknown error."
    }
  }
}

Response 503

The top-level error.code property can be one of the following:

Error Code Message
ServiceUnavailable A transient error has occurred. Please try again.
When possible, additional details are specified in the innererror property:
Top Error Code Inner Error Code Message
ServiceUnavailable ServiceUnavailable A transient error has occurred. Please try again.

{
  "error": {
    "code": "ServiceUnavailable",
    "message": "A transient error has occurred. Please try again.",
    "innererror": {
      "code": "ServiceUnavailable",
      "message": "A transient error has occurred. Please try again."
    }
  }
}

Code samples

@ECHO OFF

curl -v -X GET "https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}" 
using System;
using System.Net.Http.Headers;
using System.Text;
using System.Net.Http;
using System.Web;

namespace CSHttpClientSample
{
    static class Program
    {
        static void Main()
        {
            MakeRequest();
            Console.WriteLine("Hit ENTER to exit...");
            Console.ReadLine();
        }
        
        static async void MakeRequest()
        {
            var client = new HttpClient();
            var queryString = HttpUtility.ParseQueryString(string.Empty);

            // Request headers
            client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "{subscription key}");

            var uri = "https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview&" + queryString;

            var response = await client.GetAsync(uri);
        }
    }
}	
// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

public class JavaSample 
{
    public static void main(String[] args) 
    {
        HttpClient httpclient = HttpClients.createDefault();

        try
        {
            URIBuilder builder = new URIBuilder("https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview");


            URI uri = builder.build();
            HttpGet request = new HttpGet(uri);
            request.setHeader("Ocp-Apim-Subscription-Key", "{subscription key}");


            // Request body
            StringEntity reqEntity = new StringEntity("{body}");
            request.setEntity(reqEntity);

            HttpResponse response = httpclient.execute(request);
            HttpEntity entity = response.getEntity();

            if (entity != null) 
            {
                System.out.println(EntityUtils.toString(entity));
            }
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

<!DOCTYPE html>
<html>
<head>
    <title>JSSample</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
    $(function() {
        var params = {
            // Request parameters
        };
      
        $.ajax({
            url: "https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview&" + $.param(params),
            beforeSend: function(xhrObj){
                // Request headers
                xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key","{subscription key}");
            },
            type: "GET",
            // Request body
            data: "{body}",
        })
        .done(function(data) {
            alert("success");
        })
        .fail(function() {
            alert("error");
        });
    });
</script>
</body>
</html>
#import <Foundation/Foundation.h>

int main(int argc, const char * argv[])
{
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    
    NSString* path = @"https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview";
    NSArray* array = @[
                         // Request parameters
                         @"entities=true",
                      ];
    
    NSString* string = [array componentsJoinedByString:@"&"];
    path = [path stringByAppendingFormat:@"?%@", string];

    NSLog(@"%@", path);

    NSMutableURLRequest* _request = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:path]];
    [_request setHTTPMethod:@"GET"];
    // Request headers
    [_request setValue:@"{subscription key}" forHTTPHeaderField:@"Ocp-Apim-Subscription-Key"];
    // Request body
    [_request setHTTPBody:[@"{body}" dataUsingEncoding:NSUTF8StringEncoding]];
    
    NSURLResponse *response = nil;
    NSError *error = nil;
    NSData* _connectionData = [NSURLConnection sendSynchronousRequest:_request returningResponse:&response error:&error];

    if (nil != error)
    {
        NSLog(@"Error: %@", error);
    }
    else
    {
        NSError* error = nil;
        NSMutableDictionary* json = nil;
        NSString* dataString = [[NSString alloc] initWithData:_connectionData encoding:NSUTF8StringEncoding];
        NSLog(@"%@", dataString);
        
        if (nil != _connectionData)
        {
            json = [NSJSONSerialization JSONObjectWithData:_connectionData options:NSJSONReadingMutableContainers error:&error];
        }
        
        if (error || !json)
        {
            NSLog(@"Could not parse loaded json with error:%@", error);
        }
        
        NSLog(@"%@", json);
        _connectionData = nil;
    }
    
    [pool drain];

    return 0;
}
<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

$request = new Http_Request2('https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview');
$url = $request->getUrl();

$headers = array(
    // Request headers
    'Ocp-Apim-Subscription-Key' => '{subscription key}',
);

$request->setHeader($headers);

$parameters = array(
    // Request parameters
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_GET);

// Request body
$request->setBody("{body}");

try
{
    $response = $request->send();
    echo $response->getBody();
}
catch (HttpException $ex)
{
    echo $ex;
}

?>
########### Python 2.7 #############
import httplib, urllib, base64

headers = {
    # Request headers
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.urlencode({
})

try:
    conn = httplib.HTTPSConnection('*.cognitiveservices.azure.us')
    conn.request("GET", "/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview&%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################

########### Python 3.2 #############
import http.client, urllib.request, urllib.parse, urllib.error, base64

headers = {
    # Request headers
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.parse.urlencode({
})

try:
    conn = http.client.HTTPSConnection('*.cognitiveservices.azure.us')
    conn.request("GET", "/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview&%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################
require 'net/http'

uri = URI('https://*.cognitiveservices.azure.us/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2021-09-30-preview')
uri.query = URI.encode_www_form({
})

request = Net::HTTP::Get.new(uri.request_uri)
# Request headers
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Request body
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|
    http.request(request)
end

puts response.body