Document AI

The Document AI connector is a sophisticated extraction engine designed to transform unstructured visual data, such as PDFs, invoices, and business cards into structured, actionable database records.

Unlike Optical Character Recognition (OCR) which simply converts images to raw text, Document AI utilizes Multimodal LLMs to understand the context, relationships, and semantics within a document. This allows the system to intelligently identify specific fields (like "Total Company Name" or "Expiry Date") even across varying document layouts.

To ensure maximum precision and flexibility, the Document AI connector operates through a coordinated two-step workflow. This architecture allows you to decouple the visual extraction from the data formatting, enabling the use of different LLMs optimized for each specific task.

Step 1: Data Extraction

In this initial stage, the AI "sees" the document and identifies the relevant information based on your instructions.

  1. Go to Workflows > Create New Workflow > Add Document AI as New Connector

  1. Once added, click on the Document AI node to open its settings drawer.

  1. Select AI Model. Choose an appropriate LLM from the first drop-down.

  1. Input Extraction Instructions: Provide specific instructions to the model regarding what information it should identify. Example: "Extract the name, company, and total amount from this invoice."

  1. Input File Link: Provide the variable or URL for the document to be processed (e.g., the file uploaded by the user in the chat).

  2. Preferred Language: Specify the language of the document to improve extraction accuracy (e.g., "English" or "Auto-detect").

The AI uses these instructions to transform the visual image into raw data.

Step 2: Data Board Schema Mapping

Once the data is extracted, this step formats it for your database. Note: This section is only required if you want to insert the data into a Data Board.

  1. Toggle Data Board Option: Switch the Data Board option to ON. If this is OFF, the mapping configuration is hidden.

  1. Select the exact Data Board where the information should be inserted.

  1. Choose an LLM from the second drop-down to handle the final data formatting. This can be different or same as of step 1 LLM as per the functionality needed.

  1. In the second prompt box, define how the extracted information maps to your board columns (e.g., mapping "Client" to the "Name" column).

  1. Advanced Configurations:

  • Time Offset: Add a specific time period to the entry if required for your reporting.

  • Concurrent: Enable this to sync the extracted data to multiple data boards simultaneously.

  • Retry Failed Requests: Toggle this option if you want to allow retry requests. Add the number in the next field.

Once configured, execute the connector to verify the output and ensure the data maps correctly to your board.

Deployment Options

There are two ways to implement Document AI within your iMbrace ecosystem:

1. Channel Workflow

This is the standard implementation for a linear AI agent.

  • The connector is placed directly within a specific communication channel's workflow.

  • The extraction triggers automatically when a user sends a document.

2. Tool Calling (Agentic AI)

Used for more complex, non-linear structures (like the Business Card Scanner in the OCR Demo).

  • The Document AI is added as a Capability under the AI Assistant.

  • This allows the AI agent to "decide" to call the Document AI tool only when it determines that document extraction is necessary to fulfill a user's request.

Last updated