Monetary establishments course of 1000’s of paperwork every day, together with tax varieties, mortgage statements, and buy orders. Every has a singular format, construction, and subject names, making it difficult to create automation workflows utilizing optical character recognition (OCR) software program. Amazon Bedrock Information Automation (BDA) helps clear up these challenges by automating the extraction, validation, and evaluation of knowledge from monetary paperwork. BDA goes past easy OCR by utilizing basis fashions that may:
- Perceive doc context
- Acknowledge relationships between completely different sections
- Extract structured, actionable information
- Validate data throughout a number of sources
Whereas basis fashions like Anthropic Claude can extract content material from PDFs, Amazon Bedrock Information Automation presents customized extractions with industry-leading accuracy at a decrease value, together with options reminiscent of visible grounding with confidence scores for explainability and built-in hallucination mitigation.
On this submit, we discover how Amazon Bedrock Information Automation can precisely extract data from 4 widespread sorts of monetary paperwork: financial institution statements, W-2 varieties, 1099-B tax varieties, and vendor contracts. We spotlight the complexity within the paperwork, element the customized extraction created in Amazon Bedrock Information Automation, and describe the outcomes of the extraction course of.
Answer overview
Amazon Bedrock Information Automation enables you to configure output primarily based in your processing wants utilizing blueprints. A blueprint in Amazon Bedrock Information Automation is a configuration template that defines how information must be extracted from paperwork. It specifies:
- The doc kind being processed
- The info fields to be extracted
- The validation guidelines for the extracted information
- The construction and format of the output
Consider it as a map that tells Amazon Bedrock Information Automation precisely what data to search for and the right way to course of it. When utilizing a blueprint for extraction, you need to use a catalog blueprint or a customized created blueprint. A customized blueprint permits organizations to create extraction patterns for his or her particular wants. On this submit, we created customized blueprints and used the BDA console to generate and validate the output.

The way to develop blueprints for 4 sorts of monetary paperwork
The next sections stroll you thru creating customized blueprints for financial institution statements, W-2 varieties, 1099-B varieties, and vendor contracts.
Conditions
If you’re not conversant in how customized blueprints are created, observe the directions from the Amazon Bedrock documentation. For our analysis, we uploaded the paperwork on the BDA console, refined the AI-generated prompts, and downloaded the outcomes. Usually, a single customized blueprint suffices for a particular doc kind when extracting constant fields. Nevertheless, if workflow necessities range or doc codecs change considerably, a number of customized blueprints would possibly should be created to accommodate these variations. After a blueprint is created, you need to use it as part of the workflow for constant downstream processing. For a similar blueprint, if the enter doc has completely different information, then BDA would possibly return barely completely different output (for instance, some financial institution statements may need complete debits and credit). Nevertheless, as a result of BDA output is structured JSON, it’s easy to create applicable guidelines primarily based on downstream processing workflows (for instance, discard complete if the workflow is to categorize particular person debit and credit score transactions for accounting).
The next screenshot illustrates the blueprint immediate for one of many doc varieties.

The subsequent part describes the 4 paperwork tried as part of this challenge and extraction achieved utilizing customized blueprints primarily based on wants. Output is out there in JSON, CSV, and uncooked information codecs, highlighting the answer’s adaptability to various integration and reporting wants.
Monetary doc varieties and customized blueprints
Amazon Bedrock Information Automation supplies built-in blueprints for widespread doc varieties together with financial institution statements and W-2 varieties. These built-in blueprints supply complete extraction out of the field. On this submit, we use customized blueprints to reveal how organizations can tailor extraction to their particular workflow necessities. For instance, you possibly can extract solely transaction information from financial institution statements for automated accounting, or group W-2 fields into logical constructions (federal tax, state tax, code-amount pairs) that align with downstream tax processing methods. Customized blueprints additionally function the method for doc varieties that don’t have built-in blueprints, reminiscent of 1099-B varieties and vendor contracts proven later on this submit.
1. Financial institution Statements – Paperwork from banks detailing an account’s monetary exercise, together with deposits, withdrawals, and charges, over a particular interval, sometimes a month.
Financial institution statements current a posh problem: they comprise quite a few month-to-month transactions, typically spanning a number of pages, with various codecs and particulars. In lots of workflows, the essential activity is to exactly seize transaction information, together with dates, quantities, descriptions, and reference numbers, which may then feed straight into automated accounting workflows like categorizing transactions in an accounting ledger. This automated extraction minimizes handbook information entry errors and streamlines the reconciliation course of. As a part of our analysis course of, we chosen the next financial institution assertion for a trial of the extraction course of:

Account Assertion generated utilizing Amazon Nova Professional Foundational Mannequin
Tailor-made blueprint directions for Amazon Bedrock Information Automation:
Extraction outcomes from desk.csv:

Upon assessment, we will verify that the system efficiently extracted the transactions precisely.
2. Kind W-2 – Experiences revenue and tax withheld for a person or a enterprise.
W-2 tax varieties current distinctive extraction challenges due to their standardized but complicated construction. As a part of our analysis course of, we used the next W-2 for a trial of the extraction course of:

W2 generated utilizing Amazon Nova Professional Foundational Mannequin
Tailor-made blueprint directions for Amazon Bedrock Information Automation:
Extraction outcomes from end result.json:


Upon assessment, we will verify that the system efficiently extracted the transactions precisely. A number of extraction complexities had been particularly verified within the challenge:
- There isn’t a particular grouping on the shape for Federal Tax and State Tax data however they should be processed collectively so extraction outcomes ought to convey them collectively.
- In a single Field 12 of W2 there could be as much as 26 codes to report sure compensation and profit quantities. You will need to extract code and worth as a pair.
- Employers can put absolutely anything in field 14. It helps catch gadgets that don’t have their very own devoted field on the W-2, so these must be grouped individually.
3. IRS Kind 1099-B: Proceeds from Dealer and Barter Trade Transactions – This tax doc tracks:
- Securities buying and selling exercise
- Dealer-facilitated transactions
- Barter trade participation
As a part of our analysis course of, we used the next 1099-B for a trial of the extraction course of:

1099-B assertion generated utilizing Amazon Nova Professional Foundational Mannequin
Tailor-made blueprint directions for Amazon Bedrock Information Automation:
Extraction outcomes from desk.csv:

A major validation of BDA’s contextual understanding capabilities is that the system precisely recognized and extracted ‘TSLA’ because the safety descriptor throughout the inventory transactions, even when it appeared as a standard descriptor for the transactions. This constant extraction demonstrates BDA’s means to take care of contextual accuracy all through the doc processing.
4. Vendor contract – This extraction course of is relevant to a variety of vendor contracts. The precise particulars to be captured should be tailor-made to every firm’s distinctive operational workflows and necessities.
As a part of our analysis course of, we chosen the next vendor contract for a trial of the extraction course of:




Tailor-made blueprint directions for Amazon Bedrock Information Automation:
Extraction outcomes from end result.json:

The system efficiently recognized and extracted the blueprint-specified components current inside the contract.
Conclusion
On this submit, we demonstrated how you need to use Amazon Bedrock Information Automation to precisely extract key data from monetary paperwork together with financial institution statements, W-2 varieties, 1099-B varieties, and vendor contracts to automate downstream processing. You realized the right way to:
- Create customized blueprints for various doc varieties
- Extract structured information from complicated monetary paperwork
- Validate Amazon Bedrock Information Automation outputs for downstream processing
To study extra about implementing doc processing with Amazon Bedrock, assessment the Amazon Bedrock Information Automation documentation. For manufacturing workflows involving delicate data, observe your group’s cybersecurity and authorized pointers to confirm compliance with all relevant laws, together with however not restricted to GDPR in Europe or every other regional or industry-specific necessities.
Concerning the authors

