Documents in.
Governed metadata out.
A small team of AI agents that classify your documents, extract structured metadata, and stitch it to your business glossary.
Your governance team does the thinking. IDA does the typing.
Five agents. One pipeline. Your glossary, populated.
Collect
Drop in SDLC documents — PDFs, Word, Excel, CSV. No connectors required.
Classify
The Librarian agent reads each document, tags type and SDLC stage, and scores trustworthiness.
Extract
Specialised agents populate inventories — systems, pipelines, logical and physical data elements.
Stitch
An embeddings model proposes matches between business glossary terms and database columns.
Govern
Your team reviews and approves matches. The verified glossary becomes the source of truth.
Four inventories. One source of truth.
Everything IDA extracts lands in the right place — systems and pipelines on one side, the data elements they carry on the other.
Inventories
Asset management
Systems
Data Pipelines
Pipeline Mapping Specs
Logical Data Elements
Physical Data Elements
DQ Rule Results
EUC / EUDA
Coming soon
AI Models
Coming soon
What you do with the inventories.
Inventories are the raw material. The tools are how you turn them into a verified glossary, a lineage graph, and the trust signals your governance team actually needs.
Tools
Data utilities
Data Lineage
Interactive flow mapping
Business Glossary
Per-system match & approve
Global Glossary
Enterprise-wide consolidation
Data Profiling
Coming soon
Quality Rules
Coming soon
Built for the messy reality of enterprise data.
Librarian Agent
Automatically classifies documents by SDLC stage, assigns trust scores, and routes to specialist extractors.
Extraction Agents
Domain-expert AI agents that understand technical specs, requirements docs, and data dictionaries.
Semantic Stitching
Sentence transformers find matches between business terms and physical database elements automatically.
Global Glossary
Deduplicated single source of truth with full lineage back to source documents.
Lineage Graphs
Interactive visualization connecting systems, pipelines, and data elements end-to-end.
Human Oversight
Review, approve, and audit every AI decision. You stay in control.
Run it in bulk. Or one document at a time.
Use bulk for scale. Use chat for the documents you don't want to leave to a job queue.
Bulk processing
Automated at scale.
Drop in hundreds of documents and let the pipeline run. Classification, extraction, stitching — all happen in the background. Approve in batches when you're ready.
- Hundreds of documents in one run
- Background processing queue
- Batch approval workflows
Chat-driven
One document, total control.
Walk through a single complex document with the agent. Ask follow-up questions, refine extractions in real time, and commit only what you've reviewed.
- Deep-dive on complex documents
- Ask follow-up questions
- Fine-tune before committing
Or combine both. Bulk for the long tail, chat for the documents that matter.
A different league.
What we estimate based on benchmark data of typical mid-tier banks vs. an IDA pilot. Your mileage will vary.
| Manual | Scanners | IDA | |
|---|---|---|---|
| Time to BAU | 5 years | 2 years | 6 months |
| Cost to BAU | £52M | £31M | £1.25M |
| Accuracy | 50% | 85% | 80% |
Three months. Your documents. Real metadata.
A focused engagement to prove IDA on your real estate. We bring the agents, you bring the documents, and at the end you have a populated glossary you can keep.
Numbers below are from a recent pilot with a UK bank.
Modern stack. Boringly reliable.
Frontend
- React
- Next.js
- Tailwind
Backend
- Lambda
- API Gateway
- AppSync
Storage
- S3
- DynamoDB
- Cognito
AI/ML
- Claude
- Transformers
- MiniLM
See it on your documents.
We'll run IDA over a few of your real files in a 30-minute call. No setup, no commitment.
Book a demo