Document AI Auditability

Every extracted answer comes with proof.

Ninjadoc returns structured document answers with the source evidence attached: page, bounding box, citation URL, cropped citation URL, and confidence. Your application can show where each value came from instead of asking users to trust a black box.

Evidence object

{
  "question": "What is the total amount due?",
  "answer": "$4,250.00",
  "confidence": 0.99,
  "evidence": [{
    "page_index": 0,
    "evidence_texts": ["Total Amount Due: $4,250.00"],
    "citation_url": "https://api.ninjadoc.ai/citations/01JV.../0/page.jpg",
    "citation_url_cropped": "https://api.ninjadoc.ai/citations/01JV.../0/cropped.jpg",
    "located": [{
      "boxes": [{
        "normalized": { "x1": 680, "y1": 820, "x2": 890, "y2": 850 },
        "pixel": { "x1": 420, "y1": 680, "x2": 540, "y2": 700 }
      }]
    }]
  }]
}

The evidence chain

Answer

A structured value your system can store, validate, or pass to another tool.

Evidence text

The source text or visual region used to produce the answer.

Bounding box

Coordinates that point to the exact region on the source page.

Citation URL

A direct source artifact for review queues, audit logs, and agent responses.

Why this matters

AI outputs become usable when reviewers can verify them.

In legal, finance, insurance, healthcare operations, and compliance workflows, an answer is not enough. The system needs to explain where the answer came from, how confident it is, and what a human should review before taking action.

Ninjadoc makes that source trail part of the extraction response, so auditability does not have to be built as a separate layer after parsing.

Use it when the output needs review

Contract review fields that need legal signoff
Invoice approvals that require source backup
Compliance evidence collected from policies and forms
Insurance claim facts routed to a human reviewer
KYC and financial document fields used in downstream decisions

FAQ

What makes a document extraction auditable?

An auditable extraction keeps the answer connected to the source document. Ninjadoc returns page indexes, bounding boxes, evidence text, citation URLs, and cropped citation URLs so reviewers can verify where each value came from.

Are citation URLs different from page numbers?

Yes. Page numbers tell a reviewer where to look. Citation URLs give the application or agent a direct source artifact that can be opened, embedded, or stored in an audit log.

Why do bounding boxes matter?

Bounding boxes identify the exact source region for an extracted value. They let your product highlight, crop, review, and explain the original evidence instead of asking a user to search the whole document.

Can agents use this evidence downstream?

Yes. The evidence object is returned as structured JSON and can be passed through APIs, databases, review queues, and MCP-compatible agent tools.

Build the audit trail into extraction.

Start with answers your reviewers, systems, and agents can trace back to source.

Start building