Ampor Hub — AI document translation operations
How Ongkrong Consulting designed and built a controlled AI translation operating system for a visa, legal, and official document business — replacing manual file-by-file translation while retaining every document's context, format, and structure, with humans kept in the approval seat.
An account of Ongkrong's AI document-translation engagement for Ampor Translation — covering product design, AI pipeline engineering, OCR, structured quality evaluation, human review, layout mirroring for official documents, client operations, invoicing, portal delivery, production hosting, and a companion Telegram AI receptionist. Prepared by Ongkrong Consulting · accurate as of June 2026.
At a glance
- Client
- Ampor Translation — a translation and consultancy business specialising in visa, official, and legal document translation
- The problem
- Human translators processed every visa application, official form, legal contract, and certificate by hand — files arriving by chat, email, and walk-in; no OCR; no structured quality review; jobs, invoices, and client records in separate places
- Engagement type
- End-to-end: product architecture → AI translation pipeline → OCR → multi-language support → structured quality evaluation → layout mirroring → client management → invoicing → portal delivery → production deployment → AI receptionist
- Quality evaluation
- 7-dimension Quality Assessment Framework (QAF) — protected-pattern integrity, script integrity, OCR confidence, semantic fidelity, terminology compliance, format fidelity, glossary coverage — applied per segment, per job
- Document formats
- DOCX · digital PDFs · scanned PDFs · JPG/PNG images · multi-page browser scans — the real formats a translation business receives, not only clean files
- Language support
- 12-language registry with a configurable language-hub rule — every job involves one of the configured hub languages on one side, keeping the operational centre of gravity aligned to the markets served
- AI receptionist
- Companion Telegram bot — FAQs, service guidance, appointment booking, document upload triage, language switching, staff handoff, owner notifications, and RAG from a controlled company knowledge base
- Production status
- App, database, background worker, deployment path, runbooks, security baseline, and a working product surface for staff — production-system shape
The client and the problem
Ampor Translation is a translation and consultancy business. It does not only receive clean Word documents. Clients send visa applications, official forms, legal contracts, certificates, scans, phone photos, low-quality PDFs, and mixed-format material. A useful system had to work with the messy front door of the business, not only with ideal inputs. The operational requirement was broader than translation accuracy — before Ampor Hub, the workflow was fragmented at every step.
- Files arrived through chat, email, phone photos, and walk-in requests — no single intake point.
- Scanned documents required manual reading or separate OCR tools outside the workflow.
- Translation output often lost document structure — tables, headings, field positions.
- Important details — dates, passport numbers, legal references, names, IDs — could be damaged with no automated preservation checks.
- Review happened outside the system, making status and quality hard to track.
- Client records, job history, invoices, and delivery were spread across separate tools with no linkage.
- Customer questions and booking requests relied on manual response time with no capture.
Replace human translators for visa applications, official documents, and legal contracts with a controlled AI-assisted workflow — while retaining every document's context, structure, and format. Not just translation: translation operations — intake, OCR, AI pipeline, quality evaluation, human review, layout mirroring, client records, invoices, portal delivery, and customer intake in one system.
Our role — ten workstreams
Ongkrong Consulting worked across product, engineering, AI workflow design, deployment, security, and operational handoff — ten workstreams from first intake to customer support, designed as an integrated system where each feeds the next.
Translation Operating System
Internal command centre — job flow, status tracking, worker pipeline, review, delivery.
Document Intake
DOCX, digital PDF, scanned PDF, image, browser scan — real formats, automatic routing.
Multi-language Controls
12-language registry, language-hub rule, glossary, templates, protected patterns.
OCR & Quality Review
AI vision OCR, document classification, 7-dimension QAF, segment approval flow.
Layout Mirroring
Layout analysis, region mapping, layout editor, final DOCX/PDF export.
Client Operations
Client records, job history, notes, services catalogue, invoices, VAT, deposits.
Client Portal
Token-gated portal — clients view and download their own jobs and invoices.
Production Architecture
Single container, Postgres, HTTPS, background worker, health checks, runbooks.
AI Receptionist
Telegram bot — FAQs, booking, document triage, RAG, staff handoff, notifications.
Security & Governance
Auth, role gating, upload validation, portal token model, secret handling, runbooks.
| Input type | Detection | Processing path | Pipeline entry |
|---|---|---|---|
| DOCX | File extension + MIME type | Structural XML parser → Document tree extractor → Section analyser | Structured segments |
| Digital PDF | Text layer present | Position-aware text extractor → Region grouper → Section analyser | Positioned segments |
| Scanned PDF | No extractable text layer | Page renderer → AI Vision OCR per page → Text assembler | OCR segments |
| JPG / PNG image | Image MIME type | Direct AI Vision OCR → Text assembler | OCR segments |
| Browser scan | Staff-initiated capture | Multi-page PDF export → Scanned PDF path | OCR segments |
| Class | Detection signals | Translation register | Controls activated |
|---|---|---|---|
| Legal | Legal terminology, clause structures, section numbering | High formality — precise legal register | Name + reference protection, glossary enforcement, strict fidelity |
| Official | Government headers, seal markers, form-field patterns | Formal — government / administrative register | Date + ID preservation, place-name and entity protection, layout-mirror path |
| Form | Field labels, blank fields, table-grid structure | Literal / field-mapped translation | Field mapping, exact value preservation, position-aware output |
| Report | Section headers, paragraph structure, numbered lists | Professional — section-aware | Heading translation, section integrity check, structure validation |
| General | Default — no strong structural signal detected | Standard — neutral register | Basic date, ID, and code protection only |
The human stays in control
AI accelerates the work; staff still own the final output. No translated document reaches a client without human approval — and the QAF tells staff exactly where to look first. Before translation the system analyses and classifies each document; after translation the 7-dimension Quality Assessment Framework evaluates every segment before it reaches the review surface.
| # | Dimension | What is measured | Signal | Threshold |
|---|---|---|---|---|
| 01 | Protected-pattern integrity | Dates, IDs, codes, emails, reference numbers detected in source, verified in output | Hard block | Any mutation detected |
| 02 | Script integrity | Target-language output validated for correct Unicode range and script encoding | Hard block | Invalid characters found |
| 03 | OCR confidence | Character-recognition score assigned per segment by AI Vision OCR | Auto-warn | Score < 0.80 |
| 04 | Semantic fidelity | Word-count ratio between source and translation — extreme deviation signals truncation or hallucination | Auto-warn | > 1.6× or < 0.55× |
| 05 | Terminology compliance | Glossary term-match rate — required terms checked against the active job glossary | Auto-warn | < 95% term match |
| 06 | Format fidelity | Structural element count — headings, tables, lists, field labels matched source to output | Review flag | Any count mismatch |
| 07 | Glossary coverage | % of job-specific terms present in the active glossary before translation | Advisory | < 80% coverage |
Approve-high-confidence workflow
All 7 dimensions pass · confidence ≥ 0.85 · no flags. Batch-approve eligible segments without individual review.
Full segment-by-segment review
One or more dimensions flagged. Staff must review each segment individually before the job can be approved.
Segment rejected · not deliverable
Protected pattern mutated or script integrity failed. Segment must be manually corrected. Delivery blocked until resolved.
Documents that look like documents — and the back office around them
Many translation jobs are not plain text. Official documents need layout sensitivity: forms, certificates, tables, headings, seals, margins, and field positions. The system analyses the original layout, maps translated content back into regions by section and field position, lets staff adjust in an editing surface, and exports a DOCX or PDF suitable for client delivery. For official documents, layout fidelity is not cosmetic — it is the delivery standard the client expects. Around that, Ampor Hub also carries the business operations: client records and job history, a services catalogue, and invoices in USD and KHR with deposits, discounts, and VAT — every invoice linked to its job, every job to its client. No separate spreadsheet, no separate invoicing tool, no separate file store.
Language coverage & continuous improvement
The platform supports a 12-language registry with a configurable language-hub rule: every job must involve one of the configured hub languages on one side. That lets the business handle multilingual demand while keeping the operational centre of gravity aligned to the markets it serves. The hub model governs which pairs are valid; review-driven feedback makes every subsequent job smarter.
| Pairing type | Validity | Routing |
|---|---|---|
| Hub A ↔ Hub B | ✓ Supported — primary hub bridge | Direct translation · highest-priority pair |
| Hub A ↔ Other | ✓ Supported — hub-A spoke | Direct translation with hub-A-side controls |
| Hub B ↔ Other | ✓ Supported — hub-B spoke | Direct translation with target-script integrity guardrails |
| Other ↔ Other | – Not in scope | Must involve a hub language — server-side validation prevents invalid pairs |
How the system learns from every job
- 01
Job intake
Document enters; pipeline classifies and extracts content. Active glossary, templates, and patterns applied.
- 02
AI translation
Segments translated using the current knowledge base. QAF evaluates every segment before review.
- 03
Human review + edit
Staff correct errors, edit segments, flag issues. Every edit is a signal the system captures.
- 04
Corrections captured
Terminology errors, missed patterns, register issues, and glossary gaps identified and logged.
- 05
Knowledge base updated
Glossary extended, protected patterns refined, template instructions improved.
- 06
Every future job benefits
Updated controls applied to all subsequent translations — compounding accuracy over time.
Delivery, reliability, and a serious security posture
A controlled delivery channel
Clients receive a token-gated portal link to view their own jobs, download files, and see invoices — scoped to one client, isolated from staff routes, admin-managed tokens. No ad-hoc file sharing, no email attachments, no ambiguity about which version is final.
Small footprint, not fragile
Single web container with HTTPS ingress and health checks; internal Postgres with database-backed job queueing and atomic claiming; an in-process background worker with heartbeat monitoring and restart handling; migration runner at boot, plus backup, restore, and handoff runbooks.
The operating system reaches the front door
A companion Telegram AI receptionist handles FAQs and price guidance from a controlled knowledge base (RAG), appointment booking, document-upload triage, language switching, and staff/owner handoff with context — so staff receive a warm, documented enquiry, not a cold one.
Security treated as part of the product
Staff authentication with admin and staff roles and server-side route gating; strong session-secret enforcement, password hashing, and login throttling; portal token isolation; upload validation and size limits; backup and secret-handling runbooks documented for handoff.
Deliverables
Major artifacts produced across the ten workstreams — from translation pipeline to production operations.
- 01Production web application for translation operations — job dashboard, status tracking, and staff workflow
- 02DOCX, digital PDF, scanned PDF, image, and browser-scan intake pipeline with automatic routing
- 03AI Vision OCR pipeline — per-page recognition for scanned PDFs, phone photos, and browser scans
- 04Document classification system — legal, official, form, report, general — with register-aware translation
- 057-dimension Quality Assessment Framework (QAF) — applied per segment, per job, before human review
- 06Segment confidence scoring — high / medium / low tiers driving quality-gate decisions
- 0712-language registry with configurable language-hub rule and server-side pair validation
- 08Script-integrity guardrails, protected-pattern handling, place-name and entity protection, document-type terminology controls
- 09Glossary and template controls — accumulated and refined through the continuous-improvement loop
- 10Layout editor and final DOCX/PDF export path for formatted official documents
- 11Client records, job history, and notes — linked throughout the platform
- 12Invoice builder — services catalogue, USD/KHR, VAT, deposits, discounts, PDF preview
- 13Token-gated client portal — view, download, invoices, scoped to one client per token
- 14Background worker — atomic job claiming, heartbeat, restart handling, recovery strategies
- 15Production deployment path, backup and restore runbooks, security and handoff documentation
- 16Telegram AI receptionist — RAG knowledge base, FAQs, booking, document triage, staff handoff, owner notifications
What changed for the client
- Files arriving through chat, email, and walk-in — no single intake point
- Scanned documents required separate OCR tools
- No automated quality checks — errors caught only by human re-read
- Translation output losing document structure
- Review status and quality difficult to track
- Client records, jobs, and invoices in separate tools
- Customer questions handled manually, one at a time
- Delivery by email or ad-hoc file share
- AI handles visa, contract, and official document translation — humans review and approve
- OCR support for scans, phone photos, and browser scans
- 7-dimension QAF flagging issues before human review begins
- Multi-language support anchored around the configured hub languages
- Layout mirroring for formatted official documents
- Client records tied to job history and invoices in one system
- Invoice generation inside the same workflow
- Token-gated portal for controlled client delivery
- AI receptionist for questions, booking, and lead capture
- Continuous-improvement loop compounding accuracy over time
What made this engagement different
A quality framework, not just a translation prompt.
Seven evaluation dimensions, three signal types, and three gate outcomes — applied to every segment of every job. Quality is measured continuously through the pipeline, not reviewed only at the end.
Built around the messy front door, not the ideal demo.
Five document formats, automatic classification into five document types, and an OCR path that handles the real inputs a translation business receives — not only clean Word documents.
AI output connected to operations that compound.
The continuous-improvement loop feeds review corrections back into glossary, protected patterns, and templates — so each visa, contract, and official document makes the next one better.
App showcase — see it in action
A short walkthrough of the live Ampor Hub translation operating system — intake, AI pipeline, quality review, and delivery.
Watch the demoYouTube ↗If your business processes visa applications, official documents, or legal contracts through manual translators, we can build you a controlled AI translation operation that retains document context, format, and structure while keeping humans in the approval seat. That is the brief Ampor Hub was built against.
Prepared by Ongkrong Consulting. Accurate as of June 2026.