Model Card · Sentinel Voice-Note Transcription (OpenAI Whisper-1)
Altara Sentinel allows customers, employees and managed-services analysts to upload voice notes they have received as part of a suspected scam — for example a WhatsApp voice message claiming to be from a bank. The recording is transcribed to text before NAVI analyses it. This card describes the transcription model, how it is used, what is stored, and the limitations.
| Field | Value |
|---|---|
| Card version | 1.0 |
| Card last reviewed | 27 February 2026 |
| Card owner | Altara Sentinel Team — hello@altaracore.ai |
| Model name (internal) | Sentinel Transcribe |
| Underlying model | OpenAI Whisper |
| Provider model ID | whisper-1 |
| Provider | OpenAI, OPCO LLC |
| Access route | Emergent Universal LLM Key — Altara does not hold raw OpenAI credentials |
| Modality | Audio in → text out |
| Languages | Whisper-1 multilingual ASR — best accuracy on the languages OpenAI lists as supported. Altara has tested English, isiZulu (partial), Afrikaans (partial) and South African English accents. |
| Maximum input | 25 MB per file (provider limit). Altara enforces an additional ceiling on the request layer. |
1. Purpose & Intended Use
The transcription model exists to convert a single, user-supplied audio sample into a text transcript so that:
- NAVI can analyse the transcript content for scam indicators (urgency, impersonation cues, payment requests, link mentions).
- The case file shown to the human Sentinel analyst contains a readable version of the audio for evidentiary review.
- The Data-Rights Bundle exported to the user contains the transcript so they can verify what was processed.
Intended users
- Sentinel public submitters uploading suspicious voice notes for trust-score analysis.
- Sentinel managed-services analysts triaging escalated cases.
Out-of-scope
- Whisper-1 is not used for speaker identification, voice biometrics, deepfake detection, sentiment analysis, or any kind of profiling beyond converting audio to text.
- It is not used to transcribe call-centre recordings or any live audio stream.
- It is not used on audio the submitter has not chosen to upload to Altara.
2. Inputs
- A single audio file uploaded by the submitter via the Sentinel submission flow.
- Supported formats follow Whisper-1's accepted list (mp3, m4a, wav, ogg, webm, mp4 audio container, mpga, mpeg).
- File-size guarded both client-side and at the FastAPI boundary.
No additional metadata is sent to the provider beyond the audio bytes themselves and the request to transcribe.
3. Outputs
- A plain-text transcript of the audio. No timestamps, diarisation or speaker labels are requested.
- The transcript is stored in the Sentinel case record alongside a hash of the original audio file.
4. Training Data & Lineage
OpenAI Whisper was trained on a large multilingual speech corpus that Altara does not control and cannot independently verify. Altara has not contributed audio to the training of Whisper. The model is used in inference-only mode; no submitted audio is used to fine-tune the base model.
For full base-model lineage see OpenAI's Whisper paper and the OpenAI Audio API documentation.
5. Performance & Known Quality Bounds
- English — Altara's internal sample testing shows usable transcripts on clean and moderately noisy WhatsApp voice notes. Errors increase with low bit-rate audio, accented speech and background noise.
- South African accents — usable but error-prone on regional pronunciation; analyst review is mandatory.
- isiZulu / Afrikaans — partial. Whisper-1's accuracy on local Southern African languages varies by speaker; Altara plans to add a second-pass model for these languages (see Sentinel Phase B in the product roadmap).
- Audio with multiple speakers — Whisper-1 produces a single interleaved transcript without speaker labels; analysts must reconcile speakers manually if relevant.
Because the transcript is always reviewed by a human Sentinel analyst before any externally-facing action, ASR errors do not flow unchallenged into a decision.
6. Human-in-the-Loop Checkpoints
- Analyst review — every transcript is shown to a human Sentinel analyst alongside the original waveform link. The analyst must explicitly accept the transcript before any case-file action.
- Submitter visibility — the submitter can view the transcript in their case view and request correction or deletion via the Data-Rights Bundle.
7. Data Flow & Residency
[ Submitter device ]
│ (HTTPS upload)
▼
[ Altara FastAPI backend · audio stored encrypted at rest ]
│ (Universal Emergent LLM Key · HTTPS)
▼
[ Emergent gateway ]
│
▼
[ OpenAI Whisper-1 — synchronous transcription · no persistent storage at provider ]
│
▼
[ Transcript stored on the Altara case record · linked to the original audio hash ]
- OpenAI publishes that API audio is not retained for training purposes by default. Altara relies on that contractual position.
- The original audio file remains on Altara-managed storage; transcripts are stored alongside the case.
- Submitters can request deletion of both the audio and the transcript via
hello@altaracore.ai(subject[ALTARA-DATA-RIGHTS]) or via the Data-Rights Bundle download → delete flow.
8. Known Limitations & Failure Modes
- Word-error rate scales with audio quality, accent, code-switching and background noise.
- Mis-transcription of named entities (bank names, scam URLs) — the analyst is responsible for catching these before any escalation.
- No deepfake / synthetic-audio detection — Whisper-1 transcribes whatever it receives. Detecting synthetic audio is a separate Sentinel Phase B initiative.
- Privacy spillage risk — if the submitter inadvertently uploads audio containing sensitive information about a third party, the audio still transits the transcription path. Altara mitigates with a pre-submission warning and a quick-delete option after submission.
- Language coverage — accuracy outside the well-supported language list is poor; do not rely on transcripts in those languages without analyst confirmation.
9. Monitoring, Incident Handling & Drift
- Every transcription call is logged with prompt version, model ID, latency and bucketised quality signal (analyst-accepted, analyst-corrected, analyst-rejected).
- Quality-signal trend is reviewed monthly by the Altara Sentinel team.
- OpenAI Whisper-1 model upgrades are tracked via the Emergent gateway release notes. The model ID is pinned in Altara's code so silent upgrades are impossible without a deliberate code change.
- Kill switch — Sentinel transcription can be disabled by feature flag without redeploy.
10. Frameworks This Card Maps To
| Framework | Section |
|---|---|
| NIST AI RMF 1.0 | Govern · Map · Measure — published deployer card |
| ISO/IEC 42001 | Annex A.6 (data governance), Annex A.7 (information for interested parties), Annex A.9 (use of AI systems) |
| EU AI Act | Article 13 (transparency), Article 50 (transparency for certain AI systems) — relevant because the user is told an AI is generating the transcript |
| POPIA (South Africa) | Voice recordings are biometric-adjacent — submitter consent and right-to-deletion are surfaced at point of upload and in the Data-Rights Bundle |
| GDPR | Article 5 (data minimisation), Article 22 (no solely-automated decision — analyst sign-off mandatory) |
11. Versioning & Change Log
| Card version | Date | Change |
|---|---|---|
| 1.0 | 2026-02-27 | Initial public publication. Underlying model pinned to whisper-1. |
A new card version is published when the underlying model changes, when accepted languages are expanded, or when the data-flow / residency posture changes.
12. Contact & Reporting
- Card owner / questions —
hello@altaracore.ai - Responsible AI / incident reporting —
hello@altaracore.ai· subject[SENTINEL-TRANSCRIBE-INCIDENT] - Right-to-deletion —
hello@altaracore.ai· subject[ALTARA-DATA-RIGHTS]
Altara Sentinel is a module of Altara Core, a division of Navigate Group (Pty) Ltd · Reg No 2016/343423/07. This document is published as a public AI transparency artefact. © Navigate Group (Pty) Ltd — all rights reserved.
