File size: 4,399 Bytes

---
language:
  - en
license: other
tags:
  - nemo
  - parakeet
  - whisper
  - qwen3
  - ctranslate2
  - automatic-speech-recognition
  - text-generation
  - air-traffic-control
  - atc
  - singapore
  - military
pipeline_tag: automatic-speech-recognition
---

# ASTRA ATC Models

Fine-tuned models for Singapore military air traffic control, built for the [ASTRA](https://github.com/aether-raid) training simulator.

## Pipeline

```text
Audio  -->  VAD (Silero)  -->  ASR (Whisper or Parakeet)  -->  Rule Formatter  -->  Display Text
                               "camel climb flight level zero nine zero"
                                                                                  "CAMEL climb FL090"
```

The production pipeline uses a deterministic rule-based formatter instead of the legacy LLM formatter.

## Models

### [ASR/whisper/](./ASR/whisper) - Whisper Large v3 (Legacy CTranslate2 backend)

Fine-tuned for Singapore military ATC speech. Uses CTranslate2 float16 format for fast inference with [faster-whisper](https://github.com/SYSTRAN/faster-whisper).

| Metric | Value |
|--------|-------|
| WER | **0.66%** |
| Base model | `openai/whisper-large-v3` |
| Size | 2.9 GB |
| Runtime | `faster-whisper` / CTranslate2 |

### [ASR/parakeet/](./ASR/parakeet) - Parakeet-TDT 0.6B v2 (NeMo checkpoint)

Fine-tuned NeMo Parakeet model for Singapore military ATC speech. Published as a raw checkpoint together with the tokenizer artifacts required to restore it.

| Metric | Value |
|--------|-------|
| Validation WER | **0.72%** |
| Base model | `nvidia/parakeet-tdt-0.6b-v2` |
| Size | 7.0 GB |
| Runtime | `nemo_toolkit[asr]` |

### [LLM/](./LLM) - Qwen3-1.7B Display Formatter (Legacy)

> **Legacy.** Superseded by the deterministic rule formatter. Retained for reference only.

Converts normalized ASR output into structured ATC display text.

| Metric | Value |
|--------|-------|
| Exact match | **100%** (161/161) |
| Base model | `unsloth/Qwen3-1.7B` |
| Size | 3.3 GB |

## Architecture

```text
Audio --> VAD (Silero) --> ASR backend --> Post-processing --> Rule Formatter --> Display Text
```

| Component | Technology | Notes |
|-----------|------------|-------|
| VAD | Silero VAD | Shared frontend for both ASR backends |
| ASR (legacy) | Whisper Large v3 (CTranslate2) | Lower-memory legacy backend |
| ASR (current NeMo path) | Parakeet-TDT 0.6B v2 | Fine-tuned NeMo checkpoint |
| Formatter | Deterministic rules | Converts normalized speech to ATC display text |

## Domain

Singapore military ATC covering Tengah and Paya Lebar operations, military phraseology, 100+ callsigns, and approach / recovery / emergency traffic.

## Training History

### ASR

| Run | WER | Base | Key Change |
|-----|-----|------|------------|
| ct2_run5 | 0.48% | jacktol/whisper-large-v3-finetuned-for-ATC | Initial fine-tune |
| ct2_run6 | 0.40% | jacktol/whisper-large-v3-finetuned-for-ATC | +augmentation, weight decay |
| ct2_run7 | 0.24% | jacktol/whisper-large-v3-finetuned-for-ATC | Frozen encoder, +50 real recordings |
| ct2_run8 | 0.66% | openai/whisper-large-v3 | Full retrain from base, enhanced augmentation |
| parakeet_atc | 0.72% | nvidia/parakeet-tdt-0.6b-v2 | NeMo fine-tune with ATC radio augmentation, best checkpoint at epoch 76 |

### LLM

| Run | Accuracy | Key Change |
|-----|----------|------------|
| llm_run3 | 98.1% (Qwen3-8B) | QLoRA 4-bit, 871 examples |
| llm_run4 | 100% (Qwen3-1.7B) | bf16 LoRA, 1,915 examples with ASR noise augmentation |

## Quick Start

### Whisper ASR

```python
from faster_whisper import WhisperModel

model = WhisperModel("./ASR/whisper", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.wav", language="en", beam_size=5)
text = " ".join(seg.text.strip() for seg in segments)
```

### Parakeet ASR

See [ASR/parakeet/README.md](./ASR/parakeet/README.md) for the NeMo restore example and tokenizer artifact requirements.

## Download

```bash
# Full repo
huggingface-cli download aether-raid/astra-atc-models --local-dir ./models

# Whisper ASR only
huggingface-cli download aether-raid/astra-atc-models --include "ASR/whisper/*" --local-dir ./models

# Parakeet ASR only
huggingface-cli download aether-raid/astra-atc-models --include "ASR/parakeet/*" --local-dir ./models

# LLM only (legacy)
huggingface-cli download aether-raid/astra-atc-models --include "LLM/*" --local-dir ./models
```