Instructions to use ngocdang83/HachimiMT-60-zh-vi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ngocdang83/HachimiMT-60-zh-vi with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="ngocdang83/HachimiMT-60-zh-vi")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("ngocdang83/HachimiMT-60-zh-vi") model = AutoModelForMultimodalLM.from_pretrained("ngocdang83/HachimiMT-60-zh-vi") - Notebooks
- Google Colab
- Kaggle
HachimiMT-60: Chinese→Vietnamese Web-Novel Translation Model
A 56.94M-parameter Marian-class Chinese-to-Vietnamese translation model optimized for web-novel content (xianxia, modern, cross-domain).
TL;DR
| Aspect | Value |
|---|---|
| Params | 56.94M |
| Architecture | Asymmetric Marian (8 encoder + 2 decoder, d_model 512) |
| Vocab | Custom SPM-BPE 24k joint ZH+VI |
| Max position | 512 |
| Best for | Xianxia + cross-domain web-novel paragraph translation |
Quick Start
from transformers import AutoTokenizer, MarianMTModel
import torch
tokenizer = AutoTokenizer.from_pretrained("ngocdang83/HachimiMT-60-zh-vi")
model = MarianMTModel.from_pretrained("ngocdang83/HachimiMT-60-zh-vi").to("cuda").eval()
src = "他必须得抓紧时间了。凌伊山掏出手机,查询起了临江市最近开往雪霏市的机票。"
inp = tokenizer(src, return_tensors="pt", truncation=True, max_length=256).to("cuda")
with torch.inference_mode():
out = model.generate(
**inp,
max_new_tokens=300,
num_beams=4,
early_stopping=True,
no_repeat_ngram_size=2,
repetition_penalty=1.2,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
# Output: "Hắn phải tranh thủ thời gian rồi. Lăng Y Sơn lấy điện thoại ra, tra
# vé máy bay gần nhất từ thành phố Lâm Giang đến thành phố Tuyết Phi."
Fast CPU Runtime
This repository also includes a CTranslate2 INT8 export under
ct2-int8_float32/, used by the public demo Space for faster CPU inference.
import ctranslate2
from pathlib import Path
from huggingface_hub import snapshot_download
from transformers import AutoTokenizer
model_id = "ngocdang83/HachimiMT-60-zh-vi"
model_path = Path(snapshot_download(model_id, allow_patterns=[
"config.json", "source.spm", "target.spm", "vocab.json", "tokenizer_config.json",
"ct2-int8_float32/*",
]))
tokenizer = AutoTokenizer.from_pretrained(model_path)
translator = ctranslate2.Translator(
str(model_path / "ct2-int8_float32"),
device="cpu",
compute_type="int8_float32",
)
Speed Benchmark
Tested on RTX 5070 Ti Laptop, num_beams=4, mixed test set (20 short + 20 medium + 20 long rows).
| Model | Params | Mean Latency | max_position | Notes |
|---|---|---|---|---|
| Hirashiba-tiny | 15.1M | 377ms | 512 | Fastest |
| Hirashiba-medium | 57.07M | 495ms | 128 | Truncates paragraphs |
| HachimiMT-60 (this) | 56.94M | 603ms | 512 | Handles long paragraph without truncation |
Per-bucket mean latency (ms):
| Bucket | HachimiMT-60 | Hirashiba-medium | Hirashiba-tiny |
|---|---|---|---|
| short (~70-120ch) | 330 | 390 | 310 |
| medium (~150-250ch) | 626 | 546 | 430 |
| long (>250ch) | 853 | 548 | 390 |
⚠️ Hirashiba-medium and Hirashiba-tiny truncate on medium/long buckets due
to max_position_embeddings=128, which caps output to ~120 tokens regardless
of source length. Their lower latency on long bucket reflects truncated output
rather than faster decoding. HachimiMT-60 produces full-length output up to
~1000 chars without truncation.
For ultra-low-latency short-content use cases, consider Hirashiba-tiny. For paragraph-level web-novel translation, HachimiMT-60 is recommended.
Architecture
MarianMTModel:
vocab_size: 24000
d_model: 512
encoder_layers: 8
decoder_layers: 2
encoder_attention_heads: 8
decoder_attention_heads: 8
encoder_ffn_dim: 3072
decoder_ffn_dim: 3072
max_position_embeddings: 512
share_encoder_decoder_embeddings: true
tie_word_embeddings: true
scale_embedding: true
activation_function: swish
Total params: 56,935,424 (~57M).
Training Datasets
Primary training sources:
ngocdang83/tran-vi-teacher — 350k strict-clean Chinese-Vietnamese parallel from Gemini 2.5/3.0/3.1 teacher (Pro/Flash/Flash-Lite tiers). Provides paragraph-level training examples + cross-domain coverage (urban, fantasy, sci-fi, history).
chi-vi/hirashiba-mt-zh2vi-b-filtered — Filtered Chinese-Vietnamese translation dataset for web-novel domain.
Gold teacher generated by Gemini API for additional quality-targeted training examples.
Decode Configuration
Recommended generation parameters:
out = model.generate(
**inputs,
max_new_tokens=300, # adjust based on expected length
num_beams=4, # quality/speed tradeoff
early_stopping=True,
no_repeat_ngram_size=2, # prevent repetition
repetition_penalty=1.2,
)
For shorter inputs (single sentence), reduce max_new_tokens=150.
For long paragraphs, increase to 400.
Intended Uses
Recommended
- Chinese-Vietnamese web-novel translation (xianxia, tu tiên, fantasy, sci-fi)
- Paragraph-level translation (handles up to ~1000 chars output without truncation)
- Cross-domain content (Lovecraftian, urban, school, military)
- Production deployment with batched inference
Not Recommended
- Non-Chinese sources (ZH→VI only, not bidirectional)
- Traditional Chinese (繁體) input — model trained on Simplified
Chinese (简体). Traditional characters may degrade output quality;
convert to Simplified first (e.g. via
opencc). - Bilingual editing/post-editing without verification — automated MT should be reviewed before publication.
Limitations
- Hallucination on rare proper nouns: Western names (Klein, Audrey, Bernadette) usually preserved, but uncommon proper nouns may hallucinate.
- Trained on web-novel corpus: scientific, legal, or news domains may give suboptimal results.
- Long-context drift: When translating a single long input (>200 chars in one go), proper names and consistent terminology may drift after several mentions in the same context (e.g., "Trương Vũ" → "Trương Huyền" → "Trương XX"). Mitigation: split long inputs by paragraph/sentence and translate each chunk independently. The HF Space demo applies this automatically.
- Output length asymptote: outputs >1000 chars per chunk may degrade.
- Simplified Chinese only: Traditional Chinese inputs untested and likely to degrade.
Evaluation Methodology
Quality validation uses a trio AI reviewer pattern for cross-validated human-style preference judgments without single-model bias.
Reviewers
Three independent reviewer sessions, each using a different agent/runtime context:
- Reviewer 1:
gemini-3.1-provia Gemini CLI - Reviewer 2:
gemini-3.5-flashvia Gemini CLI (different temperature) - Reviewer 3:
gemini-3.5-flashvia Antigravity (AGY) agents
Each reviewer reads one review TSV in isolation — they cannot see other reviewers' outputs.
Scoring
Per row, per model:
- Severity 0-3 scale (0 = OK / acceptable, 1 = minor error, 2 = moderate error, 3 = severe error — hallucination, truncation, or word salad)
- Winner pick: choose the best of 4 model outputs, or
tie/all_bad - winner_reason short text (model-specific failure modes or strengths)
Aggregation
- Pooled severity = mean of all severity scores across reviewers (lower = better)
- Winner aggregate = vote count across 180 judgments (60 rows × 3 reviewers)
- Trio consensus = rows where all 3 reviewers agree on the same winner (highest-confidence signal)
Test Sets
Two complementary evaluation sets covering web-novel translation diversity:
Cross-novel paragraph (60 rows, 20 short + 20 medium + 20 long buckets) — random paragraphs from two web-novels (Lovecraftian fantasy + sci-fi mecha), tests cross-domain + long-output handling.
Xianxia in-distribution (60 rows, 30 classical xianxia + 30 modern xianxia hybrid chapter excerpts) — tests xianxia genre quality and register polish (Hán Việt accuracy, tu tiên vocabulary, modern colloquial Vietnamese register).
Anti-Bias Rules
To prevent single-reviewer drift:
- Each session opens only one review file (no cross-read)
- Anti-boilerplate rules enforced (no default severity=0, no default winner=tie)
- Reviewer-specific bias patterns identified post-hoc and weighted in interpretation
Citation
@misc{hachimimt60-2026,
author = {ngocdang83},
title = {HachimiMT-60: Chinese-to-Vietnamese Web-Novel Translation},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/ngocdang83/HachimiMT-60-zh-vi}
}
License
CC-BY-4.0 — free use with attribution. Training data includes Gemini API teacher distillation; downstream users should verify current Gemini API terms for derivative-work training.
- Downloads last month
- 504