60-second takeaway We ran one consistent single-speaker benchmark on IMDA NSC FEMALE_01 with a single-GPU setup. VoxCPM, IndexTTS2, and Qwen3-TTS all produced usable outputs under specific settings; CosyVoice3 did not reach production-ready quality in this run. Treat this as an execution benchmark under one configuration, not a universal model ranking.
Who this is for
Founder / strategy reader: use the matrix and decision guide to pick what to deploy next.
Engineer reader: use each linked deep dive for exact recipes, checkpoints, and failure diagnostics.
Shared experiment setup
Dataset: IMDA NSC single-speaker set (FEMALE_01), with model-specific preprocessing.
Hardware: single NVIDIA RTX 3090 Ti (24 GB VRAM).
Evaluation: qualitative listening on naturalness, accent retention, noise profile, long-text stability, and operational friction (VRAM, disk, rerun complexity).
Comparison matrix
Model
Dataset handling
Train recipe
Best checkpoint in this run
Main failure mode
Recommended inference setting
CosyVoice2 (baseline/control)
Baseline sample used as control
No finetune in this benchmark
Baseline control sample only
Not evaluated as a finetune target in this series
Use as control reference only
CosyVoice3
IMDA NSC
Voice cloning
Need consented AI voiceovers?
Launch AI voice cloning with clear consent, pronunciation tuning, and ad-ready mixes.
Interpretation note: "Not production-ready" here means "not production-ready in this experiment configuration." We plan a follow-up CosyVoice3 rerun with revised settings.
Decision guide
If you need deployable output fastest
Start with VoxCPM step 4000 or Qwen3-TTS LoRA epoch 10 at scale 0.3 to 0.35.
If your priority is configuration stability and reproducibility
Use IndexTTS2 as a stable full-SFT reference and keep checkpoints pinned by explicit listening tests.
If you are evaluating CosyVoice for this dataset
Use CosyVoice2 as your baseline control and treat CosyVoice3 in this run as a diagnostics case, not a final verdict.
Full artifact mapping (checkpoints, sample paths, logs):
reports/tts-experiments-evidence-map.md
If you want the short version for exec stakeholders, use this page. If you want exact commands and failure traces, open the model-specific deep dives above.