IndexTTS2 Finetuning on IMDA NSC FEMALE_01
Download printable cheat-sheet (CC-BY 4.0)07 Feb 2026, 00:00 Z
60-second takeaway
IndexTTS2 gave us a usable full-SFT baseline with strong operational predictability once we stabilized restart behavior.
In this run,model_step14000.pthwas the practical checkpoint to keep.
The major challenge was process reliability and checkpoint retention policy, not core output quality.
Where this fits
- For founders: IndexTTS2 is a steady full-finetune option in this benchmark.
- For engineers: this page focuses on run recovery and checkpoint management as much as quality.
Series overview:
The fine-tuning pipeline used in this benchmark is open-source: instavar/indextts2-finetuning - the first public fine-tuning code for IndexTTS2 (the official repo is inference-only).
Experiment setup
- Model: IndexTTS2
- Dataset: IMDA NSC
FEMALE_01_44kprocessed manifests - Hardware: RTX 3090 Ti 24 GB
- Training mode: full SFT with resume
Best checkpoint logic
- Best validation region was around step ~13800.
- Saved checkpoints available around that region were
model_step14000.pthand latermodel_step15949.pth. - We treated
model_step14000.pthas the practical best anchor for this run.
Audio evidence
Representative sample
Settings: long-text prompt comparison path, step 14000 checkpoint.
Failure modes we saw
- Training runs were interrupted multiple times and required explicit resume management.
- Some crashes were low-level (
pt_autograd_0segfault signs), which made clean logs critical. - Retention policy kept only recent checkpoint windows, so older steps disappeared automatically.
Recommended inference settings
- Keep checkpoint selection tied to both listening and nearest validation region.
- Prefer explicit run logs and resume metadata over implicit state.