netlinux-ai

## 08-f5-vs-styletts2-tradeoff.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                netlinux-ai
                / 08-f5-vs-styletts2-tradeoff.md
            
            
              Created
              May 9, 2026 07:13
            
              
                F5-TTS vs StyleTTS2: real Pareto trade-off in fine-tune behaviour (accent strength vs phonetic stability)
              
          
    F5-TTS vs StyleTTS2: a real Pareto trade-off in fine-tune behaviour

Running two TTS architectures on the same small fine-tune corpus surfaces
a real trade-off: F5-TTS commits hard to accent character at the cost of
phonetic stability; StyleTTS2 stays phonetically stable at the cost of
accent commitment. Neither dominates. Each has its own late-epoch failure
mode, just different ones.
This is a write-up of the comparison, with the concrete failure modes that
made the trade-off visible.

  
## 07-tts-listening-loop.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                netlinux-ai
                / 07-tts-listening-loop.md
            
            
              Created
              May 9, 2026 07:11
            
              
                How human feedback actually steers TTS fine-tuning — the listening loop, with worked examples
              
          
    How human feedback actually steers TTS fine-tuning

Notes on the iteration loop we ran while fine-tuning F5-TTS and StyleTTS2 on
a small Northern English corpus. The headline finding is that the listening
test isn't optional polish at the end — it's the only measurement that
catches the failure modes that matter, and each round of listening produces
specific phonetic observations that map to specific engineering decisions.
This is a write-up of the methodology, with the concrete examples that
forced each decision.

  
## 06-non-avx2-cpu-tts-compat.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                netlinux-ai
                / 06-non-avx2-cpu-tts-compat.md
            
            
              Created
              May 9, 2026 07:10
            
              
                Non-AVX2 CPU TTS compatibility: F5-TTS / StyleTTS2 / kokoro / whisper.cpp on a Phenom II X6
              
          
    Running modern Python TTS toolchains on non-AVX2 CPUs

Notes from getting F5-TTS, StyleTTS2, kokoro/Misaki, and whisper.cpp to work
on an AMD Phenom II X6 1090T (2010 K10/Family-10h architecture).
The CPU has SSE/SSE2/SSE3/SSE4a, plus CX16/POPCNT/LAHF — but no SSE4.1, no
SSE4.2, no AVX, no AVX2, no FMA, no F16C. That puts it below the modern
x86-64-v2 baseline. A growing share of binary Python wheels in the AI
ecosystem assume v2 or v3, so they SIGILL or SIGFPE at import. This is a
ground-truth list of what we hit and what worked.

  
## 05-README.md

      
              3 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                netlinux-ai
                / 05-README.md
            
            
              Created
              May 9, 2026 07:09
            
              
                Minimal F5-TTS fine-tune trainer: bypasses HuggingFace datasets/accelerate, single file, ~250 LOC
              
          
    Minimal F5-TTS fine-tune trainer (no datasets / pyarrow / accelerate)

A ~250-line trainer for F5-TTS that bypasses the HuggingFace datasets and
accelerate dependency stack. Useful when:

the upstream f5-tts_finetune-cli won't install/run because of pyarrow /
pandas / datasets issues (binary-wheel CPU baseline mismatches, missing
dependencies, etc.)
you want a single-file trainer you can read end-to-end and modify
you want to use precomputed mel-spectrograms loaded from disk rather than