19 lines
992 B
Markdown
19 lines
992 B
Markdown
# Project: Local Text-to-Speech (TTS)
|
|
## Status: Environment Configured & Model Downloaded & Verified
|
|
- **Directory:** `/home/openclaw/.openclaw/workspace/projects/local-tts`
|
|
- **Environment:** Python venv
|
|
- **Packages Installed:** `piper-tts`, `beautifulsoup4`, `requests` (CPU-only optimized on 2026-03-27)
|
|
- **Model:** `en_US-libritts_r-medium.onnx`
|
|
- **Status:** Inference pipeline verified and operational. Disk footprint reduced from 6.1GB to 294MB.
|
|
|
|
## Recent Work
|
|
- Successfully generated ~400MB WAV files.
|
|
- Debugged `piper` and `wave` interaction to resolve 0-byte file issues.
|
|
- Established `tts_script.py` as a stable wrapper with dynamic speed adjustment (via `--speed` parameter).
|
|
- Optimized environment by removing unnecessary CUDA/PyTorch/NVIDIA dependencies.
|
|
|
|
## Next Steps
|
|
1. User provides a target URL.
|
|
2. Execute `tts_script.py` (e.g., `python3 tts_script.py <url> <model> <config> <output> --speed 0.9`).
|
|
3. Retrieve/playback audio from `workspace/projects/local-tts/`.
|