# Bark TTS Integration Guide ## Overview OpenCCB now integrates with **Suno AI's Bark** text-to-speech system for generating audio versions of questions. This allows students to listen to questions instead of just reading them, improving accessibility and supporting different learning styles. ## Architecture ``` ┌─────────────────┐ HTTP ┌─────────────────┐ │ OpenCCB CMS │ ────────────> │ Bark TTS API │ │ (PostgreSQL) │ <──────────── │ (Server t-800)│ │ │ Audio │ │ └─────────────────┘ └─────────────────┘ ``` ## Deployment to t-800 Server ### Prerequisites - SSH access to t-800 server - At least 8GB RAM recommended (Bark loads large models) - 10GB free disk space - Python 3.8+ - GPU optional (CUDA support for faster generation) ### Quick Deploy ```bash # From your local machine cd /home/juan/dev/openccb ./scripts/deploy_to_t800.sh ``` This will: 1. SSH into t-800 2. Install Python dependencies 3. Clone Bark repository 4. Set up systemd service 5. Start the API server ### Manual Deploy ```bash # SSH into t-800 ssh juan@t-800 # Run installation script wget https://raw.githubusercontent.com/suno-ai/bark/main/scripts/install.sh sudo bash install.sh ``` ## API Endpoints Once deployed, Bark API is available at `http://t-800:8000` ### Health Check ```bash curl http://t-800:8000/health ``` ### List Available Voices ```bash curl http://t-800:8000/api/voices ``` ### Generate Speech ```bash # Basic usage curl "http://t-800:8000/api/generate?text=What%20color%20is%20the%20sky%3F" \ -o question.wav # With specific voice and speed curl "http://t-800:8000/api/generate?text=Hello%20World&voice=v2/en_speaker_6&speed=1.2" \ -o greeting.wav # Spanish voice curl "http://t-800:8000/api/generate?text=Hola%20mundo&voice=v2/es_speaker_0" \ -o saludo.wav ``` ## Available Voices ### English Voices - `v2/en_speaker_0` through `v2/en_speaker_9` ### Spanish Voices - `v2/es_speaker_0` through `v2/es_speaker_9` ## Integration with OpenCCB ### Generate Audio for a Question ```bash # Via API curl -X POST "http://localhost:3001/question-bank/{question_id}/generate-audio" \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "text": "What color is the sky?", "voice": "v2/en_speaker_1", "speed": 1.0 }' ``` ### Automatic Audio Generation When creating a question: ```json POST /question-bank { "question_text": "What is the capital of France?", "question_type": "multiple-choice", "options": ["Paris", "London", "Berlin", "Madrid"], "correct_answer": 0, "explanation": "Paris is the capital of France.", "generate_audio": true // Triggers async audio generation } ``` ## Configuration ### Environment Variables Add to your `.env` file: ```bash # Bark TTS API URL BARK_API_URL=http://t-800:8000 # Optional: Default voice for audio generation BARK_DEFAULT_VOICE=v2/en_speaker_1 # Optional: Default speed BARK_DEFAULT_SPEED=1.0 ``` ## Performance Optimization ### Model Preloading Bark preloads models on startup (takes ~30 seconds). The systemd service handles this automatically. ### Memory Management The systemd service includes memory limits: ```ini MemoryMax=4G MemoryHigh=3G ``` Adjust based on your server's capacity. ### Batch Generation For importing many questions: ```bash # Generate audio for multiple questions curl "http://t-800:8000/api/generate/batch?texts=Question%201&texts=Question%202&voice=v2/en_speaker_1" ``` ## Troubleshooting ### Service Not Starting ```bash # Check status sudo systemctl status bark-tts # View logs sudo journalctl -u bark-tts -f # Restart service sudo systemctl restart bark-tts ``` ### Out of Memory If Bark crashes due to memory: 1. Reduce `MemoryMax` in systemd service 2. Use smaller models: `suno/bark-small` 3. Process questions one at a time ### Slow Generation - GPU acceleration: Install CUDA-enabled PyTorch - Reduce audio quality settings - Use shorter text segments ## Testing ```bash # Test English voice curl "http://t-800:8000/api/generate?text=The%20quick%20brown%20fox&voice=v2/en_speaker_1" | play - # Test Spanish voice curl "http://t-800:8000/api/generate?text=El%20rápido%20zorro%20marrón&voice=v2/es_speaker_0" | play - ``` ## Security Notes - Bark API runs on internal network only - No authentication required (assumes trusted network) - Rate limiting handled by OpenCCB - Audio files stored in `uploads/audio/` directory ## Future Enhancements - [ ] Add authentication to Bark API - [ ] Support for custom voice cloning - [ ] Audio preprocessing (noise reduction, normalization) - [ ] Caching layer for repeated requests - [ ] WebSocket support for streaming audio ## References - [Bark GitHub](https://github.com/suno-ai/bark) - [Bark Hugging Face](https://huggingface.co/suno/bark) - [OpenCCB Question Bank Documentation](../docs/question-bank.md)