Nurfog/openccb

Fork 0

Files

T

Nurfog be699ad6ab feat: token count implement

2026-03-17 12:07:56 -03:00

5.0 KiB

Raw Blame History

Bark TTS Integration Guide

Overview

OpenCCB now integrates with Suno AI's Bark text-to-speech system for generating audio versions of questions. This allows students to listen to questions instead of just reading them, improving accessibility and supporting different learning styles.

Architecture

┌─────────────────┐     HTTP      ┌─────────────────┐
│   OpenCCB CMS   │ ────────────> │   Bark TTS API  │
│  (PostgreSQL)   │ <──────────── │   (Server t-800)│
│                 │    Audio      │                 │
└─────────────────┘               └─────────────────┘

Deployment to t-800 Server

Prerequisites

SSH access to t-800 server
At least 8GB RAM recommended (Bark loads large models)
10GB free disk space
Python 3.8+
GPU optional (CUDA support for faster generation)

Quick Deploy

# From your local machine
cd /home/juan/dev/openccb
./scripts/deploy_to_t800.sh

This will:

SSH into t-800
Install Python dependencies
Clone Bark repository
Set up systemd service
Start the API server

Manual Deploy

# SSH into t-800
ssh juan@t-800

# Run installation script
wget https://raw.githubusercontent.com/suno-ai/bark/main/scripts/install.sh
sudo bash install.sh

API Endpoints

Once deployed, Bark API is available at http://t-800:8000

Health Check

curl http://t-800:8000/health

List Available Voices

curl http://t-800:8000/api/voices

Generate Speech

# Basic usage
curl "http://t-800:8000/api/generate?text=What%20color%20is%20the%20sky%3F" \
  -o question.wav

# With specific voice and speed
curl "http://t-800:8000/api/generate?text=Hello%20World&voice=v2/en_speaker_6&speed=1.2" \
  -o greeting.wav

# Spanish voice
curl "http://t-800:8000/api/generate?text=Hola%20mundo&voice=v2/es_speaker_0" \
  -o saludo.wav

Available Voices

English Voices

v2/en_speaker_0 through v2/en_speaker_9

Spanish Voices

v2/es_speaker_0 through v2/es_speaker_9

Integration with OpenCCB

Generate Audio for a Question

# Via API
curl -X POST "http://localhost:3001/question-bank/{question_id}/generate-audio" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "What color is the sky?",
    "voice": "v2/en_speaker_1",
    "speed": 1.0
  }'

Automatic Audio Generation

When creating a question:

POST /question-bank
{
  "question_text": "What is the capital of France?",
  "question_type": "multiple-choice",
  "options": ["Paris", "London", "Berlin", "Madrid"],
  "correct_answer": 0,
  "explanation": "Paris is the capital of France.",
  "generate_audio": true  // Triggers async audio generation
}

Configuration

Environment Variables

Add to your .env file:

# Bark TTS API URL
BARK_API_URL=http://t-800:8000

# Optional: Default voice for audio generation
BARK_DEFAULT_VOICE=v2/en_speaker_1

# Optional: Default speed
BARK_DEFAULT_SPEED=1.0

Performance Optimization

Model Preloading

Bark preloads models on startup (takes ~30 seconds). The systemd service handles this automatically.

Memory Management

The systemd service includes memory limits:

MemoryMax=4G
MemoryHigh=3G

Adjust based on your server's capacity.

Batch Generation

For importing many questions:

# Generate audio for multiple questions
curl "http://t-800:8000/api/generate/batch?texts=Question%201&texts=Question%202&voice=v2/en_speaker_1"

Troubleshooting

Service Not Starting

# Check status
sudo systemctl status bark-tts

# View logs
sudo journalctl -u bark-tts -f

# Restart service
sudo systemctl restart bark-tts

Out of Memory

If Bark crashes due to memory:

Reduce MemoryMax in systemd service
Use smaller models: suno/bark-small
Process questions one at a time

Slow Generation

GPU acceleration: Install CUDA-enabled PyTorch
Reduce audio quality settings
Use shorter text segments

Testing

# Test English voice
curl "http://t-800:8000/api/generate?text=The%20quick%20brown%20fox&voice=v2/en_speaker_1" | play -

# Test Spanish voice
curl "http://t-800:8000/api/generate?text=El%20rápido%20zorro%20marrón&voice=v2/es_speaker_0" | play -

Security Notes

Bark API runs on internal network only
No authentication required (assumes trusted network)
Rate limiting handled by OpenCCB
Audio files stored in uploads/audio/ directory

Future Enhancements

Add authentication to Bark API
Support for custom voice cloning
Audio preprocessing (noise reduction, normalization)
Caching layer for repeated requests
WebSocket support for streaming audio

5.0 KiB Raw Blame History