Tempo Agent is an interactive AI DJ web app that lets anyone create dynamic, synchronized, multi-layered music just by typing prompts. Starting from an initial beat, users can iteratively add new tracks, all in sync and harmonically aligned.
Built by Massimiliano Viola, Samuel Navarro, and Joe Guo.
- Python with pip (tested on 3.10)
- Node.js with npm (tested on v24.7.0)
- DeerAPI account and R2 (Cloudflare) storage credentials
# Backend dependencies
pip install -r requirements.txt
# Frontend dependencies
cd frontend
npm install
cd ..cd backend/src
cp config.yaml.example config.yaml
# Edit config.yaml with your DeerAPI and R2 credentials
cd ../..Terminal 1 - Backend
cd backend
./start_api.shThe API will be available at http://localhost:8000
Terminal 2 - Frontend
cd frontend
npm run devThe app will be available at http://localhost:5173
- Open
http://localhost:5173in your browser - Play some music!
- Enter a prompt in the text box (e.g., "Pop, piano, synth, electric guitar, driving bass") and enter to start generation
- Wait for the AI to generate your accompaniment
- Play the generated track in the multi-track player
- Backend won't start: Check that
config.yamlexists and has valid credentials - Frontend can't connect: Ensure backend is running on port 8000
- Audio won't play: Try clicking somewhere on the page first to enable audio
- Generation fails: Check your DeerAPI quota and R2 credentials
- Frontend: User enters prompt → sends to backend API
- Backend: Calls
add_instrumental()function → generates accompaniment - Backend: Saves file with prompt-based name (e.g.,
electric_guitar_driving_bass.mp3) into thebackend/tracksfolder - Frontend: Polls for completion → downloads and plays generated track
POST /api/generate- Start generation with promptGET /api/status/{task_id}- Check generation statusGET /api/download/{task_id}- Download generated fileGET /api/tracks- List all tracks
Generated accompaniments are saved in backend/tracks/ with names like:
pop_piano_synth_electric_guitar_driving_bass.mp3jazz_saxophone_smooth_bass_drums.mp3
The backend automatically uses _input.mp3 as the conditioning vocal track source. Press Merge All to swap the default _input.mp3 file with the combination of all current sounds. This will help the newly generated sounds blend better.