Skip to main content

The Rhythm in Anything

System description

Patrick O’Reilly, Julia Barnett, Hugo Flores Garcia, Annie Chu, Nathan Pruyne, Prem Seetharaman, Bryan Pardo

This work supported by NSF Award Number 2222369


we present TRIA (The Rhythm In Anything), a masked transformer model for mapping rhythmic sound gestures to high-fidelity drum recordings. Given an audio prompt of the desired rhythmic pattern and a second prompt to represent drumkit timbre, TRIA produces audio of a drumkit playing the desired rhythm (with appropriate elaborations)in the desired timbre.

GitHub

pdf P. O’Reilly, J. Barnett, H. Flores Garcia, A. Chu, N. Pruyne, P. Seetharaman, B. Pardo, “The Rhythm In Anything: Audio-Prompted Drums Generation with Masked Language Modeling,” in ISMIR, 2025.