The Rhythm in Anything

Patrick O’Reilly, Julia Barnett, Hugo Flores Garcia, Annie Chu, Nathan Pruyne, Prem Seetharaman, Bryan Pardo
This work supported by NSF Award Number 2222369
we present TRIA (The Rhythm In Anything), a masked transformer model for mapping rhythmic sound gestures to high-fidelity drum recordings. Given an audio prompt of the desired rhythmic pattern and a second prompt to represent drumkit timbre, TRIA produces audio of a drumkit playing the desired rhythm (with appropriate elaborations)in the desired timbre.
GitHub
Related publications
pdf P. O’Reilly, J. Barnett, H. Flores Garcia, A. Chu, N. Pruyne, P. Seetharaman, B. Pardo, “The Rhythm In Anything: Audio-Prompted Drums Generation with Masked Language Modeling,” in ISMIR, 2025.