| Top | Calendar | Links | Readings |
This course covers machine extraction of structure in audio files covering areas such as source separation (unmixing audio recordings into individual component sounds), sound object recognition (labeling sounds), melody tracking, beat tracking, and perceptual mapping of audio to machine-quantifiable measures.
This course is approved for the Breadth Interfaces & project requirement in the CS curriculum.
Prior programming experience sufficient to be able to do laboratory assignments in PYTHON, implementing algorithms and using libraries without being taught to do so (there is no language instruction on Python). Having taken EECS 211 and 214 would demonstrate this experience.
Fundamentals of Music Processing
Lecture: Tue, Thu, 3:30 - 4:50pm CST in 2122 Sheridan Rd Classroom 250
Prof. Bryan Pardo 10am - 11am Thursdays in Mudd 3115
TA Annie Chu 11am - 1pm Tuesdays in Mudd 3202
[Peer Mentor] EJ Van De Grift 2pm - 3pm Wednesdays in Mudd 3108
Postdoc Jason Smith Jason Smith will help guide final projects.
Please use CampusWire for class-related questions.
You will be graded on a 100 point scale (e.g. 93 to 100 = A, 90-92 = A-, 87-89 = B+, 83-86 = B, 80-82 = B-…and so on).
Every assignment is worth 20 points. There are 5 assignments (including the final project). Your final grade will be the sum of midterm grade + your 4 highest assignment grades. This means you can skip any one assignment.
Homework and reading assignments are solo assignments and must be your original work.
You are expected to write your own code and write up your own answers to question. This means you. Not ChatGPT or Gemini or Copilot. This is an optional class you are (presumably) taking because you’re interested. So put in the time to learn this stuff, yourself.
Assignments must be submitted on the due date by the time specified on Canvas. If you are worried you can’t finish on time, upload a safety submission an hour early with what you have. I will grade the most recent item submitted before the deadline. Late submissions will not be graded.
| Week | Date | Topic | ASSIGNMENT | Points |
|---|---|---|---|---|
| 1 | Tue Jan 6 | Course intro, Recording basics | ||
| 1 | Thu Jan 8 | Frequency & Pitch, Tuning Systems | ||
| 2 | Tue Jan 13 | Loudness & Amplitude | ||
| 2 | Thu Jan 15 | Fourier Transforms & Spectrograms | ||
| 3 | Tue Jan 20 | Convolution & Filtering | HW 1 Audio Basics | 20 |
| 3 | Thu Jan 22 | Convolution & FFT notebooks | ||
| 4 | Tue Jan 27 | MFCCs and Chromagrams | ||
| 4 | Thu Jan 29 | Sound Object Labeling | HW 2 Spectrograms, Masking | 20 |
| 5 | Tue Feb 3 | Self Similarity & MFCC & Chroma notebooks | ||
| 5 | Thu Feb 5 | MIDTERM REVIEW | ||
| 6 | Tue Feb 10 | Pitch Tracking | MIDTERM | 20 |
| 6 | Thu Feb 12 | Deep Learning | ||
| 7 | Tue Feb 17 | Deep Learning | HW 3 Infinite Jukebox | 20 |
| 7 | Thu Feb 19 | Embeddings, VoiceID, Source Separation | ||
| 8 | Tue Feb 24 | Cross Modal Embeddings & Embeddings Notebook | ||
| 8 | Thu Feb 26 | Final project group formation & proposals | HW 4 Using Embeddings | 20 |
| 9 | Tue Mar 3 | Zoom meetings with project groups (no class: meetings by appointment) | Project proposal due | |
| 9 | Thu Mar 5 | Current research in music & audio processing (Annie) | ||
| 10 | Tue Mar 10 | Current research in music & audio processing | ||
| 10 | Thu Mar 12 | Zoom meetings with project groups (no class: meetings by appointment) | Project meeting | |
| 11 | Wed Mar 18 | Final project presentations 7-9pm | Final project | 20 |
Fundamentals of Music Processing, Chapter 1
Fundamentals of Music Processing, Chapter 2 & Section 3.1
Fundamentals of Music Processing, Chapter 4
Fundamentals of Music Processing, Chapter 6
Fundamentals of Music Processing, Chapter 7
* REPET for Background/Foreground Separation in Audio
Chapter 4 of Machine Learning : This is Tom Mitchell’s book. Historical overview + explanation of backprop of error. It’s a good starting point for actually understanding deep nets.
Yin: a fundamental frequency estimator for speech and music - This is, perhaps, the most popular pitch tracker.
Crepe: A Convolutional Representation for Pitch Estimation - A deep learning pitch tracker that improves on Yin.
The dummy’s guide to MFCC - an easy, high-level read. Start with this.
From Frequency to Quefrency: A History of the Cepstrum - a historical analysis of the uses of cepstrums
Recovering sound sources from embedded repetition - This is a paper on how humans actually listen to and parse audio based on repetition. Read any time.
EECS 352 Final projects from 2017 and 2015
Facebook’s Universal Music Translation
A coursera corse on pitch tracking
U of Iowa’s Music Instrument Samples Dataset
The SocialFX data set of word descriptors for audio
VocalSketch: thousands of vocal imitations of a large set of diverse sounds
Bach10: audio recordings of each part and the ensemble of ten pieces of four-part J.S. Bach chorales
Python Utilities for Detection and Classification of Acoustic Scenes
Librosa audio and music processing in Python
Essentia: an open source music analysis toolkit includes a bunch of feature extractors and pre-trained models for extracting e.g. beats per minute, mood, genre, etc.
Yaafe - audio features extraction toolbox
Sonic Visualizer music viz software
Lily Pond, open source music notation software
SoundSlice guitar tab and notation website
| Top | Calendar | Links | Readings |