View on GitHub


Course materials for the Interactive Audio Lab


Northwestern University Winter 2021

Top Calendar Links Slides Readings

Course Description

This course covers machine extraction of structure in audio files covering areas such as source separation (unmixing audio recordings into individual component sounds), sound object recognition (labeling sounds), melody tracking, beat tracking, and perceptual mapping of audio to machine-quantifiable measures.

This course is approved for the Breadth Interfaces & project requirement in the CS curriculum.

Prior programming experience sufficient to be able to do laboratory assignments in PYTHON, implementing algorithms and using libraries without being taught to do so (there is no language instruction on Python). Having taken EECS 211 and 214 would demonstrate this experience.

Course Textbook

Fundamentals of Music Processing

Time & Place

Lecture: Tuesday, Thursday, 6:30 - 7:50pm CST on ZOOM


Prof. Bryan Pardo Office Hours & Location:

Office Hours

Mondays 5:00 - 6:30pm CST on ZOOM

Course Policies

Questions outside of class

Please use CampusWire for class-related questions.

Grading Policy

You will be graded on a 100 point scale (e.g. 93 to 100 = A, 90-92 = A-, 87-89 = B+, 83-86 = B, 80-82 = B-…and so on).

Homework and reading assignments are solo assignments and must be original work.

Final projects are group assignments and all members of a group will share a grade for all parts of the assignment.

Submitting assignments

Assignments must be submitted on the due date by the time specified on Canvas. If you are worried you can’t finish on time, upload a safety submission an hour early with what you have. I will grade the most recent item submitted before the deadline. Late submissions will not be graded.

Extra credit.

Students can earn a MAXIMUM TOTAL of 10 extra-credit points (A full letter grade):

Participation during lecture You will be asked to select 2 lectures for which you will be on-call. In your on-call lectures, I will feel free to call on you and will expect that you’ve done the relevant reading prior to lecture and will be able to engage in meaningful interaction on the lecture topic. Each on-call day will be worth 3 points, for a total of 6 class participation points.

Paper reviews You will be able to earn extra credit by submitting reviews of up to 4 extra-credit papers in the field. Each paper review will be worth 1 point, for a total of 4 paper review points.

Course Calendar

Week Date Topic ASSIGNMENT Points
1 Tue Jan 12 Course intro, Recording basics    
1 Thu Jan 14 How we hear, Frequency & Pitch    
2 Tue Jan 19 Loudness & Amplitude    
2 Thu Jan 21 The Fourier Series & Spectrogram    
3 Tue Jan 26 The Fourier Series & Spectrogram    
3 Thu Jan 28 Convolution, Repetition HW 1 20
4 Tue Feb 2 Filters, Reverb    
4 Thu Feb 4 TBD    
5 Tue Feb 9 Time-frequency masking & MP3 HW 2 20
5 Thu Feb 11 TBD    
6 Tue Feb 16 Audio Source Separation    
6 Thu Feb 18 Audio Source Separation    
7 Tue Feb 23 Labeling Sound Events HW 3 20
7 Thu Feb 25 Labeling Sound Events    
8 Tue Mar 2 Music Similarity    
8 Thu Mar 4 Music Similarity    
9 Tue Mar 9 Deep Models for Audio HW 4 20
9 Thu Mar 11 Deep Models for Audio Xtra Credit 4
10 Thu Mar 18 Final assignment due HW 5 20

Lecture Slides

Recording Basics




Tuning Systems

The Fourier Transform and the Spectrogram

Cepstrograms and Chromagrams



Time-frequency Masking

Beats and Autocorrelation

The REPET algorithm

Sound Object Labeling

Textbook Reading

Fundamentals of Music Processing, Chapter 1

Fundamentals of Music Processing, Chapter 2 & Section 3.1

Fundamentals of Music Processing, Chapter 4

Fundamentals of Music Processing, Chapter 6

Fundamentals of Music Processing, Chapter 7

* REPET for Background/Foreground Separation in Audio

EXTRA CREDIT READING (Note 1/12/21: These will be updated. Don’t read yet!):

* Algorithms for Non-negative Matrix Factorization


* Deep clustering: Discriminative embeddings for segmentation and separation

* Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation

* Audealize: Crowdsourced Audio Production Tools

* An Industrial-Strength Audio Search Algorithm (Shazam)

* Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task

* Recovering sound sources from embedded repetition

* An End-to-End Neural Network for Polyphonic Piano Music Transcription

* Lessons learned building a large music recommender system (This one is a video)

* Piano Genie

* Score-Informed Source Separation for Musical Audio Recordings: An overview


* CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

* Melody Extraction from Polyphonic Music Signals

Places to get ideas

EECS 352 Final projects from 2017 and 2015

The infinite jukebox

Google’s Project Magenta

Facebook’s Universal Music Translation

A coursera corse on pitch tracking


U of Iowa’s Music Instrument Samples Dataset

The SocialFX data set of word descriptors for audio

VocalSet: a singing voice dataset consisting of 10.1 hours of monophonic recorded audio of professional singers

VocalSketch: thousands of vocal imitations of a large set of diverse sounds

Bach10: audio recordings of each part and the ensemble of ten pieces of four-part J.S. Bach chorales

The Million Song Dataset


Python Utilities for Detection and Classification of Acoustic Scenes

Librosa audio and music processing in Python

Essentia: an open source music analysis toolkit includes a bunch of feature extractors and pre-trained models for extracting e.g. beats per minute, mood, genre, etc.

Yaafe - audio features extraction toolbox

The Northwestern University Source Separation Library (nussl)

Sonic Visualizer music viz software

Lily Pond, open source music notation software

SoundSlice guitar tab and notation website

Top Calendar Links Slides Readings