teaching

Course materials for the Interactive Audio Lab

View the Project on GitHub interactiveaudiolab/teaching

CS 352 MACHINE PERCEPTION OF MUSIC AND AUDIO

Northwestern University Winter 2019

Top Calendar Links Slides Readings

Course Description

This course covers machine extraction of structure in audio files covering areas such as source separation (unmixing audio recordings into individual component sounds), sound object recognition (labeling sounds), melody tracking, beat tracking, and perceptual mapping of audio to machine-quantifiable measures.

This course is approved for the Breadth Interfaces & project requirement in the CS curriculum.

Prior programming experience sufficient to be able to do laboratory assignments in PYTHON, implementing algorithms and using libraries without being taught to do so (there is no language instruction on Python). Having taken EECS 211 and 214 would demonstrate this experience.

Course Textbook

Fundamentals of Music Processing

Time & Place

Lecture: Monday, Wednesday, Friday 3:00PM - 3:50PM Tech L361

Instructors, Office Hours, Online help

Prof. Bryan Pardo Office Hours & Location: Wednesday 4pm - 5pm, Mudd 3115

TA Fatemeh Pishdadian Office Hours & Location: Monday 4pm - 5pm, Mudd 3534

TA Bongjun Kim Office Hours & Location: Friday 2pm - 3pm, Mudd 3534

We’ll be using this piazza page for course discussion and online help.

Policies

Grading: You can earn 110 points. You’re graded on a basis of 100 points. In other words… 93 and up is an A, 90 - 92 is an A-, 87-89 is a B+, 83-86 is a B, 80-82 is a B-…and so on.

Late Policy: Assignments are due on Canvas by 11:59pm on the due date. Canvas is the only way assignments are accepted. Late assignments are docked 2 points per day, starting IMMEDIATELY. For example, an assignment handed in at 12:00am the next day has 2 points removed. An assignment that is 3 days late will have 6 points removed from the final grade.

Cheating & Academic Dishonesty: Do your own work. Academic dishonesty will be dealt with as laid out in the student handbook. Penalties include failing the class and can be more severe than that. If you have a question about whether something may be considered cheating, ask, prior to submitting your work.

Attendance is not graded.

Course Calendar

Week Date Topic Due Points
1 Mon Jan 7 Course intro, Recording basics    
1 Wed Jan 9 Loudness and Amplitude    
1 Fri Jan 11 Pitch, Tuning systems    
2 Mon Jan 14 Pitch, Tuning systems HW 0 5
2 Wed Jan 16 The Fourier Series    
2 Fri Jan 18 The Spectrogram & The Cepstrum    
3 Mon Jan 21 NO CLASS: MLK Day    
3 Wed Jan 23 Filters HW 1 20
3 Fri Jan 25 Reverb & Convolution    
4 Mon Jan 28 Rverb & Convolution, repeated    
4 Wed Jan 30 NO CLASS DUE TO EXTREME WEATHER    
4 Fri Feb 1 Time-frequency masking    
5 Mon Feb 4 Repetition, Correlation    
5 Wed Feb 6 Source separation with REPET    
5 Fri Feb 8 Sound Object Labeling    
6 Mon Feb 11 Sound Object Labeling HW 2 20
6 Wed Feb 13 Cepstra and Chroma Xtra Credit 1 5
6 Fri Feb 15 Final Projects + Audealize    
7 Mon Feb 18 Deep Clustering & Voogle    
7 Wed Feb 20 iSed: Interactive sound labeling    
7 Fri Feb 22 MCFT Project Proposal 5
8 Mon Feb 25 Project meetings HW 3 20
8 Wed Feb 27 Project meetings    
8 Fri Mar 1 Project meetings Project report 5
9 Mon Mar 4 Project meetings    
9 Wed Mar 6 Project Meetings    
9 Fri Mar 8 Project Meetings Project report 5
10 Mon Mar 11 Project Meetings Xtra Credit 2 5
10 Wed Mar 13 Project Meetings    
10 Fri Mar 15 Project Meetings Project report 5
11 Fri Mar 22 Poster/Demo Session (3-5pm) Final Project 15

Lecture Slides

Recording Basics

Amplitude

Loudness

Pitch

Tuning Systems

The Fourier Transform and the Spectrogram

Cepstrograms and Chromagrams

Filtering

Convolution

Time-frequency Masking

Beats and Autocorrelation

The REPET algorithm

Sound Object Labeling

Course Reading

Week 1:

Fundamentals of Music Processing, Chapter 1

Week 2: Fourier Transform, Spectrogram

Fundamentals of Music Processing, Chapter 2 & Section 3.1

* Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task

Week 3: Filters, Convolution, Autocorrelation

Fundamentals of Music Processing, Chapter 6

Week 4: Repet

* REPET for Background/Foreground Separation in Audio

Week 5: Sound Object Labeling

* Algorithms for Non-negative Matrix Factorization

* SOUND EVENT DETECTION IN REAL LIFE RECORDINGS USING COUPLED MATRIX FACTORIZATION OF SPECTRAL REPRESENTATIONS AND CLASS ACTIVITY ANNOTATIONS

Week 6: Recognizing sounds

* Deep clustering: Discriminative embeddings for segmentation and separation

* Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation

* Audealize: Crowdsourced Audio Production Tools

Week 7: The TA’s research

* A Human-in-the-Loop System for Sound Event Detection and Annotation

* Multi-resolution Common Fate Transform

* An Industrial-Strength Audio Search Algorithm (Shazam)

ADDITIONAL PAPERS, READINGS AND VIDEO:

* Recovering sound sources from embedded repetition (we talked about this in class)

* An End-to-End Neural Network for Polyphonic Piano Music Transcription

* Lessons learned building a large music recommender system (This one is a video)

* Piano Genie

* Score-Informed Source Separation for Musical Audio Recordings: An overview

* SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

* CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

* Melody Extraction from Polyphonic Music Signals

Places to get ideas

EECS 352 Final projects from 2017 and 2015

The infinite jukebox

Google’s Project Magenta

Facebook’s Universal Music Translation

A coursera corse on pitch tracking

Datasets

U of Iowa’s Music Instrument Samples Dataset

The SocialFX data set of word descriptors for audio

VocalSet: a singing voice dataset consisting of 10.1 hours of monophonic recorded audio of professional singers

VocalSketch: thousands of vocal imitations of a large set of diverse sounds

Bach10: audio recordings of each part and the ensemble of ten pieces of four-part J.S. Bach chorales

The Million Song Dataset

Software

Python Utilities for Detection and Classification of Acoustic Scenes

Librosa audio and music processing in Python

Essentia: an open source music analysis toolkit includes a bunch of feature extractors and pre-trained models for extracting e.g. beats per minute, mood, genre, etc.

Yaafe - audio features extraction toolbox

The Northwestern University Source Separation Library (nussl)

Sonic Visualizer music viz software

Lily Pond, open source music notation software

SoundSlice guitar tab and notation website

Top Calendar Links Slides Readings