teaching

Course materials for the Interactive Audio Lab

View the Project on GitHub interactiveaudiolab/teaching

CS 352 MACHINE PERCEPTION OF MUSIC AND AUDIO

Northwestern University Winter 2019

Top Calendar Links Slides Readings

Course Description

This course covers machine extraction of structure in audio files covering areas such as source separation (unmixing audio recordings into individual component sounds), sound object recognition (labeling sounds), melody tracking, beat tracking, and perceptual mapping of audio to machine-quantifiable measures.

This course is approved for the Breadth Interfaces & project requirement in the CS curriculum.

Prior programming experience sufficient to be able to do laboratory assignments in PYTHON, implementing algorithms and using libraries without being taught to do so (there is no language instruction on Python). Having taken EECS 211 and 214 would demonstrate this experience.

Course Textbook

Fundamentals of Music Processing

Time & Place

Lecture: Monday, Wednesday, Friday 3:00PM - 3:50PM Tech L361

Instructors, Office Hours, Online help

Prof. Bryan Pardo Office Hours & Location: Wednesday 4pm - 5pm, Mudd 3115

TA Fatemeh Pishdadian Office Hours & Location: Monday 4pm - 5pm, Mudd 3534

TA Bongjun Kim Office Hours & Location: Friday 2pm - 3pm, Mudd 3534

We’ll be using this piazza page for course discussion and online help.

Policies

Grading: You can earn 110 points. You’re graded on a basis of 100 points. In other words… 93 and up is an A, 90 - 92 is an A-, 87-89 is a B+, 83-86 is a B, 80-82 is a B-…and so on.

Late Policy: Assignments are due on Canvas by 11:59pm on the due date. Canvas is the only way assignments are accepted. Late assignments are docked 2 points per day, starting IMMEDIATELY. For example, an assignment handed in at 12:00am the next day has 2 points removed. An assignment that is 3 days late will have 6 points removed from the final grade.

Cheating & Academic Dishonesty: Do your own work. Academic dishonesty will be dealt with as laid out in the student handbook. Penalties include failing the class and can be more severe than that. If you have a question about whether something may be considered cheating, ask, prior to submitting your work.

Attendance is not graded.

Course Calendar

Week Date Topic Due Points
1 Mon Jan 7 Course intro, Recording basics    
1 Wed Jan 9 Loudness and Amplitude    
1 Fri Jan 11 Pitch, Tuning systems    
2 Mon Jan 14 Pitch, Tuning systems HW 0 5
2 Wed Jan 16 The Fourier Series    
2 Fri Jan 18 The Spectrogram & The Cepstrum    
3 Mon Jan 21 NO CLASS: MLK Day    
3 Wed Jan 23 Filters HW 1 20
3 Fri Jan 25 Reverb & Convolution    
4 Mon Jan 28 Beats and Autocorrelation    
4 Wed Jan 30 Time-frequency masking    
4 Fri Feb 1 REPET    
5 Mon Feb 4 NMF for audio source separation    
5 Wed Feb 6 NMF for sound object labeling HW 2 20
5 Fri Feb 8 Final Projects    
6 Mon Feb 11 Deep Clustering Source Separation Xtra Credit: paper reviews 5
6 Wed Feb 13 Voogle: Vocal Imitation Sound ID    
6 Fri Feb 15 Audealize: sound adjectives    
7 Mon Feb 18 Shazam and audio fingerprinting    
7 Wed Feb 20 iSed: Interactive sound labeling HW 3 20
7 Fri Feb 22 MCFT Project Proposal 5
8 Mon Feb 25 Project meetings    
8 Wed Feb 27 Project meetings    
8 Fri Mar 1 Project meetings Project report 5
9 Mon Mar 4 Project meetings    
9 Wed Mar 6 Project Meetings    
9 Fri Mar 8 Project Meetings Project report 5
10 Mon Mar 11 Project Meetings Xtra Credit: paper reviews 5
10 Wed Mar 13 Project Meetings    
10 Fri Mar 15 Project Meetings Project report 5
11 Fri Mar 22 Poster/Demo Session (3-5pm) Final Project 15

Lecture Slides

Recording Basics

Amplitude

Loudness

Pitch

Tuning Systems

The Fourier Transform

Spectrograms

Filtering

Convolution

Beats and Autocorrelation

Time-frequency Masking

Neural Networks

Course Reading

Week 1:

Fundamentals of Music Processing, Chapter 1

Week 2: Fourier Transform, Spectrogram

Fundamentals of Music Processing, Chapter 2 & Section 3.1

* Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task

Week 3: Filters, Convolution, Autocorrelation

Fundamentals of Music Processing, Chapter 6

Week 4: Repet

REPET for Background/Foreground Separation in Audio

Week 5: Non-negative Matrix Factorization

* Algorithms for Non-negative Matrix Factorization

* Score-Informed Source Separation for Musical Audio Recordings: An overview

* SOUND EVENT DETECTION IN REAL LIFE RECORDINGS USING COUPLED MATRIX FACTORIZATION OF SPECTRAL REPRESENTATIONS AND CLASS ACTIVITY ANNOTATIONS

* SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

Week 6: Recognizing sounds

* Deep clustering: Discriminative embeddings for segmentation and separation

* Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation

* Audealize: Crowdsourced Audio Production Tools

Week 7: The TA’s research

* A Human-in-the-Loop System for Sound Event Detection and Annotation

* Multi-resolution Common Fate Transform

* An Industrial-Strength Audio Search Algorithm (Shazam)

ADDITIONAL PAPERS AND READINGS:

Coming soon…

Datasets

The SocialFX data set of word descriptors for audio

VocalSet: a singing voice dataset consisting of 10.1 hours of monophonic recorded audio of professional singers

VocalSketch: thousands of vocal imitations of a large set of diverse sounds

Bach10: audio recordings of each part and the ensemble of ten pieces of four-part J.S. Bach chorales

The Million Song Dataset

Places to get ideas

Final projects from 2017 and 2015

Software

Jupyter Notebook

Tensorflow: the most popular python DNN package

Keras: A nice python API for Tensorflow

Top Calendar Links Slides Readings