Title: Audio Classification with Small Training Datasets
Time: Friday, March 3rd, 3:00 PM
Location: CSIP library (room 5126), 5th floor, Centergy one building
Bio: Alexander Lerch is Associate Professor and Director of Graduate Studies at the School of Music, Georgia Institute of Technology. He received his “Diplom-Ingenieur” (EE) and his PhD (Audio Communications) from TU Berlin. Lerch’s research in Music Information Retrieval and Audio Content Analysis positions him at the intersection of signal processing, machine learning, and music, and creates artificially intelligent software for music analysis, production, and generation. Lerch authored more than 50 peer-reviewed journal and conference papers, as well as the text book “An Introduction to Audio Content Analysis” (IEEE Press/Wiley, 2nd edition 2023). Before he joined Georgia Tech in 2013, Lerch was Co-Founder and Head of Research at his company zplane.development, an industry leader in music technology licensing. zplane technologies are integrated into a multitude of music software from consumer to professional applications and are used by millions of musicians and producers world-wide.
Abstract: Many tasks in music and audio classification lack large datasets and researchers thus struggle to train deep state-of-the-art networks with a large number of hyperparameters. This presentation will introduce recent research in the Music Informatics Group to address this challenge: First, a semi-supervised approach exploring the utilization of unlabeled data in training, second, a self-supervised representation learning approach inspired by knowledge distillation techniques, and third, an approach referred to as “reprogramming” or “model reprogramming” that transfers the knowledge from a deep model pre-trained on a different task by combining ideas from traditional transfer learning and adversarial attacks. The presentation will conclude with a short discussion on advantages and disadvantages of the presented approaches.