Computer Audition


Computer Audition (CA) is a general field of audio understanding by machine that encompasses questions of audio processing, synthesis, information retrieval, auditory scene analysis and machine listening. Inspired by models of human audition, it deals with questions of representation, transduction, grouping, use of musical knowledge and general sound semantics for the purpose of performing intelligent operations on audio and music signals by the computer. Moreover, the tasks of a computer audition system go beyond questions of classification or retrieval, often using audition for performing intelligent audio activities, such as audition driven sound processing and sound or music generation.

Accordingly, the research on computer audition builds upon psychological and cognitive evidence of human listening experience, combining disciplines of engineering, information processing and artificial intelligence, cognitive science, music theory and artistic creativity, making it a formidable interdisciplinary study.

The study of CA could be roughly divided into three main areas:


1.   Representation: signal and symbolic. This aspect deals with feature extraction, sound descriptors and auditory models. It also concerns with audio analysis-synthesis and generative models, such as pattern playback or signal recreation from partial representation.


2.   Signal alignment and comparison: One of the unique properties of musical signals is that they often combine different types of representation, from notated score to performance actions in midi files, to audio recordings and human annotations. We study methods for finding such correspondences, with applications for intelligent sound processing, performing with computers, automatic annotation and more.


3.   Musical Knowledge and Audio Semantics: many aspects of topics 1 and 2 depend on human cognitive processing, such as perception of scales, rhythms and harmonies, and up to modeling of emotions, musical memory and perception of musical structure. We use machine learning to model human cognitive modalities of anticipation, familiarity and appraisal, and use them to describe musical style with applications to machine improvisation and building of an intelligent musical assistant.


To read more about CA, check out the Computer Audition tutorial at ACM Multimedia, MM'06, October 23, 2006, Santa Barbara, California, USA.

Some Matlab CA tools can be found in CATbox

Visit also our new Computer Audition Lab and read about Music Information Processing.