Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized A
Source:
IEEE Transactions on Audio, Speech, and Language Processing, IEEE, Volume 16, Issue 2, p.291–301 (2008)
Abstract:
We describe an acoustic chord transcription system
that uses symbolic data to train hidden Markov models and
gives best-of-class frame-level recognition results. We avoid the
extremely laborious task of human annotation of chord names and
boundaries—which must be done to provide machine learning
models with ground truth—by performing automatic harmony
analysis on symbolic music files. In parallel, we synthesize audio
from the same symbolic files and extract acoustic feature vectors
which are in perfect alignment with the labels. We, therefore, generate a large set of labeled training data with a minimal amount
of human labor. This allows for richer models. Thus, we build 24
key-dependent HMMs, one for each key, using the key information
derived from symbolic data. Each key model defines a unique
state-transition characteristic and helps avoid confusions seen in
the observation vector. Given acoustic input, we identify a musical
key by choosing a key model with the maximum likelihood, and
we obtain the chord sequence from the optimal state path of the
corresponding key model, both of which are returned by a Viterbi
decoder. This not only increases the chord recognition accuracy,
but also gives key information. Experimental results show the
models trained on synthesized data perform very well on real
recordings, even though the labels automatically generated from
symbolic data are not 100% accurate. We also demonstrate the
robustness of the tonal centroid feature, which outperforms the
conventional chroma feature.
Download:
© 2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.