Homepage

Automatic Transcription of Polyphonic Music Based on The Constant-Q Bispectral Analysis

In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for many different techniques have been proposed. In this research a new technique for automatic transcription of real, polyphonic and multi-instrumental music has been presented. The system implements a novel front-end, obtained by a constant-Q bispectral analysis of the input audio signal, which offers advantages with respect to lower dimensional spectral analysis in polyphonic pitch estimation. In every frame, pitch estimation is performed by means of a 2-D correlation between signal bispectrum and a fixed bi-dimensional harmonic pattern, while information about intensity of detected pitches is taken directly from the magnitude spectrum. Onset times are detected by a procedure that highlights large energy variations between consecutive frames of the time-frequency signal representation. Such a representation is also the basis for note durations estimation: a pitch against time representation of detected notes is compared with the audio spectrogram; the duration of each detected note event in the former is adjusted to the duration of corresponding event in the latter. All these data concerning pitches, onset times, durations and volumes are tabulated and output as a numerical list and a standard MIDI file is produced. The capabilities and the performance of the proposed transcription system have been compared with a spectrum based transcription system. The evaluation data set has been extracted from the standard RWC - Classical Database; for this purpose the whole architecture has been left the most general as possible, without introducing any a priori knowledge. Standard parameters have been used for validation. Our system successfully identified over 57% of voiced events, with an overall F-measure of 72.1%. Finally, a comparison with other methods have been made within the MIREX 2009 evaluation framework, in which the proposed system has achieved good rankings: in particular, it has been top ranked in the piano-only tracking task. The MIREX results show a very good overall recall rate in all the three tasks the proposed system was submitted to. The weakest aspect seems to be a still quite high false positive rate, which affects the precision rate. This could be further improved with the introduction of physical / musicological / statistical models, or any other knowledge that may be useful to solve the challenging task of music transcription. The added values of the proposed solution, with respect to the methods based on multi-F0 estimation via direct cancellation on the spectrum domain, are the less leakage of information in presence of partial overlapping, and the computation of a clearer 2-D cross-correlation which leads to stronger decision capabilities.

F. Argenti, P. Nesi, G. Pantaleo, "Automatic Transcription of Polyphonic Music Based on The Constant-Q Bispectral Analysis", IEEE Transactions on Audio, Speech and Language Processing, IEEE Computer Society press, Vol.19, n.6, pp.1610-1630, Aug. 2011. newnewnew Click here If you are interested to the MATHlab code of this project pleasedownload and provide us a citation to our work:

F. Argenti, P. Nesi, G. Pantaleo, "Automatic Music Transcription: from Monophonic to Polyphonic", Book entitled "Musical Robots and Interactive Multimodal Systems", (Kia Ng and Jorge Solis, editors), Springer, 2011, XVIII, 274 p, ISBN 978-3-642-22291-7. http://www.springer.com/engineering/robotics/book/978-3-642-22290-0

 

Contact:

Paolo Nesi, paolo.nesi@unifi.it
 

0
Il tuo voto: Nessuno


Warning: Table '.\drupal\sessions' is marked as crashed and should be repaired query: UPDATE sessions SET uid = 0, cache = 0, hostname = '192.168.0.44', session = 'hidePDA|s:1:\"0\";', timestamp = 1714771527 WHERE sid = '760f9d654e2c0d4fdeebe860d27e3030' in C:\Programmi\Apache Software Foundation\Apache2.2\htdocs\drupal\includes\database.mysqli.inc on line 128