Tait, Crawford (1997) Wavelet analysis for onset detection. PhD thesis, University of Glasgow.
Full text available as:
PDF (scanned version of the original print thesis)
Download (8MB) |
Abstract
Many of the auditory perception processes which researchers have sought to automate can be decomposed into stages, the first of which involves segmentation of the input audio. In music this stage equates to locating note onsets, and advances in this task should therefore ease further analyses. There are also many direct applications of onset detection, including synchronisation of audio with other media and location of significant time points in graphical editing of audio. It is for these reasons that this work focuses on the task of detecting onsets. An onset is considered as a particular type of change in the time-frequency representation of a sound. The modulus plane derived from a semitone-based harmonic wavelet analysis is first transformed, to account for the varying frequency sensitivity and mapping from amplitude to loudness observed in the human auditory system. Vectors are then derived from adjacent regions of the plane, and compared for change using Minkowski's distance measure. Peaks of distance correspond to significant changes, and commencing partials are sought at peak locations to identify onset peaks. The process of testing the method is considered in some detail, and an experiment is derived in which a test piece is recorded using a wide range of timbres from a MIDI synthesiser. The piece includes a repeated note and a range of intervals, and legato and staccato styles are demonstrated. Separate test cases demonstrate results in the presence of reverberation, dynamic variation, low notes, short notes, vibrato, tremolo and drum sounds (with overlapping cymbals). The main body of tests was conducted using a large number of parameter settings and variations of the analysis method (including different loudness scales and exponents in the distance measure) to achieve optimal results, but the reduction to a single analysis method with one parameter was also considered. The use of a novel technique to compensate for slowly rising onsets is also investigated. Although the domain is restricted to monophonic musical audio, many of the test cases contain overlap and the method is shown to have some potential in the analysis of polyphonic examples. The results of this experiment are assessed in the context of error tolerances, derived from consideration of a number of typical applications. It is shown that such assessment is not a straightforward matter and, for example, there may be interaction between the type of timbre and the error tolerance which will apply in a specific application. In summary, the thesis establishes that onset detection can be accomplished by monitoring a distance measure calculated from a harmonic wavelet analysis; and does this via the design and implementation of a comprehensive experiment.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Additional Information: | Adviser: Bill Findlay |
Keywords: | Computer science, music, data processing. |
Subjects: | T Technology > T Technology (General) |
Colleges/Schools: | College of Science and Engineering |
Supervisor's Name: | Findlay, Bill and Patterson, John |
Date of Award: | 1997 |
Depositing User: | Enlighten Team |
Unique ID: | glathesis:1997-71757 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 17 May 2019 09:31 |
Last Modified: | 05 Sep 2022 16:05 |
Thesis DOI: | 10.5525/gla.thesis.71757 |
URI: | https://theses.gla.ac.uk/id/eprint/71757 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year