Yadav, Sarthak (2022) Interpreting intermediate feature representations of raw-waveform deep CNNs by sonification. MSc(R) thesis, University of Glasgow.
Full text available as:
PDF
Download (14MB) |
Abstract
The majority of the recent works that address the interpretability of raw waveform based deep neural networks (DNNs) for audio processing focus on interpreting spectral and frequency response information, often limiting to visual and signal theoretic means of interpretation, solely for the first layer. This work proposes sonification, a method to interpret intermediate feature representations of sound event recognition (SER) 1D-convolutional neural networks (1D-CNNs) trained on raw waveforms by mapping these representations back into the discrete-time input signal domain, highlighting substructures in the input that maximally activate a feature map as intelligible acoustic events. Sonification is used to compare supervised and contrastive self-supervised feature representations, observing how the latter learn more acoustically discernible representations, especially in the deeper layers. A metric to quantify acoustic similarity between the interpretations and their corresponding inputs is proposed, and a layer-by-layer analysis of the trained feature representations using this metric supports the observations made.
Item Type: | Thesis (MSc(R)) |
---|---|
Qualification Level: | Masters |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Foster, Dr. Mary Ellen |
Date of Award: | 2022 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2022-82820 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 20 Apr 2022 10:48 |
Last Modified: | 20 Apr 2022 10:50 |
Thesis DOI: | 10.5525/gla.thesis.82820 |
URI: | https://theses.gla.ac.uk/id/eprint/82820 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year