Bainbridge, D., Downie, J. S., & Ehmann, A. F. (2012). Structured audio content analysis and metadata in a digital library. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, (pp. 431-432). Washington DC, USA :Association for Computing Machinery.
Permanent Research Commons link: http://hdl.handle.net/10289/7125
This work illustrates how audio content analysis of music and manually assigned structural temporal metadata can be used to form a digital library designed for musicological exploration. In addition to text-based searching and browsing, the document view is enriched with an interactive structured audio time-line that shows ground-truth data representing the logical segments to the song, and a version that was automatically generated for comparison. A self-similarity "heat" map is also displayed, and is interactive. Clicking within the map at a co-ordinate (x,y) results in the audio being played simultaneous at time offset x and y, panned left and right, respectively, to make it easier for the listener to separate out the differences. The musicologist can also initiate an audio content based query starting at any point in the song. This produces a ranked result set which can be further studied through their respective document views. Alternatively they can perform a musical structure search (for example, for songs that contain the structure b, b, c, b, c).