Melody transcription for interactive applications

McNab, Rodger J.; Smith, Lloyd A.

Item

Melody transcription for interactive applications

McNab, Rodger J.
;
Smith, Lloyd A.

Abstract

A melody transcription system has been developed to support interactive music applications. The system accepts monophonic voice input ranging from F2 (87 HZ) to G5 (784 HZ) and tracks the frequency, displaying the result in common music notation. Notes are segmented using adaptive thresholds operating on the signal's amplitude; users are required to separate notes using a stop consonant. The frequency resolution of the system is ±4 cents. Frequencies are internally represented by their distance in cents above MIDI note 0 (8.176 Hz); this allows accurate musical pitch labeling when a note is slightly sharp or flat, and supports a simple method of dynamically adapting the system's tuning to the user's singing. The system was evaluated by transcribing 100 recorded melodies-10 tunes, each sung by 5 male and 5 female singers-comprising approximately 5000 notes. The test data was transcribed in 2.8% of recorded time. Transcription error was 11.4%, with incorrect note segmentation accounting for virtually all errors. Error rate was highly dependent on the singer, with one group of four singers having error rates ranging from 3% to 5%, error over the remaining 6 singers ranged from 11% to 23%.

Type

Working Paper

Series

Computer Science Working Papers

Citation

McNab, R. J. & Smith, L. A. (1996). Melody transcription for interactive applications. (Working paper 96/32). Hamilton, New Zealand: University of Waikato, Department of Computer Science.

Date

1996-12

Melody transcription for interactive applications

McNab, Rodger J.
;
Smith, Lloyd A.

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections