McNab, R. J. & Smith, L. A. (1996). Melody transcription for interactive applications. (Working paper 96/32). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
Permanent Research Commons link: https://hdl.handle.net/10289/1194
A melody transcription system has been developed to support interactive music applications. The system accepts monophonic voice input ranging from F2 (87 HZ) to G5 (784 HZ) and tracks the frequency, displaying the result in common music notation. Notes are segmented using adaptive thresholds operating on the signal's amplitude; users are required to separate notes using a stop consonant. The frequency resolution of the system is ±4 cents. Frequencies are internally represented by their distance in cents above MIDI note 0 (8.176 Hz); this allows accurate musical pitch labeling when a note is slightly sharp or flat, and supports a simple method of dynamically adapting the system's tuning to the user's singing. The system was evaluated by transcribing 100 recorded melodies-10 tunes, each sung by 5 male and 5 female singers-comprising approximately 5000 notes. The test data was transcribed in 2.8% of recorded time. Transcription error was 11.4%, with incorrect note segmentation accounting for virtually all errors. Error rate was highly dependent on the singer, with one group of four singers having error rates ranging from 3% to 5%, error over the remaining 6 singers ranged from 11% to 23%.
- 1996 Working Papers