1996 Working Papers

Permanent URI for this collection

https://researchcommons.waikato.ac.nz/handle/10289/10029

Browse

Melody transcription for interactive applications
(Working Paper, 1996-12) McNab, Rodger J.; Smith, Lloyd A.
A melody transcription system has been developed to support interactive music applications. The system accepts monophonic voice input ranging from F2 (87 HZ) to G5 (784 HZ) and tracks the frequency, displaying the result in common music notation. Notes are segmented using adaptive thresholds operating on the signal's amplitude; users are required to separate notes using a stop consonant. The frequency resolution of the system is ±4 cents. Frequencies are internally represented by their distance in cents above MIDI note 0 (8.176 Hz); this allows accurate musical pitch labeling when a note is slightly sharp or flat, and supports a simple method of dynamically adapting the system's tuning to the user's singing. The system was evaluated by transcribing 100 recorded melodies-10 tunes, each sung by 5 male and 5 female singers-comprising approximately 5000 notes. The test data was transcribed in 2.8% of recorded time. Transcription error was 11.4%, with incorrect note segmentation accounting for virtually all errors. Error rate was highly dependent on the singer, with one group of four singers having error rates ranging from 3% to 5%, error over the remaining 6 singers ranged from 11% to 23%.
Selecting multiway splits in decision trees
(Working Paper, 1996-12) Frank, Eibe; Witten, Ian H.
Decision trees in which numeric attributes are split several ways are more comprehensible than the usual binary trees because attributes rarely appear more than once in any path from root to leaf. There are efficient algorithms for finding the optimal multiway split for a numeric attribute, given the number of intervals in which it is to be divided. The problem we tackle is how to choose this number in order to obtain small, accurate trees.
Reconstructing Minard's graphic with the relational visualisation notation
(Working Paper, 1996-12) Humphrey, Matthew C.
Richly expressive information visualisations are difficult to design and rarely found. Few software tools can generate multi-dimensional visualisations at all, let alone incorporate artistic detail. The Relational Visualisation Toolkit is a new system for specifying highly expressive graphical representations of data without traditional programming. We seek to discover the accessible power of this notation-both its graphical expressiveness and its ease of use. Towards this end we have used the system to design and reconstruct Minard's Visualisation of Napoleon's Russian campaign of 1812. The resulting image is very similar to the original, and the design is straightforward to construct. Furthermore, the design is sufficiently general to be able to visualise Hitler's WWII defeat before Moscow.
OzCHI'96 Workshop on the Next Generation of CSCW Systems
(Working Paper, 1996-11-25) Grundy, John C.
This is the Proceedings of the OZCHI'96 Workshop on the Next Generation of CSCW Systems. Thanks must go to Andy Cockburn for inspiring the name of the workshop and thus giving it a (general) theme! The idea for this workshop grew out of discussions with John Venable concerning the Next Generation of CASE Tools workshop which he'd attended in 1995 and 1996. With CSCW research becoming more prominent within the CHI community in Australasia, it seemed a good opportunity to get people together at OZCHI'96 who share this interest. Focusing the workshop on next-generation CSCW system issues produced paper submissions which explored very diverse areas of CSCW, but which all share a common thread of "Where do we go from here?", and, perhaps even more importantly "Why should be doing this?".
Teaching students to critically evaluate the quality of Internet research resources
(Working Paper, 1996-11) Cunningham, Sally Jo
The Internet offers a host of high-quality research material in computer science-and, unfortunately, some very low quality resources as well. As part of learning the research process, students should be taught to critically evaluate the quality of all documents that they use. This paper discusses the application of document evaluation criteria to WWW resources, and describes activities for including quality evaluation in a course on research methods.
Timestamp representations for virtual sequences
(Working Paper, IEEE Computer Society Press, 1996-11) Cleary, John G.; McWha, David J.A.; Murray, Pearson
The problem of executing sequential programs optimistically using the Time Warp algorithm is considered. It is shown how to do this, by first mapping the sequential execution to a control tree and then assigning timestamps to each node in the tree. For such timestamps to be effective they must be finite, this implies that they must be periodically rescaled to allow old timestamps to be reused. A number of timestamp representations are described and compared on the basis of: their complexity; the frequency and cost of rescaling; and the cost of performing basic operations, including comparison and creation of new timestamps.
Dataset cataloging metadata for machine learning applications and research
(Working Paper, 1996-11) Cunningham, Sally Jo
As the field of machine learning (ML) matures, two types of data archives are developing: collections of benchmark data sets used to test the performance of new algorithms, and data stores to which machine learning/data mining algorithms are applied to create scientific or commercial applications. At present, the catalogs of these archives are ad hoc and not tailored to machine learning analysis. This paper considers the cataloging metadata required to support these two types of repositories, and discusses the organizational support necessary for archive catalog maintenance.
Identifying hierarchical structure in sequences: a linear-time algorithm
(Working Paper, 1996-11) Nevill-Manning, Craig G.; Witten, Ian H.
This paper describes an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing phrases which appear more than once by a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence. The algorithm works by maintaining two constraints: every diagram in the grammar must be unique, and every rule must be used more than once. It breaks new ground by operating incrementally. Moreover, its simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 10,000 symbols/second and has been applied to an extensive range of sequences encountered in practice.
Authorship patterns in information systems
(Working Paper, 1996-10) Cunningham, Sally Jo; Dillon, Stuart M.
This paper examines the patterns of multiple authorship in five information systems journals. Specifically, we determine the distribution of the number of authors per paper in this field, the proportion of male and female authors, gender composition of research teams, and the incidence of collaborative relationships spanning institutional affiliations and across different geographic regions.
Induction of model trees for predicting continuous classes
(Working Paper, 1996-10) Wang, Yong; Witten, Ian H.
Many problems encountered when applying machine learning in practice involve predicting a "class" that takes on a continuous numeric value, yet few machine learning schemes are able to do this. This paper describes a "rational reconstruction" of M5, a method developed by Quinlan (1992) for inducing trees of regression models. In order to accommodate data typically encountered in practice it is necessary to deal effectively with enumerated attributes and with missing values, and techniques devised by Breiman et al. (1984) are adapted for this purpose. The resulting system seems to outperform M5, based on the scanty published data that is available.
Understanding what machine learning produces - Part II: Knowledge visualization techniques
(Working Paper, 1996-10) Cunningham, Sally Jo; Humphrey, Matthew C.; Witten, Ian H.
Researchers in machine learning use decision trees, production rules, and decision graphs for visualizing classification data. Part I of this paper surveyed these representations, paying particular attention to their comprehensibility for non-specialist users. Part II turns attention to knowledge visualization—the graphic form in which a structure is portrayed and its strong influence on comprehensibility. We analyze the questions that, in our experience, end users of machine learning tend to ask of the structures inferred from their empirical data. By mapping these questions onto visualization tasks, we have created new graphical representations that show the flow of examples through a decision structure. These knowledge visualization techniques are particularly appropriate in helping to answer the questions that users typically ask, and we describe their use in discovering new properties of a data set. In the case of decision trees, an automated software tool has been developed to construct the visualizations.
Understanding what machine learning produces - Part I: Representations and their comprehensibility
(Working Paper, 1996-10) Cunningham, Sally Jo; Humphrey, Matthew C.; Witten, Ian H.
The aim of many machine learning users is to comprehend the structures that are inferred from a dataset, and such users may be far more interested in understanding the structure of their data than in predicting the outcome of new test data. Part I of this paper surveys representations based on decision trees, production rules and decision graphs that have been developed and used for machine learning. These representations have differing degrees of expressive power, and particular attention is paid to their comprehensibility for non-specialist users. The graphic form in which a structure is portrayed also has a strong effect on comprehensibility, and Part II of this paper develops knowledge visualization techniques that are particularly appropriate to help answer the questions that machine learning users typically ask about the structures produced.
Visual analogy in creative design: case study of fractals and crochet lace
(Working Paper, 1996-10) Cunningham, Sally Jo
One powerful technique for supporting creativity in design is analogy: drawing similarities between seemingly unrelated objects taken from different domain. A case study is presented in which fractal images serve as a source for novel crochet lace patterns. The human designer searches a potential design space by manipulating the parameters of fractal systems, and then translates portions of fractal forms to lacework. This approach to supporting innovation in design is compared with previous work based on formal modelling of the domain with generative grammars.
Theory combination: an alternative to data combination
(Working Paper, 1996-10) Ting, Kai Ming; Low, Boon Toh
The approach of combining theories learned from multiple batches of data provide an alternative to the common practice of learning one theory from all the available data (i.e., the data combination approach). This paper empirically examines the base-line behaviour of the theory combination approach in classification tasks. We find that theory combination can lead to better performance even if the disjoint batches of data are drawn randomly from a larger sample, and relate the relative performance of the two approaches to the learning curve of the classifier used.
Machine learning applied to fourteen agricultural datasets
(Working Paper, 1996-09) Thomson, Kirsten; McQueen, Robert J.
This document reports on an investigation conducted between November, 1995 and March, 1996 into the use of machine learning on 14 sets of data supplied by agricultural researchers in New Zealand. Our purpose here is to collect together short reports on trials with these datasets using the WEKA machine learning workbench, so that some understanding of the applicability and potential application of machine learning to similar datasets may result.
Serendipity: integrated environment support for process modelling, enactment and improvement
(Working Paper, 1996-08) Grundy, John C.; Hosking, John G.
Large cooperative work systems require work coordination, context awareness and process modelling and enactment mechanisms to be effective. The Serendipity environment provides visual languages for specifying process models and event processing. Enacted models can be modified during or after use and can act as plans of work to be done, describe work in progress, and record work done on a project. Serendipity has been integrated with an Information Systems engineering environment and office automation applications, without modification to these pre-existing tools. Animation of process models allows collaborating users to remain aware of the work contexts of their collaborators. Information about the current enacted process stage is attached to descriptions of changes made to work artefacts, recording the context of work. Such changes are also stored by the process stage, allowing collaborators to review the stage work history. Serendipity's visual event processing language allows users to specify rules and actions triggered by enactment, process or work artefact modification, or tool events. This paper describes Serendipity, our experiences using it, and its architecture and implementation.
Compression and explanation using hierarchical grammars
(Working Paper, 1996-07) Nevill-Manning, Craig G.; Witten, Ian H.
Data compression is an eminently pragmatic pursuit: by removing redundancy, storage can be utilised more efficiently. Identifying redundancy also serves a less prosaic purpose-it provides cues for detecting structure, and the recognition of structure coincides with one of the goals of artificial intelligence: to make sense of the world by algorithmic means. This paper describes an algorithm that excels at both data compression and structural inference. This algorithm is implemented in a system call SEQUITUR that efficiently deals with sequences containing millions of symbols.
CSCW in New Zealand: a snapshot
(Working Paper, 1996-07) Blackett, Colin; Reeves, Steve
This report has been produced as one of the outputs of the FORST funded project "Improved Computer Supported Collaborative Work Systems" which is currently running in the Department of Computer Science at the University of Waikato. Its aim is to give a snapshot of the uses of and possibilities for Computer Supported Collaborative (also Co-operative) Work (also Working) (CSCW) within New Zealand.
Digital libraries based on full-text retrieval
(Working Paper, 1996-07) Witten, Ian H.; Nevill-Manning, Craig G.; Cunningham, Sally Jo
Because digital libraries are expensive to create and maintain, Internet analogs of public libraries-reliable, quality, community services-have only recently begun to appear. A serious obstacle to their creation is the provision of appropriate cataloguing information. Without a database of titles, authors and subjects, it is hard to offer the searching and browsing facilities normally available in physical libraries. Full-text retrieval provides a way of approximating these services without a concomitant investment of human resources. This presentation will discuss the indexing, collection and maintenance processes, and the retrieval interface, to public digital libraries.
Tree browsing
(Working Paper, 1996-07) Apperley, Mark; Chester, Michael
Graphic representations of tree structures are notoriously difficult to create, display, and interpret, particularly when the volume of information they contain, and hence the number of nodes, is large. The problem of interactively browsing information held in tree structures is examined, and a design for a tree browser proposed. This design is based on distortion-oriented display techniques and intuitive direct manipulation interaction. The tree layout is automatically generated, but the location and extent of detail shown is controlled by the user. It is suggested that these techniques could be extended to the browsing of more general networks.

Browse

Recent Submissions