2002 Working Papers

Browse

Recent Submissions

  • Publication
    Toward a theory of music information retrieval queries: System design implications
    (Working Paper, Department of Computer Science, University of Waikato, 2002) Cunningham, Sally Jo; Downie, J. Stephen
    Interest in the development of content-based music information retrieval (MIR) systems is growing rapidly. The MIR research community consists of a multidisciplinary amalgam of librarians, digital librarians, information scientists, computer scientists, musicologists, audio engineers, lawyers and business persons. This multidisciplinary approach has given rise to significant technological advancements in retrieval algorithms, audio interfaces and data representation schemes. Notwithstanding these technological advancements, MIR research is currently a systems-centered research domain. For a variety of reasons-including intellectual property law, limited access to substantial, multigenre, multi-format collections and a lack of a historical user-base-MIR research has hitherto been unable to develop and exploit data concerning the nature of real-world user needs and use of music information.
  • Publication
    Research laboratory survey
    (Working Paper, University of Waikato, Department of Computer Science, 2002-11) Thomson, Kirsten
    This report represents the results of a survey conducted by the University of Waikato Usability Laboratory of the research laboratories at the Department of Computer Science, The University of Waikato, Hamilton, New Zealand. The study was conducted on behalf of the Department of Computer Science. The goal of the research was to: Inform the development of future laboratories; Inform the process any of re-development of current laboratories; Provide information about the use and acceptance of the laboratories.
  • Publication
    The LIDS Research Project: appendage to usability study report (1/2002)
    (Working Paper, University of Waikato, Department of Computer Science, 2002-07) Thomson, Kirsten; McLeod, Laurie
    This report is a follow on to an earlier report (titled: Usability Study Report (1/2002), dated 1 July 2002) that presented the University of Waikato Usability Laboratory’s (Usability Laboratory) analysis of the Large Interactive Display Screen (LIDS) technologies as developed by the LIDS Research Project.
  • Publication
    Use of video shadow for small group interaction awareness on a large interactive display surface
    (Working Paper, University of Waikato, Department of Computer Science, 2002-07) Apperley, Mark; McLeod, Laurie; Masoodian, Masood; Paine, Lance; Phillips, Malcolm; Rogers, Bill; Thomson, Kirsten
    This paper reports work done as part of the Large Interactive Display Surface (LIDS) project at the University of Waikato. One application of the LIDS equipment is distributed meeting support. In this context large display surfaces are used as shared workspaces by people at collaborating sites. A meeting with start with a shared presentation document, typically and agenda document with summary and detail on agenda items as required. During the meeting, annotations with be made on the shared document, and new pages will be added with notes and drawings. To prevent access collisions and generally mediate use of the shared space, mechanisms to provide awareness of actions of people at other sites are required. In our system a web camera is used to capture a low-resolution image of the person/people near the board on each side. Rather than transmit the image directly we computed a shadow/silhouette. The shadow is displayed behind other screen content. This provides awareness of position and impending write actions and allows intentional pointing to locations of the screen. It also has the advantage of being transmitted with low bandwidth, being relatively insensitive to low frame rates, and minimizing visual interference with substantive data being displayed on the screen.
  • Publication
    The LIDS Research Project: usability study report (1/2002)
    (Working Paper, University of Waikato, Department of Computer Science, 2002-07) Thomson, Kirsten; McLeod, Laurie
    This report represents the University of Waikato Usability Laboratory’s (Usability Laboratory) analysis of the Large Interactive Display Screen (LIDS) technologies as developed by the LIDS Research Group. The Usability Laboratory conducted three exploratory-type studies of the LIDS technology over January and February 2002. The studies each focused on individual elements of the LIDS technology, while at the same time contributing to the general understanding and knowledge of the technology.
  • Publication
    Benchmarking attribute selection techniques for discrete class data mining
    (Working Paper, University of Waikato, Department of Computer Science, 2002-04) Hall, Mark A.; Holmes, Geoffrey
    Data engineering is generally considered to be a central issue in the development of data mining applications. The success of many learning schemes, in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. Attribute selection generally involves a combination of search and attribute utility estimation plus evaluation with respect to specific learning schemes. This leads to a large number of possible permutation and has led to a situation where very few benchmark studies have been conducted. This paper presents a benchmark comparison of several attribute selection methods for supervised classification. All the methods produce an attribute ranking, a useful devise for isolating the individual merit of an attribute. Attribute selection is achieved by cross-validating the attribute rankings with respect to a classification learner to find the best attributes. Results are reported for a selection of standard data sets and two diverse learning schemes C4.5 and naïve Bayes.
  • Publication
    A logic boosting approach to inducing multiclass alternating decision trees
    (Working Paper, University of Waikato, Department of Computer Science, 2002-03) Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon; Frank, Eibe; Hall, Mark A.
    The alternating decision tree (ADTree) is a successful classification technique that combine decision trees with the predictive accuracy of boosting into a ser to interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several methods for extending the algorithm to the multiclass case by splitting the problem into several two-class LogitBoost procedure to induce alternating decision trees directly. Experimental results confirm that this procedure is comparable with methods that are based on the original ADTree formulation in accuracy, while inducing much smaller trees.
  • Publication
    ZML:XML support for standard Z.
    (Working Paper, 2002-12-01) Utting, Mark; Toyn, Ian; Sun, Jing; Martin, Andrew; Dong, Jin Song; Daley, Nicholas; Currie, David
    This paper proposes an XML format for standard Z. We describe several earlier XML proposals for Z, the problems and issues that arose, and the rationales behind our new proposal. The new proposal is based upon a comparison of various existing Z annotated syntaxes, to ensure that the mark-up will be widely usable. This XML format is expected to become a central feature of the CZT (Community Z Tools) initiative.
  • Publication
    Object orientation without extending Z
    (Working Paper, Springer-Verlag, 2002-12-01) Utting, Mark; Wang, Shaochun
    The good news of this paper is that without extending Z, we can elegantly specify object-oriented systems, including encapsulation, inheritance and subtype polymorphism (dynamic dispatch). The bad news is that this specification style is rather different to normal Z specifications, more abstract and axiomatic, which means that it is not so well supported by current Z tools such as animators. It also enforces behavioural subtyping, unlike most object-oriented programming languages. This paper explains the proposed style, with examples, and discusses its advantages and disadvantages.
  • Publication
    Accuracy bounds for ensembles under 0 - 1 loss.
    (Working Paper, Dept. of Computer Science, 2002-06-01) Bouckaert, Remco R.
    This paper is an attempt to increase the understanding in the behavior of ensembles for discrete variables in a quantitative way. A set of tight upper and lower bounds for the accuracy of an ensemble is presented for wide classes of ensemble algorithms, including bagging and boosting. The ensemble accuracy is expressed in terms of the accuracies of the members of the ensemble. Since those bounds represent best and worst case behavior only, we study typical behavior as well, and discuss its properties. A parameterised bound is presented which describes ensemble bahavior as a mixture of dependent base classifier and independent base classifier areas. Some empirical results are presented to support our conclusions.
  • Publication
    Racing committees for large datasets.
    (Working Paper, University of Waikato, Department of Computer Science, 2002-06-01) Frank, Eibe; Holmes, Geoffrey; Kirkby, Richard Brendon; Hall, Mark A.
    This paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It allows the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so. The basic idea is to split incoming data into chunks and build a committee based on classifiers build from these individual chunks [3]. Our method extends earlier work in two ways: (a) the best chunk size is chosen automatically by racing committees corresponding to different chunk sizes, and (b) the committees are pruned adaptively to keep the size of each individual committee as small as possible without negatively affecting accuracy. This paper shows that choosing an appropriate chunk size automatically is important because the accuracy of the resulting committee can vary significantly with the chunk size. It also shows that pruning is crucial to make the method practical for large datasets in terms of running time and memory requirements. Surprisingly, the results demonstrate that pruning can also improve accuracy.
  • Publication
    Usability and open source software.
    (Working Paper, University of Waikato, 2002-12-01) Nichols, David M.; Twidale, Michael B.
    Open source communities have successfully developed many pieces of software although most computer users only use proprietary applications. The usability of open source software is often regarded as one reason for this limited distribution. In this paper we review the existing evidence of the usability of open source software and discuss how the characteristics of open-source development influence usability. We describe how existing human-computer interaction techniques can be used to leverage distributed networked communities, of developers and users, to address issues of usability.