2003 Working Papers

Recent Submissions

Now showing 1 - 5 of 8
  • Item
    Predicting Library of Congress Classifications from Library of Congress Subject Headings
    (Working Paper, University of Waikato, 2003-01) Frank, Eibe; Paynter, Gordon W.
    This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to work given its set of Library of Congress Subject Headings (LCSH). LCC are organized in a tree: the root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a classification model mapping from sets of LCSH to nodes in the LCC tree. We present empirical results for our technique showing its accuracy on an independent collection of 50,000 LCSH/LCC pairs.
  • Item
    Visualizing class probability estimators
    (Working Paper, University of Waikato, Department of Computer Science, 2003-02-19) Frank, Eibe; Hall, Mark A.
    Inducing classifiers that make accurate predictions on future data is a driving force for research in inductive learning. However, also of importance to the users is how to gain information from the models produced. Unfortunately, some of the most powerful inductive learning algorithms generate "black boxes"—that is, the representation of the model makes it virtually impossible to gain any insight into what has been learned. This paper presents a technique that can help the user understand why a classifier makes the predictions that it does by providing a two-dimensional visualization of its class probability estimates. It requires the classifier to generate class probabilities but most practical algorithms are able to do so (or can be modified to this end).
  • Item
    From sit-forward to lean-back: Using a mobile device to vary interactive pace
    (Working Paper, University of Waikato, Department of Computer Science, 2003-03) Jones, Mark Hedley; Jain, Preeti; Buchanan, George; Marsden, Gary
    Although online, handheld, mobile computers offer new possibilities in searching and retrieving information on the go, the fast-paced, "sit-forward" style of interaction may not be appropriate for all user search needs. In this paper, we explore how a handheld computer can be used to enable interactive search experiences that vary in pace from fast and immediate through to reflective and delayed. We describe a system that asynchronously combines an offline handheld computer and an online desktop Personal Computer, and discuss some results of an initial user evaluation.
  • Item
    Locally weighted naive Bayes
    (Working Paper, University of Waikato, Department of Computer Science, 2003-04) Frank, Eibe; Hall, Mark A.; Pfahringer, Bernhard
    Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes' primary weakness—attribute independence—and improve the performance of the algorithm. This paper presents a locally weighted version of naive Bayes that relaxes the independence assumption by learning local models at prediction time. Experimental results show that locally weighted naive Bayes rarely degrades accuracy compared to standard naive Bayes and, in many cases, improves accuracy dramatically. The main advantage of this method compared to other techniques for enhancing naive Bayes is its conceptual and computational simplicity.
  • Item
    Comparison of data and process refinement
    (Working Paper, University of Waikato, Department of Computer Science, 2003-05) Reeves, Steve; Streader, David
    When is it reasonable, or possible, to refine a one place buffer into a two place buffer? In order to answer this question we characterise refinement based on substitution in restricted contexts. We see that data refinement (specifically in Z) and process refinement give differing answers to the original question, and we compare the precise circumstances which give rise to this difference by translating programs and processes into labelled transition systems, so providing a common basis upon which to make the comparison. We also look at the closely related area of subtyping of objects. Along the way we see how all these sorts of computational construct are related as far as refinement is concerned, discover and characterise some (as far as we can tell) new sorts of refinement and, finally, point up some research avenues for the future.