1998 Working Papers

Recent Submissions

Now showing 1 - 5 of 18
  • Item
    Making oral history accessible over the World Wide Web
    (Working Paper, University of Waikato, Department of Computer Science, 1998-11) Bainbridge, David; Cunningham, Sally Jo
    We describe a multimedia, WWW-based oral history collection constructed from off-the-shelf or publicly available software. The source materials for the collection include audio tapes of interviews and summary transcripts of each interview, as well as photographs illustrating episodes mentioned in the tapes. Sections of the transcripts are manually matched to associated segments of the tapes, and the tapes are digitized. Users search a full-text retrieval system based on the text transcripts to retrieve relevant transcript sections and their associated audio recordings and photographs. It is also possible to search for photos by matching text queries against text descriptions of the photos in the collection, where the located photos link back to their respective interview transcript and audio recordings.
  • Item
    Melody based tune retrieval over the World Wide Web
    (Working Paper, University of Waikato, Department of Computer Science., 1998-11) Bainbridge, David; McNab, Rodger J.; Smith, Lloyd A.
    In this paper we describe the steps taken to develop a Web-based version of an existing stand-alone, single-user digital library application for melodical searching of a collection of music. For the three key components: input, searching, and output, we assess the suitability of various Web-based strategies that deal with the now distributed software architecture and explain the decisions we made. The resulting melody indexing service, known as MELDEX, has been in operation for one year, and the feed-back we have received has been favorable.
  • Item
    Link as you type: using key phrases for automated dynamic link generation
    (Working Paper, University of Waikato, Department of Computer Science, 1998-09) Jones, Steve
    When documents are collected together from diverse sources they are unlikely to contain useful hypertext links to support browsing amongst them. For large collections of thousands of documents it is prohibitively resource intensive to manually insert links into each document. Users of such collections may wish to relate documents within them to text that they are themselves generating. This process, often involving keyword searching, distracts from the authoring process and results in material related to query terms but not necessarily to the author’s document. Query terms that are effective in one collection might not be so in another. We have developed Phrasier, a system that integrates authoring (of text and hyperlinks), browsing, querying and reading in support of information retrieval activities. Phrasier exploits key phrases which are automatically extracted from documents in a collection, and uses them as link anchors and to identify candidate destinations for hyperlinks. This system suggests links into existing collections for purposes of authoring and retrieval of related information, creates links between documents in a collection and provides supportive document and link overviews.
  • Item
    Naive Bayes for regression
    (Working Paper, University of Waikato, Department of Computer Science., 1998-10) Frank, Eibe; Trigg, Leonard E.; Holmes, Geoffrey; Witten, Ian H.
    Despite its simplicity, the naïve Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates. This paper shows how to apply the naïve Bayes methodology to numeric prediction (i.e. regression) tasks, and compares it to linear regression, instance-based learning, and a method that produces “model trees” - decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naïve Bayes is the method of choice, on real-world datasets it is almost uniformly worse than model trees. The comparison with linear regression depends on the error measure: for one measure naïve Bayes performs similarly, for another it is worse. Compared to instance-based learning, it performs similarly with respect to both measures. These results indicate that the simplistic statistical assumption that naïve Bayes makes is indeed more restrictive for regression than for classification.
  • Item
    Measuring ATM traffic: final report for New Zealand Telecom
    (Working Paper, University of Waikato, Department of Computer Science, 1998-10) Cleary, John G.; Graham, Ian; Pearson, Murray W.; McGregor, Anthony James
    The report describes the development of a low-cost ATM monitoring system, hosted by a standard PC. The monitor can be used remotely returning information on ATM traffic flows to a central site. The monitor is interfaces to a GPS timing receiver, which provides an absolute time accuracy of better than 1 µsec. By monitoring the same traffic flow at different points in a network it is possible to measure cell delay and delay variation in real time, and with existing traffic. The monitoring system characterises cells by a CRC calculated over the cell payload, thus special measurement cells are not required. Delays in both local area and wide-area networks have been measured using this system. It is possible to measure delay in a network that is not end-to-end ATM, as long as some cells remain identical at the entry and exit points. Examples are given of traffic and delay measurements in both wide and local area network systems, including delays measured over the Internet from Canada to New Zealand.