1993 Working Papers

Browse

Recent Submissions

  • Publication
    Parallel programming with PICSIL1
    (Working Paper, University of Waikato, Department of Computer Science, 1993-10) Pearson, Murray W.; Melchert, Matthew
    This paper describes the background and development of PICSIL1 a visual language for specifying parallel algorithms using structured decomposition. PICSIL1 draws upon graphical and textual specification techniques; the first for high level structure of an algorithm, the second for more detailed functional specifications. The graphical specification techniques used in PICSIL1 are based on Data Flow Diagrams (DFDs) and are well suited to the assembly and interconnection of abstract modules. Minor modifications to DFDs have however had to be made to make them suitable for describing parallel algorithms. These include the ability to dynamically replicate sections of a diagram and change the structure of parts of a diagram dependent on data being processed. Work is proceeding on the development of an editor to allow the direct capture and editing of PICSIL1 descriptions. In the near future development of compiler and visual debugging tools are planned.
  • Publication
    The design of an optimistic AND-parallel Prolog
    (Working Paper, University of Waikato, Department of Computer Science, 1993-10) Cleary, John G.; Olthof, Ian
    A distributed AND-parallel Prolog implementation is described. The system can correctly handle all pure Prolog programs. In particular, it deals with the problem of distributed backtracking. Conflicts in variable bindings are resolved by assigning a time value to every unification. Bindings with smaller time values are given precedence over those with larger time values. The algorithm is based on the optimistic Time Warp system, with Prolog-specific optimizations. The optimizations include two new unification algorithms that permit unification and backtracking in any order. The result is a system which can fully exploit the parallelism available in both dependent and independent AND-parallelism.
  • Publication
    Informal introduction to Starlog
    (Working Paper, University of Waikato, Department of Computer Science, 1993-10) Cleary, John G.
    This report provides an informal and gentle introduction to the logic programming language Starlog and is intended to eventually form the first chapter of a book on Starlog. Like Prolog (a widely known and common logic programming language), Starlog programs consist of sets of Horn clauses. Starlog differs from Prolog in the way it is executed and in the use of logical time to order execution. The style of programming that results tends to be different from Prolog and similar to programs for relational databases.
  • Publication
    Proving the existence of solutions in logical arithmetic
    (Working Paper, University of Waikato, Department of Computer Science, 1993-10) Cleary, John G.
    Logical arithmetic is a logically correct technique for real arithmetic in Prolog which uses constraints over interval representations for its implementation. Four problems with the technique are considered: answers are conditional and uninformative; iterative computations may lead to unboundedly large constraint networks; it is difficult and ineffective to deal with negation; and computing extrema is often not effective. A solution to these problems is proposed in the form of "existential intervals" which record the existence of a solution to a set of constraints within an interval. It is shown how to operate on existential intervals and how they solve the four problems.
  • Publication
    Language inference from function words
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Smith, Tony C.; Witten, Ian H.
    Language surface structures demonstrate regularities that make it possible to learn a capacity for producing an infinite number of well-formed expressions. This paper outlines a system that uncovers and characterizes regularities through principled wholesale pattern analysis of copious amounts of machine-readable text. The system uses the notion of closed-class lexemes to divide the input into phrases, and from these phrases infers lexical and syntactic information. The set of closed-class lexemes is derived from the text, and then these lexemes are clustered into functional types. Next the open-class words are categorized according to how they tend to appear in phrases and then clustered into a smaller number of open-class types. Finally these types are used to infer, and generalize, grammar rules. Statistical criteria are employed for each of these inference operations. The result is a relatively compact grammar that is guaranteed to cover every sentence in the source text that was used to form it. Closed-class inferencing compares well with current linguistic theories of syntax and offers a wide range of potential applications.
  • Publication
    Models for computer generated parody
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Smith, Tony C.; Witten, Ian H.
    This paper outlines two approaches to the construction of computer systems that generate prose in the style of a given author. The first involves using intuitive notions of stylistic trademarks to construct a grammar that characterizes a particular author in this case, Ernest Hemingway. The second uses statistical methods for inferring a grammar from samples of an author's work in this instance, Thomas Hardy. A brief outline of grammar induction principles is included as background material for the latter system. The relative merits of each approach are discussed, and text generated from the resulting grammars is assessed in terms of its parodic quality. Further to its esoteric interest, a discussion of parody generation as a useful technique for measuring the success of grammatical inferencing systems is included, along with suggestions for its practical application in areas of language modeling and text compression.
  • Publication
    Compression-based template matching
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Inglis, Stuart J.; Witten, Ian H.
    Textual image compression is a method of both lossy and lossless image compression that is particularly effective for images containing repeated sub-images, notably pages of text (Mohiuddin et al., 1984; Witten et al., 1992). The process comprises three main steps: • Extracting all the characters from an image; • Building a library that contains one representative for each character class; • Compressing the image with respect to the library.
  • Publication
    Compressing computer programs
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Davies, Rod M.; Witten, Ian H.
    This paper describes a scheme for compressing programs written in a particular programming language—which can be any language that has a formal lexical and syntactic description—in such a way that they can be reproduced exactly. Only syntactically correct programs can be compressed. The scheme is illustrated on the Pascal language, and compression results are given for a corpus of Pascal programs; but it is by no means restricted to Pascal. In fact, we discuss how a "compressor-generator" program can be constructed that creates a compressor automatically from a formal specification of a programming language, in much the same way as a parser generator creates a syntactic parser from a formal language description.
  • Publication
    Displaying 3D images: algorithms for single image random dot stereograms
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Witten, Ian H.; Inglis, Stuart J.; Thimbleby, Harold W.
    This paper describes how to generate a single image which, when viewed in the appropriate way, appears to the brain as a 3D scene. The image is a stereogram composed of seemingly random dots. A new, simple and symmetric algorithm for generating such images from a solid model is given, along with the design parameters and their influence on the display. The algorithm improves on previously-described ones in several ways: it is symmetric and hence free from directional (right-to-left or left-to-right) bias, it corrects a slight distortion in the rendering of depth, it removes hidden parts of surfaces, and it also eliminates a type of artifact that we call an "echo". Random dot stereograms have one remaining problem: difficulty of initial viewing. If a computer screen rather than paper is used for output, the problem can be ameliorated by shimmering, or time-multiplexing of pixel values. We also describe a simple computational technique for determining what is present in a stereogram so that, if viewing is difficult, one can ascertain what to look for.
  • Publication
    Practical machine learning and its application to problems in agriculture
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Witten, Ian H.; Holmes, Geoffrey; McQueen, Robert J.; Smith, Lloyd A.; Cunningham, Sally Jo
    One of the most exciting and potentially far-reaching developments in contemporary computer science is the invention and application of methods of machine learning. These have evolved from simple adaptive parameter-estimation techniques to ways of (a) inducing classification rules from examples, (b) using prior knowledge to guide the interpretation of new examples, (c) using this interpretation to sharpen and refine the domain knowledge, and (d) storing and indexing example cases in ways that highlight their similarities and differences. Such techniques have been applied in domains ranging from the diagnosis of plant disease to the interpretation of medical test date. This paper reviews selected methods of machine learning with an emphasis on practical applications, and suggests how they might be used to address some important problems in the agriculture industries.
  • Publication
    Multiple viewpoint systems for music prediction
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Witten, Ian H.; Conklin, Darrell
    This paper examines the prediction and generation of music using a multiple viewpoint system, a collection of independent views of the musical surface each of which models a specific type of musical phenomena. Both the general style and a particular piece are modeled using dual short-term and long-term theories, and the model is created using machine learning techniques on a corpus of musical examples. The models are used for analysis and prediction, and we conjecture that highly predictive theories will also generate original, acceptable, works. Although the quality of the works generated is hard to quantify objectively, the predictive power of models can be measured by the notion of entropy, or unpredictability. Highly predictive theories will produce low-entropy estimates of a musical language. The methods developed are applied to the Bach chorale melodies. Multiple-viewpoint systems are learned from a sample of 95 chorales, estimates of entropy are produced, and a predictive theory is used to generate new, unseen pieces.
  • Publication
    Compression by induction of hierarchical grammars
    (Working Paper, Department of Computer Science, University of Waikato, 1993) Nevill-Manning, Craig G.; Witten, Ian H.; Maulsby, David
    This paper describes a technique that develops models of symbol sequences in the form of small, human-readable, hierarchical grammars. The grammars are both semantically plausible and compact. The technique can induce structure from a variety of different kinds of sequence, and examples are given of models derived from English text, C source code and a file of numeric data. This paper explains the grammatical induction technique, demonstrates its application to three very different sequences, evaluates its compression performance, and concludes by briefly discussing its use as method of knowledge acquisition.