Publication: Language inference from function words
| dc.contributor.author | Smith, Tony C. | en_NZ |
| dc.contributor.author | Witten, Ian H. | en_NZ |
| dc.date.accessioned | 2016-02-18T02:17:31Z | |
| dc.date.available | 1993 | en_NZ |
| dc.date.available | 2016-02-18T02:17:31Z | |
| dc.date.issued | 1993 | en_NZ |
| dc.description.abstract | Language surface structures demonstrate regularities that make it possible to learn a capacity for producing an infinite number of well-formed expressions. This paper outlines a system that uncovers and characterizes regularities through principled wholesale pattern analysis of copious amounts of machine-readable text. The system uses the notion of closed-class lexemes to divide the input into phrases, and from these phrases infers lexical and syntactic information. The set of closed-class lexemes is derived from the text, and then these lexemes are clustered into functional types. Next the open-class words are categorized according to how they tend to appear in phrases and then clustered into a smaller number of open-class types. Finally these types are used to infer, and generalize, grammar rules. Statistical criteria are employed for each of these inference operations. The result is a relatively compact grammar that is guaranteed to cover every sentence in the source text that was used to form it. Closed-class inferencing compares well with current linguistic theories of syntax and offers a wide range of potential applications. | en_NZ |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | Smith, T. C., & Witten, I. H. (1993). Language inference from function words (Computer Science Working Papers 93/3). Hamilton, New Zealand: Department of Computer Science, University of Waikato. | en |
| dc.identifier.issn | 1170-487X | en_NZ |
| dc.identifier.uri | https://hdl.handle.net/10289/9927 | |
| dc.language.iso | en | |
| dc.publisher | Department of Computer Science, University of Waikato | en_NZ |
| dc.relation.isPartOf | Working Paper Series | en_NZ |
| dc.relation.ispartofseries | Computer Science Working Papers | |
| dc.rights | © 1993 by Tony C. Smith & Ian H. Witten | |
| dc.title | Language inference from function words | en_NZ |
| dc.type | Working Paper | |
| dspace.entity.type | Publication | |
| pubs.confidential | false | en_NZ |
| pubs.place-of-publication | Hamilton, New Zealand | |
| uow.relation.series | 93/3 |