Publication:
Language inference from function words

dc.contributor.authorSmith, Tony C.en_NZ
dc.contributor.authorWitten, Ian H.en_NZ
dc.date.accessioned2016-02-18T02:17:31Z
dc.date.available1993en_NZ
dc.date.available2016-02-18T02:17:31Z
dc.date.issued1993en_NZ
dc.description.abstractLanguage surface structures demonstrate regularities that make it possible to learn a capacity for producing an infinite number of well-formed expressions. This paper outlines a system that uncovers and characterizes regularities through principled wholesale pattern analysis of copious amounts of machine-readable text. The system uses the notion of closed-class lexemes to divide the input into phrases, and from these phrases infers lexical and syntactic information. The set of closed-class lexemes is derived from the text, and then these lexemes are clustered into functional types. Next the open-class words are categorized according to how they tend to appear in phrases and then clustered into a smaller number of open-class types. Finally these types are used to infer, and generalize, grammar rules. Statistical criteria are employed for each of these inference operations. The result is a relatively compact grammar that is guaranteed to cover every sentence in the source text that was used to form it. Closed-class inferencing compares well with current linguistic theories of syntax and offers a wide range of potential applications.en_NZ
dc.format.mimetypeapplication/pdf
dc.identifier.citationSmith, T. C., & Witten, I. H. (1993). Language inference from function words (Computer Science Working Papers 93/3). Hamilton, New Zealand: Department of Computer Science, University of Waikato.en
dc.identifier.issn1170-487Xen_NZ
dc.identifier.urihttps://hdl.handle.net/10289/9927
dc.language.isoen
dc.publisherDepartment of Computer Science, University of Waikatoen_NZ
dc.relation.isPartOfWorking Paper Seriesen_NZ
dc.relation.ispartofseriesComputer Science Working Papers
dc.rights© 1993 by Tony C. Smith & Ian H. Witten
dc.titleLanguage inference from function wordsen_NZ
dc.typeWorking Paper
dspace.entity.typePublication
pubs.confidentialfalseen_NZ
pubs.place-of-publicationHamilton, New Zealand
uow.relation.series93/3

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
uow-cs-wp-1993-03.pdf
Size:
3.69 MB
Format:
Adobe Portable Document Format
Description:
Published version

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Deposit Agreement.txt
Size:
193 B
Format:
Unknown data format
Description: