Yeates, Stuart AndrewWitten, Ian H.Bainbridge, David2008-11-132008-11-132001Yeates, S., Witten, I.H. & Bainbridge, D. (2001). Tag insertion complexity. In J. A. Stored(Ed.), Proceedings of the Data Compression Conference, March 2001, Snowbird, Utah (pp. 243-252). Washington DC, USA: IEEE Press.https://hdl.handle.net/10289/1324This paper is about inferring markup information, a generalization of part-of-speech-tagging. We use compression models based on a marked-up training corpus and apply them to fresh, unmarked, text. In effect, this technique builds filters that extract information from text in a way that is generalized because it depends on training text rather than preprogrammed heuristics.application/pdfenCopyright © IEEE 2001. This article has been published in Proceedings of the Data Compression Conference, March 2001, Snowbird, Utah.computer scienceMachine learningTag insertion complexityConference Contribution10.1109/DDC.2001.917155