This paper is about inferring markup information, a generalization of part-of-speech-tagging. We use compression models based on a marked-up training corpus and apply them to fresh, unmarked, text. In effect, this technique builds filters that extract information from text in a way that is generalized because it depends on training text rather than preprogrammed heuristics.