Loading...
Predicting Library of Congress Classifications from Library of Congress Subject Headings
Predicting Library of Congress Classifications from Library of Congress Subject Headings
Abstract
This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to work given its set of Library of Congress Subject Headings (LCSH). LCC are organized in a tree: the root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a classification model mapping from sets of LCSH to nodes in the LCC tree. We present empirical results for our technique showing its accuracy on an independent collection of 50,000 LCSH/LCC pairs.
Type
Working Paper
Type of thesis
Series
Computer Science Working Papers
Citation
Frank, E. & Paynter, G. (2003). Predicting Library of Congress Classifications from Library of Congress Subject Headings. (Working paper 01/03). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
Date
2003-01
Publisher
University of Waikato