Loading...
Thumbnail Image
Item

Thesaurus-based index term extraction for agricultural documents

Abstract
This paper describes a new algorithm for automatically extracting index terms from documents relating to the domain of agriculture. The domain-specific Agrovoc thesaurus developed by the FAO is used both as a controlled vocabulary and as a knowledge base for semantic matching. The automatically assigned terms are evaluated against a manually indexed 200-item sample of the FAO’s document repository, and the performance of the new algorithm is compared with a state-of-the-art system for keyphrase extraction.
Type
Conference Contribution
Type of thesis
Series
Citation
Medelyan, O., & Witten, I.H. (2005). Thesaurus-based index term extraction for agricultural documents. In Proceedings of 2005 EFITA/WCCA Joint Congress on IT in Agriculture, 25-28 July 2005, Vila Real, Portugal (pp. 1122-1129).
Date
2005
Publisher
EFITA/WICCA
Degree
Supervisors
Rights
© 2005 the authors.