Publication:
Thesaurus-based index term extraction for agricultural documents

Abstract

This paper describes a new algorithm for automatically extracting index terms from documents relating to the domain of agriculture. The domain-specific Agrovoc thesaurus developed by the FAO is used both as a controlled vocabulary and as a knowledge base for semantic matching. The automatically assigned terms are evaluated against a manually indexed 200-item sample of the FAO’s document repository, and the performance of the new algorithm is compared with a state-of-the-art system for keyphrase extraction.

Citation

Medelyan, O., & Witten, I.H. (2005). Thesaurus-based index term extraction for agricultural documents. In Proceedings of 2005 EFITA/WCCA Joint Congress on IT in Agriculture, 25-28 July 2005, Vila Real, Portugal (pp. 1122-1129).

Series name

Date

Publisher

EFITA/WICCA

Degree

Type of thesis

Supervisor

DOI

Link to supplementary material

Research Projects

Organizational Units

Journal Issue