Thesaurus-based index term extraction for agricultural documents

Medelyan, Olena; Witten, Ian H.

Item

Thesaurus-based index term extraction for agricultural documents

Medelyan, Olena
;
Witten, Ian H.

Abstract

This paper describes a new algorithm for automatically extracting index terms from documents relating to the domain of agriculture. The domain-specific Agrovoc thesaurus developed by the FAO is used both as a controlled vocabulary and as a knowledge base for semantic matching. The automatically assigned terms are evaluated against a manually indexed 200-item sample of the FAO’s document repository, and the performance of the new algorithm is compared with a state-of-the-art system for keyphrase extraction.

Type

Conference Contribution

Citation

Medelyan, O., & Witten, I.H. (2005). Thesaurus-based index term extraction for agricultural documents. In Proceedings of 2005 EFITA/WCCA Joint Congress on IT in Agriculture, 25-28 July 2005, Vila Real, Portugal (pp. 1122-1129).

Date

2005

Publisher

EFITA/WICCA

Rights

Thesaurus-based index term extraction for agricultural documents

Medelyan, Olena
;
Witten, Ian H.

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections