An Evaluation of Document Keyphrase Sets

Jones, Steve; Paynter, Gordon W.

An Evaluation of Document Keyphrase Sets

Authors

Jones, Steve

Paynter, Gordon W.

Permanent Link

https://hdl.handle.net/10289/1525

Rights

This is an article published in the Journal of Digital Information. The original publication is available at http://journals.tdl.org/jodi/index

Abstract

Keywords and keyphrases have many useful roles as document surrogates and descriptors, but the manual production of keyphrase metadata for large digital library collections is at best expensive and time-consuming, and at worst logistically impossible. Algorithms for keyphrase extraction like Kea and Extractor produce a set of phrases that are associated with a document. Though these sets are often utilized as a group, keyphrase extraction is usually evaluated by measuring the quality of individual keyphrases. This paper reports an assessment that asks human assessors to rate entire sets of keyphrases produced by Kea, Extractor and document authors. The results provide further evidence that human assessors rate all three sources highly (with some caveats), but show that the relationship between the quality of the phrases in a set and the set as a whole is not always simple. Choosing the best individual phrases will not necessarily produce the best set; combinations of lesser phrases may result in better overall quality.

Citation

Jones, S. & Paynter, G.W.(2003). An evaluation of document keyphrase sets. Journal of Digital Information, 4(1).

Type

Conference Contribution

Date

2003

Publisher

British Computer Society

An Evaluation of Document Keyphrase Sets

Authors

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor