Loading...
Thumbnail Image
Publication

Topic indexing with Wikipedia

Abstract
Wikipedia article names can be utilized as a controlled vocabulary for identifying the main topics in a document. Wikipedia’s 2M articles cover the terminology of nearly any document collection, which permits controlled indexing in the absence of manually created vocabularies. We combine state-of-art strategies for automatic controlled indexing with Wikipedia’s unique property- a richly hyperlinked encyclopedia. We evaluated the scheme by comparing automatically assigned topics with those chosen manually by human indexers. Analysis of indexing consistency shows that our algorithm outperforms some human subjects.
Type
Conference Contribution
Type of thesis
Series
Citation
Medelyan, O., Witten, I.H. & Milne, D. (2008). Topic indexing with Wikipedia. In Proceedings of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, AAAI Press, Chicago, USA, 13 July, 2008. (pp. 19-24).
Date
2008
Publisher
AAAI Press
Degree
Supervisors
Rights
This is an author’s version of an article published in Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, AAAI Press, Chicago, USA, 13 July, 2008.