dc.contributor.author | Huang, Anna | |
dc.contributor.author | Witten, Ian H. | |
dc.contributor.author | Frank, Eibe | |
dc.contributor.author | Milne, David N. | |
dc.coverage.spatial | Conference held at Bangkok, Thailand | en_NZ |
dc.date.accessioned | 2010-02-10T20:30:25Z | |
dc.date.available | 2010-02-10T20:30:25Z | |
dc.date.issued | 2009 | |
dc.identifier.citation | Huang, A., Witten, I. H., Frank, E. & Milne, D. (2009). Clustering documents using a Wikipedia-based concept representation. In Proceedings of 13th Pacific-Asia Conference, PAKDD 2009 Bangkok, Thailand, April 27-30, 2009. (pp. 628-636). | en |
dc.identifier.uri | https://hdl.handle.net/10289/3559 | |
dc.description.abstract | This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation by mapping the terms and phrases within documents to their corresponding articles (or concepts) in Wikipedia. We also developed a similarity measure that evaluates the semantic relatedness between concept sets for two documents. We test the concept-based representation and the similarity measure on two standard text document datasets. Empirical results show that although further optimizations could be performed, our approach already improves upon related techniques. | en |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.publisher | Springer | en_NZ |
dc.relation.uri | http://www.springerlink.com/content/fn135n77w033t619/ | en |
dc.rights | This is an author’s accepted version of an article published in Proceedings of 13th Pacific-Asia Conference, PAKDD 2009 Bangkok, Thailand, April 27-29. ©2009 Springer-Verlag Berlin Heidelberg. | en |
dc.source | PAKDD 2009 | en_NZ |
dc.subject | Machine learning | |
dc.title | Clustering documents using a Wikipedia-based concept representation | en |
dc.identifier.doi | 10.1007/978-3-642-01307-2_62 | en |
dc.relation.isPartOf | Proc 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining | en_NZ |
pubs.begin-page | 628 | en_NZ |
pubs.elements-id | 18982 | |
pubs.end-page | 636 | en_NZ |
pubs.finish-date | 2009-04-30 | en_NZ |
pubs.place-of-publication | Germany | en_NZ |
pubs.start-date | 2009-04-27 | en_NZ |
pubs.volume | LNCS 5476 | en_NZ |