Publication:
KEA: Practical automatic keyphrase extraction

dc.contributor.authorWitten, Ian H.
dc.contributor.authorPaynter, Gordon W.
dc.contributor.authorFrank, Eibe
dc.contributor.authorGutwin, Carl
dc.contributor.authorNevill-Manning, Craig G.
dc.date.accessioned2008-10-13T03:30:14Z
dc.date.available2008-10-13T03:30:14Z
dc.date.issued2000-03
dc.description.abstractKeyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea identifies candidate keyphrases using lexical methods, calculates feature values for each candidate, and uses a machine learning algorithm to predict which candidates are good keyphrases. The machine learning scheme first builds a prediction model using training documents with known keyphrases, and then uses the model to find keyphrases in new documents. We use a large test corpus to evaluate Kea’s effectiveness in terms of how many author-assigned keyphrases are correctly identified. The system is simple, robust, and publicly available.en_US
dc.format.mimetypeapplication/pdf
dc.identifier.citationWitten, I.H., Paynter, G.W., Frank, E., Gutwin, C. & Nevill-Manning, C. G. (2000). KEA: Practical automatic keyphrase extraction. (Working paper 00/05). Hamilton, New Zealand: University of Waikato, Department of Computer Science.en_US
dc.identifier.issn1170-487X
dc.identifier.urihttps://hdl.handle.net/10289/1021
dc.language.isoen
dc.publisherUniversity of Waikato, Department of Computer Scienceen_US
dc.relation.ispartofseriesComputer Science Working Papers
dc.subjectcomputer scienceen_US
dc.subjectMachine learning
dc.titleKEA: Practical automatic keyphrase extractionen_US
dc.typeWorking Paperen_US
dspace.entity.typePublication
uow.relation.series00/05

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
uow-cs-wp-2000-05.pdf
Size:
832.07 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.8 KB
Format:
Item-specific license agreed upon to submission
Description: