dc.contributor.author | Fowke, Michael | |
dc.contributor.author | Hinze, Annika | |
dc.contributor.author | Heese, Ralf | |
dc.date.accessioned | 2014-01-29T03:22:08Z | |
dc.date.available | 2014-01-29T03:22:08Z | |
dc.date.issued | 2013-12 | |
dc.identifier.citation | Fowke, M., Hinze, A., & Heese, R. (2013). Text categorization and similarity analysis: implementation and evaluation. (Working paper 10/2013). Hamilton, New Zealand: University of Waikato, Department of Computer Science. | en_NZ |
dc.identifier.issn | 1177-777X | |
dc.identifier.uri | https://hdl.handle.net/10289/8430 | |
dc.description.abstract | This report covers the implementation of software that aims to identify document versions and se-mantically related documents. This is important due to the increasing amount of digital information. Key criteria were that the software was fast and required limited disk space. Previous research de-termined that the Simhash algorithm was the most appropriate for this application so this method was implemented. The structure of each component was well defined with the inputs and outputs constant and the result was a software system that can have interchangeable parts if required. | en_NZ |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | en_NZ |
dc.publisher | University of Waikato, Department of Computer Science | en_NZ |
dc.relation.ispartofseries | Computer Science Working Papers | en_NZ |
dc.rights | © 2013 Michael Fowke, Annika Hinze, Ralf Heese. | en_NZ |
dc.subject | computer science | en_NZ |
dc.title | Text categorization and similarity analysis: implementation and evaluation | en_NZ |
dc.type | Working Paper | en_NZ |
uow.relation.series | 10/2013 | en_NZ |