Providing pin-point page-level precision to 1 trillion tokens of text for workset creation
Files
Published version, 968.4Kb
Citation
Export citationBainbridge, D., Downie, J. S., & Capitanu, B. (2018). Providing pin-point page-level precision to 1 trillion tokens of text for workset creation. In Proceedings of 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) (pp. 407–408). New York, USA: ACM. https://doi.org/10.1145/3197026.3203875
Permanent Research Commons link: https://hdl.handle.net/10289/11929
Abstract
We report on the work undertaken developing a web environment that allows users to search over 1 trillion tokens of text -- down to the page-level -- of the HathiTrust Part-of-Speech Extracted Features Dataset to help produce worksets for scholarly analysis. We present an extended example of the web environment in use, along with details about its implementation.
Date
2018Publisher
ACM
Rights
© 2018 Copyright held by the author(s).