Providing pin-point page-level precision to 1 trillion tokens of text for workset creation
dc.contributor.author | Bainbridge, David | en_NZ |
dc.contributor.author | Downie, J. Stephen | en_NZ |
dc.contributor.author | Capitanu, Boris | en_NZ |
dc.coverage.spatial | Fort Worth, Texas | en_NZ |
dc.date.accessioned | 2018-07-11T02:33:24Z | |
dc.date.available | 2018 | en_NZ |
dc.date.available | 2018-07-11T02:33:24Z | |
dc.date.issued | 2018 | en_NZ |
dc.description.abstract | We report on the work undertaken developing a web environment that allows users to search over 1 trillion tokens of text -- down to the page-level -- of the HathiTrust Part-of-Speech Extracted Features Dataset to help produce worksets for scholarly analysis. We present an extended example of the web environment in use, along with details about its implementation. | |
dc.format.mimetype | application/pdf | |
dc.identifier.citation | Bainbridge, D., Downie, J. S., & Capitanu, B. (2018). Providing pin-point page-level precision to 1 trillion tokens of text for workset creation. In Proceedings of 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) (pp. 407–408). New York, USA: ACM. https://doi.org/10.1145/3197026.3203875 | en |
dc.identifier.doi | 10.1145/3197026.3203875 | en_NZ |
dc.identifier.isbn | 978-1-4503-5178-2 | en_NZ |
dc.identifier.uri | https://hdl.handle.net/10289/11929 | |
dc.language.iso | en | |
dc.publisher | ACM | en_NZ |
dc.relation.isPartOf | Proceedings of 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) | en_NZ |
dc.rights | © 2018 Copyright held by the author(s). | |
dc.source | JCDL 2018 | en_NZ |
dc.subject | computer science | en_NZ |
dc.subject | very large digital libraries | en_NZ |
dc.subject | extract feature text analysis | en_NZ |
dc.subject | workset creation | en_NZ |
dc.title | Providing pin-point page-level precision to 1 trillion tokens of text for workset creation | en_NZ |
dc.type | Conference Contribution | |
dspace.entity.type | Publication | |
pubs.begin-page | 407 | |
pubs.end-page | 408 | |
pubs.finish-date | 2018-06-07 | en_NZ |
pubs.place-of-publication | New York, USA | |
pubs.start-date | 2018-06-03 | en_NZ |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- p407-bainbridge.pdf
- Size:
- 968.44 KB
- Format:
- Adobe Portable Document Format
- Description:
- Published version
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Research Commons Deposit Agreement 2017.pdf
- Size:
- 188.11 KB
- Format:
- Adobe Portable Document Format
- Description: