Publication:
Using the HTRC Data Capsule Model to promote reuse and evolution of experimental analysis of digital library data: a case study of topic modeling

Abstract

We report on a case-study to independently reproduce the work given in a publicly available blog on how to develop a topic model sourced from a collection of texts, where both the data set and source code used are readily available. More specifically, we detail the steps necessary—and the challenges that had to be overcome—to replicate the work using the HathiTrust Research Center’s virtual machine Data Capsule platform. From this we make recommendations for authors to follow, based on the lessons learned. We also show that the Data Capsule model can be put to work in a way that is of benefit to those interested in supporting computational reproducibility within their organizations.

Citation

Bainbridge, D., Nichols, D. M., Hinze, A., & Downie, J. S. (2019). Using the HTRC Data Capsule Model to promote reuse and evolution of experimental analysis of digital library data: a case study of topic modeling. In Proceedings of 19th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2019) (pp. 463–464). Champaign, IL, USA: IEEE. https://doi.org/10.1109/jcdl.2019.00124

Series name

Date

Publisher

IEEE

Degree

Type of thesis

Supervisor

Link to supplementary material

Research Projects

Organizational Units

Journal Issue