Using the HTRC Data Capsule Model to promote reuse and evolution of experimental analysis of digital library data: a case study of topic modeling
Bainbridge, D., Nichols, D. M., Hinze, A., & Downie, J. S. (2019). Using the HTRC Data Capsule Model to promote reuse and evolution of experimental analysis of digital library data: a case study of topic modeling. In Proceedings of 19th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2019) (pp. 463–464). Champaign, IL, USA: IEEE. https://doi.org/10.1109/jcdl.2019.00124
Permanent Research Commons link: https://hdl.handle.net/10289/12856
We report on a case-study to independently reproduce the work given in a publicly available blog on how to develop a topic model sourced from a collection of texts, where both the data set and source code used are readily available. More specifically, we detail the steps necessary—and the challenges that had to be overcome—to replicate the work using the HathiTrust Research Center’s virtual machine Data Capsule platform. From this we make recommendations for authors to follow, based on the lessons learned. We also show that the Data Capsule model can be put to work in a way that is of benefit to those interested in supporting computational reproducibility within their organizations.
© 2019 Copyright held by the author(s). Publication rights licensed to ACM.