Towards embedding data provenance in files
Phua, T. W., Patros, P., & Kumar, V. (2021). Towards embedding data provenance in files. In R. Paul (Ed.), Proceedings of 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (pp. 1319–1325). Washington, DC: IEEE. https://doi.org/10.1109/CCWC51732.2021.9375947
Permanent Research Commons link: https://hdl.handle.net/10289/14235
Data provenance (keeping track of who did what, where, when and how) boasts of various attractive use cases for distributed systems, such as intrusion detection, forensic analysis and secure information dependability. This potential, however, can only be realized if provenance is accessible by its primary stakeholders: the end-users. Existing provenance systems are designed in a ‘all-or-nothing’ fashion, making provenance inaccessible, difficult to extract and crucially, not controlled by its key stakeholders. To mitigate this, we propose that provenance be separated into system, data-specific and file-metadata provenance. Furthermore, we expand data-specific provenance as changes at a fine-grain level, or provenance-per-change, that is recorded alongside its source. We show that with the use of delta-encoding, provenance-per-change is viable, asserting our proposed architecture to be effectively realizable.
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.