Data Provenance

From Open Risk Manual

Definition

Data Provenance denotes the documentation of the chronology (timeline) of data ownership and any data transformations or modifications applied to a data set

In the context of Risk Management, establishing the provenance of Risk Data is considered a core element of good Data Governance and will be typically one of the Data Quality dimensions as laid out in the Data Quality Standards

The primary benefit of tracing the provenance of data sets is to provide contextual evidence and thereby helping establish the appropriateness of using the data and any limitations or other risks potentially implicit in the sequence of data ownership and data transformations.

Methods

  • Access Control (File / Database Permissions)
  • Documentation (User, Data, Operations)
  • Certification by Subject Matter Experts
  • Archiving
  • Reproducible Research

Data Provenance Standards

  • W3C PROV

Contributors to this article

» Wiki admin