Data Quality
Contents
Definition
Data Quality (also Data Integrity) refers to the condition of information sets (data) that are to be used as inputs for qualitative or quantitative risk assessment e.g. in the form of Portfolio Information, Algorithms and/or other decision support tools
Regulated institutions are required to have in place a formal Data Quality Management Framework[1]
Data Quality Assurance
Data quality assurance is a planned and systematic set of processes aiming to provide the desired confidence that the information embodied in a given data set conforms to established requirements. Data quality considerations are typically grouped as follows:
- Data Quality Assessment. Data quality assessment is a highly contextual process (dependent on the intended uses of the data) that establishes metrics of data quality along a number of different dimensions (Data Quality Standards)
- Data Validation that is primarily focused on validating the integrity of data
- Data Cleansing, the process of correcting and possibly transforming data in order to produce a set that is suitable for use
Example
In order to support to the development of an internal Credit Scorecard, a firm must have access to historical credit data that meet data quality criteria
Issues and Challenges
- During the Financial Crisis data quality was identified as a contributing cause to poor risk management [2]
- Data quality issues are a significant component of Model Risk, colloquially referred to as the "garbage in, garbage out" principle.