Missing Data
From Open Risk Manual
Definition
Missing Data is a typical phenomenon in real world quantitative / statistical analysis, whereby a material / significant amount of input data (required for performing the analysis or Model Development) is missing
Causes
The causes of missing data can be varied:
- Due to operational challenges:
- Lack of access to complete data due to commercial, technical (IT) or other reasons
- Extraction errors
- Causes intrinsic to the process being modelled: e.g.,
- market illiquidity may lead to time periods where no market transactions exist
- rejected retail clients lead to missing data as to whether they were a good or bad risk (See Reject Inference)
- high quality clients do not default within typical measurement / observation windows (Low Default Portfolios
Impact
The impact of missing data can range from insignificant to essentially preventing the quantitative program to go through. It is thus an of great importance for Data Quality programs and further downstream for Model Governance and the degree of Model Risk.
Mitigation
Missing Data Imputation is used in order to correct or salvage data sets (i.e., to avoid deleting incomplete observations).