Master Data Table

From Open Risk Manual

Definition

In the context of specific types of Model Development, Master Data Table is a data table (stored e.g. in a spreadsheet or a database) that captures the definitive collection of historical data that will be used for the development of a Risk Model that is of a certain class (statistical).

A master data table is typically the outcome of a sequence of data processing and Data Quality processes applied to initial (input) Risk Data. The precise structure and content depend on the risk model being developed.

Examples

In the context of building a credit scorecard the master data table will contain all required data for estimating a model in a suitably formatted table that might look as follows:


   1   6   4  12   5   5   3   4   1  67   3   2   1   2   1   0   0   1   0   0   1   0   0   1   1 
   2  48   2  60   1   3   2   2   1  22   3   1   1   1   1   0   0   1   0   0   1   0   0   1   2 
   4  12   4  21   1   4   3   3   1  49   3   1   2   1   1   0   0   1   0   0   1   0   1   0   1 
   1  42   2  79   1   4   3   4   2  45   3   1   2   1   1   0   0   0   0   0   0   0   0   1   1 
   1  24   3  49   1   3   3   4   4  53   3   2   2   1   1   1   0   1   0   0   0   0   0   1   2 
   4  36   2  91   5   3   3   4   4  35   3   1   2   2   1   0   0   1   0   0   0   0   1   0   1 
   4  24   2  28   3   5   3   4   2  53   3   1   1   1   1   0   0   1   0   0   1   0   0   1   1 

In the above table each row includes a set of numerical variables that might encode either explanatory factors or outcomes.

Issues and Challenges

  • In modern big data environments the concept of a single data table may not be adequate

See Also