Difference between revisions of "Master Data Table"

From Open Risk Manual
 
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Definition ==
 
== Definition ==
In the context of [[Model Development]], '''Master Data Table''' is a data table (stored e.g. in a spreadsheet or a database) that captures the definitive collection of historical data that will be used for the development of a [[Risk Model]]  
+
In the context of specific types of [[Model Development]], '''Master Data Table''' is a data table (stored e.g. in a spreadsheet or a database) that captures the definitive collection of historical data that will be used for the development of a [[Risk Model]] that is of a certain class (statistical).
  
A master data table is the outcome of any [[Data Quality]] processes applied to input [[Risk Data]]
+
A master data table is typically the outcome of a sequence of data processing and  [[Data Quality]] processes applied to initial (input) [[Risk Data]]. The precise structure and content depend on the risk model being developed.
  
 
== Examples ==
 
== Examples ==
* In the context of [[How to Build a Credit Scorecard | building a credit scorecard]] the master data table will contain all required data for estimating a model in suitably formatted table
+
In the context of [[How to Build a Credit Scorecard | building a credit scorecard]] the master data table will contain all required data for estimating a model in a suitably formatted table that might look as follows:
 +
 
  
 
<pre>
 
<pre>
Line 16: Line 17:
 
   4  24  2  28  3  5  3  4  2  53  3  1  1  1  1  0  0  1  0  0  1  0  0  1  1  
 
   4  24  2  28  3  5  3  4  2  53  3  1  1  1  1  0  0  1  0  0  1  0  0  1  1  
 
</pre>
 
</pre>
 +
 +
In the above table each row includes a set of numerical variables that might encode either explanatory factors or outcomes.
  
 
== Issues and Challenges ==
 
== Issues and Challenges ==

Latest revision as of 13:29, 21 February 2023

Definition

In the context of specific types of Model Development, Master Data Table is a data table (stored e.g. in a spreadsheet or a database) that captures the definitive collection of historical data that will be used for the development of a Risk Model that is of a certain class (statistical).

A master data table is typically the outcome of a sequence of data processing and Data Quality processes applied to initial (input) Risk Data. The precise structure and content depend on the risk model being developed.

Examples

In the context of building a credit scorecard the master data table will contain all required data for estimating a model in a suitably formatted table that might look as follows:


   1   6   4  12   5   5   3   4   1  67   3   2   1   2   1   0   0   1   0   0   1   0   0   1   1 
   2  48   2  60   1   3   2   2   1  22   3   1   1   1   1   0   0   1   0   0   1   0   0   1   2 
   4  12   4  21   1   4   3   3   1  49   3   1   2   1   1   0   0   1   0   0   1   0   1   0   1 
   1  42   2  79   1   4   3   4   2  45   3   1   2   1   1   0   0   0   0   0   0   0   0   1   1 
   1  24   3  49   1   3   3   4   4  53   3   2   2   1   1   1   0   1   0   0   0   0   0   1   2 
   4  36   2  91   5   3   3   4   4  35   3   1   2   2   1   0   0   1   0   0   0   0   1   0   1 
   4  24   2  28   3   5   3   4   2  53   3   1   1   1   1   0   0   1   0   0   1   0   0   1   1 

In the above table each row includes a set of numerical variables that might encode either explanatory factors or outcomes.

Issues and Challenges

  • In modern big data environments the concept of a single data table may not be adequate

See Also