Difference between revisions of "Stratified Sampling"

From Open Risk Manual
(5.1 Calculate The Sample Size For The Reserve Sample)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Approach To Selecting A Stratified Sample ==
 
== Approach To Selecting A Stratified Sample ==
  
The approach to selecting the sample consists of five steps, as illustrated in the figure below. These steps are not necessarily consecutive, as the NCA bank team may decide, for instance, to prepare all the scripts and tools in advance. The remainder of this subsection describes the approach for each of the steps.
+
The [[AQR Manual]] approach to selecting a portfolio sample consists of five steps. These steps are not necessarily consecutive.
  
 
+
=== Step 1. Define Perimeter Of Selectable Debtors ===
==== 3.5.1 Step 1 Define Perimeter Of Selectable Debtors ====
 
  
 
Some parts of each portfolio will be excluded from sampling (and therefore projection of findings). The exclusions are:
 
Some parts of each portfolio will be excluded from sampling (and therefore projection of findings). The exclusions are:
  
- Retail exposures other than retail mortgages (i.e. retail SMEs and retail others) These exposures will be reviewed through the collective provisioning review (see Section 7 below on the collective provisioning review) ((Also retail mortgages shall be assessed through the collective provisioning review; however critical inputs for the calibration of the collective provisioning parameters shall be sourced through the review of files and collaterals. ));
+
* Retail exposures other than retail mortgages (i.e. retail SMEs and retail others). These exposures will be reviewed through the collective provisioning review (see AQR Manual Section on the collective provisioning review) (Also retail mortgages shall be assessed through the collective provisioning review; however critical inputs for the calibration of the collective provisioning parameters shall be sourced through the review of files and collaterals. ;
- Portfolios that have not been selected for Phase 2;
+
* Portfolios that have not been selected for Phase 2;
- Individual debtors from selected portfolios that are externally rated and this rating is better than an ECAI Credit Quality Step 4, as defined in the loan tape descriptive Excel –The risk of material misstatements is negligible;
+
* Individual debtors from selected portfolios that are externally rated and this rating is better than an ECAI Credit Quality Step 4, as defined in the loan tape descriptive Excel –The risk of material misstatements is negligible;
- Corporates with both Debt/EBITDA < 1 and Equity/Assets > 50% based on audited accounts that are less than 12 months old;
+
* Corporates with both Debt/EBITDA < 1 and Equity/Assets > 50% based on audited accounts that are less than 12 months old;
- Debtors that have been 95% provisioned or more.
+
* Debtors that have been 95% provisioned or more.
  
  
=== 3.5.1.1 Calculation Approach ===
+
==== 1.1 Calculation Approach ====
  
 
Loan tape data is provided in three different views: debtor view, facility view and collateral view; as described in Section 0. This subsection outlines how these three views have to be combined to prepare the sampling dataset, which is defined at the debtor level and aggregates up past due and LTV. For the avoidance of doubt, each debtor represents one line in the sampling database, except for retail exposures in which each facility represents one line in the sampling database.
 
Loan tape data is provided in three different views: debtor view, facility view and collateral view; as described in Section 0. This subsection outlines how these three views have to be combined to prepare the sampling dataset, which is defined at the debtor level and aggregates up past due and LTV. For the avoidance of doubt, each debtor represents one line in the sampling database, except for retail exposures in which each facility represents one line in the sampling database.
Line 22: Line 21:
  
 
The third task is to exclude from the collated dataset the portfolios and debtors that are not subject to credit file review:
 
The third task is to exclude from the collated dataset the portfolios and debtors that are not subject to credit file review:
 
  
 
* Portfolio is not among the portfolios selected during Phase 1;
 
* Portfolio is not among the portfolios selected during Phase 1;
Line 28: Line 26:
 
* Portfolio = Other retail;
 
* Portfolio = Other retail;
 
* CQS better than 4;
 
* CQS better than 4;
* Both Debt/EBITDA<1 and Equity/Assets>50%;
+
* Both Debt/EBITDA < 1 and Equity/Assets > 50%;
 
* Provisions > 95% of Debtor exposure.
 
* Provisions > 95% of Debtor exposure.
  
Line 34: Line 32:
 
The general convention about how to treat missing values applies to this dataset: “not applicable” will be designated as “N/A” for text and “11111111111” for numeric fields; whereas “missing information” will be designated as “MISS” for text and “99999999999” for numeric fields.
 
The general convention about how to treat missing values applies to this dataset: “not applicable” will be designated as “N/A” for text and “11111111111” for numeric fields; whereas “missing information” will be designated as “MISS” for text and “99999999999” for numeric fields.
  
==== 3.5.2 Step 2 Stratify Portfolio ====
+
=== Step 2. Stratify Portfolio ===
  
 
Every portfolio will be split into strata. This stratification enables a manageable sample size, while maintaining high standards of accuracy and representativeness of the sample. Stratification will be based upon the criteria of exposure size and riskiness. Figure 5 below illustrates how each portfolio is divided into strata and how the stratified sample is selected. Matrix numbers represent the percentage of observations selected from each bucket, from an example large corporate portfolio.
 
Every portfolio will be split into strata. This stratification enables a manageable sample size, while maintaining high standards of accuracy and representativeness of the sample. Stratification will be based upon the criteria of exposure size and riskiness. Figure 5 below illustrates how each portfolio is divided into strata and how the stratified sample is selected. Matrix numbers represent the percentage of observations selected from each bucket, from an example large corporate portfolio.
  
=== 3.5.2.1 Step 2.1 – Stratify By Riskiness Buckets ===
+
==== 2.1. Stratify By Riskiness Buckets ====
  
 
Riskiness buckets (vertical axis of the Figure 5 above) are defined using basic definitions that all significant banks should be able to provide in their loan tape (see Section 2.4), such as past due status etc. To simplify this distinction, forward looking criteria – such as PD – have been avoided. The specific definitions are:
 
Riskiness buckets (vertical axis of the Figure 5 above) are defined using basic definitions that all significant banks should be able to provide in their loan tape (see Section 2.4), such as past due status etc. To simplify this distinction, forward looking criteria – such as PD – have been avoided. The specific definitions are:
  
 +
* ''Default more than 12 months'': Is and has been non-performing with days past due more than 12 months (internal or EBA definition).
 +
* ''Default more than six months'' but less than 12 months: Is and has been non-performing with days past due of more than six months but less than 12 (internal or EBA definition);
 +
* ''Default less than six months'': Is and has been non-performing with days past due of less than six months (internal or EBA definition);
 +
* ''High-risk cured'': Was NPE less than 12 months ago (internal or EBA definition), and currently shows any of the potential deterioration signs referred to below;
 +
* ''High risk'': Has not been non-performing for the last 12 months, but currently shows one of the signs of potential deterioration defined in Table 28;
 +
* ''Normal cured'': Currently has none of the high risk signs, but has been non-performing less than 12 months ago (internal or EBA definition);
 +
* ''Normal'': Currently has none of the high risk signs, and has not been non-performing for the last 12 months, at least;
  
* //Default more than 12 months//: Is and has been non-performing with days past due more than 12 months (internal or EBA definition).
 
* //Default more than six months// but less than 12 months: Is and has been non-performing with days past due of more than six months but less than 12 (internal or EBA definition);
 
* //Default less than six months:// Is and has been non-performing with days past due of less than six months (internal or EBA definition);
 
* //High-risk cured//: Was NPE less than 12 months ago (internal or EBA definition), and
 
currently shows any of the potential deterioration signs referred to below;
 
* //High risk:// Has not been non-performing for the last 12 months, but currently shows one of
 
the signs of potential deterioration defined in Table 28;
 
* //Normal cured:// Currently has none of the high risk signs, but has been non-performing less than 12 months ago (internal or EBA definition);
 
* //Normal:// Currently has none of the high risk signs, and has not been non-performing for the last 12 months, at least;
 
  
 
Note: Past due definitions should respect local definition of materiality as per Article 178 of CRR.
 
Note: Past due definitions should respect local definition of materiality as per Article 178 of CRR.
  
**Data required**
+
'''Data required'''
  
 
The basis for the stratification is the sampling dataset, as per the section above. The fields required are listed in the table below.
 
The basis for the stratification is the sampling dataset, as per the section above. The fields required are listed in the table below.
  
**Parameters required**
+
'''Parameters required'''
  
 
Riskiness buckets will be defined through the combination of three flags: //Current status flag//, //Time in default// and //Cured//:
 
Riskiness buckets will be defined through the combination of three flags: //Current status flag//, //Time in default// and //Cured//:
  
** Calculation approach **
+
'''Calculation approach'''
  
 
To calculate the riskiness buckets, the parameters above have to be simply combined:
 
To calculate the riskiness buckets, the parameters above have to be simply combined:
  
 
* Default more than 12 months when: ­
 
* Default more than 12 months when: ­
  * Current status flag = Default;
+
** Current status flag = Default;
  * And Time in default = More than 12 months;
+
** And Time in default = More than 12 months;
  * And Cured = N/A;
+
** And Cured = N/A;
 
* Default less than 12 months when:
 
* Default less than 12 months when:
  * Current status flag = Default;
+
** Current status flag = Default;
  * And Time in default = six to 12 months; ­unknownLineBreak
+
** And Time in default = six to 12 months; ­unknownLineBreak
  * And Cured = N/A.
+
** And Cured = N/A.
 
* Default less than 6 months when:
 
* Default less than 6 months when:
  * Current status flag = Default; ­unknownLineBreak
+
** Current status flag = Default; ­unknownLineBreak
  * And Time in default = Less than six months; ­unknownLineBreak
+
** And Time in default = Less than six months; ­unknownLineBreak
  * And Cured = N/A.
+
** And Cured = N/A.
 
* High-risk cured when:
 
* High-risk cured when:
  * Current status flag = High Risk;
+
** Current status flag = High Risk;
  * And Time in default = N/A;
+
** And Time in default = N/A;
  * And Cured = 1.
+
** And Cured = 1.
 
* High risk when:
 
* High risk when:
  * Current status flag = High Risk;
+
** Current status flag = High Risk;
  * And Time in default = N/A; ­unknownLineBreak
+
** And Time in default = N/A; ­unknownLineBreak
  * And Cured = 0.
+
** And Cured = 0.
 
* Normal cured when:
 
* Normal cured when:
  * Current status flag = Normal;
+
** Current status flag = Normal;
  * And Time in default = N/A; ­unknownLineBreak
+
** And Time in default = N/A; ­unknownLineBreak
  * And Cured = 1.
+
** And Cured = 1.
 
* Normal when:
 
* Normal when:
  * Current status flag = Normal; ­unknownLineBreak
+
** Current status flag = Normal; ­unknownLineBreak
  * And Time in default = N/A; ­unknownLineBreak
+
** And Time in default = N/A; ­unknownLineBreak
  * And Cured = 0.
+
** And Cured = 0.
  
=== 3.5.2.2 Step 2.2 – Stratify By Exposure Size Buckets ===
+
==== 2.2 – Stratify By Exposure Size Buckets ====
  
 
Exposure size buckets (horizontal axis of the Figure 5 above) are defined in three steps:
 
Exposure size buckets (horizontal axis of the Figure 5 above) are defined in three steps:
 
  
 
* Top ten debtors by exposure size of each portfolio and risk bucket are sampled;
 
* Top ten debtors by exposure size of each portfolio and risk bucket are sampled;
Line 105: Line 100:
 
* The range between the tenth debtor by exposure size and the 5th percentile (5% smallest exposures (based on total number of debtors) ordered by exposure size) is split into five buckets of the same absolute difference in exposure.
 
* The range between the tenth debtor by exposure size and the 5th percentile (5% smallest exposures (based on total number of debtors) ordered by exposure size) is split into five buckets of the same absolute difference in exposure.
  
** Data required **
+
 
 +
'''Data required'''
  
 
The basis for the stratification is the sampling dataset, as per the sections above. The fields required are listed in the table below.
 
The basis for the stratification is the sampling dataset, as per the sections above. The fields required are listed in the table below.
  
  
** Parameters required **
+
'''Parameters required'''
  
 
For clarity:
 
For clarity:
  
* A Stratum is a sub-segment of the portfolio with similar exposure size and risk classification – i.e. normal risk, exposure size bucket 1 would be an example of a Stratum Strata is the plural of Stratum(!)
+
* A Stratum is a sub-segment of the portfolio with similar exposure size and risk classification – i.e. normal risk, exposure size bucket 1 would be an example of a Stratum. Strata is the plural of Stratum(!)
 
* A Common Risk Strata is a group of Stratum with different levels of exposures but the same risk characteristics – i.e. normal risk, exposure size bucket 1 and normal risk, exposure size bucket 2 would both be in a Common Risk Strata
 
* A Common Risk Strata is a group of Stratum with different levels of exposures but the same risk characteristics – i.e. normal risk, exposure size bucket 1 and normal risk, exposure size bucket 2 would both be in a Common Risk Strata
 
* A Common Exposure Strata is a group of sub-segments with different levels of risk but the same exposure characteristics – i.e. normal risk, exposure size bucket 1 and normal cure risk, exposure size bucket 1 would both be in a Common Exposure Strata
 
* A Common Exposure Strata is a group of sub-segments with different levels of risk but the same exposure characteristics – i.e. normal risk, exposure size bucket 1 and normal cure risk, exposure size bucket 1 would both be in a Common Exposure Strata
Line 125: Line 121:
 
* Cut-off3;
 
* Cut-off3;
 
* Cut-off4;
 
* Cut-off4;
Top10th Exposure.
+
* Top10th Exposure.
 +
 
  
 
These cut-offs are specific to each portfolio and riskiness buckets, meaning that, for instance, cut-off points for retail mortgages normal will be different from cut-off points for retail mortgages defaulted >12 months and different from large corporates defaulted >12 months. The steps to calculate them are explained below and illustrated in the Figure 6:
 
These cut-offs are specific to each portfolio and riskiness buckets, meaning that, for instance, cut-off points for retail mortgages normal will be different from cut-off points for retail mortgages defaulted >12 months and different from large corporates defaulted >12 months. The steps to calculate them are explained below and illustrated in the Figure 6:
  
- Calculate the 5th Percentile of exposure (by debtor) for each portfolio and riskiness bucket i.e. determine the exposure of the debtor which has an exposure smaller than 95% of the other debtors in the same Common Risk Strata. );
+
* Calculate the 5th Percentile of exposure (by debtor) for each portfolio and riskiness bucket i.e. determine the exposure of the debtor which has an exposure smaller than 95% of the other debtors in the same Common Risk Strata. );
- Identify the exposure size of the Top 10th debtor by exposure size in each Common Risk Strata;
+
* Identify the exposure size of the Top 10th debtor by exposure size in each Common Risk Strata;
- Calculate the auxiliary variable “Step” as:
+
* Calculate the auxiliary variable “Step” as:
- Step = (Top10th Exposure - 5th Percentile) / 5
+
* Step = (Top10th Exposure - 5th Percentile) / 5
- For i = 1 to 4, calculate Cut-offi as: Cut offi = 5th percentile + (Step xi)
+
* For i = 1 to 4, calculate Cut-offi as: Cut offi = 5th percentile + (Step xi)
  
  
** Calculation approach **
+
'''Calculation approach'''
  
 
Once the parameters are calculated, each debtor is allocated to the corresponding exposure size bucket:
 
Once the parameters are calculated, each debtor is allocated to the corresponding exposure size bucket:
Line 147: Line 144:
 
* Exposure size bucket = 5th Percentile when Exposure ≤ 5th Percentile;
 
* Exposure size bucket = 5th Percentile when Exposure ≤ 5th Percentile;
  
==== 3.5.3 Step 3 – Select The Priority Debtors ====
+
=== Step 3. Select The Priority Debtors ===
  
 
In order to anticipate the beginning of the credit file review, the “priority debtors” will be selected. This will consist of the top 10 debtors (top 5 for small granular non-retail portfolios) by exposure size per portfolio and riskiness bucket. Picking these files should be relatively straight forward, allowing credit file review to begin swiftly on completion of the loan tape. If the 10th and 11th debtor are strictly identical by exposure then lowest allocated value of collateral can be used to select which debtor to go into the priority debtors. If allocated collateral is equal then a random choice should be made.
 
In order to anticipate the beginning of the credit file review, the “priority debtors” will be selected. This will consist of the top 10 debtors (top 5 for small granular non-retail portfolios) by exposure size per portfolio and riskiness bucket. Picking these files should be relatively straight forward, allowing credit file review to begin swiftly on completion of the loan tape. If the 10th and 11th debtor are strictly identical by exposure then lowest allocated value of collateral can be used to select which debtor to go into the priority debtors. If allocated collateral is equal then a random choice should be made.
Line 153: Line 150:
 
At NCA discretion, in addition to the top 10 debtors, all debtors within the top 20 groups of connected clients (across all selected portfolios, not by portfolio/riskiness bucket) can be selected as an additional priority group, to the extent they have not already been analysed. NCAs will decide at the beginning of this step if they wish to pursue this option.
 
At NCA discretion, in addition to the top 10 debtors, all debtors within the top 20 groups of connected clients (across all selected portfolios, not by portfolio/riskiness bucket) can be selected as an additional priority group, to the extent they have not already been analysed. NCAs will decide at the beginning of this step if they wish to pursue this option.
  
=== 3.5.3.1 Data Required ===
+
'''Data Required'''
  
 
The basis for the selection of the priority debtors is the sampling dataset, as per the sections above. The fields required are listed in the table below.
 
The basis for the selection of the priority debtors is the sampling dataset, as per the sections above. The fields required are listed in the table below.
  
 
+
'''Calculation Approach'''
=== 3.5.3.2 Calculation Approach ===
 
  
 
The selection of the priority debtors is as easy as picking the debtors that have been allocated to the Top10 exposure size bucket for all the portfolios and riskiness buckets. For the avoidance of doubt, this means that 70 debtors will be selected per portfolio (10 per riskiness bucket), though some debtors may belong to the same group of connected clients, and therefore be analysed together. In these circumstances, no extra priority debtors should be selected.
 
The selection of the priority debtors is as easy as picking the debtors that have been allocated to the Top10 exposure size bucket for all the portfolios and riskiness buckets. For the avoidance of doubt, this means that 70 debtors will be selected per portfolio (10 per riskiness bucket), though some debtors may belong to the same group of connected clients, and therefore be analysed together. In these circumstances, no extra priority debtors should be selected.
  
==== 3.5.4 Step 4 Select Random Stratified Sample ====
+
=== Step 4. Select Random Stratified Sample ===
  
 
The stratification of the portfolios enables sufficient audit evidence with only a few observations per stratum. This section outlines how the number of observations per stratum is defined and how individual debtors will be picked once the sample size has been calculated.
 
The stratification of the portfolios enables sufficient audit evidence with only a few observations per stratum. This section outlines how the number of observations per stratum is defined and how individual debtors will be picked once the sample size has been calculated.
  
=== 3.5.4.1 Step 4.1 Calculate Sample Size ===
+
==== 4.1 Calculate Sample Size ====
 
 
Not all of the strata will be sampled. In general, small exposures will not be reviewed and in the case of retail mortgage portfolios, for those debtors that do not show any evidence of current or past reasons for potential impairment, only the largest exposures will be reviewed. This is illustrated in Figure 8 and Figure 9 below.
 
  
 +
Not all of the strata will be sampled. In general, small exposures will not be reviewed and in the case of retail mortgage portfolios, for those debtors that do not show any evidence of current or past reasons for potential impairment, only the largest exposures will be reviewed.
  
 
The number of files sampled per stratum is defined based on the following factors:
 
The number of files sampled per stratum is defined based on the following factors:
 
  
 
* The risk category of the stratum;
 
* The risk category of the stratum;
Line 181: Line 175:
  
  
** Data required **
+
'''Data required'''
  
 
The basis for the calculation of the sample size is the sampling dataset, as per the sections above. The fields required are listed in the table below.
 
The basis for the calculation of the sample size is the sampling dataset, as per the sections above. The fields required are listed in the table below.
  
 
+
'''Parameters required'''
** Parameters required **
 
  
 
The parameters required to determine the statistical sufficiency of the sample are provided by the CPMO. The parameters are shown in the Table below.
 
The parameters required to determine the statistical sufficiency of the sample are provided by the CPMO. The parameters are shown in the Table below.
 
  
 
NCA bank teams may apply the parameters for small concentrated non-retail portfolios when: The total RWA of the portfolio is less than 5% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio. NCA bank teams may petition to apply the parameters where the total RWA of the portfolio is between 5 and 10% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio where the number of files selected for the bank is greater than the expected number of files communicated by the CPMO at the end of Phase 1.The following subsection explains how these parameters are applied.
 
NCA bank teams may apply the parameters for small concentrated non-retail portfolios when: The total RWA of the portfolio is less than 5% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio. NCA bank teams may petition to apply the parameters where the total RWA of the portfolio is between 5 and 10% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio where the number of files selected for the bank is greater than the expected number of files communicated by the CPMO at the end of Phase 1.The following subsection explains how these parameters are applied.
  
** Calculation approach **
+
'''Calculation approach'''
  
 
The first step in the calculation is to allocate exposure and number of debtors (after exclusions) by stratum, as illustrated in the following figure.
 
The first step in the calculation is to allocate exposure and number of debtors (after exclusions) by stratum, as illustrated in the following figure.
 
  
 
The number of observations is then looked up for each stratum from the table above. In doing so, the correct set of corporate parameters (granular, non-granular or small and granular) should be looked up, depending on the number of observations in the portfolio after exclusions.
 
The number of observations is then looked up for each stratum from the table above. In doing so, the correct set of corporate parameters (granular, non-granular or small and granular) should be looked up, depending on the number of observations in the portfolio after exclusions.
 
  
 
If forbearance information is not available to determine the high risk segment and no conservative proxy is available (as described in section on DIV), the sample size for normal cured and normal should be increased by a factor of 4 (up to the total population of the stratum). For instance, if forbearance/restructuring information is not available for the above example, the revised sample size will be:
 
If forbearance information is not available to determine the high risk segment and no conservative proxy is available (as described in section on DIV), the sample size for normal cured and normal should be increased by a factor of 4 (up to the total population of the stratum). For instance, if forbearance/restructuring information is not available for the above example, the revised sample size will be:
  
  
** Example calculation **
+
==== 4.2 – Select Specific Debtors ====
 
 
An example calculation and output is shown in the example calculation in Excel “Sampling example tool.xlsx”.
 
 
 
=== 3.5.4.2 Step 4.2 – Select Specific Debtors ===
 
  
 
To ensure that the sample is representative and unbiased, random sampling will be applied to select specific debtors.
 
To ensure that the sample is representative and unbiased, random sampling will be applied to select specific debtors.
  
** Data required **
+
'''Data required'''
  
 
The basis for the selection of specific debtors is the sampling dataset, as per the sections above. The fields required are listed in the Table below.
 
The basis for the selection of specific debtors is the sampling dataset, as per the sections above. The fields required are listed in the Table below.
  
 
+
'''Calculation approach'''
** Calculation approach **
 
  
 
The approach to select specific debtors is:
 
The approach to select specific debtors is:
  
- Ensure that the portfolio follows a random order by assigning a randomly generated number ((ISA 530, Appendix 4, Paragraph a: “Random selection (applied through random number generators, for example, random number tables).” )) (e.g. SAS’ ranuni(seed)) to each debtor and sorting in descending order;
+
* Ensure that the portfolio follows a random order by assigning a randomly generated number ((ISA 530, Appendix 4, Paragraph a: “Random selection (applied through random number generators, for example, random number tables).” )) (e.g. SAS’ ranuni(seed)) to each debtor and sorting in descending order;
- Starting with the first debtor in the randomly sorted list, select the first “n” debtors, for each stratum where “n” is the total sample size for each stratum described in the previous section.
+
* Starting with the first debtor in the randomly sorted list, select the first “n” debtors, for each stratum where “n” is the total sample size for each stratum described in the previous section.
 
 
  
 
Alternatively, typical data management software offers solutions to run stratified samples easily (e.g. SAS’ PROC SURVEYSELECT combined with the statement “strata”). The NCA bank team may use these solutions as long as the randomness of the selection is ensured.
 
Alternatively, typical data management software offers solutions to run stratified samples easily (e.g. SAS’ PROC SURVEYSELECT combined with the statement “strata”). The NCA bank team may use these solutions as long as the randomness of the selection is ensured.
Line 229: Line 213:
 
Experience suggests that some parties can struggle to select samples randomly. Therefore following selection of the sample, the party responsible for selecting the sample should sign a declaration that appropriate measures have been taken to ensure the sample is random and the NCA should ensure the sample selection process has been Quality Assured.
 
Experience suggests that some parties can struggle to select samples randomly. Therefore following selection of the sample, the party responsible for selecting the sample should sign a declaration that appropriate measures have been taken to ensure the sample is random and the NCA should ensure the sample selection process has been Quality Assured.
  
==== 3.5.5 Step 5 Select The Reserve Sample ====
+
=== Step 5 Select The Reserve Sample ===
  
 
Together with the main sample, the NCA bank team will select a reserve sample. Its purpose is allowing the replacement of files under very precise circumstances, explained in Section 4.4 and Chapter 6 and to check anomalies in the projection of findings phase. This section outlines how the reserve sample is selected while preserving all the attributes defined for the main sample, such as representativeness, non-bias, sufficiency, etc.
 
Together with the main sample, the NCA bank team will select a reserve sample. Its purpose is allowing the replacement of files under very precise circumstances, explained in Section 4.4 and Chapter 6 and to check anomalies in the projection of findings phase. This section outlines how the reserve sample is selected while preserving all the attributes defined for the main sample, such as representativeness, non-bias, sufficiency, etc.
  
=== 3.5.5.1 Step 5.1 Calculate The Sample Size For The Reserve Sample ===
+
==== 5.1 Calculate The Sample Size For The Reserve Sample ====
  
 
The calculation of the reserve sample size is a parallel step to the calculation of the main sample size. The data required is the same as for the main sample and that the reserve sample will be calculated right after the main sample size has been calculated.
 
The calculation of the reserve sample size is a parallel step to the calculation of the main sample size. The data required is the same as for the main sample and that the reserve sample will be calculated right after the main sample size has been calculated.
  
** Calculation approach **
+
'''Calculation approach'''
  
 
The reserve sample, when combined with actual sample can never be more than the total number of debtors in the stratum. Given “N” debtors per strata and a main sample size of “n*”, the reserve sample size is calculated using the following expression:
 
The reserve sample, when combined with actual sample can never be more than the total number of debtors in the stratum. Given “N” debtors per strata and a main sample size of “n*”, the reserve sample size is calculated using the following expression:
Line 243: Line 227:
 
* R = min(n*, N – n*)
 
* R = min(n*, N – n*)
  
Figure 13 below illustrates the reserve sample size for the example large corporate portfolio.
+
==== 5.2 Designate Specific Debtors For The Reserve Sample ====
 
 
 
 
=== 3.5.5.2 Step 5.2 – Designate Specific Debtors For The Reserve Sample ===
 
  
 
The selection of the specific reserve sample debtors will be carried out after the selection of the main sample. The required dataset is therefore the same, excluding those files that have been already selected, and the approach is also the same as described above.
 
The selection of the specific reserve sample debtors will be carried out after the selection of the main sample. The required dataset is therefore the same, excluding those files that have been already selected, and the approach is also the same as described above.

Latest revision as of 16:42, 10 June 2021

Approach To Selecting A Stratified Sample

The AQR Manual approach to selecting a portfolio sample consists of five steps. These steps are not necessarily consecutive.

Step 1. Define Perimeter Of Selectable Debtors

Some parts of each portfolio will be excluded from sampling (and therefore projection of findings). The exclusions are:

  • Retail exposures other than retail mortgages (i.e. retail SMEs and retail others). These exposures will be reviewed through the collective provisioning review (see AQR Manual Section on the collective provisioning review) (Also retail mortgages shall be assessed through the collective provisioning review; however critical inputs for the calibration of the collective provisioning parameters shall be sourced through the review of files and collaterals. ;
  • Portfolios that have not been selected for Phase 2;
  • Individual debtors from selected portfolios that are externally rated and this rating is better than an ECAI Credit Quality Step 4, as defined in the loan tape descriptive Excel –The risk of material misstatements is negligible;
  • Corporates with both Debt/EBITDA < 1 and Equity/Assets > 50% based on audited accounts that are less than 12 months old;
  • Debtors that have been 95% provisioned or more.


1.1 Calculation Approach

Loan tape data is provided in three different views: debtor view, facility view and collateral view; as described in Section 0. This subsection outlines how these three views have to be combined to prepare the sampling dataset, which is defined at the debtor level and aggregates up past due and LTV. For the avoidance of doubt, each debtor represents one line in the sampling database, except for retail exposures in which each facility represents one line in the sampling database.

The first task is to prepare the sampling dataset, which contains the fields described in the following Table for each debtor (or facility for RRE). As the loan tape for RRE is collected at the facility level, throughout the description of the sampling process in this Chapter, “debtor” should be read as “facility” for RRE.

The third task is to exclude from the collated dataset the portfolios and debtors that are not subject to credit file review:

  • Portfolio is not among the portfolios selected during Phase 1;
  • Portfolio = Retail SME;
  • Portfolio = Other retail;
  • CQS better than 4;
  • Both Debt/EBITDA < 1 and Equity/Assets > 50%;
  • Provisions > 95% of Debtor exposure.


The general convention about how to treat missing values applies to this dataset: “not applicable” will be designated as “N/A” for text and “11111111111” for numeric fields; whereas “missing information” will be designated as “MISS” for text and “99999999999” for numeric fields.

Step 2. Stratify Portfolio

Every portfolio will be split into strata. This stratification enables a manageable sample size, while maintaining high standards of accuracy and representativeness of the sample. Stratification will be based upon the criteria of exposure size and riskiness. Figure 5 below illustrates how each portfolio is divided into strata and how the stratified sample is selected. Matrix numbers represent the percentage of observations selected from each bucket, from an example large corporate portfolio.

2.1. Stratify By Riskiness Buckets

Riskiness buckets (vertical axis of the Figure 5 above) are defined using basic definitions that all significant banks should be able to provide in their loan tape (see Section 2.4), such as past due status etc. To simplify this distinction, forward looking criteria – such as PD – have been avoided. The specific definitions are:

  • Default more than 12 months: Is and has been non-performing with days past due more than 12 months (internal or EBA definition).
  • Default more than six months but less than 12 months: Is and has been non-performing with days past due of more than six months but less than 12 (internal or EBA definition);
  • Default less than six months: Is and has been non-performing with days past due of less than six months (internal or EBA definition);
  • High-risk cured: Was NPE less than 12 months ago (internal or EBA definition), and currently shows any of the potential deterioration signs referred to below;
  • High risk: Has not been non-performing for the last 12 months, but currently shows one of the signs of potential deterioration defined in Table 28;
  • Normal cured: Currently has none of the high risk signs, but has been non-performing less than 12 months ago (internal or EBA definition);
  • Normal: Currently has none of the high risk signs, and has not been non-performing for the last 12 months, at least;


Note: Past due definitions should respect local definition of materiality as per Article 178 of CRR.

Data required

The basis for the stratification is the sampling dataset, as per the section above. The fields required are listed in the table below.

Parameters required

Riskiness buckets will be defined through the combination of three flags: //Current status flag//, //Time in default// and //Cured//:

Calculation approach

To calculate the riskiness buckets, the parameters above have to be simply combined:

  • Default more than 12 months when: ­
    • Current status flag = Default;
    • And Time in default = More than 12 months;
    • And Cured = N/A;
  • Default less than 12 months when:
    • Current status flag = Default;
    • And Time in default = six to 12 months; ­unknownLineBreak
    • And Cured = N/A.
  • Default less than 6 months when:
    • Current status flag = Default; ­unknownLineBreak
    • And Time in default = Less than six months; ­unknownLineBreak
    • And Cured = N/A.
  • High-risk cured when:
    • Current status flag = High Risk;
    • And Time in default = N/A;
    • And Cured = 1.
  • High risk when:
    • Current status flag = High Risk;
    • And Time in default = N/A; ­unknownLineBreak
    • And Cured = 0.
  • Normal cured when:
    • Current status flag = Normal;
    • And Time in default = N/A; ­unknownLineBreak
    • And Cured = 1.
  • Normal when:
    • Current status flag = Normal; ­unknownLineBreak
    • And Time in default = N/A; ­unknownLineBreak
    • And Cured = 0.

2.2 – Stratify By Exposure Size Buckets

Exposure size buckets (horizontal axis of the Figure 5 above) are defined in three steps:

  • Top ten debtors by exposure size of each portfolio and risk bucket are sampled;
  • Smallest exposures (i.e. less than 5th percentile ((5% smallest exposures (based on total number of debtors in the portfolio) ordered by exposure size. )) ) are excluded from the analysis on the basis of the immateriality of the potential adjustment;
  • The range between the tenth debtor by exposure size and the 5th percentile (5% smallest exposures (based on total number of debtors) ordered by exposure size) is split into five buckets of the same absolute difference in exposure.


Data required

The basis for the stratification is the sampling dataset, as per the sections above. The fields required are listed in the table below.


Parameters required

For clarity:

  • A Stratum is a sub-segment of the portfolio with similar exposure size and risk classification – i.e. normal risk, exposure size bucket 1 would be an example of a Stratum. Strata is the plural of Stratum(!)
  • A Common Risk Strata is a group of Stratum with different levels of exposures but the same risk characteristics – i.e. normal risk, exposure size bucket 1 and normal risk, exposure size bucket 2 would both be in a Common Risk Strata
  • A Common Exposure Strata is a group of sub-segments with different levels of risk but the same exposure characteristics – i.e. normal risk, exposure size bucket 1 and normal cure risk, exposure size bucket 1 would both be in a Common Exposure Strata

Exposure size buckets will be defined through the comparison of the Exposure for each debtor and a number of exposure cut-off points:

  • 5th Percentile;
  • Cut-off1;
  • Cut-off2;
  • Cut-off3;
  • Cut-off4;
  • Top10th Exposure.


These cut-offs are specific to each portfolio and riskiness buckets, meaning that, for instance, cut-off points for retail mortgages normal will be different from cut-off points for retail mortgages defaulted >12 months and different from large corporates defaulted >12 months. The steps to calculate them are explained below and illustrated in the Figure 6:

  • Calculate the 5th Percentile of exposure (by debtor) for each portfolio and riskiness bucket i.e. determine the exposure of the debtor which has an exposure smaller than 95% of the other debtors in the same Common Risk Strata. );
  • Identify the exposure size of the Top 10th debtor by exposure size in each Common Risk Strata;
  • Calculate the auxiliary variable “Step” as:
  • Step = (Top10th Exposure - 5th Percentile) / 5
  • For i = 1 to 4, calculate Cut-offi as: Cut offi = 5th percentile + (Step xi)


Calculation approach

Once the parameters are calculated, each debtor is allocated to the corresponding exposure size bucket:

  • Exposure size bucket = Top10 when Top10th Exposure ≤ Exposure;
  • Exposure size bucket = 5 when Cut-off4 ≤ Exposure < Top10th Exposure;
  • Exposure size bucket = 4 when Cut-off3 ≤ Exposure < Cut-off4;
  • ...
  • Exposure size bucket = 1 when 5th Percentile < Exposure < Cut-off1;
  • Exposure size bucket = 5th Percentile when Exposure ≤ 5th Percentile;

Step 3. Select The Priority Debtors

In order to anticipate the beginning of the credit file review, the “priority debtors” will be selected. This will consist of the top 10 debtors (top 5 for small granular non-retail portfolios) by exposure size per portfolio and riskiness bucket. Picking these files should be relatively straight forward, allowing credit file review to begin swiftly on completion of the loan tape. If the 10th and 11th debtor are strictly identical by exposure then lowest allocated value of collateral can be used to select which debtor to go into the priority debtors. If allocated collateral is equal then a random choice should be made.

At NCA discretion, in addition to the top 10 debtors, all debtors within the top 20 groups of connected clients (across all selected portfolios, not by portfolio/riskiness bucket) can be selected as an additional priority group, to the extent they have not already been analysed. NCAs will decide at the beginning of this step if they wish to pursue this option.

Data Required

The basis for the selection of the priority debtors is the sampling dataset, as per the sections above. The fields required are listed in the table below.

Calculation Approach

The selection of the priority debtors is as easy as picking the debtors that have been allocated to the Top10 exposure size bucket for all the portfolios and riskiness buckets. For the avoidance of doubt, this means that 70 debtors will be selected per portfolio (10 per riskiness bucket), though some debtors may belong to the same group of connected clients, and therefore be analysed together. In these circumstances, no extra priority debtors should be selected.

Step 4. Select Random Stratified Sample

The stratification of the portfolios enables sufficient audit evidence with only a few observations per stratum. This section outlines how the number of observations per stratum is defined and how individual debtors will be picked once the sample size has been calculated.

4.1 Calculate Sample Size

Not all of the strata will be sampled. In general, small exposures will not be reviewed and in the case of retail mortgage portfolios, for those debtors that do not show any evidence of current or past reasons for potential impairment, only the largest exposures will be reviewed.

The number of files sampled per stratum is defined based on the following factors:

  • The risk category of the stratum;
  • The AQR asset segment (residential real estate (RRE) vs. non-retail);
  • Whether the portfolio is granular or not (i.e. has more than 1,000 individual debtors);
  • The size of the portfolio;
  • The number of debtors in the stratum.


Data required

The basis for the calculation of the sample size is the sampling dataset, as per the sections above. The fields required are listed in the table below.

Parameters required

The parameters required to determine the statistical sufficiency of the sample are provided by the CPMO. The parameters are shown in the Table below.

NCA bank teams may apply the parameters for small concentrated non-retail portfolios when: The total RWA of the portfolio is less than 5% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio. NCA bank teams may petition to apply the parameters where the total RWA of the portfolio is between 5 and 10% of the total credit RWA of the bank and the top 50 debtors account for at least 40% of the total exposure in the portfolio where the number of files selected for the bank is greater than the expected number of files communicated by the CPMO at the end of Phase 1.The following subsection explains how these parameters are applied.

Calculation approach

The first step in the calculation is to allocate exposure and number of debtors (after exclusions) by stratum, as illustrated in the following figure.

The number of observations is then looked up for each stratum from the table above. In doing so, the correct set of corporate parameters (granular, non-granular or small and granular) should be looked up, depending on the number of observations in the portfolio after exclusions.

If forbearance information is not available to determine the high risk segment and no conservative proxy is available (as described in section on DIV), the sample size for normal cured and normal should be increased by a factor of 4 (up to the total population of the stratum). For instance, if forbearance/restructuring information is not available for the above example, the revised sample size will be:


4.2 – Select Specific Debtors

To ensure that the sample is representative and unbiased, random sampling will be applied to select specific debtors.

Data required

The basis for the selection of specific debtors is the sampling dataset, as per the sections above. The fields required are listed in the Table below.

Calculation approach

The approach to select specific debtors is:

  • Ensure that the portfolio follows a random order by assigning a randomly generated number ((ISA 530, Appendix 4, Paragraph a: “Random selection (applied through random number generators, for example, random number tables).” )) (e.g. SAS’ ranuni(seed)) to each debtor and sorting in descending order;
  • Starting with the first debtor in the randomly sorted list, select the first “n” debtors, for each stratum where “n” is the total sample size for each stratum described in the previous section.

Alternatively, typical data management software offers solutions to run stratified samples easily (e.g. SAS’ PROC SURVEYSELECT combined with the statement “strata”). The NCA bank team may use these solutions as long as the randomness of the selection is ensured.

Experience suggests that some parties can struggle to select samples randomly. Therefore following selection of the sample, the party responsible for selecting the sample should sign a declaration that appropriate measures have been taken to ensure the sample is random and the NCA should ensure the sample selection process has been Quality Assured.

Step 5 Select The Reserve Sample

Together with the main sample, the NCA bank team will select a reserve sample. Its purpose is allowing the replacement of files under very precise circumstances, explained in Section 4.4 and Chapter 6 and to check anomalies in the projection of findings phase. This section outlines how the reserve sample is selected while preserving all the attributes defined for the main sample, such as representativeness, non-bias, sufficiency, etc.

5.1 Calculate The Sample Size For The Reserve Sample

The calculation of the reserve sample size is a parallel step to the calculation of the main sample size. The data required is the same as for the main sample and that the reserve sample will be calculated right after the main sample size has been calculated.

Calculation approach

The reserve sample, when combined with actual sample can never be more than the total number of debtors in the stratum. Given “N” debtors per strata and a main sample size of “n*”, the reserve sample size is calculated using the following expression:

  • R = min(n*, N – n*)

5.2 Designate Specific Debtors For The Reserve Sample

The selection of the specific reserve sample debtors will be carried out after the selection of the main sample. The required dataset is therefore the same, excluding those files that have been already selected, and the approach is also the same as described above.