How to Build an SME Credit Scorecard

From Open Risk Manual

How to Build an SME Credit Scorecard

This article covers all the stages involved in building and deploying an SME Credit Scorecard in a business context. This is a specific instance of the more general How to Build a Credit Scorecard entry hence the focus is primarily on the unique attributes of SME Lending

SME Credit Scorecard Development Stages

Stage 1: Preliminary considerations

This stage defines the scope and objectives of the credit scorecard project. Indicatively the documentation at this stage will define some of the below:

Selecting the type of scorecard

There is a very large variety of possible credit scorecards. Selecting the right type requires identifying the concrete needs of the project in terms of abilities and functionality but also the practicalities of implementation (availability of data, computer systems, human expertise, degree of automation). Some key decision points that are relevant are as follows:

Availability of Data

Statistical Models have minimum requirements on data availability and Data Quality. In the absence of available data (or for other reasons) one may opt instead for more judgemental of Expert Based Models.

Regulatory Context

Credit Scorecards used by regulated financial institutions must comply with the requirements of all regulatory bodies involved. This may place constraints on data requirements, model explainability etc.

Success Criteria Stage 1

Identifying the viability of the project in terms of availability of information, resources and pathways towards further development. Defining ultimate success criteria for accepting the outcome of the development

Stage 2: Scorecard Development

The specifics of the scorecard development process depend on the type of scorecard. The following is a list of activities that will generally be required for most common types. We split the list in two:

  • the practical side which we might term the Data Engineering component and
  • the conceptual side, which we might term the Data Science component

Stage 2a: Practical Development Steps (Data Engineering)

The steps in this sequence aim to provide suitable resources for the development of the required scorecard

  • Data Collection. This step helps establishing links with existing databases or files. Depending on the available systems it involves writing and testing queries and filters and importing data
  • Data Cleaning. This steps involves reviewing and establishing Data Quality of collected data.
  • Missing Data. In this step (where appropriate) Missing Data may be remedied with Missing Data Imputation
  • Creating a Master Data Table. This is a table of potential characteristics and outcomes (see Credit Event) is the basic input to the quantitative estimation using common statistical models
  • Setting up a machine learning estimation framework (if applicable). This can be achieved using either a commercial or open source toolkit. Judgemental scorecards also need some form of implementation (e.g. spreadsheets)


The above steps are not necessarily sequential nor do they precede the data science component (for example after pursuing a certain modelling approach it may transpire that there are additional data requirements)

Success Criteria Stage 2a

Conceptual Development Steps (Data Science)

Assuming the resources of the previous sequence are available, the conceptual development aims to identify a specific model to underpin the scorecard. There may be legal, regulatory or business (cost) limitations in the available paths. The relevant concepts for quantitative (statistical development) are:

  • Historical Sample Selection (the relevant population, temporal period, any exclusions)
    • The Sample Size must be sufficiently large, including labelled (known) outcomes, without selection bias
    • This step is particularly important both in achieving Model Stability
    • in regulatory context in assuring the Representativeness of the data
  • Population Segmentation. (Optional) It is possible that the scorecard will be applied on distinct sub-segments of the relevant population
  • The precise Credit Event definition. It is essentially what the scorecard aims to predict. It may have implications for data availability. E.g relaxing the definition may significantly increase event rate.
  • Identification of Characteristics to include in the model. There is an enormous variety of possible characteristics depending of the type of credit risk being evaluated:
  • Characteristic Selection. Narrowing down the list of characteristics, e.g. using Backward Selection.
    • Objective is to identify characteristics that first in isolation and ultimately in combination improve the predictive power of the model
    • A variety of metrics to help rank characteristics
  • Transformation Methodologies. Investigating the application of non-linear transformations to characteristics (wikipedia:Feature Engineering)
  • Selection of Model Family (e.g logistic regression or any of the catalog of Credit Scoring Models)
  • Performing the actual Statistical Fitting, usually by running a statistical algorithm such as maximum likelihood
  • Performing and reviewing Model Performance such as ( model accuracy, stability over time, out-of-sample performance etc.)

Classic SME Scoring Models

  • Altman Z Score Methodology (as applied to SME Lending)
  • Ohlson O Score
  • Logistic Regression
  • Neural Networks
  • Cox Proportional Hazard Model

Success Criteria Stage 2b

  • Delivery of a quantitative model with acceptable performance across a range of criteria

Stage 3: Model Validation

The Model Validation stage will include the following steps, depending on the rigour / independence required

  • Review of conceptual methodology
  • Review of practical development steps
  • Independent replication of the model

Success Criteria Stage 3

  • Delivery of a Model Validation Report with assurance that Model Risk is within the organizations Risk Appetite and/or suggestions for further work required

Stage 4: Model Deployment

Depending on the systems of the entity using the scorecard, the following will be typical steps:

  • Implementation of the developed model as a scorecard inference system in production systems. (Production systems typically do not require ability to re-estimate models on the fly)
  • Selection of operational parameters (like Cut-Off Score) where applicable
  • User training

Success Criteria Stage 4

  • Acceptance testing of the implementation by the operating unit

Stage 5: Model Monitoring

Monitoring is done at the appropriate time-scale (e.g., from daily to quarterly). Various levels of monitoring might be used. Monitoring typically produces updated metrics for the set of metrics that was already used in the selection / validation of the model. This includes primarily

  • Portfolio evolution statistics
  • Model performance statistics

Success Criteria Stage 5

  • Timely and informative monitoring reports that allow pro-active intervention (if necessary)

Stage 6: Adjustment / Decommissioning

When a monitoring report or other insight suggests the scorecard in production is no longer fit-for-purpose then depending on the context the scorecard must be adjusted or replaced. Model adjustment can be

  • minimal (e.g. re-estimation using an adjusted dataset)
  • substantial (e.g. introduction of a new characteristic, changing segmentation)
  • significant intervention (e.g. changing the model family)

Depending on the context (e.g. regulation) any significant change may be classified as a new model and triggers a full validation / implementation cycle.

Success Criteria Stage 6

  • Flexibility to allow adjustments without undue cost, downtime or loss of institutional memory / continuity

Issues and Challenges

  • Developing and using quantitative risk models such as credit scorecards if full of pitfalls. Check the The Zen of Modeling for a high level list.

References