How to Identify Data Outliers

From Open Risk Manual

How to Identify Data Outliers

A standardized procedure for systematically identifying data outliers in a univariate sense comprises of the following steps:

Issues and Challenges

This methodology aims to provide a powerful filter that can quickly identify outliers in large sets of variables but it does not provide an automatic solution.

  • Outliers are ultimately defined in a certain Data Generation Process, Data Collection Process and data modelling and usage context. Hence what is an outlier can change depending on that context
  • The above methodology does not apply to detecting outliers in a multivariate sense
  • The above methodology is less suited to detect outliers in categorical data
  • The above methodology is less suited for data with complicated multi-modal distributions

References