Difference between revisions of "How to Identify Data Outliers"

From Open Risk Manual
(Created page with "== How to Identify Data Outliers == A standardized procedure for systematically identifying data outliers ''in a univariate sense'' comprises of the follo...")
 
(No difference)

Latest revision as of 13:28, 3 September 2019

How to Identify Data Outliers

A standardized procedure for systematically identifying data outliers in a univariate sense comprises of the following steps:

Issues and Challenges

This methodology aims to provide a powerful filter that can quickly identify outliers in large sets of variables but it does not provide an automatic solution.

  • Outliers are ultimately defined in a certain Data Generation Process, Data Collection Process and data modelling and usage context. Hence what is an outlier can change depending on that context
  • The above methodology does not apply to detecting outliers in a multivariate sense
  • The above methodology is less suited to detect outliers in categorical data
  • The above methodology is less suited for data with complicated multi-modal distributions

References

Contributors to this article

» Wiki admin