Descriptive Statistics

From Open Risk Manual

Definition

Descriptive Statistics is a collection of summary statistics (measures, metrics) that quantitatively describe (summarize) aspects from a Dataset. Descriptive statistics is distinct from Inferential Statistics / (Statistical Models) which (in general) aim to infer underlying truths about the population that generated the data.

The core concept in the construction of descriptive statistics is that of a Distribution, the realization of observations in sufficiently large numbers that both necessitates a summary and ensures that summary statistics are relevant.

Classification

The precise collection of descriptive statistics that are adequate to summarize a given dataset depend on a number of factors: the type of dataset (timeseries, panel data etc), its dimensionality, the volume of data and the context in which the data and their summary are used. In general the following types of statistics are used:

  • Counts and other Aggregate figures (Sums)
  • Various form of average (central tendency) that are used to indicate a typical or expected value for a set of data. There are many choices with various pros an cons.
  • Measures of variability or spread (the range of a variable, standard deviation or other measures)
  • Quartiles or percentiles which divide the distribution into parts containing equal occurence frequencies
  • Histograms providing a visual representation of the distributin