Stratification is the sorting of data attributes concerning any group of entities (people, objects etc) into distinct homogeneous sub-groups and (optionally) the representation of subgroup information in aggregate form or the sampling of individuals from subgroups.

The need for stratification arizes from the accumulation of large amounts of granular datasets which poses cognitive problems in the understanding and communication of the information content of such data sets.


  • Stratification dimensions can be based on any Characteristic of the group, with attributes of characteristics usable directly (or after aggregation) as possible strata
  • The strata (defining the different stratification dimensions) should be an exhaustive and mutually exclusive partition of the group (i.e. all members of the group belong to only one and only one stratum).
  • Characteristics captured as categorical variables can suggest directly the relevant strata (same with an Ordinal Variable)
  • Characteristics captures as numerical variables must first be converted into binned variables


  • Stratification can be used to check or ensure a target Allocation of subgroups
  • Stratification can be used to control for confounding variables (variables other than those the researcher is studying), thereby making it easier for the research to detect and interpret relationships between variables

Issues and Challenges

  • Stratification is related but distinct from Descriptive Statistics, that is summary statistics that quantitatively describe or summarize features from a collection of information.

