Normal distribution is a type of statistical distribution. It is also called Gaussian distribution. When you plot the data which has a normal distribution against its frequency (e.g. height on the x-axis and frequency on the y-axis) you get a bell-shaped curve. When data does not follow normal distribution is would be called non-parametric data.
The normal distribution is one of the distributions that numerical data can follow. Categorical data cannot have normal distribution.
Parametric vs non-parametric data
Statistical tests that are undertaken on parametric data (i.e.e data which has a normal distribution) is called parametric tests. To use To undertake parametric tests on data that is not parametric would lead to erroneous results. There are specific non-parametric statistical tests for non-parametric data.Therefor, prior to undertaking statistical analysis, it is important to assess whether the data has a normal distribution (i.e. parametric data) or not.
Normality tests
Graph
How do you assess whether the numerical data you have collected follows a normal distribution (i.e. parametric data)? In statistics, it is always a good practice to plot the data into a graph (e.g. a histogram). The shape of the graph gives a good indication of whether the data follows a normal distribution i.e. whether the shape of the graph resembles bell-shape.
Statistical tests
There are also statistical tests to assess whether the data is parametric. Examples of these tests are:
- Anderson-Darling test
- D’Agostino-Pearson omnibus normality test
- Shapiro-Wilk normality test
- Kolmogorov-Smirnov normality test with Dallal-Wilkinson-Lilliefor P-value
The D’Agostino-Pearson omnibus normality test is one of the commonly performed normality tests. If the p-value of the test is > 0.05, then you could assume that the data follows a normal distribution and therefore undertake parametric tests on it. However, the p-value of the D’Agostino-Pearson omnibus normality test is < 0.05, then the data is non-parametric and you need to use a non-parametric test on the data.
What if the data is non-parametric
If the data does not have a normal distribution, then you would need to use non-parametric tests to analyse the data. However, if you are undertaking student t-test (which is a parametric test), provided that sample size is large, the results are robust regardless of whether the data follows the normal or non-parametric distribution.
Normalisation
As mentioned above if the data is non-parametric you would need to use non-parametric tests. However, there are also techniques to convert non-parametric data into parametric data in certain circumstances. This might be using normalisation calculations or converting the data into logarithmic valves. We will review normalisation on another article.
