Normal distribution

Ganealingam Narenthiran

6 years ago

Contents hide

1 Parametric vs non-parametric data

2 Normality tests

2.1 Graph

2.2 Statistical tests

3 What if the data is non-parametric

4 Normalisation

Normal distribution is a type of statistical distribution. It is also called Gaussian distribution. When you plot the data which has a normal distribution against its frequency (e.g. height on the x-axis and frequency on the y-axis) you get a bell-shaped curve. When data does not follow normal distribution is would be called non-parametric data.

The normal distribution is one of the distributions that numerical data can follow. Categorical data cannot have normal distribution.

Parametric vs non-parametric data

Statistical tests that are undertaken on parametric data (i.e.e data which has a normal distribution) is called parametric tests. To use To undertake parametric tests on data that is not parametric would lead to erroneous results. There are specific non-parametric statistical tests for non-parametric data.Therefor, prior to undertaking statistical analysis, it is important to assess whether the data has a normal distribution (i.e. parametric data) or not.

Normality tests

Graph

How do you assess whether the numerical data you have collected follows a normal distribution (i.e. parametric data)? In statistics, it is always a good practice to plot the data into a graph (e.g. a histogram). The shape of the graph gives a good indication of whether the data follows a normal distribution i.e. whether the shape of the graph resembles bell-shape.

Statistical tests

There are also statistical tests to assess whether the data is parametric. Examples of these tests are:

Anderson-Darling test
D’Agostino-Pearson omnibus normality test
Shapiro-Wilk normality test
Kolmogorov-Smirnov normality test with Dallal-Wilkinson-Lilliefor P-value

The D’Agostino-Pearson omnibus normality test is one of the commonly performed normality tests. If the p-value of the test is > 0.05, then you could assume that the data follows a normal distribution and therefore undertake parametric tests on it. However, the p-value of the D’Agostino-Pearson omnibus normality test is < 0.05, then the data is non-parametric and you need to use a non-parametric test on the data.

What if the data is non-parametric

If the data does not have a normal distribution, then you would need to use non-parametric tests to analyse the data. However, if you are undertaking student t-test (which is a parametric test), provided that sample size is large, the results are robust regardless of whether the data follows the normal or non-parametric distribution.

Normalisation

As mentioned above if the data is non-parametric you would need to use non-parametric tests. However, there are also techniques to convert non-parametric data into parametric data in certain circumstances. This might be using normalisation calculations or converting the data into logarithmic valves. We will review normalisation on another article.