It is an assumption of many statistical tests that our data be normally distributed. There are two broad approaches to test normality. The first is to assess it visually, by reviewing a graph of our data. The second is to assess it numerically, by conducting a normality test.

In this quick tutorial, we will show you how to test for normality using both graphs and normality tests in R. We will work with RStudio, a program that makes it easier to work with R.

## The Data

We start from the assumption that you have created or imported a data frame in R containing the variable that you want to test for normality. Please see our tutorials on importing SPSS, Excel and CSV files into R, or our tutorial on manually entering data in R.

In this tutorial, we want to assess whether the variable (vector) *polsci* in the data frame *sats_polsci* is normally distributed. We will do this using both visual and numerical methods.

## Visual Methods for Assessing Normality

One of the most common methods for assessing normality visually is the Q-Q plot, or Quantile-Quantile plot. If the data points on your Q-Q plot fall close to the straight diagonal line that runs from the bottom left to the top right corner of the plot, you can assume that your data is normally distributed.

Enter the following command in the RStudio console and then select **enter** on your keyboard to create a Q-Q plot in R:

qqnorm(dataframe$variable, frame = FALSE)

qqline(dataframe$variable)

Replace the highlighted text with the data that you want to use to create your own Q-Q plot:

**dataframe**: the name of your data frame in RStudio. The example that we are using in this tutorial is*sats_polsci***variable**: the variable (vector) in the above data frame that you want to test for normality. In our example this is*polsci*

The command that we use to generate the Q-Q plot for our own example is:

qqnorm(sats_polsci$polsci, frame = FALSE)

qqline(sats_polsci$polsci)

Click the **enter** key on your keyboard. You will see your Q-Q Plot in the **Plots** tab of the bottom right panel of RStudio.

We can see that the data points fall close to the diagonal line. Therefore we can conclude that our data is normally distributed.

## Numerical Tests for Assessing Normality

It is a good idea to combine your Q-Q plot with a numerical test for normality such as the Shapiro-Wilk test or the Kolmogorov-Smirnov test. The Shapiro-Wilk test is usually recommended for smaller sample sizes *(*< 50), like our variable.

To compute the Shapiro-Wilk test, enter the following command in the RStudio console and then select **enter** on your keyboard:

shapiro.test(dataframe$variable)

Replace the highlighted text with the information about the variable that you want to test for normality as follows:

**dataframe**: the name of your data frame in RStudio (*sats_polsci*for our example)**variable**: the variable (vector) in the above data frame that you want to test for normality (*polsci*for our example).

So, this is what we enter to compute the Shapiro-Wilk test for our variable:

shapiro.test(sats_polsci$polsci)

The results of our Shapiro-Wilk test are as follows:

If the *p *value is greater than .05, then we can assume that the data is normally distributed. The *p *value for our test is 0.8292, so we assume that our variable is normally distributed.

If, however, the *p* value is less than or equal to .05, we assume that the data is *not* normally distributed.

***************

Thatâ€™s it for this tutorial. You should now be able to test data for normality in R using the Q-Q plot and the Shapiro-Wilk test.

***************