In this tutorial, we will show you how to generate random numbers in R. Specifically, we will show you how to generate random numbers from both uniform distributions and normal distributions.
Throughout the tutorial we will be using RStudio, a program that makes it easier to work with R.
Generating Random Numbers from Uniform Distributions
A uniform distribution is a probability distribution in which each value within a specified range is equally likely to occur. For example, if you draw a card from a standard pack of playing cards, you are equally like to draw any of the 52 cards in the pack.
Uniform distributions can be either discrete (as in the playing cards example) or continuous.
Discrete Uniform Distribution
A discrete uniform distribution has only a finite number of values within a specified range. For example, when we draw a card from a pack, there are only 52 possibilities.
Example
Imagine that we are want to find out whether petting puppies decreases the stress levels of students who are about to take an exam. We have recruited 100 students for our study, and we want to randomly assign 50 of them to the "puppies" condition (they will pet puppies before their exam). The other 50 students will be assigned to the "no puppies" condition (they will not pet puppies before their exam).
To select 50 students for the "puppies" condition, we will generate 50 random numbers between 1 and 100. Since each student can only participate in the study once, we want to ensure that we don’t select the same number twice. That is, we want to use the sampling with replacement method.
R Function and Parameters
We use the sample() function to generate random numbers from a discrete uniform distribution in R. The parameters for this function are:
- x: a positive integer (whole number) or a vector (variable) from which we draw random numbers – in our example, we will use the integer 100 because we are selecting students from a group of 100.
- size: the number of random numbers we need – here, we want to generate 50 numbers, each representing one student.
- replace: this parameter determines whether values are replaced after they are selected. TRUE instructs R to sample with replacement. FALSE instructs R to sample without replacement. In our example, if the first random number that R selects is 37, replace determines whether the student with ID 37 should be available to be selected again. Since each student can only participate in our study once, we want to sample without replacement and so we select FALSE.
So, for our example, this is what we type into the RStudio console to generate our random numbers:
sample(100, 50, replace = FALSE)
Once we select the enter key on our keyboard, R will output 50 random numbers. Our set of 50 random numbers is below, but R will generate a different set of numbers each time.
These numbers represent the ID numbers of the students who will be assigned to the "puppy" group in our experiment. For example, the first number here represents the student with ID 17, and so on.
Continuous Uniform Distribution
In a continuous uniform distribution, all of the values within a specified range are also equally likely. Here, however, there are an infinite number of values within the range.
Example
Imagine that you arrive at the metro station. You know that trains come once every two minutes, but you do not know what time the last one came. Your expected wait time is somewhere between zero and two minutes, that is, between 0 and 120 seconds. In contrast to the playing cards example, however, there is an infinite number of values within this range. For example, your wait time could be 60 seconds or 62 seconds, but it could also be 62.5 seconds or 62.51 seconds, etc. Let’s see how we would generate 30 random wait times within this range.
R Function and Parameters
We can use the runif() function to generate numbers from a continuous uniform distribution in R. The parameters for this function are:
- n: the number of random numbers you want to generate – 30 in our example
- min: the minimum value for the distributions of these numbers – 0 (seconds) in our example
- max: the maximum value for the distribution of these numbers – 120 (seconds) in our example
Here is what we type into the RStudio console to generate the random numbers that we need for our example:
runif(30, min = 0, max = 120)
Once we select the enter key on our keyboard, R will display our 30 random numbers. The numbers that R generated for us are below, but it will generate a different set of numbers each time.
Each of these numbers represent a wait time between 0 and 120 seconds. So, the first random number represents a wait time of 94.944958 seconds, and so on.
If we want to control the number of decimal places for our set of random numbers, we can use the round() function with the runif() function. For example, if we want to generate a set of random wait times rounded to two decimal places, we can type the following:
round(runif(30, min = 0, max = 120),2)
The screenshot below displays the set of 30 random numbers that R gave us when we executed this command. Since R produces a different set of random numbers each time, this set of numbers is different from the one above. That is, it did not simply round the set of numbers it produced for the previous command:
Generating Random Numbers from Normal Distributions
The normal distribution is a continuous probability distribution. When it is plotted, it has a symmetrical bell shape, with the majority of values clustering around the mean, and the number of values gradually tapering off as we move away from the mean. Many phenomena are approximately normally distributed, for example, our heights and IQ scores.
Example
We want to generate a set of normally distributed exam scores for 30 students in a fictitious study. We decide that the mean of these exam scores will be 75, and the standard deviation of these scores will be 5.
R Function and Parameters
We can use the rnorm() function to generate random numbers from a normal distributions in R. The parameters for rnorm() are:
- n: the number of random numbers that we want to generate – 30 for our example
- mean: the mean of the random numbers that we want to generate – 75 for our example
- sd: the standard deviation of the numbers that we want to generate – 5 for our example
To generate random numbers using our example, we type the following into the RStudio console:
rnorm(30, mean = 75, sd = 5)
Once we select the enter key on our keyboard, R give us 30 random numbers. Our numbers are in the screenshot below, but yours will be different:
Each of the numbers generated represents the exam score of one of the students in our fictitious study. For example the first score is 76.24171. If desired, we can control the number of decimal places for our set of random numbers by using the round() function with the rnorm() function. For example, if we want to generate a set of random exam scores rounded to two decimal places, we can type the following:
round(rnorm(30, mean = 75, sd = 5), 2)
As with our previous example, we can see that R has generated a different set of 30 numbers. That is, it did not simply round the set of numbers it outputted when we ran the previous command.
***************
That’s it for this tutorial. You should now be able to generate random numbers in R for discrete uniform distributions, continuous uniform distributions, and normal distributions.
***************