The post Interpreting Chi Square Results in SPSS appeared first on EZ SPSS Tutorials.

]]>The tutorial starts from the assumption that you have already calculated the chi square statistic for your data set, and you want to know how to interpret the result that SPSS has generated. (We have a different tutorial explaining how to do a chi square test in SPSS).

You should be looking at a result that looks something like this in the SPSS output viewer.

The crosstabs analysis above is for two categorical variables, Religion and Eating. Each variable has two possible values: No Religion and Christian for the Religion variable; Meat Eater and Vegetarian for the Eating variable.

The null hypothesis of our hypothetical study is that these variables are not associated with each other – they are independent variables. The chi square test allows us to test this hypothesis.

The output of a crosstabs analysis contains a number of elements. Let’s look at each in turn.

As its name suggests, the Case Processing Summary is just a summary of the cases that were processed when the crosstabs analysis ran.

In our example, as you can see above, we had 30 valid cases, and no missing cases.

This is the crosstabs table, and it provides a lot of information that is useful for interpreting a chi square test result.

Our crosstabs table includes information about observed counts (what SPSS calls “Count”) and expected counts.

The observed count is the observed frequency in a particular cell of the crosstabs table. For example, our table shows that 5 meat eaters (out of a total of 16) have no religion and 3 Christians (out of a total of 14) are vegetarian.

The expected count is the predicted frequency for a cell under the assumption that the null hypothesis is true. In our case, the null hypothesis is that there is no association between the Eating variable and the Religion variable, which means the expected count is the predicted frequency for a cell on the assumption that eating and religion are not dependent on each other.

If you want to understand the result of a chi square test, you’ve got to pay close attention to the observed and expected counts. Put simply, the more these values diverge from each other, the higher the chi square score, the more likely it is to be significant, and the more likely it is we’ll reject the null hypothesis and conclude the variables are associated with each other.

If you look at the crosstabs table above, you’ll see that there are more Christian meat eaters than would be expected were the null hypothesis (that the variables are independent) true; and fewer Christian vegetarians. And similarly, there are more atheist vegetarians than would be expected, and fewer atheist meat eaters.

The question is whether these differences are big enough to allow us to conclude that the Eating variable and Religion variable are associated with each other. This is where the chi square statistic comes into play.

As you can see below, SPSS calculates a number of different measures of association.

We’re interested in the Pearson Chi-Square measure.

The chi square statistic appears in the Value column immediately to the right of “Pearson Chi-Square”. In this example, the value of the chi square statistic is 6.718.

The *p*-value (.010) appears in the same row in the “Asymptotic Significance (2-sided)” column. The result is significant if this value is equal to or less than the designated alpha level (normally .05). In this case, the *p*-value is smaller than the standard alpha value, so we’d reject the null hypothesis that asserts the two variables are independent of each other. To put it simply, the result is *significant* – the data suggests that the variables Religion and Eating are associated with each other.

The chi square statistic only tells you whether variables are associated. If you want to find out how they are associated then you need to return to the crosstabs table. In our example, the crosstabs table tells us that atheism is disproportionately associated with vegetarianism and meat eating is disproportionately associated with Christianity.

***************

That’s all for this tutorial. You should now have a good idea of how to interpret chi square results in SPSS.

***************

The second half of our SPSS chi square video includes a discussion of how to interpret chi square results in SPSS.

The post Interpreting Chi Square Results in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Pearson Correlation Coefficient and Interpretation in SPSS appeared first on EZ SPSS Tutorials.

]]>- Click on Analyze -> Correlate -> Bivariate
- Move the two variables you want to test over to the Variables box on the right
- Make sure Pearson is checked under Correlation Coefficients
- Press OK
- The result will appear in the SPSS output viewer

For the purposes of this tutorial, we’re using a data set that comes from the Philosophy Experiments website.

The Valid or Invalid? exercise is a logic test that requires people to determine whether deductive arguments are valid or invalid. This is the complete data set.

We’re interested in two variables, Score and Time.

Score is the number of questions that people get right. Time is the amount of time in seconds it takes them to complete the test. We want to find out if these two things are correlated. Put simply, do people get more questions right if they take longer answering each question?

Pearson’s correlation coefficient will help us to answer this question.

To start, click on Analyze -> Correlate -> Bivariate.

This will bring up the Bivariate Correlations dialog box.

There are two things you’ve got to get done here. The first is to move the two variables of interest (i.e., the two variables you want to see whether they are correlated) into the Variables box on the right. You can do this by dragging and dropping (or using the arrow button in the middle).

The other thing is to ensure that “Pearson” is selected under Correlation Coefficients.

You can also select “Flag significant correlations”, though this is just optional.

That’s it. You’re set. Now just click OK.

The first thing you might notice about the result is that it is a 2×2 matrix. This means, in effect, you get two results for the price of one, because you get the correlation coefficient of Score and Time Elapsed, and the correlation coefficient of Time Elapsed and Score (which is the same result, obviously).

We’re interested in two parts of the result.

The first is the value of Pearson’ *r* – i.e., the correlation coefficient. That’s the Pearson Correlation figure (inside the square red box, above), which in this case is .094.

Pearson’s *r* varies between +1 and -1, where +1 is a perfect positive correlation, and -1 is a perfect negative correlation. 0 means there is no linear correlation at all.

Our figure of .094 indicates a very weak positive correlation. The more time that people spend doing the test, the better they’re likely to do, but the effect is very small.

We’re also interested in the 2-tailed significance value – which in this case is < .000 (inside the red oval, above). The standard alpha value is .05, which means that our correlation is highly significant, not just a function of random sampling error, etc.

This seems counterintuitive. How can a very weak correlation be highly significant? How is it possible to be so confident that such a weak correlation is real?

The answer has to do with our sample size (see the figure for N, above). We have 16033 cases in our data set. This means that our study has enough statistical power to identify even very weak effects.

***************

Right, we’ve come to the end of this tutorial. You should now be able to calculate Pearson’s correlation coefficient within SPSS, and to interpret the result that you get.

The post Pearson Correlation Coefficient and Interpretation in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Recode String Variables in SPSS appeared first on EZ SPSS Tutorials.

]]>This is often done using the automatic recode functionality of SPSS, but in this case we’re going to do it manually because of the extra control we get.

- Click on Transform -> Recode into Different Variables
- Drag and drop the variable you wish to recode over into the Input Variable -> Output Variable box
- Create a new name for your output variable in the Output Variable (Name) text box, and click the Change button
- Click the Old and New Values… button
- Type the first value of your input variable into the Old Value (Value) text box, and the value you want to replace it with into the New Value (Value) text box. Then click Add to confirm the recoding
- Repeat this process for all the existing values of your input variable
- Press Continue, and then OK to do the recoding
- The new recoded output variable will appear in the Data View

We’re assuming that you’ve fired up SPSS, opened a data file, or entered new data, and you’re looking at the Data View window.

The issue we have with our data is that the Education variable has been coded as a string whereas it should be numeric. SPSS provides a number of options to help us to recode the variable. We’re going to look at the Recode into Different Variables method.

As its name suggests, if you choose this option, SPSS will use an input variable to create a new recoded variable.

To begin this process, click on Transform -> Recode into Different Variables, which will bring up its associated dialog box.

You need to drag and drop the variable you want to recode over into the Input Variable -> Output Variable box (it reads String Variable -> Output Variable, above, because SPSS has identified the Education variable as a string).

The next step is to give your new recoded output variable a name, and then to hit the Change button. As you can see, we’ve called our new recoded variable EdNumeric.

Once you’ve got this set up, click the Old and New Values… button so you can specify how you want to recode the variable.

The Old and New Values dialog box allows you to specify new values for your existing input variable.

This is easy to accomplish. The old value goes into the Old Value (Value) text box on the left, and the new value you want to replace it with into the New Value (Value) text box on the right. Then click Add to confirm the recoding.

Repeat for all the existing values of your input variable.

As you can see above, we’ve got “School” recoded as 1, and we’re about to add “Graduate” recoded as 2.

Once you’ve got the recoding set up, press Continue.

That’s it, really. Press OK to recode your variable.

If you take a look at the Data View, you’ll see you’ve got a new variable which contains the recoded values.

Our new EdNumeric variable is a numeric, nominal variable, where 1 = School, 2 = Graduate and 3 = Postgrad.

You could just leave it at that, but probably you’d want to set up Value Labels. This is the topic of a separate tutorial, so we won’t explain how to do that here, but the advantage of doing so is that you’ll end up with meaningful labels in your output, and you don’t have to remember how the coding works.

Once you’ve set up Value Labels for the new EdNumeric variable, there will be no difference between its appearance and that of the old Education variable. The only thing that has changed is the underlying coding, which is now numeric.

***************

That’s it for this quick tutorial. You should now be able to recode string values into a different variable in SPSS. In future tutorials, we’ll look at some of the other options for recoding values in SPSS.

The post How to Recode String Variables in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Select Cases in SPSS appeared first on EZ SPSS Tutorials.

]]>The data we’re using for this tutorial comes from a hypothetical study that examines how long it takes people to fall asleep during a statistics lesson.

The two variables we’re interested in here are Sex, either male or female, and Duration, which is the number of minutes that elapses from the start of a statistics lesson before a subject falls asleep.

Imagine we already know that in the population as a whole the average amount of time it takes for a *woman* to fall asleep is 8.15 minutes. We want to compare this to the average time for women in our sample. But the trouble is our sample contains data for both males and females, and any tests we run will be on that basis. The question is how do we select only female cases, thereby excluding males from any tests that we run?

This is where the select cases functionality comes in useful.

To begin, click Data -> Select Cases.

This will bring up the the Select Cases dialog box. This provides a number of different options for selecting cases. We’re going to focus on the “If condition is satisfied” option, which you should select.

Once you’ve selected it, you need to click on the If… button (as above).

The Select Cases: If dialog box will appear. This is where you do the work of selecting female only cases.

The idea here is to construct an expression in the text box at the top that functions to select cases. You can see here we’ve got “Sex = 0”, which tells SPSS that it should only select cases where the value of the variable Sex is 0 (Female = 0, Male = 1).

Obviously, it is possible to build much more complex expressions than this simple test of equivalence. For example, you could tell SPSS to select cases where Sex is Female and Height is greater than 68 inches (“Sex = 0 & Height > 68”), or where Duration is greater than 8 minutes or Height is less than 60 inches (“Duration > 8 | Height < 60”).

Once you’ve set up the expression, as above, hit the Continue button, and then click OK in the Select Cases dialog box. SPSS will now select cases as per your instruction(s).

If you take a look at the Data View, you’ll see that things have changed to indicate that SPSS is now operating with a subset of the original data set.

As you can see, SPSS has struck out cases on the left that are not selected. It has also introduced a new filter variable that specifies whether a case has been selected or not. Finally, bottom right, it says Filter On, which tells you that any tests or analyses you run will be on a subset of the data – that is, on only the selected cases.

Let’s check this out by running a one sample t test to compare the average amount of time it takes for women in the general population to fall asleep in a statistics lesson with the average for the women in our sample.

Click on Analyze -> Compare Means -> One-Sample T Test, and then set up the test like this.

You can see we’ve got Duration as our test variable, and we’re comparing it against a population mean of 8.15 minutes (the average amount of time it takes women in the general population to fall asleep in a statistics lesson).

Hit OK to run the test.

This is the result.

The value for N here is 50, which tells you immediately that select cases has worked. Our dataset has 100 cases within it, of which 50 are women.

In terms of the result, we can see that the women in our sample fall to sleep on average 1 minute faster than women in the general population. This is a significant difference, with a t value of -3.1 and a *p*-value of .003.

There are a couple of things to note before we finish.

The first is that you can return a data set to its non-filtered state by returning to the Select Cases dialog box (Data -> Select Cases), and choosing All cases (the first option available). This won’t delete the new filter variable, but it will render it inactive. You’ll also notice that “Filter On” will no longer show at the bottom right of the Data View.

The other thing to note is that SPSS offers an alternative to Select Cases that works better in many situations. This is Split File, and it will be the topic of a future tutorial.

***************

That’s all for this tutorial. You should now be able to select cases in SPSS, and to work with the resultant filtered data.

The post How to Select Cases in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Do a One Sample T Test and Interpret the Result in SPSS appeared first on EZ SPSS Tutorials.

]]>- Analyze -> Compare Means -> One-Sample T Test
- Drag and drop the variable you want to test against the population mean into the Test Variable(s) box
- Specify your population mean in the Test Value box
- Click OK
- Your result will appear in the SPSS output viewer

Our working assumption, as per usual, is that you’ve opened SPSS, and that you’re looking at the Data View within which you’ve got some data.

Our data is from a hypothetical study that examines how long it takes people to fall asleep during a statistics lesson.

For the purpose of this tutorial, we’re only interested in the Duration variable, which is the number of minutes that elapses from the start of the lesson before a subject falls asleep.

Imagine we already know that in the population as a whole the average amount of time it takes for somebody to fall asleep is 8.45 minutes. This compares to the average time in our sample of 7.35 minutes. The question is whether the difference between these two means is large enough for us to conclude there is a real difference between our sample group and the wider population in terms of the amount of time it takes to fall asleep.

If we knew the population standard deviation, we could do a z test to answer this question, but we don’t, which means a one sample t test is the appropriate test.

To begin the one sample t test, click on Analyze -> Compare Means -> One-Sample T Test. This will bring up the One-Sample T Test dialog box.

You’ve got to get the variable you want to test – in our case, the Duration variable – into the right hand Test Variable(s) box, and input the population mean into the Test Value box. For the variable, you can just drag and drop, or use the arrow in the middle of the dialog box.

Once it’s set up, it should look like this.

If you’ve got this far, you’re ready to run the test. Just hit the OK button.

The result of the one sample t test will appear in the SPSS output viewer. It will look like this.

This output is relatively easy to interpret.

The t value is -4.691 (see the One-Sample Test table, above), which gives us a *p*-value (or 2-tailed significance value) of .000. This is going to be a significant result for any realistic alpha level.

A standard alpha level is .05, and .000 is smaller than .05, so we’re going to reject the null hypothesis which asserts there is no difference between our sample mean and the population mean.

More technically, what the result shows is that on the assumption that the null hypothesis is true, a difference as big as we’ve got between our sample mean and the population mean is extremely unlikely to have arisen purely by chance.

This counts as evidence that the difference between our sample group and the population as a whole is real. Put simply, it seems that our subjects fall to sleep in statistics lessons more quickly than is true of the population as a whole.

***************

Okay, that’s it for this quick tutorial. You should now be able to run a one sample t test in SPSS, and to interpret the result that you get.

The post How to Do a One Sample T Test and Interpret the Result in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Export Data from SPSS into a MySQL Database appeared first on EZ SPSS Tutorials.

]]>As you can see below, we have a simple data set with four variables. For the purposes of this tutorial, it doesn’t really matter what these variables represent, but for reasons that will become clear later it is worth taking note of the presence of the ID variable, which functions as a unique identifier for each case in the data set.

Our task is to get this data into a table in MySQL.

We’re working on the assumption that you have opened SPSS on a Windows operating system, and you’re looking at an empty Data View.

Click on File -> Export -> Database. The Export to Database Wizard will pop up.

If you haven’t previously set up an ODBC data source connection, you’re not going to see anything in the Data Sources box, and you’re going to need to set up the connection.

We’re not going to show you how to do this here, because it’s exactly the same procedure as described in our import into SPSS from MySQL tutorial. You should check that out, setup the ODBC data source connection as detailed there, and then return to this tutorial.

We’re assuming that you now have the ODBC data source connection set up, and that you’re looking at the Export to Database Wizard.

Highlight your data source connection (as above), and then click the Next button.

You’ll now be asked to choose what sort of export you want to set up.

The simplest option is to create a new table within a MySQL database. We’ve selected this option, and named the new table PEFExperiment (see above).

Clicking the Next button will bring up a dialog box asking you to select the variables you want to be stored in the new table.

The SPSS variables show up in the text box on the left. The idea is to move the variables you want to import into your database over to the right, where you can set a number of their attributes (e.g., type and width).

This is where the significance of the ID variable comes into play. It is normal for a database table to have a primary key, which functions as a unique identifier for each database entry. This is often implemented by means of a field that increments automatically each time an entry is added to a table. This functionality is not supported by the SPSS database wizard. Ideally, therefore, you should ensure that your SPSS data set includes a variable that functions as a unique identifier, and which you can import into the database table. The ID variable performs this role in our data set.

You can specify that a variable should function as a primary key when you select the variables to store in a new table. This is what we’ve done below.

As you can see, we’ve elected to include all our variables within the new database table. To move them over from the left, you just drag and drop. We’ve specified ID as the primary key by ticking the little key icon.

The other thing worth noting is that we’ve instructed SPSS to export the value labels (Male, Female) for the Sex variable rather than the data values (1, 0). This is just for illustrative purposes, and shouldn’t be taken as a recommendation.

If you set things up correctly, then you can just hit Finish at this point. If you want to check the options you’ve chosen, click Next, and review the summary dialog box that appears.

We’re going to hit Finish to do the export.

This is the output that SPSS generates for an export to a database.

As you’ll be able to see above, there are a couple of SQL statements (marked) that are responsible for creating the new table and generating the records. Let’s see if they’ve worked.

Here you can see the first 20 rows of the new table that SPSS has created within the MySQL database.

ID has been correctly instantiated as the primary key (though you can’t tell from this screenshot), and the sex variable has been populated with its value labels rather than a numerical data type (in MySQL it’s the varchar data type).

***************

That’s it, really. You should now have an idea of how to export data from SPSS to a MySQL database. In a later tutorial, we’ll look at some of the more sophisticated options on offer during this process.

The post Export Data from SPSS into a MySQL Database appeared first on EZ SPSS Tutorials.

]]>The post How to Generate Random Numbers in SPSS appeared first on EZ SPSS Tutorials.

]]>As a starting point, you should at least have an ID variable populated in the Data View of SPSS.

The ID variable functions to identify the number of cases in a data set for which SPSS will generate random numbers.

To generate a set of random numbers, we’re going to use SPSS’s Compute Variable dialog box.

Click on Transform -> Compute Variable.

You need to do a number of things to set up this dialog box so SPSS will generate random numbers.

First, name your target variable. We’ve called ours RandomNumbers. This is the variable that SPSS will create to hold the set of random numbers.

Once you’ve named your target variable, select Random Numbers in the Function group on the right. This will bring up a set of functions, all of which operate to generate different kinds of random numbers.

The function we need is called Rv.Uniform. This returns a random value from a uniform distribution with a specified minimum and maximum value. Or, to put it a different way, it will generate a random number between two limits, where every possible value between the limits is equally likely to be generated.

It’s necessary to get the Rv.Uniform function into the Numeric Expression box at the top of the dialog box. You can drag and drop (as above) or use the up arrow in the middle of the dialog.

After you drag the RV.Uniform function into the Numeric Expression box, you’ll notice it has two question marks after it (see above). This signals that you need to specify minimum and maximum values for your random numbers.

This is easy to do. Just replace each question mark with a value. We’ve chosen 0 as our minimum and 100 as our maximum (as above).

That completes the set up. Just hit OK to generate the variable containing the set of random numbers (in this case between 0 and 100).

As you can see below, SPSS has created a new variable called RandomNumbers, and filled it with random numbers, each with a value between 0 and 100.

One thing to note here is that although you’re seeing only 2 decimal places, SPSS has actually calculated the numbers with much more precision (which you’ll see if you select an individual cell). This means it’s very unlikely you’ll get a duplicate number.

Consider the following scenario. You’ve recruited thirty people for a medical study. You want to allocate these people to treatment and control conditions on a random basis. How do you go about it using SPSS?

The following method will work.

Fire up the Compute Variable dialog box again (Transform -> Compute Variable).

This time we’re going to combine two functions together to allocate people to a treatment and control condition (where control means getting the placebo).

As before, the first thing to do is to name our target variable. We’ve chosen TreatmentGroup as our name.

Once you’ve named your target variable, select Arithmetic in the Function group on the left, and then scroll down until you get to the Trunc(1) function.

This function has the effect of rounding any decimal number down towards zero. Or, to put this another way, it truncates a decimal so you’re left with just the integer part of the number. For example, 2.91 will become 2 and 3.33 will become 3.

As before, you’ve got to get this function up into the Numeric Expression box, which you can do by dragging and dropping.

You’ll have noticed there’s a question mark immediately following the Trunc function in the Numeric Expression box (see above). This is a placeholder for the value that’ll be truncated.

We’re going to truncate a random number that lies between 0 and 2. The reason why will become apparent shortly.

To do this, we’re using the same RV.Uniform function as we used before. This time let’s just type it in, but with 0 and 2 as the Min and Max values. It should replace the question mark that appears between the brackets at the end of the Trunc function in the Numeric Expression box (as below).

To recap, the RV.UNIFORM(0,2) function is going to create a set of random numbers between 0 and 2. The truncate function will strip away the decimal part of each number just leaving the integer part.

Let’s press OK, and see how that turns out.

As you can see below, SPSS has created a new variable called TreatmentGroup, and in every case the value is either 0 or 1. (It’s theoretically possible to get a 2, but very unlikely.) This is because the truncate function has rounded every randomly generated value between 0 and 1 down to 0, and every randomly generated value between 1 and 2 down to 1.

In this context, 1 means the treatment condition and 0 the control condition, so we’ve achieved our goal of randomly allocating people to treatment and control conditions. However, you might want to tidy things up a little by going into the Variable View and setting up value labels (1 = Treatment, 0 = Control).

SPSS allows you to generate random numbers that are drawn from a normal distribution with a specified mean and standard deviation. This functionality will often be useful for various sorts of computer simulation.

Imagine you want to run a population level simulation of the effectiveness of different treatment options for a particular disease, and you know that drug efficacy is affected by patient weight. In this situation, you’re going to want your population model to reflect the distribution of weights of people in the real world. We already know weight is normally distributed, which means so long as we know the mean and standard deviation of the distribution, we can create a random distribution of weights in SPSS that will match the characteristics of the distribution of real world weights.

This is how we’d do this for adult males, assuming that the mean weight of an adult male is 195 lbs and the standard deviation of the distribution of weights is 35 lbs.

Navigate to the Compute Variable dialog box again (Transform -> Compute Variable). Hit reset if you need to return it to its default state. You should by now be familiar with the next several steps.

First, name your Target Variable (we’ve got Weight as our variable name).

Second, choose Random Numbers in the Function group, and within the Functions and Special Variables text box, scroll down until you get to RV.Normal.

Third, drag RV.Normal up into the Numeric Expression text box.

Fourth, replace the first question mark with 195, which is the mean weight, and the second question mark with 35, which is the standard deviation.

And that’s it. Press OK, and SPSS will create a variable called Weight, and fill it with normally distributed weights.

As you can see below, we now have our distribution of weights.

***************

Right, that’s it for this tutorial. You should now have an idea of how to generate random numbers within SPSS, and how you can leverage this functionality to solve various sorts of problems.

The post How to Generate Random Numbers in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Move Columns in SPSS appeared first on EZ SPSS Tutorials.

]]>- Select the entire column you want to move by click on its name at the top
- Drag the column to its new location

In a previous tutorial, we created a new variable to hold a set of integers that would function to uniquely identify each case in a data set.

The problem is the ID column is on the far right of the Data View. It’d be better if it were the first column on the left.

So how do you move it over?

The solution is both easy, but not wholly obvious.

First, you select the entire column by clicking on the column name (i.e., where it says ID).

Second, you drag it to its new location. You’ll notice it relocates to the left of whichever column you’re over when you release the mouse button.

And that’s all there is to it.

One interesting thing is that reordering the variables in the Data View also affects their order in the Variable View. As you can see below, ID is now the first variable in the Variable View list.

***************

That was certainly short and sweet. You should now be able to reorder columns in the Data View.

The post How to Move Columns in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Create ID Number in SPSS appeared first on EZ SPSS Tutorials.

]]>The advantage of doing this is it disambiguates cases that might otherwise be confused – for example, if you had two John Smiths in your sample.

- Click on Transform -> Compute Variable
- Give your new variable the name “ID” in the Target Variable box
- Click on All in the Function Group list, and then drag and drop $Casenum into the Numeric Expression box at the top
- Press OK
- You’ll be able to see your new ID variable in SPSS’s Data View

As you can see below, we have three variables in our data set, but nothing to distinguish individual cases from each other. The addition of a variable containing integers functioning as unique identifiers will sort out this issue.

We’re going to add a variable that will contain a set of number IDs by using the Transform -> Compute Variable menu item.

The Compute Variable dialog box looks like this.

There are three things you’ve got to do to set up this box to generate a set of unique identifiers.

The first is to specify the name of your target variable in the Target Variable box. You should name this ID (as above).

The second thing is to select All in the Function group on the right of the dialog. This will allow you to select $Casenum in the Functions and Special Variables box. The $Casenum variable is just the number of cases read up to and including the current case. So, for example, the function will return the number 3 for the third case (normally, but not always, the third row in the Data View), the number 4 for the fourth case, and so on. This will generate a set of unique integers that can function as ID numbers.

The third thing is to drag the $Casenum function into the Numeric Expression box (as per the red arrow above).

That’s it. If you press OK, SPSS will generate a variable named ID that will contain a unique identifier for each case in the Data View.

Here you can see the final result.

The new ID variable contains a unique identifier for each case in the data set.

There are a few things you could do to tidy this up – for example, get rid of the decimal point and drag the ID variable over to the left so it appears in the first column – but really this is job done. Each case is now associated with a unique identifier.

***************

Right, that’s it for this quick tutorial. You should now be able to create a unique ID number for each of your cases in a SPSS data set.

The post Create ID Number in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Paired Sample T Test in SPSS appeared first on EZ SPSS Tutorials.

]]>A paired samples t test will sometimes be performed in the context of a pretest-posttest experimental design. For this tutorial, we’re going to use data from a hypothetical study looking at the effect of a new treatment for asthma by measuring the peak flow of a group of asthma patients before and after treatment.

- Analyze -> Compare Means -> Paired-Samples T Test
- Drag and drop the first of the paired variables into the Variable 1 box on the right, and the second into the Variable 2 box
- Click OK to run the test
- The result will appear in the SPSS output viewer

This is our hypothetical data as it appears in the SPSS Data View.

As you can see, there are three variables. *Sex* – Male or Female. *PrePEF* – pretest peak expiratory flow (measured in litres per minute). *PostPEF* – posttest peak expiratory flow (measured in litres per minute).

It’s a paired subjects design, with a repeated measure being taken for each subject.

We want to find out if there is a difference between the mean pretest PEF and mean posttest PEF, and if so whether it is statistically significant. The paired samples t test is appropriate for this task.

To begin the paired samples t test, click on Analyze -> Compare Means -> Paired-Samples T Test. This will bring up the paired-samples t test dialog box.

The next stage is to get the PreTest and PostTest variables over from the left box into their respective boxes on the right to create the pair of variables (as above). You can do this by dragging and dropping, or by selecting and using the arrow in the middle.

You should end up with a dialog box that looks like this.

Once you’ve set the dialog box up correctly, you should hit OK to run the paired samples t test.

The result of the paired samples t test will pop up in the SPSS output viewer. It will look like this.

This is not as complicated as it might at first look.

The first thing to note is the difference between the means in the pretest and posttest conditions. As you can see above, the average PEF is nearly 25 L/min higher in the posttest condition. The question is whether this difference is large enough to reach statistical significance.

This is where the paired samples t test comes into play. We’ve got a t value of -9.272 (see the Paired Samples Test table), which gives us a *p*-value – or 2-tailed significance value – of .000. This is going to be a significant result for any plausible alpha level.

A standard alpha level is .05, and .000 is smaller than .05, so we’re able to reject the null hypothesis which asserts there is no significant difference between the PEF scores in pretest and posttest conditions. To put this another way, the difference between the means in the two conditions is extreme enough that it is very unlikely to have occurred merely due to chance, therefore, we can conclude that it is a real difference.

It should be noted, however, that if this were a real experiment, it wouldn’t be designed very well. It doesn’t have a control group. To put it simply, we don’t know how much the PEF scores would have changed in the absence of the treatment intervention, therefore, it’s very difficult to draw a conclusion about the efficacy of the treatment.

***************

Right, that’s it for this quick tutorial. You should now be able to run a paired samples t test in SPSS, and to interpret the result that you get.

The post Paired Sample T Test in SPSS appeared first on EZ SPSS Tutorials.

]]>