The post Calculate the Difference Between Two Dates in SPSS appeared first on EZ SPSS Tutorials.

]]>This is something you might want to do, for example, if you’ve got a pretest/posttest design, involving a therapeutic intervention at the pretest stage, and you want to see whether the length of time between the therapy and posttest measurement makes a difference to the treatment outcome.

Consequently, we’re going to use data from a hypothetical study that looks at the effect of a new treatment for asthma by measuring the peak flow of a group of asthma patients immediately before and then sometime after treatment. Our task will be to compute the difference between the date of the before treatment peak flow measure and the date of the after treatment measure.

This is our pretend data as it appears in the Data View of SPSS.

The variables PrePEF and PostPEF comprise the pretest and posttest peak expiratory flow measurements. The date of each pretest measurement can be found in the PrePEFDate variable (highlighted), and the date of each posttest measurement in the PostPEFDate variable (highlighted).

At this point it is worth noting that the data is in the dd.mm.yy format (so, for example, 07.08.19 is the 7th August 2019).

We want to compute how many days there are between the pretest and posttest measurements for each subject, and we want to record this data in a new variable.

This is how it’s done using the Compute Variable dialog box.

To begin, click Transform -> Compute Variable, which will bring up the Compute Variable dialog box.

Here are the things you’ve got to do to set up this dialog box so that it will compute the difference in days between two dates, and save the results to a new variable.

First, type the name of your new variable into the Target Variable box. As you can see above, in our example the new variable is going to be called “DifferenceInDays”.

Second, select Date Arithmetic in Function group on the right of the dialog. This will allow you to select Datediff in the Functions and Special Variables box. This is the function that does the heavy lifting when calculating the difference in time between two dates.

Third, you’ve got to drag the Datediff function into the Numeric Expression box (as per the red arrow above).

The fourth stage is a little more involved.

If you look in the box beneath the keypad (see above), you’ll see the syntax for the Datediff function is specified as follows:

DATEDIFF(datetime2, datetime1, “unit”).

This tells you how you need to set up the function in the Numeric Expression box.

Currently, the function looks like this DATEDIF(?,?,?).

You’ve got to replace the question marks with datetime2, datetime1, and the unit of time in which you want the difference between the two dates to be measured (the options are years, quarters, months, weeks, days, hours, minutes, seconds, and you’ve got to surround the option you choose in quotes).

In our example, we want the expression to look like DATEDIFF(PostTestDate, PreTestDate, “days”). So we’ve got the date of our posttest in the datetime2 slot, the date of our pretest in the datetime1 slot, and we want the result to be in days.

A few things to note here: (a) datetime2 & datetime1 have to be date or time format variables; (b) pay attention to their order – if you have earlier dates in the datetime1 position, then you’re going to get negative difference numbers, for example, minus 17 days; (c) you can drag and drop your date time variables into the Numeric Expression box, but be sure to delete the question marks if you do (alternatively you can just type over them).

Once you’re done you want the dialog box to look something like this.

Right, that’s it for the set up. Check everything looks good, and then hit the OK button. SPSS will create a new variable called DifferenceInDays, and fill it with the new data.

Here you can see the final result.

The new DifferenceInDays variable (highlighted, above) shows the difference in days between the pretest date and the posttest date. You can now run calculations on this variable.

***************

Okay, that’s job done. You should now be able to use the Compute Variable dialog box to compute the difference between two dates in SPSS.

The post Calculate the Difference Between Two Dates in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Test for Normality in SPSS appeared first on EZ SPSS Tutorials.

]]>It is a requirement of many parametric statistical tests – for example, the independent-samples *t* test – that data is normally distributed. There are a number of different ways to test this requirement. We’re going to focus on the Kolmogorov-Smirnov and Shapiro-Wilk tests.

- Click Analyze -> Descriptive Statistics -> Explore…
- Move the variable of interest from the left box into the Dependent List box on the right.
- Click the Plots button, and tick the Normality plots with tests option.
- Click Continue, and then click OK.
- Your result will pop up – check out the Tests of Normality section.

Our example data, displayed above in SPSS’s Data View, comes from a pretend study looking at the effect of dog ownership on the ability to throw a frisbee.

Frisbee Throwing Distance in Metres (highlighted) is the dependent variable, and we need to know whether it is normally distributed before deciding which statistical test to use to determine if dog ownership is related to the ability to throw a frisbee.

To begin, click Analyze -> Descriptive Statistics -> Explore… This will bring up the Explore dialog box, as below.

The set up here is quite easy.

First, you’ve got to get the Frisbee Throwing Distance variable over from the left box into the Dependent List box. You can either drag and drop, or use the blue arrow in the middle.

The Factor List box allows you to split your dependent variable on the basis of the different levels of your independent variable(s). In our example, Dog Owner, our independent variable, has two levels – owner and non-owner – so we could add Dog Owner to the Factor List box, and look at our dependent variable split on that basis. However, since we can perfectly well test for normality without adding in this extra complexity, we’ll just leave the box empty.

Once you’ve got the variable you want to test for normality into the Dependent List box, you should click the Plots button. The Plots dialog box will pop up.

In this box, you want to make sure that the Normality plots with tests option is ticked, and it’s also sensible to select both descriptive statistics options (Stem-and-leaf and Histogram).

Now click Continue, which will take you back to the Explore dialog box. This should now look something like this.

You’re now ready to test whether your data is normally distributed.

Press the OK button.

The Explore option in SPSS produces quite a lot of output. Here’s what you need to assess whether your data distribution is normal.

SPSS runs two statistical tests of normality – Kolmogorov-Smirnov and Shapiro-Wilk.

If the significance value is greater than the alpha value (we’ll use .05 as our alpha value), then there is no reason to think that our data differs significantly from a normal distribution – i.e., we can reject the null hypothesis that it is non-normal.

As you can see above, both tests give a significance value that’s greater than .05, therefore, we can be confident that our data is normally distributed.

A complication that can arise here occurs when the results of the two tests don’t agree – that is, when one test shows a significant result and the other doesn’t. In this situation, use the Shapiro-Wilk result – in most circumstances, it is more reliable.

SPSS also provides a normal Q-Q Plot chart which provides a visual representation of the distribution of the data.

If a distribution is normal, then the dots will broadly follow the trend line.

As you can see above, our data does cluster around the trend line – which provides further evidence that our distribution is normal.

Put this Q-Q plot together with the results of the statistical tests, and we’re safe in assuming that our data is normally distributed. This means that at least one of the criteria for parametric statistical testing is satisfied.

***************

Okay, that’s this tutorial over and done with. You should now be able to interrogate your data in order to determine whether it is normally distributed.

The post Test for Normality in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Mann-Whitney U Test in SPSS, Including Intepretation appeared first on EZ SPSS Tutorials.

]]>The Mann-Whitney U Test evaluates whether two samples are likely to originate from the same underlying population, and it tends to be used in situations where an independent-samples *t* test is not appropriate (for example, if either of the sample distributions are non-normal).

- Click Analyze -> Nonparametric Tests -> Legacy Dialogs -> 2 Independent Samples.
- Drag and drop the dependent variable into the Test Variable(s) box, and the grouping variable into the Grouping Variable box.
- Tick Mann-Whitney U under Test Type.
- Click on Define Groups, and input the values that define each of the groups that make up the grouping variable (i.e., the coded value for Group 1 and the coded value for Group 2).
- Press Continue, and then click on OK to run the test.
- The result will appear in the SPSS data viewer.

For this tutorial, we’re using data from a fake study that looks at the relationship between dog ownership and the ability to throw a frisbee.

As per usual, we’re working on the assumption that you’ve opened SPSS, you’re looking at the Data View, and it looks something like this.

In our example, Frisbee Throwing Distance in Metres is the dependent variable, and Dog Owner is the grouping variable. Put simply, we want to know whether owning a dog (independent variable) has any effect on the ability to throw a frisbee (dependent variable).

Given this setup, it would be usual to conduct an independent samples *t* test. One assumption of this parametric test is that data is normally distributed. The trouble is if we test our data for normality, we get this result.

Both Kolmogorov-Smirnov and Shapiro-Wilk suggest that our dependent variable is not distributed normally. This is confirmed by the histogram, which has a long left tail.

This means we’re better off using a non-parametric test to determine whether there is a relationship between our independent and dependent variables (though, actually, since we have a large number of observations, we’d probably get away with the *t *test). The obvious choice here is the Mann-Whitney U test.

To begin, click Analyze -> Nonparametric Tests -> Legacy Dialogs -> 2 Independent Samples. This will bring up the Two-Independent-Samples Tests dialog box.

The setup here is not too difficult.

To perform the Mann-Whitney U test, we’ve got to get our dependent variable (Frisbee Throwing Distance) into the Test Variable List box, and our grouping variable (Dog Owner) into the Grouping Variable box. To move the variables over, you can either drag and drop, or use the blue arrows.

You also need to select Mann-Whitney U under Test Type (by ticking the box).

The dialog should now look something like this.

You’ll notice that the Grouping Variable, DogOwner, has two question marks in brackets after it. This indicates that you need to define the groups that make up the grouping variable. Click on the Define Groups button.

We’re using 0 and 1 to specify each group, because these values match the way the variable is coded (the Data View shows value labels, not the underlying numeric values). 0 is No Dog; and 1 is Owns Dog.

It’s also worth noting that if you had coded your grouping variable as a String type, then you’d need to match the string values that appear in the Data View precisely – for example, “No Dog” and “Owns Dog”.

Once you have specified the values that define each group, press the Continue button, and then click on OK in the main dialog box to run the Mann-Whitney U test.

The result will appear in the SPSS Output Viewer.

The Mann-Whitney test works by converting scores into ranks while ignoring the grouping variable (in our example, ownership and non-ownership of a dog), and then comparing the mean rank of each group. If the difference between the mean ranks is big enough to be significant, then the null hypothesis that the samples derive from the same population is rejected.

As you can see above, there is what looks like a sizeable difference between the mean ranks of the No Dog and Owns Dog groups. The Mann-Whitney test statistic will tell us whether this difference is big enough to reach significance.

SPSS produces a test statistics table to summarise the result of the Mann-Whitney U test. The key values are Mann-Whitney U, Z and the 2-tailed significance score.

In our example, the No Dog group comprises greater than 20 observations. This means we can use the value of Z to derive our *p*-value. Otherwise, the significance value comes from U.

SPSS is reporting a Z score of -2.049 and a 2-tailed *p*-value of .040. This would normally be considered a significant result (the standard alpha level is .05). Therefore, we can be confident in rejecting the null hypothesis that holds that the Owns Dog and No Dog groups are drawn from the same underlying population. Or, to put this another way, the result of the Mann-Whitney U test supports the proposition that owners and non-owners of dogs have different frisbee throwing abilities.

***************

Okay, that’s the end of this tutorial. You should now have a good idea of how to perform the Mann-Whitney U test in SPSS, and how to interpret the result.

The post Mann-Whitney U Test in SPSS, Including Intepretation appeared first on EZ SPSS Tutorials.

]]>The post How to Compute Difference Scores in SPSS appeared first on EZ SPSS Tutorials.

]]>Perhaps the most common scenario for computing difference scores is where you’ve got a pre-test/post-test scenario, and you want to see how the dependent variable has changed between the two conditions (difference scores are sometimes termed “change scores”).

- Transform -> Compute Variable…
- Name the variable to hold the new difference scores (in the Target Variable box)
- Use the Numeric Expression box to calculate difference scores, using this format: Variable2Name – Variable1Name (or vice versa)
- Click OK

Our data for this tutorial comes from a hypothetical study looking at the effect of a new treatment for asthma by measuring the peak flow of a group of asthma patients before and after treatment.

The two variables we are interested in here are *PrePEF –* pretest peak expiratory flow (measured in litres per minute); and* FirstPostPEF* – posttest peak expiratory flow (measured in litres per minute).

We want to create an additional variable that holds the difference scores for these two variables allowing us to track how peak flow has changed after treatment.

To begin, click Transform -> Compute Variable…

This will bring up the Compute Variable dialog box.

This dialog enables us to create a new variable based on a variety of numeric (and other) operations. For example, suppose you have given your experimental subjects five different tests to complete, and you want to sum the scores of these tests for each subject, and fill a new variable with the totals. You can accomplish this task using the Compute Variable dialog box.

What we want to do here is to create a new variable that holds difference scores (or change scores) for our pretest and posttest variables. This is how the dialog box needs to be set up.

This is fairly straightforward. To compute the difference scores we need to subtract the pretest score from the posttest score. It’s this way around because we want a positive number (representing an increase) if the posttest score is higher than the pretest score. We also need to name a new variable within which we’ll store our new difference scores.

As you can see above, we set up the calculation for the difference scores in the Numeric Expression box. Drag and drop the PostTestPEF variable into this box, then click the minus sign (on the keypad in the middle of the dialog box), and then drag and drop the PreTestPEF variable into the box.

Now you just need to type the name of the variable that’ll contain the difference scores in the Target Variable box. In our example, the new variable is called “Change”.

That’s all there is to it. Click the OK button to compute the difference scores and create a new variable.

The result of the procedure looks like this.

As you can see, SPSS has created a new variable called “Change”, and filled it with difference scores (i.e., calculated by subtracting the PrePEF score from the FirstPostPEF score).

At this point it’s worth taking a look at the Variable View – just click on the tab towards the bottom of the screen – to check the properties of the variable that SPSS has created.

This all looks okay. The type is Numeric and the level of measurement has been correctly identified as Scale.

The only thing we might want to alter is the number of decimals we’re going to display on the Data View. This is currently set at 2, whereas the other variables are configured to display without decimals. Click in the appropriate box if you want to change it. (We have a separate tutorial that deals with the Variable View in detail.)

***************

That’s pretty much it for this tutorial. You should now be able to use the Compute Variable option to calculate difference scores in SPSS. Obviously, this only begins to scratch the surface of the power of the numerical operations on offer via this menu item. We’ll look at some other common usages in future tutorials.

The post How to Compute Difference Scores in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Repeated-Measures ANOVA in SPSS, Including Interpretation appeared first on EZ SPSS Tutorials.

]]>A repeated-measures ANOVA design is sometimes used to analyze data from a longitudinal study, where the requirement is to assess the effect of the passage of time on a particular variable. For this tutorial, we’re going to use data from a hypothetical study that looks at whether fear of spiders among arachnophobes increases over time if the disorder goes untreated.

- Click Analyze -> General Linear Model -> Repeated Measures
- Name your Within-Subject factor, specify the number of levels, then click Add
- Hit Define, and then drag and drop (left to right) a variable for each of the levels you specified (taking care to preserve their correct order)
- Click Options, and tick the Descriptive statistics and Estimate of effect size boxes, and then click Continue
- You’re now ready to run the test. Press the OK button, and your result will pop up in the Output Viewer

This is the data from our “study” as it appears in the SPSS Data View.

The variable we’re interested in here is SPQ which is a measure of the fear of spiders that runs from 0 to 31. The average score for a person with a spider phobia is 23, which compares to a score of slightly under 3 for a non-phobic.

SPQ is the dependent variable. The independent variable – or, to adopt the terminology of ANOVA, the within-subjects factor – is time, and it has three levels: SPQ_Time1 is the time of the first SPQ assessment; SPQ_Time2 is one year later; and SPQ_Time3 two years later.

The null hypothesis is that the mean SPQ score is the same for all levels of the within-subjects factor. This is what we’ll test with a one-way repeated-measures ANOVA.

To start, click Analyze -> General Linear Model -> Repeated Measures. This will bring up the Repeated Measures Define Factor(s) dialog box.

As we noted above, our within-subjects factor is time, so type “time” in the Within-Subject Factor Name box. And we have 3 levels, so input 3 into Number of Levels. Then click Add.

The dialog box should now look like this.

Okay, it’s now time to set up the within-subjects variables (at the moment SPSS knows that our within-subjects factor has three levels, but it doesn’t know which of our variables corresponds to each level). Click on the Define button, which will bring up the Repeated Measures dialog blox.

You’ve got to shift your within-subjects variables over to the Within-Subjects Variables box ensuring you maintain the correct order. You can drag and drop, or use the arrow button in the middle of the box. In our case, it just means moving SPQ_Time1, SPQ_Time2 & SPQ_Time3 into the three slots on the right.

The dialog box should look something like this once you’ve completed this stage.

We’re now ready to set up some of the options for the repeated-measures ANOVA. Click on the Options button.

What you see here depends on the version of SPSS you’re using. The most recent version of SPSS (26) has an options dialog box that looks like this.

Previous versions include an option for specifying estimated marginal means. It looks like this.

We’re going to assume that you’re using a previous version of SPSS, and you’re seeing the estimated marginal means option. If you’re not, then you need to click on the EM Means button (in the Repeated Measures dialog box) after you’ve finished with the Options dialog box, and set up the estimated marginal means there.

It’s not too difficult to get the options sorted out. You want to display descriptive statistics and estimates of effect size, so tick these options in the Display section (as above). And then in the Estimated Marginal Means section (or dialog box if you’re using the current version of SPSS), move “time” over to the Display Means for box, and then tick Compare main effects, and choose Bonferroni as the Confidence interval adjustment option.

Hit the Continue button(s) once you’ve got this set up.

That’s it, you’re ready to run the test. You should be looking at the original Repeated Measures dialog box. All you’ve got to do is hit OK, and you’ll see the result pop up in the Output Viewer.

SPSS produces a lot of output for the one-way repeated-measures ANOVA test. For the purposes of this tutorial, we’re going to concentrate on a fairly simple interpretation of all this output. (In future tutorials, we’ll look at some of the more complex options available to you, including multivariate tests and polynomial contrasts).

The descriptive statistics that SPSS outputs are easy enough to understand. The comparison between means (see above) gives us an idea of the direction of any possible effect. In our example, it seems as if fear of spiders increases over time, with the greatest increase (20.90 to 22.26 on the SPQ scale) occurring between year 1 (SPQ_Time2) and year 2 (SPQ_Time3). Of course, we won’t know whether these differences in the means reach significance until we look at the result of the ANOVA test.

A requirement that must be met before you can trust the *p*-value generated by the standard repeated-measures ANOVA is the homogeneity-of-variance-of-differences (or sphericity) assumption. For our purposes, it doesn’t matter too much what this means, we just need to know how to figure out whether the requirement has been satisfied.

SPSS tests this assumption by running Mauchly’s test of sphericity.

What we’re looking for here is a *p*-value that’s *greater* than .05. Our *p*-value is .494, which means we meet the assumption of sphericity.

You’ve got to be careful here. This assumption is frequently violated. If it is, in order to calculate a reliable value for *p, *you’ll need to adjust the degrees of freedom of *F* in line with the extent to which the assumption is violated. Happily SPSS does this work for you. All you’ve got to do is choose an alternative univariate test. Let’s look at this now.

This is where we read off the result of the repeated-measures ANOVA test.

As we have just discussed, our data meets the assumption of sphericity, which means we can read our result straight from the top row (Sphericity Assumed). The value of *F* is 5.699, which reaches significance with a *p-*value of .006 (which is less than the .05 alpha level). This means there is a statistically significant difference between the means of the different levels of the within-subjects variable (time).

If our data had not met the assumption of sphericity, we would need to use one of the alternative univariate tests. You’ll notice that these produce the same value for *F*, but that there is some variation in the reported degrees of freedom. In our case, there is not enough difference to alter the *p*-value – Greenhouse-Geisser and Huynh-Feldt, both produce significant results (*p* = .006).

Although we know that the differences between the means of our three within-subjects levels are large enough to reach significance, we don’t yet know between which of the various pairs of means the difference is significant. This is where pairwise comparisons come into play.

This table features three *unique* comparisons between the means for SPQ_Time1, SPQ_Time2 and SPQ_Time3. Only one of the differences reaches significance, and that’s the difference between the means for SPQ_Time1 and SPQ_Time 3 (see above). It is worth noting that SPSS is using an adjusted *p*-value here in order to control for multiple comparisons, and that the program lets you know if a mean difference has reached significance by attaching an asterisk to the value in column 3.

When reporting the result it’s normal to reference both the ANOVA test and any post hoc analysis that has been done.

Thus, given our example, you could write something like:

A repeated-measures ANOVA determined that mean SPQ scores differed significantly across three time points (

F(2, 58) = 5.699,p= .006). A post hoc pairwise comparison using the Bonferroni correction showed an increased SPQ score between the initial assessment and follow-up assessment one year later (20.1 vs 20.9, respectively), but this was not statistically significant (p= .743). However, the increase in SPQ score did reach significance when comparing the initial assessment to a second follow-up assessment taken two years after the original assessment (20.1 vs 22.26, p = .010). Therefore, we can conclude that the results for the ANOVA indicate a significant time effect for untreated fear of spiders as measured on the SPQ scale.

***************

Okay, that’s all for this tutorial. You should now be able to run a repeated-measures ANOVA, test the assumption of sphericity, make use of a pairwise comparison, and report the result. In future tutorials, we’ll look at some of the more sophisticated options available for this test. But this tutorial should provide enough information for you to run a basic repeated-measures ANOVA test.

The post Repeated-Measures ANOVA in SPSS, Including Interpretation appeared first on EZ SPSS Tutorials.

]]>The post Rules for Naming Variables in SPSS appeared first on EZ SPSS Tutorials.

]]>SPSS allows you to rename variables either via its Variable View or by using syntax. There are a number of rules governing the naming of variables.

- Names can be safely up to 32 characters long. Names may include alphanumeric characters, non-punctuation characters, and a period (.).
- You can’t have a space in a variable name.
- Don’t end a variable name with a period.
- Don’t end a variable name with an underscore.
- You can use periods and underscores
*within*a variable name. - You can use upper and lower case, and a mixture thereof, within a variable name.
- You can’t use SPSS reserved keywords as a variable name (i.e., you can’t use ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO or WITH).
- Each variable must be unique.

***************

That’s it, short and sweet. If you follow those rules when naming variables, you’re not going to go wrong.

The post Rules for Naming Variables in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Rename a Dataset in SPSS appeared first on EZ SPSS Tutorials.

]]>- In the SPSS Data View, click File, then Rename Dataset…
- Type the new dataset name into the dialog box, following SPSS’s naming conventions
- Click OK

The normal scenario for renaming a dataset is where you’re working with two sets of data, each of which is open in its own Data View, and you want to give each dataset a memorable name to make them readily distinguishable.

SPSS uses a default convention to name datasets automatically. This is of the form, *DataSetn*, where *n* is an incremental integer value (i.e., 1, 2, 3, etc). You can see this with our example dataset above. (The exception to this is if you open a dataset using SPSS’s syntax language, in which case no name is given unless it is specified).

It’s easy to rename a dataset, though it’s not wildly obvious how it’s done.

Click on File, and then select the Rename Dataset option. This will bring up the Rename Dataset dialog box.

All you’ve got to do now is to type in the new name of the dataset. You need to follow SPSS’s naming rules, and you should try to make your name meaningful.

Once you’re done, just press the OK button.

As you can see, the dataset has been renamed, FirstTimeTask, as specified in the dialog box.

***************

That’s it for this quick tutorial. You should now have all the information you need to rename a dataset in the SPSS statistics program.

The post How to Rename a Dataset in SPSS appeared first on EZ SPSS Tutorials.

]]>The post How to Calculate the Median in SPSS appeared first on EZ SPSS Tutorials.

]]>- Click Analyze -> Descriptive Statistics -> Frequencies
- Move the variable for which you wish to calculate the median into the right-hand column
- Click the Statistics button, select Median under Central Tendency, and then press Continue
- Click OK to perform the calculation

This is the data set with which we’re going to be working.

So we’ve got three variables here: (a) duration – which is the duration in seconds it takes to complete a certain task; (b) sex – male or female; and (c) height – in inches.

You want to find out the median of the *duration* variable. In other words, you want to know the duration in seconds that lies exactly at the midpoint of the distribution of all durations.

There are a number of different ways of calculating the median in SPSS. This is probably the easiest.

Click Analyze -> Descriptive Statistics -> Frequencies.

This will bring up the Frequencies dialog box.

You need to get the variable for which you wish to calculate the median into the Variable(s) box on the right. You can do this by dragging and dropping, or by selecting the variable on the left, and then clicking the arrow in the middle.

Once you’ve set this up, hit the Statistics button to bring up the Statistics dialog box.

Here you just want to tick the Median option under Central Tendency on the right.

We’ve also selected Mean and Standard Deviation, just because these are standard measures of central tendency and dispersion (respectively).

When you’re done, click Continue. You should now be looking at something like this.

It’s probably worth noting that we’ve also selected Display frequency tables at the bottom on the left. This isn’t necessary, but the option will provide useful additional information.

You’re now set up to calculate the median.

Just hit the OK button.

The result appears in SPSS’s output viewer.

As you can see, this is very easy to interpret.

For our example, the median value is 7.02. (The mean is 7.3541, and the standard deviation is 2.33632).

***************

That’s all for this quick tutorial. You should now know how to calculate the median in SPSS.

The post How to Calculate the Median in SPSS appeared first on EZ SPSS Tutorials.

]]>The post Frequency Distribution in SPSS appeared first on EZ SPSS Tutorials.

]]>- Click on Analyze -> Descriptive Statistics -> Frequencies
- Move the variable of interest into the right-hand column
- Click on the Chart button, select Histograms, and the press the Continue button
- Click OK to generate a frequency distribution table

This is the data set we’ll be using.

It comes from a logic test featured on the Philosophy Experiments website that requires people to identify whether arguments are valid or invalid.

We’re interested in the Score variable, which is the number of questions people get right out of 15.

A frequency distribution table provides a snapshot view of the characteristics of a data set. It allows you to see how scores are distributed across the whole set of scores – whether, for example, they are spread evenly or skew towards a particular end of the distribution.

To make a frequency distribution table, click on Analyze -> Descriptive Statistics -> Frequencies.

This will bring up the Frequencies dialog box.

You need to get the variable for which you wish to generate the frequencies into the Variable(s) box on the right. You can do this by dragging and dropping, or by selecting the variable on the left, and then clicking the arrow in the middle.

Once you’ve set this up, hit the Charts button to bring up the Charts dialog box.

Now select Histograms as the chart type (and additionally it’s a good idea to tick the show normal curve option).

Click Continue when you’re done, which will bring you back to the Frequencies dialog box. This should look something like this.

Now you’re ready to generate the frequency distribution table and histogram. Just hit the OK button.

The output produced by SPSS is fairly easy to understand.

First we have the frequency distribution table:

The scores (in our case, the number of correct answers) are in the left column. The number of occurrences of a given score is specified in the Frequency column.

You’ve also got columns specifying percent and cumulative percent, where percent is the number of occurrences of a given score divided by the total number of scores multiplied by 100, and cumulative percent is the total you get when you add the percent values to each other as you descend down the rows.

The size of the sample is effectively the total number of valid scores, which you can see at the top of the table and at the bottom of the Frequency column.

A histogram provides a graphical representation of a frequency distribution.

Here’s ours.

The y-axis (on the left) represents a frequency count, and the x-axis (across the bottom), the value of the variable (in this case the number of correct answers). You’ll notice that SPSS also provides values for mean (9.7) and standard deviation (2.654). It appears that our distribution is somewhat skewed to the left.

If you want to save your histogram, you can right-click on it within the output viewer, and choose to copy it to an image file (which you can then use within other programs).

***************

We hope you have found this quick tutorial useful. You should now be able to generate a frequency distribution table in SPSS and also select the histogram option.

The post Frequency Distribution in SPSS appeared first on EZ SPSS Tutorials.

]]>The post One Way ANOVA in SPSS Including Interpretation appeared first on EZ SPSS Tutorials.

]]>- Click on Analyze -> Compare Means -> One-Way ANOVA
- Drag and drop your independent variable into the Factor box and dependent variable into the Dependent List box
- Click on Post Hoc, select Tukey, and press Continue
- Click on Options, select Homogeneity of variance test, and press Continue
- Press the OK button, and your result will pop up in the Output viewer

We’re starting from the assumption that you’ve already got your data into SPSS, and you’re looking at a Data View screen that looks a bit like this.

Our fictitious dataset contains a number of different variables. For the purposes of this tutorial, we’re interested in whether level of education has an effect on the ability of a person to throw a frisbee. Our independent variable, therefore, is Education, which has three levels – High School, Graduate and PostGrad – and our dependent variable is Frisbee Throwing Distance (i.e., the distance a subject throws a frisbee).

The one-way ANOVA test allows us to determine whether there is a significant difference in the mean distances thrown by each of the groups.

To start, click on Analyze -> Compare Means -> One-Way ANOVA.

This will bring up the One-Way ANOVA dialog box.

To set up the test, you’ve got to get your independent variable into the Factor box (Education in this case, see above) and dependent variable into the Dependent List box. You can do this by dragging and dropping, or by highlighting a variable, and then clicking on the appropriate arrow in the middle of the dialog.

After you’ve moved the variables over, you should click the Post Hoc button, which will allow you to specify the post hoc test(s) you wish to run.

The ANOVA test will tell you whether there is a significant difference between the means of two or more levels of a variable. However, if you’ve got more than two levels it’s not going to tell you between *which* of the various pairs of means the difference is significant. You need to do a post hoc test to find this out.

The Post Hoc dialog box looks like this.

You should select Tukey, as shown above, and ensure that your significance level is set to 0.05 (or whatever alpha level is right for your study).

Now press Continue to return to the previous dialog box.

You should be looking at this dialog box again.

Click Options to bring up the Options dialog box.

At the very least, you should select the Homogeneity of variance test option (since homogeneity of variance is required for the ANOVA test). Descriptive statistics and a Means plot are also useful.

Once you’ve made your selections, click Continue.

At this point, you’re ready to run the test.

Review your options, and click the OK button. You’ll see the result pop up in the Output Viewer.

SPSS produces a lot of data for the one-way ANOVA test. Let’s deal with the important bits in turn.

It’s worth having a quick glance at the descriptive statistics generated by SPSS.

If you look above, you’ll see that our sample data produces a difference in the mean scores of the three levels of our education variable. In particular, the data analysis shows that the subjects in the PostGrad group throw the frisbee quite a bit further than subjects in the other two groups. The key question, of course, is whether the difference in mean scores reaches significance.

A requirement for the ANOVA test is that the variances of each comparison group are equal. We have tested this using the Levene statistic. What you’re looking for here is a significance value that is greater than .05. You *don’t* want a significant result, since a significant result would suggest a real difference between variances.

In our example, as you can see above, the significance value of the Levene statistic based on a comparison of medians is .155. This is *not* a significant result, which means the requirement of homogeneity of variance has been met, and the ANOVA test can be considered to be robust.

Now that we know we have equal variances, we can look at the result of the ANOVA test.

The ANOVA result is easy to read. You’re looking for the value of F that appears in the Between Groups row (see above) and whether this reaches significance (next column along).

In our example, we have a significant result. The value of F is 3.5, which reaches significance with a *p-*value of .038 (which is less than the .05 alpha level). This means there is a statistically significant difference between the means of the different levels of the education variable.

However, as yet we don’t know between *which* of the various pairs of means the difference is significant. For this we need to look at the result of the post hoc Tukey HSD test.

If you take a look at the Multiple Comparisons table above you’ll see that significance values have been generated for the mean differences between pairs of the various levels of the education variable (Graduate – High School; Graduate – PostGrad; and High School – PostGrad).

In our example, the Tukey HSD (Honest Significant Difference) shows that it is only the mean difference between the High School and PostGrad groups that reaches significance (see the Sig. column, above). The *p*-value is .034, which is less than the standard .05 alpha level.

When reporting the result it’s normal to reference both the ANOVA test and the post hoc Tukey HSD test.

Thus, given our example here, you could write something like:

There was a statistically significant difference between groups as demonstrated by one-way ANOVA (

F(2,47) = 3.5,p= .038). A Tukey post hoc test showed that the PostGrad group was able to throw the frisbee statistically significantly further than the High School group (p= .034). There was no statistically significant difference between the Graduate and High School groups (p= . 691) or between the Graduate and PostGrad groups (p = .099).

***************

Right, that’s it for this tutorial. You should now be able to perform a one-way ANOVA test in SPSS, check the homogeneity of variance assumption has been met, run a post hoc test, and interpret and report your result.

The post One Way ANOVA in SPSS Including Interpretation appeared first on EZ SPSS Tutorials.

]]>