11.2. The t-Test

11.2. The t-Test

By Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler and Dana C. Leighton, adapted by Marc Chao and Muhamad Alif Bin Ibrahim

In psychology, many studies examine differences between two means, and the t-test is the most common tool for analysing such relationships. This section focuses on three types of t-tests, each suited to different research designs:

One-Sample t-Test: Compares a sample mean to a known population mean.
Dependent-Samples t-Test: Used when comparing two related groups, such as the same participants tested at different times.
Independent-Samples t-Test: Compares the means of two independent groups, such as different sets of participants.

Even if you have taken a statistics course before, this section will help refresh your understanding of these essential tools.

One-Sample t-Test

The one-sample t-test is used to compare a sample mean (M) to a hypothetical population mean (μ₀) that serves as a reference point. The test evaluates two competing hypotheses:

Null Hypothesis (H₀): The population mean (μ) equals the hypothetical mean (μ₀).
Alternative Hypothesis (H𝑎): The population mean (μ) is different from the hypothetical mean (μ₀).

To decide between these hypotheses, the t-test calculates the probability (p-value) of obtaining a sample mean as extreme as the one observed if the null hypothesis were true. First, a statistic called t is calculated using this formula:

Here:

M = Sample mean
μ₀ = Hypothetical population mean
SD = Sample standard deviation
N = Sample size

The Role of the t Distribution

The t-statistic is useful because its behaviour under the null hypothesis is well understood as it follows a t distribution. This distribution is unimodal, symmetric, and centred at 0. Its shape depends on degrees of freedom (df), calculated as N − 1 for a one-sample t-test.

For example, with df = 24, the t distribution looks like Figure 11.2.1, where extreme t values occur in the tails. To calculate the p-value, the proportion of scores in the tails of this distribution that are as extreme as the observed t value is determined. For instance, a t value of 1.50 with df = 24 corresponds to a p-value of 0.14.

Diagram illustrating the Distribution of t scores (with 24 degrees of freedom) when the null hypothesis is true — Figure 11.2.1. Distribution of t scores (with 24 degrees of freedom) when the null hypothesis is true. The red vertical lines represent the two-tailed critical values, and the green vertical lines the one-tailed critical values when α = .05, by R. S. Jhangiani et al. in Research Methods in Psychology 4e is used under a CC BY-NC-SA licence

Decision Rule

Using statistical software, you can directly obtain the t value and p-value. Here’s how you interpret the results:

If p ≤ 0.05, reject the null hypothesis and conclude that the population mean likely differs from the hypothetical mean.
If p > 0.05, retain the null hypothesis, meaning there is insufficient evidence to conclude a difference.

When manually computing the t value, you can refer to a critical t-table (like Table 11.2.1). The table provides critical t values for different degrees of freedom (df) and significance levels (α = 0.05). For a two-tailed test, any t value beyond the critical values (e.g., ±2.064 for df = 24) leads to rejecting the null hypothesis.

Table 11.2.1. Table of critical values of t when α = .05
df	One-tailed	Two-tailed
	Critical value
3	2.353	3.182
4	2.132	2.776
5	2.015	2.571
6	1.943	2.447
7	1.895	2.365
8	1.860	2.306
9	1.833	2.262
10	1.812	2.228
11	1.796	2.201
12	1.782	2.179
13	1.771	2.160
14	1.761	2.145
15	1.753	2.131
16	1.746	2.120
17	1.740	2.110
18	1.734	2.101
19	1.729	2.093
20	1.725	2.086
21	1.721	2.080
22	1.717	2.074
23	1.714	2.069
24	1.711	2.064
25	1.708	2.060
30	1.697	2.042
35	1.690	2.030
40	1.684	2.021
45	1.679	2.014
50	1.676	2.009
60	1.671	2.000
70	1.667	1.994
80	1.664	1.990
90	1.662	1.987
100	1.660	1.984

One-Tailed vs. Two-Tailed Tests

Two-Tailed Test:

Used when you suspect the sample mean might differ from the hypothetical mean, but do not know in which direction.
Reject the null hypothesis if the t value falls in either tail of the distribution (e.g., below −2.064 or above +2.064 for df = 24).

One-Tailed Test:

Applied when you expect a difference in a specific direction.
You decide beforehand whether to test for a sample mean higher or lower than the hypothetical mean.
For df = 24, one-tailed critical values are less extreme (±1.711). However, the disadvantage is that you lose the ability to detect differences in the opposite direction.

In both cases, the significance level (α = 0.05) is maintained. The one-tailed test gives more power to detect a directional difference but completely ignores the possibility of a difference in the opposite direction.

Example: One-Sample t-Test

A health psychologist is interested in evaluating how accurately university students estimate the calorie content of a chocolate chip cookie. The cookie contains 250 calories, which serves as the hypothetical population mean (μ₀). The psychologist’s null hypothesis (H₀) states that the population mean of students’ calorie estimates (μ) is equal to 250. Since he has no prior expectation about whether students overestimate or underestimate, he decides to conduct a two-tailed test to examine deviations in either direction.

To test the hypothesis, the psychologist gathers calorie estimates from a sample of 10 university students. Their individual estimates are as follows:

250, 280, 200, 150, 175, 200, 200, 220, 180, 250

From this data, the psychologist calculates the following sample statistics:

Sample Mean (M): 212.00 calories
Sample Standard Deviation (SD): 39.17 calories
Sample Size (N): 10

The psychologist uses the one-sample t-test to determine if the sample mean significantly differs from the hypothetical population mean. The formula for the t-test is:

Where:

M = Sample mean = 212.00
μ₀ = Hypothetical population mean = 250
SD = Sample standard deviation = 39.17
N = Sample size = 10

Substituting the values into the formula:

The computed t value is −3.07.

The psychologist enters the data into statistical software (e.g., SPSS or Excel). The software calculates the p value for this t-score with df = N − 1 = 10 − 1 = 9. The output shows a two-tailed p value of 0.013.

Decision: Since p = 0.013 is less than the significance level α = 0.05, the psychologist rejects the null hypothesis.
Conclusion: There is sufficient evidence to conclude that university students significantly underestimate the calorie content of a chocolate chip cookie.

If performing the test manually, the psychologist would consult a table of critical t values (e.g., Table 11.2.1). For a two-tailed test with df = 9 and α = 0.05, the critical value of t is:

t_critical = ±2.262

The observed t value (t = −3.07) is more extreme than the critical value (−2.262), leading to the same conclusion: the null hypothesis is rejected.

The psychologist reports the results as follows:

t(9) = −3.07, p = .01

Key formatting points in APA style include:

The symbols t and p are italicised.
The degrees of freedom (df) are placed in parentheses immediately after t.
Both the t value and p value are rounded to two decimal places.

If the psychologist had a strong reason to believe that students would specifically underestimate the calories, a one-tailed test could be used. In this case:

The critical value for df = 9 at α = 0.05 would be −1.833 (for the lower tail).
Because t = −3.07 is more extreme than −1.833, the null hypothesis would still be rejected.

However, one-tailed tests have limitations. If the data showed an overestimation (a positive t value), the null hypothesis could not be rejected, as one-tailed tests only examine deviations in a pre-specified direction.

Dependent-Samples t-Test

The dependent-samples t-test, also called the paired-samples t-test, is used when comparing two means from the same group of participants. This test is typically applied in scenarios like pretest-posttest designs or within-subjects experiments, where each participant is tested under two conditions or at two different times.

Key Hypotheses

Null hypothesis (H₀): The population means for the two conditions or time points are the same (μ₀ = 0).
Alternative hypothesis (H_𝑎): The population means for the two conditions or time points are different (μ₀ ≠ 0).

If there is a clear expectation about the direction of the difference (e.g., an increase or decrease), a one-tailed test can be used instead of the more common two-tailed test.

How the Dependent-Samples t-Test Works

Think of the dependent-samples t-test as a variation of the one-sample t-test, with one additional step:

Calculate the difference scores:

For each participant, compute the difference between their two scores (e.g., posttest score minus pretest score). These difference scores summarise the change for each individual.

Apply the One-Sample t-test:

Once the difference scores are calculated, the test becomes a one-sample t-test conducted on the mean of the difference scores.
The hypothetical population mean (μ₀) for the difference scores is 0 because, under the null hypothesis, there is no average difference between the two conditions or time points.

Reframing the Hypotheses

Under the null hypothesis, the mean difference score in the population is 0 (μ₀ = 0).
The alternative hypothesis states that the mean difference score in the population is not 0 (μ₀ ≠ 0).

This approach allows the dependent-samples t-test to determine whether the observed differences between the two means are statistically significant, reflecting a real effect rather than random variation.

Example: Dependent-Samples t-Test

A health psychologist seeks to determine whether a specialised training program can enhance people’s ability to estimate the calorie content of junk food accurately. To test the program’s effectiveness, a pretest-posttest study is conducted with a sample of 10 participants. Each participant provides an estimate of the calories in a chocolate chip cookie before and after completing the training program. Since the psychologist expects the program to improve calorie estimates, a one-tailed test is used to analyse the data.

Pretest and Posttest Data Collection

The calorie estimates provided by the participants are recorded as follows:

Pretest estimates: 230, 250, 280, 175, 150, 200, 180, 210, 220, 190
Posttest estimates: 250, 260, 250, 200, 160, 200, 200, 180, 230, 240

The goal is to evaluate whether the posttest estimates, on average, are higher than the pretest estimates, indicating an improvement in accuracy due to the training.

To assess the change in estimates for each participant, the difference scores are calculated by subtracting the pretest estimates from the posttest estimates. A positive difference score indicates that a participant’s calorie estimate increased after the training, while a negative score indicates a decrease.

Difference Score = Posttest Estimate − Pretest Estimate

The resulting difference scores are:

Difference scores: 20, 10, −30, 25, 10, 0, 20, −30, 10, 50

The direction of subtraction (posttest minus pretest) does not matter as long as it is applied consistently for all participants.

From the difference scores, the following descriptive statistics are calculated:

Mean of the difference scores (M): 8.50
Standard deviation of the difference scores (SD): 27.27
Sample size (N): 10

The null hypothesis (H₀) states that the mean difference in calorie estimates after training is zero (μ₀ = 0), implying no effect of the training program. The alternative hypothesis (H_A) posits that the mean difference is greater than zero (μ > 0), consistent with an expected improvement.

The t-statistic is calculated using the following formula:

Where:

M = Mean of the difference scores = 8.50
μ₀ = Hypothetical population mean = 0
SD = Standard deviation of the difference scores = 27.27
N = Sample size = 10

Substituting the values into the formula:

The calculated t value is 1.11.

The data can be entered into software such as SPSS, Excel, or an online t-test calculator. For df = N − 1 = 10 − 1 = 9, the one-tailed p value is 0.148.

Decision: Since p = 0.148 is greater than the significance threshold (α = 0.05), the psychologist fails to reject the null hypothesis.
Conclusion: There is insufficient evidence to conclude that the training program significantly improves calorie estimation accuracy.

If calculating by hand, the psychologist would refer to a critical t value table. For a one-tailed test with df = 9 and α = 0.05, the critical t value is 1.833.

Comparison: The observed t value (t = 1.11) is less extreme than the critical value (t = 1.833).
Decision: The psychologist fails to reject the null hypothesis, consistent with the software output.

The results suggest that the training program does not lead to a statistically significant improvement in calorie estimation. While the mean difference score (8.50) indicates a slight increase in posttest estimates, this difference is not large enough to rule out the possibility that it occurred by chance.

The Independent-Samples t-Test

The independent-samples t-test is used to compare the means (M₁ and M₂) of two separate groups. These groups may have been exposed to different conditions in a between-subjects experiment or represent naturally occurring categories in a cross-sectional study (e.g., women vs. men, extraverts vs. introverts).

Hypotheses

Null hypothesis (H₀): The population means are equal (μ₁ = μ₂).
Alternative hypothesis (H_A): The population means are not equal (μ₁ ≠ μ₂).

If there is strong evidence to expect a difference in a specific direction, a one-tailed test may be used instead of a two-tailed test.

Formula for the t-Statistic

The formula for the independent-samples t-test accounts for two sample means, their variances, and sample sizes. The t-statistic is calculated as:

Where:

M₁ and M₂: Sample means for the two groups.
SD²₁and SD²₂: Variances (squared standard deviations) for the two groups.
n₁ and n₂: Sample sizes for the two groups.

Key Points

Variances: The formula uses the squared standard deviations (variances) to reflect variability within each group. These variances are added together inside the square root symbol.
Sample Sizes: The formula uses n₁ and n₂ to calculate the contribution of each group’s variance relative to its size.
Degrees of Freedom: For the independent-samples t-test, the degrees of freedom are calculated as N − 2, where N is the total number of participants across both groups.

By combining the differences between the two group means, their variances, and sample sizes, the independent-samples t-test determines whether the observed difference is statistically significant.

Example: Independent-Samples t-Test

A health psychologist aims to investigate whether individuals who regularly consume junk food differ in their calorie estimates compared to those who rarely eat junk food. Since the psychologist does not have a clear hypothesis about the direction of the difference, a two-tailed test is chosen for the analysis.

The psychologist collects calorie estimation data from two independent groups:

Junk food eaters (8 participants): 180, 220, 150, 85, 200, 170, 150, 190
- Mean (M₁) = 168.12
- Standard deviation (SD₁) = 42.66
Non-junk food eaters (7 participants): 200, 240, 190, 175, 200, 300, 240
- Mean (M₂) = 220.71
- Standard deviation (SD₂) = 41.23

The goal is to determine if there is a statistically significant difference between the two groups’ calorie estimates.

Hypotheses

Null Hypothesis (H₀): There is no difference between the population means of the two groups (μ₁ = μ₂).

Alternative Hypothesis (H_A): There is a difference between the population means of the two groups (μ₁ ≠ μ₂).

The t-statistic formula for independent samples is:

Where:

M₁ and M₂ are the sample means of the two groups.
SD²₁ and SD²₂ are the variances (squared standard deviations) of the two groups.
n₁ and n₂ are the sample sizes of the two groups.

Substituting the values into the formula:

The calculated t-statistic is t = −2.42.

The psychologist inputs the data into statistical software such as SPSS, Excel, or an online t-test calculator. The software calculates a two-tailed p-value of 0.015 for t = −2.42 with df = 13 (degrees of freedom: N − 2 = 15 − 2).

Decision: Since p = 0.015 is less than the significance threshold (α = 0.05), the psychologist rejects the null hypothesis.
Conclusion: The results suggest that people who eat junk food regularly estimate fewer calories compared to those who rarely eat junk food.

To confirm the results, the psychologist refers to a t-distribution table. For a two-tailed test with df = 13 and α = 0.05, the critical t value is ±2.160.

Comparison: The observed t value (−2.42) is more extreme than the critical value (−2.42 < −2.160).
Decision: The null hypothesis is rejected.

The results can be reported as follows:

t(13) = −2.42, p = .015

This format includes the degrees of freedom (df = 13), the t-statistic (t = −2.42), and the p-value (p = .015).

The analysis indicates a statistically significant difference between the calorie estimates of junk food eaters and non-junk food eaters. Specifically, junk food eaters provide lower calorie estimates compared to those who rarely eat junk food. This finding could suggest that regular junk food consumption might influence perceptions of calorie content, potentially contributing to dietary misjudgements.

Chapter Attribution

Content adapted, with editorial changes, from:

Research methods in psychology, (4th ed.), (2019) by R. S. Jhangiani et al., Kwantlen Polytechnic University, is used under a CC BY-NC-SA licence.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

11.2. The t-Test Copyright © 2025 by Marc Chao and Muhamad Alif Bin Ibrahim is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

By Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler and Dana C. Leighton, adapted by Marc Chao and Muhamad Alif Bin Ibrahim

One-Sample t-Test

The Role of the t Distribution

Decision Rule

One-Tailed vs. Two-Tailed Tests

Example: One-Sample t-Test

Dependent-Samples t-Test

Example: Dependent-Samples t-Test

The Independent-Samples t-Test

Example: Independent-Samples t-Test

Chapter Attribution

License

Share This Book