11.6. Statistical Power
By Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler and Dana C. Leighton, adapted by Marc Chao and Muhamad Alif Bin Ibrahim
Statistical power is a critical concept in research methodology, reflecting the likelihood of correctly rejecting the null hypothesis when it is false. In other words, it measures a study’s ability to detect a true effect or relationship in the population based on the sample data. Several factors influence statistical power, including the sample size, the expected strength of the relationship, and the chosen significance level (α).
Consider a scenario where a researcher conducts a study with 50 participants to investigate a population correlation, expecting Pearson’s r to be +0.30. The statistical power for this study is 0.59. This indicates a 59% probability of correctly rejecting the null hypothesis if the population correlation is indeed +0.30. The probability of failing to reject a false null hypothesis, a Type II error, is the complement of power. In this example, the likelihood of making a Type II error is calculated as 1 − 0.59 = 0.41, or 41%.
Statistical power serves as an essential safeguard against Type II errors, where true effects in the population go undetected. To minimise such errors, researchers aim for an adequate power level, typically set at 0.80. This standard implies an 80% chance of detecting a true effect, assuming it exists, which balances reliability and resource efficiency.
How to Calculate Statistical Power
While calculating statistical power involves complex formulas tailored to specific research designs, modern tools simplify the process. Researchers can use statistical software or online calculators by providing inputs like:
- Sample size (N): The number of participants or observations.
- Effect size: The expected strength of the relationship (e.g., Cohen’s d or Pearson’s r).
- Significance level (α): Typically set at 0.05, reflecting a 5% risk of a Type I error.
Table 11.6.1 illustrates the sample sizes required to achieve a power of 0.80 for various effect sizes in two-tailed independent-samples t-tests and tests of Pearson’s r:
Null Hypothesis Test | ||
Relationship Strength | Independent-Samples t-Test | Test of Pearson’s r |
Strong (d = .80, r = .50) | 52 | 28 |
Medium (d = .50, r = .30) | 128 | 84 |
Weak (d = .20, r = .10) | 788 | 782 |
These numbers highlight a critical point: detecting weak relationships requires much larger sample sizes compared to strong or medium relationships.
What If Power Is Inadequate?
Insufficient statistical power undermines the reliability of research findings, increasing the risk of false negatives (Type II errors). For instance, imagine a researcher conducting a between-subjects experiment with 20 participants in each group and expecting a medium effect size (d = 0.50). The statistical power of this study is only 0.34. This means there is just a 34% chance of detecting a true effect, leaving a 66% chance of missing it. Such low power renders the study unreliable, risking wasted time, resources, and misleading conclusions.
Strategies for Increasing Statistical Power
Improving statistical power is crucial for ensuring robust and reliable findings. Two primary strategies can help:
Increase the Strength of the Relationship
Researchers can enhance the effect size by:
- Strengthening manipulations: For example, using more intense experimental conditions.
- Controlling extraneous variables: Reducing variability caused by noise or irrelevant factors improves the clarity of the observed effect.
- Switching designs: Using a within-subjects design instead of a between-subjects design can reduce error variance, increasing power.
Increase the Sample Size
The most common and straightforward approach to boosting power is collecting more data. A larger sample reduces the influence of random error, making it easier to detect true relationships. Importantly, for any given effect size, there is always a sample size large enough to achieve adequate power.
Tools for Computing Statistical Power
To aid researchers in planning studies with sufficient power, various tools are available for free or online:
- Russ Lenth’s Power and Sample Size Page. A user-friendly online tool where researchers can calculate power or determine the required sample size based on their study parameters.
- G*Power. A comprehensive, free software program offering advanced capabilities for computing power, effect sizes, and sample size requirements for a wide range of statistical tests.
Chapter Attribution
Content adapted, with editorial changes, from:
Research methods in psychology, (4th ed.), (2019) by R. S. Jhangiani et al., Kwantlen Polytechnic University, is used under a CC BY-NC-SA licence.