"

Sampling, Sample Size, and Particpant Selection

Learning Objectives

In this chapter you will:

  • understand the differences between non-probability and probability sampling
  • understand the reasons qualitative research has far fewer participants than quantitative research
  • discover and understand how to employ a formula to calculate sample size for quantitative research
  • discover the best way to obtain participants for your research.

10.1 Sampling

When undertaking research, it is not possible, unless you are conducting the census, to obtain data from the whole population under consideration. Therefore, you collect data from a representative sample of the population targeted in the study. The size of the sample depends on the type of research you are undertaking. Calculating the size of a sample is discussed later in this chapter.

Population Versus Sampling

A sample is a subset of the population we are interested in. Sampling is the process of selecting a subset of the population of interest. A target population is the group of people, objects, or events that you want to study. Remember the target population does not have to be just people.

You start with the population (everyone) which you are not going to be able to access unless you are the government conducting a census. Then we come to the target population, they are those (people, objects, events) the target of your study. Then we have the sample, the portion of the target population that can be accessed or is selectable (Figure 10.1).

Figure 10.1 Population, target population, sample

The sample size is determined by the population highlighted in the research question(s) and objectives. Qualitative sampling is purposive, while quantitative sampling needs to support inferential statistics.

Inclusion/Exclusion Criteria

Inclusion and exclusion criteria for participants in research depend largely on the research question(s) and the target population. Inclusion criteria may be on age, for example, the target population are those who fall in the category of Gen Z, therefore those who do not fall in that generational age group are excluded from the study. Your research may require participants to fit several categories to be included in your study, such as age, visitor to the area, length of stay, mode of transport, etc. Inclusion and exclusion criteria are determined once your research topic and question(s) are decided.

Researcher(s) must always be aware of sample selection bias. This can occur for several reasons, such as, one or more sections of the population being over- or under-represented in the sample; an incorrect sampling frame has been used, or is not appropriate, or has insufficient coverage; the data are old, or out-of-date, or the location of data collection is not right. Another bias that can occur is self-selection bias, this occurs when only people with certain characteristics provide information, and non-responsive bias restricts sampling criteria.

Sampling Methods

Figure showing the different types of sampling for probability sampling and non-probability sampling.
Figure 10.2. Sampling – probability and non-probability types.

Probability Sampling Versus Non-Probability Sampling

A probability sample is what all researchers aim for. To obtain this type of sample you need to know the exact size of the population, be able to identify every individual within that population, the target population must be completely accessible, and every element in the target population must have a known and equal chance of being selected. Non-probability sampling is more common, but less robust than probability sampling, hence the statistical data is more conservative. Specific probability and non-probability sampling techniques (Figure 10.2) are discussed in the following sections.

Probability Sampling

Simple Random Sample

When using a simple random sample, each unit of a population has the same probability of being selected. Where ‘n’ is the sample size, each combination of ‘n’ elements has the same probability of being selected. An example of this is Saturday Night Lotto (in Australia), there are 45 balls, a total of 8 balls are randomly selected (6 winning, and 2 supplementary) from the lottery machine. All the balls, and all combinations of the balls, have the same a priori probability of being selected. In this case, the lottery machine which contains the 45 balls functions as the sampling frame (Sallis et al., 2021).

In ‘real-life’ it is quite difficult to find a perfect sampling frame that lists or covers all units in a population. Meaning the classic process of drawing a simple random sample where the population units are numbered from 1 to ‘n’, and a random selection of units is sampled from the population is rare. An alternative is to try to select units that as closely as possible represent the population.

Systematic Sample

With systematic sampling, you select every nth unit in your population beginning with a randomly selected unit between 1 and ‘n’. For example:

  • You want a sample size of 55 households from a local suburb of 774.
  • You might sample every 9th house starting with randomly selected or generated number from 1 to 9.
  • The random number is 5, then houses numbered 5, 14, 23, 32, 41, 50… and so on until 55 households were sampled.
  • However, you must be aware that systematic bias may occur. This can occur, for example, when every 9th house is in the same position in a street.

If you were investigating traffic noise and every 9th house was a corner house they would be getting traffic noise from two sides. This doubling of the traffic noise may bias your results towards the noise from traffic being greater than those households facing only one road. Therefore, your conclusions on traffic noise in that suburb may not be correct (Sekaran & Bougie, 2013).

Stratified Sample

Stratified sampling is used when the researcher has a variable of interest, and it has been determined that there are subgroup units within the population that are expected to have different parameters on that variable. In this instance, subgroups are generally called ‘stratum’ (singular) or ‘strata’ (plural).

To proceed:

  • Divide the population into mutually exclusive, collectively exhaustive strata.
  • These strata must be relevant, appropriate, and meaningful to the study’s context.
  • Take a simple random sample from each stratum.
  • Sample size in each stratum may vary depending on the homogeneity of each stratum.
  • Higher levels of homogeneity within a population, the smaller the sample size needed.

Stratified sampling ensures specific subgroups within a population are represented, and certain variables are sufficiently measured. This is useful when heterogeneous subgroups are to be represented, and when each subgroup is homogenous for the variables to be measured. When there is homogeneity within a subgroup fewer units need to be selected. This means weighting can be applied to each subgroup to ensure specific variables are properly represented at the population level (Sallis et al., 2021; Sekaran & Bougie, 2013).

Example using an imaginary scenario:

  1. How many flat white coffees do students in the undergraduate statistics subject drink, on average in one week?

  2. Assume 45% of these students take the subject as an elective.
  3. The information already obtained says students who take statistics as an elective drink fewer flat whites, on average, in one week, than those who have statistics as a ‘core’ subject.
  4. Divide the population into 2 strata: ‘elective’ and ‘core’ students.
  5. Conduct a simple random sample within each stratum.
  6. This provides an estimate of average flat white consumption of elective, and core statistics subject takers respectively.
  7. Next, calculate the average consumption for the population by weighting the results of the two strata.
  8. If the estimate for students where statistics is a core subject is 8 flat whites per week, and the estimate for students where statistics is an elective is 3 flat whites per week, the calculation is as follows: (8 x 0.55) + (3 x 0.45) = (4.4) + (1.35) = 5.75
  9. From the weighted calculations the population average drinks 5.75 flat whites per week. However, you would generally round this number to six (6).

Note: Be aware that as stratification can take place over several stages, it can become quite complex, on occasion even requiring a ‘masterplan’ to keep track (Sallis et al., 2021).

Cluster Sample

Cluster random sampling:

  • Divide a larger population into smaller groups or clusters.
  • Then, randomly select clusters to form your sample.
  • Is generally used for quite large populations.
  • Sample size is also quite large.
  • When population, and therefore sample size is too large to study successfully, cluster sampling is used to reduce the total number of participants.
  • Occasionally pre-existing groups may be used as clusters, for example, schools, households, towns/cities etc (Simkus, 2023).
There are several cluster sampling techniques:
Area Cluster Sampling
  • Consists of geographic regions (areas), which can be council areas, suburbs, or specified areas of a state (e.g., Far North Queensland)
  • If you are surveying residents of a suburb, you would obtain a map of the area, take a sample of streets within the suburb, and select households within each street.
  • This can be relatively inexpensive and does not depend on a sampling frame as you already have the map.
Single-Stage Cluster Sampling
  • The population is divided into a pre-determined number of clusters.
  • The required number of clusters to be sampled are randomly chosen.
  • Each element/unit in each selected cluster is investigated.
Double-Stage Cluster Sampling
  • Clusters are selected, then data are only obtained from a random sub-sample of individual elements/units within each of the selected clusters.
  • Not as accurate as single-stage cluster sampling.
  • Generally used only when the cost of testing the entire cluster is prohibitive, or testing the entire cluster is too challenging.
Multi-Stage Cluster Sampling
  • It is undertaken in several stages.
  • For example, you may have selected urban, regional and rural geographical locations for your study.
  • Next, you select specific areas within each location.
  • Then you might select primary schools within each selected area.
  • You keep going until you have the final clusters of sample elements/units.
  • You then sample every element/unit of the final selected clusters (Sekaran & Bougie, 2013; Simkus, 2023).

Non-Probability Sampling

Quota Sampling

Quota sampling is a non-random, convenience method and often used as an alternative to probability sampling as part of a strategy for internet and/or interviewer completed questionnaires. Results from research using quota sampling cannot be generalised to the wider population.

Quota sampling is achieved by:

  • Dividing the population into sub-groups.
  • All sub-groups must be mutually exclusive.
  • Sub-groups are in the same proportion as the population.
  • Convenience sample taken from each sub-group.
  • Relationship comparisons between selected sub-groups can be tested (Futri et al., 2022; Stratton, 2019).
Purposive Sampling

With purposive sampling researchers select participants that are knowledgeable and/or experienced in relation to the research question/phenomena. These participants must be available and agreeable to participating in the study (Stratton, 2019).

Some of the most common purposive sampling designs are:

Deviant Sampling
  • often used in programs to improve processes
  • subjects/cases chosen in anticipation of discovering information not commonly available, and that may demonstrate good/problematic findings.
Homogeneous (AKA Dominant) Sampling
  • can be used to form focus groups
  • participants are chosen to form a sample group where there are similar, dominant characteristics present in relation to the phenomena of interest.
Case Sampling
  • The researcher(s) choose cases from a group that have similar characteristics.
  • There is no randomisation involved.
  • Does not involve all available cases.
  • Commonly used with medical cases, for example, medical cases are selected by the researcher for data extraction when they have one or more diagnostic codes that are the same.
Sequential (AKA Consecutive) Sampling
  • often used in qualitative studies in developing themes
  • sequential subjects/cases are included in the study until no new themes/information emerge
  • once no new themes/information emerge, it is said the study has reached ‘saturation point’
  • randomisation is not used in selection of participants; therefore, any sampling error is impossible to determine.
Theoretical Sampling
  • Research objectives are developed.
  • A group is identified to be interviewed in relation to the research question(s).
  • Interview criteria are pre-established.
  • Researcher(s) analyse the information obtained.
  • A second group is selected.
  • This second group is interviewed about the findings from the first group.
  • The second group may or may not confirm findings of the first group.
  • To refine the study, the findings from the first two groups are combined, and a third group is selected.
  • This process continues until saturation point is reached.
  • This may lead to sample error that cannot be measured. (Stratton, 2019)
Convenience Sample

Convenience sampling is just that, it is the most convenient and easy way to reach potential participants for a study.

For example:

  • The Researcher(s) are employed at a university.
  • The research population are those aged 18 to 30 who have a smartphone.
  • The researcher(s) send emails to all university students aged 18 to 30 asking them if they would participate in a study and providing a link to the survey.
  • Even when potential participants do complete the survey, they cannot be said to be a statistically representative sample of the population.
  • In this example, all participants are aged 18 to 30 and own smartphones; however, they are not representative of the whole population of those aged 18 to 30 who own smartphones.

Convenience sampling is often used when time constraints are an issue. Results from studies using convenience sampling cannot be generalised to the wider population.

Volunteer Sample

There are 2 techniques for volunteer sampling: Snowball and self-selection:

Snowball Sampling
  • It is a continuous referral method.
  • Requires research participants to have the same characteristics.
  • Researcher(s) recruit an initially limited number of participants.
  • In some instances, these first recruits receive a small incentive to recruit other participants for the study.
  • These initial participants recruit other participants from family, friends, members of their social groups, members of their sporting clubs etc., these participants then go on to recruit other participants, who recruit other participants… and so on.
  • The final participants recruited may have no connection to the initial participants other than the same characteristics under investigation.
  • This may be used when the research is focused on hard-to-reach or vulnerable communities. (Valerio et al., 2016)
Self-Selection
  • Participants nominate themselves to participate in a survey or similar research.
  • Participants volunteer as they have an interest in what is being studied.
  • Their interest in the topic may be at the extremes of positive and negative opinions, therefore any average view on the topic is hidden.
  • This means there is a high possibility that the results from the sample are biased.
  • If using a questionnaire, self-selection sampling can be achieved by leaving the questionnaires in a range of locations appropriate to the topic under study.
  • Other means of recruitment may be achieved by using posters or flyers which provide researcher contact information, a QR code can make this easier for potential participants.
  • Web pages or posts (where platforms permit) can encourage people to self-select and complete online surveys (Galloway, 2005).

10.2 Sample Size and Participant Selection

Qualitative Research

For qualitative research using interviews or observations, a researcher generally targets a sample size between 10 and 40. However, this all depends on the process you have selected and the target population. Then you keep sampling until theoretical saturation, that is, continue sampling and data collection, and analysis, until no new conceptual insights are generated.

When focus groups are used, the target number for each group is generally 8 to 12. This provides the best balance of productive interaction against managing the interaction effectively. The number of focus groups required depends on the research question(s) and objectives. But you do need the same number of members in each focus group for that specific study.

Quantitative Research

Remember, in quantitative research, a sample is used as a substitute for the population. The sample should be free of sample selection bias and be large enough for the researcher(s) to be confident any number that describes the sample (sample statistics) is precise enough to be useful to the actual population number (parameter). This is based on probability laws in mathematics where the odds are calculated that any given sample mean is likely to be the actual population mean (confidence). The idea is that if the exact same research was done repeatedly, with different samples from the same population, the mean of the sample means should equal the population mean.

When deciding on the sample size a researcher must consider costs, if the sample is too big, or too small the data collection is just a waste of money and time. A researcher when considering sample size must consider, and allow, for non-responses and incomplete responses; how many subgroups have to be accurately described? Generally, it is a balancing act between precision and accuracy.

With a survey, to increase representativeness, precision, and confidence, a larger sample size is required. If the researcher does not need to describe subgroups or test for any differences, in Table 10.1, the confidence level is 95%. Look down the ‘N’ column for your population size (nearest to it), then look across the row to your margin of error (5%, 3%, 2%, 1%) to find the required sample size. If you Google “sample size with margin of error table”, you get multiple examples under ‘images’.

Table 10.1. Population with sample size for margins of error at 95% confidence level (N=population) (sample sizes were calculated with Qualtrics’ sample size calculator)

Sample Size with Margin of Error   Sample Size with Margin of Error
N 5% 3% 2% 1%   N 5% 3% 2% 1%
10 10 10 10 10 440 206 312 372 421
15 15 15 15 15 460 210 322 387 439
20 20 20 20 20 480 214 332 401 458
25 24 25 25 25 500 218 341 414 476
30 28 30 30 30 550 227 363 448 521
35 33 34 25 35 600 235 385 481 565
40 37 39 40 40 650 242 404 512 609
45 41 44 45 45 700 249 423 542 653
50 45 48 49 50 750 255 441 572 696
55 49 53 54 55 800 260 458 601 739
60 52 57 59 60 850 265 474 628 781
65 56 62 64 65 900 270 489 655 823
70 60 66 69 70 950 274 503 681 865
75 63 71 73 75 1000 278 517 706 906
80 67 75 78 80 1100 285 542 755 987
85 70 79 83 85 1200 291 565 801 1067
90 73 83 87 90 1300 297 587 844 1145
95 77 88 92 95 1400 302 606 885 1222
100 80 92 97 99 1500 306 624 924 1298
110 86 100 106 109 1600 310 641 961 1372
120 92 108 115 119 1700 314 656 996 1445
130 98 116 124 129 1800 317 670 1029 1516
140 103 124 133 138 1900 320 684 1061 1587
150 108 132 142 148 2000 323 696 1092 1656
160 113 140 151 158 2200 328 719 1148 1790
170 118 147 159 168 2400 332 739 1201 1921
180 123 155 168 177 2600 335 757 1249 2047
190 128 162 177 187 2800 338 773 1293 2168
200 132 169 185 196 3000 341 788 1334 2286
210 136 176 194 206 3500 347 818 1424 2566
220 140 183 202 216 4000 351 843 1501 2824
230 144 190 210 225 4500 354 863 1566 3065

Quantitative Sample Size Formula

The manual formula for calculating a quantitative sample size is as follows:

  • your confidence level is 95%
  • your margin of error is 5% = 0.05
  • when the confidence level is 95% the constant used in the formula is 1.96
  • when the confidence level is 90% the constant used is 1.64; for 99% it is 2.58
  • 0.5 is a conservative estimate of how many subjects have the characteristics being measured; this can be viewed as a constant.

Example: margin of error = 5% = 0.05, confidence level = 95%

Calculation of survey sample size with a margin of error of 5% and confidence level of 95%. n equals one point nine six squared divided by zero point zero five squared. This equals zero point nine six zero four divided by zero point zero zero two five which equals three hundred and eighty five.
Figure 10.3. Survey sample size with 5% margin of error.

In Figure 10.3, this is a minimum sample size where the margin of error is 5%, the total is rounded to 385. This example is a general survey where the researcher(s) want to be 95% confident with their results and are prepared to live with 5% error; therefore this formula can be used to determine at least 385 subjects are required in the sample. The final formula is rounding 0.9604 to a whole number, one (1), which then provides an approximate sample size of 400. Research suggests that even if there is allowance for non-response error there should still be at least a 60% response rate.

When the margin of error is changed to 3% (0.03) (Figure 10.4) and the confidence level is 95%, a larger sample size is required. The finer the margin of error, the larger the sample size required.

Calculation of survey sample size with a margin of error at three percent and confidence level at ninety five percent. n equals one thousand and sixty eight.
Figure 10.4. Survey sample size with 3% margin of error.

Response Rate

The response rate for surveys can vary greatly, but the researcher should have a response rate exceeding 50%. A quick, easy-to-understand example of how to work out a response rate of 80% is for example, a researcher asks 100 people to undertake their survey, and 80 people actually complete the survey. The response rate is therefore 80%.

Figure 10.5a calculates a survey sample size when the base sample size has been calculated at 385 (Figure 21.3), and the response rate required is 80%. The new sample size required is 482.

When the response rate expected is eighty percent the calculation is n equals n divided by response rate which equals three hundred and eighty five from early calculations divided by zero point eight zero which equals four hundred and eighty two as the sample size to obtain the required eighty percent response rate.
Figure 10.5a. Survey sample size for response rate of 80% and original sample size of 385.

Using the base sample size from Figure 10.4 of 1,068, and a response rate of 80% the new sample size required is 1,335 (Figure 10.5b).

Sample size calculation when response rate is eighty percent and the base sample size is one thousand and sixty eight. New sample size is one thousand three hundred and thirty five.
Figure 10.5b. Survey sample size for response rate of 80% and base sample size of 1,068.

Now, what if you expect a lower response rate, say 65%. We plug this into our formula along with our base sample size (Figure 10.3) of 385 for an updated sample size requirement of 593 (Figure 10.6).

Survey sample size for a response rate of sixty five percent and base saple size of three hundred and eighty five. New sample size is five hundred and ninety three
Figure 10.6. Survey sample size for response rate of 65% and base sample size of 385.

We go back to Figure 10.4, the margin of error is 3% and the confidence level is 95%, the formula provided a base sample size of 1,068. We plug this into the formula and calculate the new required sample size at a response rate of 65% is 1,644 (Figure 10.7).

Survey sample size for response rate of sixty five percent and base sample size of one thousand and sixty eight, new sample size one thousand six hundred and forty four.
Figure 10.7. Survey sample size for response rate of 65% and base sample size of 1,068.

Key Takeaways

  • Non-probability sampling is more common, but less robust than probability sampling; hence, the statistical data is more conservative.

  • Probability sampling and non-probability sampling are generally used separately, but can be used together.
  • Qualitative research has fewer participants than quantitative research.
  • Qualitative research generally uses small groups of participants or multiple small groups of participants.
  • To determine survey sample size for quantitative research, there are specific formulas to use.
  • The formulas require a margin of error and confidence level for the initial calculations.
  • Once you have a base sample size, you decide on a response rate, for example, 80% or 65%. The response rate, along with the base sample size figure, are plugged into the response rate formula to provide an updated sample size figure.
  • All response rates should exceed 50%.

References

Futri, I. N., Risfandy, T., & Ibrahim, M. H. (2022). Quota sampling method in online household surveys. MethodsX, 9, Article 101877. https://doi.org/10.1016/j.mex.2022.101877 

Galloway, A. (2005). Non-probability sampling. In K. Kempf-Leonard (Ed.). Encyclopedia of social measurement (Vol. 2, pp. 859-864). Elsevier. https://doi.org/10.1016/BO-12-369398-5/00382-0 

Martelli, J., & Greener, S. (2018). An Introduction to business research methods (3rd ed.). Bookboon. https://bookboon.com/en/an-introduction-to-business-research-methods-ebook

Sallis, J. E., Gripsrud, G., Olsson, U. H., & Silkoset, R. (2021). Research methods and data analysis for business decisions: A primer using SPSS. Springer.

Sekaran, U. & Bougie, R. (2013). Research methods for business: A skill building approach (6th ed.). Wiley.

Simkus, J. (2023, July 31). Cluster sampling: Definition, method, and examples. Simply Psychology. https://www.simplypsychology.org/cluster-sampling.html 

Stratton, S. J. (2019). Data sampling strategies for disaster and emergency health research. Prehospital and Disaster Medicine, 34(3), 227-230.  https://doi.org/10.1017/S1049023X19004412 

Valerio, M. A., Rodriguez, N., Winkler, P., Lopez, J., Dennison, M., Liang, Y., & Turner, B. J. (2016). Comparing two sampling methods to engage hard-to-reach communities in research priority setting. BMC Medical Research Methodology, 16, 2-11. https://doi.org/10.1186/s12874-016-0242-z 

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Business Research Approaches Copyright © 2025 by Leonie Cassidy and Josephine Pryce is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book