Statistics

Sampling Methods

Last updated: March 2026 · Beginner
Before you start

You should be comfortable with:

Real-world applications
💊
Nursing

Medication dosages, IV drip rates, vital monitoring

💰
Retail & Finance

Discounts, tax, tips, profit margins

The way you select your sample determines how trustworthy your conclusions are. A perfectly executed analysis on a poorly chosen sample still produces misleading results. Before collecting any data, you need to decide how you will choose the individuals, items, or observations that make up your sample. That decision — your sampling method — is one of the most important choices in any study.

This page covers the five major sampling methods: simple random, stratified, cluster, systematic, and convenience. Each has trade-offs between cost, practicality, and the quality of the conclusions you can draw.

Why Sampling Method Matters

Recall from What Is Statistics? that a sample is a subset of a population used to draw conclusions about the whole group. The goal is a representative sample — one that accurately reflects the population’s characteristics.

A biased sample leads to biased results, no matter how large it is. If you survey 10,000 people but they all share the same background, your results tell you about that background, not about the population as a whole. The method you use to select your sample is your primary defense against this kind of systematic error.

The key question is always: Does every member of the population have a known, fair chance of being selected?

Simple Random Sampling (SRS)

In a simple random sample, every possible group of the desired sample size has an equal chance of being chosen. This means every individual also has an equal probability of being selected, but the key distinction is that the sample as a whole is random — not just each individual pick. This is the gold standard of sampling because it eliminates selection bias at the design stage.

How to do it:

  1. Obtain or create a complete list (a sampling frame) of every member of the population.
  2. Assign a unique number to each member.
  3. Use a random number generator (or a table of random numbers) to select the desired number of members.

Example 1: Selecting Patients for a Study

A hospital database contains records for 5,000 patients treated in the past year. A researcher wants to study 50 of them.

  • Assign each patient a number from 1 to 5,000.
  • Use a random number generator to produce 50 distinct numbers in that range.
  • The patients corresponding to those 50 numbers form the sample.

Every patient has the same probability of being selected:

P(selected)=505,000=1100=1%P(\text{selected}) = \frac{50}{5{,}000} = \frac{1}{100} = 1\%

Pros: Eliminates selection bias. Results are straightforward to analyze because standard statistical formulas assume random sampling.

Cons: You need a complete list of the population, which is not always available. Small subgroups may be underrepresented by chance — if 3% of the 5,000 patients have a rare condition, a random sample of 50 might include zero of them.

Stratified Sampling

In stratified sampling, you divide the population into non-overlapping subgroups called strata based on a characteristic that matters to your study. Then you take a simple random sample within each stratum.

This guarantees that each subgroup is represented in proportion to its share of the population (or in whatever proportion you choose).

Example 2: Employee Satisfaction Survey

A company has 500 employees: 300 full-time (60%) and 200 part-time (40%). Management wants to survey 200 employees and ensure both groups are represented proportionally.

  • Full-time stratum: randomly select 200×0.60=120200 \times 0.60 = 120 full-time employees from the 300.
  • Part-time stratum: randomly select 200×0.40=80200 \times 0.40 = 80 part-time employees from the 200.

The combined sample of 200 reflects the company’s actual composition. If you had used simple random sampling, it would be possible (though unlikely) to draw a sample that was 90% full-time — stratified sampling prevents that.

Pros: Guarantees representation of key subgroups. Produces more precise estimates when the strata differ from each other.

Cons: You must know the strata in advance, which requires information about the population before you begin. It also requires a separate sampling frame for each stratum.

Cluster Sampling

In cluster sampling, you divide the population into groups called clusters — often based on geography or some natural grouping — then randomly select entire clusters and study every member within the selected clusters.

This is fundamentally different from stratified sampling. In stratified sampling, you sample within every group. In cluster sampling, you randomly pick which groups to include, then take everyone in those groups.

Example 3: Surveying Nurses Across a Hospital System

A hospital system operates 20 locations across a state. To survey nursing staff, it would be expensive and logistically difficult to visit all 20 sites.

  • Treat each hospital location as a cluster.
  • Randomly select 5 of the 20 locations.
  • Survey every nurse at those 5 locations.

If each location has approximately 60 nurses, the sample includes about 5×60=3005 \times 60 = 300 nurses without needing to coordinate across all 20 sites.

Pros: Practical and cost-effective for geographically spread-out populations. You only need a complete list of members within the selected clusters, not the entire population.

Cons: Clusters may not be internally diverse. If one hospital location serves a primarily elderly population while another serves mostly young families, your results depend heavily on which clusters happen to be selected. This increases sampling variability compared to SRS.

Systematic Sampling

In systematic sampling, you select every kkth member from an ordered list, starting from a randomly chosen position.

How to do it:

  1. Determine the sampling interval: k=Nnk = \frac{N}{n}, where NN is the population size and nn is the desired sample size.
  2. Choose a random starting point between 1 and kk.
  3. Select every kkth member from that starting point.

Example 4: Quality Control on a Production Line

A factory produces 1,000 items per shift. The quality team wants to inspect 50 items.

k=1,00050=20k = \frac{1{,}000}{50} = 20

They randomly pick a starting number between 1 and 20 — say, 7. Then they inspect items 7, 27, 47, 67, 87, and so on, every 20th item until they reach 50 inspections.

Pros: Simple to implement, especially in production or sequential settings. Does not require a complete numbered list — you just count off.

Cons: If the list or process has a repeating pattern that aligns with the interval kk, the sample will be biased. For instance, if every 20th item comes off a particular machine that has a defect, every item in the sample would be defective — dramatically overstating the defect rate.

Convenience Sampling

A convenience sample consists of whoever or whatever is easiest to reach. There is no randomization and no systematic selection.

Example 5: Common Convenience Samples

  • Surveying shoppers at one mall on a Saturday afternoon
  • Asking your coworkers to fill out a questionnaire
  • Collecting responses from an online poll posted on social media
  • A teacher using their own class as research subjects

This is the most common sampling method in everyday life — and the least reliable. The people who happen to be available are rarely representative of the broader population.

Pros: Fast, cheap, and easy. Useful for pilot studies or exploratory research where generalizability is not the goal.

Cons: High risk of bias. Results may not generalize to any population beyond the specific group surveyed. A mall survey on Saturday misses people who work weekends, shop online, or avoid malls. An online poll attracts people who use that platform and feel strongly enough to click.

Comparison Table

MethodRandom?RepresentationPractical CostBest For
Simple Random (SRS)YesExcellent if sample is large enoughModerate — requires full listGeneral-purpose research
StratifiedYes (within strata)Excellent for subgroupsHigher — requires stratum infoStudies comparing subgroups
ClusterYes (at cluster level)Good if clusters are diverseLower — fewer sites to visitGeographically spread populations
SystematicMostly (random start)Good if no list patternsLow — easy to executeProduction lines, ordered lists
ConvenienceNoPoor — high bias riskVery lowPilot studies, informal feedback

Voluntary Response Sampling

A special case worth highlighting is voluntary response sampling, where participants choose to respond on their own rather than being selected. Online product reviews, call-in radio polls, and comment sections all rely on voluntary responses.

The problem is self-selection: people who feel strongly — either very satisfied or very dissatisfied — are far more likely to participate. Moderate opinions are underrepresented.

This is why online reviews tend to cluster at 5 stars and 1 star, with fewer ratings in the middle. It is not that most customers have extreme opinions; it is that extreme opinions drive people to write reviews.

Rule of thumb: If participants chose to be in the study rather than being selected, the results are suspect regardless of sample size.

Real-World Application: Nursing — Choosing Patients for a Drug Study

A pharmaceutical company wants to test a new blood pressure medication. The study requires 400 patients from hospitals across the country.

Bad approach (convenience): Enroll the first 400 patients who walk into one hospital and agree to participate. This sample is biased by geography, the hospital’s patient demographics, and self-selection (patients who agree may differ from those who refuse).

Better approach (stratified + cluster):

  1. Cluster stage: Randomly select 10 hospitals from a list of 200 eligible facilities across the country.
  2. Stratification: Within each hospital, stratify eligible patients by age group (18-40, 41-60, 61+) to ensure balanced representation.
  3. Random selection: Randomly select 40 patients from across the age strata at each hospital, yielding 400 total.

This design ensures geographic diversity (multiple hospitals), demographic balance (age strata), and randomization (within each stratum). The results are far more likely to generalize to the broader population of patients with high blood pressure.

Practice Problems

Test your understanding with these problems. Click to reveal each answer.

Problem 1: A manager places all 250 employee names in a hat and draws 30 names. What sampling method is this?

Every employee has an equal chance of being selected, and the selection is random.

Answer: This is simple random sampling (SRS).

Problem 2: A school district has 40 elementary schools. Researchers randomly select 8 schools and test every fourth-grader at those schools. What sampling method is this?

The researchers randomly selected entire groups (schools) and then studied all members within the selected groups.

Answer: This is cluster sampling. The schools are the clusters.

Problem 3: A political polling firm divides voters into three income brackets — low, middle, and high — then randomly selects 200 voters from each bracket. What sampling method is this?

The population is divided into non-overlapping subgroups (income brackets), and a random sample is drawn from each one.

Answer: This is stratified sampling. The income brackets are the strata.

Problem 4: A website posts a survey asking, “How satisfied are you with our service?” and 1,200 users respond voluntarily. What type of sampling is this, and what is the main concern?

Participants were not selected — they chose to respond on their own.

Answer: This is voluntary response sampling. The main concern is self-selection bias: users with strong opinions (very satisfied or very dissatisfied) are more likely to respond, so the results do not represent the typical user.

Problem 5: A quality inspector checks every 15th bottle coming off a filling line, starting from a randomly chosen position between 1 and 15. What sampling method is this? Name one potential problem.

The inspector uses a fixed interval (k=15k = 15) with a random starting point.

Answer: This is systematic sampling. A potential problem is that if the production line has a periodic pattern that aligns with the interval of 15 — for example, if every 15th bottle comes from the same filling nozzle — the sample would overrepresent that nozzle and could miss defects from others (or overstate them).

Key Takeaways

  • Simple random sampling gives every member an equal chance of selection — it is the baseline against which other methods are compared.
  • Stratified sampling divides the population into subgroups and samples within each, guaranteeing representation of key groups.
  • Cluster sampling randomly selects entire groups and studies all members within them — practical for spread-out populations but increases variability.
  • Systematic sampling selects every kkth member from an ordered list — simple but vulnerable to periodic patterns.
  • Convenience sampling uses whoever is available — fast but unreliable for drawing conclusions about a population.
  • Voluntary response sampling attracts people with strong opinions and overrepresents extreme views.
  • The sampling method you choose determines whether your results can generalize beyond your sample. Random selection is the foundation of trustworthy inference.

Return to Statistics for more topics in this section.

Last updated: March 29, 2026