Statistics

Bias in Studies

Last updated: March 2026 · Beginner
Before you start

You should be comfortable with:

Real-world applications
💊
Nursing

Medication dosages, IV drip rates, vital monitoring

Bias is a systematic error that pushes results in one direction. Unlike random error, which scatters results unpredictably above and below the true value, bias consistently skews them the same way. The critical point: increasing sample size does not fix bias. A survey of 100,000 people can be more biased than a well-designed survey of 500 if the larger sample was collected in a biased way.

Understanding bias is essential for anyone who reads research, interprets data, or makes decisions based on evidence — which, in practice, means everyone.

What Is Bias?

In statistics, bias refers to any systematic tendency in the data collection or analysis process that produces results that differ from the truth in a consistent direction.

There are two fundamentally different types of error in any study:

  • Bias (systematic error): Consistently pushes results too high or too low. Predictable direction. Cannot be reduced by collecting more data.
  • Random error (noise): Pushes results in unpredictable directions. Averages out with larger samples. Can be reduced by increasing sample size.

Think of it like a bathroom scale. If the scale adds 3 pounds to every reading, that is bias — weighing yourself 50 times and averaging the results still gives you a number that is 3 pounds too high. If the scale fluctuates randomly by a pound or two each time, that is random error — averaging many readings will get you close to your true weight.

The formula that ties them together:

Total Error=Bias+Random Error\text{Total Error} = \text{Bias} + \text{Random Error}

Good study design minimizes both, but they require different strategies. Random error is handled by increasing sample size. Bias is handled by improving how you select subjects, phrase questions, and collect measurements.

Selection Bias

Selection bias occurs when the way you choose your sample systematically excludes part of the population, so the sample does not represent the group you are trying to study.

Example 1: Daytime Hospital Survey

A hospital surveys patients about their care experience by sending staff to interview patients between 9 AM and 3 PM on weekdays.

The problem: This misses patients who were admitted overnight and discharged before morning, patients in surgery during the day, and patients whose family members handle communication during business hours. The sample systematically overrepresents stable, long-stay patients who are awake and available during the day.

Undercoverage

A specific form of selection bias is undercoverage — when some groups in the population have zero chance of being selected. If a telephone survey calls only landline numbers, it excludes everyone who uses only a cell phone. Since younger adults are far more likely to be cell-phone-only, the survey systematically underrepresents them.

Key principle: If any identifiable group has no chance of appearing in your sample, the results cannot represent them — and any conclusions that include them are suspect.

Response Bias

Response bias occurs when the way a question is asked, who is asking it, or the circumstances of the interview influence the answers people give. The data is collected, but it does not reflect what people actually think or do.

Leading Questions

The wording of a question can steer respondents toward a particular answer.

  • Leading: “Don’t you agree that the new policy has been effective?”
  • Neutral: “How would you rate the effectiveness of the new policy?”

The first version signals the “expected” answer. Research consistently shows that people are more likely to agree with a statement than to disagree, a tendency called acquiescence bias.

Social Desirability Bias

People systematically overreport behaviors they see as positive and underreport behaviors they see as negative. This is not deliberate lying — it is an unconscious tendency to present oneself favorably.

Common examples:

  • People overreport how often they exercise, vote, and read
  • People underreport how much they drink, how much television they watch, and how often they skip medications

Example 2: Hand-Washing Survey

A hospital asks staff: “How often do you wash your hands between patient contacts?”

Nearly everyone reports “always” or “almost always.” But observational studies — where researchers secretly watch hand-washing behavior — consistently find compliance rates between 40% and 60% in most hospitals.

The gap is social desirability bias. No healthcare worker wants to admit, even on an anonymous survey, that they sometimes skip hand-washing. This is why behavioral research often relies on direct observation rather than self-reports.

Nonresponse Bias

Nonresponse bias occurs when the people who choose not to respond differ systematically from those who do. Even if your initial sample was perfectly random, a low response rate can destroy that randomness.

Example 3: Patient Satisfaction Survey

A hospital sends satisfaction surveys to 2,000 recently discharged patients. Only 600 respond — a 30% response rate.

Question: Are the 600 who responded representative of the full 2,000?

Possibly not. Patients who had extreme experiences — very good or very bad — may be more motivated to fill out the survey. Patients who were generally satisfied but not strongly so may toss the survey in the recycling bin. Patients who are elderly, less literate, or who experienced complications may be less likely to complete the form.

The result is a sample that overrepresents certain viewpoints and underrepresents others. The direction of the distortion depends on who fails to respond and why.

Guideline: Response rates below 60% raise serious concerns about nonresponse bias. The lower the rate, the more cautious you should be about the results.

Measurement Bias

Measurement bias occurs when the data collection instrument or process systematically distorts the values it records. The error is in the measurement itself, not in who or how you selected your sample.

Example 4: Calibration Error

A clinic’s scale reads 2 pounds heavier than actual weight. Every patient’s recorded weight is 2 pounds too high. If the study compares average weight before and after a program, the bias cancels out (both measurements are 2 pounds high). But if the study compares against an external standard, every result is systematically wrong.

Example 5: Self-Report vs Measurement

Studies consistently find that people overestimate their height by about half an inch and underestimate their weight by several pounds when self-reporting versus being measured. Any study that relies on self-reported height and weight will systematically overestimate BMI in some populations and underestimate it in others.

The fix for measurement bias is calibration (for instruments) and objective measurement (for human data).

Survivorship Bias

Survivorship bias occurs when you analyze only the cases that made it through a selection process and ignore those that did not. The “survivors” are a biased sample of the original group because whatever process eliminated the non-survivors is hidden from your data.

The Classic Example: WWII Aircraft Armor

During World War II, the U.S. military examined returning bombers and found bullet holes concentrated on the wings and fuselage. The initial recommendation: add armor to those areas.

Statistician Abraham Wald pointed out the flaw: the bullet holes showed where a plane could be hit and still return. The planes hit in the engines and cockpit never came back — they were the ones that needed armor. By looking only at survivors, the military was about to protect the wrong areas.

Modern Example: Business Success Studies

A study examines 100 successful companies that have survived 20 years and identifies common traits: strong leadership, innovative culture, aggressive marketing. The conclusion: these traits cause success.

The problem: The study ignores the hundreds of companies that had the same traits but failed. Without comparing survivors to non-survivors, you cannot determine which traits actually matter. The “successful” traits might be equally common among companies that went bankrupt.

Wording Effects

Beyond leading questions, the specific words chosen in a survey can shift responses dramatically — even when the questions are technically asking the same thing.

Framing matters:

  • “Government spending” vs “government investment” vs “taxpayer-funded programs” — all describe the same thing, but each phrase triggers different reactions.
  • “Estate tax” vs “death tax” — the same policy described with different words produces support gaps of 10 to 20 percentage points in polls.
  • “How fast was the car going when it smashed into the other car?” vs “…when it contacted the other car?” — in a famous experiment, the word “smashed” produced speed estimates 25% higher than “contacted.”

Order effects: The sequence in which options are presented influences choices. The first option listed in a multiple-choice question tends to be selected more frequently than it would be if listed last. When possible, surveys should randomize the order of response options.

How to Spot Misleading Statistics

When you encounter a statistic — in the news, in a study, or in a workplace report — run through this checklist:

  1. Who funded the study? A study on sugar’s health effects funded by a beverage company deserves extra scrutiny. Funding does not automatically mean bias, but it is a red flag worth noting.
  2. How was the sample selected? Was it random, convenience, or voluntary response? If the article does not say, be cautious.
  3. What was the response rate? A 20% response rate means 80% of the selected sample is missing. The reported results represent only those who chose to respond.
  4. How were questions worded? Look for leading phrasing, loaded terms, or double-barreled questions (two questions packaged as one).
  5. Are they reporting relative or absolute numbers? “Drug X reduces risk by 50%” sounds dramatic. But if the original risk was 2 in 10,000, the new risk is 1 in 10,000 — a real but tiny change. Relative risk reduction can make small effects sound large.
  6. What is the sample size? Very small samples can produce dramatic results by chance.
  7. Is the comparison fair? Are they comparing groups that differ in important ways besides the variable of interest?

Real-World Application: Nursing — Evaluating a Clinical Study

A pharmaceutical company announces: “80% of patients improved on our new medication.”

Before accepting this claim, a nurse or healthcare professional should ask:

How were patients selected? If the company enrolled only patients with mild symptoms, the high improvement rate may not apply to patients with severe cases. This is selection bias — the study population does not match the clinical population.

Was there a control group? Without a comparison group, you cannot know whether the improvement was caused by the drug or by natural recovery, the placebo effect, or other treatments patients were receiving simultaneously.

What was the dropout rate? If 500 patients started the study and 200 dropped out (perhaps due to side effects), the “80% improved” refers only to the 300 who remained. This is survivorship bias — the patients who had bad experiences are excluded from the results.

How was “improvement” defined? If the company set a low bar — say, any measurable change in symptoms, no matter how small — 80% is less impressive than it sounds. Measurement bias can enter through the definition of the outcome.

Who measured the improvement? If the treating physicians knew which patients received the drug (an unblinded study), their expectations could influence their assessments. This is a form of response bias on the researcher’s side.

The bottom line: a single number like “80% improved” means almost nothing without context about how the study was designed, who was included, and how outcomes were measured.

Practice Problems

Test your understanding with these problems. Click to reveal each answer.

Problem 1: A university emails a survey to all 15,000 students asking about campus dining. Only 800 respond, and 70% of respondents say dining options are “poor.” What type of bias is the main concern, and why?

The response rate is 80015,0005.3%\frac{800}{15{,}000} \approx 5.3\%, which is extremely low. Students who are dissatisfied with dining are more motivated to respond than those who are neutral or satisfied.

Answer: The main concern is nonresponse bias. The 70% dissatisfaction rate likely overestimates the true level because dissatisfied students are disproportionately represented among the 5.3% who responded.

Problem 2: A fitness app reports: “Users who track their meals lose an average of 12 pounds in 3 months.” What type of bias should you suspect?

The users who track their meals are self-selected — they chose to use this feature. People who are more motivated, disciplined, or already losing weight are more likely to track meals. The app is comparing a motivated subgroup against the general user base.

Answer: This is selection bias (specifically self-selection). The meal-tracking users differ systematically from non-trackers in motivation and behavior, so the weight loss cannot be attributed to meal tracking itself.

Problem 3: A survey asks employees: “Given the company’s difficult financial situation, do you support the proposed pay freeze?” 78% say yes. What type of bias is present?

The phrase “given the company’s difficult financial situation” frames the question in a way that pressures respondents toward agreement. A neutral version would ask: “Do you support the proposed pay freeze?” without the preamble.

Answer: This is response bias caused by a leading question. The framing primes respondents to consider the company’s hardship before answering, which inflates support.

Problem 4: A magazine publishes an article: “The top 50 CEOs all practiced daily meditation.” The article concludes that meditation leads to business success. What bias is at work?

The article examines only successful CEOs. It does not consider the thousands of people who meditate daily and are not CEOs, or the many unsuccessful business leaders who also meditated. By studying only “survivors” (top CEOs), the analysis cannot determine whether meditation is actually associated with success.

Answer: This is survivorship bias. Without data on non-successful individuals who also meditate, the conclusion is unfounded.

Problem 5: A clinic weighs patients using a scale that has not been calibrated in two years. A quality audit discovers the scale reads 1.5 kg too low. What type of bias is this, and does a larger sample size fix it?

The scale systematically underreports every patient’s weight by 1.5 kg. This is an error in the measuring instrument, not in how patients were selected.

Answer: This is measurement bias. A larger sample size does not fix it — averaging more readings from the same faulty scale still produces an average that is 1.5 kg too low. The fix is to calibrate the scale.

Key Takeaways

  • Bias is systematic error that pushes results in one direction. It cannot be fixed by increasing sample size.
  • Selection bias occurs when the sample systematically excludes part of the population — undercoverage is a common form.
  • Response bias occurs when question wording, social pressure, or the interview setting influences answers.
  • Nonresponse bias occurs when the people who fail to respond differ systematically from those who do — low response rates are a major warning sign.
  • Measurement bias occurs when the data collection instrument or method consistently distorts values.
  • Survivorship bias occurs when you analyze only the cases that “survived” a selection process and ignore those that were eliminated.
  • To evaluate any statistic, ask: Who was sampled? How were they selected? What was the response rate? How were questions worded? Who funded the study?
  • A critical consumer of statistics does not just look at the numbers — they look at the process that produced the numbers.

Return to Statistics for more topics in this section.

Last updated: March 29, 2026