One-Sample T-Test
You should be comfortable with:
Medication dosages, IV drip rates, vital monitoring
The one-sample t-test is the workhorse of statistical testing. It tests claims about a population mean when the population standard deviation is unknown — which is almost always. In practice, you rarely know with certainty. You have a sample, you compute the sample mean and sample standard deviation , and you need to decide whether the data provides evidence against a hypothesized value . The t-test handles this by using the t-distribution instead of the standard normal, accounting for the additional uncertainty introduced by estimating with . If you have worked through the introduction to hypothesis testing and the z-test, you already know the seven-step framework. Here you will apply it with the t-distribution.
When to Use the T-Test
Use a one-sample t-test when all of the following are true:
- You are testing a claim about a single population mean
- The population standard deviation is unknown (you only have the sample standard deviation )
- You have data from one sample, not from two groups or paired observations
If is known (rare), use a z-test instead. If you are comparing two groups, use a two-sample test. If you have paired data (before/after on the same subjects), use a paired t-test (also covered in the two-sample tests lesson).
The T-Test Statistic
The test statistic for a one-sample t-test is:
where:
- = sample mean
- = hypothesized population mean (from )
- = sample standard deviation
- = sample size
- = the estimated standard error of the mean
The test statistic follows a t-distribution with degrees of freedom. Unlike the standard normal distribution (which is fully specified), the t-distribution requires knowing the degrees of freedom. With fewer degrees of freedom, the t-distribution has heavier tails — meaning you need a more extreme test statistic to achieve the same p-value. As increases, the t-distribution approaches the standard normal.
Conditions
Before performing a one-sample t-test, verify these conditions:
- Random sample — the data must come from a random sampling process or a randomized experiment
- Independence — individual observations must be independent (typically satisfied if the sample is less than 10% of the population)
- Nearly normal population OR large sample size — this condition depends on :
- If is small (under 15): the population distribution should be approximately normal, with no outliers
- If is moderate (15 to 30): mild skewness is acceptable, but there should be no strong outliers
- If is large (30 or more): the Central Limit Theorem ensures the sampling distribution of is approximately normal, even if the population is skewed
Practical tip: Always plot your data. A histogram, dotplot, or boxplot of the sample can reveal outliers or extreme skewness that would violate the normality condition for small samples.
Worked Examples
Example 1: Two-Sided T-Test — Coffee Shop Cup Size
A coffee shop advertises that its large cups contain 16 oz. A skeptical customer measures the contents of 20 randomly selected large cups and finds oz with oz. Is there evidence the cups do not contain 16 oz? Test at .
Step 1: State the hypotheses.
Step 2: Choose significance level: .
Step 3: Check conditions. Random sample of 20 cups ✓. Independence (20 cups is a tiny fraction of all cups sold) ✓. Sample size is moderate (), so we need to assume no strong outliers or extreme skew in cup volumes — reasonable for a manufacturing process ✓.
Step 4: Calculate the test statistic.
Step 5: Find the p-value. With and a two-sided test:
Step 6: Make a decision. Since , we reject .
Step 7: Conclusion in context. There is statistically significant evidence that the large cups do not contain 16 oz as advertised. The sample mean of 15.7 oz is significantly below 16 oz. The coffee shop appears to be under-pouring by about 0.3 oz on average. While this may seem small, it represents a consistent shortfall that could affect customer satisfaction — and potentially violate consumer protection standards.
Example 2: One-Sided T-Test — Hospital Discharge Time
Hospital protocol states that the average discharge process should take no more than 45 minutes. A hospital administrator studies a random sample of 30 patients and finds minutes with minutes. Is there evidence that the average discharge time exceeds 45 minutes? Test at .
Step 1: State the hypotheses.
Step 2: Choose significance level: .
Step 3: Check conditions. Random sample ✓. Independence (30 patients from a large hospital population) ✓. Sample size is at the threshold for the CLT ✓.
Step 4: Calculate the test statistic.
Step 5: Find the p-value. With and a one-sided right test:
Step 6: Make a decision. Since , we reject .
Step 7: Conclusion in context. There is statistically significant evidence that the average discharge time exceeds the 45-minute protocol. The observed mean of 48.3 minutes is significantly longer than the target. The hospital should investigate bottlenecks in the discharge process — common causes include delayed paperwork, waiting for prescriptions, and scheduling follow-up appointments.
Reading T-Test Output
When you use a calculator or software (TI-84, Excel, R, Python, etc.) to perform a t-test, the output typically includes:
- t-statistic — the calculated value of
- df — degrees of freedom ()
- p-value — the probability used for the decision. Make sure you know whether the software reports a one-sided or two-sided p-value. Some software always reports two-sided; if you need one-sided, divide by 2.
- Sample mean — the observed mean
- Confidence interval — many programs also report a confidence interval for , which provides the same information in a different form
Example software output:
| Statistic | Value |
|---|---|
| t | -2.235 |
| df | 19 |
| p-value (two-sided) | 0.038 |
| Sample mean | 15.7 |
| 95% CI | (15.42, 15.98) |
Reading this: the test statistic is , with 19 degrees of freedom. The two-sided p-value is 0.038, which is less than 0.05, so we reject the null hypothesis. The 95% confidence interval does not contain 16, which is consistent with the rejection decision.
T-Test vs Z-Test — When to Use Which
| Feature | Z-Test | T-Test |
|---|---|---|
| Use when | is known | is unknown (use ) |
| How common | Rare in practice | Very common — the default choice |
| Reference distribution | Standard normal () | t-distribution with |
| Test statistic | ||
| Critical values | Same for all sample sizes | Depend on degrees of freedom |
| For large | Results are virtually identical to the t-test | Approaches the z-test as |
| Typical scenario | Known manufacturing , textbook problems | Any real data set where must be estimated |
Bottom line: When in doubt, use the t-test. If happens to be known and you use a z-test, that is fine. But if is unknown and you use a z-test anyway (substituting for ), your p-values will be slightly too small for small samples — the t-test correctly accounts for this additional uncertainty.
Confidence Interval Connection
There is a deep link between hypothesis tests and confidence intervals. A two-sided hypothesis test at significance level is equivalent to checking whether the corresponding confidence interval for contains the null value .
- If falls inside the confidence interval: fail to reject
- If falls outside the confidence interval: reject
Example: In Example 1 above, we rejected at . The 95% confidence interval for is:
Since 16 is not in the interval , we reject — consistent with the hypothesis test result. The confidence interval gives additional information: not only is the mean significantly different from 16, but our best estimate of the true mean is somewhere between 15.42 and 15.98 oz.
This duality is especially useful for communicating results. Instead of saying “we rejected the null hypothesis with p = 0.038,” you can say “we are 95% confident the true mean is between 15.42 and 15.98 oz, which does not include the advertised 16 oz.”
Real-World Application: Nursing — Testing Average Recovery Time
A surgical unit adopts a new post-operative care protocol. The established average recovery time under the old protocol was 5.2 days. The head nurse wants to know whether the new protocol has changed recovery time. A random sample of 18 patients under the new protocol yields days and days. Test at .
Check conditions: random sample ✓, independence ✓, is moderate so we need approximate normality — recovery times are typically right-skewed, but mild skew is acceptable for ✓.
With , the two-sided p-value: .
Since , reject . There is evidence that the new protocol has changed recovery time. The sample mean of 4.6 days is significantly lower than the historical 5.2 days — a reduction of 0.6 days per patient that is both statistically significant and clinically meaningful. Shorter recovery times translate directly to improved patient well-being, faster bed turnover, and reduced hospital costs.
Practice Problems
Test your understanding with these problems. Click to reveal each answer.
Problem 1: A food company labels its soup cans as containing 305 grams. A consumer group measures 25 randomly selected cans and finds g with g. Is there evidence the cans contain less than 305 g? ()
, (one-sided left)
With , .
Since , reject .
Answer: There is statistically significant evidence that the cans contain less than 305 grams. The sample mean of 301 g is significantly below the label claim.
Problem 2: A teacher claims the average score on a standardized test at her school is 500. A random sample of 32 students yields and . Test whether the mean score differs from 500 at .
, (two-sided)
With , two-sided p-value .
Since , fail to reject .
Answer: There is not sufficient evidence to conclude the mean score differs from 500. The observed difference of 12 points above 500 could reasonably occur by chance.
Problem 3: A fitness trainer claims her program increases resting heart rate recovery to below 72 bpm. She measures 15 clients after 8 weeks: bpm, bpm. Test at (one-sided).
, (one-sided left)
With , .
Since , reject .
Answer: There is statistically significant evidence that the program reduces resting heart rate below 72 bpm. The sample mean of 69.5 bpm is significantly lower than the threshold.
Problem 4: The recommended daily water intake is 2,000 mL. A dietitian surveys 40 office workers and finds mL with mL. Is there evidence that office workers drink less than the recommended amount? ()
, (one-sided left)
With , .
Since , reject .
Answer: There is statistically significant evidence at the 1% level that office workers drink less than the recommended 2,000 mL per day. The average of 1,850 mL represents a meaningful shortfall of 150 mL.
Problem 5: A bakery says its loaves weigh 680 g on average. An inspector weighs 10 random loaves: g, g. Test whether the mean weight differs from 680 g at . (Note: with , you need the normality assumption.)
, (two-sided)
With , two-sided p-value .
Since , fail to reject .
Answer: There is not sufficient evidence to conclude the mean loaf weight differs from 680 g. The sample of 10 is small, and while the observed mean is 7 g below the target, this difference is not statistically significant. A larger sample would provide more power to detect such a difference.
Key Takeaways
- The one-sample t-test is the standard test for a population mean when is unknown — which is nearly all real-world situations
- The test statistic is with degrees of freedom
- The t-distribution has heavier tails than the standard normal, producing wider intervals and larger p-values for small samples — this correctly accounts for the uncertainty in estimating with
- For large (30 or more), the t-test and z-test give nearly identical results because the t-distribution approaches the normal
- Check conditions carefully, especially for small samples: the data should be approximately normal with no strong outliers
- A two-sided t-test at level and a confidence interval always agree: reject if and only if is outside the confidence interval
- Report both the p-value and a confidence interval whenever possible — the CI communicates the direction and magnitude of the effect, not just whether it is significant
- In clinical and healthcare settings, the t-test is essential for evaluating protocols, treatment outcomes, and quality benchmarks with real patient data
Return to Statistics for more topics in this section.
Next Up in Statistics
Last updated: March 29, 2026