Statistics

Chi-Square Tests

Last updated: March 2026 · Advanced

Before you start

You should be comfortable with:

Introduction to Hypothesis Testing Two-Way Tables and Probability

Real-world applications

💊

Nursing

Medication dosages, IV drip rates, vital monitoring

💰

Retail & Finance

Discounts, tax, tips, profit margins

The z-tests and t-tests you have learned so far work well for comparing means and proportions — but what about categorical data? When your variable is not a number but a category (eye color, political party, treatment outcome, product preference), you cannot compute a mean or a standard deviation. Instead, you work with counts — how many observations fall into each category. The chi-square test (pronounced “ky-square” and written $\chi^2$ ) is the standard tool for testing hypotheses about categorical data. In this lesson you will learn two versions: the goodness-of-fit test for one categorical variable, and the test of independence for two categorical variables.

The Chi-Square Statistic

Both chi-square tests use the same core statistic:

$\chi^2 = \sum \frac{(O - E)^2}{E}$

where $O$ is the observed count (what you actually counted in the data) and $E$ is the expected count (what you would expect if the null hypothesis were true). The sum is taken over all categories or cells.

The logic is straightforward:

For each category, compute how far the observed count is from the expected count
Square the difference (so negative and positive deviations both contribute positively)
Divide by the expected count (a difference of 10 matters more when the expected count is 20 than when it is 2000)
Add up all the terms

A large $\chi^2$ value means the observed data is far from what the null hypothesis predicts — evidence against $H_0$ . A small $\chi^2$ value means the data fits the expected pattern well — no reason to reject $H_0$ . The chi-square statistic is always non-negative (zero or positive), and the test is always right-tailed: you reject $H_0$ only when $\chi^2$ is large enough.

The degrees of freedom depend on which test you are performing, as described below.

Chi-Square Goodness-of-Fit Test

The goodness-of-fit test asks: does a single categorical variable follow a specified distribution? You have observed counts for each category, and you compare them to what you would expect under a hypothesized distribution.

$H_0$ : the distribution of the variable matches the expected proportions
$H_a$ : the distribution does not match the expected proportions
Degrees of freedom: $df = (\text{number of categories}) - 1$

Example 1: Testing a Die for Fairness

You roll a six-sided die 120 times and record the results:

Face	1	2	3	4	5	6	Total
Observed	15	22	25	18	17	23	120
Expected	20	20	20	20	20	20	120

If the die is fair, each face should appear $120 / 6 = 20$ times. Is there evidence the die is unfair? Test at $\alpha = 0.05$ .

Step 1: State the hypotheses.

$H_0: \text{the die is fair (each face has probability } 1/6\text{)}$

$H_a: \text{the die is not fair (at least one face differs from } 1/6\text{)}$

Step 2: Calculate the chi-square statistic.

$\chi^2 = \frac{(15-20)^2}{20} + \frac{(22-20)^2}{20} + \frac{(25-20)^2}{20} + \frac{(18-20)^2}{20} + \frac{(17-20)^2}{20} + \frac{(23-20)^2}{20}$

$= \frac{25}{20} + \frac{4}{20} + \frac{25}{20} + \frac{4}{20} + \frac{9}{20} + \frac{9}{20}$

$= 1.25 + 0.20 + 1.25 + 0.20 + 0.45 + 0.45 = 3.80$

Step 3: Find degrees of freedom and the critical value.

$df = 6 - 1 = 5$

At $\alpha = 0.05$ with $df = 5$ , the critical value is $\chi^2_{0.05} = 11.070$ .

Step 4: Make a decision. Since $3.80$ is less than $11.070$ , we fail to reject $H_0$ .

Step 5: Conclusion in context. There is no statistically significant evidence that the die is unfair. The observed variation (some faces appearing slightly more or less than 20 times) is well within what random chance would produce with a fair die.

Example 2: Customer Preference

A store manager believes customers prefer four product flavors equally. She surveys 200 customers:

Flavor	Vanilla	Chocolate	Strawberry	Mint	Total
Observed	62	48	55	35	200
Expected	50	50	50	50	200

$\chi^2 = \frac{(62-50)^2}{50} + \frac{(48-50)^2}{50} + \frac{(55-50)^2}{50} + \frac{(35-50)^2}{50}$

$= \frac{144}{50} + \frac{4}{50} + \frac{25}{50} + \frac{225}{50} = 2.88 + 0.08 + 0.50 + 4.50 = 7.96$

With $df = 4 - 1 = 3$ and $\alpha = 0.05$ , the critical value is $\chi^2_{0.05} = 7.815$ . Since $7.96 > 7.815$ , we reject $H_0$ . There is evidence that customer preferences are not equally distributed. Mint appears to be the least popular flavor, while vanilla is the most popular.

Chi-Square Test of Independence

The test of independence asks: are two categorical variables related (associated), or are they independent? The data is organized in a contingency table (also called a two-way table) with rows for one variable and columns for the other.

$H_0$ : the two variables are independent (no association)
$H_a$ : the two variables are not independent (there is an association)
Expected count for each cell: $E = \frac{\text{row total} \times \text{column total}}{\text{grand total}}$
Degrees of freedom: $df = (r - 1)(c - 1)$ , where $r$ is the number of rows and $c$ is the number of columns

Example 3: Smoking and Lung Disease

A health researcher collects data on 1000 adults:

	Lung Disease	No Lung Disease	Total
Smoker	90	210	300
Non-smoker	60	640	700
Total	150	850	1000

Is there an association between smoking status and lung disease? Test at $\alpha = 0.05$ .

Step 1: State the hypotheses.

$H_0: \text{smoking status and lung disease are independent}$

$H_a: \text{smoking status and lung disease are associated}$

Step 2: Calculate expected counts under independence.

$E(\text{Smoker, Disease}) = \frac{300 \times 150}{1000} = 45$

$E(\text{Smoker, No Disease}) = \frac{300 \times 850}{1000} = 255$

$E(\text{Non-smoker, Disease}) = \frac{700 \times 150}{1000} = 105$

$E(\text{Non-smoker, No Disease}) = \frac{700 \times 850}{1000} = 595$

Verify: $45 + 255 = 300$ ✓, $105 + 595 = 700$ ✓, $45 + 105 = 150$ ✓, $255 + 595 = 850$ ✓.

Step 3: Calculate the chi-square statistic.

$\chi^2 = \frac{(90-45)^2}{45} + \frac{(210-255)^2}{255} + \frac{(60-105)^2}{105} + \frac{(640-595)^2}{595}$

$= \frac{2025}{45} + \frac{2025}{255} + \frac{2025}{105} + \frac{2025}{595}$

$= 45.000 + 7.941 + 19.286 + 3.403 = 75.630$

Notice that every cell has the same squared difference ( $45^2 = 2025$ ). This happens in a 2-by-2 table when the row and column totals are fixed.

Step 4: Find degrees of freedom and the critical value.

$df = (2-1)(2-1) = 1$

At $\alpha = 0.05$ with $df = 1$ , the critical value is $\chi^2_{0.05} = 3.841$ .

Step 5: Make a decision. Since $75.630$ far exceeds $3.841$ , we reject $H_0$ .

Step 6: Conclusion in context. There is overwhelming evidence of an association between smoking status and lung disease. Smokers had a lung disease rate of $90/300 = 30\%$ , compared to $60/700 = 8.6\%$ for non-smokers. While this test does not prove causation, the strong association is consistent with the well-established medical understanding that smoking increases lung disease risk.

Conditions for Chi-Square Tests

Both chi-square tests require the following conditions:

Random sample or random assignment — the data must come from a properly designed study
Independence — each observation must be independent of the others; one person’s response cannot influence another’s
Expected counts are at least 5 — every cell in the table must have an expected count of 5 or more. This ensures the chi-square distribution is a good approximation for the test statistic. If any expected count is below 5, consider combining categories or using Fisher’s exact test (for 2-by-2 tables)

Note that the “at least 5” rule applies to expected counts, not observed counts. An observed count of 0 or 2 is fine as long as the expected count for that cell is at least 5.

Chi-Square Critical Values Reference Table

df	$\alpha = 0.10$	$\alpha = 0.05$	$\alpha = 0.01$
1	2.706	3.841	6.635
2	4.605	5.991	9.210
3	6.251	7.815	11.345
4	7.779	9.488	13.277
5	9.236	11.070	15.086
10	15.987	18.307	23.209

To use the table: find your degrees of freedom in the left column and your chosen $\alpha$ across the top. Reject $H_0$ if your calculated $\chi^2$ exceeds the table value.

Real-World Application: Nursing — Treatment Effectiveness Across Treatment Types

A nurse researcher wants to know if the effectiveness of three wound-care treatments differs among three different treatments. She classifies 360 patients by treatment outcome (healed vs. not healed) and treatment type:

	Healed	Not Healed	Total
Treatment A	85	35	120
Treatment B	70	50	120
Treatment C	90	30	120
Total	245	115	360

Expected counts (each row total is 120, each column total is 245 or 115):

$E(\text{A, Healed}) = \frac{120 \times 245}{360} = 81.67 \qquad E(\text{A, Not Healed}) = \frac{120 \times 115}{360} = 38.33$

By symmetry (all row totals are 120), every row has the same expected counts: 81.67 healed and 38.33 not healed.

$\chi^2 = \frac{(85-81.67)^2}{81.67} + \frac{(35-38.33)^2}{38.33} + \frac{(70-81.67)^2}{81.67} + \frac{(50-38.33)^2}{38.33} + \frac{(90-81.67)^2}{81.67} + \frac{(30-38.33)^2}{38.33}$

$= \frac{11.09}{81.67} + \frac{11.09}{38.33} + \frac{136.19}{81.67} + \frac{136.19}{38.33} + \frac{69.39}{81.67} + \frac{69.39}{38.33}$

$= 0.136 + 0.289 + 1.668 + 3.553 + 0.850 + 1.810 = 8.306$

With $df = (3-1)(2-1) = 2$ and $\alpha = 0.05$ , the critical value is $5.991$ . Since $8.306 > 5.991$ , we reject $H_0$ . There is significant evidence that healing rates differ among the three treatments. Treatment C has the highest healing rate (75%), Treatment A is close behind (70.8%), and Treatment B is notably lower (58.3%). This information helps the nursing team prioritize Treatment C for patients who need the best chance of healing and investigate why Treatment B underperforms.

Practice Problems

Test your understanding with these problems. Click to reveal each answer.

Problem 1: A candy company claims its bags contain equal proportions of 5 colors. A student counts 200 candies: Red 52, Blue 38, Green 45, Yellow 30, Orange 35. Test whether the distribution matches the claim at

\alpha = 0.05

$H_0$ : all five colors occur equally ( $E = 200/5 = 40$ each). $H_a$ : not all equal.

$\chi^2 = \frac{(52-40)^2}{40} + \frac{(38-40)^2}{40} + \frac{(45-40)^2}{40} + \frac{(30-40)^2}{40} + \frac{(35-40)^2}{40}$

$= \frac{144}{40} + \frac{4}{40} + \frac{25}{40} + \frac{100}{40} + \frac{25}{40} = 3.60 + 0.10 + 0.625 + 2.50 + 0.625 = 7.45$

$df = 5 - 1 = 4$ . Critical value $\chi^2_{0.05} = 9.488$ .

Since $7.45$ is less than $9.488$ , fail to reject $H_0$ .

Answer: There is not sufficient evidence to conclude the color distribution differs from equal proportions. The observed variation is consistent with random sampling from equal proportions.

Problem 2: A survey of 500 adults cross-classifies political affiliation (Democrat, Republican, Independent) with opinion on a policy (Favor, Oppose). Results: Dem favor 120, Dem oppose 80; Rep favor 70, Rep oppose 130; Ind favor 55, Ind oppose 45. Test for independence at

\alpha = 0.05

	Favor	Oppose	Total
Democrat	120	80	200
Republican	70	130	200
Independent	55	45	100
Total	245	255	500

Expected counts: $E = \frac{\text{row total} \times \text{col total}}{500}$ .

$E(\text{Dem, Favor}) = 200 \times 245 / 500 = 98.0$
$E(\text{Dem, Oppose}) = 200 \times 255 / 500 = 102.0$
$E(\text{Rep, Favor}) = 200 \times 245 / 500 = 98.0$
$E(\text{Rep, Oppose}) = 200 \times 255 / 500 = 102.0$
$E(\text{Ind, Favor}) = 100 \times 245 / 500 = 49.0$
$E(\text{Ind, Oppose}) = 100 \times 255 / 500 = 51.0$

$\chi^2 = \frac{(120-98)^2}{98} + \frac{(80-102)^2}{102} + \frac{(70-98)^2}{98} + \frac{(130-102)^2}{102} + \frac{(55-49)^2}{49} + \frac{(45-51)^2}{51}$

$= \frac{484}{98} + \frac{484}{102} + \frac{784}{98} + \frac{784}{102} + \frac{36}{49} + \frac{36}{51}$

$= 4.939 + 4.745 + 8.000 + 7.686 + 0.735 + 0.706 = 26.811$

$df = (3-1)(2-1) = 2$ . Critical value $\chi^2_{0.05} = 5.991$ .

Since $26.811 > 5.991$ , reject $H_0$ .

Answer: There is strong evidence of an association between political affiliation and policy opinion. Democrats favor the policy at a much higher rate (60%) than Republicans (35%), with Independents in between (55%).

Problem 3: A quality inspector checks 300 items from three production shifts. Shift A: 8 defective out of 100. Shift B: 15 defective out of 100. Shift C: 7 defective out of 100. Is there a significant difference in defect rates across shifts? (

\alpha = 0.05

)

	Defective	Not Defective	Total
Shift A	8	92	100
Shift B	15	85	100
Shift C	7	93	100
Total	30	270	300

Expected counts (all row totals are 100): $E(\text{defective}) = 100 \times 30/300 = 10$ and $E(\text{not defective}) = 100 \times 270/300 = 90$ for each shift.

$\chi^2 = \frac{(8-10)^2}{10} + \frac{(92-90)^2}{90} + \frac{(15-10)^2}{10} + \frac{(85-90)^2}{90} + \frac{(7-10)^2}{10} + \frac{(93-90)^2}{90}$

$= \frac{4}{10} + \frac{4}{90} + \frac{25}{10} + \frac{25}{90} + \frac{9}{10} + \frac{9}{90}$

$= 0.400 + 0.044 + 2.500 + 0.278 + 0.900 + 0.100 = 4.222$

$df = (3-1)(2-1) = 2$ . Critical value $\chi^2_{0.05} = 5.991$ .

Since $4.222$ is less than $5.991$ , fail to reject $H_0$ .

Answer: There is not sufficient evidence of a significant difference in defect rates across shifts. Although Shift B has a higher observed defect rate (15%) compared to Shifts A (8%) and C (7%), this variation could plausibly be due to chance.

Problem 4: A genetics experiment predicts offspring in a 9:3:3:1 phenotype ratio. Out of 160 offspring observed: 84, 35, 26, 15. Test the genetic model at

\alpha = 0.05

Expected counts based on the 9:3:3:1 ratio (out of 160):

Category 1: $160 \times 9/16 = 90$
Category 2: $160 \times 3/16 = 30$
Category 3: $160 \times 3/16 = 30$
Category 4: $160 \times 1/16 = 10$

Verify: $90 + 30 + 30 + 10 = 160$ ✓.

$\chi^2 = \frac{(84-90)^2}{90} + \frac{(35-30)^2}{30} + \frac{(26-30)^2}{30} + \frac{(15-10)^2}{10}$

$= \frac{36}{90} + \frac{25}{30} + \frac{16}{30} + \frac{25}{10} = 0.400 + 0.833 + 0.533 + 2.500 = 4.266$

$df = 4 - 1 = 3$ . Critical value $\chi^2_{0.05} = 7.815$ .

Since $4.266$ is less than $7.815$ , fail to reject $H_0$ .

Answer: The observed data is consistent with the predicted 9:3:3:1 genetic ratio. There is no significant deviation from the expected phenotype distribution.

Problem 5: A hospital records patient satisfaction (Satisfied, Neutral, Dissatisfied) for two departments. Department X: 80 satisfied, 30 neutral, 10 dissatisfied. Department Y: 60 satisfied, 40 neutral, 20 dissatisfied. Is there an association between department and satisfaction? (

\alpha = 0.05

)

	Satisfied	Neutral	Dissatisfied	Total
Dept X	80	30	10	120
Dept Y	60	40	20	120
Total	140	70	30	240

Expected counts (both row totals are 120):

$E(\text{X, Sat}) = 120 \times 140/240 = 70$
$E(\text{X, Neu}) = 120 \times 70/240 = 35$
$E(\text{X, Dis}) = 120 \times 30/240 = 15$
$E(\text{Y, Sat}) = 120 \times 140/240 = 70$
$E(\text{Y, Neu}) = 120 \times 70/240 = 35$
$E(\text{Y, Dis}) = 120 \times 30/240 = 15$

$\chi^2 = \frac{(80-70)^2}{70} + \frac{(30-35)^2}{35} + \frac{(10-15)^2}{15} + \frac{(60-70)^2}{70} + \frac{(40-35)^2}{35} + \frac{(20-15)^2}{15}$

$= \frac{100}{70} + \frac{25}{35} + \frac{25}{15} + \frac{100}{70} + \frac{25}{35} + \frac{25}{15}$

$= 1.429 + 0.714 + 1.667 + 1.429 + 0.714 + 1.667 = 7.620$

$df = (2-1)(3-1) = 2$ . Critical value $\chi^2_{0.05} = 5.991$ .

Since $7.620 > 5.991$ , reject $H_0$ .

Answer: There is significant evidence of an association between department and patient satisfaction. Department X has a higher proportion of satisfied patients (66.7% vs 50%) and a lower proportion of dissatisfied patients (8.3% vs 16.7%). Hospital administrators should investigate what Department X does differently.

Key Takeaways

The chi-square statistic $\chi^2 = \sum \frac{(O-E)^2}{E}$ measures how far observed counts are from expected counts — it works exclusively with counts, not proportions or means
The goodness-of-fit test checks whether a single categorical variable follows a specified distribution, with $df = k - 1$ where $k$ is the number of categories
The test of independence checks whether two categorical variables are associated, with $df = (r-1)(c-1)$ and expected counts computed as $E = \frac{\text{row total} \times \text{col total}}{\text{grand total}}$
Both tests are always right-tailed — you only reject $H_0$ when $\chi^2$ is large
The key condition is that all expected counts must be at least 5
A significant chi-square test tells you that the variables are associated, but it does not tell you the direction or strength — examine the observed percentages to interpret the nature of the relationship
Chi-square tests are essential in medical research (treatment outcomes), quality control (defect patterns), survey analysis (opinion by demographic), and genetics (phenotype ratios)

Return to Statistics for more topics in this section.

Next Up in Statistics

Introduction to Hypothesis Testing Two-Way Tables and Probability Addition Rule of Probability One-Way ANOVA

All Statistics topics

Last updated: March 29, 2026