Two-Way Tables and Probability
Medication dosages, IV drip rates, vital monitoring
Discounts, tax, tips, profit margins
A two-way table (also called a contingency table) organizes data by two categorical variables. It is one of the most practical tools for calculating real-world probabilities because it displays all the information you need — joint counts, row totals, column totals, and the grand total — in a single grid.
Anatomy of a Two-Way Table
Every two-way table has the same structure:
- Rows represent the categories of one variable
- Columns represent the categories of the other variable
- Cells (the interior values) show how many observations fall into each combination
- Row totals (rightmost column) sum across each row
- Column totals (bottom row) sum down each column
- Grand total (bottom-right corner) is the total number of observations
| Category X₁ | Category X₂ | Row Total | |
|---|---|---|---|
| Category Y₁ | cell count | cell count | row sum |
| Category Y₂ | cell count | cell count | row sum |
| Column Total | col sum | col sum | Grand Total |
Each cell count, row total, column total, and grand total gives you a different type of probability.
Joint, Marginal, and Conditional Probabilities
Three types of probability can be read from a two-way table:
Joint probability — the probability that two specific categories occur together:
Marginal probability — the probability of a single category, ignoring the other variable:
The name “marginal” comes from the fact that these totals appear in the margins (edges) of the table.
Conditional probability — the probability of one category given that another is known:
Worked Example: Student Survey
Example 1
A university surveyed 650 students about their preferred class format. The results are organized by class year:
| Prefers Online | Prefers In-Person | Total | |
|---|---|---|---|
| Freshman | 80 | 120 | 200 |
| Sophomore | 110 | 90 | 200 |
| Junior | 95 | 55 | 150 |
| Senior | 65 | 35 | 100 |
| Total | 350 | 300 | 650 |
Verification: Row totals: . Column totals: . Both match the grand total.
Let’s calculate each type of probability.
Joint probability: What is the probability that a randomly selected student is a freshman who prefers online classes?
Marginal probabilities: What is the overall probability of being a freshman? Of preferring online?
Conditional probability: Given that a student is a freshman, what is the probability they prefer online classes?
Restrict to the Freshman row (200 students) and look at the Online cell (80):
Reversed conditional: Given that a student prefers online classes, what is the probability they are a freshman?
Restrict to the Online column (350 students) and look at the Freshman cell (80):
Notice that 40% of freshmen prefer online, but only 22.9% of online-preferring students are freshmen. These are different questions — always check which condition goes after the ”|” symbol.
Testing for Independence Using Two-Way Tables
Two variables are independent if knowing one does not change the probability of the other. The test is straightforward:
Equivalently, you can check whether .
Example 2: Are Class Year and Format Preference Independent?
Using the student survey data, check whether class year and format preference are independent.
Overall probability of preferring online:
Probability of preferring online for each class year:
If class year and format preference were independent, all of these conditional probabilities would equal the marginal probability of 0.538. Instead, they range from 0.400 (freshmen) to 0.650 (seniors).
Conclusion: The variables are not independent. Online preference increases with class year — freshmen are the least likely to prefer online (40%), while seniors are the most likely (65%).
Relative Frequency Tables
A relative frequency table converts all counts to proportions by dividing every cell by the grand total. This makes it easy to read joint probabilities directly from the table.
Starting with the student survey:
| Prefers Online | Prefers In-Person | Total | |
|---|---|---|---|
| Freshman | 80/650 ≈ 0.123 | 120/650 ≈ 0.185 | 0.308 |
| Sophomore | 110/650 ≈ 0.169 | 90/650 ≈ 0.138 | 0.308 |
| Junior | 95/650 ≈ 0.146 | 55/650 ≈ 0.085 | 0.231 |
| Senior | 65/650 ≈ 0.100 | 35/650 ≈ 0.054 | 0.154 |
| Total | 0.538 | 0.462 | 1.000 |
Now every cell is a joint probability, every margin is a marginal probability, and the grand total is 1.000. You can also create row-relative tables (each row sums to 1) to compare conditional probabilities across groups, or column-relative tables (each column sums to 1) to compare the composition within each column.
Row-relative table (each row divided by its row total):
| Prefers Online | Prefers In-Person | Total | |
|---|---|---|---|
| Freshman | 80/200 = 0.400 | 120/200 = 0.600 | 1.000 |
| Sophomore | 110/200 = 0.550 | 90/200 = 0.450 | 1.000 |
| Junior | 95/150 ≈ 0.633 | 55/150 ≈ 0.367 | 1.000 |
| Senior | 65/100 = 0.650 | 35/100 = 0.350 | 1.000 |
This table makes the trend immediately visible: online preference grows steadily from 40% among freshmen to 65% among seniors.
Building a Two-Way Table from Raw Data
Sometimes you need to construct the table yourself from a description. Here is the process:
Example 3
A store tracks 400 customer transactions. Of the 240 cash transactions, 36 involved a return. Of the 160 credit card transactions, 48 involved a return. Build the two-way table and find the probability of a return given credit card payment.
Step 1: Set up the rows and columns.
| Return | No Return | Total | |
|---|---|---|---|
| Cash | 36 | ? | 240 |
| Credit | 48 | ? | 160 |
| Total | ? | ? | 400 |
Step 2: Fill in the missing cells by subtraction.
- Cash, No Return:
- Credit, No Return:
- Total Returns:
- Total No Returns:
| Return | No Return | Total | |
|---|---|---|---|
| Cash | 36 | 204 | 240 |
| Credit | 48 | 112 | 160 |
| Total | 84 | 316 | 400 |
Verification: . . Both match.
Step 3: Answer the question.
For comparison: . Credit card purchases have double the return rate.
Real-World Application: Nursing — Treatment Outcomes
A hospital compares outcomes for two physical therapy approaches used on 200 patients recovering from knee surgery:
| Improved | No Change | Worsened | Total | |
|---|---|---|---|---|
| Traditional PT | 55 | 35 | 10 | 100 |
| Aquatic PT | 70 | 22 | 8 | 100 |
| Total | 125 | 57 | 18 | 200 |
Key conditional probabilities for the nursing team:
Aquatic PT shows a higher improvement rate (70% vs 55%) and a slightly lower worsening rate (8% vs 10%). A nurse reviewing this data could use these conditional probabilities to inform patient discussions — while noting that this observational data alone cannot prove causation (other factors like patient age or injury severity could explain the difference).
Practice Problems
Test your understanding with these problems. Click to reveal each answer.
Problem 1: A survey of 300 adults asked about exercise habits and sleep quality. Of 180 who exercise regularly, 126 reported good sleep. Of 120 who do not exercise regularly, 48 reported good sleep. What is ?
Answer: Among those who exercise regularly, 70% report good sleep quality.
Problem 2: Using the same data, what is ?
Total with good sleep: .
Answer: Among those with good sleep, about 72.4% exercise regularly. This is a different question from Problem 1.
Problem 3: Using the student survey data at the top of this page, what is the joint probability that a randomly selected student is a junior who prefers in-person classes?
Answer: About 8.5% of all students surveyed are juniors who prefer in-person classes.
Problem 4: A hospital tested 500 patients. Of 300 who received Drug A, 240 improved. Of 200 who received Drug B, 140 improved. Are drug type and outcome independent?
Since and , the variables are not independent. Drug A has a higher improvement rate.
Problem 5: A store tracked 500 orders. Online: 300 total (45 returned, 255 not). In-Store: 200 total (15 returned, 185 not). Find the joint, marginal, and conditional probabilities for “Online and Returned.”
Joint:
Marginal: , and
Conditional:
Also:
Answer: The joint probability is 9%, the marginal probabilities are 60% (Online) and 12% (Returned), and the conditional probability of a return given online purchase is 15%.
Key Takeaways
- A two-way table organizes data by two categorical variables, displaying cell counts, row totals, column totals, and the grand total.
- Joint probability: divide the cell count by the grand total — .
- Marginal probability: divide the row or column total by the grand total — .
- Conditional probability: divide the cell count by the relevant row or column total — .
- Relative frequency tables convert counts to proportions, making probabilities directly readable.
- To test for independence, check whether conditional probabilities equal the corresponding marginal probabilities.
- Two-way tables are widely used in healthcare, business, and social science to compare outcomes across groups.
Return to Statistics for more topics in this section.
Next Up in Statistics
All Statistics topicsLast updated: March 29, 2026