Statistics

Two-Way Tables and Probability

Last updated: March 2026 · Intermediate
Before you start

You should be comfortable with:

Real-world applications
💊
Nursing

Medication dosages, IV drip rates, vital monitoring

💰
Retail & Finance

Discounts, tax, tips, profit margins

A two-way table (also called a contingency table) organizes data by two categorical variables. It is one of the most practical tools for calculating real-world probabilities because it displays all the information you need — joint counts, row totals, column totals, and the grand total — in a single grid.

Anatomy of a Two-Way Table

Every two-way table has the same structure:

  • Rows represent the categories of one variable
  • Columns represent the categories of the other variable
  • Cells (the interior values) show how many observations fall into each combination
  • Row totals (rightmost column) sum across each row
  • Column totals (bottom row) sum down each column
  • Grand total (bottom-right corner) is the total number of observations
Category X₁Category X₂Row Total
Category Y₁cell countcell countrow sum
Category Y₂cell countcell countrow sum
Column Totalcol sumcol sumGrand Total

Each cell count, row total, column total, and grand total gives you a different type of probability.

Joint, Marginal, and Conditional Probabilities

Three types of probability can be read from a two-way table:

Joint probability — the probability that two specific categories occur together:

P(A and B)=cell countgrand totalP(A \text{ and } B) = \frac{\text{cell count}}{\text{grand total}}

Marginal probability — the probability of a single category, ignoring the other variable:

P(A)=row or column totalgrand totalP(A) = \frac{\text{row or column total}}{\text{grand total}}

The name “marginal” comes from the fact that these totals appear in the margins (edges) of the table.

Conditional probability — the probability of one category given that another is known:

P(AB)=cell countrow or column total for the given conditionP(A \mid B) = \frac{\text{cell count}}{\text{row or column total for the given condition}}

Worked Example: Student Survey

Example 1

A university surveyed 650 students about their preferred class format. The results are organized by class year:

Prefers OnlinePrefers In-PersonTotal
Freshman80120200
Sophomore11090200
Junior9555150
Senior6535100
Total350300650

Verification: Row totals: 200+200+150+100=650200 + 200 + 150 + 100 = 650. Column totals: 350+300=650350 + 300 = 650. Both match the grand total.

Let’s calculate each type of probability.

Joint probability: What is the probability that a randomly selected student is a freshman who prefers online classes?

P(Freshman and Online)=806500.123=12.3%P(\text{Freshman and Online}) = \frac{80}{650} \approx 0.123 = 12.3\%

Marginal probabilities: What is the overall probability of being a freshman? Of preferring online?

P(Freshman)=2006500.308=30.8%P(\text{Freshman}) = \frac{200}{650} \approx 0.308 = 30.8\%

P(Online)=3506500.538=53.8%P(\text{Online}) = \frac{350}{650} \approx 0.538 = 53.8\%

Conditional probability: Given that a student is a freshman, what is the probability they prefer online classes?

Restrict to the Freshman row (200 students) and look at the Online cell (80):

P(OnlineFreshman)=80200=0.40=40%P(\text{Online} \mid \text{Freshman}) = \frac{80}{200} = 0.40 = 40\%

Reversed conditional: Given that a student prefers online classes, what is the probability they are a freshman?

Restrict to the Online column (350 students) and look at the Freshman cell (80):

P(FreshmanOnline)=803500.229=22.9%P(\text{Freshman} \mid \text{Online}) = \frac{80}{350} \approx 0.229 = 22.9\%

Notice that 40% of freshmen prefer online, but only 22.9% of online-preferring students are freshmen. These are different questions — always check which condition goes after the ”|” symbol.

Testing for Independence Using Two-Way Tables

Two variables are independent if knowing one does not change the probability of the other. The test is straightforward:

If P(AB)=P(A), the variables are independent.\text{If } P(A \mid B) = P(A), \text{ the variables are independent.}

Equivalently, you can check whether P(A and B)=P(A)×P(B)P(A \text{ and } B) = P(A) \times P(B).

Example 2: Are Class Year and Format Preference Independent?

Using the student survey data, check whether class year and format preference are independent.

Overall probability of preferring online:

P(Online)=3506500.538P(\text{Online}) = \frac{350}{650} \approx 0.538

Probability of preferring online for each class year:

P(OnlineFreshman)=80200=0.400P(\text{Online} \mid \text{Freshman}) = \frac{80}{200} = 0.400

P(OnlineSophomore)=110200=0.550P(\text{Online} \mid \text{Sophomore}) = \frac{110}{200} = 0.550

P(OnlineJunior)=951500.633P(\text{Online} \mid \text{Junior}) = \frac{95}{150} \approx 0.633

P(OnlineSenior)=65100=0.650P(\text{Online} \mid \text{Senior}) = \frac{65}{100} = 0.650

If class year and format preference were independent, all of these conditional probabilities would equal the marginal probability of 0.538. Instead, they range from 0.400 (freshmen) to 0.650 (seniors).

Conclusion: The variables are not independent. Online preference increases with class year — freshmen are the least likely to prefer online (40%), while seniors are the most likely (65%).

Relative Frequency Tables

A relative frequency table converts all counts to proportions by dividing every cell by the grand total. This makes it easy to read joint probabilities directly from the table.

Starting with the student survey:

Prefers OnlinePrefers In-PersonTotal
Freshman80/650 ≈ 0.123120/650 ≈ 0.1850.308
Sophomore110/650 ≈ 0.16990/650 ≈ 0.1380.308
Junior95/650 ≈ 0.14655/650 ≈ 0.0850.231
Senior65/650 ≈ 0.10035/650 ≈ 0.0540.154
Total0.5380.4621.000

Now every cell is a joint probability, every margin is a marginal probability, and the grand total is 1.000. You can also create row-relative tables (each row sums to 1) to compare conditional probabilities across groups, or column-relative tables (each column sums to 1) to compare the composition within each column.

Row-relative table (each row divided by its row total):

Prefers OnlinePrefers In-PersonTotal
Freshman80/200 = 0.400120/200 = 0.6001.000
Sophomore110/200 = 0.55090/200 = 0.4501.000
Junior95/150 ≈ 0.63355/150 ≈ 0.3671.000
Senior65/100 = 0.65035/100 = 0.3501.000

This table makes the trend immediately visible: online preference grows steadily from 40% among freshmen to 65% among seniors.

Building a Two-Way Table from Raw Data

Sometimes you need to construct the table yourself from a description. Here is the process:

Example 3

A store tracks 400 customer transactions. Of the 240 cash transactions, 36 involved a return. Of the 160 credit card transactions, 48 involved a return. Build the two-way table and find the probability of a return given credit card payment.

Step 1: Set up the rows and columns.

ReturnNo ReturnTotal
Cash36?240
Credit48?160
Total??400

Step 2: Fill in the missing cells by subtraction.

  • Cash, No Return: 24036=204240 - 36 = 204
  • Credit, No Return: 16048=112160 - 48 = 112
  • Total Returns: 36+48=8436 + 48 = 84
  • Total No Returns: 204+112=316204 + 112 = 316
ReturnNo ReturnTotal
Cash36204240
Credit48112160
Total84316400

Verification: 84+316=40084 + 316 = 400. 240+160=400240 + 160 = 400. Both match.

Step 3: Answer the question.

P(ReturnCredit)=48160=0.30=30%P(\text{Return} \mid \text{Credit}) = \frac{48}{160} = 0.30 = 30\%

For comparison: P(ReturnCash)=36240=0.15=15%P(\text{Return} \mid \text{Cash}) = \frac{36}{240} = 0.15 = 15\%. Credit card purchases have double the return rate.

Real-World Application: Nursing — Treatment Outcomes

A hospital compares outcomes for two physical therapy approaches used on 200 patients recovering from knee surgery:

ImprovedNo ChangeWorsenedTotal
Traditional PT553510100
Aquatic PT70228100
Total1255718200

Key conditional probabilities for the nursing team:

P(ImprovedTraditional)=55100=0.55=55%P(\text{Improved} \mid \text{Traditional}) = \frac{55}{100} = 0.55 = 55\%

P(ImprovedAquatic)=70100=0.70=70%P(\text{Improved} \mid \text{Aquatic}) = \frac{70}{100} = 0.70 = 70\%

P(WorsenedTraditional)=10100=0.10=10%P(\text{Worsened} \mid \text{Traditional}) = \frac{10}{100} = 0.10 = 10\%

P(WorsenedAquatic)=8100=0.08=8%P(\text{Worsened} \mid \text{Aquatic}) = \frac{8}{100} = 0.08 = 8\%

Aquatic PT shows a higher improvement rate (70% vs 55%) and a slightly lower worsening rate (8% vs 10%). A nurse reviewing this data could use these conditional probabilities to inform patient discussions — while noting that this observational data alone cannot prove causation (other factors like patient age or injury severity could explain the difference).

Practice Problems

Test your understanding with these problems. Click to reveal each answer.

Problem 1: A survey of 300 adults asked about exercise habits and sleep quality. Of 180 who exercise regularly, 126 reported good sleep. Of 120 who do not exercise regularly, 48 reported good sleep. What is P(Good SleepExercise)P(\text{Good Sleep} \mid \text{Exercise})?

P(Good SleepExercise)=126180=0.70=70%P(\text{Good Sleep} \mid \text{Exercise}) = \frac{126}{180} = 0.70 = 70\%

Answer: Among those who exercise regularly, 70% report good sleep quality.

Problem 2: Using the same data, what is P(ExerciseGood Sleep)P(\text{Exercise} \mid \text{Good Sleep})?

Total with good sleep: 126+48=174126 + 48 = 174.

P(ExerciseGood Sleep)=1261740.724=72.4%P(\text{Exercise} \mid \text{Good Sleep}) = \frac{126}{174} \approx 0.724 = 72.4\%

Answer: Among those with good sleep, about 72.4% exercise regularly. This is a different question from Problem 1.

Problem 3: Using the student survey data at the top of this page, what is the joint probability that a randomly selected student is a junior who prefers in-person classes?

P(Junior and In-Person)=556500.085=8.5%P(\text{Junior and In-Person}) = \frac{55}{650} \approx 0.085 = 8.5\%

Answer: About 8.5% of all students surveyed are juniors who prefer in-person classes.

Problem 4: A hospital tested 500 patients. Of 300 who received Drug A, 240 improved. Of 200 who received Drug B, 140 improved. Are drug type and outcome independent?

P(Improved)=240+140500=380500=0.76P(\text{Improved}) = \frac{240 + 140}{500} = \frac{380}{500} = 0.76

P(ImprovedDrug A)=240300=0.80P(\text{Improved} \mid \text{Drug A}) = \frac{240}{300} = 0.80

P(ImprovedDrug B)=140200=0.70P(\text{Improved} \mid \text{Drug B}) = \frac{140}{200} = 0.70

Since 0.800.760.80 \neq 0.76 and 0.700.760.70 \neq 0.76, the variables are not independent. Drug A has a higher improvement rate.

Problem 5: A store tracked 500 orders. Online: 300 total (45 returned, 255 not). In-Store: 200 total (15 returned, 185 not). Find the joint, marginal, and conditional probabilities for “Online and Returned.”

Joint: P(Online and Returned)=45500=0.09=9%P(\text{Online and Returned}) = \frac{45}{500} = 0.09 = 9\%

Marginal: P(Online)=300500=0.60P(\text{Online}) = \frac{300}{500} = 0.60, and P(Returned)=60500=0.12P(\text{Returned}) = \frac{60}{500} = 0.12

Conditional: P(ReturnedOnline)=45300=0.15=15%P(\text{Returned} \mid \text{Online}) = \frac{45}{300} = 0.15 = 15\%

Also: P(OnlineReturned)=4560=0.75=75%P(\text{Online} \mid \text{Returned}) = \frac{45}{60} = 0.75 = 75\%

Answer: The joint probability is 9%, the marginal probabilities are 60% (Online) and 12% (Returned), and the conditional probability of a return given online purchase is 15%.

Key Takeaways

  • A two-way table organizes data by two categorical variables, displaying cell counts, row totals, column totals, and the grand total.
  • Joint probability: divide the cell count by the grand total — P(A and B)=cellgrand totalP(A \text{ and } B) = \frac{\text{cell}}{\text{grand total}}.
  • Marginal probability: divide the row or column total by the grand total — P(A)=margingrand totalP(A) = \frac{\text{margin}}{\text{grand total}}.
  • Conditional probability: divide the cell count by the relevant row or column total — P(AB)=cellcondition totalP(A \mid B) = \frac{\text{cell}}{\text{condition total}}.
  • Relative frequency tables convert counts to proportions, making probabilities directly readable.
  • To test for independence, check whether conditional probabilities equal the corresponding marginal probabilities.
  • Two-way tables are widely used in healthcare, business, and social science to compare outcomes across groups.

Return to Statistics for more topics in this section.

Last updated: March 29, 2026