CAPE PENINSULA UNIVERSITY OF TECHNOLOGY STAT151X

Hypothesis Testing - Two Samples

StudyAI Editorial

Reviewed by StudyAI tutors

· Published May 29, 2026 Updated May 29, 2026

From the statistics 1B curriculum

Hypothesis Testing - Two Samples

1. Introduction & Overview

The Mental Model: Hypothesis testing for two samples is akin to a forensic comparison, meticulously evaluating whether observed differences between two distinct sets of evidence (data) are genuine and statistically significant, or merely artifacts of random variability, thereby determining if distinct underlying processes are at play.
Significance:
- Medical Research: Comparing efficacy of two drugs (Drug A vs. Drug B), comparing incidence rates of disease in treated vs. control groups.
- Engineering Quality Control: Assessing if two production lines yield products with significantly different defect rates or tensile strengths.
- Social Sciences: Determining if two demographic groups (Gender A vs. Gender B, Age Group X vs. Age Group Y) exhibit statistically different mean scores on a psychological construct.
- Business Analytics: Evaluating if a new marketing strategy (Strategy A) results in significantly higher sales conversion rates than an old one (Strategy B).
- Environmental Science: Comparing pollutant levels in two different geographical regions or at two different time points.

mindmap
  root((Hypothesis Testing - Two Samples))
    Objectives
      Compare means ("Quantitative Data")
        "Independent Samples"
        "Paired Samples"
          "Known Variance"
          "Unknown Variance (Pooled)"
          "Unknown Variance (Welch's)"
      Compare proportions ("Categorical Data")
        "Independent Samples"
        "Known N, P"
      Compare variances
        "F-test"
    Assumptions
      "Independence"
      "Normality"
      "Homoscedasticity"
      "Random Sampling"
    Test Statistics
      "t-statistic"
      "z-statistic"
      "F-statistic"
    Decision Rule
      "p-value approach"
      "Critical value approach"
    "Type I Error (α)"
    "Type II Error (β)"
    "Power (1-β)"

2. In-Depth Theory, Equations & Mechanisms

Hypothesis testing for two samples primarily involves comparing parameters (means, proportions, variances) from two distinct populations based on sample data. The fundamental principle remains the construction of a null hypothesis ($H_0$), representing no difference, and an alternative hypothesis ($H_1$ or $H_a$), representing a significant difference.

2.1 Comparison of Two Population Means ($\mu_1 - \mu_2$)

2.1.1 Independent Samples, Population Variances Known ($\sigma_1^2, \sigma_2^2$ known)

This scenario, though rare in practice (as known population variances usually imply known means), serves as a foundational theoretical case.
* Assumptions:
1. Samples are drawn independently from two populations.
2. Both populations are normally distributed, or sample sizes ($n_1, n_2$) are sufficiently large ($n_1 \geq 30, n_2 \geq 30$) for the Central Limit Theorem to apply.
3. Population variances $\sigma_1^2$ and $\sigma_2^2$ are known.
* Hypotheses Formulation:
* Two-tailed: $H_0: \mu_1 = \mu_2$ (or $\mu_1 - \mu_2 = 0$) vs. $H_1: \mu_1
eq \mu_2$ (or $\mu_1 - \mu_2
eq 0$)
* One-tailed (Left): $H_0: \mu_1 \geq \mu_2$ (or $\mu_1 - \mu_2 \geq 0$) vs. $H_1: \mu_1 < \mu_2$ (or $\mu_1 - \mu_2 < 0$)
* One-tailed (Right): $H_0: \mu_1 \leq \mu_2$ (or $\mu_1 - \mu_2 \leq 0$) vs. $H_1: \mu_1 > \mu_2$ (or $\mu_1 - \mu_2 > 0$)
* Test Statistic: The $z$-statistic is employed due to known population variances.
$$Z = \frac{(\bar{X}1 - \bar{X}_2) - (\mu_1 - \mu_2){H_0}}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$
Under $H_0: \mu_1 = \mu_2$, the term $(\mu_1 - \mu_2){H_0}$ becomes 0.
$$Z{calc} = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$
* Distribution: Standard Normal Distribution $N(0, 1)$.

2.1.2 Independent Samples, Population Variances Unknown But Assumed Equal ($\sigma_1^2 = \sigma_2^2$)

This is a very common scenario, often justified by prior knowledge or an F-test on sample variances.
* Assumptions:
1. Samples are drawn independently from two populations.
2. Both populations are normally distributed.
3. Population variances are unknown but assumed equal ($\sigma_1^2 = \sigma_2^2 = \sigma^2$).
* Pooled Sample Variance ($S_p^2$): Since we assume equal population variances, we pool the sample variances to get a better estimate of the common population variance.
$$S_p^2 = \frac{(n_1 - 1)S_1^2 + (n_2 - 1)S_2^2}{n_1 + n_2 - 2}$$
where $S_1^2$ and $S_2^2$ are the sample variances.
* Test Statistic: The $t$-statistic is used.
$$t_{calc} = \frac{(\bar{X}1 - \bar{X}_2) - (\mu_1 - \mu_2){H_0}}{\sqrt{S_p^2\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
Under $H_0: \mu_1 = \mu_2$, the term $(\mu_1 - \mu_2){H_0}$ becomes 0.
$$t{calc} = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{(n_1 - 1)S_1^2 + (n_2 - 1)S_2^2}{n_1 + n_2 - 2}\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
* Degrees of Freedom (df): $df = n_1 + n_2 - 2$.
* Distribution: Student's $t$-distribution with $n_1 + n_2 - 2$ degrees of freedom.

2.1.3 Independent Samples, Population Variances Unknown and Unequal ($\sigma_1^2

eq \sigma_2^2$) - Welch's t-test
This is the most robust and generally recommended approach when population variances are unknown. It is often referred to as the Welch-Satterthwaite equation for degrees of freedom.
* Assumptions:
1. Samples are drawn independently from two populations.
2. Both populations are normally distributed.
3. Population variances are unknown and not assumed equal.
* Test Statistic: The $t$-statistic is used, similar to the pooled case but without pooling variances.
$$t_{calc} = \frac{(\bar{X}1 - \bar{X}_2) - (\mu_1 - \mu_2){H_0}}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}$$
Under $H_0: \mu_1 = \mu_2$, the term $(\mu_1 - \mu_2){H_0}$ becomes 0.
$$t{calc} = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}$$
* Degrees of Freedom (df): This is approximated using the Welch-Satterthwaite equation, which typically results in a non-integer value.
$$df = \frac{\left(\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}\right)^2}{\frac{(S_1^2/n_1)^2}{n_1 - 1} + \frac{(S_2^2/n_2)^2}{n_2 - 1}}$$
This value is usually rounded down to the nearest integer for conservative critical value determination.
* Distribution: Student's $t$-distribution with calculated (approximated) degrees of freedom.

2.1.4 Paired Samples (Dependent Samples)

This scenario occurs when observations in the two samples are naturally linked or matched (e.g., before-and-after measurements on the same subjects, or matched pairs of subjects).
* Assumptions:
1. The pairs are independent.
2. The differences ($D_i = X_{1i} - X_{2i}$) are normally distributed.
* Hypotheses Formulation:
* $H_0: \mu_D = 0$ (Mean difference is zero) vs. $H_1: \mu_D
eq 0$ (Mean difference is not zero).
* Test Statistic: The $t$-statistic is employed for the mean difference.
$$t_{calc} = \frac{\bar{D} - \mu_{D,H_0}}{S_D / \sqrt{n}}$$
where $\bar{D} = \frac{\sum D_i}{n}$ is the mean of the differences, $S_D = \sqrt{\frac{\sum (D_i - \bar{D})^2}{n-1}}$ is the standard deviation of the differences, and $n$ is the number of pairs. Under $H_0: \mu_D = 0$, the term $\mu_{D,H_0}$ becomes 0.
$$t_{calc} = \frac{\bar{D}}{S_D / \sqrt{n}}$$
* Degrees of Freedom (df): $df = n - 1$.
* Distribution: Student's $t$-distribution with $n-1$ degrees of freedom.

2.2 Comparison of Two Population Proportions ($p_1 - p_2$)

This test is used when comparing the success rates or prevalence of an event in two independent categorical datasets.
* Assumptions:
1. Samples are drawn independently from two populations.
2. Both samples are large enough such that $n_1p_1 \geq 5, n_1(1-p_1) \geq 5$, $n_2p_2 \geq 5, n_2(1-p_2) \geq 5$. (Sometimes $n_ip_i \geq 10$ and $n_i(1-p_i) \geq 10$). These conditions ensure the sampling distribution of the sample proportion is approximately normal.
* Hypotheses Formulation:
* $H_0: p_1 = p_2$ (or $p_1 - p_2 = 0$) vs. $H_1: p_1
eq p_2$ (or $p_1 - p_2
eq 0$).
* Pooled Sample Proportion ($\hat{p}$): Under the null hypothesis that $p_1 = p_2 = p$, we pool the sample data to estimate this common proportion.
$$\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$$
where $x_1$ and $x_2$ are the number of successes in samples 1 and 2, respectively.
* Test Statistic: The $z$-statistic is employed as the sampling distribution of the difference in proportions is approximately normal for large samples.
$$Z_{calc} = \frac{(\hat{p}1 - \hat{p}_2) - (p_1 - p_2){H_0}}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
Under $H_0: p_1 = p_2$, the term $(p_1 - p_2){H_0}$ becomes 0.
$$Z{calc} = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
* Distribution: Standard Normal Distribution $N(0, 1)$.

2.3 Comparison of Two Population Variances ($\sigma_1^2, \sigma_2^2$)

This test is often conducted as a preliminary step before comparing two means, especially to ascertain whether a pooled $t$-test or Welch's $t$-test is appropriate.
* Assumptions:
1. Samples are drawn independently from two populations.
2. Both populations are normally distributed. (This assumption is critical and the F-test is highly sensitive to violations).
* Hypotheses Formulation:
* Two-tailed: $H_0: \sigma_1^2 = \sigma_2^2$ vs. $H_1: \sigma_1^2
eq \sigma_2^2$
* One-tailed: $H_0: \sigma_1^2 \leq \sigma_2^2$ vs. $H_1: \sigma_1^2 > \sigma_2^2$ (or vice-versa)
* Test Statistic: The F-statistic. Conventionally, the larger sample variance is placed in the numerator to ensure $F_{calc} \geq 1$.
$$F_{calc} = \frac{S_1^2}{S_2^2}$$
* Degrees of Freedom (df): $df_1 = n_1 - 1$ (numerator degrees of freedom), $df_2 = n_2 - 1$ (denominator degrees of freedom).
* Distribution: F-distribution with $df_1$ and $df_2$ degrees of freedom.

radar-beta
  title "Comparative Robustness & Sensitivity"
  series
    name "Sensitivity to Normality"
    data [9, 6, 7, 5, 8]
  series
    name "Robustness to Unequal Variances"
    data [5, 9, 10, 6, 6]
  series
    name "Power (typical scenarios)"
    data [7, 8, 9, 9, 7]
  series
    name "Ease of Calculation"
    data [10, 8, 7, 9, 6]
  labels ["Z-test (known σ)", "Pooled t-test (equal σ)", "Welch's t-test (unequal σ)", "Paired t-test", "F-test (for variances)"]

3. Technical Procedures & Applications

3.1 Procedure for Two-Sample Independent t-test (Welch's approach)

This procedure outlines the steps for performing a Welch's t-test, which is generally preferred due to its robustness against the assumption of equal variances.

sequenceDiagram
    participant Analyst
    participant DataCollection
    participant Statistician
    participant DecisionMaker

    Analyst->>DataCollection: 1. Define Research Question (e.g., "Is mean yield of Process A different from Process B?")
    DataCollection->>Analyst: 2. Collect two independent samples (n1, n2)
    Analyst->>Analyst: 3. Calculate sample statistics: mean (X̄1, X̄2), stDev (S1, S2), size (n1, n2) for each sample.
    Analyst->>Statistician: 4. Formulate Hypotheses:
        Note left of Statistician: H0: µ1 = µ2 (No difference)
        Note left of Statistician: H1: µ1 ≠ µ2 (Difference exists)
    Analyst->>Analyst: 5. Set Significance Level (α), typically 0.05.
    Analyst->>Analyst: 6. Calculate Test Statistic (Welch's t):
        Note left of Analyst: $$t_{calc} = (\bar{X}_1 - \bar{X}_2) / \sqrt{(S_1^2/n_1) + (S_2^2/n_2)}$$
    Analyst->>Analyst: 7. Calculate Degrees of Freedom (Welch-Satterthwaite):
        Note left of Analyst: $$df = ( (S_1^2/n_1) + (S_2^2/n_2) )^2 / ( ( (S_1^2/n_1)^2 / (n1-1) ) + ( (S_2^2/n_2)^2 / (n2-1) ) )$$
    Analyst->>Analyst: 8. Determine Critical Value(s) or p-value:
        Note left of Analyst: Using t-distribution table or software with calculated df and α.
    Analyst->>Analyst: 9. Compare Test Statistic to Critical Value OR p-value to α.
        alt If |t_calc| > t_critical (or p-value < α)
            Analyst->>Statistician: 10a. Reject H0
        else If |t_calc| <= t_critical (or p-value >= α)
            Analyst->>Statistician: 10b. Fail to Reject H0
        end
    Statistician->>DecisionMaker: 11. Interpret Results and Draw Conclusion.
        Note right of DecisionMaker: "There is/is no sufficient evidence at α level to conclude a difference in means."
    DecisionMaker->>DecisionMaker: 12. Make Practical Decision Based on Statistical Conclusion.

3.2 Procedure for Two-Sample Z-test for Proportions

This models the technical procedure for comparing two proportions.

sequenceDiagram
    participant Researcher
    participant DataEngineer
    participant StatsModule
    participant ReportGen

    Researcher->>DataEngineer: 1. Define Hypotheses (e.g., H0: p1=p2 vs. H1: p1!=p2)
    DataEngineer->>DataEngineer: 2. Extract counts of successes (x1, x2) and total sample sizes (n1, n2) for two groups.
    DataEngineer->>StatsModule: 3. Calculate sample proportions:
        Note over StatsModule: $$\hat{p}_1 = x_1/n_1$$
        Note over StatsModule: $$\hat{p}_2 = x_2/n_2$$
    StatsModule->>StatsModule: 4. Check large sample conditions:
        Note over StatsModule: $$n_i \hat{p}_i \ge 5, n_i(1-\hat{p}_i) \ge 5$$
        alt If conditions not met
            StatsModule->>ReportGen: Flag: "Sample size insufficient for Z-test. Consider Fisher's Exact Test."
            deactivate StatsModule
            break
        end
    StatsModule->>StatsModule: 5. Calculate pooled proportion under H0:
        Note over StatsModule: $$\hat{p} = (x_1 + x_2) / (n_1 + n_2)$$
    StatsModule->>StatsModule: 6. Calculate Standard Error of the difference:
        Note over StatsModule: $$SE_{\hat{p}_1-\hat{p}_2} = \sqrt{\hat{p}(1-\hat{p})(1/n_1 + 1/n_2)}$$
    StatsModule->>StatsModule: 7. Calculate Test Statistic (Z-score):
        Note over StatsModule: $$Z_{calc} = (\hat{p}_1 - \hat{p}_2) / SE_{\hat{p}_1-\hat{p}_2}$$
    StatsModule->>StatsModule: 8. Determine p-value from Standard Normal Distribution.
    StatsModule->>ReportGen: 9. Output Z-calc, p-value, and confidence interval for (p1-p2).
    ReportGen->>Researcher: 10. Present conclusive report & recommendation.

4. Examiner's Breakdown

4.1 Comparative Analysis

Feature	One-Sample Test	Two-Sample Independent Test	Two-Sample Paired Test
Primary Objective	Compare sample parameter to known population parameter/hypothesized value.	Compare parameters of two independent populations.	Compare parameters from the same or matched subjects under two conditions.
Statistical Units	$n$ observations from 1 group	$n_1$ observations from Group 1, $n_2$ from Group 2	$n$ pairs of observations (e.g., $X_{before}, X_{after}$)
Relationship between Samples	Single sample	No direct relationship; random sampling ensures independence.	Direct, one-to-one correspondence or repeated measures.
Variability Focus	Sample mean vs. population mean	Difference in sample means/proportions	Variability of the differences within pairs.
Degrees of Freedom (mean)	$n-1$	$n_1+n_2-2$ (pooled), complex (Welch's)	$n-1$ (where $n$ is number of pairs)
Primary Benefit	Simplest baseline comparison	Versatile for comparing distinct groups	Controls for inter-subject variability, increasing statistical power.
Use Case Example	Test if average product weight is 100g.	Test if Drug A lowers blood pressure more than Drug B.	Test if a new diet reduces weight in the same individuals.
Sensitivity to Assumptions	Normality of sample mean (CLT)	Normality of populations, equal variances (pooled t).	Normality of differences.
Robustness	Good with large N (CLT)	Welch's t-test: Robust to unequal variances.	Generally robust if differences are symmetric.
Test Statistics	Z or t	Z (proportions, known `σ`), t (unknown `σ`)	t

4.2 High-Yield Marking Keywords

"Null Hypothesis ($H_0$) and Alternative Hypothesis ($H_1$)": Explicitly stated, using appropriate symbols ($\mu, p, \sigma^2$) and directionality.
"Appropriate Test Statistic": Correct selection from $Z$, $t$, or $F$ based on known/unknown population parameters and sample structure.
"Degrees of Freedom (df)": Correct calculation, specifically for pooled $t$-test ($n_1 + n_2 - 2$), paired $t$-test ($n-1$), or Welch's approximation.
"Critical Value(s) or p-value comparison": Clear statement of comparison mechanics and decision rule.
"Pooled Sample Variance ($S_p^2$)": If applicable, the correctly formulated equation for estimating common variance.
"Independence of Samples": A stated assumption crucial for all non-paired tests.
"Normality or Large Sample Sizes": Justification for using Z/t distributions.
"Conclusion in Context": Interpreting the statistical decision within the problem's real-world implications, avoiding definitive claims of "proof."

4.3 Trapdoor Mistakes

Incorrectly Using Pooled t-test when Variances are Unequal: Students often default to the pooled $t$-test formula ($df = n_1+n_2-2$) without first checking the assumption of equal variances (e.g., via an F-test or by inspection).
- Correct way: If variances are known/assumed unequal, use Welch's $t$-test with its specific, complex degrees of freedom formula ($df_{Welch}$). If an F-test leads to rejection of $H_0: \sigma_1^2 = \sigma_2^2$, then Welch's test is mandated.
Applying Independent Sample Test to Paired Data: Treating paired observations (e.g., before/after measurements on the same subject) as independent samples. This overlooks the inherent dependency and significantly inflates the standard error, thereby reducing power.
- Correct way: Formulate the differences ($D_i = X_{1i} - X_{2i}$) and perform a one-sample $t$-test on these differences with $df = n-1$ (where $n$ is the number of pairs). This effectively controls for inter-subject variability.
Misinterpreting "Fail to Reject $H_0$": Concluding that failing to reject the null hypothesis definitively proves the null hypothesis is true.
- Correct way: "Fail to reject $H_0$" means there is insufficient evidence at the specified significance level to conclude that $H_1$ is true. It does not imply that $H_0$ is proven. Consider the possibility of Type II error or inadequate statistical power.
Ignoring Sample Size Conditions for Z-test on Proportions: Applying the Z-test for two proportions when $np$ or $n(1-p)$ for either sample is less than 5 (or 10, depending on conservative guidelines).
- Correct way: If these conditions ($n_1\hat{p}_1$, $n_1(1-\hat{p}_1)$, $n_2\hat{p}_2$, $n_2(1-\hat{p}_2) \geq 5$) are not met, the normal approximation to the binomial distribution is invalid. Fisher's Exact Test or other exact methods based on the hypergeometric distribution should be considered.
Incorrectly Placing sample variances in F-test: Placing the smaller sample variance in the numerator for a two-tailed F-test.
- Correct way: For a two-tailed F-test for variances, always place the larger sample variance in the numerator ($F_{calc} = S_{larger}^2 / S_{smaller}^2$). This ensures $F_{calc} \ge 1$ and allows direct comparison with a single critical value from the upper tail of the F-distribution (using $\alpha/2$). For one-tailed tests, the hypothesized direction dictates the numerator.

Frequently asked about Hypothesis Testing - Two Samples

The Mental Model: Hypothesis testing for two samples is akin to a forensic comparison, meticulously evaluating whether observed differences between two distinct sets of evidence (data) are genuine and statistically significant, or merely artifacts of random variability, thereby… Read the full notes above for the details.

Hypothesis Testing - Two Samples is a core topic in statistics 1B. Most exam papers test it via a mix of definitions, worked examples, and applied problems. The notes above cover the high-yield sub-topics, common pitfalls, and the kind of questions examiners typically set.

Yes. Every note in the StudyAI Campus Hub is free to read. Create a free account if you want to clone the full plan, generate your own notes from your textbook, or get AI-powered practice quizzes and flashcards.

Hypothesis Testing - Two Samples

Hypothesis Testing - Two Samples

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 Comparison of Two Population Means ($\mu_1 - \mu_2$)

2.1.1 Independent Samples, Population Variances Known ($\sigma_1^2, \sigma_2^2$ known)

2.1.2 Independent Samples, Population Variances Unknown But Assumed Equal ($\sigma_1^2 = \sigma_2^2$)

2.1.3 Independent Samples, Population Variances Unknown and Unequal ($\sigma_1^2

2.1.4 Paired Samples (Dependent Samples)

2.2 Comparison of Two Population Proportions ($p_1 - p_2$)

2.3 Comparison of Two Population Variances ($\sigma_1^2, \sigma_2^2$)

3. Technical Procedures & Applications

3.1 Procedure for Two-Sample Independent t-test (Welch's approach)

3.2 Procedure for Two-Sample Z-test for Proportions

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Hypothesis Testing - Two Samples

More from statistics 1B

Get the full statistics 1B curriculum

Hypothesis Testing - Two Samples

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 Comparison of Two Population Means ($\mu_1 - \mu_2$)

2.1.1 Independent Samples, Population Variances Known ($\sigma_1^2, \sigma_2^2$ known)

2.1.2 Independent Samples, Population Variances Unknown But Assumed Equal ($\sigma_1^2 = \sigma_2^2$)

2.1.3 Independent Samples, Population Variances Unknown and Unequal ($\sigma_1^2

2.1.4 Paired Samples (Dependent Samples)

2.2 Comparison of Two Population Proportions ($p_1 - p_2$)

2.3 Comparison of Two Population Variances ($\sigma_1^2, \sigma_2^2$)

3. Technical Procedures & Applications

3.1 Procedure for Two-Sample Independent t-test (Welch's approach)

3.2 Procedure for Two-Sample Z-test for Proportions

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Hypothesis Testing - Two Samples

What is Hypothesis Testing - Two Samples?

How is Hypothesis Testing - Two Samples examined in statistics 1B?

Are these Hypothesis Testing - Two Samples notes free?

More from statistics 1B

Get the full statistics 1B curriculum