CAPE PENINSULA UNIVERSITY OF TECHNOLOGY STAT151X

Sampling Distributions and Estimation

From the statistics 1B curriculum · Updated May 29, 2026

Sampling Distributions and Estimation

1. Introduction & Overview

  • The Mental Model: Imagine an infinite ocean of individual particles (the population), from which we draw finite buckets of water (samples). A sampling distribution is the theoretical probability distribution of a statistic (e.g., the average salinity of the water), calculated from all possible such finite buckets.
  • Significance:
    • Quantifies the uncertainty associated with sample statistics.
    • Provides the theoretical foundation for inferential statistics, including hypothesis testing and confidence intervals.
    • Enables estimation of unknown population parameters using sample data.
    • Crucial for experimental design, determining necessary sample sizes to achieve desired statistical power.
    • Underpins quality control processes in manufacturing by providing probabilistic guarantees.
mindmap
  root((Sampling Distributions & Estimation))
    Sampling Distribution
      "Definition (Population Parameter vs. Sample Statistic)"
      "Role of Random Sampling"
      "Central Limit Theorem (CLT)"
        "Conditions (n, independence)"
        "Implications (Normality)"
      "Types of Sampling Distributions"
        "Sample Mean (X̄)"
        "Sample Proportion (P̂)"
        "Sample Variance (S²)"
        "Difference of Means (X̄₁ - X̄₂)"
        "Difference of Proportions (P̂₁ - P̂₂)"
    Estimation
      "Point Estimation"
        "Estimator Properties"
          "Unbiasedness"
          "Efficiency (Minimum Variance)"
          "Consistency"
          "Sufficiency"
        "Maximum Likelihood Estimation (MLE)"
          "Likelihood Function L(θ|x)"
          "Log-Likelihood ln L(θ|x)"
          "Score Function"
          "Fisher Information"
          "Cramér-Rao Lower Bound (CRLB)"
      "Interval Estimation (Confidence Intervals)"
        "Confidence Level (1-α)"
        "Margin of Error"
        "Interpretation (Frequentist vs. Bayesian)"
        "Intervals for Specific Parameters"
          "Mean (σ known, unknown)"
          "Proportion"
          "Variance"
          "Difference of Means"
          "Difference of Proportions"
    "Applications"
      "Statistical Inference"
      "Hypothesis Testing (Foundation)"
      "Quality Control"
      "Survey Design"

2. In-Depth Theory, Equations & Mechanisms

2.1 Populations and Samples

The population ($\mathcal{P}$) refers to the entire collection of objects or individuals about which information is desired. A parameter ($\theta$) is a numerical characteristic of the population (e.g., population mean $\mu$, population variance $\sigma^2$, population proportion $p$). A sample ($\mathcal{S}$) is a subset of the population selected for observation. A statistic ($T$) is a numerical characteristic of the sample (e.g., sample mean $\bar{X}$, sample variance $S^2$, sample proportion $\hat{p}$). The process of deducing properties of $\mathcal{P}$ from $\mathcal{S}$ is statistical inference.

2.2 Sampling Distributions

A sampling distribution is the probability distribution of a statistic obtained from all possible samples of a specific size taken from a population. This distribution describes the long-run behavior of the statistic.

2.2.1 Sampling Distribution of the Sample Mean ($\bar{X}$)

Let $X_1, X_2, \ldots, X_n$ be a random sample of size $n$ drawn from a population with mean $\mu$ and finite variance $\sigma^2$. The sample mean is defined as:
$\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i$

  • Expected Value of the Sample Mean:
    $E[\bar{X}] = E\left[\frac{1}{n} \sum_{i=1}^{n} X_i\right] = \frac{1}{n} \sum_{i=1}^{n} E[X_i] = \frac{1}{n} \sum_{i=1}^{n} \mu = \frac{n\mu}{n} = \mu$
    This demonstrates that $\bar{X}$ is an unbiased estimator of $\mu$.

  • Variance of the Sample Mean:
    $Var(\bar{X}) = Var\left(\frac{1}{n} \sum_{i=1}^{n} X_i\right) = \frac{1}{n^2} Var\left(\sum_{i=1}^{n} X_i\right)$
    If the samples are independent (as in Simple Random Sampling with replacement or from a very large population), then:
    $Var(\bar{X}) = \frac{1}{n^2} \sum_{i=1}^{n} Var(X_i) = \frac{1}{n^2} \sum_{i=1}^{n} \sigma^2 = \frac{n\sigma^2}{n^2} = \frac{\sigma^2}{n}$

  • Standard Deviation of the Sample Mean (Standard Error of the Mean):
    $SE(\bar{X}) = \sqrt{Var(\bar{X})} = \frac{\sigma}{\sqrt{n}}$

  • Shape:

    • If the population is normally distributed: $\bar{X}$ is also normally distributed for any sample size $n$.
      $\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)$
    • If the population is not normally distributed (and $n$ is sufficiently large, typically $n \ge 30$): The Central Limit Theorem (CLT) states that the sampling distribution of $\bar{X}$ will be approximately normally distributed.
      $\bar{X} \approx N\left(\mu, \frac{\sigma^2}{n}\right)$
      The standardized variable $Z = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}$ follows approximately a standard normal distribution $N(0,1)$.

2.2.2 Sampling Distribution of the Sample Proportion ($\hat{p}$)

Let $X$ be the number of successes in $n$ independent Bernoulli trials, where the probability of success in a single trial is $p$. Then $X \sim Bin(n, p)$. The sample proportion is defined as:
$\hat{p} = \frac{X}{n}$

  • Expected Value of the Sample Proportion:
    $E[\hat{p}] = E\left[\frac{X}{n}\right] = \frac{1}{n} E[X] = \frac{1}{n} (np) = p$
    $\hat{p}$ is an unbiased estimator of $p$.

  • Variance of the Sample Proportion:
    $Var(\hat{p}) = Var\left(\frac{X}{n}\right) = \frac{1}{n^2} Var(X) = \frac{1}{n^2} (np(1-p)) = \frac{p(1-p)}{n}$

  • Standard Error of the Proportion:
    $SE(\hat{p}) = \sqrt{Var(\hat{p})} = \sqrt{\frac{p(1-p)}{n}}$

  • Shape:

    • For large $n$ (typically $np \ge 10$ and $n(1-p) \ge 10$), the sampling distribution of $\hat{p}$ is approximately normally distributed due to the CLT applied to Bernoulli random variables.
      $\hat{p} \approx N\left(p, \frac{p(1-p)}{n}\right)$
      The standardized variable $Z = \frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$ follows approximately a standard normal distribution $N(0,1)$.

2.2.3 Sampling Distribution of the Sample Variance ($S^2$)

Let $X_1, X_2, \ldots, X_n$ be a random sample of size $n$ from a normal population $N(\mu, \sigma^2)$. The sample variance is defined as:
$S^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2$

  • Expected Value of the Sample Variance:
    $E[S^2] = \sigma^2$
    $S^2$ is an unbiased estimator of $\sigma^2$. Note that $\frac{1}{n} \sum_{i=1}^{n} (X_i - \bar{X})^2$ is a biased estimator.

  • Shape: If the population is normally distributed, the statistic $\frac{(n-1)S^2}{\sigma^2}$ follows a chi-squared distribution with $n-1$ degrees of freedom ($\chi^2_{n-1}$).
    $Y = \frac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1}$

2.2.4 The t-Distribution

When the population standard deviation $\sigma$ is unknown, and the sample size $n$ is small, we cannot use the $Z$-statistic as $\sigma$ is replaced by the sample standard deviation $S$. In this scenario, for a normally distributed population, the statistic:
$T = \frac{\bar{X} - \mu}{S/\sqrt{n}}$
follows a Student's t-distribution with $n-1$ degrees of freedom ($t_{n-1}$).
The t-distribution is symmetric and bell-shaped like the normal distribution, but has heavier tails, accounting for the additional uncertainty introduced by estimating $\sigma$. As $n \to \infty$, the t-distribution approaches the standard normal distribution.

2.3 Fundamentals of Estimation

2.3.1 Point Estimation

A point estimator is a single value, calculated from sample data, used to estimate an unknown population parameter. An estimator is a rule or function for calculating that estimate (e.g., $\bar{X}$ is an estimator for $\mu$).

  • Properties of Good Estimators:

    • Unbiasedness: An estimator $\hat{\theta}$ is unbiased for $\theta$ if $E[\hat{\theta}] = \theta$. This means, on average, the estimator hits the true parameter value.
    • Efficiency: An estimator $\hat{\theta}_1$ is more efficient than $\hat{\theta}_2$ if $Var(\hat{\theta}_1) < Var(\hat{\theta}_2)$ for all $\theta$. The most efficient unbiased estimator is called the Minimum Variance Unbiased Estimator (MVUE). The Cramér-Rao Lower Bound (CRLB) provides a theoretical lower bound for the variance of any unbiased estimator. For an unbiased estimator $\hat{\theta}$, $Var(\hat{\theta}) \ge \frac{1}{I(\theta)}$, where $I(\theta)$ is the Fisher Information.
    • Consistency: An estimator $\hat{\theta}n$ is consistent for $\theta$ if it converges in probability to $\theta$ as $n \to \infty$. That is, $\lim{n \to \infty} P(|\hat{\theta}_n - \theta| < \epsilon) = 1$ for any $\epsilon > 0$.
    • Sufficiency: A statistic $T(\mathbf{X})$ is sufficient for $\theta$ if the conditional distribution of the sample $\mathbf{X}$ given $T(\mathbf{X})$ does not depend on $\theta$. This means $T(\mathbf{X})$ captures all the information about $\theta$ present in the sample.
  • Maximum Likelihood Estimation (MLE):
    The MLE method chooses the parameter value that makes the observed data most probable.
    Let $X_1, \ldots, X_n$ be an i.i.d. random sample from a distribution with probability density function (or mass function) $f(x_i|\theta)$.
    The likelihood function $L(\theta|\mathbf{x})$ is defined as the joint density of the observed data:
    $L(\theta|\mathbf{x}) = \prod_{i=1}^{n} f(x_i|\theta)$
    The log-likelihood function $\ell(\theta|\mathbf{x}) = \ln L(\theta|\mathbf{x}) = \sum_{i=1}^{n} \ln f(x_i|\theta)$ is often maximized due to its mathematical tractability.
    The MLE $\hat{\theta}_{MLE}$ is the value of $\theta$ that maximizes $L(\theta|\mathbf{x})$ (or $\ell(\theta|\mathbf{x})$). This is typically found by solving $\frac{d}{d\theta} \ell(\theta|\mathbf{x}) = 0$.
    Properties of MLEs:

    1. Consistency: Under broad conditions, MLEs are consistent.
    2. Asymptotic Normality: As $n \to \infty$, $\hat{\theta}_{MLE}$ is approximately normally distributed.
    3. Asymptotic Efficiency: As $n \to \infty$, MLEs achieve the CRLB, becoming asymptotically efficient.
    4. Invariance: If $\hat{\theta}$ is the MLE for $\theta$, then $g(\hat{\theta})$ is the MLE for $g(\theta)$ for any function $g$.
radar-beta
  title "Desirable Estimator Properties"
  series
    name "Ideal Estimator"
    data
      Unbiasedness: 5
      Efficiency: 5
      Consistency: 5
      Sufficiency: 5
      Robustness: 4
    name "Sample Mean (for Normal Mean)"
    data
      Unbiasedness: 5
      Efficiency: 5
      Consistency: 5
      Sufficiency: 5
      Robustness: 3
    name "Sample Median (for Normal Mean)"
    data
      Unbiasedness: 5
      Efficiency: 3
      Consistency: 5
      Sufficiency: 2
      Robustness: 5
  variables
    - Unbiasedness
    - Efficiency
    - Consistency
    - Sufficiency
    - Robustness

2.3.2 Interval Estimation (Confidence Intervals)

A confidence interval (CI) for a parameter $\theta$ is an interval constructed from sample data such that, with a specified probability, the interval contains the true parameter value. This probability is called the confidence level, denoted by $1-\alpha$.

A $(1-\alpha)100\%$ confidence interval for $\theta$ is typically of the form:
$\hat{\theta} \pm Z_{\alpha/2} \cdot SE(\hat{\theta})$ (for large $n$ or known $\sigma$)
or
$\hat{\theta} \pm t_{n-1, \alpha/2} \cdot SE(\hat{\theta})$ (for small $n$ and unknown $\sigma$)

Interpretation of a $(1-\alpha)100\%$ CI: If we were to repeat the sampling process many times and construct a confidence interval for each sample, approximately $(1-\alpha)100\%$ of these intervals would contain the true population parameter. It does not mean there is a $(1-\alpha)100\%$ probability that the specific interval calculated contains the true parameter.

  • Confidence Interval for a Population Mean ($\mu$)

    • Case 1: Population Standard Deviation ($\sigma$) is Known
      Assumptions: Sample is random, population is normal or $n \ge 30$.
      Formula: $\bar{X} \pm Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
      Here, $Z_{\alpha/2}$ is the critical Z-value such that $P(Z > Z_{\alpha/2}) = \alpha/2$.
    • Case 2: Population Standard Deviation ($\sigma$) is Unknown
      Assumptions: Sample is random, population is normal or $n \ge 30$. ($\sigma$ replaced by $S$, $Z$ replaced by $t$).
      Formula: $\bar{X} \pm t_{n-1, \alpha/2} \frac{S}{\sqrt{n}}$
      Here, $t_{n-1, \alpha/2}$ is the critical t-value with $n-1$ degrees of freedom such that $P(T > t_{n-1, \alpha/2}) = \alpha/2$.
  • Confidence Interval for a Population Proportion ($p$)
    Assumptions: Sample is random, $np \ge 10$ and $n(1-p) \ge 10$ (using $\hat{p}$ as an estimate for $p$).
    Formula: $\hat{p} \pm Z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$
    This is based on the normal approximation to the binomial distribution. For small samples, exact methods (e.g., Clopper-Pearson interval based on Beta distribution) or Wilson score interval are preferred.

  • Confidence Interval for a Population Variance ($\sigma^2$)
    Assumptions: Sample is random, population is normally distributed.
    Formula: $\left(\frac{(n-1)S^2}{\chi^2_{n-1, \alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1, 1-\alpha/2}}\right)$
    Here, $\chi^2_{n-1, \alpha/2}$ is the value from the chi-squared distribution with $n-1$ degrees of freedom such that $P(\chi^2 > \chi^2_{n-1, \alpha/2}) = \alpha/2$. Note the asymmetric nature due to the non-symmetric chi-squared distribution.

2.4 Determination of Sample Size

The required sample size $n$ can be determined to achieve a desired margin of error $E$ with a specified confidence level $(1-\alpha)$.

  • For Estimating a Population Mean:
    $E = Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
    Solving for $n$: $n = \left(\frac{Z_{\alpha/2} \sigma}{E}\right)^2$
    If $\sigma$ is unknown, it can be estimated from a pilot study, historical data, or by using a conservative estimate (e.g., $\sigma \approx Range/4$ or $Range/6$). Always round $n$ up to the next integer.

  • For Estimating a Population Proportion:
    $E = Z_{\alpha/2} \sqrt{\frac{p(1-p)}{n}}$
    Solving for $n$: $n = p(1-p) \left(\frac{Z_{\alpha/2}}{E}\right)^2$
    If no prior estimate of $p$ is available, use $p=0.5$ as it maximizes $p(1-p)$ and thus provides the most conservative (largest) sample size.
    $n = 0.25 \left(\frac{Z_{\alpha/2}}{E}\right)^2$

2.5 Finite Population Correction Factor (FPCF)

When sampling without replacement from a finite population of size $N$, the variance of the sample mean (and other statistics) is reduced. The FPCF is applied to the standard error:
$SE_{FPCF}(\bar{X}) = \frac{\sigma}{\sqrt{n}} \sqrt{\frac{N-n}{N-1}}$
The term $\sqrt{\frac{N-n}{N-1}}$ is the FPCF. It is approximately 1 and can be ignored if $n/N < 0.05$ (i.e., less than 5% of the population is sampled).

stateDiagram-v2
    direction LR
    Population: Population Size N
    Sample: Sample Size n

    state "Parameter Known" as Known_Sigma
    state "Parameter Unknown" as Unknown_Sigma

    Population --> Sample : "Random Sampling (SRS)"

    subgraph "Sampling Distribution of Mean"
        Known_Sigma --> Z_Dist : "σ known"
        Unknown_Sigma --> T_Dist : "σ unknown"
        Z_Dist : "Z ~ N(0,1)"
        T_Dist : "T ~ t(n-1)"
    end

    state "Estimation Process" as Estimation
    Sampling_Mean --> Estimation
    Sampling_Proportion --> Estimation
    Sampling_Variance --> Estimation

    subgraph "Estimation Types"
        Point_Estimate
        Confidence_Interval
    end

    Estimation --> Point_Estimate
    Estimation --> Confidence_Interval

    state "Sample Size Considerations" as Sample_Size
    Confidence_Interval --> Sample_Size

    Sample_Size --> "Margin of Error E"
    "Margin of Error E" --> "Confidence Level (1-α)"
    "Confidence Level (1-α)" --> "Required n"

    state "CLT Applicability" as CLT
    Sample --> CLT
    CLT --> "Approx. Normal Distribution (if n large)"

3. Technical Procedures & Applications

3.1 Procedure for Constructing a Confidence Interval for a Population Mean (Unknown $\sigma$)

This procedure outlines the steps for calculating a confidence interval for the population mean when the population standard deviation ($\sigma$) is unknown, relying on the sample standard deviation ($S$) and the t-distribution.

sequenceDiagram
    participant Analyst as "Statistical Analyst"
    participant Data as "Sample Data Points"
    participant Calcs as "Computational Engine"
    participant T_Table as "t-Distribution Table"

    Analyst->>Data: 1. Collect a random sample (x₁, ..., xₙ)
    activate Data
    Data-->>Analyst: Sample size n, observed values
    deactivate Data

    Analyst->>Calcs: 2. Calculate Sample Mean (X̄)
    Calcs-->>Analyst: X̄ = (Σxᵢ) / n
    Analyst->>Calcs: 3. Calculate Sample Standard Deviation (S)
    Calcs-->>Analyst: S = √[Σ(xᵢ - X̄)² / (n-1)]

    Analyst->>Analyst: 4. Specify Confidence Level (1-α)
    Analyst->>T_Table: 5. Determine Degrees of Freedom (df = n-1)
    Analyst->>T_Table: 6. Find Critical t-value (tₙ₋₁, ͣ/₂) for selected α and df
    T_Table-->>Analyst: tₙ₋₁, ͣ/₂ value

    Analyst->>Calcs: 7. Calculate Standard Error (SE)
    Calcs-->>Analyst: SE = S / √n
    Analyst->>Calcs: 8. Calculate Margin of Error (ME)
    Calcs-->>Analyst: ME = tₙ₋₁, ͣ/₂ * SE

    Analyst->>Analyst: 9. Construct Confidence Interval
    Analyst-->>Analyst: CI = X̄ ± ME
    Analyst-->>Analyst: Lower Bound = X̄ - ME
    Analyst-->>Analyst: Upper Bound = X̄ + ME

    Analyst->>Analyst: 10. Interpret the interval and conclusion

3.2 Maximum Likelihood Estimation Procedure for an Exponential Distribution Parameter

Consider an i.i.d. random sample $X_1, \ldots, X_n$ from an Exponential distribution with parameter $\lambda$, where $f(x|\lambda) = \lambda e^{-\lambda x}$ for $x \ge 0$ and $\lambda > 0$.

  1. Write down the likelihood function:
    $L(\lambda|\mathbf{x}) = \prod_{i=1}^{n} \lambda e^{-\lambda x_i} = \lambda^n e^{-\lambda \sum_{i=1}^{n} x_i}$

  2. Write down the log-likelihood function:
    $\ell(\lambda|\mathbf{x}) = \ln L(\lambda|\mathbf{x}) = \ln(\lambda^n e^{-\lambda \sum x_i}) = n \ln \lambda - \lambda \sum_{i=1}^{n} x_i$

  3. Calculate the first derivative with respect to $\lambda$ (the score function):
    $\frac{d\ell}{d\lambda} = \frac{n}{\lambda} - \sum_{i=1}^{n} x_i$

  4. Set the derivative to zero and solve for $\lambda$ to find the MLE $\hat{\lambda}_{MLE}$:
    $\frac{n}{\hat{\lambda}{MLE}} - \sum{i=1}^{n} x_i = 0$
    $\frac{n}{\hat{\lambda}{MLE}} = \sum{i=1}^{n} x_i$
    $\hat{\lambda}{MLE} = \frac{n}{\sum{i=1}^{n} x_i} = \frac{1}{\bar{X}}$

  5. Calculate the second derivative to confirm it's a maximum (optional but good practice):
    $\frac{d^2\ell}{d\lambda^2} = -\frac{n}{\lambda^2}$
    Since $n > 0$ and $\lambda > 0$, the second derivative is always negative, confirming that $\hat{\lambda}_{MLE}$ corresponds to a maximum.

This procedure yields $\hat{\lambda}_{MLE} = 1/\bar{X}$, indicating that the reciprocal of the sample mean is the maximum likelihood estimator for the rate parameter of an exponential distribution.

4. Examiner's Breakdown

4.1 Comparative Analysis

Feature Point Estimation Interval Estimation (Confidence Interval)
Output Type Single numerical value Range of numerical values
Purpose Provides a "best guess" for a population parameter Provides a range of plausible values for a parameter, accounting for variability
Information Conveyed Specific value but no measure of precision or uncertainty Quantifies the uncertainty of the estimate through the margin of error and confidence level
Statistical Basis Properties of estimators (unbiasedness, efficiency, consistency) Sampling distributions, critical values (Z, t, χ²), standard errors
Interpretation $\hat{\theta}$ is our best estimate for $\theta$. We are $X\%$ confident that the interval contains $\theta$. (Frequentist view: $X\%$ of intervals constructed this way will contain $\theta$.)
Advantages Simplicity, ease of calculation. Useful when a single summary is needed. Provides a more complete picture of parameter uncertainty. Fundamental for decision-making.
Disadvantages No information on reliability; almost certainly incorrect in practice for continuous parameters. Can be misinterpreted (e.g., as probability of the interval containing the parameter). Width depends on $n$, $\sigma$, and confidence level.
Example Sample mean $\bar{X}$ as an estimate for $\mu$. $95\%$ CI for $\mu$: $(\bar{X} - E, \bar{X} + E)$.

4.2 High-Yield Marking Keywords

  1. Central Limit Theorem (CLT): Approximates sampling distribution of means (or sums) to normal for large $n$, regardless of population distribution.
  2. Standard Error: Standard deviation of a sampling distribution of a statistic, quantifying its variability across samples.
  3. Unbiased Estimator: An estimator $\hat{\theta}$ where its expected value $E[\hat{\theta}]$ equals the true population parameter $\theta$.
  4. Confidence Level (1-$\alpha$): The long-run proportion of confidence intervals that would contain the true population parameter if the sampling process were repeated many times.
  5. Degrees of Freedom (df): The number of independent pieces of information used to calculate a statistic, relevant for t and chi-squared distributions (e.g., $n-1$ for one sample mean/variance).
  6. Maximum Likelihood Estimator (MLE): The parameter value that maximizes the likelihood of observing the given sample data.
  7. Sufficient Statistic: A statistic that captures all the information in the sample relevant to the estimation of a parameter.
  8. Cramér-Rao Lower Bound (CRLB): The theoretical minimum variance an unbiased estimator can achieve, measures estimator efficiency.

4.3 Trapdoor Mistakes

  1. Misinterpreting Confidence Intervals:

    • TRAP: Stating, "There is a 95% probability that the true population mean lies within this specific calculated interval."
    • CORRECT: "If we were to repeat this sampling process many times, 95% of the confidence intervals constructed would contain the true population mean." The probability applies to the method of interval construction, not to a single, already calculated interval. For a specific interval, the parameter is either in it or it isn't (probability is 0 or 1).
  2. Ignoring the Central Limit Theorem's Conditions or Replacing $\sigma$ with $S$ Incorrectly:

    • TRAP: Assuming a normal sampling distribution for $\bar{X}$ when $n$ is small (e.g., $n<30$) and the population distribution is non-normal (e.g., highly skewed). Or, using a Z-distribution for a CI for $\mu$ when $\sigma$ is unknown and $n$ is small, forcing the replacement of $\sigma$ with $S$.
    • CORRECT: If the population is non-normal and $n < 30$, the CLT may not apply sufficiently to assume normality for $\bar{X}$. If $\sigma$ is unknown, the t-distribution MUST be used when $n$ is small (or when $n$ is large, though t-dist converges to Z-dist for large $n$). The use of Z-distribution with $S$ is only justified for sufficiently large $n$ where $t_{n-1} \approx Z$.
  3. Confusing Standard Deviation with Standard Error:

    • TRAP: Using $\sigma$ (population standard deviation) as the measure of variability for the sample mean, or dividing by $S$ without accounting for $n$ (i.e., not using $S/\sqrt{n}$).
    • CORRECT: $\sigma$ measures the spread of individual observations in the population. The standard error ($SE = \sigma/\sqrt{n}$ or $S/\sqrt{n}$) measures the spread of the sample means (or other statistics) in their sampling distribution. It accounts for the reduction in variability due to averaging multiple observations.
  4. Incorrectly Determining Sample Size 'n' (Magnitude and Rounding):

    • TRAP: Calculating a fractional sample size and rounding it down, or failing to use a conservative estimate for $p$ (e.g., $p=0.5$) when no prior information is available for proportion estimation.
    • CORRECT: Sample size calculations for desired precision (margin of error) and confidence level must always be rounded up to the next whole integer to ensure the specified conditions are met (e.g., $72.3$ implies $n=73$). For proportions, if $p$ is unknown, using $p=0.5$ in $n = p(1-p)\left(\frac{Z_{\alpha/2}}{E}\right)^2$ guarantees the largest possible sample size, thus ensuring the desired margin of error is achieved regardless of the true $p$.

Get the full statistics 1B curriculum

Clone the complete plan to your dashboard for unlimited AI-generated notes, practice quizzes, and a personalised revision schedule.

Create Free Account