Interval Estimation for Population Mean (Sigma Unknown) - T-Distribution

SA
StudyAI Editorial
Reviewed by StudyAI tutors
· Published Updated

From the Statistics curriculum

Interval Estimation for Population Mean (Sigma Unknown) - T-Distribution

TL;DR

When the population standard deviation (σ) is unknown, especially with a small sample, you use the t-distribution to create an interval estimate for the population mean. This method relies on the sample standard deviation (s) and accounts for the increased uncertainty compared to when σ is known. As your sample size grows, the t-distribution's shape approaches that of the z-distribution.

1. The Mental Model

Imagine you're trying to guess the average height of all students at a university, but you don't know how spread out their heights usually are. Instead, you take a small sample and use its spread to help you estimate the average height for everyone, realizing your estimate will have a bit more wiggle room because you're working with less initial information.

2. The Core Material

Interval estimation is about creating a range where you're confident the true population mean (μ) lies. When you don't know the population standard deviation (σ), and especially when your sample size is small (generally n < 30), you can't use the standard z-distribution. This is where the t-distribution comes in.

Why the t-Distribution?

The t-distribution is used when:
* The population standard deviation (σ) is unknown.
* You use the sample standard deviation (s) instead as an estimate for σ.
* The sample size is small (though it's technically applicable for any sample size when σ is unknown).

Its shape is also bell-shaped and symmetric, similar to the z-distribution. However, the t-distribution is:
* Wider and has heavier tails than the z-distribution. This reflects the increased uncertainty when you're estimating σ from your sample.
* It approaches the z-distribution as your sample size increases. This is because with more data, your sample standard deviation (s) becomes a better estimate of the true population standard deviation (σ).

The General Form of an Interval Estimate

Regardless of whether σ is known or unknown, the general idea for an interval estimate of a population mean is:

Sample Mean ± Margin of Error

When σ is unknown, the interval estimate for μ is specifically:

$\bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}}$

Where:
* $\bar{x}$ is the sample mean.
* tα/2 is the t-value from the t-distribution table. This value depends on your desired confidence level and the degrees of freedom (df), which are n - 1 (sample size minus 1).
* s is the sample standard deviation.
* n is the sample size.
* $\frac{s}{\sqrt{n}}$ is the estimated standard error of the mean.

The tα/2 value essentially replaces the zα/2 value used when σ is known.

How Confidence Levels Affect the Interval

The confidence level (like 90%, 95%, 98%) determines how wide your interval estimate will be. A higher confidence level means a wider interval, as you're trying to be more certain that the interval captures the true population mean.

Here’s a diagram illustrating the decision-making process for choosing between a z-distribution or a t-distribution for interval estimation of a population mean:

graph TD
    A["Interval Estimation: Population Mean (μ)"] --> B{"Is Population Standard Deviation (σ) Known?"};
    B -- Yes --> C["Known Case: Use z-distribution"];
    C --> D["Interval Estimate: x̄ ± zα/2 * (σ/√n)"];
    B -- No --> E["Unknown Case: Use t-distribution"];
    E --> F["Use Sample Standard Deviation (s) to estimate σ"];
    F --> G["Interval Estimate: x̄ ± tα/2 * (s/√n)"];

Meaning of Confidence

When you state, "We are 95% confident that the mean rent per month is between $720 and $800,” it means if you were to repeat the sampling and interval estimation process many times, about 95% of the intervals you construct would actually contain the true population mean. It doesn't mean there's a 95% chance that this specific interval contains the population mean; rather, it reflects the reliability of the method over many repetitions.

3. Worked Example

Let's use the Apartment Rents example from your source material, but assume σ is unknown.

A reporter wants a 95% confidence interval for the mean rent of one-bedroom apartments. They collect a sample of 25 apartments (n = 25) and find the following:

  • Sample Mean (x̄): $750
  • Sample Standard Deviation (s): $80

Steps:

  1. Identify Knowns:

    • n = 25
    • x̄ = $750
    • s = $80
    • Confidence Level = 95%
    • Since σ is unknown and n < 30, we use the t-distribution.
  2. Determine Degrees of Freedom (df):

    • df = n - 1 = 25 - 1 = 24
  3. Find the t-value (tα/2):

    • For a 95% confidence level, α = 1 - 0.95 = 0.05.
    • So, α/2 = 0.025.
    • Look up tα/2 in a t-distribution table with df = 24 and α/2 = 0.025. You'd find t0.025, 24 = 2.064.
  4. Calculate the Margin of Error (ME):

    • ME = tα/2 * (s/√n)
    • ME = 2.064 * (80 / √25)
    • ME = 2.064 * (80 / 5)
    • ME = 2.064 * 16
    • ME = 33.024
  5. Construct the Confidence Interval:

    • Interval = x̄ ± ME
    • Interval = 750 ± 33.024
    • Lower Limit = 750 - 33.024 = 716.976
    • Upper Limit = 750 + 33.024 = 783.024

Therefore, we are 95% confident that the mean rent per month for one-bedroom apartments within a half-mile of campus is between $716.98 and $783.02.

4. Key Takeaways

  • When the population standard deviation (σ) is unknown, you must use the t-distribution for interval estimation of the population mean.
  • The t-distribution accounts for the extra uncertainty from using the sample standard deviation (s) as an estimate for σ.
  • The key parameters for using the t-distribution are the confidence level and the degrees of freedom (n - 1).
  • As the sample size (n) increases, the t-distribution becomes more like the z-distribution.
  • A higher confidence level always leads to a wider interval because you're trying to be more certain.

Common Mistakes to Avoid:
* Using the z-distribution when σ is unknown: This is a fundamental error that underestimates the variability.
* Forgetting to subtract 1 for degrees of freedom (df = n - 1): This will lead to an incorrect t-value.
* Confusing sample standard deviation (s) with population standard deviation (σ): Always check which one you have.
* Misinterpreting confidence: It's about the reliability of the method, not the probability of a specific interval.

5. Now Try It

A new coffee shop wants to estimate the average time a customer spends in their store. They take a random sample of 16 customers and find the average time is 12 minutes with a sample standard deviation of 3 minutes.

Provide a 90% confidence interval for the true mean time all customers spend in the coffee shop.

What success looks like: You should be able to state the lower and upper bounds of the confidence interval and interpret what that interval means in plain language.

Frequently asked about Interval Estimation for Population Mean (Sigma Unknown) - T-Distribution

# Interval Estimation for Population Mean (Sigma Unknown) - T-Distribution ## TL;DR When the population standard deviation (σ) is unknown, especially with a small sample, you use the t-distribution to create an interval estimate for the population mean. This method relies on the Read the full notes above.

Interval Estimation for Population Mean (Sigma Unknown) - T-Distribution is a core topic in Statistics. Most exam papers test it via a mix of definitions, worked examples, and applied problems. The notes above cover the high-yield sub-topics, common pitfalls, and the kind of questions examiners typically set.

Yes. Every note in the StudyAI Campus Hub is free to read. Create a free account if you want to clone the full plan, generate your own notes from your textbook, or get AI-powered practice quizzes and flashcards.

More from Statistics


Get the full Statistics curriculum

Clone the complete plan to your dashboard for unlimited AI-generated notes, practice quizzes, and a personalised revision schedule.

Create Free Account