CAPE PENINSULA UNIVERSITY OF TECHNOLOGY STAT151X

Probability Distributions

StudyAI Editorial

Reviewed by StudyAI tutors

· Published May 29, 2026 Updated May 29, 2026

From the statistics 1B curriculum

Probability Distributions

1. Introduction & Overview

The Mental Model: Probability distributions function as the genomic blueprints of random variables, encoding all possible outcomes and their respective likelihoods, thereby defining the statistical characterology of a stochastic process.
Significance:
- Inferential Statistics: Foundation for hypothesis testing and confidence interval estimation in varied disciplines such as biostatistics, econometrics, and quality control.
- Risk Management: Quantifying financial risk (e.g., Value at Risk via Extreme Value Distributions) and actuarial science.
- Machine Learning: Bayesian inference, generative models (e.g., Gaussian Mixture Models), and regularization techniques.
- Engineering & Physics: Modeling noise, system reliability, and quantum phenomena (e.g., Bose-Einstein or Fermi-Dirac distributions).
- Operations Research: Stochastic optimization and queuing theory (e.g., Erlang distribution).

mindmap
    root((Probability Distributions))
        Discrete Distributions
            Bernoulli(p)
                "P(X=k) = p^k (1-p)^(1-k)"
            Binomial(n, p)
                "P(X=k) = C(n,k) p^k (1-p)^(n-k)"
            Poisson(λ)
                "P(X=k) = (e^(-λ) λ^k) / k!"
            Geometric(p)
                "P(X=k) = (1-p)^(k-1) p"
            Hypergeometric(N, K, n)
                "P(X=k) = (C(K,k) C(N-K, n-k)) / C(N,n)"
        Continuous Distributions
            Uniform(a, b)
                "f(x) = 1/(b-a)"
            Normal(μ, σ^2)
                "f(x) = (1/(σ√(2π))) e^(-(x-μ)^2 / (2σ^2))"
            Exponential(λ)
                "f(x) = λ e^(-λx)"
            Gamma(α, β)
                "f(x) = (β^α / Γ(α)) x^(α-1) e^(-βx)"
            Beta(α, β)
                "f(x) = (x^(α-1) (1-x)^(β-1)) / B(α,β)"
            Chi-squared(k)
                "f(x) = (1 / (2^(k/2) Γ(k/2))) x^((k/2)-1) e^(-x/2)"
            Student's T(ν)
                "f(t) = Γ((ν+1)/2) / (√(νπ) Γ(ν/2)) (1 + t^2/ν)^(- (ν+1)/2)"
        Key Concepts
            Probability Mass Function (PMF)
            Probability Density Function (PDF)
            Cumulative Distribution Function (CDF)
            Expected Value (Mean)
            Variance
            Moment Generating Function (MGF)
            Characteristic Function (CF)
            Quantile Function
            Parameters
            Support

2. In-Depth Theory, Equations & Mechanisms

Probability distributions are formal mathematical descriptions of the likelihood of different outcomes for a random variable. They are classified into discrete and continuous types based on the nature of their random variable's support.

2.1 Discrete Probability Distributions

A discrete random variable $X$ has a countable number of possible outcomes. Its probability distribution is defined by a Probability Mass Function (PMF), $P(X=x)$, where $x$ is a specific outcome.

2.1.1 Bernoulli Distribution

Notation: $X \sim \text{Bernoulli}(p)$
Context: Models the outcome of a single trial with two possible results: "success" (value 1) or "failure" (value 0).
Parameters: $p$, the probability of success, where $p \in [0, 1]$.
Support: $x \in {0, 1}$.
PMF:
$P(X=x) = p^x (1-p)^{1-x}$ for $x \in {0, 1}$
Expected Value (Mean): $E[X] = p$
Variance: $\text{Var}(X) = p(1-p)$
Moment Generating Function (MGF): $M_X(t) = (1-p) + pe^t$ for all real $t$.

2.1.2 Binomial Distribution

Notation: $X \sim \text{Binomial}(n, p)$
Context: Models the number of successes in a fixed number of $n$ independent Bernoulli trials, each with probability of success $p$.
Parameters: $n$, number of trials ($n \in \mathbb{N}^+$); $p$, probability of success ($p \in [0, 1]$).
Support: $x \in {0, 1, \dots, n}$.
PMF:
$P(X=x) = \binom{n}{x} p^x (1-p)^{n-x}$ for $x \in {0, 1, \dots, n}$
where $\binom{n}{x} = \frac{n!}{x!(n-x)!}$ is the binomial coefficient.
Expected Value (Mean): $E[X] = np$
Variance: $\text{Var}(X) = np(1-p)$
MGF: $M_X(t) = ((1-p) + pe^t)^n$ for all real $t$.

2.1.3 Poisson Distribution

Notation: $X \sim \text{Poisson}(\lambda)$
Context: Models the number of events occurring in a fixed interval of time or space, given these events occur with a known constant mean rate $\lambda$ and independently of the time since the last event. Often used as an approximation to the Binomial distribution when $n$ is large and $p$ is small, such that $np = \lambda$.
Parameters: $\lambda$, the average rate of occurrence ($\lambda > 0$).
Support: $x \in {0, 1, 2, \dots}$.
PMF:
$P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!}$ for $x \in {0, 1, 2, \dots}$
Expected Value (Mean): $E[X] = \lambda$
Variance: $\text{Var}(X) = \lambda$
MGF: $M_X(t) = e^{\lambda(e^t - 1)}$ for all real $t$.

2.1.4 Geometric Distribution

Notation: $X \sim \text{Geometric}(p)$
Context: Models the number of Bernoulli trials required to achieve the first success.
Parameters: $p$, probability of success ($p \in (0, 1]$).
Support: $x \in {1, 2, 3, \dots}$. (Note: Some definitions start from $x=0$, representing the number of failures before the first success. Here, we use the "number of trials" definition.)
PMF:
$P(X=x) = (1-p)^{x-1} p$ for $x \in {1, 2, 3, \dots}$
Expected Value (Mean): $E[X] = 1/p$
Variance: $\text{Var}(X) = (1-p)/p^2$
MGF: $M_X(t) = \frac{pe^t}{1-(1-p)e^t}$ for $t < -\ln(1-p)$.

2.2 Continuous Probability Distributions

A continuous random variable $X$ has an uncountable number of possible outcomes. Its probability distribution is defined by a Probability Density Function (PDF), $f(x)$, satisfying $f(x) \ge 0$ for all $x$ and $\int_{-\infty}^{\infty} f(x) dx = 1$. The probability of $X$ falling within an interval $[a, b]$ is $\int_a^b f(x) dx$. $P(X=x) = 0$ for any specific $x$.

2.2.1 Uniform Distribution

Notation: $X \sim \text{Uniform}(a, b)$
Context: Models situations where all outcomes between two specified bounds, $a$ and $b$, are equally likely.
Parameters: $a$, lower bound; $b$, upper bound ($a < b$).
Support: $x \in [a, b]$.
PDF:
$f(x) = \begin{cases} \frac{1}{b-a} & \text{for } a \le x \le b \ 0 & \text{otherwise} \end{cases}$
Expected Value (Mean): $E[X] = \frac{a+b}{2}$
Variance: $\text{Var}(X) = \frac{(b-a)^2}{12}$
MGF: $M_X(t) = \frac{e^{tb} - e^{ta}}{t(b-a)}$ for $t
e 0$, and $M_X(0) = 1$.

2.2.2 Normal (Gaussian) Distribution

Notation: $X \sim N(\mu, \sigma^2)$
Context: The most ubiquitous distribution in natural and social sciences, crucial due to the Central Limit Theorem. Many natural phenomena, measurement errors, and sampling distributions approximate normality.
Parameters: $\mu$, mean; $\sigma^2$, variance ($\sigma > 0$).
Support: $x \in (-\infty, \infty)$.
PDF:
$f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$ for $x \in (-\infty, \infty)$
Expected Value (Mean): $E[X] = \mu$
Variance: $\text{Var}(X) = \sigma^2$
MGF: $M_X(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2}$ for all real $t$.
Standard Normal Distribution: A special case where $\mu=0$ and $\sigma^2=1$, denoted $Z \sim N(0, 1)$. Its PDF is $\phi(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}$ and its CDF is $\Phi(z)$. Any normal variable $X$ can be standardized by $Z = (X-\mu)/\sigma$.

2.2.3 Exponential Distribution

Notation: $X \sim \text{Exp}(\lambda)$
Context: Models the time until an event occurs in a Poisson process, i.e., the time between two successive events. It exhibits the property of "memorylessness."
Parameters: $\lambda$, rate parameter ($\lambda > 0$). Equivalent to $1/\beta$ where $\beta$ is the scale parameter for some texts.
Support: $x \in [0, \infty)$.
PDF:
$f(x) = \lambda e^{-\lambda x}$ for $x \ge 0$, and $0$ otherwise.
Expected Value (Mean): $E[X] = 1/\lambda$
Variance: $\text{Var}(X) = 1/\lambda^2$
MGF: $M_X(t) = \frac{\lambda}{\lambda-t}$ for $t < \lambda$.
Memorylessness Property: $P(X > s+t \mid X > s) = P(X > t)$ for $s, t \ge 0$.

2.2.4 Gamma Distribution

Notation: $X \sim \text{Gamma}(\alpha, \beta)$
Context: A flexible distribution modeling waiting times. When $\alpha=1$, it reduces to the Exponential distribution. When $\alpha$ is a positive integer, it models the sum of $\alpha$ independent Exponential random variables.
Parameters: $\alpha$, shape parameter ($\alpha > 0$); $\beta$, rate parameter ($\beta > 0$). (Some texts use $\theta = 1/\beta$ as a scale parameter).
Support: $x \in [0, \infty)$.
PDF:
$f(x) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}$ for $x \ge 0$, and $0$ otherwise.
where $\Gamma(\alpha) = \int_0^\infty t^{\alpha-1} e^{-t} dt$ is the Gamma function. For integer $\alpha$, $\Gamma(\alpha) = (\alpha-1)!$.
Expected Value (Mean): $E[X] = \alpha/\beta$
Variance: $\text{Var}(X) = \alpha/\beta^2$
MGF: $M_X(t) = \left(\frac{\beta}{\beta-t}\right)^\alpha$ for $t < \beta$.

2.2.5 Beta Distribution

Notation: $X \sim \text{Beta}(\alpha, \beta)$
Context: Models probabilities, proportions, or values bounded between 0 and 1. Highly flexible shape based on parameters. Often used as a prior distribution in Bayesian statistics.
Parameters: $\alpha > 0$, $\beta > 0$.
Support: $x \in [0, 1]$.
PDF:
$f(x) = \frac{x^{\alpha-1} (1-x)^{\beta-1}}{B(\alpha, \beta)}$ for $0 \le x \le 1$, and $0$ otherwise.
where $B(\alpha, \beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}$ is the Beta function.
Expected Value (Mean): $E[X] = \frac{\alpha}{\alpha+\beta}$
Variance: $\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}$
MGF: No simple closed-form MGF exists.

2.2.6 Key Distribution Properties & Relationships

radar-beta
    title "Comparative Properties of Distributions"
    series
        "Normal (μ=0, σ=1)"
        "Exponential (λ=1)"
        "Uniform (a=0, b=1)"
        "Poisson (λ=1)"
    data
        Skewness = 0, 2, 0, 1
        Kurtosis = 3, 9, 1.8, 4
        "Memoryless P" = 0, 1, 0, 0
        "Finite Support" = 0, 0, 1, 0
        "Countable Outcomes" = 0, 0, 0, 1
        "Approximates Binom" = 1, 0, 0, 1

Central Limit Theorem (CLT): States that the sampling distribution of the mean of a large number of independent, identically distributed random variables approaches a normal distribution, regardless of the shape of the original population distribution. This is fundamental for parametric inference.
If $X_1, X_2, \dots, X_n$ are i.i.d. with $E[X_i] = \mu$ and $\text{Var}(X_i) = \sigma^2$, then as $n \to \infty$,
$\sqrt{n}(\bar{X}_n - \mu) \xrightarrow{D} N(0, \sigma^2)$ or $\bar{X}_n \approx N(\mu, \sigma^2/n)$.
Relationships:
- Bernoulli is a special case of Binomial (n=1).
- Exponential is a special case of Gamma ($\alpha=1$).
- Chi-squared distribution ($X \sim \chi^2_k$) is a special case of Gamma distribution where $\alpha = k/2$ and $\beta = 1/2$. It describes the sum of squares of $k$ independent standard normal random variables.
- Student's t-distribution ($X \sim t_
  u$) arises when estimating the mean of a normally distributed population when the sample size is small and the population standard deviation is unknown. It approaches the standard normal distribution as $
  u \to \infty$.
- F-distribution arises in ANOVA when comparing variances of two normally distributed populations.
- The Poisson distribution can be derived as the limit of the Binomial distribution under specific conditions ($n \to \infty, p \to 0, np \to \lambda$).

2.3 Cumulative Distribution Function (CDF)

Definition: For any random variable $X$, the CDF, denoted $F_X(x)$, gives the probability that $X$ will take a value less than or equal to $x$.
Formula:
- Discrete: $F_X(x) = P(X \le x) = \sum_{t \le x} P(X=t)$
- Continuous: $F_X(x) = P(X \le x) = \int_{-\infty}^{x} f(t) dt$
Properties:
- $0 \le F_X(x) \le 1$ for all $x$.
- $F_X(x)$ is non-decreasing: if $x_1 < x_2$, then $F_X(x_1) \le F_X(x_2)$.
- $\lim_{x \to -\infty} F_X(x) = 0$.
- $\lim_{x \to \infty} F_X(x) = 1$.
- For continuous distributions, $f(x) = \frac{d}{dx} F_X(x)$.
- $P(a < X \le b) = F_X(b) - F_X(a)$.

stateDiagram-v2
    direction LR
    type Discrete_State_Space { "Countable Outcomes" }
    type Continuous_State_Space { "Uncountable Outcomes" }

    "Random Variable" --> "Discrete RV" : Finite/Countable range
    "Random Variable" --> "Continuous RV" : Infinite/Uncountable range

    "Discrete RV" --> "PMF P(X=x)": "Probability Mass Function (P_x)"
    "Continuous RV" --> "PDF f(x)dx": "Probability Density Function (f_x)"

    "PMF P(X=x)" --> "CDF F(x) = ΣP(t)": "Calculated via summation"
    "PDF f(x)dx" --> "CDF F(x) = ∫f(t)dt": "Calculated via integration"

    "CDF F(x) = ΣP(t)" --> "Quantile Function Q(p)"
    "CDF F(x) = ∫f(t)dt" --> "Quantile Function Q(p)"

    subgraph Characteristic Properties
        "PMF P(X=x)" -- "P(X=x)>0"
        "PMF P(X=x)" -- "Σ P(X=x) = 1"
        "PDF f(x)dx" -- "f(x) >= 0"
        "PDF f(x)dx" -- "∫ f(x)dx = 1"
        "CDF F(x) = ΣP(t)" -- "Monotonically Non-Decreasing"
        "CDF F(x) = ∫f(t)dt" -- "Monotonically Non-Decreasing"
        "CDF F(x) = ΣP(t)" -- "Right-Continuous"
        "CDF F(x) = ∫f(t)dt" -- "Continuous"
    end

3. Technical Procedures & Applications

3.1 Parameter Estimation for a Normal Distribution (Maximum Likelihood Estimation)

The procedure to estimate the mean ($\mu$) and variance ($\sigma^2$) of a normal distribution from a sample dataset $X_1, X_2, \dots, X_n$ using Maximum Likelihood Estimation (MLE). This is a standard procedure in statistical modeling.

sequenceDiagram
    participant Data_Acquisition as "Data Set (x1, ..., xn)"
    participant Analyst as "Statistical Analyst"
    participant PDF as "Normal PDF f(x|μ, σ²)"
    participant Likelihood as "Likelihood Function L(μ, σ²|x)"
    participant LogLikelihood as "Log-Likelihood ln(L)"
    participant Optimization as "Optimization Engine"
    participant MLE_Estimates as "MLE Estimates (μ̂, σ̂²)"

    Data_Acquisition->Analyst: "Provide sample data xi"
    Analyst->PDF: "Identify parametric model: Normal Distribution"
    PDF-->Analyst: "Retrieve PDF: (1/(σ√(2π))) e^(-(xi-μ)² / (2σ²))"

    Analyst->Likelihood: "Formulate Likelihood Function"
    Likelihood->Analyst: "L(μ, σ²|x) = Π f(xi|μ, σ²)"
    Analyst->LogLikelihood: "Take natural logarithm for simplification"
    LogLikelihood->Analyst: "ln(L) = Σ [-0.5 ln(2π) - ln(σ) - (xi-μ)² / (2σ²)]"

    Analyst->Optimization: "Compute Partial Derivatives w.r.t. μ and σ²"
    Optimization->Analyst: "d(lnL)/dμ = Σ (xi-μ)/σ²"
    Optimization->Analyst: "d(lnL)/d(σ²) = Σ [(xi-μ)²/(2σ⁴) - 1/(2σ²)]"

    Analyst->Optimization: "Set derivatives to zero and solve for μ̂, σ̂²"
    Optimization->MLE_Estimates: "∂lnL/∂μ = 0  => μ̂ = (1/n) Σ xi"
    Optimization->MLE_Estimates: "∂lnL/∂(σ²) = 0 => σ̂² = (1/n) Σ (xi - μ̂)²"

    MLE_Estimates->Analyst: "Provide Maximum Likelihood Estimators"
    Analyst->Analyst: "Interpret and utilize μ̂, σ̂²"

Detailed Step-by-Step Procedure:

Define the Probability Density Function (PDF): For a normal distribution, the PDF for a single observation $x_i$ is:
$f(x_i; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x_i - \mu)^2}{2\sigma^2}\right)$
Formulate the Likelihood Function ($L$): Assuming $n$ independent observations $x_1, x_2, \dots, x_n$, the likelihood function is the product of the individual PDFs:
$L(\mu, \sigma^2 | x_1, \dots, x_n) = \prod_{i=1}^n f(x_i; \mu, \sigma^2) = \prod_{i=1}^n \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x_i - \mu)^2}{2\sigma^2}\right)$
$L(\mu, \sigma^2 | \mathbf{x}) = (2\pi\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2\right)$
Transform to Log-Likelihood Function ($\ln L$): To simplify differentiation, we take the natural logarithm of the likelihood function. Maximum likelihood estimates are invariant under monotonic transformations.
$\ln L(\mu, \sigma^2 | \mathbf{x}) = \ln\left((2\pi\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2\right)\right)$
$\ln L(\mu, \sigma^2 | \mathbf{x}) = -\frac{n}{2}\ln(2\pi) - \frac{n}{2}\ln(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2$
Compute Partial Derivatives:
- With respect to $\mu$:
  $\frac{\partial \ln L}{\partial \mu} = \frac{\partial}{\partial \mu} \left( -\frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2 \right)$
  $\frac{\partial \ln L}{\partial \mu} = -\frac{1}{2\sigma^2} \sum_{i=1}^n 2(x_i - \mu)(-1) = \frac{1}{\sigma^2} \sum_{i=1}^n (x_i - \mu)$
- With respect to $\sigma^2$ (treating it as a single parameter):
  $\frac{\partial \ln L}{\partial (\sigma^2)} = \frac{\partial}{\partial (\sigma^2)} \left( -\frac{n}{2}\ln(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2 \right)$
  $\frac{\partial \ln L}{\partial (\sigma^2)} = -\frac{n}{2\sigma^2} + \frac{1}{2(\sigma^2)^2} \sum_{i=1}^n (x_i - \mu)^2$
Set Derivatives to Zero and Solve:
- For $\mu$:
  $\frac{1}{\sigma^2} \sum_{i=1}^n (x_i - \mu) = 0$
  $\sum_{i=1}^n x_i - \sum_{i=1}^n \mu = 0$
  $\sum_{i=1}^n x_i - n\mu = 0$
  $n\mu = \sum_{i=1}^n x_i$
  $\hat{\mu}{\text{MLE}} = \frac{1}{n}\sum{i=1}^n x_i = \bar{X}$ (the sample mean)
- For $\sigma^2$ (substituting $\hat{\mu}$):
  $-\frac{n}{2\sigma^2} + \frac{1}{2(\sigma^2)^2} \sum_{i=1}^n (x_i - \hat{\mu})^2 = 0$
  Multiply by $2(\sigma^2)^2$:
  $-n\sigma^2 + \sum_{i=1}^n (x_i - \hat{\mu})^2 = 0$
  $n\sigma^2 = \sum_{i=1}^n (x_i - \hat{\mu})^2$
  $\hat{\sigma}^2_{\text{MLE}} = \frac{1}{n}\sum_{i=1}^n (x_i - \hat{\mu})^2$ (the biased sample variance)
Note: The MLE of $\sigma^2$ is a biased estimator. The unbiased estimator used in frequentist statistics is $s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{X})^2$. However, $\hat{\sigma}^2_{\text{MLE}}$ is asymptotically unbiased and consistent.

4. Examiner's Breakdown

4.1 Comparative Analysis

Feature	Discrete Probability Distributions	Continuous Probability Distributions
Output Type	Countable outcomes	Uncountable outcomes
Quantification	Probability Mass Function (PMF), $P(X=x)$	Probability Density Function (PDF), $f(x)$
P(X=x) for specific x	$P(X=x) > 0$ for allowed $x$	$P(X=x) = 0$ for any specific $x$
Sum/Integral	$\sum_x P(X=x) = 1$	$\int_{-\infty}^{\infty} f(x) dx = 1$
Probability over range	$\sum_{a \le x \le b} P(X=x)$	$\int_a^b f(x) dx$
CDF behavior	Step function	Continuous function
Key examples	Bernoulli, Binomial, Poisson, Geometric	Uniform, Normal, Exponential, Gamma
Derivative of CDF	Not directly applicable (step function)	$f(x) = d/dx F(x)$ if CDF is differentiable
Common use cases	Counting events, number of successes	Measuring quantities, time, rates
Moment Generating Function (MGF)	$M_X(t) = E[e^{tX}] = \sum_x e^{tx} P(X=x)$	$M_X(t) = E[e^{tX}] = \int_{-\infty}^{\infty} e^{tx} f(x) dx$

4.2 High-Yield Marking Keywords

Probability Mass Function (PMF): Used exclusively for discrete random variables, listing probabilities for point outcomes.
Probability Density Function (PDF): Used exclusively for continuous random variables, where probability is determined by integrating over an interval.
Cumulative Distribution Function (CDF): $F_X(x) = P(X \le x)$, universally applicable and monotonically non-decreasing.
Memoryless Property: Characteristic of Exponential and Geometric distributions, meaning the future duration is independent of past duration.
Central Limit Theorem (CLT): The sampling distribution of means or sums of independent, identically distributed random variables approaches a normal distribution as sample size $n \to \infty$.
Moment Generating Function (MGF): $E[e^{tX}]$, uniquely defines a distribution and simplifies finding moments.
Parameters (e.g., $\mu, \sigma^2, \lambda, n, p, \alpha, \beta$): Specific constants defining the shape and location of a distribution.
Support (of a distribution): The set of all possible values a random variable can take with non-zero probability.

4.3 Trapdoor Mistakes

Confusing PMF and PDF: Students often apply $\int f(x) dx = 1$ to discrete distributions or state $P(X=x) > 0$ for continuous distributions.
- Correct Answer: Discrete: Sum PMF to 1, $P(X=x)$ are point probabilities. Continuous: Integrate PDF to 1, $P(X=x)=0$, probability is over intervals.
Incorrectly Applying Normal Approximation: Assuming normality for binomial or Poisson without adequate conditions ($np \ge 5$ and $n(1-p) \ge 5$ for binomial; $\lambda \ge 10$ for Poisson) or without continuity correction.
- Correct Answer: State and verify conditions for approximation. Apply continuity correction (e.g., $P(X \le k) \approx P(X_{Normal} \le k+0.5)$ for discrete to continuous conversion).
Misinterpreting MGF properties: Confusing $M_X(t)$ directly with moments or failing to realize its uniqueness property.
- Correct Answer: State that $E[X^k] = M_X^{(k)}(0)$, the $k$-th derivative of MGF evaluated at $t=0$. Explicitly state MGFs uniquely determine the distribution.
Ignoring Support Conditions: Calculating probabilities or moments outside the defined support of a distribution (e.g., negative values for Exponential, values > 1 for Beta).
- Correct Answer: Always explicitly state the distribution's support and ensure all calculations are bounded within it. For probabilities outside, they are 0.

Frequently asked about Probability Distributions

The Mental Model: Probability distributions function as the genomic blueprints of random variables, encoding all possible outcomes and their respective likelihoods, thereby defining the statistical characterology of a stochastic process. Read the full notes above for the details.

Probability Distributions is a core topic in statistics 1B. Most exam papers test it via a mix of definitions, worked examples, and applied problems. The notes above cover the high-yield sub-topics, common pitfalls, and the kind of questions examiners typically set.

Yes. Every note in the StudyAI Campus Hub is free to read. Create a free account if you want to clone the full plan, generate your own notes from your textbook, or get AI-powered practice quizzes and flashcards.

Probability Distributions

Probability Distributions

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 Discrete Probability Distributions

2.1.1 Bernoulli Distribution

2.1.2 Binomial Distribution

2.1.3 Poisson Distribution

2.1.4 Geometric Distribution

2.2 Continuous Probability Distributions

2.2.1 Uniform Distribution

2.2.2 Normal (Gaussian) Distribution

2.2.3 Exponential Distribution

2.2.4 Gamma Distribution

2.2.5 Beta Distribution

2.2.6 Key Distribution Properties & Relationships

2.3 Cumulative Distribution Function (CDF)

3. Technical Procedures & Applications

3.1 Parameter Estimation for a Normal Distribution (Maximum Likelihood Estimation)

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Probability Distributions

More from statistics 1B

Get the full statistics 1B curriculum

Probability Distributions

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 Discrete Probability Distributions

2.1.1 Bernoulli Distribution

2.1.2 Binomial Distribution

2.1.3 Poisson Distribution

2.1.4 Geometric Distribution

2.2 Continuous Probability Distributions

2.2.1 Uniform Distribution

2.2.2 Normal (Gaussian) Distribution

2.2.3 Exponential Distribution

2.2.4 Gamma Distribution

2.2.5 Beta Distribution

2.2.6 Key Distribution Properties & Relationships

2.3 Cumulative Distribution Function (CDF)

3. Technical Procedures & Applications

3.1 Parameter Estimation for a Normal Distribution (Maximum Likelihood Estimation)

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Probability Distributions

What is Probability Distributions?

How is Probability Distributions examined in statistics 1B?

Are these Probability Distributions notes free?

More from statistics 1B

Get the full statistics 1B curriculum