CAPE PENINSULA UNIVERSITY OF TECHNOLOGY STAT151X

Simple Linear Regression and Correlation

StudyAI Editorial

Reviewed by StudyAI tutors

· Published May 29, 2026 Updated May 29, 2026

From the statistics 1B curriculum

Simple Linear Regression and Correlation

1. Introduction & Overview

The Mental Model: Imagine fitting the trajectory of a ballistic missile's flight path with a precisely defined parabolic equation, where minute variations in initial velocity and launch angle dictate its exact landing coordinates, offering a predictive model of its impact based on observable, continuous input parameters.
Significance:
- Financial Forecasting: Predicting stock prices, commodity futures, or economic indicators based on historical data and related variables (e.g., GDP, interest rates).
- Biomedical Research: Modeling drug dosage response curves (e.g., concentration of drug vs. physiological effect) or correlating genetic markers with disease susceptibility.
- Engineering Diagnostics: Predicting material fatigue life based on stress cycles, or estimating energy consumption from ambient temperature and operational load.
- Environmental Science: Relating pollutant concentrations to emission sources, or predicting agricultural yields based on rainfall and fertilizer application.
- Quality Control: Establishing relationships between manufacturing process parameters (e.g., temperature, pressure) and product quality metrics (e.g., tensile strength, purity).

mindmap
  root((Simple Linear Regression & Correlation))
    "Fundamentals"
      "Deterministic vs. Stochastic"
      "Population vs. Sample Regression Function"
      "Assumptions (Gauss-Markov)"
    "Regression Analysis"
      "Model Specification"
        "Y_i = beta_0 + beta_1 * X_i + epsilon_i"
      "Parameter Estimation (OLS)"
        "Normal Equations"
        "Beta hats"
      "Goodness-of-Fit"
        "R-squared"
        "Standard Error of Regression"
    "Correlation Analysis"
      "Pearson Product-Moment Coefficient (r)"
      "Properties of r"
      "Covariance"
    "Inference"
      "Hypothesis Testing (t-tests, F-tests)"
      "Confidence Intervals"
      "Prediction Intervals"
    "Diagnostics"
      "Residual Analysis"
        "Homoscedasticity"
        "Normality"
        "Independence"
      "Outliers & Influential Points"

2. In-Depth Theory, Equations & Mechanisms

Simple Linear Regression (SLR) models the relationship between two continuous quantitative variables: a dependent variable, $Y$, and an independent variable, $X$. This relationship is assumed to be linear in its parameters. Correlation quantifies the strength and direction of the linear association between these variables.

2.1 The Simple Linear Regression Model

The population regression function (PRF) describes the true, unknown relationship:
$Y_i = \beta_0 + \beta_1 X_i + \epsilon_i$
Where:
* $Y_i$: The $i$-th observation of the dependent variable.
* $X_i$: The $i$-th observation of the independent variable.
* $\beta_0$: The population Y-intercept, representing the expected value of $Y$ when $X=0$.
* $\beta_1$: The population slope coefficient, representing the expected change in $Y$ for a one-unit change in $X$.
* $\epsilon_i$: The $i$-th error term (or disturbance), representing all unobserved factors affecting $Y$ and the inherent randomness in the relationship. $\epsilon_i$ is a random variable.

2.2 Assumptions of the Classical Linear Regression Model (CLRM) for OLS Estimation

The validity and efficiency of Ordinary Least Squares (OLS) estimators depend critically on these assumptions (Gauss-Markov assumptions):
1. Linearity in Parameters: The model is linear in the coefficients $\beta_0$ and $\beta_1$.
* Equation: $Y_i = \beta_0 + \beta_1 X_i + \epsilon_i$
2. Random Sampling: The data $(X_i, Y_i)$ are a random sample from the population. This ensures the observations are independent.
3. No Perfect Collinearity of $X$: The independent variable $X$ must exhibit some variation in the sample (i.e., $X_i$ values are not all identical). If $Var(X) = 0$, $\beta_1$ is undefined.
* Condition: $\sum_{i=1}^{n} (X_i - \bar{X})^2 > 0$
4. Zero Conditional Mean of Error Term: The expected value of the error term, conditional on $X$, is zero. This implies that $X$ is exogenous; it is not correlated with the error term.
* Equation: $E(\epsilon_i | X_i) = 0$ for all $i$.
* Direct Implication: $E(Y_i | X_i) = \beta_0 + \beta_1 X_i$. This is the PRF.
5. Homoscedasticity (Constant Variance of Error Term): The variance of the error term, conditional on $X$, is constant for all observations.
* Equation: $Var(\epsilon_i | X_i) = \sigma^2$ (a constant) for all $i$.
* Violation is called heteroscedasticity.
6. No Autocorrelation (No Serial Correlation): The error terms for different observations are uncorrelated.
* Equation: $Cov(\epsilon_i, \epsilon_j | X_i, X_j) = 0$ for $i
eq j$.
7. Normality of Error Term (for Inference): The error terms are normally distributed. This assumption is crucial for hypothesis testing and constructing confidence intervals, particularly in small samples. For large samples, the Central Limit Theorem helps ensure estimators are approximately normally distributed even if errors are not.
* Equation: $\epsilon_i \sim N(0, \sigma^2)$

2.3 Ordinary Least Squares (OLS) Estimation

The objective of OLS is to find the sample regression function (SRF):
$\hat{Y}i = \hat{\beta}_0 + \hat{\beta}_1 X_i$
Where $\hat{\beta}_0$ and $\hat{\beta}_1$ are the OLS estimators of $\beta_0$ and $\beta_1$, respectively. The "hat" denotes an estimated value.
The OLS principle minimizes the sum of squared residuals (SSR):
$SSR = \sum{i=1}^{n} \hat{\epsilon}i^2 = \sum{i=1}^{n} (Y_i - \hat{Y}i)^2 = \sum{i=1}^{n} (Y_i - (\hat{\beta}_0 + \hat{\beta}_1 X_i))^2$

To find $\hat{\beta}_0$ and $\hat{\beta}_1$, we take partial derivatives of $SSR$ with respect to $\hat{\beta}_0$ and $\hat{\beta}_1$, set them to zero, and solve the resulting system of "normal equations."

$\frac{\partial SSR}{\partial \hat{\beta}0} = -2 \sum{i=1}^{n} (Y_i - \hat{\beta}0 - \hat{\beta}_1 X_i) = 0$
$\frac{\partial SSR}{\partial \hat{\beta}_1} = -2 \sum{i=1}^{n} X_i (Y_i - \hat{\beta}_0 - \hat{\beta}_1 X_i) = 0$

Solving these equations yields:
$\hat{\beta}1 = \frac{\sum{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})}{\sum_{i=1}^{n} (X_i - \bar{X})^2} = \frac{Cov(X, Y)}{Var(X)}$
$\hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X}$

Where:
* $\bar{X} = \frac{1}{n} \sum X_i$ is the sample mean of $X$.
* $\bar{Y} = \frac{1}{n} \sum Y_i$ is the sample mean of $Y$.
* $Cov(X, Y) = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})$ is the sample covariance.
* $Var(X) = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2$ is the sample variance.

2.4 Properties of OLS Estimators (Gauss-Markov Theorem)

Under assumptions 1-5, the OLS estimators $\hat{\beta}_0$ and $\hat{\beta}_1$ are the Best Linear Unbiased Estimators (BLUE).
* Linear: They are linear functions of the observed $Y_i$ values.
* Unbiased: Their expected values are equal to the true population parameters: $E(\hat{\beta}_0) = \beta_0$ and $E(\hat{\beta}_1) = \beta_1$.
* Best: They have the minimum variance among all linear unbiased estimators.

2.5 Goodness-of-Fit: R-squared ($R^2$) and Standard Error of the Regression ($s_e$)

Total Sum of Squares (TSS): Measures the total variation in the dependent variable.
$TSS = \sum_{i=1}^{n} (Y_i - \bar{Y})^2$
Explained Sum of Squares (ESS): Measures the variation in $Y$ explained by the regression model.
$ESS = \sum_{i=1}^{n} (\hat{Y}_i - \bar{Y})^2$
Residual Sum of Squares (RSS): Measures the unexplained variation in $Y$ (sum of squared residuals).
$RSS = \sum_{i=1}^{n} (Y_i - \hat{Y}i)^2 = \sum{i=1}^{n} \hat{\epsilon}_i^2$

Crucially, $TSS = ESS + RSS$.

Coefficient of Determination ($R^2$): Represents the proportion of the total variation in $Y$ that is explained by the independent variable $X$.
$R^2 = \frac{ESS}{TSS} = 1 - \frac{RSS}{TSS}$
- Properties: $0 \le R^2 \le 1$.
- An $R^2$ of 0 means the model explains none of the variation in $Y$. An $R^2$ of 1 means the model explains all the variation in $Y$.
Standard Error of the Regression ($s_e$): An estimate of the standard deviation of the error term ($\sigma$). It measures the average distance that the observed values fall from the regression line.
$s_e = \sqrt{\frac{RSS}{n-k}} = \sqrt{\frac{\sum_{i=1}^{n} \hat{\epsilon}_i^2}{n-2}}$
(where $k=2$ for SLR, as there are 2 parameters: $\beta_0, \beta_1$)
- Also known as the Root Mean Squared Error (RMSE).

2.6 Simple Linear Correlation: Pearson Product-Moment Correlation Coefficient ($r$)

The Pearson correlation coefficient quantifies the linear association between two variables $X$ and $Y$.
$r = \frac{Cov(X, Y)}{s_X s_Y} = \frac{\sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i=1}^{n} (X_i - \bar{X})^2 \sum_{i=1}^{n} (Y_i - \bar{Y})^2}}$

Properties of $r$:
1. Range: $-1 \le r \le 1$.
2. Sign: Indicates the direction of the linear relationship (positive for direct, negative for inverse).
3. Magnitude: Indicates the strength of the linear relationship (closer to $\pm 1$ indicates stronger).
4. Symmetry: $r_{XY} = r_{YX}$.
5. Scale Invariance: $r$ is unaffected by changes in the origin or scale of measurement of either variable.
6. $r^2 = R^2$ in Simple Linear Regression. This relationship is specific to SLR.

radar-beta
    title OLS Estimator Properties Matrix
    series
        name "Unbiasedness"
        data [100, 70, 85, 90, 60]
    series
        name "Efficiency (Minimum Variance)"
        data [75, 100, 80, 70, 95]
    series
        name "Consistency"
        data [90, 80, 100, 85, 75]
    series
        name "Distributional Normality (asymptotic)"
        data [60, 65, 70, 100, 80]
    labels
        "CLRM Assumptions"
        "Robust Standard Errors (Heteroskedasticity-consistent)"
        "Large Sample Size (n -> infinity)"
        "Normality of Errors"
        "Absence of Outliers"

The radar chart above visualizes how key properties of OLS estimators, essential for valid inference, are influenced by various conditions. For instance, "Unbiasedness" is largely dependent on CLRM assumptions, especially $E(\epsilon_i|X_i)=0$. "Efficiency" (minimum variance) is maximally attained under all CLRM assumptions (BLUE property). "Consistency" (estimators converging to true parameters as $n \to \infty$) is a large-sample property, robust to some CLRM violations. "Distributional Normality" for hypothesis testing is directly enhanced by the normality of errors or by large sample sizes (Central Limit Theorem). "Absence of Outliers" impacts all properties, as outliers can bias estimators and inflate variance.

3. Technical Procedures & Applications

3.1 Procedure for Conducting a Simple Linear Regression Analysis

This procedure outlines the steps from data collection to interpretation and diagnostic checking, emphasizing rigorous statistical practice.

sequenceDiagram
    participant Analyst as "Statistical Analyst"
    participant Data as "Raw Data Set (X, Y)"
    participant Model as "Regression Model (Y = β0 + β1X + ε)"
    participant Software as "Statistical Software (R, Python, SAS)"
    participant Report as "Analysis Report"

    Analyst->Data: 1. Acquire raw data (n observations)
    Analyst->Analyst: 2. Visualize data (Scatter plot of Y vs. X)
    Note over Analyst: Identify potential linearity, outliers, heteroscedasticity.
    Analyst->Software: 3. Specify OLS model formula
    Software->Model: 4. Estimate parameters (β̂₀, β̂₁)
    Note over Software: Applies Normal Equations using matrix algebra: <br/>β̂ = (X'X)⁻¹X'Y
    Software->Model: 5. Calculate residuals (ε̂ᵢ = Yᵢ - Ŷᵢ)
    Software->Model: 6. Calculate RSS, ESS, TSS, R²
    Software->Model: 7. Compute standard errors for β̂₀, β̂₁ (SE(β̂₀), SE(β̂₁))
    Software->Model: 8. Calculate t-statistics for β̂₀, β̂₁
    Analyst->Software: 9. Request diagnostic plots
    Software-->Analyst: 10. Generate Residuals vs. Fitted, Normal Q-Q, Scale-Location plots
    Analyst->Analyst: 11. Interpret model coefficients: β̂₀, β̂₁
    Note over Analyst: β̂₁ represents the estimated average change in Y for a one-unit increase in X.
    Analyst->Analyst: 12. Evaluate goodness-of-fit: R², s_e
    Analyst->Analyst: 13. Perform hypothesis tests for significance (t-tests for β̂₀, β̂₁)
    Note over Analyst: H₀: β₁ = 0 vs. H₁: β₁ ≠ 0. Compare p-value to α.
    Analyst->Analyst: 14. Construct confidence intervals for β₀, β₁
    Note over Analyst: CI for β₁: β̂₁ ± t(α/2, n-2) * SE(β̂₁)
    Analyst->Analyst: 15. Assess model assumptions using diagnostic plots
    Analyst->Analyst: Consistency Check (Homoscedasticity, Normality, Independence)
    Analyst->Report: 16. Compile results, interpretations, and diagnostics.
    Analyst->Report: 17. Formulate predictions for new X values (point and interval forecasts).

3.2 Detailed Calculation of Key Statistics

Given a dataset $(X_i, Y_i)$ for $i=1, \dots, n$:

Sample Means:
$\bar{X} = \frac{1}{n} \sum X_i$
$\bar{Y} = \frac{1}{n} \sum Y_i$
Sample Variances and Covariance:
$S_{XX} = \sum_{i=1}^{n} (X_i - \bar{X})^2 = \sum X_i^2 - \frac{(\sum X_i)^2}{n}$
$S_{YY} = \sum_{i=1}^{n} (Y_i - \bar{Y})^2 = \sum Y_i^2 - \frac{(\sum Y_i)^2}{n}$
$S_{XY} = \sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y}) = \sum X_i Y_i - \frac{(\sum X_i)(\sum Y_i)}{n}$
OLS Estimators:
$\hat{\beta}1 = \frac{S{XY}}{S_{XX}}$
$\hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X}$
Predicted Values and Residuals:
$\hat{Y}_i = \hat{\beta}_0 + \hat{\beta}_1 X_i$
$\hat{\epsilon}_i = Y_i - \hat{Y}_i$
Sums of Squares:
$RSS = \sum_{i=1}^{n} \hat{\epsilon}i^2 = S{YY} - \hat{\beta}1 S{XY}$ (This is an important computational shortcut)
$TSS = S_{YY}$
$ESS = TSS - RSS = \hat{\beta}1 S{XY}$
Coefficient of Determination:
$R^2 = \frac{ESS}{TSS} = 1 - \frac{RSS}{TSS}$
Standard Error of the Regression:
$s_e = \sqrt{\frac{RSS}{n-2}}$
Standard Errors of the Estimators:
$SE(\hat{\beta}1) = \frac{s_e}{\sqrt{S{XX}}}$
$SE(\hat{\beta}0) = s_e \sqrt{\frac{1}{n} + \frac{\bar{X}^2}{S{XX}}}$
Test Statistics for Hypothesis Testing:
For $H_0: \beta_1 = 0$: $t_{\hat{\beta}1} = \frac{\hat{\beta}_1}{SE(\hat{\beta}_1)}$ (follows a $t$-distribution with $n-2$ degrees of freedom)
For $H_0: \beta_0 = 0$: $t{\hat{\beta}_0} = \frac{\hat{\beta}_0}{SE(\hat{\beta}_0)}$ (follows a $t$-distribution with $n-2$ degrees of freedom)
Pearson Correlation Coefficient:
$r = \frac{S_{XY}}{\sqrt{S_{XX} S_{YY}}}$
Note: $r^2 = R^2$ for SLR.

3.3. Prediction Intervals versus Confidence Intervals

Confidence Interval for the Mean Response $E(Y|X_0)$: Provides an interval estimate for the average value of $Y$ for a given $X_0$.
$\hat{Y}0 \pm t{(n-2, \alpha/2)} \cdot s_e \sqrt{\frac{1}{n} + \frac{(X_0 - \bar{X})^2}{S_{XX}}}$
Prediction Interval for a New Observation $Y_0$: Provides an interval estimate for a single future observation $Y_0$ for a given $X_0$. It is wider than the confidence interval for the mean response because it accounts for both the uncertainty in estimating the mean and the inherent variability of individual observations.
$\hat{Y}0 \pm t{(n-2, \alpha/2)} \cdot s_e \sqrt{1 + \frac{1}{n} + \frac{(X_0 - \bar{X})^2}{S_{XX}}}$

3.4. Conditions During Application

Data Type: Both X and Y must be quantitative and continuous or nearly continuous. For categorical data, specific transformations or different regression models are required.
Sample Size: Sufficiently large samples ($n > 20-30$) are preferred to ensure the asymptotic properties of OLS estimators hold and to rely on the Central Limit Theorem for approximate normality. For small $n$, the normality of errors assumption becomes critical.
Absence of Extreme Outliers: Outliers can disproportionately influence OLS estimates and inflate standard errors, leading to misleading conclusions. Robust regression methods may be necessary in such cases.
Domain Expertise: The relationship explored should be logically plausible based on theoretical considerations or prior empirical evidence. Blindly fitting a line without domain knowledge can lead to spurious correlations.

4. Examiner's Breakdown

4.1 Comparative Analysis

Feature	Simple Linear Regression (SLR)	Simple Linear Correlation (SLC)
Primary Objective	Prediction and estimation of cause-effect (causal if assumptions met) relationship; quantifying change in Y for unit change in X. *Asymmetric* in X and Y.	Quantification of linear association between two variables; strength and direction. *Symmetric* in X and Y.
Model Equation	$Y_i = \beta_0 + \beta_1 X_i + \epsilon_i$ where $\epsilon_i$ is explicitly modeled with assumptions.	None explicitly. Focus on $r$ (Pearson product-moment correlation coefficient).
Assumptions	Strict: linearity in parameters, random sampling, $E(\epsilon_i	X_i)=0$, homoscedasticity, no autocorrelation, (optional for inference) $\epsilon_i \sim N(0, \sigma^2)$.
Output Metrics	$\hat{\beta}_0, \hat{\beta}_1$ (coefficients), $s_e$ (standard error of regression), $R^2$ (coefficient of determination), SEs of coefficients, t-statistics, p-values.	$r$ (correlation coefficient). Potentially p-value for testing $r=0$.
Interpretation	$\hat{\beta}_1$ is the expected change in $Y$ for a one-unit change in $X$, holding other factors (captured in $\epsilon$) constant. $R^2$ is % variance in Y explained by X.	$r$ indicates strength and direction of linear association. $r=0$ implies no linear association. $r=\pm 1$ implies perfect positive/negative linear association. *Does NOT imply causation.*
Causality Implication	Can imply causality IF all CLRM assumptions are met, especially exogeneity ($E(\epsilon_i	X_i)=0$) and the model is correctly specified, which is difficult to prove.
Predictive Power	High. Provides a functional form for prediction of $Y$ given $X$. Allows for prediction intervals and confidence intervals.	Limited. While 'r' indicates relationship strength, correlation itself does not provide a direct framework for predicting specific values of $Y$ based on $X$ values in the same way a regression equation does (though $r$ is a component of prediction).
Relationship to each other	For SLR, $R^2 = r^2$. Correlation measures the strength of the linear relationship that SLR then models.	Correlation is a prerequisite or a summary statistic that motivates or accompanies a regression analysis.

4.2 High-Yield Marking Keywords

Ordinary Least Squares (OLS) Estimators: $\hat{\beta}_0, \hat{\beta}_1$ derived by minimizing the Sum of Squared Residuals.
Gauss-Markov Assumptions: Specifically, linearity, random sampling, zero conditional mean of error, homoscedasticity, no autocorrelation.
BLUE (Best Linear Unbiased Estimator): Property of OLS estimators under the Gauss-Markov assumptions.
Coefficient of Determination ($R^2$): Proportion of total variation in $Y$ explained by $X$.
Pearson Product-Moment Correlation Coefficient ($r$): Quantifies strength and direction of linear association; ranges from -1 to 1.
Homoscedasticity: Constant variance of the error term across all levels of $X$.
Exogeneity: The independent variable $X$ is uncorrelated with the error term ($\text{Cov}(X, \epsilon) = 0$).
Prediction Interval vs. Confidence Interval: Crucial distinction in purpose and width for single-point forecasting vs. mean response estimation.

4.3 Trapdoor Mistakes

Inferring Causation from Correlation: Students frequently state or imply that a strong correlation ($|r|$ close to 1) means $X$ causes $Y$.
- Correct Answer: Emphasize that correlation only measures linear association and does NOT imply causation. Acknowledge possible confounding variables, reverse causality, or mere coincidence. State that establishing causation is complex and requires rigorous experimental design or satisfying very strict econometric conditions beyond mere statistical association.
Misinterpreting $R^2$ as a measure of model adequacy or superiority: Students often assume that a high $R^2$ automatically implies a good model or a causally significant relationship.
- Correct Answer: A high $R^2$ simply means the model explains a large proportion of variance in $Y$. It does not guarantee that the model is correctly specified, unbiased, or free of assumption violations (e.g., heteroscedasticity, omitted variable bias). A low $R^2$ might still represent a statistically significant and important relationship.
Confusing Confidence Interval for Mean Response with Prediction Interval for a New Observation: These are distinct and have different widths.
- Correct Answer: Clearly state that the confidence interval estimates the average value of $Y$ for a given $X_0$, while the prediction interval estimates a single, new observation of $Y$ for a given $X_0$. Explain that the prediction interval is always wider due to the additional uncertainty associated with individual variation. Write out both formulas to highlight the $\sqrt{1+\dots}$ term in the prediction interval.
Ignoring or improperly analyzing diagnostic plots: Students often report regression results without checking underlying OLS assumptions.
- Correct Answer: Discuss the systematic examination of residual plots:
  - Residuals vs. Fitted Values Plot: To check for homoscedasticity (should show a random scatter around zero, no discernible pattern or fanning/funneling) and linearity (no obvious curves).
  - Normal Q-Q Plot of Residuals: To assess the normality of error terms (points should lie approximately along a straight diagonal line).
  - Scale-Location Plot (Sqrt(|Residuals|) vs. Fitted Values): A variant for detecting heteroscedasticity more clearly, where points should be randomly scattered without any trend.
  - Emphasize that violations of these assumptions (e.g., heteroscedasticity, non-normality) invalidate standard error estimates and p-values, making inference unreliable, requiring robust standard errors or transformations.

Frequently asked about Simple Linear Regression and Correlation

The Mental Model: Imagine fitting the trajectory of a ballistic missile's flight path with a precisely defined parabolic equation, where minute variations in initial velocity and launch angle dictate its exact landing coordinates, offering a predictive model of its impact based… Read the full notes above for the details.

Simple Linear Regression and Correlation is a core topic in statistics 1B. Most exam papers test it via a mix of definitions, worked examples, and applied problems. The notes above cover the high-yield sub-topics, common pitfalls, and the kind of questions examiners typically set.

Yes. Every note in the StudyAI Campus Hub is free to read. Create a free account if you want to clone the full plan, generate your own notes from your textbook, or get AI-powered practice quizzes and flashcards.

Simple Linear Regression and Correlation

Simple Linear Regression and Correlation

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 The Simple Linear Regression Model

2.2 Assumptions of the Classical Linear Regression Model (CLRM) for OLS Estimation

2.3 Ordinary Least Squares (OLS) Estimation

2.4 Properties of OLS Estimators (Gauss-Markov Theorem)

2.5 Goodness-of-Fit: R-squared ($R^2$) and Standard Error of the Regression ($s_e$)

2.6 Simple Linear Correlation: Pearson Product-Moment Correlation Coefficient ($r$)

3. Technical Procedures & Applications

3.1 Procedure for Conducting a Simple Linear Regression Analysis

3.2 Detailed Calculation of Key Statistics

3.3. Prediction Intervals versus Confidence Intervals

3.4. Conditions During Application

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Simple Linear Regression and Correlation

More from statistics 1B

Get the full statistics 1B curriculum

Simple Linear Regression and Correlation

1. Introduction & Overview

2. In-Depth Theory, Equations & Mechanisms

2.1 The Simple Linear Regression Model

2.2 Assumptions of the Classical Linear Regression Model (CLRM) for OLS Estimation

2.3 Ordinary Least Squares (OLS) Estimation

2.4 Properties of OLS Estimators (Gauss-Markov Theorem)

2.5 Goodness-of-Fit: R-squared ($R^2$) and Standard Error of the Regression ($s_e$)

2.6 Simple Linear Correlation: Pearson Product-Moment Correlation Coefficient ($r$)

3. Technical Procedures & Applications

3.1 Procedure for Conducting a Simple Linear Regression Analysis

3.2 Detailed Calculation of Key Statistics

3.3. Prediction Intervals versus Confidence Intervals

3.4. Conditions During Application

4. Examiner's Breakdown

4.1 Comparative Analysis

4.2 High-Yield Marking Keywords

4.3 Trapdoor Mistakes

Frequently asked about Simple Linear Regression and Correlation

What is Simple Linear Regression and Correlation?

How is Simple Linear Regression and Correlation examined in statistics 1B?

Are these Simple Linear Regression and Correlation notes free?

More from statistics 1B

Get the full statistics 1B curriculum