Advanced Linear Models for Data Science 1: Least Squares


Welcome to the Advanced Linear Models for Data Science Class 1: Least Squares. This class is an introduction to least squares from a linear algebraic and mathematical perspective. Before beginning the class make sure that you have the following:

– A basic understanding of linear algebra and multivariate calculus.
– A basic understanding of statistics and regression models.
– At least a little familiarity with proof based mathematics.
– Basic knowledge of the R programming language.
After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists’ general understanding of regression models.

What you will learn


We cover some basic matrix algebra results that we will need throughout the class. This includes some basic vector derivatives. In addition, we cover some some basic uses of matrices to create summary statistics from data. This includes calculating and subtracting means from observations (centering) as well as calculating the variance.

One and two parameter regression

In this module, we cover the basics of regression through the origin and linear regression. Regression through the origin is an interesting case, as one can build up all of multivariate regression with it.

Linear regression

In this lecture, we focus on linear regression, the most standard technique for investigating unconfounded linear relationships.

General least squares

We now move on to general least squares where an arbitrary full rank design matrix is fit to a vector outcome.

What’s included