Applied Machine Learning

What's Inside

Machine Learning Algorithms

A broad overview of the field of ML.

Introduction to algorithms from a broad range of areas across machine learning—-generative models, support vector machines, tree-based algorithms, neural networks, gradient boosting, and more.

Mathematical Foundations

A rigorous definition of key concepts.

Algorithms derived from first principles using mathematical notation. Rigorous introduction to key concepts in machine learning.

Algorithm Implementations

Every algorithm is implemented in Python.

Executable lecture notes—-Jupyter notebooks that display algorithm definitions and their implementations side-by-side. Over 20 algorithms are implemented from scratch in Python.

General Information

What you will learn

Key elements of this course include:

The most important algorithms in machine learning
Fundamental concepts needed to reason about learning algorithms
Techniques for implementing and deploying ML in practice

This course provides both a theoretical and a practical foundation in machine learning. It covers a broad range of algorithms, and defines each algorithm using both mathematics and Python code.

Prerequisites

This masters-level course requires a background in mathematics and programming at the level of introductory college courses. Experience in Python is recommended, but not required. A certain degree of ease with mathematics will be helpful.

Programming experience (ideally Python; Cornell CS 1110 or equivalent)
Linear algebra. (Cornell MATH 2210, MATH 4310 or equivalent)
Statistics and probability. (Cornell STSCI 2100 or equivalent)

Course Content

Each lecture features executable Jupyter lecture notes and slides, as well as lecture videos on Youtube. We define algorithms in terms of probability and linear algebra, and we implement them in Python, Numpy and Scikit-Learn.

Lecture 1: Introduction.

This lecture provides an introduction to the field of machine learning, defining it as a field of study that gives computers the ability to learn without being explicitly programmed. The lecture also discusses three approaches to machine learning, namely supervised learning, unsupervised learning, and reinforcement learning, and provides an overview of the topics covered in the course.

Supervised Learning Unsupervised Learning Reinforcement Learning

Lecture 2: Anatomy of Supervised Machine Learning.

Introduction Models Features Objectives Model training Ordinary least squares

This lecture introduces the concept of supervised learning and explains its three main components: dataset, learning algorithm, and predictive model. This lecture provides a running example of predicting diabetes risk from BMI and illustrates how a supervised learning algorithm works through a model class and an optimizer. This lecture also emphasizes the importance of optimization algorithms in machine learning and introduces scikit-learn, a popular machine learning library.

Lecture 3: Optimization and Linear Regression.

Optimization by gradient descent Normal equations Polynomial feature expansion Extensions of linear regression

Our previous lecture defined the task of supervised learing. In this lecture, we will now define our first supervised learning algorithm—ordinary least squares (OLS). The OLS algorithm performs linear regression—it fits a linear model to a regression dataset.

Lecture 4: Foundations of Supervised Learning.

Data distribution Hypothesis classes Bayes optimality Over/under fitting Regularization

In this lecture, we turn our attention to another important task in supervised learning—classificaiton. We define this problem and introduce a first set of algorithms.

Lecture 5: Maximum Likelihood Learning.

Maximum likelihood learning Bayesian ML MAP Learning Example Algorithms

Next, let’s try to understand how to evaluate and understand these algorithms. In the process of doing that, we will identify two common failure modes of supervised learning, and develop new algorithms that address these failure modes. These algorithms will rely on a general technique called regularization.

Lecture 6: Classification Algorithms.

KNN Logistic Regression Softmax Regression

This lecture will introduce a new family of machine learning models called generative models. Our motivating example will be text classification, and we will derive for this task an famous algorithm called Naive Bayes.

Lecture 7: Generative Algorithms.

Generative models Gaussian Discriminant Analysis

In the last lecture, we introduced generative modeling and Naive Bayes. In this lecture, we will see more examples of generative models, namely Gaussian Discriminant Analysis. Let’s first review generative models and their distinction from discriminative models through a simple classification problem.

Lecture 8: Naive Bayes.

Naive Bayes Bag of Words Generative vs. Discriminative Methods

Naive Bayes is a simple and popular classification algorithm. We will go over the mechanics of Naive Bayes as well as how to apply it to text classification.

Lecture 9: Support Vector Machines.

SVM Margins Max-margin Classifiers Hinge Loss Sub-gradient Descent

Support vector machines are one of the most robust prediction methods in machine learning. We will go over margins, loss, as well as optimization for SVMs.

Lecture 10: Dual Formulation of Support Vector Machines.

Lagrange Duality Dual Formulation of SVM SMO algorithm

We will dive deeper into support vector machines by introducing Lagrange duality, and define the dual form of support vector machines.

Lecture 11: Kernels.

Kernels Mercer's Theorem RBF Kernels

In this lecture, we will explain what Kernels mean in machine learning, we will provide examples, as well as how to apply the kernel trick in support vector machines.

Lecture 12: Decision Trees.

Bagging Ensembling CART

Decision trees are simple and interpretable algorithms for regression and classification. We will talk about what they are, how to use them, as well as bagging and random forests to improve the performance of decision tree models.

Lecture 13: Boosting.

Adaboost Gradient Boosting

Boosting helps a weak learner to become better. In this lecture, we will talk about the essence of boosting, additive models, as well as gradient boosting.

Lecture 14: Neural Networks.

NN Perceptrons Multi-layer Neural Networks

Neural networks are machine learning models inspired by the brain. In this lecture, we will give an introduction of neural networks, how perception mimic human neurons, and back propagation.

Lecture 15: Deep Learning.

DL Convolutional neural networks Applications

We will provide a brief introduction to deep learning. After an introduction of convolutions, we will introduce convolutional neural networks, one of the very important topics in machine learning.

Lecture 16: Unsupervised Learning.

Unsupervised Learning Introduction Language Practice

Let’s start our journey in unsupervised learning. We will introduce the concept of unsupervised learning, get familiar with some terms in unsupervised learning, and talk about unsupervised learning in practice.

Lecture 17: Density Estimation.

Density Estimation Probabilistic Models K-Nearest Neighbors

Density estimation is an important tools in unsupervised machine learning. We will also discuss kernel density estimation, as well as latent variable models.

Lecture 18: Clustering.

Clustering K-means Expectation-Maximization

Clustering is perhaps the most popular class of unsupervised learning algorithms. We will introduce gaussian mixture models, expectation maximization, as well as generalization in probabilistic models.

Lecture 19: Dimensionality Reduction.

Dimensionality Reduction PCA ICA

Note: Lecture video 19 part 2 was incorrectly titled as part 3 on Youtube.

Reducing the dimensionality of the data can help us better interpret it and also make learning algorithms more efficient. To do that, the most used method is principal component analysis, or PCA.

Lecture 20: Evaluating Machine Learning Models.

Evaluation Dataset Splits Cross-Validation Performance Measures

So, now you have a model. How to evaluate it? We will talk about the ML development workflow, and how to evaluate classification and regression models.

Lecture 21: Model Iteration and Improvement.

Diagnosis Model Iteration Process Bias/Variance Tradeoff Baselines Learning Curves

Developing machine learning models is an iterative process. We will encounter difficulties and errors, and it is part of our job to fix them. We will talk about error and bias/variance analysis.

Lecture 22: Tools for Diagnosing Model Performance.

Diagnosis Error Analysis Data Integrity Human-Level Performance

We need tools for diagnosing the performance of our models. We will describe learning/loss/validation curves, as well as distribution mismatch.

Lecture 23: Overview.

Bias/variance Tradeoff Empirical risk minimization Learning theory

Note: Videos for lecture 23 was not recorded, however, the slides and notes are available.

It's your turn to apply what you learned here to the world!