Applied Machine Learning

Cornell University - CS 5785 (Open Online Version).

About The Course

This open online course is a broad introduction to the field of machine learning (ML), based on CS 5785 at Cornell Tech. It covers machine learning algorithms (linear regression, kernel methods, neural networks, etc.), their mathemetical foundations, and their implementation. We are thrilled to make this material open to everyone and welcome you to the course.

  • 23 lectures with detailed course notes
  • 30+ hours of lecture videos
  • 20+ implementations of ML algorithms in Python

  • Instructors

    This course is based on materials created by faculty and instructors at Cornell University and Cornell Tech.

    Volodymyr Kuleshov

    Assistant Professor,
    Cornell Tech

    Volodymyr Kuleshov focuses on machine learning and its applications in scientific discovery, health, and sustainability. He is also the co-founder of Afresh, a startup that uses AI to significantly drive down food waste.


    Nathan Kallus

    Assistant Professor,
    Cornell Tech

    Nathan Kallus' research interests include personalization; optimization, especially under uncertainty; causal inference; sequential decision making; credible and robust inference; and algorithmic fairness.


    Serge Belongie

    University of Copenhagen

    Serge Belongie specializes in computer vision, machine learning, augmented reality, and human-in-the-loop computing. He is now the head of the Pioneer Centre for Artificial Intelligence at the University of Copenhagen.


    Hongjun Wu

    Cornell Tech

    Hongjun Wu is interested in applying machine learning techniques to optimize 3D animation and video games. He is working on automated procedural mesh and shader generation with artificial intelligence.


    What's Inside

    Machine Learning Algorithms

    A broad overview of the field of ML.

    Introduction to algorithms from a broad range of areas across machine learning—-generative models, support vector machines, tree-based algorithms, neural networks, gradient boosting, and more.


    Mathematical Foundations

    A rigorous definition of key concepts.

    Algorithms derived from first principles using mathematical notation. Rigorous introduction to key concepts in machine learning.


    Algorithm Implementations

    Every algorithm is implemented in Python.

    Executable lecture notes—-Jupyter notebooks that display algorithm definitions and their implementations side-by-side. Over 20 algorithms are implemented from scratch in Python.


    General Information

    What you will learn

    Key elements of this course include:

    • The most important algorithms in machine learning
    • Fundamental concepts needed to reason about learning algorithms
    • Techniques for implementing and deploying ML in practice

    This course provides both a theoretical and a practical foundation in machine learning. It covers a broad range of algorithms, and defines each algorithm using both mathematics and Python code.



    This masters-level course requires a background in mathematics and programming at the level of introductory college courses. Experience in Python is recommended, but not required. A certain degree of ease with mathematics will be helpful.

    • Programming experience (ideally Python; Cornell CS 1110 or equivalent)
    • Linear algebra. (Cornell MATH 2210, MATH 4310 or equivalent)
    • Statistics and probability. (Cornell STSCI 2100 or equivalent)


    Course Content

    Each lecture features executable Jupyter lecture notes and slides, as well as lecture videos on Youtube. We define algorithms in terms of probability and linear algebra, and we implement them in Python, Numpy and Scikit-Learn.

    Lecture 1: Introduction.

    This lecture provides an introduction to the field of machine learning, defining it as a field of study that gives computers the ability to learn without being explicitly programmed. The lecture also discusses three approaches to machine learning, namely supervised learning, unsupervised learning, and reinforcement learning, and provides an overview of the topics covered in the course.

    Supervised Learning Unsupervised Learning Reinforcement Learning

    Lecture 2: Anatomy of Supervised Machine Learning.

    Introduction Models Features Objectives Model training Ordinary least squares

    This lecture introduces the concept of supervised learning and explains its three main components: dataset, learning algorithm, and predictive model. This lecture provides a running example of predicting diabetes risk from BMI and illustrates how a supervised learning algorithm works through a model class and an optimizer. This lecture also emphasizes the importance of optimization algorithms in machine learning and introduces scikit-learn, a popular machine learning library.

    Lecture 3: Optimization and Linear Regression.

    Optimization by gradient descent Normal equations Polynomial feature expansion Extensions of linear regression

    Our previous lecture defined the task of supervised learing. In this lecture, we will now define our first supervised learning algorithm—ordinary least squares (OLS). The OLS algorithm performs linear regression—it fits a linear model to a regression dataset.

    Lecture 4: Foundations of Supervised Learning.

    Data distribution Hypothesis classes Bayes optimality Over/under fitting Regularization

    In this lecture, we turn our attention to another important task in supervised learning—classificaiton. We define this problem and introduce a first set of algorithms.

    Lecture 5: Maximum Likelihood Learning.

    Maximum likelihood learning Bayesian ML MAP Learning Example Algorithms

    Next, let’s try to understand how to evaluate and understand these algorithms. In the process of doing that, we will identify two common failure modes of supervised learning, and develop new algorithms that address these failure modes. These algorithms will rely on a general technique called regularization.

    Lecture 6: Classification Algorithms.

    KNN Logistic Regression Softmax Regression

    This lecture will introduce a new family of machine learning models called generative models. Our motivating example will be text classification, and we will derive for this task an famous algorithm called Naive Bayes.

    Lecture 7: Generative Algorithms.

    Generative models Gaussian Discriminant Analysis

    In the last lecture, we introduced generative modeling and Naive Bayes. In this lecture, we will see more examples of generative models, namely Gaussian Discriminant Analysis. Let’s first review generative models and their distinction from discriminative models through a simple classification problem.

    Lecture 8: Naive Bayes.

    Naive Bayes Bag of Words Generative vs. Discriminative Methods

    Naive Bayes is a simple and popular classification algorithm. We will go over the mechanics of Naive Bayes as well as how to apply it to text classification.

    Lecture 9: Support Vector Machines.

    SVM Margins Max-margin Classifiers Hinge Loss Sub-gradient Descent

    Support vector machines are one of the most robust prediction methods in machine learning. We will go over margins, loss, as well as optimization for SVMs.

    Lecture 10: Dual Formulation of Support Vector Machines.

    Lagrange Duality Dual Formulation of SVM SMO algorithm

    We will dive deeper into support vector machines by introducing Lagrange duality, and define the dual form of support vector machines.

    Lecture 11: Kernels.

    Kernels Mercer's Theorem RBF Kernels

    In this lecture, we will explain what Kernels mean in machine learning, we will provide examples, as well as how to apply the kernel trick in support vector machines.

    Lecture 12: Decision Trees.

    Bagging Ensembling CART

    Decision trees are simple and interpretable algorithms for regression and classification. We will talk about what they are, how to use them, as well as bagging and random forests to improve the performance of decision tree models.

    Lecture 13: Boosting.

    Adaboost Gradient Boosting

    Boosting helps a weak learner to become better. In this lecture, we will talk about the essence of boosting, additive models, as well as gradient boosting.

    Lecture 14: Neural Networks.

    NN Perceptrons Multi-layer Neural Networks

    Neural networks are machine learning models inspired by the brain. In this lecture, we will give an introduction of neural networks, how perception mimic human neurons, and back propagation.

    Lecture 15: Deep Learning.

    DL Convolutional neural networks Applications

    We will provide a brief introduction to deep learning. After an introduction of convolutions, we will introduce convolutional neural networks, one of the very important topics in machine learning.

    Lecture 16: Unsupervised Learning.

    Unsupervised Learning Introduction Language Practice

    Let’s start our journey in unsupervised learning. We will introduce the concept of unsupervised learning, get familiar with some terms in unsupervised learning, and talk about unsupervised learning in practice.

    Lecture 17: Density Estimation.

    Density Estimation Probabilistic Models K-Nearest Neighbors

    Density estimation is an important tools in unsupervised machine learning. We will also discuss kernel density estimation, as well as latent variable models.

    Lecture 18: Clustering.

    Clustering K-means Expectation-Maximization

    Clustering is perhaps the most popular class of unsupervised learning algorithms. We will introduce gaussian mixture models, expectation maximization, as well as generalization in probabilistic models.

    Lecture 19: Dimensionality Reduction.

    Dimensionality Reduction PCA ICA

    Note: Lecture video 19 part 2 was incorrectly titled as part 3 on Youtube.

    Reducing the dimensionality of the data can help us better interpret it and also make learning algorithms more efficient. To do that, the most used method is principal component analysis, or PCA.

    Lecture 20: Evaluating Machine Learning Models.

    Evaluation Dataset Splits Cross-Validation Performance Measures

    So, now you have a model. How to evaluate it? We will talk about the ML development workflow, and how to evaluate classification and regression models.

    Lecture 21: Model Iteration and Improvement.

    Diagnosis Model Iteration Process Bias/Variance Tradeoff Baselines Learning Curves

    Developing machine learning models is an iterative process. We will encounter difficulties and errors, and it is part of our job to fix them. We will talk about error and bias/variance analysis.

    Lecture 22: Tools for Diagnosing Model Performance.

    Diagnosis Error Analysis Data Integrity Human-Level Performance

    We need tools for diagnosing the performance of our models. We will describe learning/loss/validation curves, as well as distribution mismatch.

    Lecture 23: Overview.

    Bias/variance Tradeoff Empirical risk minimization Learning theory

    Note: Videos for lecture 23 was not recorded, however, the slides and notes are available.

    It's your turn to apply what you learned here to the world!