# Teaching

## Lecturer

**Probability II**, *Spring 2020.*

*University of Washington, Seattle.*

This is the second quarter of a sequence in probability theory. This quarter, we study jointly distributed probability distributions, independent random variables, conditional distributions. We also cover diverse representations of probability distributions beyond density and cumulative distributions, namely, we introduce moment generating functions. We then study the convergence of random variables and in particular the central limit theorem.

Lecture slides

Lecture notes

Homeworks

**Statistical Learning: Modeling, Prediction, And Computing**, *Winter 2020.*

*University of Washington, Seattle. Co-taught with Zaid Harchaoui*

The course presents advanced statistical machine learning methods from a functional estimation (nonparametric statistics) viewpoint. The course covers the theoretical analysis of kernel-based methods, as well as their practical implementation using gradient-based optimization algorithms and numerical linear algebra algorithms. The course also covers an introduction to recent theoretical analyses of deep networks.

## Teaching Assistant

**Convex Optimization**, *2014-2017.*

*Master Mathematics, Vision, Learning, École Normale Supérieure Paris-Saclay, Paris.*

Taught by Alexandre d’Aspremont

website

**Oral Interrogations in Maths**, *2013-2014*

*Classes Préparatoires in Mathematics and Physics Lycée Janson de Sailly, Paris.*

## Tutorials

**Automatic Differentiation**, *Statistical Machine Learning for Data Scientists, University of Washington, Seattle.*

Lecture on automatic differentiation with code examples covering: how to compute gradients of a chain of computations, how to use automatic-differentiation software, how to use automatic- differentiation beyond gradient computations.

slides notebook

**Optimization for deep learning**, *Jul. 2018.*

*Summer School on Fundamentals of Data Analysis,University of Wisconsin-Madison, Madison.*

Interactive Jupyter Notebook for 30 attendees to understand the basics of optimization for deep learning: automatic-differentiation, convergence guarantees of SGD, illustration of the batch-normalization effect.