Talks

2023

Bayesian Deep Learning

Bayesian Neural Networks (BNNs) take a probabilistic approach to learning in neural networks by placing distributions over the weights and performing (approximate) Bayesian inference. In this talk, I will introduced the basics of BNNs, the challenges in training them, and some of their properties. Presented at Machine Learning Reading Group Talk at the Cambridge University Engineering Department, UK.
Slides

2022

Sparse MoEs meet Efficient Ensembles

I presented our paper Sparse MoEs meet Efficient Ensembles at the Machine Learning Efficiency Workshop at the Deep Learning Indaba 2022.
Slides

Energy-Based Models

While the log-likelihood of the data is analytic in many familiar machine learning models, there exist certain models, called Energy-Based Models (EBMs), for which the log-likelihood is intractable. Lifting the restriction of a tractable log-likelihood opens the door to great flexibility in model architecture and design, but naturally complicates training. In this presentation, we provide an overview of both established as well as recently developed methods for training EBMs. Presented at Machine Learning Reading Group Talk at the Cambridge University Engineering Department, UK.
Video, Slides

2021

Monte Carlo Gradient Estimation in Machine Learning

In this talk, I go over the (semi-)recent review paper for Monte Carlo gradient estimation methods in machine learning. This work discusses the problem of estimating the gradient of an expectation. This problem comes up regularly in machine learning, for example, in variational inference and reinforcement learning. The paper looks at three different methods for solving the problem: the pathwise, score function, and measure-valued gradient estimators. In addition to describing the gradient estimation problem, I’ll describe each of these estimators, their properties, and some advice for choosing one in practice. Presented at Machine Learning Reading Group Talk at the Cambridge University Engineering Department, UK.
Video, Slides

2020

Depth Uncertainty in Neural Networks

Talk about our NeurIPS paper Depth Uncertainty in Neural Networks at the Amsterdam Machine Learning Group (AMLAB)’s weekly seminar.
Slides

World Models

A World Model is a generative recurrent neural network that is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. Ha and Schmidhuber achieve state-of-the-art results for OpenAI Gym environments such as CarRacing-v0 by evolving a simple policy that uses these compressed representations. In our talk, we give an introduction to Markov Decision Processes and Model-based reinforcement learning (RL). Then we dissect the Ha and Schmidhuber paper and describe more recent work expanding on these ideas. Presented at Machine Learning Reading Group Talk at the Cambridge University Engineering Department, UK.
Slides

Variational Depth Search in ResNets

Oral presentation of our paper Variational Depth Search in ResNets at the Neural Architecture Search (NAS) workshop at ICLR 2020.
Video, Slides

2019

Equivariance and Symmetries in CNNs

This talk discusses applications of group theory to deep learning, specifically to the design of CNNs. We focus on a few key papers from Cohen and Welling, each of which proposes new kinds of convolutional layers that enjoy equivariance to more symmetries than the standard planar-CNN we’ve all come to know and love. We motivate the use of these new convolutions, build an intuition for how they work, give some practical considerations for their use, and finally dive into the theory behind them. Presented at Machine Learning Reading Group Talk at the Cambridge University Engineering Department, UK.
Slides

Convolutional Models

Introduction to convolutional networks for image classification and spatial data. Presented at the Deep Learning Indaba 2019.
Video, Slides

A Primer on Missing Data

In the real world, datasets are often messy – it is common for values to be missing or corrupt. Examples include empty cells in spreadsheets, unanswered survey questions, or readings from faulty sensors. Unfortunately, despite the frequent occurrence of such defects, software engineers tend not to develop algorithms that are robust to missing values. As a result, many standard algorithms fail on such datasets. This talk briefly discusses the theory of missing data and practical approaches for dealing with missingness in real-world machine learning. Presented at IndabaX South Africa 2019.
Video, Slides