## New Directions

Machine learning can be broadly understood as the science of prediction. Recent algorithmic advances have greatly reduced the difficulty in training **black box** predictors on large amounts of data.

Predictions are not always sufficient. One requires means to **understand** and **capture** uncertainty. This can be achieved by:

- Requiring the
**black box** machine learning model to output both a **prediction** and its **uncertainty**.
- Opening the
**black box** to understand how the machine **reached its conclusions**.

I aim to tackle these issues head on.

### Loss Functions for Uncertainty Estimation

Many machine learning methods proceed via the minimization of a loss function over a provided training set. Much work has been done on designing loss functions with good statistical/computational properties.

A natural extension is to augment the loss function, penalizing predictors that do not accurately report their uncertainty. This work aims to provide new loss functions for estimating uncertainty.

**Buzzwords:** Convex Optimization, Robust Statistics, Down with the Bootstrap.

### Opening the Black box with Machine Teaching

Computers can extract patterns from *very* large data sets, in a *fraction* of the time it takes humans. For example, performing a regression with thousands of relevant features and millions of training examples takes seconds on my laptop!

Machine teaching provides means to understand these patterns. In much the same way a Professor distills hundreds (if not thousands!) of academic papers into an undergraduate curriculum, machine teaching provides a reduced training set that contains the same information as the larger corpus analyzed by the machine.

A great example is the notion of support vector. In this work, I aim to generalize this notion to other procedures/problems, with firm statistical/computational guarantees a key focus.

**Buzzwords:** Compression Bounds, Generalized Support Vectors, Clustering (with a purpose!)