What is WDRO?
Wasserstein Distributionally Robust Optimization (WDRO) is a mathematical program that can provide robustness to data shifts in machine learning models.
Machine Learning models
Let us denote the cost
of a prediction parametrized by
for some uncertain variable
.
For instance, in linear regression, we have
with
the data and
the label. Then,
.
In machine learning, it is usual to train our model (or fit, ie. optimize on
) using data samples
of the uncertain parameter by minimizing the Empirical Risk, which leads to the problem:
(1)
Equation (1) is usually called Empirical Risk Minimization (ERM) in the literature.