skwdro.torch module#
API
This is the main dish of the API: use this interface first before trying the more complicated ones. See the pytorch interface tuto to learn more to learn more.
- skwdro.torch.robustify(loss_: Module | Callable[[...], Tensor], transform_: Module | None, rho: Tensor, xi_batchinit: Tensor, xi_labels_batchinit: Tensor | None, post_sample: bool = True, cost_spec: str | None = None, n_samples: int = 10, seed: int = 42, *, reduction: str | None = None, learning_rate: float | None = None, epsilon: float | None = None, sigma: float | None = None, l2reg: float | None = None, adapt: str | None = 'prodigy', n_iter: int | Tuple[int, int] | None = None, imp_samp: bool = True, loss_reduces_spatial_dims: bool = False) _DualFormulation#
Provide the wrapped version of the primal loss.
- Parameters:
- loss_: nn.Module|Callable
the primal loss \(L_\theta\). Can be given either as a
torch.nn.Moduleor as a (functional) callable.- transform_: nn.Module|None
the transformation to apply to the (non-label) data before feeding it to the loss. Identity if set to
None(default).- rho: Tensor, scalar tensor
Wasserstein radius
- xi_batchinit: Tensor, shape (n_samples, n_features)
Data points to initialize the samplers and \(\lambda_0\)
- xi_labels_batchinit: Optional[Tensor], shape (n_samples, n_features)
Labels to initialize the samplers and \(\lambda_0\)
- post_sample: bool
whether to use a post-sampled dual loss
- cost_spec: str|None
the cost specification in the format
(k, p)for a sample k-norm and p-power.Noneto use the default(2, 2).- n_samples: int
number of \(\zeta\) samples to draw before the gradient descent begins (can be changed if needed between inferences)
- seed: int
the seed for the samplers
- reduction: str | None
specifies the reduction to apply to the outer expectation of the SkWDRO formula applied:
'none'|'mean'|'sum'. -'none': no reduction will be applied, -'mean': the sum of the output will be divided by the number of elements in the output, -'sum': the output will be summed. Default:Nonewhich translates to'mean'- learning_rate: float
the step size for the default descent algorithm linked to the loss function
- epsilon: float|None
Epsilon if hard coded,
Noneto let the algo find it.- sigma: float|None
Sigma if hard coded,
Noneto let the algo find it.- l2reg: float|None
L2 regularization if needed
- adapt: str|None
the adaptative step to use between “prodigy” and “mechanic”.
- n_iter: int|tuple[int, int]|None
can set the default number of iterations if used through the default solving routines. Mostly an internal parameter. If int, it is the number of internal robust optimization steps, if a 2-uple of ints, it is the number of erm steps preceding the robust solve then the number of robust steps, if None it will be filled by default.
- imp_samp: bool
whether to use importance sampling (will work only for
(2, 2)costs).- loss_reduces_spatial_dims: bool
flag that can be set to
Trueif the primallossreduces the last dimension of the losses batch with its reduction set to'none', e.g. fortorch.CrossEntropyLosswhich will take one dimension as channel axis, defaults toFalse