mygrad.nnet.losses.softmax_focal_loss#
- mygrad.nnet.losses.softmax_focal_loss(scores: ArrayLike, targets: ArrayLike, *, alpha: float = 1, gamma: float = 0, constant: bool | None = None) Tensor [source]#
Applies the softmax normalization to the input scores before computing the per-datum focal loss.
- Parameters:
- scoresArrayLike, shape=(N, C)
The C class scores for each of the N pieces of data.
- targetsArrayLike, shape=(N,)
The correct class indices, in [0, C), for each datum.
- alphaReal, optional (default=1)
The ɑ weighting factor in the loss formulation.
- gammaReal, optional (default=0)
The ɣ focusing parameter. Note that for Ɣ=0 and ɑ=1, this is cross-entropy loss. Must be a non-negative value.
- constantOptional[bool]
If
True
, the returned tensor is a constant (it does not back-propagate a gradient)
- Returns:
- mygrad.Tensor, shape=(N,)
The per-datum focal loss.
Notes
The formulation for the focal loss introduced in https://arxiv.org/abs/1708.02002. It is given by -ɑ(1-p)ˠlog(p).
The focal loss for datum-\(i\) is given by
\[-\alpha \hat{y}_i(1-p_i)^\gamma\log(p_i)\]where \(\hat{y}_i\) is one in correspondence to the label associated with the datum and 0 elsewhere. That is, if the label \(y_k\) is 2 and there are four possible label values, then \(\hat{y}_k = (0, 0, 1, 0)\).
It is recommended in the paper that you normalize by the number of foreground samples.