mygrad.nnet.losses.focal_loss#

mygrad.nnet.losses.focal_loss(class_probs: ArrayLike, targets: ArrayLike, *, alpha: float = 1, gamma: float = 0, constant: bool | None = None) Tensor[source]#

Return the per-datum focal loss.

Parameters:
class_probsArrayLike, shape=(N, C)

The C class probabilities for each of the N pieces of data. Each value is expected to lie on (0, 1]

targetsArrayLike, shape=(N,)

The correct class indices, in [0, C), for each datum.

alphaReal, optional (default=1)

The ɑ weighting factor in the loss formulation.

gammaReal, optional (default=0)

The ɣ focusing parameter. Note that for Ɣ=0 and ɑ=1, this is cross-entropy loss. Must be a non-negative value.

constantOptional[bool]

If True, the returned tensor is a constant (it does not back-propagate a gradient)

Returns:
mygrad.Tensor, shape=(N,)

The per-datum focal loss.

Notes

The formulation for the focal loss introduced in https://arxiv.org/abs/1708.02002. It is given by -ɑ(1-p)ˠlog(p).

The focal loss for datum-\(i\) is given by

\[-\alpha \hat{y}_i(1-p_i)^\gamma\log(p_i)\]

where \(\hat{y}_i\) is one in correspondence to the label associated with the datum and 0 elsewhere. That is, if the label \(y_k\) is 2 and there are four possible label values, then \(\hat{y}_k = (0, 0, 1, 0)\).

It is recommended in the paper that you normalize by the number of foreground samples.