mygrad.nnet.activations.logsoftmax#
- mygrad.nnet.activations.logsoftmax(x: ArrayLike, axis: None | int | Tuple[int, ...] = -1, *, constant: bool | None = None) Tensor [source]#
Applies the log-softmax activation function:
f(x) = log ( exp(x) / sum( exp(x) ) )
Computes the log-softmax over one or more axes of an ND-tensor.
- Parameters:
- xArrayLike
- axisUnion[None, int, Tuple[int, …]], optional (default=-1)
The axis/axes over which to compute the log-softmax. By default, the log-softmax is computed over the trailing axis.
- constantconstantOptional[bool]
If
True
, the returned tensor is a constant (it does not back-propagate a gradient)
- Returns:
- log_softmaxmygrad.Tensor
Tensor with same shape as
x
Notes
\(N\) is the number of samples in the batch.
\(C\) is the number of possible classes for which scores are provided.
This implements a numerically-stable version of log-softmax, compared to the naive implementation using
mygrad.log
,mygrad.exp
, andmygrad.sum
.Given the shape-\((N, C)\) tensor of scores,
x
, the softmax classification probabilities are computed. That is, the score for class-\(k\) of a given datum (\(s_{k}\)) is normalized using the ‘softmax’ transformation:\[p_{k} = \log{\frac{e^{s_k}}{\sum_{i=1}^{C}{e^{s_i}}}}\]Examples
>>> import mygrad as mg >>> from mygrad.nnet import logsoftmax >>> x = mg.Tensor([[ 2., 2., 2.], ... [2E50, 2E50, 1E50]]) >>> logsoftmax(x) Tensor([[-1.09861229e+00, -1.09861229e+00, -1.09861229e+00], [ 0.00000000e+00, 0.00000000e+00, -1.00000000e+50]])