mygrad.linalg.norm#
- mygrad.linalg.norm(x: ArrayLike, ord: int | float | None = None, axis: int | Tuple[int] | None = None, keepdims: bool = False, *, nan_to_num: bool = True, constant: bool | None = None) Tensor [source]#
Vector norm.
This function is an infinite number of vector norms (described below), depending on the value of the
ord
parameter.In contrast to
numpy.linalg.norm
, matrix norms are not supported.This docstring was adapted from that of
numpy.linalg.norm
[1].- Parameters:
- xArrayLike
Input tensor. If axis is None, then x must be 1-D unless ord is None. If both axis and ord are None, the 2-norm of
x.ravel
will be returned.- ordOptional[Union[int, float]]
Order of the norm (see table under
Notes
). inf means numpy’s inf object. The default is None.- axisOptional[Union[int, Tuple[int]]]
If axis is an integer, it specifies the axis of x along which to compute the vector norms. The default is None.
- keepdimsbool, optional (default=False)
If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.
- nan_to_numbool, optional (default=True)
If True then gradients that would store nans due to the presence of zeros in x will instead store zeros in those places.
- constantOptional[bool]
If
True
, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e.constant.grad
will always returnNone
).Defaults to
False
for float-type data. Defaults toTrue
for integer-type data.Integer-type tensors must be constant.
- Returns:
- Tensor
Norm(s) of the vector(s).
Notes
For values of
ord < 1
, the result is, strictly speaking, not a mathematical ‘norm’, but it may still be useful for various numerical purposes.The following norms can be calculated:
ord
norm for vectors
inf
max(abs(x))
-inf
min(abs(x))
0
sum(x != 0)
1
as below
-1
as below
2
as below
-2
as below
other
sum(abs(x)**ord)**(1./ord)
The Frobenius norm is given by [1]:
\(||A||_F = [\sum_{i,j} abs(a_{i,j})^2]^{1/2}\)
The nuclear norm is the sum of the singular values.
Both the Frobenius and nuclear norm orders are only defined for matrices and raise a ValueError when
x.ndim != 2
.References
[2]G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15
Examples
>>> import mygrad as mg >>> x = mg.tensor([[1.0, 2.0, 3.0], ... [1.0, 0.0, 0.0]]) >>> l2_norms = mg.linalg.norm(x, axis=1, ord=2) >>> l2_norms Tensor([3.74165739, 1. ])
The presence of the elementwise absolute values in the norm operation means that zero-valued entries in any of input vectors have an undefined derivative. When nan_to_num=False is specified these derivatives will be reported as nan, otherwise they will be made to be 0.0.
>>> l2_norms = mg.linalg.norm(x, axis=1, ord=2, nan_to_num=True) >>> l2_norms.backward() >>> x.grad array([[0.26726124, 0.53452248, 0.80178373], [1. , nan, nan]])
This is rigorously true, but is often not the desired behavior in autodiff applications. Rather, it can be preferable to use 0.0 to fill these undefined derivatives. This is the default behavior, when nan_to_num is not specified.
>>> l2_norms = mg.linalg.norm(x, axis=1, ord=2, nan_to_num=False) # default setting: `nan_to_num=False` >>> l2_norms.backward() >>> x.grad array([[0.26726124, 0.53452248, 0.80178373], [1. , 0., 0.]])
L1 norms along each of the three columns:
>>> mg.linalg.norm(x, axis=0, ord=1) Tensor([2., 2., 3.])