mygrad.multi_matmul#

mygrad.multi_matmul(tensors: ArrayLike, *, constant: bool | None = None) Tensor[source]#

Matrix product of two or more tensors calculated in the optimal ordering

This documentation was adapted from numpy.linalg.multi_dot

Compute the matrix multiplication of two or more arrays in a single function call, while automatically selecting the fastest evaluation order. multi_matmul chains matmul and uses optimal parenthesization [1] [2]. Depending on the shapes of the matrices, this can speed up the multiplication a lot.

If the first argument is 1-D it is treated as a row vector.

If the last argument is 1-D it is treated as a column vector.

The other arguments must be 2-D or greater.

Think of multi_dot as an optimized version of:

def multi_dot(tensors): return functools.reduce(mg.matmul, tensors)
Parameters:
tensors: Sequence[array_like]

The sequence of tensors to be matrix-multiplied.

constantOptional[bool]

If True, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e. constant.grad will always return None).

Defaults to False for float-type data. Defaults to True for integer-type data.

Integer-type tensors must be constant.

Returns:
mygrad.Tensor

Returns the matrix product of the tensors provided

Raises:
ValueError

If tensors contains less than two array_like items.

ValueError

If tensor other than the first or last is less than two dimensional

See also

matmul

matrix multiplication with two arguments.

Notes

The cost for a matrix multiplication can be calculated with the following function:

def cost(A, B):
    return A.shape[0] * A.shape[1] * B.shape[1]

Let’s assume we have three matrices \(A_{10x100}, B_{100x5}, C_{5x50}\).

The costs for the two different parenthesizations are as follows:

cost((AB)C) = 10*100*5 + 10*5*50   = 5000 + 2500   = 7500
cost(A(BC)) = 10*100*50 + 100*5*50 = 50000 + 25000 = 75000

References

[1]

Cormen, “Introduction to Algorithms”, Chapter 15.2, p. 370-378

Examples

multi_matmul allows you to write:

>>> from mygrad.math.misc.funcs import matmul    >>> from mygrad import multi_matmul,  Tensor
>>> import numpy as np
>>> # Prepare some random tensors
>>> A = Tensor(np.random.random((10000, 100)))
>>> B = Tensor(np.random.random((100, 1000)))
>>> C = Tensor(np.random.random((1000, 5)))
>>> D = Tensor(np.random.random((5, 333)))
>>> # the actual matrix multiplication
>>> multi_matmul([A, B, C, D]) # computes (A @ (B @ C)) @ D

instead of:

>>> matmul(matmul(matmul(A, B), C), D)
>>> # or
>>> A @ B @ C @ D