mygrad.multi_matmul#
- mygrad.multi_matmul(tensors: ArrayLike, *, constant: bool | None = None) Tensor [source]#
Matrix product of two or more tensors calculated in the optimal ordering
This documentation was adapted from
numpy.linalg.multi_dot
Compute the matrix multiplication of two or more arrays in a single function call, while automatically selecting the fastest evaluation order.
multi_matmul
chainsmatmul
and uses optimal parenthesization [1] [2]. Depending on the shapes of the matrices, this can speed up the multiplication a lot.If the first argument is 1-D it is treated as a row vector.
If the last argument is 1-D it is treated as a column vector.
The other arguments must be 2-D or greater.
Think of multi_dot as an optimized version of:
def multi_dot(tensors): return functools.reduce(mg.matmul, tensors)
- Parameters:
- tensors: Sequence[array_like]
The sequence of tensors to be matrix-multiplied.
- constantOptional[bool]
If
True
, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e.constant.grad
will always returnNone
).Defaults to
False
for float-type data. Defaults toTrue
for integer-type data.Integer-type tensors must be constant.
- Returns:
- mygrad.Tensor
Returns the matrix product of the tensors provided
- Raises:
- ValueError
If
tensors
contains less than two array_like items.- ValueError
If
tensor
other than the first or last is less than two dimensional
See also
matmul
matrix multiplication with two arguments.
Notes
The cost for a matrix multiplication can be calculated with the following function:
def cost(A, B): return A.shape[0] * A.shape[1] * B.shape[1]
Let’s assume we have three matrices \(A_{10x100}, B_{100x5}, C_{5x50}\).
The costs for the two different parenthesizations are as follows:
cost((AB)C) = 10*100*5 + 10*5*50 = 5000 + 2500 = 7500 cost(A(BC)) = 10*100*50 + 100*5*50 = 50000 + 25000 = 75000
References
[1]Cormen, “Introduction to Algorithms”, Chapter 15.2, p. 370-378
Examples
multi_matmul
allows you to write:>>> from mygrad.math.misc.funcs import matmul >>> from mygrad import multi_matmul, Tensor >>> import numpy as np >>> # Prepare some random tensors >>> A = Tensor(np.random.random((10000, 100))) >>> B = Tensor(np.random.random((100, 1000))) >>> C = Tensor(np.random.random((1000, 5))) >>> D = Tensor(np.random.random((5, 333))) >>> # the actual matrix multiplication >>> multi_matmul([A, B, C, D]) # computes (A @ (B @ C)) @ D
instead of:
>>> matmul(matmul(matmul(A, B), C), D) >>> # or >>> A @ B @ C @ D