MyGrad’s Tensor#
Tensor
is the most critical piece of MyGrad. It is a
numpy-array-like object capable of serving as a node in a computational
graph that supports back-propagation of derivatives via the chain rule.
You can effectively do a drop-in replacement of a numpy array with a Tensor
for all basic mathematical operations. This includes basic and advanced indexing,
broadcasting, sums over axes, etc; it will simply just work.
>>> import mygrad as mg # note that we replace numpy with mygrad here
>>> x = mg.arange(9).reshape(3, 3)
>>> x
Tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> y = x[x == 4] ** 2
>>> y
Tensor([16], dtype=int32)
Thus MyGrad users can spend their time mastering numpy and their skills will transfer seamlessly when using this autograd library.
Creating a Tensor#
Tensor
can be passed any “array-like” object of numerical data.
This includes numbers, sequences (e.g. lists), nested sequences, numpy-ndarrays,
and other mygrad-tensors. mygrad also provides familiar numpy-style tensor-creation
functions (e.g. arange()
, linspace()
, etc.)
>>> import mygrad as mg
>>> mg.tensor(2.3) # creating a 0-dimensional tensor
Tensor(2.3)
>>> mg.tensor(np.array([1.2, 3.0])) # casting a numpy-array to a tensor
Tensor([1.2, 3.0])
>>> mg.tensor([[1, 2], [3, 4]]) # creating a 2-dimensional tensor from lists
Tensor([[1, 2],
[3, 4]])
>>> mg.arange(4) # using numpy-style tensor creation functions
Tensor([0, 1, 2, 3])
Integer-valued tensors are treated as constants
>>> mg.astensor(1, dtype=np.int8).constant
True
By default, float-valued tensors are not treated as constants
>>> mg.astensor(1, dtype=np.float32).constant
False
Forward and Back-Propagation#
Let’s construct a computational graph consisting of two zero-dimensional
tensors, x
and y
, which are used to compute an output tensor,
ℒ
. This is a “forward pass imperative” style for creating a computational
graph - the graph is constructed as we carry out the forward-pass computation.
>>> x = Tensor(3.0)
>>> y = Tensor(2.0)
>>> ℒ = 2 * x + y ** 2
Invoking ℒ.backward()
signals the computational graph to
compute the total-derivative of ℒ
with respect to each one of its dependent
variables. I.e. x.grad
will store dℒ/dx
and y.grad
will store
dℒ/dy
. Thus we have back-propagated a gradient from ℒ
through our graph.
Each tensor of derivatives is computed elementwise. That is, if x = Tensor(x0, x1, x2)
,
then dℒ/dx
represents [dℒ/d(x0), dℒ/d(x1), dℒ/d(x2)]
>>> ℒ.backward() # computes dℒ/dx and dℒ/dy
>>> x.grad # dℒ/dx
array(6.0)
>>> y.grad # dℒ/dy
array(4.0)
>>> ℒ.grad
array(1.0) # dℒ/dℒ
Once the gradients are computed, the computational graph containing x
,
y
, and ℒ
is cleared automatically. Additionally, involving any
of these tensors in a new computational graph will automatically null
their gradients.
>>> 2 * x
>>> x.grad is None
True
Or, you can use the null_grad()
method to manually clear a
tensor’s gradient
>>> y.null_grad()
Tensor(2.)
>>> y.grad is None
True
Accessing the Underlying NumPy Array#
Tensor
is a thin wrapper on numpy.ndarray
. A tensor’s
underlying numpy-array can be accessed via .data
. This returns
a direct reference to the numpy array.
>>> x = mg.tensor([1, 2])
>>> x.data
array([1, 2])
>>> import numpy as np
>>> np.asarray(x)
array([1, 2])
Producing a “View” of a Tensor#
MyGrad’s tensors exhibit the same view semantics and memory-sharing relationships as NumPy arrays. I.e. any (non-scalar) tensor produced via basic indexing will share memory with its parent.
>>> x = mg.tensor([1., 2., 3., 4.])
>>> y = x[:2] # the view: Tensor([1., 2.])
>>> y.base is x
True
>>> np.shares_memory(x, y)
True
Mutating shared data will propagate through views:
>>> y *= -1
>>> x
Tensor([-1., -2., 3., 4.])
>>> y
Tensor([-1., -2.])
And this view relationship will also manifest between the tensors’ gradients
>>> (x ** 2).backward()
>>> x.grad
array([-2., -4., 6., 8.])
>>> y.grad
array([-2., -4.])
Documentation for mygrad.Tensor#
|
Copy of the tensor with the specified dtype. |
|
Trigger backpropagation and compute the derivatives of this tensor. |
A reference to the base tensor that the present tensor is a view of. |
|
Removes the current tensor – and tensors above it – from their shared computational graph. |
|
If |
|
|
Produces a copy of |
The |
|
Data-type of the tensor's elements. |
|
Returns the derivative of |
|
Copy an element of a tensor to a standard Python scalar and return it. |
|
Number of tensor dimensions. |
|
|
Sets this tensor's gradient to be |
|
**Deprecated: Tensors will automatically have their computational graphs cleared during backprop. |
Tuple of tensor dimension-sizes. |
|
Number of elements in the tensor. |
|
Same as self.transpose(), except that self is returned if self.ndim < 2 and a view of the underlying data is utilized whenever possible. |