mygrad.Tensor.grad#
- property Tensor.grad: ndarray | None#
Returns the derivative of
ℒ
with respect to this tensor.ℒ
is the terminal node in the compuational graph from whichℒ.backward()
was invoked.If this tensor is a view of another tensor then their gradients will exhibit the same memory-sharing relationship as their data.
- Returns:
- dℒ/dx: numpy.ndarray
The gradient of the terminal node in a computational graph with respect to this tensor. The shape of this numpy array matches
self.shape
Examples
>>> import mygrad as mg >>> x = mg.Tensor([1.0, 2.0])
Prior to backpropagation tensors have
None
set for their gradients.>>> x.grad is None True
Now we trigger backpropagation…
>>> ℒ = x ** 2 >>> ℒ.backward()
and we see that
x.grad
stores dℒ/dx>>> x.grad # dℒ/dx array([2., 4.])
Now we will demonstrate the relationship between gradient a view tensor and that of its base.
>>> base = mg.Tensor([1.0, 2.0, 3.0]) >>> view = base[:2]; view Tensor([1., 2.])
>>> ℒ = base ** 2 >>> ℒ.backward()
Although
view
is not directly involved in the computation inℒ
, and thus would not typically store a gradient in due toℒ.backward()
, it shares memory withbase
and thus it stores a gradient in correspondence to this “view relationship”. I.e. becauseview == base[:2]
, then we expect to find thatview.grad == base.grad[:2]
.>>> base.grad array([2., 4., 6.]) >>> view.grad array([2., 4.])
>>> view.grad.base is base.grad True
The reasoning here is that, because a base tensor and its view share the same array data, then varying an element in that data implies that both the base tensor and the view will change (assuming the variation occurs specifically in a shared region). It follows that the base tensor’s gradient must share the same relationship with the view-tensor since these are measures of “cause and effects” associated with varying elements of data (albeit infinitesmaly).