tinygrad/tinygrad/pull/15463/changes
This commit adds weakint to the promotion lattice, which
is an object used for type
promotion.
How does it work?
The promotion lattice is a hashmap used to create a dag of dtype nodes. A helper finds the LCA.
Why would the user create a weak int instead of defining their dtype up front?
Tensor([1.0],[2.0]).half + 3 could
cast the Tensor’s float 16 to float 32 if you defaulted 3 to a int32
instead of a weakint.Are there any ML frameworks where all dtypes must be defined or they will error?
tinygrad/tinygrad/pull/15356/changes
Refactoring shared logic (bitwise_not() and
elementwise.py) that touched Tinygrad’s inverse/min/max
functions. These function are interesting because they are dtype
dependent and also require an interesting property.
There is not MIN op in Tinygrad. MIN is just
inverse(MAX(inverse(X),inverse(Y))). Inverse must therefore
be an involution.
bool, float, and int are stored differently in memory. Therefore inversion must be handled differently. (e.g. a bitwise not might be useful in inverting an int, but will scramble a float).
tinygrad/tinygrad/pull/15416/changes
I found that the sign() function in
elementwise.py contained a hack similar to that which was
removed in the commit I read yesterday. It could be safely removed,
which also has the (pleasant?) side effect of no longer causing the
function to cast boolean input to integer output.
tinygrad/tinygrad/pull/15367/changes
In order to prevent a run time error from being raised when the user
called .gradient(some_tensor) after writing
another_tensor.copysign(some_tensor) there was a hack in
copysign() that added other.sign()*0 to link
other to the rest of the graph so that
.backwards could reach other from the
root.
Mathematically, if a tensor was not connected to the root via the
backwards pass, then even if it’s gradient wasn’t materialized, it’s
derivative with respect to the root (the loss), is 0. So the
materialize_grads parameter was removed and gradient just
returns 0 if something doesn’t exist.