mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* triton can add * print stuff from triton * write out file * ops triton working * reduce ops * sort of works * Triton bugfixes & implementation of remaining ops (#490) * padding * support pow, max, relu, gt0 * allocate return buffer * Fix reduce * Add tests for power op * Fix triton illegal memory accesses and memory leak (#512) * Fix mypy issue * Add triton to setup.py * Replace torch with pycuda * Use one cuda stream for data transfer and kernels * Remove triton submodule * Fix memory leak by using weakrefs for caching * Fix memory access by adding valid as mask for load * Fix invalid kernel launches by flattening the grid (#515) --------- Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| efficientnet | ||
| __init__.py | ||
| Dockerfile | ||
| external_osx_profiling.py | ||
| external_test_gpu_ast.py | ||
| external_test_image.py | ||
| external_test_llvm.py | ||
| external_test_opt.py | ||
| graph_batchnorm.py | ||
| test_conv.py | ||
| test_efficientnet.py | ||
| test_gc.py | ||
| test_mnist.py | ||
| test_net_speed.py | ||
| test_nn.py | ||
| test_onnx.py | ||
| test_ops.py | ||
| test_optim.py | ||
| test_shapetracker.py | ||
| test_speed_v_torch.py | ||
| test_symbolic.py | ||
| test_tensor.py | ||
| test_train.py | ||