tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

Timmy 4592fc8fe7 Multireduce Kernels - prereq refactor (#4173 ) * refector rendering a reduceop into it's own function (will help for kernels with multiple reduceops) * linters * addressing concerns		2024-04-14 20:16:54 -04:00
..
codegen	Multireduce Kernels - prereq refactor (#4173 )	2024-04-14 20:16:54 -04:00
engine	optionally use a copy kernel instead of SDMA (#4116 )	2024-04-12 23:10:41 -07:00
features	move sum acc_dtype into lazy so it applies to backward (#4149 )	2024-04-11 14:43:56 -04:00
nn	Resnet fp16 training with fp32 master weight copy (#4144 )	2024-04-14 11:25:08 -04:00
renderer	Update ssa input order and annotate types in cstyle and assembly (#4117 )	2024-04-09 13:10:29 -04:00
runtime	hotfix: CUDA_P2P works (#4155 )	2024-04-12 18:20:12 +03:00
shape	assert if expr_idxs return might be outside of int32 (#4157 )	2024-04-12 14:18:35 -04:00
__init__.py	spend 5 lines to bring mnist into the repo (#4122 )	2024-04-09 19:24:57 -07:00
buffer.py	use Buffer.ensure_allocated in search _ensure_buffer_alloc (#4132 )	2024-04-10 13:11:50 -04:00
device.py	multitensor shouldn't recompile (#4164 )	2024-04-13 00:03:48 -07:00
dtype.py	rename Scalar to ConstType and cast_scalar to as_const (#3946 )	2024-03-26 22:39:58 -04:00
function.py	move sum acc_dtype into lazy so it applies to backward (#4149 )	2024-04-11 14:43:56 -04:00
helpers.py	PADTO SUM if parents of sum are all zero-preserving (#4140 )	2024-04-10 22:16:12 -04:00
lazy.py	optionally use a copy kernel instead of SDMA (#4116 )	2024-04-12 23:10:41 -07:00
ops.py	optionally use a copy kernel instead of SDMA (#4116 )	2024-04-12 23:10:41 -07:00
tensor.py	cleanup lbs (#4163 )	2024-04-12 22:32:16 -07:00