tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

Patrick Tsai 971d7f5d7c O(n) arange attempt (#3530 ) * It works? * Clamp correctly * Refactor * Make code better * Undo some stuff * First step to trying to make floats work * Floats work in Python op but not metal because int div is different Python integerdivision was implemented as // which rounds towards negative infinity, but C integer division rounds towards 0 so there is an off-by-1 division error * arange does cumsum with ints and then multiplies by step This is so loop optimization can remain int only * Undo a lot of symbolic changes * Final check * Cleanup * There can be multiple phis * Fix multiple phi op removal * const sets dtype correctly * Fix bugs * Fix a couple bugs and add loop vars to resolve * missed one * Don't trim too many ops * Fix symbolic test * Use ones instead of full * Delete test * Lint passes * max node error * Small updates to loop logic * Remove unnecessary changes * We are getting somewhere * Simple case * Fix * rm, prn * Better * If NumNode doesn't work then continue * clamp is needed for arange(256) * Move everything into the optim fn * Replace correctly * Order optimizations better * Delete * mypy * Test for simplification * Rename * Fix test * update test description * Undo more * Cleanup * No replaced_ops map * Fix lint * AssertionError * back again * Reinstate assertion * Return true and make diff not as big * Bigger range for test * Change cumsum impl * fix bug * make big cumsum work * lint * Undo cumsum 2-stage removal * No while helper * optional min/max clamping * floats work * rm giant arange test * fix python cast None * Check phi parents * one phi allowed per where * Fix one phi per where * Rework iteration * Delete assertions * convert to int * Try mul -1 instead of neg for hip..? * Remove one phi per where requirements * one accum only * Lint * should simplify a loop at a time * Don't get rid of loop explcitly * Need to iterate backwards * lint * unary neg * Make optim work for onnx and sum_pad_collapse * Better message * filter alu ops correctly * Fix the limiter * lint and simplify * Add it back * off by one error * test wheres and phis * test max ops and non-if stuff * <= * cast_scalar * Oops * Change test * Pass loop uops instead of a modified map * Cut param transfer between linearizer and uops * Fix issues * Fix lint * fix efficientnet python 3.8 invalid syntax * distinct vars in seen_vars * accurate var names --------- Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>		2024-03-11 16:09:20 -07:00
..
external	ptx timing vs cuda timing (#3659 )	2024-03-08 10:17:49 -08:00
imported	fix pad 0 size (#3277 )	2024-01-30 08:58:10 -08:00
models	lazy.py: cache consts (#3577 )	2024-03-02 03:50:05 -08:00
testextra	fix test extra issue (#3159 )	2024-01-17 11:58:08 -08:00
unit	changes for teenygrad (#3665 )	2024-03-09 15:30:34 -08:00
web	fast path for copy (#2548 )	2023-12-01 11:34:47 -08:00
__init__.py	All devices are equal! (#196 )	2020-12-15 23:44:08 -08:00
Dockerfile	Docker fix (#1039 )	2023-06-25 10:38:58 -07:00
helpers.py	new lazy, benchmark (#2878 )	2023-12-20 14:33:21 -08:00
test_assign.py	Remove Interpreted device & remaining CPU/TORCH ref (#3423 )	2024-02-16 00:30:21 -05:00
test_conv.py	add back "CPU" in test_onnx_backend supports_device (#3426 )	2024-02-16 00:49:30 -05:00
test_conv_shapetracker.py	move create schedule and delete old API (#3377 )	2024-02-12 18:10:45 +01:00
test_copy_speed.py	remove cpu and torch backends (#3399 )	2024-02-15 16:55:39 +01:00
test_custom_function.py	define var can be removed from vars to keep (#3549 )	2024-02-29 17:44:19 -08:00
test_device_speed.py	cleanup tests Device[Device.DEFAULT] is always Compiled (#3645 )	2024-03-07 11:15:42 -05:00
test_dtype.py	changes for teenygrad (#3665 )	2024-03-09 15:30:34 -08:00
test_dtype_alu.py	skip METAL sin test in test_dtype_alu (#3633 )	2024-03-06 17:29:19 -05:00
test_fusion_op.py	Remove Interpreted device & remaining CPU/TORCH ref (#3423 )	2024-02-16 00:30:21 -05:00
test_fuzz_shape_ops.py	fix: align .split, .chunk and .unsqueeze with torch, add fuzz tests (#3505 )	2024-02-28 17:06:39 -08:00
test_gc.py	fix up eye and fix gc test	2023-02-27 06:53:18 -08:00
test_image_dtype.py	ban __bool__ on Tensor (#3632 )	2024-03-06 17:12:35 -05:00
test_jit.py	hotfix: test_jit_copyin	2024-02-15 12:37:53 +01:00
test_kernel_cache.py	move the compiler cache to be global (#2957 )	2024-01-01 10:59:56 -08:00
test_lazybuffer.py	test for the split reduce kernel (#3515 )	2024-02-27 21:29:25 -05:00
test_lazyop.py	move create schedule and delete old API (#3377 )	2024-02-12 18:10:45 +01:00
test_linearizer.py	O(n) arange attempt (#3530 )	2024-03-11 16:09:20 -07:00
test_linearizer_failures.py	O(n) arange attempt (#3530 )	2024-03-11 16:09:20 -07:00
test_linearizer_overflows.py	bring ptx back (#3623 )	2024-03-06 13:34:21 -08:00
test_masked_st.py	multitensor start (#2676 )	2023-12-07 17:07:05 -08:00
test_method_cache.py	cleanup tests Device[Device.DEFAULT] is always Compiled (#3645 )	2024-03-07 11:15:42 -05:00
test_multitensor.py	fix Tensor.to preserves grad.data (#3636 )	2024-03-06 21:44:49 -05:00
test_net_speed.py	Remove pytest markers (#2831 )	2023-12-18 18:53:28 -05:00
test_nn.py	fix SCE ignore_index with label_smoothing (#3574 )	2024-03-01 22:19:45 -05:00
test_ops.py	O(n) arange attempt (#3530 )	2024-03-11 16:09:20 -07:00
test_optim.py	remove realize from optimizer (#2880 )	2023-12-20 16:42:41 -08:00
test_randomness.py	fix: make Tensor.rand produce correct values for float16 (#3654 )	2024-03-10 18:48:00 -04:00
test_sample.py	enable test_sample for all backend (#2593 )	2023-12-03 17:20:27 -05:00
test_schedule.py	cleanup tests Device[Device.DEFAULT] is always Compiled (#3645 )	2024-03-07 11:15:42 -05:00
test_search.py	cleanup tests Device[Device.DEFAULT] is always Compiled (#3645 )	2024-03-07 11:15:42 -05:00
test_specific_conv.py	remove cpu and torch backends (#3399 )	2024-02-15 16:55:39 +01:00
test_speed_v_torch.py	move graph.py and jit.py into features (#3376 )	2024-02-12 17:34:34 +01:00
test_symbolic_jit.py	bring ptx back (#3623 )	2024-03-06 13:34:21 -08:00
test_symbolic_ops.py	bring ptx back (#3623 )	2024-03-06 13:34:21 -08:00
test_symbolic_shapetracker.py	unbind view or shapetracker also returns var_val (#3067 )	2024-01-09 21:45:05 -05:00
test_tensor.py	Fix Tensor's __repr__ for printing out grad (#3673 )	2024-03-10 17:04:29 -04:00
test_tensor_data.py	dtype fmt (#3132 )	2024-01-15 11:31:54 -08:00
test_to_numpy.py	Apply ruff linting rules to tests (#2473 )	2023-11-27 21:24:06 -08:00
test_uops.py	constant folding (#3675 )	2024-03-10 14:47:24 -07:00
test_uops_stats.py	uops flop counter (#3373 )	2024-02-20 09:36:30 +01:00
test_winograd.py	uops flop counter (#3373 )	2024-02-20 09:36:30 +01:00
test_zero_copy.py	remove numpy from device (#3123 )	2024-01-14 19:36:05 -08:00