mirrors/tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

Author	SHA1	Message	Date
b1tg	24d328e313	onnx parser (#10435 ) * onnx parser * fix compile, lint * onnx.load -> onnx_load * compatible with ModelProto * fix test external_test_onnx_ops.py * fix tests * fix signed int * reduce to 261 lines * fix TypeProto.Optional * debug for _parse_message, add TypeProto.Sequence, cleanup * onnx_load from Tensor * remove BufferedReader * 174 lines and reduce tensor copy * cleanup * use onnx_load in external_model_benchmark.py * fix qcom test * [onnx] parser support external data --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-09 12:44:28 -04:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
George Hotz	4e2c3560b4	smaller tests are faster tests [pr] (#10704 ) * remove del spam from CI * more * preconstruct default buffer spec * ignore those errors * check exception * more exception check * skip stuff * smaller tests mean faster tests * a few more	2025-06-08 10:54:19 -07:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
uuuvn	8e3f337075	Skip flaky test in ci (#10696 ) `test_data_parallel_resnet_train_step` is already skipped on LLVM/CPU: ```python @unittest.skipIf(CI and REAL_DEV in ("CUDA", "NV", "LLVM", "CPU"), "slow, and flaky on LLVM/CPU") @unittest.skipIf(REAL_DEV == "WEBGPU" and not OSX, "WEBGPU Vulkan can only run kernels with up to 10 buffers") def test_data_parallel_resnet_train_step(self): ``` It looks like `test_data_parallel_resnet` (no `_train_step`) is flaky in a similar way: https://github.com/tinygrad/tinygrad/actions/runs/15472667248/job/43560773882?pr=10642#step:9:64	2025-06-08 08:24:09 -07:00
George Hotz	8c76250d31	speed up a few tests (#10692 )	2025-06-07 20:39:25 -07:00
ihar	40c1479267	added unit tests for 'argfix' (#10678 )	2025-06-07 22:17:10 -04:00
ihar	74b849b5e1	remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape' (#10677 ) * remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape' * added the same set of unit tests for 'view' as for 'reshape' since 'view' is just an alias for 'reshape' * improved tests for 'view' op	2025-06-07 22:15:31 -04:00
Sieds Lykles	c29a56dd51	Fix whisper OOB (#10685 ) * fix whisper and test * remove import	2025-06-07 20:23:50 -04:00
George Hotz	53ed64e133	ci speed work 1 (#10676 ) * skip a few slow tests * use a venv for python packages * create venv * no user, it's in venv * ignore venv * venv * new cache key * try that * this * version the python cache	2025-06-07 16:33:11 -07:00
qazal	cb61774ab6	move shared viz fields out of serve.py [pr] (#10684 ) * move shared viz fields out [pr] * update javascript * update test_viz	2025-06-07 17:18:18 +03:00
qazal	b515d796fb	inline viz get_name [pr] (#10682 ) * inline viz get_name [pr] * changing name_fxn makes this simpler * waitUntil dom	2025-06-07 11:16:16 +03:00
wozeparrot	e3805171e2	feat: variable bs bitcast (#10674 )	2025-06-06 17:21:53 -07:00
George Hotz	54db1f8ee8	prevent huge waste of multi ram (#10669 ) * prevent huge waste of multi ram * fix ram usage * only define var * add resolve * fix tests * fix cifar training * remove that logic * fix test without long	2025-06-06 17:17:21 -07:00
George Hotz	b68b7dbc2a	test winograd is close to normal conv [pr] (#10557 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-06 19:11:49 -04:00
leopf	eb7305e6a4	Tensor.keccak("sha3_256") (#7186 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2025-06-06 15:24:05 -07:00
chenyu	bdede4924e	fix odd number in get_test_global_size (#10671 ) factor might not be a integer if input global_size has an odd number in it	2025-06-06 17:31:35 -04:00
George Hotz	7f0f97aa76	new test_multitensor tests (#10667 ) * new test_multitensor tests * cleanup scheduler	2025-06-06 10:26:28 -07:00
chenyu	4a6d84c4c3	hotfix llama start_pos vmax is max_context-1 (#10659 ) * hotfix llama start_pos vmax is max_context-1 fixed `IGNORE_OOB=0 python3 examples/llama3.py --size 1B --benchmark --temperature 0` * hotfix: multitensor transformer test tests kv cache --------- Co-authored-by: George Hotz <geohot@gmail.com>	2025-06-06 00:41:25 -04:00
George Hotz	5eb6e1e65a	Revert "hotfix: multitensor transformer test tests kv cache" This reverts commit `ad9f88419a`.	2025-06-05 21:15:34 -07:00
George Hotz	ad9f88419a	hotfix: multitensor transformer test tests kv cache	2025-06-05 21:08:57 -07:00
George Hotz	8325c4f192	tests for multi assign (#10658 ) * tests for multi assign * transformer tests * add that assert	2025-06-05 20:56:40 -07:00
wozeparrot	0d86f8d375	fix failed threefry (#10646 )	2025-06-05 17:17:42 -07:00
chenyu	ff1aad7b69	fix const float pow to int tensor (#10655 ) was incorrectly casted into int	2025-06-05 19:15:12 -04:00
George Hotz	baba274a76	minimal mstack pr to fix allreduce (#10649 ) * minimal mstack pr to fix allreduce * fix webgpu	2025-06-05 15:14:53 -07:00
George Hotz	4c315f8e17	MSTACK little non-functional changes (#10648 )	2025-06-05 13:20:22 -07:00
chenyu	46811d0d3c	minor external_model_benchmark cleanup (#10644 )	2025-06-05 14:13:28 -04:00
qazal	26afbc954f	delete redundant tests from test_schedule [pr] (#10643 )	2025-06-05 20:08:39 +03:00
chenyu	80ebce421d	remove metal buffer limit in external_model_benchmark [pr] (#10642 ) not needed anymore	2025-06-05 13:00:51 -04:00
qazal	28c4997236	check for matching shape order in fused reduce (#10641 ) * failing test * shapes match with ones removed	2025-06-05 19:37:22 +03:00
qazal	1190062812	prevent grouper can_chase while fusing arange [pr] (#10623 )	2025-06-05 18:50:21 +03:00
qazal	8c5ea00522	push permutes through fused reduces (#10628 ) * fix pushing reshapes through reduceops * reduceop_view_right should assert on ndims mismatch * update that, view.reshape asserts it	2025-06-05 16:14:04 +03:00
chenyu	d0969f5a1f	cleanup multi tests (#10635 )	2025-06-05 00:28:44 -04:00
qazal	571c0296a9	linearizer failure from FUSE_ARANGE default diff (#10629 ) * start with test_arange_sum * test_arange_avgpool2d * device.renderer.supports_float4	2025-06-04 19:11:52 +03:00
qazal	5056d21b29	add failing TestSchedule.test_arange_sum [pr] (#10627 )	2025-06-04 17:23:59 +03:00
qazal	7114b6ab31	viz browser tests (#10626 ) * viz browser tests * expect failure if js/ isn't included * back green	2025-06-04 14:58:24 +03:00
wozeparrot	4d1686f767	clean: becnhmark -> benchmark (#10620 )	2025-06-03 19:28:18 -07:00
qazal	ce9f12dc13	reorder cast before masking constants (#10609 ) * failing test from fuzzer * .numpy() handles bfloat16 better * const->view->cast becomes const->cast->view * update TestMovedConstFolding.test_cast_padded	2025-06-03 15:44:03 +03:00
qazal	910cabb081	add kernel count to grouper process replay differ [pr] (#10611 )	2025-06-03 15:21:27 +03:00
Ahmed Harmouche	650404a143	[webgpu] Proper shared mem size for packed types (#10585 ) * Proper shared mem size in webgpu * Add test * Refactor test	2025-06-01 20:18:33 -04:00
qazal	3cc73a0172	simpler process replay main loop [pr] (#10588 ) * simpler process replay main loop [pr] * use logging * default to 1	2025-06-01 15:03:21 +03:00
qazal	dc882d3d7d	merge process replay and viz captures [pr] (#10581 ) * refactoring * test script * work * more work * diff * repr splits lines correctly * that * add location * add location * also don't need name_override * k.copy * [pr] * name_override 2 * err	2025-06-01 12:30:10 +03:00
qazal	1f8a8721e9	remove test_unaligns_idxs, UOps don't have order like this [pr] (#10587 )	2025-06-01 12:16:14 +03:00
Ahmed Harmouche	35eb4d357a	[webgpu] Fix atomic shared mem load inside loop (#10530 ) * Disable shared mem atomics on webgpu * allow_any_len in load pattern matcher to fix temp load inside loop	2025-05-31 09:29:02 -04:00
qazal	5b59728c75	refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) (#10541 ) * changes to core tinygrad * fixups pt1 TC=3 docs/abstractions2.py IMAGE=2 test_quantize_dsp test_schedule * more tests * green now * images stay images	2025-05-30 14:27:58 +03:00
chenyu	116ffc4e92	cstyle strips paren for AND and OR (#10560 )	2025-05-30 07:09:05 -04:00
qazal	bbf05110a2	use kernelize in TestLinearizer.test_indexing_multireduce [pr] (#10571 )	2025-05-30 11:27:09 +03:00
qazal	7051bf3fd5	fixup hardcoded asts ptr dtype and constants [pr] (#10570 ) * fixup hardcoded asts ptr dtype and constants [pr] * use kernelize for test_kernel_count	2025-05-30 09:38:32 +03:00
qazal	066196415f	UOp.valid and const_like work with just shapes [pr] (#10569 ) * UOp.valid and const_like work with just shapes [pr] * pm_quant left * pm_quant	2025-05-30 08:55:06 +03:00
George Hotz	b3b43a82c4	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00

... 36 37 38 39 40 ...

5,694 commits