mirrors/tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

Author	SHA1	Message	Date
George Hotz	1253819151	make beautiful indexing use a Variable (#10063 ) * make beautiful indexing use a Variable * stunning test * better color * training is broken * fix tests * fix variable indexing * fix test * no contiguous * revert that * revert that too * indexing two bind * skip for webgpu * make not slow	2025-04-27 08:22:38 -04:00
chenyu	4c1ce1a299	don't simplify if div folding resulted in negative numerator (#10064 ) * don't simplify if div folding resulted in negative numerator * test	2025-04-26 17:01:18 -04:00
George Hotz	1805403821	fix rand arange folding (#10060 ) * test rand range * --amend * fix rand arange folding * reduce_rangeless fix	2025-04-26 12:24:05 -04:00
qazal	d13c100981	don't sort dims in verify_sink_dims [pr] (#10059 ) * don't sort dims in verify_sink_dims [pr] * 1 can exist with n * put process_replay warn last * assert shape is the same * bring that back	2025-04-26 23:24:30 +08:00
George Hotz	11113c9d07	reduce_unparented (#10056 )	2025-04-26 09:48:16 -04:00
George Hotz	ea5dddc537	reduce collapse generic (#10045 ) * reduce collapse generic * new arange folder * new range folding * correct with sym * all tests pass * indexing ops passes * failing tests * fix tests, remove unused * revert that * torch indexing is fast * skip on webgpu * touchups * comments	2025-04-26 09:13:24 -04:00
quortus	5cdc96409e	Update outdated renderer.render calls (#10044 )	2025-04-26 07:35:19 -04:00
nimlgen	0fc85a2b0a	hcqfuzz: init (#10049 ) * hcqfuzz: init * fix fuzz * linter * graph * taht test * update readme	2025-04-25 23:19:21 +03:00
Ignacio Sica	76a86735c0	hotfix `amd` bf16 is supported case (#10039 ) * hotfix amd and amd_llvm * bf16 not supported in ci * hotfix amd_llvm is not a device * remove default * dont gate on ci and amd_llvm * minor cleanup * skip bf16 tc test for amd_llvm	2025-04-24 21:29:27 -03:00
Ignacio Sica	b4f823acbe	fix helper_tc_allclose (#9606 ) * fix helper_tc_allclose * cleanup * hotfix * cleanup * cleanup * check real buffer and add cast for bf16 * cleanup * fix padded for ops_python * avoid assert on amd emulated tc * swap dimensions * revert, should have nothing to do with padded * revert fix, should not go in this pr * remove skip	2025-04-24 18:36:40 -03:00
Rory Clear	3a189fa561	More yolo processing in tinygrad (#9928 ) * more tg less np * update webgpu html for new compile * resize boxes * remove text * add back note * fix indentation * fix indentation * remove magic num * remove now unused funcs * back to numpy nms * no loop * fix iou suppression * update test * dont suppress other classes * add working scale * fix expected value, rounded up 0.24 was being counted * add postprocess bool for onnx test * fix indents * clean * clean * fix indent * remove print * fix indent * remove unused import * remove hardcoded 0.25 * space * spacing * clean label_predictions func * remove single item lists * space * use postprocess output in test * space * clean * clean * remove redundant threshold * remove redundant threshold * clean * rename var * move loop into func * unhardcode iou_threshold * remove unused values * clean * add note * clean * keep const * move back funcs --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-04-24 16:21:46 -04:00
Ignacio Sica	51ca19d061	set `test_tensor_cores_padded_amd` to expectedFailure (#10036 ) * init * add expected failure to correctly track progres * hotfix * skip for amd_llvm as well * add skip * add pr number * move comment to amd test * change reason	2025-04-24 17:11:40 -03:00
uuuvn	779aa1e2e9	Enable image tests on cloud if clouddev supports image (#9903 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-04-24 14:30:12 -04:00
Ignacio Sica	373ca59b7f	use is_dtype_supported to check dtype support in tc tests (#10035 )	2025-04-24 14:59:14 -03:00
uuuvn	754d789f51	Fix and enable jit tests on CLOUD (#10031 )	2025-04-24 18:39:31 +03:00
George Hotz	aec75f51ef	fixup some slow CI tests [pr] (#10027 ) * fixup some slow CI tests [pr] * shrink test index	2025-04-24 09:20:49 -04:00
qazal	c990aac2b1	skip flaky test_transcribe_file1_OOB (#10026 )	2025-04-24 21:09:43 +08:00
Sieds Lykles	e75be6eafc	[bounty] [pr] index validation with z3 (#9981 ) * index validation with z3 * Change comment * toposort -> toposort() --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-04-24 08:06:08 -04:00
quortus	9e49721c47	CPUGraph support for clang (#10014 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-04-24 07:52:35 -04:00
Park Jun	c3ad7b2a84	create randperm and support pytorch backend (#10019 )	2025-04-24 07:29:02 -04:00
nimlgen	1c5e353249	am: use mmio iface (#10012 ) * am: use mmio iface * linters * fixes * fixes + cleanups * mute * mypy * style	2025-04-24 00:27:04 +03:00
George Hotz	2ed3acd767	toposort is a function [pr] (#10004 )	2025-04-23 16:25:03 +01:00
uuuvn	0730ff0e50	Skip test that requires lru if device's allocator isn't lru (#10003 )	2025-04-23 16:12:56 +01:00
uuuvn	9de73ccc22	Skip test that requires python 3.12 on older versions (#10001 ) `out.cast(it.dtype.fmt).tolist()` fails with `ValueError: memoryview: destination format must be a native single character format prefixed with an optional '@'`	2025-04-23 10:09:26 -04:00
George Hotz	71ecc7fa1a	use a pattern matcher for upcast [pr] (#10000 )	2025-04-23 14:24:23 +01:00
George Hotz	cc1087d2ec	move simplify into views_to_indexed_uops (#9999 ) * move simplify into views_to_indexed_uops * cache that	2025-04-23 13:50:27 +01:00
pkotzbach	dbbd755cba	FP8s truncate (#9937 ) * truncate fp8 * fix * maybe like that? * fix linters * ruff * move from extra and add ml_types to tests * minor changes * str to dtypes and nan support --------- Co-authored-by: pkotzbach <pawkotz@gmail.com>	2025-04-22 19:12:49 -04:00
qazal	f4ec57baff	new schedule linearizer enqueues KERNEL UOps [pr] (#9993 ) * new schedule linearizer enqueues kernels [pr] * no defaultdict * diff * minor	2025-04-23 05:17:58 +08:00
George Hotz	d1f6701eb7	hotfix: lower amd threshold + improve block reorder test	2025-04-22 20:44:29 +01:00
nimlgen	db51133537	rename HWInterface -> FileIOInterface (#9989 ) * rename HWInterface -> FileIOInterface * ugh	2025-04-22 22:18:57 +03:00
George Hotz	c1539b0319	putting add first orders loads as expected (#9991 )	2025-04-22 20:12:05 +01:00
nimlgen	bd580d8ea4	hcq: use mmio interface in nv (#9986 ) * hcq: start mmio interface * allow double cast * revert * faster? * simpler, not needed more now * dd * types * fix	2025-04-22 21:58:12 +03:00
George Hotz	feee6986c9	faster block reorder (#9990 ) * faster block reorder [pr] * that shouldn't change order * key just in sorted * ind	2025-04-22 19:18:57 +01:00
qazal	6cb2d18c03	refactor schedule linearize to defaultdict [pr] (#9984 ) * refactor schedule linearize to defaultdict [pr] * skip that * don't need .get	2025-04-23 00:00:23 +08:00
chenyu	9e5e371999	make DISABLE_COMPILER_CACHE a ContextVar [pr] (#9983 )	2025-04-22 10:32:54 -04:00
qazal	bbc324f5dc	remove CAST_AFTER_EXPAND (#9980 )	2025-04-22 21:06:11 +08:00
George Hotz	c519b553db	non recursive toposort is 2x+ faster (#9979 ) * non recursive toposort is 2x+ faster * don't change the order	2025-04-22 13:59:38 +01:00
qazal	7b55846e08	prep STORE UOp creation for multi output [pr] (#9975 ) * prep STORE UOp creation for multi output [pr] * test_multioutput_ast	2025-04-22 19:34:52 +08:00
George Hotz	e358e0a0c6	move metadata set to tensor [pr] (#9976 ) * move metadata set to tensor [pr] * only track that in tensor.py	2025-04-22 12:30:35 +01:00
George Hotz	f5dc70c624	microbenchmarks + micro speed ups (#9972 ) * microbenchmarks * forgot the ubenchs * clean up type verify	2025-04-22 11:30:46 +01:00
qazal	1cf4e24ca5	fix kernelize usage with pm_gradient (#9953 ) * fix kernelize usage with pm_gradient * remove that	2025-04-22 17:26:05 +08:00
qazal	36ed3c3253	fix kernelize with VIEW children (#9961 )	2025-04-21 23:38:46 +08:00
qazal	e8910540f6	Kernelize can be called multiple times on a Tensor (#9949 ) * Kernelize can be called multiple times on a Tensor * add (failing) test_kernelize_bw	2025-04-21 06:28:47 +08:00
qazal	1d90be2cff	match kernelize API in process replay (#9948 )	2025-04-21 05:23:41 +08:00
qazal	e20ef7196a	Tensor.kernelize (#9845 ) * add kernelize * remove that * kernelize returns self * update abstractions2.py * kernelize in test_schedule * temp: assert BUFFER_VIEW's existence * ASSIGN must have a buffer or subbuffer target * assert and shrink * fix * padded setitem * var * toposort once * extra * base_buffer * end with BUFFER_VIEW * setitem for disk * test_setitem_becomes_subbuffer * mul slice test * torch backend fix 1 * non-deterministic * keep subbuffer	2025-04-20 20:53:49 +08:00
qazal	dd16087f62	fold double ASSIGN to same target (#9941 )	2025-04-20 19:06:38 +08:00
qazal	9a9aba4cd5	setitem tests (some failing) from kernelize (#9940 )	2025-04-20 18:47:55 +08:00
chenyu	6c30948df6	hand_coded_optimizations returns list[Opt] [pr] (#9938 ) new api looks like `k.apply_opts(hand_coded_optimizations(k))`	2025-04-19 20:26:59 -04:00
chenyu	720f20865b	remove required_optimizations (#9848 )	2025-04-19 16:51:16 -04:00
Ignacio Sica	023b1c28a2	`test_tensor_cores_padded` refactor (#9724 ) * set pad t 3 for amd padded tc test * change pad for amd regardless CI * test tc padded uops and correctness separately * add test_tensor_cores_padded_uops test to ci * remove redundant chack for amd device * cleanup	2025-04-18 17:05:54 -03:00

... 40 41 42 43 44 ...

5,694 commits