tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

Christopher Milan 0aabc1e938 Mesa NIR backend (NAK/LLVMpipe) (#12089 ) * nak works * TestOps::test_add works * testop has no crashes * fix bool casts * fix typo * add disassemble * RANGE and locals/regs * simplify NAKCompiler * disass cleanup * cleanup nir codegen * almost all tests passing * cleanup notes in extra/ * old notes * only import nak if NIR=1 * fix new SPECIAL syntax * fix local/shared memory * more tests passing * add DEFINE_VAR support * llvmpipe kinda works * diskcache * some mypy stuff * lvp passing test_ops.py * fix imports * actually fix imports * remove 'stdout' * fix llvm import * fix mypy issues * nicer errors * simpler test_dtype skips * test lvp in CI * fix github action syntax * fix more actions typos * switch to mesa 25.1.0 * diskcache_put * better generation for lvp nir_options * b64encode shader blobs * Revert diskcache changes This reverts commits `930fa3de8a` and `8428c694b3`. * general cleanup * better error messages * fix llvm import * fix windows tests * link with libm and libgcc_s * fix some errors * dont check for 'float4' * NIR uses pointer arithmetic * use tinymesa * bump tinymesa * bump tinymesa again * update lvp nir_options * print nir shader with DEBUG * simplify LVPCompiler * more tests * "gated" STORE * NAK is cacheable * more tests * all tests pass locally for NAK * test autogen in CI * autogen deps * more deps * fix uop_gc * fix macos * mypy * save 2 lines * save two more lines * save 1 line * save 4 lines * save more lines * Revert "save more lines" This reverts commit `dd3a720c5a`. * save more lines * fix LVP on windows * refactor * reorganize some code * refactor lib_gpu * move LVP check * out of order loads * remove support.mesa * bump tinymesa version * simplify LVP jit * macos * macos ci * shell: bash * testing * more testing * compute brew prefix * stupid typo * actually fix * lib * stdout on macos * inline gallivm_compile_module * Revert "inline gallivm_compile_module" This reverts commit `b65983b151`. * elf macos * semicolon * inherit from CPULLVMCompiler * ruff * disas test * fix libm linking * default is fine actually * arm works * add elf loader link test * fix NAK beam * pylint is too smart by half --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>		2025-10-15 17:38:33 +08:00
..
amdpci	am: init support for aql (#11888 )	2025-08-28 18:41:46 +03:00
assembly	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
backends	`var_vals` uses str for var (#12011 )	2025-09-06 04:16:12 +02:00
datasets	very tiny generate_dataset (#11013 )	2025-06-27 17:10:45 -04:00
disassemblers/adreno	qcom fix disasm (#6703 )	2024-09-24 15:23:43 +08:00
dsp	dsp stuff / sniff ioctls from snpe (#9490 )	2025-03-20 10:38:23 +08:00
gemm	remove trivial use of RANGEIFY flag (#12550 )	2025-10-09 02:29:38 -04:00
hcq	ast seems to probe nv as well (#11494 )	2025-08-04 11:47:07 +03:00
hcqfuzz	remove FUSE_ARANGE (#12511 )	2025-10-08 04:54:07 -04:00
hip_gpu_driver	amd: support rocm7 (#12502 )	2025-10-08 14:30:39 +08:00
hiprtc	use comgr to compile (#3248 )	2024-01-26 18:27:49 -08:00
huggingface_onnx	move frontend dir to nn [pr] (#12470 )	2025-10-07 10:42:22 +08:00
junk	coder.py can write and run code (#2439 )	2023-11-25 12:27:54 -08:00
mesa	Mesa NIR backend (NAK/LLVMpipe) (#12089 )	2025-10-15 17:38:33 +08:00
mmapeak	mmapeak implementation for 7900 XTX (#10417 )	2025-05-23 16:26:12 -07:00
models	Stable Diffusion model init for mlperf (#12314 )	2025-10-02 02:28:41 -04:00
nv_gpu_driver	auto-select available compilers (#12094 )	2025-09-10 19:52:01 +03:00
optimization	ShapeTracker.real_strides -> is_expanded [pr] (#12579 )	2025-10-09 22:52:45 -04:00
perfetto	upd perfetto (#11528 )	2025-08-06 14:00:34 +03:00
qcom_gpu_driver	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
remu	remu: add new instructions introduced in RANGEIFY (#12363 )	2025-09-30 12:36:29 +03:00
resnet18	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00
sched	move fuzz_schedule.py to extra [pr] (#10444 )	2025-05-21 10:07:24 +03:00
sqtt	sqtt: osx decoder installer (#12637 )	2025-10-13 17:26:12 +08:00
thunder	feat: add thunderkittens (#12590 )	2025-10-10 00:32:33 -07:00
tinyfs	fetch raid from cloud (#10799 )	2025-10-14 07:53:55 -07:00
torch_backend	remove assign contiguous hack (#12659 )	2025-10-14 16:42:14 +08:00
torch_hook	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
usbgpu	amd: usb4/thunderbolt on macs (#12641 )	2025-10-15 13:02:01 +08:00
webgpu	Autogen webgpu dawn, removing wgpu-py dependency (f16 support part 1) (#8646 )	2025-02-07 15:16:59 +08:00
archprobe.py	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
augment.py	[ready] Replacing os with pathlib (#1708 )	2023-08-30 10:41:08 -07:00
bench_log.py	hotfix: BenchEvent MLPERF_RUN is mlperf_run (#10526 )	2025-05-26 20:19:37 -04:00
disk_read_speed.py	io_uring for copies from disk (#5035 )	2024-06-21 11:36:51 +03:00
dump_cache.py	wow how did i think that was okay (#2339 )	2023-11-16 21:21:11 -08:00
export_model.py	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
f16_decompress.py	u32 to f16 in tinygrad (#8074 )	2024-12-06 12:00:13 +01:00
gradcheck.py	tests from grad uop path [pr] (#8313 )	2024-12-18 09:25:05 -08:00
hip_events.py	move autogen to runtime/autogen (#3254 )	2024-01-26 12:44:19 -08:00
hip_large_kernel.py	minimum change for rdna4 [pr] (#9455 )	2025-03-16 13:39:24 +08:00
hook_cuda.py	cuda hooking (#9180 )	2025-02-20 19:20:01 +08:00
introspection.py	move files into uop dir (#10399 )	2025-05-18 11:38:28 -07:00
lr_scheduler.py	more beautiful cifar (#10551 )	2025-05-28 20:48:20 -07:00
mcts_search.py	`var_vals` uses str for var (#12011 )	2025-09-06 04:16:12 +02:00
multitensor.py	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
onnx_helpers.py	move frontend dir to nn [pr] (#12470 )	2025-10-07 10:42:22 +08:00
reduce_speed.py	VALIDATE_WITH_CPU [pr] (#9488 )	2025-03-18 15:15:04 +08:00
replay_pkl.py	update Kernel API in tests + move optimize_local_size (#11907 )	2025-08-28 15:12:47 -07:00
ring_copy.py	ring copy example (#3185 )	2024-01-19 23:34:30 -05:00
setup_mock_amd_osx.sh	add rocm 6.4 support (#10491 )	2025-05-23 16:20:54 -07:00
setup_mock_nv_osx.sh	hotfix: setup_mock_nv_osx	2025-02-13 12:26:15 +08:00
test_pyrender.py	test pyrender (#12005 )	2025-09-04 11:48:40 -07:00
thneed.py	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
threefry.py	feat: make buffer (#6745 )	2024-09-25 18:31:03 +08:00
to_movement_ops.py	update torch 2.8 (#12172 )	2025-09-14 15:19:03 -04:00
torch_muon.py	[bounty] Muon optim (#11414 )	2025-08-13 14:27:55 -04:00
training.py	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
transfer_speed.py	hotfix: copy size is in bytes	2024-01-17 16:44:15 +00:00
weekly_commits_table.py	fix weekly commits table (i didn't know we linted extra)	2025-10-10 09:23:33 +08:00