tinygrad/extra
qazal 616e9c1483
CDNA assembly gemm in tensor.py with flag (#14310)
* work

* work

* the assembly

* remove the old one

* remove ws bufs, assert splitk

* notes cleanup

* work

* gemm args

* gemm in mixins would be nice

* add gemm gradient

* print counters

* the realize is for DEBUG=2 aesthetics

* dedup

* rewrite to python dsl, no list copies

* leave that

* add B, M, N, K to gemm name

* it's M0 not NULL

* fp16 support

* test cleanup + more gemms

* work from viz

* more work

* gemm batch_size

* xccg path work

* tiny comments on the label naming

* s_waitcnt
2026-01-31 22:34:14 +09:00
..
amdpci am_smi: mi350 (#14018) 2026-01-05 13:10:56 +03:00
assembly/amd assembly/amd: test more instructions (#14365) 2026-01-31 12:40:22 +08:00
datasets remove more stale stuff (#13765) 2025-12-19 17:14:56 -04:00
dsp dsp stuff / sniff ioctls from snpe (#9490) 2025-03-20 10:38:23 +08:00
fp8 train bert with fp8 (#13874) 2026-01-09 09:21:59 -05:00
gemm CDNA assembly gemm in tensor.py with flag (#14310) 2026-01-31 22:34:14 +09:00
hcq hcq_smi: kill mac pids (#14398) 2026-01-28 15:00:28 +03:00
hcqfuzz feat: add repro command to summary (#10930) 2025-11-13 08:52:27 -08:00
hevc hevc: decoder as iterator (#14091) 2026-01-10 14:57:56 +03:00
hip_gpu_driver amd: alive wgps (#14149) 2026-01-23 00:08:45 +03:00
hiprtc use comgr to compile (#3248) 2024-01-26 18:27:49 -08:00
huggingface_onnx move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
mesa In-tree autogen: all C libraries (#13220) 2025-11-13 18:57:44 -08:00
mmapeak mfma loop in asm dsl (#14349) 2026-01-27 11:11:37 +09:00
models don't allow jit input to be const (#14045) 2026-01-06 18:15:22 -05:00
nv_gpu_driver nv: pma for 5090 (#14420) 2026-01-29 20:06:01 +03:00
nv_pma nv: add prof props to dev (#14437) 2026-01-30 12:51:43 +03:00
optimization fix generate_dataset.sh (#14324) 2026-01-24 16:47:10 -05:00
perfetto upd perfetto (#11528) 2025-08-06 14:00:34 +03:00
qcom_gpu_driver working ioctls (#14272) 2026-01-21 20:29:04 +03:00
remu simplify mi350x gemm / viz asm tests (#13984) 2026-01-03 11:11:07 +09:00
sqtt viz: replace llvm disasm with our disasm (#14325) 2026-01-25 13:56:56 +09:00
thunder fa: 32 block size (#14416) 2026-01-29 13:59:13 -08:00
tinyfs tinyfs tweaks (#13444) 2025-11-24 18:07:32 -08:00
torch_backend update torch backend function (#14333) 2026-01-25 16:39:34 -05:00
torch_hook rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
usbgpu USBGPU: debug script for comma chestnut (#14252) 2026-01-20 18:52:25 +03:00
viz fix bufferize cost function for multi, improve VIZ=-1 cli (#14394) 2026-01-28 15:53:18 +09:00
webgpu Autogen webgpu dawn, removing wgpu-py dependency (f16 support part 1) (#8646) 2025-02-07 15:16:59 +08:00
archprobe.py ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
bench_log.py hotfix: BenchEvent MLPERF_RUN is mlperf_run (#10526) 2025-05-26 20:19:37 -04:00
cl_android.sh source extra/cl_android.sh to fix opencl on android 2025-10-26 15:27:51 +08:00
export_model.py no core_id (#14265) 2026-01-23 21:30:12 +03:00
f16_decompress.py u32 to f16 in tinygrad (#8074) 2024-12-06 12:00:13 +01:00
gradcheck.py tests from grad uop path [pr] (#8313) 2024-12-18 09:25:05 -08:00
hip_large_kernel.py minimum change for rdna4 [pr] (#9455) 2025-03-16 13:39:24 +08:00
hook_cuda.py cuda hooking (#9180) 2025-02-20 19:20:01 +08:00
introspection.py move files into uop dir (#10399) 2025-05-18 11:38:28 -07:00
lr_scheduler.py more beautiful cifar (#10551) 2025-05-28 20:48:20 -07:00
multitensor.py rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
nvJitLink.h In-tree autogen: all C libraries (#13220) 2025-11-13 18:57:44 -08:00
onnx_helpers.py onnx helper intermediate node output validation (#12740) 2025-10-16 11:17:47 -04:00
setup_mock_amd_osx.sh add rocm 6.4 support (#10491) 2025-05-23 16:20:54 -07:00
setup_mock_nv_osx.sh hotfix: setup_mock_nv_osx 2025-02-13 12:26:15 +08:00
test_mi350.sh amd fp8 llvm (#13186) 2025-11-20 12:35:57 -05:00
thneed.py ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
training.py tinytqdm.set_description and tinytrange (#5101) 2024-06-22 14:45:06 -04:00
weekly_commits_table.py add chrism 2025-12-14 00:45:57 -05:00