tinygrad/extra/gemm/asm/cdna
qazal 616e9c1483
CDNA assembly gemm in tensor.py with flag (#14310)
* work

* work

* the assembly

* remove the old one

* remove ws bufs, assert splitk

* notes cleanup

* work

* gemm args

* gemm in mixins would be nice

* add gemm gradient

* print counters

* the realize is for DEBUG=2 aesthetics

* dedup

* rewrite to python dsl, no list copies

* leave that

* add B, M, N, K to gemm name

* it's M0 not NULL

* fp16 support

* test cleanup + more gemms

* work from viz

* more work

* gemm batch_size

* xccg path work

* tiny comments on the label naming

* s_waitcnt
2026-01-31 22:34:14 +09:00
..
asm.py CDNA assembly gemm in tensor.py with flag (#14310) 2026-01-31 22:34:14 +09:00
gemm.py CDNA assembly gemm in tensor.py with flag (#14310) 2026-01-31 22:34:14 +09:00
test_asm_gemm.py CDNA assembly gemm in tensor.py with flag (#14310) 2026-01-31 22:34:14 +09:00