tinygrad/extra/llama_kernels
qazal f998b9930a
fp8 gemm inv_scale in epilogue (#16625)
* fuse scale

* remove python inv_scale

* more inv_scale removal

* more cleanups

* cleaner

* diff polish

* work

* rename

* simpler

* simpler

* compute

* c

* Revert "c"

This reverts commit 8941fec7ca.

* Revert "compute"

This reverts commit 9db573a6d3.

* Revert "simpler"

This reverts commit 910ad33f87.

* Revert "simpler"

This reverts commit bf75d235a1.

* s_g

* update types

* less diff noise

* remove
2026-06-15 18:44:41 +09:00
..
cast_amax fp8 gemm inv_scale in epilogue (#16625) 2026-06-15 18:44:41 +09:00
fp8_transpose llama speed 6 (#16071) 2026-05-06 20:51:03 -07:00
fused_ce llama: no E_ copy after bf16 GEMM (#16458) 2026-06-02 14:14:13 +09:00
fused_rmsnorm_mul_quantize_fp8 fp8 gemm inv_scale in epilogue (#16625) 2026-06-15 18:44:41 +09:00
quantize_fp8_delayed quantize_fp8 kernels in uops (#16288) 2026-05-22 20:54:06 +09:00
rmsnorm llama: move llama kernels to llama_kernels (#15952) 2026-04-27 22:48:53 -07:00
__init__.py llama: update local amax implementation after ParamArgs change (#16446) 2026-05-30 16:55:43 +09:00