Commit graph

8 commits

Author SHA1 Message Date
George Hotz
81ef879da3
non recursive top_down_rewrite (#10729)
* non recursive top_down_rewrite

* nicer algorithm

* rewrite bottom up also

* only top down is broken?

* simpler iterative algo

* no recursion errors

* top down and bottom up

* unified rewrite

* simpler rewrite

* clean up comments

* move that comment
2025-06-09 16:33:04 -07:00
George Hotz
eaceafecae
do fusion locally (#10095)
* do fusion locally

* oops, that's the right way

* explicit delete closure
2025-04-28 20:45:37 -04:00
George Hotz
dd52951dd0
fix single kernel softmax with cast (#9842)
* fix single kernel softmax with cast

* tolerate none

* 3e-4

* skip on dtype
2025-04-11 12:12:02 +08:00
chenyu
7fa5f29582
add test_embedding to test_softmax_fusion (#9832) 2025-04-10 08:25:34 -04:00
George Hotz
53f0b2aad7
fix infinite loop in flash attention (#9827)
* fix infinite loop in flash attention

* get_contraction_with_reduce

* skip that test

* SINGLE_KERNEL_SOFTMAX + fix multi

* default IGNORE_OOB

* print change
2025-04-10 20:06:44 +08:00
George Hotz
fce432d2e3
Ops.FUSE makes softmax a single kernel (#9808)
* KERNELIZE makes softmax a single kernel

* single kernel works

* softmax works

* broken

* correct

* skip that test

* kernelize tests

* rename to fuse

* better reduce_push_add_ones code

* correct now

* cleanups

* oops

* return None if we can't push ones

* rename + docs

* atol fixes group

* flash attention broken test
2025-04-09 22:56:28 +08:00
chenyu
7a28133b37
failed test for single softmax backward (#9778)
getting RecursionError with DONT_GROUP_REDUCES=1
2025-04-08 02:36:32 -04:00
George Hotz
fefee5d3ab
single kernel softmax (#9776)
* real single kernel softmax

* cleanup

* fix blockend insertion

* add to bert test
2025-04-08 12:35:48 +08:00