Commit graph

13,577 commits

Author SHA1 Message Date
George Hotz
785f09aac4 lil cleanup 2026-06-11 17:44:22 -07:00
George Hotz
32ad5e8b96 port NIR to new_style (fable) 2026-06-11 17:34:01 -07:00
Christopher Milan
4d893f626a
move a bunch of test_schedule to null (#16578) 2026-06-11 20:26:34 -04:00
George Hotz
b57639a6cc
port python to new_style (fable) (#16579)
* port python to new_style (fable)

* doesn't have to be const in python
2026-06-11 17:26:05 -07:00
George Hotz
a04d2fa4eb
port ptx to new_style (fable) (#16577)
* port ptx to new_style (fable)

* simplify

* simpler
2026-06-11 17:05:03 -07:00
George Hotz
587333fddb
replace DEFINE_VAR with PARAM (#16576)
* replace DEFINE_VAR with PARAM

* cleanups

* cleanups
2026-06-11 15:03:20 -07:00
chenyu
5f1e2d3900
PADTO pads Invalids (#16562) 2026-06-11 16:54:26 -04:00
George Hotz
434a8ffc38
move llvm to new style (#16573)
* move llvm to new style

* fix wmma

* buffer is early
2026-06-11 12:59:02 -07:00
George Hotz
347608a523
put loads back on reg (#16572)
* put loads back on reg

* fix dsp
2026-06-11 11:24:50 -07:00
nimlgen
e5f498de3b
hcq2: debug=2 info (#16569)
* hcq2: debug=2 info

* t

* x

* hcq2: debug=2 info

* x
2026-06-11 19:52:01 +03:00
qazal
a83710396c
support mselect input to CALL, less kernels in allreduce (#16567)
* support mselect input to CALL, less kernels in allreduce

* resolve mstack
2026-06-11 18:10:47 +09:00
qazal
7d4a77dce4
relax comma benchmark timeout (#16568) 2026-06-11 18:03:37 +09:00
qazal
21f1101691
add allreduce kernel count test (#16566) 2026-06-11 15:54:12 +09:00
wozeparrot
c38d6a7e3a
mxfp8 part 2 (#16561) 2026-06-10 23:36:11 -07:00
Christopher Milan
83971860d8
ci: simplify webgpu install (#16557) 2026-06-10 22:57:19 -04:00
Christopher Milan
6e1b61f16f
cleanup some amd deps (#16563)
don't load hsa runtime, remove ib autogen
2026-06-10 19:01:56 -04:00
George Hotz
7e6d617935
addrspace cleanups (#16565)
* addrspace cleanups

* bumps

* eh, relax a little
2026-06-10 15:57:18 -07:00
nimlgen
2c9d2c0d31
jit: memplan before compile (#16560) 2026-06-10 15:05:15 +03:00
qazal
34481830f1
rangeify: fix cost function for AFTER(out, CALL) (#16559)
* simple failing test

* fix rangeify cost function

* new ops count
2026-06-10 17:30:50 +09:00
chenyu
623b66e0e4
more tensor and mixin cleanups [PR] (#16558) 2026-06-10 00:39:33 -04:00
chenyu
7366d32247
getitem cleanups [PR] (#16556) 2026-06-09 22:48:58 -04:00
George Hotz
fd76ac992e
cstyle renderer is new style [pr] (#16484)
* cstyle new style

* switch cstyle renderer to new style

* fix hip

* fixes

* fix webgpu

* correct webgpu is_packed

* fix dsp

* fixes

* fix Ops.RANGE must be CONST

* old style render access

* this is correct

* fix cstyle to good

* dl/dr

* as array

* fix spec

* remove define_local/define_reg

* buffer in shrink

* fix test_tiny

* all tests fix

* param args aren't realized

* wgsl fix

* work

* new gate

* fix opencl qcom

* process replay

* sort order

* fix render index
2026-06-09 18:36:01 -07:00
Christopher Milan
97d483350c
ci: download prebuilt ocelot (#16554) 2026-06-09 19:51:33 -04:00
Christopher Milan
f9d88d3c3a
fix race in test_quantize_onnx (#16555) 2026-06-09 18:39:48 -04:00
wozeparrot
2bdc360606
gemm: mxfp8 hipkittens gemm (#16541)
* gemm: mxfp8 hipkittens gemm

* feat: update hipkittens

* feat: kernel signature

* clean: just kernel

* feat: from tinygrad

* feat: test

* fix: add back utils

* clean: no diff

* clean: no diff
2026-06-09 15:20:05 -07:00
chenyu
12addee14f
tesnor and mixin cleanups [PR] (#16553) 2026-06-09 15:33:13 -04:00
nimlgen
2ab2d51099
hcq2: fix repeated calls (#16552) 2026-06-09 19:11:42 +03:00
chenyu
3f053a3370
move functional part of rand to RandMixin (#16551) 2026-06-09 09:40:48 -04:00
nimlgen
fa31c744b9
hcq2: cleaner (#16550) 2026-06-09 16:33:05 +03:00
qazal
598cc13ad2
more readable null graph profile in VIZ (#16548)
* more readable null graph profile in VIZ

* change

* fix flaky test
2026-06-09 18:35:05 +09:00
qazal
d18ad49f20
fix flaky test_disktensor (#16549) 2026-06-09 18:23:22 +09:00
qazal
fa400f9790
less E kernels in all2all (#16546) 2026-06-09 13:51:57 +09:00
qazal
b8931440ae
add all2all schedule test (#16545) 2026-06-09 12:41:35 +09:00
wozeparrot
5ef30005fa
update hipkittens (#16544) 2026-06-08 18:53:25 -07:00
Christopher Milan
4e2e2e9956
ocelot: use c.DLL (#16540) 2026-06-08 21:27:28 -04:00
chenyu
11fee53527
RandMixin [PR] (#16543) 2026-06-08 19:11:28 -04:00
chenyu
e2ef5cf5c9
no args and kwargs for _multi_like [PR] (#16539) 2026-06-08 17:35:15 -04:00
chenyu
12764161c9
UOp.shard support axis=None [PR] (#16538)
match Tensor
2026-06-08 11:36:50 -04:00
chenyu
ebc5390c9a
advance indexing to mixin [PR] (#16532) 2026-06-08 09:24:49 -04:00
nimlgen
95d63d6c07
hcq2: lower to ins (#16535)
* hcq2: lower to ins

* pm4

* f
2026-06-08 16:15:30 +03:00
nimlgen
8baca185d5
hcq2: add kfd (#16537) 2026-06-08 13:48:27 +03:00
chenyu
03943cd1a0
use more _uop for cleanup [PR] (#16531)
`t.uop if isinstance(t, Tensor) else t` -> `t._uop`
2026-06-07 17:41:36 -04:00
chenyu
937aeaec60
remove device= from UPat.const [PR] (#16530) 2026-06-07 16:38:43 -04:00
George Hotz
eb1238436a
more prereqs for DL/DR -> BUFFER (#16529) 2026-06-07 12:25:11 -07:00
George Hotz
0336ba8eb1
buffer param arg + dsp fixups (#16528) 2026-06-07 12:07:00 -07:00
Dmitriy Strunin
75e903d533
remove unused device arg from _get_winograd_matcols (#16527) 2026-06-07 08:15:09 -04:00
chenyu
90b556ca48
move gradient to mixin [PR] (#16526) 2026-06-07 00:05:02 -04:00
chenyu
4e7c6260b0
clean up test_tesnor_uop_mixin (#16525)
most of those don't have UNIQUE anymore
2026-06-06 23:25:44 -04:00
George Hotz
2a2f81dd3d
remove ANON from addrspace, refactor marg (#16523)
* remove ANON from addrspace, refactor marg

* as_shape

* as_shape is cached
2026-06-06 09:49:09 -07:00
qazal
e69b4189b0
viz: hide STACK on PARAM by default (#16522) 2026-06-06 16:41:15 +09:00