Commit graph

13,438 commits

Author SHA1 Message Date
George Hotz
4218cc9257 fix spec 2026-05-27 17:35:46 -07:00
George Hotz
17419edc4a fix slice store to remove the index 2026-05-27 17:21:49 -07:00
qazal
88e88d63d6
viz: click on +- toggles sources (#16409) 2026-05-28 09:12:43 +09:00
George Hotz
b21afb4883
marg line cleanup (#16408)
* marg line cleanup

* bitcast is a mop
2026-05-27 16:41:04 -07:00
wozeparrot
dac3743d75
llama: delayed scaling in optim (#16407) 2026-05-27 15:40:03 -07:00
George Hotz
8ee3a37524
shrink/pad use (new_shape, offset) (#16405)
* shrink uses offset and shape

* pad does too

* fix
2026-05-27 15:13:08 -07:00
Christopher Milan
171401e8df
skip modulo by zero in test_dtype_alu (#16404) 2026-05-27 17:09:05 -04:00
qazal
452c7d4230
llama: don't allocate grad_xw13 in bf16 (#16359) 2026-05-28 04:33:07 +09:00
nimlgen
0c385e31c6
hcq2 rewrite (#16375)
* hcq2 rewrite

* fi

* x

* simpler
2026-05-27 22:25:35 +03:00
chenyu
c33b767407
bring back test and torch backend change for unique const (#16403) 2026-05-27 15:16:08 -04:00
Christopher Milan
bacabf0866
webgpu: fix enums (#16402) 2026-05-27 13:09:50 -04:00
chenyu
6da785562b
test_custom_kernel_precompile_multidevice (#16401)
add a test to show what invalids need
2026-05-27 11:19:16 -04:00
chenyu
3e80f375ee
skip test_setitem_fancy_on_unrealized_view (#16400)
crashes in linux llvm ci
2026-05-27 09:50:26 -04:00
chenyu
945ed4f689
revert const unique changes (#16395) 2026-05-27 00:06:41 -04:00
Christopher Milan
aacc8addf4
ci: use ubuntu 24.04 (#16393) 2026-05-26 23:22:01 -04:00
chenyu
fa14cde05c
test update for arange and eye (#16394)
these will need explicit clone to make a buffer
2026-05-26 22:48:34 -04:00
wozeparrot
3a7a6da7d5
llama: fakedata uses real vocab size (#16389) 2026-05-26 18:58:55 -07:00
George Hotz
156a4438d9
rename BUFFER_VIEW to SLICE (#16391)
* rename BUFFER_VIEW to SLICE

* fix comments
2026-05-26 18:15:00 -07:00
Christopher Milan
3adf7f5d95
disable flaky cl test (#16388) 2026-05-26 19:56:57 -04:00
Christopher Milan
d23659d38b
cleanup some old test skips (#16384) 2026-05-26 19:07:22 -04:00
George Hotz
fd963038a0
remove allow_any_len from store (#16385)
* remove allow_any_len from store

* a few more

* no bv there

* more fixes

* fixes

* oh that
2026-05-26 15:26:53 -07:00
chenyu
0b88827482
remove CONST(UNIQUE) (#16383) 2026-05-26 14:45:22 -04:00
chenyu
d861c50dce
remove unique_const (#16382) 2026-05-26 13:53:31 -04:00
George Hotz
bac82d4949
fix emu bug in gfx950 (#16381)
* fix emu bug in gfx950

* fix renderer
2026-05-26 10:32:03 -07:00
chenyu
9b00defc8c
Revert "remove unique_const (#16372)" (#16380)
This reverts commit 09019d6761.
2026-05-26 12:30:07 -04:00
chenyu
09019d6761
remove unique_const (#16372)
* remove unique_const

* fix SDWA thing

* that?
2026-05-26 12:18:03 -04:00
George Hotz
7f1b02854e
bufferview offset is units of input dtype (#16378) 2026-05-26 08:49:31 -07:00
qazal
846a809af7
viz: add +- toggle for hidden UOps (#16368)
* first

* remove

* move src toggles to client side

* line

* update viz server tests

* remove those

* logic

* cleanup

* call matches

* fix const arg

* add labels

* keep changes

* the stack on movement ops hiding change

* structure

* rename to expandedNodes

* work

* test intention
2026-05-26 22:31:54 +09:00
nimlgen
032905dec9
hcq2: simpler (#16361) 2026-05-26 14:28:48 +03:00
George Hotz
322693dcd3 hotfix: bump Mac pytest timeout to 4 minutes (try 2) 2026-05-25 18:23:21 -07:00
George Hotz
41ee7dab1c
script to generate testsig for DSP (#16371)
* script to generate testsig for DSP

* cleanups
2026-05-25 17:54:58 -07:00
wozeparrot
76fc39ccc0
gather to single device (#16354) 2026-05-25 17:27:08 -07:00
George Hotz
942cb42b97 Revert "hotfix: bump Mac pytest timeout to 4 minutes"
This reverts commit 695a0069ed.
2026-05-25 17:25:11 -07:00
Christopher Milan
8ddd1328df
remove getenv(CI) (#16365)
gone everywhere except test_interop, because torch MPS does not work in actions
2026-05-25 20:23:33 -04:00
George Hotz
695a0069ed hotfix: bump Mac pytest timeout to 4 minutes 2026-05-25 17:20:19 -07:00
George Hotz
689ab6a49f
move buffer view offset to src (#16364)
* this work?

* failed
2026-05-25 17:07:55 -07:00
Christopher Milan
d8f86be613
webgpu: shader-f16 support in arch (#16370) 2026-05-25 19:20:59 -04:00
qazal
4bcc53eb26
viz: stable node position for +- toggle (#16367) 2026-05-26 06:30:47 +09:00
qazal
3506eb08ec
viz: sidebar toggles always recenter (#16366)
* viz: sidebar toggles always recenters

* python brain
2026-05-26 06:14:32 +09:00
chenyu
cdeb861828
invalids is empty [pr] (#16353) 2026-05-25 16:11:38 -04:00
qazal
b73d2d17b9
viz/cli: add --interval (#16363)
* interval support

* add test_interval

* llama uses interval
2026-05-26 03:35:06 +09:00
C T
2ab90f31b1
use windows-specific alias nvcuda when loading cuda on windows (#16260)
This also makes it possible to use cuda on windows by specifying 3 env
vars with direct dll paths: NVCUDA_PATH, NVRTC_PATH and NVJITLINK_PATH
without name collision with CUDA_PATH which is used for cuda headers
include path in NVRTCCompiler.
2026-05-25 08:50:50 -07:00
wozeparrot
68d2102fd2
llama: offload master weights (#16355) 2026-05-25 08:48:13 -07:00
qazal
eecd4706ff
fix mailbox comment, add types (#16360) 2026-05-25 22:24:00 +09:00
nimlgen
64095cf2e2
use get_buf in exec_kernel (#16356) 2026-05-25 15:13:40 +03:00
chenyu
5d5e02871f
remove Tensor.from_uop (#16344)
and no device for const in Tensor init
2026-05-24 18:53:09 -04:00
nimlgen
a891727c9f
hcq2: multi (#16347)
* hcq2: multi

* cleaner a bit
2026-05-24 19:28:33 +03:00
chenyu
926d125a63
update test_stack (#16345)
also skip COMPILE_ONLY, it was comparing 0==0
2026-05-23 10:42:35 -04:00
chenyu
149a87dac2
deviceless const cleanups (#16341) 2026-05-22 20:11:01 -04:00
Christopher Milan
35461d4d8f
ci: cleanup some deps [pr] (#16340) 2026-05-22 19:16:08 -04:00