Commit graph

9,569 commits

Author SHA1 Message Date
George Hotz
50be175f88 store 2025-07-22 19:28:39 -07:00
George Hotz
33d80db64c this can happen later 2025-07-22 19:12:37 -07:00
George Hotz
28b6cff043 broken 2025-07-22 19:05:11 -07:00
George Hotz
23ee2769bd gate store 2025-07-22 19:03:06 -07:00
George Hotz
5055668d1e oops, forgot that 2025-07-22 18:54:14 -07:00
George Hotz
8305a5804c just the range thing 2025-07-22 18:52:26 -07:00
George Hotz
911bb4ff44
Merge branch 'master' into endrange 2025-07-22 18:43:22 -07:00
George Hotz
53339e62f7
no gate store anymore (#11338)
* no gate store anymore

* fix up spec
2025-07-22 18:41:15 -07:00
chenyu
7a9a5cfd28
isolate test/external/external_test_am.py (#11335)
seems to be the one crashing, also remove -n=auto for that
2025-07-22 19:02:20 -04:00
George Hotz
fcbd0e4de3
assigns are no longer used [pr] (#11333) 2025-07-22 15:35:07 -07:00
George Hotz
942d69e139 store is endrange 2025-07-22 15:30:10 -07:00
George Hotz
343921f873 Revert "end the ranges in the stores"
This reverts commit c88ad9d24f.
2025-07-22 15:12:21 -07:00
George Hotz
c88ad9d24f end the ranges in the stores 2025-07-22 15:04:10 -07:00
George Hotz
60dcc9f4df insert endrange 2025-07-22 14:38:49 -07:00
George Hotz
09431d4ad1
make DEFINE_REG behave like the others (#11273)
* simpler define reg

* cast

* PTRCAT define_acc

* cleanups

* fix uops stats

* fix linearizer tests

* llvm

* define reg sets const

* define reg sets const

* no assign

* collapse that

* fix test_max_pool2d_bigger_stride_dilation

* use index, fix webgpu

* devec

* fix tests

* fix webgpu

* fix llvm

* threads for python

* fix ops_python

* only for reg

* acc_half is real now in the emulator

* fix llvm

* fix webgpu init

* fix wgpu test

* fix some tests

* fix ptx

* fix ptx bool acc

* cleanups

* broken, meh. will fix with ENDRANGE

* line count
2025-07-22 13:53:56 -07:00
chenyu
4535908679
update keccak test_long (#11331)
it should compare with arg "shake_128"
2025-07-22 16:08:01 -04:00
nimlgen
3faa352dcc
am: bump version after mm changes (#11328) 2025-07-22 21:54:10 +03:00
George Hotz
affd83961c
small changes from define_reg (#11327)
* small changes from define_reg

* fix webgpu
2025-07-22 11:11:48 -07:00
nimlgen
53b3d87456
am: use 4-lvl pdir (#11326) 2025-07-22 20:58:15 +03:00
chenyu
2d7c28de6a
clean up dup lambdas in helper_test_exception (#11325) 2025-07-22 12:21:57 -04:00
chenyu
c6aa8e58ca
fix TestDropoutProbabilityEdgeCases (#11322) 2025-07-22 11:13:56 -04:00
chenyu
fb42c84365
merge TestRollEdgeCases into test_ops (#11321) 2025-07-22 10:55:57 -04:00
chenyu
1d8b3e9d1c
movementop only Tensor.roll (#11317)
* movementop only Tensor.roll

* fixed
2025-07-22 10:34:15 -04:00
chenyu
a41140241b
truncate unsigned const in cstyle (#11318)
it can be a warning or a hard error in clang

PTX and PYTHON also need fix, skipping for now
2025-07-22 08:02:12 -04:00
qazal
6668d6d241
fix word_wrap with newlines in input string [pr] (#11319) 2025-07-22 12:03:13 +03:00
qazal
0c4e19f270
hotfix: disable process replay in REMOTE=1 tests (#11320)
* hotfix: disable process replay in REMOTE=1 tests

* comment
2025-07-22 10:41:58 +03:00
George Hotz
3b674df34b
generic changes from define_reg_2 (#11315)
* generic changes from define_reg_2

* fix for ptx

* ugh, that one
2025-07-21 15:14:06 -07:00
chenyu
6e9506e6fd
Tensor.roll supports dims=None (#11313) 2025-07-21 17:29:23 -04:00
George Hotz
108aac8af4
use AddrSpace instead of local (#11314)
* use AddrSpace instead of local

* addrspace in test
2025-07-21 14:00:06 -07:00
chenyu
d3a93185a6
clean up test_roll (#11312) 2025-07-21 16:00:50 -04:00
George Hotz
532b52fcef
store has a dtype, like assign (#11309)
* store has a dtype, like assign

* fix upat

* fix test
2025-07-21 12:50:01 -07:00
geohotstan
445ff8de56
ONNX onnx_parser and buffer_parse clean up (#11000)
* start

* remove onnx.load from compile4 and move np to dropout

* clean up and enable test

* clean up

* move WebGPU ONNX test into MacOS (WebGPU)

* leave test in ONNX (CPU)

* fix raw_data init None, and simplify onnx_runner test a little?

* THESE TESTS ARE SO UGLY UGHH

* need to really think about how to structure the test

* wow LLMs are quite something

* not always on disk now

* also add external data loading test

* cleaner tests

* minimize diff and add const folding tests

* add external data loading too

* whoops add webgpu back.. but why was it not needed in the first place?

* better comment

* move webgpu test to macos(webgpu)?

* llm english so much better than me wow

* trigger CI to check flakiness

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-07-21 15:10:25 -04:00
George Hotz
842184a1ab
rename kernelize to schedule, try 2 (#11305) 2025-07-21 11:18:36 -07:00
George Hotz
7e8f5dde74
matmul style is still reshape (#11308) 2025-07-21 11:14:57 -07:00
George Hotz
41de76a7fd
put assign and store next to each other [pr] (#11306) 2025-07-21 11:07:35 -07:00
nimlgen
de2df92551
hcq: use devices instead of ids in HCQGraph (#11303)
* hcq: use devices instead of ids in HCQGraph

* fiz
2025-07-21 20:03:12 +03:00
wozeparrot
30ce16a424
feat: failing test for long keccak (#11292) 2025-07-21 12:49:23 -04:00
uuuvn
178dbf3f66
Remote scheduler changes (#11177) 2025-07-21 09:29:44 -07:00
वेदांत
e368628736
Add amin support to Tensor operations in Torch backend (#11290)
* intiger div mod fix

* Revert "intiger div mod fix"

This reverts commit d5d2f201bf.

* feat arg_min support

* tets update

* test fix
2025-07-21 09:14:08 -04:00
qazal
5eb54e2499
viz: close event streams before profiler render (#11300) 2025-07-21 15:42:31 +03:00
nimlgen
cc3c1e4c14
hcq: move cpu to hcq (#11262)
* hcq: move cpu to hcq

* import time

* upd

* fix

* windows support

* hm

* cleaner

* fix timer

* fix timing

* std is ns

* skip profiler

* mypy

* cleaner

* cleanups

* after merge

* default is back
2025-07-21 15:10:38 +03:00
nimlgen
816c01c2d4
hcq: default copy_queue_t=None (#11297) 2025-07-21 14:45:20 +03:00
qazal
6520a7fcb6
viz: factorize event stream (#11298) 2025-07-21 14:42:00 +03:00
nimlgen
9c533e5c38
hcq: cpu prereq (#11296) 2025-07-21 13:35:18 +03:00
nimlgen
e87a42e243
hcq: prepare for windows (#11293)
* hcq: prepare for windows

* comments
2025-07-21 13:08:56 +03:00
nimlgen
df3ba0a7c0
autogen: fix imports in libusb (#11294) 2025-07-21 13:04:27 +03:00
nimlgen
dd6a2d432f
hcq: default timestamp metrics is ns (#11295) 2025-07-21 12:56:30 +03:00
wozeparrot
53345ef4e2
feat: make ops_disk work on block devices (#11291) 2025-07-20 14:39:50 -07:00
qazal
3002c63b1e
process replay: optionally pass tinygrad import error (#11289)
* process replay: optionally pass tinygrad import error

* gate all tinygrad internals

* s/getenv/os.getenv pre import

* diff
2025-07-20 22:57:56 +03:00
chenyu
9e3a593313
minor kernel.py cleanups [pr] (#11286) 2025-07-20 10:15:31 -04:00