Commit graph

968 commits

Author SHA1 Message Date
George Hotz
2611907afb
start ripping out old scheduler -- no maps (#14909)
* start ripping out old scheduler -- no maps

* no more metadata
2026-02-20 21:05:04 +08:00
George Hotz
fc5677c28b
resnet dataloader + more test cleanups (#14899)
* resnet dataloader

* tests
2026-02-20 10:05:47 +08:00
George Hotz
f081f154ae
parameterize the CDNA asm gemm (#14813)
* parameterize the CDNA asm gemm

* fix llama test

* fix

* add more gemmt ests

* confirm all match

* test these asm gemms
2026-02-17 11:35:18 +08:00
George Hotz
bc3487d607
VIZ display cleanups (#14811)
* exclude reshape/expand broadcasts from viz

* limit src lines
2026-02-17 10:03:08 +08:00
qazal
9da7f5e733
disable process replay for AMD emulator renderer [pr] (#14766)
* disable process replay for AMD emulator renderer [pr]

* line

* skip
2026-02-15 18:52:37 +09:00
nimlgen
3bee6638e3
external_test_hive_reset (#14729)
* external_test_hive_reset

* add fault
2026-02-13 19:08:36 +03:00
George Hotz
4680247e35
renderer/amd: move in tree (#14702)
* renderer/amd: move in tree

* fix paths in tests

* 24000 lines

* no delete for amd files
2026-02-12 18:09:16 +08:00
George Hotz
befc1e800c
assembly/amd: disasm is test only (#14694)
* assembly/amd: disasm is test only

* viz uses str
2026-02-12 12:33:46 +08:00
George Hotz
c331798201
move tests to test/backend (#14691)
* move tests to test/backend

* fix imports

* fix CI

* revert that one

* Fix formatting in README for test command
2026-02-12 11:09:44 +08:00
George Hotz
4565958792
some lil speedups (#14679) 2026-02-11 10:01:58 +08:00
George Hotz
2d4ad9e739
add a waitlist for graph rewrite (#14678)
* add a waitlist for graph rewrite

* cleaner

* one context on spec check
2026-02-11 09:30:13 +08:00
chenyu
884592f6c8
pin z3-solver version (#14605)
found exact input that crashes z3 4.15.4
2026-02-06 22:49:31 -05:00
George Hotz
7a2a3b5c71
Remove Ops.KERNEL, it's all Ops.CALL now (#14603) 2026-02-07 10:21:54 +08:00
chenyu
b9fe8b7591
fix opt in process replay [pr] (#14599) 2026-02-06 16:49:56 -05:00
chenyu
197ebcbbbc
log seed with flush=True in fuzz_symbolic (#14597)
* log seed with flush=True in fuzz_symbolic

i think z3 can crash. added reading seed from argv to see if we repro later

* fuzz_symbolic_symbolic_div
2026-02-06 15:03:57 -05:00
chenyu
d57d24c7d4
Buffer.as_buffer -> Buffer.as_memoryview [pr] (#14535)
it casts to memoryview. also inline the as_typed_buffer checks to Tensor._data
2026-02-04 11:31:11 -05:00
nimlgen
2f55005ad9
qcom: sync cpu cache when from_blob (#14518)
* um

* fx

* d

* x

* x

* x

* x

* f

* ren
2026-02-03 21:51:03 +03:00
George Hotz
dd2de4f838
rename all DEFINE_GLOBAL to PARAM (#14511) 2026-02-03 15:09:38 +08:00
chenyu
66d2b02f11
delete files that depends on extra.optimization.helpers (#14499) 2026-02-02 13:33:33 -05:00
George Hotz
ec0398fceb
test amd gpu crashes (#14459)
* test amd gpu crashes

* cleanup

* less sketch tests
2026-02-02 18:57:47 +03:00
nimlgen
230d08ec70
test for am recovery and faults handling (#14421)
* test for am recovery and faults handling

* linter
2026-01-29 17:11:24 +03:00
George Hotz
88bc5ee212
assembly/amd: rename to better names (#14384)
* assembly/amd: rename to better names

* might help fuzzing segfault

* emu2 -> emu
2026-01-28 10:00:54 +08:00
George Hotz
984cdc4840
add wrapper class for the -0.0 != 0.0 issue (#14339)
* add wrapper class for the -0.0 != 0.0 issue

* fixes

* spec fix

* missed one
2026-01-26 16:52:37 +08:00
nimlgen
26220a472e
no core_id (#14265)
* no core_id

* kwargs

* est

* linters

* ugh

* revert this

* deps

* glb

* should work?

* nn

* line

* fx

* ym

* z

* d

* um?

* revert

* this one?

* first half

* um p2

* all?

* um

* cleaner

* um
2026-01-23 21:30:12 +03:00
chenyu
073c6a81b5
raise if Tensor._buffer is called during jit (#14114)
* raise if Tensor._buffer is called during jit

* cleaner
2026-01-22 17:30:18 -05:00
chenyu
574d171fa6
fix onnx Pad constant_value=None (#14271)
also removed a dead branch in _resolve_pool_pads
2026-01-21 11:51:34 -05:00
chenyu
9ea63d7d52
failed test case for onnx IF with jit (#14235)
silently fails now since onnx treats IF cond as a const
2026-01-19 18:10:05 -05:00
chenyu
5e6a72c33f
new Onnx Gather (#14187)
instead of assuming const indices, check if it showed as a const
2026-01-16 22:24:07 -05:00
chenyu
ab244c7f81
onnx Gather should not assume indices to be const (#14185)
* onnx Gather should not assume indices to be const

added a failed test case

* just list
2026-01-16 20:55:00 -05:00
chenyu
2a2c1eacf6
disable fast_idiv on metal (#14137)
there's a metal compiler bug which was the root cause that keccak needs a contigous hack
2026-01-13 21:40:40 -05:00
chenyu
cad7feec02
more onnx ops (#14104)
HannWindow, HammingWindow, BlackmanWindow, Hardmax, LpNormalization
2026-01-12 09:11:13 -05:00
chenyu
9973a81356
add channels_last to QLinearGlobalAveragePool (#14094)
and other minor cleanups
2026-01-10 18:38:19 -05:00
chenyu
83063cc3e4
onnx TensorScatter (#14024) 2026-01-05 09:05:22 -05:00
chenyu
9497ec00f2
fix onnx attention permute (#14025)
* fix onnx attention permute

* skip test_attention_4d_fp16_cpu too
2026-01-05 08:58:50 -05:00
chenyu
7a81a3cb98
more passed onnx tests (#14022) 2026-01-05 07:46:27 -05:00
chenyu
aae08b20e0
enable passed onnx tests (#14017) 2026-01-04 22:12:50 -05:00
chenyu
f6a78a29e0
support einsum trace (#14012)
* support einsum trace

* test_einsum_scalar_cpu
2026-01-04 19:27:27 -05:00
qazal
bdb421f13e
process_replay: passthrough sink arg for Ops.PROGRAM input (#14000) 2026-01-04 13:09:39 +09:00
chenyu
51398edf9c
fix indirect import (#13958)
also deleted old external tests
2026-01-01 14:22:45 -05:00
nimlgen
25440f0f72
all2all (#13902)
* all2all

* um

* fix

* x

* um

* simler

* mypy

* fix

* t

* cmnts
2025-12-31 16:38:32 +03:00
George Hotz
43c6e973d8
add optional compiler in Renderer (#13817)
* add optional compiler in Renderer [pr]

* fix

* late init

* remove precompiled

* cleanup
2025-12-23 17:58:46 -05:00
nimlgen
90b217896f
am: xgmi p2p (#13811)
* system: use addr space

* am: xgmi

* fix

* ugh
2025-12-23 20:11:38 +03:00
George Hotz
8dcba2e2cc
no full_rewrite [pr] (#13809)
* no full_rewrite [pr]

* fix

* fix docs
2025-12-22 23:20:01 -05:00
chenyu
7f1d41c9f9
delete files that import ShapeTracker (#13805) 2025-12-22 15:54:18 -05:00
George Hotz
45c459848d
remove more stale stuff (#13765)
* remove more stale stuff

* remove disassemblers/adreno

* stale
2025-12-19 17:14:56 -04:00
George Hotz
744af193f0
remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
George Hotz
3dbde178c1
mark slow tests as slow instead of as CI (#13736)
* mark slow tests as slow instead of as CI

* CI shouldn't have different behavior

* more skips / CI

* slow
2025-12-17 10:29:57 -04:00
George Hotz
4b741e893f
remove REMOTE=1 (#13722)
* remove REMOTE=1

* leave ibverbs
2025-12-16 15:58:10 -04:00
nimlgen
e36385e570
am: support xgmi systems (#13659)
* am: support xgmi systems

* fake_am
2025-12-12 18:55:45 +03:00
Douglas Nyberg
947c6eefc3
add Swish op (#13541)
* add Swish ONNX operator

* add Swish regression test

* remove trailing whitespace

* upgrade ONNX to 1.20, add excludes for unimplemented ops

* upgrade ONNX to 1.19, add Swish op

* upgrade ONNX to 1.19, TensorFlow to 2.18, add Swish op

* exclude attention_3d and attention_4d_gqa tests

* exclude attention fp16 tests

* exclude all attention tests

* retrigger CI

* retrigger CI - worker crash
2025-12-08 12:41:18 -05:00