Commit graph

250 commits

Author SHA1 Message Date
George Hotz
bf05a2762e
Merge branch 'master' into image_no_vec 2026-05-08 16:32:08 -07:00
George Hotz
f68c224b71 don't use vec(2) for image index 2026-05-08 10:52:24 -07:00
chenyu
235044c9d8
Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD (#16093)
* Ops.IDIV -> Ops.CDIV, Ops.MOD -> Ops.CMOD

* ruff
2026-05-07 23:18:15 -04:00
wozeparrot
d11f4d0ec2
fix: don't copy on slice of DP weight (#16089) 2026-05-07 17:58:01 -07:00
George Hotz
b796bbae87
fix valid in indexing tests (#16087) 2026-05-07 14:11:28 -07:00
chenyu
072db9924c
div to mixin (#16078)
also deleted idiv method
2026-05-07 12:52:37 -04:00
chenyu
516b00e286
mod and fmod to mixin (#16077) 2026-05-07 12:13:39 -04:00
chenyu
ef085304bc
stronger divmod_recombine (#16066) 2026-05-06 15:41:54 -04:00
chenyu
af4140f3be
fix divmod recombine for floordiv (#16062) 2026-05-06 14:22:42 -04:00
chenyu
c6ad3d3ac2
better divmod late rewrite (#16061)
better order
2026-05-06 11:31:48 -04:00
chenyu
aaabe42373
relax fold_divmod_general (#16058) 2026-05-05 21:37:56 -04:00
chenyu
869eae6b37
fix double div rewrites (#16054) 2026-05-05 19:34:35 -04:00
qazal
795501e1da
fix device in null graph events (#16053)
* failing test

* fix compute

* fix sdma
2026-05-06 07:44:08 +09:00
chenyu
34fe37d64e
use FLOORDIV and FLOORMOD (#16048)
* use FLOORDIV and FLOORMOD

also removed CORRECT_DIVMOD_FOLDING

* fix

* Revert "fix"

This reverts commit 86af33b88ef31943c61e67189b072eca4896409a.

* fix

* fix
2026-05-05 18:32:54 -04:00
chenyu
9c37a0c75d
Ops.FLOORDIV and Ops.FLOORMOD (#16038)
* Ops.FLOORDIV and Ops.FLOORMOD

lowered into IDIV and MOD in get_late_rewrite_patterns

* still need this

* exclude

* like that?
2026-05-05 11:42:14 -04:00
Christopher Milan
8e99c4f097
fetch checks sha256 (#16037) 2026-05-04 16:08:38 -04:00
George Hotz
1884f67a39
simplify full_rewrite_to_sink spec (#16035)
* simplify full_rewrite_to_sink spec

* test cleanups
2026-05-04 11:44:13 -07:00
qazal
b1d88ebf02
viz/cli: aggregate flops in -t (#16031)
* 38

* plumbing

* more flops

* flop/s and bytes/s

* arithmetic mean

* tests

* harmonic mean

* range

* better

* simplify

* fix prints

* no string parsing needed
2026-05-04 17:35:02 +03:00
qazal
c02e390c2b
viz: encode flops, mem and metadata in json (#16032)
* gate print

* update everywhere to check path

* server encodes json

* ui changes

* cli changes

* tests never need regex

* no str replace

* update test_pipes

* remove that
2026-05-04 23:06:18 +09:00
qazal
9684334dfe
viz: fix flops in graph, add null graph tracing (#16024)
* min repro, todos

* null graph tracing

* work

* work

* work

* only test_flops

* exec points back

* first

* better

* integral timestamps maybe

* cleanup

* simpler, update NULL to use SDMA naming

* integration test

* sdma
2026-05-03 22:32:44 +09:00
qazal
7daf4b7d52
viz: split cli test (#16015)
* viz: split cli test

* arg3 is msg
2026-05-03 01:47:11 +09:00
George Hotz
0f7e296f5b
fix some indexing edge cases (#15988) 2026-04-30 08:05:30 -07:00
qazal
55915584e5
viz: fix cfg for emulated amd on the null device (#15976)
* simple failing when i test it end to end

* pass

* linter

* assemble
2026-04-30 05:18:09 +09:00
qazal
a37b605523
remove arch from asm kernel class (#15977)
* rm arch from kernel

* update other tests

* update abstractions4.py
2026-04-30 03:39:52 +09:00
chenyu
654e611a29
_bits_to_rand to mixin (#15972) 2026-04-29 13:47:25 -04:00
George Hotz
5f441ecffc
unify reduce + reduce_axis (#15973)
* unify reduce + reduce_axis

* fix all tests

* lil cleanups
2026-04-29 10:29:56 -07:00
nimlgen
7787f76dcc
get_runner -> get_runtime (#15967)
* get_runner -> get_runtime

* do not use get_runner

* fix

* remove get_tunner

* remove

* fix

* x
2026-04-29 18:29:49 +03:00
chenyu
fb188c3c23
UOp.bitcast noop early return (#15968)
matches Tensor
2026-04-29 09:41:40 -04:00
chenyu
c4bea54e9c
_threefry_random_bits to mixin (#15959)
start RandMixin
2026-04-28 19:13:57 -04:00
chenyu
77f9125c21
move Tensor.pad to OpMixin (#15946) 2026-04-27 16:56:04 -04:00
nimlgen
4164666c72
programinfo (#15942)
* programinfo

* fix

* m

* x

* x

* changes

* x

* fix

* rm
2026-04-27 23:12:03 +03:00
chenyu
fe38d6de94
_pad_circular and _pad_reflect_replicate to mixin (#15944) 2026-04-27 16:07:05 -04:00
nimlgen
bb652352c7
remove execitem (#15932)
* remove execitem

* f

* x
2026-04-25 19:33:04 +03:00
qazal
9a23de7d27
viz/cli: unify profile and rewrites, -s ALL default (#15931)
* work

* workg

* better

* cleanup

* better defaults

* --ls

* better

* work

* update llama

* update
2026-04-25 22:31:24 +09:00
nimlgen
a5e9ea7a60
remove schedule batch 4 (#15927)
* remove schedule batch 4

* fini
2026-04-25 12:36:55 +03:00
nimlgen
3c8a2db870
remove schedule() from tests batch 2 (#15923)
* remove schedule() from tests batch 2

* batch 4
2026-04-25 10:44:41 +03:00
Christopher Milan
57fbaa3d49
amd: fallback to llvm when comgr is not available (#15914) 2026-04-24 23:30:16 -04:00
nimlgen
d3378010ee
schedule() -> schedule_linear() in tests (batch 1) (#15915)
* schedule_with_vars -> linear_with_vars in tests

* tests batch 1

* batch 2

* estimate_uop

* simpler

* rm
2026-04-24 23:40:53 +03:00
chenyu
b501ba3e42
nll_loss to mixin (#15918) 2026-04-24 15:50:31 -04:00
chenyu
2f9fdb4a37
scatter to mixin (#15917) 2026-04-24 15:37:37 -04:00
nimlgen
f2751955cb
remove linear_to_schedule from tests (#15912)
* remove linear_to_schedule from tests

* x
2026-04-24 20:02:10 +03:00
chenyu
03a7604f76
sort argsort topk allclose to mixin (#15910) 2026-04-24 10:20:46 -04:00
chenyu
c24da99d56
avg_pool2d, max_pool2d to mixin (#15903)
* avg_pool2d, max_pool2d to mixin

* fix

* just dtype

* that
2026-04-23 23:36:17 -04:00
chenyu
08d9106c9f
scatter_reduce and sparse_categorical_crossentropy to mixin (#15902)
also use `.ne` to fix `# type: ignore[comparison-overlap]`
2026-04-23 21:06:36 -04:00
chenyu
8cc2c69e21
fix isclose mixin (#15898)
use `.eq` instead of `==`
2026-04-23 20:40:43 -04:00
chenyu
782bc6aece
broadcast in ElementwiseMixin.div [pr] (#15897) 2026-04-23 16:02:43 -04:00
chenyu
11c197955b
interpolate and cross_entropy to mixin (#15895) 2026-04-23 14:59:45 -04:00
chenyu
f0dbc68aa9
gather to mixin (#15891) 2026-04-23 14:00:57 -04:00
chenyu
87223f870e
logcumsumexp, argmax, argmin, sequential to mixin (#15890) 2026-04-23 12:10:42 -04:00
George Hotz
0c3260d5d9
rename VECTORIZE to STACK (#15880) 2026-04-23 10:43:42 +08:00