Commit graph

5,694 commits

Author SHA1 Message Date
George Hotz
e140f8f0d8
linearizer test_failure_61 (#10552)
* enumerate cases of Tensors in the JIT

* optional fused optimizers

* add fused optimizer test

* move that there

* ugh

* work on beautiful_cifar

* speed close to hlb_cifar

* test_failure_61

* just the failure
2025-05-28 21:30:50 -07:00
Sieds Lykles
ae02a1e232
[bounty] Z3 symbolic fuzzer [pr] (#10514)
* First version, caught a bug?

* Nicely print failure to reproduce

* Remove that

* Put the assert back

* Change fuzzing to use testing_unit so it has z3

* Test key to match

* Add rule

* Add test

* Add test for edge case 0

* Merge patterns

* update comment

* consistent whitespace

* whitespace

* add condition

* add test

* update comment

* use Variable

* fuzzer using z3_renderer

* Cleaned up printing and debugging

* working new fuzzer

* change some comments and printing

* more formatting

* fuzz failures in seperate file

* fix fstring

* more tests

* naming

* remove added line

* remove comment

* print number of skipped expressions

* use self.assertEqual

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-05-28 16:28:37 -04:00
George Hotz
98f3d1c26d
enumerate cases of Tensors in the JIT (#10548) 2025-05-28 11:51:27 -07:00
qazal
d1f0043331
use store_val helper in test_schedule asserts [pr] (#10540) 2025-05-27 21:48:06 +03:00
George Hotz
5b268121d4
remove becomes map (#10533)
* remove becomes map

* add comment and delete dead code

* multi is a view
2025-05-27 11:47:11 -07:00
George Hotz
a07caaca0d
handle stride 0 variable reshape (#10536) 2025-05-27 10:00:24 -07:00
George Hotz
41e3d07d7f
view gradient is tricky (#10528)
* view gradient is tricky

* explicit
2025-05-26 22:28:30 -07:00
uuuvn
c29c46853f
Very basic mock sqtt (#10512)
This mockgpu sqtt emulation will just ignore basically everything and end
up with a 0x1000 size trace full of zeroes, but just testing for things
like register rename is better than nothing i guess
2025-05-26 14:38:28 -07:00
qazal
6d07087fe1
remove contiguous from MSELECT 2 (#10522)
* remove contiguous from MSELECT

* test_shrink_on_shard_axis

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2025-05-26 19:19:01 +03:00
geohotstan
602a145f8f
Add Tensor.unfold (#10518)
* yoinked 10272

* eitanturok's fixes

* hmmm should size be sint?

* add test
2025-05-26 11:15:44 -04:00
qazal
9169dcfb49
do not create kernels with more inputs than the backend allows (#10510)
* work

* no itertools + top down pass

* clean viz

* python can do that

* webgpu

* gbarrier of gbarrier is gbarrier

* device can be tuple

* bug in toposort

* failing test for gated toposort

* contiguous of gbarrier is gbarrier

* check for binops

* Revert "check for binops"

This reverts commit 53e3cdf720.

* viz + match on gbarrier, self exists by default

* alt

* green now

* cleanup
2025-05-26 18:02:03 +03:00
Sieds Lykles
478c76f4b7
More div conditions (#10432)
* add condition

* add test

* use Variable
2025-05-26 07:36:05 -04:00
Sieds Lykles
c6c7882bdf
bugfix: seperate rule for x//d<-c (#10148)
* Add rule

* Add test

* Add test for edge case 0

* Merge patterns

* update comment

* consistent whitespace

* whitespace

* update comment
2025-05-26 07:35:41 -04:00
geohotstan
fd9f236a82
move test over (#10508) 2025-05-25 21:51:51 -04:00
Ahmed Harmouche
bbb6deff53
Increase op limit in test_index_mnist to pass on webgpu (#10504)
* Increase op limit to enable  mnist indexing on webgpu

* Only relax op_limit on WebGPU
2025-05-24 09:37:31 -04:00
qazal
a9d0bf5c4c
proper error for device mismatch (#10500)
* failing test

* use bufs

* buf_uop

* not on cpu
2025-05-24 12:17:41 +03:00
George Hotz
9eee5ae276
its copying the dataset every time (#10498)
* its copying the dataset every time

* add comment

* expect failure

* todo
2025-05-23 21:25:53 -07:00
George Hotz
b58f2d4544
fix tests (#10493) 2025-05-23 18:38:07 -07:00
wozeparrot
a18963d9e7
feat: use tinygrad useragent (#10488) 2025-05-23 15:44:40 -07:00
qazal
7a762f01ab
s/shape_spec/ast_spec [pr] (#10485) 2025-05-23 15:43:54 +03:00
qazal
127a7c8aee
assert AST views only exist in the edges (#10484)
* assert AST views only exist in the edges

* valid without device
2025-05-23 15:27:09 +03:00
qazal
e491168685
add metadata note + whitespace fixup [pr] (#10483)
* add metadata note + whitespace fixup [pr]

* TestSchedule.test_kernelize_diamond
2025-05-23 14:37:45 +03:00
Sieds Lykles
ce6ebfb8ee
verify rewrites in test_uop_symbolic (#10430)
* verify rewrites in test_uop_symbolic

* use global context
2025-05-23 06:57:29 -04:00
George Hotz
1e4d63e06e
uops can have multiple metadata (#10479)
* uops can have multiple metadata

* fixups
2025-05-22 21:35:02 -07:00
George Hotz
9fc01c1e03
support for uop tags (#10477)
* support for uop tags [pr]

* test uop tags
2025-05-22 19:53:48 -07:00
chenyu
8cc2dff4d8
only float Tensors have gradient [pr] (#10475) 2025-05-22 21:02:11 -04:00
George Hotz
147f7747f2
remove the map from create_schedule_with_vars [pr] (#10472) 2025-05-22 15:58:25 -07:00
George Hotz
0d39bb5de1
rename to get_kernelize_map (#10465) 2025-05-22 11:44:44 -07:00
chenyu
7bfb20757c
fix tensor int floor div (#10327)
* fix tensor int floor div

* test_float_floordiv_scalar
2025-05-21 06:46:54 -04:00
Sieds Lykles
2b4375f36d
Correct divmod folding behind flag (#10433)
* add flag

* add test

* remove import
2025-05-21 06:46:13 -04:00
qazal
df4cbb69e9
move fuzz_schedule.py to extra [pr] (#10444) 2025-05-21 10:07:24 +03:00
chenyu
29624af872
skip commavq in external_model_benchmark (#10439)
precision issue with different onnxruntime version
2025-05-21 01:45:33 -04:00
George Hotz
03e7a99ca8
add edge cases found by codex [pr] (#10423)
* add edge cases found by codex [pr]

* another test

* more edgecases

* docs

* instructions

* fine, add that one

* nan cases

* roll failures

* inv prob

* more failing tests

* err, that's failing

* more tests

* more failures

* uop verif

* failures

* webgpu
2025-05-20 14:53:18 -07:00
nimlgen
2895198c36
am: download regs (#10419)
* am: download regs

* x

* linter

* mypy

* after merge

* raise

* fixed name

* fix

* xx

* remove

* missing reg

* missing reg

* move to online

* ops
2025-05-20 18:59:56 +03:00
uuuvn
ec9955c956
Use REAL_DEV for test skips (#10420)
This should fix remote cpu tests flakiness (segfaults were in
`test_data_parallel_resnet_train_step` which is skipped on cpu but wasn't
skipped on remote cpu)
2025-05-19 17:32:14 -07:00
Sieds Lykles
db09676250
Dont simplify gate in gate, fix FUSE_ARANGE=1 python test/test_ops.py TestOps.test_scatter_add (#10411)
* substitute out index

* Add test

* change comment
2025-05-19 13:16:21 -04:00
qazal
cc8dda1d75
move multi_map to grouper rewrite pass (#10409)
* move multi_map to grouper rewrite pass

* delete that
2025-05-19 10:44:06 +03:00
George Hotz
b06291077c
no amdgpu kernel driver (#10408)
* no amdgpu kernel driver

* don't test hip

* lower req
2025-05-18 20:52:39 -07:00
George Hotz
411392dfb7
move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
uuuvn
27c12be471
amd mockgpu graph support (#10385)
For testing remote graph stuff (prompted by #10371) in ci
2025-05-18 09:43:16 -07:00
qazal
04b23087d8
grouper tests from fuse_arange_default [pr] (#10394) 2025-05-18 18:42:43 +03:00
qazal
9e2089dcd4
don't raise Exception in process replay [pr] (#10392)
* don't raise Exception in process replay [pr]

* continue generating diffs unless [pr] is set, exit(1) otherwise

* change

* works
2025-05-18 11:23:23 +03:00
qazal
0294bfe507
simpler can_pad (#10364)
* simpler can_pad [pr]

* 3 kernels

* tests

* less kernels
2025-05-18 10:00:07 +03:00
George Hotz
6f77b938d7
Move getbits tests into test_helpers (#10382) 2025-05-17 17:04:00 -07:00
George Hotz
6ec88d94df
add tests for multi ram usage [pr] (#10376) 2025-05-17 15:33:40 -07:00
वेदांत
2453d99050
rms matching pytorch implementation (#10319)
* rms matching pytorch implementation

* pre commit fix

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-05-17 08:23:11 -07:00
qazal
e054b53a75
kernel count tests for pad [pr] (#10369)
* kernel count tests for pads

* handcoded rand one kernel

* comment

* prerealize device rng counter

* test_rand_handcoded generates /0

* remove track_rewrites
2025-05-17 17:20:46 +03:00
George Hotz
e13f2a3092
multi is O(1) (#10183)
* multi is O(1)

* allreduce

* no new uops needed

* junk

* something

* simple

* that's really what i want

* closer

* inject _device_num

* pretty print

* cleanups

* this

* early dnum

* ops allreduce is good

* ish

* device is the tuple and this is fine

* simpler

* progress

* copy_multi

* work

* more tests

* more tests pass

* work

* no None axis

* tests

* no none multi

* type fixes

* pre commit passes

* lil

* remove this

* mlperf dataloader on mac

* that test was wrong

* unbind

* support DEBUG=2

* realize

* only unbind bound vars

* don't include fixedvars

* graph test

* one test

* fixedvars in hcq

* new ring reduce

* ring reduce

* simpler ring

* mselect

* mselect doesn't work

* Revert "mselect doesn't work"

This reverts commit c78b77bd7d.

* Revert "mselect"

This reverts commit bb2e430ac3.

* simpler

* fixups

* no optional

* fix jit

* move things around

* cleanup multi

* simpler multi

* simpler reshape
2025-05-16 23:14:23 -07:00
George Hotz
e1a40e8040
add hcq fixedvars support [pr] (#10356)
* add hcq fixedvars support [pr]

* different test

* fixedvars are only for comp_queues

* fix hcq varvals
2025-05-16 22:05:53 -07:00
George Hotz
876d2275a1
changes from new multi (#10353)
* changes from new multi

* revert hcq change
2025-05-16 13:07:29 -07:00