Commit graph

5,694 commits

Author SHA1 Message Date
qazal
4ef10c57f9
remove unused test helper (#10999) 2025-06-27 13:48:48 +03:00
qazal
a39343e39f
viz: move timeline layout to python (#10998)
* viz: move timeline layout to python

* DevEvent has a device and a name
2025-06-27 13:06:00 +03:00
George Hotz
b4eb876d5a
kernel.py no longer permutes reduce axis [pr] (#10968)
* kernel.py no longer permutes reduce axis [pr]

* delete tests that handcode uops

* regen of sops is broken...

* put import back

* just remove that

* disable those tests
2025-06-26 17:44:58 -07:00
qazal
1127302c46
move perfetto to extra (#10994)
* move perfetto to extra

* update TestViz and fix tests

* remove perfetto.html from viz directory

* work

* mypy
2025-06-27 01:53:54 +03:00
qazal
712980e167
fix extract_dataset + add tests to CI (#10995)
* fix extract_dataset + tests

* add CI

* sops.gz itself is same as master

* yml + gzip -c + ge

* don't commit that

* bump limit to 1000

* axis=7

* test_tiny
2025-06-27 01:51:36 +03:00
Ignacio Sica
579194f523
remove some linearize calls from tests 2 [pr] (#10992)
* refactor count_float4 to take uops as input instead of kernel

* remove some calls to linearize in test_linearizer

* remove some more calls

* remove one more call
2025-06-26 18:22:27 -03:00
geohotstan
50936b4a18
ONNX real float16 (#10694)
* squash commits

* temp fix for const tensor

* actually realizing float16 can only happen in raw_data

* .float -> cast(float) to rerun CI

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-26 14:05:12 -04:00
chenyu
49bba2f0a0
improve test_nll_loss (#10986)
build target and weight tensors outside so it tests backward too.
2025-06-26 02:46:55 -04:00
chenyu
0612acfc70
improve Tensor.cross_entropy (#10985)
separate when Y is prob vs indices and check shapes for indices. also fix higher dim cases
2025-06-26 01:39:48 -04:00
chenyu
8751d47985
CosineAnnealingLRWithWarmup (#10981) 2025-06-25 17:45:21 -04:00
Ignacio Sica
21f1c4cc09
remove some linearize calls from tests [pr] (#10978)
* remove some linearize calls from tests

speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd

* more clear assert message
2025-06-25 12:37:17 -07:00
Ignacio Sica
98d2cde293
revert tc_group feature (#10971) 2025-06-24 20:58:13 -07:00
George Hotz
cf60ccac6a
support new const lowering (#10967)
* support new const lowering

* delete invalid linearizer failure tests
2025-06-24 15:21:41 -07:00
George Hotz
8a65720528 hotfix: disable test_tensor_core_opts_group test on real metal 2025-06-24 15:21:33 -07:00
George Hotz
8743ca40e2
force reduce to be in axis order (#10837)
* force reduce to be in axis order

* disable rule causing loop

* disable that rule

* no ra there

* only move non reduce

* fix tests
2025-06-24 13:00:16 -07:00
chenyu
bfa87f3490
clean up binary_crossentropy_logits (#10958) 2025-06-24 12:23:40 -04:00
qazal
de4b9bf53b
add opts_to_apply option to AST KernelInfo (#10950)
* proposal: add option to override opts in the get_program API

* update test_linearizer_rewrite

* state in uops

* update process_replay and names

* empty isn't none

* fix process replay
2025-06-24 18:55:39 +03:00
chenyu
18e264a449
Tensor.logsigmoid (#10955) 2025-06-24 11:16:14 -04:00
b1tg
cc32394b32
support copyin/copyout/is_allocated for subbuffers (#10869)
* support copyin/copyout/is_allocated for subbuffers

* simple

* clean up

* rm underlying_buf
* add function is_initialized
* add tests

* better test_subbuffer_copy_in_out

* fix allocator

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-06-24 07:49:04 -07:00
chenyu
35504c938e
torch.clip(x,y) -> x.clip(y) in test_ops (#10954)
* torch.clip(x,y) -> x.clip(y) in test_ops

* test_binary_crossentropy_logits_pos_weights
2025-06-24 10:22:19 -04:00
Fang-Pen Lin
86d458533f
Add pos_weight for binary_crossentropy_logits (#10855)
* Add pos_weight for binary_crossentropy_logits

* Remove debug code

* Code style

* Code style

* Rename
2025-06-24 09:42:37 -04:00
Sieds Lykles
61dad3740f
fix min_max and add test (#10952) 2025-06-24 09:33:26 -04:00
qazal
f41c28a048
update test_tensor_uop_representation comments [pr] (#10946)
These comments can update to match new tinygrad.
2025-06-24 10:47:09 +03:00
qazal
7a5e4e0bf1
fix unittests process replay [pr] (#10947) 2025-06-24 10:30:23 +03:00
George Hotz
7d560dbd75 hotfix: corealize in the tiny mnist test 2025-06-23 17:41:16 -07:00
George Hotz
0f89660ce4
Revert "change clang -march flag to -mcpu on arm (#10841)" (#10942)
This reverts commit 897e42fd1b.
2025-06-23 16:48:28 -07:00
Ignacio Sica
956a8391a5
minor cleanup on test_tensor_core_opts tests (#10924)
* minor cleanup on test_tensor_core_opts tests

Tests now notify when skipped
Before, they silently skipped if backend didn't had half precision and
accumulation
Also cleaned up atol and rtol setup

* refactor test_tensor_core_opts_group

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-06-23 16:30:21 -07:00
ttomsa
897e42fd1b
change clang -march flag to -mcpu on arm (#10841)
* change clang -march flag to -mcpu with fp16 disassembly test

* fix

* add capstone to macos dependencies

* just check no cast in test

* rm import

* woops

* lets check

* move check

* llvm init before cpu chcek

* try this

* bump autogen llvm version

* also update libclang?

* revert

* add comment

* skip llvm test and add comment

* linter
2025-06-23 16:28:48 -07:00
Sieds Lykles
772cd02ad2
Perform index validation on load/store, not on the index (#10849)
* move index validation to load/stores

* add name

* add linearizer_failure

* add validate_store with implicit gates

* linearizer_failure_58 is fixed!

* add test_uop_graph test

* rename cond to gate

* test gated load/stores

* use or_casted()
2025-06-23 16:25:05 -07:00
George Hotz
ae4d2d71b4 bump line count to 14500 2025-06-23 15:32:27 -07:00
Harsh Natuskar
79d7cdd9ba
Fix device (#10929)
* fix: pkg

* better

* added test

* less lines
2025-06-23 15:30:19 -07:00
George Hotz
e15754db28
remove (some) kernelize from llama and test schedule speed (#10939)
* remove kernelize from llama

* 405B

* space
2025-06-23 15:07:31 -07:00
chenyu
42b1c9625b
skip test TestKiTS19Dataset::test_training_set (#10936)
flaky
2025-06-23 14:27:24 -04:00
patrini32
9e9fd44987
refactor test/external/external_llama_eval.py (#10567)
Co-authored-by: wozeparrot <wozeparrot@gmail.com>
2025-06-23 10:43:20 -07:00
qazal
ac39f27ae6
viz: non blocking UOp tracing (#10913)
* viz: non blocking UOp tracing

* u.arg

* no if Ops.KENREL

* drop replace

* switch to weakref.WeakKeyDictionary

* back

* remove ram usage skips, viz works here

* cache on reconstruct
2025-06-23 19:59:28 +03:00
Ignacio Sica
b8d09a1dae
tc with group/grouptop (#10903) 2025-06-23 09:58:41 -07:00
qazal
7820aeca8e
update codegen process replay to use get_program [pr] (#10921)
* update codegen process replay to get_program [pr]

* precommit

* try str replace

* +to_function_name

* fixup tc

* local2.sh

* fix openpilot NOLOCALS

* new local.sh

* correct merge

* beam cache

* back

* revert beam thing

* adding opts_override and name_override makes output of get_program
reproducible

* min diff
2025-06-23 17:31:41 +03:00
alpharush
22f9696522
Fix/hcqfuzz harnesss bug (#10923)
* update command so extra module is found

* fix empty range in randrange errors

* lint
2025-06-23 11:22:30 +03:00
qazal
9201224e0b
viz: remove Kernel check [pr] (#10920)
* viz: remove Kernel check [pr]

* TestVizIntegration

* test/unit allows opening of devices

* kernel -> Kernel
2025-06-22 20:47:54 +03:00
geohotstan
4ab7d792cc
ONNX improve dtype fallback (#10800)
* fix

* add early verbose demo test

* is this how to write tests :s

* is definition drift even a thing? gemini says it is

* clean up

* better

* even better

* try add to CI

* doesn't work quite yet

* much more work to be done

* whoops

* partition the test heh

* skipif

* some nits for better names

* add webgpu test for onnxrunner

* fix reference links

* flush for now
2025-06-21 19:29:45 -04:00
chenyu
0480139def
log_perplexity metrics (#10912) 2025-06-21 10:44:47 -04:00
nimlgen
0e7bd9fd03
factor out generic MemoryManager (#10910)
* allocator -> memory

* just moveout it

* mm is abstracted

* need entry abstraction

* fix

* mypy
2025-06-21 16:18:33 +03:00
qazal
c7ec913210
viz: cleanup unit tests (#10909)
* cleanup test_viz

* tree view
2025-06-21 12:35:09 +03:00
chenyu
2d9c61e39e
test more dims in test_logsumexp and test_logcumsumexp (#10907)
refactoring squeeze and unsqueeze is easy to get wrong
2025-06-20 21:42:18 -04:00
Nino Risteski
3771cc0f77
fix test logcumsumexp broken devectorize=0 (#10880)
* fix test logcumsumexp numerical

* lint

* Use dtypes.min instead of -1e4
2025-06-20 20:54:50 -04:00
George Hotz
7636d2cdc5
flip order of get_program args (#10905) 2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04
move functions to view and update docs [pr] (#10904)
* move functions to view and update docs [pr]

* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3
move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee
move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
fc9f883870
if upat returns self, it's none (#10898)
* if upat returns self, it's none

* fix pm tests
2025-06-20 12:11:19 -07:00