Commit graph

189 commits

Author SHA1 Message Date
chenyu
bb8cf948f2
variation of (x%c)+(x//c)*c = x (#13135)
when x is in the form of y//b, the idiv term might have combined
2025-11-06 18:53:28 -05:00
Sieds Lykles
4c8362128b
New symbolic renderer + strip parens (#13017)
* new uop renderer

* better tester

* strip parens

* update tests

* split method check_uop_against_string

* use ctx.update instead of add_rendered method

* strip parens based on precedence

* update test

* new symbolic renderer

* add comment
2025-10-30 16:41:32 +01:00
Sieds Lykles
70bce62c67
dont collapse possibly empty symbolic range (#12994)
* dont collapse a symbolic range based on min/max

* refactor z3 renderer

* include sink explicitely instead of dtypes.void

* use dtype.scalar()
2025-10-29 12:17:09 +01:00
George Hotz
7762b3558b
clean up the spec (#12868)
* tighten up the spec

* move validate into a different file

* that moved to validate

* after(barr)
2025-10-22 19:50:42 +08:00
Sieds Lykles
e0139fafc1
UOp symbolic tests use eval to check against string (#12643) 2025-10-13 14:19:42 +02:00
Sieds Lykles
b465c17b56
Revert "UOp.factor and add chain sorting (#12413)" (#12492)
This reverts commit e74be4a140.
2025-10-08 03:20:23 +02:00
Sieds Lykles
e74be4a140
UOp.factor and add chain sorting (#12413)
* add ordering

* fix some tests

* fix more tests

* shorten comment

* update test

* add rule and test

* add rule and test

* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line

* new algo

* add test

* add function to un-nest the div

* add UOp.factor

* test UOp.factor

* uop_given_valid tries to factor simplex expression

* shorten line

* symbolic_flat is back

* change that back

* fix those new tests

* new rule for ordering

* factor multiple factors

* no symbolic_flat

* symbolic_flat to there

* move that back

* fix imports

* merge correctly

* linter happy

* add rule

* add a test

* cleanup

* revert that for now

* UOp.factor returns self instead of None

* try all_candidates

* remove or_else

* post index symbolic

* add test

* maket this closer to the original

* increase mac hlb_cifar min step time

* add some ordering tests

* cleanup

* increase pytest timeout time

* check dtype
2025-10-04 06:05:38 +02:00
Sieds Lykles
16a65b4fd0
fix test_symbolic_gcd_div hang (#12427) 2025-10-03 04:21:16 +02:00
George Hotz
cdfa0f29fd
add rendering to index (#12338) 2025-09-30 09:18:05 +08:00
Sieds Lykles
45c7252aed
Better div nesting 2 (#11812)
* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line

* new algo

* add test

* cleanup

* update tests

* ALLOWED_GATED_READ_IMAGE from 16 -> 12

* only remove the call to simplify

* add option to simplify with factor_remainder

* Allowed readimage gates back to 16
2025-09-24 04:50:26 +02:00
Sieds Lykles
8d703a6369
z3 xor doesnt use bitcast (#12243) 2025-09-19 00:31:44 +02:00
Sieds Lykles
158506b91e
Upgrade some divmod folding for symbolic divs (#12216)
* use const_factor() instead of arg

* add test

* change div min_max

* add tests

* add divide_by_symbolic_gcd

* add tests

* one more test

* Slice to unbind symbolic

* deal with const factor properly

* minor cleanup

* divide_by_symbolic_gcd becomes UOp.gcd and UOp.divide_exact

* add tests

* add gcd_without_const

* fix divide_exact bug

* add factor_remainder

* add tests

* fix imports

* elif -> if

* remove expectedFailure

* add more tests

* add more unwrap

* fix signature of pop_const

* remove that

* remove that
2025-09-17 03:00:50 +02:00
Sieds Lykles
e3a3764917
delete fold_unrolled_divs (#12146) 2025-09-13 03:09:36 +02:00
Sieds Lykles
1f3950a484
Invalid idx (#12067)
* merge index_dtype_3

* new lowering with Invalid idx

* remove that dtype from range

* finish merge

* annotate better

* indentation

* dont need that anymore

* always process replay for openpilot

* more uop_given_valid for idx

* valid past index_child

* fix bug preventing load getting an alt value

* add track_match_stats back in in shapetracker and remove cache

* get_valid_idx -> get_valid and get_idx

* fix heuristics with new idx

* split line

* fix typo

* fix signature

* dont skip idx if stride is 0

the idx may still be invalid

* lower const with new valid

* delete to_indexed_uops

* update shapetracker test

* delete axis_is_masked

* add cache back

* move around comment

* fix get_valid bug

* move invalid fold to symbolic so its earlier

* cleanup

* update applying padto to new idx

* add unit tests

* cleanup

* fold line

* improve spec

* dont try to render Invalid as a float

* more consistent invalid index

* update some tests

* Fold index with true cond

* skip test

* vconst min max if Invalid in arg

* fix signature of UOp.const

* add test for min/max of Invalid CONST/VCONST

* add InvalidType to as_const signature

* is Invalid to isinstance

* Add InvalidType to ConstLike

* index gate is a where gate

* make that a metaclass

* fix heurisics for new idx

* mypy happy
2025-09-12 01:42:02 +02:00
Sieds Lykles
499f50483b
x | !x -> True (#12090) 2025-09-10 03:26:01 +02:00
Sieds Lykles
75b58fe2d3
move simplify_valid pat to sym (#12065)
* move simplify_valid pat to sym

* fix expectedfailure
2025-09-08 07:01:26 +02:00
Sieds Lykles
581b2388c2
add dtypes.index (#12015)
* add dtypes.index

* cast shape, stride and mask to dtypes.index in view.create

* move pm_lower_index_dtype to ops

* DEFINE_VAR is dtype.index by default

* merge var_val_using_str

* remove int from commutative

* fix test_rewrite_map

* change that to dtypes.index

* change some int to index

* shorten those

* remove old cast in renderer

* cleanup

* change that back

* add comment

* delete comment

* just delete those

* view doesnt have to cast anymore

* adjust comment
2025-09-06 06:03:44 +02:00
Sieds Lykles
c6c16b2946
var_vals uses str for var (#12011)
* var_vals is str,int

* remove imports

* remove print

* fix test

* change var_vals in hcq

* update test_hcq

* fix multitensor _device_num var

* fix syminfer test

* shorten line

* p.vars stays list[Variable]

* shorten line

* vars is back to tuple[Variable, ...]

* change var_vals in extra

* change var_vals from shapetracker

* var_vals is str:int

* fix signature
2025-09-06 04:16:12 +02:00
Sieds Lykles
f5404ca53c
Divmod combine - associative variations (#12017)
* add rule and test

* more rules and tests

* add all four variations

* fix test

* test fixed!

* adjust commment

* add new variations

* disable intel tensor core ops count test for bigger_matmul_half
2025-09-05 03:44:02 +02:00
Sieds Lykles
86e908db57
cast parents of int64 alu to int32 if possible (#11977)
* add overflows helper

* add rules

* x -> y

* check overflow of u too

* cleaner

* use alu instead of replace to preserve vectorization

* just one rule

* add test
2025-09-03 11:05:04 +02:00
Sieds Lykles
0bc34c000f
simplify range mod its own upper bound (#11917)
* add rules

* add tests
2025-08-30 08:37:35 +02:00
Sieds Lykles
d39365809a
add ctx to z3_renderer arg (#11867)
* add ctx to z3_renderer arg

* update symbolic fuzzer

* rewrite u1,u2,u3

* update fuzz_fast_idiv

* remove imports
2025-08-27 03:38:15 +02:00
Sieds Lykles
a3aeef45cc
associative variation of where branch-merging (#11851)
* add rule and test

* change comment
2025-08-26 19:27:05 +02:00
George Hotz
6540bb32a6
move into codegen late [pr] (#11823) 2025-08-24 10:23:25 -07:00
Sieds Lykles
dd69114573
Revert "Better div nesting (#11811)" (#11818)
This reverts commit 952f729b07.
2025-08-24 18:11:24 +02:00
Sieds Lykles
952f729b07
Better div nesting (#11811)
* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line
2025-08-24 04:17:40 +02:00
Sieds Lykles
e652062f92
tweak divmod_folding condition (#11810) 2025-08-24 02:59:02 +02:00
Sieds Lykles
07d4ed7e4c
one more symbolic add variation (#11807) 2025-08-24 01:15:04 +02:00
Sieds Lykles
6a50ab6b87
adjust idiv min_max (#11802)
* change div min_max

* add tests
2025-08-23 22:25:51 +02:00
ttomsa
70c3f1fb29
x.where(False, True) -> !x (#11738)
* add pat

* add test
2025-08-19 19:08:16 -04:00
Sieds Lykles
06beeb6e13
Nest div even if factor is negative (#11666) 2025-08-14 13:58:59 +02:00
Sieds Lykles
661e9a2d5d
div_and_mod_folding refactor (#11585)
* divmod const folding is its own function

* split nested mod optimization out of div and mod folding

* make `fold_binary_numerator` its own function

* factor out `fold_divmod_congruence`

* check sign of numerator

* add tests

* assert int on vmin and vmax

* add type: ignore

* factor out more rules

* remove div_and_mod_folding

* cached_property to property

* remove import

* add returns

* restore old order

* check sign of x.vmin and newx.vmin

* check more signs

* add some test that would have caught bugs

* better test if the div simplified

* shorten line

* replace terms_factors_const with pop_const

* move that back

* minor cleanup

* remove comments

* some cleanup
2025-08-14 11:52:42 +02:00
chenyu
4fe19eec72
Ops.TRUNC (#11659) 2025-08-13 18:40:48 -04:00
Sieds Lykles
4c3982c44e
Take sign out of mod (#11631)
* Add rule and test

* fix tests
2025-08-12 18:44:36 +02:00
chenyu
e0106b6b25
1/(x*c) -> (1/c)*(1/x) (#11491)
example: 2*(2*a).reciprocal() -> a.reciprocal()

# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00
George Hotz
e14b4fefa5
ranges on store (#11334)
* ranges on store

* fix store spec

* fix that

* fix gates

* fix tests

* fix ptx
2025-07-22 21:00:50 -07:00
Sieds Lykles
53985297bd
add test, fix rewrite rule and raise error on division by zero (#11073) 2025-07-03 08:25:06 -04:00
Sieds Lykles
61dad3740f
fix min_max and add test (#10952) 2025-06-24 09:33:26 -04:00
Sieds Lykles
b1fefb76dd
More conditions for (x//c1+a)//c2 -> (x+a*c1)//(c1*c2) (#10834)
* add rule and test

* typo

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-16 16:34:52 -04:00
Sieds Lykles
deb6af0638
Remove incorrect rule for x%-d -> (x%d)*-1 (#10832)
* fix rule and add test

* combine tests
2025-06-16 11:37:44 -04:00
leopf
118a09ddcf
xor self folding (#10806)
* xor folding

* tests + z3 bitwise xor
2025-06-14 10:01:17 -04:00
Sieds Lykles
478c76f4b7
More div conditions (#10432)
* add condition

* add test

* use Variable
2025-05-26 07:36:05 -04:00
Sieds Lykles
c6c7882bdf
bugfix: seperate rule for x//d<-c (#10148)
* Add rule

* Add test

* Add test for edge case 0

* Merge patterns

* update comment

* consistent whitespace

* whitespace

* update comment
2025-05-26 07:35:41 -04:00
Sieds Lykles
ce6ebfb8ee
verify rewrites in test_uop_symbolic (#10430)
* verify rewrites in test_uop_symbolic

* use global context
2025-05-23 06:57:29 -04:00
Sieds Lykles
2b4375f36d
Correct divmod folding behind flag (#10433)
* add flag

* add test

* remove import
2025-05-21 06:46:13 -04:00
George Hotz
411392dfb7
move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
Kirill R.
4c7c139102
Use cmod/cdiv in sym_infer (#10258)
* Use cmod/cdiv in sym_infer

* test

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-05-12 09:07:28 -04:00
Sieds Lykles
74e40aafa0
use cdiv in div and mod folding (#10216)
* use cdiv

* use cdiv and cmod there as well

* Add tests

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-05-09 12:37:24 -04:00
Sieds Lykles
8da9c070ca
take gcd out of trunc div (#10238) 2025-05-09 12:08:10 -04:00
chenyu
9846435c2e
fix test_div_numerator_negative (#10229)
the simplification was wrong with negative const_factor
2025-05-09 06:19:59 -04:00