tinygrad/test
gswangg df44a4e861
Make vectorization of CONST explicit (#5322)
* remove test_const_vectorize_fold

* remove const folding UPat for VECTORIZE

* refactor cstyle render_const

* remove calls to dtype.scalar() in render_const

* add assert

* add vectorized const to UOp.const

* add UPat GEP-VECTORIZE-CONST -> CONST

* render_vectorize for DEFINE_ACC in cstyle

* add back missing render_cast in render_const

* generate vectorized consts as UOps for DEFINE_ACC

* update asserts for DEFINE_ACC with VECTORIZE src

* add UPats for PHI with VECTORIZE src

* use prev rendered vectorize in DEFINE_ACC render

* update DEFINE_ACC in python runtime

* update vectorized DEFINE_ACC in PTXRenderer

* rebase DEFINE_ACC changes on lowerer

* verbose rewrite of bad UPats

* simplify UOps.CONST implementation in ops_python

* update sum_collapse UPats for DEFINE_ACC-VECTORIZE

* revert linearizer to TOT

* fix DEFINE_ACC implementation in ops_python

* simplify DEFINE_ACC in cstyle

* Fix linter error

* support VECTORIZE in fold gated load/store UPat

* support VECTORIZE in other fold gated load UPats

* rewrite VECTORIZE in UPat for no input DEFINE_ACC

* simplify DEFINE_ACC render in cstyle

* make VECTORIZE rules more concise

* add more vectorize fold tests

* inline VECTORIZE-CONSTs in cstyle render

* revert VECTORIZE/GEP rule refactor

* revert cstyle render_const refactor

* inline VECTORIZE-CONSTs in cstyle render

* implicitly vectorized const rendering -> explicit

* WMMA VECTORIZE CONST process replay hacks

* VECTORIZE CONST NAN process_replay hacks

* more VECTORIZE CONST NAN hacks

* cleanup process_replay hacks

* isnan() -> not isfinite() cstyle VECTORIZE CONST

* tweak isnan and isfinite checks VECTORIZE CONST

* tweak for positive vs negative infinity VECTORIZE CONST

* add assert to PTX CONST render

* process_replay VECTORIZE CONST render parity for PTX STORE

* vmin/vmax for VECTORIZE'd CONST

* update WMMA folding rules

* add tests for WMMA VECTORIZE fold

* hack for cstyle half4 CONST zero process_replay parity

* revert PTX backend changes

* add back minimal DEFINE_ACC PTX change

* remove cstyle process_replay hacks

* remove dead code in PTX CONST render

* cleanup vmin/vmax logic for VECTORIZE'd CONSTs

* update vectorize fold tests to use DEFINE_VAR

* fix long line formatting in test

* remove unwanted merge artifact

* more vmin/vmax cleanup

* remove unnecessary asserts

* yet more vmin/vmax cleanup

* get rid of explicit VECTORIZE CONST logic in _min_max

* reuse CONST instead of creating a new one

* remove unneeded cast

* handle DType correctly in sconst

* improve readability of tests

* save a line

* save another line

* tuplize pats in src

* remove GEP-VECTORIZE pats

* add vec +0 fold

* HACK: fold only vec8 +0

* remove vectorized ALU fold hack

---------

Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2024-08-08 20:59:05 +03:00
..
external graph LBScheduleItem [run_process_replay] (#5960) 2024-08-07 19:59:11 +03:00
imported skip some redundant and slow tests in ci (#5416) 2024-07-12 14:43:13 -04:00
models add failing regression test for image (#5540) 2024-07-17 17:27:18 -07:00
testextra names shadowing builtins (#5179) 2024-06-27 08:15:01 -04:00
unit trim const in UOp div_folding (#5982) 2024-08-08 12:49:05 -04:00
web fast path for copy (#2548) 2023-12-01 11:34:47 -08:00
__init__.py All devices are equal! (#196) 2020-12-15 23:44:08 -08:00
Dockerfile Docker fix (#1039) 2023-06-25 10:38:58 -07:00
helpers.py UOp.const_factor [run_process_replay] (#5945) 2024-08-06 18:18:29 -04:00
test_arange.py don't reduce the same thing in a vector (#5950) 2024-08-06 16:59:15 -07:00
test_assign.py test masked assign views (#4599) 2024-05-15 15:06:48 +03:00
test_const_folding.py MetaOps.KERNEL (#5543) 2024-07-17 19:41:23 -07:00
test_conv.py db in wal mode (#5388) 2024-07-12 20:43:36 -07:00
test_conv_shapetracker.py test: put conv in one reduce (#4441) 2024-07-22 12:16:13 +03:00
test_copy_speed.py remove cpu and torch backends (#3399) 2024-02-15 16:55:39 +01:00
test_custom_function.py s/loadops/metaops [run_process_replay] (#5421) 2024-07-12 13:26:50 -07:00
test_device_speed.py move uopgraph to file [run_process_replay] (#5364) 2024-07-10 17:34:50 -07:00
test_dtype.py remove CUDACPU flag in tests [run_process_replay] (#5902) 2024-08-04 16:06:38 -04:00
test_dtype_alu.py remove CUDACPU flag in tests [run_process_replay] (#5902) 2024-08-04 16:06:38 -04:00
test_fusion_op.py process replay in all of CI (#4884) 2024-06-10 14:49:29 -04:00
test_fuzz_shape_ops.py fix typing for test to run in py38 (#4930) 2024-06-12 13:22:30 -04:00
test_gc.py threefry again (#3785) 2024-03-18 16:47:07 -04:00
test_graph.py fix hcq sync (#5062) 2024-06-26 17:50:37 +03:00
test_hcq.py fix non-jitted transfers in profile (#5980) 2024-08-08 17:58:08 +03:00
test_image_dtype.py add failing regression test for image (#5540) 2024-07-17 17:27:18 -07:00
test_jit.py remove realize from threefry (#5969) 2024-08-07 15:08:49 -07:00
test_kernel_cache.py move the compiler cache to be global (#2957) 2024-01-01 10:59:56 -08:00
test_lazybuffer.py scheduleitem is not Tuple [run_process_replay] (#5425) 2024-07-12 15:13:19 -07:00
test_lazyop.py scheduleitem is not Tuple [run_process_replay] (#5425) 2024-07-12 15:13:19 -07:00
test_linearizer.py remove realize from threefry (#5969) 2024-08-07 15:08:49 -07:00
test_linearizer_dumb.py embedding doesn't cast (#5952) 2024-08-06 17:49:14 -07:00
test_linearizer_failures.py test lin fail 47 for UOP_IS_SYMBOLIC (#5853) 2024-07-31 23:09:22 -04:00
test_linearizer_overflows.py lowerer is kernel [run_process_replay] (#5437) 2024-07-12 18:50:55 -07:00
test_masked_st.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
test_method_cache.py simple LoadOps.ASSIGN (#3745) 2024-03-14 20:44:34 -07:00
test_multitensor.py MLB support reshape for uneven shards (#5804) 2024-08-01 02:36:03 -07:00
test_net_speed.py nv mockgpu (#4600) 2024-05-15 23:46:08 +03:00
test_nn.py MetaOps.KERNEL (#5543) 2024-07-17 19:41:23 -07:00
test_ocl.py hotfix: don't run OOM test in CI 2024-08-07 22:19:29 -07:00
test_ops.py remove CUDACPU flag in tests [run_process_replay] (#5902) 2024-08-04 16:06:38 -04:00
test_optim.py improve test_dropout_on_shard (#4912) 2024-06-11 11:36:02 -04:00
test_pattern_matcher.py multiple locals + get_kernel_modifier + fix valid (#5739) 2024-07-26 15:10:10 -07:00
test_pickle.py some TestPickleJIT tests (#5860) 2024-08-01 12:39:59 -07:00
test_profiler.py fix non-jitted transfers in profile (#5980) 2024-08-08 17:58:08 +03:00
test_randomness.py jit sampling functionn in test_randomness.test_multinomial (#5034) 2024-06-18 14:21:05 -04:00
test_renderer_failures.py test cstyle compile error for max with inline const (#5838) 2024-08-05 19:02:16 +03:00
test_sample.py enable test_sample for all backend (#2593) 2023-12-03 17:20:27 -05:00
test_schedule.py hotfix: contiguous on precompute_freqs_cis 2024-08-07 14:40:56 -07:00
test_search.py BEAM bugfix, kernels dedup now (#5617) 2024-07-20 19:43:50 -07:00
test_setitem.py setitem in-place operator tests (#4577) 2024-05-14 01:28:02 -04:00
test_specific_conv.py nv mockgpu (#4600) 2024-05-15 23:46:08 +03:00
test_speed_v_torch.py remove CUDACPU flag in tests [run_process_replay] (#5902) 2024-08-04 16:06:38 -04:00
test_subbuffer.py remove CUDACPU flag in tests [run_process_replay] (#5902) 2024-08-04 16:06:38 -04:00
test_symbolic_jit.py sort vars in jit when building expected input args (#4990) 2024-06-16 15:55:51 -04:00
test_symbolic_ops.py symbolic Tensor.var (#4843) 2024-06-05 12:55:54 -04:00
test_symbolic_shapetracker.py support symbolic reshape with non-contiguous (#4844) 2024-06-05 16:01:19 -04:00
test_tensor.py hotfix: adjust test_backward_pass_diamond_model thresholds (#5981) 2024-08-09 00:20:53 +08:00
test_tensor_data.py BEAM_COMPARE=2 validates the correctness of BEAM kernels (#5458) 2024-07-13 13:53:43 -07:00
test_tensor_variable.py Should this symbolic test fail? (#4501) 2024-06-18 15:21:26 -04:00
test_to_numpy.py Apply ruff linting rules to tests (#2473) 2023-11-27 21:24:06 -08:00
test_transcendental.py lower test_transcendental fuzz test threshold for sin float64 (#5956) 2024-08-07 02:04:37 -04:00
test_uop_graph.py Make vectorization of CONST explicit (#5322) 2024-08-08 20:59:05 +03:00
test_uops.py use CONTRACT before REDUCE (#5903) 2024-08-04 16:17:33 -07:00
test_uops_stats.py UOpGraph not in renderer or Program [run_process_replay] (#5867) 2024-08-01 16:20:30 -07:00
test_verify_lazyop.py pretty print lazy op per default (#5505) 2024-07-18 09:34:08 -07:00
test_winograd.py MetaOps.KERNEL (#5543) 2024-07-17 19:41:23 -07:00
test_zero_copy.py remove numpy from device (#3123) 2024-01-14 19:36:05 -08:00