* put the ranges on store instead of after
* better assert
* fix stuff
* comment out slow rules i don't understand
* simpler rule
* closer
* return false for store
* fix loop
* only a few schedule failures remain
* remove stores to self
* all tests pass locally
* remove junk
* regression test and fix
* better test, bump broken torch count
* bugfix with regression test
* new fusion is better
* preallocate all realized buffers
* contiguous
* work
* comment that out
* move to schedule
* better
* correct fix
* just buffer
* disk bufs
* fixes disk tensor stuff
* fix symbolic stuff
* fix multi
* 162 failures
* bugfixes
* don't check that anymore
* fix schedule tests
* mnist should be contiguious
* type and buffer
* fix tests
* shrink axis correction
* mypy fixes
* tests skips
* same 37 failures
* dedup
* no shrink in the graph
* 29 failures
* skips
* fix custom kernel
* fix training
* those optimizations aren't supported currently
* simpler
* more correct
* tests
* 14 failures
* works
* fix that test
* broken
* 11 failures
* only kernel counts left
* fixes
* all tests pass
* remove tensor_map
* op test
* 200 -> 230
* test fixes
* fixes
* revert test_tiny thing
* guard
* revert that
* test tiny passes
* no contigs there
* base realize back
* Revert "no contigs there"
This reverts commit c45bb9fcfd.
* revert that
* chop many assigns
* 12 failures
* fix tests
* tests
* apply after
* pre-commit
* remove old code
* delete that
* fix types
* remove extra contig
* fix dataloader
* torch fix
* disk fix
* update kernel fusion numbres
* runs on amd
* restore kernel count
* add that rule back
* that
* disable that
* wrong
* add the correct rule for that folding
* more tests
* guard c1.arg
* no newlines
* realize those
* split into a different file
* remove detach/contig back
* skip 2
* update that
* update the backend to fix torch deprecation warning
* use param_hook to avoid full backward hook needlessly firing on inputs which do not require gradients
* fix indentation
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* Implement private _linalg_eigh function for tensor eigenvalue decomposition in torch backend
* Add unit test for linalg.eigh function in TestTorchBackend
This test verifies the eigenvalue decomposition of a 2x2 tensor using the linalg.eigh function, ensuring the computed eigenvalues and reconstructed tensor match the expected results.
- Implemented a new function `equal` in the torch backend to compare two tensors for equality.
- Added unit tests for the `equal` function to verify its correctness with different tensor inputs.
* Enhance tensor random functions with dtype support
- Updated `aten.uniform_` and `aten.normal_` to include dtype parameter in backend.py
- Added unit tests for uniform and normal tensor generation with specific dtypes in test.py
* Refactor test name for clarity
- Renamed `test_normal_dtype` to `test_normal` in `extra/torch_backend/test.py`
- Aims to improve readability and better reflect the test's purpose
* bug in div range folding
* simpler
* oh, this is right for indexing, but the div mod folding needs to be fixed
* reenable
* Passing test_complexity_w_unroll2 (#10068)
* Passing
* remove non_folded_divs
* Add check for negative tern in div folding
* Add test
* bump that limit
* fix casted
---------
Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com>
* add kernelize
* remove that
* kernelize returns self
* update abstractions2.py
* kernelize in test_schedule
* temp: assert BUFFER_VIEW's existence
* ASSIGN must have a buffer or subbuffer target
* assert and shrink
* fix
* padded setitem
* var
* toposort once
* extra
* base_buffer
* end with BUFFER_VIEW
* setitem for disk
* test_setitem_becomes_subbuffer
* mul slice test
* torch backend fix 1
* non-deterministic
* keep subbuffer
* Add amax support to Tensor operations
- Implemented amax function in backend.py for tensor max operations.
- Added unit tests for amax in test.py to ensure correct functionality.
* Fix formatting in amax output function
- Adjusted spacing in the amax output lambda function in backend.py
- Improved code readability for better maintenance