Commit graph

2,793 commits

Author SHA1 Message Date
chenyu
18e159c9ac
comment about multi real and more tests [pr] (#7467) 2024-11-01 11:49:11 -04:00
geohotstan
6513690223
Add Tensor.hardsigmoid (#7433)
* move hardsigmoid to new branch

* add to test

* add NOTE to mention differing values for alpha and beta that match torch

* shift from relu6

* correct shift implementation

* or we just use relu? no more 666
2024-11-01 08:36:52 -04:00
George Hotz
a7ba3d2d91
move reduce to lowerer [pr] (#7462)
* move reduce to lowerer [pr]

* simpler
2024-11-01 16:39:20 +08:00
Tobias Fischer
1a9e145388
Tensor Clone Function (#7154)
* implemented clone function

* cleanup linting, single func

* added tests, cleaned up grad cloning

* fixed whitespace
2024-11-01 12:24:43 +08:00
chenyu
a21434504b
update payne_hanek_reduction [pr] (#7455) 2024-10-31 18:41:22 -04:00
chenyu
4065c3dec8
remove special 0 case in frexp (#7450)
we can safely assume input is non-zero, also removed unneeded bitcast
2024-10-31 13:02:33 -04:00
chenyu
53db3478fe
cast to float32 for float16 xlog2 (#7447)
formula has 2X error with denormal floats
2024-10-31 10:36:29 -04:00
George Hotz
5dd1ffd5d0
don't const rewrite in cstyle (#7442)
* don't const rewrite in cstyle

* Update cstyle.py

* simple_symbolic

* fix bfloat16 const on AMD
2024-10-31 19:16:49 +08:00
George Hotz
50ddd11350
lil cleanup matchers [pr] (#7437)
* move delete_redundant_gates [pr]

* simpler uops test

* addr in delete_redundant_gates

* lines

* correct early delete gates

* shorter find_gate
2024-10-31 17:52:22 +08:00
George Hotz
2e3048fc57
Revert "improve full_graph_rewrite matchers for speed (#7431)" (#7434)
This reverts commit 996152d2de.
2024-10-31 16:16:47 +08:00
George Hotz
996152d2de
improve full_graph_rewrite matchers for speed (#7431)
* remove finalize [pr]

* early transcendental

* fix tests

* load store indexing runs with devectorize

* move delete_redundant_gates

* ptx has to wait for the mask to move
2024-10-31 16:13:11 +08:00
George Hotz
17c9a9fde4
pm_render [pr] (#7430)
* pm_render [pr]

* test fixes

* use gep, not src

* ptx only symbolic, not sym

* move cast rules
2024-10-31 15:04:50 +08:00
George Hotz
e446e95974
enforce ctx is called ctx [pr] (#7424)
* enforce ctx is called ctx [pr]

* fix bug and use has_ctx

* inspect signature

* assert

* no slow asserts

* now we can support contextual reduce
2024-10-31 11:39:19 +08:00
chenyu
0739895b4d
tiny clena up pow2if and payne_hanek_reduction (#7423) 2024-10-30 22:22:48 -04:00
chenyu
118dd7721f
clean up transcendental.rintk [pr] (#7422)
added unit tests and updated the comment. it's rounding away from 0 for negatives
2024-10-30 20:37:28 -04:00
chenyu
fb694a63eb
Tensor.erf (#7419)
the same one used in onnx and the one in bert.
2024-10-30 18:12:28 -04:00
qazal
e955aa1bee
hotfix: process replay (#7418) 2024-10-30 22:45:40 +02:00
qazal
1a2ee37dd3
hotfix: remove redundant test_schedules [pr] (#7412) 2024-10-31 01:10:31 +08:00
George Hotz
7039fba406
move indexing first (#7409)
* move indexing first [pr]

* no create gate

* fix create_gate

* fix load/store folding

* fix index folding

* remove comment, no process replay
2024-10-31 00:50:35 +08:00
George Hotz
133fe81cc5
Revert "Revert "move up migrate + new gated fold (#7403)" (#7406)" (#7407)
* Revert "Revert "move up migrate + new gated fold (#7403)" (#7406)"

This reverts commit ea5654a9bc.

* test padded in emulation too

* bring back early folding
2024-10-30 23:25:45 +08:00
chenyu
ea5654a9bc
Revert "move up migrate + new gated fold (#7403)" (#7406)
This reverts commit adccfade7f.
2024-10-30 23:02:18 +08:00
George Hotz
adccfade7f
move up migrate + new gated fold (#7403)
* move up migrate + new gated fold [pr]

* vcount for const ptr

* move those rules there

* fix openpilot
2024-10-30 22:14:01 +08:00
chenyu
16e60d25b9
move polyN to helper [pr] (#7405)
also move `eval_uop` to `test.helpers`
2024-10-30 10:09:57 -04:00
George Hotz
f3bd5cbf78
simplest migration of indexing [pr] (#7402)
* simplest migration of indexing [pr]

* fix locals/barrier
2024-10-30 20:58:18 +08:00
George Hotz
ee9ef93617
delete old rules [pr] (#7400) 2024-10-30 19:45:04 +08:00
George Hotz
76a41a1083
don't compare with pointer dtype (#7394)
* don't compare with pointer dtype

* more cleanup

* images are pointers

* handle IMAGE better

* cleaner test_image

* this work

* pr match

* cleanup
2024-10-30 17:48:27 +08:00
George Hotz
4e2895f8d2
safe changes from new dtype branch [pr] (#7397)
* safe changes from new dtype branch [pr]

* only image test on GPU
2024-10-30 17:18:48 +08:00
George Hotz
27995a2a04
vcount + cleanups (#7393)
* Revert "Revert "Restore vcount [pr] (#7390)" (#7392)"

This reverts commit 4ca53db604.

* ugh bugfix [pr]

* uops_to_dtypes function

* fixups

* varnames

* fix mypy

* just 4,8

* tests
2024-10-30 12:50:15 +08:00
George Hotz
4ca53db604
Revert "Restore vcount [pr] (#7390)" (#7392)
This reverts commit 1058f9c9ff.
2024-10-30 11:40:25 +08:00
George Hotz
1058f9c9ff
Restore vcount [pr] (#7390)
* Revert "Revert "add vcount to PtrDtype (#7388)""

This reverts commit 399a5219dd.

* Revert "Revert "add tests to vcount stuff [pr] (#7389)""

This reverts commit cc8d6dbdf3.

* no ptr
2024-10-30 11:27:55 +08:00
George Hotz
cc8d6dbdf3 Revert "add tests to vcount stuff [pr] (#7389)"
This reverts commit 1b7084899b.
2024-10-30 10:56:49 +08:00
George Hotz
1b7084899b
add tests to vcount stuff [pr] (#7389) 2024-10-30 10:54:54 +08:00
chenyu
f389e1a8a0
test more special values for sin/cos/tan [pr] (#7386) 2024-10-29 21:13:37 -04:00
chenyu
6bf38c35e5
clean up transcendental frexp [pr] (#7384)
also added some unit tests for frexp
2024-10-29 18:51:37 -04:00
chenyu
d3c192b056
Device method cleanup [pr] (#7375) 2024-10-29 12:49:47 -04:00
qazal
51c0c8d27e
cachable small graph rewrite (#7371) 2024-10-29 22:28:13 +08:00
George Hotz
2cfc7b6695
Index everywhere 2 (#7363)
* indexing everywhere [pr]

* fix tests
2024-10-29 19:29:40 +08:00
qazal
7149eabb34
assert set equality in TestTensorMetadata [pr] (#7364) 2024-10-29 19:29:29 +08:00
qazal
0ebdb136e8
revert metadata with graph_rewrite (#7353) (#7362)
This reverts commit 540e4179e7.
2024-10-29 19:16:31 +08:00
George Hotz
0af1212164
use assertEqual with new style uops [pr] (#7360) 2024-10-29 18:43:21 +08:00
George Hotz
572499c71a
add indexing to ops_python (#7358)
* add indexing to ops_python

* fix image
2024-10-29 18:11:03 +08:00
qazal
540e4179e7
global UOp to Metadata mapping + inverse DEBUG=2 metadata order [pr] (#7353)
* add ctx.buf_metadata [pr]

* revert metadata insertion order

* lint rename
2024-10-29 17:12:00 +08:00
George Hotz
2fdfcffe4c
improve ci speed [pr] (#7357) 2024-10-29 17:00:35 +08:00
George Hotz
b647fa7514
rename MathTraits to maximum [pr] (#7356) 2024-10-29 16:43:04 +08:00
George Hotz
3989bd2682
idiv + reciprocal [pr] (#7354)
* idiv + reciprocal

* remove upcast from div

* fix docs
2024-10-29 15:54:19 +08:00
George Hotz
d9d4dd6756
faster ci [pr] (#7348) 2024-10-29 14:01:44 +08:00
George Hotz
4cb236a495
index in cstyle (#7328)
* index only in cstyle

* fix prefix dtypes

* fix tests

* global indexing

* Revert "global indexing"

This reverts commit 4d507e8abb.

* fix image

* fix image

* ptx tests

* fix CUDA dtype rendering
2024-10-29 13:06:26 +08:00
George Hotz
4fe1945df6
llvm if load (#7345)
* llvm if load

* unneeded line

* local llvm CI
2024-10-29 11:33:22 +08:00
chenyu
6021bf87f4
unify T = TypeVar("T") (#7342) 2024-10-28 18:43:44 -04:00
chenyu
c398f2467c
test uop mul min/max do not have nan in 0*inf (#7340) 2024-10-28 17:52:01 -04:00