Commit graph

4,667 commits

Author SHA1 Message Date
chenyu
1ac958a058
update pytest marks and CI test filters (#2587)
* remove pytest marks

* test more stuff

* fine revert some

* add that mark back

* skip that

* hmm LLVM does not work on ubuntu

* too slow on CUDA CI

* dup test
2023-12-03 15:20:44 -05:00
qazal
ab2d4d8d29
Fix cl import in the copy_speed test and cifar example (#2586)
* fix CL import

* update test to only run on GPU

* update hlb_cifar too
2023-12-03 09:22:07 -08:00
chenyu
3226b3d96b
enable the jit random test (#2580) 2023-12-02 20:25:23 -05:00
chenyu
09c9794f3f
clean external_test_opt.py (#2578) 2023-12-02 19:51:08 -05:00
George Hotz
171543fc8d
cleanups to save lines and files (#2577)
* runtime/graph -> features/graph

* put all the cstyle renderers in cstyle

* same line for those

* how did that pass mypy
2023-12-02 16:29:56 -08:00
George Hotz
d6b404ac11
No dtype alloc (#2570)
* fix all allocs

* improve docs

* ugh fix fake alloc
2023-12-02 13:29:40 -08:00
chenyu
c8774713c5
lazy cleanup (#2567) 2023-12-02 13:21:43 -05:00
George Hotz
5068e99d18
refactor to remove extra kernel params (#2563)
* refactor to have compiled kernel

* bugfixes

* docs/beautiful.py

* revert that

* fix tests
2023-12-02 00:32:25 -08:00
George Hotz
27481b9206
Switch ops_gpu -> gpuctypes (#2532)
* ops_gpu is go

* fix size 0

* fix image, and add more tests

* nerf openpilot test, doesn't test thneed

* run the schedule

* better

* oops, new inputs

* delete pyopencl

* Update ops_gpu.py
2023-12-01 22:30:21 -08:00
George Hotz
6733425095
lower schedule (#2559)
* lower schedule

* remove RAND, and don't put load in the JIT yet

* better fix for that test
2023-12-01 19:17:46 -08:00
Christopher Mauri Milan
077567f62d
Remove as_buffer for TORCH (#2554)
* remove as_buffer for torch

* enable torch zerocopy if on cpu

* remove as_buffer even on torch:cpu
2023-12-01 18:51:38 -08:00
chenyu
86fbd413f3
update test_real_world configs (#2557) 2023-12-01 20:03:52 -05:00
andresgit
00523d5656
New fix accessing elements created by padding (#2529)
* pad slice test cases, many failing

* fix failing test cases

check mask if we are outside the base buffer
also create a multi-view if in that case we reshape to an empty shape

* real_offset calculation more readable

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2023-12-01 19:08:10 -05:00
chenyu
67f4e03724
rewrite 0 size loadop into a CONST (#2556)
* rewrite 0 size loadop into a CONST

* check alloc size

* EMPTY is better

* Revert "EMPTY is better"

This reverts commit 574fe0f9ed28f1b97da5a81afdfd2cd5d9a94ff9.

* no ast is created

* fix test
2023-12-01 18:29:06 -05:00
George Hotz
4447188051 gate METAL_FAST_LOAD 2023-12-01 15:28:40 -08:00
chenyu
e9426f4fe4
simpler get_contraction (#2552)
* simpler get_contraction

* and test
2023-12-01 18:02:52 -05:00
George Hotz
f5de21e753
fast path for copy (#2548)
* fast copy

* ruff first

* flat_mv on malloc

* order + webgpu test
2023-12-01 11:34:47 -08:00
George Hotz
12fa846122
zero copy (#2531)
* zero copy

* zero copy test

* loads coder in milliseconds

* zero copy for cpu and torch

* src_from_buffer is None

* SLOW_METAL_COPY there
2023-11-30 18:38:41 -08:00
George Hotz
2c363b5f0b
new style device (#2530)
* cpu tests pass

* torch works

* works

* metal works

* fix ops_disk

* metal jit works

* fix openpilot

* llvm and clang work

* fix webgpu

* docs are rly broken

* LRU works on metal

* delete comment

* revert name to ._buf. LRU only on Compiled

* changes

* allocator

* allocator, getting closer

* lru alloc

* LRUAllocator

* all pass

* metal

* cuda

* test examples

* linearizer

* test fixes

* fix custom + clean realize

* fix hip

* skip tests

* fix tests

* fix size=0

* fix MOCKHIP

* fix thneed

* copy better

* simple

* old style metal copy

* fix thneed

* np reshape

* give cuda a device
2023-11-30 17:07:16 -08:00
chenyu
7d26452305
call ruff with --preview (#2522)
some checks are ignored without --preview
2023-11-30 13:59:00 -05:00
chenyu
5db0cdfbd3
support list of ints (or other Tensorable) in tensor indices (#2520)
* support list of ints (or other Tensorable) in tensor indices

* enable some index test cases
2023-11-30 12:46:33 -05:00
chenyu
bd941a0df1
first version of test_indexing (#2515)
* first version of test_indexing

* move to test/imported
2023-11-30 00:03:59 -05:00
qazal
370cfbb957
Cleanup vectorized hip renders (#2497)
* add typedefs and make_dtypen functions

use ext_vector_type for half16 kernels

* remove the old test_render because we just use whatever cstyle has

* align vectors
2023-11-29 14:02:12 -08:00
George Hotz
065aff747e
make webgpu test reliable (#2502)
* remove retry that doesn't work

* fix cleanup

* process exit in cleanup

* add space
2023-11-29 10:02:24 -08:00
George Hotz
6707f2588e
use copyin (#2500)
* it's always copyin

* all RawBuffer are RawBufferCopyIn

* cleanups

* this fixes it

* requirements='C'

* more correct
2023-11-29 09:34:00 -08:00
chenyu
3eb3c74675
metal ci tests everything (#2499)
* metal ci tests everything

* pretty good

* METAL
2023-11-29 12:04:37 -05:00
George Hotz
889acefe85
Support weird loads in Image (#2498)
* image support weird loads

* umm, that was always wrong

* openpilot compile fails with a weird error

* image test passes

* we have valids now

* clean that up

* no more required opts

* add fastvits test, fix bug

* minor cleanups
2023-11-29 08:30:46 -08:00
George Hotz
5629fc368c
Use Buffer.STORE at the end of ASTs (#2494)
* work

* store broken

* interpreteds work

* this passes

* symbolic cpu

* fix tests

* fix opt tests

* images fail

* fix InterpretedFlopCounter

* stupid hack for images
2023-11-28 20:11:37 -08:00
Liam
cf0c9096a9
Removing METAL Skips as CI works (#2488)
* Test metal CI

* remove metal and CI restrictions

* enable dtype tests for metal ci
2023-11-28 19:46:59 -08:00
George Hotz
d87a246439
move to new cached fetch (#2493)
* move to new cached fetch

* extra.utils is over

* loads

* bump download cache

* bump timeout
2023-11-28 17:36:55 -08:00
George Hotz
ab5d14d4ba
MEM -> LOAD (#2492)
* MEM -> LOAD

* keep legacy working
2023-11-28 16:46:37 -08:00
chenyu
847f0a02b1
non-simplifiable mod should result in ModNode (#2490)
* non-simplifiable mod should result in ModNode

* space
2023-11-28 16:52:19 -05:00
mmmkkaaayy
ddb6a33ae5
improve test assertions for jit cache len with graph executor (#2476)
* improve test assertions for jit cache len with graph executor

* delete newline

* unused import

* another unused import
2023-11-27 23:02:45 -08:00
chenyu
28a67106ca
enable symbolic ops tests for hip (#2485) 2023-11-27 22:33:41 -08:00
Christopher Mauri Milan
7f01dd04f0
Apply ruff linting rules to tests (#2473)
* everything except F821

* enable F821 with noqa

* dumb fix

* fix remaining imports and (former) lambdas

* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
Davi Silva
136dbd8b36
HIP CI that compiles (to RDNA3) but doesn't have to run (#2482)
* hip amd compilation

* gate the test properly

* cleanup unused import

* remove superfluous numpy conversion

* add SpeedyNet tests (f32 [passes] & f16 [fails])

* make CI verbose (error log from hip compiler)

* test the real ops_hip

* Merge branch 'tinygrad:master' into ci/hip-compilation

* fix CI

* cleanup

* really fix CI

* Fix CI Three: the refixening

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-11-27 21:17:06 -08:00
George Hotz
acbe6d1b53
Revert "HIP compilation on CI targeting RDNA3 (#2459)" (#2481)
This reverts commit d275ff930a.
2023-11-27 20:41:21 -08:00
qtkite
cb507a9389
Remove the toCPU copy (#2445)
* Remove the rawbuffer copy in runtime/lib.py on line 44

* remove buffer view

* added metadata back, oops

* delayed cpu testcase

* whitespace

* whitespace

* buffer behavior as is

* Update test_jit.py
2023-11-27 20:37:13 -08:00
Davi Silva
d275ff930a
HIP compilation on CI targeting RDNA3 (#2459)
* hip amd compilation

* gate the test properly

* cleanup unused import

* remove superfluous numpy conversion

* add SpeedyNet tests (f32 [passes] & f16 [fails])

* make CI verbose (error log from hip compiler)

* test the real ops_hip

* Merge branch 'tinygrad:master' into ci/hip-compilation

* fix CI

* cleanup

* really fix CI
2023-11-27 20:33:11 -08:00
Paul Gustafson
98cd9e8926
Add assertion to prevent nonsense mod values (#2474) 2023-11-27 18:37:44 -08:00
qazal
e267a93124
reset seed on every run (#2468) 2023-11-27 12:55:54 -08:00
George Hotz
9e07824542
move device to device.py (#2466)
* move device to device.py

* pylint test --disable R,C,W,E --enable E0611

* fix tests
2023-11-27 11:34:37 -08:00
qazal
262cd26d28
Simplify openpilot kernel (#2460)
* a conditional with the same results either way is a noop

* add unit test
2023-11-27 10:02:27 -08:00
chenyu
61a80a0675
asserts LtNodes of SumNode with MulNode of Nodes (#2465) 2023-11-27 12:56:59 -05:00
Paul Gustafson
1d89c018fa
Add isinstance check before gcd call in SumNode.__lt__ (#2450)
* Add isinstance check before gcd call

* Delete blank lines

* Fix unit test typo

* Delete blank lines again

---------

Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>
2023-11-26 13:05:04 -08:00
George Hotz
8e9cdef61f
clean up the buffers (#2447)
* clean up the buffers

* remove allocate_output

* functools.lru_cache is methodcache

* add TestShapeTrackerSize

* cache_clear

* no 0 sz buffer, add _ on functions that shouldn't be imported

* fix size

* if -> while
2023-11-26 11:02:29 -08:00
chenyu
511310737e
test_linearizer_failures to run on all backends (#2443)
* test_linearizer_failures to run on all backends

* test ubuntu and cuda

* failed only in CUDA CI

* move asserts
2023-11-26 01:17:29 -05:00
George Hotz
9eb2746d62
fix copy issue + add regression test (#2441) 2023-11-25 14:06:08 -08:00
George Hotz
7170a9a057
coder.py can write and run code (#2439)
* wip mistral

* coder

* touchups

* cleanups

* mistral cleanups

* clean up cache create

* download the weights, fix tests

* fix llama loading

* global fixup

* clean up all

* move llama model

* cleanups

* Revert "cleanups"

This reverts commit a71c5d59eb.

* fine, leave it
2023-11-25 12:27:54 -08:00
chenyu
9a5d0e70de
Device.DEFAULT instead of getenv to exclude tests (#2429) 2023-11-24 17:10:24 -05:00