tinygrad/extra
wozeparrot a0ab755317
threefry again (#3785)
* feat: initial xor

* feat: initial threefly

* feat: remove custom random

* fix: really need to install precommit

* feat: lmao forgot that this is rotate not a shift

* clean: put that there

* feat: numpy xor

* feat: quick test for xor

* feat: llvm xor

* feat: slightly working xor in torch

* feat: rand works in jit

* clean: save a line

* feat: match jax

* feat: maybe test against jax

* feat: requires_grad

* fix: fix test_symbolic_ops

* feat: lower alpha

* feat: just pad

* fix: maybe fix training tests?

* fix: fix some llvm stuff

* feat: cursed realize on the way out

* feat: testing jax

* fix: why is the jax install process not simple

* fix: maybe passing test

* fix: symbolic workarounds

* clean: still need that precommit

* fix: aaaa

* fix: more test fixes

* fix: quick fix for wgsl

* feat: need to set requires_grad on the final tensor

* feat: one more tensor

* feat: don't take forever

* feat: seeing y ci is brok

* feat: can't allocate 64GiB lmao

* fix: fix this

* feat: hope this doesn't break smth before i go to bed

* feat: don't destroy ram

* feat: int

* feat: remove jax

* feat: properish workaround?

* feat: skip slow webgpu tests

* feat: no longer fails

* feat: use dtypes

* feat: real number

* fix: torch

* fix: don't test against reference for torch

* feat: to device

* feat: fix advanced indexing

* feat: correct casting

* feat: even rng_counter

* feat: match master

* feat: this was actually bad

* fix: maybe?

* feat: store

* feat: remove realizes

* feat: somehow this is important

* feat: somehow this is also important

* feat: save a line

* fix: don't need that anymore

* feat: restore this

* fix: linter

* feat: remove realizes

* fix: realized is in base now

* fix: add back cast

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: bump deadline

* fix: :(

* fix: :(

* fix: not being dumb

* feat: try changing less tests

* feat: shouldn't have to change that

* feat: contiguous bumps it by one

* fix: hmm

* fix: numpy memory moment

* fix: cl_khr_fp16

* fix: torch has different tensor count

* fix: missing contiguous

* hmm: hmm

* fix: some fixes

* fix: typing

* feat: dont do that

* feat: typing fixes

* feat: why is this realize required?

* feat: ngl kinda odd typing

* feat: oh

* feat: remove realizes

* feat: why is this realize required?

* fix: hacky patch for cudacpu

* fix: without this realize pytest crashes?????

* fix: shorter line

* fix: cudacpu fixes

* fix: cudacpu fixes

* feat: real buffer

* feat: don't search when searching lmao

* fix: can't use contiguous things

* fix: no more 100GB arrays

* fix: revert

* fix: skip 7 and 10

* feat: working ish beam

* feat: minimize changes

* feat: seed 0 stable diffusion example changed

* fix: different on ci

* fix: no beam

* feat: make threefry optional

* fix: check value

* fix: unused import

* feat: threefry default

* fix: 5d

* feat: allow non upcast div

* fix: 5d better

* fix: 5d better

* fix: save all dtype

* feat: proper error

* feat: lazyop key

* fix: check float

* feat: try removing this realize now

* feat: disable threefry for uops hip tensor cores

* feat: don't need that

* feat: only check upcast

* fix: disable threefry for some metal tests

* feat: disable for metal tensor uops as well

* feat: disable for most uops

* fix: disable threefry for new uops tests

* feat: multitensor

* fix: typing

* feat: threefry default off

* feat: skip threefry half rand

* feat: restore old

* fix: bad git

* clean: ruff

* feat: bfloat16 fix

* fix: :|

* feat: restore old

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-03-18 16:47:07 -04:00
..
accel move things, clean up extra (#2292) 2023-11-13 20:18:40 -08:00
assembly move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
backends remove hip backend (#3783) 2024-03-17 10:12:16 -07:00
datasets MLPerf Resnet (cleaned up) (#3573) 2024-03-14 00:53:41 -04:00
dist move graph.py and jit.py into features (#3376) 2024-02-12 17:34:34 +01:00
gemm fix 'Import Error: cannot import name compile_cuda from tinygrad.runtime.ops_cuda' error in extra/gemm/cuda_matmul.py (#3531) 2024-02-28 17:15:32 -08:00
hip_gpu_driver disk_read_speed example 2024-01-04 13:59:43 -08:00
hiprtc use comgr to compile (#3248) 2024-01-26 18:27:49 -08:00
junk coder.py can write and run code (#2439) 2023-11-25 12:27:54 -08:00
models apply the same fix_bf16 in llama and coder (#3789) 2024-03-17 21:25:24 -04:00
optimization threefry again (#3785) 2024-03-18 16:47:07 -04:00
qcom_gpu_driver start Qualcomm GPU driver (#2804) 2023-12-16 23:10:50 -08:00
archprobe.py move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
augment.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
autopad.py move create schedule and delete old API (#3377) 2024-02-12 18:10:45 +01:00
disk_read_speed.py fast hip read (#3014) 2024-01-05 10:33:13 -08:00
dump_cache.py wow how did i think that was okay (#2339) 2023-11-16 21:21:11 -08:00
export_model.py move graph.py and jit.py into features (#3376) 2024-02-12 17:34:34 +01:00
gradcheck.py Fix: Jacobian tests [WIP] (#1126) 2023-07-05 15:36:22 -07:00
hip_events.py move autogen to runtime/autogen (#3254) 2024-01-26 12:44:19 -08:00
introspection.py move globalcounters to ops (#2960) 2024-01-01 14:21:02 -08:00
lr_scheduler.py make LR scheduler work with multigpu (#3011) 2024-01-04 12:10:56 -08:00
multitensor.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
onnx.py Fix: Always cast ONNX Slice op arguments into ints (#3317) 2024-02-04 18:40:48 -05:00
onnx_ops.py simple LoadOps.ASSIGN (#3745) 2024-03-14 20:44:34 -07:00
ring_copy.py ring copy example (#3185) 2024-01-19 23:34:30 -05:00
thneed.py new style device (#2530) 2023-11-30 17:07:16 -08:00
to_movement_ops.py Revert "track size in shapetracker" (#3043) 2024-01-08 13:13:39 -08:00
training.py move graph.py and jit.py into features (#3376) 2024-02-12 17:34:34 +01:00
transfer_speed.py hotfix: copy size is in bytes 2024-01-17 16:44:15 +00:00