tinygrad/extra
geohotstan 8c0d0a122c
Add return_indices to max_pool (#9506)
* wow argmax is so good

* 1 less line

* clean up and better variable names

* is this torch thing right...?

* add more tests

* slap a TODO on it

* clean ups

* prettier looking code and fix ceil mode test

* add return types and some docs

* ok that was a bad example since indices == value, just no example
2025-03-19 15:25:37 -04:00
..
accel move things, clean up extra (#2292) 2023-11-13 20:18:40 -08:00
amdpci am: rename soc21 to soc (#9482) 2025-03-18 08:54:26 +08:00
assembly s/UOps/Ops (#7500) 2024-11-03 11:26:10 +08:00
backends CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
datasets do not construct unmasked VALID (#8759) 2025-01-28 20:51:21 +02:00
disassemblers/adreno qcom fix disasm (#6703) 2024-09-24 15:23:43 +08:00
dsp dsp simulator (#8869) 2025-02-04 09:45:04 +08:00
gemm extra/gemm/max_matmul: start of custom kernels for GEMM (#6926) 2025-03-19 15:04:57 +08:00
hip_gpu_driver amd: autogen ip bases (#9360) 2025-03-05 22:30:38 +03:00
hiprtc use comgr to compile (#3248) 2024-01-26 18:27:49 -08:00
huggingface_onnx benchmark huggingface onnx models (#8493) 2025-03-12 20:13:12 -04:00
junk coder.py can write and run code (#2439) 2023-11-25 12:27:54 -08:00
models olmoe (from stream, wip) (#9390) 2025-03-10 13:46:33 +08:00
nv_gpu_driver nv fix shared_memory_size (#7239) 2024-10-23 21:59:47 +03:00
optimization fix import time_linearizer [pr] (#9118) 2025-02-15 21:33:28 -05:00
qcom_gpu_driver qcom match texture/sampler descriptors to OpenCL (#7622) 2024-11-11 21:56:51 +03:00
resnet18 beat mlx at resnet 18 (#6611) 2024-09-20 11:28:01 +08:00
sqtt SQTT profiling (#9278) 2025-03-11 13:19:56 +08:00
torch_backend Add return_indices to max_pool (#9506) 2025-03-19 15:25:37 -04:00
torch_hook torch_hook fixes (#9334) 2025-03-03 23:07:30 +03:00
webgpu Autogen webgpu dawn, removing wgpu-py dependency (f16 support part 1) (#8646) 2025-02-07 15:16:59 +08:00
archprobe.py move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
augment.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
disk_read_speed.py io_uring for copies from disk (#5035) 2024-06-21 11:36:51 +03:00
dump_cache.py wow how did i think that was okay (#2339) 2023-11-16 21:21:11 -08:00
export_model.py tinychat in browser, Part 2: model export (#9274) 2025-03-04 15:53:30 +08:00
f16_decompress.py u32 to f16 in tinygrad (#8074) 2024-12-06 12:00:13 +01:00
gradcheck.py tests from grad uop path [pr] (#8313) 2024-12-18 09:25:05 -08:00
hip_events.py move autogen to runtime/autogen (#3254) 2024-01-26 12:44:19 -08:00
hip_large_kernel.py minimum change for rdna4 [pr] (#9455) 2025-03-16 13:39:24 +08:00
hook_cuda.py cuda hooking (#9180) 2025-02-20 19:20:01 +08:00
introspection.py rename LazyBuffer -> UOp [pr] (#8169) 2024-12-11 16:15:52 -08:00
lr_scheduler.py use at least float32 for optim.lr (#4297) 2024-04-25 14:42:28 -04:00
mcts_search.py [TIP-9] rename Opt's amt to arg 2 (#8770) 2025-01-27 14:19:04 -05:00
multitensor.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
onnx.py Add return_indices to max_pool (#9506) 2025-03-19 15:25:37 -04:00
onnx_helpers.py benchmark huggingface onnx models (#8493) 2025-03-12 20:13:12 -04:00
reduce_speed.py VALIDATE_WITH_CPU [pr] (#9488) 2025-03-18 15:15:04 +08:00
replay_pkl.py dsp work try 3 (#9475) 2025-03-17 16:42:12 +08:00
ring_copy.py ring copy example (#3185) 2024-01-19 23:34:30 -05:00
setup_mock_amd_osx.sh add script to install amd mockgpu on macOS (#8536) 2025-01-09 01:29:25 +03:00
setup_mock_nv_osx.sh hotfix: setup_mock_nv_osx 2025-02-13 12:26:15 +08:00
thneed.py new style device (#2530) 2023-11-30 17:07:16 -08:00
threefry.py feat: make buffer (#6745) 2024-09-25 18:31:03 +08:00
to_movement_ops.py full fix for as_strided in torch backend (#9257) 2025-02-26 22:34:05 +08:00
training.py tinytqdm.set_description and tinytrange (#5101) 2024-06-22 14:45:06 -04:00
transfer_speed.py hotfix: copy size is in bytes 2024-01-17 16:44:15 +00:00