tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

nimlgen 81a4a9623c add qcom dsp runtime (#6112 ) * calling qualcomm dsp from python * include so files * add include file * adsprpc.py * running with adsprpc * work * 32-bit support in elf * compilation works * ion * msm_ion * working DSP backend * getting 500 MFLOPS on matmul * beam works with timing * move to autogen * disasm * progress * simple tests pass * qcom_dsp * more dsp autogen * progress * some progress * works w/o lib * checkpoint * no lib * ugh, better * cleaner, but with lib. test good, but with the hack * remove autogens * small * push * simpler * revert this * run_3 * simpler * android * handle * run it * why? * run2 * to gen * cc * cleaner * elf * part of autogen * comemnt * no lib * autohen * linter * bug reproducer * cleaner * this repro is almost empty and doesn't work!!!! * with this test_ops passes, no crashes anymore * cleaner * linter * renames * shorter * remoev contextlib * ugh * myoy * cleaner * cleaner * remove import * conn * import * revert this * remove heavy .so * shorter alloc * not tue anymore --------- Co-authored-by: Comma Device <device@comma.ai> Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: George Hotz <george@comma.ai>		2024-09-13 21:01:33 +03:00
..
.gitignore	updates from the chonker branch	2022-11-07 21:12:08 -08:00
amx.py	update amx gemm (#3991 )	2024-03-29 11:45:03 -04:00
cuda_matmul.py	fix 'Import Error: cannot import name compile_cuda from tinygrad.runtime.ops_cuda' error in extra/gemm/cuda_matmul.py (#3531 )	2024-02-28 17:15:32 -08:00
fuzz_matmul.py	wmma: widen TC usage in search by using PADTO on TC axes when possible (#4216 )	2024-04-22 16:50:31 -04:00
gemm.c	only 62 gflops (#2629 )	2023-12-05 13:28:24 -08:00
gemm.py	only 62 gflops (#2629 )	2023-12-05 13:28:24 -08:00
hip_matmul.py	retire hsa (#4885 )	2024-06-09 11:33:03 +03:00
intel_xmx.py	Intel XMX Tensor Core Support (#5622 )	2024-08-16 09:19:21 -07:00
jax_pmatmul.py	jax parallel matmul example	2023-11-28 13:48:11 -08:00
metal_conv.py	create engine folder and move code (#3948 )	2024-03-26 20:38:03 -07:00
metal_matmul.py	create engine folder and move code (#3948 )	2024-03-26 20:38:03 -07:00
metal_matvec.py	move GlobalCounter to helpers (#4002 )	2024-03-30 00:30:30 -04:00
mlx_matmul.py	mlx benchmark, a lil slower than tg	2023-12-05 19:00:43 -08:00
real_pmatmul.py	pmatmul example + GB/s bugfix [run_process_replay] (#5974 )	2024-08-07 22:32:11 -07:00
simple_conv.py	wmma: refactor to remove wmma_func and create TC funcs as needed (#3945 )	2024-03-27 16:43:09 -04:00
simple_matmul.py	add qcom dsp runtime (#6112 )	2024-09-13 21:01:33 +03:00
simple_matvec.py	extra/gemm/simple_matvec: add simple_matvec.py (#4021 )	2024-03-31 16:38:52 -04:00
tf_gemm.py	Add tensorflow GEMM benchmark script (#1000 )	2023-06-18 10:57:45 -07:00
tinygrad_nv_matmul.py	work to make GEMV fast (#5824 )	2024-07-30 17:41:40 -07:00
torch_gemm.py	faster RDNA assembly backend (#990 )	2023-06-16 12:06:38 -07:00
triton_nv_matmul.py	extra/gemm/triton_nv_matmul: fix Program arguments (#6212 )	2024-08-20 14:05:38 -07:00
tvm_gemm.py	lowerer is kernel [run_process_replay] (#5437 )	2024-07-12 18:50:55 -07:00