tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

Yixiang Gao 13e872b53f add mutigpu support for llama attention (#3064 ) * add llama attention test for multigpu * test fails * kv cache trying to shrink on sharded axis * mask None works for scale dot product * kv cache seems to be working but scale dot product breaks * scaled dot product works, but the last linear layer failed * running into the reshape case where it could be wrong for multigpu * making sure it was the reshape * adding contiguous doesn't solve * need to shard more properly * remove reshape test * minor adjustment to scale dot product attention test * weights are sharded wrong * continue fix new weight sharding * clean up * fix attention when start_pos is 0 * remove print * add TODOs for the best mutigpu interface		2024-01-11 16:31:02 -08:00
..
accel	move things, clean up extra (#2292 )	2023-11-13 20:18:40 -08:00
assembly	move dtypes to dtype.py (#2964 )	2024-01-01 14:58:48 -08:00
backends	webgl backend in extra (#3041 )	2024-01-08 09:29:13 -08:00
datasets	regenerate kernel ast dataset (#2968 )	2024-01-01 20:26:17 -05:00
dist	hip & cuda to gpuctypes (#2539 )	2023-12-01 09:25:27 -08:00
gemm	remove ACCUM_FP32 in simple_matmul.py (#3045 )	2024-01-08 17:37:57 -05:00
hip_gpu_driver	disk_read_speed example	2024-01-04 13:59:43 -08:00
junk	coder.py can write and run code (#2439 )	2023-11-25 12:27:54 -08:00
models	add mutigpu support for llama attention (#3064 )	2024-01-11 16:31:02 -08:00
optimization	Revert "track size in shapetracker" (#3043 )	2024-01-08 13:13:39 -08:00
qcom_gpu_driver	start Qualcomm GPU driver (#2804 )	2023-12-16 23:10:50 -08:00
archprobe.py	move dtypes to dtype.py (#2964 )	2024-01-01 14:58:48 -08:00
augment.py	[ready] Replacing os with pathlib (#1708 )	2023-08-30 10:41:08 -07:00
autopad.py	fix PADTO optimization (#2935 )	2023-12-25 22:52:49 -05:00
disk_read_speed.py	fast hip read (#3014 )	2024-01-05 10:33:13 -08:00
dump_cache.py	wow how did i think that was okay (#2339 )	2023-11-16 21:21:11 -08:00
export_model.py	webgl backend in extra (#3041 )	2024-01-08 09:29:13 -08:00
gradcheck.py	Fix: Jacobian tests [WIP] (#1126 )	2023-07-05 15:36:22 -07:00
introspection.py	move globalcounters to ops (#2960 )	2024-01-01 14:21:02 -08:00
lr_scheduler.py	make LR scheduler work with multigpu (#3011 )	2024-01-04 12:10:56 -08:00
multitensor.py	multitensor start (#2676 )	2023-12-07 17:07:05 -08:00
onnx.py	removed redundant dtype hacks in onnx_ops (#2939 )	2024-01-04 01:45:24 -05:00
onnx_ops.py	touchup onnx xor and not (#3008 )	2024-01-04 02:02:42 -05:00
thneed.py	new style device (#2530 )	2023-11-30 17:07:16 -08:00
to_movement_ops.py	Revert "track size in shapetracker" (#3043 )	2024-01-08 13:13:39 -08:00
training.py	hotfix: examples/transformer.py	2024-01-09 19:28:09 -08:00