mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* add llama attention test for multigpu * test fails * kv cache trying to shrink on sharded axis * mask None works for scale dot product * kv cache seems to be working but scale dot product breaks * scaled dot product works, but the last linear layer failed * running into the reshape case where it could be wrong for multigpu * making sure it was the reshape * adding contiguous doesn't solve * need to shard more properly * remove reshape test * minor adjustment to scale dot product attention test * weights are sharded wrong * continue fix new weight sharding * clean up * fix attention when start_pos is 0 * remove print * add TODOs for the best mutigpu interface |
||
|---|---|---|
| .. | ||
| accel | ||
| assembly | ||
| backends | ||
| datasets | ||
| dist | ||
| gemm | ||
| hip_gpu_driver | ||
| junk | ||
| models | ||
| optimization | ||
| qcom_gpu_driver | ||
| archprobe.py | ||
| augment.py | ||
| autopad.py | ||
| disk_read_speed.py | ||
| dump_cache.py | ||
| export_model.py | ||
| gradcheck.py | ||
| introspection.py | ||
| lr_scheduler.py | ||
| multitensor.py | ||
| onnx.py | ||
| onnx_ops.py | ||
| thneed.py | ||
| to_movement_ops.py | ||
| training.py | ||