mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* add llama attention test for multigpu * test fails * kv cache trying to shrink on sharded axis * mask None works for scale dot product * kv cache seems to be working but scale dot product breaks * scaled dot product works, but the last linear layer failed * running into the reshape case where it could be wrong for multigpu * making sure it was the reshape * adding contiguous doesn't solve * need to shard more properly * remove reshape test * minor adjustment to scale dot product attention test * weights are sharded wrong * continue fix new weight sharding * clean up * fix attention when start_pos is 0 * remove print * add TODOs for the best mutigpu interface |
||
|---|---|---|
| .. | ||
| bert.py | ||
| convnext.py | ||
| efficientnet.py | ||
| llama.py | ||
| mask_rcnn.py | ||
| resnet.py | ||
| retinanet.py | ||
| rnnt.py | ||
| transformer.py | ||
| unet3d.py | ||
| vit.py | ||