tinygrad/extra/models
Daniel Xu 4edaaf19e5
Handle tied embeddings for llama 3.2 1B (#13796)
Previously the output.weight layer would not be loaded, and would only
contain randomly initialized values. This led to junk when doing a
forward pass.

Signed-off-by: Daniel Xu <daniel@thinkingmachines.ai>
2025-12-22 16:31:40 -05:00
..
bert.py add contiguous in BertIntermediate (#13713) 2025-12-15 22:37:36 -05:00
clip.py Clip model updates for Stable Diffusion mlperf training (#12313) 2025-09-29 21:50:14 -04:00
convnext.py remove Tensor.no_grad, it's meaningless now [pr] (#10556) 2025-05-28 22:20:02 -07:00
efficientnet.py remove the magic methods for moving between devices [pr] (#6881) 2024-10-04 20:27:52 +08:00
inception.py don't hardcode weights path (#12171) 2025-09-15 00:33:47 -04:00
llama.py Handle tied embeddings for llama 3.2 1B (#13796) 2025-12-22 16:31:40 -05:00
mask_rcnn.py move BoxCoder to mlperf helpers (#9773) 2025-04-07 20:27:06 -04:00
resnet.py Fix FC layer ResNet load_from_pretrained error (#8387) 2024-12-26 18:11:27 -05:00
retinanet.py RetinaNet INITMLPERF support (#9950) 2025-04-21 10:32:05 -04:00
rnnt.py change Tensor.stack to method (#4719) 2024-05-24 17:04:19 -04:00
t5.py Flux.1 (#6334) 2024-09-24 10:08:04 +08:00
transformer.py _one_hot_along_dim input needs to be int (#9179) 2025-02-20 09:00:43 -05:00
unet.py Stable Diffusion model init for mlperf (#12314) 2025-10-02 02:28:41 -04:00
unet3d.py move to new cached fetch (#2493) 2023-11-28 17:36:55 -08:00
vit.py move to new cached fetch (#2493) 2023-11-28 17:36:55 -08:00