tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

chenyu ff05bff221 put bert data shard inside jit (#9160 ) python time 45ms -> 9ms, it was spending time to schedule the shard also init bert data on CLANG since it's from numpy, so we don't create the tensor on default device then shard into GPUS		2025-02-18 10:36:54 -05:00
..
scripts	UNet3D MLPerf (#3470 )	2024-09-10 04:37:28 -04:00
training_submission_v4.0/tinycorp	copy mlperf 4.0 to mlperf 4.1 (#5614 )	2024-07-20 16:12:00 -04:00
training_submission_v4.1/tinycorp	update mlperf systems and copy 4.1 to 5.0 (#7004 )	2024-10-11 16:20:34 -04:00
training_submission_v5.0/tinycorp	free_intermediates in bert (#9040 )	2025-02-12 10:00:39 -05:00
dataloader.py	put bert data shard inside jit (#9160 )	2025-02-18 10:36:54 -05:00
helpers.py	put bert data shard inside jit (#9160 )	2025-02-18 10:36:54 -05:00
initializers.py	Tuple -> tuple, List -> list [pr] (#8936 )	2025-02-06 14:21:19 -05:00
losses.py	[MLPerf][UNet3D] Add DICE loss + metrics (#4204 )	2024-04-17 20:09:33 -04:00
lr_schedulers.py	fp16 resnet (without expand backwards sum in float, doesn't work) (#3816 )	2024-03-28 01:25:37 -04:00
metrics.py	[MLPerf][UNet3D] Add DICE loss + metrics (#4204 )	2024-04-17 20:09:33 -04:00
model_eval.py	[MLPerf] Prepare openimages dataset script (#6747 )	2024-09-27 11:13:56 -04:00
model_spec.py	move globalcounters to ops (#2960 )	2024-01-01 14:21:02 -08:00
model_train.py	put bert data shard inside jit (#9160 )	2025-02-18 10:36:54 -05:00
README	start on mlperf models	2023-05-10 16:30:49 -07:00

README

Each model should be a clean single file.
They are imported from the top level `models` directory

It should be capable of loading weights from the reference imp.

We will focus on these 5 models:

# Resnet50-v1.5 (classic) -- 8.2 GOPS/input
# Retinanet
# 3D UNET (upconvs)
# RNNT
# BERT-large (transformer)

They are used in both the training and inference benchmark:
https://mlcommons.org/en/training-normal-21/
https://mlcommons.org/en/inference-edge-30/
And we will submit to both.

NOTE: we are Edge since we don't have ECC RAM