mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* this is a lot of stuff TEST_TRAIN env for less data don't diskcache get_train_files debug message no lr_scaler for fp32 comment, typo type stuff don't destructure proc make batchnorm parameters float make batchnorm parameters float resnet18, checkpointing hack up checkpointing to keep the names in there oops wandb_resume lower lr eval/ckpt use e+1 lars report top_1_acc some wandb stuff split fw and bw steps to save memory oops save model when reach target formatting make sgd hparams consistent just always write the cats tag... pass X and Y into backward_step to trigger input replace shuffle eval set to fix batchnorm eval dataset is sorted by class, so the means and variances are all wrong small cleanup hack restore only one copy of each tensor do bufs from lin after cache check (lru should handle it fine) record epoch in wandb more digits for topk in eval more env vars small cleanup cleanup hack tricks cleanup hack tricks don't save ckpt for testeval cleanup diskcache train file glob clean up a little device_str SCE into tensor small small log_softmax out of resnet.py oops hack :( comments HeNormal, track gradient norm oops log SYNCBN to wandb real truncnorm less samples for truncated normal custom init for Linear log layer stats small Revert "small" This reverts commit |
||
|---|---|---|
| .. | ||
| dataloader.py | ||
| helpers.py | ||
| initializers.py | ||
| lr_schedulers.py | ||
| metrics.py | ||
| model_eval.py | ||
| model_spec.py | ||
| model_train.py | ||
| optimizers.py | ||
| README | ||
Each model should be a clean single file. They are imported from the top level `models` directory It should be capable of loading weights from the reference imp. We will focus on these 5 models: # Resnet50-v1.5 (classic) -- 8.2 GOPS/input # Retinanet # 3D UNET (upconvs) # RNNT # BERT-large (transformer) They are used in both the training and inference benchmark: https://mlcommons.org/en/training-normal-21/ https://mlcommons.org/en/inference-edge-30/ And we will submit to both. NOTE: we are Edge since we don't have ECC RAM