tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

Francis Lata 2793cca9a6 RetinaNet MLPerf (#8385 ) * add support for a custom BASEDIR for openimages download * make export step faster * add focal loss * update model_eval with new dataloader * generate_anchors in tinygrad * update initializers for model * small cleanup * revert isin enhancements * recursively go through backbone layers to freeze them * add optimizer * minor cleanup * start dataloader work with input images * add first transform for train set * reuse existing prepare_target * continue with dataloader implementation * add dataloader * separate out KiTS19 dataset test cases * create mock data samples for test * add dataloader + test * cleanup dataloader test and revert shm path * trim dataloader related code needed from ref * got dataloader with normalize working * update image to be float32 * add back normalization and negate it in test * clean up reference dataset implementation + ruff changes * add validation set test * add proper training loop over the training dataset * add LambdaLR support * add LR scheduler and the start of training step * get forward call to model work and setup multi-GPU * already passed device * return matches from dataloader * hotfix for dataloader typo causing some hang * start some work on classification loss * update focal loss to support masking * add missing test and cleanup focal loss * cleanup unit tests * remove masking support for sigmoid_focal_loss * make ClassificationHead loss work * cleanups + fix dataloader tests * remove sigmoid when computing loss * make anchors use Tensors * simplify anchors batching * revert anchors to use np * implement regression loss * fix regression loss * cleanup losses * move BoxCoder to MLPerf helpers * revert helper changes * fixes after helper refactor cleanup * add tests for l1_loss * start re-enabling training step * minor cleanup * add pycocotools to testing dependencies * make training work * adjust regression loss to mask after L1 loss is calculated * reduce img and lbl sizes by half for KiTS19 dataset tests * Revert "reduce img and lbl sizes by half for KiTS19 dataset tests" This reverts commit `d115b0c664`. * temporarily disable openimages dataset tests to debug CI * enable openimages dataset test and create samples once * temporarily disable openimages validation set test * reenable test and add some debugging to the test * add boto3 testing dependencies * add pandas to testing dependencies * This reverts commit `467704fec6`. * reenable test * move sample creation to setup * realize boxcoder's encoding * add wandb * fix wandb resuming feature * move anchors as part of dataloader * fix dtype for anchor inside dataloader and fix horizontal flip transformation * add support for BENCHMARK * set seed * debug dataset test failuire * Revert "debug dataset test failuire" This reverts commit `1b2f9d7f50`. * fix dataloader script * do not realize when sharding model weights * setup openimages samples differently * create the necessary samples per test case * enable lr scheduler and fix benchmark timing * add jit to the training loop * add checkpointing and training resume capabilities * refactor on training loop and start the work on val looop * add debug logging for dataloader test * debug test * assert boxes again * update validation dataloader and more cleanups * fix validation test case * add multi device support to retinanet eval * fix issue with realized on dataloader * remove optional disk tensors in dataloader * remove verbose debugging on datasets test * put back parallel testing and remove img_ids Tensor from dataloader * cleanup train and validation dataloader * return validation targets in dataloader * cleanup boxes and labels in dataloader * fix img_ids repeating its values * remove unnecessary targets from validation dataloader * add validation loop to training script * adjust LR to be the ratio of the batch size * minor cleanups * remove frozen layers from optimizer's params * hyperparameter adjustments and cleanups * model init, hyperparam, and data preprocessing updates * no need to return loaded keys for resnet * fix train script * update loss calculation for regresionhead and some cleanups * add JIT reset support * add nan check during training * Revert "add nan check during training" This reverts commit `ddf1f0d5dd`. * Revert "Revert "add nan check during training"" This reverts commit `b7b2943197`. * some typing cleanups * update seeding on dataloader and the start of training script * undo changse * undo more changes * more typing fixes * minor cleanups * update dataloader seed * hotfix: log metric and move target metric check outside of CKPT * check for CKPT when target metric is reached before saving * add TRAIN_BEAM and EVAL_BEAM * minor cleanup * update hyperparams and add support for EVAL_BS * add green coloring to metric reached statement * initial work to support f16 * update model initializers to be monkeypatched * update layers to support float32 weight loading + float16 training * don't return loss that's scaled * run eval on benchmark beam * move BEAM to their respective steps * update layers to be compatible with fp16 * end BENCHMARK after first eval * cleanups and adjust learning rate for fp16 * remove duplicated files from test * revert losses changes * Revert "revert losses changes" This reverts commit `aebccf93ac`. * go back to old LR * cast batchnorm to float32 * set new loss scaler default value for float16 * remove LambdaLRScheduler * remove runner and use dataloader on eval * fix retinanet eval with new dataloader * remove unused import * revert lr_scheduler updates * use BS=96 with new learning rate * rename module initializers * more cleanups on training loop * remove contig from optim.step * simplify sum when computing loss		2025-04-12 22:11:51 -04:00
..
conversation_data	Whisper + LLAMA + VITS (#2332 )	2023-12-02 15:03:46 -08:00
llm.c	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
mlperf	RetinaNet MLPerf (#8385 )	2025-04-12 22:11:51 -04:00
openpilot	add onnx frontend stub [pr] (#9558 )	2025-03-24 12:24:34 +08:00
other_mnist	add TORCHVIZ=1 to beautiful_mnist_torch (#9576 )	2025-03-26 11:17:08 +08:00
rl	more beautiful_cartpole with exposed hparams	2024-01-07 17:41:09 -08:00
sovits_helpers	combine pad2d with pad (#7677 )	2024-11-14 17:56:02 +08:00
tinychat	tinychat in browser, Part 3: browser app (#9276 )	2025-03-07 15:07:33 +08:00
vgg7_helpers	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
webgpu	improve reproducibility of WebGPU CI puppeteer test (#9496 )	2025-03-18 09:27:38 -04:00
__init__.py	failing llama test	2023-03-11 16:28:10 -08:00
beautiful_cartpole.py	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
beautiful_cifar.py	Fix mypy examples/beautiful_*.py (#6978 )	2024-10-10 11:34:29 -04:00
beautiful_mnist.py	Revert "switch beautiful_mnist to use new optimizer [pr] (#8231 )" (#8233 )	2024-12-13 19:07:09 -08:00
beautiful_mnist_multigpu.py	Fix mypy examples/beautiful_*.py (#6978 )	2024-10-10 11:34:29 -04:00
benchmark_onnx.py	more stuff from DSP (#9689 )	2025-04-02 15:27:48 +08:00
coder.py	apply the same fix_bf16 in llama and coder (#3789 )	2024-03-17 21:25:24 -04:00
compile_efficientnet.py	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
compile_tensorflow.py	add onnx frontend stub [pr] (#9558 )	2025-03-24 12:24:34 +08:00
conversation.py	Fix examples/conversation.py (#8425 )	2024-12-26 12:45:19 -05:00
efficientnet.py	remove clang program header (#4422 )	2024-05-04 08:38:01 -07:00
flux1.py	flux set model path in args (#7660 )	2024-11-12 22:11:40 -05:00
flux1_seed0.png	Flux.1 (#6334 )	2024-09-24 10:08:04 +08:00
gpt2.py	cleanup ci, split docs/autogen, testing_minimal, LLVM Speed [pr] (#8952 )	2025-02-07 19:01:59 +08:00
handcode_opt.py	move hand_coded_optimizations to heuristic.py [pr] (#9844 )	2025-04-10 23:40:16 -04:00
hlb_cifar10.py	MultiLazyBuffer is UOp [pr] (#8662 )	2025-01-24 13:28:55 +09:00
llama.py	validate llama quantize output (#7901 )	2024-11-25 16:46:23 -05:00
llama3.py	acc_dtype -> dtype (#9402 )	2025-03-10 16:05:30 -04:00
mamba.py	prev speed improvements (#5252 )	2024-07-03 09:06:01 -07:00
mask_rcnn.py	change Tensor.stack to method (#4719 )	2024-05-24 17:04:19 -04:00
mixtral.py	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
mnist_gan.py	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
olmoe.py	olmoe memory usage cleanups	2025-03-19 12:28:18 +08:00
openelm.py	nn.RMSNorm (#5272 )	2024-07-02 21:39:01 -04:00
qwq.py	QwQ-32B-Preview support (#7962 )	2024-12-04 21:46:37 -05:00
sdv2.py	Stable Diffusion v2 Inference (#5283 )	2024-07-03 22:47:10 -04:00
sdxl.py	GlobalCounters.reset() in sdxl step [pr] (#8664 )	2025-01-17 21:10:28 -05:00
sdxl_seed0.png	default threefry (#6116 )	2024-09-25 17:45:13 +08:00
self_tokenize.py	make self_tokenize output more like a python file (#8411 )	2024-12-25 14:16:30 -05:00
serious_mnist.py	combine pad2d with pad (#7677 )	2024-11-14 17:56:02 +08:00
simple_conv_bn.py	fix various examples (#4691 )	2024-05-22 20:43:21 -04:00
so_vits_svc.py	use tuple in isinstance for type checking (#9583 )	2025-03-26 19:36:48 +08:00
stable_diffusion.py	Remove wgpu specific checks from stable diffusion example (#7991 )	2024-12-02 11:31:14 +01:00
stable_diffusion_seed0.png	default threefry (#6116 )	2024-09-25 17:45:13 +08:00
stunning_mnist.py	stunning_mnist [run_process_replay] (#6828 )	2024-10-01 15:00:48 +08:00
test_onnx_imagenet.py	fixes from the dsp branch + 12500 lines (#9683 )	2025-04-02 13:07:17 +08:00
test_pkl_imagenet.py	more stuff from DSP (#9689 )	2025-04-02 15:27:48 +08:00
torch_cuda_kernel.py	hotfix: interop example (#9237 )	2025-02-25 10:32:00 +03:00
train_efficientnet.py	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
train_resnet.py	move things, clean up extra (#2292 )	2023-11-13 20:18:40 -08:00
transformer.py	fix onehot and jit in examples/transformer (#3073 )	2024-01-10 02:22:41 -05:00
vgg7.py	waifu2x vgg7: testcase, auto-RGBA->RGB, function to grab pretrained models, training "fix" (#2117 )	2023-10-19 22:07:15 -07:00
vit.py	move to new cached fetch (#2493 )	2023-11-28 17:36:55 -08:00
vits.py	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
whisper.py	enable whisper batch for long sequences (#6458 )	2024-09-17 00:42:10 -04:00
yolov3.py	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
yolov8-onnx.py	add onnx frontend stub [pr] (#9558 )	2025-03-24 12:24:34 +08:00
yolov8.py	YoloV8 on WebGPU (#8007 )	2024-12-03 15:10:41 +01:00