mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* fix eval, lr decay, best eval * 82.27 * 82.64 * 82.79, reproducable * add lr sched, 85.26 * 87.42 * 87.94 * 87.42 * tta with flip * training flip aug * refactor * using Tensor for LR is faster * 89.5 * refactor, flip only train set * 90.01 * 90.64 * eval jit * refactor * only JIT model * fix eval JIT * fix eval JIT * 90.82 * STEPS=900 reaches 90.22 * TTA envvar * TTA default 0 * fully jit training * refactor optim * fix sched * add label smoothing * param changes * patial gelu * OneCycle with pause * gelu maybe works * 90.12 * remove pause lr * maybe fix lr schedulers * scheduler test passing * comments * try mixup * shuffle! * add back the missing last eval * fix shuffle bugs * add mixup prob * fix mixup prob * 90.19 * correct mixup * correct mixup * correct mixup * 90.24 * 90.33 * refactor, add type hints * add gradient clipping * maybe fix test * full JIT * back to relu for now * pass mixup prob as param * add typehints * maybe CI works * try erf gelu * CI, types * remove useless import/ * refactor optim * refactor optim * try leakyrelu * try celu * gelu * 90.67 * remove grad clip * remove grad clip tests * revert params * add test for OneCycleLR * 90.62 * fix eval timing * fix eval timing again * so where i calculate mixup_prob matters --------- Co-authored-by: Kunwar Raj Singh <kunwar31@pop-os.localdomain> |
||
|---|---|---|
| .. | ||
| mlperf | ||
| vgg7_helpers | ||
| __init__.py | ||
| benchmark_train_efficientnet.py | ||
| compile_efficientnet.py | ||
| compile_tensorflow.py | ||
| deep_deterministic_policy_gradient.py | ||
| efficientnet.py | ||
| hlb_cifar10.py | ||
| hlb_cifar10_torch.py | ||
| llama.py | ||
| mask_rcnn.py | ||
| mnist_gan.py | ||
| serious_mnist.py | ||
| simple_conv_bn.py | ||
| stable_diffusion.py | ||
| train_efficientnet.py | ||
| train_resnet.py | ||
| transformer.py | ||
| vgg7.py | ||
| vit.py | ||
| whisper.py | ||
| yolov3.py | ||
| yolov8-onnx.py | ||
| yolov8.py | ||