tinygrad/examples
Ahmed Harmouche 265304e7fd
Stable diffusion WebGPU port (#1370)
* WIP: Stable diffusion WebGPU port

* Load whole model: split safetensor to avoid Chrome allocation limit

* Gitignore .DS_Store, remove debug print

* Clip tokenizer in JS

* WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS

* e2e stable diffusion flow

* Create initial random latent tensor in JS

* SD working e2e

* Log if some weights were not loaded properly

* Remove latent_tensor.npy used for debugging

* Cleanup, remove useless logs

* Improve UI

* Add progress bar

* Remove .npy files used for debugging

* Add clip tokenizer as external dependency

* Remove alphas_cumprod.js and load it from safetensors

* Refactor

* Simplify a lot

* Dedup base when limiting elementwise merge (webgpu)

* Add return type to safe_load_metadata

* Do not allow run when webgpu is not supported

* Add progress bar, refactor, fix special names

* Add option to chose from local vs huggingface weights

* lowercase tinygrad :)

* fp16 model dl, decompression client side

* Cache f16 model in browser, better progress

* Cache miss recovery

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-11-03 18:29:16 -07:00
..
mlperf op logger + replay (#2021) 2023-10-08 15:10:18 -07:00
sovits_helpers Implementation of SoftVC VITS SVC model (#1371) 2023-08-13 19:43:23 -07:00
vgg7_helpers waifu2x vgg7: testcase, auto-RGBA->RGB, function to grab pretrained models, training "fix" (#2117) 2023-10-19 22:07:15 -07:00
webgpu/stable_diffusion Stable diffusion WebGPU port (#1370) 2023-11-03 18:29:16 -07:00
__init__.py failing llama test 2023-03-11 16:28:10 -08:00
benchmark_train_efficientnet.py add cache collector (#1595) 2023-08-28 19:59:55 -07:00
compile_efficientnet.py Enable Multi-Output Export (#2179) 2023-10-30 18:42:26 -07:00
compile_tensorflow.py moved extras/jit.py -> tinygrad/jit.py (#599) 2023-02-25 08:32:33 -08:00
efficientnet.py Fix plt output comment (#1428) 2023-08-03 23:35:52 -07:00
f16_w_uint32.py add exp2 (#2192) 2023-10-31 17:48:42 -07:00
gpt2.py beam=16 makes gpt2 gpu-time < 5ms on 3090 (#2154) 2023-10-27 10:21:27 -10:00
handcode_resnet50_opt.py merge kernel and optimizer (#2200) 2023-11-01 15:20:01 -07:00
hlb_cifar10.py hip multigpu training (#1878) 2023-10-24 17:35:53 -04:00
index.html Enable Multi-Output Export (#2179) 2023-10-30 18:42:26 -07:00
llama.py fix codellama params and repeat_kv (#2181) 2023-10-30 10:16:26 -07:00
mask_rcnn.py MaskRCNN Inference (#884) 2023-06-25 15:37:51 -07:00
mnist_gan.py move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
serious_mnist.py move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
simple_conv_bn.py with Tensor.train() (#1935) 2023-09-28 18:02:31 -07:00
so_vits_svc.py use class Foo: instead of class Foo(): (#1797) 2023-09-06 12:20:25 -07:00
stable_diffusion.py Stable diffusion WebGPU port (#1370) 2023-11-03 18:29:16 -07:00
train_efficientnet.py Fix examples/train_efficientnet (#1947) 2023-10-02 02:23:38 -07:00
train_resnet.py move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
transformer.py move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
vgg7.py waifu2x vgg7: testcase, auto-RGBA->RGB, function to grab pretrained models, training "fix" (#2117) 2023-10-19 22:07:15 -07:00
vit.py .cpu().numpy() -> .numpy() (#1594) 2023-08-21 09:53:29 -07:00
vits.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
whisper.py whisper: make file transcription work, add basic CI test (#2042) 2023-10-13 17:13:35 -07:00
yolov3.py .cpu().numpy() -> .numpy() (#1594) 2023-08-21 09:53:29 -07:00
yolov8-onnx.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
yolov8.py use class Foo: instead of class Foo(): (#1797) 2023-09-06 12:20:25 -07:00