mirrors/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️

Ahmed Harmouche 265304e7fd Stable diffusion WebGPU port (#1370 ) * WIP: Stable diffusion WebGPU port * Load whole model: split safetensor to avoid Chrome allocation limit * Gitignore .DS_Store, remove debug print * Clip tokenizer in JS * WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS * e2e stable diffusion flow * Create initial random latent tensor in JS * SD working e2e * Log if some weights were not loaded properly * Remove latent_tensor.npy used for debugging * Cleanup, remove useless logs * Improve UI * Add progress bar * Remove .npy files used for debugging * Add clip tokenizer as external dependency * Remove alphas_cumprod.js and load it from safetensors * Refactor * Simplify a lot * Dedup base when limiting elementwise merge (webgpu) * Add return type to safe_load_metadata * Do not allow run when webgpu is not supported * Add progress bar, refactor, fix special names * Add option to chose from local vs huggingface weights * lowercase tinygrad :) * fp16 model dl, decompression client side * Cache f16 model in browser, better progress * Cache miss recovery --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-03 18:29:16 -07:00
.github/workflows	extract const if it's const (#2193 )	2023-10-31 18:52:35 -07:00
cache	add ff_dim to transformer	2021-11-29 12:40:52 -05:00
disassemblers/adreno	[ready] Replacing os with pathlib (#1708 )	2023-08-30 10:41:08 -07:00
docs	simple runtime args (#2211 )	2023-11-03 12:31:29 -07:00
examples	Stable diffusion WebGPU port (#1370 )	2023-11-03 18:29:16 -07:00
extra	Replace (getenv("CI", "") != "") with helpers.CI (#2213 )	2023-11-03 15:20:44 -07:00
models	resnet50 hand coded optimization (#1945 )	2023-09-29 09:34:51 -07:00
openpilot	fix shape	2023-10-31 11:36:19 -07:00
test	Replace (getenv("CI", "") != "") with helpers.CI (#2213 )	2023-11-03 15:20:44 -07:00
tinygrad	Stable diffusion WebGPU port (#1370 )	2023-11-03 18:29:16 -07:00
weights	gitignore in weights	2023-08-02 16:26:41 +00:00
.editorconfig	Revert "update editorconfig, enforce via CI (#1343 )" (#1380 )	2023-07-31 10:35:50 -07:00
.flake8	flake8 (#1323 )	2023-07-24 11:19:58 -04:00
.gitignore	Stable diffusion WebGPU port (#1370 )	2023-11-03 18:29:16 -07:00
.pre-commit-config.yaml	remove arm64, caching for cuda (#2201 )	2023-11-01 18:44:00 -07:00
.pylintrc	style: else-after-return (#1216 )	2023-07-12 10:26:38 -07:00
.tokeignore	Add a quick start guide (#900 )	2023-06-04 08:51:20 -07:00
compile.sh	stop wasting time with the compiler. tinygrad needs to just jit	2023-03-12 12:08:46 -07:00
CONTRIBUTING.md	feat: reword contributing (#1131 )	2023-07-04 22:17:47 -07:00
LICENSE	Updated LICENSE year (#760 )	2023-05-01 15:35:23 -07:00
mypy.ini	ci: use `mypy.ini` (#1993 )	2023-10-06 01:45:28 -07:00
push_pypi.sh	push pypi	2020-10-27 08:13:15 -07:00
pytest.ini	Update pytest.ini format (#1398 )	2023-08-01 18:00:51 -04:00
README.md	add accelerator links to readme (#1649 )	2023-08-23 14:47:55 -04:00
rmso.sh	compile works (#688 )	2023-03-12 11:01:25 -07:00
ruff.toml	no functions with same names in test/ (#1811 )	2023-09-07 11:27:31 -07:00
run_multibackend.sh	convert `$@` to `"$@"` in `run_multibackend.sh` (#1379 )	2023-07-31 10:39:22 -07:00
setup.py	pin onnx to 1.14.1	2023-11-02 18:03:21 -07:00
strip_whitespace.sh	strip whitespace	2023-06-27 10:11:43 -07:00
sz.py	fixes (#1893 )	2023-09-22 07:20:27 +08:00

Ahmed Harmouche 265304e7fd

* WIP: Stable diffusion WebGPU port

* Load whole model: split safetensor to avoid Chrome allocation limit

* Gitignore .DS_Store, remove debug print

* Clip tokenizer in JS

* WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS

* e2e stable diffusion flow

* Create initial random latent tensor in JS

* SD working e2e

* Log if some weights were not loaded properly

* Remove latent_tensor.npy used for debugging

* Cleanup, remove useless logs

* Improve UI

* Add progress bar

* Remove .npy files used for debugging

* Add clip tokenizer as external dependency

* Remove alphas_cumprod.js and load it from safetensors

* Refactor

* Simplify a lot

* Dedup base when limiting elementwise merge (webgpu)

* Add return type to safe_load_metadata

* Do not allow run when webgpu is not supported

* Add progress bar, refactor, fix special names

* Add option to chose from local vs huggingface weights

* lowercase tinygrad :)

* fp16 model dl, decompression client side

* Cache f16 model in browser, better progress

* Cache miss recovery

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>

README.md

Homepage | Documentation | Examples | Showcase | Discord

Features

LLaMA and Stable Diffusion

Laziness

Neural networks

Neural network example (from test/models/test_mnist.py)

Accelerators

Installation

From source

Documentation

Quick example comparing to PyTorch

Contributing

Running tests