mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-06-24 02:14:17 +00:00
* [WIP]: implementation of SoftVC VITS SVC model * fix typo * fix whitespace * Fully implement Generator & Synthesizer - implement SineGen & SourceHnNSF to reconstruct source signal from F0 - source signal is added during Generator - fix various typos - start loading state dict for synthesizer * Load Synthesizer weights - Fix typos in Synthesizer - Slightly modify vits::load_checkpoint to skip a specified layer - Test with Saul Goodman model because Drake weights are on mega * start work on ContentVec - implement ConvFeatureExtractionModel for ContentVec - start work on TransformerEncoder for ContentVec: - this transformer probably needs its own MultiheadAttention implementation - fix various typos in synthesizer - add helpers to mask behavior of ~ and % operator of torch * use normal and kaiming_normal * Implement ContentVec - load ContentVec weights and config from fairseq hyperparams - use MultiHeadAttention from whisper.py - TransformerSentenceEncoderLayer might still need some tweaking, will see during inference testing - redid tilde() - some cleanup * rename the file so it can be imported * forgot to lint * use float() instead of cast() * add contentvec256l9 and cleanup * Implement SoVITS fully and run it - Fully run sovits with .wav file - Drake weights need to be manually downloaded for now - Fix bugs - Add examples/sovits_helpers - Big TODO: INVALID Kernel for recordings > 4.5 secs * temp fix for longer audio recordings * Upsample no more torch * cleanup & detailed inference time measuring * Completely remove torch(audio) - Implement sinc resample in tinygrad - Load audio via Soundfile - Some cleanups * move stuff to helper files * Cleanup * fix invalid kernel * Cleanup & add more models * Metal sounds good after master merge - But Synthesizer pass became much slower * drake weights now marked save * do load/store in numpy * no commas needed here * remove extra newline * call Tensor::where on object * use Tensor::cat instead of numpy * pull out first iteration * remove Sequential, Dropout, GELU, TransposeLast * cast during loading * clean up attention * remove SamePad * Major cleanup / line reduction - Finish implementation of GroupNormMasked - Simplify parts of TransformerEncoder - Simplify parts of Generator - Move all helpers to common section - Only use repeat_expand_left for interp after SpeechEncoder - Moved SVC-specfic ContentVec impls up (canonically) - Proper annotations for get_encoder - Finished all TODOs - Squashed some whitespaces * clean up preprocess as well * more straightforward bool expr * add demo mode |
||
|---|---|---|
| .. | ||
| mlperf | ||
| sovits_helpers | ||
| vgg7_helpers | ||
| __init__.py | ||
| benchmark_train_efficientnet.py | ||
| compile_efficientnet.py | ||
| compile_tensorflow.py | ||
| deep_deterministic_policy_gradient.py | ||
| efficientnet.py | ||
| gpt2.py | ||
| hlb_cifar10.py | ||
| hlb_cifar10_torch.py | ||
| index.html | ||
| llama.py | ||
| mask_rcnn.py | ||
| mnist_gan.py | ||
| serious_mnist.py | ||
| simple_conv_bn.py | ||
| so_vits_svc.py | ||
| stable_diffusion.py | ||
| train_efficientnet.py | ||
| train_resnet.py | ||
| transformer.py | ||
| vgg7.py | ||
| vit.py | ||
| vits.py | ||
| whisper.py | ||
| yolov3.py | ||
| yolov8-onnx.py | ||
| yolov8.py | ||