leopf
|
4f0ee4e982
|
BPE tokenizer (#11415)
* BPE works
* refactor tok
* oops
* basic tests
* fix eval
* smaller diff
* fix error
* proper vocab decoding
* use regex for splitting
* escape ucatrange
* full compat
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
|
2025-08-04 09:52:38 -07:00 |
|