Commit graph

8,475 commits

Author SHA1 Message Date
George Hotz
f3cb4c3eef oops 2025-04-01 23:44:44 +08:00
George Hotz
6ecaf11224 ugh many hacks 2025-04-01 23:33:09 +08:00
George Hotz
8b24f9cb0d oops, didn't mean to change that 2025-04-01 17:55:04 +08:00
George Hotz
797e512c00 all correct 2025-04-01 17:51:24 +08:00
George Hotz
f600482982 correctness 2025-04-01 17:27:16 +08:00
George Hotz
da35edbb55 reenable that upcast 2025-04-01 17:09:02 +08:00
George Hotz
661431ee75 correctness 2025-04-01 17:01:46 +08:00
George Hotz
8340d9c1c2 disable padding 2025-04-01 16:27:54 +08:00
George Hotz
910cddbbca correct but slower 2025-04-01 16:11:47 +08:00
George Hotz
e6e0c0ec86 should work 2025-04-01 15:25:15 +08:00
George Hotz
d0eedb5a79 hack 2025-04-01 15:05:13 +08:00
George Hotz
f69deddbd4 opt 2025-04-01 14:43:36 +08:00
George Hotz
be11fbbf78 works 2025-04-01 14:38:38 +08:00
George Hotz
812c391617 fp mul 2025-04-01 13:43:16 +08:00
George Hotz
3306083f42 YOU DIDNT FOIL 2025-04-01 12:32:00 +08:00
George Hotz
18d7e9d3f1 oops 2025-04-01 11:56:57 +08:00
George Hotz
1c3f249ecf fix multicore flop tracking 2025-04-01 10:16:01 +08:00
nimlgen
bb7b89475c
dsp multicore 2 (#9644)
* dsp multicore 2

* hmm

* better
2025-03-31 23:56:54 +08:00
George Hotz
8005e6c974 write test pkl imagenet 2025-03-31 19:37:28 +08:00
George Hotz
a3d61a0372 save pkl from benchmark 2025-03-31 19:31:48 +08:00
George Hotz
c73e35aa24 non const fix 2025-03-31 19:10:06 +08:00
George Hotz
0b4b9f61b9 simpler 2025-03-31 19:03:06 +08:00
George Hotz
ee3ddfcdc1 many l2fetch 2025-03-31 18:58:52 +08:00
George Hotz
220d682489 prefetch l2 is so winning 2025-03-31 18:29:12 +08:00
George Hotz
9c388c3539 try to be smarter 2025-03-31 18:23:49 +08:00
George Hotz
4b3a4c8c46 fix prefetch l2 2025-03-31 18:09:48 +08:00
George Hotz
eb606d7230 MULTICORE=1 PYTHONPATH=. QUANTIZE=1 DEBUG=2 DEVECTORIZE=0 python3 extra/replay_pkl.py /tmp/im.pkl 2025-03-31 15:37:07 +08:00
George Hotz
49d52a2763 support acc in __builtin_HEXAGON_A2_vraddub 2025-03-31 15:12:00 +08:00
George Hotz
a59c3dd09a err, that's a bug 2025-03-31 14:56:15 +08:00
George Hotz
a640292aed delete extra 2025-03-31 14:35:32 +08:00
George Hotz
2f48c12441
Merge branch 'master' into dsp_search 2025-03-31 14:27:27 +08:00
George Hotz
e4c545b396
linearizer fix from dsp branch (#9641)
* linearizer fix from dsp branch

* revert that
2025-03-31 14:26:39 +08:00
George Hotz
be3b5efc64 fix precommit a bit 2025-03-31 14:26:19 +08:00
George Hotz
996d0ac1d2 multicore all the way 2025-03-31 14:17:19 +08:00
George Hotz
ec405b919f
Revert "Revert "do not block gc in UOp.toposort (#9623)" (#9624)" (#9639)
This reverts commit 7ef02d0e1c.
2025-03-31 14:03:38 +08:00
George Hotz
77e897b3b1
Merge branch 'master' into dsp_search 2025-03-31 13:03:29 +08:00
George Hotz
49b1c46d16
good changes from the dsp branch (#9638) 2025-03-31 13:02:53 +08:00
George Hotz
273dde69bd remove range split support 2025-03-31 12:43:21 +08:00
George Hotz
a64030d8c8 ignore hacks 2025-03-31 12:36:39 +08:00
qazal
9d67d3a2f3
simpler viz codeblocks (#9636)
* simpler viz codeblocks

* err
2025-03-31 11:48:35 +08:00
George Hotz
9b19129e87 mc 2025-03-31 11:34:22 +08:00
George Hotz
48221d9024 2 global dim 2025-03-31 11:25:12 +08:00
George Hotz
bcfcd60f55 opt weights 2025-03-31 11:02:03 +08:00
chenyu
60eb0c4ed7
exclude slow tests on PYTHON (#9634) 2025-03-30 22:55:05 -04:00
George Hotz
abc90024ac hand coded opts 2025-03-31 10:44:09 +08:00
chenyu
5012ba3f04
cumalu touchup [pr] (#9632) 2025-03-30 22:43:11 -04:00
chenyu
d8d7ac1bb1
fix bert free_intermediates (#9633)
fix when only run eval `TRAIN=0 BERT_SIZE=tiny examples/mlperf/training_submission_v5.0/tinycorp/benchmarks/bert/implementations/tinybox_green/dev_beam.sh`
2025-03-30 22:42:52 -04:00
qazal
ff984c807d
hotfix: less lines for viz helpers (#9631) 2025-03-31 10:10:34 +08:00
George Hotz
f0e6d8394c
Merge branch 'master' into dsp_search 2025-03-31 10:01:19 +08:00
qazal
c206a7ae6d
refactor viz state updates (#9630)
* refactor viz state updates

* onclick
2025-03-31 09:54:54 +08:00