Commit graph

13,471 commits

Author SHA1 Message Date
nimlgen
18cfb54736
amd: a bit better se limiting (#13440)
* amd: a bit better se limiting

* SQTT_LIMIT_SE=0
2025-11-24 21:51:47 +03:00
C T
2d53029be3
Whisper less flaky tests (#13435)
* use less flaky metric for whisper long transcription

* multiline long transcription 3 reference

* fix reference transcript

see https://homepage.ntu.edu.tw/~karchung/miniconversations/MC.htm
sanitized for whisper

* try lower wer threshold

* add test for wer metric

* extract TRANSCRIPTION_3_ALT

* rename test

* rename

* add tests for high WER difference

* move tests

* sync metric
2025-11-24 09:50:49 -08:00
qazal
2a9bd12700
sqtt: add occupancy events to the timeline (#13430) 2025-11-24 22:28:05 +08:00
Sieds Lykles
63a931ff76
Symbolic divisor fuzzer (#13433)
* render z3 range better

* working version

* rename

* add to workflow

* factor out variable_names

* smaller expressions

* smaller

* + back
2025-11-23 20:29:32 +01:00
nimlgen
677db34eba
nv: cleanup map flags (#13434) 2025-11-23 19:54:52 +03:00
qazal
712c7a6448
sqtt loader cleanups from the occupancy branch (#13431)
* cleanup err handling

* from disasms

* s/wave_execs/wave_insts
2025-11-23 21:50:34 +08:00
George Hotz
9d7a17ee39
beautiful SQTT_PARSE=1 with color (#13428)
* beautiful SQTT_PARSE=1 with color

* linter

* linter 2

* a few more labels

* filter and or

* wave alloc

* a few more
2025-11-23 01:05:14 -08:00
qazal
474a631877
viz: align left offset for nested items (#13420) 2025-11-23 14:22:51 +08:00
George Hotz
da0aa57a3b add cu parsing to attempt_sqtt_parse 2025-11-22 22:09:05 -08:00
qazal
320ed78803
can view wave timeline with SQTT_ITRACE_SE_MASK=0 (#13427) 2025-11-23 13:55:47 +08:00
Pranil
c1838c71fc
display service name typo (#13426)
its tinybox-display.service
2025-11-22 20:49:56 -08:00
George Hotz
5110409339
continue work on parse sqtt, enable with SQTT_PARSE (#13425)
* continue work on parse sqtt, enable with SQTT_PARSE

* fix timing

* delta is pre instruction

* hi8 values

* a few more

* a bit more

* let it crash if you enabled it

* figure out simd

* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
92170d0ff1
lil op cleanup (#13424)
* track flag count and op count

* text

* more

* file count

* lil op cleanup

* cleanups

* move
2025-11-22 15:21:15 -08:00
George Hotz
423b76a852
improve sqtt format parser (saturday coffee shop project) (#13419)
* improve sqtt format parser

* actually read the trash code ChatGPT wrote

* cleanups

* hand written parser

* quality

* more

* was missing first packet

* maybe

* filt

* fixups

* label the waves

* progress
2025-11-22 15:04:10 -08:00
George Hotz
9d6cf3472e remove op/sentinel 2025-11-22 15:01:47 -08:00
Christopher Milan
310da2a201
remove hashFiles in setup-tinygrad (#13423)
* fix hashFiles in setup-tinygrad on macos

* remove hashFiles altogether
2025-11-22 17:47:10 -05:00
qazal
c14033e10f
viz: faster startup time with SQTT=1 (#13337)
* roc.py cleanups

* direct append

* viz index cleanup

* simd row details

* add kernel arg

* late instructions decode

* more instruction decode to sep server request

* 200ms startup, 6 second to waves timeline

* sort units

* creating new http paths is easy now

* instructions unpacker

* min diff, use hyphens

* summary table
2025-11-22 22:02:30 +08:00
qazal
1655fdb6de
viz: cleanup sqtt loader (#13417) 2025-11-22 20:10:23 +08:00
qazal
903eec3754
fix sz.py tinygrad import in ci (#13418) 2025-11-22 19:20:26 +08:00
nimlgen
3a42680e22
amd: pmc generic arch for gfx10+ (#13407) 2025-11-22 12:31:23 +03:00
George Hotz
1f8b24a6b9
track flag count and op count (#13416)
* track flag count and op count

* text

* more

* file count
2025-11-21 22:46:33 -08:00
George Hotz
4c0f4226b9
delete the PRECAST op [p] (#13415)
* don't use PRECAST in cstyle renderer [p]

* fix in metal

* fix opencl

* __builtin_bit_cast

* precast is unused

* cuda is c99?

* lambda_union_bitcast

* helper function

* delete precast op
2025-11-21 21:47:14 -08:00
wozeparrot
1f648bb1ba
feat: reenable mobilenetv2 dsp (#13320) 2025-11-21 15:21:49 -08:00
chenyu
054477a44f
remove full_symbolic in simplify (#13413)
only flip one schedule in winograd backward, no functional difference
2025-11-21 15:04:00 -05:00
chenyu
cb29265f23
add test that shows the validhack regression with bad rewrite order (#13411) 2025-11-21 13:48:30 -05:00
qazal
fdfe83880b
viz: unique sqtt wave names (#13410)
* viz: unique sqtt wave names

* better name for the shape

* it's a per program counter now

* table view, refactor to wave:insts dict
2025-11-22 02:43:31 +08:00
chenyu
a6c9b4ff6a
fix symbolic comments [pr] (#13408) 2025-11-21 09:18:50 -05:00
Sieds Lykles
114bb94c55
Fix load collapse MAX to ADD (#13406)
* add Ops.ADD to pattern

* add test
2025-11-21 12:26:14 +01:00
qazal
87c248eafa
small cleanups from viz memory usage fixes (#13405)
* shape link cleanups

* cleanup findRectAtPosition
2025-11-21 17:05:08 +08:00
qazal
0de1b24154
viz: SE : CU : SIMD : WAVE in sqtt timeline (#13404)
* wave id in device rows

* SE : CU : SIMD : WAVE

* automatic width

* better styling

* rm the blue

* sort
2025-11-21 15:42:29 +08:00
George Hotz
dabb02767f
set AMD profile mode with sudo on SQTT or PMC (#13403)
* require profile mode

* add mode setter

* cleanup

* not needed

* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
George Hotz
e1051d00d7
multi like on full_like as well as rand_like (#13402)
* multi like on full_like as well as rand_like

* add test and fix bug

* mismatch, optim match

* one line
2025-11-20 20:46:48 -08:00
chenyu
fa3def2f12
call less simplify in simplify_valid_load [pr] (#13401) 2025-11-20 19:54:22 -05:00
qazal
895ec7417e
viz: enable mapping function names to colors (#13400) 2025-11-21 06:43:02 +08:00
George Hotz
a74f6020d5
track apply map to tensors (#13399)
* track apply map to tensors

* sub
2025-11-20 14:24:55 -08:00
chenyu
647fde64e6
no sym in pm_reduce [pr] (#13398)
* no sym in pm_reduce [pr]

* fix that
2025-11-20 16:49:09 -05:00
qazal
1313250e0d
viz: use system helper for llvm-mca (#13395) 2025-11-21 04:47:25 +08:00
Christopher Milan
de3593957f
Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388)
This reverts commit 0901a40685.
2025-11-20 15:36:13 -05:00
qazal
1220072328
viz: refactor to generic steps api (#13393) 2025-11-21 04:33:23 +08:00
George Hotz
26ccbf7040
debufferize with symbolic in one pm (#13392) 2025-11-20 11:47:03 -08:00
George Hotz
c46f608703
top down remove_bufferize (#13391)
* top down remove_bufferize

* removable if ALWAYS_CONTIGUOUS
2025-11-20 11:32:00 -08:00
Christopher Milan
4043489803
set curl -f in setup-tinygrad (#13389)
* set curl -f in setup-tinygrad

* test bad redirect

* Revert "test bad redirect"

This reverts commit ad945e7ffc.
2025-11-20 13:45:47 -05:00
chenyu
0251a8e628
parse_valid minor cleanup [pr] (#13385)
* stricter parse_valid [pr]

* not stricter

* no VCONST

* Revert "no VCONST"

This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.
2025-11-20 13:15:06 -05:00
Christopher Milan
0901a40685
Revert "autogen: fix formatting on zero-argument function-like macros (#13386)" (#13387)
This reverts commit 58d85d4bab.
2025-11-20 12:45:35 -05:00
b1tg
91e289cb14
amd fp8 llvm (#13186)
* amd fp8 llvm support

* fix max

* clean

* add test_mi350.sh

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-11-20 12:35:57 -05:00
Roelof van Dijk
1058748440
torch backend: no aten.detach for torch 2.10 compat (#13381)
* this works, less cpp?

* simpler = better

* keep torch 2.9 working as well
2025-11-20 09:12:15 -08:00
Christopher Milan
58d85d4bab
autogen: fix formatting on zero-argument function-like macros (#13386)
* fix formatting on zero-argument function-like macros

* autogen tests should run

* ugh
2025-11-20 12:11:04 -05:00
qazal
9dbc550692
roc: map disassembly to prog name (#13384) 2025-11-20 23:47:19 +08:00
qazal
ebcdf68bab
viz: use content headers for profiler (#13383) 2025-11-20 23:33:16 +08:00
nimlgen
0b0ea4981c
hcq: unwrap signals (#13382) 2025-11-20 18:12:41 +03:00