Commit graph

358 commits

Author SHA1 Message Date
George Hotz
1cd10a2e3c amd isel renderer 2026-02-13 17:50:39 +08:00
nimlgen
10c94d2c2d
amd: print more info about device hang (#14705) 2026-02-12 15:34:08 +03:00
nimlgen
42ded7c34d
amd: bind aql (#14666)
* amd: bind to aql

* bind

* x

* f
2026-02-10 16:28:11 +03:00
Christopher Milan
e6562a5061
remove CompilerPair (#14638) 2026-02-09 19:51:18 -05:00
nimlgen
88c3022223
amd: kfd iface early exit (#14612)
* amd: kfd iface early exit

* l

* revert
2026-02-07 18:57:10 +03:00
nimlgen
fbeb978170
diff devices for sdma (#14589)
* start

* x

* fix

* sdma

* c

* clean

* x

* hm

* cleaer
2026-02-06 16:39:12 +03:00
nimlgen
ec2b6bbda8
hcq: update signal logic (#14531) 2026-02-04 19:32:56 +03:00
nimlgen
6e4238c016
amd: recovery (#14461)
* rec

* ?

* rv

* cleaner

* post merge

* not used

* um

* clnr

* x

* x

* d

* move
2026-02-02 18:57:35 +03:00
nimlgen
e0978498dc
amd: read_ptr/write_ptr/doorbells are not lists (#14445) 2026-01-30 23:11:57 +03:00
chenyu
15aed51544
return types for all math.py function (#14413)
calling int() on sint -> int, i think it's better support since some UOp can be safely cast to int
2026-01-28 20:10:11 -05:00
nimlgen
ec1b28bc2c
am: exit early in case of failures (#14376)
* am: exit early in case of failures

* sorry, pre-linter

* reset when error state
2026-01-27 22:10:02 +03:00
chenyu
cd22ee9ed0
add InvalidType to ConstType [pr] (#14373)
* add InvalidType to ConstType [pr]

TYPED=1 python test/test_tiny.py passes.
added PyConst = float|int|bool for some Tensor level input types

* hcq
2026-01-27 14:09:34 -05:00
qazal
a5f3d46423
hcq: do not assume kernel names are unique (#14371)
* hcq: do not assume kernel names are unique

* colored kernel name
2026-01-27 23:03:15 +09:00
nimlgen
3f25eb3026
am: ih (#14346)
* am: ih

* um

* fix

* line

* no trap and fix ring

* keep

* fix
2026-01-26 20:11:04 +03:00
nimlgen
26220a472e
no core_id (#14265)
* no core_id

* kwargs

* est

* linters

* ugh

* revert this

* deps

* glb

* should work?

* nn

* line

* fx

* ym

* z

* d

* um?

* revert

* this one?

* first half

* um p2

* all?

* um

* cleaner

* um
2026-01-23 21:30:12 +03:00
nimlgen
8cd22df2dd
amd: alive wgps (#14149)
* amd: disabled wgps

* l

* wgp

* uoops

* mockgpu

* drm

* ad this

* fi

* reg
2026-01-23 00:08:45 +03:00
nimlgen
7cb7abeeb0
amd: fix scratch_wave64_lane_byte_size (#14223) 2026-01-19 15:21:39 +03:00
nimlgen
979ce211f7
amd: missing self in aql's exec (#14224) 2026-01-19 14:27:54 +03:00
George Hotz
31bcbed6bb
AMD_DISABLE_SDMA for testing with -n12 (#14216) 2026-01-19 16:10:30 +09:00
chenyu
b12a9fea80
runtime int call instead of cast(int) (#14183) 2026-01-17 20:34:45 -05:00
Christopher Milan
0cb024a5bb
remove ctypes.Structure (#13651) 2026-01-15 05:06:22 -05:00
nimlgen
8c55ef4f01
amd: cleanup props (#14145)
* amd: cleanup props

* f
2026-01-14 20:27:41 +03:00
George Hotz
e5500ae4ad
add ALU stuff to default perf counters (#14135)
* add ALU stuff to default perf counters

* lds

* add alu utilization

* cleaner

* format as percent

* cleanest

* roc
2026-01-14 19:47:59 +09:00
nimlgen
62c1a014a6
amd: rename to be consistent (#14141) 2026-01-14 11:41:04 +03:00
George Hotz
a28c8105a5
assembly/amd: 2% faster amd_uop_matmul + SQTT (#14122)
* assembly/amd: 2% faster amd_uop_matmul

* SQTT_TOKEN_EXCLUDE + SQTT_SIMD_SEL

* sqtt printer

* fix printer

* fast decode

* fast decoder

* test packet counts

* ugh it's not faster

* dead
2026-01-13 19:55:32 +09:00
qazal
d8aba24967
amd: use kernel descriptor struct in AMDProgram (#14096) 2026-01-11 18:25:16 +09:00
nimlgen
325f4006ff
amd: copies w/o sdma (#14036)
* amd: copies w/o sdma

* as_args

* fixes

* f
2026-01-06 21:15:58 +03:00
nimlgen
a49924a0e9
hcq: _sleep report status (#13992)
* hcq: _sleep report status

* msg

* print all
2026-01-03 14:28:28 +03:00
nimlgen
b8ea0d779c
am: remove pipe, queue from setup_ring (#13947) 2026-01-01 21:06:41 +03:00
nimlgen
1c5ed8e8b5
am: remove doorbells from setup_ring (#13946) 2026-01-01 14:39:21 +03:00
nimlgen
f7ee644950
amd: lazy sdma queue allocation (#13920)
* ams: lazy queue

* nv

* linter

* f
2025-12-31 15:17:13 +03:00
George Hotz
25ef866e89
write python emulator from RDNA3 psuedocode in pdf (#13841)
* write python emulator from RDNA3 psuedocode in pdf

* emu2

* more emu

* working

* more psueod

* progress

* cleanups

* delete junk

* delete stale files

* just emu

* work

* emu compare

* bemu

* cleanups and more failures

* revert bench emu

* fix emu cmp

* four tests fail

* bugfixes

* dsl

* ext

* refactor

* dsl

* div scale fix

* test_emu

* fix emu tests

* pcode

* test pcode

* top imports

* fix test_emu to use run_asm

* emu tests on real hardware

* more tests

* more emu tests

* more

* work

* work

* bug fix

* bugfixes

* fix fp16 gemm

* all ops tests pass in emulator

* fix llvm tests

* fix a few more tests

* fix mockgpu timeout
2025-12-29 07:39:53 -05:00
George Hotz
f1111ac7de
move amd compilers to new style (#13831)
* move amd compilers to new style

* simplest diff

* AMDHIPrenderer
2025-12-25 13:42:24 -05:00
George Hotz
9d94b8c6b2
python asm dsl in extra + python REMU (#13436)
* having fun with python asm dsl

* rdna3

* meh

* all in rdna3

* work

* more work

* work

* integration

* tests

* simpler

* simpler

* asm

* better

* simpler

* progress

* emu

* simpler

* emu

* tests

* types

* vopd

* cleaups

* work

* memory ranges

* add tracing

* refactors

* run_asm exit

* more readable

* compare to remu

* test gemm

* bug + stale

* more tests

* refactor

* tests fix

* more ins

* more instructions

* refactor

* faster

* match case

* match case

* simpler

* work

* tests

* run_asm

* work

* bug fixes

* more emu

* alu/emu

* refactor

* no pipeline emu yet

* alu direct

* fix

* bugfixes + new test

* fix exceptions in emulators

* update gen.py

* pylint

* no pdf

* improve bench_emu

* speedups

* cleanups

* more tests
2025-12-25 13:04:14 -05:00
nimlgen
90b217896f
am: xgmi p2p (#13811)
* system: use addr space

* am: xgmi

* fix

* ugh
2025-12-23 20:11:38 +03:00
nimlgen
f6bda6ae4e
am: continue from saved state (#13799)
* am: gfx queue cont

* f

* reset

* f

* l
2025-12-22 15:55:07 +03:00
George Hotz
a987a8ed44
add neg VIZ support to not start server (#13772) 2025-12-20 00:36:38 -04:00
nimlgen
3eecb4f123
am: mi350 support (#13733) 2025-12-17 14:57:21 +03:00
nimlgen
5778722979
am: restore queues (#13714)
* am: restore queues

* l

* cmnt
2025-12-16 15:21:42 +03:00
nimlgen
615dcab767
am: minimal mi300 boot (#13679)
* nbio7_9

* psp

* gmc

* gfx

* sdma

* ih

* linter

* linter

* minor

* finish

* add missing

* do not allow warm boot for now
2025-12-15 15:55:03 +03:00
nimlgen
0b15c573ca
amd: xccs in PCIIface (#13669) 2025-12-13 17:22:11 +03:00
qazal
019e71f8ca
lds bank count tests from pmc counters (#13667)
* lds bank count tests from pmc counters

* these tests run on the RDNA3 card too

* rename duration to cycles, other rename comment

* add SQ_LDS_IDX_ACTIVE to gfx9 defaults
2025-12-13 17:39:32 +08:00
nimlgen
b4796e2d32
amd: set queue prio to normal (#13658) 2025-12-12 18:25:41 +03:00
nimlgen
dd8a1a10d4
amd: tiny cleanups (#13616) 2025-12-08 13:15:56 +03:00
nimlgen
dcd50baca4
amd/nv: cleanup (#13608) 2025-12-07 17:05:26 +03:00
nimlgen
abafb96441
hcq: check all subbufs are free (#13599)
* hcq: check all subbufs are free

* fix

* Update ops_amd.py
2025-12-06 17:43:18 +03:00
nimlgen
f2b549d921
amd: refactor scratch calc (#13595)
* amd: refactor scratch calc

* fix
2025-12-06 16:41:35 +03:00
chenyu
0977206b1c
Revert am (#13591)
* Revert "hotfix: amd: tmpring (#13589)"

This reverts commit 4d8b283b36.

* Revert "amd: use correct structs (#13583)"

This reverts commit d8b09eda57.
2025-12-05 11:03:12 -05:00
nimlgen
4d8b283b36
hotfix: amd: tmpring (#13589)
* hotfix: amd: tmpring

* more
2025-12-05 18:19:05 +03:00
nimlgen
d8b09eda57
amd: use correct structs (#13583) 2025-12-05 14:46:38 +03:00