tinygrad/extra
George Hotz 2abb474d43
kfd driver wip (#3912)
* kfd driver wip

* cleanups

* kfd almost ready to ring doorbell

* ding dong?

* issues with signals

* something

* works

* ops kfd

* add amd_signal_t

* works...sometimes

* program runs

* _gpu_alloc cleanup

* cleanups

* work

* header + enable profiling (#3959)

* header + enable profiling

* just cleaner

* measure

* only local time domain

* remove old comments

* fix with master

* elf parsing (#3965)

* elf parsing

* fix kernels with private

* not used

* clean up

* clean up 2

* add flags

* kfd sdma (#3970)

* working sdma

* remove driver, shorter

* all commands we might need

* svm

* kfd remove hardcoded values (#4007)

* remove hardcoded values

* match above line

* 7k lines + revert hsa

* update that from origin

* fix sdma reg gen

* not the updated SDMA

* compiler_opts

* don't require kfd_ioctl

* get ioctls from python

* get ioctls from python

* remove build_sdma_command

* merge into 64-bit fields

* shorter

* fix property spelling and off by one

---------

Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2024-03-30 15:08:12 -07:00
..
accel move things, clean up extra (#2292) 2023-11-13 20:18:40 -08:00
assembly move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
backends LinearizerOptions -> CompilerOptions (#3978) 2024-03-28 17:50:23 -04:00
datasets create engine folder and move code (#3948) 2024-03-26 20:38:03 -07:00
gemm extra/gemm/hip_matmul: fix to use new HSA devices and no headers (#3999) 2024-03-30 15:42:23 -04:00
hip_gpu_driver kfd driver wip (#3912) 2024-03-30 15:08:12 -07:00
hiprtc use comgr to compile (#3248) 2024-01-26 18:27:49 -08:00
junk coder.py can write and run code (#2439) 2023-11-25 12:27:54 -08:00
models fp16 resnet (without expand backwards sum in float, doesn't work) (#3816) 2024-03-28 01:25:37 -04:00
nv_gpu_driver nv ioctl sniffer (#3892) 2024-03-23 00:29:30 -07:00
optimization LinearizerOptions -> CompilerOptions (#3978) 2024-03-28 17:50:23 -04:00
qcom_gpu_driver start Qualcomm GPU driver (#2804) 2023-12-16 23:10:50 -08:00
archprobe.py move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
augment.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
autopad.py split to schedule.py (#3949) 2024-03-26 21:02:46 -07:00
disk_read_speed.py fast hip read (#3014) 2024-01-05 10:33:13 -08:00
dump_cache.py wow how did i think that was okay (#2339) 2023-11-16 21:21:11 -08:00
export_model.py create engine folder and move code (#3948) 2024-03-26 20:38:03 -07:00
gradcheck.py Fix: Jacobian tests [WIP] (#1126) 2023-07-05 15:36:22 -07:00
hip_events.py move autogen to runtime/autogen (#3254) 2024-01-26 12:44:19 -08:00
introspection.py move GlobalCounter to helpers (#4002) 2024-03-30 00:30:30 -04:00
lr_scheduler.py add lars to nn (#3750) 2024-03-24 11:43:12 -04:00
multitensor.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
onnx.py Fix: Always cast ONNX Slice op arguments into ints (#3317) 2024-02-04 18:40:48 -05:00
onnx_ops.py simple LoadOps.ASSIGN (#3745) 2024-03-14 20:44:34 -07:00
ring_copy.py ring copy example (#3185) 2024-01-19 23:34:30 -05:00
thneed.py new style device (#2530) 2023-11-30 17:07:16 -08:00
to_movement_ops.py fixup to_movement_ops and add back to CI (#3881) 2024-03-22 18:14:49 -04:00
training.py create engine folder and move code (#3948) 2024-03-26 20:38:03 -07:00
transfer_speed.py hotfix: copy size is in bytes 2024-01-17 16:44:15 +00:00