tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-06-24 02:14:17 +00:00

History

divinity76 bec4f59ce8 workaround f16 cast ambiguity (#8935 ) for unknown reasons, without this, when trying to execute "Llama 3.2 1B", I get the error below. Fwiw I do not know the performance impact for this change. I can't even get exo running, but this change allows me to /get further/ (before running into a separate issue with vram allocation? story for another day i suppose) error: ``` Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): Nvrtc Error 6, NVRTC_ERROR_COMPILATION <null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies: function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp) function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp) function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp) function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp) ((half4)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3))); ^ <null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies: function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp) function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp) function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp) function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp) ((half4)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3))); ^ <null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies: function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp) function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp) function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp) function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp) ((half4)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3))); ^ <null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies: function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp) function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp) function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp) function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp) function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp) ((half4)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3))); ^ 4 errors detected in the compilation of "<null>". ```		2025-02-11 09:38:56 +08:00
..
bert.py	simpler bert acc [pr] (#8714 )	2025-01-22 10:32:19 -05:00
clip.py	clip device fix (#6924 )	2024-10-07 00:47:32 +08:00
convnext.py	move to new cached fetch (#2493 )	2023-11-28 17:36:55 -08:00
efficientnet.py	remove the magic methods for moving between devices [pr] (#6881 )	2024-10-04 20:27:52 +08:00
inception.py	Compute FID Score (#6802 )	2024-10-01 19:47:58 -04:00
llama.py	workaround f16 cast ambiguity (#8935 )	2025-02-11 09:38:56 +08:00
mask_rcnn.py	explicitly check value for not None (#8382 )	2024-12-23 11:12:39 -05:00
resnet.py	Fix FC layer ResNet load_from_pretrained error (#8387 )	2024-12-26 18:11:27 -05:00
retinanet.py	combine pad2d with pad (#7677 )	2024-11-14 17:56:02 +08:00
rnnt.py	change Tensor.stack to method (#4719 )	2024-05-24 17:04:19 -04:00
t5.py	Flux.1 (#6334 )	2024-09-24 10:08:04 +08:00
transformer.py	replace with tensor op (#3099 )	2024-01-12 14:13:40 -05:00
unet.py	These casts should only happen if these are supported (#7644 )	2024-11-12 07:56:50 +08:00
unet3d.py	move to new cached fetch (#2493 )	2023-11-28 17:36:55 -08:00
vit.py	move to new cached fetch (#2493 )	2023-11-28 17:36:55 -08:00