Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Dao-AILab/flash-attention issues and pull requests

#1222 - Plan to support V100

Issue - State: closed - Opened by hiker-lw 5 months ago - 2 comments

#1220 - Unable to install flash attention in docker

Issue - State: open - Opened by shivance 5 months ago

#1219 - Additive Bias in Flash Attention

Issue - State: open - Opened by kkh517 5 months ago

#1218 - Support sliding window attention in FA3

Issue - State: open - Opened by lin-ht 5 months ago - 3 comments

#1216 - export onnx issue

Issue - State: open - Opened by scuizhibin 5 months ago

#1214 - [FA3][Varlen] bug for head_dim not in [64, 128, 256] for varlen

Issue - State: open - Opened by YLGH 5 months ago - 1 comment

#1212 - Question about FA3 supporting (256, 256)

Issue - State: open - Opened by YTianZHU 5 months ago

#1210 - Add q, k, v descales to FA3 interface

Pull Request - State: closed - Opened by cyanguwa 5 months ago

#1209 - [rfc][torch.compile] Make custom kernels torch.compile compatible

Pull Request - State: closed - Opened by anijain2305 5 months ago - 1 comment

#1207 - Pipelining GmemCopy on kHeadDim

Issue - State: open - Opened by phantaurus 5 months ago - 6 comments

#1206 - feat: change minimal supported CUDA version to 11.7

Pull Request - State: closed - Opened by jue-jue-zi 5 months ago

#1205 - Increase TensorCore Active % for Flash Attention Kernels

Issue - State: closed - Opened by phantaurus 5 months ago - 6 comments

#1204 - TiledMMA scales KNWarps times on the M dimension

Issue - State: closed - Opened by phantaurus 5 months ago - 2 comments

#1203 - [AMD] Triton Backend for ROCm

Pull Request - State: closed - Opened by micmelesse 5 months ago - 6 comments

#1202 - Abnormal execution time / Mismatch of FLOPs obtained from Nsys / Ncu

Issue - State: closed - Opened by phantaurus 5 months ago - 3 comments

#1199 - Is bf16 datatype available for FA3?

Issue - State: closed - Opened by YTianZHU 5 months ago - 1 comment

#1198 - Support page kvcache in AMD ROCm

Pull Request - State: closed - Opened by rocking5566 5 months ago - 2 comments

#1197 - Add local attention in Hopper FAv3

Pull Request - State: closed - Opened by ipiszy 5 months ago - 1 comment

#1195 - Bug in RotaryEmbed Kernel

Issue - State: open - Opened by tianyan01 5 months ago

#1193 - install flash-attn error windows 11

Issue - State: open - Opened by AbsoluteMode 5 months ago - 5 comments

#1192 - Fix a wrong reference to seqlen_k variable in the varlen kernel

Pull Request - State: closed - Opened by cakeng 5 months ago

#1189 - Sync compile flag with Ck tile for rocm6.2

Pull Request - State: closed - Opened by rocking5566 5 months ago

#1188 - flashattnvarlen support tree attention

Pull Request - State: open - Opened by efsotr 5 months ago - 4 comments

#1187 - Hi, will it have a XPU version impl?

Issue - State: closed - Opened by yasha1255 5 months ago - 1 comment

#1185 - FA3 runtimeError: q must be on CUDA

Issue - State: closed - Opened by GMALP 5 months ago - 1 comment

#1184 - flash-attn-with-kvcache has performance issue for torch 2.5.0

Issue - State: open - Opened by jianc99 5 months ago - 1 comment

#1183 - A question about better transformer, flash attention , Nvidia TensorRT

Issue - State: closed - Opened by bzr1 5 months ago - 1 comment

#1182 - Add seqused_q in fwd / bwd and seqused_k in bwd in hopper FA.

Pull Request - State: closed - Opened by ipiszy 5 months ago

#1177 - [Feature] FA2 support for attention mask(shape: (seq_len, seq_len))

Issue - State: closed - Opened by efsotr 5 months ago - 3 comments

#1169 - FP8 for flash attention 3 and possible concerns

Issue - State: open - Opened by TheTinyTeddy 6 months ago - 8 comments

#1166 - Add support for qk hidden dim different from v hidden dim

Pull Request - State: open - Opened by smallscientist1 6 months ago - 5 comments

#1156 - google/gemma-2-2b

Issue - State: closed - Opened by mhillebrand 6 months ago - 4 comments

#1146 - CUDA Error: no kernel image is available for execution on the device

Issue - State: closed - Opened by qiuqiu10 6 months ago - 5 comments

#1142 - How can I install with cuda12.1?

Issue - State: open - Opened by tian969 6 months ago - 2 comments

#1139 - Add custom ops for compatibility with PT Compile

Pull Request - State: closed - Opened by ani300 6 months ago - 19 comments

#1138 - Is FA3 less accurate than FA2 in bf16 computation?

Issue - State: closed - Opened by complexfilter 6 months ago - 4 comments

#1138 - Is FA3 less accurate than FA2 in bf16 computation?

Issue - State: closed - Opened by complexfilter 6 months ago - 4 comments

#1137 - How to obtain differentiable softmax_lse

Issue - State: open - Opened by albert-cwkuo 6 months ago - 8 comments

#1136 - FA3 unit test fails

Issue - State: closed - Opened by zhipeng93 6 months ago - 2 comments

#1134 - block scaling support not found

Issue - State: open - Opened by complexfilter 6 months ago - 4 comments

#1128 - Flash attn 3 has large numerical mismatches with torch spda

Issue - State: open - Opened by Fuzzkatt 6 months ago - 8 comments

#1125 - FA2' flash_attn_varlen_func is 300x slower than flash_attn_func

Issue - State: open - Opened by ex3ndr 6 months ago - 6 comments

#1125 - FA2' flash_attn_varlen_func is 300x slower than flash_attn_func

Issue - State: open - Opened by ex3ndr 6 months ago - 6 comments

#1122 - Install flash-attn 2 with cuda 12 : flash-attn is looking for cuda 11

Issue - State: open - Opened by YerongLi 6 months ago - 7 comments

#1121 - Flash Attention 3 fp8 support 4090?

Issue - State: open - Opened by huanpengchu 6 months ago - 2 comments

#1112 - Add how to import FA3 to documentation.

Pull Request - State: closed - Opened by AdamLouly 6 months ago - 1 comment

#1107 - [QST] flash_attn2: why tOrVt is no swizzle ?

Issue - State: open - Opened by itsliupeng 6 months ago - 3 comments

#1106 - [QST] How flash-attn calc the dropout?

Issue - State: closed - Opened by zhang22222 6 months ago - 3 comments

#1094 - There is no cu123 but cu124 for PyTorch

Issue - State: open - Opened by nasyxx 7 months ago - 6 comments

#1075 - Changes For FP8

Pull Request - State: closed - Opened by ganeshcolfax 7 months ago

#1075 - Changes For FP8

Pull Request - State: closed - Opened by ganeshcolfax 7 months ago

#1075 - Changes For FP8

Pull Request - State: closed - Opened by ganeshcolfax 7 months ago

#1072 - Add var-seq-len to FA3 fp16 / bf16 fwd

Pull Request - State: closed - Opened by ipiszy 7 months ago - 1 comment

#1043 - High memory requirements when compiling

Issue - State: open - Opened by haampie 7 months ago - 5 comments

#1043 - High memory requirements when compiling

Issue - State: open - Opened by haampie 7 months ago - 5 comments

#1043 - High memory requirements when compiling

Issue - State: open - Opened by haampie 7 months ago - 5 comments

#1043 - High memory requirements when compiling

Issue - State: open - Opened by haampie 7 months ago - 5 comments

#1038 - Build flash-attn takes a lot of time

Issue - State: open - Opened by Sayli2000 7 months ago - 16 comments

#1036 - Windows actions

Pull Request - State: open - Opened by bdashore3 7 months ago - 3 comments

#1035 - How to debug?

Issue - State: closed - Opened by Achazwl 7 months ago - 2 comments

#1028 - Failed to build flash-attn

Issue - State: open - Opened by xiaoyerrr 7 months ago - 2 comments

#1026 - Could not build wheels for flash-attn

Issue - State: open - Opened by FiReTiTi 7 months ago - 6 comments

#1017 - build failure

Issue - State: open - Opened by alxmke 7 months ago - 9 comments

#1009 - Availability of wheel

Issue - State: open - Opened by nikonikolov 8 months ago - 2 comments

#1009 - Availability of wheel

Issue - State: open - Opened by nikonikolov 8 months ago - 2 comments

#1007 - Unable to build wheel of flash_attn

Issue - State: open - Opened by Zer0TheObserver 8 months ago - 2 comments

#1004 - flash attention is broken for cuda-12.x version

Issue - State: open - Opened by Bhagyashreet20 8 months ago - 3 comments

#1004 - flash attention is broken for cuda-12.x version

Issue - State: open - Opened by Bhagyashreet20 8 months ago - 3 comments

#991 - Error in Algorithm 1 of Flash Attention 2 paper

Issue - State: open - Opened by mbchang 8 months ago - 2 comments

#986 - 谁成功在jetson上使用了 flash_attn

Issue - State: open - Opened by cthulhu-tww 8 months ago - 10 comments

#980 - [Draft] support qk head_dim different from vo head_dim

Pull Request - State: open - Opened by defei-coder 8 months ago - 2 comments

#978 - Fix +/-inf in LSE returned by forward

Pull Request - State: open - Opened by sgrigory 8 months ago - 3 comments

#977 - Apple Silicon Support

Issue - State: open - Opened by chigkim 8 months ago - 2 comments

#969 - how to install flash_attn in torch==2.1.0

Issue - State: open - Opened by foreverpiano 8 months ago - 5 comments