Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Dao-AILab/flash-attention issues and pull requests

#1315 - Why we iteratively arrive at barrier_O??

Issue - State: open - Opened by ziyuhuang123 3 months ago - 4 comments

#1313 - Package is uninstallable

Issue - State: open - Opened by chrisspen 3 months ago - 1 comment

#1312 - Operation Error: /usr/bin/ld: cannot find -lcuda

Issue - State: open - Opened by ying123ww 3 months ago - 1 comment

#1311 - Varlen flash attention: CUDA illegal memory access

Issue - State: open - Opened by clessig 3 months ago - 13 comments

#1310 - flash attention 3 benchmark for H20 hopper

Issue - State: closed - Opened by aftersnow 3 months ago - 9 comments

#1309 - Looking for compatible version

Issue - State: open - Opened by mahmoodn 3 months ago - 1 comment

#1307 - ROCm compilation error with PyTorch 2.5.1

Issue - State: open - Opened by calebthomas259 3 months ago - 4 comments

#1306 - Result mismatch with headdim=256 bwd

Issue - State: open - Opened by zidanehuang001 3 months ago - 5 comments

#1305 - Make namespace comment consistent

Pull Request - State: closed - Opened by ngocson2vn 3 months ago

#1304 - Question about disabling the causal mask

Issue - State: closed - Opened by volcverse 3 months ago - 2 comments

#1302 - whl for torch 2.5.0

Issue - State: open - Opened by Galaxy-Husky 3 months ago - 4 comments

#1301 - flash_attn_with_kvcach return block_lse or attention_score

Issue - State: open - Opened by NonvolatileMemory 3 months ago - 2 comments

#1299 - using `out` argument will change the output

Issue - State: open - Opened by youkaichao 3 months ago

#1298 - Different nr. of KV and Q tokens

Issue - State: closed - Opened by kilianhaefeli 3 months ago - 2 comments

#1297 - Promote wheels as alternative to pip install flash-attn

Pull Request - State: open - Opened by simonw 3 months ago - 4 comments

#1296 - Fail to initialize the TMA descriptor for head_dim of 192

Issue - State: closed - Opened by NiuMa-1234 3 months ago - 2 comments

#1295 - Build stuck on torch2.5.0

Issue - State: open - Opened by ycformal 4 months ago - 15 comments

#1294 - any plan for varlen fwd support hopper FP8?

Issue - State: closed - Opened by pengwu22 4 months ago

#1293 - Request for New Release with PT Compile Ops

Issue - State: open - Opened by kostum123 4 months ago

#1292 - Support for CUDA 12.4 and above? URGENT PERHAPS?

Issue - State: open - Opened by BBC-Esq 4 months ago - 7 comments

#1291 - Support different shape attention mask

Issue - State: open - Opened by SunzeY 4 months ago

#1289 - CUTLASS 3.5.1 makes Flash Attention 3 slower?

Issue - State: open - Opened by fno2010 4 months ago - 4 comments

#1285 - Fix compilation with clang on ARM64

Pull Request - State: closed - Opened by sclarkson 4 months ago

#1284 - Feat: Add support for PyTorch 2.5 in workflows

Pull Request - State: open - Opened by NanoCode012 4 months ago

#1284 - Feat: Add support for PyTorch 2.5 in workflows

Pull Request - State: closed - Opened by NanoCode012 4 months ago - 5 comments

#1281 - Softcap for FlashAttention v3

Issue - State: open - Opened by Jeff-Zilence 4 months ago - 1 comment

#1281 - Softcap for FlashAttention v3

Issue - State: open - Opened by Jeff-Zilence 4 months ago - 1 comment

#1279 - Fix copy-paste error in hopper tests

Pull Request - State: closed - Opened by milesvant 4 months ago

#1279 - Fix copy-paste error in hopper tests

Pull Request - State: closed - Opened by milesvant 4 months ago

#1278 - Unable to import my new kernel function after compilation success.

Issue - State: open - Opened by jpli02 4 months ago - 2 comments

#1278 - Unable to import my new kernel function after compilation success.

Issue - State: open - Opened by jpli02 4 months ago - 2 comments

#1274 - Does FA2 support 4D attention mask?

Issue - State: open - Opened by XiangTodayEatsWhat 4 months ago

#1273 - why flash attention fp8 kernel using fp16 for output?

Issue - State: closed - Opened by cccddd77 4 months ago

#1272 - Six Flash-Attention-3 unit tests fail on H20

Issue - State: closed - Opened by cailun01 4 months ago - 5 comments

#1268 - Paged Attention support for FA3

Pull Request - State: closed - Opened by kadeng 4 months ago - 3 comments

#1266 - Where in the code demonstrate inter-warp policy?

Issue - State: open - Opened by ziyuhuang123 4 months ago - 4 comments

#1265 - Intra-Warpgroup Overlapping GEMMs and Softmax in FA3

Issue - State: closed - Opened by ziyuhuang123 4 months ago - 7 comments

#1264 - flash-attention

Issue - State: open - Opened by 21X5122 4 months ago - 1 comment

#1263 - FlashAttention3 support for forward pass with kv cache

Issue - State: open - Opened by jorgeantonio21 4 months ago - 1 comment

#1262 - No module named moe_kernel in Flash Attention Installation

Issue - State: closed - Opened by abhasin14 4 months ago - 1 comment

#1261 - Speeding up exp with lookup tables?

Issue - State: open - Opened by ethansmith2000 4 months ago - 3 comments

#1260 - Questions about calculating the number of hmb accesses

Issue - State: closed - Opened by uniqueness 4 months ago - 5 comments

#1258 - Is TileShape_MNK shape 128, 176, 80, 192 kind of strange?

Issue - State: closed - Opened by ziyuhuang123 4 months ago - 1 comment

#1257 - Can I print value within function? (like load function)

Issue - State: closed - Opened by ziyuhuang123 4 months ago - 4 comments

#1256 - In non-casual case why we have mask?

Issue - State: open - Opened by ziyuhuang123 4 months ago - 2 comments

#1255 - how to remove softmax operations?

Issue - State: closed - Opened by ziyuhuang123 4 months ago - 3 comments

#1254 - FA3 varlen_bwd hangs (FA2 works in the same case)

Issue - State: open - Opened by goldhuang 4 months ago - 2 comments

#1253 - Why attn_ref use fp32 in fwd, but use fp16/bf16 in bwd?

Issue - State: open - Opened by muoshuosha 4 months ago - 4 comments

#1252 - Look into sequence packing

Issue - State: closed - Opened by alex-hh 4 months ago

#1251 - dropout in FA3 needs get fixed?

Issue - State: closed - Opened by jundaf2 4 months ago - 4 comments

#1250 - Runtime error from transformers

Issue - State: open - Opened by HarryK4673 4 months ago - 2 comments

#1249 - Difference between FusedMLP and MLP?

Issue - State: open - Opened by prmudgal 4 months ago - 4 comments

#1248 - where is flash decoding second stage (reduce) code ?

Issue - State: open - Opened by liuqi123123 4 months ago - 9 comments

#1244 - Why do we have an all_reduce with wrong backward?

Issue - State: closed - Opened by zhuzilin 5 months ago - 1 comment

#1243 - CUDA versions > 12.3 do not correctly compile H100 Flash Attention 3

Issue - State: open - Opened by rohany 5 months ago - 1 comment

#1241 - b

Issue - State: closed - Opened by rgitfiletransfer 5 months ago

#1240 - Fix FAv3 compilation with MSVC

Pull Request - State: closed - Opened by hlky 5 months ago

#1239 - Sync api change for ROCm Flash attention

Pull Request - State: closed - Opened by rocking5566 5 months ago

#1238 - 【HDIM=96】head dim = 96 ?

Issue - State: open - Opened by SunNy820828449 5 months ago

#1237 - Minify `torch.torch.int32` to `torch.int32` in Bert

Pull Request - State: closed - Opened by imShZh 5 months ago

#1236 - FA3 kvcache + split kv + gqa parallelization

Pull Request - State: closed - Opened by jayhshah 5 months ago

#1235 - WHEN can we get the flash-attention 2.x for Turing GPU ?

Issue - State: open - Opened by eileen2003-w 5 months ago - 3 comments

#1233 - Add local attention in Hopper FAv3

Pull Request - State: closed - Opened by ipiszy 5 months ago

#1232 - fp8 not enabled for mha_varlen_fwd

Issue - State: open - Opened by goldhuang 5 months ago

#1231 - [BUG]2 tests failed...?

Issue - State: open - Opened by ziyuhuang123 5 months ago

#1230 - Turing architecture error on Nvidia Quadro T1000

Issue - State: open - Opened by Tortoise17 5 months ago - 2 comments

#1229 - ERROR [12/13] RUN pip install flash-attn --no-build-isolation

Issue - State: open - Opened by promaprogga 5 months ago - 1 comment

#1228 - Avoid padding computation with `cu_seqlens`

Issue - State: open - Opened by imoneoi 5 months ago - 3 comments

#1227 - ImportError: fused_dense is not installed

Issue - State: open - Opened by kanebay 5 months ago - 2 comments

#1226 - Pytorch 2.4.1 with flash-attn 2.5.8

Issue - State: closed - Opened by adtian2 5 months ago - 2 comments