Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Dao-AILab/flash-attention issues and pull requests

#1420 - Is there a plan to support flash_attn_varlen_backward with fp8

Issue - State: open - Opened by gaodaheng about 1 month ago - 1 comment

#1419 - Add a macro for namespace

Pull Request - State: closed - Opened by drisspg about 1 month ago

#1418 - Encounter some problems when building wheel

Issue - State: open - Opened by ZarkPanda about 1 month ago

#1416 - [CK_TILE] FAv3 bwd bugfix

Pull Request - State: closed - Opened by poyenc about 2 months ago

#1415 - RuntimeError: Error compiling objects for extension

Issue - State: open - Opened by ProgramerSalar about 2 months ago - 2 comments

#1412 - UnboundLocalError: local variable 'out' referenced before assignment

Issue - State: closed - Opened by chuangzhidan about 2 months ago - 6 comments

#1411 - Can't intall it

Issue - State: open - Opened by TherrenceF about 2 months ago - 1 comment

#1410 - Impact of Register Spills on FA3 Kernel Performance

Issue - State: closed - Opened by ziyuhuang123 about 2 months ago - 1 comment

#1409 - FA 2.4.2 is falling unitest on A6000 and A5880

Issue - State: open - Opened by BoxiangW about 2 months ago - 5 comments

#1406 - fix bug when is_grad is false

Pull Request - State: closed - Opened by woaixiaoxiao about 2 months ago

#1405 - Add missing tests/__init__.py

Pull Request - State: open - Opened by BioGeek about 2 months ago

#1404 - 4 Failing `test_flash_attn_output_fp8` tests on H100

Issue - State: open - Opened by BioGeek about 2 months ago - 3 comments

#1403 - Does bar.sync Emit Semaphores Alongside bar.arrive?

Issue - State: closed - Opened by ziyuhuang123 about 2 months ago - 1 comment

#1402 - is flash_attn_with_kvcache() supposed to work for seqlen > 1 ?

Issue - State: closed - Opened by vince62s about 2 months ago - 1 comment

#1401 - Understanding sync and arrive in FA3 Store Function

Issue - State: open - Opened by ziyuhuang123 about 2 months ago

#1400 - Understanding the Role of arrive in NamedBarrier Synchronization

Issue - State: open - Opened by ziyuhuang123 about 2 months ago - 1 comment

#1399 - Fix incorrect torch dtype

Pull Request - State: closed - Opened by kevmo314 about 2 months ago

#1397 - check torch.is_grad_enabled before calling customer flash atten ops

Pull Request - State: closed - Opened by XiaobingSuper about 2 months ago - 5 comments

#1396 - Why Doesn't FlashAttention3 Allow KV and O to Share Memory Space?

Issue - State: open - Opened by ziyuhuang123 about 2 months ago - 1 comment

#1396 - Why Doesn't FlashAttention3 Allow KV and O to Share Memory Space?

Issue - State: open - Opened by ziyuhuang123 about 2 months ago - 1 comment

#1394 - Create PEP 517 build metadata

Pull Request - State: closed - Opened by frostming about 2 months ago - 1 comment

#1394 - Create PEP 517 build metadata

Pull Request - State: open - Opened by frostming about 2 months ago

#1393 - Add hipBLAS/cuBLAS distinction in benchmark_gemm.py

Pull Request - State: closed - Opened by garrettbyrd about 2 months ago

#1392 - fix a bug (issue #1390) caused by typo

Pull Request - State: closed - Opened by liguohao96 about 2 months ago - 1 comment

#1392 - fix a bug (issue #1390) caused by typo

Pull Request - State: open - Opened by liguohao96 about 2 months ago

#1391 - Large loss of accuracy between flashattention and native

Issue - State: open - Opened by fanfanaaaa about 2 months ago - 3 comments

#1391 - Large loss of accuracy between flashattention and native

Issue - State: open - Opened by fanfanaaaa about 2 months ago - 4 comments

#1390 - a small typo and fix

Issue - State: open - Opened by liguohao96 about 2 months ago - 3 comments

#1388 - Windows 11 Installation Error

Issue - State: open - Opened by 404-xianjin about 2 months ago

#1387 - FA-3 installation errors

Issue - State: closed - Opened by asahni04 about 2 months ago - 1 comment

#1386 - is fwd_kvcache compatible with torch.compile in 2.7.2post1 ?

Issue - State: open - Opened by vince62s about 2 months ago - 6 comments

#1385 - How to get actual col idx

Issue - State: open - Opened by wenkechen 2 months ago

#1384 - Support dedicated compile[For Research]

Pull Request - State: open - Opened by AllenDou 2 months ago

#1382 - Fix deprecation warnings

Pull Request - State: open - Opened by rongou 2 months ago

#1379 - Possible to install with just `torch` installed?

Issue - State: closed - Opened by davidmezzetti 2 months ago - 6 comments

#1378 - seq_lens variable used in the attention kernel

Issue - State: closed - Opened by chakpongchung 2 months ago - 1 comment

#1377 - Flash attention 3 does not use Dropout_p?

Issue - State: open - Opened by nighting0le01 2 months ago - 6 comments

#1375 - FA3 for cuda12.3 but torch only releases cuda 12.4 version

Issue - State: closed - Opened by wplf 2 months ago - 2 comments

#1374 - Headdim==96 in FA3

Issue - State: closed - Opened by wplf 2 months ago - 2 comments

#1372 - Why we have a third barrier::QueryEmpty arrive?

Issue - State: open - Opened by ziyuhuang123 2 months ago - 1 comment

#1369 - GLT

Issue - State: open - Opened by deepgandu 2 months ago

#1368 - The byzantine copy of Tensor O

Issue - State: closed - Opened by phantaurus 2 months ago - 4 comments

#1368 - The byzantine copy of Tensor O

Issue - State: closed - Opened by phantaurus 2 months ago - 4 comments

#1365 - Change {q,k,v}_descale to be per-batch-element

Pull Request - State: closed - Opened by ericauld 2 months ago

#1365 - Change {q,k,v}_descale to be per-batch-element

Pull Request - State: closed - Opened by ericauld 2 months ago

#1364 - Is there any way to compile the codes with nvcc debug flag(-G)?

Issue - State: open - Opened by Dev-Jahn 2 months ago - 6 comments

#1361 - Fix FA3 Varlen Performance regression

Pull Request - State: closed - Opened by kadeng 2 months ago

#1360 - Need `tests/__init__.py` for `hopper/test_flash_attn.py`

Issue - State: open - Opened by hancheolcho 3 months ago - 2 comments

#1360 - Need `tests/__init__.py` for `hopper/test_flash_attn.py`

Issue - State: open - Opened by hancheolcho 3 months ago - 2 comments

#1359 - Output Discrepancy Between FlashAttention and PyTorch Attention

Issue - State: closed - Opened by pengzhangzhi 3 months ago - 2 comments

#1359 - Output Discrepancy Between FlashAttention and PyTorch Attention

Issue - State: closed - Opened by pengzhangzhi 3 months ago - 2 comments

#1357 - How to get attention score? "return_attn_probs=True" is not work.

Issue - State: closed - Opened by UnableToUseGit 3 months ago - 3 comments

#1357 - How to get attention score? "return_attn_probs=True" is not work.

Issue - State: closed - Opened by UnableToUseGit 3 months ago - 1 comment

#1354 - Flashdecoding with appendKV might incorrect

Issue - State: open - Opened by DD-DuDa 3 months ago

#1353 - Added a Benchmark for Rotary and Improved Rotary Performance

Pull Request - State: closed - Opened by alexkranias-amd 3 months ago - 1 comment

#1352 - FP8 test failure on the latest 'decode' branch

Issue - State: closed - Opened by cscyuge 3 months ago - 1 comment

#1350 - How could I use a query to calculate the attention with multiple k-v

Issue - State: closed - Opened by DongyuXu77 3 months ago - 1 comment

#1349 - Question of the equation in Flash Attention 2 Paper

Issue - State: open - Opened by jeffrey-sunh1 3 months ago - 5 comments

#1347 - breaking change for head size non divisble by 8

Issue - State: closed - Opened by felix-red-panda 3 months ago - 1 comment

#1347 - breaking change for head size non divisble by 8

Issue - State: closed - Opened by felix-red-panda 3 months ago - 1 comment

#1346 - RuntimeError: Error compiling objects for extension

Issue - State: closed - Opened by beyondguo 3 months ago - 5 comments

#1346 - RuntimeError: Error compiling objects for extension

Issue - State: closed - Opened by beyondguo 3 months ago - 5 comments

#1345 - [Q] why flash attention MFU is over 100% in A800

Issue - State: closed - Opened by wonderisland 3 months ago

#1345 - [Q] why flash attention MFU is over 100% in A800

Issue - State: closed - Opened by wonderisland 3 months ago

#1344 - [Bug] Potential hazard in epilogue when kUseVarSeqLen=true

Issue - State: closed - Opened by QiZhangNV 3 months ago - 2 comments

#1343 - FA3 Failed to initialize the TMA descriptor

Issue - State: open - Opened by li-yi-dong 3 months ago