thu-ml/sageattention issues and pull requests

#93 - sageattn_qk_int8_pv_fp16_cuda black output with pv_accum fp16, results in black screen in opensora

Issue - State: open - Opened by nighting0le01 8 days ago - 2 comments

#92 - Alibi or window Attention Support

Issue - State: open - Opened by asahni04 9 days ago

#91 - (windows): Add ushort type definition for CUDA compilation for Sageattention 2.x

Pull Request - State: open - Opened by Panchovix 10 days ago - 2 comments

#90 - Help, When I use SageAttention==1.0.6, got Accelerate up 9% and Accuracy down 7%

Issue - State: open - Opened by shushanxingzhe 11 days ago - 4 comments

#89 - Imports fails on ComfyUI

Issue - State: open - Opened by deadman3000 12 days ago

#88 - incompatible CUDA&torch version

Issue - State: closed - Opened by ET823828 13 days ago - 2 comments

#87 - How could I get the code for "SAGEAttn-vT" in the paper?

Issue - State: closed - Opened by Ezio-csm 15 days ago - 4 comments

#86 - Question about how to get the attention weight matrix for Q@K^T

Issue - State: closed - Opened by shiniannian 16 days ago - 1 comment

#85 - Can not run SageAttention 2.0.1 in CC89 （4090D）with Runtime Error

Issue - State: closed - Opened by wzw1994 21 days ago - 4 comments

#84 - 报错了，安装不上

Issue - State: open - Opened by u-madara 22 days ago - 2 comments

#83 - glm4模型的加速推理示例

Issue - State: open - Opened by xiaoxue-roy 22 days ago

#82 - end to end performance

Issue - State: closed - Opened by raskolnikov028 27 days ago - 1 comment

#81 - Error installing SageAttention2 on Ubuntu

Issue - State: open - Opened by tahseensheik about 1 month ago

#80 - Rename folder 'resource' in repo to avoid error when running on Windows

Issue - State: closed - Opened by lyxkilo about 1 month ago - 1 comment

#79 - Install Assistance Multiple Errors - ComfyUI Standalone

Issue - State: open - Opened by NC17z about 1 month ago - 1 comment

#78 - can not install on rtx 6000 passive - turing compute

Issue - State: open - Opened by Nemesis-the-Warlock about 1 month ago - 1 comment

#77 - The cuda extension is never used for 3090s?

Issue - State: closed - Opened by Ph0rk0z about 1 month ago - 3 comments

#76 - 2.0.1 release

Pull Request - State: closed - Opened by jason-huang03 about 1 month ago

#76 - 2.0.1 release

Pull Request - State: closed - Opened by jason-huang03 about 1 month ago

#75 - Long sequence error

Issue - State: closed - Opened by l1cacheDell about 1 month ago - 9 comments

#75 - Long sequence error

Issue - State: closed - Opened by l1cacheDell about 1 month ago - 9 comments

#74 - Compilation with full CUDA graphs (without breaks)

Issue - State: open - Opened by bm-synth about 1 month ago - 18 comments
Labels: enhancement

#73 - Graph break warning when using with torch compile.

Issue - State: closed - Opened by akedia about 1 month ago - 1 comment

#72 - End To End performance example and problem with batch size

Issue - State: closed - Opened by SamuraiBUPT about 1 month ago - 3 comments

#71 - SageAttention support for vLLM？

Issue - State: open - Opened by Zachary-ai-engineer about 2 months ago

#71 - SageAttention support for vLLM？

Issue - State: open - Opened by Zachary-ai-engineer about 2 months ago

#70 - Sageattention1.0 runs slower than FA2 on A100.

Issue - State: open - Opened by foreverpiano about 2 months ago - 12 comments

#69 - Poor performance on L40S, no `int4`.

Issue - State: closed - Opened by bm-synth about 2 months ago - 3 comments

#68 - How much time will we save if we use sageattn_cogvideo.py instead of original_cogvideo.py

Issue - State: closed - Opened by rxmao about 2 months ago - 3 comments

#67 - Regarding the issue of installing the 2.0.0 source code

Issue - State: closed - Opened by MikeAiJF about 2 months ago - 1 comment

#66 - Possibilities of support Pascal

Issue - State: open - Opened by sorasoras about 2 months ago - 4 comments

#65 - Issue Compiling on Windows

Issue - State: open - Opened by RichyRich515 about 2 months ago - 1 comment

#64 - MultiheadAttention conversion

Issue - State: open - Opened by Maritime-Moon about 2 months ago - 10 comments

#63 - SageAttention support for GPU T4？

Issue - State: open - Opened by ivankxt about 2 months ago - 1 comment

#62 - sageattn_qk_int8_pv_fp16_cuda only support head_dim [64, 128]

Issue - State: closed - Opened by wangshankun about 2 months ago - 2 comments

#61 - Launch error at 4090 for sageattn_qk_int8_pv_fp8_cuda

Issue - State: open - Opened by Andy0422 about 2 months ago - 13 comments
Labels: bug

#60 - support for pytorch autograd

Issue - State: open - Opened by bghira about 2 months ago - 3 comments
Labels: enhancement

#59 - How the performance of Sage Attention compares to that of FA3 on Hopper GPUs?

Issue - State: open - Opened by alexngng about 2 months ago - 2 comments
Labels: enhancement

#58 - Support for flux?

Issue - State: closed - Opened by ali-afridi26 about 2 months ago - 3 comments

#57 - How to fairely compare fa2 vs sageattention？

Issue - State: closed - Opened by zhangxin81 about 2 months ago - 1 comment

#56 - SageAttention2论文中per-warp量化问题请教

Issue - State: closed - Opened by wzw1994 2 months ago - 3 comments

#55 - LLM acc problem

Issue - State: closed - Opened by laomao0 2 months ago - 10 comments
Labels: bug

#54 - Question about performance of qwen2-vl on A10

Issue - State: open - Opened by gxm651182644 2 months ago - 5 comments

#53 - SageAttention PV8 issue: unspecified launch failure

Issue - State: closed - Opened by BirdChristopher 2 months ago - 1 comment

#52 - Where is the 4bit attention API?

Issue - State: open - Opened by BirdChristopher 2 months ago - 7 comments

#51 - Turning support?

Issue - State: open - Opened by Ph0rk0z 2 months ago - 2 comments

#50 - Parallel SageAttention Inference

Pull Request - State: closed - Opened by DefTruth 2 months ago - 3 comments

#49 - sageattn 2.0编译成功，运行报错, triton版本可以正常运行

Issue - State: closed - Opened by DefTruth 2 months ago - 3 comments

#48 - speed issue compared with sdpa, fa1

Issue - State: closed - Opened by 2877992943 2 months ago - 3 comments

#47 - Only `head_dim==128` and `headdim==64` are supported?

Issue - State: closed - Opened by xmfbit 2 months ago - 3 comments

#46 - To allow Windows compilation of sageattention v2: Update math.cuh

Pull Request - State: open - Opened by EnragedAntelope 2 months ago - 2 comments

#45 - some questions about SageAttention

Issue - State: closed - Opened by hermosayhl 2 months ago - 4 comments

#44 - 1.0.5 errors with bf16 CogVideoX: AssertionError: First input (fp16) and second input (bf16) must have the same dtype!

Issue - State: closed - Opened by kijai 2 months ago - 10 comments

#43 - Could you provide some FA examples to illustrate the improvement in FA2?

Issue - State: closed - Opened by RyeYuan 2 months ago - 9 comments

#42 - AssertionError

Issue - State: closed - Opened by Maritime-Moon 3 months ago - 10 comments