Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / thu-ml/sageattention issues and pull requests

#92 - Alibi or window Attention Support

Issue - State: open - Opened by asahni04 9 days ago

#91 - (windows): Add ushort type definition for CUDA compilation for Sageattention 2.x

Pull Request - State: open - Opened by Panchovix 10 days ago - 2 comments

#89 - Imports fails on ComfyUI

Issue - State: open - Opened by deadman3000 12 days ago

#88 - incompatible CUDA&torch version

Issue - State: closed - Opened by ET823828 13 days ago - 2 comments

#87 - How could I get the code for "SAGEAttn-vT" in the paper?

Issue - State: closed - Opened by Ezio-csm 15 days ago - 4 comments

#86 - Question about how to get the attention weight matrix for Q@K^T

Issue - State: closed - Opened by shiniannian 16 days ago - 1 comment

#85 - Can not run SageAttention 2.0.1 in CC89 (4090D)with Runtime Error

Issue - State: closed - Opened by wzw1994 21 days ago - 4 comments

#84 - 报错了,安装不上

Issue - State: open - Opened by u-madara 22 days ago - 2 comments

#83 - glm4模型的加速推理示例

Issue - State: open - Opened by xiaoxue-roy 22 days ago

#82 - end to end performance

Issue - State: closed - Opened by raskolnikov028 27 days ago - 1 comment

#81 - Error installing SageAttention2 on Ubuntu

Issue - State: open - Opened by tahseensheik about 1 month ago

#80 - Rename folder 'resource' in repo to avoid error when running on Windows

Issue - State: closed - Opened by lyxkilo about 1 month ago - 1 comment

#79 - Install Assistance Multiple Errors - ComfyUI Standalone

Issue - State: open - Opened by NC17z about 1 month ago - 1 comment

#78 - can not install on rtx 6000 passive - turing compute

Issue - State: open - Opened by Nemesis-the-Warlock about 1 month ago - 1 comment

#77 - The cuda extension is never used for 3090s?

Issue - State: closed - Opened by Ph0rk0z about 1 month ago - 3 comments

#76 - 2.0.1 release

Pull Request - State: closed - Opened by jason-huang03 about 1 month ago

#76 - 2.0.1 release

Pull Request - State: closed - Opened by jason-huang03 about 1 month ago

#75 - Long sequence error

Issue - State: closed - Opened by l1cacheDell about 1 month ago - 9 comments

#75 - Long sequence error

Issue - State: closed - Opened by l1cacheDell about 1 month ago - 9 comments

#74 - Compilation with full CUDA graphs (without breaks)

Issue - State: open - Opened by bm-synth about 1 month ago - 18 comments
Labels: enhancement

#73 - Graph break warning when using with torch compile.

Issue - State: closed - Opened by akedia about 1 month ago - 1 comment

#72 - End To End performance example and problem with batch size

Issue - State: closed - Opened by SamuraiBUPT about 1 month ago - 3 comments

#71 - SageAttention support for vLLM?

Issue - State: open - Opened by Zachary-ai-engineer about 2 months ago

#71 - SageAttention support for vLLM?

Issue - State: open - Opened by Zachary-ai-engineer about 2 months ago

#70 - Sageattention1.0 runs slower than FA2 on A100.

Issue - State: open - Opened by foreverpiano about 2 months ago - 12 comments

#69 - Poor performance on L40S, no `int4`.

Issue - State: closed - Opened by bm-synth about 2 months ago - 3 comments

#67 - Regarding the issue of installing the 2.0.0 source code

Issue - State: closed - Opened by MikeAiJF about 2 months ago - 1 comment

#66 - Possibilities of support Pascal

Issue - State: open - Opened by sorasoras about 2 months ago - 4 comments

#65 - Issue Compiling on Windows

Issue - State: open - Opened by RichyRich515 about 2 months ago - 1 comment

#64 - MultiheadAttention conversion

Issue - State: open - Opened by Maritime-Moon about 2 months ago - 10 comments

#63 - SageAttention support for GPU T4?

Issue - State: open - Opened by ivankxt about 2 months ago - 1 comment

#62 - sageattn_qk_int8_pv_fp16_cuda only support head_dim [64, 128]

Issue - State: closed - Opened by wangshankun about 2 months ago - 2 comments

#61 - Launch error at 4090 for sageattn_qk_int8_pv_fp8_cuda

Issue - State: open - Opened by Andy0422 about 2 months ago - 13 comments
Labels: bug

#60 - support for pytorch autograd

Issue - State: open - Opened by bghira about 2 months ago - 3 comments
Labels: enhancement

#59 - How the performance of Sage Attention compares to that of FA3 on Hopper GPUs?

Issue - State: open - Opened by alexngng about 2 months ago - 2 comments
Labels: enhancement

#58 - Support for flux?

Issue - State: closed - Opened by ali-afridi26 about 2 months ago - 3 comments

#57 - How to fairely compare fa2 vs sageattention?

Issue - State: closed - Opened by zhangxin81 about 2 months ago - 1 comment

#56 - SageAttention2论文中per-warp量化问题请教

Issue - State: closed - Opened by wzw1994 2 months ago - 3 comments

#55 - LLM acc problem

Issue - State: closed - Opened by laomao0 2 months ago - 10 comments
Labels: bug

#54 - Question about performance of qwen2-vl on A10

Issue - State: open - Opened by gxm651182644 2 months ago - 5 comments

#53 - SageAttention PV8 issue: unspecified launch failure

Issue - State: closed - Opened by BirdChristopher 2 months ago - 1 comment

#52 - Where is the 4bit attention API?

Issue - State: open - Opened by BirdChristopher 2 months ago - 7 comments

#51 - Turning support?

Issue - State: open - Opened by Ph0rk0z 2 months ago - 2 comments

#50 - Parallel SageAttention Inference

Pull Request - State: closed - Opened by DefTruth 2 months ago - 3 comments

#49 - sageattn 2.0编译成功,运行报错, triton版本可以正常运行

Issue - State: closed - Opened by DefTruth 2 months ago - 3 comments

#48 - speed issue compared with sdpa, fa1

Issue - State: closed - Opened by 2877992943 2 months ago - 3 comments

#47 - Only `head_dim==128` and `headdim==64` are supported?

Issue - State: closed - Opened by xmfbit 2 months ago - 3 comments

#46 - To allow Windows compilation of sageattention v2: Update math.cuh

Pull Request - State: open - Opened by EnragedAntelope 2 months ago - 2 comments

#45 - some questions about SageAttention

Issue - State: closed - Opened by hermosayhl 2 months ago - 4 comments

#43 - Could you provide some FA examples to illustrate the improvement in FA2?

Issue - State: closed - Opened by RyeYuan 2 months ago - 9 comments

#42 - AssertionError

Issue - State: closed - Opened by Maritime-Moon 3 months ago - 10 comments

#41 - Getting triton compilation errors when calculating attention

Issue - State: closed - Opened by JohnnyRacer 3 months ago - 2 comments

#40 - add cuda kernel for per block and per warp quantization

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#39 - update license

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#39 - update license

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#38 - v1.0.4

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#38 - v1.0.4

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#37 - Please create a ComfyUI node to use SageAttention

Issue - State: open - Opened by wardensc2 3 months ago - 3 comments

#37 - Please create a ComfyUI node to use SageAttention

Issue - State: open - Opened by wardensc2 3 months ago - 4 comments

#36 - add sageattn_varlen support

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#36 - add sageattn_varlen support

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#35 - better support for hd96

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#35 - better support for hd96

Pull Request - State: closed - Opened by jason-huang03 3 months ago

#34 - Suppose return LSE for sequence parallel

Issue - State: closed - Opened by jason-huang03 3 months ago
Labels: enhancement

#34 - Suppose return LSE for sequence parallel

Issue - State: closed - Opened by jason-huang03 3 months ago
Labels: enhancement

#32 - Sageattention in flux

Issue - State: closed - Opened by todochenxi 3 months ago - 4 comments

#32 - Sageattention in flux

Issue - State: closed - Opened by todochenxi 3 months ago - 4 comments

#31 - Are you planning to provide a varlen and bnsd API?

Issue - State: closed - Opened by tlogn 3 months ago - 8 comments
Labels: enhancement

#31 - Are you planning to provide a varlen and bnsd API?

Issue - State: closed - Opened by tlogn 3 months ago - 8 comments
Labels: enhancement

#30 - initialize l_i as zeros

Pull Request - State: closed - Opened by feifeibear 3 months ago

#30 - initialize l_i as zeros

Pull Request - State: closed - Opened by feifeibear 3 months ago

#28 - compatible with other quantization methos

Issue - State: closed - Opened by chenchunhui97 3 months ago - 2 comments

#28 - compatible with other quantization methos

Issue - State: closed - Opened by chenchunhui97 3 months ago - 2 comments

#27 - Q matrix quantization

Issue - State: closed - Opened by liangan1 3 months ago - 1 comment

#27 - Q matrix quantization

Issue - State: closed - Opened by liangan1 3 months ago - 1 comment

#26 - got result error when seq_length of q not equals to k/v

Issue - State: closed - Opened by beegerous 3 months ago - 7 comments
Labels: enhancement

#26 - got result error when seq_length of q not equals to k/v

Issue - State: closed - Opened by beegerous 3 months ago - 7 comments
Labels: enhancement

#25 - q_kernel_per_block_int8 error in distributed settings.

Issue - State: closed - Opened by feifeibear 3 months ago

#25 - q_kernel_per_block_int8 error in distributed settings.

Issue - State: closed - Opened by feifeibear 3 months ago

#24 - Why divide ln 2 in quantiation Q value?

Issue - State: closed - Opened by MeJerry215 3 months ago - 1 comment

#22 - Real accelerated benefits

Issue - State: closed - Opened by lswzjuer 3 months ago - 2 comments

#21 - Why Running Llama infer in A10 get Wrong answer?

Issue - State: closed - Opened by MeJerry215 3 months ago - 4 comments

#20 - Can SageAttention available on AMD GPUs?

Issue - State: closed - Opened by guanchenl 3 months ago - 2 comments

#19 - exist nan when using sageattn

Issue - State: closed - Opened by Pydataman 3 months ago - 6 comments

#18 - Notation error in Equation (2)

Issue - State: closed - Opened by Coco58323 3 months ago - 1 comment

#17 - Would support other headdim

Issue - State: closed - Opened by v4if 3 months ago - 2 comments
Labels: enhancement

#16 - Other SageAttention Kenerls

Issue - State: closed - Opened by Andy0422 3 months ago - 5 comments
Labels: enhancement

#15 - Do you plan to integrate this algorithm into the vllm project?

Issue - State: open - Opened by Alienfeel 3 months ago - 2 comments

#14 - 遇到些兼容性问题

Issue - State: closed - Opened by otoTree 3 months ago - 5 comments

#13 - Can you provide an example for LLaMA?

Issue - State: closed - Opened by jyweky 3 months ago - 1 comment

#12 - Question about INT8 v.s. FP8

Issue - State: closed - Opened by lingffff 4 months ago - 1 comment