Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / thu-ml/sageattention issues and pull requests
#93 - sageattn_qk_int8_pv_fp16_cuda black output with pv_accum fp16, results in black screen in opensora
Issue -
State: open - Opened by nighting0le01 8 days ago
- 2 comments
#92 - Alibi or window Attention Support
Issue -
State: open - Opened by asahni04 9 days ago
#91 - (windows): Add ushort type definition for CUDA compilation for Sageattention 2.x
Pull Request -
State: open - Opened by Panchovix 10 days ago
- 2 comments
#90 - Help, When I use SageAttention==1.0.6, got Accelerate up 9% and Accuracy down 7%
Issue -
State: open - Opened by shushanxingzhe 11 days ago
- 4 comments
#89 - Imports fails on ComfyUI
Issue -
State: open - Opened by deadman3000 12 days ago
#88 - incompatible CUDA&torch version
Issue -
State: closed - Opened by ET823828 13 days ago
- 2 comments
#87 - How could I get the code for "SAGEAttn-vT" in the paper?
Issue -
State: closed - Opened by Ezio-csm 15 days ago
- 4 comments
#86 - Question about how to get the attention weight matrix for Q@K^T
Issue -
State: closed - Opened by shiniannian 16 days ago
- 1 comment
#85 - Can not run SageAttention 2.0.1 in CC89 (4090D)with Runtime Error
Issue -
State: closed - Opened by wzw1994 21 days ago
- 4 comments
#84 - 报错了,安装不上
Issue -
State: open - Opened by u-madara 22 days ago
- 2 comments
#83 - glm4模型的加速推理示例
Issue -
State: open - Opened by xiaoxue-roy 22 days ago
#82 - end to end performance
Issue -
State: closed - Opened by raskolnikov028 27 days ago
- 1 comment
#81 - Error installing SageAttention2 on Ubuntu
Issue -
State: open - Opened by tahseensheik about 1 month ago
#80 - Rename folder 'resource' in repo to avoid error when running on Windows
Issue -
State: closed - Opened by lyxkilo about 1 month ago
- 1 comment
#79 - Install Assistance Multiple Errors - ComfyUI Standalone
Issue -
State: open - Opened by NC17z about 1 month ago
- 1 comment
#78 - can not install on rtx 6000 passive - turing compute
Issue -
State: open - Opened by Nemesis-the-Warlock about 1 month ago
- 1 comment
#77 - The cuda extension is never used for 3090s?
Issue -
State: closed - Opened by Ph0rk0z about 1 month ago
- 3 comments
#76 - 2.0.1 release
Pull Request -
State: closed - Opened by jason-huang03 about 1 month ago
#76 - 2.0.1 release
Pull Request -
State: closed - Opened by jason-huang03 about 1 month ago
#75 - Long sequence error
Issue -
State: closed - Opened by l1cacheDell about 1 month ago
- 9 comments
#75 - Long sequence error
Issue -
State: closed - Opened by l1cacheDell about 1 month ago
- 9 comments
#74 - Compilation with full CUDA graphs (without breaks)
Issue -
State: open - Opened by bm-synth about 1 month ago
- 18 comments
Labels: enhancement
#73 - Graph break warning when using with torch compile.
Issue -
State: closed - Opened by akedia about 1 month ago
- 1 comment
#72 - End To End performance example and problem with batch size
Issue -
State: closed - Opened by SamuraiBUPT about 1 month ago
- 3 comments
#71 - SageAttention support for vLLM?
Issue -
State: open - Opened by Zachary-ai-engineer about 2 months ago
#71 - SageAttention support for vLLM?
Issue -
State: open - Opened by Zachary-ai-engineer about 2 months ago
#70 - Sageattention1.0 runs slower than FA2 on A100.
Issue -
State: open - Opened by foreverpiano about 2 months ago
- 12 comments
#69 - Poor performance on L40S, no `int4`.
Issue -
State: closed - Opened by bm-synth about 2 months ago
- 3 comments
#68 - How much time will we save if we use sageattn_cogvideo.py instead of original_cogvideo.py
Issue -
State: closed - Opened by rxmao about 2 months ago
- 3 comments
#67 - Regarding the issue of installing the 2.0.0 source code
Issue -
State: closed - Opened by MikeAiJF about 2 months ago
- 1 comment
#66 - Possibilities of support Pascal
Issue -
State: open - Opened by sorasoras about 2 months ago
- 4 comments
#65 - Issue Compiling on Windows
Issue -
State: open - Opened by RichyRich515 about 2 months ago
- 1 comment
#64 - MultiheadAttention conversion
Issue -
State: open - Opened by Maritime-Moon about 2 months ago
- 10 comments
#63 - SageAttention support for GPU T4?
Issue -
State: open - Opened by ivankxt about 2 months ago
- 1 comment
#62 - sageattn_qk_int8_pv_fp16_cuda only support head_dim [64, 128]
Issue -
State: closed - Opened by wangshankun about 2 months ago
- 2 comments
#61 - Launch error at 4090 for sageattn_qk_int8_pv_fp8_cuda
Issue -
State: open - Opened by Andy0422 about 2 months ago
- 13 comments
Labels: bug
#60 - support for pytorch autograd
Issue -
State: open - Opened by bghira about 2 months ago
- 3 comments
Labels: enhancement
#59 - How the performance of Sage Attention compares to that of FA3 on Hopper GPUs?
Issue -
State: open - Opened by alexngng about 2 months ago
- 2 comments
Labels: enhancement
#58 - Support for flux?
Issue -
State: closed - Opened by ali-afridi26 about 2 months ago
- 3 comments
#57 - How to fairely compare fa2 vs sageattention?
Issue -
State: closed - Opened by zhangxin81 about 2 months ago
- 1 comment
#56 - SageAttention2论文中per-warp量化问题请教
Issue -
State: closed - Opened by wzw1994 2 months ago
- 3 comments
#55 - LLM acc problem
Issue -
State: closed - Opened by laomao0 2 months ago
- 10 comments
Labels: bug
#54 - Question about performance of qwen2-vl on A10
Issue -
State: open - Opened by gxm651182644 2 months ago
- 5 comments
#53 - SageAttention PV8 issue: unspecified launch failure
Issue -
State: closed - Opened by BirdChristopher 2 months ago
- 1 comment
#52 - Where is the 4bit attention API?
Issue -
State: open - Opened by BirdChristopher 2 months ago
- 7 comments
#51 - Turning support?
Issue -
State: open - Opened by Ph0rk0z 2 months ago
- 2 comments
#50 - Parallel SageAttention Inference
Pull Request -
State: closed - Opened by DefTruth 2 months ago
- 3 comments
#49 - sageattn 2.0编译成功,运行报错, triton版本可以正常运行
Issue -
State: closed - Opened by DefTruth 2 months ago
- 3 comments
#48 - speed issue compared with sdpa, fa1
Issue -
State: closed - Opened by 2877992943 2 months ago
- 3 comments
#47 - Only `head_dim==128` and `headdim==64` are supported?
Issue -
State: closed - Opened by xmfbit 2 months ago
- 3 comments
#46 - To allow Windows compilation of sageattention v2: Update math.cuh
Pull Request -
State: open - Opened by EnragedAntelope 2 months ago
- 2 comments
#45 - some questions about SageAttention
Issue -
State: closed - Opened by hermosayhl 2 months ago
- 4 comments
#44 - 1.0.5 errors with bf16 CogVideoX: AssertionError: First input (fp16) and second input (bf16) must have the same dtype!
Issue -
State: closed - Opened by kijai 2 months ago
- 10 comments
#43 - Could you provide some FA examples to illustrate the improvement in FA2?
Issue -
State: closed - Opened by RyeYuan 2 months ago
- 9 comments
#42 - AssertionError
Issue -
State: closed - Opened by Maritime-Moon 3 months ago
- 10 comments
#41 - Getting triton compilation errors when calculating attention
Issue -
State: closed - Opened by JohnnyRacer 3 months ago
- 2 comments
#40 - add cuda kernel for per block and per warp quantization
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#39 - update license
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#39 - update license
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#38 - v1.0.4
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#38 - v1.0.4
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#37 - Please create a ComfyUI node to use SageAttention
Issue -
State: open - Opened by wardensc2 3 months ago
- 3 comments
#37 - Please create a ComfyUI node to use SageAttention
Issue -
State: open - Opened by wardensc2 3 months ago
- 4 comments
#36 - add sageattn_varlen support
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#36 - add sageattn_varlen support
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#35 - better support for hd96
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#35 - better support for hd96
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#34 - Suppose return LSE for sequence parallel
Issue -
State: closed - Opened by jason-huang03 3 months ago
Labels: enhancement
#34 - Suppose return LSE for sequence parallel
Issue -
State: closed - Opened by jason-huang03 3 months ago
Labels: enhancement
#33 - [update] support non-contiguous input, different qo_len and kv_len, HND and NHD layout, group query attention
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#33 - [update] support non-contiguous input, different qo_len and kv_len, HND and NHD layout, group query attention
Pull Request -
State: closed - Opened by jason-huang03 3 months ago
#32 - Sageattention in flux
Issue -
State: closed - Opened by todochenxi 3 months ago
- 4 comments
#32 - Sageattention in flux
Issue -
State: closed - Opened by todochenxi 3 months ago
- 4 comments
#31 - Are you planning to provide a varlen and bnsd API?
Issue -
State: closed - Opened by tlogn 3 months ago
- 8 comments
Labels: enhancement
#31 - Are you planning to provide a varlen and bnsd API?
Issue -
State: closed - Opened by tlogn 3 months ago
- 8 comments
Labels: enhancement
#30 - initialize l_i as zeros
Pull Request -
State: closed - Opened by feifeibear 3 months ago
#30 - initialize l_i as zeros
Pull Request -
State: closed - Opened by feifeibear 3 months ago
#29 - 关于SageAttention 性能为什么在RTX 4090 和 RTX 3090有明显效果
Issue -
State: closed - Opened by MeJerry215 3 months ago
- 1 comment
#29 - 关于SageAttention 性能为什么在RTX 4090 和 RTX 3090有明显效果
Issue -
State: closed - Opened by MeJerry215 3 months ago
- 1 comment
#28 - compatible with other quantization methos
Issue -
State: closed - Opened by chenchunhui97 3 months ago
- 2 comments
#28 - compatible with other quantization methos
Issue -
State: closed - Opened by chenchunhui97 3 months ago
- 2 comments
#27 - Q matrix quantization
Issue -
State: closed - Opened by liangan1 3 months ago
- 1 comment
#27 - Q matrix quantization
Issue -
State: closed - Opened by liangan1 3 months ago
- 1 comment
#26 - got result error when seq_length of q not equals to k/v
Issue -
State: closed - Opened by beegerous 3 months ago
- 7 comments
Labels: enhancement
#26 - got result error when seq_length of q not equals to k/v
Issue -
State: closed - Opened by beegerous 3 months ago
- 7 comments
Labels: enhancement
#25 - q_kernel_per_block_int8 error in distributed settings.
Issue -
State: closed - Opened by feifeibear 3 months ago
#25 - q_kernel_per_block_int8 error in distributed settings.
Issue -
State: closed - Opened by feifeibear 3 months ago
#24 - Why divide ln 2 in quantiation Q value?
Issue -
State: closed - Opened by MeJerry215 3 months ago
- 1 comment
#23 - all black video are generated for Open-Sora-Plan using sageattention
Issue -
State: closed - Opened by littletomatodonkey 3 months ago
- 4 comments
#22 - Real accelerated benefits
Issue -
State: closed - Opened by lswzjuer 3 months ago
- 2 comments
#21 - Why Running Llama infer in A10 get Wrong answer?
Issue -
State: closed - Opened by MeJerry215 3 months ago
- 4 comments
#20 - Can SageAttention available on AMD GPUs?
Issue -
State: closed - Opened by guanchenl 3 months ago
- 2 comments
#19 - exist nan when using sageattn
Issue -
State: closed - Opened by Pydataman 3 months ago
- 6 comments
#18 - Notation error in Equation (2)
Issue -
State: closed - Opened by Coco58323 3 months ago
- 1 comment
#17 - Would support other headdim
Issue -
State: closed - Opened by v4if 3 months ago
- 2 comments
Labels: enhancement
#16 - Other SageAttention Kenerls
Issue -
State: closed - Opened by Andy0422 3 months ago
- 5 comments
Labels: enhancement
#15 - Do you plan to integrate this algorithm into the vllm project?
Issue -
State: open - Opened by Alienfeel 3 months ago
- 2 comments
#14 - 遇到些兼容性问题
Issue -
State: closed - Opened by otoTree 3 months ago
- 5 comments
#13 - Can you provide an example for LLaMA?
Issue -
State: closed - Opened by jyweky 3 months ago
- 1 comment
#12 - Question about INT8 v.s. FP8
Issue -
State: closed - Opened by lingffff 4 months ago
- 1 comment