Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / jy-yuan/KIVI issues and pull requests
#28 - Multi GPUs
Issue -
State: open - Opened by yisunlp 23 days ago
- 5 comments
#27 - Unable to Reproduce Results for LongBench
Issue -
State: open - Opened by ilil96 26 days ago
- 2 comments
#26 - How can the code support 1bit quantization.
Issue -
State: closed - Opened by yuhuixu1993 29 days ago
- 2 comments
#25 - Develop
Pull Request -
State: open - Opened by Davids048 about 2 months ago
#24 - ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
Issue -
State: open - Opened by xzwj1699 3 months ago
#23 - How to understand the code: triton_quantize_and_pack_along_last_dim(value_states_full[:, :, :1, :].contiguous(), self.group_size, self.v_bits)
Issue -
State: closed - Opened by chenyehuang 3 months ago
- 3 comments
#22 - Difference between gemv_forward_cuda and gemv_forward_cuda_outer_rim?
Issue -
State: closed - Opened by yifeikong 3 months ago
- 2 comments
#21 - Add missing flash_attn_func import in llama_kivi model
Pull Request -
State: closed - Opened by yifeikong 3 months ago
#20 - NameError: name 'flash_attn_func' is not defined
Issue -
State: closed - Opened by zwhong714 3 months ago
- 1 comment
#19 - [FIX] use flash attention in example.py
Pull Request -
State: closed - Opened by Davids048 3 months ago
#18 - The difference in batch size leads to different results in LongBench testing
Issue -
State: open - Opened by Felixvillas 4 months ago
- 5 comments
#17 - run example.py with llama2-7B-hf only save 500MB kv cache memory conpared to base transformers ?
Issue -
State: open - Opened by riou-chen 4 months ago
- 2 comments
#16 - CUDA version
Issue -
State: closed - Opened by hensiesp32 4 months ago
- 4 comments
#15 - Why the model inference slowly when Mistral-7B-Instruct-v0.2 apply the kivi?
Issue -
State: closed - Opened by lichongod 4 months ago
- 7 comments
#14 - Where is the falcon_kivi?
Issue -
State: closed - Opened by Felixvillas 5 months ago
- 4 comments
#13 - which commit of lm-eval-harness the lmeval branch is based on?
Issue -
State: closed - Opened by condy0919 5 months ago
- 3 comments
#12 - An error occurred while using "evaluate. load (" act_match ")"
Issue -
State: closed - Opened by Felixvillas 5 months ago
- 1 comment
#11 - Which file I need to run to obtain the result in Figure 4?
Issue -
State: closed - Opened by Felixvillas 5 months ago
- 2 comments
#10 - not support evaluation with ROCM
Issue -
State: open - Opened by ym-guan 5 months ago
- 1 comment
#9 - Spport for ChatGLM3
Issue -
State: open - Opened by redscv 5 months ago
- 1 comment
#8 - Provide an accuracy testing interface?
Issue -
State: closed - Opened by ascendpoet 5 months ago
- 1 comment
#7 - Discrepancy in Reproduced Results for LLaMA2 on "qmsum" and "qasper" tasks.
Issue -
State: closed - Opened by ilur98 5 months ago
- 2 comments
#6 - W/ or w/o Weight quantization?
Issue -
State: closed - Opened by deephanson94 5 months ago
- 4 comments
#5 - [fix] add the missing comma in pyproject.toml to enable correct pip i…
Pull Request -
State: closed - Opened by wln20 6 months ago
- 1 comment
#4 - Integrate KIVI into inference frameworks?
Issue -
State: closed - Opened by andakai 6 months ago
- 1 comment
#3 - LlamaConfig.attention_dropout does not exist in transformers==4.35.2
Issue -
State: closed - Opened by RalphMao 6 months ago
- 1 comment
#2 - Could you please open-source the code for the calculation and visualization of the statistic information of KV Cache?
Issue -
State: closed - Opened by wln20 7 months ago
- 3 comments
#1 - Can this be used with any autogressive model?
Issue -
State: closed - Opened by hello-fri-end 7 months ago
- 1 comment