Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / mit-han-lab/streaming-llm issues and pull requests
#88 - why recompute can differ from window attention?
Issue -
State: open - Opened by habaohaba about 1 month ago
#87 - im confused with the PPL of sliding window with recomputation
Issue -
State: open - Opened by coderwayne3025 about 1 month ago
#86 - Can you provide the code related to the visualization in the paper?
Issue -
State: open - Opened by micelvrice 2 months ago
#85 - 【question】Does streaming-llm focus on accelerating decoding stage? How about the prefilling stage?
Issue -
State: open - Opened by Code24Man 4 months ago
#85 - 【question】Does streaming-llm focus on accelerating decoding stage? How about the prefilling stage?
Issue -
State: open - Opened by Code24Man 4 months ago
#84 - Tokenizer issue with Transformers 4.33.0
Issue -
State: open - Opened by PedemonteGiacomo 5 months ago
#84 - Tokenizer issue with Transformers 4.33.0
Issue -
State: open - Opened by PedemonteGiacomo 5 months ago
#83 - Evaluation code and dataset release inquiry
Issue -
State: open - Opened by DerrickYLJ 5 months ago
#83 - Evaluation code and dataset release inquiry
Issue -
State: open - Opened by DerrickYLJ 5 months ago
#82 - How to visualize attention logits?
Issue -
State: closed - Opened by OStars 5 months ago
- 1 comment
#82 - How to visualize attention logits?
Issue -
State: closed - Opened by OStars 5 months ago
- 1 comment
#81 - what is the difference between window attention and sliding window recomputation
Issue -
State: closed - Opened by seeyourcell 6 months ago
#80 - Progressively decreasing attention windows
Issue -
State: open - Opened by Vorlent 6 months ago
#80 - Progressively decreasing attention windows
Issue -
State: open - Opened by Vorlent 6 months ago
#79 - Using LLaVA model
Issue -
State: open - Opened by JesseZZZZZ 6 months ago
#79 - Using LLaVA model
Issue -
State: open - Opened by JesseZZZZZ 6 months ago
#78 - why `max_gen_len` is needed when considering `space_needed`?
Issue -
State: open - Opened by Mr-lonely0 8 months ago
#78 - why `max_gen_len` is needed when considering `space_needed`?
Issue -
State: open - Opened by Mr-lonely0 8 months ago
#77 - How to evaluate ppl?
Issue -
State: open - Opened by Jiawei-Yang 8 months ago
- 2 comments
#77 - How to evaluate ppl?
Issue -
State: open - Opened by Jiawei-Yang 8 months ago
- 2 comments
#76 - StreamEval
Issue -
State: open - Opened by Zhangchaoran000 10 months ago
#76 - StreamEval
Issue -
State: open - Opened by Zhangchaoran000 10 months ago
#75 - Support mistral-7b?
Issue -
State: open - Opened by spring1915 10 months ago
#75 - Support mistral-7b?
Issue -
State: open - Opened by spring1915 10 months ago
#74 - Run with start_size=0 looks just fine
Issue -
State: open - Opened by cyr0930 10 months ago
#73 - question about positions encoding when apply ROLLING KV CACHE WITH ATTENTION SINKS
Issue -
State: closed - Opened by bugm 11 months ago
- 1 comment
#73 - question about positions encoding when apply ROLLING KV CACHE WITH ATTENTION SINKS
Issue -
State: closed - Opened by bugm 11 months ago
- 1 comment
#72 - Error happened
Issue -
State: open - Opened by ForrestPi 11 months ago
- 2 comments
#72 - Error happened
Issue -
State: open - Opened by ForrestPi 11 months ago
- 2 comments
#71 - Questions about ARC datasets
Issue -
State: open - Opened by Zoeyyao27 11 months ago
#71 - Questions about ARC datasets
Issue -
State: open - Opened by Zoeyyao27 11 months ago
#70 - How much GPU memory needed to run example ?
Issue -
State: open - Opened by fangming-he 12 months ago
- 3 comments
#70 - How much GPU memory needed to run example ?
Issue -
State: open - Opened by fangming-he 12 months ago
- 3 comments
#69 - Is there the way of parallel prompt ?
Issue -
State: open - Opened by DavideHe 12 months ago
#69 - Is there the way of parallel prompt ?
Issue -
State: open - Opened by DavideHe 12 months ago
#68 - Question about attention sink arising in pretrained models
Issue -
State: open - Opened by kevinli573 12 months ago
#67 - Request for Code and Details on Figures 2 and 7
Issue -
State: open - Opened by ZhouZineng 12 months ago
#66 - Questions Related to the Application and Results of Attention Sinks After the Paper
Issue -
State: open - Opened by dsdanielpark 12 months ago
#66 - Questions Related to the Application and Results of Attention Sinks After the Paper
Issue -
State: open - Opened by dsdanielpark 12 months ago
#65 - Questions Regarding "Sink Tokens"
Issue -
State: open - Opened by clarenceluo78 about 1 year ago
#64 - Doubts in "run_streaming_llama.py" file
Issue -
State: open - Opened by Rishab9991 about 1 year ago
#64 - Doubts in "run_streaming_llama.py" file
Issue -
State: open - Opened by Rishab9991 about 1 year ago
#63 - Question about Naive Sliding Window
Issue -
State: closed - Opened by kevinli573 about 1 year ago
- 2 comments
#63 - Question about Naive Sliding Window
Issue -
State: closed - Opened by kevinli573 about 1 year ago
- 2 comments
#62 - why starting sink token is not a special token '\n'?
Issue -
State: closed - Opened by dhcode-cpp about 1 year ago
- 2 comments
#61 - Results for Section 3.2 Rolling KV Cache (Without Pretraining)
Issue -
State: open - Opened by timljj about 1 year ago
- 1 comment
#60 - The position id for q
Issue -
State: open - Opened by ofhwei about 1 year ago
- 1 comment
#59 - The reason for the importance of the initial token.
Issue -
State: open - Opened by freyamom about 1 year ago
#58 - [Feature Request] Support InternLM Model
Issue -
State: open - Opened by vansin about 1 year ago
- 1 comment
#57 - Can support to ChatGLM2?
Issue -
State: open - Opened by KareEnges about 1 year ago
#56 - Enable explictly setting transformer model cache
Pull Request -
State: open - Opened by JiaxuanYou about 1 year ago
#55 - question about Table 1 in paper
Issue -
State: open - Opened by AresXD about 1 year ago
- 1 comment
#54 - question about initial tokens
Issue -
State: open - Opened by chaojiewang94 about 1 year ago
- 2 comments
#53 - While streaming with sinks, how does the framework change the positional encodings of the KV cache without having to multiply with the Key and Value matrices?
Issue -
State: open - Opened by Bhuvanesh09 about 1 year ago
- 4 comments
#52 - Finetuning a model in the streaming mode ?
Issue -
State: closed - Opened by MohamedAliRashad about 1 year ago
- 1 comment
#51 - question about re-computation
Issue -
State: closed - Opened by ysanimals about 1 year ago
- 4 comments
#50 - Implementation of lama2 7b chat hf model
Issue -
State: open - Opened by MuhammadIshaq-AI about 1 year ago
- 7 comments
#49 - Implementing lama2 7b
Issue -
State: closed - Opened by MuhammadIshaq-AI about 1 year ago
#48 - Is code's position wrong with "kv_cache.evict_for_space" ?
Issue -
State: closed - Opened by DavideHe about 1 year ago
- 2 comments
#47 - some question about paper
Issue -
State: closed - Opened by Vincentyua about 1 year ago
- 1 comment
#46 - Does past_key_values be repeatedly compute?
Issue -
State: open - Opened by freyamom about 1 year ago
- 5 comments
#45 - How to use streaming llm to train a new model? is there any sample code . thansk
Issue -
State: closed - Opened by mega-cqz about 1 year ago
- 1 comment
#44 - I'm (A Bit) Suspicious of Table 3.
Issue -
State: closed - Opened by FrederickGeek8 about 1 year ago
- 1 comment
#43 - Questions on the demo results
Issue -
State: closed - Opened by BitCalSaul about 1 year ago
- 2 comments
#42 - Question on intuition of "attention sink" and "alibi PE"
Issue -
State: closed - Opened by bowencohere about 1 year ago
- 3 comments
#41 - Question about long input and difference between streaming-llm and dense attention.
Issue -
State: closed - Opened by hxs91 about 1 year ago
- 2 comments
#40 - RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
Issue -
State: closed - Opened by chnl about 1 year ago
- 2 comments
#39 - Question about evaluation results and demo
Issue -
State: closed - Opened by hsm1997 about 1 year ago
- 2 comments
#38 - How to answer the question in the middle of long input
Issue -
State: open - Opened by yangzhj53 about 1 year ago
#37 - RuntimeError in run_streaming_llama.py When Using Accelerate with Streaming LLMa Model on A4500 GPU
Issue -
State: open - Opened by ZexinLi0w0 about 1 year ago
- 4 comments
#36 - Questions about "Run Streaming Llama Chatbot"
Issue -
State: closed - Opened by ChuanhongLi about 1 year ago
- 3 comments
#35 - Can support to codellama34b?
Issue -
State: closed - Opened by willshion about 1 year ago
- 1 comment
#34 - Can support to Qwen14B?
Issue -
State: closed - Opened by ChenTao98 about 1 year ago
- 1 comment
#33 - Confused with four attention mechanism and their performance mentioned by paper
Issue -
State: closed - Opened by weizhenhuan about 1 year ago
- 5 comments
#32 - The k_seq_dim and v_seq_dim in StartRecentKVCache look related to the type of model
Issue -
State: open - Opened by wangxiaochun520 about 1 year ago
- 2 comments
#31 - Model paths randomly set
Issue -
State: closed - Opened by HyperUpscale about 1 year ago
- 1 comment
#30 - 测试了没有提速哇,咋回事呢?
Issue -
State: closed - Opened by xxm1668 about 1 year ago
- 3 comments
#29 - can support to Baichuan2?
Issue -
State: open - Opened by luzhongqiu about 1 year ago
#28 - 有木有类似chatgpt的调用接口?
Issue -
State: closed - Opened by xxm1668 about 1 year ago
- 1 comment
#27 - How to generate longer token streams?
Issue -
State: open - Opened by GenTxt about 1 year ago
- 3 comments
#26 - b979594a04f1bbefe1ff21eb8affacef2a186d25
Issue -
State: closed - Opened by ghost about 1 year ago
#25 - Strim
Issue -
State: closed - Opened by ghost about 1 year ago
#24 - Comparison with SWA in Mistral
Issue -
State: open - Opened by casper-hansen about 1 year ago
- 12 comments
#23 - output
Issue -
State: closed - Opened by 21pl about 1 year ago
#22 - wrong
Issue -
State: closed - Opened by QingChengLineOne about 1 year ago
- 3 comments
#21 - add suport codellama
Issue -
State: closed - Opened by willshion about 1 year ago
- 1 comment
#20 - Streaming example: Move input_ids to model device rather than "cuda"
Pull Request -
State: closed - Opened by tomaarsen about 1 year ago
- 1 comment
#19 - hi
Issue -
State: closed - Opened by Kompiuter89 about 1 year ago
#18 - Metal Support
Issue -
State: closed - Opened by jordo1138 about 1 year ago
- 7 comments
#17 - I keep getting a 403 forbidden
Issue -
State: closed - Opened by odfhgodhfighdf about 1 year ago
#16 - Update mt_bench.jsonl
Pull Request -
State: closed - Opened by t562 about 1 year ago
#15 - [Feature Request] Release StreamEval dataset and evaluation code in OpenCompass
Issue -
State: open - Opened by vansin about 1 year ago
- 2 comments
#14 - TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask'
Issue -
State: closed - Opened by MartinKratochvilProgramy about 1 year ago
- 4 comments
#13 - Have you run any passkey retrieval tests on streaming-llm?
Issue -
State: open - Opened by RonanKMcGovern about 1 year ago
- 2 comments
#12 - Questions on "streaming-llm" Paper
Issue -
State: closed - Opened by llsj14 about 1 year ago
- 2 comments
#11 - 'CUDA_VISIBLE_DEVICES' is not recognized as an internal or external command, operable program or batch file.
Issue -
State: closed - Opened by IntrovertsBedroom about 1 year ago
- 1 comment
#10 - Convert demo video from MOV to MP4
Pull Request -
State: closed - Opened by cosmojg about 1 year ago
#9 - The video included in the README does not play in Firefox
Issue -
State: closed - Opened by cosmojg about 1 year ago
#8 - Google Colab installation
Issue -
State: closed - Opened by narita63755930 about 1 year ago
- 10 comments
#7 - window_size attention pretrain
Issue -
State: closed - Opened by wawpaopao about 1 year ago
- 3 comments