Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / deftruth/awesome-llm-inference issues and pull requests
#41 - 🔥[Speculative Decoding] Parallel Speculative Decoding with Adaptive Draft Length
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#40 - Update README.md
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#39 - Bump up to v2.0
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#39 - Bump up to v2.0
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#38 - [Token Recycling] Turning Trash into Treasure: Accelerating Inference…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#37 - 🔥[ABQ-LLM] Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#37 - 🔥[ABQ-LLM] Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#36 - Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#35 - KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#34 - 🔥🔥[Eigen Attention] Attention in Low-Rank Space for KV Cache Compression
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#34 - 🔥🔥[Eigen Attention] Attention in Low-Rank Space for KV Cache Compression
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#33 - 🔥🔥[LUT TENSOR CORE] Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#33 - 🔥🔥[LUT TENSOR CORE] Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#32 - Bump up to v1.9
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#32 - Bump up to v1.9
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#31 - 🔥🔥[500xCompressor] 500xCompressor: Generalized Prompt Compression for…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#31 - 🔥🔥[500xCompressor] 500xCompressor: Generalized Prompt Compression for…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#30 - 🔥[Automatic Inference Engine Tuning] Towards SLO-Optimized LLM Servin…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#30 - 🔥[Automatic Inference Engine Tuning] Towards SLO-Optimized LLM Servin…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#29 - 🔥[Zero-Delay QKV Compression] Zero-Delay QKV Compression for Mitigati…
Pull Request -
State: closed - Opened by DefTruth 6 months ago
#28 - 🔥[DynamoLLM] DynamoLLM: Designing LLM Inference Clusters for Performa…
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#27 - Bump up to v1.8
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#27 - Bump up to v1.8
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#26 - 🔥[SentenceVAE] SentenceVAE: Faster, Longer and More Accurate Inferenc…
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#26 - 🔥[SentenceVAE] SentenceVAE: Faster, Longer and More Accurate Inferenc…
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#25 - 🔥[Palu] Palu: Compressing KV-Cache with Low-Rank Projection(@nycu.edu…
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#25 - 🔥[Palu] Palu: Compressing KV-Cache with Low-Rank Projection(@nycu.edu…
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#24 - 🔥[flashinfer] FlashInfer: Kernel Library for LLM Serving(@flashinfer-ai)
Pull Request -
State: closed - Opened by DefTruth 7 months ago
#23 - Flashinier
Issue -
State: closed - Opened by milinxiaobo 7 months ago
- 1 comment
#23 - Flashinier
Issue -
State: closed - Opened by milinxiaobo 7 months ago
- 1 comment
#22 - Update README.md
Pull Request -
State: closed - Opened by clevercool 7 months ago
#22 - Update README.md
Pull Request -
State: closed - Opened by clevercool 7 months ago
#21 - Add paper "Internal Consistency and Self-Feedback in Large Language Models: A Survey"
Pull Request -
State: closed - Opened by fan2goa1 7 months ago
#21 - Add paper "Internal Consistency and Self-Feedback in Large Language Models: A Survey"
Pull Request -
State: closed - Opened by fan2goa1 7 months ago
#20 - add MInference 1.0 from microsoft
Pull Request -
State: closed - Opened by liyucheng09 7 months ago
#19 - [MoA] MoA: Mixture of Sparse Attention for Automatic LLM Compression
Pull Request -
State: closed - Opened by liyucheng09 8 months ago
#18 - Update README.md
Pull Request -
State: closed - Opened by Kthyeon 8 months ago
- 1 comment
#17 - How about wechat group? 搞个群吧
Issue -
State: closed - Opened by HarryWu99 9 months ago
- 2 comments
Labels: stale
#16 - update [Decoding Speculative Decoding] github repo
Pull Request -
State: closed - Opened by KylinC 9 months ago
- 1 comment
#16 - update [Decoding Speculative Decoding] github repo
Pull Request -
State: closed - Opened by KylinC 9 months ago
- 1 comment
#15 - Update README.md
Pull Request -
State: closed - Opened by preminstrel 10 months ago
#14 - Add Microbenchmark
Pull Request -
State: closed - Opened by Miroier 10 months ago
#13 - [KVcache] add "Gear" paper and code of "Keyformer"
Pull Request -
State: closed - Opened by HarryWu99 10 months ago
#12 - add SnapKV
Pull Request -
State: closed - Opened by liyucheng09 10 months ago
#11 - LLMLingua-2
Pull Request -
State: closed - Opened by liyucheng09 10 months ago
#10 - update
Pull Request -
State: closed - Opened by DefTruth 10 months ago
#9 - Add github link for paper FP8-Quantization[2208.09225]
Pull Request -
State: closed - Opened by Mr-Philo 11 months ago
#8 - Add an ICLR paper for KV cache compression
Pull Request -
State: closed - Opened by Janghyun1230 11 months ago
- 1 comment
#7 - add context compression & new papers KV compression
Pull Request -
State: closed - Opened by liyucheng09 11 months ago
#6 - fix typo
Pull Request -
State: closed - Opened by lkm2835 11 months ago
- 1 comment
#5 - New papers: KV Compression/Quant
Issue -
State: closed - Opened by liyucheng09 11 months ago
- 3 comments
#4 - Context compression methods?
Issue -
State: closed - Opened by liyucheng09 11 months ago
- 2 comments
#3 - correct affiliation error
Pull Request -
State: closed - Opened by liyucheng09 11 months ago
- 1 comment
#2 - Update README.md
Pull Request -
State: closed - Opened by HuangLianghong about 1 year ago
- 1 comment
#1 - [Docs] resources handle
Issue -
State: closed - Opened by DefTruth over 1 year ago
- 12 comments