FasterDecoding/Medusa issues and pull requests

#107 - The implementation of stage 2 with axolotl

Issue - State: open - Opened by boxiaowave 4 months ago

#106 - PPL compute

Issue - State: open - Opened by yuyangxie96 4 months ago

#105 - Fix TGI's medusa link

Pull Request - State: open - Opened by fxmarty 4 months ago

#104 - Containerization with Dockerfile to setup medusa

Issue - State: open - Opened by gangooteli 4 months ago

#103 - Fix for removing LM_HEAD and upgrading Medusa v2

Pull Request - State: closed - Opened by tgaddair 4 months ago

#102 - Conversation roles must alternate user/assistant/user/assistant/

Issue - State: open - Opened by gangooteli 4 months ago

#101 - [bug] fix preprocess function

Issue - State: open - Opened by xiezipeng-ML 5 months ago

#100 - Using Medusa with Whisper

Issue - State: open - Opened by AvivSham 5 months ago - 5 comments

#99 - Token-wise the same generalization?

Issue - State: closed - Opened by Ageliss 5 months ago - 2 comments

#98 - ImportError: cannot import name 'is_flash_attn_available' from 'transformers.utils'

Issue - State: open - Opened by imneov 5 months ago - 1 comment

#97 - Creating medusa2.

Pull Request - State: closed - Opened by Narsil 5 months ago - 1 comment

#96 - Is there a bug in gen_model_answer_baseline.py?

Issue - State: open - Opened by qspang 5 months ago - 1 comment

#95 - Medusa Training Loss

Issue - State: open - Opened by TomYang-TZ 6 months ago - 5 comments

#94 - train medusa stage-2

Issue - State: open - Opened by smartliuhw 6 months ago - 1 comment

#93 - mistral.json

Issue - State: open - Opened by Git-L1 6 months ago

#92 - which dataset should i use when training medusa heads with llama2 7b

Issue - State: open - Opened by tu2022 6 months ago

#91 - Cant it support chatgllm?

Issue - State: open - Opened by PeterXiaTian 6 months ago

#90 - HYDRA support?

Issue - State: open - Opened by arunpatala 6 months ago

#89 - Misleading Name LLM Name MEDUSA

Issue - State: open - Opened by Pittconnect 7 months ago

#88 - about Medusa mask details

Issue - State: closed - Opened by dhcode-cpp 7 months ago

#85 - Why medusa-2 train llama2 with no such great improvement?

Issue - State: open - Opened by MeJerry215 7 months ago - 2 comments

#84 - release medusa-llm v1.0

Issue - State: closed - Opened by zhyncs 7 months ago - 1 comment

#83 - Adding recipe for other models (non llama, non vicuna).

Pull Request - State: closed - Opened by Narsil 7 months ago

#82 - [Dynamic Batching] Concerns about whether features are not supported using Medusa

Issue - State: open - Opened by Ageliss 7 months ago

#81 - Encounter an CUDA error when set Medusa head

Issue - State: open - Opened by 1649759610 7 months ago

#80 - Support batch size > 1

Pull Request - State: open - Opened by xwang365 7 months ago

#79 - Why the speed up of Medusa 1 on vicuna changed?

Issue - State: closed - Opened by niyunsheng 8 months ago - 2 comments

#78 - deepspeed support

Issue - State: open - Opened by jiangix-paper 8 months ago

#77 - Is there no way to inference without training?

Issue - State: open - Opened by MoOo2mini 8 months ago - 3 comments

#76 - medusa-2 HF repo has no 'medusa_num_heads' in config

Issue - State: closed - Opened by HaebinShin 8 months ago - 1 comment

#75 - How to use the finetuned mistal model for inference with Medusa

Issue - State: open - Opened by pradeepdev-1995 8 months ago - 7 comments

#74 - Question about Heads warmup

Issue - State: open - Opened by eloooooon 8 months ago - 1 comment

#73 - Medusa 1 and 2 speed up

Issue - State: closed - Opened by LotuSrc 8 months ago - 2 comments

#72 - update Community Adoption for RTP-LLM

Pull Request - State: closed - Opened by zhyncs 8 months ago - 2 comments

#71 - V1.0 prerelease

Pull Request - State: closed - Opened by ctlllll 8 months ago

#70 - Training Medusa heads

Issue - State: open - Opened by mmilunovic-mdcs 8 months ago - 6 comments

#69 - OSError

Issue - State: open - Opened by qspang 8 months ago - 3 comments

#68 - About changing LLM from LLAMA to LLAMA-2

Issue - State: closed - Opened by dydrkfl06 8 months ago - 2 comments

#67 - how did you construct the sparse tree architecture

Issue - State: closed - Opened by pengfeiwu1999 9 months ago - 2 comments

#66 - Clarifications on Models + Batch Size

Issue - State: closed - Opened by RonanKMcGovern 9 months ago - 5 comments

#65 - Can I make an AWQ quantization?

Issue - State: closed - Opened by RonanKMcGovern 9 months ago - 1 comment

#64 - Sparse candidate generation confusion

Issue - State: closed - Opened by zankner 10 months ago - 6 comments

#63 - Some questions about sampling strategy

Issue - State: closed - Opened by qianxiao1111 11 months ago - 3 comments

#62 - Results for different configs

Issue - State: closed - Opened by zankner 11 months ago - 8 comments

#61 - How to load finetune checkpoint files directly？

Issue - State: closed - Opened by qianxiao1111 11 months ago

#60 - AttributeError: 'LlamaForCausalLM' object has no attribute 'medusa_head'

Issue - State: closed - Opened by blwaji 11 months ago - 2 comments

#59 - AttributeError: 'LlamaForCausalLM' object has no attribute 'medusa_head'

Issue - State: closed - Opened by blwaji 11 months ago

#57 - FasterTransformer support

Issue - State: open - Opened by niyunsheng 11 months ago - 1 comment

#56 - Will using this method result in inconsistent output results?

Issue - State: closed - Opened by niyunsheng 11 months ago - 8 comments

#55 - TypeError: init() got an unexpected keyword argument 'medusa_num_heads'

Issue - State: closed - Opened by HackGiter 11 months ago - 4 comments

#54 - Mistral 7B model support

Pull Request - State: closed - Opened by JianbangZ 12 months ago - 4 comments

#53 - Llm judge update

Pull Request - State: closed - Opened by leeyeehoo 12 months ago

#52 - [Feature Request] Qwen model support

Issue - State: open - Opened by JianbangZ 12 months ago - 1 comment

#51 - errors occurred when running simple_gradio_interface.py

Issue - State: closed - Opened by MeWannaSleep 12 months ago - 2 comments

#50 - Install the package with the console script ?

Issue - State: closed - Opened by devrimcavusoglu 12 months ago - 1 comment

#49 - How to test latency between medusa & baseline

Issue - State: closed - Opened by YixinSong-e 12 months ago - 3 comments

#48 - name not exist "from medusa.model.medusa_choices import medusa_choices"

Issue - State: closed - Opened by JianbangZ almost 1 year ago - 4 comments

#46 - update roadmap

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#45 - CUBLAS_STATUS_EXECUTION_FAILED when training Medusa Head with base model set to Llama2 7B

Issue - State: closed - Opened by void-main about 1 year ago - 7 comments

#42 - Sparse tree

Pull Request - State: closed - Opened by ctlllll about 1 year ago

#41 - vLLM support

Issue - State: open - Opened by MichaelJayW about 1 year ago - 11 comments

#40 - Pull main to sparse_tree

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#39 - [New feature] More sampling schemes

Issue - State: closed - Opened by Jokoe66 about 1 year ago - 2 comments
Labels: enhancement

#38 - add development bounty

Pull Request - State: closed - Opened by ctlllll about 1 year ago

#37 - Benchmark results

Issue - State: closed - Opened by JianbangZ about 1 year ago - 3 comments

#36 - [New feature] Fine-tune Medusa heads during SFT

Issue - State: closed - Opened by ctlllll about 1 year ago - 4 comments
Labels: enhancement

#35 - [New feature] llama.cpp support

Issue - State: open - Opened by ctlllll about 1 year ago - 7 comments
Labels: enhancement

#34 - [Research] Explore tree sparsity (speed +10%-20%)

Issue - State: closed - Opened by ctlllll about 1 year ago
Labels: research

#33 - [New feature] mlc-llm support

Issue - State: open - Opened by ctlllll about 1 year ago - 8 comments
Labels: enhancement

#32 - [New feature] exllama support

Issue - State: open - Opened by ctlllll about 1 year ago
Labels: enhancement

#31 - [Inference] IndexError: list index out of range

Issue - State: closed - Opened by helldog-star about 1 year ago - 2 comments

#30 - Fork base model's last two decoder layers

Pull Request - State: closed - Opened by austinsilveria about 1 year ago - 21 comments

#29 - Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding

Issue - State: closed - Opened by YixinSong-e about 1 year ago

#27 - add docstrings

Pull Request - State: closed - Opened by rajveer43 about 1 year ago - 1 comment

#26 - Add an option to override base model path

Pull Request - State: closed - Opened by Btlmd about 1 year ago - 1 comment

#25 - batch support

Issue - State: closed - Opened by thistleknot about 1 year ago - 1 comment

#24 - gguf

Issue - State: closed - Opened by thistleknot about 1 year ago - 1 comment

#23 - Add a simple gradio interface, make life easier

Pull Request - State: closed - Opened by Mrw33554432 about 1 year ago - 2 comments

#22 - update roadmap

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#21 - Update ROADMAP.md

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#20 - Merge pull request #19 from FasterDecoding/main

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#19 - N/A

Pull Request - State: closed - Opened by leeyeehoo about 1 year ago

#18 - finetune vicuna-7b-1.5-16k

Issue - State: closed - Opened by JianbangZ about 1 year ago - 6 comments

#17 - Fix save medusa_head

Pull Request - State: closed - Opened by ctlllll about 1 year ago

#16 - Unable to save `medusa_lm_head.pt` file in default folder

Issue - State: closed - Opened by caiyuhu about 1 year ago - 4 comments

#15 - Add git-lfs instruction

Pull Request - State: closed - Opened by ctlllll about 1 year ago

#14 - OOM question

Issue - State: closed - Opened by helldog-star about 1 year ago - 2 comments

#13 - Is there any document that describes the Medusa inference details and theory?

Issue - State: closed - Opened by YingHH1 about 1 year ago - 4 comments

#12 - train_vicuna_7b.sh script does not install dataset from git-lfs

Issue - State: closed - Opened by guberti about 1 year ago - 2 comments

#11 - AttributeError: module 'collections' has no attribute 'MutableMapping'

Issue - State: closed - Opened by beingPurple about 1 year ago - 2 comments

#10 - Update ROADMAP.md

Pull Request - State: closed - Opened by eltociear about 1 year ago - 1 comment

#9 - Medusa can't find accelerate or bitsandbytes

Issue - State: closed - Opened by beingPurple about 1 year ago - 6 comments

#8 - megatron-llm模型的支持

Issue - State: closed - Opened by Jokoe66 about 1 year ago - 2 comments

#7 - Add support for LoRA finetuning for LLaMA-65B

Issue - State: closed - Opened by YingHH1 about 1 year ago - 1 comment

#6 - Quesitons about the Performance

Issue - State: closed - Opened by Adam1679 about 1 year ago - 1 comment

#5 - Benchmarks

Issue - State: closed - Opened by someone13574 about 1 year ago - 2 comments

#4 - Cleaned up README

Pull Request - State: closed - Opened by kalomaze about 1 year ago

#3 - Roadmap

Issue - State: open - Opened by ctlllll about 1 year ago - 15 comments
Labels: documentation

#2 - Fine-tune?

Issue - State: closed - Opened by loretoparisi about 1 year ago - 1 comment

#1 - Number of heads

Issue - State: closed - Opened by abacaj about 1 year ago - 3 comments

GitHub / FasterDecoding/Medusa issues and pull requests