Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / PKU-YuanGroup/MoE-LLaVA issues and pull requests
#92 - [Question] 论文table 7的 non-MoE LLaVA-phi的train scripts和eval scripts
Issue -
State: open - Opened by sharkdrop 2 months ago
#92 - [Question] 论文table 7的 non-MoE LLaVA-phi的train scripts和eval scripts
Issue -
State: open - Opened by sharkdrop 2 months ago
#91 - [Question] Is there any Moe checkpoint of Qwen1.5 or Qwen2 released?
Issue -
State: open - Opened by double-fire-0 3 months ago
#90 - [Question] How to eval textqa
Issue -
State: open - Opened by fanminshi 3 months ago
- 1 comment
#90 - [Question] How to eval textqa
Issue -
State: open - Opened by fanminshi 3 months ago
- 1 comment
#89 - [Question] Step 3 loss curve
Issue -
State: open - Opened by fanminshi 3 months ago
#89 - [Question] Step 3 loss curve
Issue -
State: open - Opened by fanminshi 3 months ago
#88 - [Question] Question about the tokenizer of required pretrained model stabilityai/stablelm-2-1_6
Issue -
State: open - Opened by Taylorfire 4 months ago
- 1 comment
#88 - [Question] Question about the tokenizer of required pretrained model stabilityai/stablelm-2-1_6
Issue -
State: open - Opened by Taylorfire 4 months ago
- 1 comment
#87 - [Question] In paper Table 6, why variant (d) is better than variant (c)?
Issue -
State: open - Opened by pkumc 4 months ago
#87 - [Question] In paper Table 6, why variant (d) is better than variant (c)?
Issue -
State: open - Opened by pkumc 4 months ago
#86 - [Feature request] 是否會訓練更進一步的模型
Issue -
State: open - Opened by gesen2egee 5 months ago
#85 - Training of Stage 3 , 第三阶段训练,代码中实际的训练参数与论文不符
Issue -
State: open - Opened by Wuyingwen 5 months ago
- 1 comment
#84 - [Question] What exactly does the language model mean?
Issue -
State: open - Opened by dana-niu 5 months ago
#83 - [Discussion] What is the expert relationship between different layers with the same index? If not, what is the role of figures 4, 5 and 6 in the paper?
Issue -
State: open - Opened by meteorlium 6 months ago
#82 - [Question] ValueError: Unknown image tower: /hy-tmp/LLaVA/clip-vit-large-patch14-336
Issue -
State: open - Opened by FanshuoZeng 6 months ago
- 5 comments
#81 - Can the confidence coefficient of an answer be obtained?
Issue -
State: open - Opened by IsabelJimenez99 6 months ago
#80 - [Question] Inconsistency on MoE Layer Number in paper and model config
Issue -
State: open - Opened by QAQdev 7 months ago
#79 - [Usage] ADD windows support for more exposure
Issue -
State: open - Opened by mr-lab 7 months ago
#78 - can you please give me python script to use API with your demo ?
Issue -
State: open - Opened by gamesubzero 7 months ago
#77 - Moe finetuning error
Issue -
State: open - Opened by sahilqure 7 months ago
#76 - [Question] 多图collate_fn
Issue -
State: open - Opened by PangziZhang523 7 months ago
#75 - Minor fix and tips update for README
Pull Request -
State: closed - Opened by QAQdev 7 months ago
#74 - [Question] 能解释一下llava_arch中的class LlavaQWenMetaForCausalLM(LlavaMetaForCausalLM)这个类吗
Issue -
State: open - Opened by 20191864218 7 months ago
#73 - [Question] Pretrain step
Issue -
State: closed - Opened by rlagustmd82 7 months ago
#72 - [Question] CUDA OOM when finetune phi2-clipL336 at stage 2 with 8-A100-40G
Issue -
State: closed - Opened by terry-for-github 7 months ago
- 1 comment
#71 - [Feature request] Support Llama3
Issue -
State: open - Opened by xiweideng 7 months ago
#70 - [Question] About parameter ep_size
Issue -
State: open - Opened by puppy2000 7 months ago
#69 - [Usage] tokenizer.pad_token_id == None?
Issue -
State: open - Opened by sjtu-cz 8 months ago
- 1 comment
#68 - [Question] The error that occurred while running cli.py for inference, using Qwen-7B-base as the LLM.
Issue -
State: closed - Opened by 20191864218 8 months ago
- 1 comment
#67 - [Question] 论文参数讨论
Issue -
State: open - Opened by bufanx 8 months ago
- 1 comment
#66 - [Question] 关于第三阶段训练loss
Issue -
State: open - Opened by rangmiao 8 months ago
#65 - [Usage] Deepspeed MoE hangs when EP_SIZE > 1
Issue -
State: closed - Opened by Wadaxiwan 9 months ago
- 1 comment
#64 - DeepSpeed MoE 问题
Issue -
State: open - Opened by BlackBearBiscuit 9 months ago
#63 - RuntimeError: mat1 and mat2 must have the same dtype
Issue -
State: open - Opened by Crystalxd 9 months ago
#62 - 如何使用自己的数据集微调MoE-LLaVA
Issue -
State: open - Opened by Tunanzzz 9 months ago
- 4 comments
#61 - [Question]Can't find the "mm_projecotr.bin" in the model_path
Issue -
State: closed - Opened by sdlyzhq 9 months ago
#60 - [Question] The evaluation results vary every time.
Issue -
State: open - Opened by koda-11 9 months ago
#59 - [Question] The evaluation results vary every time.
Issue -
State: closed - Opened by koda-11 9 months ago
#58 - [Question] Adding to the dataset.
Issue -
State: open - Opened by arthurwolf 9 months ago
#57 - [Question] 第二阶段微调的模型会开源吗?
Issue -
State: open - Opened by murray-z 9 months ago
- 1 comment
#56 - [Question] 如何基于MoE模型,在自己的数据上进一步微调呢?
Issue -
State: open - Opened by murray-z 9 months ago
- 3 comments
#55 - [Question] how to visualize routing distribution ?
Issue -
State: closed - Opened by koda-11 9 months ago
- 1 comment
#54 - [Question] Model and Dataset Size
Issue -
State: open - Opened by adrielkuek 9 months ago
#53 - [Question] How did u using 768x768 resolution?
Issue -
State: open - Opened by lucasjinreal 9 months ago
#52 - [Question] How to finetune the moe-llava model on customized data?
Issue -
State: open - Opened by RayshenSL 9 months ago
#51 - [Usage] ValueError: Unknown image tower: /data1/ljq/Moellava/MoE-LLaVA-Qwen-1.8B-4e/clip-vit-large-patch14-336
Issue -
State: open - Opened by xiangchihuoguo 9 months ago
- 3 comments
#50 - [Question] Scale down futher to support IOT usecases?
Issue -
State: open - Opened by kinchahoy 9 months ago
- 1 comment
#49 - [Question] About nlp_tune data.
Issue -
State: open - Opened by Lucky-Lance 9 months ago
- 2 comments
#48 - Error during training on custom dataset
Issue -
State: open - Opened by saeedkhaki92 9 months ago
- 1 comment
#47 - 推理效率对比问题
Issue -
State: open - Opened by aprilehannibal 9 months ago
- 1 comment
#46 - [Discussion] How to improve model's understanding of high-resolution images?
Issue -
State: open - Opened by whalefa1I 9 months ago
- 1 comment
#45 - [Question] how to check activate parameters of MoE models?
Issue -
State: closed - Opened by koda-11 9 months ago
- 2 comments
#44 - > Hi, everyone. Sorry for that, we updated the new runing command to fix it. Checking out [here](https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/scripts/v1/qwen/finetune_moe.sh)
Issue -
State: closed - Opened by hxhcreate 9 months ago
- 2 comments
#43 - [Question] Image patch representation in this work
Issue -
State: closed - Opened by cydiachen 9 months ago
- 1 comment
#42 - 用LLava官方脚本替换Qwen2,用mpt的template训练 loss 0
Issue -
State: open - Opened by lucasjinreal 9 months ago
- 26 comments
#41 - [Usage] The training always stuck after formatting inputs
Issue -
State: closed - Opened by detectRecog 9 months ago
- 8 comments
#40 - Inference without Deepspeed
Issue -
State: open - Opened by aaronnat23 9 months ago
- 1 comment
#39 - [Discussion] Implementation of Qwen1.5 for the project
Issue -
State: closed - Opened by cydiachen 10 months ago
- 19 comments
#38 - inference error in llavamistral
Issue -
State: open - Opened by saeedkhaki92 10 months ago
- 2 comments
#37 - Support cuda 12
Issue -
State: open - Opened by fishfree 10 months ago
- 1 comment
#36 - Eval on MMVET
Issue -
State: open - Opened by BeachWang 10 months ago
- 1 comment
#35 - Wrong depedancies, why deepspeed dependency for inference, better transformers integration
Issue -
State: open - Opened by sujitvasanth 10 months ago
- 3 comments
#34 - finetune阶段内存占用太高
Issue -
State: open - Opened by awzhgw 10 months ago
- 2 comments
#33 - Can i use this to detect events in the video???
Issue -
State: open - Opened by Shekharmeena28 10 months ago
- 1 comment
#32 - License Questions
Issue -
State: open - Opened by kbrostrom 10 months ago
- 1 comment
#31 - 第二阶段,loss下降到多少比较合理?
Issue -
State: open - Opened by awzhgw 10 months ago
- 1 comment
#30 - MoE-LLaVA-StableLM for 4-bits and 8-bit
Issue -
State: open - Opened by NikiBase 10 months ago
- 1 comment
#29 - panic on finetune
Issue -
State: closed - Opened by awzhgw 10 months ago
- 3 comments
#28 - 用自己的数据训练MOE-LLAVA,pretrain阶段,loss下降的非常快
Issue -
State: closed - Opened by awzhgw 10 months ago
- 4 comments
#27 - Reproducing the stage1 and stage2 Model problem on L40s
Issue -
State: closed - Opened by cydiachen 10 months ago
- 14 comments
#26 - traning dataset?
Issue -
State: closed - Opened by luohao123 10 months ago
- 14 comments
#25 - 是否支持启动的时候,指定use_flash_attion_2 ??
Issue -
State: closed - Opened by awzhgw 10 months ago
- 2 comments
#24 - Whether stage-2 pre-train model(llavaphi-2.7b-finetune) is released?
Issue -
State: open - Opened by yucheng-zyc 10 months ago
- 4 comments
#23 - Wrong cuda allocation
Issue -
State: open - Opened by paulgavrikov 10 months ago
- 5 comments
#22 - paper和readme指标不一致
Issue -
State: closed - Opened by sxu1997 10 months ago
- 2 comments
#21 - languageBindVideo model may be hang ?
Issue -
State: open - Opened by awzhgw 10 months ago
- 8 comments
#20 - 当我使用moe-llava的架构集成了mixtral 7BX8的时候,奇怪的事情发生了
Issue -
State: closed - Opened by awzhgw 10 months ago
- 4 comments
#19 - ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
Issue -
State: open - Opened by andysingal 10 months ago
- 2 comments
#18 - Error in predict.py
Issue -
State: closed - Opened by KuofengGao 10 months ago
- 1 comment
#17 - Moe finetune error
Issue -
State: closed - Opened by AdonLee072348 10 months ago
- 12 comments
#16 - Do you replicate the weights of the FFNs from stage 1 or stage 2?
Issue -
State: closed - Opened by simon-lund 10 months ago
- 4 comments
#15 - is support video ?
Issue -
State: open - Opened by awzhgw 10 months ago
- 3 comments
#14 - Hello, I want to know how much GPU memory is needed to run this model.
Issue -
State: closed - Opened by dforel 10 months ago
- 2 comments
#13 - Openchat, quantisation, multiimage
Issue -
State: closed - Opened by sujitvasanth 10 months ago
- 5 comments
#12 - /deepspeed/comm/comm.py", line 341, in all_to_all_single return cdb.all_to_all_single(output=output, AttributeError: 'NoneType' object has no attribute 'all_to_all_single'
Issue -
State: open - Opened by lucasjinreal 10 months ago
- 13 comments
#11 - image not processed
Issue -
State: closed - Opened by leosongwei 10 months ago
- 1 comment
#10 - Allow custom storage path of the google/siglip-so400m-patch14-384
Issue -
State: closed - Opened by leosongwei 10 months ago
- 1 comment
#9 - Is llavallama moe supported?
Issue -
State: open - Opened by DietDietDiet 10 months ago
- 14 comments
#8 - can support mixtral 7BX8 model ?
Issue -
State: closed - Opened by awzhgw 10 months ago
- 1 comment
#7 - Update qwen_generation_utils.py
Pull Request -
State: open - Opened by eltociear 10 months ago
#6 - Very bad at language ability
Issue -
State: closed - Opened by lucasjinreal 10 months ago
- 1 comment
#5 - supports Chinese or multiple images?
Issue -
State: closed - Opened by BaoyanWang 10 months ago
- 3 comments
#4 - Images for training
Issue -
State: closed - Opened by phellonchen 10 months ago
- 2 comments
#3 - Method to Replicate Results from Huggingface Spaces
Issue -
State: closed - Opened by hiroalchem 10 months ago
- 5 comments
#2 - Can the author elaborate a bit more about how the stage 3 was achieved?
Issue -
State: closed - Opened by CanyonWind 10 months ago
- 2 comments
#1 - {'loss': 0.0, 'learning_rate': 1.6877637130801689e-07, 'epoch': 0.0}
Issue -
State: closed - Opened by whalefa1I 10 months ago
- 14 comments