Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / PKU-YuanGroup/MoE-LLaVA issues and pull requests

#90 - [Question] How to eval textqa

Issue - State: open - Opened by fanminshi about 1 month ago - 1 comment

#90 - [Question] How to eval textqa

Issue - State: open - Opened by fanminshi about 1 month ago - 1 comment

#89 - [Question] Step 3 loss curve

Issue - State: open - Opened by fanminshi about 1 month ago

#89 - [Question] Step 3 loss curve

Issue - State: open - Opened by fanminshi about 1 month ago

#84 - [Question] What exactly does the language model mean?

Issue - State: open - Opened by dana-niu 3 months ago

#79 - [Usage] ADD windows support for more exposure

Issue - State: open - Opened by mr-lab 4 months ago

#77 - Moe finetuning error

Issue - State: open - Opened by sahilqure 5 months ago

#76 - [Question] 多图collate_fn

Issue - State: open - Opened by PangziZhang523 5 months ago

#75 - Minor fix and tips update for README

Pull Request - State: closed - Opened by QAQdev 5 months ago

#73 - [Question] Pretrain step

Issue - State: closed - Opened by rlagustmd82 5 months ago

#71 - [Feature request] Support Llama3

Issue - State: open - Opened by xiweideng 5 months ago

#70 - [Question] About parameter ep_size

Issue - State: open - Opened by puppy2000 5 months ago

#69 - [Usage] tokenizer.pad_token_id == None?

Issue - State: open - Opened by sjtu-cz 5 months ago - 1 comment

#67 - [Question] 论文参数讨论

Issue - State: open - Opened by bufanx 6 months ago - 1 comment

#66 - [Question] 关于第三阶段训练loss

Issue - State: open - Opened by rangmiao 6 months ago

#65 - [Usage] Deepspeed MoE hangs when EP_SIZE > 1

Issue - State: closed - Opened by Wadaxiwan 6 months ago - 1 comment

#64 - DeepSpeed MoE 问题

Issue - State: open - Opened by BlackBearBiscuit 6 months ago

#63 - RuntimeError: mat1 and mat2 must have the same dtype

Issue - State: open - Opened by Crystalxd 6 months ago

#62 - 如何使用自己的数据集微调MoE-LLaVA

Issue - State: open - Opened by Tunanzzz 6 months ago - 4 comments

#60 - [Question] The evaluation results vary every time.

Issue - State: open - Opened by koda-11 6 months ago

#59 - [Question] The evaluation results vary every time.

Issue - State: closed - Opened by koda-11 6 months ago

#58 - [Question] Adding to the dataset.

Issue - State: open - Opened by arthurwolf 7 months ago

#57 - [Question] 第二阶段微调的模型会开源吗?

Issue - State: open - Opened by murray-z 7 months ago - 1 comment

#55 - [Question] how to visualize routing distribution ?

Issue - State: closed - Opened by koda-11 7 months ago - 1 comment

#54 - [Question] Model and Dataset Size

Issue - State: open - Opened by adrielkuek 7 months ago

#53 - [Question] How did u using 768x768 resolution?

Issue - State: open - Opened by lucasjinreal 7 months ago

#50 - [Question] Scale down futher to support IOT usecases?

Issue - State: open - Opened by kinchahoy 7 months ago - 1 comment

#49 - [Question] About nlp_tune data.

Issue - State: open - Opened by Lucky-Lance 7 months ago - 2 comments

#48 - Error during training on custom dataset

Issue - State: open - Opened by saeedkhaki92 7 months ago - 1 comment

#47 - 推理效率对比问题

Issue - State: open - Opened by aprilehannibal 7 months ago - 1 comment

#45 - [Question] how to check activate parameters of MoE models?

Issue - State: closed - Opened by koda-11 7 months ago - 2 comments

#43 - [Question] Image patch representation in this work

Issue - State: closed - Opened by cydiachen 7 months ago - 1 comment

#42 - 用LLava官方脚本替换Qwen2,用mpt的template训练 loss 0

Issue - State: open - Opened by lucasjinreal 7 months ago - 26 comments

#41 - [Usage] The training always stuck after formatting inputs

Issue - State: closed - Opened by detectRecog 7 months ago - 8 comments

#40 - Inference without Deepspeed

Issue - State: open - Opened by aaronnat23 7 months ago - 1 comment

#39 - [Discussion] Implementation of Qwen1.5 for the project

Issue - State: closed - Opened by cydiachen 7 months ago - 19 comments

#38 - inference error in llavamistral

Issue - State: open - Opened by saeedkhaki92 7 months ago - 2 comments

#37 - Support cuda 12

Issue - State: open - Opened by fishfree 7 months ago - 1 comment

#36 - Eval on MMVET

Issue - State: open - Opened by BeachWang 7 months ago - 1 comment

#34 - finetune阶段内存占用太高

Issue - State: open - Opened by awzhgw 7 months ago - 2 comments

#33 - Can i use this to detect events in the video???

Issue - State: open - Opened by Shekharmeena28 8 months ago - 1 comment

#32 - License Questions

Issue - State: open - Opened by kbrostrom 8 months ago - 1 comment

#31 - 第二阶段,loss下降到多少比较合理?

Issue - State: open - Opened by awzhgw 8 months ago - 1 comment

#30 - MoE-LLaVA-StableLM for 4-bits and 8-bit

Issue - State: open - Opened by NikiBase 8 months ago - 1 comment

#29 - panic on finetune

Issue - State: closed - Opened by awzhgw 8 months ago - 3 comments

#28 - 用自己的数据训练MOE-LLAVA,pretrain阶段,loss下降的非常快

Issue - State: closed - Opened by awzhgw 8 months ago - 4 comments

#27 - Reproducing the stage1 and stage2 Model problem on L40s

Issue - State: closed - Opened by cydiachen 8 months ago - 14 comments

#26 - traning dataset?

Issue - State: closed - Opened by luohao123 8 months ago - 14 comments

#25 - 是否支持启动的时候,指定use_flash_attion_2 ??

Issue - State: closed - Opened by awzhgw 8 months ago - 2 comments

#24 - Whether stage-2 pre-train model(llavaphi-2.7b-finetune) is released?

Issue - State: open - Opened by yucheng-zyc 8 months ago - 4 comments

#23 - Wrong cuda allocation

Issue - State: open - Opened by paulgavrikov 8 months ago - 5 comments

#22 - paper和readme指标不一致

Issue - State: closed - Opened by sxu1997 8 months ago - 2 comments

#21 - languageBindVideo model may be hang ?

Issue - State: open - Opened by awzhgw 8 months ago - 8 comments

#18 - Error in predict.py

Issue - State: closed - Opened by KuofengGao 8 months ago - 1 comment

#17 - Moe finetune error

Issue - State: closed - Opened by AdonLee072348 8 months ago - 12 comments

#16 - Do you replicate the weights of the FFNs from stage 1 or stage 2?

Issue - State: closed - Opened by simon-lund 8 months ago - 4 comments

#15 - is support video ?

Issue - State: open - Opened by awzhgw 8 months ago - 3 comments

#14 - Hello, I want to know how much GPU memory is needed to run this model.

Issue - State: closed - Opened by dforel 8 months ago - 2 comments

#13 - Openchat, quantisation, multiimage

Issue - State: closed - Opened by sujitvasanth 8 months ago - 5 comments

#11 - image not processed

Issue - State: closed - Opened by leosongwei 8 months ago - 1 comment

#10 - Allow custom storage path of the google/siglip-so400m-patch14-384

Issue - State: closed - Opened by leosongwei 8 months ago - 1 comment

#9 - Is llavallama moe supported?

Issue - State: open - Opened by DietDietDiet 8 months ago - 14 comments

#8 - can support mixtral 7BX8 model ?

Issue - State: closed - Opened by awzhgw 8 months ago - 1 comment

#7 - Update qwen_generation_utils.py

Pull Request - State: open - Opened by eltociear 8 months ago

#6 - Very bad at language ability

Issue - State: closed - Opened by lucasjinreal 8 months ago - 1 comment

#5 - supports Chinese or multiple images?

Issue - State: closed - Opened by BaoyanWang 8 months ago - 3 comments

#4 - Images for training

Issue - State: closed - Opened by phellonchen 8 months ago - 2 comments

#3 - Method to Replicate Results from Huggingface Spaces

Issue - State: closed - Opened by hiroalchem 8 months ago - 5 comments

#2 - Can the author elaborate a bit more about how the stage 3 was achieved?

Issue - State: closed - Opened by CanyonWind 8 months ago - 2 comments

#1 - {'loss': 0.0, 'learning_rate': 1.6877637130801689e-07, 'epoch': 0.0}

Issue - State: closed - Opened by whalefa1I 8 months ago - 14 comments