haotian-liu/LLaVA issues and pull requests

#1705 - [Question] 想学习【MLLM多模态】快加学习交流群

Issue - State: open - Opened by km1994 3 days ago

#1704 - [Question] What is the optimizer of LLaVA?

Issue - State: open - Opened by 20191864218 4 days ago

#1703 - How to judge whether we train the vision tower during our lora-fine tuning

Issue - State: open - Opened by lmingze 7 days ago

#1702 - what is the difference between liuhaotian/llava-v1.5-7b and (vicuna-7b-v1.5 + vision tower + mm_projector)

Issue - State: open - Opened by lmingze 7 days ago

#1701 - [Question] Unable to generate pydantic-core schema for <class 'starlette.requests.Request'>. Set `arbitrary_types_allowed=True`

Issue - State: open - Opened by sxchen123 8 days ago - 1 comment

#1700 - [Question] May I ask if there is a 7B-sized file of Llama_2_7b_chat?

Issue - State: open - Opened by xlnn 8 days ago

#1699 - [Question] if not vision_tower.is_loaded: AttributeError: 'NoneType' object has no attribute 'is_loaded'

Issue - State: open - Opened by shen1005 10 days ago

#1698 - [Usage] The difference between finetune_lora.sh and finetune_task_lora.sh

Issue - State: open - Opened by PixelChen24 10 days ago

#1697 - [Question] Why the link of LAION/CC/SBU BLIP-Caption Concept-balanced 558K Meta Data(meta.json) is empty?

Issue - State: open - Opened by Liuqibaa 10 days ago

#1696 - [Question] What base python version should `LLaVA` use ? like `3.10` ,`3.11` ?

Issue - State: closed - Opened by q2333gh 11 days ago - 2 comments

#1695 - [Question] Why is the output always numerical when using model inference, like this

Issue - State: open - Opened by yuese1234 12 days ago

#1694 - Can I load the parameters of llava1.5 with full parameter fine-tuning

Issue - State: open - Opened by zhangzef 12 days ago

#1693 - [Usage] can I load

Issue - State: closed - Opened by zhangzef 12 days ago

#1692 - Error while fine-tuning llava-v1.6 using finetune_task_lora.sh：DeepSpeed Zero-3 is not compatible with `low_cpu_mem_usage=True` or with passing a `device_map`.

Issue - State: open - Opened by LuYinMiao 13 days ago

#1691 - [Question] How can I this truble when i use LLaVa_Med?

Issue - State: open - Opened by TimoFan1998 15 days ago

#1690 - [Question] A series of questions about fine tuning , I want to learn this stuff

Issue - State: open - Opened by bdv29 15 days ago

#1688 - Enhance error handling and optimize file handling

Pull Request - State: open - Opened by mandlinsarah 16 days ago

#1687 - OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./checkpoints/vicuna-7b-v1.5.

Issue - State: open - Opened by Leo-Lsc 16 days ago

#1686 - [Question] Why is the accuracy low when I evaluate llava-v1.5-7b-lora on VQAv2 ?

Issue - State: open - Opened by tanghao2118 18 days ago - 1 comment

#1685 - [Question] Any ideas on how to design a customized loss function for LLaVA and integrate several new layers that would be compatible with it?

Issue - State: open - Opened by powenlo 19 days ago

#1683 - [Question] how much vram for batch 1 full fine tune?

Issue - State: open - Opened by nub2927 20 days ago

#1682 - [Question] 歪曲了历史，数据源精美精西

Issue - State: open - Opened by wushh 21 days ago - 3 comments

#1681 - [Usage] Adding very few parameters when using LoRA to finetune LLaVA 1.5

Issue - State: open - Opened by XiaoruiMaLU 21 days ago

#1680 - [Question] Custom Conversations

Issue - State: open - Opened by KansaiUser 22 days ago

#1679 - [Usage] Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint

Issue - State: open - Opened by RobitsG 22 days ago - 2 comments

#1678 - [Usage] Can you release the code for concept balancing filtering in pretrain data?

Issue - State: open - Opened by williamium3000 22 days ago

#1677 - [Question] why model generate takes more time the second time?

Issue - State: open - Opened by KansaiTraining 23 days ago

#1676 - [Question] Is there a parameter to limit the amount of text output, or is the output more streamlined

Issue - State: open - Opened by zou-yawen 23 days ago

#1675 - [Question] How can I use CLIPVisionTowerS2?

Issue - State: closed - Opened by dongbinShin96 24 days ago

#1674 - [Question] How to get output embeddings or last hidden states?

Issue - State: open - Opened by flab305 26 days ago

#1674 - [Question] How to get output embeddings or last hidden states?

Issue - State: open - Opened by flab305 26 days ago

#1673 - [Question] Understanding the Licensing and Commercial Use of LLaVa-NeXT

Issue - State: open - Opened by sayanbiswas59 27 days ago

#1673 - [Question] Understanding the Licensing and Commercial Use of LLaVa-NeXT

Issue - State: open - Opened by sayanbiswas59 27 days ago

#1672 - [Question] The difference between llava ckpt and llava_hf ckpt?

Issue - State: open - Opened by hxhcreate 28 days ago

#1670 - [Usage] pretrain about "gradient_accumulation_steps" para use

Issue - State: closed - Opened by wanlipeng 29 days ago

#1669 - [Feature request] Add better support for Brazilian Portuguese

Issue - State: open - Opened by insinfo 29 days ago - 1 comment

#1668 - [Usage] about fine tuning

Issue - State: open - Opened by liucheny 30 days ago

#1667 - [Usage] From Hugging face llava gives no response

Issue - State: open - Opened by MonoMarkor about 1 month ago

#1666 - [Usage] Error - `ollama._types.ResponseError: error parsing llm response stream: error: {"content":"internal_error"}`

Issue - State: open - Opened by laksh-2193 about 1 month ago - 1 comment

#1665 - [Question] How to correctly write a prompt for this model?

Issue - State: open - Opened by KansaiTraining about 1 month ago

#1665 - [Question] How to correctly write a prompt for this model?

Issue - State: open - Opened by KansaiTraining about 1 month ago

#1664 - [Question] evaluation problem in textVQA

Issue - State: open - Opened by jiinhui about 1 month ago

#1664 - [Question] evaluation problem in textVQA

Issue - State: open - Opened by jiinhui about 1 month ago

#1663 - [Usage] How do i get a LLaVA model converted to contain vision tower related config items?

Issue - State: closed - Opened by BountyMage about 1 month ago - 2 comments

#1662 - [Question] Support for Multi-Image Input in One-Shot/Few-Shot Learning Scenarios

Issue - State: open - Opened by vedernikovphoto about 1 month ago - 1 comment

#1661 - [Question] 单卡训练正常，多卡训练loss == 0

Issue - State: open - Opened by Camellia-hz about 1 month ago - 2 comments

#1660 - Dependency errors

Issue - State: open - Opened by Chanete about 1 month ago - 1 comment

#1659 - [Question] The pure text process in function "prepare_inputs_labels_for_multimodal"

Issue - State: open - Opened by zhanglixuan0720 about 1 month ago - 1 comment

#1658 - [Question] When finetuning with my custom dataset, do I need to mix the custom datasets and original LLaVA instruction datasets?

Issue - State: open - Opened by zy1996829 about 1 month ago

#1657 - [Usage] Out of bounds error

Issue - State: open - Opened by happywinder about 1 month ago

#1656 - comment out torch requirement

Pull Request - State: closed - Opened by nguyendinhson-kaist about 1 month ago

#1656 - comment out torch requirement

Pull Request - State: closed - Opened by nguyendinhson-kaist about 1 month ago

#1655 - [Question]detailed information about tokenizer and encoder

Issue - State: open - Opened by dnkscu about 1 month ago

#1654 - [Usage] why the demo of LLaVA-1.5 can not use now？

Issue - State: open - Opened by moonnnpie about 1 month ago

#1654 - [Usage] why the demo of LLaVA-1.5 can not use now？

Issue - State: open - Opened by moonnnpie about 1 month ago

#1653 - [Question] Could you tell me any difference between llava-v1.5-13b.jsonl and old llava-v1.5-7b.jsonl?

Issue - State: open - Opened by yukio0321 about 1 month ago

#1651 - [Usage] LLaVA-v1.6-vicuna-7b generates incomplete sentence

Issue - State: closed - Opened by gymbeijing about 1 month ago - 1 comment

#1650 - [Question] Learning is completed, but only the weights of the projector are output.

Issue - State: closed - Opened by kouyakamada about 1 month ago

#1649 - Why are the first image and the remaining images processed separately? in prepare_inputs_labels_for_multimodal()

Issue - State: open - Opened by shure-dev about 1 month ago

#1648 - llava 1.2.2.post1 requires torch==2.1.2, but you have torch 2.0.1 which is incompatible.

Issue - State: open - Opened by Mike-ihr about 1 month ago - 3 comments

#1647 - [Usage] Freezing Vision Encoder During LoRA Training

Issue - State: open - Opened by xie-qiang about 1 month ago

#1646 - You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.

Issue - State: open - Opened by XCF-Mike about 1 month ago - 1 comment

#1645 - [Question] Inference Code

Issue - State: closed - Opened by bryanwong17 about 2 months ago

#1644 - [Question] Misaligned image_grid_pinpoints

Issue - State: open - Opened by Forence1999 about 2 months ago

#1643 - about multi-image input

Issue - State: open - Opened by eternal8080 about 2 months ago

#1642 - [Question] Image and Text Embeddings For Downstream Tasks

Issue - State: open - Opened by dipikakhullar about 2 months ago - 1 comment

#1641 - load_dataset('liuhaotian/LLaVA-Instruct-150K') ERROR

Issue - State: open - Opened by YerongLi about 2 months ago

#1639 - [Usage] KeyError: 'LlavaMistralConfig' when I start a model-worker with a fine-tuned Llava-v1.6-mistral-7B model.

Issue - State: open - Opened by yunsaijc about 2 months ago

#1638 - Error when load model in 4bit

Issue - State: open - Opened by rin2401 about 2 months ago - 1 comment

#1637 - [Question] Fine-tuned model ignore some of the captions

Issue - State: open - Opened by AbdulrahmanSoliman1 about 2 months ago

#1636 - How can I custom loss function during fine-tuning?[Question]

Issue - State: open - Opened by qingyunyanran about 2 months ago - 4 comments

#1635 - Error when saving model: Invalid generation config due to conflicting parameters

Issue - State: closed - Opened by ohhan777 about 2 months ago - 5 comments

#1634 - NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.[Usage]

Issue - State: open - Opened by sivang about 2 months ago

#1633 - Can this model be adapted to torch2.3？

Issue - State: open - Opened by zxy1728 about 2 months ago

#1632 - [Question] About how to deal with multiple-answer question in vqav2

Issue - State: open - Opened by BroJunn about 2 months ago

#1632 - [Question] About how to deal with multiple-answer question in vqav2

Issue - State: open - Opened by BroJunn about 2 months ago

#1630 - [Usage] Visual instruction tuning for LLaVa 1.6

Issue - State: open - Opened by mattia-re-learn about 2 months ago - 6 comments

#1628 - 拉了一个多模态大模型技术交流群，大家可以加入进来进行技术交流

Issue - State: open - Opened by feihuamantian about 2 months ago - 3 comments

#1625 - [Usage] There is no output from the ASSISTANT when using CLI Inference????

Issue - State: open - Opened by xlxcomputer about 2 months ago - 1 comment

#1624 - [Usage] from llava.mm_utils import ( ImportError: cannot import name 'load_pretrained_model' from 'llava.mm_utils'

Issue - State: closed - Opened by seven112233 about 2 months ago - 1 comment

#1622 - [Usage] You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.

Issue - State: open - Opened by matengxiaotiancai 2 months ago - 2 comments

#1619 - [Question] "Size mismatch" Error when finetuning from a projector

Issue - State: open - Opened by hvgupta 2 months ago - 3 comments

#1618 - Some images of ocr vqa data in llava_v1_5_mix665k.json do not exist!

Issue - State: open - Opened by Vicent0205 2 months ago - 5 comments

#1613 - [Questio why 'mm_vision_select_layer' == -2 in config ? n]

Issue - State: open - Opened by fmy7834 2 months ago - 2 comments

#1610 - [Question] 想换掉Vicuna 模型

Issue - State: open - Opened by wanglongpeng1 2 months ago - 1 comment

#1610 - [Question] 想换掉Vicuna 模型

Issue - State: open - Opened by wanglongpeng1 2 months ago - 1 comment

#1603 - Pretraining LLaVA with SigLIP

Pull Request - State: open - Opened by nahidalam 2 months ago - 2 comments

#1600 - Need either a `state_dict` or a `save_folder` containing offloaded weights.

Issue - State: open - Opened by zhaodaojie 2 months ago - 1 comment

#1597 - [Usage] how to load the trained lora?

Issue - State: closed - Opened by feiyangsuo 2 months ago - 4 comments

#1582 - device mis-match error on pre-training

Issue - State: open - Opened by oroojlooy 3 months ago - 3 comments

#1582 - device mis-match error on pre-training

Issue - State: open - Opened by oroojlooy 3 months ago - 1 comment

#1580 - [Question] Mac run but get importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes

Issue - State: open - Opened by Timothychen00 3 months ago - 4 comments

#1578 - [Question] Why LLaVA-1.5-13b keep repeatting answer in the first round conversation?

Issue - State: open - Opened by PoopBear1 3 months ago - 3 comments

#1573 - add qwen2 support for pretraining and finetuning

Pull Request - State: open - Opened by TobyYang7 3 months ago - 5 comments

#1568 - "Assertion `srcIndex < srcSelectDimSize` failed" in Docker on some systems

Issue - State: closed - Opened by Careiner 3 months ago - 3 comments

#1568 - "Assertion `srcIndex < srcSelectDimSize` failed" in Docker on some systems

Issue - State: closed - Opened by Careiner 3 months ago - 3 comments

#1567 - Having issues while merging LoRA attention weights

Issue - State: closed - Opened by anas-zafar 3 months ago - 4 comments

#1554 - [Usage] Not able to fine tune the LLaVA model with llava-v1.5-7b.

Issue - State: open - Opened by ayushgupta9198 3 months ago - 6 comments

#1552 - [Usage] Merging LoRa weights into llava-13b fails with bizarre error

Issue - State: open - Opened by maxall41 3 months ago - 3 comments

#1551 - [Discussion] Request for Guidance on Stage Two: Converting LLaVA-V1.6 into 4-bit GGUF Format (PAPER)

Issue - State: open - Opened by rohithbojja 3 months ago - 2 comments

GitHub / haotian-liu/LLaVA issues and pull requests