unslothai/unsloth issues and pull requests

#970 - Infinite Generation

Issue - State: open - Opened by MuhammadBilalKhan267 3 months ago - 5 comments

#969 - how can i compare before and after finetune model results at same time

Issue - State: open - Opened by paras007frnd 3 months ago - 1 comment

#968 - Ubsloth defaults to 0 GPU always

Issue - State: open - Opened by kishoretvk 3 months ago - 5 comments

#967 - FastLanguageModel import error

Issue - State: closed - Opened by VatsalPatel18 3 months ago - 1 comment

#966 - Add StepFlushCallback

Pull Request - State: closed - Opened by vTuanpham 3 months ago - 1 comment

#965 - Error when saving unsloth Llama 3.1 model to GGUF format

Issue - State: open - Opened by okoliechykwuka 3 months ago - 3 comments

#964 - ModuleNotFoundError: No module named 'triton'

Issue - State: closed - Opened by boseong-yun 3 months ago - 7 comments
Labels: fixed - pending confirmation, URGENT BUG

#963 - TypeError in `orpo_trainer.train()`: 'str' object is not callable

Issue - State: open - Opened by kdunee 3 months ago - 7 comments
Labels: fixed - pending confirmation

#962 - can't import FastLanguageModel from unsloth

Issue - State: open - Opened by sagniknandigit 3 months ago - 2 comments

#961 - AttributeError: 'RMSNorm' object has no attribute 'variance_epsilon'

Issue - State: closed - Opened by Silentssss 3 months ago - 3 comments
Labels: fixed - pending confirmation

#960 - KeyError: 'base_model.model.model.layers.0.mlp.down_proj.lora_A.weight'

Issue - State: open - Opened by Iven2132 3 months ago - 4 comments

#959 - Multiple Generation Similar to Huggingface `num_return_sequences`

Issue - State: open - Opened by ankitprezent 3 months ago - 1 comment

#959 - Multiple Generation Similar to Huggingface `num_return_sequences`

Issue - State: open - Opened by ankitprezent 3 months ago - 2 comments

#958 - How to fine-tune using pytroch dataset instead of hf's dataset

Issue - State: open - Opened by xugy16 3 months ago - 2 comments

#957 - issue while finetuning llama 3.1, 'tuple' object has no attribute 'remove_unused_columns'

Issue - State: open - Opened by Abhisekgit1994 3 months ago - 6 comments

#956 - Conda environment setup issues

Issue - State: closed - Opened by timlai4 3 months ago - 7 comments
Labels: currently fixing

#955 - Phi 3.5 bug fix

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#954 - Able to finetune `homebrewltd/llama3.1-s-instruct-v0.2` (Input=Text & Audio, Output=Text)

Issue - State: open - Opened by asmith26 3 months ago - 3 comments

#953 - KeyError: '__EOS_TOKEN__'

Issue - State: closed - Opened by zs856 3 months ago - 4 comments
Labels: currently fixing

#952 - update token retrieval logic

Pull Request - State: closed - Opened by not-lain 3 months ago - 2 comments

#951 - Exclamation points are not periods.

Issue - State: open - Opened by briandw 3 months ago - 2 comments

#950 - Output_attentions not available

Issue - State: open - Opened by Paoloc99 3 months ago - 2 comments

#949 - Add nvidia/Minitron-8B-Base support

Issue - State: open - Opened by minipasila 3 months ago - 1 comment
Labels: feature request

#948 - Your Flash Attention 2 installation seems to be broken

Issue - State: closed - Opened by C0casio45 3 months ago - 5 comments

#947 - Fix DPO

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#946 - AttributeError: 'LongRopeRotaryEmbedding' object has no attribute 'inv_freq' when finetuning Phi3.5 mini

Issue - State: open - Opened by beniz 3 months ago - 9 comments
Labels: fixed - pending confirmation

#945 - stablelm-zephyr-3b support

Issue - State: open - Opened by ItzAmirreza 3 months ago - 2 comments
Labels: feature request

#944 - GPU = NVIDIA GeForce RTX 4060 Ti 16G , Finetunning unsloth/Meta-Llama-3.1-8B-bnb-4bit OOM

Issue - State: open - Opened by 1272870698 3 months ago - 1 comment

#943 - model loading failed: RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.weight

Issue - State: open - Opened by BoyangGu1 3 months ago - 5 comments
Labels: currently fixing

#942 - Save the original model name in `config.json` instead of mapped name when saving LoRA

Issue - State: open - Opened by ivsanro1 3 months ago - 4 comments
Labels: currently fixing

#941 - Phi 3.5

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#940 - Phi 3.5

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#939 - 'tuple' object has no attribute 'remove_unused_columns'

Issue - State: open - Opened by Abhisekgit1994 3 months ago - 3 comments

#938 - Update README.md

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#937 - Fix NEFTune

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#936 - issue with Kaggle continue pretraining

Issue - State: closed - Opened by Dusker233 3 months ago - 7 comments

#935 - issue with Merging lora adapters

Issue - State: closed - Opened by Ammar-Alnagar 3 months ago - 1 comment

#934 - Unable to load unsloth trained model saved to a local directory.

Issue - State: open - Opened by InderjeetVishnoi 3 months ago - 7 comments
Labels: currently fixing

#933 - FlashAttention only support fp16 and bf16 data type

Issue - State: open - Opened by ArcherShirou 3 months ago - 3 comments

#932 - Supporting "LoRA-GA: Low-Rank Adaptation with Gradient Approximation"?

Issue - State: open - Opened by fzyzcjy 3 months ago - 2 comments
Labels: feature request

#931 - Bug #930

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#930 - newest commit (untrained tokens llame 3.1 base) creates bfloat16 issue with Mistral Nemo training

Issue - State: closed - Opened by Nazzaroth2 3 months ago - 2 comments
Labels: fixed - pending confirmation

#929 - untrained tokens llama 3.1 base

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#928 - Something is wrong with `save_pretrained_merged` or `FastLanguageModel.from_pretrained`

Issue - State: closed - Opened by tengerye 3 months ago - 3 comments

#927 - NameError: name 'os' is not defined

Issue - State: open - Opened by Tizzzzy 3 months ago - 1 comment

#926 - Issue with running trainer.train()

Issue - State: open - Opened by dilerbatu 3 months ago - 1 comment

#925 - Can not load tokenizer after `save_pretrained_merged`

Issue - State: closed - Opened by tengerye 3 months ago - 2 comments

#924 - pass trust_remote_code to AutoConfig.from_pretrained and PeftConfig .from_pretrained

Issue - State: open - Opened by wellhowtosay 3 months ago - 1 comment
Labels: fixed - pending confirmation

#923 - beam search does not work for gemma2b

Issue - State: open - Opened by world2vec 3 months ago - 5 comments
Labels: currently fixing, URGENT BUG

#922 - Jamba Models not Supported yet

Issue - State: open - Opened by chintanckg 3 months ago
Labels: feature request

#921 - Fix mapping

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#920 - Bug Fixes

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#919 - Is it possible to use some of the kernels with deepspeed?

Issue - State: closed - Opened by huyiwen 3 months ago - 2 comments

#918 - Gemma 2 9b lm_head, emb_tokens probably not loading (but were saved)

Issue - State: open - Opened by richardxoldman 3 months ago - 1 comment

#917 - Fix chat templates

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#916 - Fix Chat Templates

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#915 - Llama 3 template issue

Issue - State: open - Opened by minipasila 3 months ago - 4 comments

#914 - PermissionError: [WinError 5]

Issue - State: open - Opened by silentgameshub 3 months ago - 1 comment

#913 - Converting unsloth finetuned model to AWQ using autoawq package.

Issue - State: open - Opened by fusesid 3 months ago - 8 comments

#912 - Why do the number of parameters of unsloth models differ from the original model?

Issue - State: open - Opened by Keramatfar 3 months ago - 3 comments

#911 - Tinyllama issues

Issue - State: open - Opened by Srini-98 3 months ago - 3 comments
Labels: currently fixing

#911 - Tinyllama issues

Issue - State: closed - Opened by Srini-98 3 months ago - 4 comments
Labels: currently fixing

#910 - Providing more flexibility for users to customize their llama when using LoRA

Pull Request - State: closed - Opened by Brownwang0426 3 months ago - 1 comment

#909 - unsloth's FastLanguageModel target_modules does not support custom attention layer

Issue - State: closed - Opened by Brownwang0426 3 months ago - 2 comments

#909 - unsloth's FastLanguageModel target_modules does not support custom attention layer

Issue - State: closed - Opened by Brownwang0426 3 months ago - 2 comments

#908 - Request for Support: Phi-3 Vision Model

Issue - State: open - Opened by rahatarinasir 3 months ago - 1 comment
Labels: feature request

#908 - Request for Support: Phi-3 Vision Model

Issue - State: open - Opened by rahatarinasir 3 months ago - 2 comments
Labels: feature request

#907 - google/recurrentgemma-9b-it support

Issue - State: closed - Opened by ammary25 3 months ago - 1 comment
Labels: feature request

#906 - Fix DPO stats

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#905 - Unsloth should probably not alter or delete source models.

Issue - State: open - Opened by rwl4 3 months ago - 1 comment

#904 - The issue regarding the parameter settings during llama3.1 training.

Issue - State: closed - Opened by Xtian-hub 3 months ago - 3 comments

#903 - Why attention_mask is ignored?

Issue - State: closed - Opened by Theodotus1243 3 months ago - 1 comment

#902 - Torch 2.4, Xformers>0.0.27, TRL>0.9, Python 3.12 + bug fixes

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#901 - Add ability to set flash_attn_func(deterministic=True)

Issue - State: closed - Opened by Theodotus1243 3 months ago - 1 comment

#900 - Create mode.bin file when we are saving the model.

Issue - State: open - Opened by JaynouOliver 3 months ago - 2 comments

#899 - I ran into errors when running the conda installation locally on my Dell Precision 7920 with the RTX 3090

Issue - State: open - Opened by kirahman2 3 months ago - 1 comment

#898 - Not able to push the whole model to hugging face after training : help urgent , Hackathon submission EOD

Issue - State: open - Opened by Data-Scientist-Sahil 3 months ago - 6 comments

#897 - Feature Request: Loading the model again requires GPU even though quantization has been done.

Issue - State: closed - Opened by ahmedembeddedxx 3 months ago - 1 comment

#896 - Load Unsloth-FT-Merged-Model with AutoModel Attribute Error

Issue - State: open - Opened by carstendraschner 3 months ago - 6 comments

#895 - Error: one of the variables needed for gradient computation has been modified by an inplace operation

Issue - State: open - Opened by YZHang2333 3 months ago - 2 comments
Labels: currently fixing

#894 - Does unsloth support token classification using phi3?

Issue - State: open - Opened by xinyudong93 3 months ago - 1 comment

#893 - Supporting "LoRA+: Efficient Low Rank Adaptation of Large Models"

Issue - State: open - Opened by fzyzcjy 3 months ago - 1 comment

#892 - LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct support

Issue - State: closed - Opened by Ammar-Alnagar 3 months ago - 2 comments
Labels: feature request

#891 - Notebook for Fine-Tuning on Chat Dataset and Batch Inference

Issue - State: open - Opened by binhmed2lab 3 months ago - 1 comment

#890 - Could you please upgrade the trl library to the latest version?

Issue - State: open - Opened by ArcherShirou 3 months ago - 3 comments
Labels: fixed - pending confirmation

#889 - AttributeError: torch._dynamo.config.vocab_size does not exist

Issue - State: open - Opened by nyl199310 3 months ago - 8 comments
Labels: fixed - pending confirmation

#888 - Cache only has 0 layers, attempted to access layer with index 0

Issue - State: open - Opened by arturwplantecs 3 months ago - 2 comments

#887 - Fix tokenizers

Pull Request - State: closed - Opened by danielhanchen 3 months ago

#886 - support torch=2.4.0 and python=3.12

Issue - State: open - Opened by NiuBlibing 3 months ago - 8 comments
Labels: currently fixing

#885 - Phi-3 models not supported

Issue - State: closed - Opened by umasehs 3 months ago - 3 comments
Labels: fixed - pending confirmation

#884 - [Urgent] Llama3 NOT Working in PPO Trainer

Issue - State: open - Opened by yuan-xia 3 months ago - 5 comments

#883 - Extra response with fine-tuned model

Issue - State: open - Opened by LiuAlex1109 3 months ago - 2 comments

#882 - The fine-tuning of the 2024.8 version has become very poor compared to the previous version.

Issue - State: open - Opened by githubzuoyi 3 months ago - 12 comments
Labels: currently fixing

#881 - Dubious !

Issue - State: open - Opened by dromeuf 3 months ago - 23 comments
Labels: currently fixing, URGENT BUG

#880 - Outdated tokenizer chat template for llama3.1 8B (`unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit`)

Issue - State: closed - Opened by ivsanro1 3 months ago - 5 comments
Labels: fixed - pending confirmation

#879 - xFormers can't load C++/CUDA extensions

Issue - State: closed - Opened by DaddyCodesAlot 3 months ago - 2 comments

#878 - how to train on non instruction dataset .

Issue - State: open - Opened by hemangjoshi37a 3 months ago - 8 comments

#877 - Issue Report: Inconsistent Behavior and Meaningless Output

Issue - State: closed - Opened by seolhokim 3 months ago - 4 comments

#876 - Request: Flux (Diffusion transformer)

Issue - State: open - Opened by RefractAI 3 months ago - 3 comments
Labels: currently fixing, feature request

#876 - Request: Flux (Diffusion transformer)

Issue - State: open - Opened by RefractAI 3 months ago - 3 comments
Labels: currently fixing, feature request

GitHub / unslothai/unsloth issues and pull requests