unslothai/unsloth issues and pull requests

#1267 - save_pretrained_merged ruins my model

Issue - State: open - Opened by Romiroz 1 day ago

#1266 - Couldn't build proto file into descriptor pool! Invalid proto descriptor for file "sentencepiece_model.proto": sentencepiece_model.proto: A file with this name is already in the pool.

Issue - State: open - Opened by CurtiusSimplus 2 days ago - 2 comments

#1265 - LoRA on Qwen 2.5 does not patch qkv matrices

Issue - State: open - Opened by MinghaoYan 2 days ago

#1264 - TypeError: SFTTrainer.init() got an unexpected keyword argument 'dataset_text_field'

Issue - State: open - Opened by officialsahyaboutorabi 2 days ago

#1263 - [ERROR BUG] 'triton.language' has no attribute 'cast'. Did you mean: 'cat'?

Issue - State: open - Opened by arianyambao 2 days ago - 3 comments

#1262 - `train_on_responses_only` doesn't work for Mistral models

Issue - State: open - Opened by XiaomoWu 3 days ago - 2 comments

#1261 - CAN'T LOAD: AttributeError: 'LlamaForCausalLM' object has no attribute 'update'

Issue - State: open - Opened by yukiarimo 3 days ago - 2 comments

#1260 - [FIXED] `dtype c10::BFloat16, but got float`

Issue - State: closed - Opened by CurtiusSimplus 3 days ago - 8 comments
Labels: fixed - pending confirmation

#1259 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 3 days ago

#1258 - ValueError: Unsloth: Untrained tokens of [[128004]] found

Issue - State: open - Opened by Hyfred 3 days ago

#1257 - Getting "compiled_autograd.enable() requires no threads in backwards()" on running SFTTrainer on unsloth/gemma-2 models

Issue - State: closed - Opened by sudha-kannan 3 days ago

#1256 - fix/autograd_compile

Pull Request - State: closed - Opened by Erland366 3 days ago - 2 comments

#1255 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 3 days ago

#1254 - Fix: cast logits to float32 in cross_entropy_forward to prevent errors

Pull Request - State: closed - Opened by Erland366 3 days ago - 2 comments

#1253 - Problem with installing packages and dependencies (triton in particular)

Issue - State: closed - Opened by AllenY687 4 days ago - 1 comment

#1252 - import unsloth causes error: pip install unsloth-zoo

Issue - State: open - Opened by DaddyCodesAlot 4 days ago - 3 comments

#1251 - issue when using default settings for training

Issue - State: open - Opened by Ammar-Alnagar 4 days ago - 3 comments

#1250 - [FIXED] `compiled_autograd.enable()` Gemma

Issue - State: closed - Opened by InderjeetVishnoi 4 days ago - 4 comments
Labels: fixed - pending confirmation

#1249 - Bug fix

Pull Request - State: closed - Opened by danielhanchen 4 days ago - 1 comment

#1248 - [FIXED] `AssertionError('initial value for logits` error

Issue - State: open - Opened by daegonYu 4 days ago - 9 comments
Labels: fixed - pending confirmation, URGENT BUG

#1247 - Errors occurring in Pip Installation : torch 2.5 and CUDA 12.4

Issue - State: closed - Opened by daegonYu 4 days ago - 1 comment

#1246 - fix/get_chat_template

Pull Request - State: open - Opened by Erland366 4 days ago

#1245 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 4 days ago

#1244 - RuntimeError: CUDA error during inference from saved lora weights

Issue - State: open - Opened by danisharoonds 5 days ago - 1 comment

#1243 - Dataset creation to use with unsloth fine tuning

Issue - State: open - Opened by gaussiangit 5 days ago - 1 comment

#1242 - Unsloth error unable to push to hub

Issue - State: open - Opened by hung-ngm 5 days ago - 1 comment

#1241 - how to only do lora on the lm_head?

Issue - State: open - Opened by brando90 5 days ago - 3 comments
Labels: feature request

#1240 - why is unsloth thinking I'm doing multi gpu optimization when I'm not?

Issue - State: open - Opened by brando90 5 days ago - 2 comments

#1239 - Fine tuned Llama3.1 does not support tools

Issue - State: open - Opened by darkroasted 5 days ago - 1 comment

#1238 - erorr

Issue - State: open - Opened by werruww 5 days ago - 4 comments

#1237 - RuntimeError: `ptxas` failed with error code 4294967295:

Issue - State: open - Opened by heiheiheibj 5 days ago - 2 comments

#1236 - Throw error when inferencing longer than max_popsition_embeddings

Pull Request - State: closed - Opened by Datta0 5 days ago

#1235 - CLI now handles user input strings for dtype correctly

Pull Request - State: closed - Opened by Rabbidon 6 days ago - 1 comment

#1234 - Which Torch & Python

Issue - State: closed - Opened by IzzyHibbert 6 days ago - 3 comments

#1233 - Overlap matrix multiplication (needs tensor core) and other things like activation (needs cuda core and memory bandwidth) to speed up

Issue - State: open - Opened by fzyzcjy 6 days ago - 2 comments

#1232 - AttributeError: 'torchvision' has no attribute 'extension' When Using Unsloth on Kaggle

Issue - State: closed - Opened by Saber120 7 days ago - 1 comment

#1231 - Unsloth error with trl 0.11.4

Issue - State: closed - Opened by mohit-raghavendra 9 days ago - 7 comments

#1230 - Why is memory bandwidth only half used? Is it possible we speed up by utilizing this?

Issue - State: open - Opened by fzyzcjy 9 days ago - 2 comments

#1229 - Is it possible to use `train_on_responses_only` with the Mistral template?

Issue - State: open - Opened by kldzj 9 days ago - 2 comments

#1228 - support

Issue - State: open - Opened by Qarqor5555555 9 days ago - 3 comments

#1227 - Remove "embed_tokens" and "lm_head" Lora layers when loading CPT trained models

Issue - State: closed - Opened by daegonYu 9 days ago - 2 comments

#1226 - Update README.md

Pull Request - State: closed - Opened by WontonSam 9 days ago - 1 comment

#1225 - fix/load-checkpoint-add-new-tokens

Pull Request - State: open - Opened by Erland366 10 days ago - 3 comments

#1224 - OSError: could not get source code when loading a model using a for loop

Issue - State: open - Opened by daegonYu 10 days ago - 4 comments

#1223 - Adding New Tokens

Issue - State: open - Opened by StrangePineAplle 10 days ago - 5 comments

#1222 - FastLanguageModel.from_pretrained fails validate_repo_id in huggingface_hub

Issue - State: open - Opened by AndreBremer 10 days ago - 3 comments

#1221 - Official Colab - unsloth/Llama-3.2-1B-Instruct-bnb-4bit randomly does not produce EOS tokens

Issue - State: open - Opened by jchook 10 days ago - 6 comments

#1220 - Load And Unload Model Error: OSError: could not get source code

Issue - State: open - Opened by DaddyCodesAlot 11 days ago - 4 comments

#1219 - Feat/all tmp

Pull Request - State: closed - Opened by danielhanchen 11 days ago - 1 comment

#1218 - Granite support

Pull Request - State: open - Opened by Datta0 11 days ago

#1217 - Potential bugfix in FlexAttention

Pull Request - State: closed - Opened by AdityaKane2001 12 days ago - 2 comments

#1216 - Cross entropy for packing

Issue - State: open - Opened by fzyzcjy 12 days ago - 2 comments

#1215 - Fail to load checkpoints trained with extended tokenizer

Issue - State: open - Opened by AbnetS 12 days ago - 4 comments

#1214 - Error - 'OutOfMemoryError: CUDA out of memory.'

Issue - State: open - Opened by raghavendra-k-j 12 days ago - 3 comments

#1213 - 3B finetuned model - being Merged in to 7b Model, When saving to use in VLLM

Issue - State: closed - Opened by pusapatiakhilraju 12 days ago - 1 comment

#1212 - GGUF breaks

Issue - State: open - Opened by awesomecoolraj 12 days ago - 2 comments

#1211 - Error saving PEFT adapter, re-loading model & adapter, and continuing to train

Issue - State: closed - Opened by laura-burdick-sil 12 days ago - 4 comments

#1210 - Continued Pre-Training Notebook not working with unsloth/Llama-3.2-1B-bnb-4bit

Issue - State: open - Opened by githomein 12 days ago - 5 comments

#1209 - Please add the model: EleutherAI/polyglot-ko-5.8b

Issue - State: open - Opened by SabaPivot 13 days ago - 1 comment

#1208 - Error `KeyError: 'layers.0.mlp.down_proj.weight'` when running Merged 4-bit Mistral Nemo in vLLM

Issue - State: closed - Opened by josiah-redjade 13 days ago - 3 comments

#1207 - Is there proper attention masking done when applying packing=true?

Issue - State: open - Opened by LostRuins 13 days ago - 2 comments

#1206 - Installation for torch 2.5.0

Issue - State: closed - Opened by Galaxy-Husky 13 days ago - 1 comment

#1205 - Unable to use "unsloth/gemma-2b-bnb-4bit" model via vLLM

Issue - State: open - Opened by InderjeetVishnoi 13 days ago - 2 comments

#1204 - merging w/ hacky gpu

Pull Request - State: closed - Opened by Alex-Gurung 13 days ago - 2 comments

#1203 - ORPO trainer not works after SFT

Issue - State: open - Opened by Romiroz 13 days ago - 1 comment

#1202 - Question: How to fine tune an already finetuned model like NuExtract as a fine tune of Phi-3.5

Issue - State: open - Opened by KIC 13 days ago - 1 comment

#1201 - Check if final_location is in /tmp in Kaggle environment

Pull Request - State: closed - Opened by dendarrion 14 days ago - 2 comments

#1200 - Fix/casting continue pretraining

Pull Request - State: closed - Opened by Erland366 14 days ago - 3 comments

#1200 - Fix/casting continue pretraining

Pull Request - State: closed - Opened by Erland366 14 days ago - 3 comments

#1199 - Phi-3.5-mini generation becomes instable after 4096 tokens

Issue - State: open - Opened by NicolasSteen 14 days ago - 1 comment

#1198 - Mistral Instruct v3 `sentencepiece_model.proto` error

Issue - State: open - Opened by CurtiusSimplus 14 days ago - 19 comments
Labels: currently fixing, help wanted

#1195 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 15 days ago - 1 comment

#1194 - unsloth_train() does not work, shows more step than trainer.train()

Issue - State: open - Opened by Linguiniotta 15 days ago - 1 comment

#1193 - Fix/phi-longrope

Pull Request - State: closed - Opened by Erland366 15 days ago

#1192 - Train_on_completions cant handle eval_datasets as dictionary

Issue - State: open - Opened by R4ZZ3 15 days ago - 1 comment
Labels: fixed - pending confirmation

#1191 - URGENT: unsloth saved lora adapter config not supported in VLLM

Issue - State: closed - Opened by xinyudong93 15 days ago - 1 comment

#1190 - Errors with pip installation in Docker containers with torch 2.5

Issue - State: closed - Opened by SyedA5688 16 days ago - 5 comments

#1189 - raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") AttributeError: 'LongRopeRotaryEmbedding' object has no attribute 'long_cos_cached'. Did you mean: 'short_cos_cached'?

Issue - State: open - Opened by SnehaKumari14 16 days ago - 4 comments

#1188 - Cleanup upcast logs

Pull Request - State: closed - Opened by Datta0 16 days ago

#1187 - pip install --upgrade --no-cache-dir unsloth BROKE CUDA packages. Inference slower.

Issue - State: open - Opened by pusapatiakhilraju 16 days ago - 3 comments

#1186 - 25% less mem and 10% faster training: Do not upcast lm_head and embedding to float32

Pull Request - State: closed - Opened by Datta0 16 days ago

#1185 - ModuleNotFoundError : Failed to import transformers.models.falcon_mamba.configuration_falcon_mamba

Issue - State: open - Opened by CurtiusSimplus 16 days ago - 2 comments

#1184 - RuntimeError: Expected out tensor to have dtype c10::BFloat16, but got float instead

Issue - State: open - Opened by Brightatkmitl 16 days ago - 3 comments

#1183 - does unlsoth support freeze tunning

Issue - State: open - Opened by NathanaelTamirat 16 days ago - 2 comments

#1182 - Fix 4.47 issue

Pull Request - State: closed - Opened by danielhanchen 16 days ago

#1181 - NameError: name 'Unpack' is not defined

Issue - State: open - Opened by CurtiusSimplus 16 days ago - 12 comments
Labels: fixed - pending confirmation

#1180 - fix/transformers-unpack

Pull Request - State: closed - Opened by Erland366 16 days ago - 2 comments

#1179 - Can't import unsloth when both the latest version of unsloth and transformers are installed

Issue - State: open - Opened by lossflow 17 days ago - 7 comments

#1178 - DPO, ORPO - grad accumulation fix

Issue - State: open - Opened by danielhanchen 17 days ago
Labels: feature request, help wanted

#1177 - Fix DPO, ORPO

Pull Request - State: closed - Opened by danielhanchen 17 days ago - 2 comments

#1176 - Unsloth full finetune: Does the fast speed and small memory come with a cost of performance degrading or not?

Issue - State: open - Opened by fzyzcjy 17 days ago - 2 comments

#1175 - torch.compile fails

Issue - State: closed - Opened by fzyzcjy 17 days ago - 3 comments

#1174 - Fix/kaggle pytorch

Pull Request - State: closed - Opened by Erland366 17 days ago

#1173 - [FIXED] Kaggle broken

Issue - State: closed - Opened by danielhanchen 17 days ago - 2 comments
Labels: fixed - pending confirmation, URGENT BUG

#1172 - [FIXED] `TypeError: 'NoneType' object is not callable`

Issue - State: closed - Opened by BlackWyvernX 18 days ago - 11 comments
Labels: fixed - pending confirmation, URGENT BUG

#1171 - Fix/patch tokenizer

Pull Request - State: closed - Opened by Erland366 18 days ago

#1170 - A bug in save.py

Issue - State: open - Opened by serendipity800 18 days ago - 2 comments
Labels: fixed - pending confirmation

GitHub / unslothai/unsloth issues and pull requests