unslothai/unsloth issues and pull requests

#1280 - Qwen 2.5

Pull Request - State: closed - Opened by danielhanchen 1 day ago - 1 comment

#1279 - FileExistsError: [WinError 183]

Issue - State: open - Opened by rogersohandsome 2 days ago

#1278 - Will the open source version support multiple Gpus later?

Issue - State: open - Opened by first-li 2 days ago

#1277 - Can you support the fine-tuning of the MiniCPM3-4B model ?

Issue - State: open - Opened by faceair 2 days ago

#1276 - fix/sft-trainer

Pull Request - State: open - Opened by Erland366 2 days ago - 1 comment

#1275 - Finetuned Llama 3.1 8B (base) gets stuck in a loop

Issue - State: open - Opened by skerit 2 days ago

#1274 - Add Support for Pre-Training

Issue - State: open - Opened by dame-cell 2 days ago

#1273 - Resizing tokenizer leads to missing end token and garbage response?

Issue - State: open - Opened by Mark-DelGrande 4 days ago - 1 comment

#1272 - Jupyter notebook: No module named 'unsloth'

Issue - State: open - Opened by iwouldratherbeatthebeach 4 days ago - 3 comments

#1271 - dataset for train model to translate language

Issue - State: closed - Opened by nichellehouston 4 days ago - 1 comment

#1270 - feat: add option for using ADOPT optimizer based on Taniguchi, Shohei, et al.

Issue - State: open - Opened by Selich 5 days ago - 1 comment
Labels: feature request

#1269 - DOC Update - Update README.md with os.environ in example

Pull Request - State: open - Opened by udaygirish 5 days ago

#1268 - cannot load some models via vllm

Issue - State: open - Opened by yananchen1989 5 days ago - 10 comments

#1267 - save_pretrained_merged ruins my model

Issue - State: open - Opened by Romiroz 6 days ago - 1 comment

#1266 - Couldn't build proto file into descriptor pool! Invalid proto descriptor for file "sentencepiece_model.proto": sentencepiece_model.proto: A file with this name is already in the pool.

Issue - State: open - Opened by CurtiusSimplus 6 days ago - 8 comments

#1265 - LoRA on Qwen 2.5 does not patch qkv matrices

Issue - State: open - Opened by MinghaoYan 6 days ago - 1 comment

#1264 - TypeError: SFTTrainer.init() got an unexpected keyword argument 'dataset_text_field'

Issue - State: closed - Opened by officialsahyaboutorabi 6 days ago - 6 comments
Labels: currently fixing

#1263 - [ERROR BUG] 'triton.language' has no attribute 'cast'. Did you mean: 'cat'?

Issue - State: open - Opened by arianyambao 6 days ago - 9 comments

#1262 - `train_on_responses_only` doesn't work for Mistral models

Issue - State: open - Opened by XiaomoWu 7 days ago - 2 comments

#1261 - CAN'T LOAD: AttributeError: 'LlamaForCausalLM' object has no attribute 'update'

Issue - State: open - Opened by yukiarimo 7 days ago - 5 comments

#1260 - [FIXED] `dtype c10::BFloat16, but got float`

Issue - State: closed - Opened by CurtiusSimplus 7 days ago - 8 comments
Labels: fixed - pending confirmation

#1259 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 7 days ago

#1258 - ValueError: Unsloth: Untrained tokens of [[128004]] found

Issue - State: open - Opened by Hyfred 7 days ago - 1 comment

#1257 - Getting "compiled_autograd.enable() requires no threads in backwards()" on running SFTTrainer on unsloth/gemma-2 models

Issue - State: closed - Opened by sudha-kannan 7 days ago

#1256 - fix/autograd_compile

Pull Request - State: closed - Opened by Erland366 7 days ago - 2 comments

#1255 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 7 days ago

#1254 - Fix: cast logits to float32 in cross_entropy_forward to prevent errors

Pull Request - State: closed - Opened by Erland366 7 days ago - 2 comments

#1253 - Problem with installing packages and dependencies (triton in particular)

Issue - State: closed - Opened by AllenY687 8 days ago - 1 comment

#1252 - import unsloth causes error: pip install unsloth-zoo

Issue - State: open - Opened by DaddyCodesAlot 8 days ago - 5 comments
Labels: currently fixing

#1251 - issue when using default settings for training

Issue - State: closed - Opened by Ammar-Alnagar 8 days ago - 5 comments

#1250 - [FIXED] `compiled_autograd.enable()` Gemma

Issue - State: closed - Opened by InderjeetVishnoi 8 days ago - 4 comments
Labels: fixed - pending confirmation

#1249 - Bug fix

Pull Request - State: closed - Opened by danielhanchen 8 days ago - 1 comment

#1248 - [FIXED] `AssertionError('initial value for logits` error

Issue - State: open - Opened by daegonYu 8 days ago - 9 comments
Labels: fixed - pending confirmation, URGENT BUG

#1247 - Errors occurring in Pip Installation : torch 2.5 and CUDA 12.4

Issue - State: closed - Opened by daegonYu 8 days ago - 1 comment

#1246 - fix/get_chat_template

Pull Request - State: open - Opened by Erland366 8 days ago

#1245 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 8 days ago

#1244 - RuntimeError: CUDA error during inference from saved lora weights

Issue - State: open - Opened by danisharoonds 9 days ago - 1 comment

#1243 - Dataset creation to use with unsloth fine tuning

Issue - State: open - Opened by gaussiangit 9 days ago - 1 comment

#1242 - Unsloth error unable to push to hub

Issue - State: closed - Opened by hung-ngm 9 days ago - 2 comments

#1241 - how to only do lora on the lm_head?

Issue - State: open - Opened by brando90 9 days ago - 3 comments
Labels: feature request

#1240 - why is unsloth thinking I'm doing multi gpu optimization when I'm not?

Issue - State: open - Opened by brando90 9 days ago - 3 comments

#1239 - Fine tuned Llama3.1 does not support tools

Issue - State: open - Opened by darkroasted 9 days ago - 1 comment

#1238 - erorr

Issue - State: open - Opened by werruww 9 days ago - 5 comments

#1237 - RuntimeError: `ptxas` failed with error code 4294967295:

Issue - State: open - Opened by heiheiheibj 9 days ago - 2 comments

#1236 - Throw error when inferencing longer than max_popsition_embeddings

Pull Request - State: closed - Opened by Datta0 9 days ago

#1235 - CLI now handles user input strings for dtype correctly

Pull Request - State: closed - Opened by Rabbidon 10 days ago - 1 comment

#1234 - Which Torch & Python

Issue - State: closed - Opened by IzzyHibbert 10 days ago - 3 comments

#1233 - Overlap matrix multiplication (needs tensor core) and other things like activation (needs cuda core and memory bandwidth) to speed up

Issue - State: open - Opened by fzyzcjy 10 days ago - 2 comments

#1232 - AttributeError: 'torchvision' has no attribute 'extension' When Using Unsloth on Kaggle

Issue - State: closed - Opened by Saber120 11 days ago - 1 comment

#1231 - Unsloth error with trl 0.11.4

Issue - State: closed - Opened by mohit-raghavendra 13 days ago - 7 comments

#1230 - Why is memory bandwidth only half used? Is it possible we speed up by utilizing this?

Issue - State: open - Opened by fzyzcjy 13 days ago - 2 comments

#1229 - Is it possible to use `train_on_responses_only` with the Mistral template?

Issue - State: open - Opened by kldzj 13 days ago - 2 comments

#1228 - support

Issue - State: open - Opened by Qarqor5555555 13 days ago - 3 comments

#1227 - Remove "embed_tokens" and "lm_head" Lora layers when loading CPT trained models

Issue - State: closed - Opened by daegonYu 13 days ago - 2 comments

#1226 - Update README.md

Pull Request - State: closed - Opened by WontonSam 13 days ago - 1 comment

#1225 - fix/load-checkpoint-add-new-tokens

Pull Request - State: open - Opened by Erland366 14 days ago - 3 comments

#1224 - OSError: could not get source code when loading a model using a for loop

Issue - State: open - Opened by daegonYu 14 days ago - 4 comments

#1223 - Adding New Tokens

Issue - State: open - Opened by StrangePineAplle 14 days ago - 5 comments

#1222 - FastLanguageModel.from_pretrained fails validate_repo_id in huggingface_hub

Issue - State: open - Opened by AndreBremer 14 days ago - 3 comments

#1221 - Official Colab - unsloth/Llama-3.2-1B-Instruct-bnb-4bit randomly does not produce EOS tokens

Issue - State: open - Opened by jchook 14 days ago - 6 comments

#1220 - Load And Unload Model Error: OSError: could not get source code

Issue - State: open - Opened by DaddyCodesAlot 15 days ago - 4 comments

#1219 - Feat/all tmp

Pull Request - State: closed - Opened by danielhanchen 15 days ago - 1 comment

#1218 - Granite support

Pull Request - State: open - Opened by Datta0 15 days ago

#1217 - Potential bugfix in FlexAttention

Pull Request - State: closed - Opened by AdityaKane2001 16 days ago - 2 comments

#1216 - Cross entropy for packing

Issue - State: open - Opened by fzyzcjy 16 days ago - 2 comments

#1215 - Fail to load checkpoints trained with extended tokenizer

Issue - State: open - Opened by AbnetS 16 days ago - 4 comments

#1214 - Error - 'OutOfMemoryError: CUDA out of memory.'

Issue - State: open - Opened by raghavendra-k-j 16 days ago - 3 comments

#1213 - 3B finetuned model - being Merged in to 7b Model, When saving to use in VLLM

Issue - State: closed - Opened by pusapatiakhilraju 16 days ago - 1 comment

#1212 - GGUF breaks

Issue - State: open - Opened by awesomecoolraj 16 days ago - 2 comments

#1211 - Error saving PEFT adapter, re-loading model & adapter, and continuing to train

Issue - State: closed - Opened by laura-burdick-sil 16 days ago - 4 comments

#1210 - Continued Pre-Training Notebook not working with unsloth/Llama-3.2-1B-bnb-4bit

Issue - State: open - Opened by githomein 16 days ago - 5 comments

#1209 - Please add the model: EleutherAI/polyglot-ko-5.8b

Issue - State: open - Opened by SabaPivot 17 days ago - 1 comment

#1208 - Error `KeyError: 'layers.0.mlp.down_proj.weight'` when running Merged 4-bit Mistral Nemo in vLLM

Issue - State: closed - Opened by josiah-redjade 17 days ago - 3 comments

#1207 - Is there proper attention masking done when applying packing=true?

Issue - State: open - Opened by LostRuins 17 days ago - 2 comments

#1206 - Installation for torch 2.5.0

Issue - State: closed - Opened by Galaxy-Husky 17 days ago - 1 comment

#1205 - Unable to use "unsloth/gemma-2b-bnb-4bit" model via vLLM

Issue - State: open - Opened by InderjeetVishnoi 17 days ago - 2 comments

#1204 - merging w/ hacky gpu

Pull Request - State: closed - Opened by Alex-Gurung 17 days ago - 2 comments

#1203 - ORPO trainer not works after SFT

Issue - State: open - Opened by Romiroz 17 days ago - 1 comment

#1202 - Question: How to fine tune an already finetuned model like NuExtract as a fine tune of Phi-3.5

Issue - State: open - Opened by KIC 17 days ago - 2 comments

#1201 - Check if final_location is in /tmp in Kaggle environment

Pull Request - State: closed - Opened by dendarrion 18 days ago - 2 comments

#1200 - Fix/casting continue pretraining

Pull Request - State: closed - Opened by Erland366 18 days ago - 3 comments

#1200 - Fix/casting continue pretraining

Pull Request - State: closed - Opened by Erland366 18 days ago - 3 comments

#1199 - Phi-3.5-mini generation becomes instable after 4096 tokens

Issue - State: open - Opened by NicolasSteen 18 days ago - 1 comment

#1198 - Mistral Instruct v3 `sentencepiece_model.proto` error

Issue - State: open - Opened by CurtiusSimplus 18 days ago - 25 comments
Labels: currently fixing, help wanted

#1195 - Bug fixes

Pull Request - State: closed - Opened by danielhanchen 19 days ago - 1 comment

#1194 - unsloth_train() does not work, shows more step than trainer.train()

Issue - State: open - Opened by Linguiniotta 19 days ago - 1 comment

#1193 - Fix/phi-longrope

Pull Request - State: closed - Opened by Erland366 19 days ago

#1192 - Train_on_completions cant handle eval_datasets as dictionary

Issue - State: open - Opened by R4ZZ3 19 days ago - 1 comment
Labels: fixed - pending confirmation

#1191 - URGENT: unsloth saved lora adapter config not supported in VLLM

Issue - State: closed - Opened by xinyudong93 19 days ago - 1 comment

#1190 - Errors with pip installation in Docker containers with torch 2.5

Issue - State: closed - Opened by SyedA5688 20 days ago - 5 comments

#1189 - raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") AttributeError: 'LongRopeRotaryEmbedding' object has no attribute 'long_cos_cached'. Did you mean: 'short_cos_cached'?

Issue - State: open - Opened by SnehaKumari14 20 days ago - 4 comments

#1188 - Cleanup upcast logs

Pull Request - State: closed - Opened by Datta0 20 days ago

#1187 - pip install --upgrade --no-cache-dir unsloth BROKE CUDA packages. Inference slower.

Issue - State: open - Opened by pusapatiakhilraju 20 days ago - 3 comments

#1186 - 25% less mem and 10% faster training: Do not upcast lm_head and embedding to float32

Pull Request - State: closed - Opened by Datta0 20 days ago

#1185 - ModuleNotFoundError : Failed to import transformers.models.falcon_mamba.configuration_falcon_mamba

Issue - State: open - Opened by CurtiusSimplus 20 days ago - 2 comments

#1184 - RuntimeError: Expected out tensor to have dtype c10::BFloat16, but got float instead

Issue - State: open - Opened by Brightatkmitl 20 days ago - 3 comments

#1183 - does unlsoth support freeze tunning

Issue - State: open - Opened by NathanaelTamirat 20 days ago - 2 comments

#1182 - Fix 4.47 issue

Pull Request - State: closed - Opened by danielhanchen 20 days ago

#1181 - NameError: name 'Unpack' is not defined

Issue - State: open - Opened by CurtiusSimplus 20 days ago - 12 comments
Labels: fixed - pending confirmation

#1180 - fix/transformers-unpack

Pull Request - State: closed - Opened by Erland366 20 days ago - 2 comments

GitHub / unslothai/unsloth issues and pull requests