arcee-ai/mergekit issues and pull requests

#498 - Handle unsupported architecture error

Pull Request - State: closed - Opened by FBarrca 4 days ago

#497 - Update copyright

Pull Request - State: closed - Opened by cg123 5 days ago

#496 - Allow specifying modules to save in mergekit-extract-lora

Pull Request - State: closed - Opened by cg123 5 days ago

#495 - Add decorator for easier merge method definition

Pull Request - State: closed - Opened by cg123 6 days ago

#494 - how to merge gguf quntizated models?

Issue - State: open - Opened by lexasub 6 days ago

#493 - Add SCE merging

Pull Request - State: closed - Opened by yangzy39 6 days ago - 3 comments

#492 - MoE wish list and future implementation exploration

Issue - State: open - Opened by johnr14 8 days ago

#491 - Adds a new merge method SCE

Pull Request - State: closed - Opened by yangzy39 8 days ago - 1 comment

#490 - Handle weight aliases

Pull Request - State: closed - Opened by jrruethe 9 days ago - 2 comments

#489 - mismatch shape issues

Issue - State: open - Opened by imrankh46 12 days ago

#487 - Deepseek MoE, Union Tokenizer, Forwarding, Consistency

Pull Request - State: closed - Opened by jrruethe 15 days ago - 1 comment

#486 - Bug (extract-lora): CUSOLVER_STATUS_INVALID_VALUE

Issue - State: open - Opened by ngxson 19 days ago

#485 - support geodesic interpolation merging of chipalign paper ?

Issue - State: open - Opened by dlmastery 20 days ago - 1 comment

#484 - KeyError: 'lm_head.weight' while lora extraction

Issue - State: open - Opened by ehristoforu 20 days ago - 3 comments

#483 - extract_lora : support tied embeddings

Pull Request - State: closed - Opened by ngxson 22 days ago - 1 comment

#482 - ValueError: operands could not be broadcast together with shapes (12582912,1) (3072,8192)

Issue - State: open - Opened by bhuvneshsaini 24 days ago - 1 comment

#481 - Qwen MoE gives strange output

Issue - State: open - Opened by khoangothe 25 days ago

#480 - Add NearSwap

Pull Request - State: closed - Opened by popbyte 25 days ago - 4 comments

#479 - [Bug?] `dict()` error when loading YAML config

Issue - State: open - Opened by T145 28 days ago

#478 - [Question] Customizing special tokens

Issue - State: closed - Opened by T145 29 days ago

#477 - [Bug] "Mega merging" seems to no longer be supported.

Issue - State: closed - Opened by T145 about 1 month ago - 1 comment

#475 - Requirements for Model Merging

Issue - State: closed - Opened by eunbin079 about 1 month ago - 2 comments

#474 - Question about the implement of sign election in TIES

Issue - State: closed - Opened by kobayashikanna01 about 1 month ago - 2 comments

#473 - Fix NuSLERP when merging tokenizers

Pull Request - State: closed - Opened by cg123 about 2 months ago

#472 - NuSLERP breaks

Issue - State: closed - Opened by RedrixHD about 2 months ago - 2 comments

#471 - Please Provide Configuration Example for nuslerp？

Issue - State: closed - Opened by ArcherShirou about 2 months ago

#470 - Merging LoRA adapters: "TypeError: object of type 'NoneType' has no len()"

Issue - State: closed - Opened by anika-ilieva about 2 months ago - 2 comments

#469 - Mergekit produces broken Tokenizers

Issue - State: closed - Opened by RedrixHD about 2 months ago - 2 comments

#468 - Spellcheck on the evolve doc

Pull Request - State: closed - Opened by T145 about 2 months ago - 1 comment

#467 - Update README

Pull Request - State: closed - Opened by cg123 about 2 months ago

#466 - docs: update README.md

Pull Request - State: closed - Opened by eltociear about 2 months ago - 1 comment

#466 - docs: update README.md

Pull Request - State: closed - Opened by eltociear about 2 months ago - 1 comment

#465 - Pad embeds to multiple

Pull Request - State: closed - Opened by cg123 2 months ago

#464 - Better tied weight handling

Pull Request - State: closed - Opened by cg123 2 months ago

#463 - Handle optional weights in mergekit-moe

Pull Request - State: closed - Opened by cg123 2 months ago

#462 - Rewrite readme more novice-friendly

Issue - State: open - Opened by clover1980 2 months ago

#461 - mergekit for vision models

Issue - State: open - Opened by prince0310 2 months ago

#460 - Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints?

Issue - State: open - Opened by zhangzx-uiuc 2 months ago - 1 comment

#459 - Qwen2.5 LoRA Extraction not working in vLLM & Aphrodite Engine

Issue - State: open - Opened by Nero10578 2 months ago - 3 comments

#458 - add whisper model

Pull Request - State: open - Opened by sagewe 2 months ago

#457 - Critical Merging Bug just started...

Issue - State: open - Opened by David-AU-github 3 months ago - 2 comments

#455 - About Model-Breadcrumbs merge implementation

Issue - State: open - Opened by vishaal27 3 months ago

#454 - Base Model generation time increases when passed through the MergeKit

Issue - State: open - Opened by ahmedamrelhefnawy 3 months ago

#453 - N-model ModelStock merging

Issue - State: closed - Opened by vishaal27 3 months ago - 1 comment

#452 - Moe merging failed

Issue - State: open - Opened by PsoriasiIR 3 months ago - 2 comments

#451 - Use sst2 to eval merging

Pull Request - State: closed - Opened by VivianeGalvao 3 months ago

#450 - Merge Models with Non-Standard Architectures (e.g., Multimodal Models)

Pull Request - State: closed - Opened by ElliotStein 3 months ago - 10 comments

#449 - [question] multi gpu available?

Issue - State: open - Opened by eunbin079 3 months ago

#448 - Bump version number

Pull Request - State: closed - Opened by cg123 3 months ago

#447 - mergekit-extract-lora does not extract - the destination is empty

Issue - State: open - Opened by raulod 3 months ago - 2 comments

#446 - KeyError model[0] did not exist in tensor?

Issue - State: open - Opened by FrozzDay 3 months ago - 3 comments

#445 - Report issues regarding the architecture-agnostic branch.

Issue - State: open - Opened by win10ogod 3 months ago - 3 comments

#444 - Bump dependencies

Pull Request - State: closed - Opened by cg123 3 months ago

#442 - RuntimeError: Need to specify cache dir to merge adapters

Issue - State: closed - Opened by Zolilio 3 months ago - 1 comment

#441 - Add methods from https://arxiv.org/abs/2405.07813

Pull Request - State: closed - Opened by zsgvivo 3 months ago - 2 comments

#440 - add methods from https://arxiv.org/abs/2405.07813

Pull Request - State: closed - Opened by zsgvivo 3 months ago

#439 - 11

Issue - State: closed - Opened by meiyiyeshi 4 months ago

#438 - [question] `task_arithmetic` simple question

Issue - State: closed - Opened by eunbin079 4 months ago - 2 comments

#437 - After the two Qwen1.5-7B-chat models were merged, garbled inference results appeared.

Issue - State: closed - Opened by Zhangfanfan0101 4 months ago

#435 - Fixed the YML/YAML documentation for Qwen MoE creation

Pull Request - State: open - Opened by Nottlespike 4 months ago - 1 comment

#434 - [request] Support for Vision Language Models

Issue - State: closed - Opened by NickGao96 4 months ago - 14 comments

#433 - [request]Can it support architectures such as stable diffusion Xl and flux dev?

Issue - State: open - Opened by win10ogod 4 months ago - 2 comments

#432 - Initial implementation of PCB merge method

Pull Request - State: open - Opened by cg123 4 months ago

#431 - Update actors.py

Pull Request - State: open - Opened by kwon13 4 months ago

#430 - Handle merges stored as list instead of space-separated string

Pull Request - State: closed - Opened by cg123 4 months ago

#429 - Update Llama architecture to handle 3b/1b

Pull Request - State: closed - Opened by cg123 4 months ago

#428 - Broken tokenizer in Yi-34B merge

Issue - State: closed - Opened by Asherathe 4 months ago - 3 comments

#427 - I would like to merge the deepseekForCausalLM model. Are there any related examples available?

Issue - State: open - Opened by xaiocaibi 4 months ago

#426 - Merging Lora fine-tuned models with MoE

Issue - State: open - Opened by AmineBechar07 4 months ago

#425 - Qwen2.5 14B models are ... sometimes? ... having their token vocabulary truncated down to 'actual'?

Issue - State: open - Opened by ann-brown 4 months ago - 6 comments

#424 - Support for new Llama 3.2 - 1B / 3B ?

Issue - State: closed - Opened by David-AU-github 4 months ago - 14 comments

#423 - Support for Vision Model such as ViT

Issue - State: open - Opened by redagavin 4 months ago

#422 - Support for xlm-roberta

Issue - State: open - Opened by umiron 4 months ago - 2 comments

#421 - "mergekit-yaml" not created upon installation

Issue - State: open - Opened by BovineOverlord 4 months ago - 2 comments

#420 - How to use multi GPUs

Issue - State: open - Opened by liudan193 4 months ago - 1 comment

#419 - would you like to support Qwen2.5 Model?

Issue - State: closed - Opened by ArcherShirou 4 months ago - 1 comment

#418 - Input should be a valid dictionary or instance of MergeConfiguration

Issue - State: open - Opened by Hugo-Calero 4 months ago - 2 comments

#417 - Make Cohere lm_head optional

Pull Request - State: closed - Opened by cg123 5 months ago

#416 - Add Solar And Exaone Model

Pull Request - State: closed - Opened by shing100 5 months ago - 1 comment

#415 - Add support Exaone Model

Pull Request - State: closed - Opened by shing100 5 months ago - 2 comments

#414 - Re-Train every block with reduced width

Issue - State: closed - Opened by snapo 5 months ago

#413 - Fix README links

Pull Request - State: closed - Opened by cg123 5 months ago

#412 - Broken links on main page - " Arcee App"

Issue - State: closed - Opened by David-AU-github 5 months ago

#411 - The DARE-TIES experiment.

Issue - State: open - Opened by David-AU-github 5 months ago - 4 comments

#410 - Cloud Merging

Pull Request - State: closed - Opened by Jacobsolawetz 5 months ago

#409 - I am having problem merging GPT-Neo

Issue - State: open - Opened by 2625554780 5 months ago - 1 comment

#408 - support for GPT-Neo needed!

Issue - State: closed - Opened by 2625554780 5 months ago - 2 comments

#407 - Is it possible to merge Mistral 7B and Mistral NeMo 12B?

Issue - State: open - Opened by azulika 5 months ago - 1 comment

#406 - Set Gemma2 lm_head optional instead of aliasing to embed_tokens

Pull Request - State: closed - Opened by cg123 5 months ago

#405 - Add Phi3SmallForCausalLM and tweak Phi3

Pull Request - State: closed - Opened by cg123 5 months ago

#404 - 小白怎么合并模型 yaml文件配置

Issue - State: open - Opened by yhyub 5 months ago - 1 comment

#403 - 怎么解决mergekit-yaml qwen_sail.yaml ./fddfgh/ Warmup loader cache: 100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 2/2 [00:00<00:00, 64.02it/s] Executing graph: 0%| | 0/1457 [00:00<?, ?it/s]Segmentation fault

Issue - State: open - Opened by yhyub 5 months ago

#402 - 解决运行错误

Issue - State: open - Opened by yhyub 5 months ago - 1 comment

#401 - Merging two mistral based models with different architectures. Looking for some guidance.

Issue - State: open - Opened by AshD 5 months ago - 1 comment

#400 - Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies'

Issue - State: open - Opened by eunbin079 5 months ago - 1 comment

#399 - Working Example of the Mergkit-Evo

Issue - State: open - Opened by nthangelane 5 months ago

#398 - passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407

Issue - State: closed - Opened by AshD 5 months ago - 2 comments

#397 - MergeKit GUI not working.

Issue - State: closed - Opened by Abdulhanan535 6 months ago

#396 - Support for Phi-3-Small [Feature ?]

Issue - State: open - Opened by hammoudhasan 6 months ago

#395 - Error at MoE Qwen 1.5B

Issue - State: closed - Opened by ehristoforu 6 months ago - 3 comments

GitHub / arcee-ai/mergekit issues and pull requests