Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / arcee-ai/mergekit issues and pull requests
#395 - Error at MoE Qwen 1.5B
Issue -
State: closed - Opened by ehristoforu 6 months ago
- 3 comments
#394 - Null vocab_file Issue with mistral v03 based models when using union tokenizer source
Issue -
State: open - Opened by guillermo-gabrielli-fer 6 months ago
- 2 comments
#393 - Is there a way to run LORA extraction using multi GPU? 70B LORA extraction OOM on 24GB 3090Ti
Issue -
State: open - Opened by Nero10578 6 months ago
- 4 comments
#392 - Example case of task_arithmetic needed
Issue -
State: open - Opened by Opdoop 6 months ago
- 1 comment
#391 - MoE exits itself after expert prompts 100% 2/2
Issue -
State: open - Opened by SameedHusayn 6 months ago
#390 - mergekit saves tied and ignored weights unlike what transformers does when saving
Issue -
State: open - Opened by nyxkrage 6 months ago
#389 - Create Communication Channels for MergeKit
Issue -
State: open - Opened by aditya-cherukuru 6 months ago
#388 - The speed issue with the GTATask.
Issue -
State: open - Opened by daidaiershidi 6 months ago
- 3 comments
#387 - ABM corrections
Pull Request -
State: open - Opened by metric-space 6 months ago
#386 - How to Create a New Merging Method
Issue -
State: open - Opened by Guozhenyuan 6 months ago
- 1 comment
#385 - Result of merging 2 Gemma2 9B models gains 1B parameters somehow
Issue -
State: closed - Opened by jim-plus 6 months ago
- 6 comments
#383 - does not appear to have a file named config.json
Issue -
State: open - Opened by bxf1001 6 months ago
- 2 comments
#382 - Added support for DeepseekV2 model
Pull Request -
State: open - Opened by aditya-29 7 months ago
- 4 comments
#379 - mergekit-moe支持qwen吗?
Issue -
State: open - Opened by hoooooli 7 months ago
- 5 comments
#378 - Questions about Config
Issue -
State: open - Opened by Zheng-Jay 7 months ago
- 2 comments
#377 - mergekit-evolve doesn't account for higher_is_better: false tasks.
Issue -
State: open - Opened by mekaneeky 7 months ago
- 1 comment
#375 - Network is unreachable
Issue -
State: closed - Opened by guanfaqian 7 months ago
- 1 comment
#370 - remove strict version of pydantic
Pull Request -
State: closed - Opened by sreev 7 months ago
- 1 comment
#366 - Add Della merge method
Pull Request -
State: closed - Opened by Tej-Deep 7 months ago
- 6 comments
#364 - gracefully pause evolutionary optimization?
Issue -
State: open - Opened by johnwee1 7 months ago
- 1 comment
#360 - Condense a models layers.
Issue -
State: open - Opened by DewEfresh 7 months ago
- 1 comment
#357 - NuSLERP
Pull Request -
State: closed - Opened by cg123 8 months ago
- 1 comment
#350 - qwen2-0.5B cannot be merged into MoE
Issue -
State: closed - Opened by letterk 8 months ago
- 5 comments
#341 - Evolutionary Merging out of memory
Issue -
State: open - Opened by ArcherShirou 8 months ago
- 4 comments
#340 - Weights Metrics
Pull Request -
State: open - Opened by ElliotStein 8 months ago
#335 - Merge arbitrary pytorch models
Pull Request -
State: open - Opened by cg123 8 months ago
- 1 comment
#333 - `extract_lora.py` improvements and fixes
Pull Request -
State: closed - Opened by jukofyork 9 months ago
- 12 comments
#332 - Add --load-in-4bit and --load-in-8bit for HF eval backend
Pull Request -
State: closed - Opened by cg123 9 months ago
#319 - How to merge a VLM and LLM with different model type.
Issue -
State: open - Opened by tanyakansal30 9 months ago
- 1 comment
#312 - Qwen/Qwen1.5-1.8B MoE Merging fails
Issue -
State: closed - Opened by dgolchin 9 months ago
- 4 comments
#251 - Attempt to make zipit work speak the same language as rest of mergekit
Pull Request -
State: closed - Opened by metric-space 10 months ago
#249 - Mainly adding modified M_U computation.
Pull Request -
State: closed - Opened by shamanez 10 months ago
#243 - _pickle.UnpicklingError: Unsupported type torch._tensor._rebuild_from_type_v2
Issue -
State: open - Opened by rangan2510 10 months ago
- 5 comments
#207 - Evolutionary Merging Method
Issue -
State: open - Opened by codelauncher444 11 months ago
- 19 comments
#198 - Idea: Downscaling the K and/or Q matrices for repeated layers in franken-merges?
Issue -
State: open - Opened by jukofyork 11 months ago
- 63 comments
#195 - Add support for GPTBigCodeForCausalLM
Pull Request -
State: closed - Opened by cg123 11 months ago
- 2 comments
#179 - Automatic Weight Calc based on NearSwap
Pull Request -
State: closed - Opened by Steel-skull 12 months ago
- 2 comments
#168 - Support for Merge methods which require some input data?
Issue -
State: closed - Opened by ita9naiwa 12 months ago
- 2 comments
#167 - Adds a new method to shuffle/swap values
Pull Request -
State: open - Opened by Ar57m 12 months ago
- 5 comments
#158 - qwen2 architecture definition
Pull Request -
State: closed - Opened by thomasgauthier about 1 year ago
- 6 comments
#150 - Fix phi-2 merging to MoE.
Pull Request -
State: closed - Opened by PhilipMay about 1 year ago
- 4 comments
#101 - moe - ValidationError: 1 validation error for MergeConfiguration
Issue -
State: closed - Opened by naseerfaheem about 1 year ago
- 1 comment
#100 - Adds a way of merging models with different sizes(B)
Pull Request -
State: closed - Opened by Ar57m about 1 year ago
- 10 comments
#99 - Add JAISLMHeadModel
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#98 - JapaneseStableLMAlphaForCausalLM support
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#97 - KeyError: 'model.embed_tokens.weight' when using mergekit-moe
Issue -
State: open - Opened by axrwl about 1 year ago
- 5 comments
#96 - Support for Qwen model
Issue -
State: closed - Opened by sorasoras about 1 year ago
- 8 comments
#95 - Why can't I run mergekit-moe command in mixtral branch ?
Issue -
State: closed - Opened by ZhangEnmao about 1 year ago
- 1 comment
#94 - Can you implement the expansion and merging of hidden_size and expand the original hidden?
Issue -
State: open - Opened by win10ogod about 1 year ago
- 2 comments
#93 - Just merge models
Issue -
State: closed - Opened by ftgreat about 1 year ago
- 1 comment
#92 - support for JAISLMHeadModel
Issue -
State: closed - Opened by h9-tect about 1 year ago
- 7 comments
#91 - Support for JapaneseStableLMAlphaForCausalLM
Issue -
State: open - Opened by azulika about 1 year ago
- 1 comment
#90 - Don't use Safetensors
Issue -
State: closed - Opened by fakerybakery about 1 year ago
- 1 comment
#89 - Latest commit to Mixtral branch causes script to never run
Issue -
State: closed - Opened by Dakraid about 1 year ago
- 1 comment
#88 - Support LLaMA MoE?
Issue -
State: closed - Opened by cdj0311 about 1 year ago
- 2 comments
#87 - Even when gate_mode is set to random, it is still required to input different positive prompts.
Issue -
State: closed - Opened by aoi-naive about 1 year ago
- 1 comment
#86 - Can mergekit be applied to merge multiple LoRA checkpoints by weights?
Issue -
State: open - Opened by authurlord about 1 year ago
- 3 comments
#85 - RuntimeError: Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again
Issue -
State: closed - Opened by Quang-elec44 about 1 year ago
- 6 comments
#84 - Question
Issue -
State: closed - Opened by dillfrescott about 1 year ago
- 3 comments
#83 - Add support for GPTBigCodeForCausalLM
Pull Request -
State: closed - Opened by cg123 about 1 year ago
- 1 comment
#82 - Convert Phi to Llama
Issue -
State: closed - Opened by fakerybakery about 1 year ago
- 5 comments
#81 - Support for MoE Phi-2
Issue -
State: open - Opened by beratcmn about 1 year ago
- 2 comments
#80 - Support for GPTBigCodeForCausalLM (StarCoder)
Issue -
State: closed - Opened by rpand002 about 1 year ago
- 1 comment
#79 - Add tokenizer merging tests
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#78 - fix(mega): use config file name for final merge
Pull Request -
State: closed - Opened by nyxkrage about 1 year ago
- 2 comments
#77 - [Mixtral] Positive Prompt Format
Issue -
State: closed - Opened by fakerybakery about 1 year ago
- 2 comments
#76 - merge_method on moe
Issue -
State: closed - Opened by guidevops about 1 year ago
- 2 comments
#75 - no error simple question
Issue -
State: open - Opened by kalle07 about 1 year ago
- 3 comments
#74 - The differences in principles and effects between the various merging methods
Issue -
State: open - Opened by hywchina about 1 year ago
- 1 comment
#73 - Tokenizer merge fix
Pull Request -
State: closed - Opened by cg123 about 1 year ago
- 1 comment
#72 - mergekit-mega: compound merging using multiple yaml documents in a single merge config
Pull Request -
State: closed - Opened by nyxkrage about 1 year ago
- 5 comments
#71 - Union tokenizer merging seems to break lazy tensor loading
Issue -
State: closed - Opened by brucethemoose about 1 year ago
- 1 comment
#70 - confuse about parameter t in slerp
Issue -
State: open - Opened by zyh3826 about 1 year ago
- 5 comments
#69 - Fix fp16 DARE on CPU
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#68 - Has anyone tried modal.com for merging models ?
Issue -
State: open - Opened by MarcelBP about 1 year ago
- 1 comment
#67 - Computational Graph Overhaul
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#66 - Phi 2
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#65 - Move argument parsing to click
Pull Request -
State: closed - Opened by cg123 about 1 year ago
#64 - Mixtral branch : What happens when we give both positive and negative prompts per an expert ?
Issue -
State: open - Opened by shamanez about 1 year ago
- 4 comments
#63 - While quantized by awq , error KeyError: 'block_sparse_moe.experts.0.w2'`
Issue -
State: open - Opened by xiechengmude about 1 year ago
- 2 comments
#62 - gradient merge
Issue -
State: open - Opened by thistleknot about 1 year ago
- 1 comment
#61 - Mixtral-moe branch minor issue
Issue -
State: closed - Opened by RoseTheLocalFem about 1 year ago
- 1 comment
#60 - Convert Mistral -> Llama
Issue -
State: closed - Opened by fakerybakery about 1 year ago
- 3 comments
#59 - frankenllama_22
Issue -
State: closed - Opened by fakerybakery about 1 year ago
- 2 comments
#58 - Generate Hugging Face model card
Pull Request -
State: closed - Opened by cg123 about 1 year ago
- 11 comments
#57 - Lazy tensor loader
Issue -
State: open - Opened by sudy-super about 1 year ago
- 3 comments
#56 - Eval
Issue -
State: open - Opened by darkzbaron about 1 year ago
#55 - Relevant literature for these methods
Issue -
State: closed - Opened by petroskarypis about 1 year ago
- 1 comment
#54 - mergekit-moe seems to fail
Issue -
State: closed - Opened by dillfrescott about 1 year ago
- 3 comments
#53 - Could you please explain how passthrough slicing works?
Issue -
State: closed - Opened by dillfrescott about 1 year ago
- 2 comments
#52 - phi-2 error
Issue -
State: closed - Opened by win10ogod about 1 year ago
- 4 comments
#51 - Runtime Error, please help
Issue -
State: closed - Opened by TuyulBrutal about 1 year ago
- 2 comments
#50 - MergeKit models does not behave the same as the original model
Issue -
State: open - Opened by casper-hansen about 1 year ago
- 2 comments
#49 - Why two different options generate different size of models?
Issue -
State: closed - Opened by DopeorNope-Lee about 1 year ago
- 1 comment
#48 - add for loop for slices_in
Pull Request -
State: closed - Opened by teilomillet about 1 year ago
- 1 comment
#47 - Can Models with Different vocab_size be Merged?
Issue -
State: open - Opened by ZeroYuJie about 1 year ago
- 6 comments
#46 - Support for miscrosoft Phi-2
Issue -
State: closed - Opened by shamanez about 1 year ago
- 3 comments
#45 - Here is a fix for mergekit-moe on Apple Silicon
Issue -
State: closed - Opened by ddh0 about 1 year ago
- 1 comment
#44 - Can you conduct TIES merging only on the embedding weights of two models?
Issue -
State: closed - Opened by shamanez about 1 year ago
- 2 comments
#43 - Deci 7B Support
Issue -
State: open - Opened by fakerybakery about 1 year ago
- 2 comments