lyogavin/airllm issues and pull requests

#182 - B70 need

Issue - State: open - Opened by ayttop 6 days ago

#181 - How to set system prompt

Issue - State: open - Opened by OKHand-Zy 17 days ago - 1 comment

#180 - unsloth/Meta-Llama-3.1-405B-Instruct-bnb-4bit

Issue - State: open - Opened by kendiyang 19 days ago - 2 comments

#179 - delete_original

Issue - State: open - Opened by ayttop 20 days ago - 4 comments

#178 - RuntimeError: shape '[1, 5, 8, 128]' is invalid for input of size 10240 LLama 405B 4-bit on Layer 1

Issue - State: open - Opened by TitleOS 20 days ago - 3 comments

#177 - Compression does not work with MLX / Apple Silicon

Issue - State: open - Opened by sammcj 23 days ago

#176 - Fix pip not found when install in Jupyter

Pull Request - State: closed - Opened by chinkan 24 days ago

#175 - CUDA Out of memory RTX 4060TI 16G

Issue - State: open - Opened by 1272870698 27 days ago

#174 - Fixing mlx model load

Pull Request - State: closed - Opened by Razikus about 1 month ago - 1 comment

#173 - added delete_original support for single modelfiles

Pull Request - State: closed - Opened by NavodPeiris about 1 month ago

#172 - RuntimeError: shape '[1, 13, 8, 128]' is invalid for input of size 26624

Issue - State: open - Opened by zhuojun1024 about 1 month ago - 6 comments

#170 - #169: fixed error when running on cpu and added post install command to upgrade transformers

Pull Request - State: closed - Opened by NavodPeiris about 1 month ago

#169 - Error when running on CPU device and rope_scaling error when using old version of transformers

Issue - State: closed - Opened by NavodPeiris about 1 month ago - 1 comment

#168 - mlx Linear weight arrays were loaded with a dict of arrays

Issue - State: closed - Opened by shiwanlin about 1 month ago - 1 comment

#167 - mlx embedding indexing failure - ValueError: Cannot index mlx array using the given type.

Issue - State: closed - Opened by shiwanlin about 1 month ago - 2 comments

#166 - how to increase speed of inference

Issue - State: open - Opened by Tdrinker about 1 month ago - 1 comment

#165 - Position Embedding with Seq > 512

Issue - State: open - Opened by Codys12 about 2 months ago - 1 comment

#164 - Data Parallel across multiple GPUs?

Issue - State: open - Opened by Codys12 about 2 months ago

#163 - name 'dynamically_import_QuantLinear' is not defined

Issue - State: open - Opened by gyyixr about 2 months ago - 1 comment
Labels: enhancement, future work

#162 - layer_name 在使用前没有被定义

Issue - State: open - Opened by yjleo17 about 2 months ago - 4 comments

#161 - Circular import error in importing partially initialised module airllm

Issue - State: closed - Opened by samarthpusalkar about 2 months ago - 1 comment

#160 - AssertionError: model.safetensors.index.json should exist

Issue - State: open - Opened by huangyifu about 2 months ago

#159 - I can’t run llama-3.1-405B-Instruct-bnb-4bit because of a ValueError: rope_scaling must be a dictionary with two fields.

Issue - State: open - Opened by LCG22 about 2 months ago - 1 comment

#158 - can not run llama 3.1 405B

Issue - State: open - Opened by taozhiyuai about 2 months ago - 2 comments

#157 - docs: add Japanese README

Pull Request - State: closed - Opened by eltociear about 2 months ago

#156 - AttributeError: 'AirLLMLlama2' object has no attribute '_supports_cache_class'

Issue - State: open - Opened by Source61 2 months ago - 2 comments

#155 - Ramdisk

Issue - State: open - Opened by HennethAnnun 2 months ago

#154 - how to use Qwen2-72B-instuct

Issue - State: open - Opened by shenhai-ran 2 months ago - 2 comments

#153 - AssertionError: Torch not compiled with CUDA enabled

Issue - State: open - Opened by smartdawg 3 months ago - 1 comment

#152 - Some grammar suggested fixes in README.md

Pull Request - State: closed - Opened by TheTechOddBug 3 months ago

#151 - No english readme for rlhf

Issue - State: open - Opened by drawnwren 3 months ago

#150 - How?

Issue - State: closed - Opened by nonetrix 3 months ago - 1 comment

#149 - AttributeError: 'list' object has no attribute 'absmax' when I load Qwen-72B-Chat with 8-bit compression with AirLLMQWen

Issue - State: open - Opened by Yang-bug-star 3 months ago

#148 - I want to use in-context learning in qwen1.5-72b-chat inference and thus use tokenizer.apply_chat_template as in the official tutorial, however ValueError: max() arg. Doesn't airllm support the official inference way ?

Issue - State: open - Opened by Yang-bug-star 3 months ago

#147 - I want to use in-context learning in qwen1.5-72b-chat inference and thus use tokenizer.apply_chat_template as in the official tutorial, however ValueError: max() arg is an empty sequence

Issue - State: closed - Opened by Yang-bug-star 3 months ago

#146 - Add support for Mistral model inference

Issue - State: open - Opened by kunling-cxk 3 months ago

#145 - ImportError: cannot import name 'AutoModel' from partially initialized module 'airllm' (most likely due to a circular import)

Issue - State: closed - Opened by leobilocastro 3 months ago

#144 - Linear(in_features=28672, out_features=8192, bias=False) does not have a parameter or a buffer named qweight.

Issue - State: open - Opened by luzacao 4 months ago

#143 - WeChat QR Code out of date

Issue - State: open - Opened by zixianwang2022 4 months ago

#142 - air_llm: README fix MacOS typo

Pull Request - State: closed - Opened by hiemal 4 months ago

#137 - safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer

Issue - State: open - Opened by chuangzhidan 5 months ago

#136 - Insuficient disk space

Issue - State: open - Opened by ulisesbussi 5 months ago - 3 comments

#135 - CPU ram offload

Issue - State: open - Opened by NicolasMejiaPetit 5 months ago

#134 - error in apple mac m3

Issue - State: open - Opened by mustangs0786 5 months ago - 5 comments

#133 - Does airllm support quantized gguf/gptq/awq models ?

Issue - State: open - Opened by robik72 5 months ago

#132 - COMPILED_WITH_CUDA error requires libcuda.so

Issue - State: open - Opened by nickums 5 months ago

#131 - Error with Llama3: ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this look incorrect.

Issue - State: closed - Opened by Cangshanqingshi 5 months ago

#130 - 跑不通chatglm3，请大佬指教。

Issue - State: open - Opened by ZiQiangXie 5 months ago - 2 comments

#129 - segmentation fault python3 airllm2.py

Issue - State: open - Opened by taozhiyuai 5 months ago - 3 comments

#128 - to run llama3-70b,but fail to import. why?

Issue - State: closed - Opened by taozhiyuai 5 months ago

#127 - Any CoreML implementation plans?

Issue - State: open - Opened by Proryanator 5 months ago

#126 - Mac 'str' object has no attribute 'sequences

Issue - State: open - Opened by gr3enarr0w 5 months ago

#125 - "src" directory name is conflicted

Issue - State: open - Opened by Rambo55555 5 months ago

#124 - how to delete the original download model after it has been downloaded

Issue - State: open - Opened by ruiguo-bio 5 months ago - 1 comment

#123 - Running on Mac get traceback error

Issue - State: closed - Opened by gr3enarr0w 5 months ago - 3 comments

#122 - 通过Ollama下载了的模型，如何在airllm中直接使用呢

Issue - State: open - Opened by w1005444804 5 months ago - 2 comments

#121 - 请求支持llama3

Issue - State: closed - Opened by CrazyBoyM 5 months ago - 2 comments
Labels: enhancement

#120 - The following error is encountered when running the sample code

Issue - State: open - Opened by Nuclear6 6 months ago

#119 - compression parameter on mac.dosent work.

Issue - State: open - Opened by dnvs 6 months ago

#118 - Support for OPT Architecture

Issue - State: open - Opened by varunlmxd 6 months ago

#117 - Is it possible to use AirLLM with a quantized input model?

Issue - State: open - Opened by Verdagon 6 months ago - 3 comments
Labels: enhancement

#116 - mac m2 run air llm garage-bAInd/Platypus2-7B get error Input must be a file-like object opened in binary mode, or string

Issue - State: open - Opened by wuxiongwei 7 months ago - 6 comments

#115 - 似乎只能产生很少的字符

Issue - State: closed - Opened by andeyeluguo 7 months ago - 2 comments

#114 - Add UI like AUTOMATIC1111 for stable-diffusion-webui

Issue - State: open - Opened by janmartin 7 months ago

#112 - Which 70B model does macOS support?

Issue - State: open - Opened by ruifengma 7 months ago

#111 - Generation takes forever

Issue - State: closed - Opened by Kira-Pgr 8 months ago - 4 comments

#109 - Optimize for consumer GPU, eg 11GB or 16GB

Issue - State: open - Opened by profintegra 8 months ago

#108 - AirLLM: Support for DirectML

Issue - State: open - Opened by vegax87 8 months ago - 1 comment

#107 - attn impl to sdpa...

Issue - State: open - Opened by saa1028 8 months ago - 4 comments

#106 - AMD gpu support

Issue - State: open - Opened by hanq-moreh 8 months ago - 1 comment

#105 - For me this model is extremely underperforming

Issue - State: open - Opened by SadafShafi 8 months ago - 1 comment

#104 - Macbook "Torch not compiled with CUDA enabled" Error

Issue - State: closed - Opened by LanLanBoom 8 months ago - 2 comments

#103 - 用airllm运行Yi-34B-chat模型，分层之后报这个错误

Issue - State: open - Opened by peiyanyang 8 months ago - 1 comment

#102 - Will the airllm framework be adapted for the streaming output functionality of different models in the future?

Issue - State: open - Opened by wangqn1 8 months ago
Labels: future work

#101 - ValueError: LlamaForCausalLM does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet.

Issue - State: open - Opened by sleeper1023 8 months ago - 1 comment
Labels: bug

#100 - AirLLMLlamaMlx fails to load model with mlx==0.0.7

Issue - State: open - Opened by jakule 9 months ago
Labels: bug

#99 - 关于对话模型是否能使用airllm

Issue - State: open - Opened by wzz981 9 months ago - 1 comment
Labels: question

#98 - how to infer on multiple gpus?

Issue - State: closed - Opened by yuxx0218 9 months ago - 1 comment
Labels: wontfix

#97 - Fix TYPO

Pull Request - State: closed - Opened by Naozumi520 9 months ago

#96 - Finetune 70B on 24GB 4090?

Issue - State: open - Opened by Naozumi520 9 months ago - 1 comment
Labels: future work

#95 - microsoft-phi2:max() arg is an empty sequence

Issue - State: open - Opened by zazaji 9 months ago - 1 comment
Labels: future work

#94 - ImportError: cannot import name AutoMode

Issue - State: closed - Opened by zazaji 9 months ago - 1 comment

#93 - safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

Issue - State: open - Opened by fudp 9 months ago - 1 comment
Labels: bug

#91 - ValueError: max() arg is an empty sequence(Apple M2 Max, macOS 14.2.1)

Issue - State: open - Opened by tvsj 9 months ago - 6 comments
Labels: future work

#90 - Discord Invite Expired in the readme

Issue - State: open - Opened by birdup000 9 months ago - 1 comment
Labels: help wanted

#89 - Would adding Parallelism speed up AirLLM?

Issue - State: open - Opened by birdup000 9 months ago
Labels: question

#88 - Mac quantization

Issue - State: open - Opened by ageorgios 9 months ago
Labels: question

#87 - Mac Airllm Inference tigerbot-70b-chat-v2

Issue - State: open - Opened by ageorgios 9 months ago
Labels: bug

#86 - configure the chunk split size

Issue - State: open - Opened by ageorgios 9 months ago
Labels: question

#85 - Does Airllm support sqlcoder-34b which was fine-tuned on codellama?

Issue - State: closed - Opened by mw-hv 9 months ago - 1 comment

#84 - Mixtral models seem to run forever

Issue - State: closed - Opened by Josh-XT 9 months ago - 1 comment

#83 - mistral模型无限加载中

Issue - State: closed - Opened by fenglui 9 months ago - 2 comments
Labels: help wanted

#82 - Mistral Mixtral model support

Issue - State: closed - Opened by birdup000 9 months ago

#81 - RuntimeError: cannot pin 'torch.cuda.HalfTensor' only dense CPU tensors can be pinned

Issue - State: closed - Opened by birdup000 9 months ago - 2 comments
Labels: bug

#80 - what's the difference or advantage of airllm vs flexgen?

Issue - State: open - Opened by showkeyjar 9 months ago - 1 comment
Labels: enhancement

#79 - [Feature Request] Mixtral Model Support

Issue - State: closed - Opened by birdup000 9 months ago - 11 comments
Labels: bug

#78 - airllm是否支持ptunig、lora等微调模型的加载？

Issue - State: open - Opened by estuday 9 months ago
Labels: enhancement

GitHub / lyogavin/airllm issues and pull requests