lyogavin/Anima issues and pull requests

#157 - docs: add Japanese README

Pull Request - State: open - Opened by eltociear 4 months ago

#103 - 用airllm运行Yi-34B-chat模型，分层之后报这个错误

Issue - State: open - Opened by peiyanyang 10 months ago - 1 comment

#102 - Will the airllm framework be adapted for the streaming output functionality of different models in the future?

Issue - State: open - Opened by wangqn1 10 months ago
Labels: future work

#101 - ValueError: LlamaForCausalLM does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet.

Issue - State: open - Opened by sleeper1023 10 months ago - 1 comment
Labels: bug

#100 - AirLLMLlamaMlx fails to load model with mlx==0.0.7

Issue - State: open - Opened by jakule 10 months ago
Labels: bug

#99 - 关于对话模型是否能使用airllm

Issue - State: open - Opened by wzz981 10 months ago - 1 comment
Labels: question

#98 - how to infer on multiple gpus?

Issue - State: closed - Opened by yuxx0218 10 months ago - 1 comment
Labels: wontfix

#97 - Fix TYPO

Pull Request - State: closed - Opened by Naozumi520 10 months ago

#96 - Finetune 70B on 24GB 4090?

Issue - State: open - Opened by Naozumi520 10 months ago - 1 comment
Labels: future work

#95 - microsoft-phi2:max() arg is an empty sequence

Issue - State: open - Opened by zazaji 10 months ago - 1 comment
Labels: future work

#94 - ImportError: cannot import name AutoMode

Issue - State: closed - Opened by zazaji 10 months ago - 1 comment

#93 - safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

Issue - State: open - Opened by fudp 11 months ago - 1 comment
Labels: bug

#91 - ValueError: max() arg is an empty sequence(Apple M2 Max, macOS 14.2.1)

Issue - State: open - Opened by tvsj 11 months ago - 6 comments
Labels: future work

#90 - Discord Invite Expired in the readme

Issue - State: open - Opened by birdup000 11 months ago - 1 comment
Labels: help wanted

#89 - Would adding Parallelism speed up AirLLM?

Issue - State: open - Opened by birdup000 11 months ago
Labels: question

#88 - Mac quantization

Issue - State: open - Opened by ageorgios 11 months ago
Labels: question

#87 - Mac Airllm Inference tigerbot-70b-chat-v2

Issue - State: open - Opened by ageorgios 11 months ago
Labels: bug

#86 - configure the chunk split size

Issue - State: open - Opened by ageorgios 11 months ago
Labels: question

#85 - Does Airllm support sqlcoder-34b which was fine-tuned on codellama?

Issue - State: closed - Opened by mw-hv 11 months ago - 1 comment

#84 - Mixtral models seem to run forever

Issue - State: closed - Opened by Josh-XT 11 months ago - 1 comment

#83 - mistral模型无限加载中

Issue - State: closed - Opened by fenglui 11 months ago - 2 comments
Labels: help wanted

#82 - Mistral Mixtral model support

Issue - State: closed - Opened by birdup000 11 months ago

#81 - RuntimeError: cannot pin 'torch.cuda.HalfTensor' only dense CPU tensors can be pinned

Issue - State: closed - Opened by birdup000 11 months ago - 2 comments
Labels: bug

#80 - what's the difference or advantage of airllm vs flexgen?

Issue - State: open - Opened by showkeyjar 11 months ago - 1 comment
Labels: enhancement

#79 - [Feature Request] Mixtral Model Support

Issue - State: closed - Opened by birdup000 11 months ago - 11 comments
Labels: bug

#78 - airllm是否支持ptunig、lora等微调模型的加载？

Issue - State: open - Opened by estuday 11 months ago
Labels: enhancement

#77 - Mistral Model ValueError: Asking to pad but the tokenizer does not have a padding token

Issue - State: closed - Opened by birdup000 11 months ago - 2 comments

#76 - fix unbound error

Pull Request - State: closed - Opened by birdup000 11 months ago - 2 comments

#75 - Running the example inference code and I get error

Issue - State: closed - Opened by birdup000 11 months ago - 3 comments

#74 - ImportError: cannot import name 'AirLLMLlama2' from partially initialized module 'airllm' (most likely due to a circular import)

Issue - State: closed - Opened by kw408 11 months ago - 1 comment

#73 - RuntimeError: expected scalar type struct c10::Half but found double

Issue - State: closed - Opened by quifas 11 months ago - 2 comments
Labels: bug

#72 - cant initial NVML.... No CUDA GPUs are available

Issue - State: closed - Opened by hiqsociety 11 months ago - 4 comments
Labels: help wanted

#71 - cant initial NVML

Issue - State: closed - Opened by hiqsociety 11 months ago - 2 comments
Labels: help wanted

#70 - Support for T5 based models

Issue - State: open - Opened by balachandarsv 11 months ago - 1 comment
Labels: enhancement

#69 - So much time loading

Issue - State: open - Opened by Alvaro8gb 11 months ago - 2 comments
Labels: bug, help wanted

#68 - 外推训练时max-position-embeddings需要做调整吗？

Issue - State: open - Opened by ScottishFold007 11 months ago - 1 comment
Labels: enhancement

#67 - AirLLMLlama2 error: TypeError: llama_forward() got an unexpected keyword argument 'padding_mask'

Issue - State: closed - Opened by nguyen-viet-hung 11 months ago - 5 comments
Labels: bug

#66 - Create kunrt

Pull Request - State: open - Opened by liqikun0000 11 months ago

#65 - Error on first run

Issue - State: open - Opened by jalle007 11 months ago - 2 comments
Labels: question

#64 - Question on inference run time.

Issue - State: open - Opened by KryptixOne 11 months ago
Labels: question

#62 - ValueError: max() arg is an empty sequence

Issue - State: open - Opened by tutu329 11 months ago - 1 comment
Labels: help wanted

#61 - internlm-chat-20b模型剪枝失败

Issue - State: closed - Opened by hingkan 11 months ago - 2 comments

#60 - Provide minimum required versions

Issue - State: open - Opened by rmlopes 11 months ago - 1 comment
Labels: enhancement

#59 - qwen报错

Issue - State: closed - Opened by lymanzhao 11 months ago - 11 comments
Labels: bug

#58 - dpo+lora训练llama2 70B失败

Issue - State: open - Opened by LMXKO 11 months ago
Labels: bug

#57 - 可以做个参数设置吗？比如可以配置，可以手工或者根据使用频率自动调整哪些层放显存，哪些放内存，哪些在硬盘这样，以加快推理速度

Issue - State: open - Opened by quida01 11 months ago
Labels: enhancement

#56 - 推理速度似乎稍慢。2080TI qwen-14B推理大约需要两分半。

Issue - State: closed - Opened by sunzhaoyang1 11 months ago - 2 comments
Labels: question

#55 - 导入qwen报错：ValueError: max() arg is an empty sequence。airllm为最新版本。

Issue - State: closed - Opened by sunzhaoyang1 11 months ago - 5 comments
Labels: bug

#54 - 使用时报错cannot import name 'cached_path' from 'transformers'，或许和transformers的版本有关？我的版本是transformers-4.35.2

Issue - State: closed - Opened by sunzhaoyang1 11 months ago - 1 comment

#53 - loading qwen get TypeError: 'NoneType' object is not subscriptable

Issue - State: closed - Opened by Minami-su 11 months ago - 1 comment

#52 - max() arg is an empty sequence

Issue - State: closed - Opened by Minami-su 11 months ago - 6 comments
Labels: bug

#51 - 请问能支援 qwen72b,qwen72b-int4 吗

Issue - State: closed - Opened by jeffreychen567 11 months ago - 1 comment
Labels: enhancement, future work

#50 - 还未支持Qwen吗

Issue - State: closed - Opened by lymanzhao 11 months ago - 1 comment
Labels: enhancement

#48 - Typo in air_llm setup.py

Issue - State: closed - Opened by volkerjaenisch 11 months ago - 2 comments
Labels: bug

#47 - Does'n t work on Apple M1/M2. AssertionError: Torch not compiled with CUDA enabled.

Issue - State: closed - Opened by netandreus 11 months ago - 7 comments
Labels: future work

#46 - request to add support for CHATGLM3

Issue - State: closed - Opened by lyogavin 12 months ago - 1 comment
Labels: enhancement

#45 - Add reference to my work

Issue - State: closed - Opened by SimJeg 12 months ago - 1 comment
Labels: enhancement

#44 - More clever batching of layers

Issue - State: open - Opened by priamai 12 months ago - 11 comments
Labels: question

#43 - The model precision used for training with qlora

Issue - State: closed - Opened by Nipi64310 12 months ago - 3 comments
Labels: question

#42 - Does not deliver the correct result

Issue - State: closed - Opened by volkerjaenisch 12 months ago - 4 comments
Labels: question

#41 - more examples

Issue - State: open - Opened by csv610 12 months ago - 1 comment
Labels: question

#40 - pytorch_model.bin.index.json should exists.

Issue - State: open - Opened by vuminhquang 12 months ago - 6 comments
Labels: enhancement, future work

#39 - Printing one token in output

Issue - State: closed - Opened by thedunston 12 months ago - 3 comments
Labels: question

#38 - Support Mistral

Issue - State: closed - Opened by lyogavin 12 months ago - 1 comment
Labels: enhancement

#37 - AIR LLM能否支持百川，qwen等模型？

Issue - State: closed - Opened by sunzhaoyang1 12 months ago - 3 comments
Labels: enhancement

#36 - 速度对比

Issue - State: open - Opened by lucasjinreal 12 months ago - 1 comment
Labels: enhancement

#35 - 如何支持超长文本的训练？

Issue - State: closed - Opened by rookielyb about 1 year ago - 1 comment

#34 - 可以分享训练过程的loss变化吗？

Issue - State: closed - Opened by alex-ht about 1 year ago - 1 comment
Labels: question

#33 - 群的二维码过期了

Issue - State: closed - Opened by bltcn about 1 year ago - 2 comments

#32 - 100K的训练数据集会开源吗

Issue - State: closed - Opened by chaochen99 about 1 year ago - 1 comment

#31 - 多轮对话数据格式怎么弄呢

Issue - State: closed - Opened by ymmbb8882ymmbb about 1 year ago
Labels: future work

#30 - 模型训练最低需要显存

Issue - State: closed - Opened by LawlightXY about 1 year ago - 1 comment
Labels: question

#29 - 谢谢

Issue - State: closed - Opened by aohan237 over 1 year ago

#28 - 请问多卡推理时具体是如何使用accelerate

Issue - State: closed - Opened by jaycehw over 1 year ago - 1 comment
Labels: good first issue

#27 - 请问修改哪些参数qlora就会变成lora?

Issue - State: open - Opened by dongteng over 1 year ago - 1 comment
Labels: question

#26 - 大佬，有没有测评啊想看

Issue - State: closed - Opened by Mousaic over 1 year ago - 1 comment
Labels: question

#25 - 能否套用deepspeed？

Issue - State: closed - Opened by LzhinFdu over 1 year ago - 1 comment
Labels: future work

#24 - 使用lyogavin/Anima33B-DPO-Belle-1k-merged模型推断时大量重复

Issue - State: closed - Opened by xiu-ze over 1 year ago - 3 comments
Labels: enhancement, question

#23 - peft版本问题："addmm_impl_cpu_" not implemented for 'Half'

Issue - State: closed - Opened by liuyeah over 1 year ago - 2 comments
Labels: question

#22 - 想要进行llama-13b的ppo

Issue - State: closed - Opened by zhaobinNF over 1 year ago
Labels: future work

#21 - 训练参数疑问

Issue - State: closed - Opened by BillKiller over 1 year ago - 1 comment
Labels: question

#20 - Adapter模型的合并

Issue - State: closed - Opened by LiuChen19960902 over 1 year ago - 2 comments
Labels: question

#19 - 关于rlhf中source_max_length和target_max_len

Issue - State: open - Opened by jiahuanluo over 1 year ago - 3 comments
Labels: bug

#18 - 关于train的问题

Issue - State: closed - Opened by wanghao-007 over 1 year ago - 1 comment
Labels: question

#17 - Reference一定要和base一致吗

Issue - State: closed - Opened by BillKiller over 1 year ago - 1 comment
Labels: question

#16 - 请问有没有稳定中文输出的办法？

Issue - State: closed - Opened by AIchenkai over 1 year ago - 3 comments

#15 - Fix typo in README.md

Pull Request - State: closed - Opened by eltociear over 1 year ago

#14 - license特殊指的是什么，有什么限制？如何确认？

Issue - State: closed - Opened by wzg-zhuo over 1 year ago - 1 comment
Labels: question

#13 - 协议要求是什么，有哪些限制？

Issue - State: closed - Opened by wzg-zhuo over 1 year ago - 3 comments
Labels: question

#12 - 【建议】希望作者能提供一个4bit or 8bit量化的微调demo

Issue - State: closed - Opened by AILWQ over 1 year ago
Labels: enhancement, future work

#11 - README中的例子推理失败

Issue - State: closed - Opened by 1MLightyears over 1 year ago - 5 comments
Labels: bug

#10 - 请问一张单卡3090 24G能否部署运行？

Issue - State: closed - Opened by RickyWang111 over 1 year ago - 1 comment

#9 - 低消耗微调和推理这个太重要了

Issue - State: closed - Opened by TodayWei over 1 year ago - 2 comments
Labels: future work

#8 - 借贵宝地问下qlora的训练结果如何使用

Issue - State: closed - Opened by forestbat over 1 year ago - 1 comment

#7 - 训练复现报错

Issue - State: closed - Opened by acbogeh over 1 year ago - 3 comments
Labels: bug

#6 - 请问有计划开源并行训练的代码和运行脚本吗？

Issue - State: closed - Opened by liutongyang over 1 year ago - 1 comment
Labels: future work

#5 - 关于训练用时

Issue - State: closed - Opened by dourgey over 1 year ago - 2 comments

#4 - Anima对硬件有要求吗？

Issue - State: closed - Opened by kulame over 1 year ago - 1 comment

#3 - TypeError: not a string

Issue - State: closed - Opened by kulame over 1 year ago - 2 comments

#2 - fine-tuning QLoRA on other base models

Issue - State: closed - Opened by RanchiZhao over 1 year ago - 1 comment
Labels: future work

GitHub / lyogavin/Anima issues and pull requests