Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / THUDM/SwissArmyTransformer issues and pull requests
#188 - 为什么我的地址都改了,还会出现这样的错误:Traceback (most recent call last): File "/root/autodl-tmp/model/XrayGLM/VisualGLM-6B-main/finetune_visualglm.py", line 9, in <module> from sat.model.finetune.lora2 import LoraMixin ModuleNotFoundError: No module named 'sat.model.finetune.lora2' [2024-12-25 00:31:28,139] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 1338
Issue -
State: open - Opened by zxdpro 27 days ago
#187 - ImportError: cannot import name 'rotate_half' from 'sat.model.official.llama_model'
Issue -
State: open - Opened by kkkwjr about 2 months ago
#186 - UnboundLocalError: local variable 'batch_size' referenced before assignment
Issue -
State: open - Opened by Money8888 about 2 months ago
- 1 comment
#186 - UnboundLocalError: local variable 'batch_size' referenced before assignment
Issue -
State: open - Opened by Money8888 about 2 months ago
- 1 comment
#185 - MetaDistributedWebDataset传入的训练数据格式是否有范例参考?
Issue -
State: open - Opened by xiabaoyulo 4 months ago
#184 - SAT fused_ema_adam_frontend error
Issue -
State: closed - Opened by AlphaNext 4 months ago
#183 - assert isinstance(new_mixin, BaseMixin) AssertionError更新了权重吗?导致这里出错?
Issue -
State: closed - Opened by corkiyao 4 months ago
- 1 comment
#182 - 在使用Deepspeed的zero stage2训练visualglm-6B是,出现最终权重为25G的现象
Issue -
State: open - Opened by corkiyao 4 months ago
#181 - 转换llama3.1遇到问题
Issue -
State: closed - Opened by strivebfq 5 months ago
- 2 comments
#180 - ema模型是否正确被保存
Issue -
State: open - Opened by Blueskyvvvvv 5 months ago
- 1 comment
#179 - TypeError: sat.model.transformer.BaseTransformer() got multiple values for keyword argument 'parallel_output'
Issue -
State: open - Opened by deep-practice 6 months ago
- 35 comments
#178 - 请问断点续训应该如何设置
Issue -
State: open - Opened by elesun2018 9 months ago
- 6 comments
#177 - transfer_param.py 转换vincuna hf模型成sat模型报错
Issue -
State: open - Opened by Lunatic-Solar 10 months ago
- 17 comments
#176 - How to install a model to the right path?
Issue -
State: closed - Opened by link89 10 months ago
- 1 comment
#175 - NO cogagent?
Issue -
State: open - Opened by Mac0q 10 months ago
- 2 comments
#174 - ModuleNotFoundError: No module named 'localAttention'
Issue -
State: open - Opened by BlueSkyyyyyy 10 months ago
#173 - “No backend type associated with device type cpu” when run cli_demo_sat.py
Issue -
State: open - Opened by yileld 11 months ago
- 5 comments
#172 - 如果想绕过deepspeed做finetune,可以在train的时候直接model.step()来实现吗?
Issue -
State: open - Opened by cocoshe 11 months ago
- 2 comments
#171 - Using CogVLM - KeyError (MODEL_URLS) - Google Colab
Issue -
State: closed - Opened by Baggiorobertozoba 11 months ago
- 1 comment
#170 - MixtralMlpMixin()这个函数里面moe只是计算专家的logits但是没看到分发逻辑
Issue -
State: open - Opened by AlenjandroWang 12 months ago
- 1 comment
#169 - AutoModel.from_pretrained()里面不能加载hf版本的权重吗
Issue -
State: open - Opened by AlenjandroWang 12 months ago
- 1 comment
#168 - AutoModel.from_pretrained()里面不能加载hf的权重吗
Issue -
State: closed - Opened by AlenjandroWang 12 months ago
#167 - 怎么从断点恢复微调训练
Issue -
State: open - Opened by zoumaguanxin 12 months ago
- 1 comment
#166 - MoE support
Pull Request -
State: closed - Opened by 1049451037 12 months ago
#165 - fix rotary bug when q seqlen > cos seqlen
Pull Request -
State: closed - Opened by leizhao1234 12 months ago
#164 - support chatglm rotary in triton
Pull Request -
State: closed - Opened by leizhao1234 12 months ago
#163 - 请问针对样本数量不均衡的数据集怎么做样本均衡呢
Issue -
State: open - Opened by lln556 12 months ago
- 1 comment
#162 - Questions about your LoRA codes
Issue -
State: closed - Opened by miznchimaki 12 months ago
- 7 comments
#161 - deepspeed 分布式训练 loss nan or inf
Issue -
State: open - Opened by JohnTang93 about 1 year ago
- 1 comment
#160 - Is sat suuport saving checkpoint by using fp16 or bf16?
Issue -
State: open - Opened by xxxwuwq about 1 year ago
- 4 comments
#159 - add accumulate ema and fix fp32 weight bug
Pull Request -
State: closed - Opened by leizhao1234 about 1 year ago
#158 - 单机多卡训练时内存占用过高
Issue -
State: closed - Opened by zodiacg about 1 year ago
- 2 comments
#157 - SwissArmyTransformer可以读bin权重文件吗?visualglm-6b项目里就没见pt文件,只有bin。难以微调
Issue -
State: closed - Opened by qq577288254 about 1 year ago
- 5 comments
#156 - fix zero3 check
Pull Request -
State: closed - Opened by Sleepychord about 1 year ago
#155 - fix model parallel inconsistent init
Pull Request -
State: closed - Opened by Sleepychord about 1 year ago
#154 - update ema
Pull Request -
State: closed - Opened by leizhao1234 about 1 year ago
#153 - support MoE & Mixtral-8x7b
Pull Request -
State: closed - Opened by 1049451037 about 1 year ago
#152 - fix profiling
Pull Request -
State: closed - Opened by leizhao1234 about 1 year ago
#151 - merge main to glu
Pull Request -
State: closed - Opened by 1049451037 about 1 year ago
#150 - add profiling
Pull Request -
State: closed - Opened by leizhao1234 about 1 year ago
#149 - deepspeed分布式训练出现sat ValueError inconsistent
Issue -
State: open - Opened by elesun2018 about 1 year ago
- 1 comment
#148 - How to embed video encoder module from pytorch?
Issue -
State: open - Opened by zyhzyh88 about 1 year ago
- 3 comments
#147 - mqa cross & stream chat
Pull Request -
State: closed - Opened by 1049451037 about 1 year ago
#146 - Can you help to confirm if chatglm3 model is same as GPT or it's original from GLM architecture?
Issue -
State: closed - Opened by tiendung about 1 year ago
- 3 comments
#145 - 请问如何使用hf加载icetk_glm_130B的tokenizer和GLM130B的模型?
Issue -
State: closed - Opened by Ajay-Wong about 1 year ago
- 6 comments
#144 - FileLock - out of date?
Issue -
State: closed - Opened by taziksh about 1 year ago
- 1 comment
#143 - How to load and initialize llama2 models downloaded from Huggingface
Issue -
State: closed - Opened by microhu about 1 year ago
- 2 comments
#142 - ore.exceptions.ResponseStreamingError
Issue -
State: open - Opened by AnnaYang2020 about 1 year ago
- 1 comment
#141 - Cannot use torch.compile with SAT
Issue -
State: open - Opened by lijing1996 about 1 year ago
#140 - Rotary embedding
Pull Request -
State: closed - Opened by leizhao1234 over 1 year ago
#139 - Rotary embedding
Pull Request -
State: closed - Opened by leizhao1234 over 1 year ago
#138 - 不支持流式dataset
Issue -
State: closed - Opened by af-74413592 over 1 year ago
- 2 comments
#137 - Fail to load random states from checkpoints saved
Issue -
State: open - Opened by minkowski0125 over 1 year ago
- 2 comments
#136 - Fix params dtype bug
Pull Request -
State: closed - Opened by Jintao-Huang over 1 year ago
- 1 comment
#135 - fix lost bias when quantize from pre-trained model parameters
Pull Request -
State: closed - Opened by jimmieliu over 1 year ago
- 3 comments
#134 - fix lost bias when quantize from pre-trained model parameters
Pull Request -
State: closed - Opened by jimmieliu over 1 year ago
- 1 comment
#133 - ModuleNotFoundError: No module named 'SwissArmyTransformer'
Issue -
State: open - Opened by B-1368 over 1 year ago
- 6 comments
#132 - 使用微调时,由于数据集过大,内存不够如何处理?
Issue -
State: closed - Opened by Syno8 over 1 year ago
- 1 comment
#131 - 请教一个问题,使用mp_size=2时的loss应该怎么写
Issue -
State: open - Opened by kunden0612 over 1 year ago
- 1 comment
#130 - 模型并行的方式进行lora方式的finetuning要怎么设置呢
Issue -
State: open - Opened by kunden0612 over 1 year ago
- 5 comments
#129 - During BERT decoding, past_key_values is used to accelerate calculation. Do we have a similar implementation?
Issue -
State: open - Opened by etrigger over 1 year ago
- 1 comment
#128 - Window 安装错误
Issue -
State: open - Opened by mai1015 over 1 year ago
- 2 comments
#127 - 使用0.2.x版本时报错
Issue -
State: open - Opened by Yuziyi1117 over 1 year ago
- 1 comment
#126 - fix a slightly inappropriate default value.
Pull Request -
State: closed - Opened by hhnqqq over 1 year ago
#125 - 测试源码中给的qlora.py报错
Issue -
State: open - Opened by shituo123456 over 1 year ago
- 7 comments
#124 - sat.arguments.get_args failed to handle the "-h" option
Issue -
State: open - Opened by limjcst over 1 year ago
- 1 comment
#123 - chatglm model parallel
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#122 - reframe mp_split
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#121 - llama-30b & -65b
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#120 - Kv cache
Pull Request -
State: open - Opened by leizhao1234 over 1 year ago
#119 - change model-parallel-size online
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#118 - SAT Tokenizer 地址挂了
Issue -
State: open - Opened by youngstu over 1 year ago
- 1 comment
#117 - chatglm2 release
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#116 - ChatGLM2-6B
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#115 - save ema parameters
Pull Request -
State: closed - Opened by leizhao1234 over 1 year ago
- 1 comment
#114 - stream_filling_sequence function
Pull Request -
State: closed - Opened by wenyihong over 1 year ago
#113 - 请问dataloader能shuffle吗?
Issue -
State: closed - Opened by XaviLv over 1 year ago
- 2 comments
#112 - 0.2.12 release版本的源码在哪个分支?
Issue -
State: closed - Opened by guohuanliang1 over 1 year ago
- 1 comment
#111 - huggingface版本的visualglm在前向传播报错,Exception: cuda rng state model-parallel-rng is not added
Issue -
State: open - Opened by zhangyuanscall over 1 year ago
- 1 comment
#110 - fix webds import
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#109 - fix prefatch typo
Pull Request -
State: closed - Opened by Sleepychord over 1 year ago
#108 - merge meta info
Pull Request -
State: closed - Opened by Sleepychord over 1 year ago
#107 - add llama 13b & generation example
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#106 - add llama inference
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#105 - add llama
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#104 - add ema-adam
Pull Request -
State: closed - Opened by leizhao1234 over 1 year ago
#103 - Request for documentation: the relationship between "Swing Army Transformer" and Nvidia's "FasterTransformer"
Issue -
State: closed - Opened by Oukaishen over 1 year ago
- 1 comment
#102 - 在镜像中安装报错
Issue -
State: open - Opened by xinyubai1209 over 1 year ago
- 2 comments
#101 - fix lora merge
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#100 - sat 和 transformers & huggingface hub 可以无缝集成吗?
Issue -
State: open - Opened by SwordFaith over 1 year ago
- 1 comment
#99 - Qlora
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#98 - preserve linear parallel lora
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#97 - remove redundant argument
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#96 - add qlora support
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#95 - 怎样使用DeepSpeed的offload功能降低显存占用?
Issue -
State: open - Opened by yt7589 over 1 year ago
#94 - add lora merging interface
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#93 - adapt vit to new version
Pull Request -
State: closed - Opened by 1049451037 over 1 year ago
#92 - 安装时报错
Issue -
State: open - Opened by ge90114b over 1 year ago
- 1 comment
#91 - pypl 清华源,没有swissarmytransformer
Issue -
State: closed - Opened by yhyu13 over 1 year ago
- 2 comments
#90 - fix base_strategy
Pull Request -
State: closed - Opened by lykeven over 1 year ago