tsinghuaai/cpm-2-finetune issues and pull requests

#50 - Cpm2 open

Issue - State: open - Opened by Suleymanlizad 6 months ago

#50 - Cpm2 open

Issue - State: open - Opened by Suleymanlizad 6 months ago

#49 - change mp size 后，训练会出现 size missmatch 的错误

Issue - State: open - Opened by samulew almost 2 years ago - 2 comments

#49 - change mp size 后，训练会出现 size missmatch 的错误

Issue - State: open - Opened by samulew almost 2 years ago - 2 comments

#48 - 训好的模型如何转化成huggingface的模型格式呢

Issue - State: open - Opened by Tron1994 almost 2 years ago

#48 - 训好的模型如何转化成huggingface的模型格式呢

Issue - State: open - Opened by Tron1994 almost 2 years ago

#47 - 官方提供预训练模型参数是4个模型并行的文件，这限定模型并行必须是4？

Issue - State: open - Opened by Tron1994 about 2 years ago - 1 comment

#47 - 官方提供预训练模型参数是4个模型并行的文件，这限定模型并行必须是4？

Issue - State: open - Opened by Tron1994 about 2 years ago - 1 comment

#46 - 加载100000模型，_load_zero_checkpoint失败，提示没有相关zero_pp_rank*文件

Issue - State: open - Opened by Tron1994 about 2 years ago - 1 comment

#46 - 加载100000模型，_load_zero_checkpoint失败，提示没有相关zero_pp_rank*文件

Issue - State: open - Opened by Tron1994 about 2 years ago - 1 comment

#45 - CPM2Datasets.py 中的T5Dataset报错

Issue - State: closed - Opened by dingtine about 2 years ago - 1 comment

#45 - CPM2Datasets.py 中的T5Dataset报错

Issue - State: closed - Opened by dingtine about 2 years ago - 1 comment

#44 - 在A100上加载FusedAdam报错

Issue - State: open - Opened by giter000 about 2 years ago - 1 comment

#44 - 在A100上加载FusedAdam报错

Issue - State: open - Opened by giter000 about 2 years ago - 1 comment

#43 - CPM2的文本生成example怎么没有，prompt训练完了之后不知道咋推理

Issue - State: open - Opened by touwenameng over 2 years ago - 1 comment

#43 - CPM2的文本生成example怎么没有，prompt训练完了之后不知道咋推理

Issue - State: open - Opened by touwenameng over 2 years ago - 1 comment

#42 - Docker DeepSpeed error: ssh: Could not resolve hostname node0: Name or service not known

Issue - State: closed - Opened by sebastian-nehrdich over 2 years ago - 4 comments

#42 - Docker DeepSpeed error: ssh: Could not resolve hostname node0: Name or service not known

Issue - State: closed - Opened by sebastian-nehrdich over 2 years ago - 4 comments

#41 - promt adgen文件缺失

Issue - State: closed - Opened by zhu1090093659 over 2 years ago - 3 comments

#41 - promt adgen文件缺失

Issue - State: closed - Opened by zhu1090093659 over 2 years ago - 3 comments

#40 - 内部做了古诗翻译和菜谱生成的demo可以提供数据和demo吗？

Issue - State: closed - Opened by jiangliqin over 2 years ago

#40 - 内部做了古诗翻译和菜谱生成的demo可以提供数据和demo吗？

Issue - State: closed - Opened by jiangliqin over 2 years ago

#39 - CPM2如何做few-shot的文本生成任务

Issue - State: closed - Opened by zhihao-chen over 2 years ago

#39 - CPM2如何做few-shot的文本生成任务

Issue - State: closed - Opened by zhihao-chen over 2 years ago

#38 - 请教：使用中英文双语模型报了一下的错误：

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 1 comment

#38 - 请教：使用中英文双语模型报了一下的错误：

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 1 comment

#37 - 请教：模型在跑prompt的的脚本时，并没有用到test的数据，是为什么呢？还有prompt训练好的模型参数保存在哪里？

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 5 comments

#37 - 请教：模型在跑prompt的的脚本时，并没有用到test的数据，是为什么呢？还有prompt训练好的模型参数保存在哪里？

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 5 comments

#36 - 救助：模型支持的最长输入序列是多少

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 2 comments

#36 - 救助：模型支持的最长输入序列是多少

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago - 2 comments

#35 - 想看模型生成的结果，该修改代码那一块

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago

#35 - 想看模型生成的结果，该修改代码那一块

Issue - State: closed - Opened by Chunhui-Zou almost 3 years ago

#34 - Math23K 没有公开test_private.json 文件吗？

Issue - State: closed - Opened by XiaoqingNLP almost 3 years ago - 3 comments

#34 - Math23K 没有公开test_private.json 文件吗？

Issue - State: closed - Opened by XiaoqingNLP almost 3 years ago - 3 comments

#33 - RuntimeError: Unable to proceed, no GPU resources available

Issue - State: open - Opened by louxingrui about 3 years ago - 2 comments

#33 - RuntimeError: Unable to proceed, no GPU resources available

Issue - State: open - Opened by louxingrui about 3 years ago - 2 comments

#32 - 数据集怎么处理，我下载了LSCTS数据集，运行程序后报错。

Issue - State: closed - Opened by Chunhui-Zou about 3 years ago - 2 comments

#32 - 数据集怎么处理，我下载了LSCTS数据集，运行程序后报错。

Issue - State: closed - Opened by Chunhui-Zou about 3 years ago - 2 comments

#31 - Create finetune_cpm2_sogou-log.sh

Pull Request - State: closed - Opened by xcjthu about 3 years ago

#31 - Create finetune_cpm2_sogou-log.sh

Pull Request - State: closed - Opened by xcjthu about 3 years ago

#30 - CPM2模型推理代码

Issue - State: closed - Opened by Bournet about 3 years ago - 1 comment

#30 - CPM2模型推理代码

Issue - State: closed - Opened by Bournet about 3 years ago - 1 comment

#29 - CPM2在生成任务上的微调策略

Issue - State: closed - Opened by XiaoqingNLP about 3 years ago - 3 comments

#29 - CPM2在生成任务上的微调策略

Issue - State: closed - Opened by XiaoqingNLP about 3 years ago - 3 comments

#28 - 怎么加入新词再finetune

Issue - State: closed - Opened by LinglingGreat about 3 years ago - 3 comments

#28 - 怎么加入新词再finetune

Issue - State: closed - Opened by LinglingGreat about 3 years ago - 3 comments

#27 - A100-8卡环境cublas报错

Issue - State: closed - Opened by linjianz about 3 years ago - 1 comment

#27 - A100-8卡环境cublas报错

Issue - State: closed - Opened by linjianz about 3 years ago - 1 comment

#26 - 用deepspeed工具，将cpm2.0的pt模型文件转化为fp32_state_dict失败

Issue - State: closed - Opened by linjianz about 3 years ago

#26 - 用deepspeed工具，将cpm2.0的pt模型文件转化为fp32_state_dict失败

Issue - State: closed - Opened by linjianz about 3 years ago

#25 - How to use BMInf to inference 100000.tar 11B model?

Issue - State: closed - Opened by linjianz about 3 years ago - 1 comment

#25 - How to use BMInf to inference 100000.tar 11B model?

Issue - State: closed - Opened by linjianz about 3 years ago - 1 comment

#24 - 模型并行度修改的切割问题

Issue - State: closed - Opened by leelinglin about 3 years ago - 1 comment

#24 - 模型并行度修改的切割问题

Issue - State: closed - Opened by leelinglin about 3 years ago - 1 comment

#23 - 模型fine-tune显存溢出

Issue - State: closed - Opened by leelinglin about 3 years ago - 4 comments

#23 - 模型fine-tune显存溢出

Issue - State: closed - Opened by leelinglin about 3 years ago - 4 comments

#22 - how to use 32000.tar?

Issue - State: closed - Opened by AdamBear about 3 years ago

#22 - how to use 32000.tar?

Issue - State: closed - Opened by AdamBear about 3 years ago

#21 - 双机8卡分布式训练

Issue - State: closed - Opened by forrestbing over 3 years ago - 2 comments

#21 - 双机8卡分布式训练

Issue - State: closed - Opened by forrestbing over 3 years ago - 2 comments

#20 - the decoder input in evaluate_gen()

Issue - State: closed - Opened by GMago-LeWay over 3 years ago - 1 comment

#20 - the decoder input in evaluate_gen()

Issue - State: closed - Opened by GMago-LeWay over 3 years ago - 1 comment

#19 - 显存占用

Issue - State: closed - Opened by 2020zyc over 3 years ago - 2 comments

#19 - 显存占用

Issue - State: closed - Opened by 2020zyc over 3 years ago - 2 comments

#18 - 请问可以不用deepspeed吗

Issue - State: closed - Opened by 2020zyc over 3 years ago - 7 comments

#18 - 请问可以不用deepspeed吗

Issue - State: closed - Opened by 2020zyc over 3 years ago - 7 comments

#17 - bug? save_zero

Issue - State: closed - Opened by 2020zyc over 3 years ago - 1 comment

#17 - bug? save_zero

Issue - State: closed - Opened by 2020zyc over 3 years ago - 1 comment

#16 - attention.dense.weight not found when prompt fine tuning

Issue - State: closed - Opened by luotongml over 3 years ago - 17 comments

#16 - attention.dense.weight not found when prompt fine tuning

Issue - State: closed - Opened by luotongml over 3 years ago - 17 comments

#15 - 请问2张A100-40G能跑吗

Issue - State: closed - Opened by 2020zyc over 3 years ago - 11 comments

#15 - 请问2张A100-40G能跑吗

Issue - State: closed - Opened by 2020zyc over 3 years ago - 11 comments

#14 - prompt tunning问题

Issue - State: closed - Opened by zirui over 3 years ago - 4 comments

#14 - prompt tunning问题

Issue - State: closed - Opened by zirui over 3 years ago - 4 comments

#13 - 请问sentinel id的作用什么？

Issue - State: closed - Opened by wakafengfan over 3 years ago - 1 comment

#13 - 请问sentinel id的作用什么？

Issue - State: closed - Opened by wakafengfan over 3 years ago - 1 comment

#12 - Can CPM-2 run in playground model, any prompt hint?

Issue - State: closed - Opened by qhduan over 3 years ago - 1 comment

#12 - Can CPM-2 run in playground model, any prompt hint?

Issue - State: closed - Opened by qhduan over 3 years ago - 1 comment

#11 - Finetune loss and acc is pool

Issue - State: closed - Opened by k15201363625 over 3 years ago - 10 comments

#11 - Finetune loss and acc is pool

Issue - State: closed - Opened by k15201363625 over 3 years ago - 10 comments

#10 - docker 运行失败

Issue - State: closed - Opened by windflee over 3 years ago - 1 comment

#10 - docker 运行失败

Issue - State: closed - Opened by windflee over 3 years ago - 1 comment

#9 - MoE Finetune

Issue - State: closed - Opened by lizy14 over 3 years ago - 2 comments

#8 - 用两个机器，启动的时候，报错

Issue - State: closed - Opened by lonelydancer over 3 years ago - 3 comments

#7 - deepspeed init hang住

Issue - State: closed - Opened by lonelydancer over 3 years ago - 3 comments

#6 - CPM-2-Finetuning的推理速度有多快(在V100上)

Issue - State: closed - Opened by lonelydancer over 3 years ago - 2 comments

#5 - adgen数据集

Issue - State: closed - Opened by zhenhao-huang over 3 years ago - 4 comments

#4 - finetune的最小配置是？

Issue - State: closed - Opened by eshaoliu over 3 years ago - 10 comments

#3 - V100单卡能inference吗

Issue - State: closed - Opened by zuowang over 3 years ago - 1 comment

#2 - 报错

Issue - State: closed - Opened by superqing001 over 3 years ago

#1 - 是否可以提供docker运行的脚本参考？

Issue - State: closed - Opened by superqing001 over 3 years ago - 2 comments

GitHub / tsinghuaai/cpm-2-finetune issues and pull requests