Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / shibing624/MedicalGPT issues and pull requests

#426 - 从头开始预训练 Loss下降缓慢

Issue - State: closed - Opened by dage0127 6 days ago - 3 comments
Labels: enhancement

#425 - 复现医疗大模型与训练数据加载问题

Issue - State: closed - Opened by veresse about 1 month ago - 1 comment
Labels: question

#424 - Qwen-7B领域大模型继续预训练

Issue - State: closed - Opened by veresse about 1 month ago - 6 comments
Labels: question

#423 - shibing624/medical 数据集能不能传一份到魔塔社区那边？

Issue - State: open - Opened by hecheng64 about 2 months ago - 2 comments
Labels: question

#422 - 麻烦改一下requirenments.txt，有一个包少一个等号

Issue - State: closed - Opened by immmor about 2 months ago
Labels: bug

#421 - 耗时过长

Issue - State: open - Opened by ucaslei 2 months ago
Labels: question

#420 - 多gpu 的时候运行ppo_training.py报错，

Issue - State: open - Opened by cqray1990 2 months ago - 1 comment
Labels: bug

#419 - 请问，我用Qwen1.8b的模型微调，但报错TypeError: argument 'tokens': 'NoneType' object cannot be interpreted as an integer

Issue - State: open - Opened by wangxinwwang 2 months ago
Labels: bug

#418 - 请教下数据处理部分，tokenizer分词后的position_ids字段是怎么生成的

Issue - State: open - Opened by XiaozhuLove 2 months ago - 4 comments
Labels: question

#417 - 使用Llama-2-13b-hf训练奖励模型报错

Issue - State: closed - Opened by cqray1990 2 months ago - 1 comment
Labels: bug

#416 - perplexity 微调之后变大了？需要继续微调？

Issue - State: open - Opened by cqray1990 2 months ago - 1 comment
Labels: bug

#415 - sharegpt_gpt4的看了数据跟医疗没有关系，为什么也可以用于医疗多轮对话的微调呢？

Issue - State: open - Opened by cqray1990 2 months ago - 1 comment
Labels: question

#414 - 请问 sft之后的模型效果和base 模型比较，大佬用的什么指标比较呢？可以提供仓库链接？

Issue - State: closed - Opened by cqray1990 2 months ago - 1 comment
Labels: question

#413 - 请问支持llama 3.1 微调？

Issue - State: open - Opened by cqray1990 3 months ago - 2 comments
Labels: bug

#412 - 词表扩充程序能否应用在Qwen2中

Issue - State: closed - Opened by LarryLong45 3 months ago - 1 comment
Labels: question

#411 - qwen1.5-0.5b-chat按教程SFT后推理无结果

Issue - State: closed - Opened by LarryLong45 3 months ago - 8 comments
Labels: question

#410 - Windows四卡3090平台跑baichuan2-13b时，感觉模型好像没有分布到各个显卡上，显存一下就满了oom了。怎么解决？

Issue - State: open - Opened by Ruiruiz30 3 months ago - 3 comments
Labels: bug

#409 - 关于奖励模型训练数据的构成

Issue - State: open - Opened by Eren139 3 months ago - 6 comments
Labels: question

#408 - run_rm.sh双卡运行全量参数，报RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons:

Issue - State: open - Opened by XiaozhuLove 3 months ago - 1 comment
Labels: bug

#407 - Update README.md

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#406 - 增量预训练

Issue - State: closed - Opened by cqray1990 3 months ago - 1 comment
Labels: bug

#405 - Create validate_jsonl.py

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#404 - Test the perplexity

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#403 - rm阶段，loss降到0，并且图看起来很奇怪

Issue - State: closed - Opened by zhengshi119 3 months ago - 9 comments
Labels: question

#402 - Update README_EN.md

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#401 - Update model_quant.py

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#400 - Complete quantification

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#399 - Create Multi-GPUs-deployment.sh

Pull Request - State: closed - Opened by ZhuangXialie 3 months ago

#398 - 预测时，提示Attention Mask 未设置和Attention Mask 未设置

Issue - State: closed - Opened by huangrs494 3 months ago - 3 comments
Labels: question

#397 - Is possible multilingual English and Spanish

Issue - State: closed - Opened by johnfelipe 3 months ago - 1 comment
Labels: question

#396 - 大量数据加载问题

Issue - State: closed - Opened by dage0127 4 months ago - 2 comments
Labels: question

#391 - 运行sh ./run_ppo.sh时遇到错误ValueError: Target modules q_proj,v_proj not found in the base model. Please check the target modules and try again错误复现过程

Issue - State: closed - Opened by iomgaa-ycz 4 months ago - 2 comments
Labels: question

#388 - 关于本地训练问题

Issue - State: closed - Opened by Ruiruiz30 4 months ago - 1 comment
Labels: question

#385 - notebook报错

Issue - State: closed - Opened by cheun726 5 months ago - 1 comment
Labels: question

#378 - 关于llama3的权重转换

Issue - State: closed - Opened by tszslovewanpu 6 months ago - 1 comment
Labels: question

#377 - 医学大模型全流程体验

Issue - State: closed - Opened by YoshuaBengio 6 months ago - 2 comments
Labels: question

#376 - 运行pretraining.py时报错：RuntimeError: CUDA error: device-side assert triggered

Issue - State: closed - Opened by Wenting1227 6 months ago - 5 comments
Labels: bug

#373 - vocab扩展后的模型合并问题

Issue - State: closed - Opened by sungatetop 6 months ago - 1 comment
Labels: question

#366 - 对chat模型进行二次预训练后，自问自答

Issue - State: closed - Opened by wsl1014 7 months ago - 2 comments
Labels: question

#365 - 几步的训练怎么都是独立的，rm都没用sft的adapter

Issue - State: closed - Opened by cqray1990 7 months ago - 3 comments
Labels: bug

#361 - reward_modeling咨询

Issue - State: closed - Opened by tuqingwen 7 months ago - 1 comment
Labels: question

#358 - Regarding RLHF and DPO training data

Issue - State: closed - Opened by Aniketto16 7 months ago - 2 comments
Labels: question

#356 - ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2) and requested shape (1,2)

Issue - State: closed - Opened by Riapy 8 months ago - 1 comment
Labels: bug

#355 - lora模型合并

Issue - State: closed - Opened by sevenandseven 8 months ago - 2 comments
Labels: question

#352 - 扩充词表后能否直接进行SFT呢？

Issue - State: closed - Opened by HaotianLiu123 8 months ago - 2 comments
Labels: question

#351 - 预训练后模型出现自问自答、输出未知序列、重复口吃现象

Issue - State: closed - Opened by Peter-of-Astora 8 months ago - 7 comments
Labels: question

#349 - 增量预训练效果评估

Issue - State: closed - Opened by qibao77 8 months ago - 1 comment
Labels: question

#347 - llama进行rm训练的时候，出现问题ValueError: weight is on the meta device, we need a `value` to put in on cpu.

Issue - State: closed - Opened by cove1011 8 months ago - 1 comment
Labels: bug

#346 - 使用qwen进行pretrain的时候出现了问题：Cannot copy out of meta tensor; no data!

Issue - State: closed - Opened by cove1011 8 months ago - 1 comment
Labels: bug

#315 - 请问，pt阶段，基础模型比较大(Yi-67B)，多机多卡用那种训练比较好呢？

Issue - State: closed - Opened by listwebit 10 months ago - 1 comment
Labels: question

#293 - 请教DPO多轮对话的问题

Issue - State: closed - Opened by chloefresh 11 months ago - 3 comments
Labels: question

#291 - 在单机多卡监督微调时使用的策略是DP还是DDP？

Issue - State: closed - Opened by CNUIGB 11 months ago - 1 comment
Labels: question

#284 - 请问大佬，Reward model验证分类评分，一个问题回传两个tensor?

Issue - State: closed - Opened by waycup7 11 months ago - 2 comments
Labels: question

#280 - 大佬，使用自己数据进行增量预训练时，loss不降反增。

Issue - State: closed - Opened by SevenMpp 11 months ago - 12 comments
Labels: question

#270 - 需要进行模型评测吗？

Issue - State: closed - Opened by chenkang404 12 months ago - 2 comments
Labels: question

#264 - 多轮对话SFT完了后测试会出现回复重复句子的现象

Issue - State: closed - Opened by chloefresh 12 months ago - 2 comments
Labels: question

#261 - 关于增量预训练

Issue - State: closed - Opened by tszslovewanpu about 1 year ago - 2 comments
Labels: question

#258 - 在convert_dataset.py文件中总是单个文件处理！

Issue - State: closed - Opened by tuqingwen about 1 year ago - 1 comment
Labels: question

#256 - 在DPO训练时程序运行一半时突然直接中断

Issue - State: closed - Opened by tuqingwen about 1 year ago - 1 comment

#255 - 关于merge模型的格式问题

Issue - State: closed - Opened by tszslovewanpu about 1 year ago - 2 comments
Labels: question

#134 - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)

Issue - State: closed - Opened by gaoxc315 over 1 year ago - 3 comments
Labels: question

#102 - 加载模型后打印为空

Issue - State: closed - Opened by yingzhao27 over 1 year ago - 1 comment
Labels: question

#101 - 使用chatglm模型在作者提供的medical数据集下跑PT阶段出现问题

Issue - State: closed - Opened by daimazz1 over 1 year ago - 16 comments
Labels: question

#100 - Lora是pt方法吗？Lora是微调方法吗？

Issue - State: closed - Opened by chengzhen123 over 1 year ago - 1 comment
Labels: question

#99 - 关于模型训练参数

Issue - State: closed - Opened by daimazz1 over 1 year ago - 3 comments
Labels: question

#98 - 有什么效果评估吗？每个步骤PT， SFT， RM, PPO分别是怎样的效果？

Issue - State: closed - Opened by ljch2018 over 1 year ago - 5 comments
Labels: question

#97 - 在利用deepspeed对chatglm2进行微调的时候，多机多卡的cuda问题

Issue - State: closed - Opened by 1716649290 over 1 year ago - 3 comments
Labels: bug

#96 - pretraining中的group_texts()方法的目的是什么？

Issue - State: closed - Opened by Zagreus-lzy over 1 year ago - 4 comments
Labels: question

#95 - 每个stage设置的Lora_rank都可以不一样的吗？

Issue - State: closed - Opened by Alfer-Feng over 1 year ago - 2 comments
Labels: question

#93 - 优化prompt

Pull Request - State: closed - Opened by iioSnail over 1 year ago

#92 - chatglm2进行进行lora权重合并报错

Issue - State: closed - Opened by lzw2000118 over 1 year ago - 5 comments
Labels: bug

#91 - 中断训练无法恢复训练

Issue - State: closed - Opened by yangcm1986 over 1 year ago - 1 comment
Labels: bug

#90 - 二次预训练语料准备问题

Issue - State: closed - Opened by valkryhx over 1 year ago - 1 comment
Labels: question

#89 - chatglm2-6b pt阶段num_train_samples

Issue - State: closed - Opened by 535603775 over 1 year ago - 1 comment
Labels: question

#88 - chatglm2-6B二次预训练后调用merge脚本报错

Issue - State: closed - Opened by valkryhx over 1 year ago
Labels: question

#87 - deepspeed分布式Gpu分配问题

Issue - State: closed - Opened by xxm1668 over 1 year ago
Labels: bug

#86 - chatglm-6b的rm和rl阶段问题

Issue - State: closed - Opened by daimazz1 over 1 year ago - 1 comment
Labels: question

#85 - 传参时load_in_8bit=True 为何显存占用没有减小呢？

Issue - State: closed - Opened by valkryhx over 1 year ago - 5 comments
Labels: question, wontfix

#84 - 【chatglm2和Baichuan-13B】二次预训练和微调支持？

Issue - State: closed - Opened by LouisHeck over 1 year ago - 10 comments
Labels: bug

#83 - 请教增量预训练后的两个问题：1）token长尾 2）group texts

Issue - State: closed - Opened by Zagreus-lzy over 1 year ago - 12 comments
Labels: question

#82 - Stage4 PPOtrainer logging问题

Issue - State: closed - Opened by ymyjl over 1 year ago - 3 comments
Labels: question

#81 - 问题

Issue - State: closed - Opened by fxb392 over 1 year ago - 1 comment
Labels: question

#80 - 关于ziya-llama-13b-medical-lora 模型权重如何获得

Issue - State: closed - Opened by daimazz1 over 1 year ago - 3 comments
Labels: question

#79 - 关于预训练的问题

Issue - State: closed - Opened by xiaohengDa over 1 year ago - 6 comments
Labels: bug

#78 - datasets 版本问题 validation_split_percentage 类型错误

Issue - State: closed - Opened by yysirs over 1 year ago - 1 comment
Labels: bug

#77 - RuntimeError: Distributed package doesn't have NCCL built in when pretrain

Issue - State: closed - Opened by SeekPoint over 1 year ago - 1 comment
Labels: bug

#76 - 我使用自己的数据开始微调，请问出现这种原因是我的数据问题吗？

Issue - State: closed - Opened by opooopooo over 1 year ago - 8 comments
Labels: bug

#75 - 大佬，将你的代码单独摘出来，就报错了，求指点一下，谢谢！

Issue - State: closed - Opened by xxm1668 over 1 year ago - 9 comments
Labels: question

#74 - chatglm在奖励模型阶段报错，大佬指点

Issue - State: closed - Opened by xxm1668 over 1 year ago - 1 comment
Labels: bug

#73 - 使用deepspeed run run_rm.sh

Issue - State: closed - Opened by yangzhipeng1108 over 1 year ago - 4 comments
Labels: bug

#72 - gradio_demo 演示乱出问题

Issue - State: closed - Opened by YuanEZhou over 1 year ago - 3 comments
Labels: question

#71 - bloom模型sft时报错

Issue - State: closed - Opened by DDYuudachi over 1 year ago - 3 comments
Labels: help wanted, wontfix

#70 - 多机多卡

Issue - State: closed - Opened by zhr0313 over 1 year ago - 4 comments
Labels: question

#69 - 【SFT+DeepSpeed全量参数训练】推理结果异常，很奇怪！！！

Issue - State: closed - Opened by yanqiangmiffy over 1 year ago - 15 comments
Labels: bug

#68 - sft微调chatglm2，合并时报错

Issue - State: closed - Opened by shawnlihst over 1 year ago - 3 comments
Labels: question

#67 - 请教一下chatglm的PROMPT_TEMPLATE是否需要按照官方预测脚本的形式呢

Issue - State: closed - Opened by MRKINKI over 1 year ago - 1 comment
Labels: question

#66 - run_pt.sh: 42: run_pt.sh: --deepspeed: not found

Issue - State: closed - Opened by yangzhipeng1108 over 1 year ago - 3 comments
Labels: bug

#65 - 基于ChatGLM2运行run_pt.py报错

Issue - State: closed - Opened by dijkstra-mose over 1 year ago - 1 comment
Labels: question

#64 - Will it be beneficial to apply qlora in the pretraining stage?

Issue - State: closed - Opened by yeeeqichen over 1 year ago - 1 comment
Labels: question

#63 - chatglm, chatglm2的pretrain

Issue - State: closed - Opened by calvinzhan over 1 year ago - 2 comments
Labels: question, wontfix