Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / PKU-Alignment/safe-rlhf issues and pull requests
#182 - [Question] A ValueError occurs reward.sh execution
Issue -
State: closed - Opened by leezy18 16 days ago
Labels: question
#181 - Failing to train cost model (ValueError: The safer answer is not safer than the unsafer answer.)
Issue -
State: closed - Opened by cemiu 2 months ago
- 5 comments
Labels: question
#169 - [Question] 运行过程中出现Signals SIGKILL
Issue -
State: closed - Opened by NNStrings 8 months ago
Labels: question
#133 - [Question] reward model
Issue -
State: closed - Opened by kylin-zhou about 1 year ago
- 7 comments
Labels: question, need information
#110 - feat(logger): save script and hyperparameters to output directory
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, new feature
#109 - [Question] 关于reward model 与reward critic model
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 4 comments
Labels: question
#108 - [Question] 使用opt1.3b作为reward model loss虽然下但是震荡很大
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 5 comments
Labels: question
#107 - feat(serve): set `dtype` while loading models
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, cuda, new feature
#106 - fix(trainers/rl_trainer): always pass `max_length` argument when loading models
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: bug
#105 - fix(trainers/rl_trainer): fix assertion for micro training batch size
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, enhancement
#104 - feat(values): Score Model Normalization
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: enhancement, new feature
#103 - feat(datasets): eliminate duplicate prompts for RLHF training
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#102 - fix(scripts): fix error messages for unkown arguments
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
#101 - feat(dataset): add HhRLHFPreference Dataset
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
#100 - feat(datasets): support preference model and rlhf training for dialogue
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: enhancement
#99 - feat(serve): support streaming output for CLI
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: enhancement, new feature
#98 - [Question] score_model training support for baichuan model
Issue -
State: closed - Opened by skepsun over 1 year ago
- 2 comments
Labels: question
#96 - docs(README): add notes for Chinese support
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: documentation, enhancement
#95 - docs(README): 🎉 release checkpoints for `beaver-7b-v1.0` and its friends
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: documentation
#94 - feat(scripts): randomize torch distributed master port
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, dependency, new feature
#93 - chore(score_model): set architectures for `ScoreModel`s in `model.config`
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
#92 - [Question] rollout过程中generate太慢跟zero3有关吗
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 4 comments
Labels: question
#91 - [Feature Request] To deal with hh-rlhf dialogue data
Issue -
State: closed - Opened by jc-ryan over 1 year ago
- 3 comments
Labels: enhancement
#90 - feat(datasets): add more raw dataset support
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, new feature
#89 - feat(rl_trainer): add generation config for RL rollout
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: enhancement, new feature
#88 - fix(rl_trainer): fix advantage calculation (GAE) when response lengths are different
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug
#87 - feat(rl): log sequence-wise KL-divergence to reference model during training
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
- 1 comment
Labels: enhancement, new feature
#86 - [Feature Request] log sequence-wise KL-divergence to reference model during training
Issue -
State: closed - Opened by rockmagma02 over 1 year ago
- 1 comment
Labels: enhancement, new feature
#85 - [Question] 请问数据集会有中文版本吗
Issue -
State: closed - Opened by ghost over 1 year ago
- 4 comments
Labels: question
#84 - feat(values): enhance logging for training value models
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: enhancement
#83 - feat(serve): better markdown format code block rendering
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#82 - [Question] 如何使用pycharm调试beaver,比如sft.sh
Issue -
State: closed - Opened by diehualong over 1 year ago
- 3 comments
Labels: question
#81 - chore(logger): log global step during training
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#80 - feat(datasets): support dataset proportion > 1
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: enhancement, new feature
#79 - feat(datasets): lazy tokenization support for `TokenizedDataset`s
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, new feature
#78 - feat(logger): enable manual logging level setting
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: enhancement
#77 - [Question] 训练好的 cost 模型可以直接作为 Q+A 是否安全的判别模型使用吗?
Issue -
State: closed - Opened by lierer007 over 1 year ago
- 5 comments
Labels: question
#76 - fix(datasets): raise errors when got duplicate dataset names
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, enhancement
#75 - feat(serve): add new special command `/reset`
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, evaluation
#74 - chore(datasets): better error message when raw dataset class not found
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#73 - fix(models): handle model embeddings resizing on model parallel
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, enhancement, new feature
#72 - fix(serve): handle `UnicodeDecodeError` for CJK inputs on deletion
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, enhancement
#71 - [Question] ppo训练后,输出越来越长,越来越重复。
Issue -
State: closed - Opened by SpongebBob over 1 year ago
- 5 comments
Labels: question
#69 - [Question] 关于PPO之后存储的模型大小翻倍问题
Issue -
State: closed - Opened by Tinker250 over 1 year ago
- 6 comments
Labels: question
#68 - fix(datasets): check tensor size before comparing their contents
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug
#67 - feat(datasets): add duplication check for preference datasets
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, enhancement, new feature
#66 - fix(configs/deepspeed_config): fix argument passing for evaluation batch size
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug
#65 - fix(configs/deepspeed_config): only support stage 0 and 3 in deepspeed config for evaluation
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: bug
#64 - [Question] OSError: [Errno 12] Cannot allocate memory
Issue -
State: closed - Opened by glsoon over 1 year ago
- 4 comments
Labels: question
#63 - [Question] 请教下关于SFT部分的loss计算
Issue -
State: closed - Opened by EthenZhang over 1 year ago
- 1 comment
Labels: question
#62 - feat(datasets): add `PKU-SafeRLHF` / `BeaverTails` datasets and their friends
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: documentation, enhancement, new feature, evaluation
#61 - fix(evalute/gpt4): fix GPT-4 evalation script
Pull Request -
State: closed - Opened by rockmagma02 over 1 year ago
Labels: bug
#60 - docs(README): add Beaver (1 round) preference distribution results
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: documentation, evaluation
#59 - [Question] 请问readme的效果对比图是在目前开放的10K数据,和scripts的默认配置跑的吗?
Issue -
State: closed - Opened by LiuShixing over 1 year ago
- 2 comments
Labels: question
#58 - [Question] 请教一下left padding的问题
Issue -
State: closed - Opened by DwarfWarriors over 1 year ago
- 2 comments
Labels: question
#56 - [Question] PPO 训练完的模型没有输出
Issue -
State: closed - Opened by liumingzhu6060 over 1 year ago
- 5 comments
Labels: question, need information
#55 - [Question] 为什么Reward critic tokenizer must be the same as actor tokenizer?
Issue -
State: closed - Opened by liumingzhu6060 over 1 year ago
- 1 comment
Labels: question
#54 - feat(datasets): accept local repo paths while loading datasets
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, new feature
#53 - [Question] 数据格式对不齐
Issue -
State: closed - Opened by AlexXx-Wu over 1 year ago
- 4 comments
Labels: bug, question, need information
#52 - [Question] How to plot the graph after running GPT eval and obtaining a JSON file?
Issue -
State: closed - Opened by yifan123 over 1 year ago
- 2 comments
Labels: question, evaluation
#51 - [BUG] RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:5 and cpu!
Issue -
State: closed - Opened by Yanfei-Qin over 1 year ago
- 4 comments
Labels: bug, need information
#50 - feat(values): add a new sequence-wise loss for reward/cost models
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: enhancement, new feature
#49 - feat(evaluate/arena): allow using different tokenizers in arena evaluation
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement, evaluation
#48 - feat(algorithms/dpo): add implementation for the DPO algorithm
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
- 3 comments
Labels: enhancement, new feature
#47 - feat(trainers/rl_trainer): ensure RL dataset is exhausted when also using PTX dataset
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#46 - chore(scripts): update default pre-train model path
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
#45 - fix(models): temporarily disable LLaMA fast tokenizer
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug, upstream
#44 - style(algorithms): merge and move `torch.no_grad()` context manager to method level
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
#43 - [Question] Results of arena evaluation
Issue -
State: closed - Opened by nonstopfor over 1 year ago
- 8 comments
Labels: question, need information, evaluation
#42 - [Question] 数据集翻译成中文输入,会报“AssertionError: The better and worse answer are the same!”的错误
Issue -
State: closed - Opened by liumingzhu6060 over 1 year ago
- 5 comments
Labels: question
#41 - refactor(utils): refactor pytree registration for `ModelOutput` subclasses
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#40 - [Question] rollout函数generate耗费时间过长
Issue -
State: closed - Opened by Mandy0016 over 1 year ago
- 10 comments
Labels: question
#39 - [Question] PKU-SafeRLHF-1M 数据集使用
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 4 comments
Labels: question
#38 - [BUG][Upstream] `deepspeed` failed to compile `FusedAdam` CUDA operator
Issue -
State: closed - Opened by Harry-mic over 1 year ago
- 6 comments
Labels: bug, dependency, installation, upstream, cuda
#37 - [Question] Question about the actor loss in RLHF training
Issue -
State: closed - Opened by xyjsjruiliu over 1 year ago
- 1 comment
Labels: question, need information
#35 - chore(.github): update issue template
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: documentation, enhancement
#34 - [BUG] Poor internet connection: failed to download datasets from Hugging Face
Issue -
State: closed - Opened by Harry-mic over 1 year ago
- 1 comment
Labels: bug, invalid
#33 - [Question] Question about dataset splitting for different training stage
Issue -
State: closed - Opened by liumingzhu6060 over 1 year ago
- 3 comments
Labels: question
#32 - docs(README): add `alpaca-farm` to the comparison table
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: documentation
#31 - refactor(trainers/supervised_trainer): split the eval dataset with `eval_split_ratio` argument
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: enhancement
#30 - [BUG] Poor internet connection: failed to download datasets from Hugging Face
Issue -
State: closed - Opened by Harry-mic over 1 year ago
- 2 comments
Labels: bug, invalid
#29 - [Question] Question about the PTX Step in RLHF training
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 4 comments
Labels: question
#27 - [BUG] unlimited recursion when calling tokenizer.unk_token_id
Issue -
State: closed - Opened by feiliya333 over 1 year ago
- 2 comments
Labels: bug, upstream
#26 - fix(algorithms): handle potential index error for empty generation
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug
#24 - [Question] What are the recommended hyper-parameters?
Issue -
State: closed - Opened by nonstopfor over 1 year ago
- 4 comments
Labels: question
#23 - fix(algorithms): skip special tokens when re-tokenizing with the reward/cost tokenizer
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: bug, enhancement
#21 - [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
- 8 comments
Labels: question, need information
#20 - [Feature Request] LoRA support for memory efficient fine-tuning
Issue -
State: open - Opened by 70557dzqc over 1 year ago
- 2 comments
Labels: enhancement, in progress, new feature
#18 - fix(models/pretrained): set special token ids in `model.config`
Pull Request -
State: closed - Opened by calico-1226 over 1 year ago
Labels: bug, enhancement
#17 - [Question] Metric/task used to evaluate Beaver
Issue -
State: closed - Opened by feiliya333 over 1 year ago
- 2 comments
Labels: question, evaluation
#15 - [Feature Request] Releasing the Reward Model
Issue -
State: closed - Opened by d223302 over 1 year ago
- 6 comments
Labels: enhancement, question
#14 - [Feature Request] 请问后续支持chatglm的rm训练和rl训练吗?
Issue -
State: closed - Opened by iamsile over 1 year ago
- 2 comments
Labels: enhancement, invalid, dependency
#13 - feat(algorithms): allow reward/cost models use different tokenizers than actor tokenizer
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: enhancement
#12 - [Feature Request] loading dataset from local files
Issue -
State: closed - Opened by haorannlp over 1 year ago
- 5 comments
Labels: enhancement, in progress, new feature
#11 - [Feature Request] Support Actor and Reward/Cost Models using different tokenizers
Issue -
State: closed - Opened by calico-1226 over 1 year ago
- 1 comment
Labels: enhancement
#9 - [BUG] 运行 PPO 阶段时,出现错误:CUDA error: device-side assert triggered
Issue -
State: closed - Opened by HaixHan over 1 year ago
- 23 comments
Labels: bug, invalid, need information, cuda
#8 - How to setup the data in sft process,should I just make a dir Alpaca and put the data downloaded in it?
Issue -
State: closed - Opened by zhaobinNF over 1 year ago
Labels: question
#7 - fix(serve): fix argument passing for chatbot generation
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: bug
#6 - deps(tokenizers): pin `tokenizers` minimum version for fast tokenizer support for LLaMA models
Pull Request -
State: closed - Opened by XuehaiPan over 1 year ago
Labels: dependency, installation
#5 - [Question] Trlx doesn't support the Reward model training ?
Issue -
State: closed - Opened by wqw547243068 over 1 year ago
- 2 comments
Labels: question