Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / OpenRLHF/OpenRLHF issues and pull requests

#442 - [WIP] Add PRM training with hard estimation

Pull Request - State: open - Opened by zhuzilin 1 day ago

#441 - Why ZeRO-3 is only supported when vLLM enabled

Issue - State: open - Opened by liuxsh9 2 days ago - 2 comments

#439 - Add context parallel to DPO

Pull Request - State: closed - Opened by zhuzilin 5 days ago

#438 - only import bitsandbytes when necessary

Pull Request - State: closed - Opened by zhuzilin 7 days ago

#437 - why advantage calculate ops [::-1]

Issue - State: closed - Opened by DavideHe 7 days ago

#436 - Lora merge error after dpo training with lora.

Issue - State: open - Opened by KaedinLian 10 days ago - 1 comment

#435 - 期望能支持序列并行(sequence_parallel)

Issue - State: open - Opened by kangyishuai 12 days ago - 3 comments

#434 - train_knowledge_distillation.sh 脚本无法运行

Issue - State: closed - Opened by Rookie-Kai 12 days ago - 2 comments

#433 - Support for Token-Level Rewards?

Issue - State: open - Opened by pagepal666 14 days ago - 4 comments

#432 - Is n_samples_per_prompt actually used?

Issue - State: closed - Opened by Rosenberg37 15 days ago - 1 comment

#431 - The need of micro_rollout_batch_size

Issue - State: closed - Opened by Unfinito 17 days ago - 4 comments

#430 - Will OpenRLHF handle gradient_accumulation_steps with loss?

Issue - State: closed - Opened by mzhaoshuai 19 days ago - 4 comments

#429 - The behavior of log

Issue - State: closed - Opened by visionxyz 23 days ago - 6 comments

#428 - How to load a open-sourced model without 'value head'

Issue - State: closed - Opened by kleinzcy about 1 month ago - 3 comments

#427 - batch_inference NCCL time out error

Issue - State: open - Opened by BeyonderXX about 1 month ago - 1 comment

#425 - Add feature of load_from_disk to utils.py

Pull Request - State: closed - Opened by tongyx361 about 1 month ago

#424 - 4卡4090 训练qwen2-7b-instruct时报错CUDA out of memory

Issue - State: open - Opened by SuiJiGuoChengSuiJiGuo about 1 month ago - 2 comments

#423 - PPO训练overlap_comm打开会对训练表现造成很大影响

Issue - State: open - Opened by andylrx about 1 month ago - 1 comment

#422 - add 'num_return_sequences' feature in actor

Pull Request - State: closed - Opened by 0xWelt about 1 month ago - 4 comments

#421 - 请问这里循环处为什么设置dist.barrier()?

Issue - State: closed - Opened by lyz22233 about 1 month ago - 2 comments

#420 - PPO takes very long

Issue - State: closed - Opened by mandyyyyii about 1 month ago - 10 comments

#419 - 请问有蒸馏的loss曲线可以参考吗

Issue - State: closed - Opened by Schnabel-8 about 1 month ago - 1 comment

#418 - flash_attn问题

Issue - State: closed - Opened by tbsxxxH about 1 month ago - 3 comments

#417 - Add makedirs before writing in batch_inference

Pull Request - State: closed - Opened by tongyx361 about 1 month ago

#416 - ppo错误

Issue - State: closed - Opened by ldh127 about 1 month ago - 1 comment

#415 - AssertionError: Session name does not match persisted value

Issue - State: closed - Opened by tbsxxxH about 2 months ago - 1 comment

#414 - update link to code in readme

Pull Request - State: closed - Opened by coding-famer about 2 months ago

#413 - 版本冲突问题

Issue - State: closed - Opened by tbsxxxH about 2 months ago - 1 comment

#412 - Speed Up Data Processing by Using Multi-Processing in Dataset.map

Pull Request - State: closed - Opened by Ricardokevins about 2 months ago

#411 - torch.distributed.broadcast timeout

Issue - State: closed - Opened by lyz22233 about 2 months ago - 2 comments

#410 - Speed Up Data Processing by Using Multi-Processing in Dataset.map

Pull Request - State: closed - Opened by Ricardokevins about 2 months ago - 1 comment

#409 - Data Preprocess Speed Up

Issue - State: closed - Opened by Ricardokevins about 2 months ago - 1 comment

#408 - ppo 错误

Issue - State: closed - Opened by ldh127 about 2 months ago - 7 comments

#407 - 模型参数量与显存占用关系

Issue - State: closed - Opened by tbsxxxH about 2 months ago - 1 comment

#406 - 8卡 4090 ppo ray 训练问题

Issue - State: closed - Opened by hehebamei about 2 months ago - 1 comment

#405 - ppo 错误

Issue - State: closed - Opened by ldh127 about 2 months ago - 2 comments

#404 - 无法用Qwen2-70b给 Qwen2-1.5b进行蒸馏 报错 tensor size 不能match

Issue - State: closed - Opened by xiechengmude about 2 months ago - 1 comment

#403 - Qlora load model error

Issue - State: closed - Opened by ldh127 about 2 months ago - 1 comment

#401 - an unexpected error while SFT

Issue - State: closed - Opened by liwd190019 about 2 months ago - 2 comments

#400 - No module named 'vllm'

Issue - State: closed - Opened by tbsxxxH about 2 months ago - 3 comments

#398 - PPO OOM 8*A100 40G

Issue - State: closed - Opened by tbsxxxH about 2 months ago - 11 comments

#397 - train_batch_size

Issue - State: closed - Opened by mandyyyyii about 2 months ago - 1 comment

#396 - rename wandb args in scripts

Pull Request - State: closed - Opened by coding-famer about 2 months ago

#395 - 显存占用很奇怪

Issue - State: closed - Opened by lyz22233 about 2 months ago - 4 comments

#394 - Support Checkpoint

Pull Request - State: closed - Opened by xiaoxigua999 about 2 months ago

#393 - SFT dataset tokenization scheme bug when using llama3

Issue - State: closed - Opened by ZhaofengWu about 2 months ago - 8 comments

#392 - train_ppo_ray OOM

Issue - State: closed - Opened by syx11237744 about 2 months ago - 4 comments

#391 - support remote rm

Pull Request - State: closed - Opened by xiaoxigua999 about 2 months ago

#390 - Why multiplying rstd instead of dividing by rstd?

Issue - State: closed - Opened by gohsyi about 2 months ago - 1 comment

#389 - Performance of Iterative DPO?

Issue - State: closed - Opened by yesiam-png about 2 months ago - 1 comment

#388 - Update version.txt

Pull Request - State: closed - Opened by xiaoxigua999 about 2 months ago

#387 - Zero stage 3 error

Issue - State: closed - Opened by syx11237744 about 2 months ago - 1 comment

#386 - Feature: add DPO-P

Issue - State: closed - Opened by catqaq about 2 months ago
Labels: enhancement

#380 - Fix loading dataset from local text files

Pull Request - State: closed - Opened by tongyx361 2 months ago

#368 - Support training from breakpoint

Issue - State: closed - Opened by luo-li-ba-suo 2 months ago - 4 comments

#366 - DPO后的模型推理出的结果都是无序符号

Issue - State: open - Opened by 2024WY 2 months ago - 3 comments

#364 - DPO Finetuning constantly gives preference loss as 0.6931

Issue - State: closed - Opened by mandyyyyii 2 months ago - 9 comments

#361 - support remote rm api for ppo and ppo ray

Pull Request - State: closed - Opened by catqaq 2 months ago - 8 comments
Labels: enhancement, P0

#340 - [pre-commit.ci] pre-commit suggestions

Pull Request - State: closed - Opened by pre-commit-ci[bot] 3 months ago

#331 - PPO加载完模型后卡在bundle_reservation_check_func这里

Issue - State: open - Opened by lixsh6 3 months ago - 3 comments

#311 - 可以增加支持SimPO吗

Issue - State: open - Opened by victorShawFan 4 months ago - 3 comments

#305 - Strange Kill of Critic Model

Issue - State: closed - Opened by Ricardokevins 4 months ago - 6 comments

#300 - [Question] EOS in reward model dataset

Issue - State: open - Opened by qwenzo 4 months ago - 4 comments

#293 - PPO采用zero 3 stage后产生time out error

Issue - State: open - Opened by victorShawFan 4 months ago - 5 comments

#288 - RM training loss becomes NAN when finish the first training step.

Issue - State: open - Opened by lixsh6 4 months ago - 2 comments
Labels: bug

#283 - fix vLLM v0.4.1

Pull Request - State: closed - Opened by hijkzzz 5 months ago - 3 comments

#281 - Revert "vllm 0.4.1 compatibility (#278)"

Pull Request - State: closed - Opened by hijkzzz 5 months ago

#270 - Issue with models not using `position_ids`

Issue - State: closed - Opened by kfertakis 5 months ago - 2 comments

#267 - add test pipeline: use small LLM and small data

Issue - State: closed - Opened by catqaq 5 months ago
Labels: documentation, enhancement

#266 - Documentation for using Kuberay

Issue - State: closed - Opened by karthik-nexusflow 5 months ago - 4 comments

#256 - Is save checkpoint not yet supported for ppo ray trainer?

Issue - State: closed - Opened by mickel-liu 6 months ago - 6 comments

#251 - [For your information] Ways to build environment and run openrlhf codes on a slurm cluster

Issue - State: closed - Opened by glorgao 6 months ago - 3 comments
Labels: documentation, envs

#245 - enable_ema cause runtime error when running train_ppo_llama.sh

Issue - State: open - Opened by dshnightmare 6 months ago - 7 comments

#241 - Fix yi-34b tokenizer, use_fast=False

Pull Request - State: closed - Opened by hijkzzz 6 months ago

#238 - Forced EOS token in vllm generation?

Issue - State: open - Opened by mgerstgrasser 7 months ago - 7 comments

#236 - adding length penalty to reward

Issue - State: open - Opened by karthik-nexusflow 7 months ago - 2 comments

#232 - Is left-padding in PPO strictly necessary?

Issue - State: open - Opened by mgerstgrasser 7 months ago - 8 comments

#230 - Actor-Critic-Model

Issue - State: open - Opened by mgerstgrasser 7 months ago - 6 comments

#221 - Citation or comparison to trlX and NeMo-align.

Issue - State: closed - Opened by LouisCastricato 7 months ago - 3 comments

#220 - Support top models stage2

Issue - State: closed - Opened by catqaq 7 months ago
Labels: enhancement

#219 - use_right_pad

Pull Request - State: closed - Opened by hijkzzz 7 months ago - 1 comment

#211 - vllm +zero2 hangs

Issue - State: closed - Opened by karthik19967829 7 months ago - 32 comments

#205 - Improve ease of use

Issue - State: closed - Opened by hijkzzz 8 months ago - 1 comment

#183 - support mixtral 8*7b balancing loss

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#174 - upgrade container for to_bettertransformer

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#169 - refactor ds config and fix flash_attn/ model.config.pad_token_id

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#168 - update Logo

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#163 - remove pad token and embedding resize for llama

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#155 - Add pipeline module to support more scientific comparative experiments and research

Issue - State: closed - Opened by catqaq 9 months ago
Labels: enhancement, P1

#151 - feature: add api support for hosting a reward model

Issue - State: closed - Opened by ftmtk 10 months ago - 5 comments
Labels: enhancement, P1

#102 - Feature: Support detailed running process management: save_steps, log_steps, eval_steps

Issue - State: closed - Opened by catqaq about 1 year ago - 7 comments
Labels: enhancement, P0

#101 - Bug: AttributeError: 'DeepspeedStrategy' object has no attribute 'save_hf_format'

Issue - State: closed - Opened by catqaq about 1 year ago - 2 comments
Labels: bug