Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / OpenRLHF/OpenRLHF issues and pull requests

#433 - Support for Token-Level Rewards?

Issue - State: closed - Opened by pagepal666 3 months ago - 5 comments

#432 - Is n_samples_per_prompt actually used?

Issue - State: closed - Opened by Rosenberg37 3 months ago - 1 comment

#431 - The need of micro_rollout_batch_size

Issue - State: closed - Opened by Unfinito 3 months ago - 4 comments

#430 - Will OpenRLHF handle gradient_accumulation_steps with loss?

Issue - State: closed - Opened by mzhaoshuai 3 months ago - 4 comments

#429 - The behavior of log

Issue - State: closed - Opened by visionxyz 3 months ago - 6 comments

#428 - How to load a open-sourced model without 'value head'

Issue - State: closed - Opened by kleinzcy 3 months ago - 3 comments

#427 - batch_inference NCCL time out error

Issue - State: closed - Opened by BeyonderXX 3 months ago - 1 comment

#425 - Add feature of load_from_disk to utils.py

Pull Request - State: closed - Opened by tongyx361 3 months ago

#424 - 4卡4090 训练qwen2-7b-instruct时报错CUDA out of memory

Issue - State: closed - Opened by SuiJiGuoChengSuiJiGuo 3 months ago - 2 comments

#423 - PPO训练overlap_comm打开会对训练表现造成很大影响

Issue - State: open - Opened by andylrx 3 months ago - 1 comment

#422 - add 'num_return_sequences' feature in actor

Pull Request - State: closed - Opened by 0xWelt 4 months ago - 4 comments

#421 - 请问这里循环处为什么设置dist.barrier()?

Issue - State: closed - Opened by lyz22233 4 months ago - 2 comments

#420 - PPO takes very long

Issue - State: closed - Opened by mandyyyyii 4 months ago - 10 comments

#419 - 请问有蒸馏的loss曲线可以参考吗

Issue - State: closed - Opened by Schnabel-8 4 months ago - 8 comments

#418 - flash_attn问题

Issue - State: closed - Opened by tbsxxxH 4 months ago - 3 comments

#417 - Add makedirs before writing in batch_inference

Pull Request - State: closed - Opened by tongyx361 4 months ago

#416 - ppo错误

Issue - State: closed - Opened by ldh127 4 months ago - 1 comment

#415 - AssertionError: Session name does not match persisted value

Issue - State: closed - Opened by tbsxxxH 4 months ago - 1 comment

#414 - update link to code in readme

Pull Request - State: closed - Opened by coding-famer 4 months ago

#413 - 版本冲突问题

Issue - State: closed - Opened by tbsxxxH 4 months ago - 1 comment

#411 - torch.distributed.broadcast timeout

Issue - State: closed - Opened by lyz22233 4 months ago - 2 comments

#410 - Speed Up Data Processing by Using Multi-Processing in Dataset.map

Pull Request - State: closed - Opened by Ricardokevins 4 months ago - 1 comment

#409 - Data Preprocess Speed Up

Issue - State: closed - Opened by Ricardokevins 4 months ago - 1 comment

#408 - ppo 错误

Issue - State: closed - Opened by ldh127 4 months ago - 7 comments

#407 - 模型参数量与显存占用关系

Issue - State: closed - Opened by tbsxxxH 4 months ago - 1 comment

#406 - 8卡 4090 ppo ray 训练问题

Issue - State: closed - Opened by hehebamei 4 months ago - 1 comment

#405 - ppo 错误

Issue - State: closed - Opened by ldh127 4 months ago - 2 comments

#403 - Qlora load model error

Issue - State: closed - Opened by ldh127 4 months ago - 1 comment

#401 - an unexpected error while SFT

Issue - State: closed - Opened by liwd190019 4 months ago - 2 comments

#400 - No module named 'vllm'

Issue - State: closed - Opened by tbsxxxH 4 months ago - 3 comments

#398 - PPO OOM 8*A100 40G

Issue - State: closed - Opened by tbsxxxH 4 months ago - 11 comments

#397 - train_batch_size

Issue - State: closed - Opened by mandyyyyii 4 months ago - 1 comment

#396 - rename wandb args in scripts

Pull Request - State: closed - Opened by coding-famer 4 months ago

#395 - 显存占用很奇怪

Issue - State: closed - Opened by lyz22233 4 months ago - 4 comments

#394 - Support Checkpoint

Pull Request - State: closed - Opened by xiaoxigua999 4 months ago

#393 - SFT dataset tokenization scheme bug when using llama3

Issue - State: closed - Opened by ZhaofengWu 4 months ago - 8 comments

#392 - train_ppo_ray OOM

Issue - State: closed - Opened by syx11237744 4 months ago - 4 comments

#391 - support remote rm

Pull Request - State: closed - Opened by xiaoxigua999 4 months ago

#390 - Why multiplying rstd instead of dividing by rstd?

Issue - State: closed - Opened by gohsyi 4 months ago - 1 comment

#389 - Performance of Iterative DPO?

Issue - State: closed - Opened by yesiam-png 4 months ago - 1 comment

#388 - Update version.txt

Pull Request - State: closed - Opened by xiaoxigua999 4 months ago

#387 - Zero stage 3 error

Issue - State: closed - Opened by syx11237744 4 months ago - 1 comment

#386 - Feature: add DPO-P

Issue - State: closed - Opened by catqaq 4 months ago
Labels: enhancement

#385 - Online DPO 支持

Issue - State: closed - Opened by Ashura5 4 months ago - 5 comments

#380 - Fix loading dataset from local text files

Pull Request - State: closed - Opened by tongyx361 4 months ago

#371 - Support RLOO

Issue - State: closed - Opened by gohsyi 4 months ago - 1 comment

#368 - Support training from breakpoint

Issue - State: closed - Opened by luo-li-ba-suo 4 months ago - 4 comments

#366 - DPO后的模型推理出的结果都是无序符号

Issue - State: open - Opened by 2024WY 5 months ago - 3 comments

#364 - DPO Finetuning constantly gives preference loss as 0.6931

Issue - State: closed - Opened by mandyyyyii 5 months ago - 9 comments

#361 - support remote rm api for ppo and ppo ray

Pull Request - State: closed - Opened by catqaq 5 months ago - 8 comments
Labels: enhancement, P0

#353 - 会不会支持异步生成训练

Issue - State: open - Opened by syx11237744 5 months ago - 1 comment
Labels: enhancement

#340 - [pre-commit.ci] pre-commit suggestions

Pull Request - State: closed - Opened by pre-commit-ci[bot] 5 months ago

#331 - PPO加载完模型后卡在bundle_reservation_check_func这里

Issue - State: open - Opened by lixsh6 5 months ago - 4 comments

#329 - qwen2 72B PPO OOM

Issue - State: closed - Opened by lixsh6 5 months ago - 5 comments

#311 - 可以增加支持SimPO吗

Issue - State: open - Opened by victorShawFan 6 months ago - 3 comments

#308 - Dummy token for prompts in HH datasets

Issue - State: closed - Opened by louieworth 6 months ago - 2 comments

#307 - Will 2 x GPU setups be supported

Issue - State: closed - Opened by llmlocal 6 months ago - 1 comment

#305 - Strange Kill of Critic Model

Issue - State: closed - Opened by Ricardokevins 6 months ago - 6 comments

#304 - Suggestion on the configurations

Issue - State: closed - Opened by Ricardokevins 6 months ago - 1 comment

#300 - [Question] EOS in reward model dataset

Issue - State: open - Opened by qwenzo 6 months ago - 4 comments

#297 - Avoid monkey patching vLLM

Issue - State: closed - Opened by Atry 6 months ago - 1 comment

#295 - QLORA model loading error

Issue - State: closed - Opened by karthik-nexusflow 7 months ago - 5 comments

#293 - PPO采用zero 3 stage后产生time out error

Issue - State: open - Opened by victorShawFan 7 months ago - 5 comments

#288 - RM training loss becomes NAN when finish the first training step.

Issue - State: open - Opened by lixsh6 7 months ago - 2 comments
Labels: bug

#285 - Custom ExperienceMaker

Issue - State: closed - Opened by mgerstgrasser 7 months ago - 4 comments

#283 - fix vLLM v0.4.1

Pull Request - State: closed - Opened by hijkzzz 7 months ago - 3 comments

#281 - Revert "vllm 0.4.1 compatibility (#278)"

Pull Request - State: closed - Opened by hijkzzz 7 months ago

#270 - Issue with models not using `position_ids`

Issue - State: closed - Opened by kfertakis 8 months ago - 2 comments

#269 - The configuration for Llama-7b on 4 RTX4090

Issue - State: closed - Opened by LinkyLiu 8 months ago - 5 comments

#267 - add test pipeline: use small LLM and small data

Issue - State: closed - Opened by catqaq 8 months ago
Labels: documentation, enhancement

#266 - Documentation for using Kuberay

Issue - State: closed - Opened by karthik-nexusflow 8 months ago - 4 comments

#262 - How long does single LLM's tunning reuqired?

Issue - State: closed - Opened by alphahumancoder 8 months ago - 3 comments

#256 - Is save checkpoint not yet supported for ppo ray trainer?

Issue - State: closed - Opened by mickel-liu 8 months ago - 6 comments

#253 - Support ORPO

Issue - State: closed - Opened by paulcx 9 months ago - 1 comment

#251 - [For your information] Ways to build environment and run openrlhf codes on a slurm cluster

Issue - State: closed - Opened by glorgao 9 months ago - 3 comments
Labels: documentation, envs

#246 - Unexpected long actor_time when train_ppo_ray

Issue - State: closed - Opened by LSC527 9 months ago - 9 comments
Labels: enhancement, help wanted

#245 - enable_ema cause runtime error when running train_ppo_llama.sh

Issue - State: open - Opened by dshnightmare 9 months ago - 7 comments

#242 - The tokenizer of reward model and policy model.

Issue - State: closed - Opened by eyuansu62 9 months ago - 4 comments

#241 - Fix yi-34b tokenizer, use_fast=False

Pull Request - State: closed - Opened by hijkzzz 9 months ago

#239 - why generate use flash-attn is slower?

Issue - State: closed - Opened by dshnightmare 9 months ago - 2 comments

#238 - Forced EOS token in vllm generation?

Issue - State: open - Opened by mgerstgrasser 9 months ago - 8 comments

#236 - adding length penalty to reward

Issue - State: open - Opened by karthik-nexusflow 9 months ago - 2 comments

#232 - Is left-padding in PPO strictly necessary?

Issue - State: open - Opened by mgerstgrasser 9 months ago - 8 comments

#230 - Actor-Critic-Model

Issue - State: open - Opened by mgerstgrasser 9 months ago - 6 comments

#221 - Citation or comparison to trlX and NeMo-align.

Issue - State: closed - Opened by LouisCastricato 9 months ago - 3 comments

#220 - Support top models stage2

Issue - State: closed - Opened by catqaq 9 months ago
Labels: enhancement

#219 - use_right_pad

Pull Request - State: closed - Opened by hijkzzz 9 months ago - 1 comment

#211 - vllm +zero2 hangs

Issue - State: closed - Opened by karthik19967829 10 months ago - 32 comments

#205 - Improve ease of use

Issue - State: closed - Opened by hijkzzz 10 months ago - 1 comment

#188 - About using vLLM for generation

Issue - State: closed - Opened by LSC527 11 months ago - 5 comments
Labels: enhancement, help wanted

#183 - support mixtral 8*7b balancing loss

Pull Request - State: closed - Opened by hijkzzz 11 months ago