Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / OpenLLMAI/OpenRLHF issues and pull requests

#362 - SFT loss calculation issue

Issue - State: closed - Opened by ZhaofengWu 3 months ago - 2 comments

#361 - support remote rm api for ppo and ppo ray

Pull Request - State: open - Opened by catqaq 3 months ago - 3 comments

#359 - Possible minor bug

Issue - State: closed - Opened by ZhaofengWu 3 months ago

#358 - vllm engine not working

Issue - State: open - Opened by babu111 3 months ago - 4 comments

#357 - process groups for actor and vllm engine

Issue - State: closed - Opened by babu111 3 months ago - 1 comment

#355 - load transformers' issue

Issue - State: closed - Opened by chauncygu 3 months ago - 1 comment

#353 - 会不会支持异步生成训练

Issue - State: open - Opened by syx11237744 3 months ago - 1 comment

#351 - Qwen2-1.5b模型做sft微调报错padding_side='right'

Issue - State: closed - Opened by ProsperousYe 3 months ago - 3 comments

#350 - train_ppo_llama_ray训练没有train的过程

Issue - State: closed - Opened by syx11237744 3 months ago - 2 comments

#348 - Why is the RM head separately initialized?

Issue - State: closed - Opened by ZhaofengWu 3 months ago - 7 comments

#345 - reward is always 0 when training DPO

Issue - State: closed - Opened by UbeCc 3 months ago - 1 comment

#343 - Qwen-32B train RM using adam_offload& zero3 lead to Runtime Error

Issue - State: closed - Opened by victorShawFan 3 months ago - 2 comments

#342 - it occurs error when im trying to build a docker container.

Issue - State: closed - Opened by hehebamei 3 months ago - 3 comments

#341 - support remote rm and ref model api for ppo

Pull Request - State: closed - Opened by catqaq 3 months ago - 8 comments

#340 - [pre-commit.ci] pre-commit suggestions

Pull Request - State: closed - Opened by pre-commit-ci[bot] 3 months ago

#338 - An error occurred during supervisied fine-tuning.

Issue - State: closed - Opened by hehebamei 3 months ago - 2 comments

#337 - Multi-node training. Slurm vs Slurm + Ray

Issue - State: closed - Opened by yannikkellerde 3 months ago - 1 comment

#335 - Support LoRA+VLLM, especially for ZeRO-3.

Pull Request - State: closed - Opened by luo-li-ba-suo 3 months ago - 4 comments

#334 - train_rm apply custom tokenizer chat template

Pull Request - State: closed - Opened by mickelliu 3 months ago

#333 - Qwen2 ppo

Issue - State: closed - Opened by Yusifu 3 months ago - 1 comment

#331 - PPO加载完模型后卡在bundle_reservation_check_func这里

Issue - State: open - Opened by lixsh6 3 months ago - 1 comment

#330 - Easy to miss bug that results in min_new_tokens not working

Pull Request - State: closed - Opened by yannikkellerde 3 months ago

#329 - qwen2 72B PPO OOM

Issue - State: open - Opened by lixsh6 4 months ago - 5 comments

#328 - Update requirements.txt

Pull Request - State: closed - Opened by Atry 4 months ago - 3 comments

#327 - Could you give an example of testing deepspeed-chat time?

Issue - State: closed - Opened by youngyoung321 4 months ago - 7 comments

#326 - qwen2 sft后的模型使用kto训练loss nan

Issue - State: closed - Opened by vincezengqiang 4 months ago - 3 comments

#324 - Generate function for distributional training

Issue - State: open - Opened by louieworth 4 months ago - 2 comments

#323 - 多卡并行无法model.generate

Issue - State: closed - Opened by louieworth 4 months ago - 2 comments

#322 - /openrlhf must be an existing directory or a zip package

Issue - State: closed - Opened by harvinyou 4 months ago - 1 comment

#321 - 训练启动时,如何指定gpu的数量?

Issue - State: closed - Opened by harvinyou 4 months ago - 1 comment

#320 - [Question] Is multi-nodes stage 3 model loading supported?

Issue - State: closed - Opened by mickelliu 4 months ago - 2 comments

#319 - mixtral 8*7B的最佳训练参数,推理参数可以提供一个吗?

Issue - State: closed - Opened by harvinyou 4 months ago - 1 comment

#318 - train_ppo_llama_ray.sh run two H800 machine error

Issue - State: closed - Opened by yangzhipeng1108 4 months ago - 3 comments

#316 - train_ppo_llama_ray_70b.sh run two H800 machine error

Issue - State: closed - Opened by yangzhipeng1108 4 months ago - 1 comment

#315 - Moving model between GPU and CPU

Issue - State: closed - Opened by kfertakis 4 months ago - 3 comments

#315 - Moving model between GPU and CPU

Issue - State: closed - Opened by kfertakis 4 months ago - 3 comments

#314 - run train_ppo_llama_ray.sh error

Issue - State: closed - Opened by yangzhipeng1108 4 months ago

#314 - run train_ppo_llama_ray.sh error

Issue - State: closed - Opened by yangzhipeng1108 4 months ago

#313 - Failed to update weights to vLLM

Issue - State: closed - Opened by thirteenflt 4 months ago - 3 comments

#313 - Failed to update weights to vLLM

Issue - State: closed - Opened by thirteenflt 4 months ago - 3 comments

#312 - zero3 training error

Issue - State: closed - Opened by karthik-nexusflow 4 months ago - 1 comment

#311 - 可以增加支持SimPO吗

Issue - State: open - Opened by victorShawFan 4 months ago

#311 - 可以增加支持SimPO吗

Issue - State: open - Opened by victorShawFan 4 months ago - 2 comments

#310 - wrong action_log_probs returned?

Issue - State: closed - Opened by thirteenflt 4 months ago - 1 comment

#310 - wrong action_log_probs returned?

Issue - State: closed - Opened by thirteenflt 4 months ago - 1 comment

#309 - Does this codebase consider using "torch.compile"?

Issue - State: closed - Opened by eyuansu62 4 months ago - 2 comments

#309 - Does this codebase consider using "torch.compile"?

Issue - State: closed - Opened by eyuansu62 4 months ago - 2 comments

#308 - Dummy token for prompts in HH datasets

Issue - State: open - Opened by louieworth 4 months ago - 2 comments

#308 - Dummy token for prompts in HH datasets

Issue - State: open - Opened by louieworth 4 months ago - 2 comments

#307 - Will 2 x GPU setups be supported

Issue - State: open - Opened by llmlocal 4 months ago - 1 comment

#307 - Will 2 x GPU setups be supported

Issue - State: open - Opened by llmlocal 4 months ago - 1 comment

#305 - Strange Kill of Critic Model

Issue - State: open - Opened by Ricardokevins 4 months ago - 5 comments

#305 - Strange Kill of Critic Model

Issue - State: open - Opened by Ricardokevins 4 months ago - 5 comments

#304 - Suggestion on the configurations

Issue - State: open - Opened by Ricardokevins 4 months ago - 1 comment

#304 - Suggestion on the configurations

Issue - State: open - Opened by Ricardokevins 4 months ago - 1 comment

#303 - Incompatibility with Qwen

Issue - State: closed - Opened by Ricardokevins 4 months ago - 2 comments

#303 - Incompatibility with Qwen

Issue - State: closed - Opened by Ricardokevins 4 months ago - 2 comments

#302 - Support Llama-3 models

Issue - State: closed - Opened by wenlinyao 5 months ago - 1 comment

#302 - Support Llama-3 models

Issue - State: closed - Opened by wenlinyao 5 months ago - 1 comment

#301 - action_log_probs重复计算

Issue - State: closed - Opened by cdm114514 5 months ago - 2 comments

#301 - action_log_probs重复计算

Issue - State: closed - Opened by cdm114514 5 months ago - 2 comments

#300 - [Question] EOS in reward model dataset

Issue - State: open - Opened by qwenzo 5 months ago - 3 comments

#300 - [Question] EOS in reward model dataset

Issue - State: open - Opened by qwenzo 5 months ago - 3 comments

#299 - Claim your paper on HF

Issue - State: closed - Opened by adeenayakup 5 months ago - 1 comment

#299 - Claim your paper on HF

Issue - State: closed - Opened by adeenayakup 5 months ago - 1 comment

#298 - Added GPU memory specs and clarifications, fixed typo.

Pull Request - State: closed - Opened by KT313 5 months ago - 2 comments

#297 - Avoid monkey patching vLLM

Issue - State: open - Opened by Atry 5 months ago - 1 comment

#297 - Avoid monkey patching vLLM

Issue - State: open - Opened by Atry 5 months ago - 1 comment

#295 - QLORA model loading error

Issue - State: open - Opened by karthik-nexusflow 5 months ago - 5 comments

#295 - QLORA model loading error

Issue - State: open - Opened by karthik-nexusflow 5 months ago - 5 comments

#294 - maybe data bug with dpo trainer

Issue - State: closed - Opened by none0663 5 months ago - 1 comment

#294 - maybe data bug with dpo trainer

Issue - State: closed - Opened by none0663 5 months ago - 1 comment

#293 - PPO采用zero 3 stage后产生time out error

Issue - State: open - Opened by victorShawFan 5 months ago - 3 comments

#293 - PPO采用zero 3 stage后产生time out error

Issue - State: open - Opened by victorShawFan 5 months ago - 2 comments

#292 - 启用PPO Ray后无响应

Issue - State: closed - Opened by victorShawFan 5 months ago - 3 comments

#292 - 启用PPO Ray后无响应

Issue - State: closed - Opened by victorShawFan 5 months ago - 3 comments

#291 - RLHF for classification tasks

Issue - State: closed - Opened by vinodrajendran001 5 months ago - 2 comments

#291 - RLHF for classification tasks

Issue - State: open - Opened by vinodrajendran001 5 months ago - 2 comments

#290 - HTTPError when running train_ppo_llama_ray.sh

Issue - State: closed - Opened by Zeyuan-Liu 5 months ago - 5 comments

#290 - HTTPError when running train_ppo_llama_ray.sh

Issue - State: open - Opened by Zeyuan-Liu 5 months ago - 5 comments

#289 - [question] long context for single model ppo training

Issue - State: closed - Opened by yananchen1989 5 months ago - 1 comment

#289 - [question] long context for single model ppo training

Issue - State: closed - Opened by yananchen1989 5 months ago - 1 comment

#288 - RM training loss becomes NAN when finish the first training step.

Issue - State: open - Opened by lixsh6 5 months ago - 1 comment