Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / OpenLLMAI/OpenRLHF issues and pull requests
#363 - update "reward_dataset" to properly handle "prompt_key" when "apply_chat_template==True"
Pull Request -
State: open - Opened by Nickydusk 3 months ago
- 2 comments
#362 - SFT loss calculation issue
Issue -
State: closed - Opened by ZhaofengWu 3 months ago
- 2 comments
#361 - support remote rm api for ppo and ppo ray
Pull Request -
State: open - Opened by catqaq 3 months ago
- 3 comments
#360 - A worker died or was killed while executing a task by an unexpected system error.
Issue -
State: open - Opened by lusongshuo-mt 3 months ago
- 3 comments
#359 - Possible minor bug
Issue -
State: closed - Opened by ZhaofengWu 3 months ago
#358 - vllm engine not working
Issue -
State: open - Opened by babu111 3 months ago
- 4 comments
#357 - process groups for actor and vllm engine
Issue -
State: closed - Opened by babu111 3 months ago
- 1 comment
#355 - load transformers' issue
Issue -
State: closed - Opened by chauncygu 3 months ago
- 1 comment
#354 - 请问下rm 模型训练大概需要什么级别的显卡,需要几张?
Issue -
State: closed - Opened by hehebamei 3 months ago
- 7 comments
#353 - 会不会支持异步生成训练
Issue -
State: open - Opened by syx11237744 3 months ago
- 1 comment
#352 - OpenLLama always generate EOS token after SFT in shareGPT dataset
Issue -
State: open - Opened by MGDDestiny 3 months ago
#351 - Qwen2-1.5b模型做sft微调报错padding_side='right'
Issue -
State: closed - Opened by ProsperousYe 3 months ago
- 3 comments
#350 - train_ppo_llama_ray训练没有train的过程
Issue -
State: closed - Opened by syx11237744 3 months ago
- 2 comments
#349 - When I run the train_ppo_llama_ray.sh script, the async_fit_actor_model gets stuck
Issue -
State: closed - Opened by syx11237744 3 months ago
- 3 comments
#348 - Why is the RM head separately initialized?
Issue -
State: closed - Opened by ZhaofengWu 3 months ago
- 7 comments
#347 - 这个工程哪儿都好,就是数据读入咋搞的这么复杂,就不能sft和rl 分别以及 chat 模版,用一套简单的方式设定一下吗?
Issue -
State: closed - Opened by ldh127 3 months ago
- 10 comments
#346 - AssertionError: Check batch related parameters. train_batch_size is not equal to micro_batch_per_gpu * gradient_acc_step * world_size 256 != 2 * 18 * 7
Issue -
State: closed - Opened by hehebamei 3 months ago
- 7 comments
#345 - reward is always 0 when training DPO
Issue -
State: closed - Opened by UbeCc 3 months ago
- 1 comment
#344 - Feature: Define a set of default data formats for OpenRLHF to reduce the cost of using custom data for everyone.
Issue -
State: closed - Opened by catqaq 3 months ago
- 1 comment
#343 - Qwen-32B train RM using adam_offload& zero3 lead to Runtime Error
Issue -
State: closed - Opened by victorShawFan 3 months ago
- 2 comments
#342 - it occurs error when im trying to build a docker container.
Issue -
State: closed - Opened by hehebamei 3 months ago
- 3 comments
#341 - support remote rm and ref model api for ppo
Pull Request -
State: closed - Opened by catqaq 3 months ago
- 8 comments
#340 - [pre-commit.ci] pre-commit suggestions
Pull Request -
State: closed - Opened by pre-commit-ci[bot] 3 months ago
#339 - Status message: Unexpected error occurred: The actor 2c5251641e72297b4e3f4d7f01000000 is unavailable
Issue -
State: closed - Opened by lusongshuo-mt 3 months ago
- 2 comments
#338 - An error occurred during supervisied fine-tuning.
Issue -
State: closed - Opened by hehebamei 3 months ago
- 2 comments
#337 - Multi-node training. Slurm vs Slurm + Ray
Issue -
State: closed - Opened by yannikkellerde 3 months ago
- 1 comment
#336 - vLLM related: model's max seq len (8192) is larger than the maximum number of tokens that can be stored in KV cache (6048).
Issue -
State: closed - Opened by mickelliu 3 months ago
- 2 comments
#335 - Support LoRA+VLLM, especially for ZeRO-3.
Pull Request -
State: closed - Opened by luo-li-ba-suo 3 months ago
- 4 comments
#334 - train_rm apply custom tokenizer chat template
Pull Request -
State: closed - Opened by mickelliu 3 months ago
#333 - Qwen2 ppo
Issue -
State: closed - Opened by Yusifu 3 months ago
- 1 comment
#332 - How much memory(RAM) is required to train a 70B Llama2 model with two 80G A800 nodes?
Issue -
State: open - Opened by luo-li-ba-suo 3 months ago
- 7 comments
#331 - PPO加载完模型后卡在bundle_reservation_check_func这里
Issue -
State: open - Opened by lixsh6 3 months ago
- 1 comment
#330 - Easy to miss bug that results in min_new_tokens not working
Pull Request -
State: closed - Opened by yannikkellerde 3 months ago
#329 - qwen2 72B PPO OOM
Issue -
State: open - Opened by lixsh6 4 months ago
- 5 comments
#328 - Update requirements.txt
Pull Request -
State: closed - Opened by Atry 4 months ago
- 3 comments
#327 - Could you give an example of testing deepspeed-chat time?
Issue -
State: closed - Opened by youngyoung321 4 months ago
- 7 comments
#326 - qwen2 sft后的模型使用kto训练loss nan
Issue -
State: closed - Opened by vincezengqiang 4 months ago
- 3 comments
#325 - [rank3]: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cpu!
Issue -
State: closed - Opened by xiechengmude 4 months ago
- 3 comments
#324 - Generate function for distributional training
Issue -
State: open - Opened by louieworth 4 months ago
- 2 comments
#323 - 多卡并行无法model.generate
Issue -
State: closed - Opened by louieworth 4 months ago
- 2 comments
#322 - /openrlhf must be an existing directory or a zip package
Issue -
State: closed - Opened by harvinyou 4 months ago
- 1 comment
#321 - 训练启动时,如何指定gpu的数量?
Issue -
State: closed - Opened by harvinyou 4 months ago
- 1 comment
#320 - [Question] Is multi-nodes stage 3 model loading supported?
Issue -
State: closed - Opened by mickelliu 4 months ago
- 2 comments
#319 - mixtral 8*7B的最佳训练参数,推理参数可以提供一个吗?
Issue -
State: closed - Opened by harvinyou 4 months ago
- 1 comment
#318 - train_ppo_llama_ray.sh run two H800 machine error
Issue -
State: closed - Opened by yangzhipeng1108 4 months ago
- 3 comments
#317 - ray多节点训练下deepspeed zero3的切分还是按照 node数*8卡来切分吗?
Issue -
State: closed - Opened by lma-c4d 4 months ago
- 1 comment
#316 - train_ppo_llama_ray_70b.sh run two H800 machine error
Issue -
State: closed - Opened by yangzhipeng1108 4 months ago
- 1 comment
#315 - Moving model between GPU and CPU
Issue -
State: closed - Opened by kfertakis 4 months ago
- 3 comments
#315 - Moving model between GPU and CPU
Issue -
State: closed - Opened by kfertakis 4 months ago
- 3 comments
#314 - run train_ppo_llama_ray.sh error
Issue -
State: closed - Opened by yangzhipeng1108 4 months ago
#314 - run train_ppo_llama_ray.sh error
Issue -
State: closed - Opened by yangzhipeng1108 4 months ago
#313 - Failed to update weights to vLLM
Issue -
State: closed - Opened by thirteenflt 4 months ago
- 3 comments
#313 - Failed to update weights to vLLM
Issue -
State: closed - Opened by thirteenflt 4 months ago
- 3 comments
#312 - zero3 training error
Issue -
State: closed - Opened by karthik-nexusflow 4 months ago
- 1 comment
#311 - 可以增加支持SimPO吗
Issue -
State: open - Opened by victorShawFan 4 months ago
#311 - 可以增加支持SimPO吗
Issue -
State: open - Opened by victorShawFan 4 months ago
- 2 comments
#310 - wrong action_log_probs returned?
Issue -
State: closed - Opened by thirteenflt 4 months ago
- 1 comment
#310 - wrong action_log_probs returned?
Issue -
State: closed - Opened by thirteenflt 4 months ago
- 1 comment
#309 - Does this codebase consider using "torch.compile"?
Issue -
State: closed - Opened by eyuansu62 4 months ago
- 2 comments
#309 - Does this codebase consider using "torch.compile"?
Issue -
State: closed - Opened by eyuansu62 4 months ago
- 2 comments
#308 - Dummy token for prompts in HH datasets
Issue -
State: open - Opened by louieworth 4 months ago
- 2 comments
#308 - Dummy token for prompts in HH datasets
Issue -
State: open - Opened by louieworth 4 months ago
- 2 comments
#307 - Will 2 x GPU setups be supported
Issue -
State: open - Opened by llmlocal 4 months ago
- 1 comment
#307 - Will 2 x GPU setups be supported
Issue -
State: open - Opened by llmlocal 4 months ago
- 1 comment
#306 - 使用Deepseek-lite训练DPO,显示expected mat1 and mat2 to have the same type, but got: float != c10: : BFLoat16
Issue -
State: open - Opened by victorShawFan 4 months ago
- 3 comments
#306 - 使用Deepseek-lite训练DPO,显示expected mat1 and mat2 to have the same type, but got: float != c10: : BFLoat16
Issue -
State: open - Opened by victorShawFan 4 months ago
- 3 comments
#305 - Strange Kill of Critic Model
Issue -
State: open - Opened by Ricardokevins 4 months ago
- 5 comments
#305 - Strange Kill of Critic Model
Issue -
State: open - Opened by Ricardokevins 4 months ago
- 5 comments
#304 - Suggestion on the configurations
Issue -
State: open - Opened by Ricardokevins 4 months ago
- 1 comment
#304 - Suggestion on the configurations
Issue -
State: open - Opened by Ricardokevins 4 months ago
- 1 comment
#303 - Incompatibility with Qwen
Issue -
State: closed - Opened by Ricardokevins 4 months ago
- 2 comments
#303 - Incompatibility with Qwen
Issue -
State: closed - Opened by Ricardokevins 4 months ago
- 2 comments
#302 - Support Llama-3 models
Issue -
State: closed - Opened by wenlinyao 5 months ago
- 1 comment
#302 - Support Llama-3 models
Issue -
State: closed - Opened by wenlinyao 5 months ago
- 1 comment
#301 - action_log_probs重复计算
Issue -
State: closed - Opened by cdm114514 5 months ago
- 2 comments
#301 - action_log_probs重复计算
Issue -
State: closed - Opened by cdm114514 5 months ago
- 2 comments
#300 - [Question] EOS in reward model dataset
Issue -
State: open - Opened by qwenzo 5 months ago
- 3 comments
#300 - [Question] EOS in reward model dataset
Issue -
State: open - Opened by qwenzo 5 months ago
- 3 comments
#299 - Claim your paper on HF
Issue -
State: closed - Opened by adeenayakup 5 months ago
- 1 comment
#299 - Claim your paper on HF
Issue -
State: closed - Opened by adeenayakup 5 months ago
- 1 comment
#298 - Added GPU memory specs and clarifications, fixed typo.
Pull Request -
State: closed - Opened by KT313 5 months ago
- 2 comments
#297 - Avoid monkey patching vLLM
Issue -
State: open - Opened by Atry 5 months ago
- 1 comment
#297 - Avoid monkey patching vLLM
Issue -
State: open - Opened by Atry 5 months ago
- 1 comment
#296 - 我们正在对比DSchat跟OpenRLHF的性能以便完成选型工作,能否提供下修复后的DSChat代码,从而复现社区提供的性能对比数据
Issue -
State: closed - Opened by yinzhijian 5 months ago
- 7 comments
#296 - 我们正在对比DSchat跟OpenRLHF的性能以便完成选型工作,能否提供下修复后的DSChat代码,从而复现社区提供的性能对比数据
Issue -
State: closed - Opened by yinzhijian 5 months ago
- 7 comments
#295 - QLORA model loading error
Issue -
State: open - Opened by karthik-nexusflow 5 months ago
- 5 comments
#295 - QLORA model loading error
Issue -
State: open - Opened by karthik-nexusflow 5 months ago
- 5 comments
#294 - maybe data bug with dpo trainer
Issue -
State: closed - Opened by none0663 5 months ago
- 1 comment
#294 - maybe data bug with dpo trainer
Issue -
State: closed - Opened by none0663 5 months ago
- 1 comment
#293 - PPO采用zero 3 stage后产生time out error
Issue -
State: open - Opened by victorShawFan 5 months ago
- 3 comments
#293 - PPO采用zero 3 stage后产生time out error
Issue -
State: open - Opened by victorShawFan 5 months ago
- 2 comments
#292 - 启用PPO Ray后无响应
Issue -
State: closed - Opened by victorShawFan 5 months ago
- 3 comments
#292 - 启用PPO Ray后无响应
Issue -
State: closed - Opened by victorShawFan 5 months ago
- 3 comments
#291 - RLHF for classification tasks
Issue -
State: closed - Opened by vinodrajendran001 5 months ago
- 2 comments
#291 - RLHF for classification tasks
Issue -
State: open - Opened by vinodrajendran001 5 months ago
- 2 comments
#290 - HTTPError when running train_ppo_llama_ray.sh
Issue -
State: closed - Opened by Zeyuan-Liu 5 months ago
- 5 comments
#290 - HTTPError when running train_ppo_llama_ray.sh
Issue -
State: open - Opened by Zeyuan-Liu 5 months ago
- 5 comments
#289 - [question] long context for single model ppo training
Issue -
State: closed - Opened by yananchen1989 5 months ago
- 1 comment
#289 - [question] long context for single model ppo training
Issue -
State: closed - Opened by yananchen1989 5 months ago
- 1 comment
#288 - RM training loss becomes NAN when finish the first training step.
Issue -
State: open - Opened by lixsh6 5 months ago
- 1 comment