OpenLLMAI/OpenRLHF issues and pull requests

#288 - RM training loss becomes NAN when finish the first training step.

Issue - State: open - Opened by lixsh6 5 months ago - 1 comment

#287 - PPO训练之后模型拒绝回答

Issue - State: open - Opened by burger-pb 5 months ago - 3 comments

#287 - PPO训练之后模型拒绝回答

Issue - State: closed - Opened by burger-pb 5 months ago - 3 comments

#286 - Vllm0.42 + Lora configs

Pull Request - State: closed - Opened by hijkzzz 5 months ago

#286 - Vllm0.42 + Lora configs

Pull Request - State: closed - Opened by hijkzzz 5 months ago

#285 - Custom ExperienceMaker

Issue - State: open - Opened by mgerstgrasser 5 months ago - 4 comments

#285 - Custom ExperienceMaker

Issue - State: open - Opened by mgerstgrasser 5 months ago - 4 comments

#284 - when import requests, class NewLineFormatter(logging.Formatter): AttributeError: partially initialized module 'logging' has no attribute 'Formatter' (most likely due to a circular import)

Issue - State: open - Opened by catqaq 5 months ago

#284 - when import requests, class NewLineFormatter(logging.Formatter): AttributeError: partially initialized module 'logging' has no attribute 'Formatter' (most likely due to a circular import)

Issue - State: open - Opened by catqaq 5 months ago

#283 - fix vLLM v0.4.1

Pull Request - State: closed - Opened by hijkzzz 5 months ago - 3 comments

#283 - fix vLLM v0.4.1

Pull Request - State: closed - Opened by hijkzzz 5 months ago - 3 comments

#282 - Update NGC and vllm version.

Issue - State: closed - Opened by THINK2TRY 5 months ago - 2 comments

#282 - Update NGC and vllm version.

Issue - State: closed - Opened by THINK2TRY 5 months ago - 2 comments

#281 - Revert "vllm 0.4.1 compatibility (#278)"

Pull Request - State: closed - Opened by hijkzzz 5 months ago

#281 - Revert "vllm 0.4.1 compatibility (#278)"

Pull Request - State: closed - Opened by hijkzzz 5 months ago

#280 - fix typos in train_ppo_ray.py

Pull Request - State: closed - Opened by mickelliu 5 months ago - 1 comment

#280 - fix typos in train_ppo_ray.py

Pull Request - State: closed - Opened by mickelliu 5 months ago - 1 comment

#279 - fix typos in train_ppo_ray.py

Pull Request - State: closed - Opened by mickelliu 5 months ago

#279 - fix typos in train_ppo_ray.py

Pull Request - State: closed - Opened by mickelliu 5 months ago

#278 - vllm 0.4.1 compatibility

Pull Request - State: closed - Opened by mgerstgrasser 5 months ago - 6 comments

#278 - vllm 0.4.1 compatibility

Pull Request - State: closed - Opened by mgerstgrasser 5 months ago - 6 comments

#277 - 内存超出问题

Issue - State: closed - Opened by burger-pb 5 months ago - 3 comments

#277 - 内存超出问题

Issue - State: open - Opened by burger-pb 5 months ago - 4 comments

#273 - reward model数据集问题

Issue - State: closed - Opened by burger-pb 6 months ago - 3 comments
Labels: documentation

#272 - PPO training configuration for train_ppo_llama.sh

Issue - State: closed - Opened by MurrayTom 6 months ago - 1 comment

#263 - [Baseline] LLaMA2-7B RLHF training curves

Issue - State: closed - Opened by hijkzzz 6 months ago - 2 comments

#263 - [Baseline] LLaMA2-7B RLHF training curves

Issue - State: open - Opened by hijkzzz 6 months ago - 2 comments

#256 - Is save checkpoint not yet supported for ppo ray trainer?

Issue - State: open - Opened by mickel-liu 6 months ago - 5 comments

#256 - Is save checkpoint not yet supported for ppo ray trainer?

Issue - State: open - Opened by mickel-liu 6 months ago - 5 comments

#249 - add perf and benchmark scripts

Pull Request - State: closed - Opened by wuxibin89 7 months ago - 3 comments
Labels: documentation, enhancement, P0

#249 - add perf and benchmark scripts

Pull Request - State: closed - Opened by wuxibin89 7 months ago - 3 comments
Labels: documentation, enhancement, P0

#245 - enable_ema cause runtime error when running train_ppo_llama.sh

Issue - State: open - Opened by dshnightmare 7 months ago - 6 comments

#245 - enable_ema cause runtime error when running train_ppo_llama.sh

Issue - State: open - Opened by dshnightmare 7 months ago - 6 comments

#235 - DPO Loss

Issue - State: closed - Opened by paulcx 7 months ago - 14 comments

#211 - vllm +zero2 hangs

Issue - State: open - Opened by karthik19967829 8 months ago - 32 comments

#211 - vllm +zero2 hangs

Issue - State: open - Opened by karthik19967829 8 months ago - 32 comments

#181 - why not include eos_token in action_seq, which may make mistakes?

Issue - State: closed - Opened by ZiyiLiubird 9 months ago - 16 comments

#181 - why not include eos_token in action_seq, which may make mistakes?

Issue - State: closed - Opened by ZiyiLiubird 9 months ago - 16 comments

#158 - 为什么速度回比deepspeed chat快4倍这么多

Issue - State: closed - Opened by tingshua-yts 10 months ago - 4 comments

#102 - Feature: Support detailed running process management: save_steps, log_steps, eval_steps

Issue - State: closed - Opened by catqaq about 1 year ago - 7 comments
Labels: enhancement, P0

#101 - Bug: AttributeError: 'DeepspeedStrategy' object has no attribute 'save_hf_format'

Issue - State: closed - Opened by catqaq about 1 year ago - 2 comments
Labels: bug

#100 - HfDeepSpeedConfig must be kept during AutoModel.from_pretrained if using ZeRO-3

Issue - State: closed - Opened by wuxibin89 about 1 year ago - 1 comment
Labels: envs

#99 - basemodel and qlora add.

Pull Request - State: closed - Opened by John-Ge about 1 year ago

#98 - Add GPT-4 evaluation scripts

Issue - State: closed - Opened by hijkzzz about 1 year ago - 1 comment
Labels: enhancement

#97 - Do you have a plan for applying Reinforced Self-Training (ReST)?

Issue - State: closed - Opened by missflash about 1 year ago - 1 comment
Labels: enhancement

#96 - Add flash-attention2.0 support

Pull Request - State: closed - Opened by suc16 about 1 year ago
Labels: enhancement

#95 - PPO OOM

Issue - State: closed - Opened by catqaq about 1 year ago - 4 comments
Labels: envs

#94 - 开启ppo-ptx会出现梯度重复计算的报错

Issue - State: closed - Opened by skepsun about 1 year ago - 9 comments

#93 - Support more prompt template in datasets

Issue - State: closed - Opened by hijkzzz about 1 year ago
Labels: enhancement

#92 - 更大的模型

Issue - State: closed - Opened by wanghao-007 about 1 year ago - 2 comments

#91 - 有几个问题

Issue - State: closed - Opened by skepsun about 1 year ago - 2 comments

#90 - available for reward model: OpenAssistant / reward-model-deberta-v3-large-v2

Pull Request - State: closed - Opened by RanchiZhao about 1 year ago - 1 comment

#88 - feat: add wandb logger in ppo trainer

Pull Request - State: closed - Opened by dabney777 about 1 year ago - 2 comments

#87 - Vocabulary overflow Issue with [PAD] for SFT

Issue - State: closed - Opened by leeeizhang about 1 year ago - 4 comments

#86 - feat: add Wandb logger

Pull Request - State: closed - Opened by dabney777 about 1 year ago - 3 comments

#85 - fix ds cpuadam bug

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#84 - fix cpu adam bug

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#83 - Support pretrain and post-pretrain

Issue - State: closed - Opened by catqaq about 1 year ago - 1 comment
Labels: enhancement, P1

#82 - refactor eval

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#81 - Dev add eval/ceval

Pull Request - State: closed - Opened by catqaq about 1 year ago - 1 comment

#80 - Dev

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#79 - revert init on gpu

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#78 - Dev

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#77 - fix update_timesteps

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#76 - support dataset with subfold

Pull Request - State: closed - Opened by wwxFromTju about 1 year ago - 1 comment

#75 - set seed

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#74 - Dev

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#73 - update license

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#72 - add ppo examples and fix container

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#71 - update docker version

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#70 - fix local rank

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#69 - fix gpus_per_node in scripts and readme

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#68 - [#52] Support Multi-nodes training on Slurm

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#67 - Support llama2 flash attention

Issue - State: closed - Opened by hijkzzz about 1 year ago - 2 comments
Labels: enhancement

#66 - Support DPO

Issue - State: closed - Opened by hijkzzz about 1 year ago - 2 comments
Labels: enhancement

#65 - Support checkpoint to prevent training from collapse

Issue - State: open - Opened by hijkzzz about 1 year ago - 9 comments
Labels: enhancement

#64 - updata readme

Pull Request - State: closed - Opened by pikaqqqqqq about 1 year ago - 2 comments

#63 - pydantic.error_wrappers.ValidationError: 3 validation errors for DeepSpeedZeroConfig zero_hpz_partition_size extra fields not permitted (type=value_error.extra)

Issue - State: closed - Opened by pikaqqqqqq about 1 year ago - 1 comment

#62 - Support Evaluation Tools

Issue - State: closed - Opened by hijkzzz about 1 year ago - 3 comments
Labels: enhancement

#61 - fix readme error and add citation

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#60 - [QUESTION] huggingface login in readme

Issue - State: closed - Opened by suc16 about 1 year ago - 1 comment

#59 - update readme

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#58 - Support Lora & QLora

Issue - State: closed - Opened by hijkzzz about 1 year ago - 4 comments
Labels: enhancement, P0

#56 - Add better docs and usage examples

Issue - State: open - Opened by hijkzzz about 1 year ago - 2 comments
Labels: documentation, enhancement

#55 - Support Adam Optmizer offload and reload to GPU

Issue - State: closed - Opened by hijkzzz about 1 year ago
Labels: enhancement

#54 - Support wandb logs

Issue - State: closed - Opened by hijkzzz about 1 year ago - 1 comment
Labels: enhancement

#53 - Support Decision Transformer

Issue - State: closed - Opened by hijkzzz about 1 year ago
Labels: enhancement

#52 - Support Multi-nodes training on Slurm

Issue - State: closed - Opened by hijkzzz about 1 year ago - 1 comment

#51 - Support Multiple Reward Models

Issue - State: closed - Opened by hijkzzz about 1 year ago - 3 comments
Labels: enhancement

#50 - Support Rejection Sampling

Issue - State: closed - Opened by hijkzzz about 1 year ago - 2 comments
Labels: enhancement, P0

#49 - add: save huggingface checkpoint

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#48 - Support running on Ray as distributed RLHF framework.

Issue - State: closed - Opened by jovany-wang about 1 year ago - 1 comment
Labels: enhancement

#47 - Introduce LINT tools

Issue - State: closed - Opened by jovany-wang about 1 year ago - 2 comments
Labels: enhancement

#46 - remove cuda_launch_blocking

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#45 - fix oom

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#44 - Dev

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#43 - new datasets

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#42 - fix scripts

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#41 - fix prompt data name

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

#40 - fix

Pull Request - State: closed - Opened by hijkzzz about 1 year ago

GitHub / OpenLLMAI/OpenRLHF issues and pull requests