GitHub / Stable-Baselines-Team/stable-baselines3-contrib issues and pull requests
#296 - Add vanilla RNN support to `RecurrentPPO`
Pull Request -
State: open - Opened by gcroci2 about 2 months ago
#295 - [Feature Request] Add vanilla RNN support to `RecurrentPPO`
Issue -
State: open - Opened by gcroci2 about 2 months ago
Labels: enhancement
#294 - Use classes for schedules instead of lambdas
Pull Request -
State: closed - Opened by akanto 3 months ago
#293 - [Question] rollout_buffer_class in RecurrentPPO
Issue -
State: open - Opened by SaltyPoseidon 3 months ago
- 4 comments
Labels: question
#292 - [Bug]: MaskablePPO Inaccurate update counting when target_kl early exists
Issue -
State: open - Opened by Sean-Fuhrman 3 months ago
- 1 comment
Labels: bug, good first issue, help wanted
#291 - [Question] Integrating Behavior Cloning With Maskable PPO
Issue -
State: open - Opened by kaihansen8 3 months ago
- 2 comments
Labels: question, custom gym env
#290 - Fix `RecurrentRolloutBuffer` not taking `env_change` into account for resetting states
Pull Request -
State: open - Opened by araffin 4 months ago
#289 - [Question] Is it possible to not load optimizer state during inference?
Issue -
State: closed - Opened by superkido511 4 months ago
- 1 comment
Labels: question, RTFM
#288 - Release v2.6.0
Pull Request -
State: closed - Opened by araffin 4 months ago
#287 - Maskable Recurrent PPO
Pull Request -
State: open - Opened by akbaig 4 months ago
- 4 comments
#286 - Add unit tests for RecurrentDictRolloutBuffer
Pull Request -
State: open - Opened by DanielAvdar 5 months ago
#285 - [Question] RecurrentRolloutBuffer samples LSTM states from multiple VecEnv environments in a single sample
Issue -
State: open - Opened by DanielAvdar 5 months ago
- 3 comments
Labels: question
#284 - [Question] Why env_change[batch_inds] is not considered during _get_samples(*) in RecurrentRolloutBuffer?
Issue -
State: open - Opened by Hhannzch 5 months ago
- 4 comments
Labels: question
#283 - [Question] Significant Performance Disparity Between Maskable PPO and PPO
Issue -
State: open - Opened by gemelom 5 months ago
- 1 comment
Labels: question
#282 - [Question]: while I'm training with RecurrentPPO the entropy_loss becomes positive. How is this possible?
Issue -
State: closed - Opened by LucaGiorcelli 5 months ago
Labels: question
#281 - Upgrade to Gymnasium v1.1
Pull Request -
State: closed - Opened by araffin 5 months ago
#280 - Test using Gymnasium v1.1.0
Pull Request -
State: closed - Opened by pseudo-rnd-thoughts 5 months ago
#279 - [Feature Request] VecMaskWrapper for MaskablePPO
Issue -
State: open - Opened by CAI23sbP 5 months ago
- 1 comment
Labels: enhancement
#278 - Rename `_dump_logs()`
Pull Request -
State: closed - Opened by araffin 6 months ago
#277 - [Question] How to test recurrent + maskable + dependent multidiscrete actions?
Issue -
State: open - Opened by maxmax1992 6 months ago
- 1 comment
Labels: question
#276 - Use `has_attr` for detecting masking support, fixes several issues
Pull Request -
State: closed - Opened by araffin 6 months ago
#275 - Hybrid Group Relative Policy Optimization (Hybrid GRPO): A Multi-Sample Approach to Reinforcement Learning
Pull Request -
State: open - Opened by Soham4001A 6 months ago
- 2 comments
#274 - Fix crash using SubprocVecEnv with MaskablePPO (#49)
Pull Request -
State: closed - Opened by KiuIras 6 months ago
- 1 comment
#273 - [Feature Request] Group Relative Proximity Optimization (GRPO)
Issue -
State: open - Opened by Soham4001A 6 months ago
- 5 comments
Labels: enhancement
#272 - GPRO - Feature Addition
Pull Request -
State: open - Opened by Soham4001A 6 months ago
- 4 comments
#271 - Release 2.5.0
Pull Request -
State: closed - Opened by araffin 6 months ago
#270 - [Bug]: in "RecurrentPPO" not work "model.policy.evaluate_actions()"
Issue -
State: open - Opened by drulye 7 months ago
- 2 comments
Labels: bug, more information needed, check the checklist
#269 - [Feature Request] Support for multi input policies in CrossQ
Issue -
State: open - Opened by RaikoPipe 7 months ago
- 1 comment
Labels: enhancement
#268 - Added MultiInputPolicy support to CrossQ
Pull Request -
State: open - Opened by RaikoPipe 7 months ago
#267 - [Feature Request] ACERAC
Issue -
State: open - Opened by lychanl 8 months ago
- 2 comments
Labels: enhancement
#266 - Add policy documentation links to policy_kwargs parameter
Pull Request -
State: closed - Opened by kplers 8 months ago
#265 - [Question] Not updating lstm states during training
Issue -
State: open - Opened by abhinavj98 8 months ago
- 1 comment
Labels: question
#264 - Add missing condition in CI
Pull Request -
State: closed - Opened by araffin 9 months ago
#263 - Drop python 3.8, add python 3.12 support
Pull Request -
State: closed - Opened by araffin 9 months ago
#262 - Release v2.4.0
Pull Request -
State: closed - Opened by araffin 9 months ago
#261 - Add support for gymnasium v1.0
Pull Request -
State: closed - Opened by araffin 9 months ago
#260 - Update deps for read the doc
Pull Request -
State: closed - Opened by araffin 9 months ago
#259 - Fix QRDQN loading `target_update_interval`
Pull Request -
State: closed - Opened by jak3122 10 months ago
#258 - [Bug]: loading QRDQN changes target_update_interval
Issue -
State: closed - Opened by jak3122 10 months ago
Labels: bug
#257 - [Question] Why can't directly use the PPO (RecurrentActorCriticPolicy, "CartPole - v1", verbose = 1)
Issue -
State: open - Opened by dajianer 11 months ago
- 1 comment
Labels: question, more information needed, check the checklist
#256 - [Bug]: Is sb3_contrib/common/maskable/utils.py the cause of "WARN: env.action_masks to get variables from other wrappers is deprecated and will be removed in v1.0"?
Issue -
State: open - Opened by mkbg8 12 months ago
- 1 comment
Labels: enhancement, custom gym env
#255 - Fix warning when loading a `RecurrentPPO` model
Pull Request -
State: closed - Opened by araffin 12 months ago
#254 - [Bug]: FutureWarning: You are using `torch.load` with `weights_only=False`
Issue -
State: closed - Opened by drulye 12 months ago
- 3 comments
Labels: bug, more information needed
#253 - [Feature Request] same random seed for every env in AsyncEval
Issue -
State: open - Opened by 1-Bart-1 about 1 year ago
- 1 comment
Labels: enhancement, check the checkboxes
#252 - Update QR-DQN optimizer to only use q_net parameters
Pull Request -
State: closed - Opened by corentinlger about 1 year ago
- 1 comment
#251 - Update SB3 and remove gSDE resampling
Pull Request -
State: closed - Opened by araffin about 1 year ago
#250 - [Question] Masked actions PPO in multiagent setting using PettigZoo
Issue -
State: open - Opened by MarcoPicione about 1 year ago
Labels: question
#249 - [Question] Apply Masking using ActionMasker on composite actions
Issue -
State: closed - Opened by mwalidcharrwi about 1 year ago
- 4 comments
Labels: duplicate, question, more information needed
#248 - [Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy
Issue -
State: open - Opened by iwishiwasaneagle about 1 year ago
- 1 comment
Labels: question
#247 - MaskablePPO Masking Doesn't Work with Big Action Space
Issue -
State: closed - Opened by orkunkn about 1 year ago
- 4 comments
Labels: custom gym env, check the checklist
#246 - RecurrentActorCriticPolicy Behaviour Not Clear
Issue -
State: open - Opened by pasinit about 1 year ago
- 1 comment
Labels: documentation
#245 - TQC: ep_len_mean and ep_rew_mean does not match real values
Issue -
State: open - Opened by btabia about 1 year ago
Labels: bug, custom gym env, check the checkboxes
#244 - ep_len_mean discrepancy
Issue -
State: closed - Opened by btabia about 1 year ago
Labels: custom gym env
#243 - Implemented CrossQ
Pull Request -
State: closed - Opened by danielpalen about 1 year ago
- 11 comments
#242 - Dependent Actions in MultiDiscrete Action Space
Issue -
State: open - Opened by bbarisbaturay over 1 year ago
- 6 comments
Labels: question
#241 - [Question] Recurrent Maskable PPO ?!? Rudder ?!?
Issue -
State: closed - Opened by tty666 over 1 year ago
- 1 comment
Labels: duplicate, question, trading warning
#240 - [Question] What is the difference between old_distribution and distribution in train function of TRPO
Issue -
State: closed - Opened by 0Addicted0 over 1 year ago
- 2 comments
Labels: question
#239 - [Question] RecurrentPPO: Reset LSTM states early?
Issue -
State: open - Opened by phisad over 1 year ago
- 3 comments
Labels: enhancement, question
#238 - [Feature Request] Implement CrossQ
Issue -
State: closed - Opened by danielpalen over 1 year ago
Labels: enhancement
#237 - Fix typo in changelog
Pull Request -
State: closed - Opened by araffin over 1 year ago
#236 - Release v2.3.0
Pull Request -
State: closed - Opened by araffin over 1 year ago
#235 - Log success rate for PPO variants
Pull Request -
State: closed - Opened by araffin over 1 year ago
#234 - [Question] Why does MaskablePPO does not mask with some logic with last observation?
Issue -
State: open - Opened by EloyAnguiano over 1 year ago
- 4 comments
Labels: question
#233 - Fix PPO maskable type annotations
Pull Request -
State: closed - Opened by araffin over 1 year ago
#232 - Update ruff and SB3 dependencies
Pull Request -
State: closed - Opened by araffin over 1 year ago
#231 - [Question] Simple way to implement data augmentation when training agent
Issue -
State: closed - Opened by thomashirtz over 1 year ago
- 2 comments
Labels: question
#230 - [Question] LSTM observations
Issue -
State: closed - Opened by suargi over 1 year ago
- 3 comments
Labels: question
#229 - Fix `train_freq` type annotation for TQC and QR-DQN
Pull Request -
State: closed - Opened by Armandpl over 1 year ago
#228 - Episodic training with TQC?
Issue -
State: closed - Opened by Armandpl over 1 year ago
- 2 comments
Labels: enhancement, question
#227 - Add note about MaskableEvalCallback
Pull Request -
State: closed - Opened by icheered over 1 year ago
#226 - EvalCallback crashes Maskable PPO without error
Issue -
State: closed - Opened by icheered over 1 year ago
- 3 comments
Labels: documentation, help wanted, custom gym env
#225 - Update QRDQN defaults
Pull Request -
State: closed - Opened by araffin over 1 year ago
#224 - Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper
Issue -
State: open - Opened by vladyskai over 1 year ago
- 2 comments
Labels: enhancement
#223 - [Feature Request] STAC algorithm
Issue -
State: open - Opened by EloyAnguiano over 1 year ago
- 4 comments
Labels: enhancement
#222 - [Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training
Issue -
State: closed - Opened by DeepRowLie over 1 year ago
- 2 comments
Labels: question
#221 - [Bug]: producing NAN values during training in MaskablePPO
Issue -
State: open - Opened by vahidqo over 1 year ago
- 5 comments
Labels: bug, more information needed, custom gym env, No tech support
#220 - [Feature Request] Expand RNN Options and Algorithm Flexibility
Issue -
State: open - Opened by mtnusf97 over 1 year ago
- 3 comments
Labels: enhancement
#219 - Update `_process_sequence()` docstring
Pull Request -
State: closed - Opened by rogerioagjr over 1 year ago
#218 - [Question] Recurrent PPO evaluation
Issue -
State: closed - Opened by CAI23sbP over 1 year ago
- 2 comments
Labels: question
#217 - Release v2.2.1: hotfix file closing
Pull Request -
State: closed - Opened by araffin over 1 year ago
#216 - Release v2.2.0
Pull Request -
State: closed - Opened by araffin over 1 year ago
#215 - Remove PyType and upgrade to latest SB3 version
Pull Request -
State: closed - Opened by araffin over 1 year ago
#214 - Add rollout_buffer_class to TRPO
Pull Request -
State: closed - Opened by ernestum almost 2 years ago
- 2 comments
#213 - Sync SB3 Contrib with SB3
Pull Request -
State: closed - Opened by araffin almost 2 years ago
#212 - Predicting actions after using MaskablePPO model outputs invalid action
Issue -
State: closed - Opened by vivek-kumar9696 almost 2 years ago
- 2 comments
Labels: duplicate, question, RTFM
#211 - Recurrent PPO Not Training Well on a Very Simple Environment
Issue -
State: open - Opened by sreejank almost 2 years ago
- 1 comment
Labels: custom gym env, No tech support
#210 - Worse training with Vectorized Environment
Issue -
State: closed - Opened by pklochowicz almost 2 years ago
Labels: more information needed, custom gym env, No tech support
#209 - How to use LSTM ? RecurrentPPO from sb3-contrib
Issue -
State: closed - Opened by PedroIAgithub almost 2 years ago
- 6 comments
Labels: question
#208 - Maskable PPO selects illegal actions, altough everything looks correct
Issue -
State: closed - Opened by DominikRoB almost 2 years ago
- 2 comments
Labels: duplicate, question, more information needed
#207 - Decrease in reward during training with MaskablePPO
Issue -
State: open - Opened by vahidqo almost 2 years ago
Labels: question, more information needed, custom gym env
#206 - [Feature Request] BBF algorithm implementation
Issue -
State: open - Opened by Alian3785 almost 2 years ago
- 2 comments
Labels: enhancement
#205 - Speed up when using MaskablePPO
Issue -
State: open - Opened by vahidqo almost 2 years ago
- 2 comments
Labels: question
#204 - Release v2.1.0
Pull Request -
State: closed - Opened by araffin almost 2 years ago
#203 - SACD Discrete Soft Actor Critic
Pull Request -
State: open - Opened by splatter96 almost 2 years ago
- 5 comments
#202 - [Feature Request] Hybrid PPO
Issue -
State: open - Opened by AlexPasqua almost 2 years ago
- 5 comments
Labels: enhancement
#201 - [Feature Request] Implement Recurrent SAC
Issue -
State: open - Opened by masterdezign almost 2 years ago
- 17 comments
Labels: enhancement
#200 - [Bug]: inappropriate actions despite the MaskablePPO applied
Issue -
State: closed - Opened by koliber31 about 2 years ago
- 1 comment
Labels: custom gym env, No tech support, check the checkboxes
#199 - Bugfix/ppo mask stats window size
Pull Request -
State: closed - Opened by PatrickHelm about 2 years ago
- 3 comments
#198 - [Bug]: MaskablePPO ignores stats_window_size argument
Issue -
State: closed - Opened by PatrickHelm about 2 years ago
- 2 comments
Labels: bug, help wanted
#197 - [Question] Action mask dimensions for action combinations in a MultiDiscrete space
Issue -
State: closed - Opened by npit about 2 years ago
- 2 comments
Labels: question