Stable-Baselines-Team/stable-baselines3-contrib issues and pull requests

#296 - Add vanilla RNN support to `RecurrentPPO`

Pull Request - State: open - Opened by gcroci2 about 2 months ago

#295 - [Feature Request] Add vanilla RNN support to `RecurrentPPO`

Issue - State: open - Opened by gcroci2 about 2 months ago
Labels: enhancement

#294 - Use classes for schedules instead of lambdas

Pull Request - State: closed - Opened by akanto 3 months ago

#293 - [Question] rollout_buffer_class in RecurrentPPO

Issue - State: open - Opened by SaltyPoseidon 3 months ago - 4 comments
Labels: question

#292 - [Bug]: MaskablePPO Inaccurate update counting when target_kl early exists

Issue - State: open - Opened by Sean-Fuhrman 3 months ago - 1 comment
Labels: bug, good first issue, help wanted

#291 - [Question] Integrating Behavior Cloning With Maskable PPO

Issue - State: open - Opened by kaihansen8 3 months ago - 2 comments
Labels: question, custom gym env

#290 - Fix `RecurrentRolloutBuffer` not taking `env_change` into account for resetting states

Pull Request - State: open - Opened by araffin 4 months ago

#289 - [Question] Is it possible to not load optimizer state during inference?

Issue - State: closed - Opened by superkido511 4 months ago - 1 comment
Labels: question, RTFM

#288 - Release v2.6.0

Pull Request - State: closed - Opened by araffin 4 months ago

#287 - Maskable Recurrent PPO

Pull Request - State: open - Opened by akbaig 4 months ago - 4 comments

#286 - Add unit tests for RecurrentDictRolloutBuffer

Pull Request - State: open - Opened by DanielAvdar 5 months ago

#285 - [Question] RecurrentRolloutBuffer samples LSTM states from multiple VecEnv environments in a single sample

Issue - State: open - Opened by DanielAvdar 5 months ago - 3 comments
Labels: question

#284 - [Question] Why env_change[batch_inds] is not considered during _get_samples(*) in RecurrentRolloutBuffer?

Issue - State: open - Opened by Hhannzch 5 months ago - 4 comments
Labels: question

#283 - [Question] Significant Performance Disparity Between Maskable PPO and PPO

Issue - State: open - Opened by gemelom 5 months ago - 1 comment
Labels: question

#282 - [Question]: while I'm training with RecurrentPPO the entropy_loss becomes positive. How is this possible?

Issue - State: closed - Opened by LucaGiorcelli 5 months ago
Labels: question

#281 - Upgrade to Gymnasium v1.1

Pull Request - State: closed - Opened by araffin 5 months ago

#280 - Test using Gymnasium v1.1.0

Pull Request - State: closed - Opened by pseudo-rnd-thoughts 5 months ago

#279 - [Feature Request] VecMaskWrapper for MaskablePPO

Issue - State: open - Opened by CAI23sbP 5 months ago - 1 comment
Labels: enhancement

#278 - Rename `_dump_logs()`

Pull Request - State: closed - Opened by araffin 6 months ago

#277 - [Question] How to test recurrent + maskable + dependent multidiscrete actions?

Issue - State: open - Opened by maxmax1992 6 months ago - 1 comment
Labels: question

#276 - Use `has_attr` for detecting masking support, fixes several issues

Pull Request - State: closed - Opened by araffin 6 months ago

#275 - Hybrid Group Relative Policy Optimization (Hybrid GRPO): A Multi-Sample Approach to Reinforcement Learning

Pull Request - State: open - Opened by Soham4001A 6 months ago - 2 comments

#274 - Fix crash using SubprocVecEnv with MaskablePPO (#49)

Pull Request - State: closed - Opened by KiuIras 6 months ago - 1 comment

#273 - [Feature Request] Group Relative Proximity Optimization (GRPO)

Issue - State: open - Opened by Soham4001A 6 months ago - 5 comments
Labels: enhancement

#272 - GPRO - Feature Addition

Pull Request - State: open - Opened by Soham4001A 6 months ago - 4 comments

#271 - Release 2.5.0

Pull Request - State: closed - Opened by araffin 6 months ago

#270 - [Bug]: in "RecurrentPPO" not work "model.policy.evaluate_actions()"

Issue - State: open - Opened by drulye 7 months ago - 2 comments
Labels: bug, more information needed, check the checklist

#269 - [Feature Request] Support for multi input policies in CrossQ

Issue - State: open - Opened by RaikoPipe 7 months ago - 1 comment
Labels: enhancement

#268 - Added MultiInputPolicy support to CrossQ

Pull Request - State: open - Opened by RaikoPipe 7 months ago

#267 - [Feature Request] ACERAC

Issue - State: open - Opened by lychanl 8 months ago - 2 comments
Labels: enhancement

#266 - Add policy documentation links to policy_kwargs parameter

Pull Request - State: closed - Opened by kplers 8 months ago

#265 - [Question] Not updating lstm states during training

Issue - State: open - Opened by abhinavj98 8 months ago - 1 comment
Labels: question

#264 - Add missing condition in CI

Pull Request - State: closed - Opened by araffin 9 months ago

#263 - Drop python 3.8, add python 3.12 support

Pull Request - State: closed - Opened by araffin 9 months ago

#262 - Release v2.4.0

Pull Request - State: closed - Opened by araffin 9 months ago

#261 - Add support for gymnasium v1.0

Pull Request - State: closed - Opened by araffin 9 months ago

#260 - Update deps for read the doc

Pull Request - State: closed - Opened by araffin 9 months ago

#259 - Fix QRDQN loading `target_update_interval`

Pull Request - State: closed - Opened by jak3122 10 months ago

#258 - [Bug]: loading QRDQN changes target_update_interval

Issue - State: closed - Opened by jak3122 10 months ago
Labels: bug

#257 - [Question] Why can't directly use the PPO (RecurrentActorCriticPolicy, "CartPole - v1", verbose = 1)

Issue - State: open - Opened by dajianer 11 months ago - 1 comment
Labels: question, more information needed, check the checklist

#256 - [Bug]: Is sb3_contrib/common/maskable/utils.py the cause of "WARN: env.action_masks to get variables from other wrappers is deprecated and will be removed in v1.0"?

Issue - State: open - Opened by mkbg8 12 months ago - 1 comment
Labels: enhancement, custom gym env

#255 - Fix warning when loading a `RecurrentPPO` model

Pull Request - State: closed - Opened by araffin 12 months ago

#254 - [Bug]: FutureWarning: You are using `torch.load` with `weights_only=False`

Issue - State: closed - Opened by drulye 12 months ago - 3 comments
Labels: bug, more information needed

#253 - [Feature Request] same random seed for every env in AsyncEval

Issue - State: open - Opened by 1-Bart-1 about 1 year ago - 1 comment
Labels: enhancement, check the checkboxes

#252 - Update QR-DQN optimizer to only use q_net parameters

Pull Request - State: closed - Opened by corentinlger about 1 year ago - 1 comment

#251 - Update SB3 and remove gSDE resampling

Pull Request - State: closed - Opened by araffin about 1 year ago

#250 - [Question] Masked actions PPO in multiagent setting using PettigZoo

Issue - State: open - Opened by MarcoPicione about 1 year ago
Labels: question

#249 - [Question] Apply Masking using ActionMasker on composite actions

Issue - State: closed - Opened by mwalidcharrwi about 1 year ago - 4 comments
Labels: duplicate, question, more information needed

#248 - [Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy

Issue - State: open - Opened by iwishiwasaneagle about 1 year ago - 1 comment
Labels: question

#247 - MaskablePPO Masking Doesn't Work with Big Action Space

Issue - State: closed - Opened by orkunkn about 1 year ago - 4 comments
Labels: custom gym env, check the checklist

#246 - RecurrentActorCriticPolicy Behaviour Not Clear

Issue - State: open - Opened by pasinit about 1 year ago - 1 comment
Labels: documentation

#245 - TQC: ep_len_mean and ep_rew_mean does not match real values

Issue - State: open - Opened by btabia about 1 year ago
Labels: bug, custom gym env, check the checkboxes

#244 - ep_len_mean discrepancy

Issue - State: closed - Opened by btabia about 1 year ago
Labels: custom gym env

#243 - Implemented CrossQ

Pull Request - State: closed - Opened by danielpalen about 1 year ago - 11 comments

#242 - Dependent Actions in MultiDiscrete Action Space

Issue - State: open - Opened by bbarisbaturay over 1 year ago - 6 comments
Labels: question

#241 - [Question] Recurrent Maskable PPO ?!? Rudder ?!?

Issue - State: closed - Opened by tty666 over 1 year ago - 1 comment
Labels: duplicate, question, trading warning

#240 - [Question] What is the difference between old_distribution and distribution in train function of TRPO

Issue - State: closed - Opened by 0Addicted0 over 1 year ago - 2 comments
Labels: question

#239 - [Question] RecurrentPPO: Reset LSTM states early?

Issue - State: open - Opened by phisad over 1 year ago - 3 comments
Labels: enhancement, question

#238 - [Feature Request] Implement CrossQ

Issue - State: closed - Opened by danielpalen over 1 year ago
Labels: enhancement

#237 - Fix typo in changelog

Pull Request - State: closed - Opened by araffin over 1 year ago

#236 - Release v2.3.0

Pull Request - State: closed - Opened by araffin over 1 year ago

#235 - Log success rate for PPO variants

Pull Request - State: closed - Opened by araffin over 1 year ago

#234 - [Question] Why does MaskablePPO does not mask with some logic with last observation?

Issue - State: open - Opened by EloyAnguiano over 1 year ago - 4 comments
Labels: question

#233 - Fix PPO maskable type annotations

Pull Request - State: closed - Opened by araffin over 1 year ago

#232 - Update ruff and SB3 dependencies

Pull Request - State: closed - Opened by araffin over 1 year ago

#231 - [Question] Simple way to implement data augmentation when training agent

Issue - State: closed - Opened by thomashirtz over 1 year ago - 2 comments
Labels: question

#230 - [Question] LSTM observations

Issue - State: closed - Opened by suargi over 1 year ago - 3 comments
Labels: question

#229 - Fix `train_freq` type annotation for TQC and QR-DQN

Pull Request - State: closed - Opened by Armandpl over 1 year ago

#228 - Episodic training with TQC?

Issue - State: closed - Opened by Armandpl over 1 year ago - 2 comments
Labels: enhancement, question

#227 - Add note about MaskableEvalCallback

Pull Request - State: closed - Opened by icheered over 1 year ago

#226 - EvalCallback crashes Maskable PPO without error

Issue - State: closed - Opened by icheered over 1 year ago - 3 comments
Labels: documentation, help wanted, custom gym env

#225 - Update QRDQN defaults

Pull Request - State: closed - Opened by araffin over 1 year ago

#224 - Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper

Issue - State: open - Opened by vladyskai over 1 year ago - 2 comments
Labels: enhancement

GitHub / Stable-Baselines-Team/stable-baselines3-contrib issues and pull requests