nikhilbarhate99/PPO-PyTorch issues and pull requests

#71 - Addition of some files and editting for the moving obstacle case

Pull Request - State: open - Opened by sidwat 5 months ago

#70 - Environment setting about python version, gym and roboschool

Issue - State: open - Opened by vickychen928 6 months ago

#70 - Environment setting about python version, gym and roboschool

Issue - State: open - Opened by vickychen928 6 months ago

#69 - If I run test.py, there are these problems, how to solve them?

Issue - State: closed - Opened by CodeKnocker 6 months ago

#68 - 为什么新的版本在k epochs更新时不重新计算advantages？

Issue - State: open - Opened by 31CFDC30 6 months ago

#67 - (Solved) No env.reset() at the end of each training epoch.

Issue - State: open - Opened by slDeng1003 9 months ago - 2 comments

#66 - the version problem about gym and roboschool

Issue - State: open - Opened by ShunZuo-AI 10 months ago - 2 comments

#65 - policy_old完全看不出作用

Issue - State: open - Opened by haduoken 10 months ago - 6 comments

#64 - ValueError: expected sequence of length 8 at dim 1 (got 0)

Issue - State: open - Opened by kavinwkp about 1 year ago - 1 comment

#63 - question

Issue - State: open - Opened by yinshuangshuang621671 over 1 year ago

#62 - optimize the existing Chinese generation model

Issue - State: open - Opened by ARES3366 over 1 year ago

#61 - Minor change

Pull Request - State: closed - Opened by mychoi97 almost 2 years ago

#60 - Continuous action space should use Independent Normal instead of MultivariateNormal

Issue - State: open - Opened by imathg almost 2 years ago - 1 comment

#59 - Test results are not good

Issue - State: open - Opened by 295885025 about 2 years ago

#58 - Would a shared network work ?

Issue - State: open - Opened by Miguel-s-Amaral about 2 years ago

#57 - Setting Model to eval() mode in test.py

Issue - State: open - Opened by rllyryan about 2 years ago

#56 - error

Issue - State: open - Opened by deperado007 over 2 years ago - 5 comments

#55 - Update from roboschool to pybulletgym

Pull Request - State: closed - Opened by rahatsantosh almost 3 years ago - 1 comment

#54 - roboschool is deprecated

Issue - State: closed - Opened by rahatsantosh almost 3 years ago - 1 comment

#53 - [email protected]

Pull Request - State: open - Opened by Lilnoon2040 almost 3 years ago - 2 comments

#52 - Confusion about the loss function

Issue - State: closed - Opened by tlt18 almost 3 years ago - 1 comment

#51 - Convolutional?

Issue - State: closed - Opened by Bobingstern almost 3 years ago - 1 comment

#50 - About environment configuration

Issue - State: closed - Opened by BIT-KaiYu almost 3 years ago - 2 comments

#49 - how can I use this code for a problem with 3 different actions?

Issue - State: closed - Opened by m031n over 3 years ago - 1 comment

#48 - How to improve the performance based on your code?

Issue - State: closed - Opened by 4thfever over 3 years ago - 1 comment

#47 - How are you ensuring that actions are in range of (-1,1) after sampling in continuous action

Issue - State: closed - Opened by PhanindraParashar over 3 years ago - 1 comment

#46 - policy.eval() after load_state_dict()

Issue - State: closed - Opened by xinqin23 over 3 years ago - 1 comment

#45 - The reward function for training?

Issue - State: closed - Opened by DongXingshuai over 3 years ago - 1 comment

#44 - PPO with determinate variance

Issue - State: closed - Opened by keinccgithub over 3 years ago - 3 comments

#43 - why detaching the state values when computing the advantage functions

Issue - State: closed - Opened by jingxixu over 3 years ago - 1 comment
Labels: duplicate

#42 - I got an error while running the program

Issue - State: closed - Opened by robot-xyh over 3 years ago - 2 comments

#41 - Fix for RuntimeError for Environments with single continuous actions.

Pull Request - State: closed - Opened by Aakarshan-chauhan over 3 years ago - 1 comment

#40 - RuntimeError for Environments with single continuous action

Issue - State: closed - Opened by Aakarshan-chauhan over 3 years ago

#39 - Why does PPO use monte carlo estimation instead of value function estimation?

Issue - State: closed - Opened by outdoteth over 3 years ago - 1 comment

#38 - Discounted Reward Calulcation (Generalized Advantage Estimation)

Issue - State: open - Opened by artest08 almost 4 years ago - 5 comments
Labels: help wanted, question

#37 - Monotonic improvement of PPO

Issue - State: closed - Opened by olixu about 4 years ago - 2 comments

#36 - Performance of PPO on other projects

Issue - State: closed - Opened by pengzhi1998 about 4 years ago - 3 comments

#35 - advantages = rewards - state_values.detach() problem

Issue - State: closed - Opened by fatalfeel over 4 years ago - 2 comments
Labels: duplicate

#34 - Question on multiple actors

Issue - State: closed - Opened by pengzhi1998 over 4 years ago - 2 comments

#33 - in cuda train error expected dtype Double but got dtype Float

Issue - State: closed - Opened by fatalfeel over 4 years ago - 1 comment

#32 - Unexpected key(s) in state_dict: "affine.weight", "affine.bias".

Issue - State: closed - Opened by fatalfeel over 4 years ago - 2 comments

#31 - loss.mean().backward() crash

Issue - State: closed - Opened by fatalfeel over 4 years ago - 1 comment

#30 - added tensorboard to track several key metrics

Pull Request - State: closed - Opened by junkwhinger over 4 years ago

#29 - Question regarding state_values.detach()

Issue - State: closed - Opened by junkwhinger over 4 years ago - 3 comments

#28 - I'm a beginner, and I have a question for PPO_continuous.py

Issue - State: closed - Opened by GrehXscape over 4 years ago - 1 comment

#27 - RuntimeError: Error(s) in loading state_dict for ActorCritic: Unexpected key(s) in state_dict: "affine.weight", "affine.bias".

Issue - State: closed - Opened by nro-bot over 4 years ago - 1 comment

#26 - Including GAE

Issue - State: closed - Opened by CesMak over 4 years ago - 1 comment

#25 - Export as ONNX Model

Issue - State: closed - Opened by CesMak over 4 years ago - 1 comment

#24 - can it can be used for chess?

Issue - State: closed - Opened by Unimax almost 5 years ago - 1 comment

#23 - Question abot PPO_continuous.py

Issue - State: closed - Opened by HeegerGao almost 5 years ago - 2 comments

#22 - Fix Squeeze Under 1d Action Case

Pull Request - State: closed - Opened by xunzhang almost 5 years ago - 2 comments

#21 - When to Update

Issue - State: closed - Opened by xunzhang almost 5 years ago - 5 comments

#20 - PPO instead of PPO-M

Issue - State: closed - Opened by murtazabasu almost 5 years ago - 2 comments

#19 - Shared parameters for NN action_layer and NN value_layer

Issue - State: closed - Opened by ArnoudWellens almost 5 years ago - 3 comments

#18 - Ratio Calculation

Issue - State: closed - Opened by murtazabasu almost 5 years ago - 1 comment

#17 - Minor change

Pull Request - State: closed - Opened by noanabeshima almost 5 years ago

#16 - Ram gets full which stops the training session

Issue - State: closed - Opened by murtazabasu almost 5 years ago - 2 comments

#15 - Learning from scratch without using pre-trained model

Issue - State: closed - Opened by EnnaSachdeva almost 5 years ago - 4 comments

#14 - Why maintain two policies?

Issue - State: closed - Opened by biggzlar almost 5 years ago - 3 comments

#13 - Question about GAE

Issue - State: closed - Opened by CatIIIIIIII about 5 years ago - 1 comment

#12 - Why are ratios not always 1?

Issue - State: closed - Opened by BigBadBurrow about 5 years ago - 2 comments
Labels: duplicate

#11 - resetting timestep wrong?

Issue - State: closed - Opened by YilunZhou about 5 years ago - 1 comment

#10 - forget to copy policy to policy_old during ppo initialization?

Issue - State: closed - Opened by YilunZhou about 5 years ago - 1 comment

#9 - Generalized Advantage Estimation / GAE ?

Issue - State: closed - Opened by BigBadBurrow about 5 years ago - 3 comments

#8 - update() retains discounted_reward from previous episodes

Issue - State: closed - Opened by BigBadBurrow about 5 years ago - 2 comments

#7 - update() function to minimize, rather than maximize?

Issue - State: closed - Opened by BigBadBurrow about 5 years ago - 2 comments

#6 - Implementation issues

Issue - State: closed - Opened by kierkegaard13 over 5 years ago - 2 comments

#5 - The `ppo_continuous.py` model does not learn

Issue - State: closed - Opened by chingandy over 5 years ago - 2 comments

#4 - PPO for continuous env

Issue - State: closed - Opened by zbenic over 5 years ago - 1 comment

#3 - how did you figure out continuous?

Issue - State: closed - Opened by nyck33 over 5 years ago - 1 comment

#2 - edit advantage in surrogate

Pull Request - State: closed - Opened by AlpoGIT over 5 years ago - 1 comment

#1 - Create LICENSE

Pull Request - State: closed - Opened by nikhilbarhate99 about 6 years ago

GitHub / nikhilbarhate99/PPO-PyTorch issues and pull requests