MorvanZhou/Reinforcement-learning-with-tensorflow issues and pull requests

#101 - 关于tkinter的问题

Issue - State: open - Opened by MorchelPeng about 6 years ago

#101 - 关于tkinter的问题

Issue - State: open - Opened by MorchelPeng about 6 years ago

#100 - Actor Critic neural combine

Issue - State: open - Opened by RozenAstrayChen about 6 years ago

#100 - Actor Critic neural combine

Issue - State: open - Opened by RozenAstrayChen about 6 years ago

#99 - 关于learn过程中Nan问题

Issue - State: closed - Opened by zhouchunyi about 6 years ago - 4 comments

#99 - 关于learn过程中Nan问题

Issue - State: closed - Opened by zhouchunyi about 6 years ago - 4 comments

#98 - When changing actor return, values not changing?

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#98 - When changing actor return, values not changing?

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#97 - DDPG Gamma not working?

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#97 - DDPG Gamma not working?

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#96 - DDPG soft replacement zip 函数参数长度不匹配

Issue - State: closed - Opened by lzher about 6 years ago - 1 comment

#96 - DDPG soft replacement zip 函数参数长度不匹配

Issue - State: closed - Opened by lzher about 6 years ago - 1 comment

#95 - Question about large amount of outputs

Issue - State: closed - Opened by YuffieHuang about 6 years ago - 1 comment

#95 - Question about large amount of outputs

Issue - State: closed - Opened by YuffieHuang about 6 years ago - 1 comment

#94 - DDPG Sigmoid is spitting really high values

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#94 - DDPG Sigmoid is spitting really high values

Issue - State: open - Opened by ghost about 6 years ago - 1 comment

#93 - Some advice on Actor-Critic neural

Issue - State: closed - Opened by RozenAstrayChen over 6 years ago - 3 comments

#93 - Some advice on Actor-Critic neural

Issue - State: closed - Opened by RozenAstrayChen over 6 years ago - 3 comments

#92 - 关于策略梯度和PPO中目标函数的几个问题。

Issue - State: open - Opened by BrainWWW over 6 years ago

#92 - 关于策略梯度和PPO中目标函数的几个问题。

Issue - State: open - Opened by BrainWWW over 6 years ago

#91 - A3C_continuous_action.py每个线程均可更新全局网络参数，有冲突的可能吗？

Issue - State: closed - Opened by exceeddream over 6 years ago - 2 comments

#91 - A3C_continuous_action.py每个线程均可更新全局网络参数，有冲突的可能吗？

Issue - State: closed - Opened by exceeddream over 6 years ago - 2 comments

#90 - no problem

Issue - State: closed - Opened by exceeddream over 6 years ago

#90 - no problem

Issue - State: closed - Opened by exceeddream over 6 years ago

#89 - EMA Getter in DDPG not getting called?

Issue - State: closed - Opened by ghost over 6 years ago - 3 comments

#89 - EMA Getter in DDPG not getting called?

Issue - State: closed - Opened by ghost over 6 years ago - 3 comments

#87 - avoid possible NaN error

Pull Request - State: closed - Opened by ruihuili over 6 years ago - 1 comment

#87 - avoid possible NaN error

Pull Request - State: closed - Opened by ruihuili over 6 years ago - 1 comment

#86 - 为什么在7_Policy_gradient_softmax的例子没有加噪声呢

Issue - State: closed - Opened by ouyangzhuzhu over 6 years ago - 1 comment

#86 - 为什么在7_Policy_gradient_softmax的例子没有加噪声呢

Issue - State: closed - Opened by ouyangzhuzhu over 6 years ago - 1 comment

#85 - About how to feed action to critic

Issue - State: closed - Opened by sasforce over 6 years ago - 2 comments

#85 - About how to feed action to critic

Issue - State: closed - Opened by sasforce over 6 years ago - 2 comments

#84 - Pytorch version

Issue - State: closed - Opened by flexibility2 over 6 years ago - 1 comment

#84 - Pytorch version

Issue - State: closed - Opened by flexibility2 over 6 years ago - 1 comment

#83 - 关于action的数量是变化的，需要怎么处理？

Issue - State: closed - Opened by liqinxiao over 6 years ago - 1 comment

#83 - 关于action的数量是变化的，需要怎么处理？

Issue - State: closed - Opened by liqinxiao over 6 years ago - 1 comment

#82 - 4_Sarsa_lambda_maze这个教程的算法的一点小建议

Issue - State: closed - Opened by xhxt2008 over 6 years ago - 1 comment

#82 - 4_Sarsa_lambda_maze这个教程的算法的一点小建议

Issue - State: closed - Opened by xhxt2008 over 6 years ago - 1 comment

#81 - 关于batch_size的问题

Issue - State: closed - Opened by Curry30h over 6 years ago - 1 comment

#81 - 关于batch_size的问题

Issue - State: closed - Opened by Curry30h over 6 years ago - 1 comment

#80 - 代码中的batch_size是什么数据的size？

Issue - State: closed - Opened by Curry30h over 6 years ago - 1 comment

#80 - 代码中的batch_size是什么数据的size？

Issue - State: closed - Opened by Curry30h over 6 years ago - 1 comment

#79 - A writing mistake in DoubleDQN?

Issue - State: closed - Opened by YifanZhou95 over 6 years ago - 1 comment

#79 - A writing mistake in DoubleDQN?

Issue - State: closed - Opened by YifanZhou95 over 6 years ago - 1 comment

#78 - 三维observation应该如何处理DQN？

Issue - State: closed - Opened by AlisaBen over 6 years ago - 1 comment

#78 - 三维observation应该如何处理DQN？

Issue - State: closed - Opened by AlisaBen over 6 years ago - 1 comment

#77 - DDPG Dimensions

Issue - State: closed - Opened by ghost over 6 years ago

#77 - DDPG Dimensions

Issue - State: closed - Opened by ghost over 6 years ago

#76 - Could DDPG made without using Gym?

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#76 - Could DDPG made without using Gym?

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#75 - Question about global step

Issue - State: closed - Opened by raoqiyu over 6 years ago - 1 comment

#75 - Question about global step

Issue - State: closed - Opened by raoqiyu over 6 years ago - 1 comment

#74 - DDPG env not working

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#74 - DDPG env not working

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#73 - why random action?

Issue - State: closed - Opened by sezan92 over 6 years ago - 7 comments

#73 - why random action?

Issue - State: closed - Opened by sezan92 over 6 years ago - 7 comments

#72 - why no action required in value function?

Issue - State: closed - Opened by sezan92 over 6 years ago - 1 comment

#72 - why no action required in value function?

Issue - State: closed - Opened by sezan92 over 6 years ago - 1 comment

#71 - 跑2_Q_learning_maze历程时发现训练结果不对

Issue - State: closed - Opened by wdxairforce over 6 years ago - 4 comments

#71 - 跑2_Q_learning_maze历程时发现训练结果不对

Issue - State: closed - Opened by wdxairforce over 6 years ago - 4 comments

#70 - the description of dynamic q learning

Issue - State: closed - Opened by yujianyuanhaha over 6 years ago - 1 comment

#70 - the description of dynamic q learning

Issue - State: closed - Opened by yujianyuanhaha over 6 years ago - 1 comment

#69 - DPPO with discrete action space

Pull Request - State: closed - Opened by ruihuili over 6 years ago - 1 comment

#69 - DPPO with discrete action space

Pull Request - State: closed - Opened by ruihuili over 6 years ago - 1 comment

#68 - OpenAI gym:observation

Issue - State: closed - Opened by AlisaBen over 6 years ago - 3 comments

#68 - OpenAI gym:observation

Issue - State: closed - Opened by AlisaBen over 6 years ago - 3 comments

#67 - a small problem about update order

Pull Request - State: closed - Opened by UesugiErii over 6 years ago - 3 comments

#67 - a small problem about update order

Pull Request - State: closed - Opened by UesugiErii over 6 years ago - 3 comments

#66 - Q-Learning Loop

Issue - State: closed - Opened by ghost over 6 years ago

#66 - Q-Learning Loop

Issue - State: closed - Opened by ghost over 6 years ago

#65 - ppo algorithm question

Issue - State: closed - Opened by weihaosky over 6 years ago - 2 comments

#65 - ppo algorithm question

Issue - State: closed - Opened by weihaosky over 6 years ago - 2 comments

#64 - ddpg 当一个episode结束的时候，这个状态下的q值是0吗

Issue - State: closed - Opened by michelliming over 6 years ago - 1 comment

#64 - ddpg 当一个episode结束的时候，这个状态下的q值是0吗

Issue - State: closed - Opened by michelliming over 6 years ago - 1 comment

#63 - 对于CartPole任务，AC似乎表现不如Policy Gradient？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 6 comments

#63 - 对于CartPole任务，AC似乎表现不如Policy Gradient？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 6 comments

#62 - Actor-Critic中，每轮训练，是先取td_error还是先训练critic再取td_error?

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 2 comments

#62 - Actor-Critic中，每轮训练，是先取td_error还是先训练critic再取td_error?

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 2 comments

#61 - 传统policy gradient是否存在数据关联性问题？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 10 comments

#61 - 传统policy gradient是否存在数据关联性问题？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 10 comments

#60 - DPPO只推送数据会加速很多吗？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 1 comment

#60 - DPPO只推送数据会加速很多吗？

Issue - State: closed - Opened by Junshuai-Song over 6 years ago - 1 comment

#59 - How to Modify the Code?

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#59 - How to Modify the Code?

Issue - State: closed - Opened by ghost over 6 years ago - 1 comment

#58 - 这里应该是 super(RL... 吧？

Issue - State: closed - Opened by quyuanhang over 6 years ago - 2 comments

#58 - 这里应该是 super(RL... 吧？

Issue - State: closed - Opened by quyuanhang over 6 years ago - 2 comments

#57 - sum tree capacity in prioritized experienced replay

Issue - State: closed - Opened by XavierDDD over 6 years ago - 3 comments

#57 - sum tree capacity in prioritized experienced replay

Issue - State: closed - Opened by XavierDDD over 6 years ago - 3 comments

#56 - A3C example fail after updating TF==1.6

Issue - State: open - Opened by zsdonghao over 6 years ago - 4 comments

#56 - A3C example fail after updating TF==1.6

Issue - State: open - Opened by zsdonghao over 6 years ago - 4 comments

#55 - Problem with more than one action - A3C

Issue - State: closed - Opened by akhilsanand over 6 years ago - 7 comments

#55 - Problem with more than one action - A3C

Issue - State: closed - Opened by akhilsanand over 6 years ago - 7 comments

#54 - why there is some ‘nans’ stored in the self.tree.tree arrays?

Issue - State: closed - Opened by zengjie617789 over 6 years ago - 1 comment

#54 - why there is some ‘nans’ stored in the self.tree.tree arrays?

Issue - State: closed - Opened by zengjie617789 over 6 years ago - 1 comment

#53 - mu, sigma = mu * A_BOUND[1], sigma + 1e-4

Issue - State: closed - Opened by michelliming over 6 years ago - 2 comments

#53 - mu, sigma = mu * A_BOUND[1], sigma + 1e-4

Issue - State: closed - Opened by michelliming over 6 years ago - 2 comments

#52 - 请问apply_gradients 这个函数要加锁吗？

Issue - State: closed - Opened by michelliming over 6 years ago - 2 comments

#52 - 请问apply_gradients 这个函数要加锁吗？

Issue - State: closed - Opened by michelliming over 6 years ago - 2 comments

#51 - Try to apply RL_brain to new gym enviroment

Issue - State: closed - Opened by nik31096 over 6 years ago

#51 - Try to apply RL_brain to new gym enviroment

Issue - State: closed - Opened by nik31096 over 6 years ago

GitHub / MorvanZhou/Reinforcement-learning-with-tensorflow issues and pull requests