MorvanZhou/Reinforcement-learning-with-tensorflow issues and pull requests

#50 - 如果reward 十分稀疏，A3C UPDATE_GLOBAL_ITER 该如何选取

Issue - State: closed - Opened by mas17kaworu over 6 years ago - 1 comment

#50 - 如果reward 十分稀疏，A3C UPDATE_GLOBAL_ITER 该如何选取

Issue - State: closed - Opened by mas17kaworu over 6 years ago - 1 comment

#49 - DDPG Critic implementation

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 1 comment

#49 - DDPG Critic implementation

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 1 comment

#48 - qlearning等算法讲的不透彻

Issue - State: closed - Opened by freelogic almost 7 years ago - 1 comment

#48 - qlearning等算法讲的不透彻

Issue - State: closed - Opened by freelogic almost 7 years ago - 1 comment

#47 - Why there is stop_gradient for td?

Issue - State: closed - Opened by mas17kaworu almost 7 years ago - 1 comment

#47 - Why there is stop_gradient for td?

Issue - State: closed - Opened by mas17kaworu almost 7 years ago - 1 comment

#46 - 我应该如何输出我的应对策略表

Issue - State: closed - Opened by Ostnie almost 7 years ago - 1 comment

#46 - 我应该如何输出我的应对策略表

Issue - State: closed - Opened by Ostnie almost 7 years ago - 1 comment

#45 - DeepMind涉嫌抄袭你? :)

Issue - State: closed - Opened by freelogic almost 7 years ago - 1 comment

#45 - DeepMind涉嫌抄袭你? :)

Issue - State: closed - Opened by freelogic almost 7 years ago - 1 comment

#44 - 如何实现A3C代码中仅save全局网络的参数，而非所有参数

Issue - State: closed - Opened by xiaokeZuo almost 7 years ago - 1 comment

#44 - 如何实现A3C代码中仅save全局网络的参数，而非所有参数

Issue - State: closed - Opened by xiaokeZuo almost 7 years ago - 1 comment

#43 - How is the state dimension 7 , For 3 arms would it be 9 ?

Issue - State: closed - Opened by Sarthak-02 almost 7 years ago

#43 - How is the state dimension 7 , For 3 arms would it be 9 ?

Issue - State: closed - Opened by Sarthak-02 almost 7 years ago

#42 - How to save network in DDPG_update2.py?

Issue - State: closed - Opened by dynamik1703 almost 7 years ago

#42 - How to save network in DDPG_update2.py?

Issue - State: closed - Opened by dynamik1703 almost 7 years ago

#41 - 关于Critic网络训练的问题

Issue - State: closed - Opened by sasforce almost 7 years ago - 1 comment

#41 - 关于Critic网络训练的问题

Issue - State: closed - Opened by sasforce almost 7 years ago - 1 comment

#40 - Using DDPG_update2.py with pendulum, reward not converging

Issue - State: closed - Opened by lisayan almost 7 years ago - 1 comment

#40 - Using DDPG_update2.py with pendulum, reward not converging

Issue - State: closed - Opened by lisayan almost 7 years ago - 1 comment

#39 - Solved some issues encounter during research

Pull Request - State: closed - Opened by tomymehdi almost 7 years ago

#39 - Solved some issues encounter during research

Pull Request - State: closed - Opened by tomymehdi almost 7 years ago

#38 - q_learning中 maze_env.py报错

Issue - State: closed - Opened by nuass almost 7 years ago - 1 comment

#38 - q_learning中 maze_env.py报错

Issue - State: closed - Opened by nuass almost 7 years ago - 1 comment

#37 - A3C玩 flappy bird

Issue - State: closed - Opened by XueXiXueXiHaHa almost 7 years ago

#37 - A3C玩 flappy bird

Issue - State: closed - Opened by XueXiXueXiHaHa almost 7 years ago

#36 - 关于Prioritized Expereience DQN的问题

Issue - State: closed - Opened by huzhejie almost 7 years ago

#36 - 关于Prioritized Expereience DQN的问题

Issue - State: closed - Opened by huzhejie almost 7 years ago

#35 - Better Exploration with Parameter Noise

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 2 comments

#35 - Better Exploration with Parameter Noise

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 2 comments

#34 - Avoid high frequent changes in DPPO

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 2 comments

#34 - Avoid high frequent changes in DPPO

Issue - State: closed - Opened by dynamik1703 almost 7 years ago - 2 comments

#33 - Fix Epsilon

Pull Request - State: closed - Opened by mauricepoirrier almost 7 years ago - 1 comment

#33 - Fix Epsilon

Pull Request - State: closed - Opened by mauricepoirrier almost 7 years ago - 1 comment

#32 - Save model in "experiments/Solve_LunarLander/A3C.py"

Issue - State: closed - Opened by joseska about 7 years ago - 3 comments

#32 - Save model in "experiments/Solve_LunarLander/A3C.py"

Issue - State: closed - Opened by joseska about 7 years ago - 3 comments

#31 - fix trasure_on_right choose_action bug

Pull Request - State: closed - Opened by chucklqsun about 7 years ago - 2 comments

#31 - fix trasure_on_right choose_action bug

Pull Request - State: closed - Opened by chucklqsun about 7 years ago - 2 comments

#30 - Problem in a3c discrete implements about encourage exploration

Issue - State: closed - Opened by LoneWolfDog about 7 years ago - 1 comment

#30 - Problem in a3c discrete implements about encourage exploration

Issue - State: closed - Opened by LoneWolfDog about 7 years ago - 1 comment

#29 - Questions regarding DQN_modified

Issue - State: closed - Opened by EveLIn3 about 7 years ago - 1 comment

#29 - Questions regarding DQN_modified

Issue - State: closed - Opened by EveLIn3 about 7 years ago - 1 comment

#28 - The hidden layers of A3C have too many neurons

Issue - State: closed - Opened by caozhenxiang-kouji about 7 years ago - 8 comments

#28 - The hidden layers of A3C have too many neurons

Issue - State: closed - Opened by caozhenxiang-kouji about 7 years ago - 8 comments

#27 - 為什麼合併Q的時候要將A減去他的平均值?

Issue - State: closed - Opened by HencyChen about 7 years ago - 3 comments

#27 - 為什麼合併Q的時候要將A減去他的平均值?

Issue - State: closed - Opened by HencyChen about 7 years ago - 3 comments

#26 - lambda parameter

Issue - State: closed - Opened by mynameisvinn about 7 years ago - 1 comment

#26 - lambda parameter

Issue - State: closed - Opened by mynameisvinn about 7 years ago - 1 comment

#25 - small refactoring of RL_brain.py

Pull Request - State: closed - Opened by alexpantyukhin about 7 years ago

#25 - small refactoring of RL_brain.py

Pull Request - State: closed - Opened by alexpantyukhin about 7 years ago

#24 - Substitute pandas `ix` with `iloc`.

Issue - State: closed - Opened by alexpantyukhin about 7 years ago - 1 comment

#24 - Substitute pandas `ix` with `iloc`.

Issue - State: closed - Opened by alexpantyukhin about 7 years ago - 1 comment

#23 - Changed argmax to idxmax.

Pull Request - State: closed - Opened by hiroyachiba about 7 years ago - 1 comment

#22 - argmax() is deprecated, use idxmax() instead

Issue - State: closed - Opened by yangliu28 about 7 years ago - 1 comment

#21 - how to save A3C model

Issue - State: closed - Opened by endymecy about 7 years ago - 1 comment

#20 - 'terminal' in 2_Q_Learning_maze

Issue - State: closed - Opened by freebooterish about 7 years ago - 1 comment

#19 - AttributeError: 'NoneType' object has no attribute 'decode'

Issue - State: closed - Opened by deepmeng about 7 years ago - 4 comments

#18 - network cost don't convergence

Issue - State: closed - Opened by beimingmaster over 7 years ago - 1 comment

#17 - Why both AC and A3C examples use Value function not Q-function?

Issue - State: closed - Opened by zsdonghao over 7 years ago - 1 comment

#16 - stochastic policy for continuous control

Issue - State: closed - Opened by sufengniu over 7 years ago - 2 comments

#15 - Q-learning vs. Sarsa_lambda

Issue - State: closed - Opened by MHaneferd over 7 years ago - 1 comment

#14 - Main function of run_this.py?

Issue - State: closed - Opened by HalleyXie over 7 years ago - 2 comments

#13 - the code report the pyglet issues with OpenGL

Issue - State: closed - Opened by wuyohee2004 over 7 years ago - 1 comment

#12 - Question

Issue - State: closed - Opened by shivajid over 7 years ago - 3 comments

#11 - Question for Deep Q network

Issue - State: closed - Opened by wetliu over 7 years ago - 1 comment

#10 - Some questions about PPO

Issue - State: closed - Opened by 20chase over 7 years ago - 1 comment

#9 - please add # coding=utf-8 in the beginning of every python file

Issue - State: closed - Opened by zhaoying9105 over 7 years ago - 1 comment

#8 - car_env.py SyntaxError

Issue - State: closed - Opened by zhaoying9105 over 7 years ago - 1 comment

#7 - 生成的图片意味着什么呢？

Issue - State: closed - Opened by huangh12 over 7 years ago - 2 comments

#6 - AC Cartpole: I think the better loss function is this one.

Issue - State: closed - Opened by zsdonghao over 7 years ago - 1 comment

GitHub / MorvanZhou/Reinforcement-learning-with-tensorflow issues and pull requests