ShangtongZhang/reinforcement-learning-an-introduction issues and pull requests

#66 - chapter13

Issue - State: closed - Opened by liiiiiiiiil over 6 years ago - 3 comments

#65 - Changed the numpy array display of 'Random Policy' and 'Optimal Policy' to pyplot table

Pull Request - State: closed - Opened by cbrom over 6 years ago - 1 comment

#64 - Possible Error in updating Gradient Bandit algorithm

Issue - State: closed - Opened by cbrom over 6 years ago - 4 comments

#63 - np.random.shuffle(self.indices) in Chapter02 affects gradient

Issue - State: closed - Opened by cbrom over 6 years ago - 2 comments

#62 - Chapter 4, Gambler's problem incorrect output

Issue - State: closed - Opened by sharmavedic over 6 years ago - 2 comments

#61 - Minor changes

Pull Request - State: closed - Opened by tegg89 over 6 years ago

#60 - sample all the states in playerTrajectory in MC on policy

Pull Request - State: closed - Opened by huanghua1668 over 6 years ago

#59 - First-visit MC prediction for estimating v_pi

Issue - State: closed - Opened by huanghua1668 over 6 years ago - 1 comment

#58 - The link to the book is broken

Issue - State: closed - Opened by ghost almost 7 years ago - 1 comment

#57 - Is there any boundary for policies with off-policy Q-learning using the tree backup algorithm?

Issue - State: closed - Opened by ghost almost 7 years ago - 1 comment

#56 - One question posted on SOF

Issue - State: closed - Opened by cinqs almost 7 years ago - 1 comment

#55 - Update chapter 04

Pull Request - State: closed - Opened by Kulbear almost 7 years ago - 1 comment

#54 - Remove commented code

Pull Request - State: closed - Opened by yuhang-lin almost 7 years ago

#53 - Make graph and messages more friendly

Pull Request - State: closed - Opened by yuhang-lin almost 7 years ago - 1 comment

#52 - Fix not allowing same value in both arrays

Pull Request - State: closed - Opened by yuhang-lin almost 7 years ago - 3 comments

#51 - np.argmax may lead to unexpected behavior

Issue - State: closed - Opened by ShangtongZhang almost 7 years ago - 6 comments
Labels: Hint

#50 - Shouldn't update state action value using absolute priority in Priority Sweeping

Issue - State: closed - Opened by wumo almost 7 years ago - 1 comment

#49 - Could you please help to answer some questions?

Issue - State: closed - Opened by JieMEI1994 almost 7 years ago - 2 comments

#48 - Sarsa, Q-Learning Optimal Policy Output is Not Optimal

Issue - State: closed - Opened by mosicr about 7 years ago - 3 comments

#47 - questions on Q-learning applied for racetrack

Issue - State: closed - Opened by xubo92 about 7 years ago - 1 comment

#46 - About the action selection in Double Q-Learning

Issue - State: closed - Opened by ewanlee about 7 years ago - 4 comments

#45 - why 3 nested loop in function banditSimulation in TenArmedTestbed.py

Issue - State: closed - Opened by bestbzw about 7 years ago - 1 comment

#44 - Example 4.3 Gambler's Problem Cannot Reproduce Book's Plotting

Issue - State: closed - Opened by superxingzheng about 7 years ago - 3 comments

#43 - Merge pull request #1 from ShangtongZhang/master

Pull Request - State: closed - Opened by weixu000 about 7 years ago

#42 - A little problem in the Gambler‘s question in chapter4

Issue - State: closed - Opened by xubo92 over 7 years ago - 8 comments

#41 - chapter4:Gambler question. Cannot reproduce result, why?

Issue - State: closed - Opened by xubo92 over 7 years ago - 2 comments

#40 - exercise 4.8

Issue - State: closed - Opened by persistforever over 7 years ago - 2 comments

#39 - gambler's problem

Issue - State: closed - Opened by datahaki over 7 years ago - 4 comments

#37 - ImportError: No module named utils.utils

Issue - State: closed - Opened by SJTUGuofei over 7 years ago - 7 comments

#36 - Tic-Tac-Toe program didn't work well when AI is player 2.

Issue - State: closed - Opened by shengchun over 7 years ago - 2 comments

#35 - should there be some changes about the code in chapter 12 "Eligibility Traces", RandomWalk.py, class "OffLineLambdaReturn" function "nStepReturnFromTime"

Issue - State: closed - Opened by xiaogengyaokeyan over 7 years ago - 6 comments

#34 - Ch02 TenArmedTestbed sampleAverage

Issue - State: closed - Opened by wugh over 7 years ago - 2 comments

#33 - Some explanation of tictactoe is required

Issue - State: closed - Opened by mohanr over 7 years ago - 1 comment

#32 - divide by one

Pull Request - State: closed - Opened by ndvanforeest over 7 years ago

#31 - np.argmax

Pull Request - State: closed - Opened by ndvanforeest over 7 years ago - 5 comments

#30 - Updated contribution list.

Pull Request - State: closed - Opened by kentan over 7 years ago - 2 comments

#29 - Set the boolean improvePolicy to false after each policy Improvement …

Pull Request - State: closed - Opened by loopinvariant4 over 7 years ago

#28 - Policy Iteration in Chapter4 for RentalCar

Issue - State: closed - Opened by loopinvariant4 over 7 years ago - 4 comments
Labels: bug

#27 - debug errors for missing modules

Issue - State: closed - Opened by huiwenzhang over 7 years ago - 1 comment

#26 - Maybe Error in Chapter03/GridWorld.py

Issue - State: closed - Opened by ZiJianZhao over 7 years ago - 4 comments
Labels: invalid

#25 - Fixed typo.

Pull Request - State: closed - Opened by kentan over 7 years ago

#24 - Updating to run on a local envs

Pull Request - State: closed - Opened by dendisuhubdy over 7 years ago - 1 comment

#23 - update states when batch is False

Pull Request - State: closed - Opened by gwding over 7 years ago

#22 - Typo

Pull Request - State: closed - Opened by vfdev-5 over 7 years ago

#21 - Set up the Travis.CI integration environment and packaging.

Pull Request - State: closed - Opened by aoboturov over 7 years ago - 2 comments

#20 - Licensing: MIT?

Issue - State: closed - Opened by aoboturov over 7 years ago - 1 comment

#19 - Returning list type in case of "False" option.

Pull Request - State: closed - Opened by kentan over 7 years ago

#18 - Improved the performance of argmax.

Pull Request - State: closed - Opened by kentan over 7 years ago - 6 comments

#17 - Modified the codes so that they have python v3 compatibility.

Pull Request - State: closed - Opened by kentan over 7 years ago - 6 comments

#16 - Performance improvement.

Pull Request - State: closed - Opened by kentan over 7 years ago - 1 comment

#15 - Fix for the step size value for the non-average and non-gradient case.

Pull Request - State: closed - Opened by aoboturov over 7 years ago - 3 comments

#14 - Confused about the implementation of figure-5-3 in Blackjack.py

Issue - State: closed - Opened by findmyway over 7 years ago - 1 comment
Labels: question

#13 - add Figure 2.1

Pull Request - State: closed - Opened by findmyway over 7 years ago

#12 - Fix on value iteration.

Pull Request - State: closed - Opened by kentan almost 8 years ago

#11 - Remove redundant reset

Pull Request - State: closed - Opened by findmyway almost 8 years ago - 1 comment

#10 - Observation & Offer To Help

Issue - State: closed - Opened by atki4564 almost 8 years ago - 1 comment
Labels: question

#9 - Ch2, line 48, 62, & 77 : don't seem to match book calc

Issue - State: closed - Opened by atki4564 almost 8 years ago - 7 comments
Labels: invalid

#8 - Ch2, line 62: What does the operator '+ \' do?

Issue - State: closed - Opened by atki4564 almost 8 years ago - 1 comment
Labels: invalid

#7 - Release date - 2nd Edition

Issue - State: closed - Opened by andrewcz almost 8 years ago - 12 comments

#6 - Just fixed misspelling.

Pull Request - State: closed - Opened by kentan almost 8 years ago

#5 - Just fixed misspelling.

Pull Request - State: closed - Opened by kentan almost 8 years ago - 1 comment

#4 - Make the value function invariant under rotation and mirror of the board for Tic-Tac-Toe

Issue - State: closed - Opened by ShangtongZhang almost 8 years ago
Labels: enhancement

#3 - Remove unused variable

Pull Request - State: closed - Opened by datahaki almost 8 years ago

#2 - Add a link to learning resource

Issue - State: closed - Opened by dmulitsa almost 8 years ago - 1 comment
Labels: enhancement

#1 - Fixed a bug on coping ndarray.

Pull Request - State: closed - Opened by kentan almost 8 years ago - 5 comments

GitHub / ShangtongZhang/reinforcement-learning-an-introduction issues and pull requests