Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / boyu-ai/Hands-on-RL issues and pull requests
#92 - 第10章 actor-critic算法的顺序问题
Issue -
State: closed - Opened by ppap36 about 1 month ago
- 3 comments
#91 - 请求提供一个requirements.txt
Issue -
State: open - Opened by HEHUA2005 about 1 month ago
#90 - 第8章 DDQN代码无法运行
Issue -
State: open - Opened by FengYeXuanLv about 1 month ago
- 3 comments
#89 - AC cartpole奖励现在破200是正确的吗?
Issue -
State: open - Opened by liaojiaxin97 about 2 months ago
- 1 comment
#88 - DQN 在 gym 新版本下修改后正常运行,但结果不符合预期,1.8.3 测试是 ok 的
Issue -
State: closed - Opened by mango-zx about 2 months ago
#88 - DQN 在 gym 新版本下修改后正常运行,但结果不符合预期,1.8.3 测试是 ok 的
Issue -
State: closed - Opened by mango-zx about 2 months ago
#87 - 请教PPO问题
Issue -
State: open - Opened by 394262597 3 months ago
- 1 comment
#87 - 请教PPO问题
Issue -
State: open - Opened by 394262597 3 months ago
#86 - Chapter 7
Issue -
State: open - Opened by A1513906286 3 months ago
- 1 comment
#86 - Chapter 7
Issue -
State: open - Opened by A1513906286 3 months ago
- 2 comments
#85 - Chapter 7
Issue -
State: open - Opened by A1513906286 3 months ago
#85 - Chapter 7
Issue -
State: open - Opened by A1513906286 3 months ago
#84 - 多臂老虎机ε - 贪心算法 解释部分有问题
Issue -
State: open - Opened by gymdarius 3 months ago
#84 - 多臂老虎机ε - 贪心算法 解释部分有问题
Issue -
State: open - Opened by gymdarius 3 months ago
#83 - trpo
Issue -
State: open - Opened by L-lorish 3 months ago
#83 - trpo
Issue -
State: open - Opened by L-lorish 3 months ago
#82 - 策略梯度证明笔误?
Issue -
State: open - Opened by lanceyliao 3 months ago
- 2 comments
#82 - 策略梯度证明笔误?
Issue -
State: open - Opened by lanceyliao 3 months ago
- 2 comments
#81 - 第10章Actor-Critic中actor_loss为何加torch.mean?
Issue -
State: closed - Opened by lanceyliao 3 months ago
#81 - 第10章Actor-Critic中actor_loss为何加torch.mean?
Issue -
State: closed - Opened by lanceyliao 3 months ago
#80 - 3.6. 占用度量,为何逆序计算?
Issue -
State: closed - Opened by lanceyliao 4 months ago
- 1 comment
#80 - 3.6. 占用度量,为何逆序计算?
Issue -
State: closed - Opened by lanceyliao 4 months ago
- 1 comment
#79 - 第九章策略梯度的损失函数
Issue -
State: open - Opened by mgt-lya 5 months ago
- 1 comment
#79 - 第九章策略梯度的损失函数
Issue -
State: open - Opened by mgt-lya 5 months ago
- 1 comment
#78 - https://www.boyuai.com/进不去了
Issue -
State: open - Opened by virtualxiaoman 6 months ago
- 1 comment
#78 - https://www.boyuai.com/进不去了
Issue -
State: open - Opened by virtualxiaoman 6 months ago
- 1 comment
#77 - 马尔可夫决策过程,MDP转化为MRP时计算的P疑似有误
Issue -
State: open - Opened by zyy777 6 months ago
- 1 comment
#77 - 马尔可夫决策过程,MDP转化为MRP时计算的P疑似有误
Issue -
State: open - Opened by zyy777 6 months ago
- 1 comment
#76 - 关于web教程布局的建议
Issue -
State: open - Opened by dctwan15 7 months ago
#76 - 关于web教程布局的建议
Issue -
State: open - Opened by dctwan15 7 months ago
#75 - 第十三章 DDPG算法 代码实践中的一点疏漏
Issue -
State: open - Opened by xiyanzzz 8 months ago
#75 - 第十三章 DDPG算法 代码实践中的一点疏漏
Issue -
State: open - Opened by xiyanzzz 8 months ago
#74 - 多臂老虎机的代码里面init_prob为什么是1.0?
Issue -
State: open - Opened by mafan1506 8 months ago
#74 - 多臂老虎机的代码里面init_prob为什么是1.0?
Issue -
State: open - Opened by mafan1506 8 months ago
#73 - 关于环境初始化的一点提示
Issue -
State: open - Opened by Summer907 8 months ago
#73 - 关于环境初始化的一点提示
Issue -
State: open - Opened by Summer907 8 months ago
#72 - CartPole-v0环境训练reward超过上限值200?
Issue -
State: closed - Opened by SHTechBoBo 8 months ago
- 1 comment
#72 - CartPole-v0环境训练reward超过上限值200?
Issue -
State: closed - Opened by SHTechBoBo 8 months ago
- 1 comment
#71 - 网页版教程 3.3.2 价值函数 推导过程有些模糊
Issue -
State: open - Opened by wangdehua01 8 months ago
#71 - 网页版教程 3.3.2 价值函数 推导过程有些模糊
Issue -
State: open - Opened by wangdehua01 8 months ago
#70 - PPO在单摆实验中为什么要对reward=(reward+8)/8的修改呢?
Issue -
State: closed - Opened by xxoospring 10 months ago
- 2 comments
#70 - PPO在单摆实验中为什么要对reward=(reward+8)/8的修改呢?
Issue -
State: closed - Opened by xxoospring 10 months ago
- 2 comments
#69 - SAC伪代码存在一点小问题
Issue -
State: open - Opened by taojunhui 11 months ago
#69 - SAC伪代码存在一点小问题
Issue -
State: open - Opened by taojunhui 11 months ago
#68 - DQN ReplayBuffer
Issue -
State: open - Opened by xxoospring 11 months ago
- 1 comment
#68 - DQN ReplayBuffer
Issue -
State: open - Opened by xxoospring 11 months ago
- 1 comment
#67 - 用spyder跑PPO代码,kernel自动关闭了
Issue -
State: closed - Opened by Shawkncok 11 months ago
- 1 comment
#67 - 用spyder跑PPO代码,kernel自动关闭了
Issue -
State: closed - Opened by Shawkncok 11 months ago
- 1 comment
#66 - 将AC改为off-policy后,每次训练500条左右的经验就会报错,显示action_dist = torch.distributions.Categorical(probs)这行代码的运行结果为tensor([[nan, nan]]
Issue -
State: open - Opened by Chensyfighting 11 months ago
- 3 comments
#66 - 将AC改为off-policy后,每次训练500条左右的经验就会报错,显示action_dist = torch.distributions.Categorical(probs)这行代码的运行结果为tensor([[nan, nan]]
Issue -
State: open - Opened by Chensyfighting 11 months ago
- 3 comments
#65 - 公式错误---https://hrl.boyuai.com/chapter/1/时序差分算法#55-q-learning-算法
Issue -
State: closed - Opened by wslgqq277g 12 months ago
- 1 comment
#65 - 公式错误---https://hrl.boyuai.com/chapter/1/时序差分算法#55-q-learning-算法
Issue -
State: closed - Opened by wslgqq277g 12 months ago
- 1 comment
#64 - 7.4 DQN 算法反向传播有没有进行求导??
Issue -
State: open - Opened by anranyicheng about 1 year ago
- 1 comment
#64 - 7.4 DQN 算法反向传播有没有进行求导??
Issue -
State: open - Opened by anranyicheng about 1 year ago
- 1 comment
#63 - SAC算法——状态价值函数存在问题
Issue -
State: open - Opened by Dilettante258 about 1 year ago
#63 - SAC算法——状态价值函数存在问题
Issue -
State: open - Opened by Dilettante258 about 1 year ago
#62 - 运行环境
Issue -
State: open - Opened by zheng-lv about 1 year ago
- 1 comment
#62 - 运行环境
Issue -
State: open - Opened by zheng-lv about 1 year ago
- 1 comment
#61 - 21章MADDPG代码问题,存在维度不匹配
Issue -
State: open - Opened by CorneliusDeng about 1 year ago
- 2 comments
#61 - 21章MADDPG代码问题,存在维度不匹配
Issue -
State: open - Opened by CorneliusDeng about 1 year ago
- 2 comments
#60 - 20章的代码问题
Issue -
State: open - Opened by Wayne857 about 1 year ago
- 3 comments
#60 - 20章的代码问题
Issue -
State: open - Opened by Wayne857 about 1 year ago
- 3 comments
#59 - 第七章DNQ回报超出200
Issue -
State: closed - Opened by KingOfChuXuan about 1 year ago
- 1 comment
#59 - 第七章DNQ回报超出200
Issue -
State: closed - Opened by KingOfChuXuan about 1 year ago
- 1 comment
#58 - 已解决
Issue -
State: closed - Opened by Thovenfish about 1 year ago
#58 - 已解决
Issue -
State: closed - Opened by Thovenfish about 1 year ago
#57 - 失业三年人不认可该观点!:UCB的U_t(a)的分母分母中为拉动每根拉杆的次数加上常数 1 ,这确保每个动作**至少被探索一次**
Issue -
State: open - Opened by StevenJokess about 1 year ago
#57 - 失业三年人不认可该观点!:UCB的U_t(a)的分母分母中为拉动每根拉杆的次数加上常数 1 ,这确保每个动作**至少被探索一次**
Issue -
State: open - Opened by StevenJokess about 1 year ago
#56 - MARL的PPT的第7页和8页参考文献咋相同?
Issue -
State: open - Opened by StevenJokess about 1 year ago
- 1 comment
#56 - MARL的PPT的第7页和8页参考文献咋相同?
Issue -
State: open - Opened by StevenJokess about 1 year ago
- 1 comment
#55 - 第三章 马尔科夫决策过程 3.3.1计算回报的函数有问题
Issue -
State: open - Opened by Sen1553 over 1 year ago
#55 - 第三章 马尔科夫决策过程 3.3.1计算回报的函数有问题
Issue -
State: open - Opened by Sen1553 over 1 year ago
#54 - 第八章 `In [7]`代码块,VAnet() 疑似有误
Issue -
State: open - Opened by Aegis1863 over 1 year ago
- 1 comment
#54 - 第八章 `In [7]`代码块,VAnet() 疑似有误
Issue -
State: open - Opened by Aegis1863 over 1 year ago
- 1 comment
#53 - 第8章 拓展阅读公式推导结果有误,补充分部积分过程
Issue -
State: open - Opened by Aegis1863 over 1 year ago
#53 - 第8章 拓展阅读公式推导结果有误,补充分部积分过程
Issue -
State: open - Opened by Aegis1863 over 1 year ago
#52 - 第9章-策略梯度算法 中的交叉熵损失体现在代码哪里 ?
Issue -
State: open - Opened by chensisi0730 over 1 year ago
#52 - 第9章-策略梯度算法 中的交叉熵损失体现在代码哪里 ?
Issue -
State: open - Opened by chensisi0730 over 1 year ago
#51 - 关于开发环境配置
Issue -
State: open - Opened by mellody11 over 1 year ago
- 4 comments
#51 - 关于开发环境配置
Issue -
State: open - Opened by mellody11 over 1 year ago
- 4 comments
#50 - 第七章DQN代运行报错
Issue -
State: open - Opened by ShuoZheLi over 1 year ago
- 3 comments
#50 - 第七章DQN代运行报错
Issue -
State: open - Opened by ShuoZheLi over 1 year ago
- 3 comments
#49 - 制作了 EPUB 格式
Issue -
State: open - Opened by wizardforcel over 1 year ago
#49 - 制作了 EPUB 格式
Issue -
State: open - Opened by wizardforcel over 1 year ago
#48 - DQN和AC算法中的q_targets的loss计算为什么最后要乘(1-done)呢?
Issue -
State: open - Opened by superbignut over 1 year ago
- 2 comments
#48 - DQN和AC算法中的q_targets的loss计算为什么最后要乘(1-done)呢?
Issue -
State: open - Opened by superbignut over 1 year ago
- 2 comments
#47 - 蒙特卡罗采样动作和状态 temp变量为什么是累加呢
Issue -
State: open - Opened by ChengchengDu over 1 year ago
- 1 comment
#47 - 蒙特卡罗采样动作和状态 temp变量为什么是累加呢
Issue -
State: open - Opened by ChengchengDu over 1 year ago
- 1 comment
#46 - DDPG算法篇笔误
Issue -
State: closed - Opened by Neuerliu over 1 year ago
- 1 comment
#46 - DDPG算法篇笔误
Issue -
State: closed - Opened by Neuerliu over 1 year ago
- 1 comment
#45 - 第18章cql代码
Issue -
State: open - Opened by Jaceyxy over 1 year ago
#45 - 第18章cql代码
Issue -
State: open - Opened by Jaceyxy over 1 year ago
#44 - 第十四章SAC 算法代码实践中tanh_normal分布的对数概率密度不太对
Issue -
State: open - Opened by SurprisedCat over 1 year ago
- 11 comments
#44 - 第十四章SAC 算法代码实践中tanh_normal分布的对数概率密度不太对
Issue -
State: open - Opened by SurprisedCat over 1 year ago
- 11 comments
#43 - 第十六章 模型预测控制 EnsembleModel类:train方法的问题
Issue -
State: open - Opened by Yandong23 over 1 year ago
- 1 comment
#43 - 第十六章 模型预测控制 EnsembleModel类:train方法的问题
Issue -
State: open - Opened by Yandong23 over 1 year ago
- 1 comment
#42 - 第20章 未定义win?
Issue -
State: closed - Opened by beyondliaaaa over 1 year ago
#42 - 第20章 未定义win?
Issue -
State: closed - Opened by beyondliaaaa over 1 year ago
#41 - 3.5公式不准确
Issue -
State: closed - Opened by administrator418 over 1 year ago
#41 - 3.5公式不准确
Issue -
State: closed - Opened by administrator418 over 1 year ago