強化学習/機械学習研究ジャーナル JMLR
をテンプレートにして作成
開始行:
Journal of Machine Learning Researchに掲載された強化学習...
(順次追加しており,完全なリストではありません.)
*逆強化学習 [#pf244e68]
-[[''Inverse Reinforcement Learning in Partially Observab...
Jaedeug Choi, Kee-Eung Kim~
JMLR 12:691−730 (2011)~
'''Keywords:''' inversereinforcementlearning,partiallyobs...
*POMDP [#i24c2b85]
-[[''A Bayesian Approach for Learning and Planning in Par...
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa, Pierre K...
JMLR 12:1729−1770 (2011)~
'''Keywords:''' reinforcement learning, Bayesian inferenc...
-[[''Multi-task Reinforcement Learning in Partially Obser...
Hui Li, Xuejun Liao, Lawrence Carin (Duke University)~
JMLR 10:1131-1186 (2009).~
'''Keywords:''' reinforcement learning, partially observa...
*転移学習 [#eba4fe72]
-[[''Transfer Learning for Reinforcement Learning Domains...
Matthew E. Taylor (The University of Southern California)...
JMLR 10:1633-1685 (2009).~
'''Keywords:''' transfer learning,reinforcement learning,...
-[[''Transfer Learning via Inter-Task Mappings for Tempor...
Matthew E. Taylor, Peter Stone, Yaxin Liu~
JMLR 8:2125-2167 (2007).~
'''Keywords:''' transfer learning, reinforcement learning...
*環境変化・動的環境 [#l4a70c46]
-[[''Value Function Based Reinforcement Learning in Chang...
Balázs Csanád Csáji, László Monostori~
JMLR 9:1679-1709 (2008).~
'''Keywords:''' Markov decision processes, reinforcement ...
-[[''ε-MDPs: Learning in Varying Environments'':http://jm...
István Szita, Bálint Takács, András Lörincz~
JMLR 3:145-174 (2002).~
'''Keywords:''' reinforcement learning, convergence, even...
*マルチエージェント [#xbea664c]
-[[''Multi-Agent Reinforcement Learning in Common Interes...
Avraham Bab, Ronen I. Brafman~
JMLR 9:2635-2675 (2008).~
'''Keywords:''' reinforcement learning, multi-agent reinf...
-[[''Collaborative Multiagent Reinforcement Learning by P...
Jelle R. Kok, Nikos Vlassis~
JMLR 7:1789-1828 (2006).~
'''Keywords:''' collaborative multiagent system, coordina...
*階層型強化学習 [#p19d8166]
-[[''Hierarchical Average Reward Reinforcement Learning''...
Mohammad Ghavamzadeh, Sridhar Mahadevan~
JMLR 8:2629-2669 (2007).~
'''Keywords:''' semi-Markov decision processes, hierarchi...
*バッチ学習 [#ad2f5f99]
-[[''Tree-Based Batch Mode Reinforcement Learning'':http:...
Damien Ernst, Pierre Geurts, Louis Wehenkel~
JMLR 6:503-556 (2005).~
'''Keywords:''' batch mode reinforcement learning, regres...
*多目的強化学習 [#bc0870fa]
-[[''A Geometric Approach to Multi-Criterion Reinforcemen...
Shie Mannor, Nahum Shimkin~
JMLR 5:325-360 (2004).
*探査と知識利用のジレンマ [#k5f6e822]
-[[''Using Confidence Bounds for Exploitation-Exploration...
Peter Auer~
JMLR 3:397-422 (2002).~
'''Keywords:''' Online Learning, Exploitation-Exploration...
*学習分析 [#w8d839f2]
-[[''Reinforcement Learning in Finite MDPs: PAC Analysis'...
Alexander L. Strehl, Lihong Li, Michael L. Littman~
JMLR 10:2413−2444 (2009).
-[[''Provably Efficient Learning with Typed Parametric Mo...
Emma Brunskill, Bethany R. Leffler, Lihong Li, Michael L....
JMLR 10:1955-1988 (2009).~
'''Keywords:''' reinforcement learning, provably efficien...
-[[''Action Elimination and Stopping Conditions for the M...
Eyal Even-Dar, Shie Mannor, Yishay Mansour~
JMLR 7:1079-1105 (2006).
-[[''Lyapunov Design for Safe Reinforcement Learning'':ht...
Theodore J. Perkins, Andrew G. Barto~
JMLR 3:803-832 (2002) .~
'''Keywords:''' Reinforcement Learning, Lyapunov Function...
*TD学習 [#e6e25c4d]
-[[''Evolutionary Function Approximation for Reinforcemen...
Shimon Whiteson, Peter Stone~
JMLR 7:877-917 (2006).~
'''Keywords:''' reinforcement learning, temporal differen...
-[[''Reinforcement Learning with Factored States and Acti...
Brian Sallans, Geoffrey E. Hinton~
JMLR 5:1063-1088 (2004).~
'''Keywords:''' product of experts, Boltzmann machine, re...
-[[''Least-Squares Policy Iteration'':http://jmlr.csail.m...
Michail G. Lagoudakis, Ronald Parr~
JMLR 4:1107-1149 (2003).~
'''Keywords:''' Reinforcement Learning, Markov Decision P...
*アクター・クリティック [#mf88a299]
-[[''A Convergent Online Single Time Scale Actor Critic A...
Dotan Di Castro, Ron Meir~
JMLR 11:367−410 (2010).
-[[''Variance Reduction Techniques for Gradient Estimates...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxter~
JMLR 5:1471-1530 (2004).~
'''Keywords:''' reinforcementlearning,policygradient,base...
*モデル・ベースド [#j7d124d1]
-[[''R-MAX - A General Polynomial Time Algorithm for Near...
Ronen I. Brafman, Moshe Tennenholtz~
JMLR 3:213-231 (2002).~
'''Keywords:''' Reinforcement Learning, Learning in Games...
*探査 [#t3da9fdf]
-[[''Policy Gradient in Continuous Time'':http://jmlr.csa...
Rémi Munos~
JMLR 7:771-791 (2006).~
'''Keywords:''' optimal control, reinforcement learning, ...
-[[''Policy Search using Paired Comparisons'':http://jmlr...
Malcolm J. A. Strens, Andrew W. Moore~
JMLR 3:921-950 (2002).~
'''Keywords:''' Reinforcement Learning, Policy Search, Ex...
*ツール [#z719584b]
-[[''RL-Glue: Language-Independent Software for Reinforce...
Brian Tanner, Adam White (University of Alberta)~
JMLR 10:2133-2136 (2009).~
'''Keywords:''' reinforcement learning, empirical evaluat...
終了行:
Journal of Machine Learning Researchに掲載された強化学習...
(順次追加しており,完全なリストではありません.)
*逆強化学習 [#pf244e68]
-[[''Inverse Reinforcement Learning in Partially Observab...
Jaedeug Choi, Kee-Eung Kim~
JMLR 12:691−730 (2011)~
'''Keywords:''' inversereinforcementlearning,partiallyobs...
*POMDP [#i24c2b85]
-[[''A Bayesian Approach for Learning and Planning in Par...
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa, Pierre K...
JMLR 12:1729−1770 (2011)~
'''Keywords:''' reinforcement learning, Bayesian inferenc...
-[[''Multi-task Reinforcement Learning in Partially Obser...
Hui Li, Xuejun Liao, Lawrence Carin (Duke University)~
JMLR 10:1131-1186 (2009).~
'''Keywords:''' reinforcement learning, partially observa...
*転移学習 [#eba4fe72]
-[[''Transfer Learning for Reinforcement Learning Domains...
Matthew E. Taylor (The University of Southern California)...
JMLR 10:1633-1685 (2009).~
'''Keywords:''' transfer learning,reinforcement learning,...
-[[''Transfer Learning via Inter-Task Mappings for Tempor...
Matthew E. Taylor, Peter Stone, Yaxin Liu~
JMLR 8:2125-2167 (2007).~
'''Keywords:''' transfer learning, reinforcement learning...
*環境変化・動的環境 [#l4a70c46]
-[[''Value Function Based Reinforcement Learning in Chang...
Balázs Csanád Csáji, László Monostori~
JMLR 9:1679-1709 (2008).~
'''Keywords:''' Markov decision processes, reinforcement ...
-[[''ε-MDPs: Learning in Varying Environments'':http://jm...
István Szita, Bálint Takács, András Lörincz~
JMLR 3:145-174 (2002).~
'''Keywords:''' reinforcement learning, convergence, even...
*マルチエージェント [#xbea664c]
-[[''Multi-Agent Reinforcement Learning in Common Interes...
Avraham Bab, Ronen I. Brafman~
JMLR 9:2635-2675 (2008).~
'''Keywords:''' reinforcement learning, multi-agent reinf...
-[[''Collaborative Multiagent Reinforcement Learning by P...
Jelle R. Kok, Nikos Vlassis~
JMLR 7:1789-1828 (2006).~
'''Keywords:''' collaborative multiagent system, coordina...
*階層型強化学習 [#p19d8166]
-[[''Hierarchical Average Reward Reinforcement Learning''...
Mohammad Ghavamzadeh, Sridhar Mahadevan~
JMLR 8:2629-2669 (2007).~
'''Keywords:''' semi-Markov decision processes, hierarchi...
*バッチ学習 [#ad2f5f99]
-[[''Tree-Based Batch Mode Reinforcement Learning'':http:...
Damien Ernst, Pierre Geurts, Louis Wehenkel~
JMLR 6:503-556 (2005).~
'''Keywords:''' batch mode reinforcement learning, regres...
*多目的強化学習 [#bc0870fa]
-[[''A Geometric Approach to Multi-Criterion Reinforcemen...
Shie Mannor, Nahum Shimkin~
JMLR 5:325-360 (2004).
*探査と知識利用のジレンマ [#k5f6e822]
-[[''Using Confidence Bounds for Exploitation-Exploration...
Peter Auer~
JMLR 3:397-422 (2002).~
'''Keywords:''' Online Learning, Exploitation-Exploration...
*学習分析 [#w8d839f2]
-[[''Reinforcement Learning in Finite MDPs: PAC Analysis'...
Alexander L. Strehl, Lihong Li, Michael L. Littman~
JMLR 10:2413−2444 (2009).
-[[''Provably Efficient Learning with Typed Parametric Mo...
Emma Brunskill, Bethany R. Leffler, Lihong Li, Michael L....
JMLR 10:1955-1988 (2009).~
'''Keywords:''' reinforcement learning, provably efficien...
-[[''Action Elimination and Stopping Conditions for the M...
Eyal Even-Dar, Shie Mannor, Yishay Mansour~
JMLR 7:1079-1105 (2006).
-[[''Lyapunov Design for Safe Reinforcement Learning'':ht...
Theodore J. Perkins, Andrew G. Barto~
JMLR 3:803-832 (2002) .~
'''Keywords:''' Reinforcement Learning, Lyapunov Function...
*TD学習 [#e6e25c4d]
-[[''Evolutionary Function Approximation for Reinforcemen...
Shimon Whiteson, Peter Stone~
JMLR 7:877-917 (2006).~
'''Keywords:''' reinforcement learning, temporal differen...
-[[''Reinforcement Learning with Factored States and Acti...
Brian Sallans, Geoffrey E. Hinton~
JMLR 5:1063-1088 (2004).~
'''Keywords:''' product of experts, Boltzmann machine, re...
-[[''Least-Squares Policy Iteration'':http://jmlr.csail.m...
Michail G. Lagoudakis, Ronald Parr~
JMLR 4:1107-1149 (2003).~
'''Keywords:''' Reinforcement Learning, Markov Decision P...
*アクター・クリティック [#mf88a299]
-[[''A Convergent Online Single Time Scale Actor Critic A...
Dotan Di Castro, Ron Meir~
JMLR 11:367−410 (2010).
-[[''Variance Reduction Techniques for Gradient Estimates...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxter~
JMLR 5:1471-1530 (2004).~
'''Keywords:''' reinforcementlearning,policygradient,base...
*モデル・ベースド [#j7d124d1]
-[[''R-MAX - A General Polynomial Time Algorithm for Near...
Ronen I. Brafman, Moshe Tennenholtz~
JMLR 3:213-231 (2002).~
'''Keywords:''' Reinforcement Learning, Learning in Games...
*探査 [#t3da9fdf]
-[[''Policy Gradient in Continuous Time'':http://jmlr.csa...
Rémi Munos~
JMLR 7:771-791 (2006).~
'''Keywords:''' optimal control, reinforcement learning, ...
-[[''Policy Search using Paired Comparisons'':http://jmlr...
Malcolm J. A. Strens, Andrew W. Moore~
JMLR 3:921-950 (2002).~
'''Keywords:''' Reinforcement Learning, Policy Search, Ex...
*ツール [#z719584b]
-[[''RL-Glue: Language-Independent Software for Reinforce...
Brian Tanner, Adam White (University of Alberta)~
JMLR 10:2133-2136 (2009).~
'''Keywords:''' reinforcement learning, empirical evaluat...
ページ名: