強化学習/機械学習国際会議 ICML
をテンプレートにして作成
開始行:
International Conference on Machine Learningで発表された...
(最近のものから順次追加しており,完全なリストではありま...
*ファイナンス [#b2eaf14d]
-[[''Reinforcement learning for optimized trade execution...
Yuriy Nevmyvaka, Yi Feng, Michael Kearns~
ICML 2006, pp. 673-680.
*ゲーム [#za08746b]
-[[''Sample-based learning and search with permanent and ...
David Silver, Richard Sutton, and Martin Mueller~
ICML 2008, pp. 968-975.
-[[''Learning algorithms for online principal-agent probl...
Vincent Conitzer, Nikesh Garera~
ICML 2006, pp. 209-216.
-[[''Learning to compete, compromise, and cooperate in re...
Jacob W. Crandall, Michael A. Goodrich~
ICML 2005, pp. 161-168.
*ロボット [#t5c9846c]
-[[''Learning complex motions by sequencing simpler motio...
Gerhard Neumann, Wolfgang Maass and Jan Peters~
ICML 2009, pp. 753-760.
-[[''Reinforcement learning by reward-weighted regression...
Jan Peters, Stefan Schaal~
ICML 2007, pp. 745-750.
*マルチエージェント [#qbc49045]
-[[''Dynamic analysis of multiagent Q-learning with ε-gre...
Eduardo Rodrigues Gomes and Ryszard Kowalczyk~
ICML 2009, pp. 369-376.
-[[''Privacy-preserving reinforcement learning'':http://d...
Jun Sakuma, Shigenobu Kobayashi, and Rebecca Wright~
ICML 2008, pp. 864-871.
-[[''Conditional random fields for multi-agent reinforcem...
Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanathan~
ICML 2007, pp. 1143-1150.
*階層型強化学習 [#n3d65970]
-[[''Hierarchical model-based reinforcement learning: R-m...
Nicholas Jong and Peter Stone~
ICML 2008, pp. 432-439.
*多目的強化学習 [#dae8ad5c]
-[[''Learning All Optimal Policies with Multiple Criteria...
Leon Barrett and Srinivas Narayanan~
ICML 2008, pp. 41-47.
-[[''Multi-task reinforcement learning: a hierarchical Ba...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepalli~
ICML 2007, pp. 1015-1022.
-[[''Dynamic preferences in multi-criteria reinforcement ...
Sriraam Natarajan, Prasad Tadepalli~
ICML 2005, pp. 601-608.
*転移学習 [#u5e36ccb]
-[[''Transfer of samples in batch reinforcement learning'...
Alessandro Lazaric, Marcello Restelli, and Andrea Bonarini~
ICML 2008, pp. 544-551.
-[[''Automatic discovery and transfer of MAXQ hierarchies...
Neville Mehta, Soumya Ray, Prasad Tadepalli, and Thomas D...
ICML 2008, pp. 648-655.
-[[''Automatic shaping and decomposition of reward functi...
Bhaskara Marthi~
ICML 2007, pp. 601-608.
-[[''Cross-domain transfer for reinforcement learning'':h...
Matthew E. Taylor, Peter Stone~
ICML 2007, pp. 879-886.
-[[''Autonomous shaping: knowledge transfer in reinforcem...
George Konidaris, Andrew Barto~
ICML 2006, pp. 489-496.
-[[''Identifying useful subgoals in reinforcement learnin...
Özgür Şimşek, Alicia P. Wolfe, Andrew G. Barto~
ICML 2005, pp. 816-823.
*関係強化学習 [#jccd787b]
-[[''Relational temporal difference learning'':http://doi...
Nima Asgharbeygi, David Stracuzzi, Pat Langley~
ICML 2006, pp. 49-56.
-[[''Learning the structure of Factored Markov Decision P...
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin~
ICML 2006, pp. 257-264.
*能動学習 [#dc6ee45a]
-[[''Active reinforcement learning'':http://doi.acm.org/1...
Arkady Epshteyn, Adam Vogel, and Gerald DeJong~
ICML 2008, pp. 296-303.
-[[''Reinforcement learning with limited reinforcement: u...
Finale Doshi, Joelle Pineau, and Nicholas Roy~
ICML 2008, pp. 256-263.
*POMDP [#p3d4e7ac]
-[[''Predictive representations for policy gradient in PO...
Abdeslam Boularias and Brahim Chaib-draa~
ICML 2009, pp. 65-72.
-[[''Region-based value iteration for partially observabl...
Hui Li, Xuejun Liao, Lawrence Carin~
ICML 2006, pp. 561-568.
*PSR [#xb4b8a08]
-[[''Efficiently learning linear-linear exponential famil...
David Wingate and Satinder Singh~
ICML 2008, pp. 1176-1183.
-[[''Learning predictive state representations using non-...
Michael Bowling, Peter McCracken, Michael James, James Ne...
ICML 2006, pp. 129-136.
-[[''Predictive state representations with options'':http...
Britton Wolfe, Satinder Singh~
ICML 2006, pp. 1025-1032.
-[[''Learning predictive state representations in dynamic...
Britton Wolfe, Michael R. James, Satinder Singh~
ICML 2005, pp. 980-987.
-[[''Learning predictive representations from a history''...
Eric Wiewiora~
ICML 2005, pp. 964-971.
*動的環境 [#k92e74c9]
-[[''Dealing with non-stationary environments using conte...
Bruno C. da Silva, Eduardo W. Basso, Ana L. C. Bazzan, Pa...
ICML 2006, pp. 217-224.
*価値関数近似 [#k0519134]
-[[''Constructing basis functions from directed graphs fo...
Jeff Johns, Sridhar Mahadevan~
ICML 2007, pp. 385-392.
-[[''Learning state-action basis functions for hierarchic...
Sarah Osentoski, Sridhar Mahadevan~
ICML 2007, pp. 705-712.
-[[''Analyzing feature generation for value-function appr...
Ronald Parr, Christopher Painter-Wakefield, Lihong Li, Mi...
ICML 2007, pp. 737-744.
-[[''Tracking value function dynamics to improve reinforc...
Chee Wee Phua, Robert Fitch~
ICML 2007, pp. 751-758.
-[[''Automatic basis function construction for approximat...
Philipp W. Keller, Shie Mannor, Doina Precup~
ICML 2006, pp. 449-456.
*連続的行動空間 [#i80e2ed1]
-[[''Binary action search for learning continuous-action ...
Jason Pazis and Michail Lagoudakis~
ICML 2009, pp. 793-800.
*大規模状態空間 [#g740c173]
-[[''Bayesian sparse sampling for on-line reward optimiza...
Tao Wang, Daniel Lizotte, Michael Bowling, Dale Schuurmans~
ICML 2005, pp. 956-963.
-[[''Proto-value functions: developmental reinforcement l...
Sridhar Mahadevan~
ICML 2005, pp. 553-560.
*探査 [#h48019f5]
-[[''The adaptive k-meteorologists problem and its applic...
Carlos Diuk, Lihong Li and Bethany Leffler~
ICML 2009, pp. 249-256.
-[[''Near-Bayesian exploration in polynomial time'':http:...
J. Zico Kolter and Andrew Ng~
ICML 2009, pp. 513-520.
-[[''Optimistic initialization and greediness lead to pol...
Istvan Szita and Andras Lorincz~
ICML 2009, pp. 1001-1008.
-[[''Hoeffding and Bernstein races for selecting policies...
Verena Heidrich-Meisner and Christian Igel~
ICML 2009, pp. 401-408.
-[[''The many faces of optimism: a unifying approach'':ht...
Istvan Szita and Andras Lorincz~
ICML 2008, pp. 1048-1055.
-[[''Reinforcement learning in the presence of rare event...
Jordan Frank, Shie Mannor, and Doina Precup~
ICML 2008, pp. 336-343.
-[[''Percentile optimization in uncertain Markov decision...
Erick Delage, Shie Mannor~
ICML 2007, pp. 225-232.
-[[''An intrinsic reward mechanism for efficient explorat...
Özgür Şimşek, Andrew G. Barto~
ICML 2006, pp. 833-840.
-[[''Qualitative reinforcement learning'':http://doi.acm....
Arkady Epshteyn, Gerald DeJong~
ICML 2006, pp. 305-312.
-[[''Experience-efficient learning in associative bandit ...
Alexander L. Strehl, Chris Mesterharm, Michael L. Littman...
ICML 2006, pp. 889-896.
-[[''Exploration and apprenticeship learning in reinforce...
Pieter Abbeel, Andrew Y. Ng~
ICML 2005, pp. 1-8.
*行動規則評価 [#w8d3ad24]
-[[''A semiparametric statistical approach to model-free ...
Tsuyoshi Ueno, Motoaki Kawanabe, Takeshi Mori, Shin-Ichi ...
ICML 2008, pp. 1072-1079.
-[[''Exploration scavenging'':http://doi.acm.org/10.1145/...
John Langford, Alexander Strehl, and Jennifer Wortman~
ICML 2008, pp. 528-535.
-[[''Fast direct policy evaluation using multiscale analy...
Mauro Maggioni, Sridhar Mahadevan~
ICML 2006, pp. 601-608.
*学習分析 [#ie0cad04]
-[[''A worst-case comparison between temporal difference ...
Lihong Li~
ICML 2008, pp. 560-567.
-[[''An analysis of linear models, linear value-function ...
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter...
ICML 2008, pp. 752-759.
-[[''An analysis of reinforcement learning with function ...
Francisco Melo, Sean Meyn, and Isabel Ribeiro~
ICML 2008, pp. 664-671.
-[[''A theoretical analysis of Model-Based Interval Estim...
Alexander L. Strehl, Michael L. Littman~
ICML 2005, pp. 856-863.
-[[''Relating reinforcement learning performance to class...
John Langford, Bianca Zadrozny~
ICML 2005, pp. 473-480.
*勾配法 [#kb3fd704]
-[[''Fast gradient-descent methods for temporal-differenc...
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh...
ICML 2009, pp. 993-1000.
-[[''Non-parametric policy gradients: a unified treatment...
Kristian Kersting and Kurt Driessens~
ICML 2008, pp. 456-463.
*TD学習 [#c5efc134]
-[[''Proto-predictive representation of states with simpl...
Takaki Makino~
ICML 2009, pp. 697-704.
-[[''Regularization and feature selection in least-square...
J. Zico Kolter and Andrew Ng~
ICML 2009, pp. 521-528.
-[[''Kernelized value function approximation for reinforc...
Gavin Taylor and Ronald Parr~
ICML 2009, pp. 1017-1024.
-[[''Constraint relaxation in approximate linear programs...
Marek Petrik and Shlomo Zilberstein~
ICML 2009, pp. 809-816.
-[[''Preconditioned temporal difference learning'':http:/...
Hengshuai Yao and Zhi-Qiang Liu~
ICML 2008, pp. 1208-1215.
-[[''PAC model-free reinforcement learning'':http://doi.a...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, John Langf...
ICML 2006, pp. 881-888.
-[[''Reinforcement learning with Gaussian processes'':htt...
Yaakov Engel, Shie Mannor, Ron Meir~
ICML 2005, pp. 201-208.
-[[''TD(λ) networks: temporal-difference networks with el...
Brian Tanner, Richard S. Sutton~
ICML 2005, pp. 888-895.
*アクター・クリティック [#l44a6bfc]
-[[''Bayesian actor-critic algorithms'':http://doi.acm.or...
Mohammad Ghavamzadeh, Yaakov Engel~
ICML 2007, pp. 297-304.
*モデル・ベースド [#xee6c308]
-[[''An analytic solution to discrete Bayesian reinforcem...
Pascal Poupart, Nikos Vlassis, Jesse Hoey, Kevin Regan~
ICML 2006, pp. 697-704.
-[[''Using inaccurate models in reinforcement learning'':...
Pieter Abbeel, Morgan Quigley, Andrew Y. Ng~
ICML 2006, pp. 1-8.
*その他・未分類 [#ne78a287]
-[[''Discovering options from example trajectories'':http...
Peng Zang, Peng Zhou, David Minnen and Charles Isbell~
ICML 2009, pp. 1217-1224.
-[[''An object-oriented representation for efficient rein...
Carlos Diuk, Andre Cohen, and Michael Littman~
ICML 2008, pp. 240-247.
-[[''Online kernel selection for Bayesian reinforcement l...
Joseph Reisinger, Peter Stone, and Risto Miikkulainen~
ICML 2008, pp. 816-823.
終了行:
International Conference on Machine Learningで発表された...
(最近のものから順次追加しており,完全なリストではありま...
*ファイナンス [#b2eaf14d]
-[[''Reinforcement learning for optimized trade execution...
Yuriy Nevmyvaka, Yi Feng, Michael Kearns~
ICML 2006, pp. 673-680.
*ゲーム [#za08746b]
-[[''Sample-based learning and search with permanent and ...
David Silver, Richard Sutton, and Martin Mueller~
ICML 2008, pp. 968-975.
-[[''Learning algorithms for online principal-agent probl...
Vincent Conitzer, Nikesh Garera~
ICML 2006, pp. 209-216.
-[[''Learning to compete, compromise, and cooperate in re...
Jacob W. Crandall, Michael A. Goodrich~
ICML 2005, pp. 161-168.
*ロボット [#t5c9846c]
-[[''Learning complex motions by sequencing simpler motio...
Gerhard Neumann, Wolfgang Maass and Jan Peters~
ICML 2009, pp. 753-760.
-[[''Reinforcement learning by reward-weighted regression...
Jan Peters, Stefan Schaal~
ICML 2007, pp. 745-750.
*マルチエージェント [#qbc49045]
-[[''Dynamic analysis of multiagent Q-learning with ε-gre...
Eduardo Rodrigues Gomes and Ryszard Kowalczyk~
ICML 2009, pp. 369-376.
-[[''Privacy-preserving reinforcement learning'':http://d...
Jun Sakuma, Shigenobu Kobayashi, and Rebecca Wright~
ICML 2008, pp. 864-871.
-[[''Conditional random fields for multi-agent reinforcem...
Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanathan~
ICML 2007, pp. 1143-1150.
*階層型強化学習 [#n3d65970]
-[[''Hierarchical model-based reinforcement learning: R-m...
Nicholas Jong and Peter Stone~
ICML 2008, pp. 432-439.
*多目的強化学習 [#dae8ad5c]
-[[''Learning All Optimal Policies with Multiple Criteria...
Leon Barrett and Srinivas Narayanan~
ICML 2008, pp. 41-47.
-[[''Multi-task reinforcement learning: a hierarchical Ba...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepalli~
ICML 2007, pp. 1015-1022.
-[[''Dynamic preferences in multi-criteria reinforcement ...
Sriraam Natarajan, Prasad Tadepalli~
ICML 2005, pp. 601-608.
*転移学習 [#u5e36ccb]
-[[''Transfer of samples in batch reinforcement learning'...
Alessandro Lazaric, Marcello Restelli, and Andrea Bonarini~
ICML 2008, pp. 544-551.
-[[''Automatic discovery and transfer of MAXQ hierarchies...
Neville Mehta, Soumya Ray, Prasad Tadepalli, and Thomas D...
ICML 2008, pp. 648-655.
-[[''Automatic shaping and decomposition of reward functi...
Bhaskara Marthi~
ICML 2007, pp. 601-608.
-[[''Cross-domain transfer for reinforcement learning'':h...
Matthew E. Taylor, Peter Stone~
ICML 2007, pp. 879-886.
-[[''Autonomous shaping: knowledge transfer in reinforcem...
George Konidaris, Andrew Barto~
ICML 2006, pp. 489-496.
-[[''Identifying useful subgoals in reinforcement learnin...
Özgür Şimşek, Alicia P. Wolfe, Andrew G. Barto~
ICML 2005, pp. 816-823.
*関係強化学習 [#jccd787b]
-[[''Relational temporal difference learning'':http://doi...
Nima Asgharbeygi, David Stracuzzi, Pat Langley~
ICML 2006, pp. 49-56.
-[[''Learning the structure of Factored Markov Decision P...
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin~
ICML 2006, pp. 257-264.
*能動学習 [#dc6ee45a]
-[[''Active reinforcement learning'':http://doi.acm.org/1...
Arkady Epshteyn, Adam Vogel, and Gerald DeJong~
ICML 2008, pp. 296-303.
-[[''Reinforcement learning with limited reinforcement: u...
Finale Doshi, Joelle Pineau, and Nicholas Roy~
ICML 2008, pp. 256-263.
*POMDP [#p3d4e7ac]
-[[''Predictive representations for policy gradient in PO...
Abdeslam Boularias and Brahim Chaib-draa~
ICML 2009, pp. 65-72.
-[[''Region-based value iteration for partially observabl...
Hui Li, Xuejun Liao, Lawrence Carin~
ICML 2006, pp. 561-568.
*PSR [#xb4b8a08]
-[[''Efficiently learning linear-linear exponential famil...
David Wingate and Satinder Singh~
ICML 2008, pp. 1176-1183.
-[[''Learning predictive state representations using non-...
Michael Bowling, Peter McCracken, Michael James, James Ne...
ICML 2006, pp. 129-136.
-[[''Predictive state representations with options'':http...
Britton Wolfe, Satinder Singh~
ICML 2006, pp. 1025-1032.
-[[''Learning predictive state representations in dynamic...
Britton Wolfe, Michael R. James, Satinder Singh~
ICML 2005, pp. 980-987.
-[[''Learning predictive representations from a history''...
Eric Wiewiora~
ICML 2005, pp. 964-971.
*動的環境 [#k92e74c9]
-[[''Dealing with non-stationary environments using conte...
Bruno C. da Silva, Eduardo W. Basso, Ana L. C. Bazzan, Pa...
ICML 2006, pp. 217-224.
*価値関数近似 [#k0519134]
-[[''Constructing basis functions from directed graphs fo...
Jeff Johns, Sridhar Mahadevan~
ICML 2007, pp. 385-392.
-[[''Learning state-action basis functions for hierarchic...
Sarah Osentoski, Sridhar Mahadevan~
ICML 2007, pp. 705-712.
-[[''Analyzing feature generation for value-function appr...
Ronald Parr, Christopher Painter-Wakefield, Lihong Li, Mi...
ICML 2007, pp. 737-744.
-[[''Tracking value function dynamics to improve reinforc...
Chee Wee Phua, Robert Fitch~
ICML 2007, pp. 751-758.
-[[''Automatic basis function construction for approximat...
Philipp W. Keller, Shie Mannor, Doina Precup~
ICML 2006, pp. 449-456.
*連続的行動空間 [#i80e2ed1]
-[[''Binary action search for learning continuous-action ...
Jason Pazis and Michail Lagoudakis~
ICML 2009, pp. 793-800.
*大規模状態空間 [#g740c173]
-[[''Bayesian sparse sampling for on-line reward optimiza...
Tao Wang, Daniel Lizotte, Michael Bowling, Dale Schuurmans~
ICML 2005, pp. 956-963.
-[[''Proto-value functions: developmental reinforcement l...
Sridhar Mahadevan~
ICML 2005, pp. 553-560.
*探査 [#h48019f5]
-[[''The adaptive k-meteorologists problem and its applic...
Carlos Diuk, Lihong Li and Bethany Leffler~
ICML 2009, pp. 249-256.
-[[''Near-Bayesian exploration in polynomial time'':http:...
J. Zico Kolter and Andrew Ng~
ICML 2009, pp. 513-520.
-[[''Optimistic initialization and greediness lead to pol...
Istvan Szita and Andras Lorincz~
ICML 2009, pp. 1001-1008.
-[[''Hoeffding and Bernstein races for selecting policies...
Verena Heidrich-Meisner and Christian Igel~
ICML 2009, pp. 401-408.
-[[''The many faces of optimism: a unifying approach'':ht...
Istvan Szita and Andras Lorincz~
ICML 2008, pp. 1048-1055.
-[[''Reinforcement learning in the presence of rare event...
Jordan Frank, Shie Mannor, and Doina Precup~
ICML 2008, pp. 336-343.
-[[''Percentile optimization in uncertain Markov decision...
Erick Delage, Shie Mannor~
ICML 2007, pp. 225-232.
-[[''An intrinsic reward mechanism for efficient explorat...
Özgür Şimşek, Andrew G. Barto~
ICML 2006, pp. 833-840.
-[[''Qualitative reinforcement learning'':http://doi.acm....
Arkady Epshteyn, Gerald DeJong~
ICML 2006, pp. 305-312.
-[[''Experience-efficient learning in associative bandit ...
Alexander L. Strehl, Chris Mesterharm, Michael L. Littman...
ICML 2006, pp. 889-896.
-[[''Exploration and apprenticeship learning in reinforce...
Pieter Abbeel, Andrew Y. Ng~
ICML 2005, pp. 1-8.
*行動規則評価 [#w8d3ad24]
-[[''A semiparametric statistical approach to model-free ...
Tsuyoshi Ueno, Motoaki Kawanabe, Takeshi Mori, Shin-Ichi ...
ICML 2008, pp. 1072-1079.
-[[''Exploration scavenging'':http://doi.acm.org/10.1145/...
John Langford, Alexander Strehl, and Jennifer Wortman~
ICML 2008, pp. 528-535.
-[[''Fast direct policy evaluation using multiscale analy...
Mauro Maggioni, Sridhar Mahadevan~
ICML 2006, pp. 601-608.
*学習分析 [#ie0cad04]
-[[''A worst-case comparison between temporal difference ...
Lihong Li~
ICML 2008, pp. 560-567.
-[[''An analysis of linear models, linear value-function ...
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter...
ICML 2008, pp. 752-759.
-[[''An analysis of reinforcement learning with function ...
Francisco Melo, Sean Meyn, and Isabel Ribeiro~
ICML 2008, pp. 664-671.
-[[''A theoretical analysis of Model-Based Interval Estim...
Alexander L. Strehl, Michael L. Littman~
ICML 2005, pp. 856-863.
-[[''Relating reinforcement learning performance to class...
John Langford, Bianca Zadrozny~
ICML 2005, pp. 473-480.
*勾配法 [#kb3fd704]
-[[''Fast gradient-descent methods for temporal-differenc...
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh...
ICML 2009, pp. 993-1000.
-[[''Non-parametric policy gradients: a unified treatment...
Kristian Kersting and Kurt Driessens~
ICML 2008, pp. 456-463.
*TD学習 [#c5efc134]
-[[''Proto-predictive representation of states with simpl...
Takaki Makino~
ICML 2009, pp. 697-704.
-[[''Regularization and feature selection in least-square...
J. Zico Kolter and Andrew Ng~
ICML 2009, pp. 521-528.
-[[''Kernelized value function approximation for reinforc...
Gavin Taylor and Ronald Parr~
ICML 2009, pp. 1017-1024.
-[[''Constraint relaxation in approximate linear programs...
Marek Petrik and Shlomo Zilberstein~
ICML 2009, pp. 809-816.
-[[''Preconditioned temporal difference learning'':http:/...
Hengshuai Yao and Zhi-Qiang Liu~
ICML 2008, pp. 1208-1215.
-[[''PAC model-free reinforcement learning'':http://doi.a...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, John Langf...
ICML 2006, pp. 881-888.
-[[''Reinforcement learning with Gaussian processes'':htt...
Yaakov Engel, Shie Mannor, Ron Meir~
ICML 2005, pp. 201-208.
-[[''TD(λ) networks: temporal-difference networks with el...
Brian Tanner, Richard S. Sutton~
ICML 2005, pp. 888-895.
*アクター・クリティック [#l44a6bfc]
-[[''Bayesian actor-critic algorithms'':http://doi.acm.or...
Mohammad Ghavamzadeh, Yaakov Engel~
ICML 2007, pp. 297-304.
*モデル・ベースド [#xee6c308]
-[[''An analytic solution to discrete Bayesian reinforcem...
Pascal Poupart, Nikos Vlassis, Jesse Hoey, Kevin Regan~
ICML 2006, pp. 697-704.
-[[''Using inaccurate models in reinforcement learning'':...
Pieter Abbeel, Morgan Quigley, Andrew Y. Ng~
ICML 2006, pp. 1-8.
*その他・未分類 [#ne78a287]
-[[''Discovering options from example trajectories'':http...
Peng Zang, Peng Zhou, David Minnen and Charles Isbell~
ICML 2009, pp. 1217-1224.
-[[''An object-oriented representation for efficient rein...
Carlos Diuk, Andre Cohen, and Michael Littman~
ICML 2008, pp. 240-247.
-[[''Online kernel selection for Bayesian reinforcement l...
Joseph Reisinger, Peter Stone, and Risto Miikkulainen~
ICML 2008, pp. 816-823.
ページ名: