*概要 [#m9fefdc1]
-機械学習国際会議(International Conference on Machine Learning)
-2009年6月14日-18日
-モントリオール
-http://www.cs.mcgill.ca/~icml2009/
強化学習に最も関係が強い国際会議です.
今年は強化学習に関するセッションが3つあります.
(昨年は6つもありました.)
*1C - Exploration in Reinforcement Learning [#f8839b2b]
-''[[The Adaptive k-Meteorologists Problem and Its Application to Structure Learning and Feature Selection in Reinforcement Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#302]]''~
Carlos Diuk, Lihong Li and Bethany Leffler.
-''[[Near-Bayesian Exploration in Polynomial Time.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#332]]''~
J. Zico Kolter and Andrew Ng.
-''[[Optimistic Initialization and Greediness Lead to Polynomial Time Learning in Factored MDPs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#290]]''~
Istvan Szita and Andras Lorincz.
-''[[Dynamic Analysis of Multiagent Q-learning with e-greedy Exploration.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#61]]''~
Eduardo Rodrigues Gomes and Ryszard Kowalczyk.
-''[[Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#229]]''~
Verena Heidrich-Meisner and Christian Igel.
*3C - Reinforcement Learning with Temporal Differences [#s9e3982e]
-''[[Proto-Predictive Representation of States with Simple Recurrent Temporal-Difference Networks.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#211]]''~
Takaki Makino.
-''[[Regularization and Feature Selection in Least Squares Temporal-Difference Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#439]]''~
J. Zico Kolter and Andrew Ng.
-''[[Fast gradient-descent methods for temporal-difference learning with linear function approximation.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#546]]''~
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvari, Eric Wiewiora.
-''[[Kernelized Value Function Approximation for Reinforcement Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#467]]''~
Gavin Taylor and Ronald Parr.
-''[[Constraint Relaxation in Approximate Linear Programs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#340]]''~
Marek Petrik and Shlomo Zilberstein.
*5C - Reinforcement Learning in High Order Environments [#db5191a1]
-''[[Binary Action Search for Learning Continuous-Action Control Policies.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#532]]''~
Jason Pazis and Michail Lagoudakis.
-''[[Predictive Representations for Policy Gradient in POMDPs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#446]]''~
Stochastic Search using the Natural Gradient.
-''[[Stochastic Search using the Natural Gradient.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#556]]''~
Sun Yi, Daan Wierstra, Tom Schaul, and Juergen Schmidhuber.
-''[[Approximate Inference for Planning in Stochastic Relational Worlds.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#90]]''~
Tobias Lang and Marc Toussaint.
-''[[Discovering Options from Example Trajectories.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#376]]''~
Peng Zang, Peng Zhou, David Minnen and Charles Isbell.