機械学習国際会議 ICML 2009

| Topic path: Top / 強化学習 / 機械学習国際会議 ICML 2009

*概要 [#m9fefdc1]
-機械学習国際会議(International Conference on Machine Learning)
-2009年6月14日-18日
-モントリオール
-http://www.cs.mcgill.ca/~icml2009/

強化学習に最も関係が強い国際会議です. 

今年は強化学習に関するセッションが3つあります.
(昨年は6つもありました.)

*1C - Exploration in Reinforcement Learning [#f8839b2b]

-''[[The Adaptive k-Meteorologists Problem and Its Application to Structure Learning and Feature Selection in Reinforcement Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#302]]''~
Carlos Diuk, Lihong Li and Bethany Leffler.
-''[[Near-Bayesian Exploration in Polynomial Time.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#332]]''~
J. Zico Kolter and Andrew Ng.
-''[[Optimistic Initialization and Greediness Lead to Polynomial Time Learning in Factored MDPs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#290]]''~
Istvan Szita and Andras Lorincz.
-''[[Dynamic Analysis of Multiagent Q-learning with e-greedy Exploration.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#61]]''~
Eduardo Rodrigues Gomes and Ryszard Kowalczyk.
-''[[Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#229]]''~
Verena Heidrich-Meisner and Christian Igel.

*3C - Reinforcement Learning with Temporal Differences [#s9e3982e]

-''[[Proto-Predictive Representation of States with Simple Recurrent Temporal-Difference Networks.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#211]]''~
Takaki Makino.
-''[[Regularization and Feature Selection in Least Squares Temporal-Difference Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#439]]''~
J. Zico Kolter and Andrew Ng.
-''[[Fast gradient-descent methods for temporal-difference learning with linear function approximation.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#546]]''~
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvari, Eric Wiewiora.
-''[[Kernelized Value Function Approximation for Reinforcement Learning.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#467]]''~
Gavin Taylor and Ronald Parr.
-''[[Constraint Relaxation in Approximate Linear Programs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#340]]''~
Marek Petrik and Shlomo Zilberstein.

*5C - Reinforcement Learning in High Order Environments [#db5191a1]

-''[[Binary Action Search for Learning Continuous-Action Control Policies.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#532]]''~
Jason Pazis and Michail Lagoudakis.
-''[[Predictive Representations for Policy Gradient in POMDPs.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#446]]''~
Stochastic Search using the Natural Gradient.
-''[[Stochastic Search using the Natural Gradient.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#556]]''~
Sun Yi, Daan Wierstra, Tom Schaul, and Juergen Schmidhuber.
-''[[Approximate Inference for Planning in Stochastic Relational Worlds.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#90]]''~
Tobias Lang and Marc Toussaint.
-''[[Discovering Options from Example Trajectories.:http://www.cs.mcgill.ca/~icml2009/abstracts.html#376]]''~
Peng Zang, Peng Zhou, David Minnen and Charles Isbell.
トップ   編集 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS