機械学習国際会議 ICML 2010

| Topic path: Top / 強化学習 / 機械学習国際会議 ICML 2010

*概要 [#tb4acdf8]

-第27回機械学習国際会議(The 27th International Conference on Machine Learning)
-2010年6月21日-24日
-イスラエル
-http://www.icml2010.org/


*Reinforcement Learning 1 [#y26c66c6]

-[[''Least-Squares Λ Policy Iteration: Bias-Variance Trade-off in Control Problems'':http://www.icml2010.org/abstracts.html#336]]~
Christophe Thiery (Loria); Bruno Scherrer (Loria)
-[[''Finite-Sample Analysis of LSTD'':http://www.icml2010.org/abstracts.html#598]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria)
-[[''Convergence of Least Squares Temporal Difference Methods Under General Conditions'':http://www.icml2010.org/abstracts.html#187]]~
Huizhen Yu (Univ. of Helsinki)
-[[''Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view'':http://www.icml2010.org/abstracts.html#654]]~
Bruno Scherrer (Loria)


*Reinforcement Learning 2 [#zd62c780]

-[[''Approximate Predictive Representations of Partially Observable Systems'':http://www.icml2010.org/abstracts.html#588]]~
Doina Precup (Mcgill University); Monica Dinculescu (McGill University)
-[[''Constructing States for Reinforcement Learning'':http://www.icml2010.org/abstracts.html#593]]~
M. M. Mahmud (Australian National University)
-[[''Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda'':http://www.icml2010.org/abstracts.html#295]]~
Carlton Downey (Victoria University of Wellington); Scott Sanner (Nicta)
-[[''Bayesian Multi-Task Reinforcement Learning'':http://www.icml2010.org/abstracts.html#269]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria)


*Reinforcement Learning 3 [#fe04a101]

-[[''Generalizing Apprenticeship Learning across Hypothesis Classes'':http://www.icml2010.org/abstracts.html#475]]~
Thomas Walsh (Rutgers University); Kaushik Subramanian (Rutgers University); Michael Littman (Rutgers University); Carlos Diuk (Princeton University)
-[[''Toward Off-Policy Learning Control with Function Approximation'':http://www.icml2010.org/abstracts.html#627]]~
Hamid Maei (University of Alberta); Csaba Szepesvari (University Of Alberta); Shalabh Bhatnagar (Indian Institute of Science); Richard Sutton (University of Alberta)
-[[''Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis'':http://www.icml2010.org/abstracts.html#464]]~
Daniel Lizotte (University of Michigan); Michael Bowling (University of Alberta); Susan Murphy (University of Michigan)
-[[''Internal Rewards Mitigate Agent Boundedness'':http://www.icml2010.org/abstracts.html#442]]
Jonathan Sorg (University of Michigan); Satinder Singh (University of Michigan); Richard Lewis (University of Michigan)


*Reinforcement Learning 4 [#b301e15c]

-[[''Analysis of a Classification-based Policy Iteration Algorithm'':http://www.icml2010.org/abstracts.html#303]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria)
-[[''Nonparametric Return Distribution Approximation for Reinforcement Learning'':http://www.icml2010.org/abstracts.html#652]]~
Tetsuro Morimura (IBM Research - Tokyo); Masashi Sugiyama (Tokyo Institute Of Technology); Hisashi Kashima (University of Tokyo); Hirotaka Hachiya; Toshiyuki Tanaka
-[[''Inverse Optimal Control with Linearly Solvable MDPs'':http://www.icml2010.org/abstracts.html#571]]~
Krishnamurthy Dvijotham (University of Washington); Emanuel Todorov (University of Washington)
-[[''Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes'':http://www.icml2010.org/abstracts.html#52]]~
Marek Petrik (University of Massachusetts ); Gavin Taylor (Duke); Ron Parr (Duke); Shlomo Zilberstein (University of Massachusetts Amherst)


トップ   編集 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS