*概要 [#tb4acdf8]
-第27回機械学習国際会議(The 27th International Conference on Machine Learning)
-2010年6月21日-24日
-イスラエル
-http://www.icml2010.org/
*Reinforcement Learning 1 [#y26c66c6]
-[[''Least-Squares Λ Policy Iteration: Bias-Variance Trade-off in Control Problems'':http://www.icml2010.org/abstracts.html#336]]~
Christophe Thiery (Loria); Bruno Scherrer (Loria)
-[[''Finite-Sample Analysis of LSTD'':http://www.icml2010.org/abstracts.html#598]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria)
-[[''Convergence of Least Squares Temporal Difference Methods Under General Conditions'':http://www.icml2010.org/abstracts.html#187]]~
Huizhen Yu (Univ. of Helsinki)
-[[''Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view'':http://www.icml2010.org/abstracts.html#654]]~
Bruno Scherrer (Loria)
*Reinforcement Learning 2 [#zd62c780]
-[[''Approximate Predictive Representations of Partially Observable Systems'':http://www.icml2010.org/abstracts.html#588]]~
Doina Precup (Mcgill University); Monica Dinculescu (McGill University)
-[[''Constructing States for Reinforcement Learning'':http://www.icml2010.org/abstracts.html#593]]~
M. M. Mahmud (Australian National University)
-[[''Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda'':http://www.icml2010.org/abstracts.html#295]]~
Carlton Downey (Victoria University of Wellington); Scott Sanner (Nicta)
-[[''Bayesian Multi-Task Reinforcement Learning'':http://www.icml2010.org/abstracts.html#269]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria)
*Reinforcement Learning 3 [#fe04a101]
-[[''Generalizing Apprenticeship Learning across Hypothesis Classes'':http://www.icml2010.org/abstracts.html#475]]~
Thomas Walsh (Rutgers University); Kaushik Subramanian (Rutgers University); Michael Littman (Rutgers University); Carlos Diuk (Princeton University)
-[[''Toward Off-Policy Learning Control with Function Approximation'':http://www.icml2010.org/abstracts.html#627]]~
Hamid Maei (University of Alberta); Csaba Szepesvari (University Of Alberta); Shalabh Bhatnagar (Indian Institute of Science); Richard Sutton (University of Alberta)
-[[''Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis'':http://www.icml2010.org/abstracts.html#464]]~
Daniel Lizotte (University of Michigan); Michael Bowling (University of Alberta); Susan Murphy (University of Michigan)
-[[''Internal Rewards Mitigate Agent Boundedness'':http://www.icml2010.org/abstracts.html#442]]
Jonathan Sorg (University of Michigan); Satinder Singh (University of Michigan); Richard Lewis (University of Michigan)
*Reinforcement Learning 4 [#b301e15c]
-[[''Analysis of a Classification-based Policy Iteration Algorithm'':http://www.icml2010.org/abstracts.html#303]]~
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria)
-[[''Nonparametric Return Distribution Approximation for Reinforcement Learning'':http://www.icml2010.org/abstracts.html#652]]~
Tetsuro Morimura (IBM Research - Tokyo); Masashi Sugiyama (Tokyo Institute Of Technology); Hisashi Kashima (University of Tokyo); Hirotaka Hachiya; Toshiyuki Tanaka
-[[''Inverse Optimal Control with Linearly Solvable MDPs'':http://www.icml2010.org/abstracts.html#571]]~
Krishnamurthy Dvijotham (University of Washington); Emanuel Todorov (University of Washington)
-[[''Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes'':http://www.icml2010.org/abstracts.html#52]]~
Marek Petrik (University of Massachusetts ); Gavin Taylor (Duke); Ron Parr (Duke); Shlomo Zilberstein (University of Massachusetts Amherst)