概要 †
- 第27回機械学習国際会議(The 27th International Conference on Machine Learning)
- 2010年6月21日-24日
- イスラエル
- http://www.icml2010.org/
Reinforcement Learning 1 †
- Least-Squares Λ Policy Iteration: Bias-Variance Trade-off in Control Problems
Christophe Thiery (Loria); Bruno Scherrer (Loria) - Finite-Sample Analysis of LSTD
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria) - Convergence of Least Squares Temporal Difference Methods Under General Conditions
Huizhen Yu (Univ. of Helsinki) - Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
Bruno Scherrer (Loria)
Reinforcement Learning 2 †
- Approximate Predictive Representations of Partially Observable Systems
Doina Precup (Mcgill University); Monica Dinculescu (McGill University) - Constructing States for Reinforcement Learning
M. M. Mahmud (Australian National University) - Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Carlton Downey (Victoria University of Wellington); Scott Sanner (Nicta) - Bayesian Multi-Task Reinforcement Learning
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria)
Reinforcement Learning 3 †
- Generalizing Apprenticeship Learning across Hypothesis Classes
Thomas Walsh (Rutgers University); Kaushik Subramanian (Rutgers University); Michael Littman (Rutgers University); Carlos Diuk (Princeton University) - Toward Off-Policy Learning Control with Function Approximation
Hamid Maei (University of Alberta); Csaba Szepesvari (University Of Alberta); Shalabh Bhatnagar (Indian Institute of Science); Richard Sutton (University of Alberta) - Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis
Daniel Lizotte (University of Michigan); Michael Bowling (University of Alberta); Susan Murphy (University of Michigan) - Internal Rewards Mitigate Agent Boundedness Jonathan Sorg (University of Michigan); Satinder Singh (University of Michigan); Richard Lewis (University of Michigan)
Reinforcement Learning 4 †
- Analysis of a Classification-based Policy Iteration Algorithm
Alessandro Lazaric (Inria); Mohammad Ghavamzadeh (Inria); Remi Munos (Inria) - Nonparametric Return Distribution Approximation for Reinforcement Learning
Tetsuro Morimura (IBM Research - Tokyo); Masashi Sugiyama (Tokyo Institute Of Technology); Hisashi Kashima (University of Tokyo); Hirotaka Hachiya; Toshiyuki Tanaka - Inverse Optimal Control with Linearly Solvable MDPs
Krishnamurthy Dvijotham (University of Washington); Emanuel Todorov (University of Washington) - Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Marek Petrik (University of Massachusetts ); Gavin Taylor (Duke); Ron Parr (Duke); Shlomo Zilberstein (University of Massachusetts Amherst)