"The optimism principle in various sequential decision making settings"
Automatic and sequential decision making is a fundamental objective in modern artificial intelligence and machine learning. The simplest model for this problem is the bandit setting, where the learner has to choose sequentially at each time t one source of information between K, and receives some data only about this source. He or she performs this choice while having an objective in mind, for instance maximizing the sum of collected data, finding the source with best outcome on average, performing some inference about the sources, or finding abnormal sources, etc. Although these problems differ widely with respect to the objective, a underlying principle, the optimism in face of uncertainty principle, can serve as a guiding framework for solving all these objectives. In this talk, I will present this general principle and explain how it can be applied in several situations.
Jul 11, 2016 | 04:00 PM - 06:00 PM