Inverse reinforcement learning (IRL) is a powerful framework to extract the reward function of an agent by observing its behavior, but IRL algorithms that infer point estimates can be misleading. A Bayesian approach to IRL can model a distribution over possible reward functions that could explain a set of observations, alleviating the shortcomings of learning a point estimate. Unfortunately, existing Bayesian approaches to IRL use a $Q$-value function, estimated using $Q$-learning, in place of the likelihood function. The resulting posterior is computationally intensive to calculate, and has few theoretical guarantees. We introduce kernel density Bayesian IRL (KD-BIRL), a method that uses conditional kernel density estimation to directly approximate the likelihood used in Bayesian inference. The resulting posterior distribution contracts to the optimal reward function as the dataset sample size increases, leading to a flexible and efficient framework that extends to environments with complex state spaces. We demonstrate KD-BIRL’s computational benefits and ability to represent uncertainty in the recovered reward function through a series of experiments in a Gridworld environment and on a healthcare task.
A preliminary version appeared in the Symposium on Advances in Approximate Bayesian Inference, 2022.