ONR – Interactive Machine Learning

PacMan

We are interested in machines that can learn new things from people who are not Machine Learning (ML) experts. We propose a research agenda framed around the human factors (HF) and ML research questions of teaching an agent via demonstration and critique. Ultimately, we will develop a training simulation game with several nonplayer characters, all of which can be easily taught new behaviors by an end-user.

With respect to the Science of Autonomy, this proposal is focused on Interactive Intelligence. We seek to understand how an automated system can partner with a human to learn how to act and reason about a new domain. Interactive learning machines that adapt to the needs of a user have long been a goal of AI research. Machine Learning (ML) promises a way to build adaptive systems while avoiding tedious pre-programming (which in sufficiently complex domains is almost impossible); however, we have yet to see many successful applications where machines learn from everyday users. ML techniques are not designed for input from na¨ıve users, remaining by and large a tool built by experts for experts.
Many prior efforts in designing Machine Learning systems with human input, pose the problem as: “what can I get the human to do to help my machine learn better?” Instead, our goal is for systems to learn from everyday people, we reframe the problem as: “how can machines take better advantage of the input that an everyday person is going to be able to provide?” This approach is Interactive Machine Learning (IML), and brings Human Factors to the problem of Machine Learning. IML has two major complementary research goals: (1) to develop interaction protocols for people to teach an ML agent in a way they find natural and intuitive. (2) to design ML algorithms that take better advantage of a human teacher’s guidance; that is, to understand formally how to optimize the information source that is humans, even when those humans have imperfect models of the learning algorithms or suboptimal policies themselves. Our research agenda addresses both of these IML research questions in two complementary types of learning interactions:

  • Learning from Demonstrations (LfD)—A human teacher provides demonstrations of the desired behavior in a given task domain, from which the agent infers a policy of action.
  • Learning from Critique—A human teacher watches critiques the behavior with high-level feedback.

This project is the joint effort with Dr. Andrea Thomaz (PI), Dr. Charles Isbell and Dr. Mark Riedl.