Hybrid Reinforcement Learning & Transfer Learning: MEAC

While exploring potential solutions for the DARPA TIAMAT program (which we did not ultimately secure), I inadvertently developed a new reinforcement learning technique called Multi-Expert Actor-Critic (MEAC). This approach blends actor-critic reinforcement learning with transfer learning methods. Preliminary theoretical studies and implementation in a Gridworld environment—conducted by our partners at the University of Washington—showed that MEAC achieved approximately an order-of-magnitude reduction in training time compared to standard actor-critic algorithms. One of my students is currently validating these results in a real-world application by training a bipedal robot to walk. Beyond mere training speed-ups, MEAC’s compositional properties allow it to combine multiple expert models simultaneously. For example, to train a robot that can navigate a slippery slope, we can integrate an expert skilled at walking on inclines with another trained on slippery flat surfaces, thereby accelerating the learning process for the combined scenario.