Robots are on the cusp of transforming society, powered by bold research and rapid innovation. From embodied AI to adaptive autonomy, Georgia Tech is building machines that learn, collaborate, and have the potential to reshape industries. The future is not science fiction — it’s being engineered today, and it begins here.

Welcome to Georgia Tech at ICRA 2025.

Smile and say, “ICRA!” Robots showed off their skills at #ICRA2025. So did our researchers presenting at the international event. It’s not just fancy hardware, it’s the future of embodied AI.

Georgia Tech at ICRA 2025

Learn more about robotics at
Georgia Tech’s Institute for Robotics and Intelligent Machines (IRIM) 🤖

By the Numbers

Faculty with Papers

College of Computing

Sonia Chernova (IC)
Animesh Garg (IC)
Matthew Gombolay (IC)
Sehoon Ha (IC)
Judy Hoffman (IC)
Cedric Pradalier (IC)
Harish Ravichandar (IC)
Bruce Walker (IC)
Danfei Xu (IC)

College of Engineering

Lu Gan (AE)
Evangelos Theodorou (AE)
Panagiotis Tsiotras (AE)

Saad Bhamla (ChBE)

Yue Chen (BME)
Jaydev P. Desai (BME)

Samuel Coogan (ECE)
Maegan Tucker (ECE)
Patricio Vela (ECE)

Spiridon Reveliotis (ISyE)

Jacob Blevins (ME)
Shreyas Kousik (ME)
Jun Ueda (ME)

College of Sciences

Daniel Goldman (Physics)

Bruce Walker (Psych)

Georgia Tech Research Institute

Stephen Balakirsky (GTRI)
Ai-Ping Hu (GTRI)
Jessica Inman (GTRI)
Sean Wilson (GTRI)
Robert Wright (GTRI)

Institute for People & Technology

Clint Zeagler (IPaT)

Partner Organizations on Papers

Arizona State University • Boston Dynamics AI Institute • California Institute of Technology • Carnegie Mellon University • Columbia University • Cornell University • Emory University • ETRI • Georgia Tech • Google • Harvard University • Hillsdale College • Honda Research Institute USA • Hong Kong University of Science and Technology • Intuitive Surgical • Jet Propulsion Laboratory • Johns Hopkins University • Massachusetts Institute of Technology • Max Planck Institute for Intelligent Systems • Mercedes-Benz Research & Development North America • Mitsubishi Electric Research Labs • National University of Singapore • New York University • Nuro AI • Nvidia Research • Pusan National University • Rochester Institute of Technology • RWTH Aachen University • Sandia National Labs • Shanghai Jiao Tong University • Sogang University • Stanford University • Technical University of Munich • The Hong Kong University of Science and Technology (Guangzhou) • The Ohio State University • The University of Electro-Communications • Tsinghua University • United States Department of Agriculture – Agricultural Research • Université De Montréal • University of Arkansas • University of California, Berkeley • University of California, Los Angeles • University of Cambridge • University of Delaware • University of Freiburg • University of Illinois Chicago • University of Michigan • University of North Carolina at Charlotte • University of Southern California • University of Texas at Austin • University of the Bundeswehr Munich • University of Toronto • University of Wisconsin, Madison • US Naval Research Laboratory • Wuhan University • Zoox

The Big Picture 🔗

▶ EXPLORE INTERACTIVE CHART

Welcome to ICRA 2025: Advancing the Frontiers of Robotics and Automation 🔗

**Sonia Chernova**
Senior Program Committee, ICRA 2025
School of Interactive Computing, Georgia Tech

Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

🐝 Swarm Highlights 🐝

🤖 Explore Featured Tech Teams at ICRA 2025 🤖

medusai: Interactive AI-driven Robotic Sculpture 🔗

ARTS IN ROBOTICS

The AI-driven robotic sculpture medusai is an invited ICRA installation by Gil Weinberg and his team. It responds to and interacts with humans through sound, light, touch, and movement. Inspired by the Greek myth of Medusa, the sculpture features seven robotic arms in the form of “snake hair” installed on top of an 8×10 ft. metallic “face” structure. Human movement around medusai is captured by a top mounted camera, and an artificial vision tracking system drives the robotic arms to follow humans, pluck strings, and hit drums around its surface. medusai also responds with light and electronic sound to humans’ activity such as drumming and plucking strings.

details

May 19, 7:30 PM – 9:30 PM | The Goat Farm ATL

Advancing Robotics for Complex Aquatic Navigation 🔗

ROBOPHYSICAL MODELS

AquaMILR and AquaMILR+ are new untethered limbless robots designed for agile navigation in complex aquatic environments. The robots use a bilateral actuation mechanism, inspired by musculoskeletal actuation in anguilliform swimming organisms, allowing undulatory swimming from head to tail. This actuation is enhanced by mechanical intelligence, improving maneuverability around obstacles.

AquaMILR+ also features a depth control system inspired by swim bladders of eels and sea snakes, offering capabilities that most anguilliform robots lack. Additional features such as fins and a tail improve stability and propulsion efficiency. Tests in open water and indoor aquatic environments highlight their capabilities, positioning it for search and rescue and deep-sea exploration tasks.

TEAM: Tianyu Wang, Matthew Fernandez, Nishanth Mankame, Galen Tunnicliffe, Velin Kojouharov, Donoven Dortilus, Peter Gunnarson, John Dabiri, Daniel Goldman

video

Next-Gen Robotic Pollination for Indoor Farming 🔗

AGRICULTURAL AUTOMATION

New agricultural robotics research combines a novel robotic system with an algorithmic approach that can handle delicate flora for indoor farming. The proposed hardware combines a 7-degree-of-freedom (DOF) manipulator arm with a custom end-effector, comprised of an endoscope camera, a 2-DOF microscope subsystem, and a custom vibrating pollination tool; this is paired with algorithms to detect and estimate the pose of strawberry flowers, navigate to each flower, pollinate using the tool, and inspect with the microscope. The key novelty is vibrating the flower from below while simultaneously inspecting with a microscope from above. Each subsystem is validated via extensive experiments.

TEAM: Chuizheng Kong, Alex Qiu, Idris Wibowo, Marvin Ren, Aishik Dhori, Kai-Shu Ling, Ai-Ping Hu, Shreyas Kousik

project details

Improving Multi-Legged Locomotion on Complex Terrains 🔗

LEGGED LOCOMOTION

Characterized by their elongate bodies and relatively simple legs, multi-legged robots have the potential to move through complex terrains for applications such as search-and-rescue and terrain inspection. Prior work has developed effective and reliable locomotion strategies for multi-legged robots by propagating the two waves of lateral body undulation and leg stepping, what the research team refers to as the two-wave template. However, these robots have limited capability to climb over obstacles with sizes comparable to their heights, with the team hypothesizing that such limitations stem from the two-wave template.

Seeking effective alternative waves for obstacle-climbing, the researchers designed a five-segment robot with static (non-actuated) legs, where each cable-driven joint has a rotational degree-of-freedom (DoF) in the sagittal plane (vertical wave) and a linear DoF (peristaltic wave).

The research team tested robot locomotion performance on a flat terrain and a rugose terrain. While there were marginal benefits of using peristalsis (wave-like muscle contractions to move the body) on flat-ground locomotion, the inclusion of a peristaltic wave substantially improved the locomotion performance in rugose terrains: it not only enabled obstacle-climbing capabilities with obstacles having a similar height as the robot, but it also significantly improved the traversing capabilities of the robot in such terrains. The results demonstrate an alternative actuation mechanism for multi-legged robots, paving the way towards all-terrain multi-legged robots.

TEAM: Massimiliano Iaschi* (pictured), Baxi Chong*, Tianyu Wang, Jianfeng Lin, Zhaochen Xu, Daniel Soto, Juntao He, Daniel Goldman
*equal contribution

project details

Also at ICRA | Effective Self-Righting Strategies for Elongate Multi-Legged Robots

A Better Understanding of Robot Teleoperator Performance 🔗

TELEOPERATION

New Georgia Tech research represents one of the first studies investigating how both stress and workload impact teleoperation performance. The team conducted a novel study to jointly manipulate users’ stress and workload and analyze the user’s performance through objective and subjective measures. Results indicate that, as stress increased, over 70% of participants performed better up to a moderate level of stress; yet, the majority of participants performed worse as the workload increased. Importantly, the experimental design elucidated that stress and workload have related yet distinct impacts on task performance, with workload mediating the effects of distress on performance.

TEAM: Sam Yi Ting, Erin Hedlund-Botti, Manisha Natarajan, Jamison Heard, Matthew Gombolay (pictured)

reAD NEWS

Collective Behavior of Entangled Robotic Worms 🔗

MULTI-ROBOT SWARMS

Researchers from ETH Zurich, Harvard, and Georgia Tech have combined soft robots and living worms as systems to study collective behaviors driven by physical entanglement. The researchers demonstrated individual and group movement as well as tunable cohesion in both robotic and biological “blobs,” highlighting the generalizability of entanglement-based behaviors. Researchers say the work can be extended to investigate tasks such as collective transport, where entangled groups move objects too large for individuals, and explore mechanisms for controlled disentanglement. By comparing the robot and worm systems, the authors aim to identify key features underlying observed collective behaviors and plan to develop autonomous robot systems with embedded sensing and untethered operation.

The foreground image shows a proof-of-concept demonstration of collective transport with a robot blob. The blob entangles with a 3D-printed bust and carries it along as it undergoes collective
locomotion.

BEST PAPER FINALIST 🏆

TEAM: Carina Kaeser, Junghan Kwon, Elio Challita, Harry Tuazon, Robert Wood, Saad Bhamla, Justin Werfel

New Algorithm Teaches Robots Through Human Perspective

With just 90 minutes of first-person video, a humanoid robot learned household tasks 4x faster — paving the way for scalable, human-taught assistive robots.

FEATURED

New Algorithm Teaches Robots Through Human Perspective

With just 90 minutes of first-person video, a humanoid robot learned household tasks 4x faster — paving the way for scalable, human-taught assistive robots.

Lead investigator **Simar Kareer** develops skillful robots that adapt to ever changing environments.

By Nathan Dean, School of Interactive Computing

A new data creation paradigm and algorithmic breakthrough from Georgia Tech has laid the groundwork for humanoid assistive robots to help with laundry, dishwashing, and other household chores. The framework enables these robots to learn new skills by mimicking actions from first-person videos of everyday activities.

Current training methods limit robots from being produced at the necessary scale to put a robot in every home, said Simar Kareer, a Ph.D. student in the School of Interactive Computing.

“Traditionally, collecting data for robotics means creating demonstration data,” Kareer said. “You operate the robot’s joints with a controller to move it and achieve the task you want, and you do this hundreds of times while recording sensor data, then train your models. This is slow and difficult. The only way to break that cycle is to detach the data collection from the robot itself.”

Other fields, such as computer vision and natural language processing (NLP), already leverage training data passively culled from the internet to create powerful generative AI and large-language models (LLMs).

Many roboticists, however, have shifted toward interventions that allow individual users to teach their robots how to perform tasks. Kareer believes a similar source of passive data can be established to enable practical generalized training that scales the production of humanoid robots.

This is why Kareer collaborated with School of IC Assistant Professor Danfei Xu and his Robot Learning and Reasoning Lab to develop EgoMimic, an algorithmic framework that leverages data from egocentric videos.

Meta’s Ego4D dataset inspired Kareer’s project. The benchmark dataset, released in 2023, consists of first-person videos of humans performing daily activities. This open-source data set trains AI models from a first-person human perspective.

“When I looked at Ego4D, I saw a dataset that’s the same as all the large robot datasets we’re trying to collect, except it’s with humans,” Kareer said. “You just wear a pair of glasses, and you go do things. It doesn’t need to come from the robot. It should come from something more scalable and passively generated, which is us.”

Kareer acquired a pair of Meta’s Project Aria research glasses, which contain a rich sensor suite and can record video from a first-person perspective through external RGB and SLAM cameras.

Kareer recorded himself folding a shirt while wearing the glasses and repeated the process. He did the same with other tasks such as placing a toy in a bowl and groceries into a bag. Then, he constructed a humanoid robot with pincers for hands and attached the glasses to the top to mimic a first-person viewpoint.

The robot performed each task repeatedly for two hours. Kareer said building a traditional training algorithm would take days of teleoperating and recording robot sensory data. For his project, he only needed to gather a baseline of sensory data to ensure performance improvement.

Kareer bridged the gap between the two training sets with the EgoMimic algorithm. The robot’s task performance rating increased by as much as 400% among various tasks with just 90 minutes of recorded footage. It also showed the ability to perform these tasks in unseen environments.

If enough people wear Aria glasses or other smart glasses while performing daily tasks, it can create the passive data bank needed to train robots on a massive scale.

This type of data collection can enable nearly endless possibilities for roboticists to help humans achieve more in their everyday lives. Humanoid robots can be produced and trained at an industrial level and be able to perform tasks the same way humans do.

“This work is most applicable to jobs that you can get a humanoid robot to do,” Kareer said. “In whatever industry we are allowed to collect egocentric data, we can develop humanoid robots.”

Kareer will present his paper on EgoMimic at the 2025 IEEE Engineers’ International Conference on Robotics and Automation (ICRA), which will take place from May 19 to 23 in Atlanta. The paper was co-authored by Xu and School of IC Assistant Professor Judy Hoffman, fellow Tech students Dhruv Patel, Ryan Punamiya, Pranay Mathur, and Shuo Cheng, and Chen Wang, a Ph.D. student at Stanford.

By Nathan Dean, School of Interactive Computing

Current training methods limit robots from being produced at the necessary scale to put a robot in every home, said Simar Kareer, a Ph.D. student in the School of Interactive Computing.

Kareer acquired a pair of Meta’s Project Aria research glasses, which contain a rich sensor suite and can record video from a first-person perspective through external RGB and SLAM cameras.

If enough people wear Aria glasses or other smart glasses while performing daily tasks, it can create the passive data bank needed to train robots on a massive scale.

“This work is most applicable to jobs that you can get a humanoid robot to do,” Kareer said. “In whatever industry we are allowed to collect egocentric data, we can develop humanoid robots.”

Kareer will present his paper on EgoMimic at the 2025 IEEE Engineers’ International Conference on Robotics and Automation (ICRA), which will take place from May 19 to 23 in Atlanta. The paper was co-authored by Xu and School of IC Assistant Professor Judy Hoffman, fellow Tech students Dhruv Patel, Ryan Punamiya, Pranay Mathur, and Shuo Cheng, and Chen Wang, a Ph.D. student at Stanford.

Meet the Team

Simar Kareer
Ph.D. student

Dhruv Patel
Robotics M.S. student

Ryan Punamiya
Computer Science B.S. student

Pranay Mathur
MS Robotics 2024

Shuo Cheng
Ph.D. student

Chen Wang
Ph.D. student, Stanford

Judy Hoffman
Associate Professor, Interactive Computing

Danfei Xu
Associate Professor, Interactive Computing

News and Activities

Bruce Walker Named Founding Director of New Center to Advance Human-AI-Robot Collaboration

Imagine a future where robotic guide dogs lead the visually impaired, flying cars navigate the skies, and electric self-driving vehicles communicate effortlessly with pedestrians.

That future is being shaped today at Georgia Tech’s Center for Human-AI-Robot Teaming (CHART). Led by Bruce Walker, a professor in the School of Psychology and the School of Interactive Computing, the newly launched Center aims to transform how humans, artificial intelligence, and robots work together. By focusing on the dynamic partnership between humans and intelligent systems, CHART will explore how humans can collaborate more effectively with artificial intelligence systems and robots to solve critical scientific and societal challenges.

Walker is a coauthor in the ICRA 2025 technical program on the paper Do Looks Matter? Exploring Functional and Aesthetic Design Preferences for a Robotic Guide Dog.

Robotics World Converges on Atlanta for ICRA 2025

The world’s largest robotics conference is coming to Atlanta, and 136 researchers and students from Georgia Tech will showcase their novel and groundbreaking contributions to a booming field.

The IEEE International Conference on Robotics and Automation (ICRA) will be held Monday through Friday at the Georgia World Congress Center.

“This is the flagship robotics conference,” said Seth Hutchinson, a former Georgia Tech professor who served as one of two general chairs for this year’s event. “Most of the robotics researchers you want to hear from or see will be at this conference.”

ICRA 2025 Expo

Wed, May 21, 3-5 pm

Showcasing Cutting-Edge Robotic Research

Smart Foot System for Enhanced Robot Mobility on Challenging Terrains

Deniz Kerimoglu, Burak Catalbas, Bahadir Catalbas, Daniel Goldman

Robotic platforms mainly focus on locomoting and operating in structured environments such as homes, factory settings, highways and streets etc. These robots can navigate reliably in relatively predictable and controlled settings by relying on predictable ground interactions to perform various tasks. However, robots must also achieve stable locomotion in unpredictable, and challenging terrain such as natural environments and hazardous areas where human operation is difficult, enabling tasks such as exploration, load carrying, and infrastructure maintenance.

Ground Control Robotics ICRA 2025 Demo Proposal

Daniel Soto, Esteban Flores, Daniel Goldman

Multi-legged, undulatory robots possess many advantageous properties for locomotion over unstructured and crowded terrain including low profiles and robustness to missing foot contacts. Despite these advantages, coordinating a high number of legs (6+) and body joints represents a many degree of freedom control problem that has limited their practical and commercial viability. Ground Control Robotics LLC. (GCR), in collaboration with researchers at Georgia Tech, seeks to commercialize these systems for agricultural use by advancing robophysical theories and developing robust mechanical systems.

Multimodal Perception with Legged Mobile Manipulator for Visual, Thermal, and Radiation Monitoring

Hojoon Son, Youndo Do, Marc Zebrowitz, Jacob Faile, Spencer Banks, Myeongjun Choi, Fan Zhang

The proposal presents a multimodal robotic platform for remote visual, thermal, and radiation monitoring in hazardous or unknown environments. The system integrates a Unitree B1 quadruped robot with a Unitree Z1 robotic arm to create a mobile and semi-autonomous perception platform. Equipped with a Teledyne FLIR Hadron 640R, which integrates a long-wave infrared (thermal) camera and a 1080p visible-light imaging sensor, along with an SPRD-ER gamma radiation detector, the robot fuses visual, thermal, and radiation data for real-time monitoring and environmental awareness.

EgoMimic-Expo: Demonstrating Robot Learning from Egocentric Data

Simar Kareer, Dhruv Patel, Ryan Punamiya, Pranay Mathur, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu

This demo showcases EgoMimic, a robotic system that learns from egocentric human data captured by wearable smart glasses. We will demonstrate how robot manipulation skills can be scaled using easily collected human data. This interactive demo features: (1) Live egocentric video streaming from Project Aria glasses capturing human demonstrations.(2) Policy execution on ’Eve’, our low-cost, humanoid-style bimanual robot performing contact-rich tasks (e.g., shirt folding, grocery packing). Project info: https://egomimic.github.io/

Light Following Robophysical Space Rover with Closed-Loop Gait Strategies at Granular Slope

Bahadir Catalbas, Deniz Kerimoglu, Burak Catalbas, Malone Lincoln Hemsley, Daniel Goldman

Exploring extraterrestrial environments requires planetary rovers to gather data and conduct experiments on challenging terrains like steep granular slopes, obstacles, and craters. To overcome these difficulties, modern rovers have leg-like movement systems that can lift, sweep, and spin their wheels; such a mechanism can apply selective substrate fluidization by changing the sweep and spin speed of wheels to generate effective thrust. We utilize a 30 cm-long laboratory-scale robophysical rover model on a tiltable fluidizing testbed containing poppy seeds.

ICRA 2025 Technical Tours

Monday, May 19

9:10 am – 11:30 am
Advanced Manufacturing Pilot Facility

9:15 am – 11:30 am
Georgia Tech Research Institute

9:15 am – 11:30 am
Georgia Tech Hi-bay Robotics and Human-Augmentation Space

11:40 am – 2:00 pm
Klaus Building Robotics Lab Tour

12:45 pm – 3:00 pm
Advanced Manufacturing Pilot Facility 2

12:45 pm – 3:00 pm
Georgia Tech Research Institute 2

12:45 pm – 3:00 pm
Georgia Tech Hi-bay Robotics and Human-Augmentation Space 2

1:00 pm – 3:30 pm
Klaus Building Robotics Lab Tour 2

RESEARCH 🔗

ICRA 2025 Papers with Georgia Tech coauthors are listed and sorted alphabetically by session.
Search complete program details by day and keyword, e.g. “Georgia Tech,” for other work, including Late-Breaking Results,
or see Author Index. Also check out the Workshop listing.

Aerial Robots: Mechanics and Control

Dense Fixed-Wing Swarming Using Receding-Horizon NMPC
Varun Madabushi, Yocheved Kopel, Adam Polevoy, Joseph Moore

Agricultural Automation

Robotic 3D Flower Pose Estimation for Small-Scale Urban Farms
Venkata Harsh Suhith Muriki, Hong Ray Teo, Ved Sengupta, Ai-Ping Hu

Agricultural Automation

Towards Closing the Loop in Robotic Pollination for Indoor Farming Via Autonomous Microscopic Inspection
Chuizheng Kong, Alex Qiu, Idris Wibowo, Marvin Ren, Aishik Dhori, Kai-Shu Ling, Ai-Ping Hu, Shreyas Kousik

This work proposes a novel robotic system for indoor farming. The proposed hardware combines a 7-degree-of-freedom (DOF) manipulator arm with a custom end-effector, comprised of an endoscope camera, a 2-DOF microscope subsystem, and a custom vibrating pollination tool.

Assistive Robotics

Do Looks Matter? Exploring Functional and Aesthetic Design Preferences for a Robotic Guide Dog
Aviv Cohav, Xinran Gong, Joanne Taery Kim, Clint Zeagler, Sehoon Ha, Bruce Walker

Bioinspiration and Biomimetics

AquaMILR: Mechanical Intelligence Simplifies Control of Undulatory Robots in Cluttered Fluid Environments
Tianyu Wang, Nishanth Mankame, Matthew Fernandez, Velin Kojouharov, Daniel Goldman

This paper explores how mechanical intelligence—the idea that physical body mechanics can simplify control—applies to undulatory robots swimming in cluttered aquatic environments.

Bioinspiration and Biomimetics

AquaMILR+: Design of an Untethered Limbless Robot for Complex Aquatic Terrain Navigation
Matthew Fernandez, Tianyu Wang, Galen Tunnicliffe, Donoven Dortilus, Peter Gunnarson, John Dabiri, Daniel Goldman

Bioinspiration and Biomimetics

Bird-Inspired Tendon Coupling Improves Paddling Efficiency by Shortening Phase Transition Times
Jianfeng Lin, Zhao Guo, Alexander Badri-Spröwitz

Bio-Inspired Robot Learning

Materials Matter: Investigating Functional Advantages of Bio-Inspired Materials Via Simulated Robotic Hopping
Andrew Schulz, Ayah Ahmad, Maegan Tucker

This paper explores how material properties impact robot performance, inspired by how natural systems use varied materials to their advantage. Using a simulated single-limb hopping robot, the authors tested different material profiles.

Design and Control

Continuously Variable Transmission and Stiffness Actuator Based on Actively Variable Four-Bar Linkage for Highly Dynamic Robot Systems
Jungwoo Hur, Hangyeol Song, Seokhwan Jeong

Diffusion for Manipulation

Legibility Diffuser: Offline Imitation for Intent Expressive Motion
Matthew Bronars, Shuo Cheng, Danfei Xu

Diffusion Models

Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient Guidance
Kin Man Lee, Sean Ye, Qingyu Xiao, Zixuan Wu, Zulfiqar Zaidi, David D’Ambrosio, Pannag Sanketi, Matthew Gombolay

Explainable AI in Robotics

CE-MRS: Contrastive Explanations for Multi-Robot Systems
Ethan Schneider, Daniel Wu, Devleena Das, Sonia Chernova

ID and Estimation for Legged Robots

Simultaneous Collision Detection and Force Estimation for Dynamic Quadrupedal Locomotion
Ziyi Zhou, Stefano Di Cairano, Yebin Wang, Karl Berntorp

Imitation Learning

Learning Wheelchair Tennis Navigation from Broadcast Videos with Domain Knowledge Transfer and Diffusion Motion Planning
Zixuan Wu, Zulfiqar Zaidi, Adithya Patil, Qingyu Xiao, Matthew Gombolay

Imitation Learning for Manipulation

EgoMimic: Scaling Imitation Learning Via Egocentric Video
Simar Kareer, Dhruv Patel, Ryan Punamiya, Pranay Mathur, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu

A major challenge in imitation learning is the need for large, diverse demonstration data. EgoMimic is a full-stack framework designed to scale robotic manipulation using egocentric human videos.

Imitation Learning for Manipulation

Learning Prehensile Dexterity by Imitating and Emulating State-Only Observations
Yunhai Han, Zhenyang Chen, Kyle Williams, Harish Ravichandar

Imitation Learning for Manipulation

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations
Ezra Ameperosa, Jeremy Collins, Mrinal Jain, Animesh Garg

In-Hand Manipulation

Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation
Abhinav Kumar, Thomas Power, Fan Yang, Sergio Aguilera, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson

Integrating Motion Planning/Learning

CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building
Walker Byrnes, Miroslav Bogdanovic, Avi Balakirsky, Stephen Balakirsky, Animesh Garg

Learning for Manipulation

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands
Yuanhang Zhang, Tianhai Liang, Zhenyang Chen, Yanjie Ze, Huazhe Xu

Legged Locomotion: Novel Platforms

Addition of a Peristaltic Wave Improves Multi-Legged Locomotion Performance on Complex Terrains
Massimiliano Iaschi, Baxi Chong, Tianyu Wang, Jianfeng Lin, Zhaochen Xu, Daniel Soto, Juntao He, Daniel Goldman

Legged Locomotion: Novel Platforms

Berkeley Humanoid: A Research Platform for Learning-Based Control
Qiayuan Liao, Bike Zhang, Xuanyu Huang, Xiaoyu Huang, Zhongyu Li, Koushil Sreenath

Legged Locomotion: Novel Platforms

Effective Self-Righting Strategies for Elongate Multi-Legged Robots
Erik Teder, Baxi Chong, Juntao He, Tianyu Wang, Massimiliano Iaschi, Daniel Soto, Daniel Goldman

Manipulation Planning and Control

Is Linear Feedback on Smoothed Dynamics Sufficient for Stabilizing Contact-Rich Plans?
Yuki Shirai, Tong Zhao, Hyung Ju Terry Suh, Huaijiang Zhu, Xinpei Ni, Jiuguang Wang, Max Simchowitz, Tao Pang

Marine Robotics

A Data-Driven Velocity Estimator for Autonomous Underwater Vehicles Experiencing Unmeasurable Flow and Wave Disturbance
Jinzhi Cai, Scott Mayberry, Huan Yin, Fumin Zhang

Mechanism Design and Control

Guaranteed Reach-Avoid for Black-Box Systems through Narrow Gaps Via Neural Network Reachability
Long Kiu Chung, Wonsuhk Jung, Srivatsank Pullabhotla, Parth Kishor Shinde, Yadu Krishna Sunil, Saihari Kota, Luis F. W. Batista, Cedric Pradalier, Shreyas Kousik

Mechanism Design and Control

RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution
Wonsuhk Jung, Dennis Anthony, Utkarsh Mishra, Nadun Ranawaka Arachchige, Matthew Bronars, Danfei Xu, Shreyas Kousik

Imitation learning (IL) has shown great success in learning complex robot manipulation tasks. However, there remains a need for practical safety methods to justify widespread deployment.

Medical Robot Systems

Design and Modeling of a Compact Spooling Mechanism for the COAST Guidewire Robot
Timothy A. Brumfiel, Jared Grinberg, Betina Siopongco, Jaydev P. Desai

Model Control, Legged Robots

Terrain-Aware Model Predictive Control of Heterogeneous Bipedal and Aerial Robot Coordination for Search and Rescue Tasks
Abdulaziz Shamsah, Jesse Jiang, Ziwon Yoon, Samuel Coogan, Ye Zhao

Motion Planning

Propagative Distance Optimization for Motion Planning
Yu Chen, Jinyun Xu, Yilin Cai, Ting-Wei Wong, Zhongqiang Ren, Howie Choset, Guanya Shi

Multi-Robot Exploration

Communication-Aware Iterative Map Compression for Online Path-Planning
Evangelos Psomiadis, Ali Reza Pedram, Dipankar Maity, Panagiotis Tsiotras

This paper addresses the problem of optimizing communicated information among heterogeneous, resource-aware robot teams to facilitate their navigation. A mobile robot compresses its local map to assist another robot in reaching a target within an uncharted environment.

Multi-Robot Swarms
Best Paper Finalist 🏆

Individual and Collective Behaviors in Soft Robot Worms Inspired by Living Worm Blobs
Carina Kaeser, Junghan Kwon, Elio Challita, Harry Tuazon, Robert Wood, Saad Bhamla, Justin Werfel

Multi-Robot Systems

A Streamlined Heuristic for the Problem of Min-Time Coverage in Constricted Environments (I)
Young-In Kim, Spiridon Reveliotis

This paper tackles the minimum-time coverage problem for robotic fleets operating in constricted, structured environments, like pipes or narrow service tunnels. It introduces a streamlined heuristic that reduces computation time and a local search method that refines the solution for better performance and scalability.

Multi-Robot Systems

Integrating Multi-Robot Adaptive Sampling and Informative Path Planning for Spatiotemporal Natural Environment Prediction
Siva Kailas, Srujan Deolasee, Wenhao Luo, Woojun Kim, Katia Sycara

This work presents a decentralized framework for multi-robot adaptive sampling and informative path planning to predict spatiotemporal environmental processes. Using Gaussian Process models and peer-to-peer coordination, the system identifies informative sampling locations and plans efficient paths under constraints.

Multi-Robot Systems

Residual Descent Differential Dynamic Game (RD3G) – a Fast Newton Solver for Constrained General Sum Games
Zhiyuan Zhang, Panagiotis Tsiotras

This paper introduces RD3G, a Newton-based solver for multi-agent differential dynamic games with constraints. It dynamically handles constraints and uses efficient techniques to achieve 4× faster performance and 2× higher convergence compared to previous methods.

Novel Methods for Mapping/Localization

Evaluating Global Geo-Alignment for Precision Learned Autonomous Vehicle Localization Using Aerial Data
Yi Yang, Xuran Zhao, Haicheng Charles Zhao, Shumin Yuan, Samuel Bateman, Tiffany A. Huang, Chris Beall, Will Maddern

This paper investigates aligning aerial data with sensor data for autonomous vehicle localization. Evaluated on a 1600 km dataset, the method achieves sub-0.3 meter and 0.5° errors, showing promise for precise, scalable localization.

Offroad Navigation

Dynamics Modeling Using Visual Terrain Features for High-Speed Autonomous Off-Road Driving
Jason Gibson, Anoushka Alavilli, Erica Tevere, Evangelos Theodorou, Patrick Spieler

This paper presents a hybrid dynamics model that uses visual terrain features—extracted via a foundation model like DINOv2—to improve high-speed autonomous off-road driving. The method enables real-time planning and is validated on diverse terrain data from the DARPA RACER program.

Optimization and Optimal Control

Second-Order Stein Variational Dynamic Optimization
Yuichiro Aoyama, Peter Lehmann, Evangelos Theodorou

The authors introduce a new optimization algorithm—Stein Variational Differential Dynamic Programming—that merges sampling-based and gradient-based approaches to improve trajectory optimization. It performs well in Model Predictive Control tasks, offering better convergence and avoiding local minima.

Perception for Mobile Robots

DreamDrive: Generative 4D Scene Modeling from Street View Images
Jiageng Mao, Boyi Li, Boris Ivanovic, Yuxiao Chen, Yan Wang, Yurong You, Chaowei Xiao, Danfei Xu, Marco Pavone, Yue Wang

DreamDrive synthesizes 3D-consistent, 4D driving scenes from street view images using generative video diffusion models and hybrid Gaussian representations. It produces realistic and generalizable driving videos from in-the-wild data, enhancing autonomous perception and planning.

Reinforcement Learning

Learning a High-Quality Robotic Wiping Policy Using Systematic Reward Analysis and Visual-Language Model Based Curriculum
Yihong Liu, Dongyeop Kang, Sehoon Ha

This work improves robotic wiping through a bounded reward formulation and a VLM-based curriculum that guides learning. The combined approach trains high-quality wiping policies across varied surfaces, solving convergence issues faced by standard Deep RL methods.

Reinforcement Learning

PrivilegedDreamer: Explicit Imagination of Privileged Information for Rapid Adaptation of Learned Policies
Morgan Byrd, Jackson Crandell, Mili Das, Jessica Inman, Robert Wright, Sehoon Ha

PrivilegedDreamer is a model-based reinforcement learning framework for hidden-parameter MDPs. It uses a dual recurrent architecture to estimate hidden environment parameters and conditions its networks on them, significantly improving sim-to-real transfer and outperforming state-of-the-art methods on multiple tasks.

Reinforcement Learning Applications

Learning Multi-Agent Coordination for Replenishment at Sea
Byeolyi Han, Minwoo Cho, Letian Chen, Rohan Paleja, Zixuan Wu, Sean Ye, Esmaeil Seraj, David Sidoti, Matthew Gombolay

This work introduces Marine, a MARL environment simulating sea-based logistics with real wave data. Their SchedHGNN model combines a heterogeneous graph neural network and intrinsic rewards to improve coordination under dynamic weather, achieving up to 37.8% better performance than prior baselines.

Representation Learning

Learning Dynamics of a Ball with Differentiable Factor Graph and Roto-Translational Invariant Representations
Qingyu Xiao, Zixuan Wu, Matthew Gombolay

This paper proposes a differentiable factor graph and roto-translational invariant representations to model fast, nonlinear ball dynamics in games like ping pong. The approach achieves improved accuracy with low RMSE and supports agile robot planning in dynamic, contact-rich environments.

Representation Learning

MI-HGNN: Morphology-Informed Heterogeneous Graph Neural Network for Legged Robot Contact Perception
Daniel Chase Butterfield, Sandilya Sai Garimella, NaiJen Cheng, Lu Gan

MI-HGNN is a graph neural network informed by robot morphology for legged robot contact perception. It outperforms a leading baseline by 8.4% using only 0.21% of its parameters, and generalizes to other multi-body systems. Code is available on GitHub: Morphology-Informed-HGNN.

Resiliency and Security

Affine Transformation-Based Perfectly Undetectable False Data Injection Attacks on Remote Manipulator Kinematic Control with Attack Detector
Jun Ueda, Jacob Blevins

This paper demonstrates the viability of perfectly undetectable affine transformation attacks against robotic manipulators. The attacker can implement these communication line attacks by satisfying three conditions presented in this work.

Resiliency and Security

Perfectly Undetectable False Data Injection Attacks on Encrypted Bilateral Teleoperation System Based on Dynamic Symmetry and Malleability
Hyukbin Kwon, Hiroaki Kawase, Heriberto Andres Nieves-Vazquez, Kiminao Kogiso, Jun Ueda

This paper investigates the vulnerability of bilateral teleoperation systems to perfectly undetectable False Data Injection Attacks (FDIAs). The paper focuses on a specific class of cyberattacks: perfectly undetectable FDIAs, where attackers alter signals without leaving detectable traces at all.

Robot Safety

Dynamic Gap: Safe Gap-Based Navigation in Dynamic Environments
Maxwell Asselmeier, Dhruv Ahuja, Abdel Zaro, Ahmad Abuaish, Ye Zhao, Patricio Vela

SLAM

HDPlanner: Advancing Autonomous Deployments in Unknown Environments through Hierarchical Decision Networks
Jingsong Liang, Yuhong Cao, Yixiao Ma, Hanqi Zhao, Guillaume Adrien Sartoretti

In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation. Specifically, HDPlanner relies on novel hierarchical attention networks to empower the robot to reason about its belief across multiple spatial scales.

Soft Robotic Grasping

Kinetostatics and Retention Force Analysis of Soft Robot Grippers with External Tendon Routing
Anthony Gunderman, Yifan Wang, Benjamin Gunderman, Alex Qiu, Milad Azizkhani, Joseph Sommer, Yue Chen

Soft robots (SR) are a class of continuum robots that enable safe human interaction with task versatility beyond rigid robots. This letter presents a kinetostatic modeling approach based on strain energy minimization subject to mechanics and geometric constraints for shape estimation of SR grippers with external tendon routing (ETR).

Software Tools

A Survey on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms
Armin Mokhtarian, Jianye Xu, Patrick Scheffe, Maximilian Kloock, Simon Schäfer, Heeseung Bang, Viet-Anh Le, Sangeet Ulhas, Johannes Betz, Sean Wilson, Spring Berman, Liam Paull, Amanda Prorok, Bassam Alrifaee

This work serves to facilitate researchers’ efforts in identifying existing small-scale testbeds suitable for their experiments and provide insights for those who want to build their own for connected and automated vehicles and robot swarms.

Surgical Robotics: Catheters/Needles

Model-Based Parameter Selection for a Steerable Continuum Robot — Applications to Bronchoalveolar Lavage (BAL)
Amber K. Rothe, Timothy A. Brumfiel, Revanth Konda, Kirsten Williams, Jaydev P. Desai

Bronchoalveolar lavage (BAL) is a minimally invasive procedure for diagnosing lung infections and diseases. Continuum robots could improve the navigation of catheters, guidewires, and endoscopes in such procedures.

Surgical Robotics: Planning

SuFIA-BC: Generating High Quality Demonstration Data for Visuomotor Policy Learning in Surgical Subtasks
Masoud Moghani, Nigel Nelson, Mohamed Ghanem, Andres Diaz-Pinto, Kush Hari, Mahdi Azizian, Ken Goldberg, Sean Huver, Animesh Garg

Task and Motion Planning

Optimization-Based Task and Motion Planning under Signal Temporal Logic Specifications Using Logic Network Flow
Xuan Lin, Jiming Ren, Samuel Coogan, Ye Zhao

Teleoperation

The Impact of Stress and Workload on Human Performance in Robot Teleoperation Tasks
Sam Yi Ting, Erin Hedlund-Botti, Manisha Natarajan, Jamison Heard, Matthew Gombolay

Testing and Validation

Learning-Based Bayesian Inference for Testing of Autonomous Systems
Anjali Parashar, Ji Yin, Charles Dawson, Panagiotis Tsiotras, Chuchu Fan

For the safe operation of robotic systems, it is important to accurately understand its failure modes using prior testing. Hardware testing of robotic infrastructure is known to be slow and costly.

Vision-Based Navigation

Safer Gap: Safe Navigation of Planar Nonholonomic Robots with a Gap-Based Local Planner
Shiyu Feng, Ahmad Abuaish, Patricio Vela

This paper extends the idea of gap-based robot navigation to nonholonomic robots with safety guarantees. The authors propose a safe navigation technique that ensures robot movement in dynamic environments while guaranteeing collision avoidance.

Vision-Based Navigation

X-MOBILITY: End-To-End Generalizable Navigation Via World Modeling
Wei Liu, Huihua Zhao, Chenran Li, Joydeep Biswas, Billy Okal, Pulkit Goyal, Yan Chang, Soha Pouya

This paper introduces xmobility, an end-to-end generalizable navigation model that overcomes existing challenges by leveraging three key ideas. First, xmobility employs an auto-regressive world modeling architecture with a latent state space to capture world dynamics.

Aerial Robots: Mechanics and Control

Dense Fixed-Wing Swarming Using Receding-Horizon NMPC
Varun Madabushi, Yocheved Kopel, Adam Polevoy, Joseph Moore

This paper presents a method for controlling agile fixed-wing aerial vehicles flying closely together in a swarm. The method uses receding-horizon nonlinear model predictive control (NMPC) to plan dynamic maneuvers while avoiding inter-agent collisions. A key feature is a statistical analysis that estimates the probability of agents deviating from their planned paths, allowing for better collision risk assessment. The authors also propose a new metric to evaluate the behavior of highly dynamic swarms. The method was validated in both simulation and hardware, marking the first demonstration of close-quarters swarming with real aerobatic fixed-wing vehicles.

Agricultural Automation

Robotic 3D Flower Pose Estimation for Small-Scale Urban Farms
Venkata Harsh Suhith Muriki, Hong Ray Teo, Ved Sengupta, Ai-Ping Hu

This paper presents a novel approach for flower pose estimation using a FarmBot platform with a custom camera end-effector to automate plant phenotyping. By leveraging 3D point cloud data, the system generates 2D images corresponding to six orthogonal viewpoints of the flower. It then uses 2D object detection models to identify the flowers and convert the bounding boxes back into 3D space for pose estimation. The method fits shapes like superellipsoids, paraboloids, and planes to the flower point clouds, achieving a mean pose error of 7.7 degrees, which is sufficient for robotic pollination. The system successfully identifies around 80% of flowers scanned and rivals previous methods. Code is available at GitHub.

Effective pollination is a key challenge for indoor farming, since bees struggle to navigate without the sun. While a variety of robotic system solutions have been proposed, it remains difficult to autonomously check that a flower has been sufficiently pollinated to produce high-quality fruit, which is especially critical for self-pollinating crops such as strawberries. To this end, this work proposes a novel robotic system for indoor farming. The proposed hardware combines a 7-degree-of-freedom (DOF) manipulator arm with a custom end-effector, comprised of an endoscope camera, a 2-DOF microscope subsystem, and a custom vibrating pollination tool; this is paired with algorithms to detect and estimate the pose of strawberry flowers, navigate to each flower, pollinate using the tool, and inspect with the microscope. The key novelty is vibrating the flower from below while simultaneously inspecting with a microscope from above. Each subsystem is validated via extensive experiments.

Assistive Robotics

Do Looks Matter? Exploring Functional and Aesthetic Design Preferences for a Robotic Guide Dog
Aviv Cohav, Xinran Gong, Joanne Taery Kim, Clint Zeagler, Sehoon Ha, Bruce Walker

Dog guides offer an effective mobility solution for blind or visually impaired (BVI) individuals, but conventional dog guides have limitations including the need for care, potential distractions, societal prejudice, high costs, and limited availability. To address these challenges, we seek to develop a robot dog guide capable of performing the tasks of a conventional dog guide, enhanced with additional features. In this work, we focus on design research to identify functional and aesthetic design concepts to implement into a quadrupedal robot. The aesthetic design remains relevant even for BVI users due to their sensitivity toward societal perceptions and the need for smooth integration into society. We collected data through interviews and surveys to answer specific design questions pertaining to the appearance, texture, features, and method of controlling and communicating with the robot. Our study identified essential and preferred features for a future robot dog guide, which are supported by relevant statistics aligning with each suggestion. These findings will inform the future development of user-centered designs to effectively meet the needs of BVI individuals.

Bioinspiration and Biomimetics

AquaMILR: Mechanical Intelligence Simplifies Control of Undulatory Robots in Cluttered Fluid Environments
Tianyu Wang, Nishanth Mankame, Matthew Fernandez, Velin Kojouharov, Daniel Goldman

This paper explores how mechanical intelligence—the idea that physical body mechanics can simplify control—applies to undulatory robots swimming in cluttered aquatic environments. The team developed an untethered limbless robot that uses a bilateral cable-driven mechanism to create programmable anisotropic compliance, inspired by real muscle actuation. Experiments show that open-loop control, paired with the right body compliance and undulation frequency, enables effective movement through complex hydrodynamics—something not as critical in terrestrial locomotion. Additionally, a real-time cable-tension-based compliance controller boosts performance and adaptability in unpredictable environments.

This paper presents AquaMILR+, an untethered limbless robot designed for agile navigation in complex aquatic environments. The robot uses a bilateral actuation mechanism, inspired by musculoskeletal actuation in anguilliform swimming organisms, allowing undulatory swimming from head to tail. This actuation is enhanced by mechanical intelligence, improving maneuverability around obstacles. AquaMILR+ also features a depth control system inspired by swim bladders of eels and sea snakes, offering capabilities that most anguilliform robots lack. Additional features such as fins and a tail improve stability and propulsion efficiency. Tests in open water and indoor aquatic environments highlight AquaMILR+’s capabilities, positioning it for search and rescue and deep-sea exploration tasks.

Bird-Inspired Tendon Coupling Improves Paddling Efficiency by Shortening Phase Transition Times
Jianfeng Lin, Zhao Guo, Alexander Badri-Spröwitz

This paper explores the design of drag-based swimming vehicles, inspired by the coupling tendons of aquatic birds. The challenge of transitioning between the recovery and power phases in swimming is addressed by incorporating tendon coupling mechanisms. The experiments showed that these mechanisms improved propulsive efficiency by 2.0 and 2.4 times compared to designs without tendons or passive paddles, respectively. The study also concluded that distal leg joint clutching, which is important for terrestrial walking, did not significantly impact swimming performance. The work suggests new principles for efficient leg and paddle designs with potential applications in bioinspired swimming vehicles.

Bio-Inspired Robot Learning

Materials Matter: Investigating Functional Advantages of Bio-Inspired Materials Via Simulated Robotic Hopping
Andrew Schulz, Ayah Ahmad, Maegan Tucker

This paper explores how material properties impact robot performance, inspired by how natural systems use varied materials to their advantage. Most robots rely on rigid, uniform materials, but in nature, material gradients—changes in stiffness or density—improve performance. Using a simulated single-limb hopping robot, the authors tested different material profiles. They found that a gradient design (e.g., increasing density down the leg) can cut tracking error by 35% and power consumption by 23% compared to stainless steel. These results show that using bio-inspired, non-uniform materials can improve energy efficiency, reduce vibrations, and extend robot lifespan, opening a path for smarter material choices in future robot design.

Design and Control

Continuously Variable Transmission and Stiffness Actuator Based on Actively Variable Four-Bar Linkage for Highly Dynamic Robot Systems
Jungwoo Hur, Hangyeol Song, Seokhwan Jeong

This paper presents a novel actuation mechanism that combines a continuously variable transmission (CVT) mechanism with a variable stiffness actuator (VSA) for highly dynamic robot systems such as legged robots. The CVT effectively changes the input-output transmission ratio of the system, thereby extending the operational torque-speed range. Concurrently, the VSA adjusts the system stiffness, altering its compliance characteristics. Both CVT and VSA are seamlessly integrated into a single four-bar linkage mechanism, with their active features enabled by an actively variable link within this linkage. This CVT-VSA mechanism offers a range of dynamic advantages by inversely varying transmission ratio and stiffness, which includes impact mitigation, torque or speed amplification, and expanded control bandwidth. The implementation and efficacy of the CVT-VSA mechanism in a legged robot were tested and validated through a series of experiments.

Diffusion for Manipulation

Legibility Diffuser: Offline Imitation for Intent Expressive Motion
Matthew Bronars, Shuo Cheng, Danfei Xu

This paper introduces Legibility Diffuser, a diffusion-based policy for generating intent-expressive motion in human-robot collaboration. Unlike classical motion planners, this approach learns directly from offline human demonstration data. By combining predictions from a goal-conditioned diffusion model, the robot’s motion is guided toward the most legible trajectory in the training dataset. A key finding is that gradually decaying the guidance weight throughout the trajectory is essential for maintaining high success rates while maximizing legibility. This method addresses the challenge of generating human-like motions for improved safety and efficiency in collaborative tasks.

Diffusion Models

This paper presents a novel approach to robot learning, focusing on learning diverse robot striking motions for tasks such as table tennis. The authors propose a diffusion modeling approach that is offline and constraint-guided. The key innovation is a kinematic constraint gradient guidance (KCGG) technique, which directs the sampling process by computing gradients through both the robot’s forward kinematics and the diffusion model. This minimizes constraint violations while ensuring the sampled trajectory stays within the distribution of training data. The approach is tested on two challenging tasks: simulated air hockey (25.4% increase in block rate) and real table tennis (17.3% increase in success rate), outperforming imitation learning baselines.

Explainable AI in Robotics

CE-MRS: Contrastive Explanations for Multi-Robot Systems
Ethan Schneider, Daniel Wu, Devleena Das, Sonia Chernova

As multi-robot systems grow in complexity, their decisions often become hard for humans to understand. This paper introduces CE-MRS, a framework for generating contrastive explanations—natural language explanations that answer questions like “Why did the system do this instead of that?” The authors formalize contrastive explanations for multi-robot systems and create a method that pulls from task allocation, scheduling, and motion planning to justify robot behavior. In user studies, CE-MRS improved participants’ ability to detect and correct system errors, resulting in better overall team performance and user trust in robotic systems.

ID and Estimation for Legged Robots

Simultaneous Collision Detection and Force Estimation for Dynamic Quadrupedal Locomotion
Ziyi Zhou, Stefano Di Cairano, Yebin Wang, Karl Berntorp

In this paper we address the simultaneous collision detection and force estimation problem for quadrupedal locomotion using joint encoder information and the robot dynamics only. We design an interacting multiple-model Kalman filter (IMM-KF) that estimates the external force exerted on the robot and multiple possible contact modes. The method is invariant to any gait pattern design. Our approach leverages pseudo-measurement information of the external forces based on the robot dynamics and encoder information. Based on the estimated contact mode and external force, we design a reflex motion and an admittance controller for the swing leg to avoid collisions by adjusting the leg’s reference motion. Additionally, we implement a force-adaptive model predictive controller to enhance balancing. Simulation ablation studies and experiments show the efficacy of the approach.

Imitation Learning

Learning Wheelchair Tennis Navigation from Broadcast Videos with Domain Knowledge Transfer and Diffusion Motion Planning
Zixuan Wu, Zulfiqar Zaidi, Adithya Patil, Qingyu Xiao, Matthew Gombolay

In this paper, we propose a novel and generalizable zero-shot knowledge transfer framework that distills expert sports navigation strategies from web videos into robotic systems with adversarial constraints and out-of-distribution image trajectories. Our pipeline enables diffusion-based imitation learning by reconstructing the full 3D task space from multiple partial views, warping it into 2D image space, closing the planning loop within this 2D space, and transfer constrained motion of interest back to task space. Additionally, we demonstrate that the learned policy can serve as a local planner in conjunction with position control. We apply this framework in the wheelchair tennis navigation problem to guide the wheelchair into the ball-hitting region. Our pipeline achieves a navigation success rate of 97.67% in reaching real-world recorded tennis ball trajectories with a physical robot wheelchair, and achieve a success rate of 68.49% in a real-world, real-time experiment on a full-sized tennis court.

Imitation Learning for Manipulation

EgoMimic: Scaling Imitation Learning Via Egocentric Video
Simar Kareer, Dhruv Patel, Ryan Punamiya, Pranay Mathur, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu

A major challenge in imitation learning is the need for large, diverse demonstration data. EgoMimic is a full-stack framework designed to scale robotic manipulation using egocentric human videos. It includes: (1) a human data collection setup using Project Aria glasses, (2) a low-cost bimanual robot with human-like kinematics, (3) cross-domain alignment between human and robot data, and (4) an imitation learning model that co-trains on both human and robot data. Unlike prior work that extracts only high-level intent from videos, EgoMimic treats human and robot actions as equally valid demonstrations, enabling it to learn unified policies. It shows major gains across long-horizon and bimanual tasks, and notably, adding 1 hour of human data boosts performance more than adding 1 hour of robot data. Project page

Learning Prehensile Dexterity by Imitating and Emulating State-Only Observations
Yunhai Han, Zhenyang Chen, Kyle Williams, Harish Ravichandar

Humans often learn physical skills by first observing experts and then emulating them through practice. Inspired by this, the authors propose CIMER (Combining IMitation and Emulation for Motion Refinement) — a two-stage learning framework for dexterous prehensile manipulation from state-only observations. In the imitation stage, a dynamical system encodes robot and object motions into a reactive motion generation policy, which provides a prior but lacks contact reasoning. In the emulation stage, reinforcement learning refines this motion by adjusting the robot’s actions to recreate the expert’s observed object motion. CIMER is task-agnostic and requires no extra demonstrations. Experiments show that imitation alone is insufficient, but adding emulation greatly improves performance. CIMER also outperforms expert policies with action labels and can generalize to novel objects from the YCB dataset. Project page

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations
Ezra Ameperosa, Jeremy Collins, Mrinal Jain, Animesh Garg

RoCoDA is a data augmentation framework for imitation learning that improves generalization and efficiency by combining three key ideas: invariance, equivariance, and causality. It makes synthetic demonstrations by (1) modifying task-irrelevant parts of the scene (causal invariance), and (2) applying rigid SE(3) transformations to objects and adjusting the actions (equivariance). This results in policy outputs that remain correct under scene variations. Across five robot manipulation tasks, RoCoDA significantly boosts performance, even in unseen scenes with novel textures, poses, and distractions. It even shows emergent behaviors like re-grasping. RoCoDA offers a principled way to connect causal reasoning with geometric transformations in robotic learning. Project page

In-Hand Manipulation

Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation
Abhinav Kumar, Thomas Power, Fan Yang, Sergio Aguilera, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson

Contact-rich manipulation with multi-fingered hands is difficult due to the complex and hybrid nature of dynamics. This paper introduces DIPS (Diffusion-Informed Probabilistic Search), a planning method that uses A* search informed by a diffusion model trained on high-quality demonstrations. These demos are generated using a trajectory optimizer and contact modes, offering structured learning. DIPS incorporates a discriminator-based likelihood estimator to handle variability in sampling. It achieves better performance than ablations and even surpasses its training data in both simulated tasks (card sliding, screwdriver turning) and real-world screwdriver manipulation.

Integrating Motion Planning/Learning

CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building
Walker Byrnes, Miroslav Bogdanovic, Avi Balakirsky, Stephen Balakirsky, Animesh Garg

Intelligent and reliable task planning is a core capability for generalized robotics, which requires a descriptive domain representation that sufficiently models all object and state information for the scene. We present CLIMB, a continual learning framework for robot task planning that leverages foundation models and feedback from execution to guide the construction of domain models. CLIMB can build a model from a natural language description, learn non-obvious predicates while solving tasks, and store that information for future problems. We demonstrate the ability of CLIMB to improve performance in common planning environments compared to baseline methods. We also developed the BlocksWorld++ domain, a simulated environment with an easily usable real counterpart, together with a curriculum of tasks with progressing difficulty to evaluate continual learning.

Learning for Manipulation

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands
Yuanhang Zhang, Tianhai Liang, Zhenyang Chen, Yanjie Ze, Huazhe Xu

This paper tackles the challenge of catching objects thrown through the air—a skill that demands precise, agile, and full-body control. The team built a mobile robot system with a mobile base, a 6-DoF arm, and a 12-DoF dexterous hand. A two-stage reinforcement learning framework is used to train the robot in simulation, focusing on adaptability to random object shapes, sizes, and throw trajectories. The policy reaches ~80% success rate in simulation and transfers effectively to the real world, where it successfully catches sandbags of various shapes using only onboard sensors and computation. Project page

Legged Locomotion: Novel Platforms

Characterized by their elongate bodies and relatively simple legs, multi-legged robots have the potential to locomote through complex terrains for applications such as search-and-rescue and terrain inspection. Prior work has developed effective and reliable locomotion strategies for multi-legged robots by propagating the two waves of lateral body undulation and leg stepping, which we will refer to as the two-wave template. However, these robots have limited capability to climb over obstacles with sizes comparable to their heights. We hypothesize that such limitations stem from the two-wave template that we used to prescribe the multi-legged locomotion. Seeking effective alternative waves for obstacle-climbing, we designed a five-segment robot with static (non-actuated) legs, where each cable-driven joint has a rotational degree-of-freedom (DoF) in the sagittal plane (vertical wave) and a linear DoF (peristaltic wave). We tested robot locomotion performance on a flat terrain and a rugose terrain. While the benefit of peristalsis on flat-ground locomotion is marginal, the inclusion of a peristaltic wave substantially improves the locomotion performance in rugose terrains: it not only enables obstacle-climbing capabilities with obstacles having a similar height as the robot, but it also significantly improves the traversing capabilities of the robot in such terrains. Our results demonstrate an alternative actuation mechanism for multi-legged robots, paving the way towards all-terrain multi-legged robots.

Berkeley Humanoid: A Research Platform for Learning-Based Control
Qiayuan Liao, Bike Zhang, Xuanyu Huang, Xiaoyu Huang, Zhongyu Li, Koushil Sreenath

We introduce Berkeley Humanoid, a reliable and low-cost mid-scale humanoid research platform for learning-based control. Our lightweight, in-house-built robot is designed specifically for learning algorithms with accurate simulation, low simulation complexity, anthropomorphic motion, and high reliability against falls. The narrow sim-to-real gap enables agile and robust locomotion across various terrains in outdoor environments, achieved with a simple reinforcement learning controller using light domain randomization. Furthermore, we demonstrate the robot traversing for hundreds of meters, walking on a steep unpaved trail, and hopping with single and double legs as a testimony to its high performance in dynamic walking. Capable of omnidirectional locomotion and withstanding large perturbations with a compact setup, our system aims for rapid sim-to-real deployment of learning-based humanoid systems. Please check our website https://berkeley-humanoid.com/ and code https://github.com/HybridRobotics/isaac_berkeley_humanoid/.

Effective Self-Righting Strategies for Elongate Multi-Legged Robots
Erik Teder, Baxi Chong, Juntao He, Tianyu Wang, Massimiliano Iaschi, Daniel Soto, Daniel Goldman

Centipede-like robots offer an effective and robust solution to navigation over complex terrain with minimal sensing. However, when climbing over obstacles, such multi-legged robots often elevate their center-of-mass into unstable configurations, where even moderate terrain uncertainty can cause tipping over. Robust mechanisms for such elongate multi-legged robots to self-right remain unstudied. Here, we developed a comparative biological and robophysical approach to investigate self-righting strategies. We first released S. polymorpha upside down from a 10 cm height and recorded their self-righting behaviors using top and side view high-speed cameras. Using kinematic analysis, we hypothesize that these behaviors can be prescribed by two traveling waves superimposed in the body’s lateral and vertical planes, respectively. We tested our hypothesis on an elongate robot with static (non-actuated) limbs, and we successfully reconstructed these self-righting behaviors. We further evaluated how wave parameters affect self-righting effectiveness. We identified two key wave parameters: the spatial frequency, which characterizes the sequence of body-rolling, and the wave amplitude, which characterizes body curvature. By empirically obtaining a behavior diagram of spatial frequency and amplitude, we identify effective and versatile self-righting strategies for general elongate multi-legged robots, which greatly enhances these robots’ mobility and robustness in practical applications such as agricultural terrain inspection and search-and-rescue.

Manipulation Planning and Control

This paper explores the challenges of designing planners and controllers for contact-rich manipulation, where contact disrupts the smoothness assumptions of many gradient-based controller synthesis tools. The authors analyze the effectiveness of linear feedback control for smoothed dynamics, using contact smoothing to approximate non-smooth systems. They examine two key areas: (1) open-loop plans that are robust to uncertain conditions, and (2) feedback gains for stabilizing around these open-loop plans. Through empirical experiments in bimanual whole-body manipulation, involving over 300 trajectories, the authors assess why LQR control is insufficient for stabilizing contact-rich manipulation plans.

Marine Robotics

A Data-Driven Velocity Estimator for Autonomous Underwater Vehicles Experiencing Unmeasurable Flow and Wave Disturbance
Jinzhi Cai, Scott Mayberry, Huan Yin, Fumin Zhang

Autonomous Underwater Vehicles (AUVs) encounter significant challenges in confined spaces like ports and testing tanks, where vehicle-environment interactions, such as wave reflections and unsteady flows, introduce complex, time-varying disturbances. Model-based state estimation methods can struggle to handle these dynamics, leading to localization errors. To address this, we propose a data-driven velocity estimation approach using Inertial Measurement Units (IMUs) and a Gated Recurrent Unit (GRU) neural network, capturing temporal dependencies and rejecting external disturbances. This velocity estimator is then integrated into a sensor fusion framework using an asynchronous Kalman filter to improve localization by fusing on-board and off-board sensor information. Experimental validation on miniature AUVs demonstrates the effectiveness of the proposed method in enhancing accuracy for velocity and position estimation in environments with significant disturbances due to interactions between the vehicle and the environment.

Mechanism Design and Control

In the classical reach-avoid problem, autonomous mobile robots are tasked to reach a goal while avoiding obstacles. However, it is difficult to provide guarantees on the robot’s performance when the obstacles form a narrow gap and the robot is a black-box (i.e. the dynamics are not known analytically, but interacting with the system is cheap). To address this challenge, this paper presents NeuralPARC. The method extends the authors’ prior Piecewise Affine Reach-avoid Computation (PARC) method to systems modeled by rectified linear unit (ReLU) neural networks, which are trained to represent parameterized trajectory data demonstrated by the robot. NeuralPARC computes the reachable set of the network while accounting for modeling error, and returns a set of states and parameters with which the black-box system is guaranteed to reach the goal and avoid obstacles. NeuralPARC is shown to outperform PARC, generating provably-safe extreme vehicle drift parking maneuvers in simulations and in real life on a model car, as well as enabling safety on an autonomous surface vehicle (ASV) subjected to large disturbances and controlled by a deep reinforcement learning (RL) policy.

RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution
Wonsuhk Jung, Dennis Anthony, Utkarsh Mishra, Nadun Ranawaka Arachchige, Matthew Bronars, Danfei Xu, Shreyas Kousik

Imitation learning (IL) has shown great success in learning complex robot manipulation tasks. However, there remains a need for practical safety methods to justify widespread deployment. In particular, it is important to certify that a system obeys hard constraints on unsafe behavior in settings when it is unacceptable to design a tradeoff between performance and safety via tuning the policy (i.e. soft constraints). This leads to the question, how does enforcing hard constraints impact the performance (meaning safely completing tasks) of an IL policy? To answer this question, this paper builds a reachability-based safety filter to enforce hard constraints on IL, which we call Reachability-Aided Imitation Learning (RAIL). Through evaluations with state-of-the-art IL policies in mobile robots and manipulation tasks, we make two key findings. First, the highest-performing policies are sometimes only so because they frequently violate constraints, and significantly lose performance under hard constraints. Second, surprisingly, hard constraints on the lower-performing policies can occasionally increase their ability to perform tasks safely. Finally, hardware evaluation confirms the method can operate in real time. More results can be found at our website: https://safe-robotics-lab-gt.github.io/rail/.

Medical Robot Systems

Design and Modeling of a Compact Spooling Mechanism for the COAST Guidewire Robot
Timothy A. Brumfiel, Jared Grinberg, Betina Siopongco, Jaydev P. Desai

The treatment of many intravascular procedures begins with a clinician manually placing a guidewire to the target lesion to aid in placing other devices. Manually steering the guidewire is challenging due to the lack of direct tip control and the high tortuosity of vessel structures, potentially resulting in vessel perforation or guidewire fracture. These challenges can be alleviated through the use of robotically steerable guidewires that can improve guidewire tip control, provide force feedback, and, similar to commercial guidewires, are inherently safe due to their compliant structure. However, robotic guidewires are not yet clinically viable due to small robot lengths or large actuation systems. In this paper, we develop a highly compact spooling mechanism for the COaxially Aligned STeerable (COAST) guidewire robot, capable of dispensing a clinically viable length of 1.5 m of the robotic guidewire. The mechanism utilizes a spool with several interior armatures to actuate each component of the COAST guidewire. The kinematics of the robotic guidewire are then modeled considering additional friction forces caused by interactions within the mechanism. The actuating mechanisms of the compact spooling mechanism are calibrated and the kinematics of the guidewire are validated resulting in an average curvature RMSE of 0.24 m−1.

Model Control, Legged Robots

Terrain-Aware Model Predictive Control of Heterogeneous Bipedal and Aerial Robot Coordination for Search and Rescue Tasks
Abdulaziz Shamsah, Jesse Jiang, Ziwon Yoon, Samuel Coogan, Ye Zhao

This study presents a task and motion planning framework for search and rescue operations using a heterogeneous robot team composed of humanoids and aerial robots. A terrain-aware Model Predictive Controller (MPC) is proposed, incorporating terrain elevation gradients learned using Gaussian processes (GP). The MPC generates safe navigation paths for bipedal robots to traverse rough terrain while minimizing slopes, and directs quadrotors for aerial search and mapping tasks. The locations of the rescue subjects are estimated by a target belief GP, updated online during map exploration. A high-level planner for task allocation is developed using syntactically cosafe Linear Temporal Logic (scLTL), with a consensus-based algorithm for robot task assignment. The framework is evaluated in simulation in uncertain environments with varying terrains and random rescue subject placements.

Motion Planning

Propagative Distance Optimization for Motion Planning
Yu Chen, Jinyun Xu, Yilin Cai, Ting-Wei Wong, Zhongqiang Ren, Howie Choset, Guanya Shi

This paper focuses on the motion planning problem for serial articulated robots with revolute joints under kinematic constraints. Many motion planners leverage iterative local optimization methods but are often trapped in local minima due to non-convexity of the problem. A key reason for the non-convexity is the trigonometric term when parameterizing the kinematics using joint angles. Recent distance-based formulation can eliminate these trigonometric terms by formulating the kinematics based on distances, and has shown superior performance against classic joint angle based formulations in domains like inverse kinematics (IK). However, distance-based kinematics formulations have not yet been studied for motion planning, and naively applying them for motion planning may lead to poor computational efficiency. In particular, IK seeks one configuration while motion planning seeks a sequence of configurations, which greatly increases the scale of the underlying optimization problem. This paper proposes Propagative Distance Optimization for Motion Planning (PDOMP), which addresses the challenge by (i) introducing a new compact representation that reduces the number of variables in the distance-based formulation, and (ii) leveraging the chain structure to efficiently compute forward kinematics and Jacobians of the robot among waypoints along a path.

Multi-Robot Exploration

Communication-Aware Iterative Map Compression for Online Path-Planning
Evangelos Psomiadis, Ali Reza Pedram, Dipankar Maity, Panagiotis Tsiotras

This paper addresses the problem of optimizing communicated information among heterogeneous, resource-aware robot teams to facilitate their navigation. In such operations, a mobile robot compresses its local map to assist another robot in reaching a target within an uncharted environment. The primary challenge lies in ensuring that the map compression step balances network load while transmitting only the most essential information for effective navigation. We propose a communication framework that sequentially selects the optimal map compression in a task-driven, communication-aware manner. It introduces a decoder capable of iterative map estimation, handling noise through Kalman filter techniques. The computational speed of our decoder allows for a larger compression template set compared to previous methods, and enables applications in more challenging environments. Specifically, our simulations demonstrate a remarkable 98% reduction in communicated information, compared to a framework that transmits the raw data, on a large Mars inclination map and an Earth map, all while maintaining similar planning costs. Furthermore, our method significantly reduces computational time compared to the state-of-the-art approach.

Multi-Robot Swarms

BEST PAPER FINALIST 🏆
Individual and Collective Behaviors in Soft Robot Worms Inspired by Living Worm Blobs
Carina Kaeser, Junghan Kwon, Elio Challita, Harry Tuazon, Robert Wood, Saad Bhamla, Justin Werfel

California blackworms constitute a recently identified animal system exhibiting unusual collective behaviors, in which dozens to thousands of worms entangle to form a “blob” capable of actions like locomotion as an aggregate. In this paper we describe a system of pneumatic soft robots inspired by the blackworms, intended for the study of collective behaviors enabled and mediated by such physical entanglement. Both the robots and worms have high aspect ratio (≳1:50), intertwine in complex 3D configurations, operate both in air and underwater, and can locomote both individually and as a collective. We demonstrate and characterize locomotion for both individual robots and entangled blobs, explore the tunability of entanglement strength, and compare these to the analogous versions in living worms. The robots provide a testbed for studying mechanisms underlying behaviors observed in worm blobs, as well as serving as a platform for studies of novel collective behaviors based on physical entanglement.

Multi-Robot Systems

A Streamlined Heuristic for the Problem of Min-Time Coverage in Constricted Environments (I)
Young-In Kim, Spiridon Reveliotis

This paper tackles the minimum-time coverage problem for robotic fleets operating in constricted, structured environments, like pipes or narrow service tunnels. Prior work defined the problem and introduced a Mixed Integer Programming (MIP) model, proving its complexity and developing a construction heuristic based on structural properties of optimal solutions. However, that heuristic required solving many linear programs, limiting scalability. This paper introduces a streamlined version of the heuristic that dramatically reduces computational time. It also proposes a local search method to further refine the solution, balancing performance and efficiency.

Integrating Multi-Robot Adaptive Sampling and Informative Path Planning for Spatiotemporal Natural Environment Prediction
Siva Kailas, Srujan Deolasee, Wenhao Luo, Woojun Kim, Katia Sycara

This work focuses on multi-robot adaptive sampling and informative path planning for predicting spatiotemporal (ST) environmental processes from sparse samples. The challenge is to balance both the sampling perspective (collecting the best samples) and the learning perspective (predicting the next timestep). The paper presents a framework where multi-robot informative path planning is incorporated into spatiotemporal adaptive sampling, while considering path length constraints for optimal sampling. The process is split into two stages: the first stage uses a spatiotemporal mixture of Gaussian Processes (STMGP) model to select the most informative sampling locations, and the second stage plans the best path to collect samples. The approach is decentralized and uses peer-to-peer communication to coordinate between robots. Simulation results on real-world data demonstrate the effectiveness of the proposed method.

Residual Descent Differential Dynamic Game (RD3G) – a Fast Newton Solver for Constrained General Sum Games
Zhiyuan Zhang, Panagiotis Tsiotras

This paper introduces Residual Descent Differential Dynamic Game (RD3G), a Newton-based solver for multi-agent game-control problems with constraints. The solver efficiently seeks a local Nash equilibrium for agents interacting through rewards and state constraints. By dynamically managing active constraints, using a barrier function for satisfied constraints, and implementing a backtracking line search, the method keeps the problem dimension low while ensuring constraint satisfaction. When compared to existing methods, RD3G is 4X faster and has 2X higher convergence rate in high-dimensional games, showing significant computational advantages.

Novel Methods for Mapping/Localization

This paper investigates the use of aerial and satellite map data for autonomous vehicle localization, aiming to improve precision learned localization. Although aerial data has cost-saving and scalability benefits, it presents challenges such as sensor-modality gaps and viewpoint differences. The paper shows that aligning aerial data with autonomous vehicle sensor data at training time is crucial for improving localization accuracy. Two data alignment methods are compared using a factor graph framework. The evaluation on a 1600 km autonomous vehicle dataset demonstrates that the proposed methods achieve a localization error below 0.3 meters and 0.5°, which is suitable for autonomous vehicle applications.

Offroad Navigation

Dynamics Modeling Using Visual Terrain Features for High-Speed Autonomous Off-Road Driving
Jason Gibson, Anoushka Alavilli, Erica Tevere, Evangelos Theodorou, Patrick Spieler

This paper focuses on the challenge of autonomous navigation over unstructured terrain, such as in disaster response, search and rescue, and planetary exploration. The dynamics of a vehicle can change drastically when driving at high speeds over varying terrain, affecting parameters like traction, tire slip, and rolling resistance. To address this, the authors propose a hybrid dynamics model that uses visual terrain features to predict the changes in vehicle dynamics. The model leverages a pre-trained visual foundation model (VFM) like DINOv2, which captures fine-grained semantic features. The paper introduces an end-to-end training architecture for creating a lightweight map of the environment, enabling real-time planning. The proposed model is validated on extensive DARPA RACER program datasets, which include aggressive off-road driving across multiple locations.

Optimization and Optimal Control

Second-Order Stein Variational Dynamic Optimization
Yuichiro Aoyama, Peter Lehmann, Evangelos Theodorou

We present a novel second-order trajectory optimization algorithm based on Stein Variational Newton’s Method and Maximum Entropy Differential Dynamic Programming. The proposed algorithm, called Stein Variational Differential Dynamic Programming, is a kernel-based extension of Maximum Entropy Differential Dynamic Programming that combines the best of the two worlds of sampling-based and gradient-based optimization. The resulting algorithm avoids known drawbacks of gradient-based dynamic optimization in terms of getting stuck at local minima, while it overcomes limitations of sampling-based stochastic optimization in terms of introducing undesirable stochasticity when applied in online fashion. To test the efficacy of the proposed algorithm, experiments are conducted in Model Predictive Control mode. The experiments include comparisons with unimodal and multimodal Maximum Entropy Differential Dynamic Programming as well as Model Predictive Path Integral Control and its multimodal and Stein Variational extensions. The results demonstrate the superior performance of the proposed algorithms and confirm the hypothesis that there is a middle ground between sampling- and gradient-based optimization that is indeed beneficial for dynamic optimization.

Perception for Mobile Robots

DreamDrive: Generative 4D Scene Modeling from Street View Images
Jiageng Mao, Boyi Li, Boris Ivanovic, Yuxiao Chen, Yan Wang, Yurong You, Chaowei Xiao, Danfei Xu, Marco Pavone, Yue Wang

Synthesizing photo-realistic visual observations from an ego vehicle’s driving trajectory is a critical step towards scalable training of self-driving models. Reconstruction-based methods create 3D scenes from driving logs and synthesize geometry-consistent driving videos through neural rendering, but their dependence on costly object annotations limits their ability to generalize to in-the-wild driving scenarios. On the other hand, generative models can synthesize action-conditioned driving videos in a more generalizable way but often struggle with maintaining 3D visual consistency. In this paper, we present ourmethod{}, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction, to synthesize generalizable 4D driving scenes and dynamic driving videos with 3D consistency. Specifically, we leverage the generative power of video diffusion models to synthesize a sequence of visual references and further elevate them to 4D with a novel hybrid Gaussian representation. Given a driving trajectory, we then render 3D-consistent driving videos via Gaussian splatting. The use of generative priors allows our method to produce high-quality 4D scenes from in-the-wild driving data, while neural rendering ensures 3D-consistent video generation from the 4D scenes. Extensive experiments on nuScenes and in-the-wild driving data demonstrate that ourmethod{} can generate controllable and generalizable 4D driving scenes, synthesize novel views of driving videos with high fidelity and 3D consistency, decompose static and dynamic elements in a self-supervised manner, and enhance perception and planning tasks for autonomous driving.

Reinforcement Learning

Learning a High-Quality Robotic Wiping Policy Using Systematic Reward Analysis and Visual-Language Model Based Curriculum
Yihong Liu, Dongyeop Kang, Sehoon Ha

Autonomous robotic wiping is an important task in various industries, ranging from industrial manufacturing to sanitization in healthcare. Deep reinforcement learning (Deep RL) has emerged as a promising algorithm, however, it often suffers from a high demand for repetitive reward engineering. Instead of relying on manual tuning, we first analyze the convergence of quality-critical robotic wiping, which requires both high-quality wiping and fast task completion, to show the poor convergence of the problem and propose a new bounded reward formulation to make the problem feasible. Then, we further improve the learning process by proposing a novel visual-language model (VLM) based curriculum, which actively monitors the progress and suggests hyperparameter tuning. We demonstrate that the combined method can find a desirable wiping policy on surfaces with various curvatures, frictions, and waypoints, which cannot be learned with the baseline formulation. The demo of this project can be found at: https://sites.google.com/view/highqualitywiping

PrivilegedDreamer: Explicit Imagination of Privileged Information for Rapid Adaptation of Learned Policies
Morgan Byrd, Jackson Crandell, Mili Das, Jessica Inman, Robert Wright, Sehoon Ha

Numerous real-world control problems involve dynamics and objectives affected by unobservable hidden parameters, ranging from autonomous driving to robotic manipulation, which cause performance degradation during sim-to-real transfer. To represent these kinds of domains, we adopt hidden-parameter Markov decision processes (HIP-MDPs), which model sequential decision problems where hidden variables parameterize transition and reward functions. Existing approaches, such as domain randomization, domain adaptation, and meta-learning, simply treat the effect of hidden parameters as additional variance and often struggle to effectively handle HIP-MDP problems, especially when the rewards are parameterized by hidden variables. We introduce PrivilegedDreamer, a model-based reinforcement learning framework that extends the existing model-based approach by incorporating an explicit parameter estimation module. PrivilegedDreamer features its novel dual recurrent architecture that explicitly estimates hidden parameters from limited historical data and enables us to condition the model, actor, and critic networks on these estimated parameters. Our empirical analysis on five diverse HIP-MDP tasks demonstrates that PrivilegedDreamer outperforms state-of-the-art model-based, model-free, and domain adaptation learning algorithms. Additionally, we conduct ablation studies to justify the inclusion of each component in the proposed architecture.

Reinforcement Learning Applications

Learning Multi-Agent Coordination for Replenishment at Sea
Byeolyi Han, Minwoo Cho, Letian Chen, Rohan Paleja, Zixuan Wu, Sean Ye, Esmaeil Seraj, David Sidoti, Matthew Gombolay

This paper addresses the challenge of optimizing large-scale logistics under stochastic and time-varying weather conditions. The authors present a new multi-agent reinforcement learning (MARL) environment, Marine, which simulates replenishment at sea (RAS) operations. Marine includes two types of agents with limited resources, and integrates real wave data to model the impact of weather on RAS operations. The proposed SchedHGNN algorithm, a MARL model, uses a heterogeneous graph neural network and an intrinsic reward scheme to improve agent coordination and tackle the challenges posed by environmental non-stationarity. The results show that the combination of better RAS scheduling and enhanced communication improves performance by up to 37.8% over competitive baselines, demonstrating the potential of MARL in complex, real-world logistics scenarios.

Representation Learning

Learning Dynamics of a Ball with Differentiable Factor Graph and Roto-Translational Invariant Representations
Qingyu Xiao, Zixuan Wu, Matthew Gombolay

To support agile planning in dynamic settings like sports, robots need accurate and fast object dynamics models. This is especially tough in games like ping pong where ball motion involves complex aerodynamics and friction effects. This paper proposes an end-to-end framework that learns both a dynamics model and a factor graph estimator. It introduces roto-translational invariant representations via the Gram-Schmidt process, which improves accuracy better than typical data augmentation. A novel neural architecture using self-multiplicative bypasses enhances nonlinearity. The model predicts ball location with 37.2 mm RMSE after the first bounce and 71.5 mm after the second.

MI-HGNN: Morphology-Informed Heterogeneous Graph Neural Network for Legged Robot Contact Perception
Daniel Chase Butterfield, Sandilya Sai Garimella, NaiJen Cheng, Lu Gan

This paper presents a Morphology-Informed Heterogeneous Graph Neural Network (MI-HGNN) for learning-based contact perception. The architecture is informed by the robot’s morphology, where nodes and edges represent the robot’s joints and links. By incorporating morphology-informed constraints, the network leverages model-based knowledge to improve performance. The method is applied to two contact perception problems, with experiments conducted using real-world and simulated data from two quadruped robots. Results demonstrate that MI-HGNN outperforms a state-of-the-art model by 8.4% while using only 0.21% of its parameters. Although focused on legged robots, MI-HGNN can be applied to other multi-body dynamical systems and can enhance various robot learning frameworks. The authors have made their code publicly available at GitHub – Morphology-Informed-HGNN.

Resiliency and Security

Affine Transformation-Based Perfectly Undetectable False Data Injection Attacks on Remote Manipulator Kinematic Control with Attack Detector
Jun Ueda, Jacob Blevins

This paper demonstrates the viability of perfectly undetectable affine transformation attacks against robotic manipulators, where intelligent attackers can inject multiplicative and additive false data while remaining completely hidden from system users. The attacker can implement these communication line attacks by satisfying three conditions presented in this work. These claims are experimentally validated on a FANUC 6-degree-of-freedom manipulator by comparing a nominal (non-attacked) trial and a detectable attack case against three perfectly undetectable trajectory attack scenarios: scaling, reflection, and shearing. The results show similar observed end-effector error for the attack scenarios and the nominal case, indicating that the perfectly undetectable affine transformation attack method keeps the attacker perfectly hidden while enabling them to attack manipulator trajectories.

This paper investigates the vulnerability of bilateral teleoperation systems to perfectly undetectable False Data Injection Attacks (FDIAs). Teleoperation, one of the major applications in robotics, involves a leader manipulator operated by a human and a follower manipulator at a remote site, connected via a communication channel. While this setup enables operation in challenging environments, it also introduces cybersecurity risks, particularly in the communication link. The paper focuses on a specific class of cyberattacks: perfectly undetectable FDIAs, where attackers alter signals without leaving detectable traces at all. Compared to previous research on linear and first-order nonlinear systems, this paper examines bilateral teleoperation systems with second-order nonlinear manipulator dynamics. The paper derives mathematical conditions based on Lie Group theory that enable such attacks, demonstrating how an attacker can modify the follower manipulator’s motion while the operator perceives normal operation through the leader device. This vulnerability challenges conventional detection methods based on observable changes and highlights the need for advanced security measures in teleoperation systems. To validate the theoretical results, the paper presents experimental demonstrations using a teleoperation system connecting robots in the US and Japan.

Robot Safety

Dynamic Gap: Safe Gap-Based Navigation in Dynamic Environments
Maxwell Asselmeier, Dhruv Ahuja, Abdel Zaro, Ahmad Abuaish, Ye Zhao, Patricio Vela

This paper extends the family of gap-based local planners to unknown dynamic environments by generating provably collision-free properties for hierarchical navigation systems. Unlike existing planners that rely on empirical robustness for dynamic obstacle avoidance, this method performs a formal analysis of dynamic obstacles. Additionally, instead of tracking obstacles in a global inertial frame (which is prone to odometry drift), this planner models how free space (gaps) evolves over time. The paper introduces gap crossing and closing conditions to determine safe passage. Extensive simulation benchmarking shows that the Dynamic Gap planner outperforms all tested navigation planners across multiple environments in terms of success rate.

SLAM

HDPlanner: Advancing Autonomous Deployments in Unknown Environments through Hierarchical Decision Networks
Jingsong Liang, Yuhong Cao, Yixiao Ma, Hanqi Zhao, Guillaume Adrien Sartoretti

In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specifically, HDPlanner relies on novel hierarchical attention networks to empower the robot to reason about its belief across multiple spatial scales and sequence collaborative decisions, where our networks decompose long-term objectives into short-term informative task assignments and informative path plannings. We further propose a contrastive learning-based joint optimization to enhance the robustness of HDPlanner. We empirically demonstrate that HDPlanner significantly outperforms state-of-the-art conventional and learning-based baselines on an extensive set of simulations, including hundreds of test maps and large-scale, complex Gazebo environments. Notably, HDPlanner achieves real-time planning with travel distances reduced by up to 35.7% compared to exploration benchmarks and by up to 16.5% than navigation benchmarks. Furthermore, we validate our approach on hardware, where it generates high-quality, adaptive trajectories in both indoor and outdoor environments, highlighting its real-world applicability without additional training.

Soft Robotic Grasping

Soft robots (SR) are a class of continuum robots that enable safe human interaction with task versatility beyond rigid robots. This has resulted in their rapid adoption in a number of applications that require manipulation of delicate and irregular objects. Despite their advantages, SR grippers typically require case-specific experimental characterization for shape and gripper retention force estimation. This letter presents a kinetostatic modeling approach based on strain energy minimization subject to mechanics and geometric constraints for shape estimation of SR grippers with external tendon routing (ETR), including those with composite structures. Additionally, Castigliano’s First Theorem is used to estimate the retention force of the gripper. These models are evaluated across four different ETR SR grippers. The mechanics model predicted the fingertip position and orientation with an accuracy of 1.06±0.62 mm (1.79%±1.05% of length) and 3.58°±2.82° with respect to tendon force and 0.72±0.45 mm (1.22%±0.76% of length) and 2.86°±2.11° with respect to tendon retraction. The retention force of the gripper was predicted with an average error of 0.20±0.12 N.

Software Tools

Connected and automated vehicles and robot swarms hold transformative potential for enhancing safety, efficiency, and sustainability in the transportation and manufacturing sectors. Extensive testing and validation of these technologies is crucial for their deployment in the real world. While simulations are essential for initial testing, they often have limitations in capturing the complex dynamics of real-world interactions. This limitation underscores the importance of small-scale testbeds. These testbeds provide a realistic, cost-effective, and controlled environment for testing and validating algorithms, acting as an essential intermediary between simulation and full-scale experiments. This work serves to facilitate researchers’ efforts in identifying existing small-scale testbeds suitable for their experiments and provide insights for those who want to build their own. In addition, it delivers a comprehensive survey of the current landscape of these testbeds. We derive 62 characteristics of testbeds based on the well-known sense-plan-act paradigm and offer an online table comparing 23 small-scale testbeds based on these characteristics. The online table is hosted on our designated public webpage https://bassamlab.github.io/testbeds-survey, and we invite testbed creators and developers to contribute to it. We closely examine nine testbeds in this paper, demonstrating how the derived characteristics can be used to present testbeds. Furthermore, we discuss three ongoing ch

Surgical Robotics: Catheters/Needles

Bronchoalveolar lavage (BAL) is a minimally invasive procedure for diagnosing lung infections and diseases. Navigating the tortuous anatomy of the lungs remains a challenge during BAL. Continuum robots could improve the navigation of catheters, guidewires, and endoscopes in such procedures. This paper discusses a tendon-driven continuum robot model designed for navigating the lung anatomy during BAL. It proposes a model that predicts curvature in a continuum robot with arbitrary notch shapes, achieving error rates of 2.32%, 3.65%, and 6.32% for rectangular, elliptical, and sinusoidal notches, respectively. The paper also presents an algorithm for determining the optimal notch pattern for a desired nonuniform curvature. Using this algorithm, the robot achieved a 1.52° RMSE with the desired shape. Additionally, a model incorporating friction and pre-curvature predictions yielded a 5.20° RMSE in nonuniform notched continuum robots, with successful navigation demonstrated in a pulmonary phantom.

Surgical Robotics: Planning

Behavior cloning facilitates the learning of dexterous manipulation skills, yet the complexity of surgical environments, the difficulty and expense of obtaining patient data, and robot calibration errors present unique challenges for surgical robot learning. We provide an enhanced surgical digital twin with photorealistic human anatomical organs, integrated into a comprehensive simulator designed to generate high-quality synthetic data to solve fundamental tasks in surgical autonomy. We present SuFIA-BC: visual Behavior Cloning policies for Surgical First Interactive Autonomy Assistants. We investigate visual observation spaces including multi-view cameras and 3D visual representations extracted from a single endoscopic camera view. Through systematic evaluation, we find that the diverse set of photorealistic surgical tasks introduced in this work enables a comprehensive evaluation of prospective behavior cloning models for the unique challenges posed by surgical environments. We observe that current state-of-the-art behavior cloning techniques struggle to solve the contact-rich and complex tasks evaluated in this work, regardless of their underlying perception or control architectures. These findings highlight the importance of customizing perception pipelines and control architectures, as well as curating larger-scale synthetic datasets that meet the specific demands of surgical tasks. Project website: orbit-surgical.github.io/sufia-bc/

Task and Motion Planning

Optimization-Based Task and Motion Planning under Signal Temporal Logic Specifications Using Logic Network Flow
Xuan Lin, Jiming Ren, Samuel Coogan, Ye Zhao

This paper proposes an optimization-based task and motion planning framework, called Logic Network Flow, to integrate signal temporal logic (STL) specifications into efficient mixed-binary linear programs. Unlike traditional Logic Tree formulations that encode temporal constraints between nodes, Logic Network Flow encodes temporal predicates as polyhedral constraints on edges of a network flow. By synthesizing with Dynamic Network Flows, this method yields a tighter convex relaxation than Logic Trees derived from STL. The framework is evaluated through multiple multi-robot motion planning case studies. Results show that it outperforms the Logic Tree approach in computation time and scales more effectively, finding better lower and upper bounds while exploring fewer nodes.

Teleoperation

The Impact of Stress and Workload on Human Performance in Robot Teleoperation Tasks
Sam Yi Ting, Erin Hedlund-Botti, Manisha Natarajan, Jamison Heard, Matthew Gombolay

Advances in robot teleoperation have enabled groundbreaking innovations in many fields, such as space exploration, healthcare, and disaster relief. The human operator’s performance plays a key role in the success of any teleoperation task, with prior evidence suggesting that operator stress and workload can impact task performance. As robot teleoperation is currently deployed in safety-critical domains, it is essential to analyze how different stress and workload levels impact the operator. We are unaware of any prior work investigating how both stress and workload impact teleoperation performance. We conducted a novel study (n=24) to jointly manipulate users’ stress and workload and analyze the user’s performance through objective and subjective measures. Our results indicate that, as stress increased, over 70% of our participants performed better up to a moderate level of stress; yet, the majority of participants performed worse as the workload increased. Importantly, our experimental design elucidated that stress and workload have related yet distinct impacts on task performance, with workload mediating the effects of distress on performance (p<.05).

Testing and Validation

Learning-Based Bayesian Inference for Testing of Autonomous Systems
Anjali Parashar, Ji Yin, Charles Dawson, Panagiotis Tsiotras, Chuchu Fan

For the safe operation of robotic systems, it is important to accurately understand its failure modes using prior testing. Hardware testing of robotic infrastructure is known to be slow and costly. Instead, failure prediction in simulation can help to analyze the system before deployment. Conventionally, large-scale naive Monte Carlo simulations are used for testing; however, this method is only suitable for testing average system performance. For safety-critical systems, worst-case performance is more crucial as failures are often rare events, and the size of test batches increases substantially as failures become more rare. Rare-event sampling methods can be helpful; however, they exhibit slow convergence and cannot handle constraints. This research introduces a novel sampling-based testing framework for autonomous systems which bridges these gaps by utilizing a discretized gradient-based second-order Langevin algorithm combined with learning-based techniques for constrained sampling of failure modes. Our method can predict more diverse failures by exploring the search space efficiently and ensures feasibility with respect to temporal and implicit constraints. We demonstrate the use of our testing methodology on two categories of testing problems, via simulations and hardware experiments. Our method discovers up to 2X failures compared to naive Random Walk sampling, with only half of the sample size.

Vision-Based Navigation

Safer Gap: Safe Navigation of Planar Nonholonomic Robots with a Gap-Based Local Planner
Shiyu Feng, Ahmad Abuaish, Patricio Vela

This paper extends the gap-based navigation technique Potential Gap with safety guarantees for planar nonholonomic robots, resulting in Safer Gap. The method uses a special region called the keyhole region—a combination of the largest collision-free disc centered on the robot and a trapezoidal collision-free space leading through a gap. Safer Gap creates Bezier-based paths within these regions, and a shallow neural network-based Zeroing Barrier Function (ZBF) defines the safety boundary of the top path in real time. Using Nonlinear Model Predictive Control (NMPC) with ZBF constraints and Bezier path tracking, the system generates safe and feasible trajectories. If NMPC fails, a Potential Gap fallback guarantees safety. Simulations and experiments confirm collision-free navigation performance.

X-MOBILITY: End-To-End Generalizable Navigation Via World Modeling
Wei Liu, Huihua Zhao, Chenran Li, Joydeep Biswas, Billy Okal, Pulkit Goyal, Yan Chang, Soha Pouya

General-purpose navigation in challenging environments remains a significant problem in robotics, with current state-of-the-art approaches facing myriad limitations. Classical approaches struggle with cluttered settings and require extensive tuning, while learning-based methods face difficulties generalizing to out-of-distribution environments. This paper introduces xmobility, an end-to-end generalizable navigation model that overcomes existing challenges by leveraging three key ideas. First, xmobility employs an auto-regressive world modeling architecture with a latent state space to capture world dynamics. Second, a diverse set of multi-head decoders enables the model to learn a rich state representation that correlates strongly with effective navigation skills. Third, by decoupling world modeling from action policy, our architecture can train effectively on a variety of data sources, both with and without expert policies—off-policy data allows the model to learn world dynamics, while on-policy data with supervisory control enables optimal action policy learning. Through extensive experiments, we demonstrate that xmobility not only generalizes effectively but also surpasses current state-of-the-art navigation approaches. Additionally, xmobility achieves zero-shot Sim2Real transferability and shows strong potential for cross-embodiment generalization. Project page

Go to top ⬆️

See you in Atlanta!

Development: College of Computing, Institute for Robotics and Intelligent Machines (IRIM)
Project and Web Lead/Data Graphics: Joshua Preston
News: Nathan Deen, Laura S. Smith
Data: https://ras.papercept.net/conferences/conferences/ICRA25/program/

Visual media

ICRA media reel: Joshua Preston (producer), Nathan Deen, Terence Rushin, Bruce Walker, and ICRA
Crab Lab photos/video (director Dan Goldman): Tianyu Wang, Massimiliano Iaschi
Georgia Tech staff photos: Rob Felt (Institute Communications), Terence Rushin (College of Computing), Kevin Beasley (College of Computing), Christa Ernst (IRIM)
medusai: Photos- Gioconda Barral-Secchi (IGNI Productions); Video- Georgia Tech Center for Music Technology
Meta: Project Aria Case Study (video)
Original Graphic Line Art: Christa Ernst
PAPER: EgoMimic: Scaling Imitation Learning via Egocentric (video); first author Simar Kareer
PAPER: Individual and collective behaviors in soft robot worms inspired by living worm blobs (photo) – first author Carina Kaeser; (background video) Harry Tuazon
Safe Robotics Lab video (director Shreyas Kousik)
Other Sources: LinkedIn, Personal Web pages