publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- LEARN: Learning End-to-End Aerial Resource-Constrained Multi-Robot NavigationDarren Chiu* , Zhehui Huang* , Ruohai Ge , and Gaurav S. Sukhatme2025
Nano-UAV teams offer great agility yet face severe navigation challenges due to constrained onboard sensing, communication, and computation. Existing approaches rely on high-resolution vision or compute intensive planners, rendering them infeasible for these platforms. We introduce LEARN, a lightweight, two-stage safety-guided reinforcement learning (RL) framework for multi-UAV navigation in cluttered spaces. Our system combines low-resolution Time-of-Flight (ToF) sensors and a simple motion planner with a compact, attention-based RL policy. In simulation, LEARN outperforms two state-of-the-art planners by 10% while using substantially fewer resources. We demonstrate LEARN’s viability on six Crazyflie quadrotors, achieving fully onboard flight in diverse indoor and outdoor environments at speeds up to 2.0m/s and traversing 0.2m gaps.
- Latent Activation Editing: Inference-Time Refinement of Learned Policies for Safer Multirobot NavigationSatyajeet Das , Darren Chiu , Zhehui Huang , Lars Lindemann , and Gaurav S. Sukhatme2025
Reinforcement learning has enabled significant progress in complex domains such as coordinating and navigating multiple quadrotors. However, even well-trained policies remain vulnerable to collisions in obstacle-rich environments. Addressing these infrequent but critical safety failures through retraining or fine-tuning is costly and risks degrading previously learned skills. Inspired by activation steering in large language models and latent editing in computer vision, we introduce a framework for inference-time Latent Activation Editing (LAE) that refines the behavior of pre-trained policies without modifying their weights or architecture. The framework operates in two stages: (i) an online classifier monitors intermediate activations to detect states associated with undesired behaviors, and (ii) an activation editing module that selectively modifies flagged activations to shift the policy towards safer regimes. In this work, we focus on improving safety in multi-quadrotor navigation. We hypothesize that amplifying a policy’s internal perception of risk can induce safer behaviors. We instantiate this idea through a latent collision world model trained to predict future pre-collision activations, thereby prompting earlier and more cautious avoidance responses. Extensive simulations and real-world Crazyflie experiments demonstrate that LAE achieves statistically significant reduction in collisions (nearly 90% fewer cumulative collisions compared to the unedited baseline) and substantially increases the fraction of collision-free trajectories, while preserving task completion. More broadly, our results establish LAE as a lightweight paradigm, feasible on resource-constrained hardware, for post-deployment refinement of learned robot policies.
- SAFE-GIL: SAFEty Guided Imitation Learning for Robotic SystemsYusuf Umut Ciftci , Darren Chiu , Zeyuan Feng , Gaurav S. Sukhatme , and Somil BansalIn 2025 IEEE International Conference on Robotics and Automation (ICRA) , 2025
Behavior cloning (BC) is a widely-used approach in imitation learning, where a robot learns a control policy by observing an expert supervisor. However, the learned policy can make errors and might lead to safety violations, which limits their utility in safety-critical robotics applications. While prior works have tried improving a BC policy via additional real or synthetic action labels, adversarial training, or runtime filtering, none of them explicitly focus on reducing the BC policy’s safety violations during training time. We propose SAFE-GIL, a design-time method to learn safety-aware behavior cloning policies. SAFE-GIL deliberately injects adversarial disturbance in the system during data collection to guide the expert towards safety-critical states. This disturbance injection simulates potential policy errors that the system might encounter during the test time. By ensuring that training more closely replicates expert behavior in safety-critical states, our approach results in safer policies despite policy errors during the test time. We further develop a reachability-based method to compute this adversarial disturbance. We compare SAFE-GIL with various behavior cloning techniques and online safety-filtering methods in three domains: autonomous ground navigation, aircraft taxiing, and aerial navigation on a quadrotor testbed. Our method demonstrates a significant reduction in safety failures, particularly in low data regimes where the likelihood of learning errors, and therefore safety violations, is higher.
2024
- Collective Bayesian Decision-Making in a Swarm of Miniaturized Robots for Surface InspectionThiemen Siemensma , Darren Chiu , Sneha Ramshanker , Radhika Nagpal , and Bahar HaghighatIn Swarm Intelligence , 2024
Robot swarms can effectively serve a variety of sensing and inspection applications. Certain inspection tasks require a binary classification decision. This work presents an experimental setup for a surface inspection task based on vibration sensing and studies a Bayesian two-outcome decision-making algorithm in a swarm of miniaturized wheeled robots. The robots are tasked with individually inspecting and collectively classifying a }}1\backslash,\backslashtext {m} \backslashtimes 1\backslash,\backslashtext {m}}}1m\texttimes1mtiled surface consisting of vibrating and non-vibrating tiles based on the majority type of tiles. The robots sense vibrations using onboard IMUs and perform collision avoidance using a set of IR sensors. We develop a simulation and optimization framework leveraging the Webots robotic simulator and a Particle Swarm Optimization (PSO) method. We consider two existing information sharing strategies and propose a new one that allows the swarm to rapidly reach accurate classification decisions. We first find optimal parameters that allow efficient sampling in simulation and then evaluate our proposed strategy against the two existing ones using 100 randomized simulation and 10 real experiments. We find that our proposed method compels the swarm to make decisions at an accelerated rate, with an improvement of up to 20.52% in mean decision time at only 0.78% loss in accuracy.
- Optimization and Evaluation of a Multi Robot Surface Inspection Task Through Particle Swarm OptimizationDarren Chiu , Radhika Nagpal , and Bahar HaghighatIn 2024 IEEE International Conference on Robotics and Automation (ICRA) , 2024
Robot swarms can be tasked with a variety of automated sensing and inspection applications in aerial, aquatic, and surface environments. In this paper, we study a simplified two-outcome surface inspection task. We task a group of robots to inspect and collectively classify a 2D surface section based on a binary pattern projected on the surface. We use a decentralized Bayesian decision-making algorithm and deploy a swarm of 3-cm sized wheeled robots to inspect a randomized black and white tiled surface section of size 1mx1m in simulation. We first describe the model parameters that characterize our simulated environment, the robot swarm, and the inspection algorithm. We then employ a noise-resistant heuristic optimization scheme based on the Particle Swarm Optimization (PSO) using a fitness evaluation that combines the swarms classification decision accuracy and decision time. We use our fitness measure definition to asses the optimized parameters through 100 randomized simulations that vary surface pattern and initial robot poses. The optimized algorithm parameters show up to 55% improvement in median of fitness evaluations against an empirically chosen parameter set.