Repository: ryanbgriffiths/ICRA2023PaperList Branch: main Commit: 01ed0d2981f9 Files: 2 Total size: 2.7 MB Directory structure: gitextract_fbkt9p15/ ├── ImgRG-ICRA-2023-full-paper-list.csv └── README.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: ImgRG-ICRA-2023-full-paper-list.csv ================================================ Title,Authors,Organisation,Session,Abstract Picking up Speed: Continuous-Time Lidar-Only Odometry Using Doppler Velocity Measurements,"Yuchen Wu, David Juny Yoon, Keenan Burnett, Sören Kammel, Yi Chen, Heethesh Vhavle, Timothy Barfoot","University of Toronto,Aeva Inc,Aeva,Aeva, Inc",SLAM 1,"Frequency-Modulated Continuous-Wave (FMCW) lidar is a recently emerging technology that additionally enables per-return instantaneous relative radial velocity measurements via the Doppler effect. In this letter, we present the first continuous-time lidar-only odometry algorithm using these Doppler velocity measurements from an FMCW lidar to aid odometry in geometrically degenerate environments. We apply an existing continuous-time framework that efficiently estimates the vehicle trajectory using Gaussian process regression to compensate for motion distortion due to the scanning-while-moving nature of any mechanically actuated lidar (FMCW and non-FMCW). We evaluate our proposed algorithm on several real-world datasets, including publicly available ones and datasets we collected. Our algorithm outperforms the only existing method that also uses Doppler velocity measurements, and we study difficult conditions where including this extra information greatly improves performance. We additionally demonstrate state-of-the-art performance of lidar-only odometry with and without using Doppler velocity measurements in nominal conditions. Code for this project can be found at: https://github.com/utiasASRL/steam_icp." Stein ICP for Uncertainty Estimation in Point Cloud Matching,"Fahira Afzal Maken, Fabio Ramos, Lionel Ott","Data,,, CSIRO,University of Sydney, NVIDIA,ETH Zurich",SLAM 1,"Quantification of uncertainty in point cloud matching is critical in many tasks such as pose estimation, sensor fusion, and grasping. Iterative closest point (ICP) is a commonly used pose estimation algorithm which provides a point estimate of the transformation between two point clouds. There are many sources of uncertainty in this process that may arise due to sensor noise, ambiguous environment, initial condition, and occlusion. However, for safety critical problems such as autonomous driving, a point estimate of the pose transformation is not sufficient as it does not provide information about the multiple solutions. Current probabilistic ICP methods usually do not capture all sources of uncertainty and may provide unreliable transformation estimates which can have a detrimental effect in state estimation or decision making tasks that use this information. In this work we propose a new algorithm to align two point clouds that can precisely estimate the uncertainty of ICP's transformation parameters. We develop a Stein variational inference framework with gradient based optimization of ICP's cost function. The method provides a non-parametric estimate of the transformation, can model complex multi-modal distributions, and can be effectively parallelized on a GPU. Experiments using 3D kinect data as well as sparse indoor/outdoor LiDAR data show that our method is capable of efficiently producing accurate pose uncertainty estimates." Direct and Sparse Deformable Tracking,"Jose Lamarca, Juan Jose Gomez Rodriguez, Juan D. Tardos, Jose M M Montiel","Apple Inc.,Universidad de Zaragoza,I,A. Universidad de Zaragoza",SLAM 1,"Deformable Monocular SLAM algorithms recover the localization of a camera in an unknown deformable environment. Current approaches use a template-based deformable tracking to recover the camera pose and the deformation of the map. These template-based methods use an underlying global deformation model. In this paper, we introduce a novel deformable camera tracking method with a local deformation model for each point. Each map point is defined as a single textured surfel that moves independently of the other map points. Thanks to a direct photometric error cost function, we can track the position and orientation of the surfel without an explicit global deformation model. In our experiments, we validate the proposed system and observe that our local deformation model estimates more accurately the targeted deformations of the map in both laboratory-controlled experiments and in-body scenarios undergoing quasi-isometric deformations, with changing topology or discontinuities." ASRO-DIO: Active Subspace Random Optimization Based Depth Inertial Odometry,"Jiazhao Zhang, Yijie Tang, He Wang, Kai Xu","National University of Defense Technology,Peking University",SLAM 1,"High-dimensional nonlinear state estimation is at the heart of inertial-aided navigation systems (INS). Traditional methods usually rely on good initialization and find difficulty in handling large inter-frame transformations due to fast camera motion. We opt to tackle these challenges by solving the depth inertial odometry (DIO) problem with random optimization. To address the exponentially increased amount of candidate states sampled for the high-dimensional state space, we propose a highly efficient variant of random optimization based on the idea of active subspace. Our method identifies the active dimensions which contribute the most significantly to the decrease of the cost function in each iteration, and samples candidate states only within the corresponding subspace. This allows us to efficiently explore the 18D state space of DIO and achieve good optimality by sampling and evaluating only thousands of candidate states. Experiments show that our method attains highly robust and accurate DIO under fast camera motions and low light conditions, without needing a slow-motion warm-up for initialization." Discrete-Continuous Smoothing and Mapping,"Kevin Doherty, Ziqi Lu, Kurran Singh, John Leonard","Massachusetts Institute of Technology,MIT",SLAM 1,"We describe a general approach for maximum a posteriori (MAP) inference in a class of discrete-continuous factor graphs commonly encountered in robotics applications. While there are openly available tools providing flexible and easy-to-use interfaces for specifying and solving inference problems formulated in terms of either discrete or continuous graphical models, at present, no similarly general tools exist enabling the same functionality for hybrid discrete-continuous problems. We aim to address this problem. In particular, we provide a library, DC-SAM, extending existing tools for inference problems defined in terms of factor graphs to the setting of discrete-continuous models. A key contribution of our work is a novel solver for efficiently recovering approximate solutions to discrete-continuous inference problems. The key insight to our approach is that while joint inference over continuous and discrete state spaces is often hard, many commonly encountered discrete-continuous problems can naturally be split into a “discrete part” and a “continuous part” that can individually be solved easily. Leveraging this structure, we optimize discrete and continuous variables in an alternating fashion. In consequence, our proposed work enables straightforward representation of and approximate inference in discrete-continuous graphical models. We also provide a method to approximate the uncertainty in estimates of both discrete and continuous variables." Anderson Acceleration for On-Manifold Iterated Error State Kalman Filters,"Xiang Gao, Tao Xiao, Chunge Bai, Dezhao Zhang, Fang Zhang","idriverplus.com,Beijing Idriverplus Technology Co. Ltd.,Tsinghua University,,,,,,,,,,,,,,,,,,,,Beijing Idriverplus Technology Co., Ltd.",SLAM 1,"Iterated Extended Kalman Filter is a promising and widely-used estimator for real-time localization applications. It iterates the observation equation to find a better linearization point and, simultaneously, only maintains the state estimation in a single time to save the computation resources. Inspired by the recent development of the iterative closest point algorithm, this paper investigates an acceleration approach to the iterations in iterative error state Kalman filters (IESKFs). We show that the IESKF can be seen as a fixed point problem, and the Anderson acceleration (AA) can be elegantly applied to the iterations of IESKF since the error state naturally lies in the tangent space and does not require additional transforms. However, the tangent space of the current estimation may change during the iterations, so we should switch the tangent space to the starting point to perform Anderson acceleration. We propose the AA-IEKF and apply it to the lidar-inertial odometry (LIO) systems to estimate the ego-motion of a lidar. The experiments show that the Anderson acceleration can efficiently reduce the number of iterations in ESKF and achieve a lower computational cost." Generalized LOAM: LiDAR Odometry Estimation with Trainable Local Geometric Features,"Kohei Honda, Kenji Koide, Masashi Yokozuka, Shuji Oishi, Atsuhiko Banno","Nagoya University Graduate School,National Institute of Advanced Industrial Science and Technology,Nat. Inst. of Advanced Industrial Science and Technology,National Institute of Advanced Industrial Science and Technology (AIST),National Instisute of Advanced Industrial Science and Technology",SLAM 1,"This paper presents a LiDAR odometry estimation framework called Generalized LOAM. Our proposed method is generalized in that it can seamlessly fuse various local geometric shapes around points to improve the position estimation accuracy compared to the conventional LiDAR odometry and mapping (LOAM) method. To utilize continuous geometric features for LiDAR odometry estimation, we incorporate tiny neural networks into a generalized iterative closest point (GICP) algorithm. These neural networks improve the data association metric and the matching cost function using local geometric features. Experiments with the KITTI benchmark demonstrate that our proposed method reduces relative trajectory errors compared to the GICP and LOAM methods." BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAM,"Yunge Cui, Xieyuanli Chen, Yinlong Zhang, Jiahua Dong, Qingxiao Wu, Feng Zhu","Shenyang Institute of Automation Chinese Academy of Sciences,National University of Defense Technology,Shenyang Institute of Automation, Chinese Academy of Sciences,Shenyang Institute of Automation,Chinese Academy of Scien",SLAM 1,"Loop closing is a fundamental part of simultaneous localization and mapping (SLAM) for autonomous mobile systems. In the field of visual SLAM, bag of words (BoW) has achieved great success in loop closure. The BoW features for loop searching can also be used in the subsequent 6-DoF loop correction. However, for 3D LiDAR SLAM, the state-of-the-art methods may fail to effectively recognize the loop in real time, and usually cannot correct the full 6-DoF loop pose. To address this limitation, we present a novel Bag of Words for real-time loop closing in 3D LiDAR SLAM, called BoW3D. Our method not only efficiently recognizes the revisited loop places, but also corrects the full 6-DoF loop pose in real time. BoW3D builds the bag of words based on the 3D LiDAR feature LinK3D, which is efficient, pose-invariant and can be used for accurate point-to-point matching. We furthermore embed our proposed method into 3D LiDAR odometry system to evaluate loop closing performance. We test our method on public dataset, and compare it against other state-of-the-art algorithms. Our BoW3D shows better performance in terms of F1 max and extended precision scores in most scenarios with superior real-time performance. It is noticeable that BoW3D takes an average of 50 ms to recognize and correct the loops on KITTI 00 (includes 4K+ 64-ray LiDAR scans), when executed on a notebook with an Intel Core i7 @2.2 GHz processor. We release the implementation of our method here: https://github.com/YungeCui/" Gaussian Mixture Midway-Merge for Object SLAM with Pose Ambiguity,"Jae Hyung Jung, Chan Gook Park",Seoul National University,SLAM 1,"In this letter, we propose a novel method to merge a Gaussian mixture on matrix Lie groups and present its application for a simultaneous localization and mapping problem with symmetric objects. The key idea is to predetermine the weighted mean called a midway point and merge Gaussian mixture components at the associated tangent space. Through this rule, the covariance matrix captures the original density more accurately, and the need for the back-projection is spared when compared to the conventional merge. We highlight the midway-merge by numerically evaluating dissimilarity metrics of density functions before and after the merge on the rotational group. Furthermore, we experimentally discover that the rotational error of symmetric objects follows heavy-tailed behavior and formulate the Gaussian sum filter to model it by a Gaussian mixture noise. The effectiveness of our approach is validated through virtual and real-world datasets." Design and Characterization of a 3D-Printed Pneumatically-Driven Bistable Valve with Tunable Characteristics,"Sihan Wang, Liang He, Perla Maiolino",University of Oxford,Soft Robot Applications,"Although research studies in pneumatic soft robots develop rapidly, most pneumatic actuators are still controlled by rigid valves and conventional electronics. The existence of these rigid, electronic components sacrifices the compliance and adaptability of soft robots. Current electronics-free valve designs based on soft materials are facing challenges in behaviour consistency, design flexibility, and fabrication complexity. Taking advantages of soft material 3D printing, this paper presents a new design of a bi-stable pneumatic valve, which utilises two soft, pneumatically-driven, and symmetrically-oriented conical shells with structural bistability to stabilise and regulate the airflow. The critical pressure required to operate the valve can be adjusted by changing the design features of the soft bi-stable structure. Multi-material printing simplifies the valve fabrication, enhances the flexibility in design feature optimisations, and improves the system repeatability. In this work, both a theoretical model and physical experiments are introduced to examine the relationships between the critical operating pressure and the key design features. Results with valve characteristic tuning via material stiffness changing show better effectiveness compared to the change of geometry design features (demonstrated largest tunable critical pressure range from 15.3 to 65.2 kPa and fastest response time" Design of Fully Controllable and Continuous Programmable Surface Based on Machine Learning,"Jue Wang, Jiaqi Suo, Alex Chortos","Purdue University,Gensler Baltimore,Purdue",Soft Robot Applications,"Programmable surfaces (PSs) consist of a 2D array of actuators that can deform in the third dimension, providing the ability to create continuous 3D profiles. Discrete PSs can be realized using an array of independent solid linear actuators. Continuous PSs consist of actuators that are mechanically coupled, providing deformation states that are more similar to real surfaces with reduced complexity of the control electronics. However, continuous PSs have been limited in size by the lack of the control systems required to take into account the complex internal coupling between actuators in the array. In this work, we computationally explore the deformation of a fully continuous PS with 81 independent actuation pixels based on ionic bending actuator. We establish a control strategy using machine learning (ML) regression models. Both forward and inverse control are achieved based on the training datasets which are derived from the finite element analysis (FEA) data of our PS. The prediction of surface deformation achieved by forward control with accuracy under 1% is 15000 times faster than FEM. And the real-time inverse control of continuous PSs that is to reproduce any arbitrary pre-defined surfaces, which possess high practical value for tactile display or human-machine interactive devices, is first proposed in the letter." On the Use of Magnets to Robustify the Motion Control of Soft Hands,"Sara Marullo, Gionata Salvietti, Domenico Prattichizzo",University of Siena,Soft Robot Applications,"In this letter, we propose a physics-based framework to exploit magnets in robotic manipulation. More specifically, we suggest equipping soft and underactuated hands with magnetic elements, which can generate a magnetic actuation able to synergistically interact with tendon-driven and pneumatic actuations, engendering a complementarity that enriches the capabilities of the actuation system. Magnetic elements can act as additional Degrees of Actuation (DoAs), robustifying the motion control of the device and augmenting the hand manipulation capabilities. We investigate the interaction of a soft hand with itself for enriching possible hand shaping, and the interaction of the hand with the environment for enriching possible grasping capabilities. Physics laws and notions reported in the manuscript can be used as a guidance for DoAs augmentation and can provide tools for the design of novel soft hands." Kinegami: Algorithmic Design of Compliant Kinematic Chains from Tubular Origami,"Wei-Hsi Chen, Woohyeok Yang, Lucien Peach, Daniel Koditschek, Cynthia Sung",University of Pennsylvania,Soft Robot Applications,"Origami processes can generate both rigid and compliant structures from the same homogeneous sheet material. We advance the origami robotics literature by showing that it is possible to construct an arbitrary rigid kinematic chain with prescribed joint compliance from a single tubular sheet. Our ""Kinegami"" algorithm converts a Denavit-Hartenberg specification into a single-sheet crease pattern for an equivalent serial robot mechanism by composing origami modules from a catalogue. The algorithm arises from the key observation that tubular origami linkage design reduces to a Dubins path planning problem. The automatically generated structural connections and movable joints that realize the specified design can also be endowed with independent user-specified compliance. We apply the Kinegami algorithm to a number of common robot mechanisms and hand-fold their algorithmically generated single-sheet crease patterns into functioning kinematic chains. We believe this is the first completely automated end-to-end system for converting an abstract manipulator specification into a physically realizable origami design that requires no additional human input." Entrainment During Human Locomotion Using a Lightweight Soft Robotic Hip Exosuit (SR-HExo),"Lily C. Baye-wallace, Carly Thalman, Hyunglae Lee","Southwest Research Institute; Arizona State University,Arizona State University",Soft Robot Applications,"A gait entrainment study was conducted using a lightweight soft robotic hip exosuit (SR-HExo) that can apply perturbations at the hip joint during treadmill walking. Periodic perturbations were applied by flat fabric Pneumatic Artificial Muscle actuators starting at a subject’s preferred gait frequency and increasing up to 15% higher in 3% increments. Anterior hip flexion perturbations and posterior hip extension perturbations were tested in two separate experiments. All 11 healthy participants showed successful entrainment in all 12 experimental conditions (i.e., from preferred gait frequency to 15% higher in both flexion and extension perturbation directions). This study confirmed that there exists a single stable point attractor during gait entrainment to unilateral, unidirectional hip perturbations, which is consistent with previous ankle studies. Phase-locking was consistently observed around toe-off phase of the gait cycle (GC). Group averaged results showed gait synchronization with extension perturbations occurred earlier in the gait cycle (around 50% GC where the hip angle reaches maximum extension) than with flexion perturbations (just after 60% GC where the transition from maximum hip extension towards hip flexion occurs). Other gait entrainment characteristics (success rate of entrainment, basin of entrainment, and transient response) observed in this study posits the potential of the SR-HExo for entrainment-based gait training in rehabilitation contexts." SOPHIE: SOft and Flexible Aerial Vehicle for PHysical Interaction with the Environment,"Fernando Ruiz Vincueria, Begoña C. Arrue, Aníbal Ollero","UNIVERSIDAD DE SEVILLA,Universidad de Sevilla,University of Seville",Soft Robot Applications,"This paper presents the first design of a soft, 3D-printed in flexible filament, lightweight UAV, capable of performing full-body perching using soft tendons, specifically landing and stabilizing on pipelines and irregular surfaces without the need for an auxiliary system. The flexibility of the UAV can be controlled during the additive manufacturing process by adjusting the infill rate distribution. However, the increase in flexibility implies difficulties in controlling the UAV, as well as structural, aerodynamic, and aeroelastic effects. This article provides insight into the dynamics of the system and validates the flyability of the vehicle for densities as low as 6%. Within this range, quasi-static arm deformations can be considered, thus the autopilot is fed back through a static arm deflection model. At lower densities, strong non-linear elastic dynamics appear, which translates to complex modeling, and it is suggested to switch to data-based approaches." A Tensegrity-Based Inchworm-Like Robot for Crawling in Pipes with Varying Diameters,"Yixiang Liu, Xiaolin Dai, Zhe Wang, Qing Bi, Rui Song, Jie Zhao, Yibin Li","Shandong University,Volvo Construction Equipment Technology (China) Co., Ltd,shandong university,Harbin Institute of Technology",Soft Robot Applications,"Most current in-pipe robots are usually designed for pipes of a specific size. In this paper, we propose a novel inchworm-like in-pipe robot based on the concept of tensegrity for moving in pipes with varying diameters. Firstly, a tensegrity-based robotic module capable of two kinds of shape change is designed. One kind is extension in the axial direction accompanied by contrac-tion in the radial direction, which is the basis for the wave-like crawling movement of the in-pipe robot. The other kind is ex-pansion in the radial direction while keeping changeless in the axial direction, enabling the module adaptable to pipes with different diameters. Then, the geometrical equilibrium configu-ration of the tensegrity module is determined, followed by kinematic analysis using force density method. By cascading three modules, the in-pipe crawling robot is developed. Finally, a series of experiments are performed to test the shape change-ability and friction force of the tensegrity module, and the mo-bility, load capacity, and adaptability of the in-pipe robot. The results validate that the robot can crawl in horizontal pipes, vertical pipes, and elbow pipes under the control of a simple actuation sequence. Furthermore, the robot has the abilities to adapt to pipes with different diameters varying from 100 mm to 180 mm. It is suggested that the usage of tensegrity structures brings about higher adaptability, flexibility, and mobility to the in-pipe crawling robot." Untethered Robotic Millipede Driven by Low-Pressure Microfluidic Actuators for Multi-Terrain Exploration,"Qi Shao, Xuguang Dong, Zhonghan Lin, Chao Tang, Hao Sun, Xin-Jun Liu, Huichan Zhao",Tsinghua University,Soft Robot Applications,"Mobile robots that can adapt to an extensive range of terrains play essential roles in many applications. Millipedes are one of the most terrain-adaptive creatures in nature due to their multi-legged locomotion and flexible body. Inspired by natural millipedes, we report an untethered robotic millipede with a 6-segments soft-rigid hybrid body that can actively bend and 24 legs driven by low-pressure microfluidic actuators. The 24 microfluidic actuators are driven by two independent low-pressure sources from miniature pumps, which allows the untethered locomotion of the robotic millipede in small size (length, 23 cm; width, 5 cm; height, 4 cm) and lightweight (150 g). Using a pre-defined gait for the multi-legs, the robotic millipede can locomote with a maximum speed of 30.96 cm/min (1.35 body length per minute) and a minimum turning radius of 15 cm (0.65 body length). Experiments also demonstrated that the robot was able to locomote effectively in various uneven terrains. Utilizing its passive or active mode of its flexible body, the robot could also achieve adaptive moves. The robotic millipede has the potential to perform a variety of environment exploration tasks by remotely controlling and transmitting real time images wirelessly." FEA-Based Soft Robotic Modeling: Simulating a Soft-Actuator in SOFA,"Pasquale Ferrentino, Ellen Roels, Joost Brancart, Seppe Terryn, Guy Van Assche, Bram Vanderborght","Vrije Universiteit Brussels,Vrije Universiteit Brussel,Vrije Universiteit Brussel (VUB)",Soft Robot Applications,"Soft robotics modeling is a research topic that is evolving fast. Many techniques are present in literature but most of them require analytical models with a lot of equations that are time-consuming, hard to resolve, and not so easy to handle. For this reason, the help of a soft mechanics simulator is essential in this field. In fact, this paper presents a tutorial on how to build a soft-robot model using an open-source Finite Element Analysis (FEA) simulator, called SOFA. This software is able to generate a simulation scene from a code written in Python or XML, so it can be used by people that with different fields of competence like mechanical knowledge, knowledge of material properties and programming skills. As a case study, a Python simulation of a cable-driven soft actuator that makes contact with a rigid object is considered. The basic working principles of SOFA required to make a scene are explained step by step. In particular, it shows how to simulate the mechanics and animate the bending behavior of the actuator. Furthermore, it will be shown also how to retrieve and save data from simulation, demonstrating that SOFA can easily adapt to a multi-disciplinary subject as the research in soft-robotics, but also be useful for teaching simulation and programming language principles to engineering students." Inflated Bendable Eversion Cantilever Mechanism with Inner Skeleton for Increased Stiffness,"Tomoya Takahashi, Masahiro Watanabe, Kazuki Abe, Kenjiro Tadakuma, Naoto Saiki, Masashi Konyo, Satoshi Tadokoro",Tohoku University,Soft Robot Applications,"Inflatable structures used in soft robotics applications have unique characteristics. In particular, the tip-extension structure, which extends the structure from its tip, can grow without creating friction with the environment. However, these inflatable structures need high pressure to maintain their stiffness under various conditions. Excessive inner pressure limits their application in that it prevents the structure from maintaining its curved shape and from complying with specifications. This study aimed to simultaneously lower the pressure and increase the rigidity of the structure. Our work resulted in the proposal of a mechanism that combines a skeleton structure consisting of multi-joint links with functions to increase the rigidity. Insertion of this mechanism into an inflatable structure obviates the need for high inner pressure, yet enables the structure to bend and maintain the intended shape. We devised a design based on rigid articulated links and combined it with a membrane structure that utilizes the advantages of the tip-extension structure. The experimental results show that the payload of the structure designed to operate at low pressure increases compared to that of the membrane-only structure. The findings of this research can be applied to long robots that can be extended into open space without drooping and to mechanisms that enable structures to wrap around the human body." Energy-Based Design Optimization of a Miniature Wave-Like Robot Inside Curved Compliant Tubes,"Rotem Katz, Dan Shachaf, David Zarrouk","Ben Gurion University of the Negev,BGU,Ben Gurion University",Design of Mechanisms,This paper analyzes the crawling locomotion of a wave-like robot in curved tubes. We use an energy-based approach to determine the optimal crawling orientation of the robot that minimizes the surface energy while advancing. The results showed that the robot rotated its body along the roll direction so that the wave motion would be in the same plane as the curvature plane of the tube. The incorporation of a passive bending joint along the plane of the wave motion decreased the surface energy and enhanced the robot’s ability to advance in even tighter curves. Given these findings we designed and manufactured two new robots with either one or two passive bending joints. We molded custom flexible surfaces and tubes and experimentally tested our robots in them. These validating experiments indicated that the bending joints substantially improved the robots’ ability to traverse curved tubes (see video). A Palm-Sized Omnidirectional Mobile Robot Driven by 2-DOF Torus Wheels,"Yunosuke Sato, Ayato Kanada, Tomoaki Mashimo","Toyohashi University of Technology,Kyushu University,Okayama University",Design of Mechanisms,"This paper proposes a palm-sized omnidirectional mobile robot with two torus wheels. A single torus wheel is made of an elastic elongated coil spring in which the two ends of the coil connected each other and is driven by a piezoelectric actuator (stator) that can generate 2-degrees-of-freedom (axial and angular) motions. The stator converts its thrust force and torque into longitudinal and meridian motions of the torus wheel, respectively, making the torus work as an omnidirectional wheel on a plane. In this paper, we build a control system of a piezo-driven 2-degrees-of-freedom torus wheel and evaluate its performance measures, such as the transient characteristics, the orientation accuracy and the payload capacity. An omnidirectional robot with the two torus wheels is constructed, and the feedback control for a desired planar motion is demonstrated. The design inspired by a ring torus represents the possibility toward the creation of an unprecedentedly simple, light, and compact 2-wheel omnidirectional robot." Flipper-Style Locomotion through Strong Expanding Modular Robots,"Lillian Chin, Max Burns, Gregory Xie, Daniela Rus","Massachusetts Institute of Technology,MIT",Design of Mechanisms,"Modular robotic units that can change their size at will presents an exciting pathway for modular robotics. However, current attempts have been relatively limited, requiring tethers, complex fabrication or slow cycle times. In this work, we present AuxBots: an auxetic-based approach to create high force, fast cycle time self-contained modules. By driving the auxetic shell's inherent mathematical expansion with a motor and leadscrew, these robots are capable of expanding their volume by 274% in 0.7 seconds with a maximum strength to weight ratio of 76x. These force and expansion properties enable us to use these modules in conjunction with flexible wire constraints to get shape changing behavior and independent locomotion. We demonstrate the power of this modular system by using a limited number of AuxBots to mimic the flipper-style locomotion of mudskippers and sea turtles. These structures are entirely untethered and can still move forward even as some AuxBots stall out, achieving the key modular robotics goal of versatility and robustness." Simplified Configuration Design of Anthropomorphic Hand Imitating Specific Human Hand Grasps,"Xinyang Tian, Qiang Zhan, Yin Zhang, Junyi Zou, Lingxiao Jiang, Qinhuan Xu","Beihang university,Beihang University",Design of Mechanisms,"How to design an anthropomorphic hand imitating specific human hand grasps with as few actuators as possible is still a challenge. This paper presents a method for obtaining a simplified configuration of anthropomorphic hand imitating specific human hand grasps based on the motion analyses of the human hand. A participation matrix which characterizes a human hand grasp on joint motion level is constructed according to the motion participation of each finger joint. By adding all participation matrices of expected human hand grasps together a total participation matrix can be derived, and through mathematical processing a simplified anthropomorphic hand configuration can be obtained. Following the proposed method, a simplified anthropomorphic hand configuration that imitates six basic human hand grasps was obtained. A series of grasp experiments with the anthropomorphic hand prototype were conducted to validate the grasping capability as well as the proposed simplified configuration design method. This method can help to obtain a reasonably simplified configuration of an anthropomorphic hand when expected human hand grasps are definite." Meta Reinforcement Learning for Optimal Design of Legged Robots,"Alvaro Belmonte-baeza, Joonho Lee, Giorgio Valsecchi, Marco Hutter","University of Alicante,ETH Zurich Robotic Systems Laboratory,Robotic System Lab, ETH,ETH Zurich",Design of Mechanisms,"The process of robot design is a complex task and the majority of design decisions are still based on human intuition or tedious manual tuning. A more informed way of facing this task is computational design methods where design parameters are concurrently optimized with corresponding controllers. Existing approaches, however, are strongly influenced by predefined control rules or motion templates and cannot provide end-to-end solutions. In this paper, we present a design optimization framework using model-free meta reinforcement learning, and its application to the optimizing kinematics and actuator parameters of quadrupedal robots. We use meta reinforcement learning to train a locomotion policy that can quickly adapt to different designs. This policy is used to evaluate each design instance during the design optimization. We demonstrate that the policy can control robots of different designs to track random velocity commands over various rough terrains. With controlled experiments, we show that the meta policy achieves close-to-optimal performance for each design instance after adaptation. Lastly, we compare our results against a model-based baseline and show that our approach allows higher performance while not being constrained by predefined motions or gait patterns." Advanced 2-DOF Counterbalance Mechanism Based on Gear Units and Springs to Minimize Required Torques of Robot Arm,"Hwi-Su Kim, Jongwoo Park, Myeongsu Bae, Dongil Park, Chanhun Park, Hyunmin Do, Taeyong Choi, Doo-hyeong Kim, Jinho Kyung","Korea Institute of Machinery & Materials,Korea Institue of Machinery & Materials,Dyence tech,Korea Institute of Machinery and Materials (KIMM),KIMM,Korea Institute of Machinery and Materials,Korea Institute of Machinery & Materials (KIMM)",Design of Mechanisms,"In recent years, human-robot cooperation has enhanced productivity and achieved high payload, speed, and accuracy. Integrating typical industrial robots in human-robot cooperation is challenging because their arms may cause serious injuries to humans during a collision due to malfunction or errors due to robot operators. Therefore, counterbalance robot arms that are capable of counterbalancing the gravitational torques due to the robot mass have been developed to decrease the required capacity of the motors and speeds of these robots. In this research, we propose an advanced counterbalance mechanism using gear units and springs to improve the durability and reliability compared to the previously proposed wire-based counterbalance mechanism, which is difficult to apply to a commercialized product because it can easily be broken or stretched when an excessive force is applied for a long period. Moreover, our proposed method was extended to a multi-DOF system using a parallelogram mechanism based on a timing belt and pulleys to achieve multi-DOF robotic arms. A 2-DOF counterbalanced arm was designed to verify the effectiveness of the proposed mechanism. The simulations and experimental results showed that the proposed mechanism effectively reduced the gravitational torques of each joint of the multi-DOF arm." Permanent-Magnetically Amplified Robotic Gripper with Less Clamping Width Influence on Compensation Realized by a Stepless Width Adjustment Mechanism,"Tori Shimizu, Kenjiro Tadakuma, Masahiro Watanabe, Kazuki Abe, Masashi Konyo, Satoshi Tadokoro",Tohoku University,Design of Mechanisms,"Machines such as robotic grippers use powerful actuators or gearboxes to exert large loads at the expense of energy consumption, volume, and mass. We propose a stepless force amplification mechanism that assists clamping by a pair of permanent magnets, in which the external control force required to adjust their distance, and thus the output force, is suppressed by compensation springs. For further sophistication, we invented a new width adjuster using a lever. By separating the actuation of fingers and compensated magnets temporarily, the adjuster eliminated the nonlinear influence of the object width on the clamping force. The prototype gripper for proof of concept revealed that the adjuster successfully linearized the width-force characteristic with an inclination of 0.15 N/mm, which is sufficiently insignificant compared to the major output force of approximately 50 N. The force amplification effect coexisted with this phenomenon, such that the clamping force was amplified to 137.5% while maintaining the energy consumption of a DC motor, and the force-energy efficiency was multiplied by 1.39. Thus, able to be driven by a weaker, smaller, and lighter actuator, the gripper contributes to extension of the operation time of robots with limited power supply." Design of a New Bio-Inspired Dual-Axis Compliant Micromanipulator with Millimeter Strokes,"Zekui Lyu, Qingsong Xu",University of Macau,Design of Mechanisms,"This paper proposes the concept design of a novel bio-inspired dual-axis compliant micromanipulator with millimeter working strokes dedicated to fiber alignment. It subtly mimics the gripping and rubbing function of the human hand consisting of the forefinger, purlicue, and thumb. Compared with traditional dual-axis grippers, its advantages lie in millimeter-level stroke, bi-directional rotation, less slippage, and comprehensive force sensing. To achieve dexterous and reliable manipulation, a two-degree-of-freedom (2-DOF) flexible decoupling mechanism and a displacement reversing mechanism based on the leaf-shaped flexible hinge are introduced. A prototype driven by two voice coil motors is fabricated for experimental testing. Three high-precision strain gauges with temperature compensation are glued on the sensitive region to measure the gripping force and rubbing force. Experimental results show that the gripping and rubbing strokes of the manipulator are up to 2.3 mm and 2.1 mm, respectively. For a custom-made fiber flag with a diameter of 200 um, the rotation stroke of more than 1000 degrees has been achieved, which cannot be realized by previous work with the same level of compact mechanism design." Optimal Elastic Wing for Flapping-Wing Robots through Passive Morphing,"Cristina Ruiz Paez, Jose Angel Acosta, Aníbal Ollero",University of Seville,Design of Mechanisms,"Flapping wing robots show promise as platforms for safe and efficient flight in near-human operations, thanks to their ability to agile maneuver or perch at a low Reynolds number. The growing trend in the automatization of these robots has to go hand in hand with an increase in the payload capacity. This work provides a new passive morphing wing prototype to increase the payload of this type of UAV. The prototype is based on a biased elastic joint and the holistic research also includes the modelling, simulation and optimization scheme, thus allowing to adapt the prototype for any flapping wing robot. This model has been validated through flight experiments on the available platform, and it has also been demonstrated that the morphing prototype can increase the lift of the robot under study by up to 16% in real flight and 10% of estimated consumption reduction." Robust Multi-Robot Trajectory Optimization Using Alternating Direction Method of Multiplier,"Ruiqi Ni, Zherong Pan, Xifeng Gao","Florida State University,Tencent America",Planning,"We propose a variant of alternating direction method of multiplier (ADMM) to solve constrained trajectory optimization problems. Our ADMM framework breaks a joint optimization into small sub-problems, leading to a low iteration cost and decentralized parameter updates. Starting from a collision-free initial trajectory, our method inherits the theoretical properties of primal interior point method (P-IPM), i.e., guaranteed collision avoidance and homotopy preservation throughout optimization, while being orders of magnitude faster. We have analyzed the convergence and evaluated our method for time-optimal multi-UAV trajectory optimizations and simultaneous goal-reaching of multiple robot arms, where we take into consider kinematics-, dynamics-limits, and homotopy-preserving collision constraints. Our method highlights an order of magnitude's speedup, while generating trajectories of comparable qualities as state-of-the-art P-IPM solver." Autonomous Exploration in a Cluttered Environment for a Mobile Robot with 2D-Map Segmentation and Object Detection,"Hyung Seok Kim, Hyeongjin Kim, Seon-il Lee, Hyeonbeom Lee",Kyungpook National University,Planning,"Frontier-based exploration is widely adopted for exploring an unknown region. The conventional frontier-based exploration for a mobile robot may collide with three-dimensional (3D) obstacles or can suffer from a slower exploration time because the robot may move to another place before completely exploring the current area. To solve this problem, in this paper, we propose a new exploration algorithm by considering a path traveled by a mobile robot and segmenting a two-dimensional (2D) map. The segmented 2D map is generated in real-time by using the position of the robot and the location of the detected frontiers. To apply our algorithm to the actual experiment, we develop an object detection-based exploration algorithm that can remarkably reduce the probability of collision with 3D obstacles. To verify the effectiveness of our proposed algorithm, we perform simulations (Gazebo) and experiments (in the real world) to compare the conventional approach and our algorithm in a cluttered environment. The simulation and experiment results show that our algorithm can satisfactorily shorten the exploration path and time." Distributionally Safe Path Planning: Wasserstein Safe RRT,"Paul Lathrop, Beth Boardman, Sonia Martinez","University of California, San Diego,Los Alamos National Laboratory,UC San Diego",Planning,"In this paper, we propose a Wasserstein metric-based random path planning algorithm. Wasserstein Safe RRT (W-Safe RRT) provides finite-sample probabilistic guarantees on the safety of a returned path in an uncertain obstacle environment. Vehicle and obstacle states are modeled as distributions based upon state and model observations. We define limits on distributional sampling error so the Wasserstein distance between a vehicle state distribution and obstacle distributions can be bounded. This enables the algorithm to return safe paths with a confidence bound through combining finite sampling error bounds with calculations of the Wasserstein distance between discrete distributions. W-Safe RRT is compared against a baseline minimum encompassing ball algorithm, which ensures balls that minimally encompass discrete state and obstacle distributions do not overlap. The improved performance is verified in a 3D environment using single, multi, and rotating non-convex obstacle cases, with and without forced obstacle error in adversarial directions, showing that W-Safe RRT can handle poorly modeled complex environments." Sim2Real Learning of Obstacle Avoidance for Robotic Manipulators in Uncertain Environments,"Tan Zhang, Kefang Zhang, Jiatao Lin, Wing-yue Geoffrey Louie, Hui Huang","Shenzhen Techonology University,Shenzhen University,Oakland University",Planning,"Obstacle avoidance for robotic manipulators can be challenging when they operate in unstructured environments. This problem is probed with the sim-to-real (sim2real) deep reinforcement learning, such that a moving policy of the robotic arm is learnt in a simulator and then adapted to the real world. However, the problem of sim2real adaptation is notoriously difficult. To this end, this work proposes (1) a unified representation of obstacles and targets to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model combining the unified representation with the deep reinforcement learning control module that can be trained by interacting with the environment. Such a representation is agnostic to the shape and appearance of the underlying objects, which simplifies and unifies the scene representation in both simulated and real worlds. We implement this idea with a vision-based actor-critic framework by devising a bounding box predictor module. The predictor estimates the 3D bounding boxes of obstacles and targets from the RGB-D input. The features extracted by the predictor are fed into the policy network, and all the modules are jointly trained. Our experiments in simulated environment and the real-world show that the end-to-end model of the unified representation achieves better sim2real adaption and scene generalization than state-of-the-art techniques." Bidirectional Sampling-Based Motion Planning without Two-Point Boundary Value Solution,"Sharan Nayak, Michael W. Otte","University of Maryland, College Park,University of Maryland",Planning,"Bidirectional path and motion planning approaches decrease planning time, on average, compared to their unidirectional counterparts. In single-query feasible motion planning, using bidirectional search to find a continuous motion plan requires an edge connection between the forward and the reverse search tree. Such a tree–tree connection requires solving a two-point boundary value problem (BVP). However, obtaining a closed-form two-point BVP solution can be difficult or impossible for many systems. While numerical methods can provide a reasonable solution in many cases, they are often computationally expensive or numerically unstable for the purposes of single-query sampling-based motion planning. To overcome this challenge, we present a novel bidirectional search strategy that does not require solving the two-point BVP. Instead of connecting the forward and reverse trees directly, the reverse tree’s cost information is used as a guiding heuristic for forward search. This enables the forward search to quickly grow down the reverse tree—converging to a fully feasible solution without the solution to a two-point BVP. In this article, we propose two algorithms that use this strategy for single-query feasible motion planning for various dynamical systems, performing experiments in both simulation and hardware testbeds. We find that these algorithms perform better than or comparable to existing state-of-the-art methods with respect to quickly finding an initial feasible solution." Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly,"Valentin Hartmann, Andreas Orthey, Danny Driess, Ozgur S. Oguz, Marc Toussaint","University of Stuttgart,TU Berlin,Bilkent University",Planning,"Robotic construction assembly planning aims to find feasible assembly sequences as well as the corresponding robot-paths and can be seen as a special case of task and motion planning (TAMP). As construction assembly can well be parallelized, it is desirable to plan for multiple robots acting concurrently. Solving TAMP instances with many robots and over a long time-horizon is challenging due to coordination constraints, and the difficulty of choosing the right task assignment. We present a planning system which enables parallelization of complex task and motion planning problems by iteratively solving smaller subproblems. Combining optimization methods to jointly solve for manipulation constraints with a sampling-based bi-directional space-time path planner enables us to plan cooperative multi-robot manipulation with unknown arrival-times. Thus, our solver allows for completing subproblems and tasks with differing timescales and synchronizes them effectively. We demonstrate the approach on multiple construction case-studies to show the robustness over long planning horizons and scalability to many objects and agents. % of our algorithm. Finally, we also demonstrate the" A Reachability-Based Spatio-Temporal Sampling Strategy for Kinodynamic Motion Planning,"Yongxing Tang, Zhanxia Zhu, Hongwen Zhang","Northwestern Polytechnical University,Zhejiang Lab",Planning,"By limiting the planning domain to “L2 Informed Set”, some sampling-based motion planner (SBMP) (e.g. Informed RRT*, BIT*) can solve the geometric motion planning problems efficiently. However, the construction of informed set (IS) will be very challenging, when further differential constraints are considered. For the time-optimal kinodynamic motion planning problem, this paper defines a modified time informed set (MTIS) to limit the planning domain. Due to drawing inspiration from Hamilton-Jacobi-Bellman (HJB) reachability analysis, MTIS, compared with the original TIS, can not only help save the running time of SBMP, but also extend the applicable scope from linear systems to polynomial nonlinear systems with control constrains. On this basis, a spatio-temporal sampling strategy adapted to MTIS is proposed. Firstly, MTIS is used to estimate the optimal cost and the valid tree structure is reused, so that we do not need to provide a solution trajectory in advance. Secondly, this strategy is generic, allowing it to be combined with common SBMP (such as SST, etc.) to accelerate convergence and reduce memory complexity. Several simulation experiments also demonstrate the effectiveness of proposed method." Efficient Speed Planning for Autonomous Driving in Dynamic Environment with Interaction Point Model,"Yingbing Chen, Ren Xin, Jie Cheng, Qingwen Zhang, Xiaodong Mei, Ming Liu, Lujia Wang","The Hongkokng University of Science and Technology,the Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,KTH Royal Institute of Technology,HKUST,The Hong Kong University of Technology",Planning,"Safely interacting with other traffic participants is one of the core requirements for autonomous driving, especially in intersections and occlusions. Most existing approaches are designed for particular scenarios and require significant human labor in parameter tuning to be applied to different situations. To solve this problem, we first propose a learning-based Interaction Point Model (IPM), which describes the interaction between agents with the protection time and interaction priority in a unified manner. We further integrate the proposed IPM into a novel planning framework, demonstrating its effectiveness and robustness through comprehensive simulations in highly dynamic environments." Efficient Anytime CLF Reactive Planning System for a Bipedal Robot on Undulating Terrain,"Bruce Jk Huang, J.W Grizzle",University of Michigan,Planning,"We propose and experimentally demonstrate a reactive planning system for bipedal robots on unexplored, challenging terrain. The system includes: a multi-layer local map for assessing traversability; an anytime omnidirectional Control Lyapunov Function (CLF) for use with a Rapidly Exploring Random Tree Star (RRT*) that generates a vector field for specifying motion between nodes; a sub-goal finder when the final goal is outside of the current map; and a finite-state machine to handle high-level mission decisions. The system also includes a reactive thread that copes with robot deviations via a vector field, defined by a closed-loop feedback policy. The vector field provides real-time control commands to the robot's gait controller as a function of instantaneous robot pose. The system is evaluated on various challenging outdoor terrains and cluttered indoor scenes in both simulation and experiment on Cassie Blue, a bipedal robot with 20 degrees of freedom. All implementations are coded in C++ with the Robot Operating System (ROS) and are available at https://github.com/UMich-BipedLab/CLF_reactive_planning_system." A Framework to Co-Optimize Robot Exploration and Task Planning in Unknown Environments,"Yuanfan Xu, Zhaoliang Zhang, Yu Jincheng, Yuan Shen, Yu Wang",Tsinghua University,Planning,"Robots often need to accomplish complex tasks in unknown environments, which is a challenging problem, involving autonomous exploration for acquiring necessary scene knowledge and task planning. In traditional approaches, the agent first explores the environment to instantiate a complete planning domain and then invokes a symbolic planner to plan and perform high-level actions. However, task execution is inefficient since the two processes involve many repetitive states and actions. Hence, this paper proposes a framework to co-optimize robot exploration and task planning in unknown environments. To afford robot exploration and symbolic planning not being independent and separated, we design a unified structure named subtask, which is exploited to decompose the robot exploration and planning phases. To select the appropriate subtask each time, we develop a value function and a value-based scheduler to co-optimize exploration and task processing. Our framework is evaluated in a photo-realistic simulator with three complex household tasks, increasing task efficiency by 25%-29%." Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA,"Yuki Kadokawa, Yoshihisa Tsurumine, Takamitsu Matsubara",Nara Institute of Science and Technology,Reinforcement Learning,"This paper explores a Deep Reinforcement Learning (DRL) approach for designing image-based control for edge robots to be implemented on Field Programmable Gate Arrays (FPGAs). Although FPGAs are more power-efficient than CPUs and GPUs, a typical (DRL) method cannot be applied since they are composed of many Logic Blocks (LBs) for high-speed logical operations but low-speed real-number operations. To cope with this problem, we propose a novel DRL algorithm called Binarized P-Network (BPN), which learns image-input control policies using Binarized Convolutional Neural Networks (BCNNs). To alleviate the instability of reinforcement learning caused by a BCNN with low function approximation accuracy, our BPN adopts a robust value update scheme called Conservative Value Iteration, which is tolerant of function approximation errors. We confirmed the BPN's effectiveness through applications to a visual tracking task in simulation and real-robot experiments with FPGA." Automating Reinforcement Learning with Example-Based Resets,"Jigang Kim, J. Hyeon Park, Daesol Cho, H. Jin Kim",Seoul National University,Reinforcement Learning,"Deep reinforcement learning has enabled robots to learn motor skills from environmental interactions with minimal to no prior knowledge. However, existing reinforcement learning algorithms assume an episodic setting, in which the agent resets to a fixed initial state distribution at the end of each episode, to successfully train the agents from repeated trials. Such reset mechanism, while trivial for simulated tasks, can be challenging to provide for real-world robotics tasks. Resets in robotic systems often require extensive human supervision and task-specific workarounds, which contradicts the goal of autonomous robot learning. In this paper, we propose an extension to conventional reinforcement learning towards greater autonomy by introducing an additional agent that learns to reset in a self-supervised manner. The reset agent preemptively triggers a reset to prevent manual resets and implicitly imposes a curriculum for the forward agent. We apply our method to learn from scratch on a suite of simulated and real-world continuous control tasks and demonstrate that the reset agent successfully learns to reduce manual resets whilst also allowing the forward policy to improve gradually over time." Improving the Robustness of Reinforcement Learning Policies with L1 Adaptive Control,"Yikun Cheng, Pan Zhao, Fanxin Wang, Daniel Block, Naira Hovakimyan","University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign,University of Illinois",Reinforcement Learning,"A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamics variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an L1 adaptive controller (L1AC). Leveraging the capability of an L1AC for fast estimation and active compensation of dynamic variations, the proposed approach can improve the robustness of an RL policy that is trained either in a simulator or in the real world without consideration of a broad class of dynamics variations. Numerical and real-world experiments empirically demonstrate the efficacy of the proposed approach in robustifying RL policies trained using both model-free and model-based methods." Developing Cooperative Policies for Multi-Stage Reinforcement Learning Tasks,"Jordan Erskine, Christopher Lehnert",Queensland University of Technology,Reinforcement Learning,"Many hierarchical reinforcement learning algorithms utilise a series of independent skills as a basis to solve tasks at a higher level of reasoning. These algorithms don't consider the value of using skills that are cooperative instead of independent. This paper proposes the Cooperative Consecutive Policies (CCP) method of enabling consecutive agents to cooperatively solve long time horizon multi-stage tasks. This method is achieved by modifying the policy of each agent to maximise both the current and next agent's critic. Cooperatively maximising critics allows each agent to take actions that are beneficial for its task as well as subsequent tasks. Using this method in a multi-room maze domain and a peg in hole manipulation domain, the cooperative policies were able to outperform a set of naive policies, a single agent trained across the entire domain, as well as another sequential HRL algorithm." Learning Performance Graphs from Demonstrations Via Task-Based Evaluations,"Aniruddh Gopinath Puranic, Jyotirmoy Deshmukh, Stefanos Nikolaidis","University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA",Reinforcement Learning,"In the paradigm of robot learning-from-demonstrations (LfD), understanding and evaluating the demonstrated behaviors plays a critical role in extracting control policies for robots. Without this knowledge, a robot may infer incorrect reward functions that lead to undesirable or unsafe control policies. Prior work has used temporal logic specifications, manually ranked by human experts based on their importance, to learn reward functions from imperfect/suboptimal demonstrations. To overcome reliance on expert rankings, we propose a novel algorithm that learns from demonstrations, a partial ordering of provided specifications in the form of a performance graph. Through various experiments, including simulation of industrial mobile robots, we show that extracting reward functions with the learned graph results in robot policies similar to those generated with the manually specified orderings. We also show in a user study that the learned orderings match the orderings or rankings by participants for demonstrations in a simulated driving domain. These results show that we can accurately evaluate demonstrations with respect to provided task specifications from a small set of imperfect data with minimal expert input." Tumbling Robot Control Using Reinforcement Learning,"Andrew Schwartzwald, Matthew Tlachac, Luis Guzman, Athanasios Bacharis, Nikos Papanikolopoulos","CSE, UMN,CSE, University of Minnesota,University of Minnesota",Reinforcement Learning,"Tumbling robots are simple platforms that are able to traverse large obstacles relative to their size, at the cost of being difficult to control. Existing control methods apply only a subset of possible robot motions and make the assumption of flat terrain. Reinforcement learning allows for the development of sophisticated control schemes that can adapt to diverse environments. By utilizing domain randomization while training in simulation, a robust control policy can be learned which transfers well to the real world. In this paper, we implement autonomous setpoint navigation on a tumbling robot prototype and evaluate it on flat and uneven terrain. The flexibility of our system demonstrates the viability of nontraditional robots for navigational tasks." Guided Reinforcement Learning – a Review and Evaluation for Efficient and Effective Real-World Robotics,"Julian Eßer, Nicolas Bach, Christian Jestel, Oliver Urbann, Sören Kerner",Fraunhofer IML,Reinforcement Learning,"Recent successes aside, reinforcement learning still faces significant challenges in its application to the real-world robotics domain. Guiding the learning process with additional knowledge offers a potential solution, thus leveraging the strengths of data- and knowledge-driven approaches. However, this field of research encompasses several disciplines and hence would benefit from a structured overview. In this paper, we propose the concept of guided reinforcement learning that provides a systematic approach towards accelerating the training process and improving the performance for real-world robotic settings. We introduce a classification that structures guided reinforcement learning approaches and shows how different sources of knowledge can be integrated into the learning pipeline in a practical way. Based upon this, we describe available approaches in this field and evaluate their specific impact in terms of efficiency, effectiveness, and sim-to-real transfer within the robotics domain." Robust Adaptive Ensemble Adversary Reinforcement Learning,"Peng Zhai, Taixian Hou, Xiaopeng Ji, Zhiyan Dong, Lihua Zhang","Fudan University,FuDan University,Zhejiang University",Reinforcement Learning,"Reinforcement learning needs to learn policies through trial and error. The unstable policies in the early stage of training make it expensive (and time-consuming) to train directly in the real environment, which may cause disastrous consequences. The popular solution is to use the simulator to train the policy and deploy it in a real environment. However, the modeling error and external disturbance between the simulation and the real environment may fail the physical deployment, resulting in the sim2real transfer problem. In this letter, we propose a novel robust adversarial reinforcement learning framework, which uses the ensemble training of multi-adversarial agents that can adaptively adjust adversaries’ strength to enhance RL policy’s robustness. More specifically, we take the accumulative reward as feedback and construct a PID controller to adjust the adversary’s output magnitude to perform the adversarial training well. Experiments in the simulated and the real environment show that our algorithm improves the generalization ability of the policy for the modeling error and the uncertain disturbance simultaneously, outperforming the next best prior methods across all domains. The algorithm was further proven to be effective in a sim2real transfer task through the load experiment of a real racing drone, and the tracking performance is better than the PID-based flight controller." GIN: Graph-Based Interaction-Aware Constraint Policy Optimization for Autonomous Driving,"Se-Wook Yoo, Chan Kim, Jinwoo Choi, Seong-woo Kim, Seung-Woo Seo",Seoul National University,Reinforcement Learning,"Applying reinforcement learning to autonomous driving entails particular challenges, primarily due to dynamically changing traffic flows. To address such challenges, it is necessary to quickly determine response strategies to the changing intentions of surrounding vehicles. This paper proposes a new policy optimization method for safe driving using graph-based interaction-aware constraints. In this framework, the motion prediction and control modules are trained simultaneously while sharing a latent representation that contains a social context. To reflect social interactions, we illustrate the movements of agents in graph form and filter the features with the graph convolution networks. This helps preserve the spatiotemporal locality of adjacent nodes. Furthermore, we create feedback loops to combine these two modules effectively. As a result, this approach encourages the learned controller to be safe from dynamic risks and renders the motion prediction robust to abnormal movements. In the experiment, we set up a navigation scenario comprising various situations with CARLA, an urban driving simulator. The experiments show state-of-the-art performance in navigation strategy and motion prediction compared to the baselines. The code is available online." Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning,"Nicolai Dorka, Tim Welschehold, Joschka Boedecker, Wolfram Burgard","University of Freiburg,Albert-Ludwigs-Universität Freiburg,University of Technology Nuremberg",Reinforcement Learning,"Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method called Adaptively Calibrated Critics (ACC) that uses the most recent high variance but unbiased on-policy rollouts to alleviate the bias of the low variance temporal difference targets. We apply ACC to Truncated Quantile Critics [1], which is an algorithm for continuous control that allows regulation of the bias with a hyperparameter tuned per environment. The resulting algorithm adaptively adjusts the parameter during training rendering hyperparameter search unnecessary and sets a new state of the art on the OpenAI gym continuous control benchmark among all algorithms that do not tune hyperparameters for each environment. ACC further achieves improved results on different tasks from the Meta-World robot benchmark. Additionally, we demonstrate the generality of ACC by applying it to TD3 [2] and showing an improved performance also in this setting." An Investigation on the Effect of Actuation Pattern on the Power Consumption of Legged Robots for Extraterrestrial Exploration,"Yuan Hu, Weizhong Guo, Rongfu Lin","University of Shanghai for Science and Technology,Shanghai Jiao Tong University,ShangHai JiaoTong university",Marine and Field Robotics,"Legged robots have great potential to be extraterrestrial exploration rovers of extraordinary versatility. Minimizing power consumption is of vital importance in the scenarios of extraterrestrial explorations. The actuation pattern, which refers to the combination of necessary actuators that output torque, has a significant influence on the power consumption of legged robots. This article seeks to investigate the effect of actuation patterns on the power consumption of legged robots that perform motion in a quasi-static manner. The power consumption model of legged robots considering actuation patterns is deduced. Based on that, the effect of the actuation pattern on mechanical power and heat power, which are the main power-loss terms, is investigated. The lowest power consumption under various conditions achieved by different actuation patterns is investigated. Simulation results show that the power consumption can be reduced by choosing the actuation pattern properly. Furthermore, the principles of selecting the optimal actuation pattern from the perspective of power consumption are summarized, which are expected to facilitate the minimal power consumption motion planning of legged robots." Intent Inference-Based Ship Collision Avoidance in Encounters with Rule-Violating Vessels,"Yonghoon Cho, Jonghwi Kim, Jinwhan Kim","Agency for Defense Development,KAIST",Marine and Field Robotics,"All vessels operating in a marine environment are required to comply with the international regulations for preventing collisions at sea (COLREGs), which provide the guidelines and evasive procedures required to resolve potential conflicts between vessels. However, not all vessels strictly abide by COLREGs, often leading to dangerous situations. This paper presents a novel approach for robust collision avoidance in encounter situations involving COLREG-violating vessels. A probabilistic velocity obstacle algorithm based on intent inference is designed and implemented with consideration of the tradeoff between the adherence to traffic rules and the proactive evasive actions for safety. One-to-one and multi-ship encounter situations in the presence of rule-violating vessels are examined through Monte-Carlo simulations, and the results are discussed to demonstrate the feasibility and performance of the proposed approach." Nezha-Mini: Design and Locomotion of a Miniature Low-Cost Hybrid Aerial Underwater Vehicle,"Yuanbo Bi, Yufei Jin, Chenxin Lyu, Zheng Zeng, Lian Lian","Shanghai jiao tong University,Shanghai Jiao Tong University,Shanghai Jiaotong University",Marine and Field Robotics,"The distinct design concepts of the vehicles operating in air and water is one of the tremendous challenges that constrain the development of the hybrid aerial underwater vehicle (HAUV). This incompatibility consequently results in the enlarging volume and weight of the existing prototypes, as well as the unmatched maneuvering characteristics in both domains. This letter presented a novel miniaturized and lightweight HAUV, ""Nezha-mini"", which weighs 953g and is only A4-scaled. Besides, the low cost and high modularity allow the convenient repair and remanufacturing. Nezha-mini reconciles the complete multi-domain maneuverability within 50m aerially and 6m underwater whilst sufficing for the rapid and stable cross-domain locomotion, which benefits from the selection and unique layout of the propulsion system, as well as our proposed multi-modal control strategy and the cross-domain triggering mechanism. The results of the field experiments are in good agreement with the dynamics simulation, demonstrating the performance of multi-domain locomotion in real environments. The preliminary exploration in this letter provides a referential solution for the miniaturization of the highly maneuverable HAUVs for practical applications and creates a feasible platform for the future clustering and networking of HAUVs." CPG-Based Motion Planning of Hybrid Underwater Hexapod Robot for Wall Climbing and Transition,"Feiyu Ma, Weisheng Yan, Lepeng Chen, Rongxin Cui",Northwestern Polytechnical University,Marine and Field Robotics,"Most of the existing underwater legged robots are capable of moving on small-angled slopes, but few of them can climb the large-angled slope or transition from one plane to another, such as transition from horizontal plane to vertical plane. In this paper, we propose a motion planning method of a hybrid underwater hexapod robot (HUHR) driven by six C-shape legs and eight thrusters. By analyzing the relationship between rotation and displacement of the hip joint, we establish a single-leg kinematic model. By analyzing the force at the touchpoint, we propose a locomotion mechanism to ensure no slip of the C-shape leg. Based on the central pattern generator (CPG) and tripod gait, we design an aperiodic mapping between the oscillator outputs and the desired rotation angles of hip joints. Overall, a gait planning and control method for our robot is proposed to realize continuous legged locomotion from one plane to another, including directional climbing and transition between them. Furthermore, the effectiveness of the proposed method has been verified on HUHR." Improving Self-Consistency in Underwater Mapping through Laser-Based Loop Closure,"Thomas Hitchcox, James Richard Forbes",McGill University,Marine and Field Robotics,"Accurate, self-consistent bathymetric maps are needed to monitor changes in subsea environments and infrastructure. These maps are increasingly collected by underwater vehicles, and mapping requires an accurate vehicle navigation solution. Commercial off-the-shelf (COTS) navigation solutions for underwater vehicles often rely on external acoustic sensors for localization, however survey-grade acoustic sensors are expensive to deploy and limit the range of the vehicle. Techniques from the field of simultaneous localization and mapping, particularly loop closures, can improve the quality of the navigation solution over dead-reckoning, but are difficult to integrate into COTS navigation systems. This work presents a method to improve the self-consistency of bathymetric maps by smoothly integrating loop-closure measurements into the state estimate produced by a commercial subsea navigation system. Integration is done using a white-noise-on-acceleration motion prior, without access to raw sensor measurements or proprietary models. Improvements in map self-consistency are shown for both simulated and experimental datasets, including a 3D scan of an underwater shipwreck in Wiarton, Ontario, Canada." Passive Inverted Ultra-Short Baseline Positioning for a Disc-Shaped Autonomous Underwater Vehicle: Design and Field Experiments,"Yingqiang Wang, Ruoyu Hu, S. H. Huang, Zhikun Wang, Peizhou Du, Wencheng Yang, Ying Chen","Zhejiang University,Zhejiang Univ.,China",Marine and Field Robotics,"Underwater positioning is critical to autonomous underwater vehicles (AUVs) for navigation and geo-referencing. The rapid attenuation of the electromagnetic wave in the underwater environment prevents the use of traditional positioning methods such as the Global Positioning System, whereupon acoustic methods like ultra-short baseline (USBL) positioning systems play an important role in AUV navigation. However, the high cost and complexity of classical USBL systems have stifled the democratization of these technologies, which leads to a new method called passive inverted ultra-short baseline (piUSBL) positioning. In a typical piUSBL system, a single beacon is placed at a reference point, periodically broadcasting a positioning signal. A passive USBL receiver, time-synchronized to the beacon, is mounted on an AUV to get one-way travel-time (OWTT) slant range and azimuth estimates. The passive nature of the receiver means the system is inexpensive, low-power, and lightweight. Particularly, the omnidirectional broadcasted signals offer a feasible solution for concurrent multi-AUV navigation. This letter demonstrates a full-stack design and development of a piUSBL positioning system, and presents evaluations of the accuracy and reliability of the system through a series of experiments. More significantly, a successful sea trial of a disc-shaped AUV outfitted with our piUSBL was conducted in the South China Sea." The Robustness of Tether Friction in Non-Idealized Terrains,"Justin Page, Laura Treers, Steven Jens Jorgensen, Ronald Fearing, Hannah Stuart","UC Berkeley Mechanical Engineering,University of California Berkeley,Apptronik,University of California at Berkeley,UC Berkeley",Marine and Field Robotics,"Reduced traction limits the ability of mobile robotic systems to resist or apply large external loads, such as tugging a massive payload. One simple and versatile solution is to wrap a tether around naturally occurring objects to leverage the capstan effect and create exponentially-amplified holding forces. Experiments show that an idealized capstan model explains force amplification experienced on common irregular outdoor objects – trees, rocks, posts. Robust to variable environmental conditions, this exponential amplification method can harness single or multiple capstan objects, either in series or in parallel with a team of robots. This adaptability allows for a range of potential configurations especially useful for when objects cannot be fully encircled or gripped. This versatility is demonstrated with teleoperated mobile platforms to (1) control the lowering and arrest of a payload, (2) to achieve planar control of a payload, and (3) to act as an anchor point for a more massive platform to winch towards. We show the simple addition of a tether, wrapped around shallow stones in sand, amplifies holding force of a low-traction platform by up to 774x." Reconfigurable Inflated Soft Arms,"Nam Gyun Kim, Jee-Hwan Ryu",Korea Advanced Institute of Science and Technology,Soft Robots I,"Inflatable structures have attracted considerable research attention in many fields owing to their numerous advantages, such as being light and able to engage in interactions safely. However, in most cases, the inflatable structure can only have one stable configuration, which is undesirable for robotic arms. This study proposes a novel inflatable structure that can be easily reconfigured into multiple stable configurations, even with single-body inflation. In the proposed mechanism, the structure length can be freely adjusted, and its respective joints can be set in the desired directions to facilitate the reconfiguration of its pose. An additional advantage of the proposed mechanism is that it can withstand external forces as well as its own weight. This study analyzes and experimentally validates the shape locking and load-carrying properties of the proposed mechanism. Further, the fabrication process and design guidelines for the proposed mechanism are presented. Through a suitable demonstration, the proposed mechanism is shown to exhibit multiple stable configurations and lock its poses." A Soft Hybrid-Actuated Continuum Robot Based on Dual Origami Structures,"Jian Tao, Qiqiang Hu, Tianzhi Luo, Erbao Dong","University of Science and Technology of China,City University of Hong Kong",Soft Robots I,"Soft continuum robots have shown tremendous potential for medical and industrial applications owing to their flexibility and continuous deformability. However, their telescopic and bending capabilities and variable stiffness are still limited. This study proposes a novel origami-inspired soft continuum robot to possess large telescopic and bending capabilities while improving stiffness based on the principle of antagonistic actuation. The soft robot consists of dual origami structures. The inner forms an air chamber actuated by pneumatics, and the outer is controlled by nine tendon-driven actuators. The proposed design uses the advantages of a hybrid actuation to achieve motion and stiffness control. The performance of the soft robot is studied experimentally based on single and three robot modules. Results show that the robot has an excellent stretch ratio and a maximum bending angle of 180°. The robot can also increase stiffness to resist the bending deformation induced by self-weight and loads." Direct and Inverse Modeling of Soft Robots by Learning a Condensed FEM Model,"Etienne Ménager, Tanguy Navez, Olivier Goury, Christian Duriez","Univ. Lille, Inria, CNRS, Centrale Lille, UMR ,,,, CRIStAL,University of Lille - INRIA,Inria - Lille Nord Europe,INRIA",Soft Robots I,"The Finite Element Method (FEM) is a powerful modeling tool for predicting the behavior of soft robots. However, its use for control can be difficult for non-specialists of numerical computation: it requires an optimization of the computation to make it real-time. In this paper, we propose a learning-based approach to obtain a compact but sufficiently rich mechanical representation. Our choice is based on non- linear compliance data in the actuator/effector space provided by a condensation of the FEM model. We demonstrate that this compact model can be learned with a reasonable amount of data and, at the same time, be very efficient in terms of modeling, since we can deduce the direct and inverse kinematics of the robot. We also show how to couple some models learned individually in particular on an example of a gripper composed of two soft fingers. Other results are shown by comparing the inverse model derived from the full FEM model and the one from the compact learned version. This work opens new perspectives, namely for the embedded control of soft robots, but also for their design. These perspectives are also discussed in the paper." Limit Cycle Generation with Pneumatically Driven Physical Reservoir Computing,"Hiroaki Shinkawa, Toshihiro Kawase, Tetsuro Miyazaki, Takahiro Kanno, Maina Sogabe, Kenji Kawashima","The University of Tokyo,Tokyo Denki University,Riverfield Inc.,the University of Tokyo",Soft Robots I,"One of the recent developments in physical reservoir computing, which uses the complex dynamics of a physical system as a computational resource, is the use of a pneumatic pipeline system as a computational resource. This uses the dynamics of air for computation, and because it is lightweight and power-saving, it is used for gait-assist control using a soft exoskeleton with pneumatic rubber artificial muscles. In this study, we verified that by feeding back the estimated information to a pneumatic pipeline system, the pneumatic physical reservoir computing can generate periodic pressure changes as a stable limit cycle, such as those seen in walking. A pneumatic reservoir with feedback loops was modeled to generate limit cycles in the simulation, and it was confirmed that the system could generate limit cycles with high accuracy even from initial positions far from the target limit cycle. This system is expected to be applied to assist walking movements with a soft exoskeleton with a lightweight computational device." Toward Zero-Shot Sim-To-Real Transfer Learning for Pneumatic Soft Robot 3D Proprioception Sensing,"Uksang Yoo, Hanwen Zhao, Alvaro Altamirano, Wenzhen Yuan, Chen Feng","Carnegie Mellon University,New York University",Soft Robots I,"Pneumatic soft robots present many advantages in manipulation tasks. Notably, their inherent compliance makes them safe and reliable in unstructured and fragile environments. However, full-body shape sensing for pneumatic soft robots is challenging because of their high degrees of freedom and complex deformation behaviors. Vision-based proprioception sensing methods relying on embedded cameras and deep learning provide a good solution to proprioception sensing by extracting the full-body shape information from the high-dimensional sensing data. But the current training data collection process makes it difficult for many applications. To address this challenge, we propose and demonstrate a robust sim-to-real pipeline that allows the collection of the soft robot's shape information in high-fidelity point cloud representation. The model trained on simulated data was evaluated with real internal camera images. The results show that the model performed with averaged Chamfer distance of $8.85$ mm and tip position error of $10.12$ mm even with external perturbation for a pneumatic soft robot with a length of $100.0$ mm. We also demonstrated the sim-to-real pipeline’s potential for exploring different configurations of visual patterns to improve vision-based reconstruction results. The code and dataset are available at https://github.com/DeepSoRo/DeepSoRoSim2Real." Cross-Domain Transfer Learning and State Inference for Soft Robots Via a Semi-Supervised Sequential Variational Bayes Framework,"Shageenderan Sapai, Junn Yong Loo, Ze Yang Ding, Chee Pin Tan, Raphael Phan, Vishnu Monn Baskaran, Surya G. Nurzaman","Monash University,Monash Malaysia,Monash University Malaysia",Soft Robots I,"Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged for various reasons, including difficulty in the sensorization of soft robots and the inconvenience of collecting data in unstructured environments. To address this challenge, in this paper, we propose a semi-supervised sequential variational Bayes (DSVB) framework for transfer learning and state inference in soft robots with missing state labels on certain robot configurations. Considering that soft robots may exhibit distinct dynamics under different robot configurations, a feature space transfer strategy is also incorporated to promote the adaptation of latent features across multiple configurations. Unlike existing transfer learning approaches, our proposed DSVB employs a recurrent neural network to model the nonlinear dynamics and temporal coherence in soft robot data. The proposed framework is validated on multiple setup configurations of a pneumatic-based soft robot finger. Experimental results on four transfer scenarios demonstrate that DSVB performs effective transfer learning and accurate state inference amidst missing state labels." "Image-Based Pose Estimation and Shape Reconstruction for Robot Manipulators and Soft, Continuum Robots Via Differentiable Rendering","Jingpei Lu, Fei Liu, Cedric Girerd, Michael Yip","University of California San Diego,UCSD,University of California, San Diego",Soft Robots I,"State estimation from measured data is crucial for robotic applications as autonomous systems rely on sensors to capture the motion and localize in the 3D world. Among sensors that are designed for measuring a robot's pose, or for soft robots, their shape, vision sensors are favorable because they are information-rich, easy to set up, and cost-effective. With recent advancements in computer vision, deep learning-based methods no longer require markers for identifying feature points on the robot. However, learning-based methods are data-hungry and hence not suitable for soft and prototyping robots, as building such bench-marking datasets is usually infeasible. In this work, we achieve image-based robot pose estimation and shape reconstruction from camera images. Our method requires no precise robot meshes, but rather utilizes a differentiable renderer and primitive shapes. It hence can be applied to robots for which CAD models might not be available or are crude. Our parameter estimation pipeline is fully differentiable. The robot shape and pose are estimated iteratively by back-propagating the image loss to update the parameters. We demonstrate that our method of using geometrical shape primitives can achieve high accuracy in shape reconstruction for a soft continuum robot and pose estimation for a robot manipulator." Discrete-Time Model Based Control of Soft Manipulator with FBG Sensing,"Enrico Franco, Ayhan Aktas, Shen Treratanakulchai, Arnau Garriga-casanovas, Abdulhamit Donder, Ferdinando Rodriguez Y Baena","Imperial College London,Imperial College,Imperial College, London, UK",Soft Robots I,In this article we investigate the discrete-time model based control of a planar soft continuum manipulator with proprioceptive sensing provided by fiber Bragg gratings. A control algorithm is designed with a discrete-time energy shaping approach which is extended to account for control-related lag of digital nature. A discrete-time nonlinear observer is employed to estimate the uncertain bending stiffness of the manipulator and to compensate constant matched disturbances. Simulations and experiments demonstrate the effectiveness of the controller compared to a continuous time implementation. A Soft Robot with Three Dimensional Shape Sensing and Contact Recognition Multi-Modal Sensing Via Tunable Soft Optical Sensors,"Max Mccandless, Frank Juliá Wise, Sheila Russo",Boston University,Soft and Flexible Sensors,"Soft optical sensing strategies are rapidly developing for soft robotic systems as a means to increase the controllability of soft compliant robots. In this paper, we present a roughness tuning strategy for the fabrication of soft optical sensors to achieve the dual functionality of shape sensing combined with contact recognition within a single multi-modal sensor. The molds used to fabricate the soft sensors are roughened via laser micromachining to achieve asymmetrical sensor responses when bent in opposite directions. We demonstrate the integration of these sensors into a fully soft robotic platform consisting of a multi-directional bending module with integrated 3D shape sensing and a gripper with tip position monitoring along with contact force recognition. We show the accuracy of our sensing strategy in validation experiments and a pick-andplace task is performed to demonstrate the robot’s functionality." A Flexible 3D Force Sensor with Tunable Sensitivity,"James J. Davies, Mai Thanh Thai, Trung Thien Hoang, Nguyen Chi Cong, Phuoc Thien Phan, Kefan Zhu, Dang Bao Nhi Tran, Van Ho, Hung La, Q P Ha, Nigel Lovell, Thanh Nho Do","University of New South Wales,UNSW Sydney,RMIT,Japan Advanced Institute of Science and Technology,University of Nevada at Reno,University of Technology Sydney",Soft and Flexible Sensors,"Following biology’s lead, soft robotics has emerged as a perfect candidate for actuation within complex environments. While soft actuation has been developed intensively over the last few decades, soft sensing has so far slowed to catch up. A largely unresearched area is the change of the soft material properties through prestress to achieve a degree of mechanical sensitivity tunability within soft sensors. Here, a new 3D force sensor which employs novel hydraulic filament artificial muscles capable of sensitivity tunability is introduced. Using a neural network (NN) model, the new soft 3D sensor can precisely detect external forces based on the change of the hydraulic pressures with error of ~1.0, ~1.3, and ~0.94 % in the x, y, and z-axis directions, respectively. The sensor is also able to sense large force ranges, comparable to other similar sensors available in the literature. The sensor is then integrated into a soft robotic surgical arm for monitoring the tool-tissue interaction during the ablation process." STEV: Stretchable Triboelectric E-Skin Enabled Proprioceptive Vibration Sensing for Soft Robot,"Zihan Wang, Kai-chong Lei, Tang Huaze, Shoujie Li, Yuan Dai, Wenbo Ding, Xiao-Ping (Steven) Zhang","Tsinghua University,Tsinghua Shenzhen International Graduate School,Tencent,Ryerson University",Soft and Flexible Sensors,"Vibration perception is essential for robotic sensing and dynamic control. Nevertheless, due to the rigorous demand for sensor conformability and stretchability, enabling soft robots with proprioceptive vibration sensing remains challenging. This paper proposes a new liquid metal-based stretchable e-skin via a kirigami-inspired design to enable soft robot proprioceptive vibration sensing. The e-skin is fabricated into 0.1mm ultrathin thickness, ensuring its negligible influence on the overall stiffness of the soft robot. Moreover, the working mechanism of the e-skin is based on the ubiquitous triboelectrification effect, which transduces mechanical stimuli without external power supply. To demonstrate the practicability of the e-skin, we built a soft gripper consisting of three soft robotic fingers with proprioceptive vibration sensing. Our experiment shows that the gripper can accurately distinguish the grain category (six grains with the same mass, 99.9% accuracy) and the packaging quality (100% accuracy) by simply shaking the gripped bottle. In summary, a soft robotic proprioceptive vibration sensing solution is proposed; it helps soft robots to have a more comprehensive awareness of their self-state and may inspire further research on soft robotics." Design and Development of a Hydrogel-Based Soft Sensor for Multi-Axis Force Control,"Yichen Cai, David Hardman, Fumiya Iida, Thomas George Thuruthel","University of Cambridge,University College London",Soft and Flexible Sensors,"As soft robotic systems become increasingly complex, there is a need to develop sensory systems which can provide rich state information to the robot for feedback control. Multi-axis force sensing and control is one of the less explored problems in this domain. There are numerous challenges in the development of a multi-axis soft sensor: from the design and fabrication to the data processing and modelling. This work presents the design and development of a novel multi-axis soft sensor using a gelatin-based ionic hydrogel and 3D printing technology. A learning-based modelling approach coupled with sensor redundancy is developed to model the environmentally dependent soft sensors. Numerous real-time experiments are conducted to test the performance of the sensor and its applicability in closed-loop control tasks. Our results indicate that the soft sensor can predict force values and orientation angle within 4% and 7% of their total range, respectively." "Design and Characterization of a Low Mechanical Loss, High-Resolution Wearable Strain Gauge","Addison Liu, Oluwaseun Adelowo Araromi, Conor James Walsh, Robert Wood","Harvard University,Harvard University Science and Engineering Building",Soft and Flexible Sensors,"Soft, wearable systems hold promise for a wide variety of new or enhanced applications in the realm of human-computer interaction, physiological monitoring, wearable robotics, and a host of other human-centric devices. Soft sensor systems have been developed concurrently in order to allow these wearable systems to respond intelligently with their surroundings. A recently reported sensing mechanism based on the strain-mediated contact in anisotropically resistive structures (SCARS) is an attractive solution due to its high sensing resolution, low-profile nature, and high mechanical resilience. Furthermore, the resistance-based output provides a simple electronic readout, facilitating its use in a wide variety of applications. However, previous iterations of the sensing mechanism have exhibited stress relaxation and hysteretic behaviors that limit the scope of its use. Here, we report an iteration of the SCARS mechanism that uses silicone-based materials with low mechanical loss in order to improve the sensor signal stability and bandwidth. A new fabrication approach is developed which permits the incorporation of a liquid elastomer adhesive layer while also preserving the SCARS sensing functionality. The silicone-based SCARS sensors exhibited fast stress relaxation response (< 1 s) and reduced cyclic drift properties by more than half that of previously reported designs. A physiological monitoring demonstration is presented, validating that the new sensor design is mechanically resilient to such applications and has potential for use in real-world wearable use cases." "Identifying Contact Distance Uncertainty in Whisker Sensing with Tapered, Flexible Whiskers","Teresa Kent, Hannah Emnett, Mahnoush Babaei, Mitra Hartmann, Sarah Bergbreiter","Carnegie Mellon University,Northwestern University,The University of Texas at Austin",Soft and Flexible Sensors,"Whisker-based tactile sensors have the potential to perform fast and accurate 3D mappings of the environment, complementing vision-based methods under conditions of glare, reflection, proximity, and occlusion. However, current algorithms for mapping with whiskers make assumptions about the conditions of contact, and these assumptions are not always valid and can cause significant sensing errors. Here we introduce a new whisker sensing system with a tapered, flexible whisker. The system provides inputs to two separate algorithms for estimating radial contact distance on a whisker. Using a Gradient-Moment (GM) algorithm, we correctly detect contact distance in most cases (within 4% of the whisker length). We introduce the Z-Dissimilarity score as a new metric that quantifies uncertainty in the radial contact distance estimate using both the GM algorithm and a Moment-Force (MF) algorithm that exploits the tapered whisker design. Combining the two algorithms ultimately results in contact distance estimates more robust than either algorithm alone." "Learning Decoupled Multi-Touch Force Estimation, Localization and Stretch for Soft Capacitive E-Skin","Abu Bakar Dawood, Claudio Coppola, Kaspar Althoefer",Queen Mary University of London,Soft and Flexible Sensors,"Distributed sensor arrays capable of detecting multiple spatially distributed stimuli are considered an important element in the realisation of exteroceptive and proprioceptive soft robots. This paper expands upon the previously presented idea of decoupling the measurements of pressure and location of a local indentation from global deformation, using the overall stretch experienced by a soft capacitive e-skin. We employed machine learning methods to decouple and predict these highly coupled deformation stimuli, collecting data from a soft sensor e-skin which was then fed to a machine learning system comprising of linear regressor, gaussian process regressor, SVM and random forest classifier for stretch, force, detection and localisation respectively. We also studied how the localisation and forces are affected when two forces are applied simultaneously. Soft sensor arrays aided by appropriately chosen machine learning techniques can pave the way to e-skins capable of deciphering multi-modal stimuli in soft robots." OptiGap: A Modular Optical Sensor System for Bend Localization,"Jr. Bupe, Cindy Harnett",University of Louisville,Soft and Flexible Sensors,"This paper presents the novel use of air gaps in flexible optical light pipes to create coded segments for use in bend localization. The OptiGap sensor system allows for the creation of extrinsic intensity modulated bend sensors that function as flexible absolute linear encoders. Coded segment patterns are identified by a Gaussian naive Bayes classifier running on an STM32 microcontroller. Fitting of the classifier is aided by a custom software suite that simplifies data collection and processing from the sensor. The sensor model is analyzed and verified through simulation and experiments, highlighting key properties and parameters that aid in the design of OptiGap sensors using different light pipe materials and for various applications. This system allows for realtime and accurate bend localization in many robotics and automation applications, in wet and dry conditions." A Silicone-Sponge-Based Variable-Stiffness Device,"Tianqi Yue, Tsam Lung You, Hemma Philamore, Hermes Gadelha, Jonathan Rossiter","University of Bristol,Kyoto University,Department of engineering, University of Bristol, UK",Actuation,"Soft devices employ variable stiffness to ensure safety and improve the robustness in the interaction between robots and objects. Using soft materials is one of the most popular approaches to design a variable-stiffness device, while the use of silicone sponge remains less explored in this field. Here we present a novel silicone-sponge-based variable-stiffness device (SVD). The SVD is easy-to-make and low-cost, and fabricated by an air-tight bellow enclosing a silicone sponge core. This allows easy access to the hyper-elastic response of the porous sponge whilst stiffness tuning of the device via pneumatic pressure difference. A detailed mathematical model of the SVD is proposed, by which the stiffness can be precisely controlled by the pressure difference applied. The stiffness of SVD can be tuned in the range of [1.55, 22.82]×10^3 N/m, up to 14.7 times increase. The high stiffness is easily triggered by a low pressure difference (ΔP < 12 kPa). The SVD is a versatile and compact module, with small axial size (10 mm height) and light weight (14.3 g), making it highly suitable for integration in a wide range of robotics and industrial applications. This, in addition to its easy-to-fabricate and low-cost features, may appeal to the robotics community at large. We further detail its working principle, fabrication processes, mathematical model and automated control methods to show its versatility." Design and Control of a Tunable-Stiffness Coiled-Spring Actuator,"Shivangi Misra, Mason Mitchell, Rongqian Chen, Cynthia Sung","University of Pennsylvania,Worcester Polytechnic Institute",Actuation,"We propose a novel design for a lightweight and compact tunable stiffness actuator capable of stiffness changes up to 20x. The design is based on the concept of a coiled spring, where changes in the number of layers in the spring change the bulk stiffness in a near linear fashion. We present an elastica nested rings model for the deformation of the proposed actuator and empirically verify that the designed stiffness-changing spring abides by this model. Using the resulting model, we design a physical prototype of the tunable-stiffness coiled-spring actuator and discuss the effect of design choices on the resulting achievable stiffness range and resolution. In the future, this actuator design could be useful in a wide variety of soft robotics applications, where fast, controllable, and local stiffness change is required over a large range of stiffnesses." Wirelessly-Controlled Untethered Piezoelectric Planar Soft Robot Capable of Bidirectional Crawling and Rotation,"Zhiwu Zheng, Hsin Cheng, Prakhar Kumar, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm",Princeton University,Actuation,"Electrostatic actuators provide a promising approach to creating soft robotic sheets, due to their flexible form factor, modular integration, and fast response speed. However, their control requires kilo-Volt signals and understanding of complex dynamics resulting from force interactions by on-board and environmental effects. In this work, we demonstrate an untethered planar five-actuator piezoelectric robot powered by batteries and on-board high-voltage circuitry, and controlled through a wireless link. The scalable fabrication approach is based on bonding different functional layers on top of each other (steel foil substrate, actuators, flexible electronics). The robot exhibits a range of controllable motions, including bidirectional crawling (up to ~0.6 cm/s), turning, and in-place rotation (at ~1 degree/s). High-speed videos and control experiments show that the richness of the motion results from the interaction of an asymmetric mass distribution in the robot and the associated dependence of the dynamics on the driving frequency of the piezoelectrics. The robot's speed can reach 6 cm/s with specific payload distribution." Origami Folding Enhances Modularity and Mechanical Efficiency of Soft Actuators,"Zheng Wang, Yazhou Song, Zhongkui Wang, Hongying Zhang","National University of Singapore,Ritsumeikan University",Actuation,"Soft robots have long been attractive to robotic engineers due to their remarkable dexterity; however, reports that standardize soft actuators into modularized off-shelf devices akin to rigid robots are still rare, and the mechanical efficiency of existing designs is still limited. This work identifies origami folding to enable the design of LEGO-like modularized soft actuators with high mechanical efficiency in terms of payload capability and workspace. Herein, three modularized origami actuators that can generate translational, bending, and twisting motion are designed, prototyped, and tested. The translational actuator can contract to 40% of its original length, and the twisting and bending actuators can exert 31° and 52° angular motions, respectively. The translational actuator can exert a blocked force of about 821 times self-weight. The motion of origami soft actuators is accurately modeled using rigid body kinematics, and complex systems built by them are captured by homogeneous transformation. Finally, the modularized design and efficient kinematic model are verified on a manipulator and a reconfigurable letter. Benefiting from the unprecedented modularity and mechanical efficiency, these LEGO-like origami actuators are promising for practical applications like food handling and healthcare." "Characterisation of Antagonistically Actuated, Stiffness-Controllable Joint-Link Units for Cobots","Wenlong Gaozhang, Jialei Shi, Yue Li, Agostino Stilli, Helge Wurdemann","University College London,Kings College London",Actuation,"Soft robotic structures may play a major role in the 4th industrial revolution. Researchers have successfully demonstrated the advantages of soft robotics over traditional robots made of rigid links and joints in many application areas. Variable stiffness links (VSL) and joints (VSJ) have been investigated to achieve on-demand forces and, at the same time, be inherently safe in interactions with humans. However, a thorough characterisation of soft and rigid robotic components is still required. This paper investigates the influence of antagonistically actuated, stiffness-controllable joint-link units (JLUs) on the performance of collaborative robots (i.e. stiffness, load capacity, repetitive precision) and characterizes the difference compared with rigid units. A JLU is made of a combination of a VSL, a VSJ, and their rigid counterparts. Experimental results show that the VSL has minor differences in terms of stiffness (0.62 ~ 0.95), output force (0.93 ~ 0.94), and repetitive precision compared with the rigid link. For the VSJ, our results show a significant gap compared with the servo motor with regards to maximum stiffness (0.14 ~ 0.21) and repetitive position precision (0.07 ~ 0.25). However, similar performance on repetitive force precision and better performance on the maximum output force (1.54 ~ 1.55 times) are demonstrated." A Fluidic Actuator with an Internal Stiffening Structure Inspired by Mammalian Erectile Tissue,"Jan Fras, Kaspar Althoefer",Queen Mary University of London,Actuation,"One of the biggest problems with soft robots is precisely the fact that they are soft. Indeed the softer they are, the less force they can exert on the environment. Researchers have proposed a number of stiffening methods, but all of them have drawbacks, such as locking the shape of the device in a way that precludes further adjustments. In this paper we propose a stiffening method inspired by the internal structure of the mammalian penis. The soft actuation chamber is divided into small compartments that trap the actuation fluid, leading to locally amplified pressure increase under certain conditions. At the same time, the proposed solution does not affect the actuation mechanism, allowing the actuator to be adjusted in one direction just as if it was in non-stiffened mode, while offering a stiff response in the opposite direction. Our prototype achieves an increase in stiffening of approximately a factor of two. The paper describes the concept, the mathematical justification of the working principle, the prototype design, its implementation and our experimental results." On Tendon Driven Continuum Robots with Compressible Backbones,"Manu Srivastava, Ian Walker",Clemson University,Actuation,"This paper discusses the effect of axial backbone compression on tendon-driven continuum robots. A new mechanics model for compensating for this effect that does not require tendon tension sensing or knowledge of manipulator material properties/stiffnesses is introduced and analyzed. In addition, we provide an analytical expression for the minimum preload on the tendons to achieve a given bend, a quantity determined empirically thus far. Our model is computationally efficient and achieves real time control on low cost hardware. The analysis is supported by experimental results demonstrating significant improvement over kinematics in open loop control of a tendon-driven continuum hose robot." FourStr: When Multi-Sensor Fusion Meets Semi-Supervised Learning,"Bangquan Xie, Liang Yang, Zongming Yang, Ailin Wei, Xiaoxiong Weng, Bing Li","South China University of Technology,Apple Inc,Clemson University,Clemson Univeristy",Sensor Fusion I,"This article proposes a novel semi-supervised learning framework FourStr} (Four-Stream formed by two two-stream models) that focuses on the improvement of fusion and labeling efficiency for 3D multi-sensor detector. FourStr adopts a multi-sensor single-stage detector named adaptive fusion network (AFNet) as the backbone and trains it through the semi-supervision learning (SSL) strategy Stereo Fusion. Note that multi-sensor AFNet and SSL Stereo Fusion can benefit each other. On the one hand, the Four-stream composed of two AFNets naturally provides rich inputs and large models for SSL Stereo Fusion. While other SSL works have to use massive augmentation to obtain rich inputs, and deepen and widen the network for large models. On the other hand, by the novel three fusion stages and Loss Pruning, Stereo Fusion improves the fusion and labeling efficiency for AFNet. Finally, extensive experiments demonstrate that FourStr performs excellently on outdoor dataset (KITTI and Waymo Open Dataset) and indoor dataset (SUN RGB-D), especially for the small contour objects. And compared to the fully-supervised methods, FourStr achieves similar accuracy with only 2% labeled data on KITTI (or with 50% labeled data on SUN RGB-D)." Combining Motion and Appearance for Robust Probabilistic Object Segmentation in Real Time,"Vito Mengers, Aravind Battaje, Manuel Baum, Oliver Brock","Technische Universität Berlin,TU Berlin",Sensor Fusion I,"We present a robust method to visually segment scenes into objects based on motion and appearance. Both these cues provide complementary information that we fuse using two interconnected recursive estimators: One estimates object segmentation from motion as a probabilistic clustering of tracked 3D points, and the other estimates object segmentation from appearance as a probabilistic image segmentation. The interconnected estimators provide a probabilistic and consistent object segmentation in real time, which makes them well suited for many downstream robotic tasks. We evaluate our method on one such task, kinematic structure estimation, on a dataset of interactions with articulated objects and show that our fusion improves object segmentation by 70% and in turn estimated kinematic joints by 26% over a purely motion-based approach. Furthermore, we show the necessity of probabilistic modeling for downstream robotic tasks, achieving 339% of the performance of a recent multimodal but deterministic RNN for object segmentation on the estimation of kinematic structure." Event-Based Real-Time Moving Object Detection Based on IMU Ego-Motion Compensation,"Chunhui Zhao, Yakun Li, Yang Lyu",Northwestern Polytechnical University,Sensor Fusion I,"Accurate and timely onboard perception is a prerequisite for mobile robots to operate in highly dynamic scenarios. The bio-inspired event camera can capture more motion details than a traditional camera by triggering each pixel asynchronously and therefore is more suitable in such scenarios. Among various perception tasks based on the event camera, ego-motion removal is one fundamental procedure to reduce perception ambiguities. Recent ego-motion removal methods are mainly based on optimization processes and may be computationally expensive for robot applications. In this paper, we consider the challenging perception task of detecting fast-moving objects from an aggressively operated platform equipped with an event camera, achieving computational cost reduction by directly employing IMU motion measurement. First, we design a nonlinear warping function to capture rotation information from an IMU and to compensate for the camera motion during an asynchronous events stream. The proposed nonlinear warping accuracy by 10%-15%. Afterward, we segmented the moving parts on the warped image through dynamic threshold segmentation and optical flow calculation, and clustering. Finally, we validate the proposed detection pipeline on public datasets and real-world data streams containing challenging light conditions and fast-moving objects." Estimating the Motion of Drawers from Sound,"Manuel Baum, Amelie Froessl, Aravind Battaje, Oliver Brock","TU Berlin,Technische Universitaet Berlin,Technische Universität Berlin",Sensor Fusion I,"Robots need to understand articulated objects, such as drawers. The state of articulated structures is commonly estimated using vision, but visual perception is limited when objects are occluded, have few salient features, or are not in the camera's field of view. Audio sensing does not face these challenges, since sound propagates in a fundamentally different way than light. Therefore we propose to fuse vision and audio sensing to overcome the challenges faced by vision alone. We estimate motion in several drawers and show that an audio-visual approach estimates drawer motion more reliably than only vision -- even in settings where the purely visual approach completely breaks down. Additionally, we perform an in-depth analysis of the regularities that govern how motion in drawers shapes their sound." Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents That See and Hear,"Ruohan Gao, Hao Li, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Fei-Fei Li, Jiajun Wu","Stanford University,Google Inc",Sensor Fusion I,"Developing embodied agents in simulation has been a key research topic in recent years. Exciting new tasks, algorithms, and benchmarks have been developed in various simulators. However, most of them assume deaf agents in silent environments, while we humans perceive the world with multiple senses. We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. Sonicverse models realistic continuous audio rendering in 3D environments in real-time. Together with a new audio-visual VR interface that allows humans to interact with agents with audio, Sonicverse enables a series of embodied AI tasks that need audio-visual perception. For semantic audio-visual navigation in particular, we also propose a new multi-task learning model that achieves state-of-the-art performance. In addition, we demonstrate Sonicverse's realism via sim-to-real transfer, which has not been achieved by other simulators: an agent trained in Sonicverse can successfully perform audio-visual navigation in real-world environments. Sonicverse is available at: https://github.com/StanfordVL/Sonicverse." LAPTNet-FPN: Multi-Scale LiDAR-Aided Projective Transform Network for Real Time Semantic Grid Prediction,"Manuel Diaz Zapata, David Sierra Gonzalez, Ozgur Erkent, Christian Laugier, Jilles Dibangoye","Inria Grenoble,Inria Grenoble Rhône-Alpes,Hacettepe University,INRIA,Univ Lyon",Sensor Fusion I,"Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation forhuman (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS." Collision-Aware In-Hand 6D Object Pose Estimation Using Multiple Vision-Based Tactile Sensors,"Gabriele Mario Caddeo, Nicola Agostino Piga, Fabrizio Bottarel, Lorenzo Natale",Istituto Italiano di Tecnologia,Sensor Fusion I,"In this paper, we address the problem of estimating the in-hand 6D pose of an object in contact with multiple vision-based tactile sensors. We reason on the possible spatial configurations of the sensors along the object surface. Specifically, we filter contact hypotheses using geometric reasoning and a Convolutional Neural Network (CNN), trained on simulated object-agnostic images, to promote those that better comply with the actual tactile images from the sensors. We use the selected sensors configurations to optimize over the space of 6D poses using a Gradient Descent-based approach. We finally rank the obtained poses by penalizing those that are in collision with the sensors. We carry out experiments in simulation using the DIGIT vision-based sensor with several objects, from the standard YCB model set. The results demonstrate that our approach estimates object poses that are compatible with actual object-sensor contacts in 87.5% of cases while reaching an average positional error in the order of 2 centimeters. Our analysis also includes qualitative results of experiments with a real DIGIT sensor." CalibDepth: Unifying Depth Map Representation for Iterative LiDAR-Camera Online Calibration,"Jiangtong Zhu, Jianru Xue, Pu Zhang",Xi'an Jiaotong University,Sensor Fusion I,"LiDAR-Camera online calibration is of great significance for building a stable autonomous driving perception system. For online calibration, a key challenge lies in constructing a unified and robust representation between multimodal sensor data. Most methods extract features manually or implicitly with an end-to-end deep learning method. The former suffers poor robustness, while the latter has poor interpretability. In this paper, we propose CalibDepth, which uses depth maps as the unified representation for image and LiDAR point cloud. CalibDepth introduces a sub-network for monocular depth estimation to assist online calibration tasks. To further improve the performance, we regard online calibration as a sequence prediction problem, and introduce global and local losses to optimize the calibration results. CalibDepth shows excellent performance in different experimental setups." Shape Visual Servoing of a Tether Cable from Parabolic Features,"Lev Smolentsev, Alexandre Krupa, Francois Chaumette","INRIA Rennes - Bretagne Atlantique,Centre Inria de l'Université de Rennes,Inria center at University of Rennes",Visual Servoing,In this paper we propose a visual servoing approach that controls the deformation of a suspended tether cable subject to gravity from visual data provided by a RGB-D camera. The cable shape is modelled with a parabolic curve together with the orientation of the plane containing the tether. The visual features considered are the parabolic coefficients and the yaw angle of that plane. We derive the analytical expression of the interaction matrix that relates the variation of the visual features to the velocities of the cable extremities. Singularities are demonstrated to occur if and only if the cable is taut horizontally or vertically. An image processing algorithm is also developed to extract in real-time the current features fitting the parabola to the cable from the observed point cloud. Simulations and experimental results demonstrate the efficiency of our visual servoing approach to deform the tether cable toward a desired shape configuration. Deep Metric Learning for Visual Servoing: When Pose and Image Meet in Latent Space,"Samuel Felton, Elisa Fromont, Eric Marchand","Université de Rennes ,, IRISA,Université of Rennes ,-- IRISA/Inria rba,Univ Rennes, Inria, CNRS, IRISA",Visual Servoing,"We propose a new visual servoing method that controls a robot's motion in a latent space. We aim to extract the best properties of two previously proposed servoing methods: we seek to obtain the accuracy of photometric methods such as Direct Visual Servoing (DVS), as well as the behavior and convergence of pose-based visual servoing (PBVS). Photometric methods suffer from limited convergence area due to a highly non-linear cost function, while PBVS requires estimating the pose of the camera which may introduce some noise and incurs a loss of accuracy. Our approach relies on shaping (with metric learning) a latent space, in which the representations of camera poses and the embeddings of their respective images are tied together. By leveraging the multimodal aspect of this shared space, our control law minimizes the difference between latent image representations thanks to information obtained from a set of pose embeddings. Experiments in simulation and on a robot validate the strength of our approach, showing that the sought out benefits are effectively found." CNN-Based Visual Servoing for Simultaneous Positioning and Flattening of Soft Fabric Parts,"Fuyuki Tokuda, Akira Seino, Akinari Kobayashi, Kazuhiro Kosuge","Centre for Transformative Garment Production,Tohoku University,The University of Hong Kong",Visual Servoing,"This paper proposes CNN-based visual servoing for simultaneous positioning and flattening of a soft fabric part placed on a table by a dual manipulator system. We propose a network for multimodal data processing of grayscale images captured by a camera and force/torque applied to force sensors. The training dataset is collected by moving the real manipulators, which enables the network to map the captured images and force/torque to the manipulator’s motion in Cartesian space. We apply structured lighting to emphasize the features of the surface of the fabric part since the surface shape of the non-textured fabric part is difficult to recognize by a single grayscale image. Through experiments, we show that the fabric part with unseen wrinkles can be positioned and flattened by the proposed visual servoing scheme." Dynamical System-Based Imitation Learning for Visual Servoing Using the Large Projection Formulation,"Antonio Paolillo, Paolo Robuffo Giordano, Matteo Saveriano","IDSIA USI-SUPSI,IRISA CNRS UMR,,,,,University of Trento",Visual Servoing,"Nowadays ubiquitous robots must be adaptive and easy to use. To this end, dynamical system-based imitation learning plays an important role. In fact, it allows to realize stable and complex robotic tasks without explicitly coding them, thus facilitating the robot use. However, the adaptation capabilities of dynamical systems have not been fully exploited due to the lack of closed-loop implementations making use of visual feedback. In this regard, the integration of visual information allows higher flexibility to cope with environmental changes. This work presents a dynamical system-based imitation learning for visual servoing, based on the large projection task priority formulation. The proposed scheme enables complex and stable visual tasks, as demonstrated by a simulation analysis and experiments with a robotic manipulator." Constant Distance and Orientation Following of an Unknown Surface with a Cable-Driven Parallel Robot,"Thomas Rousseau, Nicolo Pedemonte, Stephane Caro, Francois Chaumette","Nantes Université, LS,N, IRT Jules Verne,IRT Jules Verne,CNRS/LS,N,Inria center at University of Rennes",Visual Servoing,"Cable-Driven Parallel Robots (CDPRs) are well-adapted to large workspaces since they replace rigid links by cables. However, they lack in positioning accuracy and new control methods are necessary to achieve profile-following tasks. This paper presents a control scheme designed for these tasks, relying on a combination of accurate boarded distance sensors and of a less accurate remote camera. The profile-following task is divided into two subtasks that are partially conflicting: maintaining a parallel orientation and a constant distance with the surface to follow, and following a trajectory between two points on the surface. The data fusion to solve the redundancy is based on the Gradient Projection Method. This control scheme is validated experimentally on a CDPR prototype and shown to provide the expected behaviour." 3D Spectral Domain Registration-Based Visual Servoing,"Komlan Adjigble, Brahim Tamadazte, Cristiana De Farias, Rustam Stolkin, Naresh Marturi","University of Birmingham,CNRS",Visual Servoing,"This paper presents a spectral domain registration-based visual servoing scheme that works on 3D point clouds. Specifically, we propose a 3D model/point cloud alignment method, which works by finding a global transformation between reference and target point clouds using spectral analysis. A 3D Fast Fourier Transformation (FFT) in R3 is used for the translation estimation, and the real spherical harmonics in SO(3) are used for the rotations estimation. Such an approach allows us to derive a decoupled 6 degrees of freedom (DoF) controller, where we use gradient ascent optimisation to minimise translation and rotational costs. We then show how this methodology can be used to regulate a robot arm to perform a positioning task. In contrast to the existing state-of-the-art depth-based visual servoing methods that either require dense depth maps or dense point clouds, our method works well with partial point clouds and can effectively handle larger transformations between the reference and the target positions. Furthermore, the use of spectral data (instead of spatial data) for transformation estimation makes our method robust to sensor-induced noise and partial occlusions. We validate our approach by performing experiments using point clouds acquired by a robot-mounted depth camera. Obtained results demonstrate the effectiveness of our visual servoing approach." Autonomous Endoscope Control Algorithm with Visibility and Joint Limits Avoidance Constraints for Da Vinci Research Kit Robot,"Rocco Moccia, Fanny Ficuciello","Università degli Studi di Napoli Federico II,Università di Napoli Federico II",Visual Servoing,"This paper presents a novel autonomous endoscope control method for the dVRK’s Endoscopic Camera Manipulator (ECM), which allows the camera to track the surgical instruments on the Patient Side Manipulator (PSM). An Image-based Visual Servoing (IBVS) is enforced by the addition of a visibility constraint that ensures the identified surgical tool remains in the camera’s Field Of View (FOV) for the continued availability of image feedback and a joint limits avoidance constraint that prevents the ECM from exceeding its joint limits. The work relies on an optimization approach, with constraints performed using the Control Barrier Functions concept (CBFs). The goal is to minimize the surgeon’s cognitive and physical workload by removing the time-consuming job of camera reorientation, offering an enforced method compared to the traditional IBVS endoscopic camera controller." Safe Control Using Vision-Based Control Barrier Function (V-CBF),"Hossein Abdi, Golnaz Raja, Reza Ghabcheloo",Tampere University,Visual Servoing,"Safe motion control in unknown environments is one of the challenging tasks in robotics, such as autonomous navigation. Control Barrier Function (CBF), as a strong mathematical tool, has been widely used in many safety-critical systems to satisfy safety requirements. However, there are only a handful of recent studies on safety controllers with perception inputs. Common assumptions in most of the works are that the CBF is already known and obstacles have predefined shapes. In this work, we introduce a novel Vision-based Control Barrier Function (V-CBF), which enables generalization to new environments and obstacles of arbitrary shapes. We then derive CBF safety conditions over RGB-D space and relate those to actual robot control inputs. To train the CBF function, we introduce a method to generate ground truth with desired properties complying with CBF and a method to generate part of the CBF as an image-to-image translation problem. We finally demonstrate the efficacy of V-CBF on the safe control of an autonomous car in CARLA simulator." DC-MOT: Motion Deblurring and Compensation for Multi-Object Tracking in UAV Videos,"Song Cheng, Meibao Yao, Xueming Xiao","Jilin University,Changchun University of Science and Technology",Visual Tracking,"In this paper, we propose a multi-object tracking framework for videos captured by UAVs, considering motion imperfection in the following two aspects: 1) motion blurring of objects due to high-speed motion of the UAV and the objects, deteriorating the performance of the detector; 2) motion coupling of the global movement of the UAV camera with the object motion, resulting in the objects trajectory in adjacent frames more difficult to predict. For motion blurring, this paper proposes a hybrid deblurring module that deals with the blurred frames while retaining the clear frames, trading off between video tracking performance and spatio-temporal consistency. For motion coupling, we proposed a motion compensation module to align adjacent frames by feature matching, and the corrected target position is obtained in the next frame to alleviate the interference of camera movement with tracking. We evaluate the proposed methods on VisDrone dataset and validate that our framework achieves new state-of-the-art performance on UAV-based MOT systems." Fast Event-Based Double Integral for Real-Time Robotics,"Shijie Lin, Yinqiang Zhang, Dongyue Huang, Bin Zhou, Xiaowei Luo, Jia Pan","The University of Hong Kong,The Chinese University of Hong Kong,Beihang University,City University, HONG KONG,University of Hong Kong",Visual Tracking,"Motion deblurring is a critical ill-posed problem that is important in many vision-based robotics applications. The recently proposed event-based double integral (EDI) provides a theoretical framework for solving the deblurring problem with the event camera and generating clear images at high frame-rate. However, the original EDI is mainly designed for offline computation and does not support real-time requirement in many robotics applications. In this paper, we propose the fast EDI, an efficient implementation of EDI that can achieve real-time online computation on single-core CPU devices, which is common for physical robotic platforms used in practice. In experiments, our method can handle event rates at as high as 13 million event per second in a wide variety of challenging lighting conditions. We demonstrate the benefit on multiple downstream real-time applications, including localization, visual tag detection, and feature matching." Continuous-Time Gaussian Process Motion-Compensation for Event-Vision Pattern Tracking with Distance Fields,"Cedric Le Gentil, Ignacio Alzugaray, Teresa A. Vidal-Calleja","University of Technology Sydney,Imperial College London",Visual Tracking,"This work addresses the issue of motion compensation and pattern tracking in event camera data. An event camera generates asynchronous streams of events triggered independently by each of the pixels upon changes in the observed intensity. Providing great advantages in low-light and rapid-motion scenarios, such unconventional data present significant research challenges as traditional vision algorithms are not directly applicable to this sensing modality. The proposed method decomposes the tracking problem into a local SE(2) motion-compensation step followed by a homography registration of small motion-compensated event batches. The first component relies on Gaussian Process (GP) theory to model the continuous occupancy field of the events in the image plane and embed the camera trajectory in the covariance kernel function. In doing so, estimating the trajectory is done similarly to GP hyperparameter learning by maximising the log marginal likelihood of the data. The continuous occupancy fields are turned into distance fields and used as templates for homography-based registration. By benchmarking the proposed method against other state-of-the-art techniques, we show that our open-source implementation performs high-accuracy motion compensation and produces high-quality tracks in real-world scenarios." EXOT: Exit-Aware Object Tracker for Safe Robotic Manipulation of Moving Object,"Hyunseo Kim, Hye Jung Yoon, Minji Kim, Dong-sig Han, Byoung-Tak Zhang",Seoul National University,Visual Tracking,"Current robotic hand manipulation narrowly operates with objects in predictable positions in limited environments. Thus, when the location of the target object deviates severely from the expected location, a robot sometimes responds in an unexpected way, especially when it operates with a human. For safe robot operation, we propose the EXit-aware Object Tracker (EXOT) on a robot hand camera that recognizes an object's absence during manipulation. The robot decides whether to proceed by examining the tracker's bounding box output containing the target object. We adopt an out-of-distribution classifier for more accurate object recognition since trackers can mistrack a background as a target object. To the best of our knowledge, our method is the first approach of applying an out-of-distribution classification technique to a tracker output. We evaluate our method on the first-person video benchmark dataset, TREK-150, and on the custom dataset, RMOT-223, that we collect from the UR5e robot. Then we test our tracker on the UR5e robot in real-time with a conveyor-belt sushi task, to examine the tracker's ability to track target dishes and to determine the exit status. Our tracker shows 38% higher exit-aware performance than a baseline method. The dataset and the code will be released at https://github.com/hskAlena/EXOT." Mono-STAR: Mono-Camera Scene-Level Tracking and Reconstruction,"Haonan Chang, Dhruv Metha Ramesh, Shijie Geng, Yuqiu Gan, Abdeslam Boularias","Rutgers University,Columbia University",Visual Tracking,"We present Mono-STAR, the first real-time RGB-D 3D reconstruction system that simultaneously supports semantic fusion, fast motion tracking, non-rigid object deformation, and topological change under a unified framework. The proposed system solves a new optimization problem incorporating optical-flow-based 2D constraints to deal with fast motion and a novel semantic-aware deformation graph (SAD-graph) for handling topology change. We test the proposed system under various challenging scenes and demonstrate that it significantly outperforms existing state-of-the-art methods." DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor Fusion,"Mohamed Nagy, Majid Khonji, Jorge Dias, Sajid Javed",Khalifa University,Visual Tracking,"Persistent multi-object tracking (MOT) allows autonomous vehicles to navigate safely in highly dynamic environments. One of the well-known challenges in MOT is object occlusion when an object becomes unobservant for subsequent frames. The current MOT methods store objects information, like objects' trajectory, in internal memory to recover the objects after occlusions. However, they retain short-term memory to save computational time and avoid slowing down the MOT method. As a result, they lose track of objects in some occlusion scenarios, particularly long ones. In this paper, we propose DFR-FastMOT, a light MOT method that uses data from a camera and LiDAR sensors and relies on an algebraic formulation for object association and fusion. The formulation boosts the computational time and permits long-term memory that tackles more occlusion scenarios. Our method shows outstanding tracking performance over recent learning and non-learning benchmarks with about 3% and 4% margin in MOTA, respectively. Also, we conduct extensive experiments that simulate occlusion phenomena by employing detectors with various distortion levels. The proposed solution enables superior performance under various distortion levels in detection over current state-of-art methods. Our framework processes about 7,763 frames in 1.48 seconds, which is seven times faster than recent benchmarks. The framework will be available at https://github.com/MohamedNagyMostafa/DFR-FastMOT." Fusion of Events and Frames Using 8-DOF Warping Model for Robust Feature Tracking,"Min Seok Lee, Ye Jun Kim, Jae Hyung Jung, Chan Gook Park","Seoul National University,Hyundai motor group",Visual Tracking,"Event cameras are asynchronous neuromorphic vision sensors with high temporal resolution and no motion blur, offering advantages over standard frame-based cameras especially in high-speed motions and high dynamic range conditions. However, event cameras are unable to capture the overall context of the scene, and produce different events for the same scenery depending on the direction of the motion, creating a challenge in data association. Standard camera, on the other hand, provides frames at a fixed rate that are independent of the motion direction, and are rich in context. In this paper, we present a robust feature tracking method that employs 8-DOF warping model in minimizing the difference between brightness increment patches from events and frames, exploiting the complementary nature of the two data types. Unlike previous works, the proposed method enables tracking of features under complex motions accompanying distortions. Extensive quantitative evaluation over publicly available datasets was performed where our method shows an improvement over state-of-the-art methods in robustness with greatly prolonged feature age and in accuracy for challenging scenarios." 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds,"Jyoti Kini, Ajmal Mian, Mubarak Shah","University of Central Florida,University of Western Australia",Visual Tracking,"We propose a method for joint detection and tracking of multiple objects in 3D point clouds, a task conventionally treated as a two-step process comprising object detection followed by data association. Our method embeds both steps into a single end-to-end trainable network eliminating the dependency on external object detectors. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network, thereby making it a utilitarian formulation for real-world scenarios. Computing affinity matrix by employing features similarity across consecutive point cloud scans forms an integral part of visual tracking. We propose an attention-based refinement module to refine the affinity matrix by suppressing erroneous correspondences. The module is designed to capture the global context in affinity matrix by employing self-attention within each affinity matrix and cross-attention across a pair of affinity matrices. Unlike competing approaches, our network does not require complex post-processing algorithms, and processes raw LiDAR frames to directly output tracking results. We demonstrate the effectiveness of our method on the three tracking benchmarks: JRDB, Waymo, and KITTI. Experimental evaluations indicate the ability of our model to generalize well across datasets." Inverse Reinforcement Learning Framework for Transferring Task Sequencing Policies from Humans to Robots in Manufacturing Applications,"Omey Mohan Manyar, Zachary Mcnulty, Stefanos Nikolaidis, Satyandra K. Gupta","University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA",Robot Learning,"In this work, we present an inverse reinforcement learning approach for solving the problem of task sequencing for robots in complex manufacturing processes. Our proposed framework is adaptable to variations in process and can perform sequencing for entirely new parts. We prescribe an approach to capture feature interactions in a demonstration dataset based on a metric that computes feature interaction coverage. We then actively learn the expert's policy by keeping the expert in the loop. Our training and testing results reveal that our model can successfully learn the expert's policy. We demonstrate the performance of our method on a real-world manufacturing application where we transfer the policy for task sequencing to a manipulator. Our experiments show that the robot can perform these tasks to produce human-competitive performance. Code and video can be found at: https://sites.google.com/usc.edu/irlfortasksequencing" Learning State Conditioned Linear Mappings for Low-Dimensional Control of Robotic Manipulators,"Michael Przystupa, Kerrick Johnstonbaugh, Zichen(Vincent) Zhang, Laura Petrich, Masood Dehghan, Faezeh Haghverd, Martin Jagersand","University of Alberta,University of Alberta, Canada",Robot Learning,"Identifying an appropriate task space can simplify solving robotic manipulation problems. One solution is deploying control algorithms in a learned low-dimensional action space. Linear and nonlinear action mapping methods have trade-offs between simplicity and the ability to express motor commands outside of a single low-dimensional subspace. We propose that learning local linear action representations can achieve both of these benefits. Our state-conditioned linear maps ensure that for any given state, the high-dimensional robotic actuation is linear in the low-dimensional actions. As the robot state evolves, so do the action mappings, so that necessary motions can be performed during a task. These local linear representations guarantee desirable theoretical properties by design. We validate these findings empirically through two user studies. Results suggest state-conditioned linear maps outperform conditional autoencoder and PCA baselines on a pick-and-place task and perform comparably to mode switching in a more complex pouring task." Decoupling Skill Learning from Robotic Control for Generalizable Object Manipulation,"Kai Lu, Bo Yang, Bing Wang, Andrew Markham","University of Oxford,The Hong Kong Polytechnic University,Oxford University",Robot Learning,"Recent works in robotic manipulation through reinforcement learning (RL) or imitation learning (IL) have shown potential for tackling a range of tasks e.g., opening a drawer or a cupboard. However, these techniques generalize poorly to unseen objects. We conjecture that this is due to the high-dimensional action space for joint control. In this paper, we take an alternative approach and separate the task of learning 'what to do' from 'how to do it' i.e., whole-body control. We pose the RL problem as one of determining the skill dynamics for a disembodied virtual manipulator interacting with articulated objects. The whole-body robotic kinematic control is optimized to execute the high-dimensional joint motion to reach the goals in the workspace. It does so by solving a quadratic programming (QP) model with robotic singularity and kinematic constraints. Our experiments on manipulating complex articulated objects show that the proposed approach is more generalizable to unseen objects with large intra-class variations, outperforming previous approaches. The evaluation results indicate that our approach generates more compliant robotic motion and outperforms the pure RL and IL baselines in task success rates. Additional information and videos are available at https://kl-research.github.io/decoupskill." Comparison of Model-Based and Model-Free Reinforcement Learning for Real-World Dexterous Robotic Manipulation Tasks,"David Patricio Valencia Redrovan, John Jia, Raymond Li, Alex Hayashi, Reuel Terezakis, Trevor Gee, Minas Liarokapis, Bruce Macdonald, Henry Williams","The University of Auckland,University of AUCKLAND,University of Auckland",Robot Learning,"Model Free Reinforcement Learning (MFRL) has shown significant promise for learning dexterous robotic manipulation tasks, at least in simulation. However, the high number of samples, as well as the long training times, prevent MFRL from scaling to complex real-world tasks. Model-Based Reinforcement Learning (MBRL) emerges as a potential solution that, in theory, can improve the data efficiency of MFRL approaches. This could drastically reduce the training time of MFRL, and increase the application of RL for real-world robotic tasks. This article presents a study on the feasibility of using the state-of-the-art MBRL to improve the training time for two real-world dexterous manipulation tasks. The evaluation is conducted on a real low-cost robot gripper where the predictive model and the control policy are learned from scratch. The results indicate that MBRL is capable of learning accurate models of the world, but does not show clear improvements in learning the control policy in the real world as prior literature suggests should be expected." Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control,"Murad Elnagdi, Nils Dengler, Jorge De Heuvel, Maren Bennewitz",University of Bonn,Robot Learning,"Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control MPC as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot navigation both in simulation and real-world experiments with a Kuboki Turtlebot 2. We furthermore demonstrate great improvement over pure RL algorithms in terms of success rate as well as number of collisions and timeouts. Our experiments show that MPC as an experience source improves the agent's learning process for a given task in the case of sparse rewards." Task-Driven Graph Attention for Hierarchical Relational Object Navigation,"Michael Lingelbach, Chengshu Li, Minjune Hwang, Andrey Kurenkov, Alan Lou, Roberto Martín-martín, Ruohan Zhang, Fei-Fei Li, Jiajun Wu","Stanford University,University of Texas at Austin,Stanford University",Robot Learning,"Embodied AI agents in large scenes often need to navigate to find objects. In this work, we study a naturally emerging variant of the object navigation task, hierarchical relational object navigation (HRON), where the goal is to find objects specified by logical predicates organized in a hierarchical structure—objects related to furniture and then to rooms—such as finding an apple on top of a table in the kitchen. Solving such a task requires an efficient representation to reason about object relations and correlate the relations in the environment and in the task goal. HRON in large scenes (e.g. homes) is particularly challenging due to its partial observability and long horizon, which invites solutions that can compactly store the past information while effectively exploring the scene. We demonstrate experimentally that scene graphs are the best-suited representation compared to conventional representations such as images or 2D maps. We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone, with an integrated task-driven attention mechanism, and demonstrate better scalability and learning efficiency than state-of-the-art baselines." Safety-Guaranteed Skill Discovery for Robot Manipulation Tasks,"Sunin Kim, Jaewoon Kwon, Taeyoon Lee, Younghyo Park, Julien Perez","NAVER LABS,Naver labs,MIT,NAVER LABS EUROPE",Robot Learning,"Programming manipulation behaviors can become increasingly difficult with a growing number and complexity of manipulation tasks, particularly in a dynamic and unstructured environment. Recent progress in unsupervised skill discovery algorithms has shown great promise in learning an extensive collection of behaviors without extrinsic supervision. On the other hand, safety is one of the most critical factors for real-world robot applications. As skill discovery methods typically encourage exploratory and dynamic behaviors, it can often be the case that a large portion of learned skills remain too dangerous and unsafe. In this paper, we introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe to be composed for solving downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy that maximizes intrinsic rewards regularized with a safety-critic that can model any user-defined safety constraints. Using the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without the need for explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation and show promising downstream task performance while satisfying safety constraints." A Framework for the Unsupervised Inference of Relations between Sensed Object Spatial Distributions and Robot Behaviors,"Christopher Morse, Lu Feng, Matthew Dwyer, Sebastian Elbaum",University of Virginia,Robot Learning,"The spatial distribution of sensed objects strongly influences the behavior of mobile robots. Yet, as robots evolve in complexity to operate in increasingly rich environments, it becomes much more difficult to specify the underlying relations between sensed object spatial distributions and robot behaviors. We aim to address this challenge by leveraging system trace data to automatically infer relations that help to better characterize these spatial associations. In particular, we introduce SpRInG, a framework for the unsupervised inference of system specifications from traces that characterize the spatial relationships under which a robot operates. Our method builds on a parameterizable notion of reachability to encode relationships of spatial neighborship, which are used to instantiate a language of patterns. These patterns provide the structure to infer, from system traces, the connection between such relationships and robot behaviors. We show that SpRInG can automatically infer spatial relations over two distinct domains: autonomous vehicles in traffic and a surgical robot. Our results demonstrate the power and expressiveness of SpRInG, in its ability to learn existing specifications as machine-checkable first-order logic, uncover previously unstated specifications that are rich and insightful, and reveal contextual differences between executions." Learning Video-Conditioned Policies for Unseen Manipulation Tasks,"Elliot Chane-sane, Cordelia Schmid, Ivan Laptev","Inria PARIS,Inria,INRIA",Robot Learning,"The ability to specify robot commands by a non-expert user is critical for building generalist agents capable of solving a large variety of tasks. One convenient way to specify the intended robot goal is by a video of a person demonstrating the target task. While prior work typically aims to imitate human demonstrations performed in robot environments, here we focus on a more realistic and challenging setup with demonstrations recorded in natural and diverse human environments. We propose Video-conditioned Policy learning (ViP), a data-driven approach that maps human demonstrations of previously unseen tasks to robot manipulation skills. To this end, we learn our policy to generate appropriate actions given current scene observations and a video of the target task. To encourage generalization to new tasks, we avoid particular tasks during training and learn our policy from unlabelled robot trajectories and corresponding robot videos. Both robot and human videos in our framework are represented by video embeddings pre-trained for human action recognition. At test time we first translate human videos to robot videos in the common video embedding space, and then use resulting embeddings to condition our policies. Notably, our approach enables robot control by human demonstrations in a zero-shot manner, i.e., without using robot trajectories paired with human instructions during training. We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art. Our method also demonstrates excellent performance in a new challenging zero-shot setup where no paired data is used during training." Learning Food Picking without Food: Fracture Anticipation by Breaking Reusable Fragile Objects,"Rinto Yagawa, Reina Ishikawa, Masashi Hamaya, Kazutoshi Tanaka, Atsushi Hashimoto, Hideo Saito","Keio University,OMRON SINIC X Corporation,OMRON SINIC X",Robot Learning,"Food picking is trivial for humans but not for robots, as foods are fragile. Presetting foods' physical properties does not help robots much due to the objects' inter- and intra-category diversity. A recent study proved that learning-based fracture anticipation with tactile sensors could overcome this problem; however, the method trains the model for each food to deal with intra-category differences, and tuning robots for each food leads to an undesirable amount of food consumption. This study proposes a novel framework for learning food-picking tasks without consuming foods. The key idea is to leverage the object-breaking experiences of several reusable fragile objects instead of consuming real foods while making the picking ability object-invariant with domain generalization (DG). In real-robot experiments, we trained a model with reusable objects (toy blocks, ping-pong balls, and jellies), which are selected by three typical fracture types (crack, rupture, and crush). We then tested the model with four real food objects (tofu, bananas, potato chips, and tomatoes). The results showed that the proposed combination of reusable objects' breaking experiences and DG is effective for the food-picking task." Learning Risk-Aware Costmaps Via Inverse Reinforcement Learning for Off-Road Navigation,"Samuel Triest, Mateo Guaman Castro, Parv Maheshwari, Matthew Sivaprakasam, Wenshan Wang, Sebastian Scherer","Carnegie Mellon University,Indian Institute of Technology Kharagpur",Robot Learning,"The process of designing costmaps for off-road driving tasks is often a challenging and engineering-intensive task. Recent work in costmap design for off-road driving focuses on training deep neural networks to predict costmaps from sensory observations using corpora of expert driving data. However, such approaches are generally subject to overconfident mis-predictions and are rarely evaluated in-the-loop on physical hardware. We present an inverse reinforcement learning-based method of efficiently training deep cost functions that are uncertainty-aware. We do so by leveraging recent advances in highly parallel model-predictive control and robotic risk estimation. In addition to demonstrating improvement at reproducing expert trajectories, we also evaluate the efficacy of these methods in challenging off-road navigation scenarios. We observe that our method significantly outperforms a geometric baseline, resulting in 44% improvement in expert path reconstruction and 57% fewer interventions in practice. We also observe that varying the risk tolerance of the vehicle results in qualitatively different navigation behaviors, especially with respect to higher-risk scenarios such as slopes and tall grass." How Does It Feel? Self-Supervised Costmap Learning for Off-Road Vehicle Traversability,"Mateo Guaman Castro, Samuel Triest, Wenshan Wang, Jason M. Gregory, Felix Sanchez, John G. Rogers Iii, Sebastian Scherer","Carnegie Mellon University,US Army Research Laboratory,Booz Allen Hamilton",Robot Learning,"Estimating terrain traversability in off-road environments requires reasoning about complex interaction dynamics between the robot and these terrains. However, it is challenging to create informative labels to learn a model in a supervised manner for these interactions. We propose a method that learns to predict traversability costmaps by combining exteroceptive environmental information with proprioceptive terrain interaction feedback in a self-supervised manner. Additionally, we propose a novel way of incorporating robot velocity into the costmap prediction pipeline. We validate our method in multiple short and large-scale navigation tasks on challenging off-road terrains using two different large, all-terrain robots. Our short-scale navigation results show that using our learned costmaps leads to overall smoother navigation, and provides the robot with a more fine-grained understanding of the robot-terrain interactions. Our large-scale navigation trials show that we can reduce the number of interventions by up to 57% compared to an occupancy-based navigation baseline in challenging off-road courses ranging from 400 m to 3150 m. Appendix and full experiment videos can be found in our website: https://mateoguaman.github.io/hdif." Global and Reactive Motion Generation with Geometric Fabric Command Sequences,"Weiming Zhi, Iretiayo Akinola, Karl Van Wyk, Nathan Ratliff, Fabio Ramos","Carnegie Mellon University, University of Sydney,Columbia University,NVIDIA,University of Sydney, NVIDIA",Learning for Control I,"Motion generation seeks to produce safe and feasible robot motion from start to goal. Various tools at different levels of granularity have been developed. On one extreme, sampling-based motion planners focus on completeness -- a solution, if it exists, would eventually be found. However, produced paths are often of low quality, and contain superfluous motion. On the other, reactive methods optimise the immediate cost to obtain the next controls, producing smooth and legible motion that can quickly adapt to perturbations, uncertainties, and changes in the environment. However, reactive methods are highly local, and often produce motion that become trapped in non-convex regions of the environment. This paper contributes, Geometric Fabric Command Sequences, a method that lies in the middle ground. It can produce globally optimal motion that is smooth and intuitive, while being also reactive. We model motion via a reactive Geometric Fabric policy that ingests a sequence of attractor states, or commands, and then apply global optimisation over the space of commands. We postulate that solutions for different problems and scenes are highly transferable when conditioned on environmental features. Therefore, an implicit generative model is trained on solutions from optimisation and environment features in a self-supervised manner. That is, faced with multiple motion generation problems, the learning and optimisation are contained within the same loop: the optimisation generates labels for learning, while the learning improves the optimisation for the next problem, which in turn provides higher quality labels. We empirically validate our method in both simulation and on a real-world 6-DOF JACO arm." Enforcing the Consensus between Trajectory Optimization and Policy Learning for Precise Robot Control,"Quentin Le Lidec, Wilson Jallet, Ivan Laptev, Cordelia Schmid, Justin Carpentier","INRIA-ENS-PSL,LAAS-CNRS,INRIA,Inria",Learning for Control I,"Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly converge towards a locally optimal control trajectory which is only valid within the vicinity of the solution. Over the past decade, several approaches have aimed to adequately combine the two classes of methods in order to obtain the best of both worlds. Following on from this line of research, we propose several improvements on top of these approaches to learn global control policies quicker, notably by leveraging sensitivity information stemming from TO methods via Sobolev learning, and Augmented Lagrangian (AL) techniques to enforce the consensus between TO and policy learning. We evaluate the benefits of these improvements on various classical tasks in robotics through comparison with existing approaches in the literature." Neural Optimal Control Using Learned System Dynamics,"Kazim Selim Engin, Volkan Isler",University of Minnesota,Learning for Control I,"We study the problem of generating control laws for systems with unknown dynamics. Our approach is to represent the controller and the value function with neural networks, and to train them using loss functions adapted from the Hamilton-Jacobi-Bellman (HJB) equations. In the absence of a known dynamics model, our method first learns the state transitions from data collected by interacting with the system in an offline process. The learned transition function is then integrated to the HJB equations and used to forward simulate the control signals produced by our controller in a feedback loop. In contrast to trajectory optimization methods that optimize the controller for a single initial state, our controller can generate near-optimal control signals for initial states from a large portion of the state space. Compared to recent model-based reinforcement learning algorithms, we show that our method is more sample efficient and trains faster by an order of magnitude. We demonstrate our method in a number of tasks, including the control of a quadrotor with 12 state variables." Learned Risk Metric Maps for Kinodynamic Systems,"Ross Allen, Wei Xiao, Daniela Rus","MIT Lincoln Laboratory,MIT",Learning for Control I,"We present Learned Risk Metric Maps (LRMM) for real-time estimation of coherent risk metrics of high-dimensional dynamical systems operating in unstructured, partially observed environments. LRMM models are simple to design and train---requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator---which makes them broadly applicable to arbitrary system dynamics and obstacle sets. In a parallel autonomy setting, we demonstrate the model's ability to rapidly infer collision probabilities of a fast-moving car-like robot driving recklessly in an obstructed environment; allowing the LRMM agent to intervene, take control of the vehicle, and avoid collisions. In this time-critical scenario, we show that LRMMs can evaluate risk metrics 20-100x times faster than alternative safety algorithms based on control barrier functions (CBFs) and Hamilton-Jacobi reachability (HJ-reach), leading to 5-15% fewer obstacle collisions by the LRMM agent than CBFs and HJ-reach. This performance improvement comes in spite of the fact that the LRMM model only has access to local/partial observation of obstacles, whereas the CBF and HJ-reach agents are granted privileged/global information. We also show that our model can be equally well trained on a 12-dimensional quadrotor system operating in an obstructed indoor environment. All software for training and experiments is provided at https://github.com/mit-drl/pyrmm" Autonomous Drifting with 3 Minutes of Data Via Learned Tire Models,"Franck Djeumou, Jonathan Goh, Ufuk Topcu, Avinash Balachandran","University of Texas at Austin,Toyota Research Institute,The University of Texas at Austin,Toyota Research Institue",Learning for Control I,"Near the limits of adhesion, the forces generated by a tire are nonlinear and intricately coupled. Efficient and accurate modelling in this region could improve safety, especially in emergency situations where high forces are required. To this end, we propose a novel family of tire force models based on neural ordinary differential equations and a neural-texttt{ExpTanh} parameterization. These models are designed to satisfy physically insightful assumptions while also having sufficient fidelity to capture higher-order effects directly from vehicle state measurements. They are used as drop-in replacements for an analytical brush tire model in an existing nonlinear model predictive control framework. Experiments with a customized Toyota Supra show that scarce amounts of driving data -- less than three minutes -- is sufficient to achieve high-performance autonomous drifting on various trajectories with speeds up to 45mph. Comparisons with the benchmark model show a $4 times$ improvement in tracking performance, smoother control inputs, and faster and more consistent computation time." DDK: A Deep Koopman Approach for Longitudinal and Lateral Control of Autonomous Ground Vehicles,"Yongqian Xiao, Xinglong Zhang, Xin Xu, Lu Yang, Junxiang Li","National University of Defense Technology,National university of defense technology",Learning for Control I,"Autonomous driving has attracted lots of attention in recent years. For some tasks, e.g., trajectory prediction, motion planning, and trajectory tracking, an accurate vehicle model can reduce the difficulty of these tasks and improve task completion performance. Prior works focused on parameter estimation of physical models or modeling nonlinear dynamics using neural networks. Still, these methods rely on internal parameters of vehicles or are not friendly for control due to the strong nonlinearity of models. This paper proposes a data-driven method to approximate vehicle dynamics based on the Koopman operator. The resulting model is an interpretable linear time-invariant model, facilitating controller design and solving related optimization problems. In the proposed approach, the state transition matrix is constructed based on the learned Koopman eigenvalues, while the input matrix is trained as a tensor. Based on the resulting model, a linear model predictive controller is designed to implement coupled longitudinal and lateral trajectory tracking. Simulations and experiments, including vehicle dynamics modeling and coupled longitudinal and lateral trajectory tracking, are performed in a high-fidelity CarSim environment and a real vehicle platform. An oil-driven D-Class SUV is selected in the simulation, while a real electric SUV is utilized in the experiment. Simulation and experiment results illustrate that the model of the nonlinear vehicle dynamics can be identified effectively via the proposed method, and high-quality trajectory tracking performance can be obtained with the resulting model." Meta-Learning-Based Optimal Control for Soft Robotic Manipulators to Interact with Unknown Environments,"Zhiqiang Tang, Peiyi Wang, Wenci Xin, Zhexin Xie, Longxin Kan, Muralidharan Mohanakrishnan, Cecilia Laschi","National University of Singapore,Beijing Jiaotong University",Learning for Control I,"Safe and efficient robot-environment interaction is a critical but challenging problem as robots are being increasingly employed to operate in unstructured and unpredictable environments. Soft robots are inherently compliant to safely interact with environments but their high nonlinearity exacerbates control difficulties. Meta-learning provides a powerful tool for fast online model adaptation because it can learn an efficient model from data across different environments. Thus, this work applies the idea of meta-learning for the control of soft robotics. In particular, a target-oriented proactive search strategy is firstly performed to collect environment-specific data efficiently when a new interaction environment occurs. Then meta-learning exploits past experience to train a data-driven probabilistic model prior, and the model prior is online updated to be fast adapted to the new environment. Lastly, a model-based optimal control policy is utilized to drive the robot to desired performance. Our approach controls a soft robotic manipulator to achieve the desired position and contact force simultaneously when interacting with unknown changing environments. Experimental results demonstrate that the tracking error could be reached within 1mm for position and 0.01N for contact force. Overall, this work provides a viable control approach for soft robots to interact with unknown environments." Dealing with Sparse Rewards in Continuous Control Robotics Via Heavy-Tailed Policy Optimization,"Souradip Chakraborty, Amrit Bedi, Kasun Weerakoon, Prithvi Poddar, Alec Koppel, Pratap Tokekar, Dinesh Manocha","UNIVERSITY OF MARYLAND,University of Maryland, College Park,IISER Bhopal,JP Morgan Chase,University of Maryland",Learning for Control I,"In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. Sparse rewards are common in continuous control robotics tasks such as manipulation and navigation and make the learning problem hard due to the non-trivial estimation of value functions over the state space. This demands either reward shaping or expert demonstrations for the sparse reward environment. However, obtaining high-quality demonstrations is quite expensive and sometimes even impossible. We propose a heavy-tailed policy parametrization along with a modified momentum-based policy gradient tracking scheme (HT-SPG) to induce a stable exploratory behavior to the algorithm. The proposed algorithm does not require access to expert demonstrations. We test the performance of HT-SPG on various benchmark tasks of continuous control with sparse rewards such as 1D Mario, Pathological Mountain Car, Sparse Pendulum in OpenAI Gym, and Sparse MuJoCo environments (Hopper-v2, Half-Cheetah, Walker-2D). We show consistent performance improvement across all tasks in terms of high average cumulative reward without requiring access to expert demonstrations. We further demonstrate that a navigation policy trained using HT-SPG can be easily transferred into a Clearpath Husky robot to perform real-world navigation tasks." MPC with Sensor-Based Online Cost Adaptation,"Avadesh Meduri, Huaijiang Zhu, Armand Jordana, Ludovic Righetti","New York University,NYU",Learning for Control I,"Model predictive control is a powerful tool to generate complex motions for robots. However, it often requires solving non-convex problems online to produce rich behaviors, which is computationally expensive and not always practical in real time. Additionally, direct integration of high dimensional sensor data (e.g. RGB-D images) in the feedback loop is challenging with current state-space methods.This paper aims to address both issues. It introduces a model predictive control scheme, where a neural network constantly updates the cost function of a quadratic program based on sensory inputs, aiming to minimize a general non-convex task loss without solving a non-convex problem online. By updating the cost, the robot is able to adapt to changes in the environment directly from sensor measurement without requiring a new cost design. Furthermore, since the quadratic program can be solved efficiently with hard constraints, a safe deployment on the robot is ensured. Experiments with a wide variety of reaching tasks on an industrial robot manipulator demonstrate that our method can efficiently solve complex non-convex problems with high-dimensional visual sensory inputs, while still being robust to external disturbances." ReachLipBnB: A Branch-And-Bound Method for Reachability Analysis of Neural Network Autonomous Systems Using Lipschitz Bounds,"Taha Entesari, Sina Sharifi, Mahyar Fazlyab",Johns Hopkins University,Learning for Control I,"We propose a novel Branch-and-bound method for reachability analysis of neural networks. Our idea is to first compute accurate bounds on the Lipschitz constant of the neural network in specific directions of interest offline using a convex program. We then use these computations to obtain an instantaneous but conservative polyhedral approximation of the reachable set online using Lipschitz continuity arguments. To reduce conservatism, we incorporate our bounding algorithm within a branching strategy to decrease the over-approximation error within an arbitrary accuracy. We then extend our method to reachability analysis of control systems with neural network controllers. Finally, to capture the shape of the reachable sets as accurately as possible, we use sample trajectories to inform the directions of the reachable set over-approximations using Principal Component Analysis (PCA). We evaluate the performance of the proposed method in several open-loop and closed-loop settings." Gradient-Based Trajectory Optimization with Learned Dynamics,"Bhavya Sukhija, Nathanael Köhler, Miguel Zamora, Simon Zimmermann, Sebastian Curi, Stelian Coros, Andreas Krause","ETH Zürich,ETH Zurich",Learning for Control I,"Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate analytical models of the dynamics, yet some aspects of the physical world can only be captured to a limited extent. An alternative approach is to leverage machine learning techniques to learn a differentiable dynamics model of the system from data. In this work, we use trajectory optimization and model learning for performing highly dynamic and complex tasks with robotic systems in absence of accurate analytical models of the dynamics. We show that a neural network can model highly nonlinear behaviors accurately for large time horizons, from data collected in only 25 minutes of interactions on two distinct robots: (i) the Boston Dynamics Spot and an (ii) RC car. Furthermore, we use the gradients of the neural network to perform gradient-based trajectory optimization. In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car, and gives good performance in combination with trajectory optimization methods." RAMP-Net: A Robust Adaptive MPC for Quadrotors Via Physics-Informed Neural Network,"Sourav Sanyal, Kaushik Roy",Purdue University,Learning for Control I,"Model Predictive Control (MPC) is a state-of-the-art (SOTA) control technique which requires solving hard constrained optimization problems iteratively. In the event of uncertain dynamics (typically encountered in real-life), analytical model based MPC requires setting conservative bounds on disturbances to obtain robust controllers. This however, increases the hardness of the problem, as more constraint satisfactions are required. The problem exacerbates in performance-critical applications, when more compute is required in lesser time. Data-driven regression methods such as Neural Networks have been proposed in the past to approximate system dynamics. However, such models rely on high volumes of labeled data, in the absence of symbolic analytical priors. This incurs non-trivial training overheads. Physics-informed Neural Networks (PINNs) have gained traction for approximatingnon-linear system of ordinary differential equations (ODEs), with reasonable accuracy. In this work, we propose a Robust Adaptive MPC framework via PINNs (RAMP-Net), which uses a neural network trained partly from simple ODEs and partly from data. A physics loss is used to learn simple ODEs (representing ideal dynamics). Having access to analytical functions inside the loss function acts as a regularizer, enforcing robust behavior for parametric uncertainties. On the other hand, a regular data loss is used for adapting to residual disturbances (non-parametric uncertainties), unaccounted during mathematical modelling. Experimentsare performed in a simulated environment for trajectory tracking of a quadrotor. We report 7.8% to 43.2% and 8.04% to 61.5% reduction in tracking errors for speeds ranging from 0.5 to 1.75 m/s compared to two SOTA regression based MPC methods." 3-D Reconstruction Using Monocular Camera and Lights: Multi-View Photometric Stereo for Non-Stationary Robots,"Monika Roznere, Philippos Mordohai, Ioannis Rekleitis, Alberto Quattrini Li","Dartmouth College,Stevens Institute of Technology,University of South Carolina",Marine Robotics I,"This paper proposes a novel underwater Multi-View Photometric Stereo (MVPS) framework for reconstructing scenes in 3-D with a non-stationary low-cost robot equipped with a monocular camera and fixed lights. The underwater realm is the primary focus of study here, due to the challenges in utilizing underwater camera imagery and lack of low-cost reliable localization systems. Previous underwater PS approaches provided accurate scene reconstruction results, but assumed that the robot was stationary at the bottom. This assumption is limiting, as many artifacts, reefs, and man-made structures are large and meters above the bottom. Our proposed MVPS framework relaxes the stationarity assumption by utilizing a monocular SLAM system to estimate small robot motions and extract an initial sparse feature map. To compensate for the scale inconsistency in monocular SLAM output, our MVPS optimization scheme collectively estimates a high-quality, dense 3-D reconstruction and corrects the camera pose estimates. We also present an attenuation and camera-light extrinsic parameter calibration method for non-stationary robots. Finally, validation experiments with a BlueROV2 demonstrated the low-cost capability of producing high-quality scene reconstructions. Overall, this work is the foundation of an active perception pipeline for robots (i.e., underwater, ground, and aerial) to explore and map complex structures in high accuracy and resolution with an inexpensive sensor-light configuration." GMM Registration: A Probabilistic Scan Matching Approach for Sonar-Based AUV Navigation,"Pau Vial, Miguel Malagón Pedrosa, Ricard Segura, Narcís Palomeras, Marc Carreras","Universitat de Girona ESQ,,,,,,,E,Universitat de Girona",Marine Robotics I,"Acoustic perception in underwater environments is challenging due to the low frequency of the acquisition system and multiple and huge sources of noise. Therefore, point clouds built by profiling sonars mounted on Autonomous Underwater Vehicles (AUV) are sparse and noisy. To solve the mapping task, AUVs need a registration algorithm to prevent maps from inconsistencies. Many scan matching algorithms are available, however, a few of them are specialized in acoustic data. In this paper, a probabilistic scan matching methodology based on Gaussian Mixtures Models (GMM) is presented and, for the first time, the Bayesian-GMM algorithm is applied in this context to model acoustic data. The scan matching problem is properly formulated using Lie groups to define pose. In addition, this methodology can return an uncertainty measure for the matching result, which is fundamental in Pose SLAM applications. This tool is implemented in a public C++ library that can process in real-time 2D and 3D scans acquired by a profiling sonar. Theoretical justification and results with real data are provided to benchmark our method against the state-of-the-art Normal Distributions Transforms (NDT) technique. The library repository can be found in https://bitbucket.org/gmmregistration/gmm_registration." Neural Implicit Surface Reconstruction Using Imaging Sonar,"Mohamad Qadri, Michael Kaess, Ioannis Gkioulekas",Carnegie Mellon University,Marine Robotics I,"We present a technique for dense 3D reconstruction of objects using an imaging sonar, also known as forward-looking sonar (FLS). Compared to previous methods that model the scene geometry as point clouds or volumetric grids, we represent the geometry as a neural implicit function. Additionally, given such a representation, we use a differentiable volumetric renderer that models the propagation of acoustic waves to synthesize imaging sonar measurements. We perform experiments on real and synthetic datasets and show that our algorithm reconstructs high-fidelity surface geometry from multi-view FLS images at much higher quality than was possible with previous techniques and without suffering from their associated memory overhead." Conditional GANs for Sonar Image Filtering with Applications to Underwater Occupancy Mapping,"Tianxiang Lin, Akshay Hinduja, Mohamad Qadri, Michael Kaess",Carnegie Mellon University,Marine Robotics I,"Underwater robots typically rely on acoustic sensors like sonar to perceive their surroundings. However, these sensors are often inundated with multiple sources and types of noise, which makes using raw data for any meaningful inference with features, objects, or boundary returns very difficultly. While several conventional methods of dealing with noise exist, their success rates are unsatisfactory. This paper presents a novel application of conditional Generative Adversarial Networks to train a model to produce noise-free sonar images, outperforming several conventional filtering methods. Estimating free space is crucial for autonomous robots performing active exploration and mapping. Thus we apply our approach to the task of underwater occupancy mapping and show superior free and occupied space inference when compared to conventional methods." Stochastic Planning for ASV Navigation Using Satellite Images,"Yizhou Huang, Hamza Dugmag, Florian Shkurti, Timothy Barfoot",University of Toronto,Marine Robotics I,"Autonomous surface vessels (ASV) represent a promising technology to automate water-quality monitoring of lakes. In this work, we use satellite images as a coarse map and plan sampling routes for the robot. However, inconsistency between the satellite images and the actual lake, as well as environmental disturbances such as wind, aquatic vegetation, and changing water levels can make it difficult for robots to visit places suggested by the prior map. This paper presents a robust route-planning algorithm that minimizes the expected total travel distance given these environmental disturbances, which induce uncertainties in the map. We verify the efficacy of our algorithm in simulations of over a thousand Canadian lakes and demonstrate an application of our algorithm in a 3.7 km-long real-world robot experiment on a lake in Northern Ontario, Canada." Autonomous Underwater Docking Using Flow State Estimation and Model Predictive Control,"Rakesh Vivekanandan, Geoffrey Hollinger, Dongsik Chang","Oregon State University,Amazon",Marine Robotics I,"We present a navigation framework to perform autonomous underwater docking to a wave energy converter (WEC) under various ocean conditions by incorporating flow state estimation into the design of model predictive control (MPC). Existing methods lack the ability to perform dynamic rendezvous and autonomously dock in energetic conditions. The use of exteroceptive sensors or high performing acoustic sensors have been previously investigated to obtain or estimate the flow states. However, the use of such sensors increases the overall cost of the system and expects the vehicle to navigate close to the seafloor or other landmarks. To overcome these limitations, our method couples an active perception framework with MPC to estimate the flow states simultaneously while moving towards the dock. Our simulation results demonstrate the robustness and reliability of the proposed framework for autonomous docking under various ocean conditions. Furthermore, we conducted laboratory trials with a BlueROV2 docking with an oscillating dock and achieved a greater than 70% success rate." Real-Time Navigation for Autonomous Surface Vehicles in Ice-Covered Waters,"Rodrigue De Schaetzen, Alexander Botros, Robert Gash, Kevin Murrant, Stephen L. Smith","University of Waterloo,National Research Council of Canada",Marine Robotics I,"Vessel transit in ice-covered waters poses unique challenges in safe and efficient motion planning. When the concentration of ice is high, it may not be possible to find collision-free paths. Instead, ice can be pushed out of the way if it is small or if contact occurs near the edge of the ice. In this work, we propose a real-time navgiation framework that minimizes collisions with ice and distance travelled by the vessel. We exploit a lattice-based planner with a cost that captures the ship interaction with ice. To address the dynamic nature of the environment, we plan motion in a receding horizon manner based on updated vessel and ice state information. Further, we present a novel planning heuristic for evaluating the cost-to-go, which is applicable to navigation in a channel without a fixed goal location. The performance of our planner is evaluated across several levels of ice concentration both in simulated and in real-world experiments." Experiments in Underwater Feature Tracking with Performance Guarantees Using a Small AUV,"Benjamin Adams Biggs, Hans He, James Mcmahon, Daniel Stilwell","Virginia Polytechnic Institute and State University,Virginia Tech,The Naval Research Laboratory",Marine Robotics I,"We present the results of experiments performed using a small autonomous underwater vehicle to determine the location of an isobath within a bounded area. The primary contribution of this work is to implement and integrate several recent developments real-time planning for environmental mapping, and to demonstrate their utility in a challenging practical example. We model the bathymetry within the operational area using a Gaussian process and propose a reward function that represents the task of mapping a desired isobath. As is common in applications where plans must be continually updated based on real-time sensor measurements, we adopt a receding horizon framework where the vehicle continually computes near-optimal paths. The sequence of paths does not, in general, inherit the optimality properties of each individual path. Our real-time planning implementation incorporates recent results that lead to performance guarantees for receding-horizon planning." Robust Imaging Sonar-Based Place Recognition and Localization in Underwater Environments,"Hogyun Kim, Kang Gilhwan, Seokhwan Jeong, Seungjun Ma, Younggun Cho","Inha University,Inha university",Marine Robotics I,"Place recognition using SOund Navigation and Ranging (SONAR) images is an important task for simultaneous localization and mapping (SLAM) in underwater environments. This paper proposes a robust and efficient imaging SONAR-based place recognition, SONAR context, and loop closure method. Unlike previous methods, our approach encodes geometric information based on the characteristics of raw SONAR measurements without prior knowledge or training. We also design a hierarchical searching procedure for fast retrieval of candidate SONAR frames and apply adaptive shifting and padding to achieve robust matching on rotation and translation changes. In addition, we can derive the initial pose through adaptive shifting and apply it to the iterative closest point (ICP)-based loop closure factor. We evaluate the SONAR context’s performance in the various underwater sequences such as simulated open water, real water tank, and real underwater environments. The proposed approach shows the robustness and improvements of place recognition on various datasets and evaluation metrics. Supplementary materials are available at https://github.com/sparolab/sonar_context." Deep Underwater Monocular Depth Estimation with Single-Beam Echosounder,"Haowen Liu, Monika Roznere, Alberto Quattrini Li",Dartmouth College,Marine Robotics I,"Underwater depth estimation is essential for safe Autonomous Underwater Vehicles (AUV) navigation. While there has been recent advances in out-of-water monocular depth estimation, it is difficult to apply these methods to the underwater domain due to the lack of well-established datasets with labelled ground truths. In this paper, we propose a novel method for self-supervised underwater monocular depth estimation by leveraging a low-cost single-beam echosounder (SBES). We also present a synthetic dataset for underwater depth estimation to facilitate visual learning research in the underwater domain, available at https://github.com/hdacnw/ sbes-depth. We evaluated our method on the proposed dataset with results outperforming previous methods and tested our method in a dataset we collected with an inexpensive AUV. We further investigated the use of SBES as an additional component in our self-supervised method for up-to-scale depth estimation providing insights on next research directions." Self-Supervised Monocular Depth Underwater,"Shlomi Amitai, Itzik Klein, Tali Treibitz",University of Haifa,Marine Robotics I,"Depth estimation is critical for any robotic system. In the past years estimation of depth from monocular images have shown great improvement, however, in the underwater environment results are still lagging behind due to appearance changes caused by the medium. So far little effort has been invested on overcoming this. Moreover, underwater, there are more limitations for using high resolution depth sensors, this makes generating ground truth for learning methods another enormous obstacle. So far unsupervised methods that tried to solve this have achieved very limited success as they relied on domain transfer from dataset in air. We suggest training using subsequent frames self-supervised by a reprojection loss, as was demonstrated successfully above water. We suggest several additions to the self-supervised framework to cope with the underwater environment and achieve state-of-the-art results on a challenging forward-looking underwater dataset." Performance Evaluation of 3D Keypoint Detectors and Descriptors on Coloured Point Clouds in Subsea Environments,"Kyungmin Jung, Thomas Hitchcox, James Richard Forbes",McGill University,Marine Robotics I,"The recent development of high-precision subsea optical scanners allows for 3D keypoint detectors and feature descriptors to be leveraged on point cloud scans from subsea environments. However, the literature lacks a comprehensive survey to identify the best combination of detectors and descriptors to be used in these challenging and novel environments. This paper aims to identify the best detector/descriptor pair using a challenging field dataset collected using a commercial underwater laser scanner. Furthermore, studies have shown that incorporating texture information to extend geometric features adds robustness to feature matching on synthetic datasets. This paper also proposes a novel method of fusing images with underwater laser scans to produce coloured point clouds, which are used to study the effectiveness of 6D point cloud descriptors." Puppeteer and Marionette: Learning Anticipatory Quadrupedal Locomotion Based on Interactions of a Central Pattern Generator and Supraspinal Drive,"Milad Shafiee, Guillaume Bellegarda, Auke Ijspeert",EPFL,Biomimetic Systems,"Quadruped animal locomotion emerges from the interactions between the spinal central pattern generator (CPG), sensory feedback, and supraspinal drive signals from the brain. Computational models of CPGs have been widely used for investigating the spinal cord contribution to animal locomotion control in computational neuroscience and in bio-inspired robotics. However, the contribution of supraspinal drive to anticipatory behavior, i.e. motor behavior that involves planning ahead of time (e.g. of footstep placements), is not yet properly understood. In particular, it is not clear whether the brain modulates CPG activity and/or directly modulates muscle activity (hence bypassing the CPG) for accurate foot placements. In this paper, we investigate the interaction of supraspinal drive and a CPG in an anticipatory locomotion scenario that involves stepping over gaps. By employing deep reinforcement learning (DRL), we train a neural network policy that replicates the supraspinal drive behavior. This policy can either modulate the CPG dynamics, or directly change actuation signals to bypass the CPG dynamics. Our results indicate that the direct supraspinal contribution to the actuation signal is a key component for a high gap crossing success rate. However, the CPG dynamics in the spinal cord are beneficial for gait smoothness and energy efficiency. Moreover, our investigation shows that sensing the front feet distances to the gap is the most important and sufficient sensory information for learning gap crossing. Our results support the biological hypothesis that cats and horses mainly control the front legs for obstacle avoidance, and that hind limbs follow an internal memory based on the front limbs' information. Our method enables the quadruped robot to cross gaps of up to 20 cm (50% of body-length) without any explicit dynamics modeling or Model Predictive Control (MPC)." A Performance Optimization Strategy Based on Improved NSGA-II for a Flexible Robotic Fish,"Ben Lu, Jian Wang, Xiaocun Liao, Qianqian Zou, Min Tan, Chao Zhou","Institute of Automation, Chinese Academy of Sciences,Institution of Automation, Chinese Academy of sciences,Institute of Automation,Chinese Academy of Sciences,Chinese Academy of Sciences",Biomimetic Systems,"The high speed and low energy cost are two conflicting objectives in the motion optimization of bio-inspired underwater robots, but playing a very important role. To this end, this paper proposes an optimization strategy for swimming speed and power cost using an improved NSGAII for a flexible robotic fish. A dynamic model involving flexible deformation is established for speed prediction with the hydrodynamic parameters identified. A back propagation (BP) neural network is applied to perform compensation of power cost prediction with the dynamic model’s prediction as input. In particular, an NSGA-II-AMS method is developed to improve the efficiency of solving the two-objective optimization problem based on NSGA-II. Finally, extensive simulations and experimental results demonstrate the effectiveness of the proposed optimization strategy, which offers promising prospects for the flexible robotic fish performing aquatic tasks with different performance constraints." Swarm Robotics Search and Rescue: A Bee-Inspired Swarm Cooperation Approach without Information Exchange,"Yue Li, Yan Gao, Sijie Yang, Quan Quan","Beihang University,School of Automation Science and Electrical Engineering, Beihang",Biomimetic Systems,"Swarm robotics plays a non-negligible role in actual practice because of its scalability and robustness. Besides some specific studies, there is still a lack of an overall approach to solving the search and rescue problem in a communication-denied environment. This paper presents a bee-inspired swarm cooperation approach without information exchange, including a target grouping method suitable for multi-objective and multi-robot, a finite behavior state machine, and the corresponding control law. Finally, the effectiveness of the proposed approach is shown via simulation. The overall approach proposed in this paper requires no global position of the swarm and two-way information exchange, making swarm robotics search and rescue in a communication-denied environment possible." Achieving Extensive Trajectory Variation in Impulsive Robotic Systems,"Luis Viornery, Chloe Goode, Gregory Sutton, Sarah Bergbreiter","Carnegie Mellon University,University of Lincoln",Biomimetic Systems,"Robots that use impulsive mechanisms to achieve high-speed and high-powered motion are becoming more common and better understood, but control of these systems remains relatively rudimentary. Among robots that use spring actuation to generate motion, robot actuation and mechanisms are usually not controlled intentionally in order to achieve variation in the system's behavior, or they are controlled only roughly via adjustments made to the amount of energy stored in the mechanism. We describe the development, construction, and test of an impulsive catapult mechanism whose design is inspired by the grasshopper leg and for which extensive variation in the projectile trajectory is achieved by force control of the actuator that restrains the spring. As a step toward future controlled jumping robots, we give a detailed model of this system, validate this model experimentally, and explain how the actuator dynamics are critical to our ability to vary the system's trajectory using this approach. This work represents a novel approach to the control of spring actuated robots and illustrates how they can be controlled even under highly limiting actuator constraints." Towards Safe Landing of Falling Quadruped Robots Using a 3-DoF Morphable Inertial Tail,"Yunxi Tang, Jiajun An, Xiangyu Chu, Shengzhi Wang, Ching Yan Wong, Samuel Au",The Chinese University of Hong Kong,Biomimetic Systems,"Falling cat problem is well-known where cats show their super aerial reorientation capability and can land safely. For their robotic counterparts, a similar falling quadruped robot problem, has not been fully addressed, although achieving safe landing as the cats has been increasingly investigated. Unlike imposing the burden on landing control, we approach to safe landing of falling quadruped robots by effective flight phase control. Different from existing work like swinging legs and attaching reaction wheels or simple tails, we propose to deploy a 3-DoF morphable inertial tail on a medium-size quadruped robot. In the flight phase, the tail with its maximum length can self-right the body orientation in 3D effectively; before touch-down, the tail length can be retracted to about 1/4 of its maximum for impressing the tail's side-effect on landing. To enable aerial reorientation for safe landing in the quadruped robots, we design a control architecture, which is verified in a high-fidelity physics simulation environment with different initial conditions. Experimental results on a customized flight-phase test platform with comparable inertial properties are provided and show the tail's effectiveness on 3D body reorientation and its fast retractability before touch-down. An initial falling quadruped robot experiment is shown, where the robot Unitree A1 with the 3-DoF tail can land safely subject to non-negligible initial body angles." Bioinspired Tearing Manipulation with a Robotic Fish,"Stanley Wang, Juan Romero, Monica Li, Peter Wainwright, Hannah Stuart","University of California, Berkeley,UC Berkeley,University of California, Davis",Biomimetic Systems,"We present SunBot, a robotic system for the study and implementation of fish-inspired tearing manipulations. Various fish species -- such as the sunburst butterflyfish -- feed on prey fixed to substrates, a maneuver previously not demonstrated by robotic fish which typically specialize for open water swimming and surveillance. Biological studies indicate that a dynamic ``head flick'' behavior may play a role in tearing off soft prey during such feeding. In this work, we study whether the robotic tail is an effective means to generate such head motions for ungrounded tearing manipulations in water. We describe the function of SunBot and compare the forces that it applies to a fixed prey in the lab while varying tail speeds and ranges of motion. A simplified dynamic template model for the tail-driven head flick maneuver matches peak force magnitudes from experiments, indicating that inertial effects of the fish's body play a substantial role. Finally we demonstrate a tearing scenario and evaluate a free-swimming trial of SunBot -- this is important to show that the actuator that enables swimming also provides the new dual purpose of forceful tearing manipulation." Learnable Tegotae-Based Feedback in CPGs with Sparse Observation Produces Efficient and Adaptive Locomotion,"Christopher Herneth, Mitsuhiro Hayashibe, Dai Owaki","Technical University Munich,Tohoku University",Biomimetic Systems,"Central Pattern generators (CPG) are a biologically inspired, decentralized control architecture that enables modefree, but yet adaptively stable and computational lightweight locomotion capabilities on complex robots. Nevertheless, no unified design guidelines for closed-loop CPG controllers are available in the literature. Therefore, we propose a task-distributed, end-to-end trainable, closed-loop CPG control policy by generalizing and extending Tegotae control. The Tegotae approach modulates CPG activity by quantifying the discrepancy between internal belief states and environmental reactions. Spontaneous and adaptive gait formation towards situationally efficient locomotion patterns are intrinsic properties of Tegotae control. The Tegotae control policy is trained and benchmarked in simulation on a 1D hopping robot. We found that our approach can learn efficient and adaptive locomotion on minimal feedback information, while outperforming unstructured, classic reinforcement learning policies of equal complexity. To the best of our knowledge, this is the first study to fully generalize the Tegotae approach and construct unimpeded, end-to-end trainable Tegotae control policies." "Multi-Segmented, Adaptive Feet for Versatile Legged Locomotion in Natural Terrain","Abhishek Chatterjee, An Mo, Bernadett Kiss, Emre Cemal Gonen, Alexander Badri-Spröwitz","Max Planck Institute for Intelligent Systems, Stuttgart,MPI IS Stuttgart,Max Planck Institute for Intelligent Systems",Biomimetic Systems,"Most legged robots are built with leg structures from serially mounted links and actuators and are controlled through complex controllers and sensor feedback. In comparison, animals developed multi-segment legs, mechanical coupling between joints, and multi-segmented feet. They run agile over all terrains, arguably with simpler locomotion control. Here we focus on developing foot mechanisms that resist slipping and sinking also in natural terrain. We present first results of multi-segment feet mounted to a bird-inspired robot leg with multi-joint mechanical tendon coupling. Our one- and two-segment, mechanically adaptive feet show increased viable horizontal forces on multiple soft and hard substrates before starting to slip. We also observe that segmented feet reduce sinking on soft substrates compared to ball-feet and cylinder-feet. We report how multi-segmented feet provide a large range of viable centre of pressure points well suited for bipedal robots, but also for quadruped robots on slopes and natural terrain. Our results also offer a functional understanding of segmented feet in animals like ratite birds." Burst Stimulation for Enhanced Locomotion Control of Terrestrial Cyborg Insects,"Huu Duoc Nguyen, Hirotaka Sato, Tat Thang Vo Doan","Nanyang Technological University,University of Freiburg",Biomimetic Systems,"Terrestrial cyborg insects are biohybrid systems integrating living insects as mobile platforms. The insects’ locomotion is controlled by the electrical stimulation of their sensory, muscular, or neural systems, in which continuous pulse trains are usually chosen as the stimulation waveform. Although this waveform is easy to generate and can elicit graded responses from the insects, its locomotion control efficiency has not been consistent among existing literature. This study demonstrates the improvement of locomotion control by using a new stimulation protocol, named Burst Stimulation, to stimulate a cyborg beetle’s antennae (Zophobas morio). Modulating the continuous pulse train into multiple bursts enhanced the beetle’s turning responses. At the same stimulation intensity (amplitude, pulse width, and active duration), the Burst Stimulation improved the turning angle by up to 50% compared to the continuous waveform. Moreover, the beetle’s graded response was preserved. Increasing the stimulation frequency from 10 Hz to 40 Hz raised the turning rate by 40 deg/s. In addition, the initial implementation of this protocol in the feedback control-based navigation achieved a success rate of 81%, suggesting its potential use to optimize further the autonomous navigation of terrestrial cyborg insects." Twisting Spine or Rigid Torso: Exploring Quadrupedal Morphology Via Trajectory Optimization,"J. Diego Caporale, Zeyuan Feng, Shane Rozen-levy, Aja Carter, Daniel Koditschek",University of Pennsylvania,Biomimetic Systems,"Modern legged robot morphologies assign most of their actuated degrees of freedom (DoF’s) to the limbs and designs continue to converge to twelve DoF quadrupeds with three actuators per leg and a rigid torso often modeled as a Single Rigid Body (SRB). This is in contrast to the animal kingdom, which provides tantalizing hints that core actuation of a jointed torso confers substantial benefit for efficient agility. Unfortunately, the limited specific power of available actuators continues to hamper roboticists' efforts to capitalize on this bio-inspiration. This paper presents the initial steps in a comparative study of the costs and benefits associated with a traditionally neglected torso degree of freedom: a twisting spine. We use trajectory optimization to explore how a one-DoF, axially twisting spine might help or hinder a set of axially-active (twisting) behaviors: trots, sudden turns while bounding, and parkour-style wall jumps. By optimizing for minimum electrical energy or average power, intuitive cost functions for robots, we avoid hand-tuning the behaviors and explore the activation of the spine. Initial evidence suggests that for lower energy behaviors the spine increases the electrical energy required when compared to the rigid torso, but for higher energy runs the spine trends toward having no effect or reducing the electrical work. These results support future, more bio-inspired versions of the spine with inherent stiffness or dampening built into their mechanical design." Dynamic Locomotion of a Quadruped Robot with Active Spine Via Model Predictive Control,"Wanyue Li, Zida Zhou, Hui Cheng",Sun Yat-sen University,Biomimetic Systems,"As an active spine introduces higher degree of freedoms (DOFs) as well as time-varying inertia, locomotion control of spined quadruped robots is challenging. Direct optimization on the full dynamics model causes prohibitive calculation time and is difficult to apply to embedded platforms. Model predictive control (MPC)-based on centroidal dynamics is a prevalent approach for ordinary quadruped robots, regarding the whole robot as a single rigid body (SRB). However, the approach ignores the changes of the center of mass (CoM) and inertia, which seriously affects the robot’s stability and could not be used in spined quadruped robots directly. To resolve the above issue, this paper presents an MPC approach that considers the movements of the spine in the SRB model. Since the mass of the robot is concentrated on its body, the whole robot is modelled as an unactuated 3D SRB with fully-actuated internal spine joints. MPC finds the optimal ground reaction forces (GRFs) based on the SRB centroidal dynamics, in which the missing spine part is complemented by the pre-defined spine joints’ states and corresponding inertia sequence. According to the GRFs, the full dynamic model calculates the precise joint torques. In addition, a quadruped robot with a 3-DOF active spine, Yat-sen Lion, is developed. With the presented approach, experimental results illustrate that Yat-sen Lion freely achieves bending, arching, and turning behaviors while trotting at speeds of 3.8 m/s in simulations and 0.5 m/s in real-world experiments." Scalable Task-Driven Robotic Swarm Control Via Collision Avoidance and Learning Mean-Field Control,"Kai Cui, Mengguang Li, Christian Fabian, Heinz Koeppl",Technische Universität Darmstadt,Aerial Robotics I,"In recent years, reinforcement learning and its multi-agent analogue have achieved great success in solving various complex control problems. However, multi-agent reinforcement learning remains challenging both in its theoretical analysis and empirical design of algorithms, especially for large swarms of embodied robotic agents where a definitive toolchain remains part of active research. We use emerging state-of-the-art mean-field control techniques in order to convert many-agent swarm control into more classical single-agent control of distributions. This allows profiting from advances in single-agent reinforcement learning at the cost of assuming weak interaction between agents. However, the mean-field model is violated by the nature of real systems with embodied, physically colliding agents. Thus, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior. On the theoretical side, we provide novel approximation guarantees for general mean-field control both in continuous spaces and with collision avoidance. On the practical side, we show that our approach outperforms multi-agent reinforcement learning and allows for decentralized open-loop application while avoiding collisions, both in simulation and real UAV swarms. Overall, we propose a framework for the design of swarm behavior that is both mathematically well-founded and practically useful, enabling the solution of otherwise intractable swarm problems." STD-Trees: Spatio-Temporal Deformable Trees for Multirotors Kinodynamic Planning,"Hongkai Ye, Chao Xu, Fei Gao",Zhejiang University,Aerial Robotics I,"In constrained solution spaces with a huge number of homotopy classes, stand-alone sampling-based kinodynamic planners suffer low efficiency in convergence. Local optimization is integrated to alleviate this problem. In this paper, we propose to thrive the trajectory tree growing by optimizing the tree in the forms of deformation units, and each unit contains one tree node and all the edges connecting it. The deforming proceeds both spatially and temporally by optimizing the node state and edge time durations efficiently. Deforming the unit only changes the tree locally yet improves the overall quality of a corresponding subtree. Further, to consider the computation burden and optimizing level, patterns to deform different tree parts in combination of different deformation units are studied and compared, all showing much faster convergence. The proposed deformation can be easily integrated into different RRT-based kinodynamic planning methods, and numerical experiments show that integrating the spatio-temporal deformation greatly accelerates the convergence and outperforms the spatial-only deformation." PredRecon: A Prediction-Boosted Planning Framework for Fast and High-Quality Autonomous Aerial Reconstruction,"Chen Feng, Haojia Li, Fei Gao, Boyu Zhou, Shaojie Shen","The Hong Kong University of Science and Technology,Zhejiang University,Sun Yat-sen University,Hong Kong University of Science and Technology",Aerial Robotics I,"Autonomous UAV path planning for 3D reconstruction has been actively studied in various applications for high-quality 3D models. However, most existing works have adopted explore-then-exploit, prior-based or exploration-based strategies, demonstrating inefficiency with repeated flight and low autonomy. In this paper, we propose PredRecon, a prediction-boosted planning framework that can autonomously generate paths for high 3D reconstruction quality. We obtain inspiration from humans can roughly infer the complete construction structure from partial observation. Hence, we devise a surface prediction module (SPM) to predict the coarse complete surfaces of the target from the current partial reconstruction. Then, the uncovered surfaces are produced by online volumetric mapping waiting for observation by UAV. Lastly, a hierarchical planner plans motions for 3D reconstruction, which sequentially finds efficient global coverage paths, plans local paths for maximizing the performance of Multi-View Stereo (MVS), and generates smooth trajectories for image-pose pairs acquisition. We conduct benchmarks in the realistic simulator, which validates the performance of PredRecon compared with the classical and state-of-the-art methods. The open-source code is released at https://github.com/HKUST-Aerial-Robotics/PredRecon." Vision-Aided UAV Navigation and Dynamic Obstacle Avoidance Using Gradient-Based B-Spline Trajectory Optimization,"Zhefan Xu, Yumeng Xiu, Xiaoyang Zhan, Baihan Chen, Kenji Shimada",Carnegie Mellon University,Aerial Robotics I,"Navigating dynamic environments requires the robot to generate collision-free trajectories and actively avoid moving obstacles. Most previous works designed path planning algorithms based on one single map representation, such as the geometric, occupancy, or ESDF map. Although they have shown success in static environments, due to the limitation of map representation, those methods cannot reliably handle static and dynamic obstacles simultaneously. To address the problem, this paper proposes a gradient-based B-spline trajectory optimization algorithm utilizing the robot's onboard vision. The depth vision enables the robot to track and represent dynamic objects geometrically based on the voxel map. The proposed optimization first adopts the circle-based guide-point algorithm to approximate the costs and gradients for avoiding static obstacles. Then, with the vision-detected moving objects, our receding-horizon distance field is simultaneously used to prevent dynamic collisions. Finally, the iterative re-guide strategy is applied to generate the collision-free trajectory. The simulation and physical experiments prove that our method can run in real-time to navigate dynamic environments safely." Multi-Agent Spatial Predictive Control with Application to Drone Flocking,"Andreas Brandstätter, Scott Smolka, Scott Stoller, Ashish Tiwari, Radu Grosu","Technische Universität Wien,Stony Brook University,Microsoft Corp,TU Wien",Aerial Robotics I,"We introduce Spatial Predictive Control (SPC), a technique for solving the following problem: given a collection of robotic agents with black-box positional low-level controllers (PLLCs) and a mission-specific distributed cost function, how can a distributed controller achieve and maintain cost-function minimization without a plant model and only positional observations of the environment? Our fully distributed SPC controller is based strictly on the position of the agent itself and on those of its neighboring agents. This information is used in every time step to compute the gradient of the cost function and to perform a spatial look-ahead to predict the best next target position for the PLLC. Using a simulation environment, we show that SPC outperforms Potential Field Controllers, a related class of controllers, on the drone flocking problem. We also show that SPC works on real hardware, and is therefore able to cope with the potential sim-to-real transfer gap. We demonstrate its performance using as many as 16 Crazyflie 2.1 drones in a number of scenarios, including obstacle avoidance." Multimodal Image Registration for GPS-Denied UAV Navigation Based on Disentangled Representations,"Huandong Li, Zhunga Liu, Yanyi Lyu, Feiyan Wu",Northwestern Polytechnical University,Aerial Robotics I,"Visual navigation plays an important role for Unmanned Aerial Vehicles(UAVs). In some applications, the landmark image and the real-time image may be heterogeneous, like near-infrared and visible images. In this work, we propose a multimodal image registration method to deal with near-infrared and visible images so that it can be applied to visual navigation system for the localization of UAVs in GPS-denied environments. At first, a new feature extraction strategy is developed to embed different modalities of images into the common feature space based on disentangled representations. Such common space is independent of the image modality, and this can eliminate the modality differences. Meanwhile, an intensity loss is introduced to measure the similarity of mono-modal images. In the proposed method, we can directly predict the transformation parameters and thus accelerates the localization of UAVs. Extensive experiments on synthetic datasets are conducted to demonstrate the validity of our method, and the experimental results show that the proposed method can effectively improve the localization accuracy." SEER: Safe Efficient Exploration for Aerial Robots Using Learning to Predict Information Gain,"Yuezhan Tao, Yuwei Wu, Beiming Li, Fernando Cladera, Alex Zhou, Dinesh Thakur, Vijay Kumar",University of Pennsylvania,Aerial Robotics I,"We address the problem of efficient 3-D exploration in indoor environments for micro aerial vehicles with limited sensing capabilities and payload/power constraints. We develop an indoor exploration framework that uses learning to predict the occupancy of unseen areas, extracts semantic features, samples viewpoints to predict information gains for different exploration goals, and plans informative trajectories to enable safe and smart exploration. Extensive experimentation in simulated and real-world environments shows the proposed approach outperforms the state-of-the-art exploration framework by 24% in terms of the total path length in a structured indoor environment and with a higher success rate during exploration." Trajectory Planning for the Bidirectional Quadrotor As a Differentially Flat Hybrid System,"Katherine Mao, Jake Welde, M. Ani Hsieh, Vijay Kumar",University of Pennsylvania,Aerial Robotics I,"The use of bidirectional propellers provides quadrotors with greater maneuverability which is advantageous in constrained environments. This paper addresses the development of a trajectory planning algorithm for quadrotors with bidirectional motors. Previous work has shown that the property of differential flatness can be leveraged for efficient trajectory planning. However, planners that leverage flatness for quadrotors fail at points where the acceleration of the center of mass is equal to gravity, i.e., when the vehicle experiences free fall. The central contribution of this paper is a flatness-based trajectory planning method that allows quadrotors to use bidirectional propellers and pass through the so-called free-fall singularity. We model our system as a differentially flat hybrid system with the aid of coordinate charts derived from the Hopf fibration and develop an algorithm that computes forward and reverse thrusts for each propeller, resulting in smooth trajectories everywhere in $SE(3)$. We demonstrate the planner’s versatility by planning knife-edge maneuvers and trajectories passing through the free-fall singularity, while transitioning from forward to reverse thrust." Fisher Information Based Active Planning for Aerial Photogrammetry,"Jaeyoung Lim, Nicholas Lawrance, Florian Achermann, Thomas Stastny, Rik Marian Kai Bähnemann, Roland Siegwart","ETH Zurich,CSIRO Data,,,ETH Zurich, ASL,Swiss Federal Institute of Technology (ETH Zurich),ETH Zürich",Aerial Robotics I,"Small uncrewed aerial systems are great for 3D reconstruction due to their speed, ease of use, and ability to access high-utility viewpoints. Today, most aerial survey approaches generate a preplanned coverage pattern assuming a planar target region. However, this is inefficient since it results in superfluous overlap and suboptimal viewing angles and does not utilize the entire flight envelope. In this work, we propose active path planning for photogrammetric reconstruction. Our main contribution is a view utility function based on Fisher information approximating the offline reconstruction uncertainty. The metric enables online path planning to make in-flight decisions to collect geometrically informative image data in complex terrain. We evaluate our approach in a photorealistic simulation. A viewpoint selection study shows that our metric leads to faster and more precise reconstruction than state-of-the-art active planning metrics and adapts to different camera resolutions. Comparing our online planning approach to an ordinary fixed-wing aerial survey yields 3.2 times faster coverage of 16 ha undulated terrain without sacrificing precision." Integrated Vector Field and Backstepping Control for Quadcopters,"Arthur Henrique Dias Nunes, Guilherme Vianna Raffo, Luciano Pimenta",Universidade Federal de Minas Gerais,Aerial Robotics I,"In this work, we present an Integrated Guidance and Controller (IGC) scheme to drive quadcopters in path-following tasks with obstacle avoidance and constant uncertainties rejection. This scheme is based on the combination of a time-varying artificial vector field and Backstepping with integral action control. The vector field switches between two behaviors: (i) path-following; and (ii) obstacle circumnavigation to allow collision avoidance. This vector field is then integrated into a non-linear controller designed via Backstepping with Integral Action to deal with the quadcopter vehicle dynamics and reject constant uncertainties. The considered vehicle model is based on quaternion algebra. The control inputs are considered to be the total thrust and torques. Stability is proved by using Lyapunov's Theory and Matrosov's Theorem. To illustrate our proposed solution, we show computational simulations." Learning a Single Near-Hover Position Controller for Vastly Different Quadcopters,"Dingqi Zhang, Antonio Loquercio, Xiangyu Wu, Ashish Kumar, Jitendra Malik, Mark Wilfried Mueller","University of California, Berkeley,UC Berkeley",Aerial Robotics I,"This paper proposes an adaptive near-hover position controller for quadcopters, which can be deployed to quadcopters of very different mass, size and motor constants, and also shows rapid adaptation to unknown disturbances during runtime. The core algorithmic idea is to learn a single policy that can adapt online at test time not only to the disturbances applied to the drone, but also to the robot dynamics and hardware in the same framework. We achieve this by training a neural network to estimate a latent representation of the robot and environment parameters, which is used to condition the behaviour of the controller, also represented as a neural network. We train both networks exclusively in simulation with the goal of flying the quadcopters to goal positions and avoiding crashes to the ground. We directly deploy the same controller trained in the simulation without any modifications on two quadcopters in the real world with differences in mass, size, motors, and propellers with mass differing by 4.5 times. In addition, we show rapid adaptation to sudden and large disturbances up to one-third of the mass of the quadcopters. We perform an extensive evaluation in both simulation and the physical world, where we outperform a state-of-the-art learning-based adaptive controller and a traditional PID controller specifically tuned to each platform individually." Forming and Controlling Hitches in Midair Using Aerial Robots,"Diego Salazar-Dantonio, Subhrajit Bhattacharya, David Saldana",Lehigh University,Aerial Robotics I,"The use of cables for aerial manipulation has shown to be a lightweight and versatile way to interact with objects. However, fastening objects using cables is still a challenge and human is required. In this work, we propose a novel way to secure objects using hitches. The hitch can be formed and morphed in midair using a team of aerial robots with cables. The hitch’s shape is modeled as a convex polygon, making it versatile and adaptable to a wide variety of objects. We propose an algorithm to form the hitch systematically. The steps can run in parallel, allowing hitches with a large number of robots to be formed in constant time. We develop a set of actions that include different actions to change the shape of the hitch. We demonstrate our methods using a team of aerial robots via simulation and actual experiments." AirTrack: Onboard Deep Learning Framework for Long-Range Aircraft Detection and Tracking,"Sourish Ghosh, Jay Patrikar, Brady Moon, Milad Moghassem Hamidi, Sebastian Scherer",Carnegie Mellon University,Aerial Robot Learning,"Detect-and-Avoid (DAA) capabilities are critical for safe operations of unmanned aircraft systems (UAS). This paper introduces, AirTrack, a real-time vision-only detect and tracking framework that respects the size, weight, and power (SWaP) constraints of sUAS systems. Given the low Signal-to-Noise ratios (SNR) of far away aircraft, we propose using full resolution images in a deep learning framework that aligns successive images to remove ego-motion. The aligned images are then used downstream in cascaded primary and secondary classifiers to improve detection and tracking performance on multiple metrics. We show that AirTrack outperforms state-of-the art baselines on the Amazon Airborne Object Tracking (AOT) Dataset. Multiple real world flight tests with a Cessna 172 interacting with general aviation traffic and additional near-collision flight tests with a Bell helicopter flying towards a UAS in a controlled setting showcase that the proposed approach satisfies the newly introduced ASTM F3442/F3442M standard for DAA. Empirical evaluations show that our system has a probability of track of more than 95% up to a range of 700m." Towards a Reliable and Lightweight Onboard Fault Detection in Autonomous Unmanned Aerial Vehicles,"Sai Srinadhu Katta, Eduardo Viegas","TII,Pontifícia Universidade Catolica do Paraná (PUCPR), Brazil",Aerial Robot Learning,"This paper proposes a new model for onboard physical fault detection on autonomous unmanned aerial vehicles (UAV) through machine learning (ML) techniques. The proposal performs the detection task with high accuracies and minimal processing requirements,while signaling an unreliable ML model to the operator, implemented in two main phases. First, a wrapper-based feature selection is performed to decrease the feature extraction computational costs, coped with a classification assessment technique to identify ML model unreliability. Second, physical UAV faults are signaled through a multi-view rationale that evaluates a variety of UAV sensors, while triggering alerts based on a sliding window scheme. Experiments performed on a real quadcopter UAV with a broken propeller use case, shows the proposal's feasibility. Our model can decrease the false-positive rates up to only 0.4%, while also decreasing the computational costs by at least 43% when compared to traditional techniques. Notwithstanding, it can identify ML model unreliability, signaling the UAV operator when model fine-tuning is needed." Variable Admittance Interaction Control of UAVs Via Deep Reinforcement Learning,"Yuting Feng, Chuanbeibei Shi, Jianrui Du, Yushu Yu, Fuchun Sun, Yixu Song","Beijing Institute of Technology,Univeristy of Toronto,Tsinghua University,Tsinghua university",Aerial Robot Learning,"A compliant control model based on reinforcement learning (RL) is proposed to allow robots to interact with the environment more effectively and autonomously execute force control tasks. The admittance model learns an optimal adjustment policy for interactions with the external environment using RL algorithms. The model combines energy consumption and trajectory tracking of the agent state using a cost function. Therein, an Unmanned Aerial Vehicle (UAV) can operate stably in unknown environments where interaction forces exist. Furthermore, the model ensures that the interaction process is safe, comfortable, and flexible while protecting the external structures of the UAV from damage. To evaluate the model performance, we verified the approach in a simulation environment using a UAV in three external force scenes. We also tested the model across different UAV platforms and various low-level control parameters, and the proposed approach provided the best results." Learning Tethered Perching for Aerial Robots,"Fabian Hauf, Başaran Bahadır Koçer, Hai-nguyen Nguyen, Oscar Kwong Fai Pang, Ronald Clark, Edward Johns, Mirko Kovac","Imperial College London,CNRS,University of Oxford",Aerial Robot Learning,"Aerial robots have a wide range of applications, such as collecting data in hard-to-reach areas. This requires the longest possible operation time. However, because currently available commercial batteries have limited specific energy of roughly 300 Whkg^{−1}, a drone's flight time is a bottleneck for sustainable long-term data collection. Inspired by birds in nature, a possible approach to tackle this challenge is to perch drones on trees, and environmental or man-made structures, to save energy whilst in operation. In this paper, we propose an algorithm to automatically generate trajectories for a drone to perch on a tree branch, using the proposed tethered perching mechanism with a pendulum-like structure. This enables a drone to perform an energy-optimised, controlled 180-degree flip to disarm upside down safely. To fine-tune a set of reachable trajector" Credible Online Dynamics Learning for Hybrid UAVs,"David Rohr, Nicholas Lawrance, Olov Andersson, Roland Siegwart","ETH Zurich,CSIRO Data,,,ETH Zürich",Aerial Robot Learning,"Hybrid unmanned aerial vehicles (H-UAVs) are highly versatile platforms with the ability to transition between rotary- and fixed-wing flight. However, their (aero)dynamics tend to be highly nonlinear which increases the risk of introducing safety-critical modeling errors in a controller. Designing a safe, yet not too cautious controller, requires a credible model which provides accurate dynamics uncertainty quantification. We present a data-efficient, probabilistic semi-parametric dynamics modeling approach that allows for online, filter-based inference. The proposed model leverages prior knowledge using a nominal parametric model, and combines it with residuals in form of sparse Gaussian processes (GP) to account for possibly unmodeled forces and moments. Uncertain nominal and residual parameters are jointly estimated using Bayesian filtering. The resulting model accuracy and the reliability of its predicted uncertainty are analyzed for both a simulated and a real example, where we learn the 6DoF nonlinear dynamics of a tiltwing H-UAV from a few minutes of flight data. Compared to a residual-free nominal model, the proposed semi-parametric approach provides increased model accuracy in relevant parts of the flight envelope and substantially higher credibility overall." AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning,"Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso De Melo, Stephen Nogar, Aniket Bera, Dinesh Manocha","University of Maryland, College Park,University of Maryland-College Park,University of Maryland,CCDC US Army Research Laboratory,CCDC U.S. Army Research Laboratory,Purdue University",Aerial Robot Learning,"We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also present an efficient temporal reasoning algorithm to capture the action information along the spatial and temporal domains within a controllable computational cost. Our approach has been implemented and evaluated both on the desktop with high-end GPUs and on the low power Robotics RB5 Platform for robots and drones. In practice, we achieve 6.1-7.4% improvement over SOTA in Top-1 accuracy on the RoCoG-v2 dataset, 8.3-10.4% improvement on the UAV-Human dataset and 3.2% improvement on the Drone Action dataset." Follow the Rules: Online Signal Temporal Logic Tree Search for Guided Imitation Learning in Stochastic Domains,"Jasmine Jerry Aloor, Jay Patrikar, Parv Kapoor, Jean Oh, Sebastian Scherer","Massachusetts Institute of Technology,Carnegie Mellon University",Aerial Robot Learning,"Seamlessly integrating rules in Learning-from-Demonstrations (LfD) policies is a critical requirement to enable the real-world deployment of AI agents. Recently Signal Temporal Logic (STL) has been shown to be an effective language for encoding rules as spatio-temporal constraints. This work uses Monte Carlo Tree Search (MCTS) as a means of integrating STL specification into a vanilla LfD policy to improve constraint satisfaction. We propose augmenting the MCTS heuristic with STL robustness values to bias the tree search towards branches with higher constraint satisfaction. While the domain-independent method can be applied to integrate STL rules online into any pre-trained LfD algorithm, we choose goal-conditioned Generative Adversarial Imitation Learning as the offline LfD policy. We apply the proposed method to the domain of planning trajectories for General Aviation aircraft around a non-towered airfield. Results using the simulator trained on real-world data showcase 60% improved performance over baseline LfD methods that do not use STL heuristics." Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking,"Changhong Fu, Mutian Cai, Sihang Li, Kunhan Lu, Haobo Zuo, Chongjun Liu","Tongji University,Harbin Engineering University",Aerial Robot Learning,"Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential interframe connections. To address the above issues, this work proposes a novel framework with continuity-aware latent interframe information mining for reliable UAV tracking, i.e., ClimRT. Specifically, a new efficient continuity-aware latent interframe information mining network (ClimNet) is proposed for UAV tracking, which can generate highly-effective latent frame between two adjacent frames. Besides, a novel location-continuity Transformer (LCT) is designed to fully explore continuity-aware spatial-temporal information, thereby markedly enhancing UAV tracking. Extensive qualitative and quantitative experiments on three authoritative aerial benchmarks strongly validate the robustness and reliability of ClimRT in UAV tracking performance. Furthermore, real-world tests on the aerial platform validate its practicability and effectiveness. The code and demo materials are released at https://github.com/vision4robotics/ClimRT." Weighted Maximum Likelihood for Controller Tuning,"Angel Romero, Shreedhar Govil, Gonca Yilmaz, Yunlong Song, Davide Scaramuzza",University of Zurich,Aerial Robot Learning,"Recently, Model Predictive Contouring Control (MPCC) has arisen as the state-of-the-art approach for model-based agile flight. MPCC benefits from great flexibility in trading-off between progress maximization and path following at runtime without relying on globally optimized trajectories. However, finding the optimal set of tuning parameters for MPCC is challenging because (i) the full quadrotor dynamics are non-linear, (ii) the cost function is highly non-convex, and (iii) of the high dimensionality of the hyperparameter space. This paper leverages a probabilistic Policy Search method, Weighted Maximum Likelihood (WML), to automatically learn the optimal objective for MPCC. WML is sample-efficient due to its closed-form solution for updating the learning parameters. Additionally, the data efficiency provided by the use of a model-based approach allows us to directly train in a high-fidelity simulator, which in turn makes our approach able to transfer zero-shot to the real world. We validate our approach in the real world, where we show that our method outperforms both the previous manually tuned controller and the state-of-the-art auto-tuning baseline reaching speeds of 75 km/h." User-Conditioned Neural Control Policies for Mobile Robotics,"Leonard Bauersfeld, Elia Kaufmann, Davide Scaramuzza","University of Zurich (UZH),,University of Zurich",Aerial Robot Learning,"Recently, learning-based controllers have been shown to push mobile robotic systems to their limits and provide the robustness needed for many real-world applications. However, only classical optimization-based control frameworks offer the inherent flexibility to be dynamically adjusted during execution by, for example, setting target speeds or actuator limits. We present a framework to overcome this shortcoming of neural controllers by conditioning them on an auxiliary input. This advance is enabled by including a feature-wise linear modulation layer (FiLM). We use model-free reinforcement-learning to train quadrotor control policies for the task of navigating through a sequence of waypoints in minimum time. By conditioning the policy on the maximum available thrust or the viewing direction relative to the next waypoint, a user can regulate the aggressiveness of the quadrotor's flight during deployment. We demonstrate in simulation and in real-world experiments that a single control policy can achieve close to time-optimal flight performance across the entire performance envelope of the robot, reaching up to 60km/h and 4.5g in acceleration. The ability to guide a learned controller during task execution has implications beyond agile quadrotor flight, as conditioning the control policy on human intent helps safely bringing learning based systems out of the well-defined laboratory environment into the wild." Training Efficient Controllers Via Analytic Policy Gradient,"Nina Wiedemann, Valentin Wueest, Antonio Loquercio, Matthias Mueller, Dario Floreano, Davide Scaramuzza","Robotics and Perception Group, University of Zürich,EPFL,UC Berkeley,Intel,Ecole Polytechnique Federal, Lausanne,University of Zurich",Aerial Robot Learning,"Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with APG through curriculum learning and experiment on a widely used controls benchmark, the CartPole, and two common aerial robots, a quadrotor and a fixed-wing drone. Our proposed method outperforms both model-based and model-free RL methods in terms of tracking error. Concurrently, it achieves similar performance to MPC while requiring more than an order of magnitude less computation time. Our work provides insights into the potential of APG as a promising control method for robotics. To facilitate the exploration of APG, we open-source our code and make it available at https://github.com/lis-epfl/apg_trajectory_tracking." Parallel Reinforcement Learning Simulation for Visual Quadrotor Navigation,"Jack Saunders, Sajad Saeedi, Wenbin Li","University of Bath,Toronto Metropolitan University",Aerial Robot Learning,"Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world. Gathering data for RL is known to be a laborious task, and real-world experiments can be risky. Simulators facilitate the collection of training data in a quicker and more cost-effective manner. However, RL frequently requires a significant number of simulation steps for an agent to become skilful at simple tasks. This is a prevalent issue within the field of RL-based visual quadrotor navigation where state dimensions are typically very large and dynamic models are complex. Furthermore, rendering images and obtaining physical properties of the agent can be computationally expensive. To solve this, we present a simulation framework, built on AirSim, which provides efficient parallel training. Building on this framework, Ape-X is modified to incorporate parallel training of AirSim environments to make use of numerous networked computers. Through experiments we were able to achieve a reduction in training time from 3.9 hours to 11 minutes, for a toy problem, using the aforementioned framework and a total of 74 agents and two networked computers." Toward Efficient Physical and Algorithmic Design of Automated Garages,"Teng Guo, Jingjin Yu",Rutgers University,Multi-Robot Systems I,"Parking in large metropolitan areas is often a time-consuming task with further implications toward traffic patterns that affect urban landscaping. Reducing the premium space needed for parking has led to the development of automated mechanical parking systems. Compared to regular garages having one or two rows of vehicles in each island, automated garages can have multiple rows of vehicles stacked together to support higher parking demands. Although this multi-row layout reduces parking space, it makes the parking and retrieval more complicated. In this work, we propose an automated garage design that supports near 100% parking density. Modeling the problem of parking and retrieving multiple vehicles as a special class of multi-robot path planning problem, we propose associated algorithms for handling all common operations of the automated garage, including (1) optimal algorithm and near-optimal methods that find feasible and efficient solutions for simultaneous parking/retrieval and (2) a novel shuffling mechanism to rearrange vehicles to facilitate scheduled retrieval at rush hours. We conduct thorough simulation studies showing the proposed methods are promising for large and high-density real-world parking applications" Chronos and CRS: Design of a Miniature Car-Like Robot and a Software Framework for Single and Multi-Agent Robotics and Control,"Andrea Carron, Bodmer Sabrina, Lukas Vogel, René Zurbruegg, David Helm, Rahel Rickenbach, Simon Muntwiler, Jerome Sieber, Melanie N. Zeilinger","ETH Zurich,ETH Zürich",Multi-Robot Systems I,"From both an educational and research point of view, experiments on hardware are a key aspect of robotics and control. In the last decade, many open-source hardware and software frameworks for wheeled robots have been presented, mainly in the form of unicycles and car-like robots, with the goal of making robotics accessible to a wider audience and to support control systems development. Unicycles are usually small and inexpensive, and therefore facilitate experiments in a larger fleet, but they are not suited for high-speed motion. Car-like robots are more agile, but they are usually larger and more expensive, thus requiring more resources in terms of space and money. In order to bridge this gap, we present Chronos, a new car-like 1/28th scale robot with customized open-source electronics, and CRS, an open-source software framework for control and robotics. The CRS software framework includes the implementation of various state-of-the-art algorithms for control, estimation, and multi-agent coordination. With this work, we aim to provide easier access to hardware and reduce the engineering time needed to start new educational and research projects." Multi-Agent Path Integral Control for Interaction-Aware Motion Planning in Urban Canals,"Lucas Michael Streichenberg, Elia Trevisan, Jen Jen Chung, Roland Siegwart, Javier Alonso-Mora","ETH Zurich,Delft University of Technology,The University of Queensland",Multi-Robot Systems I,"Autonomous vehicles that operate in urban environments shall comply with existing rules and reason about the interactions with other decision-making agents. In this paper, we introduce a decentralized and communication-free interaction-aware motion planner and apply it to Autonomous Surface Vessels (ASVs) in urban canals. We build upon a sampling-based method, namely Model Predictive Path Integral control (MPPI), and employ it to, in each time instance, compute both a collision-free trajectory for the vehicle and a prediction of other agents' trajectories, thus modeling interactions. To improve the method's efficiency in multi-agent scenarios, we introduce a two-stage sample evaluation strategy and define an appropriate cost function to achieve rule compliance. We evaluate this decentralized approach in simulations with multiple vessels in real scenarios extracted from Amsterdam's canals, showing superior performance than a state-of-the-art trajectory optimization framework and robustness when encountering different types of agents." Mixed Observable RRT: Multi-Agent Mission-Planning in Partially Observable Environments,"Kasper Johansson, Ugo Rosolia, Wyatt Ubellacker, Andrew Singletary, Aaron Ames","Stanford University,Caltech,California Institute of Technology",Multi-Robot Systems I,"This paper considers centralized mission-planning for a heterogeneous multi-agent system with the aim of locating a hidden target. We propose a mixed observable setting, consisting of a fully observable state-space and a partially observable environment, using a hidden Markov model. First, we construct rapidly exploring random trees (RRTs) to introduce the mixed observable RRT for finding plausible mission plans giving way-points for each agent. Leveraging this construction, we present a path-selection strategy based on a dynamic programming approach, which accounts for the uncertainty from partial observations and minimizes the expected cost. Finally, we combine the high-level plan with model predictive control algorithms to evaluate the approach on an experimental setup consisting of a quadruped robot and a drone. It is shown that agents are able to make intelligent decisions to explore the area efficiently and to locate the target through collaborative actions." RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments,"Aakriti Agrawal, Amrit Bedi, Dinesh Manocha","University of Maryland, College Park,University of Maryland",Multi-Robot Systems I,"We present a novel reinforcement learning based algorithm for multi-robot task allocation problem in warehouse environments. We formulate it as a Markov Decision Process and solve via a novel deep multi-agent reinforcement learning method (called RTAW) with attention inspired policy architecture. Hence, our proposed policy network uses global embeddings that are independent of the number of robots/tasks. We utilize proximal policy optimization algorithm for training and use a carefully designed reward to obtain a converged policy. The converged policy ensures cooperation among different robots to minimize total travel delay (TTD) which ultimately improves the makespan for a sufficiently large task-list. In our extensive experiments, we compare the performance of our RTAW algorithm to state of the art methods such as myopic pickup distance minimization (greedy) and regret based baselines on different navigation schemes. We show an improvement of upto 14% (25-1000 seconds) in TTD on scenarios with hundreds or thousands of tasks for different challenging warehouse layouts and task generation schemes. We also demonstrate the scalability of our approach by showing performance with up to 1000 robots in simulations." Hybrid SUSD-Based Task Allocation for Heterogeneous Multi-Robot Teams,"Shengkang Chen, Tony Lin, Said Al-abri, Ronald Arkin, Fumin Zhang","Georgia Tech,Georgia Institute of Technology",Multi-Robot Systems I,"Effective task allocation is an essential component to the coordination of heterogeneous robots. This paper proposes a hybrid task allocation algorithm that improves upon given initial solutions, for example from the popular decentralized market-based allocation algorithm, via a derivative-free optimization strategy called Speeding-Up and Slowing-Down (SUSD). Based on the initial solutions, SUSD performs a search to find an improved task assignment. Unique to our strategy is the ability to apply a gradient-like search to solve a classical integer-programming problem. The proposed strategy outperforms other state-of-the-art algorithms in terms of total task utility and can achieve near optimal solutions in simulation. Experimental results using the Robotarium are also provided." Search Algorithms for Multi-Agent Teamwise Cooperative Path Finding,"Zhongqiang Ren, Chaoran Zhang, Sivakumar Rathinam, Howie Choset","Carnegie Mellon University,TAMU",Multi-Robot Systems I,"Multi-Agent Path Finding (MA-PF) computes a set of collision-free paths for multiple agents from their respective starting locations to destinations. This paper considers a generalization of MA-PF called Multi-Agent Teamwise Cooperative Path Finding (MA-TC-PF), where agents are grouped as multiple teams and each team has its own objective to be minimized. For example, an objective can be the sum or max of individual arrival times of the agents. In general, there is more than one team, and MA-TC-PF is thus a multi-objective planning problem with the goal of finding the entire Pareto-optimal front that represents all possible trade-offs among the objectives of the teams. To solve MA-TC-PF, we propose two algorithms TC-CBS and TC-M*, which leverage the existing CBS and M* for conventional MA-PF. We discuss the conditions under which the proposed algorithms are complete and are guaranteed to find the Pareto-optimal front. We present numerical results for several types of MA-TC-PF problems." Collaborative Scheduling with Adaptation to Failure for Heterogeneous Robot Teams,"Peng Gao, Sriram Siva, Anthony Micciche, Hao Zhang","University of Maryland, College Park,Colorado School of Mines,University of Massachusetts Amherst",Multi-Robot Systems I,"Collaborative scheduling is an essential ability for a team of heterogeneous robots to collaboratively complete complex tasks, e.g., in a multi-robot assembly application. To enable collaborative scheduling, two key problems should be addressed, including allocating tasks to heterogeneous robots and adapting to robot failures in order to guarantee the completion of all tasks. In this paper, we introduce a novel approach that integrates deep bipartite graph matching and imitation learning for heterogeneous robots to complete complex tasks as a team. Specifically, we use a graph attention network to represent attributes and relationships of the tasks. Then, we formulate collaborative scheduling with failure adaptation as a new deep learning-based bipartite graph matching problem, which learns a policy by imitation to determine task scheduling based on the reward of potential task schedules. During normal execution, our approach generates robot-task pairs as potential allocations. When a robot fails, our approach identifies not only individual robots but also subteams to replace the failed robot. We conduct extensive experiments to evaluate our approach in the scenarios of collaborative scheduling with robot failures. Experimental results show that our approach achieves promising, generalizable and scalable results on collaborative scheduling with robot failure adaptation." AMSwarm: An Alternating Minimization Approach for Safe Motion Planning of Quadrotor Swarms in Cluttered Environments,"Vivek Kantilal Adajania, Siqi Zhou, Arun Singh, Angela P. Schoellig","University of Toronto,Technical University of Munich,University of Tartu,TU Munich",Multi-Robot Systems I,"This paper presents a scalable online algorithm to generate safe and kinematically feasible trajectories for quadrotor swarms. Existing approaches rely on linearizing Euclidean distance-based collision constraints and on axis-wise decoupling of kinematic constraints to reduce the trajectory optimization problem for each quadrotor to a quadratic program (QP). This conservative approximation often fails to find a solution in cluttered environments. We present a novel alternative that handles collision constraints without linearization and kinematic constraints in their quadratic form while still retaining the QP form. We achieve this by reformulating the constraints in a polar form and applying an Alternating Minimization algorithm to the resulting problem. Through extensive simulation results, we demonstrate that, as compared to Sequential Convex Programming (SCP) baselines, our approach achieves on average, a 72% improvement in success rate, a 36% reduction in mission time, and a 42 times faster per-agent computation time. We also show that collision constraints derived from discrete-time barrier functions (BF) can be incorporated, leading to different safety behaviours without significant computational overhead. Moreover, our optimizer outperforms the state-of-the-art optimal control solver ACADO in handling BF constraints with a 31 times faster per-agent computation time and a 44% reduction in mission time on average. We experimentally validated our approach on a Crazyflie quadrotor swarm of up to 12 quadrotors. The code with supplementary material and video are released for reference." Decentralized Deadlock-Free Trajectory Planning for Quadrotor Swarm in Obstacle-Rich Environments,"Jungwon Park, Inkyu Jang, H. Jin Kim",Seoul National University,Multi-Robot Systems I,"This paper presents a decentralized multi-agent trajectory planning (MATP) algorithm that guarantees to generate a safe, deadlock-free trajectory in an obstacle-rich environment under a limited communication range. The proposed algorithm utilizes a grid-based multi-agent path planning (MAPP) algorithm for deadlock resolution, and we introduce the subgoal optimization method to make the agent converge to the waypoint generated from the MAPP without deadlock. In addition, the proposed algorithm ensures the feasibility of the optimization problem and collision avoidance by adopting a linear safe corridor (LSC). We verify that the proposed algorithm does not cause a deadlock in both random forests and dense mazes regardless of communication range, and it outperforms our previous work in flight time and distance. We validate the proposed algorithm through a hardware demonstration with ten quadrotors." A Negative Imaginary Theory-Based Time-Varying Group Formation Tracking Scheme for Multi-Robot Systems: Applications to Quadcopters,"Yu-Hsiang Su, Parijat Bhowmick, Alexander Lanzon","The University of Manchester,Indian Institute of Technology Guwahati",Multi-Robot Systems I,"This paper proposes a new methodology to develop a time-varying group formation tracking scheme for a class of multi-agent systems (e.g. different types of multi-robot systems) utilising Negative Imaginary (NI) theory. It offers a two-loop control scheme in which the inner loop deploys an appropriate feedback linearising control law to transform the nonlinear dynamics of each agent into a double integrator system, while the outer loop applies an NI-based time-varying group formation control protocol on the linearised agents. This approach offers greater flexibility in choosing a controller, easy implementation and tuning, reduces the overall complexity of the scheme, and uses only output feedback (hence reduced sensing requirements) to achieve formation control in contrast to the existing formation control schemes. The paper has also provided lab-based experimental validation results to demonstrate the feasibility and usefulness of the proposed scheme. Two experiments were conducted on a group of small-scale quadcopters connected via a network to test the time-varying group formation tracking performance." Data-Driven Risk-Sensitive Model Predictive Control for Safe Navigation in Multi-Robot Systems,"Atharva Navsalkar, Ashish Hota","Indian Institute of Technology Kharagpur,Indian Institute of Technology (IIT) Kharagpur",Multi-Robot Systems I,"Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally robust conditional value-at-risk (CVaR) of the distance between the agent and a polyhedral obstacle geometry. Specifically, the CVaR constraints are required to hold for all distributions that are close to the empirical distribution constructed from observed samples of prediction error collected during execution. The generality of the approach allows us to robustify against prediction errors that arise under commonly imposed assumptions in both distributed and decentralized settings. We derive tractable finite-dimensional approximations of this class of constraints by leveraging convex and minmax duality results for Wasserstein distributionally robust optimization problems. The effectiveness of the proposed approach is illustrated in a multi-drone navigation setting implemented in Gazebo platform." Multi-Modal Hierarchical Transformer for Occupancy Flow Field Prediction in Autonomous Driving,"Haochen Liu, Zhiyu Huang, Chen Lv",Nanyang Technological University,Intelligent Transportation Systems I,"Forecasting the future states of surrounding traffic participants is a crucial capability for autonomous vehicles. The recently proposed occupancy flow field prediction introduces a scalable and effective representation to jointly predict surrounding agents' future motions in a scene. However, the challenging part is to model the underlying social interactions among traffic agents and the relations between occupancy and flow. Therefore, this paper proposes a novel Multi-modal Hierarchical Transformer network that fuses the vectorized (agent motion) and visual (scene flow, map, and occupancy) modalities and jointly predicts the flow and occupancy of the scene. Specifically, visual and vector features from sensory data are encoded through a multi-stage Transformer module and then a late-fusion Transformer module with temporal pixel-wise attention. Importantly, a flow-guided multi-head self-attention (FG-MSA) module is designed to better aggregate the information on occupancy and flow and model the mathematical relations between them. The proposed method is comprehensively validated on the Waymo Open Motion Dataset and compared against several state-of-the-art models. The results reveal that our model with much more compact architecture and data inputs than other methods can achieve comparable performance. We also demonstrate the effectiveness of incorporating vectorized agent motion features and the proposed FG-MSA module. Compared to the ablated model without the FG-MSA module, which won the 2nd place in the 2022 Waymo Occupancy and Flow Prediction Challenge, the current model shows better separability for flow and occupancy and further performance improvements." Annotating Covert Hazardous Driving Scenarios Online: Utilizing the Driver's Electroencephalography (EEG) Signals,"Chen Zheng, Muxiao Zi, Wenjie Jiang, Mengdi Chu, Yan Zhang, Jirui Yuan, Guyue Zhou, Jiangtao Gong","Institute for AI Industry Research, Tsinghua University,Tsinghua University",Intelligent Transportation Systems I,"As autonomous driving systems prevail, it is becoming increasingly critical that the systems learn from databases containing fine-grained driving scenarios. Most databases currently available are human-annotated; they are expensive, time-consuming, and subject to behavioral biases. In this paper, we provide initial evidence supporting a novel technique utilizing drivers’ electroencephalography (EEG) signals to implicitly label hazardous driving scenarios while passively viewing recordings of real-road driving, thus sparing the need for manual annotation and avoiding human annotators’behavioral biases during explicit report. We conducted an EEG experiment using real-life and animated recordings of driving scenarios and asked participants to report danger explicitly whenever necessary. Behavioral results showed the participants tended to report danger only when overt hazards (e.g., a vehicle or a pedestrian appearing unexpectedly from behind an occlusion) were in view. By contrast, their EEG signals were enhanced at the sight of both an overt hazard and a covert hazard (e.g., an occlusion signalling possible appearance of a vehicle or a pedestrian from behind). Thus, EEG signals were more sensitive to driving hazards than explicit reports. Further, the Time-Series AI (TSAI, [1]) successfully classified EEG signals corresponding to overt and covert hazards. We discuss future steps necessary to materialize the technique in real life." Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints,"Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled Refaat, Weilong Yang, Eugene Ie, Congcong Li","Stanford University,Waymo LLC,Waymo,Google,Waymo Inc.",Intelligent Transportation Systems I,"Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas. In this work, we aim at identifying crossing pedestrians and predicting their future trajectories. To achieve these goals, we not only need the context information of road geometry and other traffic participants but also need fine-grained information of the human pose, motion and activity, which can be inferred from human keypoints. In this paper, we propose a novel multi-task learning framework for pedestrian crossing action recognition and trajectory prediction, which utilizes 3D human keypoints extracted from raw sensor data to capture rich information on human pose and activity. Moreover, we propose to apply two auxiliary tasks and contrastive learning to enable auxiliary supervisions to improve the learned keypoints representation, which further enhances the performance of major tasks. We validate our approach on a large-scale in-house dataset, as well as a public benchmark dataset, and show that our approach achieves state-of-the-art performance on a wide range of evaluation metrics. The effectiveness of each model component is validated in a detailed ablation study." Model-Agnostic Multi-Agent Perception Framework,"Runsheng Xu, Weizhe Chen, Hao Xiang, Xia Xin, Lantao Liu, Jiaqi Ma","UCLA,Indiana University Bloomington,University of California, Los Angeles,Indiana University",Intelligent Transportation Systems I,"Existing multi-agent perception systems assume that every agent utilizes the same model with identical parameters and architecture. The performance can be degraded with different perception models due to the mismatch in their confidence scores. In this work, we propose a model-agnostic multi-agent perception framework to reduce the negative effect caused by the model discrepancies without sharing the model information. Specifically, we propose a confidence calibrator that can eliminate the prediction confidence score bias. Each agent performs such calibration independently on a standard public database to protect intellectual property. We also propose a corresponding bounding box aggregation algorithm that considers the confidence scores and the spatial agreement of neighboring boxes. Our experiments shed light on the necessity of model calibration across different agents, and the results show that the proposed framework improves the baseline 3D object detection performance of heterogeneous agents." Explainable Action Prediction through Self-Supervision on Scene Graphs,"Pawit Kochakarn, Daniele De Martini, Daniel Omeiza, Lars Kunze",University of Oxford,Intelligent Transportation Systems I,"This work explores scene graphs as a distilled representation of high-level information for autonomous driving, applied to future driver-action prediction. Given the scarcity and strong imbalance of data samples, we propose a self-supervision pipeline to infer representative and well-separated embeddings. Key aspects are interpretability and explainability; as such, we embed in our architecture attention mechanisms that can create spatial and temporal heatmaps on the scene graphs. We evaluate our system on the ROAD dataset against a fully-supervised approach, showing the superiority of our training regime." CueCAn: Cue-Driven Contextual Attention for Identifying Missing Traffic Signs on Unconstrained Roads,"Varun Gupta, Anbumani Subramanian, C.V. Jawahar, Rohit Saluja","IIIT, Hyderabad,Intel,IIIT Hyderabad",Intelligent Transportation Systems I,"Unconstrained Asian roads often involve poor infrastructure, affecting overall road safety. Missing traffic signs are a regular part of such roads. Missing or non-existing object detection has been studied for locating missing curbs and estimating reasonable regions for pedestrians on road scene images. Such methods involve analyzing task-specific single object cues. In this paper, we present the first and most challenging video dataset for missing objects, with multiple types of traffic signs for which the cues are visible without the signs in the scenes. We refer to it as the Missing Traffic Signs Video Dataset (MTSVD). MTSVD is challenging compared to the previous works in two aspects i) The traffic signs are generally not present in the vicinity of their cues, ii) The traffic signs’ cues are diverse and unique. Also, MTSVD is the first publicly available missing object dataset. To train the models for identifying missing signs, we complement our dataset with 10K traffic sign tracks, with 40% of the traffic signs having cues visible in the scenes. For identifying missing signs, we propose the Cue-driven Contextual Attention units (CueCAn), which we incorporate in our model’s encoder. We first train the encoder to classify the presence of traffic sign cues and then train the entire segmentation model end-to-end to localize missing traffic signs. Quantitative and qualitative analysis shows that CueCAn significantly improves the performance of base models." Tackling Clutter in Radar Data - Label Generation and Detection Using PointNet++,"Johannes Kopp, Dominik Kellner, Aldi Piroli, Klaus Dietmayer","Ulm University, Germany,BMW AG,Universität Ulm,University of Ulm",Intelligent Transportation Systems I,"Radar sensors employed for environment perception, e.g. in autonomous vehicles, output a lot of unwanted clutter. These points, for which no corresponding real objects exist, are a major source of errors in following processing steps like object detection or tracking. We therefore present two novel neural network setups for identifying clutter. The input data, network architectures and training configuration are adjusted specifically for this task. Special attention is paid to the downsampling of point clouds composed of multiple sensor scans. In an extensive evaluation, the new setups display substantially better performance than existing approaches. Because there is no suitable public data set in which clutter is annotated, we design a method to automatically generate the respective labels. By applying it to existing data with object annotations and releasing its code, we effectively create the first freely available radar clutter data set representing real-world driving scenarios. Code and instructions are accessible at www.github.com/kopp-j/clutter-ds." "Effective Combination of Vertical, Longitudinal and Lateral Data for Vehicle Mass Estimation","Younesse EL MRHASLI, Bruno Monsuez, Xavier Mouton","ENSTA PARIS,ENSTA-ParisTech,Groupe Renault",Intelligent Transportation Systems I,"Real-time knowledge of the vehicle mass is valuable for several applications, mainly: active safety systems design and energy consumption optimization. This work describes a novel strategy for mass estimation in static and dynamic conditions. First, when the vehicle is powered-up, an initial estimation is given by observing the variations of one suspension deflection sensor mounted on the rear. Then, the estimation is refined based on conditioned and filtered longitudinal and lateral motions. In this study, we suggest using these extracted events on two different algorithms, namely: the recursive least squares and the prior-recursive Bayesian inference. That is to express the results in a deterministic and statistical sense. Both simulations and experimental tests show that our approach encompasses the benefits of various works in the literature, preeminently, robustness to resistive loads, fast convergence, and minimal instrumentation." Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles,"Sushant Veer, Karen Yan Ming Leung, Ryan Cosner, Yuxiao Chen, Peter Karkus, Marco Pavone","NVIDIA,Stanford University, NVIDIA Research, University of Washington,California Institute of Technology,Nvidia research,Stanford University",Intelligent Transportation Systems I,"Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of the both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans ~7-10 Hz in various challenging road navigation and intersection negotiation scenarios." Active Probing and Influencing Human Behaviors Via Autonomous Agents,"Shuangge Wang, Yiwei Lyu, John Dolan","University of Southern California,Carnegie Mellon University",Intelligent Transportation Systems I,"Autonomous agents (robots) face tremendous challenges while interacting with heterogeneous human agents in close proximity. One of these challenges is that the autonomous agent does not have an accurate model tailored to the specific human that the autonomous agent is interacting with, which could sometimes result in inefficient human-robot interaction and suboptimal system dynamics. Developing an online method to enable the autonomous agent to learn information about the human model is therefore an ongoing research goal. Existing approaches position the robot as a passive learner in the environment to observe the physical states and the associated human response. This passive design, however, only allows the robot to obtain information that the human chooses to exhibit, which sometimes doesn’t capture the human’s full intention. In this work, we present an online optimization-based probing procedure for the autonomous agent to clarify its belief about the human model in an active manner. By optimizing an information radius, the autonomous agent chooses the action that most challenges its current conviction. This procedure allows the autonomous agent to actively probe the human agents to reveal information that’s previously unavailable to the autonomous agent. With this gathered information, the autonomous agent can interactively influence the human agent for some designated objectives. Our main contributions include a coherent theoretical framework that unifies the probing and influence procedures and two case studies in autonomous driving that show how active probing can help to create better participant experience during influence, like higher efficiency or less perturbations." TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction,"Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc Van Gool","ETH Zurich,ETH Zürich",Intelligent Transportation Systems I,"Data-driven simulation has become a favorable way to train and test autonomous driving algorithms. The idea of replacing the actual environment with a learned simulator has also been explored in model-based reinforcement learning in the context of world models. In this work, we show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles. Existing data-driven traffic simulators are lacking configurability and scalability. To generate configurable behaviors, for each agent we introduce a destination as navigational information, and a time-invariant latent personality that specifies the behavioral style. To improve the scalability, we present a new scheme of positional encoding for angles, allowing all agents to share the same vectorized context and the use of an architecture based on dot-product attention. As a result, we can simulate all traffic participants seen in dense urban scenarios. Experiments on the Waymo open motion dataset show TrafficBots can simulate realistic multi-agent behaviors and achieve good performance on the motion prediction task." SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments,"Arec Jamgochian, Etienne Buehrle, Johannes Fischer, Mykel Kochenderfer","Stanford University,Karlsruhe Institute of Technology",Intelligent Transportation Systems I,"Designing a safe and human-like decision-making system for an autonomous vehicle is a challenging task. Generative imitation learning is one possible approach for automating policy-building by leveraging both real-world and simulated decisions. Previous work that applies generative imitation learning to autonomous driving policies focuses on learning a low-level controller for simple settings. However, to scale to complex settings, many autonomous driving systems combine fixed, safe, optimization-based low-level controllers with high-level decision-making logic that selects the appropriate task and associated controller. In this paper, we attempt to bridge this gap in complexity by employing Safety-Aware Hierarchical Adversarial Imitation Learning (SHAIL), a method for learning a high-level policy that selects from a set of low-level controller instances in a way that imitates low-level driving data on-policy. We introduce an urban roundabout simulator that controls non-ego vehicles using real data from the Interaction dataset. We then demonstrate empirically that even with simple controller options, our approach can produce better behavior than previous approaches in driver imitation that have difficulty scaling to complex environments. Our implementation is available at https://github.com/sisl/InteractionImitation." Reinforcement Learning-Based Optimal Multiple Waypoint Navigation,"Christos Vlachos, Panagiotis Rousseas, Charalampos Bechlioulis, Kostas Kyriakopoulos","National Technical University of Athens,University of Patras,National Technical Univ. of Athens",Motion and Path Planning I,"In this paper, a novel method based on Artificial Potential Field (APF) theory is presented, for optimal motion planning in fully-known, static workspaces, for multiple final goal configurations. Optimization is achieved through a Reinforcement Learning (RL) framework. More specifically, the parameters of the underlying potential field are adjusted through a policy gradient algorithm in order to minimize a cost function. The main novelty of the proposed scheme lies in the method that provides optimal policies for multiple final positions, in contrast to most existing methodologies that consider a single final configuration. An assessment of the optimality of our results is conducted by comparing our novel motion planning scheme against a RRT∗ method." DriveIRL: Drive in Real Life with Inverse Reinforcement Learning,"Tung Phan-minh, Forbes Howington, Ting-sheng Chu, Momchil Tomov, Robert Beaudoin, Sang Uk Lee, Nanxiang Li, Caglayan Dicle, Samuel Findler, Francisco Suárez-Ruiz, Bo Yang, Sammy Omari, Eric Wolff","Motional AD,Motional,University of Michigan,Bosch Research and Technology Center,Senior Software Engineer at Motional,Nanyang Technological University,ETH Zurich,California Institute of Technology",Motion and Path Planning I,"In this paper, we introduce the first published planner to drive a car in dense, urban traffic using Inverse Reinforcement Learning (IRL). Our planner, DriveIRL, generates a diverse set of trajectory proposals and scores them with a learned model. The best trajectory is tracked by our self-driving vehicle's low-level controller. We train our trajectory scoring model on a 500+ hour real-world dataset of expert driving demonstrations in Las Vegas within the maximum entropy IRL framework. DriveIRL's benefits include: a simple design due to only learning the trajectory scoring function, a flexible and relatively interpretable feature engineering approach, and strong real-world performance. We validated DriveIRL on the Las Vegas Strip and demonstrated fully autonomous driving in heavy traffic, including scenarios involving cut-ins, abrupt braking by the lead vehicle, and hotel pickup/dropoff zones. Our dataset is currently undergoing public release to help further research in this area." LES: Locally Exploitative Sampling for Robot Path Planning,"Sagar Joshi, Seth Hutchinson, Panagiotis Tsiotras","Aurora Innovation,Georgia Institute of Technology,Georgia Tech",Motion and Path Planning I,"Sampling-based algorithms solve the path planning problem by generating random samples in the search-space and incrementally growing a connectivity graph or a tree. Conventionally, the sampling strategy used in these algorithms is biased towards exploration to acquire information about the search-space. In contrast, this work proposes an optimization-based procedure that generates new samples so as to improve the cost-to-come value of vertices in a given neighborhood. The application of the proposed algorithm adds an exploitative bias to sampling and results in a faster convergence to the optimal solution compared to other state-of-the-art sampling techniques. This is demonstrated using benchmarking experiments performed for 7 DOF Panda and 14 DOF Baxter robots." Boundary Conditions in Geodesic Motion Planning for Manipulators,"Mario Laux, Andreas Zell",University of Tübingen,Motion and Path Planning I,"In dynamic environments, robotic manipulators and especially cobots must be able to react to changing circumstances while in motion. This substantiates the need for quick trajectory planning algorithms that are able to cope with arbitrary velocity and acceleration boundary conditions. Apart from dynamic re-planning, being able to seamlessly join trajectories together opens the door for divide-and-conquer-type algorithms to focus on the individual parts of a motion separately. While geodesic motion planning has proven that it can produce very smooth and efficient actuator movement, the problem of incorporating non-zero boundary conditions has not been addressed yet. We show how a set of generalized coordinates can be used to transition between boundary conditions and free movement in an optimal way while still retaining the known advantages of geodesic planners. We also outline, how our approach can be combined with the family of time-scaling algorithms for further improvement of the generated trajectories." TOFG: A Unified and Fine-Grained Environment Representation in Autonomous Driving,"Zihao Wen, Yifan Zhang, Xinhong Chen, Jianping Wang",City University of Hong Kong,Motion and Path Planning I,"In autonomous driving, an accurate understanding of environment, e.g., the vehicle-to-vehicle and vehicle-to-lane interactions, plays a critical role in many driving tasks such as trajectory prediction and motion planning. Environment information comes from high-definition (HD) map and historical trajectories of vehicles. Due to the heterogeneity of the map data and trajectory data, many data-driven models for trajectory prediction and motion planning extract vehicle-to-vehicle and vehicle-to-lane interactions in a separate and sequential manner. However, such a manner may capture biased interpretation of interactions, causing lower prediction and planning accuracy. Moreover, separate extraction leads to a complicated model structure and hence the overall efficiency and scalability are sacrificed. To address the above issues, we propose an environment representation, Temporal Occupancy Flow Graph (TOFG). Specifically, the occupancy flow-based representation unifies the map information and vehicle trajectories into a homogeneous data format and enables a consistent prediction. The temporal dependencies among vehicles can help capture the change of occupancy flow timely to further promote model performance. To demonstrate that TOFG is capable of simplifying the model architecture, we incorporate TOFG with a simple graph attention (GAT) based neural network and propose TOFG-GAT, which can be used for both trajectory prediction and motion planning. Experiment results show that TOFG-GAT achieves better or competitive performance than all the SOTA baselines with less training time." Unidirectional-Road-Network-Based Global Path Planning for Cleaning Robots in Semi-Structured Environments,"Yong Li, Hui Cheng","Guangzhou Shiyuan Electronic Technology Co., Ltd,Sun Yat-sen University",Motion and Path Planning I,"Practical global path planning is critical for commercializing cleaning robots working in semi-structured environments. In the literature, global path planning methods for free space usually focus on path length and neglect the traffic rule constraints of the environments, which leads to high-frequency re-planning and increases collision risks. In contrast, those for structured environments are developed mainly by strictly complying with the road network representing the traffic rule constraints, which may result in an overlong path that hinders the overall navigation efficiency. This article proposes a general and systematic approach to improve global path planning performance in semi-structured environments. A unidirectional road network is built to represent the traffic constraints in semi-structured environments and a hybrid strategy is proposed to achieve a guaranteed planning result. Cutting across the road at the starting and the goal points are allowed to achieve a shorter path. Especially, a two-layer potential map is proposed to achieve a guaranteed performance when the starting and the goal points are in complex intersections. Comparative experiments are carried out to validate the effectiveness of the proposed method.Quantitative experimental results show that, compared with the state-of-art, the proposed method guarantees a much better balance between path length and the consistency with the road network." A Hierarchical Decoupling Approach for Fast Temporal Logic Motion Planning,"Ziyang Chen, Zhangli Zhou, Shaochen Wang, Zhen Kan",University of Science and Technology of China,Motion and Path Planning I,"Fast motion planning is of great significance, espe- cially when a timely mission is desired. However, the complexity of motion planning can grow drastically with the increase of environment details and mission complexity. This challenge can be further exacerbated if the tasks are coupled with the desired locations in the environment. To address these issues, this work aims at fast motion planning problems with temporal logical specifications. In particular, we develop a hierarchical decoupling framework that consists of three layers: the high- level task planner, the decoupling layer, and the low-level motion planner. The decoupling layer is designed to bridge the high and low layers by providing necessary information exchange. Such a framework enables the decoupling of the task planner and path planner, so that they can run independently, which significantly reduces the search space and enables fast planing in continuous or high-dimension discrete workspaces. In addition, the implicit constraint during task-level planning is taken into account, so that the low-level path planning is guaranteed to satisfy the mission requirements. Numerical simulations demonstrate at least one order of magnitude speed up in terms of computational time over existing methods." A Fast Two-Stage Approach for Multi-Goal Path Planning in a Fruit Tree,"Werner Kroneman, João Valente, Frank Van Der Stappen","University College Roosevelt,Wageningen University & Research,Utrecht University",Motion and Path Planning I,"We consider the problem of planning the motion of a drone equipped with a robotic arm, tasked with bringing its end-effector up to many (150+) targets in a fruit tree; to inspect every piece of fruit, for example. The task is complicated by the intersection of a version of Neighborhood TSP (to find an optimal order and a pose to visit every target), and a robotic motion-planning problem through a planning space that features numerous cavities and narrow passages that confuse common techniques. In this contribution, we present a framework that decomposes the problem into two stages: planning approach paths for every target, and quickly planning between the start points of those approach paths. Then, we compare our approach by simulation to a more straightforward method based on multi-query planning, showing that our approach outperforms it in both time and solution cost." Online Whole-Body Motion Planning for Quadrotor Using Multi-Resolution Search,"Yunfan Ren, Siqi Liang, Fangcheng Zhu, Guozheng Lu, Fu Zhang","The University of Hong Kong,Harbin Institute of Technology, Shenzhen,University of Hong Kong",Motion and Path Planning I,"In this paper, we address the problem of online quadrotor whole-body motion planning (SE(3) planning) in unknown and unstructured environments. We propose a novel multi-resolution search method, which discovers narrow areas requiring full pose planning and normal areas requiring only position planning. As a consequence, a quadrotor planning problem is decomposed into several SE(3) (if necessary) and R^3 sub-problems. To fly through the discovered narrow areas, a carefully designed corridor generation strategy for narrow areas is proposed, which significantly increases the planning success rate. The overall problem decomposition and hierarchical planning framework substantially accelerate the planning process, making it possible to work online with fully onboard sensing and computation in unknown environments. Extensive simulation benchmark comparisons show that the proposed method is one to several orders of magnitude faster than the state-of-the-art methods in computation time while maintaining high planning success rate. The proposed method is finally integrated into a LiDAR-based autonomous quadrotor, and various real-world experiments in unknown and unstructured environments are conducted to demonstrate the outstanding performance of the proposed method." Intermittent Diffusion Based Path Planning for Heterogeneous Groups of Mobile Sensors in Cluttered Environments,"Christina Frederick, Haomin Zhou, Frank Crosby","NJIT,Georgia Institute of Technology,USNWC PC",Motion and Path Planning I,"This paper presents a method for task-oriented path planning and collision avoidance for a group of heterogeneous holonomic mobile sensors. It is a generalization of the authors' prior work on diffusion-based path planning. The proposed variant allows one to plan paths in environments cluttered with obstacles. The agents follow flow dynamics, i.e., the negative gradient of a function that is the sum of two functions: the first minimizes the distance from desired target regions and the second captures distance from other agents within a field of view. When it becomes necessary to steer around an obstacle, this function is augmented by a projection term that is carefully designed in terms of obstacle boundaries. More importantly, a diffusion term is added intermittently so that agents can exit local minima. In addition, the new approach skips the offline planning phase in the prior approach to improve computational performance and handle collision avoidance with a completely decentralized method. This approach also provably finds collision-free paths under certain conditions. Numerical simulations of three deployment missions further support the performance of ID-based diffusion." GANet: Goal Area Network for Motion Forecasting,"Mingkun Wang, Xinge Zhu, Changqian Yu, Wei Li, Yuexin Ma, Ruochun Jin, Xiaoguang Ren, Dongchun Ren, Mingxu Wang, Wenjing Yang","Peking University,CUHK,Meituan,Inceptio,ShanghaiTech University,National University of Defense Technology,Academy of Military Sciences,Fudan University,State Key Laboratory of High Performance Computing (HPCL), Schoo",Motion and Path Planning I,"Predicting the future motion of road participants is crucial for autonomous driving but is extremely challenging due to staggering motion uncertainty. Recently, most motion forecasting methods resort to the goal-based strategy, i.e., predicting endpoints of motion trajectories as conditions to regress the entire trajectories, so that the search space of solution can be reduced. However, accurate goal coordinates are hard to predict and evaluate. In addition, the point representation of the destination limits the utilization of a rich road context, leading to inaccurate prediction results in many cases. Goal area, i.e., the possible destination area, rather than goal coordinate, could provide a more soft constraint for searching potential trajectories by involving more tolerance and guidance. In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas as preconditions for trajectory prediction, performing more robustly and accurately. Specifically, we propose a GoICrop (Goal Area of Interest) operator to effectively extract semantic lane features in goal areas and model actors' future interactions, which benefits a lot for future trajectory estimations. GANet ranks 1st on the leaderboard of Argoverse Challenge among all public literature (till the paper submission), and its source codes will be released." FlowMap: Path Generation for Automated Vehicles in Open Space Using Traffic Flow,"Wenchao Ding, Jieru Zhao, Yubin Chu, Haihui Huang, Tong Qin, Chunjing Xu, Yuxiang Guan, Zhongxue Gan","Fudan University,Shanghai Jiao Tong University,Dalian University of Technology,Zhejiang University,Huawei Techonology,Huawei Technologies",Motion and Path Planning I,"There is extensive literature on perceiving road structures by fusing various sensor inputs such as lidar point clouds and camera images using deep neural nets. Leveraging the latest advance of neural architects (such as transformers) and bird-eye-view (BEV) representation, the road cognition accuracy keeps improving. However, how to cognize the “road” for automated vehicles where there is no well-defined “roads”remains an open problem. For example, how to find paths inside intersections without HD maps is hard since there is neither an explicit definition for “roads” nor explicit features such as lane markings. The idea of this paper comes from a proverb: it becomes a way when people walk on it. Although there are no “roads” from sensor readings, there are “roads” from tracks of other vehicles. In this paper, we propose FlowMap, a path generation framework for automated vehicles based on traffic flows. FlowMap is built by extending our previous work RoadMap [1], a light-weight semantic map, with an additional traffic flow layer. A path generation algorithm on traffic flow fields (TFFs) is proposed to generate human-like paths. The proposed framework is validated using real-world driving data and is amenable to generating paths for super complicated intersections without using HD maps." An Architecture for Reactive Mobile Manipulation On-The-Move,"Ben Burgess-Limerick, Christopher Lehnert, Jurgen Leitner, Peter Corke","Queensland University of Technology,LYRO Robotics & Monash University",Reactive and Sensor-Based Planning,"We present a generalised architecture for reactive mobile manipulation while a robot's base is in motion toward the next objective in a high-level task. By performing tasks on-the-move, overall cycle time is reduced compared to methods where the base pauses during manipulation. Reactive control of the manipulator enables grasping objects with unpredictable motion while improving robustness against perception errors, environmental disturbances, and inaccurate robot control compared to open-loop, trajectory-based planning approaches. We present an example implementation of the architecture and investigate the performance on a series of pick and place tasks with both static and dynamic objects and compare the performance to baseline methods. Our method demonstrated a real-world success rate of over 99%, failing in only a single trial from 120 attempts with a physical robot system. The architecture is further demonstrated on other mobile manipulator platforms in simulation. Our approach reduces task time by up to 48%, while also improving reliability, gracefulness, and predictability compared to existing architectures for mobile manipulation." Multi-Robot Mission Planning in Dynamic Semantic Environments,"Samarth Kalluraya, George J. Pappas, Yiannis Kantaros","Washington University in St. Louis,University of Pennsylvania",Reactive and Sensor-Based Planning,"This paper addresses a new semantic multi-robot planning problem in uncertain and dynamic environments. Particularly, the environment is occupied with mobile and uncertain semantic targets. These targets are governed by stochastic dynamics while their current and future positions as well as their semantic labels are uncertain. Our goal is to control mobile sensing robots so that they can accomplish collaborative semantic tasks defined over the uncertain current/future positions and semantic labels of these targets. We express these tasks using Linear Temporal Logic (LTL). We propose a sampling-based approach that explores the robot motion space, the mission specification space, as well as the future configurations of the semantic targets to design optimal paths. These paths are revised online to adapt to uncertain perceptual feedback. To the best of our knowledge, this is the first work that addresses semantic mission planning problems in uncertain and dynamic semantic environments. We provide extensive experiments that demonstrate the efficiency of the proposed method." A System for Generalized 3D Multi-Object Search,"Kaiyu Zheng, Anirudha Paul, Stefanie Tellex","Brown University,Brown",Reactive and Sensor-Based Planning,"Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m$^2$ lobby area." A General Class of Combinatorial Filters That Can Be Minimized Efficiently,"Yulin Zhang, Dylan Shell","Amazon,Texas A&M University",Reactive and Sensor-Based Planning,"State minimization of combinatorial filters is a fundamental problem that arises, for example, in building cheap, resource-efficient robots. But exact minimization is known to be NP-hard. This paper conducts a more nuanced analysis of this hardness than up till now, and uncovers two factors which contribute to this complexity. We show each factor is a distinct source of the problem’s hardness and are able, thereby, to shed some light on the role played by (1) structure of the graph that encodes compatibility relationships, and (2) determinism-enforcing constraints. Just as a line of prior work has sought to introduce additional assumptions and identify sub-classes that lead to practical state reduction, we next use this new, sharper understanding to explore special cases for which exact minimization is efficient. We introduce a new algorithm for constraint repair that applies to a large sub-class of filters, subsuming three distinct special cases for which the possibility of optimal minimization in polynomial time was known earlier. While the efficiency in each of these three cases previously appeared to stem from seemingly dissimilar properties, when seen through the lens of the present work, their commonality now becomes clear. We also provide entirely new families of filters that are efficiently reducible." Cautious Planning with Incremental Symbolic Perception: Designing Verified Reactive Driving Maneuvers,"Disha Kamale, Sofie Haesaert, Cristian Ioan Vasile","Lehigh University,Eindhoven University of Technology",Reactive and Sensor-Based Planning,"This work presents a step towards utilizing incrementally-improving symbolic perception knowledge of the robot’s surroundings for provably correct reactive control synthesis applied to an autonomous driving problem. Combining abstract models of motion control and information gathering, we show that assume-guarantee specifications (a subclass of Linear Temporal Logic) can be used to define and resolve traffic rules for cautious planning. We propose a novel representation called symbolic refinement tree for perception that captures the incremental knowledge about the environment and embodies the relationships between various symbolic perception inputs. The incremental knowledge is leveraged for synthesizing verified reactive plans for the robot. The case studies demonstrate the efficacy of the proposed approach in synthesizing control inputs even in case of partially occluded environments." Decision Diagrams As Plans: Answering Observation-Grounded Queries,"Dylan Shell, Jason O'kane",Texas A&M University,Reactive and Sensor-Based Planning,"We consider a robot that answers questions about its environment by traveling to appropriate places and then sensing. Questions are posed as structured queries and may involve conditional or contingent relationships between observable properties. After formulating this problem, and emphasizing the advantages of exploiting deducible information, we describe how non-trivial knowledge of the world and queries can be given a convenient, concise, unified representation via reduced ordered binary decision diagrams (BDDs). To use these data structures directly for inference and planning, we introduce a new product operation and generalize the classic dynamic variable reordering techniques to solve planning problems. Also, finally, we evaluate optimizations that exploit locality." Obstacle Avoidance Using Raycasting and Riemannian Motion Policies at kHz Rates for MAVs,"Michael Pantic, Isar Meijer, Rik Marian Kai Bähnemann, Nikhilesh Alatur, Olov Andersson, Cesar D. Cadena Lerma, Roland Siegwart, Lionel Ott","ETH Zürich,ETH Zurich",Reactive and Sensor-Based Planning,"This paper presents a novel method for using Riemannian Motion Policies on volumetric maps, shown in the example of obstacle avoidance for Micro Aerial Vehicles (MAVs). Today, most robotic obstacle avoidance algorithms rely on sampling or optimization-based planners with volumetric maps. However, they are computationally expensive and often have inflexible monolithic architectures. Riemannian Motion Policies are a modular, parallelizable, and efficient navigation alternative but are challenging to use with the widely used voxel-based environment representations. We propose using GPU raycasting and tens of thousands of concurrent policies to provide direct obstacle avoidance using Riemannian Motion Policies in voxelized maps without needing map smoothing or pre-processing. Additionally, we present how the same method can directly plan on LiDAR scans without any intermediate map. We show how this reactive approach compares favorably to traditional planning methods and can evaluate up to $200$ million rays per second. We demonstrate the planner successfully on a real MAV for static and dynamic obstacles. The presented planner is made available as an open-source package." Adaptive and Explainable Deployment of Navigation Skills Via Hierarchical Deep Reinforcement Learning,"Kyowoon Lee, Seongun Kim, Jaesik Choi","Ulsan National Institute of Science and Technology,Korea Advanced Institute of Science and Technology",Reactive and Sensor-Based Planning,"For robotic vehicles to navigate robustly and safely in unseen environments, it is crucial to decide the most suitable navigation policy. However, most existing deep reinforcement learning based navigation policies are trained with a hand-engineered curriculum and reward function which are difficult to be deployed in a wide range of real-world scenarios. In this paper, we propose a framework to learn a family of low-level navigation policies and a high-level policy for deploying them. The main idea is that, instead of learning a single navigation policy with a fixed reward function, we simultaneously learn a family of policies that exhibit different behaviors with a wide range of reward functions. We then train the high-level policy which adaptively deploys the most suitable navigation skill. We evaluate our approach in simulation and the real world and demonstrate that our method can learn diverse navigation skills and adaptively deploy them. We also illustrate that our proposed hierarchical learning framework presents explainability by providing semantics for the behavior of an autonomous agent." Learning Agile Flight Maneuvers: Deep SE(3) Motion Planning and Control for Quadrotors,"Yixiao Wang, Bingheng Wang, Shenning Zhang, Han Wei Sia, Lin Zhao","National University of Singapore,ST Engineering",Collision Avoidance,"Agile flights of autonomous quadrotors in cluttered environments require constrained motion planning and control subject to translational and rotational dynamics. Traditional model-based methods typically demand complicated design and heavy computation. In this paper, we develop a novel deep reinforcement learning-based method that tackles the challenging task of flying through a dynamic narrow gate. We design a model predictive controller with its adaptive tracking references parameterized by a deep neural network (DNN). These references include the traversal time and the quadrotor SE(3) traversal pose that encourage the robot to fly through the gate with maximum safety margins from various initial conditions. To cope with the difficulty of training in highly dynamic environments, we develop a reinforce-imitate learning framework to train the DNN efficiently that generalizes well to diverse settings. Furthermore, we propose a binary search algorithm that allows online adaption of the SE(3) references to dynamic gates in real-time. Finally, through extensive high-fidelity simulations, we show that our approach is adaptive to different gate trajectories, velocities, and orientations." Robust MADER: Decentralized and Asynchronous Multiagent Trajectory Planner Robust to Communication Delay,"Kota Kondo, Jesus Tordesillas Torres, Reinaldo Figueroa, Juan Rached, Joseph Merkel, Parker Lusk, Jonathan Patrick How","Massachusetts Institute of Technology,MIT Aerospace Controls Lab",Collision Avoidance,"Although communication delays can disrupt multiagent systems, most of the existing multiagent trajectory planners lack a strategy to address this issue. State-of-the-art approaches typically assume perfect communication environments, which is hardly realistic in real-world experiments. This paper presents Robust MADER (RMADER), a decentralized and asynchronous multiagent trajectory planner that can handle communication delays among agents. By broadcasting both the newly optimized trajectory and the committed trajectory, and by performing a delay check step, RMADER is able to guarantee safety even under communication delay. RMADER was validated through extensive simulation and hardware flight experiments and achieved a 100% success rate of collision-free trajectory generation, outperforming state-of-the-art approaches." Obstacle Identification and Ellipsoidal Decomposition for Fast Motion Planning in Unknown Dynamic Environments,"Mehmetcan Kaymaz, Nazim Ure",Istanbul Technical University,Collision Avoidance,"Collision avoidance in the presence of dynamic obstacles in unknown environments is one of the most critical challenges for unmanned systems. In this paper, we present a method that identifies obstacles in terms of ellipsoids to estimate linear and angular obstacle velocities. Our proposed method is based on the idea of any object can be approximately expressed by ellipsoids. To achieve this, we propose a method based on variational Bayesian estimation of Gaussian mixture model, the Kyachiyan algorithm, and a refinement algorithm. Our proposed method does not require knowledge of the number of clusters and can operate in real-time, unlike existing optimization-based methods. In addition, we define an ellipsoid-based feature vector to match obstacles given two timely close point frames. Our method can be applied to any environment with static and dynamic obstacles, including ones with rotating obstacles. We compare our algorithm with other clustering methods and show that when coupled with a trajectory planner, the overall system can efficiently traverse unknown environments in the presence of dynamic obstacles." Safe Operations of an Aerial Swarm Via a Cobot Human Swarm Interface,"Sydrak Abdi, Derek Paley",University of Maryland,Collision Avoidance,"Command and control of an aerial swarm is a complex task. This task increases in difficulty when the flight volume is restricted and the swarm and operator inhabit the same workspace. This work presents a novel method for interacting with and controlling a swarm of quadrotors in a confined space. EMG-based gesture control is used to control the position, orientation, and density of the swarm. Inter-agent as well as agent-operator collisions are prevented through a velocity controller based on a distance-based potential function. State feedback is relayed to the operator via a vibrotactile haptic vest. This cobot human swarm interface prioritizes operator safety while reducing the cognitive load during control of a cobot swarm. This work demonstrates that an operator can safely and intuitively control a swarm of aerial robots in the same workspace." MonoGraspNet: 6-DoF Grasping with a Single RGB Image,"Guangyao Zhai, Dianye Huang, Shun-cheng Wu, Hyunjun Jung, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam","Technical University of Munich,Google,Technische Universität München,TU Munich",Perception for Grasping and Manipulation I,"6-DoF robotic grasping is a long-lasting but unsolved problem. Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors, demonstrating superior accuracy on common objects but performing unsatisfactorily on photometrically challenging objects, e.g., objects in transparent or reflective materials. The bottleneck lies in that the surface of these objects can not reflect accurate depth due to the absorption or refraction of light. In this paper, in contrast to exploiting the inaccurate depth data, we propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet that utilizes stable 2D features to simultaneously handle arbitrary object grasping and overcome the problems induced by photometrically challenging objects. MonoGraspNet leverages a keypoint heatmap and a normal map to recover the 6-DoF grasping poses represented by our novel representation parameterized with 2D keypoints with corresponding depth, grasping direction, grasping width, and angle. Extensive experiments in real scenes demonstrate that our method can achieve competitive results in grasping common objects and surpass the depth-based competitor by a large margin in grasping photometrically challenging objects. To further stimulate robotic manipulation research, we annotate and open-source a multi-view and multi-scene grasping dataset in the real world containing 120 objects of mixed photometric complexity with 20M accurate grasping labels." USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation,"Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu","Shanghai Jiao Tong University,Tsinghua University,Carnegie Mellon University,Center for Artificial Intelligence and Robotics, Graduate School",Perception for Grasping and Manipulation I,"Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance? In this paper, we try to address this intriguing challenge by using USEEK, an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category, to perform generalizable manipulation. USEEK follows a teacher-student structure to decouple the unsupervised keypoint discovery and SE(3)-equivariant keypoint detection. With USEEK in hand, the robot can infer the category-level task-relevant object frames in an efficient and explainable manner, enabling manipulation of any intra-category objects from and to any poses. Through extensive experiments, we demonstrate that the keypoints produced by USEEK possess rich semantics, thus successfully transferring the functional knowledge from the demonstration object to the novel ones. Compared with other object representations for manipulation, USEEK is more adaptive in the face of large intra-category shape variance, more robust with limited demonstrations, and more efficient at inference time. Project website: https://sites.google.com/view/useek/." Semantic Mapping with Confidence Scores through Metric Embeddings and Gaussian Process Classification,"Jungseok Hong, Suveer Garg, Volkan Isler","University of Minnesota,University of Pennsylvania",Perception for Grasping and Manipulation I,"Recent advances in robotic mapping enable robots to use both semantic and geometric understanding of their surroundings to perform complex tasks. Current methods are optimized for reconstruction quality, but they do not provide a measure of how certain they are of their outputs. Therefore, algorithms that use these maps do not have a way of assessing how much they can trust the outputs. We present a mapping approach that unifies semantic information and shape completion inferred from RGBD images and computes confidence scores for its predictions. We use a Gaussian Process (GP) classification model to merge confidence scores (if available) for the given information. A novel aspect of our method is that we lift the measurement to a learned metric space over which the GP parameters are learned. After training, we can evaluate the uncertainty of objects’ completed shapes with their semantic information. We show that our approach can achieve more accurate predictions than a classic GP model and provide robots with the flexibility to decide whether they can trust the estimate at a given location using the confidence scores." The Third Generation (G3) Dual-Modal and Dual Sensing Mechanisms (DMDSM) Pretouch Sensor for Robotic Grasping,"Cheng Fang, Shuangliang Li, Di Wang, Fengzhi Guo, Dezhen Song, Jun Zou",Texas A&M University,Perception for Grasping and Manipulation I,"Fingertip-mounted pretouch sensors are very useful for robotic grasping. In this paper, we report a new (G3) dual-modal and dual sensing mechanisms (DMDSM) pretouch sensor for near-distance ranging and material sensing, which is based on pulse-echo ultrasound (US) and optoacoustics (OA). Different from previously reported versions, the G3 sensor utilizes a self-focused US/OA transceiver, thereby eliminating the need of a bulky parabolic reflective mirror for focusing the ultrasound and laser beams. The self-focused laser and ultrasound beams can be easily steered by a (flat) scanning mirror which expands from single-point ranging and detection to areal mapping or imaging. To verify the new design, a prototype G3 DMDSM sensor with a scanning mirror is fabricated. The US and OA ranging performances are tested in experiments. Together with the scanning mirror, thin wire targets made of same or different materials at different positions are scanned and imaged. The ranging and imaging results show that the G3 DMDSM sensor can provide new and better pretouch mapping and imaging capabilities for robotic grasping than its predecessors." Learning Height for Top-Down Grasps with the DIGIT Sensor,"Thais Bernardi, Yoann Fleytoux, Jean-Baptiste Mouret, Serena Ivaldi","Inria,INRIA",Perception for Grasping and Manipulation I,"We address the problem of grasping unknown objects identified from top-down images with a parallel gripper. When no object 3D model is available, the state-of-the-art grasp generators identify the best candidate locations for planar grasps using the RGBD image. However, while they generate the Cartesian location and orientation of the gripper, the height of the grasp center is often determined by heuristics based on the highest point in the depth map, which leads to unsuccessful grasps when the objects are not thick, or have transparencies or curved shapes. In this paper, we propose to learn a regressor that predicts the best grasp height based from the image. We train this regressor with a dataset that is automatically acquired thanks to the DIGIT optical tactile sensors, which can evaluate grasp success and stability. Using our predictor, the grasping success is improved by 6% for all objects, by 16% on average on difficult objects, and by 40% for objects that are notably very difficult to grasp (e.g., transparent, curved, thin)." Instance-Wise Grasp Synthesis for Robotic Grasping,"Yucheng Xu, Mohammadreza Kasaei, Hamidreza Kasaei, Zhibin Li","University of Edinburgh,University of Groningen,University College London",Perception for Grasping and Manipulation I,"Generating high-quality instance-wise grasp configurations provides critical information of how to grasp specific objects in a multi-object environment and is of high importance for robot manipulation tasks. This work proposed a novel Single-Stage Grasp (SSG) synthesis network, which performs high-quality instance-wise grasp synthesis in a single stage: instance mask and grasp configurations are generated for each object simultaneously. Our method outperforms state-of-the-art on robotic grasp prediction based on the OCID-Grasp dataset, and performs competitively on the JACQUARD dataset. The benchmarking results showed significant improvements compared to the baseline on the accuracy of generated grasp configurations. The performance of the proposed method has been validated through both extensive simulations and real robot experiments for three tasks including single object pick-and-place, grasp synthesis in cluttered environments and table cleaning task." Joint Segmentation and Grasp Pose Detection with Multi-Modal Feature Fusion Network,"Xiaozheng Liu, Yunzhou Zhang, He Cao, Shan Dexing, Jiaqi Zhao",Northeastern University,Perception for Grasping and Manipulation I,"Efficient grasp pose detection is essential for robotic manipulation in cluttered scenes. However, most methods only utilize point clouds or images for prediction, ignoring the advantages of different features. In this paper, we present a multi-modal fusion network for joint segmentation and grasp pose detection. We design a point cloud and image co-guided feature fusion module that can be used to fuse features and adaptively estimate the importance of the point-pixel feature pairs. Moreover, we develop a seed point sampling algorithm that simultaneously considers the distance, semantics and attention scores. For selected seed points, we design a local feature aggregation module to fully utilize the local features in the grasp region. Experimental results on the GraspNet-1Billion Dataset show that our network outperforms several state-of-the-art methods. We also conduct real robot grasping experiments to demonstrate the effectiveness of our approach." GraspNeRF: Multiview-Based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF,"Qiyu Dai, Yan Zhu, Yiran Geng, Ciyu Ruan, Jiazhao Zhang, He Wang","Peking University,National University of Defense Technology",Perception for Grasping and Manipulation I,"In this work, we tackle 6-DoF grasp detection for transparent and specular objects, which is an important yet challenging problem in vision-based robotic systems, due to the failure of depth cameras in sensing their geometry. We, for the first time, propose a multiview RGB-based 6-DoF grasp detection network, GraspNeRF, that leverages the generalizable neural radiance field (NeRF) to achieve material-agnostic object grasping in clutter. Compared to the existing NeRF-based 3-DoF grasp detection methods that rely on densely captured input images and time-consuming per-scene optimization, our system can perform zero-shot NeRF construction with sparse RGB inputs and reliably detect 6-DoF grasps, both in real-time. The proposed framework jointly learns generalizable NeRF and grasp detection in an end-to-end manner, optimizing the scene representation construction for the grasping. For training data, we generate a large-scale photorealistic domain-randomized synthetic dataset of grasping in cluttered tabletop scenes that enables direct transfer to the real world. Our extensive experiments in synthetic and real-world environments demonstrate that our method significantly outperforms all the baselines in all the experiments while remaining in real-time. Project page can be found at https://pku-epic.github.io/GraspNeRF." Elastic Context: Encoding Elasticity for Data-Driven Models of Textiles,"Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael Welle, Alexander Kravberg, Yufei Wang, David Held, Zackory Erickson, Danica Kragic","KTH Royal Institute of Technology,Carnegie Mellon University,KTH",Perception for Grasping and Manipulation I,"Physical interaction with textiles, such as assistive dressing or household tasks, requires advanced dexterous skills. The complexity of textile behavior during stretching and pulling is influenced by the material properties of the yarn and by the textile's construction technique, which are often unknown in real-world settings. Moreover, identification of physical properties of textiles through sensing commonly available on robotic platforms remains an open problem. To address this, we introduce Elastic Context (EC), a method to encode the elasticity of textiles using stress-strain curves adapted from textile engineering for robotic applications. We employ EC to learn generalized elastic behaviors of textiles and examine the effect of EC dimension on accurate force modeling of real-world non-linear elastic behaviors." Vision-Based Six-Dimensional Peg-In-Hole for Practical Connector Insertion,"Kun Zhang, Chen Wang, Hua Chen, Jia Pan, Michael Y. Wang, Wei Zhang","Hong Kong University of Science and Technology,The University of Hong Kong,Southern University of Science and Technology,University of Hong Kong,Monash University",Perception for Grasping and Manipulation I,"We study six-dimensional (6D) perceptive peg-in-hole problem for practical connector insertion task in this paper. To enable the manipulator system to handle different types of pegs in complex environment, we develop a perceptive robotic assembly system that utilizes an in-hand RGB-D camera for peg-in-hole with multiple types of pegs. The proposed framework addresses the critical hole detection and pose estimation problem through combining the learning-based detection with model-based pose estimation strategies. By exploiting the structure of the peg-in-hole task, we consider a rectangle-shape based characterization for modeling the candidate socket. Such a characterization allows us to design simple learning-based methods to detect and estimate the 6D pose of the target socket that balances between processing speed and accuracy. To validate our method, we test the performance of the proposed perceptive peg-in-hole solution using a KUKA iiwa7 robotic arm to accomplish the socket insertion task with two types of practical sockets (RJ45/HDMI). Without the need of additional search, our method achieves an acceptable success rate in the connector insertion tasks. The results confirm the reliability of our method and show that our method is suitable for real world application." RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control,"Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield","University of Illinois Urbana-Champaign,NVIDIA Corporation,Nvidia,NVIDIA,Carnegie Mellon University,University of Illinois at Urbana-Champaign",Perception for Grasping and Manipulation I,"We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is compted. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF. We show results on a real dataset collected and annotated in our lab." Multi-View Object Pose Estimation from Correspondence Distributions and Epipolar Geometry,"Rasmus Haugaard, Thorbjørn Mosekjær Iversen","University of Southern Denmark,The Maersk Mc-Kinney Moller Institute, University of Southern De",Perception for Grasping and Manipulation I,"In many automation tasks involving manipulation of rigid objects, the poses of the objects must be acquired. Vision-based pose estimation using a single RGB or RGB-D sensor is especially popular due to its broad applicability. However, single-view pose estimation is inherently limited by depth ambiguity and ambiguities imposed by various phenomena like occlusion, self-occlusion, reflections, etc. Aggregation of information from multiple views can potentially resolve these ambiguities, but the current state-of-the-art multi-view pose estimation method only uses multiple views to aggregate single-view pose estimates, and thus rely on obtaining good single-view estimates. We present a multi-view pose estimation method which aggregates learned 2D-3D distributions from multiple views for both the initial estimate and optional refinement. Our method performs probabilistic sampling of 3D-3D correspondences under epipolar constraints using learned 2D-3D correspondence distributions which are implicitly trained to respect visual ambiguities such as symmetry. Evaluation on the T-LESS dataset shows that our method reduces pose estimation errors by 80-91% compared to the best single-view method, and we present state-of-the-art results on T-LESS with four views, even compared with methods using five and eight views." FSG-Net: A Deep Learning Model for Semantic Robot Grasping through Few-Shot Learning,"Leonardo Barcellona, Alberto Bacchin, Alberto Gottardi, Emanuele Menegatti, Stefano Ghidoni","University of Padova,University of Padua,The University of Padua",Learning for Grasping and Manipulation I,"Robot grasping has been widely studied in the last decade. Recently, Deep Learning made possible to achieve remarkable results in grasp pose estimation, using depth and RGB images. However, only few works consider the choice of the object to grasp. Moreover, they require a huge amount of data for generalizing to unseen object categories. For this reason, in this work, we define the Few-shot Semantic Grasping task where the objective is inferring a correct grasp given only five labelled images of a target unseen object. We propose a new deep learning architecture able to solve the aforementioned problem, leveraging on a Few-shot Semantic Segmentation module. We have evaluated the proposed model both in the Graspnet dataset and in a real scenario. In Graspnet, we achieve 40,95% accuracy in the Few-shot Semantic Grasping task, outperforming baseline approaches. In the real experiments, the results confirmed the generalization ability of the network." Learning Pre-Grasp Manipulation of Flat Objects in Cluttered Environments Using Sliding Primitives,"Jiaxi Wu, Haoran Wu, Shanlin Zhong, Quqin Sun, Yinlin Li","Peking University,University of Science and Technology of China,Institute of Automation, Chinese Academy of Sciences,Wuhan Second.Ship Design.and Research Institute",Learning for Grasping and Manipulation I,"Flat objects with negligible thicknesses like books and disks are challenging to be grasped by the robot because of the width limit of the robot's gripper, especially when they are in cluttered environments. Pre-grasp manipulation is conducive to rearranging objects on the table and moving the flat objects to the table edge, making them graspable. In this paper, we formulate this task as Parameterized Action Markov Decision Process, and a novel method based on deep reinforcement learning is proposed to address this problem by introducing sliding primitives as actions. A weight-sharing policy network is utilized to predict the sliding primitive's parameters for each object, and a Q-network is adopted to select the acted object among all the candidates on the table. Meanwhile, via integrating a curriculum learning scheme, our method can be scaled to cluttered environments with more objects. In both simulation and real-world experiments, our method surpasses the existing methods and achieves pre-grasp manipulation with higher task success rates and fewer action steps. Without fine-tuning, it can be generalized to novel shapes and household objects with more than 85% success rates in the real world. Videos and supplementary materials are available at https://sites.google.com/view/pre-grasp-sliding." Learning Category-Level Manipulation Tasks from Point Clouds with Dynamic Graph CNNs,"Junchi Liang, Abdeslam Boularias",Rutgers University,Learning for Grasping and Manipulation I,"This paper presents a new technique for learning category-level manipulation from raw RGB-D videos of task demonstrations, with no manual labels or annotations. Category-level learning aims to acquire skills that can be generalized to new objects, with geometries and textures that are different from the ones of the objects used in the demonstrations. We address this problem by first viewing both grasping and manipulation as special cases of tool use, where a tool object is moved to a sequence of key-poses defined in a frame of reference of a target object. Tool and target objects, along with their key-poses, are predicted using a dynamic graph convolutional neural network that takes as input an automatically segmented depth and color image of the entire scene. Empirical results on object manipulation tasks with a real robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks on novel objects within the same category, and outperforms alternative approaches." Neural Grasp Distance Fields for Robot Manipulation,"Thomas Weng, David Held, Franziska Meier, Mustafa Mukadam","Carnegie Mellon University,Facebook,Facebook AI Research",Learning for Grasping and Manipulation I,"We formulate grasp learning as a neural field and present Neural Grasp Distance Fields (NGDF). Here, the input is a 6D pose of a robot end effector and output is a distance to a continuous manifold of valid grasps for an object. In contrast to current approaches that predict a set of discrete candidate grasps, the distance-based NGDF representation is easily interpreted as a cost, and minimizing this cost produces a successful grasp pose. This grasp distance cost can be incorporated directly into a trajectory optimizer for joint optimization with other costs such as trajectory smoothness and collision avoidance. During optimization, as the various costs are balanced and minimized, the grasp target is allowed to smoothly vary, as the learned grasp field is continuous. We evaluate NGDF on joint grasp and motion planning in simulation and the real world, outperforming baselines by 63% execution success while generalizing to unseen query poses and unseen object shapes. Project page: https://sites.google.com/view/neural-grasp-distance-fields." Planning for Multi-Object Manipulation with Graph Neural Network Relational Classifiers,"Yixuan Huang, Adam Conkey, Tucker Hermans",University of Utah,Learning for Grasping and Manipulation I,"Objects rarely sit in isolation in human environments. As such, we’d like our robots to reason about how multiple objects relate to one another and how those relations may change as the robot interacts with the world. To this end, we propose a novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions. Our model operates on partial-view point clouds and can reason about multiple objects dynamically interacting during the manipulation. By learning a dynamics model in a learned latent graph embedding space, our model enables multi-step planning to reach target goal relations. We show our model trained purely in simulation transfers well to the real world. Our planner enables the robot to rearrange a variable number of objects with a range of shapes and sizes using both push and pick and place skills." Local Neural Descriptor Fields: Locally Conditioned Object Representations for Manipulation,"Ethan Chun, Yilun Du, Anthony Simeonov, Tomas Lozano-Perez, Leslie Kaelbling","Massachusetts Institute of Technology,MIT",Learning for Grasping and Manipulation I,"A robot operating in a household environment will see a wide range of unique and unfamiliar objects. While a system could train on many of these, it is infeasible to predict all the objects a robot will see. In this paper, we present a method to generalize object manipulation skills acquired from a limited number of demonstrations, to novel objects from unseen shape categories. Our approach, Local Neural Descriptor Fields (L-NDF), utilizes neural descriptors defined on the local geometry of the object to effectively transfer manipulation demonstrations to novel objects at test time. In doing so, we leverage the local geometry shared between objects to produce a more general manipulation framework. We illustrate the efficacy of our approach in manipulating novel objects in novel poses -- both in simulation and in the real world." Practical Visual Deep Imitation Learning Via Task-Level Domain Consistency,"Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang","Google X,UC Berkeley,Everyday Robots,X, The Moonshot Factory,Halodi Robotics",Learning for Grasping and Manipulation I,"Recent work in visual end-to-end learning for robotics has shown the promise of imitation learning across a variety of tasks. Such approaches are however expensive both because they require large amounts of real world data and rely on time-consuming real-world evaluations to identify the best model for deployment. These challenges can be mitigated by using simulation evaluations to identify high performing policies. However, this introduces the well-known ""reality gap"" problem, where simulator inaccuracies decorrelate performance in simulation from that of reality. In this paper, we build on top of prior work in GAN-based domain adaptation and introduce the notion of a Task Consistency Loss (TCL), a self-supervised loss that encourages sim and real alignment both at the feature and action-prediction levels. We demonstrate the effectiveness of our approach by teaching a 9-DoF mobile manipulator to perform the challenging task of latched door opening purely from visual inputs such as RGB and depth images. We achieve 69% success across twenty seen and unseen meeting rooms using only ~16.2 hours of teleoperated demonstrations in sim and real. To the best of our knowledge, this is the first work to tackle latched door opening from a purely end-to-end learning approach, where the task of navigation and manipulation are jointly modeled by a single neural network." SEIL: Simulation-Augmented Equivariant Imitation Learning,"Mingxi Jia, Dian Wang, Guanang Su, David Klee, Xupeng Zhu, Robin Walters, Robert Platt",Northeastern University,Learning for Grasping and Manipulation I,"In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount of demonstrations. We propose Simulation-augmented Equivariant Imitation Learning (SEIL), a method that combines a novel data augmentation strategy of supplementing expert trajectories with simulated transitions and an equivariant model that exploits the O(2) symmetry in robotic manipulation. Experimental evaluations demonstrate that our method can learn non-trivial manipulation tasks within ten demonstrations and outperform the baselines by a significant margin." Dextrous Tactile In-Hand Manipulation Using a Modular Reinforcement Learning Architecture,"Johannes Pitz, Lennart Röstel, Leon Sievers, Berthold Bäuml","German Aerospace Center,German Aerospace Center (DLR)",Learning for Grasping and Manipulation I,"Dextrous in-hand manipulation with a multi- fingered robotic hand is a challenging task, esp. when performed with the hand oriented upside down, demanding permanent force-closure, and when no external sensors are used. For the task of reorienting an object to a given goal orientation (vs. infinitely spinning it around an axis), the lack of external sensors is an additional fundamental challenge as the state of the object has to be estimated all the time, e.g., to detect when the goal is reached. In this paper, we show that the task of reorienting a cube to any of the 24 possible goal orientations in a π/2-raster using the torque-controlled DLR- Hand II is possible. The task is learned in simulation using a modular deep reinforcement learning architecture: the actual policy has only a small observation time window of 0.5s but gets the cube state as an explicit input which is estimated via a deep differentiable particle filter trained on data generated by running the policy. In simulation, we reach a success rate of 92% while applying significant domain randomization. Via zero-shot Sim2Real-transfer on the real robotic system, all 24 goal orientations can be reached with a high success rate." Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation,"Mengxi Li, Rika Antonova, Dorsa Sadigh, Jeannette Bohg",Stanford University,Learning for Grasping and Manipulation I,"When humans perform contact-rich manipulation tasks, customized tools are often necessary to simplify the task. For instance, we use various utensils for handling food, such as knives, forks and spoons. Similarly, robots may benefit from specialized tools that enable them to more easily complete a variety of tasks. We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators. Previous work relied on manually constructed priors requiring detailed specification of a 3D object model, grasp pose and task description to facilitate the search or optimization process. Our approach only requires defining the objective with respect to task performance and enables learning a robust morphology through randomizing variations of the task. We make this optimization tractable by casting it as a continual learning problem. We demonstrate the effectiveness of our method for designing new tools in several scenarios, such as winding ropes, flipping a box and pushing peas onto a scoop in simulation. Additionally, experiments with real robots show that the tool shapes discovered by our method help them succeed in these scenarios." CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation,"Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox","Nvidia Corporation,NVIDIA,University of Washington",Learning for Grasping and Manipulation I,"We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes---orders of magnitude more than prior work---in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7 micro-seconds/query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene’s signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Supplementary material and videos of robot experiments in completely unknown scenes are available at: https://cabinet-object-rearrangement.github.io." NIFT: Neural Interaction Field and Template for Object Manipulation,"Zeyu Huang, Juzhan Xu, Sisi Dai, Kai Xu, Hao Zhang, Hui Huang, Ruizhen Hu","Shenzhen University,National University of Defense Technology,Simon Fraser University",Learning for Grasping and Manipulation I,"We introduce NIFT, Neural Interaction Field and Template, a descriptive and robust interaction representation of object manipulations to facilitate imitation learning. Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Field (NIF) defined for the new object. Specifically, the NIF is a neural field that encodes the relationship between each spatial point and a given object, where the relative position is defined by a spherical distance function rather than occupancies or signed distances, which are commonly adopted by conventional neural fields but less informative. For a given demo interaction, the corresponding NIT is defined by a set of spatial points sampled in the demo NIF with associated neural features. To better capture the interaction, the points are sampled on the Interaction Bisector Surface (IBS), which consists of points that are equidistant to the two interacting objects and has been used extensively for interaction representation. With both point selection and pointwise features defined for better interaction encoding, NIT effectively guides the feature matching in the NIFs of the new object instances such that the relative poses are optimized to realize the manipulation while imitating the demo interactions. Experiments show that our NIFT solution outperforms state- of-the-art imitation learning methods for object manipulation and generalizes better to objects from new categories." Place Recognition under Occlusion and Changing Appearance Via Disentangled Representations,"Yue Chen, Xingyu Chen, Yicen Li","Xi'an Jiaotong University,Laboratory of Visual Cognitive Computing and Intelligent Vehicle,McMaster University",Localization I,"Place recognition is a critical and challenging task for mobile robots, aiming to retrieve an image captured at the same place as a query image from a database. Existing methods tend to fail while robots move autonomously under occlusion (e.g., car, bus, truck) and changing appearance (e.g., illumination changes, seasonal variation). Because they encode the image into only one code, entangling place features with appearance and occlusion features. To overcome this limitation, we propose PROCA, an unsupervised approach to decompose the image representation into three codes: a place code used as a descriptor to retrieve images, an appearance code that captures appearance properties, and an occlusion code that encodes occlusion content. Extensive experiments show that our model outperforms the state-of-the-art methods." GIDP: Learning a Good Initialization and Inducing Descriptor Post-Enhancing for Large-Scale Place Recognition,"Zhaoxin Fan, Zhenbo Song, Jun He, Hongyan Liu","Renmin University of China,Nanjing University of Science and Technology,Tsinghua University",Localization I,"Large-scale place recognition is a fundamental but challenging task, which plays an increasingly important role in autonomous driving and robotics. Existing methods have achieved acceptable good performance, however, most of them are concentrating on designing elaborate global descriptor learning network structures. The importance of feature generalization and descriptor post-enhancing has long been neglected. In this work, we propose a novel method named GIDP to learn a Good Initialization and Inducing Descriptor Pose-enhancing for Large-scale Place Recognition. In particular, an unsupervised momentum contrast point cloud pretraining module and a reranking-based descriptor post-enhancing module are proposed respectively in GIDP. The former aims at learning a good initialization for the point cloud encoding network before training the place recognition model, while the later aims at post-enhancing the predicted global descriptor through reranking at inference time. Extensive experiments on both indoor and outdoor datasets demonstrate that our method can achieve state-of-the-art performance using simple and general point cloud encoding backbones." STD: Stable Triangle Descriptor for 3D Place Recognition,"Yuan Chongjian, Jiarong Lin, Zuhao Zou, Xiaoping Hong, Fu Zhang","The University of Hong Kong,HongKong University,Southern University of Science and Technology,University of Hong Kong",Localization I,"In this work, we present a novel global descriptor termed stable triangle descriptor (STD) for 3D place recognition. For a triangle, its shape is uniquely determined by the length of the sides or included angles. Moreover, the shape of triangles is completely invariant to rigid transformations. Based on this property, we first design an algorithm to efficiently extract local key points from the 3D point cloud and encode these key points into triangular descriptors. Then, place recognition is achieved by matching the side lengths (and some other information) of the descriptors between point clouds. The point correspondence obtained from the descriptor matching pair can be further used in geometric verification, which greatly improves the accuracy of place recognition. In our experiments, we extensively compare our proposed system against other state-of-the-art systems (i.e., M2DP, Scan Context) on public datasets (i.e., KITTI, NCLT, and Complex-Urban) and our self-collected dataset (with a non-repetitive scanning solid-state LiDAR). All the quantitative results show that STD has stronger adaptability and a great improvement in precision over its counterparts. To share our findings and make contributions to the community, we open source our code on our GitHub: https://github.com/hku-mars/STD" DeepRING: Learning Roto-Translation Invariant Representation for LiDAR Based Place Recognition,"Sha Lu, Xuecheng Xu, Li Tang, Rong Xiong, Yue Wang",Zhejiang University,Localization I,"LiDAR based place recognition is popular for loop closure detection and re-localization. In recent years, deep learning brings improvements to place recognition by learnable feature extraction. However, these methods degenerate when the robot re-visits previous places with a large perspective difference. To address the challenge, we propose DeepRING to learn the roto-translation invariant representation from LiDAR scan, so that robot visiting the same place with a different perspective can have similar representations. There are two keys in DeepRING: the feature is extracted from sinogram, and the feature is aggregated by magnitude spectrum. The two steps keep the final representation with both discrimination and roto-translation invariance. Moreover, we state place recognition as a one-shot learning problem with each place being a class, leveraging relation learning to build representation similarity. Substantial experiments are carried out on public datasets, validating the effectiveness of each proposed component, and showing that DeepRING outperforms the comparative methods, especially in dataset level generalization." Sensor Localization by Few Distance Measurements Via the Intersection of Implicit Manifolds,"Michael Moshe Bilevich, Steven M Lavalle, Dan Halperin","Tel Aviv University,University of Oulu",Localization I,"We present a general approach for determining the unknown (or uncertain) position and orientation of a sensor mounted on a robot in a known environment, using only a few distance measurements (between 2 to 6 typically), which is advantageous, among others, in sensor cost, and storage and information-communication resources. In-between the measurements, the robot can perform predetermined local motions in its workspace, which are useful for narrowing down the candidate poses of the sensor. We demonstrate our approach for planar workspaces, and show that, under mild transversality assumptions, already two measurements are sufficient to reduce the set of possible poses to a set of curves (one-dimensional objects) in the three-dimensional configuration space of the sensor $mathbb{R}^2timesmathbb{S}^1$, and three or more measurements reduce the set of possible poses to a finite collection of points. However, analytically computing these potential poses for non-trivial intermediate motions between measurements raises substantial hardships and thus we resort to numerical approximation. We reduce the localization problem to a carefully tailored procedure of intersecting two or more implicitly defined two-manifolds, which we carry out to any desired accuracy, proving guarantees on the quality of the approximation. We demonstrate the real-time effectiveness of our method even at high accuracy on various scenarios and different allowable intermediate motions. We also present experiments with a physical robot. Our open-source software and supplementary materials are available at https://bitbucket.org/taucgl/vb-fdml-public" Boosting Performance of a Baseline Visual Place Recognition Technique by Predicting the Maximally Complementary Technique,"Connor Malone, Stephen Hausler, Tobias Fischer, Michael J Milford","Queensland University of Technology,CSIRO",Localization I,"One recent promising approach to the Visual Place Recognition (VPR) problem has been to fuse the place recognition estimates of multiple complementary VPR techniques using methods such as shared representative appearance learning (SRAL) and multi-process fusion. These approaches come with a substantial practical limitation: they require all potential VPR methods to be brute-force run before they are selectively fused. The obvious solution to this limitation is to predict the viable subset of methods ahead of time, but this is challenging because it requires a predictive signal within the imagery itself that is indicative of high performance methods. Here we propose an alternative approach that instead starts with a known single base VPR technique, and learns to predict the most complementary additional VPR technique to fuse with it, that results in the largest improvement in performance. The key innovation here is to use a dimensionally reduced difference vector between the query image and the top-retrieved reference image using this baseline technique as the predictive signal of the most complementary additional technique, both during training and inference. We demonstrate that our approach can train a single network to select performant, complementary technique pairs across datasets which span multiple modes of transportation (train, car, walking) as well as to generalise to unseen datasets, outperforming multiple baseline strategies for manually selecting the best technique pairs based on the same training data." Loosely-Coupled Localization Fusion System Based on Track-To-Track Fusion with Bias Alignment,"Soyeong Kim, Kichun Jo, Benazouz Bradai, Paulo Resende, Jaeyoung Jo","Konkuk University,Valeo,Konkuk university, Smart vehicle engineering",Localization I,"The localization system is an essential element in robotics, which can provide accurate position information. Multiple localization systems can be integrated for reliable localization operations because there are various methods for measuring the position or processing algorithms. Significantly, the track-to-track (T2T) fusion method can fuse multiple localization systems using each system’s estimate without accessing the sensor's low data. However, most T2T fusion-based localization systems ignore slowly varying biases, such as drift errors, odometry errors, and offsets among multiple maps. This can degrade the localization performance because a slowly varying bias is directly reflected in the localization estimate. Therefore, a slowly varying bias must be considered in the fusion process to derive reliable estimates. This study proposes a T2T fusion-based localization system that considers a slowly varying bias. First, the slow-varying bias difference between the systems was estimated. Because each localization system can have a different bias, the estimated bias difference was used to align it with the reference system. Second, a fused estimate can be obtained by T2T fusion using bias-aligned estimates. The proposed fusion system can also be used without limiting the number of inputs to the localization system. The proposed system was compared with various T2T-based localization fusion algorithms for verification in a simulation environment, and it exhibited the best performance in RMSE error comparison." Portable Multi-Hypothesis Monte Carlo Localization for Mobile Robots,"Alberto García, Francisco Martin Rico, Jose Miguel Guerrero, Francisco Javier Rodríguez Lera, Vicente Matellan","Universidad Rey Juan Carlos,Carnegie Mellon University,Rey Juan Carlos University,Universidad de León,Universidad de Leon",Localization I,"Self-localization is a fundamental capability that mobile robot navigation systems integrate to move from one point to another using a map. Thus, any enhancement in localization accuracy is crucial to perform delicate dexterity tasks. This paper describes a new localization algorithm that maintains several populations of particles using the Monte Carlo Localization (MCL) algorithm, always choosing the best one as the system’s output. As novelties, our work includes a multi-scale map matching algorithm to create new MCL populations and a metric to determine the most reliable. It also contributes the state of the art implementations, enhancing recovery times from erroneous estimates or unknown initial positions. The proposed method is evaluated in ROS2 in a module fully integrated with Nav2 and compared with the current state-of-the-art Adaptive AMCL solution, obtaining good accuracy/recovery times." CPnP: Consistent Pose Estimator for Perspective-N-Point Problem with Bias Elimination,"Guangyang Zeng, Shiyu Chen, Biqiang Mu, Guodong Shi, Junfeng Wu","The Chinese University of Hong Kong, Shenzhen,Chinese Academy of Sciences,The University of Sydney,The Chinese Unviersity of Hong Kong, Shenzhen",Localization I,"The Perspective-n-Point (PnP) problem has been widely studied in both computer vision and photogrammetry societies. With the development of feature extraction techniques, a large number of feature points might be available in a single shot. It is promising to devise a consistent estimator, i.e., the estimate can converge to the true camera pose as the number of points increases. To this end, we propose a consistent PnP solver, named CPnP, with bias elimination. Specifically, linear equations are constructed from the original projection model via measurement model modification and variable elimination, based on which a closed-form least-squares solution is obtained. We then analyze and subtract the asymptotic bias of this solution, resulting in a consistent estimate. Additionally, Gauss- Newton (GN) iterations are executed to refine the consistent solution. Our proposed estimator is efficient in terms of computations—it has O(n) time complexity. Simulations and real dataset tests show that our proposed estimator is superior to some well-known ones for images with dense visual features, in terms of estimation precision and computing time." LiDAR-Based Indoor Localization with Optimal Particle Filters Using Surface Normal Constraints,"Heruka Andradi, Sebastian Blumenthal, Erwin Prassler, Paul G. Plöger","Hochschule Bonn Rhein Sieg,Locomotec,Bonn-Rhein-Sieg Univ. of Applied Sciences",Localization I,"Accurate and robust localization systems are often highly desired in autonomous mobile robots. Existing LiDAR-based localization systems generally use standard particle filters which suffer from the well-known particle degeneracy problem. Furthermore, standard particle filters are ill-suited for handling discrepancies between maps and the actual operating environments. In this work, we present an effective LiDAR-based indoor localization system which addresses these two issues. The particle degeneracy problem is tackled with an efficient implementation of an optimal particle filter. Map discrepancies are then handled with the use of a high-fidelity observation model for accurate particle propagation and a separate low-fidelity observation model for robust weight update. Evaluations were carried out against a standard particle filter baseline on both real-world and simulated data from challenging indoor environments. The proposed system was found to show significantly better performance in-terms of accuracy, robustness to ambiguity, and robustness to map discrepancies. These performance gains were observed even with more than ten times smaller particle set sizes than in the baseline, while the increase in the computation time per particle was only around 20%." Efficient Planar Pose Estimation Via UWB Measurements,"Haodong Jiang, Wentao Wang, Yuan Shen, Xinghan Li, Xiaoqiang Ren, Biqiang Mu, Junfeng Wu","The Chinese University of Hong Kong, Shenzhen,ZhejiangUniversity,Nanjing University of Science and Technology,Zhejiang university,Shanghai University,Chinese Academy of Sciences,The Chinese Unviersity of Hong Kong, Shenzhen",Localization I,"State estimation is an essential part of autonomous systems. Integrating the Ultra-Wideband(UWB) technique has been shown to correct the long-term estimation drift and bypass the complexity of loop closure detection. However, few works on robotics treat UWB as a stand-alone state estimation solution. The primary purpose of this work is to investigate planar pose estimation using only UWB range measurements. We prove the excellent property of a two-step scheme, which says we can refine a consistent estimator to be asymptotically efficient by one step of Gauss-Newton iteration. Grounded on this result, we design the GN-ULS estimator, which reduces the computation time significantly compared to previous methods and presents the possibility of using only UWB for real-time state estimation." Visual Pitch and Roll Estimation for Inland Water Vessels,"Dennis Griesser, Georg Umlauf, Matthias Franz","University of Applied Sciences Konstanz, Institute for Optical S",Vision-Based Navigation I,"Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node." GPF-BG: A Hierarchical Vision-Based Planning Framework for Safe Quadrupedal Navigation,"Shiyu Feng, Ziyi Zhou, Justin Smith, Maxwell Asselmeier, Ye Zhao, Patricio A. Vela",Georgia Institute of Technology,Vision-Based Navigation I,"Safe quadrupedal navigation through unknown environments is a challenging problem. This paper proposes a hierarchical vision-based planning framework (GPF-BG) integrating our previous Global Path Follower (GPF) navigation system and a gap-based local planner using Bézier curves, so called Bézier Gap (BG). This BG-based trajectory synthesis can generate smooth trajectories and guarantee safety for point-mass robots. With a gap analysis extension based on non-point, rectangular geometry, safety is guaranteed for an idealized quadrupedal motion model and significantly improved for an actual quadrupedal robot model. Stabilized perception space improves performance under oscillatory internal body motions that impact sensing. Simulation-based and real experiments under different benchmarking configurations test safe navigation performance. GPF-BG has the best safety outcomes across all experiments." Direct Angular Rate Estimation without Event Motion-Compensation at High Angular Rates,"Matthew Ng, Xinyu Cai, Shaohui Foong",Singapore University of Technology and Design,Vision-Based Navigation I,"Feature-based methods are a popular method for camera state estimation using event cameras. Due to the spatiotemporal nature of events, all event images exhibit smearing of events analogous to motion blur for a camera under motion. As such, events must be motion compensated to derive a sharp event image. However, this presents a causality dilemma where motion prior is required to unsmear the events, but a sharp event image is required to estimate motion. While it is possible to use the IMU to develop motion prior, it has been shown that the limited dynamic range of ±2000 ◦/s is insufficient for high angular rate rotorcrafts. Furthermore, smoothing of motion-compensated images due to actual event detection time latency in event cameras severely limits the performance of feature-based methods at high angular rates. This paper proposes a Fourier-based angular rate estimator capable of estimating angular rates directly on non-motion compensated event images. This method circumvents the need for external motion priors in camera state estimation and sidesteps problematic smoothing of features in the spatial domain due to motion blur. Lastly, using an NVIDIA Jetson Xavier NX, the algorithm is demonstrated to be real-time performant up to 3960◦/s" StereoVAE: A Lightweight Stereo-Matching System Using Embedded GPUs,"Qiong Chang, Li Xiang, Xu Xin, Xin Liu, Yun Li, Jun Miyazaki","Tokyo Institute of Technology,NanJing University,National Institute of Advanced Industrial Science and Technology,Tokyo Institute of Technology School of Computing",Vision-Based Navigation I,"We propose a lightweight system for stereo-matching using embedded graphic processing units (GPUs). The proposed system overcomes the trade-off between accuracy and processing speed in stereo matching, thus further improving the matching accuracy while ensuring real-time processing. The basic idea is to construct a tiny neural network based on a variational autoencoder (VAE) to achieve the upscaling and refinement a small size of coarse disparity map. This map is initially generated using a traditional matching method. The proposed hybrid structure maintains the advantage of low computational complexity found in traditional methods. Additionally, it achieves matching accuracy with the help of a neural network. Extensive experiments on the KITTI 2015 benchmark dataset demonstrate that our tiny system exhibits high robustness in improving the accuracy of coarse disparity maps generated by different algorithms, while running in real-time on embedded GPUs." Learning Perception-Aware Agile Flight in Cluttered Environments,"Yunlong Song, Kexin Shi, Robert Pěnička, Davide Scaramuzza","University of Zurich,Universität Zürich,Czech Technical University in Prague",Vision-Based Navigation I,"Recently, neural control policies have outperformed existing model-based planning-and-control methods for autonomously navigating quadrotors through cluttered environments in minimum time. However, they are not perception aware, a crucial requirement in vision-based navigation due to the camera's limited field of view and the underactuated nature of a quadrotor. We propose a learning-based system that achieves perception-aware, agile flight in cluttered environments. Our method combines imitation learning with reinforcement learning (RL) by leveraging a privileged learning-by-cheating framework. Using RL, we first train a perception-aware teacher policy with full-state information to fly in minimum time through cluttered environments. Then, we use imitation learning to distill its knowledge into a vision-based student policy that only perceives the environment via a camera. Our approach tightly couples perception and control, showing a significant advantage in computation speed ($10times$ faster) and success rate. We demonstrate the closed-loop control performance using hardware-in-the-loop simulation." NanoFlowNet: Real-Time Dense Optical Flow on a Nano Quadcopter,"Rik Jan Bouwmeester, Federico Paredes-valles, Guido De Croon","Delft University of Technology,TU Delft",Vision-Based Navigation I,"Nano quadcopters are small, agile, and cheap platforms that are well suited for deployment in narrow, cluttered environments. Due to their limited payload, these vehicles are highly constrained in processing power, rendering conventional vision-based methods for safe and autonomous navigation incompatible. Recent machine learning developments promise high-performance perception at low latency, while dedicated edge computing hardware has the potential to augment the processing capabilities of these limited devices. In this work, we present NanoFlowNet, a lightweight convolutional neural network for real-time dense optical flow estimation on edge computing hardware. We draw inspiration from recent advances in semantic segmentation for the design of this network. Additionally, we guide the learning of optical flow using motion boundary ground truth data, which improves performance with no impact on latency. Validation results on the MPI-Sintel dataset show the high performance of the proposed network given its constrained architecture. Additionally, we successfully demonstrate the capabilities of NanoFlowNet by deploying it on the ultra-low power GAP8 microprocessor and by applying it to vision-based obstacle avoidance on board a Bitcraze Crazyflie, a 34 g nano quadcopter." Zero-Shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants,"Jeongeun Park, Taerim Yoon, Jejoon Hong, Youngjae Yu, Matthew Pan, Sungjoon Choi","Korea University,Yonsei University,Queen's University",Vision-Based Navigation I,"In this paper, we focus on the problem of efficiently locating a target object described with free-form text using a mobile robot equipped with vision sensors (e.g., an RGBD camera). Conventional active visual search predefines a set of objects to search for, rendering these techniques restrictive in practice. To provide added flexibility in active visual searching, we propose a system where a user can enter target commands using free-form text; we call this system Zero-shot Active Visual Search (ZAVIS). ZAVIS detects and plans to search for a target object inputted by a user through a semantic grid map represented by static landmarks (e.g., desk or bed). For efficient planning of object search patterns, ZAVIS considers commonsense knowledge-based co-occurrence and predictive uncertainty while deciding which landmarks to visit first. We validate the proposed method with respect to SR (success rate) and SPL (success weighted by path length) in both simulated and real-world environments. The proposed method outperforms previous methods in terms of SPL in simulated scenarios, and we further demonstrate ZAVIS with a Pioneer- 3AT robot in real-world studies." Memory-Based Exploration-Value Evaluation Model for Visual Navigation,"Yongquan Feng, Liyang Xu, Minglong Li, Ruochun Jin, Da Huang, Shaowu Yang, Wenjing Yang","National University of Defense Technology,NUDT,the State Key Laboratory of High Performance Computing (HPCL) &,State Key Laboratory of High Performance Computing (HPCL), Schoo",Vision-Based Navigation I,"We propose a hierarchical visual navigation solution, called Memory-based Exploration-value Evaluation Model (MEEM), to improve the agent's navigation performance. MEEM employs a hierarchical policy to tackle the challenge of sparse rewards, holds an episodic memory to store the historical information of the agent, and applies an Exploration-value Evaluation Model to calculate an exploration-value for action planning at each location in the observable area. We experimentally verify MEEM by navigation performance comparison on two datasets including the grid-map dataset and the 3D scenes Gibson dataset, where our approach achieves state-of-the-art performance on both. Specifically, the overall success rate of MEEM is 95% on the grid-map dataset while the best competitor reaches 68% only. As for the Gibson dataset, the success rate of ours and the best competitor SemExp are 69.8% and 54.4%, respectively. Ablation analysis on the tile-map dataset indicates that all three components of MEEM have positive effects." ViNL: Visual Navigation and Locomotion Over Obstacles,"Simar Kareer, Naoki Yokoyama, Dhruv Batra, Sehoon Ha, Joanne Truong","Georgia Tech,Georgia Institute of Technology,Georgia Tech / Facebook AI Research,The Georgia Institute of Technology",Vision-Based Navigation I,"We present Visual Navigation and Locomotion over obstacles (ViNL), which enables a quadrupedal robot to navigate unseen apartments while stepping over small obstacles that lie in its path (e.g., shoes, toys, cables), similar to how humans and pets lift their feet over objects as they walk. ViNL consists of: (1) a visual navigation policy that outputs linear and angular velocity commands that guides the robot to a goal coordinate in unfamiliar indoor environments; and (2) a visual locomotion policy that controls the robot’s joints to avoid stepping on obstacles while following provided velocity commands. Both the policies are entirely “model-free”, i.e. sensors-to-actions neural networks trained end-to-end. The two are trained independently in two entirely different simulators and then seamlessly co-deployed by feeding the velocity commands from the navigator to the locomotor, entirely “zero-shot” (without any co-training). While prior works have developed learning methods for visual navigation or visual locomotion, to the best of our knowledge, this is the first fully learned approach that leverages vision to accomplish both (1) intelligent navigation in new environments, and (2) intelligent visual locomotion that aims to traverse cluttered environments without disrupting obstacles. On the task of navigation to distant goals in unknown environments, ViNL using just egocentric vision significantly outperforms prior work on robust locomotion using privileged terrain maps (+32.8% success and -4.42 collisions per meter). Additionally, we ablate our locomotion policy to show that each aspect of our approach helps reduce obstacle collisions. Videos and code at http://www.joannetruong.com/projects/vinl.html." Zero-Shot Object Goal Visual Navigation,"Qianfan Zhao, Lu Zhang, Bin He, Hong Qiao, Zhiyong Liu","State Key Laboratory of Management and Control for Complex Syste,Institute of Automation, Chinese Academy of Science,Tongji University,Institute of Automation, Chinese Academy of Sciences,Institute of Automation Chinese Academy of Sciences",Vision-Based Navigation I,"Object goal visual navigation is a challenging task that aims to guide a robot to find the target object based on its visual observation, and the target is limited to the classes pre-defined in the training stage. However, in real households, there may exist numerous target classes that the robot needs to deal with, and it is hard for all of these classes to be contained in the training stage. To address this challenge, we study the zero-shot object goal visual navigation task, which aims at guiding robots to find targets belonging to novel classes without any training samples. To this end, we also propose a novel zero-shot object navigation framework called semantic similarity network (SSNet). Our framework use the detection results and the cosine similarity between semantic word embeddings as input. Such type of input data has a weak correlation with classes and thus our framework has the ability to generalize the policy to novel classes. Extensive experiments on the AI2-THOR platform show that our model outperforms the baseline models in the zero-shot object navigation task, which proves the generalization ability of our model. Our code is available at: https://github.com/pioneer-innovation/Zero-Shot-Object-Navigation." Monocular Simultaneous Localization and Mapping Using Ground Textures,"Kyle Hart, Brendan Englot, Ryan O'shea, John Kelly, David Martinez","Stevens Institute of Technology,Naval Air Warfare Center Aircraft Division,RISE Laboratory at Naval Air Warfare Center,Pennsylvania State University",Vision-Based Navigation I,"Recent work has shown impressive localization performance using only images of ground textures taken with a downward facing monocular camera. This provides a reliable navigation method that is robust to feature sparse environments and challenging lighting conditions. However, these localization methods require an existing map for comparison. Our work aims to relax the need for a map by introducing a full simultaneous localization and mapping (SLAM) system. By not requiring an existing map, setup times are minimized and the system is more robust to changing environments. This SLAM system uses a combination of several techniques to accomplish this. Image keypoints are identified and projected into the ground plane. These keypoints, visual bags of words, and several threshold parameters are then used to identify overlapping images and revisited areas. The system then uses robust M-estimators to estimate the transform between robot poses with overlapping images and revisited areas. These optimized estimates make up the map used for navigation. We show, through experimental data, that this system performs reliably on many ground textures, but not all." "WAVN: Wide Area Visual Navigation for Large-Scale, GPS-Denied Environments","Damian Lyons, Mohamed Rahouti",Fordham University,Vision-Based Navigation I,"This paper introduces a novel approach to GPS-denied visual navigation of a robot team over a wide (i.e., out of line of sight) area which we call WAVN (Wide Area Visual Navigation). Application domains include small-scale precision agriculture as well as exploration and surveillance. The proposed approach requires no exploration or map generation, merging, and updating, some of the most computationally intensive aspects of multi-robot navigation, especially in dynamic environments and for long-term deployments. In contrast, we extend the visual homing paradigm to leverage visual information from the entire team to allow a robot to home to a distant location. Since it only employs the latest imagery, the approach can be resilient to the current state of the environment. WAVN requires three components: identification of common landmarks between robots, a communication infrastructure, and an algorithm to find a sequence of common landmarks to navigate to a goal. The principal contribution of this paper is the navigation algorithm in addition to simulation and physical robot results characterizing performance. The approach is also compared to more traditional map-based approaches." ORORA: Outlier-Robust Radar Odometry,"Hyungtae Lim, Kawon Han, Gunhee Shin, Giseop Kim, Songcheol Hong, Hyun Myung","Korea Advanced Institute of Science and Technology,Inha University,NAVER LABS,KAIST (Korea Advanced Institute of Science and Technology)",Localization and Mapping I,"Radar sensors are emerging as solutions for perceiving surroundings and estimating ego-motion in extreme weather conditions. Unfortunately, radar measurements are noisy and suffer from mutual interference, which degrades the performance of feature extraction and matching, triggering imprecise matching pairs, which are referred to as outliers. To tackle the effect of outliers on radar odometry, a novel outlier-robust method called textit{ORORA} is proposed, which is an abbreviation of textit{Outlier-RObust RAdar odometry}. To this end, a novel decoupling-based method is proposed, which consists of graduated non-convexity~(GNC)-based rotation estimation and anisotropic component-wise translation estimation~(A-COTE). Furthermore, our method leverages the anisotropic characteristics of radar measurements, each of whose uncertainty along the azimuthal direction is somewhat larger than that along the radial direction. As verified in the public dataset, it was demonstrated that our proposed method yields robust ego-motion estimation performance compared with other state-of-the-art methods. Our code is available at https://github.com/url-kaist/outlier-robust-radar-odometry." AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from Motion,"Yu Chen, Zihao Yu, Shu Song, Jianming Li, Tianning Yu, Gim Hee Lee","National University of Singapore,Beihang University,Nreal,Segway Ninebot,Willand Company",Localization and Mapping I,"Despite the impressive results achieved by many existing Structure from Motion (SfM) approaches, there is still a need to improve the robustness, accuracy, and efficiency on large-scale scenes with many outlier matches and sparse view graphs. In this paper, we propose AdaSfM: a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets. Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors such as Inertial Measurement Units (IMUs) and wheel encoders. Subsequently, the view graph is divided into sub-scenes that are refined in parallel by a fine local incremental SfM regularised by the result from the coarse global SfM to improve the camera registration accuracy and alleviate scene drifts. Finally, our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM. Extensive experiments on large-scale benchmark datasets show that our approach achieves state-of-the-art accuracy and efficiency." Robust Map Fusion with Visual Attention Utilizing Multi-Agent Rendezvous,"Jaein Kim, Dong-sig Han, Byoung-Tak Zhang",Seoul National University,Localization and Mapping I,"The map fusion for multi-robot simultaneous localization and mapping (SLAM) consistently combines robot maps built independently into the global map. An established approach to map fusion is utilizing rendezvous, which refers to an encounter between multiple agents, to calculate the transformation into the global map. However, previous works using rendezvous have a limitation in that they are unreliable for certain circumstances, where the amount of agent observations or overlapping landmarks is limited. This work proposes a novel map fusion system which robustly fuses local maps in challenging rendezvous that lack shared information. Our system utilizes the single visual perception from rendezvous and estimates the relative pose between agents with the DOPE. Then our scheme transforms local maps with an estimated relative pose and predicts the misalignment from approximated maps by utilizing the attention mechanism of the vision transformer. Comparisons with the Hough transform-based method show that ours is significantly better when the overlap between local maps is insufficient. We also verify the robustness of our system against a similar real-world scenario." Wi-Closure: Reliable and Efficient Search of Inter-Robot Loop Closures Using Wireless Sensing,"Weiying Wang, Anne Kemmeren, Daniel Son, Javier Alonso-Mora, Stephanie Gil","Harvard University,Delft University,Delft University of Technology",Localization and Mapping I,"In this paper we propose a novel algorithm, WiClosure, to improve the computational efficiency and robustness of loop closure detection in multi-robot SLAM. Our approach decreases the computational overhead of classical approaches by pruning the search space of potential loop closures, prior to evaluation by a typical multi-robot SLAM pipeline. Wi-Closure achieves this by identifying candidates that are spatially close to each other measured via sensing over the wireless communication signal between robots, even when they are operating in non-line-of-sight or in remote areas of the environment from one another. We demonstrate the validity of our approach in simulation and hardware experiments. Our results show that using Wi-closure greatly reduces computation time, by 54.1% in simulation and by 76.8% in hardware experiments, compared with a multi-robot SLAM baseline. Importantly, this is achieved without sacrificing accuracy. Using Wi-closure reduces absolute trajectory estimation error by 98.0% in simulation and 89.2% in hardware experiments. This improvement is partly due to Wi-Closure’s ability to avoid catastrophic optimization failure that typically occurs with classical approaches in challenging repetitive environments." COVINS-G: A Generic Back-End for Collaborative Visual-Inertial SLAM,"Manthan Patel, Marco Karrer, Philipp Baenninger, Margarita Chli",ETH Zurich,Localization and Mapping I,"Collaborative SLAM is at the core of perception in multi-robot systems as it enables the co-localization of the team of robots in a common reference frame, which is of vital importance for any coordination amongst them. The paradigm of a centralized architecture is well established, with the robots (i.e. agents) running Visual-Inertial Odometry (VIO) onboard while communicating relevant data, such as e.g. Keyframes (KFs), to a central back-end (i.e. server), which then merges and optimizes the joint maps of the agents. While these frameworks have proven to be successful, their capability and performance are highly dependent on the choice of the VIO front-end, thus limiting their flexibility. In this work, we present COVINS-G, a generalized back-end building upon the COVINS framework, enabling the compatibility of the server-back-end with any arbitrary VIO front-end, including, for example, off-the-shelf cameras with odometry capabilities, such as the Realsense T265. The COVINS-G back-end deploys a multi-camera relative pose estimation algorithm for computing the loop-closure constraints allowing the system to work purely on 2D image data. In the experimental evaluation, we show on-par accuracy with state-of-the-art multi-session and collaborative SLAM systems, while demonstrating the flexibility and generality of our approach by employing different front-ends onboard collaborating agents within the same mission. The COVINS-G codebase along with a generalized front-end wrapper to allow any existing VIO front-end to be readily used in combination with the proposed collaborative back-end is open-sourced. Video--https://youtu.be/FoJfXCfaYDw" PIEKF-VIWO: Visual-Inertial-Wheel Odometry Using Partial Invariant Extended Kalman Filter,"Tong Hua, Tao Li, Ling Pei",Shanghai Jiao Tong University,Localization and Mapping I,"Invariant Extended Kalman Filter (IEKF) has been successfully applied in Visual-inertial Odometry (VIO) as an advanced achievement of Kalman filter, showing great potential in sensor fusion. In this paper, we propose partial IEKF (PIEKF), which only incorporates rotation-velocity state into the Lie group structure and apply it for Visual-Inertial-Wheel Odometry (VIWO) to improve positioning accuracy and consistency. Specifically, we derive the rotation-velocity measurement model, which combines wheel measurements with kinematic constraints. The model circumvents the wheel odometer’s 3D integration and covariance propagation, which is essential for filter consistency. And a plane constraint is also introduced to enhance the position accuracy. A dynamic outlier detection method is adopted, leveraging the velocity state output. Through the simulation and real-world test, we validate the effectiveness of our approach, which outperforms the standard Multi-State Constraint Kalman Filter (MSCKF) based VIWO in consistency and accuracy." Observability-Aware Active Extrinsic Calibration of Multiple Sensors,"Shida Xu, Jonatan Scharff Willners, Ziyang Hong, Kaicheng Zhang, Y. R. Petillot, Sen Wang","Imperial College London,Heriot-Watt University",Localization and Mapping I,"The extrinsic parameters play a crucial role in multi-sensor fusion, such as visual-inertial Simultaneous Localization and Mapping(SLAM), as they enable the accurate alignment and integration of measurements from different sensors. However, extrinsic calibration is challenging in scenarios, such as underwater, where in-view structures are scanty and/or visibility is limited, causing incorrect extrinsic calibration due to insufficient motion on all degrees of freedom. In this paper, we propose an entropy-based active extrinsic calibration algorithm which leverages observability analysis and information entropy to enhance the accuracy and reliability of extrinsic calibration. It determines the system observability numerically by using singular value decomposition (SVD) of the Fisher Information Matrix (FIM). Furthermore, when the calibration parameter is not fully observable, our method actively searches for the best next motion to recover the system's observability via entropy-based optimisation. Experimental results on synthetic data, in a simulation and using a real underwater vehicle verify that the proposed method is able to avoid/reduce the calibration failure while improving the calibration accuracy and reliability." Learning Continuous Control Policies for Information-Theoretic Active Perception,"Pengzhi Yang, Yuhan Liu, Shumon Koga, Arash Asgharivaskasi, Nikolay A. Atanasov","University of Electronic Science and Technology of China,University of California, San Diego,University of California San Diego",Localization and Mapping I,"This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially observable problem in the landmark state to the Markov decision process (MDP), a differentiable field of view to shape the reward, and an attention-based neural network to represent the control policy. The approach is further unified with active volumetric mapping to promote exploration in addition to landmark localization. The performance is demonstrated in several simulated landmark localization tasks in comparison with benchmark methods." "Structure PLP-SLAM: Efficient Sparse Mapping and Localization Using Point, Line and Plane for Monocular, RGB-D and Stereo Cameras","Fangwen Shu, Jiaxuan Wang, Alain Pagani, Stricker Didier","DFKI,German Research Center for Artificial Intelligence",Localization and Mapping I,"This paper presents a visual SLAM system that uses both points and lines for robust camera localization, and simultaneously performs a piece-wise planar reconstruction (PPR) of the environment to provide a structural map in real-time. One of the biggest challenges in parallel tracking and mapping with a monocular camera is to keep the scale consistent when reconstructing the geometric primitives. This further introduces difficulties in graph optimization of the bundle adjustment (BA) step. We solve these problems by proposing several run-time optimizations on the reconstructed lines and planes. Our system is able to run with depth and stereo sensors in addition to the monocular setting. Our proposed SLAM tightly incorporates the semantic and geometric features to boost both frontend pose tracking and backend map optimization. We evaluate our system exhaustively on various datasets, and show that we outperform state-of-the-art methods in terms of trajectory precision. The code of PLP-SLAM has been made available in open-source for the research community (https://github.com/PeterFWS/Structure-PLP-SLAM)." Rotation Synchronization Via Deep Matrix Factorization,"Tejus Gk, Giacomo Zara, Paolo Rota, Andrea Fusiello, Elisa Ricci, Federica Arrigoni","Indian Institute of Technology (ISM) Dhanbad,University of Trento,University of Udine,Politecnico di Milano",Localization and Mapping I,"In this paper we address the rotation synchronization problem, where the objective is to recover absolute rotations starting from pairwise ones, where the unknowns and the measures are represented as nodes and edges of a graph, respectively. This problem is an essential task for structure from motion and simultaneous localization and mapping. We focus on the formulation of synchronization via neural networks, which has only recently begun to be explored in the literature. Inspired by deep matrix completion, we express rotation synchronization in terms of matrix factorization with a deep neural network. Our formulation exhibits implicit regularization properties and, more importantly, is unsupervised, whereas previous deep approaches are supervised. Our experiments show that we achieve comparable accuracy to the closest competitors in most scenes, while working under weaker assumptions." Object-Based SLAM Utilizing Unambiguous Pose Parameters Considering General Symmetry Types,"Taekbeom Lee, Youngseok Jang, H. Jin Kim",Seoul National University,Localization and Mapping I,"Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance ofsimultaneous localization and mapping (SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is efficient and effective in that it allows to deal with general objects and the objects in the same category can be associated with the same type of ambiguity. Then we extract only the unambiguous parameters corresponding to each category and use them in data association and joint optimization of the camera and object pose. The proposed approach provides significant robustness to the SLAM performance by removing the ambiguous parameters and utilizing as much useful geometric information as possible. Comparison with baseline algorithms confirms the superior performance of the proposed system in terms of object tracking and pose estimation, even in challenging scenarios where the baseline fails." Towards View-Invariant and Accurate Loop Detection Based on Scene Graph,"Chuhao Liu, Shaojie Shen",Hong Kong University of Science and Technology,Localization and Mapping I,"Loop detection plays a key role in visual Simultaneous Localization and Mapping (SLAM) by correcting the accumulated pose drift. In indoor scenarios, the richly distributed semantic landmarks are view-point invariant and hold strong descriptive power in loop detection. The current semantic-aided loop detection embeds the topology between semantic instances to search a loop. However, current semantic-aided loop detection methods face challenges in dealing with ambiguous semantic instances and drastic viewpoint differences, which are not fully addressed in the literature. This paper introduces a novel loop detection method based on an incrementally created scene graph, targeting the visual SLAM at indoor scenes. It jointly considers the macro-view topology, micro-view topology, and occupancy of semantic instances to find correct correspondences. Experiments using handheld RGB-D sequence show our method is able to accurately detect loops in drastically changed viewpoints. It maintains a high precision in observing objects with similar topology and appearance. Our method also demonstrates that it is robust in changed indoor scenes." ViViD++: Vision for Visibility Dataset,"Alex Lee, Younggun Cho, Young-Sik Shin, Ayoung Kim, Hyun Myung","Hyundai Motor Company,Inha University,KIMM,Seoul National University,KAIST (Korea Advanced Institute of Science and Technology)",SLAM 2,"In this paper, we present a dataset capturing diverse visual data formats that target varying luminance conditions. While RGB cameras provide nourishing and intuitive information, changes in lighting conditions potentially result in catastrophic failure for robotic applications based on vision sensors. Approaches overcoming illumination problems have included developing more robust algorithms or other types of visual sensors, such as thermal and event cameras. Despite the alternative sensors’ potential, there still are few datasets with alternative vision sensors. Thus, we provided a dataset recorded from alternative vision sensors, by handheld or mounted on a car, repeatedly in the same space but in different conditions. We aim to acquire visible information from co-aligned alternative vision sensors. Our sensor system collects data more independently from visible light intensity by measuring the amount of infrared dissipation, depth by structured reflection, and instantaneous temporal changes in luminance. We provide these measurements along with inertial sensors and ground-truth for developing robust visual SLAMunder poor illumination. The full dataset is available at: https://visibilitydataset.github.io/" CamMap: Extrinsic Calibration of Non-Overlapping Cameras Based on SLAM Map Alignment,"Jie Xu, Ruifeng Li, Lijun Zhao, Wenlu Yu, Zhiheng Liu, Bo Zhang, Yuchen Li","Harbin Institute of Technology,harbin institute of technology",SLAM 2,"Multiple cameras have emerged as a promising technology for robots and vehicles due to their broad fields of view (FoV) and high resolution. However, there are often limited or no overlapping FoVs among cameras, bringing challenges to estimating extrinsic camera parameters. To overcome this problem, we propose CamMap: a novel 6-degree-of-freedom (DoF) extrinsic calibration pipeline. Following three operating rules, we make a multi-camera rig capture some similar image sequences individually to create sparse feature-based maps with a SLAM system. A two-stage optimization problem is formulated to align the maps and obtain the transformations between them based on bidirectional reprojection. The transformations are exactly the extrinsic parameters. Supporting diverse camera types, the pipeline is available in any texture-rich environment. It can calibrate any number of cameras simultaneously without requiring calibration patterns, synchronization, same resolution and frequency. The pipeline is evaluated on cameras with limited and no overlapping FoVs. In the experiments, we demonstrate our method's accuracy and efficiency. The absolute pose error (APE) between Kalibr and CamMap is less than 0.025. We make the source codes public at github.com/jiejie567/SlamForCalib." Hybrid Visual SLAM for Underwater Vehicle Manipulator Systems,"Gideon Billings, Richard Camilli, Matthew Johnson-Roberson","University of Sydney, Australian Center for Field Robotics,Woods Hole Oceanographic Institution,University of Michigan",SLAM 2,"This paper presents a novel visual feature based scene mapping method for underwater vehicle manipulator systems (UVMSs), with specific emphasis on robust mapping in natural seafloor environments. Our method uses GPU accelerated SIFT features in a graph optimization framework to build a feature map. The map scale is constrained by features from a vehicle mounted stereo camera, and we exploit the dynamic positioning capability of the manipulator system by fusing features from a wrist mounted fisheye camera into the map to extend it beyond the limited viewpoint of the vehicle mounted cameras. Our hybrid SLAM method is evaluated on challenging image sequences collected with a UVMS in natural deep seafloor environments of the Costa Rican continental shelf margin, and we also evaluate the stereo only mode on a shallow reef survey dataset. Results on these datasets demonstrate the high accuracy of our system and suitability for operating in diverse and natural seafloor environments. We also contribute these datasets for public use." WOLF: A Modular Estimation Framework for Robotics Based on Factor Graphs,"Joan Solà, Joan Vallvé, Joaquim Casals, Jeremie Deray, Mederic Fourmy, Dinesh Atchuthan, Andreu Corominas-murtra, Juan Andrade-Cetto","Institut de Robòtica i Informàtica Industrial,CSIC-UPC,Institut de Robòtica i Informàtica Industrial, CSIC-UPC,LAAS, CNRS,EasyMile,Beta Robots SL",SLAM 2,"This paper introduces WOLF, a C++ estimation framework based on factor graphs and targeted at mobile robotics. WOLF can be used beyond SLAM to handle self-calibration, model identification, or the observation of dynamic quantities other than localization. The architecture of WOLF allows for a modular yet tightly-coupled estimator. Modularity is enhanced via reusable plugins that are loaded at runtime depending on the application setup. This setup is achieved conveniently through YAML files, allowing users to configure a wide range of applications without the need of writing or compiling code. Most procedures are coded as abstract algorithms in base classes with varying levels of specialization. Overall, all these assets allow for coherent processing and favor code reusability and scalability. WOLF can be used with ROS and is made publicly available and open to collaboration." "Point Cloud Change Detection with Stereo V-SLAM: Dataset, Metrics and Baseline","Zihan Lin, Yu Jincheng, Lipu Zhou, Xudong Zhang, Jian Wang, Yu Wang","Tsinghua University,MeiTuan,Tsinghua Univ.",SLAM 2,"Localization and navigation are basic robotic tasks requiring an accurate and up-to-date map to finish these tasks, with crowdsourced data to detect map changes posing an appealing solution. Collecting and processing crowdsourced data requires low-cost sensors and algorithms, but existing methods rely on expensive sensors or computationally expensive algorithms. Additionally, there is no existing dataset to evaluate point cloud change detection. Thus, this paper proposes a novel framework using low-cost sensors like stereo cameras and IMU to detect changes in a point cloud map. Moreover, we create a dataset and the corresponding metrics to evaluate point cloud change detection with the help of the high-fidelity simulator Unreal Engine 4. Experiments show that our visual-based framework can effectively detect the changes in our dataset." Hilti-Oxford Dataset: A Millimeter-Accurate Benchmark for Simultaneous Localization and Mapping,"Lintong Zhang, Michael Helmberger, Lanke Frank Tarimo Fu, David Wisth, Marco Camurri, Davide Scaramuzza, Maurice Fallon","University of Oxford,HILTI AG,Free University of Bozen-Bolzano,University of Zurich",SLAM 2,"Simultaneous Localization and Mapping (SLAM) is being deployed in real-world applications, however many state-of-the-art solutions still struggle in many common scenarios. A key necessity in progressing SLAM research is the availability of high-quality datasets and fair and transparent benchmarking. To this end, we have created the Hilti-Oxford Dataset, to push state-of-the-art SLAM systems to their limits. The dataset has a variety of challenges ranging from sparse and regular construction sites to a 17th century neoclassical building with fine details and curved surfaces. To encourage multi-modal SLAM approaches, we designed a data collection platform featuring a lidar, five cameras, and an IMU. With the goal of benchmarking SLAM algorithms for tasks where accuracy and robustness are paramount, we implemented a novel ground truth collection method that enables our dataset to accurately measure SLAM pose errors with millimeter accuracy. To further ensure accuracy, the extrinsics of our platform were verified with a micrometer-accurate scanner, and temporal calibration was managed online using hardware time synchronization. The multi-modality and diversity of our dataset attracted a large field of academic and industrial researchers to enter the second edition of the Hilti SLAM challenge. The results of the challenge show that while the top three teams could achieve an accuracy of 2cm or better for some sequences, the performance dropped off in more difficult sequences." Long-Term Visual SLAM with Bayesian Persistence Filter Based Global Map Prediction,"Tianchen Deng, Hongle Xie, Jingchuan Wang, Weidong Chen",Shanghai Jiao Tong University,SLAM 2,"With the rapidly growing demand for accurate localization in real-world environments, visual SLAM has received significant attention in recent years. However, those existing methods still suffer from the degradation of localization accuracy in long-term changing environments. To address these problems, we propose a novel long-term SLAM system with map prediction and dynamics removal. First, a visual point cloud matching algorithm is designed to efficiently fuse 2D pixel information and 3D voxel information. Second, each map point is classified into three types: static, semi-static, and dynamic, based on the Bayesian persistence filter. Then we remove the dynamic map points to eliminate the influence of those map points. We can obtain a global predicted map by modeling the time series of semi-static map points. Finally, we incorporate the predicted global map into a state-of-art SLAM method, achieving an efficient visual SLAM system for longterm dynamic environments. Extensive experiments are carried out on a wheelchair robot in an indoor environment over several months. The results demonstrate that our method has better map prediction accuracy and achieves more robust localization performance." Wheel-SLAM: Simultaneous Localization and Terrain Mapping Using One Wheel-Mounted IMU,"Yibin Wu, Jian Kuang, Xiaoji Niu, Jens Behley, Lasse Klingbeil, Heiner Kuhlmann","University of Bonn,Wuhan University",SLAM 2,"A reliable pose estimator robust to environmental disturbances is desirable for mobile robots. To this end, inertial measurement units (IMUs) play an important role because they can perceive the full motion state of the vehicle independently. However, it suffers from accumulative error due to inherent noise and bias instability, especially for low-cost sensors. In our previous studies on Wheel-INS, we proposed to limit the error drift of the pure inertial navigation system (INS) by mounting an IMU to the wheel of the robot to take advantage of rotation modulation. However, it still drifted over a long period of time due to the lack of external correction signals. In this letter, we propose to exploit the environmental perception ability of Wheel-INS to achieve simultaneous localization and mapping (SLAM) with only one IMU. To be specific, we use the road bank angles (mirrored by the robot roll angles estimated by Wheel-INS) as terrain features to enable the loop closure with a Rao-Blackwellized particle filter. The road bank angle is sampled and stored according to the robot position in the grid maps maintained by the particles. The weights of the particles are updated according to the difference between the currently estimated roll sequence and the terrain map. Field experiments suggest the feasibility of the idea to perform SLAM in Wheel-INS using the robot roll angle estimates. In addition, the positioning accuracy is improved significantly (more than 30%) over WheelINS." Maplab 2.0 - a Modular and Multi-Modal Mapping Framework,"Andrei Cramariuc, Lukas Bernreiter, Florian Tschopp, Marius Fehr, Victor Reijgwart, Juan Nieto, Roland Siegwart, Cesar D. Cadena Lerma","ETHZ,ETH Zurich, Autonomous Systems Lab,Arrival Ltd,Voliro AG,ETH Zurich,Microsoft",SLAM 2,"Integration of multiple sensor modalities and deep learning into Simultaneous Localization And Mapping (SLAM) systems are areas of significant interest in current research. Multi-modality is a stepping stone towards achieving robustness in challenging environments and interoperability of heterogeneous multi-robot systems with varying sensor setups. With maplab 2.0, we provide a versatile open-source platform that facilitates developing, testing, and integrating new modules and features into a fully-fledged SLAM system. Through extensive experiments, we show that maplab 2.0’s accuracy is comparable to the state-of-the-art on the HILTI 2021 benchmark. Additionally, we showcase the flexibility of our system with three use cases: i) large-scale (approx. 10 km) multi-robot multi-session (23 missions) mapping, ii) integration of non-visual landmarks, and iii) incorporating a semantic object-based loop closure module into the mapping framework. The code is available open-source at https://github.com/ethz-asl/maplab." Simulation Data Driven Design Optimization for Reconfigurable Soft Gripper System,"Jun LIU, Jin Huat Low, Qian Qian Han, Marisa Lim, Dingjie Lu, Yangfan Li, Chen-Hua Yeow, Zhuangjian Liu","IHPC, A*STAR,National University of Singapore,IHPC, ASTAR,Institute of High Performance Computing, A*Star,INSTITUTE OF HIGH PERFORMANCE COMPUTING","Modeling, Control, and Learning for Soft Robots","In the soft gripper design work, most of the designs such as gripping width and the design of finger actuator are purely based on experience, and repeated trial-and-error. In most scenarios, the designed actuators cannot achieve the best/optimized grasping performance with a specific design type. This optimized design is important especially for the food grasping application, as a minor improvement of the grasping capability will be helpful to increase the grasping success ratio, especially during high-speed pick and place tasks. That motivates us to develop a design optimization framework, focusing on how to achieve an optimized grasping performance with a multi-objective design optimization. In this work, a simulation aided data-driven optimization framework for guiding the design of a reconfigurable soft gripper system is presented. To achieve an effective optimization, a simulation model is developed based on the Simulation Open Framework Architecture (SOFA) platform. This model can predict the bending and grasping behavior under actuation and external loading. This model is then used in a data-driven design optimization framework for optimizing the actuator design. An artificial neural network (ANN) is built based on the simulation results as training data, and used as a surrogate model in a multi-objective optimization framework, to achieve an optimal grasping capability with design constraints. This simulation and optimization capability can significantly reduce the tr" Research on Design and Experiment of a Wearable Hand Rehabilitation Device Driven by Fiber-Reinforced Soft Actuator,"Kaiwei Ma, Zhenjiang Jiang, Shuang Gao, Guoping Jiang, Fengyu Xu","Nanjing University of Posts and Telecommunications,southeast university","Modeling, Control, and Learning for Soft Robots","Fiber-reinforced soft actuators have great potential for the development of wearable technology. However, its complex structural design, nonlinear soft material body, fluiddriven dynamics and high manufacturing costs have brought huge challenges to system modeling, control and application. To improve this situation, a novel fiber-reinforced soft actuator is designed and analyzed. First, a wearable hand rehabilitation device based on fiber-reinforced soft actuators with three-airchamber structure is designed. Next, using Yeoh model and principle of virtual work, we establish a bending mathematical model of the soft actuator, whose input parameters are air pressure P and winding number N, and output parameter is bending angle b. Finally, through the finite element analysis, the optimal N is obtained, and the correctness of the model is verified. To verify the above research, an experimental platform is constructed. The results show that the relative error of the model is in an acceptable state. The device can imitate common gestures, easily grasp objects with a volume of 1.6 dm3 and mass of 335.7 g, which can realize the hand rehabilitation training." DNN-Based Predictive Model for a Batoid-Inspired Soft Robot,"Guangtong Li, Thileepan Stalin, Truong Van Tien, Pablo Valdivia","Singapore University of Technology and Design,Singapore University of Technology and Design, MIT","Modeling, Control, and Learning for Soft Robots","Soft robots have a unique potential to harness advanced functionalities through materials engineering, chemistry, and advanced fabrication. However, modeling and control of soft robot bodies is challenging due to non-linearities and time-dependencies of materials physico-chemical properties. With the rapid development of artificial intelligence technologies, deep neural networks (DNN) have become an essential tool for exploring the relationships between inputs and outputs of challenging systems under complex environmental conditions. In this work, rather than physically modeling a soft robotic system, we treat the entire system, including its environment, as a complex but deterministic input-output system, and we use DNNs to estimate these relationships. As an application example, our training results show that DNNs can accurately simulate the physical properties of an underwater bio-inspired soft robot. Validation experiments show that measured propulsive forces are in good agreement with target values predicted by DNNs. Our experiments show the potential of using DNNs to accomplish rapid modeling of bio-inspired propulsion and facilitate control." Modeling the Locomotion of Articulated Soft Robots in Granular Medium,"Yayun Du, Jacqueline Lam, Karunesh Sachanandani, Mohammad Khalid Jawed","University of California, Los Angeles,UCLA","Modeling, Control, and Learning for Soft Robots","Structural flexibility and robot-environment interaction make soft robot modeling challenging. We introduce a numerical tool that couples discrete differential geometry-based simulation of elastic rods, our model for articulated structure, and other external forces. Parallel to simulations, we build an untethered robot testbed, in the granular medium, comprised of multiple flexible flagella that are rotated about an axis by a motor. Drag from the granules causes the flagella to deform and the deformed shape generates a net forward propulsion. External drag depends on the flagellar shape, while the change in flagellar shape is the result of the competition between the external loading and elastic forces. We find reasonable quantitative agreement between experiments and simulations. Owing to a rod-based kinematic representation of the robot, the simulation can run faster than real-time in some cases, and, therefore, we can use it as a design tool for this class of soft robots. We find that there is an optimal rotational speed at which maximum efficiency is achieved. Moreover, both experiments and simulations show that increasing the number of flagella from two to three decreases the speed of the robot. This indicates that our simulator is potentially applicable for unknown physics exploration. We also gain insight into the mechanics of granular medium - while resistive force theory can successfully describe the propulsion at low number of flagella, it fails when more flagella a" SoRoSim: A MATLAB Toolbox for Hybrid Rigid-Soft Robots Based on the Geometric Variable-Strain Approach,"Anup Teejo Mathew, Ikhlas Mohamed Ben Hmida, Costanza Armanini, Frédéric Boyer, Federico Renda","Khalifa University,IMT atlantique,Khalifa University of Science and Technology","Modeling, Control, and Learning for Soft Robots","Soft robotics has been a trending topic within the robotics community for almost two decades. However, available tools for the modeling and analysis of soft robots are still limited. This paper introduces a user-friendly MATLAB toolbox, SoRoSim, that integrates the Geometric Variable Strain model of Cosserat rods to facilitate the static and dynamic analysis of soft, rigid, or hybrid robotic systems. We present a brief overview of the design and structure of the toolbox and validate it by comparing its results with those published in the literature. To highlight the toolbox's potential to efficiently model, simulate, optimize, and control various robotic systems, we demonstrate four sample applications. The demonstrated applications explore different actuator and external loading conditions of single-, branched-, open-, and closed-chain robotic systems. We think that the soft-robotics research community will significantly benefit from the SoRoSim toolbox for a wide variety of applications." A Geometrically-Exact Assumed Strain Modes Approach for the Geometrico and Kinemato-Static Modellings of Continuum Parallel Robots,"Sébastien Briot, Frédéric Boyer","LS,N,IMT atlantique","Modeling, Control, and Learning for Soft Robots","There is a growing interest on the study of continuum parallel robots (CPRs) due to their higher stiffness and better dynamics capacities than serial continuum robots (SCRs). Several works have focused on the computation of their geometrico- and kinemato-static models, that can be sorted into two main categories: (i) models based on the continuous Cosserat equations: They are very accurate but assessing elastic stability with them is tricky; (ii) discretized models: They allow easily checking the elastic stability but they require a large number of elastic variables to be accurate. In this paper, we extend an approach based on assumed strain modes developed for the dynamics of SCRs to the statics of CPRs. This method is able to predict the robot configuration with an excellent accuracy with a very limited number of elastic variables, contrary to other discretization methods. The method is also more than 100 times faster than finite differences for a better prediction accuracy. Finally, it is possible to assess the robot elastic stability by only checking the Hessian of the potential energy as for any discretization method, thus making the analysis of this property simpler than f" Towards a Physics-Based Model for Steerable Eversion Growing Robots,"Zicong Wu, Mikel De Iturrate Reyzabal, S.M.Hadi Sadati, Hongbin Liu, Sebastien Ourselin, Daniel Richard Leff, Robert Kevin Katzschmann, Kawal Rhode, Christos Bergeles","King's College London,Hong Kong Institute of Science & Innovation, Chinese Academy of ,University College London,Imperial College London,ETH Zurich","Modeling, Control, and Learning for Soft Robots","Soft robots that grow through eversion/apical extension can effectively navigate fragile environments such as ducts and vessels inside the human body. This letter presents the physics based model of a miniature steerable eversion growing robot. We demonstrate the robot’s growing, steering, stiffening and interaction capabilities. The interaction between two robot-internal components is explored, i.e., a steerable catheter for robot tip orientation, and a growing sheath for robot elongation/retraction. The behavior of the growing robot under different inner pressures and external tip forces is investigated. Simulations are carried out within the SOFA framework. Extensive experimentation with a physical robot setup demonstrates agreement with the simulations. The comparison demonstrates a mean absolute error of 10–20% between simulation and experimental results for curvature values, including catheter-only experiments, sheath-only experiments and full system experiments. To our knowledge, this is the first work to explore physics-based modelling of a tendon-driven steerable eversion growing robot. While our work is motivated by early breast cancer detection through mammary duct inspection and uses our MAMMOBOT robot prototype, our approach is general and relevant to similar growing robots." P-satI-D Shape Regulation of Soft Robots,"Pietro Pustina, Pablo Borja, Cosimo Della Santina, Alessandro De Luca","Sapienza University of Rome,University of Plymouth,TU Delft","Modeling, Control, and Learning for Soft Robots","Soft robots are intrinsically underactuated mechanical systems that operate under uncertainties and disturbances. In these conditions, this letter proposes two versions of PID-like control laws with a saturated integral action for the particularly challenging shape regulation task. The closed-loop system is asymptotically stabilized and matched constant disturbances are rejected using a very reduced amount of system information for control implementation. Stability is assessed on the underactuated dynamic model through the Invariant Set Theorem for two relevant classes of soft robots, i.e., elastically decoupled and elastically dominated soft robots. Extensive simulation results validate the proposed controllers." Statics and Dynamics of Continuum Robots Based on Cosserat Rods and Optimal Control Theories,"Frédéric Boyer, Vincent Lebastard, Fabien Candelier, Federico Renda, Mazen Alamir","IMT atlantique,Université Aix Marseille,Khalifa University of Science and Technology,LAG","Modeling, Control, and Learning for Soft Robots","This paper explores the relationship between optimal control and Cosserat beam theory from the perspective of solving the forward and inverse dynamics (and statics as a subcase) of continuous manipulators and snake-like bio-inspired locomotors. By invoking the principle of minimum potential energy, and the Gauss principle of least constraint, it is shown that the quasi-static and dynamic evolution of these robots, are solutions of optimal control problems (OCPs) in the space variable, which can be solved at each step (of loading or time) of a simulation with the shooting method. In addition to offering an alternative viewpoint on several simulation approaches proposed in the recent past, the optimal control viewpoint allows us to improve some of them while providing a better understanding of their numerical properties. The approach and its properties are illustrated through a set of numerical examples validated against a reference simulator." Robotic Fiber Threading from a Gel-Like Substance Based on Impedance Control with Force Tracking,"Houari Bettahar, P. A. Diluka Harischandra, Quan Zhou","Aalto university,Aalto University","Modeling, Control, and Learning for Soft Robots","Gel-like matter is used extensively in a wide range of application fields including industrial applications such as the manufactory and assembly of garment and footwear products, soft macro/micro-robotics,medical diagnostics, and drug delivery. However, the manipulation of gel-like matter is very challenging, due to its high deformability, high viscosity, and fast phase changing. In this paper, a robotic fiber threading approach based on impedance control with force tracking is proposed to mimic the threading process of natural species. The aim is to automatically tune the mechanical properties of the gel-like substance to fabricate high-performance artificial fibers with the desired mechanical properties. The proposed approach estimates the stiffness of the gel using only force and position measurements during the whole threading process, without a need to specify any gain. The obtained force tracking error is 0.019 mN (0.95%) and the position error is 0.0053 mm (0.23%). The proposed approach fabricates the higher performance of artificial fibers compared to the widely used velocity control fiber fabrication approach." Overload Clutch with Integrated Torque Sensing and Decoupling Detection for Collision Tolerant Hybrid High-Speed Industrial Cobots,"Frederik Ostyn, Bram Vanderborght, Guillaume Crevecoeur","Ghent University,VUB",Compliant Mechanisms,"A hybrid high-speed industrial collaborative robot can switch between collaborative mode and high-speed mode, combining the advantages of both. While promising, this concept comes with some challenges such as dealing with collisions at high speed. An overload clutch with integrated torque sensing and clutch decoupling detection is presented as enabling technology. Both joint torque sensing and clutch decoupling detection are realized with the same capacitive measurement hardware that consists of paired electrodes. A prototype device is experimentally validated through comparison with a reference torque sensor." A Micro Aircraft with Passive Variable-Sweep Wings,"Songnan Bai, Runze Ding, Pakpong Chirarattananon","City University of Hong Kong,CITY UNIVERSITY OF HONGKONG",Compliant Mechanisms,"Traditional fixed-wing vehicles are equipped with multiple active aerodynamic surfaces for flight control. This inevitably necessitates several actuators, complicates the mechanical structure, and adversely impacts the flight efficiency, particularly for small aerial vehicles. As a lightweight and efficient solution, this work proposes to employ passive variable-sweep wings on a micro airplane to eliminate the need for active control surfaces while retaining effective pitch maneuverability. Depending on the thrust produced by the propellers, the wings passively sweep back and forth, relocating the center of pressure and affecting the pitch moment. The thrust-induced deformation substitutes the elevators for pitch control. Through aerodynamic modeling, the flight dynamics of the proposed vehicle is analyzed. The results show that the proposed design brings about amplified and accelerated pitch response. Lastly, a prototype was constructed to demonstrate and verify the enhanced aircraft’s pitch control ability." Design and Voluntary Control of Variable Stiffness Exoskeleton Based on sEMG Driven Model,"Yanghui Zhu, Qingcong Wu, Bai Chen, Ziyue Zhao",Nanjing University of Aeronautics and Astronautics,Compliant Mechanisms,"Exoskeleton robots are a feasible solution for patients with motor dysfunction to restore their daily activities. First, a variable stiffness exoskeleton robot (VSA-EXO) is designed based on the variable stiffness actuator. Then, a sEMG-driven musculoskeletal model is proposed, which can estimate the joint torque and quasi-stiffness of joint through the sEMG signals of some muscles related to the joint. Furthermore, a voluntary control strategy is proposed, which can adjust the degree of assistance according to the voluntary efforts of the subjects, and can transfer the stiffness adjustment skills of the human joints to the exoskeleton joints. Unlike the traditional assist-as-needed (ANN), the voluntary control strategy in this paper does not need to define a trajectory. Finally, some feasibility experiments are conducted on three healthy subjects. The calibration experiments of the musculoskeletal model show that the NRMSE of the estimated torque and the actual torque of the three subjects are lower than 7.21%, proving the effectiveness of the musculoskeletal model. In the assisted experiment, with the exoskeleton assisting ratio of 0.3, the subjects’ effort is reduced by up to 25.32%." A Robotic Torso Joint with Adjustable Linear Spring Mechanism for Natural Dynamic Motions in a Differential-Elastic Arrangement,"Jens Reinecke, Alexander Dietrich, Anton Shu, Bastian Deutschmann, Marco Hutter","DLR,German Aerospace Center (DLR),German Aerospace Center,ETH Zurich",Compliant Mechanisms,"To be operated in unknown or complex environments, modern robots have to fulfill various challenging criteria. Among them, one finds requirements such as a high level of robustness to withstand impacts and the capabilities to physically interact in a safe manner. One way to achieve that is to integrate variable-stiffness actuators into the systems, enabling compliant behavior through the elastic components and providing the additional adaptability of the impedance. Here, we introduce a novel adjustable linear stiffness joint mounted in a differential-elastic arrangement. The mechanism is integrated into the anthropomorphic upper body of the DLR David robot and responsible for the spinal rotation. Consequently, the actuator is crucial for the overall workspace of the robot and the realization of energy-efficient natural motions such as in dynamic running. The proposed hardware setup is experimentally validated in terms of the linearity in the spring characteristics, intrinsic damping, the excitation of resonance frequencies, and the ability to alter these resonance frequencies through stiffness adaptation during dynamic motions." Requirements on the Spatial Distribution of Elastic Components Used in Compliance Realization,"Shuguang Huang, Joseph Schimmels",Marquette University,Compliant Mechanisms,"In this paper,necessary conditions on the locations and orientations of elastic components in a compliant mechanism used to realize any given spatial compliance are identified. The topologies considered are either fully parallel or fully serial mechanisms having an arbitrary number of lumped elastic components. It is shown that the requirements on elastic components are characterized by a sphere for the location distribution and by three cones for the orientation distribution. The easy to assess conditions on the set of components can be used to achieve a more desirable mechanism geometry when used in conjunction with existing spatial compliance synthesis procedures." A Novel Metamorphic Foot Mechanism with Toe Joints Based on Spring-Loaded Linkages,"Jianwei Sun, Zhenyu Wang, Meiling Zhang, Songyu Zhang, Zhihui Qian, Jinkui Chu","Changchun University of Technology,Jilin University,Dalian University of Technology",Compliant Mechanisms,"The toe joints play an important role in human walking and running movement patterns. In this paper, we propose a design method for metamorphic foot structures based on spring-loaded linkages to realize metatarsal-toe switching by using the self-recovery and self-stabilization properties of the spring-loaded linkages. We constrain the degrees of freedom (DOFs) of the foot mechanism by using the singular position characteristic of the four-bar mechanism to compensate for the lack of rigidity when the foot mechanism is in the toe line state. In addition, the structural parameters are calculated based on the static analysis method, and the validity of the design method is verified by simulation using Adams software. Finally, the compliance of the metatarsal state and the stability of the toe line state are demonstrated by physical prototyping and experiments. The results show that the novel metamorphic foot mechanism provides a uniform solution to cope with both walking and running movement modes." Haptic-Based and SE(3)-Aware Object Insertion Using Compliant Hands,"Osher Azulay, Maxim Monastirsky, Avishai Sintov","Tel Aviv University,Tel-Aviv University",Compliant Mechanisms,"Object insertion is primarily studied using rigid robotic hands. However, these may have difficulties overcoming spatial uncertainties originating from an uncertain initial grasp. Compliant hands, on the other hand, can cope with SE(3) uncertainties and adapt to the environment upon contact. Nevertheless, contact forces may contribute additional uncertainties and lead to failure if not controlled properly. In this letter, we take inspiration from human insertion and study how haptic glances with compliant hands during contact can provide valuable information regarding object state. Using a force/torque sensor, we show that a haptic glance based on excitation of finger perturbations can provide accurate contact localization and indication of a successful insertion. With such insight, we propose an online learning scheme for general precision control of contact-rich object insertion. A deep residual Reinforcement Learning (RL) policy leverages the contact dynamics of the compliant hand to cope with SE(3) uncertainties. Several experiments of precision insertion tasks with various objects and grasp uncertainties exhibit high success rate and validate the effectiveness of the approach." Dynamic Modeling and Performance Analysis for a Wire-Driven Elastic Robotic Fish,"Xiaocun Liao, Chao Zhou, Qianqian Zou, Jian Wang, Ben Lu","Institute of Automation, Chinese Academy of Sciences,Chinese Academy of Sciences,Institution of Automation, Chinese Academy of sciences",Compliant Mechanisms,"The complex and continuous undulation of fishtail facilitates extraordinary underwater motion performance for natural fish. For the widely used Multi-Joint robotic fish, a lot of joints are used to simulate continuum fishtail, resulting in some challenges, e.g., the mechanism complexity, friction losses of adjacent joints, load disequilibrium and unsmooth servomotor output power. To overcome these intractable hurdles, motivated by natural fish, this letter proposes a wire-driven elastic robotic fish, which simulates fish muscle through multi-wire drive and adopts a fishlike spine design based on elastic component. Compared with the existing wire-driven robotic fish with discrete multiple-joints-spine and single-wire drive, our robotic fish not only has continuum fishtail, but also can swing with C-Shape and S-Shape owing to multi-wire coupling drive, and simulate the energy storage behavior of fish by elastic component. Further, a Lagrangian dynamic model that models a robotic fish with continuum fishtail and passive flexible joint is developed to explore the propulsive performance and validated by extensive experiments and simulations, and our robotic fish reaches the maximum swimming speed of 0.58 m/s, i.e., 1.04 BL/s. Finally, the superiority of the proposed drive mechanism in disposing of load disequilibrium and smoothing output power of servomotor, is analyzed and validated by the comparisons between a Multi-Joint robotic fish and our robotic fish." A 2-Degree-Of-Freedom Quasi-Passive Prosthetic Wrist with Two Levels of Compliance,"Leonardo Cappello, Daniele D'accolti, Marta Gherardini, Marco Controzzi, Christian Cipriani","Scuola Superiore Sant'Anna,The Biorobotics Institute, Sant'Anna School of Advanced Studies",Compliant Mechanisms,"Restoring the function of a missing hand is still a grand challenge for bioengineers. We witnessed significant recent advances in the development of myoelectric hand prostheses and their controllers. Conversely, the wrist joint is generally overlooked in prosthetics, despite playing a fundamental role in orienting the hand in space. Indeed, it may account for several degrees of freedom of the hand in reducing compensatory movements. We acknowledge that an active, three-degree-offreedom prosthetic wrist is not a viable option for a self-contained prosthesis, therefore we merged in one design two opposed passive behaviors. The proposed wrist can automatically transition between a compliant mode, which exhibits relatively low stiffness allowing for passive motions around two rotational axes (wrist flexion/extension and radial/ulnar deviation), and a stiff mode, which grants stability during manipulation. To switch mode, no additional control input - hence cognitive burden - from the user is needed: it occurs synchronously with the prosthetic hand opening and closing motion, such that the wrist is compliant during reaching and stiff during manipulation. Our device proved reliable on the test bench and useful in a pilot test with an amputee volunteer, motivating further developments and more extensive testing to prove its effectiveness." DiffCo: Auto-Differentiable Proxy Collision Detection with Multi-Class Labels for Safety-Aware Trajectory Optimization,"Yuheng Zhi, Nikhil Das, Michael Yip","University of California, San Diego,UCSD",Path Planning and Collision Avoidance,"The objective of trajectory optimization algorithms is to achieve an optimal collision-free path between a start and goal state. In real-world scenarios where environments can be complex and non-homogeneous, a robot needs to be able to gauge whether a state will be in collision with various objects in order to meet some safety metrics. The collision detector should be computationally efficient and, ideally, analytically differentiable to facilitate stable and rapid gradient descent during optimization. However, methods today lack an elegant approach to detect collision differentiably, relying rather on numerical gradients that can be unstable. We present DiffCo, the first, fully auto-differentiable, non-parametric model for collision detection. Its non-parametric behavior allows one to compute collision boundaries on-the-fly and update them, requiring no pre-training and allowing it to update continuously in dynamic environments. It provides robust gradients for trajectory optimization via backpropagation and is often 10-100x faster to compute than its geometric counterparts. DiffCo also extends trivially to modeling different object collision classes for semantically informed trajectory optimization." Risk-Aware Submodular Optimization for Multi-Robot Coordination,"Lifeng Zhou, Pratap Tokekar","Drexel University,University of Maryland",Path Planning and Collision Avoidance,"We study the problem of incorporating risk while making combinatorial decisions under uncertainty. We formulate a discrete submodular maximization problem for selecting a set using Conditional-Value-at-Risk (CVaR), a risk metric commonly used in financial analysis. While CVaR has recently been used in the optimization of linear cost functions in robotics, we take the first step towards extending this to discrete submodular optimization and provide several positive results. Specifically, we propose the Sequential Greedy Algorithm that provides an approximation guarantee on finding the maxima of the CVaR cost function under a matroidal constraint. The approximation guarantee shows that the solution produced by our algorithm is within a constant factor of the optimal and an additive term that depends on the optimal. Our analysis uses the curvature of the submodular set function, and proves that the algorithm runs in polynomial time. We also study the problem of adaptive risk-aware submodular maximization. We design a heuristic solution that triggers the replanning only when certain conditions are satisfied, to eliminate unnecessary planning." Risk-Aware Fast Trajectory Planner for Uncertain Environments Based on Probabilistic Surrogate Reliability and Risk Contours,Guobiao Wang,Southeast university,Path Planning and Collision Avoidance,"This paper presents the risk-aware fast trajectory planner (RAFTER) for autonomous vehicles in dynamic uncertain environments, which is based on the probabilistic surrogate reliability of other traffic participants and risk contours. In contrast to the conventional risk metric, RAFTER not only provides the upper bound of the probability of constraint violation but deduces an infimum on the risk which the controlled plant can stand by the probabilistic reliable surrogate model. Such a risk-aware algorithm is capable of perceiving uncertainty and handling robustness. A series of covering disks are constructed utilizing a concise geometric configuration for a lower conservatism representation of the vehicle profile, which attains a desirable tradeoff between the quantity and area of occupation. Safe travel corridors are built on the dilated map via covering disks, significantly reducing the computational burden of a reliability-based optimization procedure for optimal trajectory planning. The effectiveness of the proposed method is confirmed by two numerical simulations derived from real scenarios." Collision Avoidance among Dense Heterogeneous Agents Using Deep Reinforcement Learning,"Kai Zhu, Bin Li, Wen Ming Zhe, Tao Zhang","Tsinghua University,JD",Path Planning and Collision Avoidance,"Navigating in a complex congested social environment without collision is a crucial and challenging task. Recent studies have demonstrated the considerable success of Deep Reinforcement Learning (DRL) in multi-agent collision avoidance. However, the assumption of these studies that agents are homogeneous circles deviates from reality, leading to performance deterioration in congested scenarios. The current work extends the DRL-based approaches to develop a collision avoidance method for congested scenarios wherein the heterogeneity of agents can no longer be disregarded. Considering shape heterogeneity, we use the Orientated Bounding Capsule (OBC) to model the agents and transform the interactive state space of Robot-Obstacle agent pair. For speed heterogeneity, we design a velocity-related collision risk function to shape the behavior of the robot. Experimental results demonstrate that our proposed method outperforms state-of-the-art DRL-based approaches in terms of success rate and safety. It also exhibits desired collision avoidance behavior." Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions,"Negar Mehr, Mingyu Wang, Maulik Bhatt, Mac Schwager","University of Illinois Urbana-Champaign,Stanford University",Path Planning and Collision Avoidance,"In this paper, we study the problem of multiple stochastic agents interacting in a dynamic game scenario with continuous state and action spaces. We define a new notion of stochastic Nash equilibrium for boundedly rational agents, which we call the Entropic Cost Equilibrium (ECE). We show that ECE is a natural extension to multiple agents of Maximum Entropy optimality for a single agent. We solve both the ``forward'' and ``inverse'' problems for the multi-agent ECE game. For the forward problem, we provide a Riccati algorithm to compute closed-form ECE feedback policies for the agents, which are exact in the Linear-Quadratic-Gaussian case. We give an iterative variant to find locally ECE feedback policies for the nonlinear case. For the inverse problem, we present an algorithm to infer the cost functions of the multiple interacting agents given noisy, boundedly rational input and state trajectory examples from agents acting in an ECE. The effectiveness of our algorithms is demonstrated in a simulated multi-agent collision avoidance scenario, and with data from the INTERACTION traffic dataset. In both cases, we show that, by taking into account the agents' game theoretic interactions using our algorithm, a more accurate model of agents' costs can be learned, compared with standard inverse optimal control methods." Distributing Collaborative Multi-Robot Planning with Gaussian Belief Propagation,"Aalok Patwardhan, Riku Murai, Andrew J Davison",Imperial College London,Path Planning and Collision Avoidance,"Precise coordinated planning over a forward time window enables safe and highly efficient motion when many robots must work together in tight spaces, but this would normally require centralised control of all devices which is difficult to scale. We demonstrate GBP Planning, a new purely distributed technique based on Gaussian Belief Propagation for multi-robot planning problems, formulated by a generic factor graph defining dynamics and collision constraints. In simulations, we show that our method allows extremely high performance collaborative planning where robots are able to cross each other in busy, intricate scenarios. They maintain shorter, quicker and smoother trajectories than alternative distributed planning techniques even in cases of communication failure. We encourage the reader to view the accompanying video demonstration at https://youtu.be/8VSrEUjH610." Interactive Multi-Modal Motion Planning with Branch Model Predictive Control,"Yuxiao Chen, Ugo Rosolia, Wyatt Ubellacker, Noel Csomay-Shanklin, Aaron Ames","Nvidia research,Caltech,California Institute of Technology",Path Planning and Collision Avoidance,"Motion planning for autonomous robots and vehicles in presence of uncontrolled agents remains a challenging problem as the reactive behaviors of the uncontrolled agents must be considered. Since the uncontrolled agents usually demonstrate multimodal reactive behavior, the motion planner needs to solve a continuous motion planning problem under these behaviors, which contains a discrete element. We propose a branch Model Predictive Control (MPC) framework that plans over feedback policies to leverage the reactive behavior of the uncontrolled agent. In particular, a scenario tree is constructed from a finite set of policies of the uncontrolled agent, and the branch MPC solves for a feedback policy in the form of a trajectory tree, which shares the same topology as the scenario tree. Moreover, coherent risk measures such as the Conditional Value at Risk (CVaR) are used as a tuning knob to adjust the tradeoff between performance and robustness. The proposed branch MPC framework is tested on an autonomous vehicle planning problem in simulation, and on an autonomous quadruped robot alongside an uncontrolled quadruped in experiments. The result demonstrates interesting human-like behaviors, achieving a balance between safety and performance." A Sequential MPC Approach to Reactive Planning for Bipedal Robots Using Safe Corridors in Highly Cluttered Environments,"Kunal Sanjay Narkhede, Abhijeet Kulkarni, Dhruv Ashwinkumar Thanki, Ioannis Poulakakis",University of Delaware,Path Planning and Collision Avoidance,"This paper presents a sequential Model Predictive Control (MPC) approach to reactive motion planning for bipedal robots in highly cluttered environments with moving obstacles. The approach relies on a polytopic decomposition of the free space, which provides a safe corridor in the form of an ordered collection of mutually intersecting obstacle-free polytopes and waypoints. These are subsequently used to define a corresponding sequence of MPC programs that drive the system to a goal location avoiding static and moving obstacles. This way, the planner focuses on the free space in the vicinity of the robot, thus alleviating the need to consider all the obstacles simultaneously and reducing computational time. We verify the efficacy of our approach in high-fidelity simulations with the bipedal robot Digit, demonstrating robust reactive planning in the presence of static and moving obstacles." Towards a Continuous Solution of the D-Visibility Watchman Route Problem in a Polygon with Holes,"Jan Mikula, Miroslav Kulich","Faculty of Electrical Engineering – Czech Technical University in Prague,Czech Technical University in Prague",Path Planning and Collision Avoidance,"A new heuristic solution framework is proposed to address the challenging watchman route problem (WRP) in a polygonal domain, which can be viewed as an offline version of the robot exploration task. The solution is the shortest route from which the robot can visually inspect a known 2D environment. Our framework considers a circular robot with radius r equipped with an omnidirectional sensor with limited visibility range d. Instead of a standard decoupled solution, the framework generates a set of specifically constrained regions covering the domain and then solves the traveling salesman problem with continuous neighborhoods (TSPN) to obtain the solution route. The TSPN is solved by another proposed heuristic algorithm that finds a discretized solution first and then improves it back in the continuous domain. The whole framework is evaluated experimentally, compared to two approaches from the literature, and shown to provide the highest-quality solutions. The current version of the framework is one step from a fully continuous approach to the WRP that we will address in the future." Learning Deep Neural Network Controller for Path Following of Unicycle Robots,"Priyabrata Saha, Luis Guerrero-bonilla, Magnus Egerstedt, Saibal Mukhopadhyay","Georgia Institute of Technology,Instituto Tecnologico y de Estudios Superiores de Monterrey,University of California, Irvine",Deep Learning and Neural Networks in Robotics,"This paper investigates the scope of deep neural network (DNN) based controllers in the path following task for unicycle mobile robots. A DNN-based controller is trained to follow paths with arbitrary curvature in two-dimensional space. The training process does not require initialization or supervision from any other known expert controller. Rather, the training of the DNN controller is guided by another predictive neural network that represents a path following error dynamics which is exponentially stable at the origin. The two DNNs are trained jointly in a simulated environment. The learned DNN controller is then employed as a standalone controller in a real unicycle robot for the tasks of following various linear and curved paths." ViewBirdiformer: Learning to Recover Ground-Plane Crowd Trajectories and Ego-Motion from a Single Ego-Centric View,"Mai Nishimura, Shohei Nobuhara, Ko Nishino","OMRON SINIC X,Kyoto University",Deep Learning and Neural Networks in Robotics,"We introduce a novel learning-based method for view birdification, the task of recovering ground-plane trajectories of pedestrians of a crowd and their observer in the same crowd just from the observed ego-centric video. View birdification becomes essential for mobile robot navigation and localization in dense crowds where the static background is hard to see and reliably track. It is challenging mainly for two reasons; i) absolute trajectories of pedestrians are entangled with the movement of the observer which needs to be decoupled from their observed relative movements in the ego-centric video, and ii) a crowd motion model describing the pedestrian movement interactions is specific to the scene yet unknown a priori. For this, we introduce a Transformer-based network referred to as ViewBirdiformer which implicitly models the crowd motion through self-attention and decomposes relative 2D movement observations onto the ground-plane trajectories of the crowd and the camera through cross-attention between views. Most important, ViewBirdiformer achieves view birdification in a single forward pass which opens the door to accurate real-time, always-on situational awareness. Extensive experimental results demonstrate that ViewBirdiformer achieves accuracy similar to or better than state-of-the-art with three orders of magnitude reduction in execution time." Closing the Planning-Learning Loop with Application to Autonomous Driving,"Panpan Cai, David Hsu","Shanghai Jiao Tong University,National University of Singapore",Deep Learning and Neural Networks in Robotics,"Real-time planning under uncertainty is critical for robots operating in complex dynamic environments. Planning explicitly over a long time horizon, however, incurs prohibitive computational cost. To achieve real-time performance for large-scale planning, this work introduces a new algorithm, Learning from Tree Search for Driving (LeTS-Drive), which integrates planning and learning in a closed loop, and applies it to autonomous driving in crowded urban traffic in simulation. Specifically, LeTS-Drive learns a policy and its value function from data provided by an online planner, which searches a sparsely-sampled belief tree; the online planner in turn uses the learned policy and value functions as heuristics to scale up its run-time performance for real-time robot control. These two steps are repeated to form a closed loop so that the planner and the learner inform each other and improve in synchrony. The algorithm learns on its own in a self-supervised manner, without human effort on explicit data labeling. Experimental results demonstrate that LeTS-Drive outperforms either planning or learning alone, as well as open-loop integration of planning and learning." Learning from Demonstrations Via Multi-Level and Multi-Attention Domain-Adaptive Meta-Learning,"Ziye Hu, Zhongxue Gan, Wei Li, Weikun Guo, Xiang Gao, Jiwei Zhu","Fudan University,Jihua Lab",Deep Learning and Neural Networks in Robotics,"Despite significant advances in few-shot classification, object detection, or speech recognition in recent years, training an effective robot to adapt to previously unseen environments in a small data regime is still a long-lasting problem for learning from demonstrations (LfD). A promising solution is meta-learning. However, we notice that simply constructing a model with a more complicated and deeper network via previous meta-learning methods does not perform well as we expected. One possible reason is that the shallow features are gradually lost as the network deepens, while these shallow features play an essential role in the adaptation process of metalearning. Thus, we present a novel yet effective Multi-Level and Multi-Attention Domain-Adaptive Meta-Learning (MLMADAML) framework, which meta-learns multiple visual features via different attention heads to update the model policy. Once the model is updated, our MLMA-DAML predicts robot actions (e.g., positions of end-effectors) via fully connected layers (FCL). As we notice that directly converting visual signals to robot actions via FCL following prior methods is not robust to perform robot manipulation tasks, we further extend our MLMA-DAML to MLMA-DAML++. The proposed MLMA-DAML++ learns an effective representation of manipulation tasks via an extra goal prediction network with convolutional layers (CL) to predict more reliable robot actions (represented by feature pixels/grids)." Learning Stable Vector Fields on Lie Groups,"Julen Urain, Davide Tateo, Jan Peters","TU Darmstadt,Technische Universität Darmstadt",Deep Learning and Neural Networks in Robotics,"Learning robot motions from demonstration requires models able to specify vector fields for the full robot pose when the task is defined in operational space. Recent advances in reactive motion generation have shown that learning adaptive, reactive, smooth, and stable vector fields is possible. However, these approaches define vector fields on a flat Euclidean manifold, while representing vector fields for orientations requires modeling the dynamics in non-Euclidean manifolds, such as Lie Groups. In this paper, we present a novel vector field model that can guarantee most of the properties of previous approaches i.e., stability, smoothness, and reactivity beyond the Euclidean space. In the experimental evaluation, we show the performance of our proposed vector field model to learn stable vector fields for full robot poses as SE(2) and SE(3) in both simulated and real robotics tasks." Learning to Play Table Tennis from Scratch Using Muscular Robots,"Dieter Buechler, Simon Guist, Roberto Calandra, Vincent Berenz, Bernhard Schölkopf, Jan Peters","Max Planck Institute for Intelligent Systems Tübingen,Max Planck Institute for Intelligent Systems,Meta AI,Technische Universität Darmstadt",Deep Learning and Neural Networks in Robotics,"Dynamic tasks like table tennis are relatively easy to learn for humans but pose significant challenges to robots. Such tasks require accurate control of fast movements and precise timing in the presence of imprecise state estimation of the flying ball and the robot. Reinforcement Learning (RL) has shown promise in learning complex control tasks from data. However, applying step-based RL to dynamic tasks on real systems is safety-critical as RL requires exploring and failing safely for millions of time steps in high-speed and high-acceleration regimes. This paper demonstrates that safe learning of table tennis using model-free Reinforcement Learning can be achieved by using robot arms driven by pneumatic artificial muscles (PAMs). Softness and back-drivability properties of PAMs prevent the system from leaving the safe region of its state space. In this manner, RL empowers the robot to return and smash real balls with 5m/s and 12m/s on average respectively to a desired landing point. Our setup allows the agent to learn this safety-critical task (i) without safety constraints in the algorithm, (ii) while maximizing the speed of returned balls directly in the reward fu" Particle Filters in Latent Space for Robust Deformable Linear Object Tracking,"Yuxuan Yang, Johannes A. Stork, Todor Stoyanov","Örebro University,Orebro University",Deep Learning and Neural Networks in Robotics,"Tracking of deformable linear objects (DLOs) is important for many robotic applications. However, achieving robust and accurate tracking is challenging due to the lack of distinctive features or appearance on the DLO, the object's high-dimensional state space, and the presence of occlusion. In this letter, we propose a method for tracking the state of a DLO by applying a particle filter approach within a lower-dimensional state embedding learned by an autoencoder. The dimensionality reduction preserves state variation, while simultaneously enabling a particle filter to accurately track DLO state evolution with a practically feasible number of particles. Compared to previous works, our method requires neither running a high-fidelity physics simulation, nor manual designs of constraints and regularization. Without the assumption of knowing the initial DLO state, our method can achieve accurate tracking even under complex DLO motions and in the presence of severe occlusions. Our implementation is available at https://amm.aass.oru.se/dlo-pf-tracking/." Multi-Scale Interaction for Real-Time LiDAR Data Segmentation on an Embedded Platform,"Shijie Li, Xieyuanli Chen, Yun Liu, Dengxin Dai, Cyrill Stachniss, Juergen Gall","Bonn University,National University of Defense Technology,Agency for Science, Technology and Research (A*STAR),ETH Zurich,University of Bonn",Deep Learning and Neural Networks in Robotics,"Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles and robots, which are usually equipped with an embedded platform and have limited computational resources. Approaches that operate directly on the point cloud use complex spatial aggregation operations, which are very expensive and difficult to deploy on embedded platforms. As an alternative, projection-based methods are more efficient and can run on embedded hardware. However, current projection- based methods either have a low accuracy or require millions of parameters. In this paper, we therefore propose a projection- based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate. The network uses multiple paths with different scales and balances the computational re- sources between the scales. Additional dense interactions between the scales avoid redundant computations and make the network highly efficient. The proposed network outperforms point-based, image-based, and projection-based methods in terms of accuracy, number of parameters, and runtime. Moreover, the network processes more than 24 scans per second on an embedded platform, which is higher than the framerates of LiDAR sensors. The network is therefore suitable for robotics applications." Stable Neural Adaptive Filters for Teleoperations with Uncertain Delays,"Parham Kebria, Abbas Khosravi, Saeid Nahavandi",Deakin University,Deep Learning and Neural Networks in Robotics,"Uncertainties in communication networks negatively affect the performance and usability of teleoperation systems, especially, in time-critical applications such as telesurgery. There already exist different methods to tackle this problem using filtering and learning approaches to smoothly estimate perturbed reference signals. Despite these efforts, the instability issue remains unsolved for such systems under random time-delay perturbations. This study employs and extends one of the best filtering techniques and proposes a new strategy based on the learning capabilities of artificial neural networks to adaptively filter out delay related disturbances and provide the most stable yet accurate estimation. To achieve this goal, an adaptive learning mechanism is proposed based on a Lyapunov-Krasovskii functional to not only analyse and guarantee the stability of the overall system, but also preventing the learning algorithm to get stuck in local optima. The proposed method is experimentally evaluated and compared with other closely similar methods in the recent literature. The results demonstrate the outstanding performance of the proposed solution in this study." Compliant Microgripper Using Soft Polymer Actuator,"Jung-Hwan Youn, Je-Sung Koh, Ki-Uk Kyung","Electronics and Telecommunications Research Institute (ETRI),Ajou University,Korea Advanced Institute of Science & Technology (KAIST)",Soft Robots II,"Miniaturization of robotic grippers enables precise manipulation of small-size objects. However, most microgrippers are actuated by rigid actuators, and thus retain challenges such as micro-fabrication, complex structure, and lack of compliance. Here, we present a compliant microgripper driven by a soft polymer actuator. The proposed millimeter-scale soft polymer actuator can produce a linear displacement and output force with a fast operation. Then, we designed the gripper linkage to convert the linear displacement of the actuator into a gripping motion. Fabricated compliant microgripper has a size of 10 × 10 × 10 mm and a weight of 0.36 g, with a maximum gripping width of 8 mm. Demonstration of the gripper shows the feasibility of gripping various sub-millimeter scale objects regardless of their shape owing to its compliance." Development of Hydraulically-Driven Soft Hand for Handling Heavy Vegetables and Its Experimental Evaluation,"Osamu Azami, Kyosuke Ishibashi, Mitsuo Komagata, Ko Yamamoto","Tokyo University,The University of Tokyo,University of Tokyo",Soft Robots II,"In this study, we develop a hydraulically-driven soft robotic hand for handling heavy vegetables in a vegetable factory and report its experimental validations. The working population in agriculture is decreasing worldwide, creating a lot of demands for the robotic automation in harvest and transportation of agricultural produces. In particular, a vegetable factory deals with large and heavy vegetables, e.g., cabbages, with 2--3 kg weight and 20--30 cm diameter. A soft robot hand is suitable for handling a food or vegetable; however, most of existing soft robot hands cannot generate necessary output because they are usually actuated by the air-pressure. Therefore, we employ the hydraulic actuation for our soft hand to generate 1 or 2 MPa pressure. Using the developed soft hand, we report experimental validations including basic control performance evaluation and grasping experiments assuming a vegetable factory environment." Two-Stage Grasping: A New Bin Picking Framework for Small Objects,"Hanwen Cao, Jianshu Zhou, Yichuan Li, Rui Cao, Qi Dou, Yunhui Liu","The Chinese University of Hong Kong,Chinese University of Hong Kong",Soft Robots II,"This paper proposes a novel bin picking framework, two-stage grasping, aiming at precise grasping of cluttered small objects. Object density estimation and rough grasping are conducted in the first stage. Fine segmentation, detection, grasping, and pushing are performed in the second stage. A small object bin picking system has been realized to exhibit the concept of two-stage grasping. Experiments have shown the effectiveness of the proposed framework. Unlike traditional bin picking methods focusing on vision-based grasping planning using classic frameworks, the challenges of picking cluttered small objects can be solved by the proposed new framework with simple vision detection and planning." Electroadhesive Auxetics As Programmable Layer Jamming Skins for Formable Crust Shape Displays,"Ahad Rauf, John Settimio Bernardo, Sean Follmer",Stanford University,Soft Robots II,"Shape displays are a class of haptic devices that enable whole-hand haptic exploration of 3D surfaces. However, their scalability is limited by the mechanical complexity and high cost of traditional actuator arrays. In this paper, we propose using electroadhesive auxetic skins as a strain-limiting layer to create programmable shape change in a continuous (""formable crust"") shape display. Auxetic skins are manufactured as flexible printed circuit boards with dielectric-laminated electrodes on each auxetic unit cell (AUC), using monolithic fabrication to lower cost and assembly time. By layering multiple sheets and applying a voltage between electrodes on subsequent layers, electroadhesion locks individual AUCs, achieving a maximum in-plane stiffness variation of 7.6x with a power consumption of 50 uW/AUC. We first characterize an individual AUC and compare results to a kinematic model. We then validate the ability of a 5x5 AUC array to actively modify its own axial and transverse stiffness. Finally, we demonstrate this array in a continuous shape display as a strain-limiting skin to programmatically modulate the shape output of an inflatable LDPE pouch. Integrating electroadhesion with auxetics enables new capabilities for scalable, low-profile, and low-power control of flexible robotic systems." Navigating Soft Robots through Wireless Heating,"Yiwen Song, Mason Zadan, Kushaan Misra, Zefang Li, Jingxian Wang, Carmel Majidi, Swarun Kumar","Carnegie Mellon University,Microsoft & National University of Singapore",Soft Robots II,"Recent work on battery-free soft robotics has demonstrated the use of liquid crystal elastomers (LCE) to build shape-changing materials activated by applied external heat. However, sources of heat must typically be in direct field-of-view of the robot (i.e. NIR, laser, and visual light EM sources or convective heats guns), be tethered to an external power supply (i.e. thermoelectric heating or resistive joule heaters), or require a heavy on-board battery that limits mobility and range. This paper presents a novel battery-free soft-robotics platform that can crawl through confined, enclosed, and hard-to-reach spaces (e.g. packages, machinery, pipes, etc.), hidden from view of heating infrastructure. This is achieved through the co-design of a soft robotics platform and integrated soft conductive traces that enable wireless (microwave) heating through remote stimulation. We achieve fast actuation through a careful choice of materials and the overall mechanical structure of the robot to maximize heating efficiency. Further, the robot is actively tracked through enclosed spaces using a mmWave radar to direct heat to its location. We provide a detailed evaluation on the robot's heating efficiency, location-tracking accuracy and crawling speed." Fast Untethered Soft Robotic Crawler with Elastic Instability,"Zechen Xiong, Yufeng Su, Hod Lipson","Columbia University,Columbia university",Soft Robots II,"Enlightened by the fast-running gait of mammals like cheetahs and wolves, we design and fabricate a single-actuated untethered compliant robot that is capable of galloping at a speed of 313 mm/s or 1.56 body length per second (BL/s), faster than most reported soft crawlers in mm/s and BL/s. An in-plane prestressed hair clip mechanism (HCM) made up of semi-rigid materials, i.e. plastics are used as the supporting chassis, the compliant spine, and the force amplifier of the robot at the same time, enabling the robot to be simple, rapid, and strong. With experiments, we find that the HCM robotic locomotion speed is linearly related to actuation frequencies and substrate friction differences except for concrete surface, that tethering slows down the crawler, and that asymmetric actuation creates a new galloping gait. This paper demonstrates the potential of HCM-based soft robots." An Underwater Jet-Propulsion Soft Robot with High Flexibility Driven by Water Hydraulics,"Siqing Chen, He Xu, Xiong Xiao, Ben Lu","Harbin Engineering University,College of Mechanical and Electrical Engineering, Harbin Enginee,Institute of Automation, Chinese Academy of Sciences",Soft Robots II,"Compared with rigid robots, soft robots have the advantages of inherent compliance, high adaptability, and impact tolerance. Many researchers are very interested in the motion design of soft robot underwater. In this paper, inspired by the method of octopus propulsion, a jet propulsion unit with 80% soft materials driven by pressure is designed. It can change the volume of its cavity to absorb and eject the fluid medium to make the robot move. According to the working characteristics of the jet unit, corresponding experiments are designed to analyze its force output, deformation, ejection flow, and pressure response characteristics. In order to expand the motion space of the robot, a buoyancy unit is designed to control the depth of the robot in the water. Three jet units and a buoyancy element are combined into a tetrahedron robot - jet soft robot (JSR). The feasibility of its motion is verified by experiments. Compared with other similar jet robots, the biggest feature of this robot is that the drive unit can bend ortwist roughly along the centerline, which can prevent accidental collision and damage." Force/Torque Sensing for Soft Grippers Using an External Camera,"Jeremy Collins, Patrick Grady, Charlie Kemp",Georgia Institute of Technology,Soft Robots II,"Robotic manipulation can benefit from wrist-mounted force/torque (F/T) sensors, but conventional F/T sensors can be expensive, difficult to install, and damaged by high loads. We present Visual Force/Torque Sensing (VFTS), a method that visually estimates the 6-axis F/T measurement that would be reported by a conventional F/T sensor. In contrast to approaches that sense loads using internal cameras placed behind soft exterior surfaces, our approach uses an external camera with a fisheye lens that observes a soft gripper. VFTS includes a deep learning model that takes a single RGB image as input and outputs a 6-axis F/T estimate. We trained the model with sensor data collected while teleoperating a robot (Stretch RE1 from Hello Robot Inc.) to perform manipulation tasks. VFTS outperformed F/T estimates based on motor currents, generalized to a novel home environment, and supported three autonomous tasks relevant to healthcare: grasping a blanket, pulling a blanket over a manikin, and cleaning a manikin's limbs. VFTS also performed well with a manually operated pneumatic gripper. Overall, our results suggest that an external camera observing a soft gripper can perform useful visual force/torque sensing for a variety of manipulation tasks." Data-Driven Spectral Submanifold Reduction for Nonlinear Optimal Control of High-Dimensional Robots,"John Irvin Alora, Mattia Cenedese, Edward Schmerling, George Haller, Marco Pavone","Stanford University,ETH Zürich,ETH Zurich",Modelling and Control,"Modeling and control of high-dimensional, nonlinear robotic systems remains a challenging task. While various model- and learning-based approaches have been proposed to address these challenges, they broadly lack generalizability to different control tasks and rarely preserve the structure of the dynamics. In this work, we propose a new, data-driven approach for extracting low-dimensional models from data using Spectral Submanifold Reduction (SSMR). In contrast to other data-driven methods which fit dynamical models to training trajectories, we identify the dynamics on generic, low-dimensional attractors embedded in the full phase space of the robotic system. This allows us to obtain computationally-tractable models for control which preserve the system's dominant dynamics and better track trajectories radically different from the training data. We demonstrate the superior performance and generalizability of SSMR in dynamic trajectory tracking tasks vis-a-vis the state of the art, including Koopman operator-based approaches." Control of Shape Memory Alloy Actuator Via Electrostatic Capacitive Sensor for Meso-Scale Mirror Tilting System,"Baekgyeom Kim, Doohoe Lee, Dongjin Kim, Seungyong Han, Daeshik Kang, Uikyum Kim, Je-Sung Koh",Ajou University,Modelling and Control,"Shape memory alloy (SMA) has superior actuation capability over the limit of the scale. However, inherently low controllability is a primary issue that hinders practical usage. To address this challenge, this paper presents an SMA-based artificial muscle actuator capable of self-displacement sensing through the capacitive sensor. To realize sensing capability, the theoretical model-based design and fabrication process are proposed. Here, we show that the actuator can be controlled at intervals of 100 µm as well as maintaining sensing capability while lifting 90 times heavier than its weight. To exhibit the usefulness of the actuator to an optical device, we integrate the actuator into the mirror tilting device, which has 20 degrees tilting angle. We expect that the proposed actuator can overcome the scale limit of meso-scale devices, which require payload capacity and controllability, simultaneously" Data-Efficient Non-Parametric Modelling and Control of an Extensible Soft Manipulator,"Mohammadreza Kasaei, Keyhan Kouhkiloui Babarahmati, Zhibin Li, Mohsen Khadem","University of Edinburgh,University College London",Modelling and Control,"Data-driven approaches have shown promising results in modeling and controlling robots, specifically soft and flexible robots where developing physics-based models are more challenging. However, these methods often require a large number of real data, and gathering such data is time-consuming and can damage the robot as well. This paper proposed a novel data-efficient and non-parametric approach to develop a continuous model using a small dataset of real robot demonstrations (only 25 points). To the best of our knowledge, the proposed approach is the most sample-efficient method for soft continuum robot. Furthermore, we employed this model to develop a controller to track arbitrary trajectories in the feasible kinematic space. To show the performance of the proposed approach, a set of trajectory-tracking experiments has been conducted. The results showed that the robot was able to track the references precisely even in presence of external loads (up to 25~grams). Moreover, fine object manipulation experiments were performed to demonstrate the effectiveness of the proposed method in real-world tasks. Finally, we compared its performance with common data-driven approaches in seen/useen-before trajectory tracking scenarios. The results validated that the proposed approach significantly outperformed the existing approaches in unseen-before scenarios and offered similar performance in seen-before scenarios." Analytical Approach to Inverse Kinematics of Single Section Mobile Continuum Manipulators,"Audrey Hyacinthe Bouyom Boutchouang, Achille Melingui, Joseph Jean-baptiste Mvogo Ahanda, Xinrui Yang, Othman Lakhal, Frederic Biya Motto, Rochdi Merzouki","University of Yaounde I,Higher Technical Teacher Training collage, University of Bame,University of Lille,University Lille, CRIStAL, CNRS-UMR ,,,,,CRIStAL, CNRS UMR ,,,,, University of Lille,",Modelling and Control,"This paper proposes a novel mathematical solution to solve the inverse kinematics (IK) of single section mobile continuum manipulators (SSMCMs). Thus, to achieve a given pose of the end-effector (EE), the proposed mathematical solution consists in determining the position and orientation parameters of the mobile platform and of a single section of the continuum manipulator. As advantages, the proposed mathematical solution eliminates the EE pose errors when the dynamic parameters are neglected and the continuum manipulator is cylindrical in shape. A simulation and an experiment validate the proposed approach." A Fast Geometric Framework for Dynamic Cosserat Rods with Discrete Actuated Joints,"Hossain Samei, Robin Chhabra",Carleton University,Modelling and Control,"Current dynamical models of Cosserat rods often use the finite element method limited by computational efficiency or the finite difference method in a Cartesian framework with a compromise to accuracy. We employ the finite difference method in a geometric framework to develop solutions that are both computationally efficient and accurate. A numerical study is conducted on various backward-differentiation discretization and Runge-Kutta-Munthe-Kaas integration schemes, focusing on their accuracy and computational efficiency. Case studies are conducted on a single-degree-of-freedom joint actuated Cosserat rod to mitigate additional sources of undesired error from the numerical analysis, e.g. multi-body interactions, moving base dynamics, etc. The proposed geometric integrators are demonstrated to improve solution accuracy compared to the published finite difference models. The presented solution is parameterization-free and also computationally efficient with the potential for use in real-time applications, e.g., model-based control of soft manipulators." Data-Driven Estimation of Forces Along the Backbone of Concentric Tube Continuum Robots,"Heiko Donat, Pouya Mohammadi, Jochen Steil",Technische Universität Braunschweig,Modelling and Control,"Concentric tube continuum robots (CTCRs) belong to the family of continuum robots with applications in minimally invasive surgeries. Because of this application domain, measuring the external forces along the body of the robot is paramount. CTCRs are made up of thin elastic rods and are intended to be applied inside the human body, where conventional sensor-based measurements are not feasible. Consequently, research is resorting to estimate the forces through geometric, numeric, or optimization methods. However, these methods often suffer from slow convergence. In this paper, we introduce a novel data-driven approach for estimating contact forces along the body of a CTCR that offers an estimation precision comparable to the current state-of-the-art optimization-based approaches, but exhibits nearly two orders of magnitude faster convergence. The proposed method is scalable and exhibits a significant performance in response to a wide range of external forces. The approach was evaluated in simulations and on a real 2-tube CTCR." Bootstrapping the Dynamic Gait Controller of the Soft Robot Arm,"Rudolf Szadkowski, Muhammad Sunny Nazeer, Matteo Cianchetti, Egidio Falotico, Jan Faigl","Czech Technical University in Prague,The BioRobotics Institute, Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna",Modelling and Control,"In this paper, we propose a novel dynamic gait controller for the repetitive behavior of soft robot manipulators performing routine tasks. Compliance with soft robots is advantageous when the robot interacts with living organisms and other fragile objects. However, predicting and controlling repetitive behavior is challenging because of hysteresis and non-linear dynamics governing the interactions. Existing prior-free methods track the dynamic state using recurrent neural networks or rely on known generalized coordinates describing the robot's state. We propose to model the interaction induced by the repetitive behavior as gait dynamics and represent the dynamic state with Central Pattern Generator (CPG) tracking the motion phase and thus reduce the complexity of the robot's forward model. The proposed method bootstraps an ensemble of the forward models exploring multiple dynamic contexts that are expanded as it searches for repetitive motion producing the target repetitive behavior. The proposed approach is experimentally validated on a pneumatically actuated soft robot arm I-Support, where the method infers gaits for different targets." Model Based Position Control of Soft Hydraulic Actuators,"Mark Runciman, Enrico Franco, James Avery, Ferdinando Rodriguez Y Baena, George Mylonas","Imperial College London,Imperial College, London, UK",Modelling and Control,"In this article, we investigate the model based position control of soft hydraulic actuators arranged in an antagonistic pair. A dynamical model of the system is constructed by employing the port-Hamiltonian formulation. A control algorithm is designed with an energy shaping approach, which accounts for the pressure dynamics of the fluid. A nonlinear observer is included to compensate the effect of unknown external forces. Simulations demonstrate the effectiveness of the proposed approach, and experiments achieve positioning accuracy of 0.043 mm with a standard deviation of 0.033 mm in the presence of constant external forces up to 1 N." Multiple Surgical Instruments Tracking-By-Prediction with Graph Hierarchy,"Rui Guo, Xi Liu, Ziheng Wang, Tony Jarc",Intuitive Surgical,Medical Imaging and Perception I,"Current research strive has tremendously changed the horizon of computer vision tasks in multiple agents tracking. Nevertheless, in the research of robotic assisted surgery, reliable surgical instrument tracking imposes challenge due to the high complexity in state modeling for the hierarchical structure of the instrument versus de-coupling the spatial-temporal correlations naturally embedded in the task. In this paper, we present a new tracking paradigm integrating the trajectory prediction to reduce the data association error that is propagated from the false detection. As a key component in the system, a proposed predictor disentangles the hierarchical modeling and agent kinematic learning by introducing inductive attention mechanism in spatial-temporal graph network. Experiments on real anatomical datasets show that our tracking-by-prediction scheme improves overall localization accuracy over the frames by up to 81%, in comparison to the generic pipelines of tracking, even with transductive graph representation learning, with a large margin of gain in terms of precise localization." Fully Robotized 3D Ultrasound Image Acquisition for Artery,"Mingcong Chen, Yuanrui Huang, Jian Chen, Tongxi Zhou, Jiuan Chen, Hongbin Liu","Institute of Automation Chinese Academy of Sciences,University of Chinese Academy of Sciences,Institute of Automation, Chinese Academy of Sciences,institute of Automation, Chinese Academy of Sciences,Institute of Automation,Chinese Academy of Sciences",Medical Imaging and Perception I,"Current imaging of the artery relies primarily on computed tomography angiography (CTA), which requires contrast injections and exposure to radiation. In this paper, we present a method for fully autonomous artery 3D image acquisition using a linear ultrasound (US) probe and a 6 DoFs robot arm with a 3D camera. Robotic vessel acquisition can minimize tissue deformation and permit the reproduction of scans. Additionally, the robotic-based acquisition can provide more precise vessel position data that can be utilized for 3D reconstruction as a preoperative image. The first scanning point is determined by the 3D camera using a neural network for leg area estimation. A visual servo algorithm adjusts the in-plane motions using a cross-sectional vessel segmentation produced by a neural network with a UNet structure, while a US confidence map regulates the in-plane rotation. The robot is equipped with impedance control to maintain a constant and safe scan. Experiments on a leg phantom and a volunteer indicate that the robot can follow the vessel and modify its position to provide a sharper US image. The average error of phantom scanning in y-axis and z-axis are 0.2536mm and 0.2928mm, respectively, while the root means square error (RMSE) of contact force in the volunteer experiment is 0.2664N. In addition, a 3D vessel reconstruction demonstrates the possibility of substituting CTA with robotic US acquisition as a preoperative image." Depth Estimation for Oral Cavity by Shape from Shading with Endoscope,"Xi Wu, Gangtie Zheng",Tsinghua University,Medical Imaging and Perception I,"The tracheal intubation for patients with respiratory infectious diseases requires doctors to wear a full set of protective clothing, which takes a certain time. How to protect doctors from infection when facing an emergency operation has become an important issue. The intubation robot may solve this contradiction. In order to provide visual information for real-time path planning for robotic intubation, this study recover depth information of the oral environment using the low cost and widely used endoscopic. Since the oral cavity is small with less texture, Shape from Shading (SFS) method may be a good choice for oral depth estimation. This paper proposes the ""oral elbow"" hypothesis, filters out outliers caused by saliva, calculates a 3-D contour map, and highlights the contour map features from different views. This work expands the application scenarios of depth estimation; and can provide depth information for the visual navigation of the intubation surgical robot." Dynamic Interactive Relation Capturing Via Scene Graph Learning for Robotic Surgical Report Generation,"Hongqiu Wang, Yueming Jin, Lei Zhu","Hong Kong University of Science and Technology (Guangzhou),University College London,The Hong Kong University of Science and Technology (Guangzhou)",Medical Imaging and Perception I,"For robot-assisted surgery, an accurate surgical report reflects clinical operations during surgery and helps document entry tasks, post-operative analysis and follow-up treatment. It is a challenging task due to many complex and diverse interactions between instruments and tissues in the surgical scene. Although existing surgical report generation methods based on deep learning have achieved large success, they often ignore the interactive relation between tissues and instrumental tools, thereby degrading the report generation performance. This paper presents a neural network to boost surgical report generation by explicitly exploring the interactive relation between tissues and surgical instruments. To do so, we first devise a relational exploration (RE) module to model the interactive relation via graph learning, and an interaction perception (IP) module to assist the graph learning in RE module. In our IP module, we first devise a node tracking system to identify and append missing graph nodes of the current video frame for constructing graphs at RE module. Moreover, the IP module generates a global attention model to indicate the existence of the interactive relation on the whole scene of the current video frame to eliminate the graph learning at the current video frame. Furthermore, our IP module predicts a local attention model to more accurately identify the interaction relation of each graph node for assisting the graph updating at the RE module. After that, we concatenate features of all graph nodes of RE module and pass concatenated features into a transformer for generating the output surgical report. We validate the effectiveness of our method on a widely-used robotic surgery benchmark dataset, and experimental results show that our network can significantly outperform existing state-of-the-art surgical report generation methods (e.g., 7.48% and 5.43% higher for BLEU-1 and ROUGE)." Reslicing Ultrasound Images for Data Augmentation and Vessel Reconstruction,"Cecilia Morales, Jason Yao, Tejas Rane, Robert Edman, Howie Choset, Artur Dubrawski","Carnege Mellon University,Carnegie Mellon University",Medical Imaging and Perception I,"Robot-guided vascular access has the potential to deliver crucial medical care in situations where medical personnel are unavailable. However, this technique requires accurate and reliable segmentation of anatomical landmarks in the body. For the ultrasound imaging modality, obtaining large amounts of training data for a segmentation model is time-consuming and expensive. This paper introduces RESUS (RESlicing of UltraSound Images), a weak supervision data augmentation technique for ultrasound images based on slicing reconstructed 3D volumes from tracked 2D images. This technique allows us to generate views which cannot be easily obtained due to physical constraints of ultrasound imaging, and use these augmented ultrasound images to train a semantic segmentation model. We demonstrate that RESUS achieves statistically significant improvement over training with non-augmented images and highlight qualitative improvements through vessel reconstruction." Expert-Agnostic Ultrasound Image Quality Assessment Using Deep Variational Clustering,"Deepak Raina, Dimitrios Ntentia, Sh Chandrashekhara, Richard Voyles, Subir Kumar Saha","Indian Institute of Technology Delhi and Purdue University USA,Purdue university,All India Insititute of Medical Sciences, New Delhi,Purdue University,Indain Institute of Technology Delhi",Medical Imaging and Perception I,"Ultrasound imaging is a commonly used modality for several diagnostic and therapeutic procedures. However, the diagnosis by ultrasound relies heavily on the quality of images assessed manually by sonographers, which diminishes the objectivity of the diagnosis and makes it operator-dependent. The supervised learning-based methods for automated quality assessment require manually annotated datasets, which are highly labour-intensive to acquire. These ultrasound images are low in quality and suffer from noisy annotations caused by inter-observer perceptual variations, which hampers learning efficiency. We propose an UnSupervised UltraSound image Quality assessment Network, US2QNet, that eliminates the burden and uncertainty of manual annotations. US2QNet uses the variational autoencoder embedded with the three modules, pre-processing, clustering and post-processing, to jointly enhance, extract, cluster and visualize the quality feature representation of ultrasound images. The pre-processing module uses filtering of images to point the network's attention towards salient quality features, rather than getting distracted by noise. Post-processing is proposed for visualizing the clusters of feature representations in 2D space. We validated the proposed framework for quality assessment of the urinary bladder ultrasound images. The proposed framework achieved 78% accuracy and superior performance to state-of-the-art clustering methods. The project page with source codes is available at https://sites.google.com/view/us2qnet." A Curvature and Trajectory Optimization-Based 3D Surface Reconstruction Pipeline for Ultrasound Trajectory Generation,"Ananya Bal, Ashutosh Gupta, Fnu Abhimanyu, John Galeotti, Howie Choset","Carnegie Mellon University,BITS Pilani KK Birla Goa campus",Medical Imaging and Perception I,"Ultrasound scanning is an efficient imaging modality preferred for quick medical procedures. However, due to the lack of skilled sonographers, researchers have developed many Robotic Ultrasound System (RUS) prototypes for various procedures. Most of these systems have a human-in-the-loop and require an expert to point the robot to the region of the subject to be scanned. Only a few systems try to incorporate some knowledge from the exterior shape of the subject for ultrasound scanning. Accurate 3D surface reconstruction of a patient’s exterior can enable an RUS to perceive subjects more like a clinician would. It can help localize the subject for the robot while eliminating input from an expert. Ultrasound scanning trajectories can be better planned if the RUS first detects critical regions on the surface of the subject and corresponding curvatures. We use an RGB-D sensor to acquire point clouds representing the 3D surface of the subject, which in the present work is for a lower-torso leg phantom. A consolidated pipeline for creating an optimized 3D surface reconstruction of a subject is presented and is used to autonomously identify a region of interest for scanning femoral vessels with an ultrasound probe. To make our system more robust to inter-subject variations in shape and size, we incorporate a trajectory optimization module of the RUS-mounted RGB-D sensor. To this end, we introduce a comprehensive evaluation score to quantify the quality of point cloud reconstructions. The resulting improvements in 3D surface scanning and reconstruction enable near-automation in generating ultrasound scanning trajectories for femoral vessels. Our pipeline produces ultrasound images with an average ZNCC score of 0.86 and our 3D point cloud reconstructions are accurate up to 1e-5 m from a ground-truth high-resolution CT scan." Graph-Based Pose Estimation of Texture-Less Surgical Tools for Autonomous Robot Control,"HAOZHENG XU, Mark Runciman, João Cartucho, Chi Xu, Stamatia Giannarou","Imperial college london,Imperial College London",Medical Imaging and Perception I,"In Robot-assisted Minimally Invasive Surgery (RMIS), the estimation of the pose of surgical tools is crucial for applications such as surgical navigation, visual servoing, autonomous robotic task execution and augmented reality. A plethora of hardware-based and vision-based methods have been proposed in the literature. However, direct application of these methods to RMIS has significant limitations due to partial tool visibility, occlusions and changes in the surgical scene. In this work, a novel keypoint-graph-based network is proposed to estimate the pose of texture-less cylindrical surgical tools of small diameter. To deal with the challenges in RMIS, keypoint object representation is used and for the first time, temporal information is combined with spatial information in keypoint graph representation, for keypoint refinement. Finally, stable and accurate tool pose is computed using a PnP solver. Our performance evaluation study has shown that the proposed method is able to accurately predict the pose of a textureless robotic shaft with an ADD-S score of over 98%. The method outperforms state-of-the-art pose estimation models under challenging conditions such as object occlusion and changes in the lighting of the scene." Adaptive Sampling-Based Particle Filter for Visual-Inertial Gimbal in the Wild,"Xueyang Kang, Ariel Herrera, Henry Lema, Esteban Valencia, Patrick Vandewalle","KU Leuven,Escuela Politécnica Nacional,Escuela Politecnica Nacional",Sensor Fusion II,"In this paper, we present a Computer Vision (CV) based tracking and fusion algorithm, dedicated to a 3D printed gimbal system on drones flying in nature. The whole gimbal system can stabilize the camera orientation robustly in challenging environments by using skyline and ground plane as references. Our main contributions are the following: a) a light-weight Resnet-18 backbone network model was trained from scratch, and deployed onto the Jetson Nano platform to segment the image specifically into binary parts (ground and sky); b) our geometry assumption from the skyline and ground cues delivers the potential for robust visual tracking in the wild by using the skyline and ground plane as references; c) a manifold surface-based adaptive particle sampling can fuse orientation from multiple sensor sources flexibly. The whole algorithm pipeline is tested on our 3D- printed gimbal module with Jetson Nano. The experiments were performed on top of a building in a real landscape." DAMS-LIO: A Degeneration-Aware and Modular Sensor-Fusion LiDAR-Inertial Odometry,"Fuzhang Han, Han Zheng, Wenjun Huang, Rong Xiong, Yue Wang, Yanmei Jiao","Zhejiang University,Hangzhou Normal University",Sensor Fusion II,"With robots being deployed in increasingly com- plex environments like underground mines and planetary sur- faces, the multi-sensor fusion method has gained more and more attention which is a promising solution to state estimation in the such scene. The fusion scheme is a central component of these methods. In this paper, a light-weight iEKF-based LiDAR-inertial odometry system is presented, which utilizes a degeneration-aware and modular sensor-fusion pipeline that takes both LiDAR points and relative pose from another odometry as the measurement in the update process only when degeneration is detected. Both the Cramer-Rao Lower Bound (CRLB) theory and simulation test are used to demonstrate the higher accuracy of our method compared to methods using a single observation. Furthermore, the proposed system is evaluated in perceptually challenging datasets against various state-of-the-art sensor-fusion methods. The results show that the proposed system achieves real-time and high estimation accuracy performance despite the challenging environment and poor observations." ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions,"Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng Chen, Jiming Chen, Yuchi Huo, Qi Ye","Zhejiang University,Tsinghua university,Netease Inc",Sensor Fusion II,"3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images. In this paper, we present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies in all weather conditions robustly. Specifically, our ImmFusion consists of image and point backbones for token feature extraction and a Transformer module for token fusion. The image and point backbones refine global and local features from original data, and the Fusion Transformer Module aims for effective information fusion of two modalities by dynamically selecting informative tokens. Extensive experiments on a large-scale dataset, mmBody, captured in various environments demonstrate that ImmFusion can efficiently utilize the information of two modalities to achieve a robust 3D human body reconstruction in all weather conditions. In addition, our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods." Simple-BEV: What Really Matters for Multi-Sensor BEV Perception?,"Adam Harley, Zhaoyuan Fang, Jie Li, Rares Ambrus, Aikaterini Fragkiadaki","Stanford University,Carnegie Mellon University,Toyota Research Institute",Sensor Fusion II,"Building 3D perception systems for autonomous vehicles that do not rely on high-density LiDAR is a critical research problem because of the expense of LiDAR systems compared to cameras and other sensors. Recent research has developed a variety of camera-only methods, where features are differentiably ""lifted"" from the multi-camera images onto the 2D ground plane, yielding a ""bird's eye view"" (BEV) feature representation of the 3D space around the vehicle. This line of work has produced a variety of novel ""lifting"" methods, but we observe that other details in the training setups have shifted at the same time, making it unclear what really matters in top-performing methods. We also observe that using cameras alone is not a real-world constraint, considering that additional sensors like radar have been integrated into real vehicles for years already. In this paper, we first of all attempt to elucidate the high-impact factors in the design and training protocol of BEV perception models. We find that batch size and input resolution greatly affect performance, while lifting strategies have a more modest effect---even a simple parameter-free lifter works well. Second, we demonstrate that radar data can provide a substantial boost to performance, helping to close the gap between camera-only and LiDAR-enabled systems. We analyze the radar usage details that lead to good performance, and invite the community to re-consider this commonly-neglected part of the sensor platform." MVFusion: Multi-View 3D Object Detection with Semantic-Aligned Radar and Camera Fusion,"Zizhang Wu, Guilian Chen, Yuanzhu Gan, Wang Robin, Jian Pu","Zongmu Technology,Fudan University",Sensor Fusion II,"Multi-view radar-camera fused 3D object detection provides a farther detection range and more helpful features for autonomous driving, especially under adverse weather. The current radar-camera fusion methods deliver kinds of designs to fuse radar information with camera data. However, these fusion approaches usually adopt the straightforward concatenation operation between multi-modal features, which ignores the semantic alignment with radar features and sufficient correlations across modals. In this paper, we present MVFusion, a novel Multi-View radar-camera Fusion method to achieve semantic-aligned radar features and enhance the cross-modal information interaction. To achieve so, we inject the semantic alignment into the radar features via the semantic-aligned radar encoder (SARE) to produce image-guided radar features. Then, we propose the radar-guided fusion transformer (RGFT) to fuse our radar and image features to strengthen the two modals' correlation from the global scope via the cross-attention mechanism. Extensive experiments show that MVFusion achieves state-of-the-art performance (51.7% NDS and 45.3% mAP) on the nuScenes dataset. We shall release our code and trained networks upon publication." BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation,"Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, Song Han","MIT,Massachusetts Institute of Technology,Shanghai Jiao Tong University,OmniML",Sensor Fusion II,"Multi-sensor fusion is essential for an accurate and reliable autonomous driving system. Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with camera features. However, the camera-to-LiDAR projection throws away the semantic density of camera features, hindering the effectiveness of such methods, especially for semantic-oriented tasks (such as 3D scene segmentation). In this paper, we propose BEVFusion, an efficient and generic multi-task multi-sensor fusion framework. It unifies multi-modal features in the shared bird’s-eye view (BEV) representation space, which nicely preserves both geometric and semantic information. To achieve this, we diagnose and lift key efficiency bottlenecks in the view transformation with optimized BEV pooling, reducing latency by more than 40x. BEVFusion is fundamentally task-agnostic and seamlessly supports different 3D perception tasks with almost no architectural changes. It establishes the new state of the art on the nuScenes benchmark, achieving 1.3% higher mAP and NDS on 3D object detection and 13.6% higher mIoU on BEV map segmentation, with 1.9x lower computation cost. Code to reproduce our results is available at https://github.com/mit-han-lab/bevfusion." Fusing Event-Based Camera and Radar for SLAM Using Spiking Neural Networks with Continual STDP Learning,"Ali Safa, Tim Verbelen, Ilja Ocket, André Bourdoux, Hichem Sahli, Catthoor Francky, Georges Gielen","KU Leuven - IMEC,Ghent University - imec,imec - KU Leuven,imec,Vrije Universiteit Brussel",Sensor Fusion II,"This work proposes a first-of-its-kind SLAM architecture fusing an event-based camera and a Frequency Modulated Continuous Wave (FMCW) radar for drone navigation. Each sensor is processed by a bio-inspired Spiking Neural Network (SNN) with continual Spike-Timing-Dependent Plasticity (STDP) learning, as observed in the brain. In contrast to most learning-based SLAM systems, our method does not require an offline training phase but rather, the SNN continuously learns features from the input data on the fly via STDP. At the same time, the SNN outputs are used as feature descriptors for loop closure detection and map correction. We conduct numerous experiments to benchmark our system against state-of-the-art RGB methods and we demonstrate the robustness of our DVS-Radar SLAM approach against strong lighting variations." AI-Based Multi-Object Relative State Estimation with Self-Calibration Capabilities,"Thomas Jantos, Christian Brommer, Eren Allak, Stephan Weiss, Jan Steinbrener","University of Klagenfurt,Universität Klagenfurt",Sensor Fusion II,"The capability to extract task specific, semantic information from raw sensory data is a crucial requirement for many applications of mobile robotics. Autonomous inspection of critical infrastructure with Unmanned Aerial Vehicles (UAVs), for example, requires precise navigation relative to the structure that is to be inspected. Recently, Artificial Intelligence (AI)-based methods have been shown to excel at extracting semantic information such as 6 degree-of-freedom (6-DoF) poses of objects from images. In this paper, we propose a method combining a state-of-the-art AI-based pose estimator for objects in camera images with data from an inertial measurement unit (IMU) for 6-DoF multi-object relative state estimation of a mobile robot. The AI-based pose estimator detects multiple objects of interest in camera images along with their relative poses. These measurements are fused with IMU data in a state-of-the-art sensor fusion framework. We illustrate the feasibility of our proposed method with real world experiments for different trajectories and number of arbitrarily placed objects. We show that the results can be reliably reproduced due to the self-calibrating capabilities of our approach." Are All Point Clouds Suitable for Completion? Weakly Supervised Quality Evaluation Network for Point Cloud Completion,"Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen","Hong Kong University of Technology and Science,HKUST, Robotics Institute,DJI,Hong Kong University of Science and Technology",Point Clouds,"In the practical application of point cloud completion tasks, real data quality is usually much worse than the CAD datasets used for training. A small amount of noisy data will usually significantly impact the overall system’s accuracy. In this paper, we propose a quality evaluation network to score the point clouds and help judge the quality of the point cloud before applying the completion model. We believe our scoring method can help researchers select more appropriate point clouds for subsequent completion and reconstruction and avoid manual parameter adjustment. Moreover, our evaluation model is fast and straightforward and can be directly inserted into any model’s training or use process to facilitate the automatic selection and post-processing of point clouds. We propose a complete dataset construction and model evaluation method based on ShapeNet. We verify our network using detection and flow estimation tasks on KITTI, a real-world dataset for autonomous driving. The experimental results show that our model can effectively distinguish the quality of point clouds and help in practical tasks." From Semi-Supervised to Omni-Supervised Room Layout Estimation Using Point Clouds,"Huan-ang Gao, Beiwen Tian, Pengfei Li, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Yurong Chen, Hongbin Zha","Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University,Intel,Peking University",Point Clouds,"Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-the-art (SOTA) solution for PC-based layout estimation is not straightforward. To this end, we define a quad set matching strategy and several consistency losses based upon metrics tailored for layout quads. Besides, we propose a new online pseudo-label harvesting algorithm that decomposes the distribution of a hybrid distance measure between quads and PC into two components. This technique does not need manual threshold selection and intuitively encourages quads to align with reliable layout points. Surprisingly, this framework also works for the fully-supervised setting, achieving a new SOTA on the ScanNet benchmark. Last but not least, we also push the semi-supervised setting to the realistic omni-supervised setting, demonstrating significantly promoted performance on a newly annotated ARKitScenes testing set. Our codes, data and models will be made publicly available." Few-Shot Point Cloud Semantic Segmentation Via Contrastive Self-Supervision and Multi-Resolution Attention,"Jiahui Wang, Haiyue Zhu, Haoren Guo, Abdullah Al Mamun, Cheng Xiang, Tong Heng Lee","National University of Singapore,Agency for Science, Technology and Research (A*STAR)",Point Clouds,"This paper presents an effective few-shot point cloud semantic segmentation approach for real-world applications. Existing few-shot segmentation methods on point cloud heavily rely on the fully-supervised pretrain with large annotated datasets, which causes the learned feature extraction bias to those pretrained classes. However, as the purpose of few-shot learning is to handle unknown/unseen classes, such class-specific feature extraction in pretrain is not ideal to generalize into new classes for few-shot learning. Moreover, point cloud datasets hardly have a large number of classes due to the annotation difficulty. To address these issues, we propose a contrastive self-supervision framework for few-shot learning pretrain, which aims to eliminate the feature extraction bias through class-agnostic contrastive supervision. Specifically, we implement a novel contrastive learning approach with a learnable augmentor for a 3D point cloud to achieve point-wise differentiation, so that to enhance the pretrain with managed overfitting through the self-supervision. Furthermore, we develop a multi-resolution attention module using both the nearest and farthest points to extract the local and global point information more effectively, and a center-concentrated multi-prototype is adopted to mitigate the intra-class sparsity. Comprehensive experiments are conducted to evaluate the proposed approach, which shows our approach achieves state-of-the-art performance. Moreover, a case study on practical CAM/CAD segmentation is presented to demonstrate the effectiveness of our approach for real-world applications." Scene-Level Point Cloud Colorization with Semantics-And-Geometry-Aware Networks,"Rongrong Gao, Tian-zhu Xiang, Chenyang Lei, Jaesik Park, Qifeng Chen","HongKong university of science and engineering,Inception Institute of Artificial Intelligence,HKUST,POSTECH",Point Clouds,"In robotic applications, we often obtain tons of 3D point cloud data without color information, and it is difficult to visualize point clouds in a meaningful and colorful way. Can we colorize 3D point clouds for better visualization? Existing deep learning-based colorization methods usually only take simple 3D objects as input, and their performance for complex scenes with multiple objects is limited. To this end, this paper proposes a novel semantics-and-geometry-aware colorization network, termed SGNet, for vivid scene-level point cloud colorization. Specifically, we propose a novel pipeline that explores geometric and semantic cues from point clouds containing only coordinates for color prediction. We also design two novel losses, including a colorfulness metric loss and a pairwise consistency loss, to constrain model training for genuine colorization. To the best of our knowledge, our work is the first to generate realistic colors for point clouds of large-scale indoor scenes. Extensive experiments on the widely used ScanNet benchmarks demonstrate that the proposed method achieves state-of-the-art performance on point cloud colorization." Deep Interactive Full Transformer Framework for Point Cloud Registration,"Guangyan Chen, Meiling Wang, Qingxiang Zhang, Li Yuan, Tong Liu, Yufeng Yue","Beijing Institute of technology,Beijing Institute of Technology,Peking University",Point Clouds,"Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT." Coarse-To-Fine Point Cloud Registration with SE(3)-Equivariant Representations,"Cheng-wei Lin, Tung-i Chen, Hsin-ying Lee, Wen-chin Chen, Winston Hsu",National Taiwan University,Point Clouds,"Point cloud registration is a crucial problem in computer vision and robotics. Existing methods either rely on matching local geometric features, which are sensitive to the pose differences, or leverage global shapes, which leads to inconsistency when facing distribution variances such as partial overlapping. Combining the advantages from both type of methods, we adopt a coarse-to-fine pipeline that concurrently handles both issues. We first reduce the pose differences between input point clouds by aligning global features; then we match the local features to further refine the inaccurate alignments resulting from distribution variances. As global feature alignment requires the features to preserve the poses of input point clouds and local feature matching expects the features to be invariant to these poses, we propose an SE(3)-equivariant feature extractor to simultaneously generate two types of features. In this feature extractor, representations that preserve the poses are first encoded by our novel SE(3)-equivariant network and then converted into pose-invariant ones by a pose-detaching module. Experiments demonstrate that our proposed method increase the recall rate by 20% compared to state-of-the-art methods when facing both pose differences and distribution variances." LiDAR-SGM: Semi-Global Matching on LiDAR Point Clouds and Their Cost-Based Fusion into Stereo Matching,"Bianca Forkel, Hans J Wuensche",Universität der Bundeswehr München,Point Clouds,"Stereo matching can be used to estimate dense but inaccurate depth information for each pixel of a camera image. A LiDAR can provide accurate but sparse depth measurements. The fusion of both can combine their advantages. We propose an efficient method for fusing stereo and LiDAR at the cost level of Semi-Global Matching. It significantly improves density and accuracy of the estimated disparities while remaining real-time capable. Based on a LiDAR point cloud projected into the camera image costs are calculated for each possible disparity. These costs are added to the costs from stereo matching. Our LiDAR-SGM outperforms other real-time capable fusion approaches evaluated on the KITTI Stereo 2015 dataset. In addition to this real data, synthetic datasets are created (and made available) for a detailed analysis of the benefit of stereo LiDAR fusion as well as the evaluation of different sensors." Segregator: Global Point Cloud Registration with Semantic and Geometric Cues,"Pengyu Yin, Shenghai Yuan, Cao Haozhi, Xingyu Ji, Shuyang Zhang, Lihua Xie","Nanyang Technological University,NANYANG TECHNOLOGICAL UNIVERSITY,The Hong Kong University of Science and Technology,NanyangTechnological University",Point Clouds,"This paper presents Segregator, a global point cloud registration framework that exploits both semantic information and geometric distribution to efficiently build up outlier-robust correspondences and search for inliers. Current state-of-the-art algorithms rely on point features to set up putative correspondences and refine them by employing pair-wise distance consistency checks. However, such a scheme suffers from degenerate cases, where the descriptive capability of local point features downgrades, and unconstrained cases, where length-preserving (l-TRIMs)-based checks cannot sufficiently constrain whether the current observation is consistent with others, resulting in a complexified NP-complete problem to solve. To tackle these problems, on the one hand, we propose a novel degeneracy-robust and efficient corresponding procedure consisting of both instance-level semantic clusters and geometric-level point features. On the other hand, Gaussian distribution-based translation and rotation invariant measurements (G-TRIMs) are proposed to conduct the consistency check and further constrain the problem size. We validated our proposed algorithm on extensive real-world data-based experiments. The code is available: https://github.com/Pamphlett/Segregator." StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images Via Back-View NOCS,"Kai Chen, Stephen James, Congying Sui, Yunhui Liu, Pieter Abbeel, Qi Dou","The Chinese University of Hong Kong,Dyson,Chinese University of Hong Kong,UC Berkeley",Pose Estimation,"Most existing methods for category-level pose estimation rely on object point clouds. However, when considering transparent objects, depth cameras are usually not able to capture high-quality data, resulting in point clouds with severe artifacts. Without a complete point cloud, existing methods are not applicable to challenging transparent objects. To tackle this problem, we present StereoPose, a novel stereo image based framework for category-level object pose estimation, ideally suited for transparent objects. For a robust estimation from pure stereo images, we develop a pipeline that decouples category-level pose estimation into object size estimation, initial pose estimation, and pose refinement. StereoPose then estimates object pose based on representation in the normalized object coordinate space(NOCS). To address the issue of image content aliasing, we further define a back-view NOCS map for the transparent object. The back-view NOCS aims to reduce the network learning ambiguity caused by content aliasing, and leverage informative cues on the back of the transparent object for more accurate pose estimation. To further improve the performance of the stereo framework, StereoPose is equipped with a parallax attention module for stereo feature fusion and an epipolar loss for improving the stereo-view consistency of network predictions. Extensive experiments on the public TOD dataset demonstrate the superiority of the proposed StereoPose framework for category-level 6D transparent object pose estimation. Code and demos will be available on the project homepage: www.cse.cuhk.edu.hk/~kaichen/stereopose.html." Non-Minimal Solvers for Relative Pose Estimation with a Known Relative Rotation Angle,Deshun Hu,Harbin Institute of Technology,Pose Estimation,"Knowing the relative rotation angle improves relative pose estimation accuracy. We consider the problem of computing relative motion from a non-minimal number of correspondences with a known relative rotation angle. While several solvers for minimum correspondences have been proposed, no non-minimal solver for this problem currently exists. In this work, we propose two non-minimal solvers for this problem. The first solver solves the problem using convex relaxation and semidefinite programming, yielding certifiable solutions.The second method approaches the problem through local eigenvalue optimization with random initialization. Increasing the number of initial guesses lowers the chances of missing the correct solution. We conduct experiments on synthetic and real data, confirming our methods' advantages over competing methods." Generalizable Pose Estimation Using Implicit Scene Representations,"Vaibhav Saxena, Kamal Rahimi Malekshan, Linh Tran, Yotto Koga","Georgia Institute of Technology,Autodesk",Pose Estimation,"6-DoF pose estimation is an essential component of robotic manipulation pipelines. However, it usually suffers from a lack of generalization to new instances and object types. Most widely used methods learn to infer the object pose in a discriminative setup where the model filters useful information to infer the exact pose of the object. While such methods offer accurate poses, the model does not store enough information to generalize to new objects. In this work, we address the generalization capability of pose estimation using models that contain enough information about the object to render it in different poses. We follow the line of work that inverts neural renderers to infer the pose. We propose i-$sigma$SRN to maximize the information flowing from the input pose to the rendered scene and invert them to infer the pose given an input image. Specifically, we extend Scene Representation Networks (SRNs) by incorporating a separate network for density estimation and introduce a new way of obtaining a weighted scene representation. We investigate several ways of initial pose estimates and losses for the neural renderer. Our final evaluation shows a significant improvement in inference performance and speed compared to existing approaches." RFFCE: Residual Feature Fusion and Confidence Evaluation Network for 6DoF Pose Estimation,"Qiwei Meng, Shanshan Ji, Shiqiang Zhu, Tianlei Jin, Te Li, Jianjun Gu, Wei Song","Zhejiang Lab,zhejiang lab",Pose Estimation,"In this paper, we propose a novel RGBD-based object 6DoF pose estimation network - RFFCE. It is a two-stage method that firstly leverages deep neural networks for feature extraction and object points matching, and then the geometric constraints are utilized for final pose computation. Our approach consists of three primary innovations: residual feature fusion for representative RGBD feature extraction; confidence evaluation and confidence-based paired points offsets regression for self-evaluation and self-optimization respectively. Their effectiveness is verified through an ablation study, and our RFFCE achieves the SOTA performance on LineMOD, Occlusion-LineMOD and YCB-Video datasets. Additionally, we also conduct a real-world object grasping experiment for visualization and qualitative evaluation of the RFFCE." Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-Hand Objects,"Alireza Rezazadeh, Snehal Dikhale, Soshi Iba, Nawid Jamali","University of Minnesota,Honda Research Institute USA",Pose Estimation,"Robotic manipulation, in particular, in-hand object manipulation often requires an accurate estimate of the object’s 6D pose. To improve the accuracy of the estimated pose, state-of-the-art approaches in 6D object pose estimation use observational data from one or more modalities, e.g., RGB images, depth, and tactile readings. However, existing approaches make limited use of the underlying geometric structure of the object captured by these modalities, thereby, increasing their reliance on visual features. This results in poor performance when presented with objects that lack such visual features or when visual features are simply occluded. Furthermore, current approaches do not take advantage of the proprioceptive information embedded in the position of the fingers. To address these limitations, in this paper: (1) we introduce a hierarchical graph neural network architecture for combining multimodal (vision and touch) data that allows for a geometrically informed 6D object pose estimation, (2) we introduce a hierarchical message passing operation that flows the information within and across modalities to learn a graph-based object representation, and (3) we introduce a method that accounts for the proprioceptive information for in-hand object representation. We evaluate our model on a diverse subset of objects from the YCB Object and Model Set and show that our method substantially outperforms existing state-of-the-art work in accuracy and robustness to occlusion. We also deploy our proposed framework on a real robot and qualitatively demonstrate successful transfer to real settings." Interactive Object Segmentation in 3D Point Clouds,"Theodora Kontogianni, Ekin Celikkan, Siyu Tang, Konrad Schindler","ETH Zurich,RWTH Aachen University,ETH Zürich",Pose Estimation,"We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects in a 3D point cloud directly. Current methods for 3D instance segmentation are generally trained in a fully-supervised fashion, which requires large amounts of costly training labels, and does not generalize well to classes unseen during training. Few works have attempted to obtain 3D segmentation masks using human interactions. Existing methods rely on user feedback in the 2D image domain. As a consequence, users are required to constantly switch between 2D images and 3D representations, and custom architectures are employed to combine multiple input modalities. Therefore, integration with existing standard 3D models is not straightforward. The core idea of this work is to enable users to interact directly with 3D point clouds by clicking on desired 3D objects of interest (or their background) to interactively segment the scene in an open-world setting. Specifically, our method does not require training data from any target domain and can adapt to new environments where no appropriate training sets are available. Our system continuously adjusts the object segmentation based on the user feedback and achieves accurate dense 3D segmentation masks with minimal human effort (few clicks per object). Besides its potential for efficient labeling of large-scale and varied 3D datasets, our approach, where the user directly interacts with the 3D environment, enables new AR/VR and human-robot interaction applications." GSNet: Model Reconstruction Network for Category-Level 6D Object Pose and Size Estimation,"Penglei Liu, Qieshi Zhang, Jun Cheng","Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences,Shenzhen Institutes of Advanced Technology, Chinese Academy of S,Shenzhen Institutes of Advanced Technology",Pose Estimation,"Category-level 6D pose and size estimation is to estimate the rotation, translation and size of the observed instance objects from an arbitrary angle in a cluttered scene. Compared with instance-level 6D pose estimation, there are two main challenges for category-level 6D pose estimation. One is that the algorithm needs to estimate the 6D pose and size of unseen objects, and no 3D models are available. Another is that different instance objects of the same class of objects differ greatly in shape. This paper propose a novel method to estimate the 6D pose and size of unseen objects from an RGBD image. To handle intra-class shape variation, we propose an autoencoder-decoder that is trained on a set of object models to learn structural feature-invariant and shape-variant features of intra-class objects, and constructs a category-level priori model containing the structure feature and shape feature. To solve the problem of 3D model, this paper proposes a model reconstruction network including 3D graph convolution and spherical convolution (GSNet), which can reconstruct the 3D model of the observed instance object from the input RGB-D image and the priori model, and establish a dense correspondence between the 3D model and the observed instance object. Finally, random sample consensus (RANSAC) algorithm and Umeyama algorithm are used to estimate the 6D pose and size of the object. Extensive experiments on benchmark datasets show that the proposed method achieves state-of-the-art performance in category-level 6D object pose estimation. In order to prove that our method can be applied to the grasping and operation tasks of robots in industry and life, we deploy our method to a physical UR5 robot to perform grasping tasks on unseen but category known instances, and the results validate the efficacy of our proposed method." 6D Pose Estimation for Textureless Objects on RGB Frames Using Multi-View Optimization,"Jun Yang, Wenjie Xue, Sahar Ghavidel, Steven Lake Waslander","University of Toronto,Epson Canada",Pose Estimation,"6D pose estimation of textureless objects is a valuable but challenging task for many robotic applications. In this work, we propose a framework to address this challenge using only RGB images acquired from multiple viewpoints. The core idea of our approach is to decouple 6D pose estimation into a sequential two-step process, first estimating the 3D translation and then the 3D rotation of each object. This decoupled formulation first resolves the scale and depth ambiguities in single RGB images, and uses these estimates to accurately identify the object orientation in the second stage, which is greatly simplified with an accurate scale estimate. Moreover, to accommodate the multi-modal distribution present in rotation space, we develop an optimization scheme that explicitly handles object symmetries and counteracts measurement uncertainties. In comparison to the state-of-the-art multi-view approach, we demonstrate that the proposed approach achieves substantial improvements on a challenging 6D pose estimation dataset for textureless objects." Learning Stabilization Control from Observations by Learning Lyapunov-Like Proxy Models,"Milan Ganai, Chiaki Hirayama, Ya-Chien Chang, Sicun Gao","University of California San Diego,UCSD",Imitation Learning,"The deployment of Reinforcement Learning to robotics applications faces the difficulty of reward engineering. Therefore, approaches have focused on creating reward functions by Learning from Observations (LfO) which is the task of learning policies from expert trajectories that only contain state sequences. We propose new methods for LfO for the important class of continuous control problems of learning to stabilize, by introducing intermediate proxy models acting as reward functions between the expert and the agent policy based on Lyapunov stability theory. Our LfO training process consists of two steps. The first step attempts to learn a Lyapunov-like landscape proxy model from expert state sequences without access to any kinematics model, and the second step uses the learned landscape model to guide in training the learner's policy. We formulate novel learning objectives for the two steps that are important for overall training success. We evaluate our methods in real automobile robot environments and other simulated stabilization control problems in model-free settings, like Quadrotor control and maintaining upright positions of Hopper in MuJoCo. We compare with state-of-the-art approaches and show the proposed methods can learn efficiently with less expert observations." Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models,"Yi Liu, Gaurav Datta, Ellen Novoseller, Daniel Brown","UC Berkeley,University of California, Berkeley,University of Utah",Imitation Learning,"Preference-based reinforcement learning (PbRL) can enable robots to learn to perform tasks based on an individual's preferences without requiring a hand-crafted reward function. However, existing approaches either assume access to a high-fidelity simulator or analytic model or take a model-free approach that requires extensive, possibly unsafe online environment interactions. In this paper, we study the benefits and challenges of using learned dynamics model when performing PbRL. In particular, we provide evidence that a learned dynamics model offers the following benefits when performing PbRL: (1) preference elicitation and policy optimization require significantly fewer environmental interactions than model-free PbR, (2) diverse preference queries can be synthesized safely and efficiently as a by-product of standard model-based RL, and (3) reward pretraining based on suboptimal demonstrations can be performed without any environmental interaction. Our paper provides empirical evidence that learned dynamics models enable robots to learn customized policies based on user preferences in ways that are safer and more sample efficient than prior preference learning approaches." BITS: Bi-Level Imitation for Traffic Simulation,"Danfei Xu, Yuxiao Chen, Boris Ivanovic, Marco Pavone","Stanford Univesity,Nvidia research,NVIDIA,Stanford University",Imitation Learning,"Simulation is the key to scaling up validation and verification for robotic systems such as autonomous vehicles. Despite advances in high-fidelity physics and sensor simulation, a critical gap remains in simulating realistic behaviors of road users. This is because devising first principle models for human-like behaviors is generally infeasible. In this work, we take a data-driven approach to generate traffic behaviors from real-world driving logs. The method achieves high sample efficiency and behavior diversity by exploiting the bi-level hierarchy of high-level intent inference and low-level driving behavior imitation. The method also incorporates a planning module to obtain stable long-horizon behaviors. We empirically validate our method with scenarios from two large-scale driving datasets and show our method achieves balanced traffic simulation performance in realism, diversity, and long-horizon stability. We also explore ways to evaluate behavior realism and introduce a suite of evaluation metrics for traffic simulation. Finally, as part of our core contributions, we develop and open source a software tool that unifies data formats across different driving datasets and converts scenes from existing datasets into interactive simulation environments." Off-Policy Imitation Learning from Visual Inputs,"Zhihao Cheng, Li Shen, Dacheng Tao","The University of Sydney,JD Explore Academy",Imitation Learning,"Recently, various successful applications utilizing expert states in imitation learning (IL) have been witnessed. However, IL from visual inputs (ILfVI), which has a greater promise to be widely applied by using online visual resources, suffers from low data-efficiency and poor performance resulted from on-policy learning and high-dimensional visual inputs. We propose OPIfVI (Off-Policy Imitation from Visual Inputs), which is composed of an off-policy learning manner, data augmentation, and encoder techniques, to tackle the mentioned challenges, respectively. More specifically, to improve data-efficiency, OPIfVI conducts IL in an off-policy manner, with which sampled data used multiple times. In addition, we enhance the stability of OPIfVI with spectral normalization to mitigate the side effect of off-policy training. The core factor, contributing to the poor performance of ILfVI, that we think is agents could not extract meaningful features from visual inputs. Hence, OPIfVI employs data augmentation from computer vision to help train encoders to better extract features from visual inputs. Besides, a specific structure of gradient backpropagation for the encoder is designed to stabilize the encoder training. At last, we demonstrate that OPIfVI can achieve expert-level performance and outperform existing baselines via extensive experiments using DeepMind Control Suite." Versatile Skill Control Via Self-Supervised Adversarial Imitation of Unlabeled Mixed Motions,"Chenhao Li, Sebastian Blaes, Pavel Kolev, Marin Vlastelica, Jonas Frey, Georg Martius","ETH Zürich,Max Planck Institute for Intelligent Systems,ETH Zurich",Imitation Learning,"Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining single versatile policies with controllable skill sets from unlabeled datasets containing diverse state transition patterns by maximizing their discriminability. Moreover, we show that by utilizing unsupervised skill discovery in the generative adversarial imitation learning framework, novel and useful skills emerge with successful task fulfillment. Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations." Curriculum-Based Imitation of Versatile Skills,"Maximilian Xiling Li, Onur Celik, Philipp Becker, Denis Blessing, Rudolf Lioutikov, Gerhard Neumann","Karlsruhe Institute of Technology,KIT,Karlsruhe Institute of Technology (KIT)",Imitation Learning,"Learning skills by imitation is a promising concept for the intuitive teaching of robots. A common way to learn such skills is to learn a parametric model by maximizing the likelihood given the demonstrations. Yet, human demonstrations are often multi-modal, i.e., the same task is solved in multiple ways which is a major challenge for most imitation learning methods that are based on such a maximum likelihood (ML) objective. The ML objective forces the model to cover all data, it prevents specialization in the context space and can cause mode-averaging in the behavior space, leading to suboptimal or potentially catastrophic behavior. Here, we alleviate those issues by introducing a curriculum using a weight for each data point, allowing the model to specialize on data it can represent while incentivizing it to cover as much data as possible by an entropy bonus. We extend our algorithm to a Mixture of (linear) Experts (MoE) such that the single components can specialize on local context regions, while the MoE covers all data points. We evaluate our approach in complex simulated and real robot control tasks and show it learns from versatile human demonstrations and significantly outperforms current SOTA methods." Learning Stable Dynamics Via Iterative Quadratic Programming,"Paul Gesel, Momotaz Begum",University of New Hampshire,Imitation Learning,"This paper proposes a novel autonomous dynamic system (ADS) based controller for trajectory learning from demonstration (LfD). We call our method Learning Stable Dynamics via Iterative Quadratic Programming (LSD-IQP). LSD-IQP learns an energy function and an ADS from demonstrations via semi-infinite quadratic programming. Energy function constraints are imposed on the learned ADS to ensure convergence to a single goal position. Unlike other energy-based methods, LSD-IQP allows the energy function to have both local maximums and saddle points. This flexibility enables LSD-IQP to learn a broader class of motions compared to other ADS based controllers. We demonstrate the capabilities of LSD-IQP via several experiments, including: 1) learning handwritten symbols and comparing the swept error area to several other ADS methods 2) learning a pick and place task with novel goal positions for a robot, and 3) learning a point to point motion in the presence of a non-convex obstacle for a robot." Holistic Graph-Based Motion Prediction,"Daniel Grimm, Philip Schörner, Moritz Dreßler, Johann Marius Zöllner","FZI Research Center for Information Technology,Karlsruhe Institute of Technology (KIT),FZI Forschungszentrum Informatik",Imitation Learning,"Motion prediction for automated vehicles in complex environments is a difficult task that is to be mastered when automated vehicles are to be used in arbitrary situations. Many factors influence the future motion of traffic participants starting with traffic rules and reaching from the interaction between each other to personal habits of human drivers. Therefore we present a novel approach for a graph-based prediction based on a heterogeneous holistic graph representation that combines temporal information, properties and relations between traffic participants as well as relations with static elements like the road network. The information are encoded through different types of nodes and edges that both are enriched with arbitrary features. We evaluated the approach on the INTERACTION and the Argoverse dataset and conducted an informative ablation study to demonstrate the benefit of different types of information for the motion prediction quality." Extraneousness-Aware Imitation Learning,"Ray Chen Zheng, Kaizhe Hu, Zhecheng Yuan, Boyuan Chen, Huazhe Xu","Tsinghua University,Massachusetts Institute of Technology",Imitation Learning,"Visual imitation learning provides an effective framework to learn skills from demonstrations. However, the quality of the provided demonstrations usually significantly affects the ability of an agent to acquire desired skills. Therefore, the standard visual imitation learning assumes near-optimal demonstrations, which are expensive or sometimes prohibitive to collect. Previous works propose to learn from textit{noisy} demonstrations; however, the noise is usually assumed to follow a context-independent distribution such as a uniform or gaussian distribution. In this paper, we consider another crucial yet underexplored setting --- imitation learning with task-irrelevant yet locally consistent segments in the demonstrations (e.g., wiping sweat while cutting potatoes in a cooking tutorial). We argue that such noise is common in real world data and term them as ``extraneous'' segments. To tackle this problem, we introduce Extraneousness-Aware Imitation Learning (EIL), a self-supervised approach that learns visuomotor policies from third-person demonstrations with extraneous subsequences. EIL learns action-conditioned observation embeddings in a self-supervised manner and retrieves task-relevant observations across visual demonstrations while excluding the extraneous ones. Experimental results show that EIL outperforms strong baselines and achieves comparable policies to those trained with perfect demonstration on both simulated and real-world robot control tasks. The project page can be found here: url{https://sites.google.com/view/eil-website}." Wayformer: Motion Forecasting Via Simple & Efficient Attention Networks,"Nigamaa Nayakanti, Rami Al-rfou, Aurick Zhou, Kratarth Goel, Khaled Refaat, Benjamin Sapp",Waymo,Imitation Learning,"Motion forecasting for autonomous driving is a challenging task because complex driving scenarios involve a heterogeneous mix of static and dynamic inputs. It is an open problem how best to represent and fuse information about road geometry, lane connectivity, time-varying traffic light state, and history of dynamic set of agents and their interactions into an effective encoding. To model this diverse set of input features, many approaches proposed to design an equally complex system with a diverse set of modality specific modules. This results in systems that are difficult to scale, extend, or tune in rigorous ways to trade off quality and efficiency. In this paper we present Wayformer, a family of simple and homogeneous attention based architectures for motion forecasting. Wayformer offers a compact model description consisting of an attention based scene encoder and a decoder. In the scene encoder we study the choice of early, late and hierarchical fusion of input modalities. For each fusion type we explore strategies to trade off efficiency and quality via factorized attention or latent query attention. We show that early fusion, despite its simplicity, is not only modality agnostic but also achieves state-of-the-art results on both Waymo Open Motion Dataset (WOMD) and Argoverse leaderboards, demonstrating the effectiveness of our design philosophy." A Non-Parametric Skill Representation with Soft Null Space Projectors for Fast Generalization,"João Silvério, Yanlong Huang","German Aerospace Center,University of Leeds",Imitation Learning,"Over the last two decades, the robotics community witnessed the emergence of various motion representations that have been used extensively, particularly in behavioral cloning, to compactly encode and generalize skills. Among these, probabilistic approaches have earned a relevant place, owing to their encoding of variations, correlations and adaptability to new task conditions. Modulating such primitives, however, is often cumbersome due to the need for parameter re-optimization which frequently entails computationally costly operations. In this paper we derive a non-parametric movement primitive formulation that contains a null space projector. We show that such formulation allows for fast and efficient motion generation with computational complexity O(n2) without involving matrix inversions, whose complexity is O(n3). This is achieved by using the null space to track secondary targets, with a precision determined by the training dataset. Using a 2D example associated with time input we show that our non-parametric solution compares favourably with a state-of-the-art parametric approach. For demonstrated skills with high-dimensional inputs we show that it permits on-the-fly adaptation as well." Sample Efficient Dynamics Learning for Symmetrical Legged Robots: Leveraging Physics Invariance and Geometric Symmetries,"Jee-Eun Lee, Jaemin Lee, Tirthankar Bandyopadhyay, Luis Sentis","The University of Texas at Austin,California Institute of Technology,CSIRO",Learning for Control II,"Model generalization of the underlying dynamics is critical for achieving data efficiency when learning for robot control. This paper proposes a novel approach for learning dynamics leveraging the symmetry in the underlying robotic system, which allows for robust extrapolation from fewer samples. Existing frameworks that represent all data in vector space fail to consider the structured information of the robot, such as leg symmetry, rotational symmetry, and physics invariance. As a result, these schemes require vast amounts of training data to learn the system's redundant elements because they are learned independently. Instead, we propose considering the geometric prior by representing the system in symmetrical object groups and designing neural network architecture to assess invariance and equivariance between the objects. Finally, we demonstrate the effectiveness of our approach by comparing the generalization to unseen data of the proposed model and the existing models. We also implement a controller of a climbing robot based on learned inverse dynamics models. The results show that our method generates accurate control inputs that help the robot reach the desired state while requiring less training data than existing methods." Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion,"Lev Grossman, Brian Plancher","Berkshire Grey,Barnard College, Columbia University",Learning for Control II,"Deep reinforcement learning (DRL) is one of the most powerful tools for synthesizing complex robotic behaviors. But training DRL models is incredibly compute and memory intensive, requiring large training datasets and replay buffers to achieve performant results. This poses a challenge for the next generation of field robots that will need to learn on the edge to adapt to their environment. In this paper, we begin to address this issue through observation space quantization. We evaluate our approach using four simulated robot locomotion tasks and two state-of-the-art DRL algorithms, the on-policy Proximal Policy Optimization (PPO) and off-policy Soft Actor-Critic (SAC) and find that observation space quantization reduces overall memory costs by as much as 4.2x without impacting learning performance." Causal Inference for De-Biasing Motion Estimation from Robotic Observational Data,"Junhong Xu, Kai Yin, Jason M. Gregory, Lantao Liu","Indiana University,Expedia Group,US Army Research Laboratory",Learning for Control II,"Robot data collected in complex real-world scenarios are often biased due to safety concerns, human preferences, and mission or platform constraints. Consequently, robot learning from such observational data poses great challenges for accurate parameter estimation. We propose a principled causal inference framework for robots to learn the parameters of a stochastic motion model using observational data. Specifically, we leverage the de-biasing functionality of the potential outcome causal inference framework, the Inverse Propensity Weighting (IPW), and the Doubly Robust (DR) methods, to obtain a better parameter estimation of the robot’s stochastic motion model. The IPW is a re-weighting approach to ensure unbiased estimation and the DR approach further combines any two estimators to strengthen the unbiased result even if one of these estimators is biased. We then develop an approximate policy iteration algorithm using the bias-eliminated estimated state transition function. We validate our framework using both simulation and real-world experiments, and the results have revealed that the proposed causal inference-based navigation and control framework can correctly and efficiently learn the parameters from biased observational data." Active Predictive Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems,"Alex Ororbia, Ankur Mali","Rochester Institute of Technology,University of South Florida",Learning for Control II,"In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent completely built from predictive processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic simulator, i.e., the Surreal Robotics Suite, for the block lifting task and the can pick-and-place problem. Notably, our results demonstrate that the proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backpropagation-based reinforcement learning approaches." Approximating Discontinuous Nash Equilibrial Values of Two-Player General-Sum Differential Games,"Lei Zhang, Mukesh Ghimire, Wenlong Zhang, Zhe Xu, Yi Ren",Arizona State University,Learning for Control II,"Finding Nash equilibrial policies for two-player differential games requires solving Hamilton-Jacobi-Isaacs (HJI) PDEs. Self-supervised learning has been used to approximate solutions of such PDEs while circumventing the curse of dimensionality. However, this method fails to learn discontinuous PDE solutions due to its sampling nature, leading to poor safety performance of the resulting controllers in robotics applications when player rewards are discontinuous. This paper investigates two potential solutions to this problem: a hybrid method that leverages both supervised Nash equilibria and the HJI PDE, and a value-hardening method where a sequence of HJIs are solved with a gradually hardening reward. We compare these solutions using the resulting generalization and safety performance in two vehicle interaction simulation studies with 5D and 9D state spaces, respectively. Results show that with informative supervision (e.g., collision and near-collision demonstrations) and the low cost of self-supervised learning, the hybrid method achieves better safety performance than the supervised, self-supervised, and value hardening approaches on equal computational budget. Value hardening fails to generalize in the higher-dimensional case without informative supervision. Lastly, we show that the neural activation function needs to be continuously differentiable for learning PDEs and its choice can be case dependent." Visual Affordance Prediction for Guiding Robot Exploration,"Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani",Carnegie Mellon University,Learning for Control II,"Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning `visual affordances'. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. To allow predicting diverse plausible futures, wediscretize the space of continuous images with a VQ-VAE and use a Transformer-based model to learn a conditional distribution in the latent embedding space. We show that these models can be trained using large-scale and diverse passive data, and that the learned models exhibit compositional generalization to diverse objects beyond the training distribution. We evaluate the quality and diversity of the generations, and demonstrate how the trained affordance model can be used for guiding exploration during visual goal-conditioned policy learning in robotic manipulation." Generating Stable and Collision-Free Policies through Lyapunov Function Learning,"Alexandre Coulombe, Hsiu-Chin Lin",McGill University,Learning for Control II,"The need for rapid and reliable robot deployment is on the rise. Imitation Learning (IL) has become popular for producing motion planing policies from a set of demonstrations. However, many methods in IL are not guaranteed to produce stable policies. The generated policy may not converge to the robot target, reducing reliability, and may collide with its environment, reducing the safety of the system. Stable Estimator of Dynamic Systems (SEDS) produces stable policies by constraining the Lyapunov stability criteria during learning, but the Lyapunov candidate function was manually selected. In this work, we propose a novel method for learning a Lyapunov function and a policy using a single neural network model. The method can be equipped with an obstacle avoidance module for convex object pairs to guarantee no collision. We demonstrated our method is capable of finding policies on several simulation environments and transfer to a real-world scenario." ALAN: Autonomously Exploring Robotic Agents in the Real World,"Russell Mendonca, Shikhar Bahl, Deepak Pathak","Carnegie Mellon University,UC Berkeley",Learning for Control II,"In order to build robotic agents that can autonomously operate in the real world, it is crucial to explore the environment. While it is possible to build agents that can learn without supervision, current methods struggle to scale to the real world. Thus, we propose, ALAN, an autonomously exploring robotic agent, that can perform many tasks in the real world with little training and interaction time. Our approach builds a shared model of the world, across all possible tasks, and continuously explores. We propose a novel intrinsic motivation reward that leverages both self-supervised agent-centric prediction as well environment-centric priors for visual change. We show our approach on a real world play kitchen setting, performing multiple manipulation tasks and showing much better exploratory performance than state-of-the-art methods. Videos can be found at https://robo-explorer.github.io/" Throwing Objects into a Moving Basket While Avoiding Obstacles,"Hamidreza Kasaei, Mohammadreza Kasaei","University of Groningen,University of Edinburgh",Learning for Control II,"The capabilities of a robot will be increased significantly by exploiting throwing behavior. In particular, throwing will enable robots to rapidly place the object into the target basket, located outside its feasible kinematic space, without traveling to the desired location. In previous approaches, the robot often learned a parameterized throwing kernel through analytical approaches, imitation learning, or hand-coding. There are many situations in which such approaches do not work/generalize well due to various object shapes, heterogeneous mass distribution, and also obstacles that might be presented in the environment. It is obvious that a method is needed to modulate the throwing kernel through its meta-parameters. In this paper, we tackle object throwing problem through a deep reinforcement learning approach that enables robots to precisely throw objects into a moving basket while there is an obstacle obstructing the path. To the best of our knowledge, we are the first group that addresses throwing objects with obstacle avoidance. Such a throwing skill not only increases the physical reachability of a robotic arm but also improves the execution time. In particular, the robot detects the pose of the target object, basket, and obstacle at each time step, predicts the proper grasp configuration for the target object, and then infers appropriate parameters to throw the object into the basket. Due to safety constraints, we develop a simulation environment in Gazebo to train the robot and then use the learned policy in real-robot directly. To assess the performers of the proposed approach, we perform extensive sets of experiments in both simulation and real-robot in three scenarios. Experimental results showed that the robot could precisely throw a target object into the basket outside its kinematic range and generalize well to new locations and objects without colliding with obstacles. The video of our experiments can be found at https://youtu.be/VmIFF__c_84" AIMY: An Open-Source Table Tennis Ball Launcher for Versatile and High-Fidelity Trajectory Generation,"Alexander Dittrich, Jan Schneider, Simon Guist, Nico Gürtler, Heiko Ott, Thomas Steinbrenner, Bernhard Schölkopf, Dieter Buechler","Max Planck Institute for Intelligent Systems, Tübingen, Germany,Max Planck Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems Tübingen,MPI for Intelligent Systems",Learning for Control II,"To approach the level of advanced human players in table tennis with robots, generating varied ball trajectories in a reproducible and controlled manner is essential. Current ball launchers used in robot table tennis either do not provide an interface for automatic control or are limited in their capabilities to adapt speed, direction, and spin of the ball. For these reasons, we present AIMY, a three-wheeled open- hardware and open-source table tennis ball launcher, which can generate ball speeds and spins of up to 15.44 ms-1 and 192 s-1, respectively, which is comparable to advanced human players. The wheel speeds, launch orientation and time can be fully controlled via an open Ethernet or Wi-Fi interface. We provide a detailed overview of the core design features, as well as open-source the software to encourage distribution and duplication within and beyond the robot table tennis research community. We also extensively evaluation of the ball launcher’s accuracy for different system settings and learn to launch a ball to desired locations. With this ball launcher, we enable long-duration training of robot table tennis approaches where the complexity of the ball trajectory can be automatically adjusted, enabling large-scale real-world online reinforcement learning for table tennis robots." Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees,"Ewerton Vieira, Aravind Sivaramakrishnan, Yao Song, Edgar Granados, Marcio Gameiro, Konstantin Mischaikow, Ying Hung, Kostas E. Bekris","Rutgers University,Rutgers,Rutgers, the State University of New Jersey",Learning for Control II,"This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representation is built and used to describe the dynamics in the form of a directed acyclic graph, known as Morse graph. The Morse graph is able to describe thesystem's attractors and their corresponding regions of attraction (RoA). Furthermore, a pointwise confidence level of the global dynamics estimation over the entire state space is provided. In contrast to alternatives, the framework does not require estimation of Lyapunov functions, alleviating the need for high prediction accuracy of the GP. The framework is suitable for data-driven controllers that do not expose an analytical model as long as Lipschitz-continuity is satisfied. The method is compared against established analytical and recent machine learning alternatives for estimating RoAs, outperforming them in data efficiency without sacrificing accuracy." Modeling and Inertial Parameter Estimation of Cart-Like Nonholonomic Systems Using a Mobile Manipulator,"Sergio Aguilera, Muhammad Ali Murtaza, Jonathan Rogers, Seth Hutchinson",Georgia Institute of Technology,Learning for Control II,"The introduction of Mobile Manipulators (MMs) in industrial settings, hospital, hotels, home and office environments require that they are able to perform a wide range of capabilities. In all these settings, there are cart-like systems with similar wheels configuration and varying dynamic properties that we desire to have MMs maneuver around. To push/pull a cart-like system, we need to understand its dynamics model, nonholonomic constraints, and inertial parameters to predict its behavior. In this study, we propose a dynamic model for cart-like systems using constrained Euler-Lagrange equations to model nonholonomic constraints due to wheels configuration. Then, we discuss parameter estimation of this systems using Extended Kalman Filter (EKF) with an augmented state representation, while actuating the system using a MM. Since the cart-like system can have discrete changes on its inertial parameters, we discuss online parameter estimation while pushing/pulling the system along simple trajectories, with fixed and changing the grasping point on the object. Simulation and experimental result show that an accurate mass estimation of the system can be accomplished, and a good estimation of the Center of Mass (CoM) of the cart-like system can help reduce the force/torque needed to successfully control the system." Using Registration with Fourier-SOFT in 2D (FS2D) for Robust Scan Matching of Sonar Range Data,"Tim Hansen, Andreas Birk","Constructor University,Jacobs University",Marine Robotics II,"In this paper, we introduce Fourier-SOFT 2D (FS2D) as a new robust registration method. FS2D operates in the frequency domain where it exploits the well-known decoupling of rotation and translation. The challenging part of determining the rotation parameter is solved here based on a projection of the Fourier magnitude on a sphere and the SO(3) Fourier Transform (SOFT). The underlying use case is underwater mapping with sonar, i.e., with very noisy and partially overlapping environment data under non-trivial localization and navigation challenges. Fourier-SOFT 2D is compared with openly available registration methods on two real-world datasets and a simulated dataset. Results show the robustness of FS2D, i.e., its capabilities to handle large amounts of noise and occlusions of consecutive scans. The implementation in C++ is openly available." A Robotic Cooperative Network for Localising a Submarine in Distress: Results from REPMUS21,"Gabriele Ferri, Alessandro Faggiani, Roberto Petroccia, Pietro Stinco, Alessandra Tesei","NATO Centre for Maritime Research and Experimentation,CMRE,NATO Ctr. on Maritime Research and Experimentation (CMRE),NATO STO CMRE",Marine Robotics II,"Autonomy, cooperation and data fusion can increase the performance of robotic networks in many underwater applications. In this paper, we describe a novel occupancy grid (OG) based perception layer, and its use for controlling a network of autonomous underwater vehicles (AUVs), sensorised with passive sonars. Data fusion between the robots’ bearing-only measurements (typical of passive sonars) enables the estimate of target position. The developed OG framework exploits networking and the spatial diversity provided by the multi-robot system. The perception layer was integrated in the intelligent Cooperative Autonomous Decision Making Engine (iCADME) control architecture and validated for the first time in the Robotics Experimentation and Prototyping MUS (REPMUS) Exercise, held in Portugal in September 2021. Our robotic network participated in a technical demonstration, whose main objective was to localise a bottomed submarine which emitted a periodic acoustic help request during a simulated distress situation. We report results which are one of the first examples to demonstrate how cooperative robotics, supported by data fusion, can be effective in a passive sonar scenario. They also confirm the viability of adopting such solutions in real-world applications, characterised by poor communications and challenging environments. What achieved at REPMUS21 clearly demonstrates how a network of cooperative robots can improve search & rescue operations of a submarine." DeepSeeColor: Realtime Adaptive Color Correction for Autonomous Underwater Vehicles Via Deep Learning Methods,"Stewart Jamieson, Jonathan Patrick How, Yogesh Girdhar","Massachusetts Institute of Technology,Woods Hole Oceanographic Institution",Marine Robotics II,"Successful applications of complex vision-based behaviours underwater have lagged behind progress in terrestrial and aerial domains. This is largely due to the degraded image quality resulting from the physical phenomena involved in underwater image formation. Spectrally-selective light attenuation drains some colors from underwater images while backscattering adds others, making it challenging to perform vision-based tasks underwater. State-of-the-art methods for underwater color correction optimize the parameters of image formation models to restore the full spectrum of color to underwater imagery. However, these methods have high computational complexity that is unfavourable for realtime use by autonomous underwater vehicles (AUVs), as a result of having been primarily designed for offline color correction. Here, we present DeepSeeColor, a novel algorithm that combines a state-of-the-art underwater image formation model with the computational efficiency of deep learning frameworks. In our experiments, we show that DeepSeeColor offers comparable performance to the popular ""Sea-Thru"" algorithm while being able to rapidly process images at up to 60Hz, thus making it suitable for use onboard AUVs as a preprocessing step to enable more robust vision-based behaviours." From Concept to Field Tests: Accelerated Development of Multi-AUV Missions Using a High-Fidelity Faster-Than-Real-Time Simulator,"Tim Player, Arjo Chakravarty, Mabel Zhang, Ben Yair Raanan, Brian Kieft, Yanwu Zhang, Brett Hobson","Oregon State University,Open Robotics, Singapore University of Science and Technology,Open Robotics team at Intrinsic,Monterey Bay Aquarium Research Institute,MBARI",Marine Robotics II,"We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots’ operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The simulator’s ability to exceed a real-time factor (RTF) of 100 has been stress- tested with a robust continuous integration suite and was used to develop a multi-robot field experiment." Deep Reinforcement Learning Based Tracking Control of an Autonomous Surface Vessel in Natural Waters,"Wei Wang, Xiaojing Cao, Alejandro Gonzalez-garcia, Lianhao Yin, Niklas Hagemann, Yuanyuan Qiao, Carlo Ratti, Daniela Rus","Massachusetts Institute of Technology,Beijing University of Posts and Telecommunications,KU Leuven,MIT",Marine Robotics II,"Accurate control of autonomous marine robots still poses challenges due to the complex dynamics of the environment. In this paper, we propose a Deep Refinement Learning (DRL) approach to train a controller for autonomous surface vessel (ASV) trajectory tracking and compare its performance with an advanced nonlinear model predictive controller (NMPC) in real environments. Taking into account environmental disturbances (e.g., wind, waves, and currents), noisy measurements and non-ideal actuators presented in the physical ASV, several effective reward functions for DRL tracking control policies are carefully designed. The control policies were trained in a simulation environment with diverse tracking trajectories and disturbances. The performance of the DRL controller has been verified and compared with the NMPC in both simulations with model-based environmental disturbances and in natural waters. Simulations show that the DRL controller has 53.33% lower tracking error than that of NMPC. Experimental results further show that, compared to NMPC, the DRL controller has 35.51% lower tracking error, indicating that DRL controllers offer better disturbance rejection in river environments than NMPC." UDepth: Fast Monocular Depth Estimation for Visually-Guided Underwater Robots,"Boxiao Yu, Jiayi Wu, Md Jahidul Islam",University of Florida,Marine Robotics II,"In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater light attenuation prior, and then devise a least-squared formulation for coarse pixel-wise depth prediction. Subsequently, we extend this into a domain projection loss that guides the end-to-end learning of UDepth on over 9K RGB-D training samples. UDepth is designed with a computationally light MobileNetV2 backbone and a Transformer-based optimizer for ensuring fast inference rates on embedded systems. By domain-aware design choices and through comprehensive experimental analyses, we demonstrate that it is possible to achieve state-of-the-art depth estimation performance while ensuring a small computational footprint. Specifically, with 70%-80% less network parameters than existing benchmarks, UDepth achieves comparable and often better depth estimation performance. While the full model offers over 66 FPS (13 FPS) inference rates on a single GPU (CPU core), our domain projection for coarse depth prediction runs at 51.5 FPS rates on single-board Jetson TX2s. The inference pipelines are available at https://github.com/uf-robopi/UDepth." Improved Benthic Classification Using Resolution Scaling and SymmNet Unsupervised Domain Adaptation,"Heather Doig, Oscar Pizarro, Stefan Bernard Williams","University of Sydney,Australian Centre for Field Robotics",Marine Robotics II,"Autonomous Underwater Vehicles (AUVs) conduct regular visual surveys of marine environments to characterise and monitor the composition and diversity of the benthos. The use of machine learning classifiers for this task is limited by the low numbers of annotations available and the many fine-grained classes involved. In addition to these challenges, there are domain shifts between image sets acquired during different AUV surveys due to changes in camera systems, imaging altitude, illumination and water column properties leading to a drop in classification performance for images from a different survey where some or all these elements may have changed. This paper proposes a framework to improve the performance of a benthic morphospecies classifier when used to classify images from a different survey compared to the training data. We adapt the SymmNet state-of-the-art Unsupervised Domain Adaptation method with an efficient bilinear pooling layer and image scaling to normalise spatial resolution, and show improved classification accuracy. We test our approach on two datasets with images from AUV surveys with different imaging payloads and locations. The results show that generic domain adaptation can be enhanced to produce a significant increase in accuracy for images from an AUV survey that differs from the training images." Data-Driven Loop Closure Detection in Bathymetric Point Clouds for Underwater SLAM,"Jiarui Tan, Ignacio Torroba Balmori, Yiping Xie, John Folkesson","KTH Royal Institute of Technology,KTH",Marine Robotics II,"Simultaneous localization and mapping (SLAM) frameworks for autonomous navigation rely on a robust data association module to identify loop closures for back-end trajectory optimization. In the case of autonomous underwater vehicles (AUVs) equipped with multibeam echosounders (MBES), data association is particularly challenging due to the scarcity of identifiable landmarks in the seabed, the large drifts to which AUVs are prone and the low resolution characteristic of MBES data. Deep learning solutions to loop closure detection have shown excellent performance on data from more structured and less dynamic environments. However, their transfer to the seabed domain is not immediate and efforts to port them are hindered by the lack of bathymetric datasets. Thus, in this paper we propose a neural network architecture aimed to showcase the potential of adapting such techniques to correspondence matching in bathymetric data. We train our framework on real bathymetry from an AUV mission and evaluate its performance on the tasks of loop closure detection and coarse point cloud alignment. Finally, we show its potential against a more traditional method and we release the implementation of the network." ResiPlan: Closing the Planning-Acting Loop for Safe Underwater Navigation,"Marios Xanthidis, Eleni Kelasidi, Kostas Alexis","SINTEF Ocean,NTNU - Norwegian University of Science and Technology",Marine Robotics II,"Autonomous operation in underwater environments is, arguably, one of the most complex domains. It requires safe operations under the presence of unpredictable surge, currents, uncertainty, and dynamic obstacles that challenges to the highest degree real-time motion planning; the primary focus of this paper. Although previous work addressed the problem of safe real-time 3D navigation in cluttered underwater environments, it did not account explicitly for disturbances, currents, dynamic obstacles, or uncertainty growth. This paper presents ResiPlan, a novel motion planning framework that utilizes past information of errors monitoring the path follower's performance, along with estimation of dynamic obstacles and uncertainty, to produce adaptive paths by adjusting the safety margins accordingly. Extensive numerical experiments and simulations validate the safety guarantees of the technique, in a variety of different environments with various types of disturbance, showcasing the strong potential to be utilized for operations in challenging underwater environments." Diver Interest Via Pointing: Human-Directed Object Inspection for AUVs,"Chelsey Edge, Junaed Sattar","University of Minnesota-Twin Cities,University of Minnesota",Marine Robotics II,"In this paper, we present the Diver Interest via Pointing (DIP) algorithm, a highly modular method for conveying a diver’s area of interest to an autonomous underwater vehicle (AUV) using pointing gestures for underwater human-robot collaborative tasks. DIP uses a single monocular camera and exploits human body pose, even with complete dive gear, to extract underwater human pointing gesture poses and their directions. By extracting 2D scene geometry based on the human body pose and density of salient feature points along the direction of pointing, using a low-level feature detector, the DIP algorithm is able to locate objects of interest as indicated by the diver. DIP makes it possible for scuba divers and swimmers to use directional cues, through pointing, to an AUV for inspection, surveillance, manipulation, and navigation. We examine the elements that make up our method, provide quantitative and qualitative evaluation, and demonstrate AUV actuation based on diver pointing gestures in closed-water human-robot collaborative experiments. Our evaluations demonstrate the high efficacy of the DIP algorithm in correctly identifying the direction of a pointing gesture and locating an object within that region of interest. We also show that the findings of the algorithm qualitatively conform with human assessment of pointing gestures, directions, and targets." Robust Uncertainty Estimation for Classification of Maritime Objects,"Jonathan Becktor, Frederik Scholler, Evangelos Boukas, Lazaros Nalpantidis","Techincal University of Denmark,Technical University of Denmark",Marine Robotics II,"We explore the use of uncertainty estimation in the maritime domain, showing the efficacy on toy datasets(CIFAR10) and proving it on an in-house dataset, SHIPS. We present a method joining the intra-class uncertainty achieved using Monte Carlo Dropout, with recent discoveries in the field of outlier detection, to gain more holistic uncertainty measures. We explore the relationship between the introduced uncertainty measures and examine how well they work on CIFAR10 and in a real-life setting. Our work improves the FPR95 by 8% compared to the current highest performing work when the models are trained without out-of-distribution data. We increase the performance by 77% compared to a vanilla implementation of the Wide ResNet. We release the SHIPS dataset and show the effectiveness of our method by improving the FPR95 by 44.2% from the baseline. Our approach is model agnostic and easy to implement, and does not require models to be retrained." Adaptive Heading for Perception-Aware Trajectory Following,"Jonatan Scharff Willners, Sean Katagiri, Shida Xu, Tomasz Luczynski, Joshua Roe, Y. R. Petillot","Heriot-Watt University,Imperial College London",Marine Robotics II,"This paper presents an adaptive heading approach for perception awareness during trajectory following. By adapting the heading of a robot to improve the feature tracking in the current mapped environment, the accuracy in localisation can be improved. This can have a significant advantage for autonomous operations in GPS-denied environments such as subsea or in caves. The aim of the proposed approach is to position the sensor used for perception and feature tracking in such a way that it; obtains a view that contains a good observation of the previously mapped environment, face forward along the direction of travel, reduces the change in heading and view the perceived environment along the surface’s estimated normals. These 4 objectives create a weighted utility function that is used to find the most beneficial heading. The benefit is a system that improves feature tracking for simultaneous localisation and mapping (SLAM) while considering the safety of the robot by being aware of its surrounding. To sense the environment, a simulated sensor is discretised to a set of vertical rays based on the vertical field of view. The vertical rays are swept 360 degrees around a position to evaluate for a new heading. This allows for the simulated sensor data from ray casting to be reused and therefore reduces the computational load to find the heading which maximises the utility function. The paper is focused on holonomic robots capable of controlling the robot’s heading or sensor orientation independently from the position. We present results and evaluation in a simulated environment where we show a great improvement in the SLAM’s pose estimation. In addition, we endow an autonomous underwater vehicle (AUV) with the proposed approach during field trials and present the result in two different environments." An Optimal Open-Loop Strategy for Handling a Flexible Beam with a Robot Manipulator,"shamilmamedov, Alejandro Astudillo, Daniele Ronzani, Wilm Decré, Jean-philippe Noël, Jan Swevers","KU Leuven,Katholieke Universiteit Leuven",Optimization and Optimal Control,"Fast and safe manipulation of flexible objects with a robot manipulator necessitates measures to cope with vibrations. Existing approaches either increase the task execution time or require complex models and/or additional instrumentation to measure vibrations. This paper develops a model-based method that overcomes these limitations. It relies on a simple pendulum-like model for modeling the beam, open-loop optimal control for suppressing vibrations, and does not require any exteroceptive sensors. We experimentally show that the proposed method drastically reduces residual vibrations — at least 90% — and outperforms the commonly used input shaping (IS) for trajectories with the same execution time. Besides, our method can also execute the task faster than IS with a minor reduction in vibration suppression performance, thereby facilitating the development of new solutions for flexible object manipulation tasks" Constraint Manifolds for Robotic Inference and Planning,"Yetong Zhang, Fan Jiang, Gerry Chen, Varun Agrawal, Adam Rutkowski, Frank Dellaert","Georgia Institute of Technology,Air Force Research Laboratory",Optimization and Optimal Control,"We propose a manifold optimization approach for solving constrained inference and planning problems. The approach employs a framework that transforms an arbitrary nonlinear equality constrained optimization problem into an unconstrained manifold optimization problem. The core of the transformation process is the formulation of constraint manifolds that represent sets of variables subject to equality constraints. We propose various approaches to define the tangent spaces and retraction operations of constraint manifolds, which are crucial for manifold optimization. We evaluate our constraint manifold optimization approach on multiple constrained inference and planning problems, and show that it generates strictly feasible results with increased efficiency as compared to state-of-the-art constrained optimization methods." Model Predictive Optimized Path Integral Strategies,"Dylan M. Asmar, Ransalu Senanayake, Shawn Manuel, Mykel Kochenderfer",Stanford University,Optimization and Optimal Control,"We generalize the derivation of model predictive path integral control (MPPI) to allow for a single joint distribution across controls in the control sequence. This reformation allows for the implementation of adaptive importance sampling (AIS) algorithms into the original importance sampling step while still maintaining the benefits of MPPI such as working with arbitrary system dynamics and cost functions. The benefit of optimizing the proposal distribution by integrating AIS at each control step is demonstrated in simulated environments including controlling multiple cars around a track. The new algorithm is more sample efficient than MPPI, achieving better performance with fewer samples. This performance disparity grows as the dimension of the action space increases. Results from simulations suggest the new algorithm can be used as an anytime algorithm, increasing the value of control at each iteration versus relying on a large set of samples." Real-Time Solutions to Multimodal Partially Observable Dynamic Games,"Oswin So, Paul Drews, Thomas Balch, Velin Dimitrov, Guy Rosman, Evangelos Theodorou","Massachusetts Institute of Technology,Toyota Research Institute,Georgia Institute of Technology",Optimization and Optimal Control,"Game theoretic methods have become popular for planning and prediction in situations involving rich multi-agent interactions. However, these methods often assume the existence of a single local Nash equilibria and are hence unable to handle uncertainty in the intentions of different agents. While maximum entropy (MaxEnt) dynamic games try to address this issue, practical approaches solve for MaxEnt Nash equilibria using linear-quadratic approximations which are restricted to unimodal responses and unsuitable for scenarios with multiple local Nash equilibria. By reformulating the problem as a POMDP, we propose MPOGames, a method for efficiently solving MaxEnt dynamic games that captures the interactions between local Nash equilibria. We show the importance of uncertainty-aware game theoretic methods via a two-agent merge case study. Finally, we prove the real-time capabilities of our approach with hardware experiments on a 1/10th scale car platform." Autonomous Drone Racing: Time-Optimal Spatial Iterative Learning Control within a Virtual Tube,"Shuli Lv, Yan Gao, Jiaxing Che, Quan Quan","Beihang University,School of Automation Science and Electrical Engineering, Beihang",Optimization and Optimal Control,"It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high-efficient learning method by imitating the training experience of top racing drivers. Unlike traditional iterative learning control methods for accurate tracking, the proposed approach iteratively learns a trajectory online to finish the race as quickly as possible. Simulations and experiments using different models show that the proposed approach is model-free and is able to achieve the optimal result with low computation requirements. Furthermore, this approach surpasses some state-of-the-art methods in racing time on a benchmark drone racing platform. An experiment on a real quadcopter is also performed to demonstrate its effectiveness." Curvature-Aware Model Predictive Contouring Control,"Lorenzo Lyons, Laura Ferranti",Delft University of Technology,Optimization and Optimal Control,"We present a novel Curvature-Aware Model Predictive Contouring Control (CA-MPCC) formulation for mobile robotics motion planning. Our method aims at generalizing the traditional contouring control formulation derived from machining to autonomous driving applications. The proposed controller is able of handling sharp curvatures in the reference path while subject to non-linear constraints, such as lane boundaries and dynamic obstacle collision avoidance. Compared to a standard MPCC formulation, our method improves the reliability of the path-following algorithm and simplifies the tuning, while preserving real-time capabilities. We validate our findings in both simulations and experiments on a scaled-down car-like robot." A Sequential Quadratic Programming Approach to the Solution of Open-Loop Generalized Nash Equilibria,"Edward Zhu, Francesco Borrelli","University of California, Berkeley",Optimization and Optimal Control,"In this work, we propose a numerical method for the solution of local generalized Nash equilibria (GNE) for the class of open-loop general-sum dynamic games for agents with nonlinear dynamics and constraints. In particular, we formulate a sequential quadratic programming (SQP) approach which requires only the solution of a single convex quadratic program at each iteration and is locally convergent. Central to the effectiveness of our approach is a non-monotonic line search method and a novel merit function for SQP step acceptance which helps to improve solver convergence beyond the local neighborhood of a GNE. We demonstrate the effectiveness of the algorithm in the context of car racing, where we see up to 32% improvement of success rate when comparing against a recent solution approach for dynamic games." RPGD: A Small-Batch Parallel Gradient Descent Optimizer with Explorative Resampling for Nonlinear Model Predictive Control,"Frederik Heetmeyer, Marcin Paluch, Diego Bolliger, Florian Bolli, Xiang Deng, Ennio Filicicchia, Tobi Delbruck","ETH Zurich,University of Zurich,Univ. of Zurich & ETH Zurich",Optimization and Optimal Control,"Nonlinear model predictive control often involves nonconvex optimization for which real-time control systems require fast and numerically stable solutions. This work proposes RPGD, a Resampling Parallel Gradient Descent optimizer designed to exploit small-batch parallelism of modern hardware like neural accelerators or multithreaded microcontrollers. After initialization, it continuously maintains a small population of good control trajectory solution candidates and improves them using gradient information, followed by selection of elite candidates and resampling of the others. In simulation on a cartpole, the OpenAI Gym mountain car, a Dubins car with obstacles, and a high input dimensional 2D arm, it produces similar or lower MPC costs than benchmark cross-entropy and path integral methods. On a physical cartpole, it performs swing-up and cart target following of the pole, using either a differential equation or multilayer perceptron as dynamics model. RPGD drives an F1TENTH simulated race car at near-optimal lap times and a real F1TENTH car in laps around a cluttered room. We study alterations of RPGD's building blocks to justify its composition. RPGD compute time in Python with TensorFlow optimization running on CPU is 2 to 4 times slower than the FORCESPRO commercial embedded solver." Distributionally Robust Optimization with Unscented Transform for Learning-Based Motion Control in Dynamic Environments,"Astghik Hakobyan, Insoon Yang",Seoul National University,Optimization and Optimal Control,"Safety is one of the main challenges when applying learning-based motion controllers to practical robotic systems, especially when the dynamics of the robots and their surrounding dynamic environments are unknown. This issue is further exacerbated when the learned information is unreliable and inaccurate. In this paper, we aim to enhance the safety of learning-enabled mobile robots in dynamic environments from the perspective of distributionally robust optimization (DRO) and the unscented transform (UT). Our method infers the unknown dynamics of both the robot and the environment by adopting Gaussian process regression with an uncertainty propagation scheme based on UT to improve prediction accuracy. This leads to a novel learning-based model predictive control (MPC) method in which state information about both the robot and the environment is propagated via UT. The proposed method uses DRO to proactively limit the risk of collisions or other unsafe events in the presence of learning errors. However, the distributionally robust risk constraint is intractable because it involves a separate infinite-dimensional optimization problem. To overcome this challenge, we exploit UT with modern DRO techniques to replace the risk constraint with its simple upper bound. The performance and the utility of our method are demonstrated through simulations in autonomous driving scenarios, showing its capability to enhance safety and computational efficiency." Event-Triggered Optimal Formation Tracking Control Using Reinforcement Learning for Large-Scale UAV Systems,"Ziwei Yan, Liang Han, Xiaoduo Li, Jinjie Li, Zhang Ren","Beihang University,Shanghai Jiao Tong University,Beihang Unviersity",Optimization and Optimal Control,"Large-scale UAV switching formation tracking control has been widely applied in many fields such as search and rescue, cooperative transportation, and UAV light shows. In order to optimize the control performance and reduce the computational burden of the system, this study proposes an event-triggered optimal formation tracking controller for discrete-time large-scale UAV systems (UASs). And an optimal decision - optimal control framework is completed by introducing the Hungarian algorithm and actor-critic neural networks (NNs) implementation. Finally, a large-scale mixed reality experimental platform is built to verify the effectiveness of the proposed algorithm, which includes large-scale virtual UAV nodes and limited physical UAV nodes. This compensates for the limitations of the experimental field and equipment in real-world scenario, ensures the experimental safety, significantly reduces the experimental cost, and is suitable for realizing large-scale UAV formation light shows." Differentiable Collision Detection: A Randomized Smoothing Approach,"Louis Montaut, Quentin Le Lidec, Antoine Bambade, Vladimír Petrík, Josef Sivic, Justin Carpentier","INRIA (Paris) - CIIRC (Prague),INRIA-ENS-PSL,INRIA Paris, ENPC France,Czech Technical University in Prague,Czech Technical University,INRIA",Optimization and Optimal Control,"Collision detection appears as a canonical operation in a large range of robotics applications from robot control to simulation, including motion planning and estimation. While the seminal works on the topic date back from the 80s, this is only recently that the question of properly differentiating collision detection has emerged as a central issue, thanks notably to the ongoing and various efforts made by the scientific community around the topic of differentiable physics. Yet, very few solutions have been suggested so far, and only with a strong assumption on the nature of the shapes involved. In this work, we introduce a generic and efficient approach to compute the derivatives of collision detection for any convex shapes, by notably leveraging randomized smoothing techniques which have shown to be particularly adapted to capture the derivatives of non-smooth problems. This approach is implemented in the HPP-FCL and Pinocchio ecosystems, and evaluated on classic datasets and problems of the robotics literature, depicting few micro-seconds timings to compute informative derivatives directly exploitable by many real robotic applications including differentiable simulation." Start State Selection for Control Policy Learning from Optimal Trajectories,"Christoph Zelch, Jan Peters, Oskar Von Stryk",Technische Universität Darmstadt,Optimization and Optimal Control,"Combination of optimal control methods and machine learning approaches allows to profit from complementary benefits of each field in control of robotic systems. Data from optimal trajectories provides valuable information that can be used to learn a near-optimal state-dependent feedback control policy. To obtain high-quality learning data, careful selection of optimal trajectories, determined by a set of start states, is essential to achieve a good learning performance. In this paper, we extend previous work with new complementing strategies to generate start points. These methods complement the existing approach, as they introduce new criteria to identify relevant regions in joint state space that need coverage by new trajectories. It is demonstrated that the extensions significantly improve the overall performance of the previous method in simulation on full nonlinear dynamics model of the industrial Manutec r3 robot arm. Further, it is demonstrated that it suffices to learn a policy that reaches the proximity of the goal state, from where a PI controller can be used for stable control reaching the final system state." Swarm-LIO: Decentralized Swarm LiDAR-Inertial Odometry,"Fangcheng Zhu, Yunfan Ren, Fanze Kong, Huajie Wu, Siqi Liang, Nan Chen, Wei Xu, Fu Zhang","The University of Hong Kong,Hong Kong University,Harbin Institute of Technology, Shenzhen,University of Hong Kong",Aerial Robotics II,"Accurate self and relative state estimation are the critical preconditions for completing swarm tasks, eg, collaborative autonomous exploration, target tracking, search and rescue. This paper proposes Swarm-LIO: a fully decentralized state estimation method for aerial swarm systems, in which each drone performs precise ego-state estimation, exchanges ego-state and mutual observation information by wireless communication, and estimates relative state with respect to (w.r.t.) the rest of UAVs, all in real-time and only based on LiDAR-inertial measurements. A novel 3D LiDAR-based drone detection, identification and tracking method is proposed to obtain observations of teammate drones. The mutual observation measurements are then tightly-coupled with IMU and LiDAR measurements to perform real-time and accurate estimation of ego-state and relative state jointly. Extensive real-world experiments show the broad adaptability to complicated scenarios, including GPS-denied scenes, degenerate scenes for camera (dark night) or LiDAR (facing a single wall). Compared with ground-truth provided by motion capture system, the result shows the centimeter-level localization accuracy which outperforms other state-of-the-art LiDAR-inertial odometry for single UAV system." HALO: Hazard-Aware Landing Optimization for Autonomous Systems,"Christopher Hayner, Samuel Buckner, Daniel Broyles, Evelyn Madewell, Karen Yan Ming Leung, Behcet Acikmese","University of Washington,Stanford University, NVIDIA Research, University of Washington",Aerial Robotics II,"With autonomous aerial vehicles enacting safety-critical missions, such as the Mars Science Laboratory Curiosity rover’s landing on Mars, the identification and reasoning of potentially hazardous landing sites is paramount. This paper presents a coupled perception-planning solution which addresses the hazard detection, optimal landing trajectory generation, and contingency planning challenges encountered when landing in uncertain environments. Specifically, we develop and combine two novel algorithms, Hazard-Aware Landing Site Selection (HALSS) and Adaptive Deferred-Decision Trajectory Optimization (Adaptive-DDTO), to address the perception and planning challenges respectively. The HALSS framework processes point cloud information to identify feasible safe landing zones, while Adaptive-DDTO is a multi-target contingency planner that adaptively replans as new perception information is received. We demonstrate the efficacy of our approach using a simulated Martian environment and show that our coupled perception-planning method achieves greater landing success whilst being more fuel efficient compared to a non-adaptive DDTO approach." Onboard Controller Design for Nano UAV Swarm in Operator-Guided Collective Behaviors,"Tugay Alperen Karagüzel, Victor Retamal Guiberteau, Eliseo Ferrante",Vrije Universiteit Amsterdam,Aerial Robotics II,"In this paper, we present a swarm of Crazyflie nano-drones. The swarm can show various collective behaviors: Flocking, gradient following, going to a chosen point, formation, and scattered search of the environment. The methodology behind the behaviors is implemented on-board. Crazyflies use a common radio channel to share positions with each other. If desired, an operator can use the same channel and start, end, change or guide the collective behaviors on-air. We use the virtual force vectors and modify the way they are combined to achieve different behaviors instead of developing unique algorithms for each. This allows us to develop more collective behavior types with less effort. In the results, we show a detailed analysis of the behaviors and assess the coordination and the safety of the agents in addition to the performance as a collective. We conclude that our swarm of 6 Crazyflies was successful in the desired behaviors." EFTrack: A Lightweight Siamese Network for Aerial Object Tracking,"Wenqi Zhang, Yuan Yao, Xincheng Liu, Kai Kou, Gang Yang",Northwestern Polytechnical University,Aerial Robotics II,"Visual object tracking is a very important task for unmanned aerial vehicle (UAV). Limited resources of UAV lead to strong demand for efficient and robust trackers. In recent years, deep learning-based trackers, especially, siamese trackers achieve very impressive results. Though siamese trackers can run a relatively fast speed on the high-end GPU, they are becoming heavier and heavier which restricts them to be deployed on UAV platform. In this work, we propose a lightweight aerial tracker based on the siamese network. We use EfficientNet as the backbone, which has less parameters and stronger feature extract ability compared with ResNet-50. After a pixel-wise correlation, a classification branch and a regression branch are applied to predict the front/back score and offset of the target without the predefined anchor. The results show that our tracker works efficiently and achieves impressive performance on UAV tracking datasets. In addition, the real-world test shows that it runs effectively on the Nvidia Jetson NX deployed on DJI UAV" Active Metric-Semantic Mapping by Multiple Aerial Robots,"Xu Liu, Ankit Prabhu, Fernando Cladera, Ian Douglas Miller, Lifeng Zhou, Camillo Jose Taylor, Vijay Kumar","University of Pennsylvania,Drexel University",Aerial Robotics II,"Traditional approaches for active mapping focus on building geometric maps. For most real-world applications, however, actionable information is related to semantically meaningful objects in the environment. We propose an approach to the active metric-semantic mapping problem that enables multiple heterogeneous robots to collaboratively build a map of the environment. The robots actively explore to minimize the uncertainties in both semantic (object classification) and geometric (object modeling) information. We represent the environment using informative but sparse object models, each consisting of a basic shape and a semantic class label, and characterize uncertainties empirically using a large amount of real-world data. Given a prior map, we use this model to select actions for each robot to minimize uncertainties. The performance of our algorithm is demonstrated through multi-robot experiments in diverse real-world environments. The proposed framework is applicable to a wide range of real-world problems, such as precision agriculture, infrastructure inspection, and asset mapping in factories." Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm Using Deep Multi-Agent Reinforcement Learning,"Maryam Kouzehgar, Youngbin Song, Malika Meghjani, Roland Bouffanais","Singapore University of Technology and Design,University of Ottawa",Aerial Robotics II,"Multi-agent pursuit-evasion tasks involving intelligent targets are notoriously challenging coordination problems. In this paper, we investigate new ways to learn such coordinated behaviors of unmanned aerial vehicles (UAVs) aimed at keeping track of multiple evasive targets. Within a Multi-Agent Reinforcement Learning (MARL) framework, we specifically propose a variant of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method. Our approach addresses multi-target pursuit-evasion scenarios within non-stationary and unknown environments with random obstacles. In addition, given the critical role played by collective exploration in terms of detecting possible targets, we implement heterogeneous roles for the pursuers for enhanced exploratory actions balanced by exploitation (i.e. tracking) of previously identified targets. Our proposed role-based MADDPG algorithm is not only able to track multiple targets, but also is able to explore for possible targets by means of the proposed Voronoi-based rewarding policy. We implemented, tested and validated our approach in a simulation environment prior to deploying a real-world multi-robot system comprising of Crazyflie drones. Our results demonstrate that a multi-agent pursuit team has the ability to learn highly efficient coordinated control policies in terms of target tracking and exploration even when confronted with multiple fast evasive targets in complex environments." A Moving Target Tracking System of Quadrotors with Visual-Inertial Localization,"Ziyue Lin, Wenbo Xu, Wei Wang","Institute of Automation, Chinese Academy of Sciences",Aerial Robotics II,"This paper implements a vision-based moving target tracking system of quadrotors with visual-inertial localization in GNSS-denied indoor environments. We use the visual-inertial odometry to estimate the states of the UAV by minimizing visual and inertial residuals, and estimate the states of the target with extended Kalman Filter from visual detection. This research formulates the target tracking problem as optimization-based trajectory generation where a weighted sum cost function jointly penalizes the tracking error, the control cost of the trajectory and the trajectory length, while enforcing the safety and feasibility constraints. We present a strategy that represents the trajectory as piecewise Bézier curves using Bernstein polynomial basis. Due to the special properties of Bézier curves, the position of the entire trajectory and its derivatives can be directly bounded within the safe spaces, thus this facilitating the dynamics of the quadrotor. The proposed strategy can generate smooth and collision-free tracking trajectories and is time and space efficient. We conduct simulations and real-world experiments to validate the effectiveness of our system." BogieCopter: A Multi-Modal Aerial-Ground Vehicle for Long-Endurance Inspection Applications,"Teodoro Dias, Meysam Basiri",Instituto Superior Técnico,Aerial Robotics II,"The use of Micro Aerial Vehicles (MAVs) for inspection and surveillance missions has proved to be extremely useful, however, their usability is negatively impacted by the large power requirements and the limited operating time. This work describes the design and development of a novel hybrid aerial-ground vehicle, enabling multi-modal mobility and long operating time, suitable for long-endurance inspection and monitoring applications. The design consists of a MAV with two tiltable axles and four independent passive wheels, allowing it to fly, approach, land and move on flat and inclined surfaces, while using the same set of actuators for all modes of locomotion. In comparison to existing multi-modal designs with passive wheels, the proposed design enables a higher ground locomotion efficiency, provides a higher payload capacity, and presents one of the lowest mass increases due to the ground actuation mechanism. The vehicle’s performance is evaluated through a series of real experiments, demonstrating its flying, ground locomotion and wall-climbing capabilities, and the energy consumption for all modes of locomotion is evaluated." Towards Autonomous UAV Railway DC Line Recharging: Design and Simulation,"Frederik Falk Nyboe, Nicolaj Malle, Gerd Vom Bögel, Linda Cousin, Thomas Heckel, Konstantin Troidl, Anders Schack Madsen, Emad Samuel Malki Ebeid","University of Southern Denmark,Fraunhofer IMS,Fraunhofer IISB",Perception,"Autonomously recharging UAVs from existing infrastructure has enormous potential for various applications, such as infrastructure inspection, surveillance, and search and rescue. While it is an active area of research, most related work focuses on alternating current (AC) infrastructure while very little work has been done on investigating the potential of recharging UAVs from direct current (DC) infrastructure. This work proposes a UAV system designed to autonomously recharge from existing DC infrastructure. Two onboard powerline grippers and a motorized cable drum enable the UAV to perform a two-stage landing on railway DC lines where a wire is connected between them through the UAV for recharging. Light-weight electronics designed to be carried by the UAV are developed to harvest energy from up to 3kV DC railway lines. The recharge mission is autonomously executed using fully onboard and real-time perception and trajectory planning and tracking algorithms. The potential of the system is shown in lab setting validation, with hardware-in-the-loop simulation, and partly in a real overhead powerline environment, verifying the functionality of the sub-components." Fast Region of Interest Proposals on Maritime UAVs,"Benjamin Kiefer, Andreas Zell","University of Tuebingen,University of Tübingen",Perception,"Unmanned aerial vehicles assist in maritime search and rescue missions by flying over large search areas to autonomously search for objects or people. Reliably detecting objects of interest requires fast models to employ on embedded hardware. Moreover, with increasing distance to the ground station only part of the video data can be transmitted. In this work, we consider the problem of finding meaningful region of interest proposals in a video stream on an embedded GPU. Current object or anomaly detectors are not suitable due to their slow speed, especially on limited hardware and for large image resolutions. Lastly, objects of interest, such as pieces of wreckage, are often not known a priori. Therefore, we propose an end-to-end future frame prediction model running in real-time on embedded GPUs to generate region proposals. We analyze its performance on large-scale maritime data sets and demonstrate its benefits over traditional and modern methods." TRADE: Object Tracking with 3D Trajectory and Ground Depth Estimates for UAVs,"Pedro Proença, Patrick Spieler, Robert Hewitt, Jeff Delaune","NASA-JPL,JPL,Jet Propulsion Laboratory",Perception,"We propose TRADE for robust tracking and 3D localization of a moving target in complex environments, from UAVs equipped with a single camera. Ultimately TRADE enables 3d-aware target following. Tracking-by-detection approaches are vulnerable to target switching, especially between similar objects. Thus, TRADE predicts and incorporates the target 3D trajectory to select the right target from the tracker’s response map. Unlike static environments, depth estimation of a moving target from a single camera is an ill-posed problem. Therefore we propose a novel 3D localization method for ground targets on complex terrain. It reasons about scene geometry by combining ground plane segmentation, depth-from-motion and single-image depth estimation. The benefits of using TRADE are demonstrated as tracking robustness and depth accuracy on several dynamic scenes simulated in this work. Additionally, we demonstrate autonomous target following using a thermal camera by running TRADE on a quadcopter’s board computer." Adaptive Keyframe Generation Based LiDAR Inertial Odometry for Complex Underground Environments,"Boseong Kim, Chanyoung Jung, David Hyunchul Shim, Ali-Akbar Agha-Mohammadi","KAIST,NASA-JPL, Caltech",Perception,"In this paper, we present a LiDAR Inertial Odometry (LIO) algorithm utilizing adaptive keyframe generation which achieves fast and accurate state estimation for aerial and ground robots. It is known that keyframe generation significantly affects the performance of Simultaneous Localization and Mapping (SLAM) algorithms. Unlike existing SLAM algorithms that generate keyframes based on fixed conditions, we propose to use adaptive keyframe generation conditions considering characteristics of surrounding environment using real-time LiDAR scans. When a keyframe is generated, the keyframe and the corresponding LiDAR measurements are stored in our novel data structure designed for efficient sub-map generation. The scan to sub-map matching module then uses the Generalized Iterative Closest Point (GICP) algorithm to adjust estimated states at a global scale, producing more accurate and globally consistent state estimation results even in large-scale underground environments. Experimental results from diverse types of underground environments show that the proposed method outperforms the existing state-of-the-art LIO algorithms in various metrics such as computational speed, CPU usage, and accuracy." Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV,"Sotirios Papatheodorou, Nils Funk, Dimos Tzoumanikas, Christopher Choi, Binbin Xu, Stefan Leutenegger","Imperial College London,University of Toronto,Technical University of Munich",Perception,"Exploration of unknown space with an autonomous mobile robot is a well-studied problem. In this work we broaden the scope of exploration, moving beyond the pure geometric goal of uncovering as much free space as possible. We believe that for many practical applications, exploration should be contextualised with semantic and object-level understanding of the environment for task-specific exploration. Here, we study the task of both finding specific objects in unknown space as well as reconstructing them to a target level of detail. We therefore extend our environment reconstruction to not only consist of a background map, but also object-level and semantically fused submaps. Importantly, we adapt our previous objective function of uncovering as much free space as possible in as little time as possible with two additional elements: first, we require a maximum observation distance of background surfaces to ensure target objects are not missed by image-based detectors because they are too small to be detected. Second, we require an even smaller maximum distance to the found objects in order to reconstruct them with the desired accuracy. We further created a Micro Aerial Vehicle (MAV) semantic exploration simulator based on Habitat in order to quantitatively demonstrate how our framework can be used to efficiently find specific objects as part of exploration. Finally, we showcase this capability can be deployed in real-world scenes involving our drone equipped with an Intel RealSense D455 RGB-D camera." Stealthy Perception-Based Attacks on Unmanned Aerial Vehicles,"Amir Khazraei, Haocheng Meng, Miroslav Pajic","Duke university,Duke University",Perception,"In this work, we study vulnerability of unmanned aerial vehicles (UAVs) to stealthy attacks on perception-based control. To guide our analysis, we consider two specific missions: (i) ground vehicle tracking (GVT), and (ii) vertical take-off and landing (VTOL) of a quadcopter on a moving ground vehicle. Specifically, we introduce a method to consistently attack both the sensors measurements and camera images over time, in order to cause control performance degradation (e.g., by failing the mission) while remaining stealthy (i.e., undetected by the deployed anomaly detector). Unlike existing attacks that mainly rely on vulnerability of deep neural networks to small input perturbations (e.g., by adding small patches and/or noise to the images), we show that stealthy yet effective attacks can be designed by changing images of the ground vehicle's landing markers as well as suitably falsifying sensing data. We illustrate the effectiveness of our attacks in Gazebo 3D robotics simulator." SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking,"Liangliang Yao, Changhong Fu, Sihang Li, Guangze Zheng, Junjie Ye",Tongji University,Perception,"Vision-based object tracking has boosted extensive autonomous applications for unmanned aerial vehicles (UAVs). However, the dynamic changes in flight maneuver and viewpoint encountered in UAV tracking pose significant difficulties, e.g., aspect ratio change, and scale variation. The conventional cross-correlation operation, while commonly used, has limitations in effectively capturing perceptual similarity and incorporates extraneous background information. To mitigate these limitations, this work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking. The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation and effectively discriminate foreground and background information. Additionally, a saliency adaptation embedding operation dynamically generates tokens based on initial saliency, thereby reducing the computational complexity of the Transformer architecture. Finally, a lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information. The efficacy and robustness of the proposed approach have been thoroughly assessed through experiments on three widely-used UAV tracking benchmarks and real-world scenarios, with results demonstrating its superiority. The source code and demo videos are available at https://github.com/vision4robotics/SGDViT." Semantics-Aware Exploration and Inspection Path Planning,"Mihir Rahul Dharmadhikari, Kostas Alexis",NTNU - Norwegian University of Science and Technology,Perception,"This paper contributes a novel strategy for semantics-aware autonomous exploration and inspection path planning. Attuned to the fact that environments that need to be explored often involve a sparse set of semantic entities of particular interest, the proposed method offers volumetric exploration combined with two new planning behaviors that together ensure that a complete mesh model is reconstructed for each semantic, while its surfaces are observed at appropriate resolution and through suitable viewing angles. Evaluated in extensive simulation studies and experimental results using a flying robot, the planner delivers efficient combined exploration and high-fidelity inspection planning that is focused on the semantics of interest. Comparisons against relevant methods of the state-of-the-art are further presented." Inverted Landing in a Small Aerial Robot Via Deep Reinforcement Learning for Triggering and Control of Rotational Maneuvers,"Bryan Habas, Jack W. Langelaan, Bo Cheng","The Pennsylvania State University,Penn State University,Pennsylvania State University",Micro Aerial Robots,"Inverted landing in a rapid and robust manner is a challenging feat for aerial robots, especially while depending entirely on onboard sensing and computation. In spite of this, this feat is routinely performed by biological fliers such as bats, flies, and bees. Our previous work has identified a direct causal connection between a series of onboard visual cues and kinematic actions that allow for reliable execution of this challenging aerobatic maneuver in small aerial robots. In this work, we utilized Deep Reinforcement Learning and a physics-based simulation to obtain a general, optimal control policy for robust inverted landing starting from any arbitrary approach condition. This optimized control policy provides a computationally-efficient mapping from the system's emulated observational space to its motor command action space, including both triggering and control of rotational maneuvers. This was accomplished by training the system over a large range of approach flight velocities that varied with magnitude and direction. Next, we performed a sim-to-real transfer and experimental validation of the learned policy via domain randomization, by varying the robot's inertial parameters in the simulation. Through experimental trials, we identified several dominant factors which greatly improved landing robustness and the primary mechanisms that determined inverted landing success. We expect the reinforcement learning framework developed in this study can be generalized to solve more challenging tasks, such as utilizing noisy onboard sensory data, landing on surfaces of various orientations, or landing on dynamically-moving surfaces." Heading Control of a Long-Endurance Insect-Scale Aerial Robot Powered by Soft Artificial Muscles,"Yi-Hsuan Hsiao, Suhan Kim, Zhijian Ren, Yufeng Chen","Massachusetts Institute of Technology,Massachusetts Institute of Technology (MIT)",Micro Aerial Robots,"Aerial insects demonstrate fast and precise heading control when they perform body saccades and rapid escape maneuvers. While insect-scale micro-aerial-vehicles (IMAVs) have demonstrated early results on heading control, their flight endurance and heading angle tracking accuracy remain far inferior to that of natural fliers. In this work, we present a long endurance sub-gram aerial robot that can demonstrate effective heading control during hovering flight. Through using a tilted wing stroke-plane design, our robot demonstrates a 10-second flight where it tracks a desired yaw trajectory with maximum and root-mean-square (RMS) error of 14.2° and 5.8°. The new robot design requires 7% higher lift forces for enabling heading angle control, which creates higher stress on wing hinges and adversely influences robot endurance. To address this challenge, we developed novel 3-layered wing hinges that exhibit 1.82 times improvement of lifetime. With the new wing hinges, our robot demonstrates a 40-second hovering flight – the longest among existing sub-gram IMAVs. These results represent substantial improvement of flight capabilities in soft- actuated IMAVs, showing the potential of operating these insect- like fliers in cluttered natural environments." "Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC","Andrea Tagliabue, Yi-Hsuan Hsiao, Urban Fasel, J. Nathan Kutz, Steven L. Brunton, Yufeng Chen, Jonathan Patrick How","Massachusetts Institute of Technology,Imperial College London,University of Washington",Micro Aerial Robots,"Accurate and agile trajectory tracking in sub-gram MAVs is challenging, as the small scale of the robot induces large model uncertainties, demanding robust feedback controllers, while the fast dynamics and computational constraints prevent the deployment of computationally expensive strategies. In this work, we present an approach for agile and computationally efficient trajectory tracking on the MIT SoftFly, a sub-gram MAV (0.7 grams). Our strategy employs a cascaded control scheme, where an adaptive attitude controller is combined with a neural network policy trained to imitate a trajectory tracking robust tube model predictive controller (RTMPC). The neural network policy is obtained using our recent work, which enables the policy to preserve the robustness of RTMPC, but at a fraction of its computational cost. We experimentally evaluate our approach, achieving position Root Mean Square Error (RMSE) lower than 1.8 cm even in the more challenging maneuvers, obtaining a 60% reduction in maximum position error compared to our prior work, and demonstrating robustness to large external disturbances." A New Sensation: Digital Strain Sensing for Disturbance Detection in Flapping Wing Micro Aerial Vehicles,"Regan Kubicek, Mahnoush Babaei, Alison Weber, Sarah Bergbreiter","Carnegie Mellon University,The University of Texas at Austin,University of Washington",Micro Aerial Robots,"Flapping wing micro aerial vehicles face challenges in sensing and reacting to disturbances like wind gusts. This work introduces a new microscale bio-inspired digital strain sensor to detect these perturbations. The sensor is designed to change logic states when a specified strain threshold has been reached. The sensors are 3D printed on a flexible Mylar wing using two-photon polymerization. Three digital sensors with varying strain thresholds demonstrate differences in activation timing due to different design parameters. The sensors are tested at the 25 Hz flapping frequency of a hawkmoth, an insect with comparable wing size. A perturbation was added to the flapping wing by subjecting it to a 3 m/s wind gust. A single digital sensor is able to identify the wind disturbance by comparing the time of the first strain threshold crossing." A Lightweight High-Voltage Boost Circuit for Soft-Actuated Micro-Aerial-Robots,"Zhijian Ren, Jiahui Yang, Suhan Kim, Yi-Hsuan Hsiao, Jeffrey Lang, Yufeng Chen","Massachusetts Institute of Technology,Southern University of Science and Technology,Massachusetts Institute of Technology (MIT),MIT",Micro Aerial Robots,"Flight is an energetically expensive task. While aerial insects can effortlessly fly through natural environments, achieving power autonomous flights in insect-scale robots remains a major challenge. In prior works, we developed soft-actuated insect-scale aerial robots that demonstrated unique capabilities such as in-flight collision recovery and somersaults. However, the soft dielectric elastomer actuators (DEAs) have low efficiency (600 V). These properties represent formidable obstacles for soft aerial robots to achieve power autonomous flights. In this work, we developed a 127 mg boost circuit that can convert a 7.7 V DC input into a 600 V and 400 Hz output for driving a 120 mg DEA. The DEA has an equivalent capacitance and resistance of 20 nF and 5 kΩ, respectively. The DEA is assembled into a 158 mg aerial robot, which can demonstrate liftoff while carrying the boost circuit as a payload. Although the robot remains tethered to an offboard power supply, this result represents a first step towards achieving power autonomy in soft aerial robots." Hummingbird-Bat Hybrid Wing by 3-D Printing,"Tomoya Fujii, Jinqiang Dang, Hiroto Tanaka","Tokyo institute of technology,Tokyo Institute of Technology",Micro Aerial Robots,"Hovering hummingbirds have inspired small flapping-wing aerial robots. Natural flyers, including hummingbirds and bats, undergo torsional wing deformation during flapping flight owing to complex wing structure, while previous artificial wings were relatively simple and difficult to design the torsional flexibility. In this paper, we proposed a hummingbird-bat hybrid (HBH) wing in which torsional flexibility was implemented by an available fabrication technology. The HBH wing had a torsional arm at the leading edge inspired by a torsional wrist of a hummingbird. A bat-like stretchable wing membrane was also employed not to constrain the wing torsion. The membrane was supported by wing shafts of which bending stiffness was designed based on that of the feather shaft of a hummingbird. The three-dimensional (3-D) shape of the torsional arm and wing shafts was created by 3-D printing. The effect of the torsional arm and stretchable membrane on lift generation and deformation was evaluated using an electric flapping mechanism. It was confirmed that the torsional arm actually enhanced the passive wing torsion. The stretchable wing membrane further promoted the torsion effect of the torsional arm. Consequently, the HBH wing did not increase lift, but efficacy, defined as lift per input power, was greatly improved by 14% at most compared with the wing without a torsional arm." Ultra-Low Power Deep Learning-Based Monocular Relative Localization Onboard Nano-Quadrotors,"Stefano Bonato, Stefano Carlo Lambertenghi, Elia Cereda, Alessandro Giusti, Daniele Palossi","USI and SUPSI,USI, SUPSI,IDSIA USI-SUPSI,IDSIA Lugano, SUPSI,ETH Zurich",Micro Aerial Robots,"Precise relative localization is a crucial functional block for swarm robotics. This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones, i.e., sub-40g of weight and sub-100mW processing power. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, from the dataset collection to the final in-field deployment, including dataset augmentation, quantization, and system optimizations. Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to ~2m distance. On a disjoint testing dataset our model yields a mean R2 score of 0.42 and a root mean square error of 18cm, which results in a mean in-field prediction error of 15cm and in a closed-loop control error of 17cm, over a ~60s-flight test. Ultimately, the proposed system improves the State-of-the-Art by showing long-endurance tracking performance (up to 2 min continuous tracking), generalization capabilities being deployed in a never-seen-before environment, and requiring a minimal power consumption of 95mW for an onboard real-time inference-rate of 48Hz." A Hybrid Quadratic Programming Framework for Real-Time Embedded Safety-Critical Control,"Ryan Bena, Sushmit Hossain, Buyun Chen, Wei Wu, Quan Nguyen",University of Southern California,Micro Aerial Robots,"We present a new framework for implementing real-time embedded safety-critical controllers which utilizes hybrid computing to address the issue of limited computational resources, a problem that is particularly prevalent in microrobotics. In our approach, the nominal stabilizing control algorithm is implemented digitally while the safety-critical quadratic program is solved via a dedicated analog resistor array. We apply this hybrid computing architecture to a simulated collision avoidance task for a micro-aerial vehicle and show the benefit relative to a purely-digital implementation. By leveraging analog quadratic programming on the Crazyflie 2.1 micro quadrotor, a reduction in overall processing time from 8.9 ms to 0.6 ms is estimated for this computationally-limited system. We further display the viability of our proposed safety-critical control framework through real-time flight demonstrations, utilizing a novel prototype analog circuit tethered to the Crazyflie. The flight results confirm the functionality of the control structure and prototype circuit while highlighting the overall capabilities of hybrid computing." D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage,"Vishnu Sharma, Lifeng Zhou, Pratap Tokekar","University of Maryland,Drexel University",Multi-Robot Systems II,"Centralized approaches for multi-robot coverage planning problems suffer from the lack of scalability. Learning-based distributed algorithms provide a scalable avenue in addition to bringing data-oriented feature generation capabilities to the table, allowing integration with other learning-based approaches. To this end, we present a learning-based, differentiable distributed coverage planner (D2CoPlan) which scales efficiently in runtime and number of agents compared to the expert algorithm, and performs on par with the classical distributed algorithm. In addition, we show that D2CoPlan can be seamlessly combined with other learning methods to learn end-to-end, resulting in a better solution than the individually trained modules, opening doors to further research for tasks that remain elusive with classical methods." Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality,"Chenning Yu, Qingbiao Li, Sicun Gao, Amanda Prorok","University of California San Diego,The University of Cambridge,UCSD,University of Cambridge",Multi-Robot Systems II,"Conflict-Based Search is one of the most popular methods for multi-agent path finding. Though it is complete and optimal, it does not scale well. Recent works have been proposed to accelerate it by introducing various heuristics. However, whether these heuristics can apply to non-grid-based problem settings while maintaining their effectiveness remains an open question. In this work, we find that the answer is prone to be no. To this end, we propose a learning-based component, i.e., the Graph Transformer, as a heuristic function to accelerate the planning. The proposed method is provably complete and bounded-suboptimal with any desired factor. We conduct extensive experiments on two environments with dense graphs. Results show that the proposed Graph Transformer can be trained in problem instances with relatively few agents and generalizes well to a larger number of agents, while achieving better performance than state-of-the-art methods." Environment Optimization for Multi-Agent Navigation,"Zhan Gao, Amanda Prorok",University of Cambridge,Multi-Robot Systems II,"Traditional approaches to the design of multi-agent navigation algorithms consider the environment as a fixed constraint, despite the obvious influence of spatial constraints on agents' performance. Yet hand-designing improved environment layouts and structures is inefficient and potentially expensive. The goal of this paper is to consider the environment as a decision variable in a system-level optimization problem, where both agent performance and environment cost can be accounted for. We begin by proposing a novel environment optimization problem. We show, through formal proofs, under which conditions the environment can change while guaranteeing completeness (i.e., all agents reach their navigation goals). Our solution leverages a model-free reinforcement learning approach. In order to accommodate a broad range of implementation scenarios, we include both online and offline optimization, and both discrete and continuous environment representations. Numerical results corroborate our theoretical findings and validate our approach." Heterogeneous Coverage and Multi-Resource Allocation in Supply-Constrained Teams,"Mela Coffey, Alyssa Pierson",Boston University,Multi-Robot Systems II,"We consider a team of heterogeneous robots, each equipped with various types and quantities of resources, and tasked with supplying these resources to multiple areas of demand. We propose a Voronoi-based coverage control approach to deploy robots to areas of demand by defining a position- and time-varying density function to represent the quality at which demand is being met in the environment. This approach allows robots to prioritize the various demand locations in a continuous, distributed fashion. We present analyses to show that our controls drive the robots to critical points in the environment, along with simulations and hardware-in-the-loop experiments to demonstrate our approach." Sequential Stochastic Multi-Task Assignment for Multi-Robot Deployment Planning,"Colin Mitchell, Graeme Best, Geoffrey Hollinger","Oregon State University,University of Technology Sydney",Multi-Robot Systems II,"Real time sequential decision making under uncertainty is a challenging task for autonomous robots. Such problems are even more challenging when making decisions involving heterogeneous teams of robots completing multiple tasks. Deploying autonomous taxi cabs and utilizing drones for package delivery represent relevant examples of these types of problems. In this paper, we present an effective solution to a multi-robot multi-task sequential stochastic assignment problem using a simulation-based optimization algorithm (MARP). Our algorithm employs a novel approach that uses Monte Carlo simulation to seek the deployment with the highest probability of being optimal. To demonstrate MARP's performance and robustness, we performed more than 2,000 numerical experiments in two different problem domains, evaluating MARP's performance against three different comparison algorithms. These numerical studies show that it significantly outperforms the comparison methods, achieving results within 5% of the maximum possible reward." Path Planning under Uncertainty to Localize mmWave Sources,"Kai Pfeiffer, Yuze Jia, Mingsheng Yin, Akshaj Kumar Veldanda, Yaqi Hu, Amee Trivedi, Jeff Jun Zhang, Siddharth Garg, Elza Erkip, Sundeep Rangan, Ludovic Righetti","Nanyang Technological University,NYU,UBC,Yale,New York University",Multi-Robot Systems II,"In this paper, we study a navigation problem where a mobile robot needs to locate a mmWave wireless signal. Using the directionality properties of the signal, we propose an estimation and path planning algorithm that can efficiently navigate in cluttered indoor environments. We formulate Extended Kalman filters for emitter location estimation in cases where the signal is received in line-of-sight or after reflections. We then propose to plan motion trajectories based on belief-space dynamics in order to minimize the uncertainty of the position estimates. The associated non-linear optimization problem is solved by a state-of-the-art constrained iLQR solver. In particular, we propose a method that can handle a large number of obstacles (∼ 300) with reasonable computation times. We validate the approach in an extensive set of simulations. We show that our estimators can help increase navigation success rate and that planning to reduce estimation uncertainty can improve the overall task completion speed." Communication-Critical Planning Via Multi-Agent Trajectory Exchange,"Nathaniel Glaser, Zsolt Kira",Georgia Institute of Technology,Multi-Robot Systems II,"This paper addresses the task of joint multi-agent perception and planning, especially as it relates to the real-world challenge of collision-free navigation for connected self-driving vehicles. For this task, several communication-enabled vehicles must navigate through a busy intersection while avoiding collisions with each other and with obstacles. To this end, this paper proposes a learnable costmap-based planning mechanism, given raw perceptual data, that is (1) distributed, (2) uncertainty-aware, and (3) bandwidth-efficient. Our method produces a costmap and uncertainty-aware entropy map to sort and fuse candidate trajectories as evaluated across multiple-agents. The proposed method demonstrates several favorable performance trends on a suite of open-source overhead datasets as well as within a novel communication-critical simulator. It produces accurate semantic occupancy forecasts as an intermediate perception output, attaining a 72.5% average pixel-wise classification accuracy. By selecting the top trajectory, the multi-agent method scales well with the number of agents, reducing the hard collision rate by up to 57% with eight agents compared to the single-agent version." Distributed Potential iLQR: Scalable Game-Theoretic Trajectory Planning for Multi-Agent Interactions,"Zach Williams, Jushan Chen, Negar Mehr",University of Illinois Urbana-Champaign,Multi-Robot Systems II,"In this work, we develop a scalable, local trajectory optimization algorithm that enables robots to interact with other robots. It has been shown that agents’ interactions can be successfully captured in game-theoretic formulations, where the interaction outcome can be best modeled via the equilibria of the underlying dynamic game. However, it is typically challenging to compute equilibria of dynamic games as it involves simultaneously solving a set of coupled optimal control problems. Existing solvers operate in a centralized fashion and do not scale up tractably to multiple interacting agents. We enable scalable distributed game-theoretic planning by leveraging the structure inherent in multi-agent interactions, namely, interactions belonging to the class of dynamic potential games. Since equilibria of dynamic potential games can be found by minimizing a single potential function, we can apply distributed and decentralized control techniques to seek equilibria of multi-agent interactions in a scalable and distributed manner. We compare the performance of our algorithm with a centralized interactive planner in a number of simulation studies and demonstrate that our algorithm results in better efficiency and scalability. We further evaluate our method in hardware experiments involving multiple quadcopters." FRAME: Fast and Robust Autonomous 3D Point Cloud Map-Merging for Egocentric Multi-Robot Exploration,"Nikolaos Stathoulopoulos, Anton Koval, Ali-Akbar Agha-Mohammadi, George Nikolakopoulos","Luleå University of Technology, Robotics and AI Group,Luleå University of Technology,NASA-JPL, Caltech",Multi-Robot Systems II,"This article presents a 3D point cloud map-merging framework for egocentric heterogeneous multi-robot exploration, based on overlap detection and alignment, that is independent of a manual initial guess or prior knowledge of the robots' poses. The novel proposed solution utilizes state-of-the-art place recognition learned descriptors, that through the framework's main pipeline, offer a fast and robust region overlap estimation, hence eliminating the need for the time-consuming global feature extraction and feature matching process that is typically used in 3D map integration. The region overlap estimation provides a homogeneous rigid transform that is applied as an initial condition in the point cloud registration algorithm Fast-GICP, which provides the final and refined alignment. The efficacy of the proposed framework is experimentally evaluated based on multiple field multi-robot exploration missions in underground environments, where both ground and aerial robots are deployed, with different sensor configurations." Autonomous Task Planning for Heterogeneous Multi-Agent Systems,"Anatoli Tziola, Savvas Loizou",Cyprus University of Technology,Multi-Robot Systems II,"This paper presents a solution to the automatic task planning problem for multi-agent systems. A formal framework is developed based on Nondeterministic Finite Automata with $epsilon$-transitions, where given the capabilities, constraints and failure modes of the agents involved, any initial state of the system and a task specification, an optimal solution is generated that satisfies the system constraints and the task specification. The resulting solution is guaranteed to be complete and optimal; moreover a heuristic solution that offers significant reduction of the computational requirements while relaxing the completeness and optimality requirements is proposed. The constructed system model is independent from the initial conditions and the task specifications, eliminating the need to repeat the costly pre-processing cycle, while allowing the incorporation of failure modes on-the-fly. A case study is provided to demonstrate the effectiveness and validity of the methodology." Graph Neural Networks for Multi-Robot Active Information Acquisition,"Mariliza Tzes, Nikolaos Bousias, Evangelos Chatzipantazis, George J. Pappas",University of Pennsylvania,Multi-Robot Systems II,"This paper addresses the Multi-Robot Active Information Acquisition (AIA) problem, where a team of mobile robots, communicating through an underlying graph, estimates a hidden state expressing a phenomenon of interest. Applications like target tracking, coverage and SLAM can be expressed in this framework. Existing approaches, though, are either not scalable, unable to handle dynamic phenomena or not robust to changes in the communication graph. To counter these shortcomings, we propose an Information-aware Graph Block Network (I-GBNet), an AIA adaptation of Graph Neural Networks, that aggregates information over the graph representation and provides sequential-decision making in a distributed manner. The I-GBNet, trained via imitation learning with a centralized sampling-based expert solver, exhibits permutation equivariance and time invariance, while harnessing the superior scalability, robustness and generalizability to previously unseen environments and robot configurations. Numerical simulations on significantly larger graphs and dimensionality of the hidden state and more complex environments than those seen in training validate the properties of the proposed architecture and its efficacy in the application of localization and tracking of dynamic targets." Balancing Efficiency and Unpredictability in Multi-Robot Patrolling: A MARL-Based Approach,"Lingxiao Guo, Haoxuan Pan, Xiaoming Duan, Jianping He","Shanghai Jiao Tong University,Department of Automation, Shanghai Jiao Tong University",Multi-Robot Systems II,"Patrolling with multiple robots is a challenging task. While the robots collaboratively and repeatedly cover the regions of interest in the environment, their routes should satisfy two often conflicting properties: i) (efficiency) the time intervals between two consecutive visits to the regions are small; ii) (unpredictability) the patrolling trajectories are random and unpredictable. We manage to strike a balance between the two goals by i) recasting the original patrolling problem as a Graph Deep Learning problem; ii) directly solving this problem on the graph in the framework of cooperative multi-agent reinforcement learning. Treating the decisions of a team of agents as a sequence input, our model outputs the agents' actions in order by an autoregressive mechanism. Extensive simulation studies show that our approach has comparable performance with existing algorithms in terms of efficiency and outperforms them in terms of unpredictability. To our knowledge, this is the first work that successfully solves the patrolling problem with reinforcement learning on a graph." Learning to Influence Vehicles' Routing in Mixed-Autonomy Networks by Dynamically Controlling the Headway of Autonomous Cars,"Xiaoyu Ma, Negar Mehr","University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign",Intelligent Transportation Systems II,"It is known that autonomous cars can increase road capacities by maintaining a smaller headway through vehicle platooning. Recent works have shown that these capacity increases can influence vehicles' route choices in unexpected ways similar to the well-known Braess's paradox, such that the network congestion might increase. In this paper, we propose that in mixed-autonomy networks, i.e., networks where roads are shared between human-driven and autonomous cars, the headway of autonomous cars can be directly controlled to influence vehicles' routing and reduce congestion. We argue that the headway of autonomous cars --- and consequently the capacity of link segments --- is not just a fixed design choice; but rather, it can be leveraged as an {infrastructure control} strategy to {dynamically} regulate capacities. Imagine that similar to variable speed limits which regulate the maximum speed of vehicles on a road segment, a control policy regulates the headway of autonomous cars along each road segment. We seek to influence vehicles' route choices by directly controlling the headway of autonomous cars to prevent Braess-like unexpected outcomes and increase network efficiency. We model the dynamics of mixed-autonomy traffic networks while accounting for the vehicles' route choice dynamics. We train an RL policy that learns to regulate the headway of autonomous cars such that the total travel time in the network is minimized. We will show empirically that our trained policy can not only prevent Braess-like inefficiencies but also decrease total travel time." Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation,"Laura Zheng, Sanghyun Son, Ming C. Lin","University of Maryland, College Park,University of Maryland,University of Maryland at College Park",Intelligent Transportation Systems II,"While there have been advancements in autonomous driving control and traffic simulation, there have been little to no works exploring their unification with deep learning. Works in both areas seem to focus on entirely different exclusive problems, yet traffic and driving are inherently related in the real world. In this paper, we present Traffic-Aware Autonomous Driving (TrAAD), a generalizable distillation-style method for traffic-informed imitation learning that directly optimizes for faster traffic flow and lower energy consumption. TrAAD focuses on the supervision of speed control in imitation learning systems, as most driving research focuses on perception and steering. Moreover, our method addresses the lack of co-simulation between traffic and driving simulators and provides a basis for directly involving traffic simulation with autonomous driving in future work. Our results show that, with information from traffic simulation involved in the supervision of imitation learning methods, an autonomous vehicle can learn how to accelerate in a fashion that is beneficial for traffic flow and overall energy consumption for all nearby vehicles." Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand,"Daniel Garces, Sushmita Bhattacharya, Stephanie Gil, Dimitri Bertsekas","Harvard University,MIT",Intelligent Transportation Systems II,"We derive a learning framework to generate routing/pickup policies for a fleet of autonomous vehicles tasked with servicing stochastically appearing requests on a city map. We focus on policies that 1) give rise to coordination amongst the vehicles, thereby reducing wait times for servicing requests, 2) are non-myopic, considering a-priori potential future requests, and 3) can adapt to changes in the underlying demand distribution. Specifically, we are interested in adapting to fluctuations of actual demand conditions in urban environments, such as on-peak vs. off-peak hours. We achieve this through a combination of (i) an online play algorithm that improves the performance of an offline-trained policy, and (ii) an offline approximation scheme that allows for adapting to changes in the underlying demand model. In particular, we achieve adaptivity of our learned policy to different demand distributions by quantifying a region of validity using the q-valid radius of a Wasserstein Ambiguity Set. We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region. In this case, we propose to use an offline architecture, trained on a historical demand model that is closer to the current demand in terms of Wasserstein distance. We learn routing and pickup policies over real taxicab requests in San Francisco with high variability between on-peak and off-peak hours, demonstrating the ability of our method to adapt to real fluctuation in demand distributions. Our numerical results demonstrate that our method outperforms alternative rollout-based reinforcement learning schemes, as well as other classical methods from operations research." Cooperative Driving in Mixed Traffic of Manned and Unmanned Vehicles Based on Human Driving Behavior Understanding,"Jiaxing Lu, Sanzida Hossain, Weihua Sheng, He Bai",Oklahoma State University,Intelligent Transportation Systems II,"To achieve safe cooperative driving in mixed traffic of manned and unmanned vehicles, it is necessary to understand and model human drivers' driving behaviors. This paper proposed a Hidden Markov Model (HMM)-based method to analyze human driver's control and vehicle's dynamics; and then recognize the human driver's action, such as accelerating, braking, and changing lanes. With the knowledge of the human driver's actions, a probability model is used to predict the human-driven vehicle's acceleration. Such information on the driver behavior and the vehicle behavior can be used to achieve safer cooperative driving, which is realized using vehicle-to-vehicle (V2V) communication and model predictive control (MPC). The proposed method was tested and evaluated in our custom-built cooperative driving testbed. Experimental results show that the above driver action model is effective and accurate. A preliminary case study on a lane merging scenario is provided to further validate its effectiveness and capability." Exploring Navigation Maps for Learning-Based Motion Prediction,"Julian Schmidt, Julian Jordan, Franz Gritschneder, Thomas Monninger, Klaus Dietmayer","Mercedes-Benz AG, Ulm University,Mercedes-Benz AG,Ulm University,Mercedes-Benz AG, University of Stuttgart,University of Ulm",Intelligent Transportation Systems II,"The prediction of surrounding agents' motion is a key for safe autonomous driving. In this paper, we explore navigation maps as an alternative to the predominant High Definition (HD) maps for learning-based motion prediction. Navigation maps provide topological and geometrical information on road-level, HD maps additionally have centimeter-accurate lane-level information. As a result, HD maps are costly and time-consuming to obtain, while navigation maps with near-global coverage are freely available. We describe an approach to integrate navigation maps into learning-based motion prediction models. To exploit locally available HD maps during training, we additionally propose a model-agnostic method for knowledge distillation. In experiments on the publicly available Argoverse dataset with navigation maps obtained from OpenStreetMap, our approach shows a significant improvement over not using a map at all. Combined with our method for knowledge distillation, we achieve results that are close to the original HD map-reliant models. Our publicly available navigation map API for Argoverse enables researchers to develop and evaluate their own approaches using navigation maps." SLAMesh: Real-Time LiDAR Simultaneous Localization and Meshing,"Jianyuan Ruan, Bo Li, Yibo Wang, Yuxiang Sun","Hong Kong Polytechnic University,Zhejiang University,The Hong Kong Polytechnic University",Intelligent Transportation Systems II,"Most current LiDAR simultaneous localization and mapping (SLAM) systems build maps in point clouds, which are sparse when zoomed in, even though they seem dense to human eyes. Dense maps are essential for robotic applications, such as map-based navigation. Due to the low memory cost, mesh has become an attractive dense model for mapping in recent years. However, existing methods usually produce mesh maps by using an offline post-processing step to generate mesh maps. This two-step pipeline does not allow these methods to use the built mesh maps online and to enable localization and meshing to benefit each other. To solve this problem, we propose the first CPU-only real-time LiDAR SLAM system that can simultaneously build a mesh map and perform localization against the mesh map. A novel and direct meshing strategy with Gaussian process reconstruction realizes the fast building, registration, and updating of mesh maps. We perform experiments on several public datasets. The results show that our SLAM system can run at around 40Hz. The localization and meshing accuracy also outperforms the state-of-the-art methods, including the TSDF map and Poisson reconstruction. Our code and video demos are available at: https://github.com/lab-sun/SLAMesh." CenterLineDet: CenterLine Graph Detection for Road Lanes with Vehicle-Mounted Sensors by Transformer for HD Map Generation,"Zhenhua Xu, Yuxuan Liu, Yuxiang Sun, Ming Liu, Lujia Wang","the Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,The Hong Kong Polytechnic University,The Hong Kong University of Technology",Intelligent Transportation Systems II,"With the fast development of autonomous driving technologies, there is an increasing demand for high-definition (HD) maps, which provide reliable and robust prior information about the static part of the traffic environments. As one of the important elements in HD maps, road lane centerline is critical for downstream tasks, such as prediction and planning. Manually annotating centerlines for road lanes in HD maps is labor-intensive, expensive and inefficient, severely restricting the wide applications of autonomous driving systems. Previous work seldom explores the lane centerline detection problem due to the complicated topology and severe overlapping issues of lane centerlines. In this paper, we propose a novel method named CenterLineDet to detect lane centerlines for automatic HD map generation. Our CenterLineDet is trained by imitation learning and can effectively detect the graph of centerlines with vehicle-mounted sensors (i.e., six cameras and one LiDAR) through iterations. Due to the use of the DETR-like transformer network, CenterLineDet can handle complicated graph topology, such as lane intersections. The proposed approach is evaluated on the large-scale public dataset NuScenes. The superiority of our CenterLineDet is demonstrated by the comparative results. Our code, supplementary materials, and video demonstrations are available at href{https://tonyxuqaq.github.io/projects/CenterLineDet/}{https://tonyxuqaq.github.io/projects/CenterLineDet/}." Guided Conditional Diffusion for Controllable Traffic Simulation,"Ziyuan Zhong, Davis Rempe, Danfei Xu, Yuxiao Chen, Sushant Veer, Tong Che, Baishakhi Ray, Marco Pavone","Columbia University,Stanford University,Stanford Univesity,Nvidia research,NVIDIA,Columbia University in the City of New York",Intelligent Transportation Systems II,"Controllable and realistic traffic simulation is critical for developing and verifying autonomous vehicles. Typical heuristic-based traffic models offer flexible control to make vehicles follow specific trajectories and traffic rules. On the other hand, data-driven approaches generate realistic and human-like behaviors, improving transfer from simulated to real-world traffic. However, to the best of our knowledge, no traffic model offers both controllability and realism. In this work, we develop a conditional diffusion model for controllable traffic generation (CTG) that allows users to control desired properties of trajectories at test time (e.g., reach a goal or follow a speed limit) while maintaining realism and physical feasibility through enforced dynamics. The key technical idea is to leverage recent advances from diffusion modeling and differentiable logic to guide generated trajectories to meet rules defined using signal temporal logic (STL). We further extend guidance to multi-agent settings and enable interaction-based rules like collision avoidance. CTG is extensively evaluated on the nuScenes dataset for diverse and composite rules, demonstrating improvement over strong baselines in terms of the controllability-realism tradeoff. Demo videos can be found at https://aiasd.github.io/ctg.github.io" TrafficGen: Learning to Generate Diverse and Realistic Traffic Scenarios,"Lan Feng, Quanyi Li, Zhenghao Peng, Shuhan Tan, Bolei Zhou","ETH ZURICH,University of Edinburgh,University of California, Los Angeles,UT Austin",Intelligent Transportation Systems II,"Diverse and realistic traffic scenarios are crucial for evaluating the AI safety of autonomous driving systems in simulation. This work introduces a data-driven method called TrafficGen for traffic scenario generation. It learns from the fragmented human driving data collected in the real world and then generates realistic traffic scenarios. TrafficGen is an autoregressive neural generative model with an encoder-decoder architecture. In each autoregressive iteration, it first encodes the current traffic context with the attention mechanism and then decodes a vehicle's initial state followed by generating its long trajectory. We evaluate the trained model in terms of vehicle placement and trajectories, and the experimental result shows our method has substantial improvements over baselines for generating traffic scenarios. After training, TrafficGen can also augment existing traffic scenarios, by adding new vehicles and extending the fragmented trajectories. We further demonstrate that importing the generated scenarios into a simulator as an interactive training environment improves the performance and safety of a driving agent learned from reinforcement learning. Code and data will be made available." Infrastructure-Based End-To-End Learning and Prevention of Driver Failure,"Noam Buckman, Shiva Sreeram, Mathias Lechner, Yutong Ban, Ramin Hasani, Sertac Karaman, Daniela Rus","Massachusetts Institute of Technology,MIT,Massachusetts Institute of Technology (MIT)",Intelligent Transportation Systems II,"Intelligent intersection managers can improve safety by detecting dangerous drivers or failure modes in autonomous vehicles, warning oncoming vehicles as they approach an intersection. In this work, we present FailureNet, a recurrent neural network trained end-to-end on trajectories of both nominal and reckless drivers in a scaled miniature city. FailureNet observes the poses of vehicles as they approach an intersection and detects whether a failure is present in the autonomy stack, warning cross-traffic of potentially dangerous drivers. FailureNet can accurately identify control failures, upstream perception errors, and speeding drivers, distinguishing them from nominal driving. The network is trained and deployed with autonomous vehicles in the MiniCity. Compared to speed or frequency-based predictors, FailureNet's recurrent neural network structure provides improved predictive power, yielding upwards of 84% accuracy when deployed on hardware." V2XP-ASG: Generating Adversarial Scenes for Vehicle-To-Everything Perception,"Hao Xiang, Runsheng Xu, Xia Xin, Zhaoliang Zheng, Bolei Zhou, Jiaqi Ma","University of California, Los Angeles,UCLA",Intelligent Transportation Systems II,"Recent advancements in Vehicle-to-Everything communication technology have enabled autonomous vehicles to share sensory information to obtain better perception performance. With the rapid growth of autonomous vehicles and intelligent infrastructure, the V2X perception systems will soon be deployed at scale, which raises a safety-critical question: how can we evaluate and improve its performance under challenging traffic scenarios before the real-world deployment? Collecting diverse large-scale real-world test scenes seems to be the most straightforward solution, but it is expensive and time-consuming, and the collections can only cover limited scenarios. To this end, we propose the first open adversarial scene generator V2XP-ASG that can produce realistic, challenging scenes for modern LiDAR-based multi-agent perception systems. V2XP- ASG learns to construct an adversarial collaboration graph and simultaneously perturb multiple agents’ poses in an adversarial and plausible manner. The experiments demonstrate that V2XP- ASG can effectively identify challenging scenes for a large range of V2X perception systems. Meanwhile, by training on the limited number of generated challenging scenes, the accuracy of V2X perception systems can be further improved by 12.3% on challenging and 4% on normal scenes. Our code will be released at https://github.com/XHwind/V2XP-ASG." Satellite Image Based Cross-View Localization for Autonomous Vehicle,"Shan Wang, Yanhao Zhang, Ankit Vora, Akhil Perincherry, Hongdong Li","The Australian National University,Australian National University,Ford Motor Company,Australian National university and NICTA",Intelligent Transportation Systems II,"Existing spatial localization techniques for autonomous vehicles mostly use a pre-built 3D-HD map, often constructed using a survey-grade 3D mapping vehicle, which is not only expensive but also laborious. This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy, providing a cheaper and more practical way for localization. While the utilization of satellite imagery for cross-view localization is an established concept, the conventional methodology focuses primarily on image retrieval. This paper introduces a novel approach to cross-view localization that departs from the conventional image retrieval method. Specifically, our method develops (1) a Geometric-align Feature Extractor (GaFE) that leverages measured 3D points to bridge the geometric gap between ground and overhead views, (2) a Pose Aware Branch (PAB) adopting a triplet loss to encourage pose-aware feature extraction, and (3) a Recursive Pose Refine Branch (RPRB) using the Levenberg-Marquardt (LM) algorithm to align the initial pose towards the true vehicle pose iteratively. Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view. The results demonstrate the superiority of our method in cross-view localization with median spatial and angular errors within 1 meter and 1 degree, respectively." Collision-Free Coverage Path Planning for the Variable-Speed Curvature-Constrained Robot,"Lin Li, Dianxi Shi, Songchang Jin, Yixuan Sun, Xing Zhou, Shaowu Yang, Hengzhu Liu","National University of Defense Technology,Defense Innovation Institute",Motion and Path Planning II,"Dubins coverage has been extensively researched to address the coverage path planning (CPP) problem of a known environment for the curvature-constrained robot. However, its fixed-speed assumption prevents the robot from accelerating to reduce the time and limits its flexibility to avoid obstacles. Therefore, this paper presents a collision-free CPP (CFC) approach for the obstacle-constrained environment, which enhances time efficiency by constructing the variable-speed Dubins paths and ensures robot safety by building a risk potential surface for representing the possibility of collision. Furthermore, the CFC approach models the CPP problem as an asymmetric traveling salesman problem (ATSP) and utilizes a graph pruning strategy to reduce the computational cost. Comparison tests with other Dubins coverage methods demonstrate that the CFC approach provides shorter coverage times and better runtimes than the other Dubins coverage methods while preventing collision risk between the robot and obstacles. Physical experiments in a laboratory setting demonstrate the applicability of the CFC approach to the physical robot." Stochastic Traveling Salesperson Problem with Neighborhoods for Object Detection,"Cheng Peng, Minghan Wei, Volkan Isler","Univerisyt of Minnesota, Twin Cities,University of Minnesota",Motion and Path Planning II,"We introduce a new route-finding problem which considers perception and travel costs simultaneously. Specifically, we consider the problem of finding the shortest tour such that all objects of interest can be detected successfully. To represent a viable detection region for each object, we propose to use an entropy-based viewing score that generates a diameter-bounded region as a viewing neighborhood. We formulate the detection-based trajectory planning problem as a stochastic traveling salesperson problem with neighborhoods and propose a center-visit method that obtains an approximation ratio of O(Dmax/Dmin) for disjoint regions. For non-disjoint regions, our method provides a novel finite detour in 3D, which utilizes the region’s minimum curvature property. Finally, we show that our method can generate efficient trajectories compared to a baseline method in a photo-realistic simulation environment." Optimal Allocation of Many Robot Guards for Sweep-Line Coverage,"Si Wei Feng, Teng Guo, Jingjin Yu",Rutgers University,Motion and Path Planning II,"We study the problem of allocating many mobile robots for the execution of a pre-defined sweep schedule in a known two-dimensional environment, with applications toward search and rescue, coverage, surveillance, monitoring, pursuit-evasion, and so on. The mobile robots (or agents) are assumed to have one-dimensional sensing capability with probabilistic guarantees that deteriorate as the sensing distance increases. In solving such tasks, a time-parameterized distribution of robots along the sweep frontier must be computed, with the objective to minimize the number of robots used to achieve some desired coverage quality guarantee or to maximize the probabilistic guarantee for a given number of robots. We propose a max-flow based algorithm for solving the allocation task, which builds on a decomposition technique of the workspace as a generalization of the well-known boustrophedon decomposition. Our proposed algorithm has a very low polynomial running time and completes in under two seconds for polygonal environments with over $10^5$ vertices. Simulation experiments are carried out on three realistic use cases with randomly generated obstacles of varying shapes, sizes, and spatial distributions, which demonstrate the applicability and scalability our proposed method." A Linear and Exact Algorithm for Whole-Body Collision Evaluation Via Scale Optimization,"Qianhao Wang, Zhepei Wang, Liuao Pei, Chao Xu, Fei Gao",Zhejiang University,Motion and Path Planning II,"Collision evaluation is of essential importance in various applications. However, existing methods are either cumbersome to calculate or not exact. Therefore, considering the cost of implementation, most whole-body planning works, which require evaluating collision between robots and environments, struggle to tradeoff between accuracy and computationally efficiency. In this paper, we propose a zero-gap whole-body collision evaluation that can be formulated as a low-dimensional linear programming. This evaluation can be solved analytically in linear complexity. Moreover, the method provides gradient efficiently, making it accessible to optimization-based applications. Additionally, this method provides support for obstacles represented by either points or hyperplanes. Experiments on the widely used aerial and car-like robots validate the versatility and practicality of our method." Probabilistic Risk Assessment for Chance-Constrained Collision Avoidance in Uncertain Dynamic Environments,"Khaled Alaaeldin Abdelfattah Mustafa, Oscar De Groot, Xinwei Wang, Jens Kober, Javier Alonso-Mora","TU Delft,Delft University of Technology",Motion and Path Planning II,"Balancing safety and efficiency when planning in crowded scenarios with uncertain dynamics is challenging where it is imperative to accomplish the robot’s mission without incurring any safety violations. Typically, chance constraints are incorporated into the planning problem to provide probabilistic safety guarantees by imposing an upper bound on the collision probability of the planned trajectory. Yet, this results in an overly conservative behavior on the grounds that the gap between the obtained risk and the specified upper limit is not explicitly restricted. To address this issue, we propose a real-time capable approach to quantify the risk associated with planned trajectories obtained from multiple probabilistic planners, running in parallel, with different upper bounds of the acceptable risk level. Based on the evaluated risk, the least conservative plan is selected provided that its associated risk is below a specified threshold. In such a way, the proposed approach provides probabilistic safety guarantees by attaining a closer bound to the specified risk, while being applicable to generic uncertainties of moving obstacles. We demonstrate the efficiency of our proposed approach, by improving the performance of a state-of-the-art probabilistic planner, in simulations and experiments using a mobile robot in an environment shared with humans." Computational Tradeoff in Minimum Obstacle Displacement Planning for Robot Navigation,"Antony Thomas, Giulio Ferro, Fulvio Mastrogiovanni, Michela Robba",University of Genoa,Motion and Path Planning II,"In this paper, we look into the minimum obstacle displacement (MOD) planning problem from a mobile robot motion planning perspective. This problem finds an optimal path to goal by displacing movable obstacles when no path exists due to collision with obstacles. However this problem is computationally expensive and grows exponentially in the size of number of movable obstacles. This work looks into approximate solutions that are computationally less intensive and differ from the optimal solution by a factor of the optimal cost." A Trajectory Planner for Mobile Robots Steering Non-Holonomic Wheelchairs in Dynamic Environments,"Martin Schulze, Friedrich Graaf, Lea Steffen, Arne Roennau, Rüdiger Dillmann","FZI Research Center for Information Technology,FZI Research Center for Information Technology, ,,,,, Karlsruhe,,FZI Forschungszentrum Informatik, Karlsruhe,FZI - Forschungszentrum Informatik - Karlsruhe",Motion and Path Planning II,"Motion planning for mobile robot platforms is one of the long-established research fields in robotics. In this paper, we propose a trajectory planner for mobile holonomic robots to steer non-holonomic conventional passive wheelchairs in dynamic environments. The challenges to overcome when steering a wheelchair are to find smooth feasible trajectories, maintain a fast reactive response to dynamic obstacles and to satisfy a set of additional constraints such as limiting physical forces acting on the wheelchair occupants. Our approach is a variant of the timed-elastic-bands (TEB) planner, which includes a footprint of the wheelchair during optimization, and generates a steering angle which is then consumed by an arm controller to actuate the relative orientation between the wheelchair and the mobile platform. This is realized by posing new non-holonomic and kinodynamic constraints on the TEB planner and an implementation of a suitable real-time dual-arm controller for executing steering commands. We demonstrate our results based on a TEB baseline comparison in simulation using functional models of our robot HoLLiE and a wheelchair." Safe Bipedal Path Planning Via Control Barrier Functions for Polynomial Shape Obstacles Estimated Using Logistic Regression,"Chengyang Peng, Octavian Donca, Guillermo Castillo, Ayonga Hereid","The Ohio State University,Ohio State University",Motion and Path Planning II,"Safe path planning is critical for bipedal robots to operate in safety-critical environments. Common path planning algorithms, such as RRT or RRT*, typically use geometric or kinematic collision check algorithms to ensure collision-free paths toward the target position. However, such approaches may generate non-smooth paths that do not comply with the dynamics constraints of walking robots. It has been shown that the control barrier function (CBF) can be integrated with RRT/RRT* to synthesize dynamically feasible collision-free paths. Yet, existing work has been limited to simple circular or elliptical shape obstacles due to the challenging nature of constructing appropriate barrier functions to represent irregularly shaped obstacles. In this paper, we present a CBF-based RRT* algorithm for bipedal robots to generate a collision-free path through space with multiple polynomial-shaped obstacles. In particular, we used logistic regression to construct polynomial barrier functions from a grid map of the environment to represent irregularly shaped obstacles. Moreover, we developed a multi-step CBF steering controller to ensure the efficiency of free space exploration. The proposed approach was first validated in simulation for a differential drive model, and then experimentally evaluated with a 3D humanoid robot, Digit, in a lab setting with randomly placed obstacles." Real-Time Decentralized Navigation of Nonholonomic Agents Using Shifted Yielding Areas,"He Liang, Zherong Pan, Dinesh Manocha","University of North Carolina at Chapel Hill,Tencent America,University of Maryland",Motion and Path Planning II,"We present a lightweight, decentralized algorithm for navigating multiple nonholonomic agents through challenging environments with narrow passages. Our key idea is to allow agents to yield to each other in large open areas instead of narrow passages, to increase the success rate of conventional decentralized algorithms. At the pre-processing time, our method computes a medial axis for the free space. A reference trajectory is then computed and projected onto the medial axis for each agent. During runtime, when an agent senses other agents moving in the opposite direction, our algorithm uses the medial axis to estimate a Point of Impact (POI) as well as the available area around the POI. If the area around the POI is not large enough for yielding behaviors to be successful, we shift the POI to nearby large areas by modulating the agent’s reference trajectory and traveling speed. We evaluate our method on a row of 4 environments with up to 15 robots, and we find our method incurs a marginal computational overhead of 10-30 ms on average, achieving real-time performance. Afterward, our planned reference trajectories can be tracked using local navigation algorithms to achieve up to 100% higher success rate over local navigation algorithms alone." Differentiable Collision Detection for a Set of Convex Primitives,"Kevin Tracy, Taylor Howell, Zachary Manchester","Carnegie Mellon University,Stanford University",Motion and Path Planning II,"Collision detection between objects is critical for simulation, control, and learning for robotic systems. However, existing collision detection routines are inherently non-differentiable, limiting their applications in gradient-based optimization tools. In this work, we propose DCOL: a fast and fully differentiable collision-detection framework that reasons about collisions between a set of composable and highly expressive convex primitive shapes. This is achieved by formulating the collision detection problem as a convex optimization problem that solves for the minimum uniform scaling applied to each primitive before they intersect. The optimization problem is fully differentiable with respect to the configurations of each primitive and is able to return a collision detection metric and contact points on each object, agnostic of interpenetration. We demonstrate the capabilities of DCOL on a range of robotics problems from trajectory optimization and contact physics, and have made an open-source implementation available." Shunted Collision Avoidance for Multi-UAV Motion Planning with Posture Constraints,"Gang Xu, Deye Zhu, Junjie Cao, Yong Liu, Jian Yang","Zhejiang University,Institute of Cyber Systems and Control, Zhejiang University,China Research and Development Academy of Machinery Equipment",Motion and Path Planning II,"This paper investigates the problem of fixed-wing unmanned aerial vehicles (UAVs) motion planning with posture constraints and the problem of the more general symmetrical situations where UAVs have more than one optimal solution. In this paper, the posture constraints are formulated in the 3D Dubins method, and the symmetrical situations are overcome by a more collaborative strategy called the shunted strategy. The effectiveness of the proposed method has been validated by conducting extensive simulation experiments. Meanwhile, we compared the proposed method with the other state-of-the-art methods, and the comparison results show that the proposed method advances the previous works. Finally, the practicability of the proposed algorithm was analyzed by the statistic in computational cost." Dynamic Control Barrier Function-Based Model Predictive Control to Safety-Critical Obstacle-Avoidance of Mobile Robot,"Zhuozhu Jian, Zihong Yan, Xuanang Lei, Zihong Lu, Bin Lan, Xueqian Wang, Bin Liang","Tsinghua University,Tsinghua university,ETH Zurich,Harbin Institute of Technology, Shenzhen,Center for Artificial Intelligence and Robotics, Graduate School",Motion and Path Planning II,"This paper presents an efficient and safe method to avoid static and dynamic obstacles based on LiDAR. First, point cloud is used to generate a real-time local grid map for obstacle detection. Then, obstacles are clustered by DBSCAN algorithm and enclosed with minimum bounding ellipses (MBEs). In addition, data association is conducted to match each MBE with the obstacle in the current frame. Considering MBE as an observation, Kalman filter (KF) is used to estimate and predict the motion state of the obstacle. In this way, the trajectory of each obstacle in the forward time domain can be parameterized as a set of ellipses. Due to the uncertainty of the MBE, the semi-major and semi-minor axes of the parameterized ellipse are extended to ensure safety. We extend the traditional Control Barrier Function (CBF) and propose Dynamic Control Barrier Function (D-CBF). We combine D-CBF with Model Predictive Control (MPC) to implement safety-critical dynamic obstacle avoidance. Experiments in simulated and real scenarios are conducted to verify the effectiveness of our algorithm. The source code is released for the reference of the community." A Minimum Swept-Volume Metric Structure for Configuration Space,"Yann Dubois De Mont-marin, Jean Ponce, Jean-Paul Laumond","Inria, DI ENS,Ecole Normale Supérieure,Inria, DI ENS PSL",Task and Motion Planning,"Borrowing elementary ideas from solid mechanics and differential geometry, this presentation shows that the volume swept by a regular solid undergoing a wide class of volume-preserving deformations induces a rather natural metric structure with well-defined and computable geodesics on its configuration space. This general result applies to concrete classes of articulated objects such as robot manipulators, and we demonstrate as a proof of concept the computation of geodesic paths for a free flying rod and planar robotic arms as well as their use in path planning with many obstacles." Task-Space Clustering for Mobile Manipulator Task Sequencing,"Quang-Nam Nguyen, Nicholas Adrian, Quang-Cuong Pham","Nanyang Technological University,NTU Singapore",Task and Motion Planning,"Mobile manipulators have gained attention for the potential in performing large-scale tasks which are beyond the reach of fixed-base manipulators. The Robotic Task Sequencing Problem for mobile manipulators often requires optimizing the motion sequence of the robot to visit multiple targets while reducing the number of base placements. A two-step approach to this problem is clustering the task-space into clusters of targets before sequencing the robot motion. In this paper, we propose a task-space clustering method which formulates the clustering step as a Set Cover Problem using bipartite graph and reachability analysis, then solves it to obtain the minimum number of target clusters with corresponding base placements. We demonstrated the practical usage of our method in a mobile drilling experiment containing hundreds of targets. Multiple simulations were conducted to benchmark the algorithm and also showed that our proposed method found, in practical time, better solutions than the existing state-of-the-art methods." Sampling-Based Path Planning under Temporal Logic Constraints with Real-Time Adaptation,"Yizhou Chen, Ruoyu Wang, Xinyi Wang, Ben M. Chen","Chinese University of Hong Kong,The Chinese University of Hong Kong",Task and Motion Planning,"Replanning in temporal logic tasks is extremely difficult during the online execution of robots. This study introduces an effective path planner that computes solutions for temporal logic goals and instantly adapts to non-static and partially unknown environments. Given prior knowledge and a task specification, the planner first identifies an initial feasible solution by growing a sampling-based search tree. While carrying out the computed plan, the robot maintains a solution library to continuously enhance the unfinished part of the plan and store backup plans. The planner updates existing plans when meeting unexpected obstacles or recognizing flaws in prior knowledge. Upon a high-level path is obtained, a trajectory generator tracks the path by dividing it into segments of motion primitives. Our planner is integrated into an autonomous mobile robot system, further deployed on a multicopter with limited onboard processing power. In simulation and realworld experiments, our planner is demonstrated to swiftly and effectively adjust to environmental uncertainties." Optimal Grasps and Placements for Task and Motion Planning in Clutter,"Carlos Quintero-Pena, Zachary Kingston, Tianyang Pan, Rahul Shome, Anastasios Kyrillidis, Lydia Kavraki","Rice University,The Australian National University",Task and Motion Planning,"Many methods that solve robot planning problems, such as task and motion planners, employ discrete symbolic search to find sequences of valid symbolic actions that are grounded with motion planning. Much of the efficacy of these planners lies in this grounding---bad placement and grasp choices can lead to inefficient planning when a problem has many geometric constraints. Moreover, grounding methods such as naive sampling often fail to find appropriate values for these choices in the presence of clutter. Towards efficient task and motion planning, we present a novel optimization-based approach for grounding to solve cluttered problems that have many constraints that arise from geometry. Our approach finds an optimal grounding and can provide feedback to discrete search for more effective planning. We demonstrate our method against baseline methods in complex simulated environments." Resolution Complete In-Place Object Retrieval Given Known Object Models,"Daniel Nakhimovich, Yinglong Miao, Kostas E. Bekris","Rutgers, the State University of New Jersey,Rutgers University",Task and Motion Planning,"This work proposes a robot task planning framework for retrieving a target object in a confined workspace among multiple stacked objects that obstruct the target. The robot can use prehensile picking and in-workspace placing actions. The method assumes access to 3D models for the visible objects in the scene. The key contribution is in achieving desirable properties, i.e., to provide (a) safety, by avoiding collisions with sensed obstacles, objects, and occluded regions, and (b) resolution completeness (RC) - or probabilistic completeness (PC) depending on implementation - which indicates a solution will be eventually found (if it exists) as the resolution of algorithmic parameters increases. A heuristic variant of the basic RC algorithm is also proposed to solve the task more efficiently while retaining the desirable properties. Simulation results compare using random picking and placing operations against the basic RC algorithm that reasons about object dependency as well as its heuristic variant. The success rate is higher for the RC approaches given the same amount of time. The heuristic variant is able to solve the problem even more efficiently than the basic approach. The integration of the RC algorithm with perception, where an RGB-D sensor detects the objects as they are being moved, enables real robot demonstrations of safely retrieving target objects from a cluttered shelf." Task-Directed Exploration in Continuous POMDPs for Robotic Manipulation of Articulated Objects,"Aidan Curtis, Leslie Kaelbling, Siddarth Jain","MIT,Mitsubishi Electric Research Laboratories (MERL)",Task and Motion Planning,"Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current sample-based online POMDP solvers on several tasks." Learning Feasibility of Factored Nonlinear Programs in Robotic Manipulation Planning,"Joaquim Ortiz De Haro, Jung-su Ha, Danny Driess, Erez Karpas, Marc Toussaint","University of Stuttgart,TU Berlin,Technion",Task and Motion Planning,"A factored Nonlinear Program (Factored-NLP) explicitly models the dependencies between a set of continuous variables and nonlinear constraints, providing an expressive formulation for relevant robotics problems such as manipulation planning or simultaneous localization and mapping. When the problem is over-constrained or infeasible, a fundamental issue is to detect a minimal subset of variables and constraints that are infeasible. Previous approaches require solving several nonlinear programs, incrementally adding and removing constraints, and are thus computationally expensive. In this paper, we propose a graph neural architecture that predicts which variables and constraints are jointly infeasible. The model is trained with a dataset of labeled subgraphs of Factored-NLPs, and importantly, can make useful predictions on larger factored nonlinear programs than the ones seen during training. We evaluate our approach in robotic manipulation planning, where our model is able to generalize to longer manipulation sequences involving more objects and robots, and different geometric environments. The experiments show that the learned model accelerates general algorithms for conflict extraction (by a factor of 50) and heuristic algorithms that exploit expert knowledge (by a factor of 4)." Learning to Predict Action Feasibility for Task and Motion Planning in 3D Environments,"Smail Ait Bouhsain, Alami Rachid, Thierry Simeon","LAAS-CNRS,CNRS",Task and Motion Planning,"In Task and motion planning (TAMP), symbolic search is combined with continuous geometric planning. A task planner finds an action sequence while a motion planner checks its feasibility and plans the corresponding sequence of motions. However, due to the high combinatorial complexity of discrete search, the number of calls to the geometric planner can be very large. Previous works leverage learning methods to efficiently predict the feasibility of actions, much like humans do, on tabletop scenarios. This way, the time spent on motion planning can be greatly reduced. In this work, we generalize these methods to 3D environments, thus covering the whole workspace of the robot. We propose an efficient method for 3D scene representation, along with a deep neural network capable of predicting the probability of feasibility of an action. We develop a simple TAMP algorithm that integrates the trained classifier, and demonstrate the performance gain of using our approach on multiple problem domains. On complex problems, our method can reduce the time spent on geometric planning by up to 90%." Policy Guided Lazy Search with Feedback for Task and Motion Planning,"Mohamed Khodeir, Atharv Sonwane, Ruthrash Hari, Florian Shkurti","University of Toronto,Microsoft Research",Task and Motion Planning,"PDDLStream solvers have recently emerged as viable solutions for Task and Motion Planning (TAMP) problems, extending PDDL to problems with continuous action spaces. Prior work has shown how PDDLStream problems can be reduced to a sequence of PDDL planning problems, which can then be solved using off-the-shelf planners. However, this approach can suffer from long runtimes. In this paper we propose LAZY, a solver for PDDLStream problems that maintains a single integrated search over action skeletons, which gets progressively more geometrically informed, as samples of possible motions are lazily drawn during motion planning. We explore how learned models of goal-directed policies and current motion sampling data can be incorporated in LAZY to adaptively guide the task planner. We show that this leads to significant speed-ups in the search for a feasible solution evaluated over unseen test environments of varying numbers of objects, goals, and initial conditions. We evaluate our TAMP approach by comparing to existing solvers for PDDLStream problems on a range of simulated 7DoF rearrangement/manipulation problems." A Reachability Tree-Based Algorithm for Robot Task and Motion Planning,"Kanghyun Kim, Daehyung Park, Min Jun Kim","Korea Advanced Institute of Science and Technology (KAIST),Korea Advanced Institute of Science and Technology, KAIST,KAIST",Task and Motion Planning,"This paper presents a novel algorithm for robot task and motion planning (TAMP) problems by utilizing a reachability tree. While tree-based algorithms are known for their speed and simplicity in motion planning (MP), they are not well-suited for TAMP problems that involve both abstracted and geometrical state variables. To address this challenge, we propose a hierarchical sampling strategy, which first generates an abstracted task plan using Monte Carlo tree search (MCTS) and then fills in the details with a geometrically feasible motion trajectory. Moreover, we show that the performance of the proposed method can be significantly enhanced by selecting an appropriate reward for MCTS and by using a pre-generated goal state that is guaranteed to be geometrically feasible. A comparative study using TAMP benchmark problems demonstrates the effectiveness of the proposed approach." Dual Quaternion Based Dynamic Movement Primitives to Learn Industrial Tasks Using Teleoperation,"Rohit CHANDRA, Victor Henri Giraud, Mohammad Alkhatib, Youcef Mezouar","SIGMA, UCA Clermont-Ferrand, France,SIGMA-Clermont / Institut Pascal,Université Clermont Auvergne,Clermont Auvergne INP - SIGMA Clermont",Task and Motion Planning,"Dynamic movement primitives (DMPs) provide an effective method of learning manipulation skills from human demonstration. DMPs can be especially useful for imitating industrial manipulation tasks which are performed by humans and are difficult to model, for instance, deformable object manipulation. In this work the effectiveness of a conventional Cartesian space DMP is enhanced using a compact and efficient representation of dual quaternions (DQ). We demonstrate that our DQ based DMP learning approach that utilizes the geometrical meaning of screw-based kinematics, outperforms traditional decoupled task-space DMPs in terms of accuracy during learning in certain situations. Our DMP formulation affords two additional applications: (1) Filter the noisy and irregular sensing of human demonstration; (2) Limit the robotic manipulator’s task-space velocity during teleoperation, thus improving the safety of the robot and the environment. The learning and filtering strategies are validated on a bimanual robotic system and a motion capture system. We demonstrate the effectiveness of DMP based manipulation of deformable object by learning a bimanual deformation trajectory and then using it to perform the same task in new scenarios." Multi-Contact Task and Motion Planning Guided by Video Demonstration,"Kateryna Zorina, David Kovar, Florent Lamiraux, Nicolas Mansard, Justin Carpentier, Josef Sivic, Vladimír Petrík","CIIRC,Czech Technical University in Prague,CNRS,INRIA,Czech Technical University",Task and Motion Planning,"This work aims at leveraging instructional video to guide the solving of complex multi-contact task-and-motion planning tasks in robotics. Towards this goal, we propose an extension of the well-established Rapidly-Exploring Random Tree (RRT) planner, which simultaneously grows multiple trees around grasp and release states extracted from the guiding video. Our key novelty lies in combining contact states, and 3D object poses extracted from the guiding video with a traditional planning algorithm that allows us to solve tasks with sequential dependencies, for example, if an object needs to be placed at a specific location to be grasped later. To demonstrate the benefits of the proposed video-guided planning approach, we design a new benchmark with three challenging tasks: (i) 3D re-arrangement of multiple objects between a table and a shelf, (ii) multi-contact transfer of an object through a tunnel, and (iii) transferring objects using a tray in a similar way a waiter transfers dishes. We demonstrate the effectiveness of our planning algorithm on several robots, including the Franka Emika Panda and the KUKA KMR iiwa." MVTrans: Multi-View Perception of Transparent Objects,"Yi Ru Wang, Yuchi Zhao, Haoping Xu, Sagi Eppel, Alan Aspuru-guzik, Florian Shkurti, Animesh Garg","University of Toronto, University of Washington,University of Waterloo,University of Toronto",Perception for Grasping and Manipulation II,"Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and extend the stereo based method. Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities, including depth estimation, segmentation, and pose estimation. Additionally, we establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset, Syn TODD, which is suitable for training networks with all three modalities, RGB-D, stereo and multi-view RGB. Website: https://ac-rad.github.io/MVTrans/" The Sum of Its Parts: Visual Part Segmentation for Inertial Parameter Identification of Manipulated Objects,"Philippe Nadeau, Matthew Giamou, Jonathan Kelly",University of Toronto,Perception for Grasping and Manipulation II,"To operate safely and efficiently alongside human workers, collaborative robots (cobots) require the ability to quickly understand the dynamics of manipulated objects. However, traditional methods for estimating the full set of inertial parameters rely on motions that are necessarily fast and unsafe (to achieve a sufficient signal-to-noise ratio). In this work, we take an alternative approach: by combining visual and force-torque measurements, we develop an inertial parameter identification algorithm that requires slow or ""stop-and-go"" motions only, and hence is ideally tailored for use around humans. Our technique, called Homogeneous Part Segmentation (HPS), leverages the observation that man-made objects are often composed of distinct, homogeneous parts. We combine a surface-based point clustering method with a volumetric shape segmentation algorithm to quickly produce a part-level segmentation of a manipulated object; the segmented representation is then used by HPS to accurately estimate the object's inertial parameters. To benchmark our algorithm, we create and utilize a novel dataset consisting of realistic meshes, segmented point clouds, and inertial parameters for 20 common workshop tools. Finally, we demonstrate the real-world performance and accuracy of HPS by performing an intricate ""hammer balancing act"" autonomously and online with a low-cost collaborative robotic arm. Our code and dataset are open source and freely available." SLURP! Spectroscopy of Liquids Using Robot Pre-Touch Sensing,"Nathaniel Hanson, Wesley Lewis, Kavya Puthuveetil, Donelle Furline Jr, Akhil Padmanabha, Taskin Padir, Zackory Erickson","Northeastern University,Carnegie Mellon University",Perception for Grasping and Manipulation II,"Liquids and granular media are pervasive throughout human environments. Their free-flowing nature causes people to constrain them into containers. We do so with thousands of different types of containers made out of different materials with varying sizes, shapes, and colors. In this work, we present a state-of-the-art sensing technique for robots to perceive what liquid is inside of an unknown container. We do so by integrating Visible to Near Infrared (VNIR) reflectance spectroscopy into a robot's end effector. We introduce a hierarchical model for inferring the material classes of both containers and internal contents given spectral measurements from two integrated spectrometers. To train these inference models, we capture and open source a dataset of spectral measurements from over 180 different combinations of containers and liquids. Our technique demonstrates over 85% accuracy in identifying 13 different liquids and granular media contained within 13 different containers. The sensitivity of our spectral readings allow our model to also identify the material composition of the containers themselves with 96% accuracy. Overall, VNIR spectroscopy presents a promising method to give household robots a general-purpose ability to infer the liquids inside of containers, without needing to open or manipulate the containers." Tactile Based Robotic Skills for Cable Routing Operations,"Andrea Monguzzi, Martina Pelosi, Andrea Maria Zanchettin, Paolo Rocco",Politecnico di Milano,Perception for Grasping and Manipulation II,"This paper proposes a set of tactile based skills to perform robotic cable routing operations for deformable linear objects (DLOs) characterized by considerable stiffness and constrained at both ends. In particular, tactile data are exploited to reconstruct the shape of the grasped portion of the DLO and to estimate the future local one. This information is exploited to obtain a grasping configuration aligned to the local shape of the DLO, starting from a rough initial grasping pose, and to follow the DLO's contour in the three-dimensional space. Taking into account the distance travelled along the arc length of the DLO, the robot can detect the cable segments that must be firmly grasped and inserted in intermediate clips, continuing then to slide along the contour until the next DLO's portion, that has to be clipped, is reached. The proposed skills are experimentally validated with an industrial robot on different DLOs in several configurations and on a cable routing use case." Category-Level Global Camera Pose Estimation with Multi-Hypothesis Point Cloud Correspondences,"Jun-Jee Chao, Kazim Selim Engin, Nicolai Häni, Volkan Isler",University of Minnesota,Perception for Grasping and Manipulation II,"Correspondence search is an essential step in rigid point cloud registration algorithms. Most methods maintain a single correspondence at each step and gradually remove wrong correspondances. However, building one-to-one correspondence with hard assignments is extremely difficult, especially when matching two point clouds with many locally similar features. This paper proposes an optimization method that retains all possible correspondences for each keypoint when matching a partial point cloud to a complete point cloud. These uncertain correspondences are then gradually updated with the estimated rigid transformation by considering the matching cost. Moreover, we propose a new point feature descriptor that measures the similarity between local point cloud regions. Extensive experiments show that our method outperforms the state-of-the-art (SoTA) methods even when matching different objects within the same category. Notably, our method outperforms the SoTA methods when registering real-world noisy depth images to a template shape by up to 20% performance." GSMR-CNN: An End-To-End Trainable Architecture for Grasping Target Objects from Multi-Object Scenes,"Valerija Holomjova, Andrew Joe Starkey, Pascal Meißner",University of Aberdeen,Perception for Grasping and Manipulation II,"We present an end-to-end trainable multi-task model that locates and retrieves target objects from multi-object scenes. The model is an extension of the Siamese Mask R-CNN, which combines the components of Siamese Neural Networks (SNNs) and Mask R-CNN for performing one-shot instance segmentation. The proposed network, called Grasping Siamese Mask R-CNN (GSMR-CNN), extends Siamese Mask R-CNN by adding an additional branch for grasp detection in parallel to the previous object detection head branches. This allows our model to identify a target object with a suitable grasp simultaneously, as opposed to other approaches that require the training of separate models to achieve the same task. The inherent SNN properties enable the proposed model to generalize and recognize new object categories that were not present during training, which is beyond the capabilities of standard object detectors. Moreover, an end-to-end solution uses shared features entailing less model parameters. The model achieves grasp accuracy scores of 92.1% and 90.4% on the OCID grasp dataset on image-wise and object-wise splits. Physical experiments show that the model achieves a grasp success rate of 76.4% when correctly identifying the object." 3DSGrasp: 3D Shape-Completion for Robotic Grasp,"Seyed Saber Mohammadi, Nuno Ferreira Duarte, Plinio Moreno, Atabak Dehban, Dimitrios Dimou, Pietro Morerio, Matteo Taiana, Yiming Wang, Alexandre Bernardino, Alessio Del Bue, José Santos-Victor","Istituto Italiano di Tecnologia (IIT),IST-ID,IST-ID ,,, ,,, ,,,,Instituto Superior Tecnico, University of Lisbon,Istituto Italiano di Tecnologia,Italian Institute of Technology (IIT),Fondazione Bruno Kessler,IST - Técnico Lisboa,Instituto Superior Técnico - Lisbon",Perception for Grasping and Manipulation II,"Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point’s permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset are available at: https://github.com/NunoDuarte/3DSGrasp." Goal-Conditioned Action Space Reduction for Deformable Object Manipulation,"Shengyin Wang, Rafael Papallas, Matteo Leonetti, Mehmet Remzi Dogar","University of Leeds,King's College London",Perception for Grasping and Manipulation II,"Planning for deformable object manipulation has been a challenge for a long time in robotics due to its high computational cost. In this work, we propose to reduce this cost by reducing the number of pick points on a deformable object in the action space. We do this by identifying a small number of key particles that are sufficient as pick points to reach a given goal state. We find these key particles through a geometric model simplification process, which finds the minimal geometric model that still enables a good approximation of the original model at the goal state. We present an implementation of this general approach for 1-D linear deformable objects (e.g., ropes) that uses a piece-wise line fitted model, and for 2-D flat deformable objects (e.g., cloth) that uses a mesh simplified model. We conducted simulation experiments on ropes and cloths, which demonstrate the effectiveness of the proposed method. Finally, the planned paths are executed in a real-world setting for two cloth folding tasks." MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes,"Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nan-Ning Zheng","Xi'an Jiaotong University,Xi'an Jiaotong Univ.",Perception for Grasping and Manipulation II,"Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes. Previous works infer manipulation relationship by deep neural network trained with data collected from a predefined view, which has limitation in visual dislocation in unstructured environments. Multi-view data provide more comprehensive information in space, while a challenge of multi-view MRD is domain shift. In this paper, we propose a novel multi-view fusion framework, namely multi-view MRD network (MMRDN), which is trained by 2D and 3D multi-view data. We project the 2D data from different views into a common hidden space and fit the embeddings with a set of Von-Mises-Fisher distributions to learn the consistent representations. Besides, taking advantage of position information within the 3D data, we select a set of $K$ Maximum Vertical Neighbors (KMVN) points from the point cloud of each object pair, which encodes the relative position of these two objects. Finally, the features of multi-view 2D and 3D data are concatenated to predict the pairwise relationship of objects. Experimental results on the challenging REGRAD dataset show that MMRDN outperforms the state-of-the-art methods in multi-view MRD tasks. The results also demonstrate that our model trained by synthetic data is capable to transfer to real-world scenarios." SCARP: 3D Shape Completion in ARbitrary Poses for Improved Grasping,"Bipasha Sen, Aditya Agarwal, Gaurav Singh, Brojeshwar Bhowmick, Srinath Sridhar, Madhava Krishna","International Institute of Information Technology,IIIT Hyderabad,Tata Consultancy Services,Brown University",Perception for Grasping and Manipulation II,"Recovering full 3D shapes from partial observations is a challenging task that has been extensively addressed in the computer vision community. Many deep learning methods tackle this problem by training 3D shape generation networks to learn a prior over the full 3D shapes. In this training regime, the methods expect the inputs to be in a fixed canonical form, without which they fail to learn a valid prior over the 3D shapes. We propose SCARP, a model that performs Shape Completion in ARbitrary Poses. Given a partial pointcloud of an object, SCARP learns a disentangled feature representation of pose and shape by relying on rotationally equivariant pose features and geometric shape features trained using a multi-tasking objective. Unlike existing methods that depend on an external canonicalization method, SCARP performs canonicalization, pose estimation, and shape completion in a single network, improving the performance by 45% over the existing baselines. In this work, we use SCARP for improving grasp proposals on tabletop objects. By completing partial tabletop objects directly in their observed poses, SCARP enables a SOTA grasp proposal network improve their proposals by 71.2% on partial shapes. Project page: https://bipashasen.github.io/scarp" Category-Level Shape Estimation for Densely Cluttered Objects,"Zhenyu Wu, Ziwei Wang, Jiwen Lu, Haibin Yan","Beijing University of Posts and Telecommunications,Tsinghua University",Perception for Grasping and Manipulation II,"Accurately estimating the shape of objects in dense clutter makes important contribution to robotic packing, because the optimal object arrangement requires the robot planner to acquire shape information of all existed objects. However, the objects for packing are usually piled in dense clutter with severe occlusion, and the object shape varies significantly across different instances for the same category. They respectively cause large object segmentation errors and inaccurate shape recovery on unseen instances, which both degrade the performance of shape estimation during deployment. In this paper, we propose a category-level shape estimation method for densely cluttered objects. Our framework partitions each object in the clutter via the multi-view visual information fusion to achieve high segmentation accuracy, and the instance shape is recovered by deforming the category templates with diverse geometric transformation to obtain strengthened generalization ability. Specifically, we first collect the multi-view RGB-D images of the object clutter for point cloud reconstruction. Then we fuse the feature maps representing the visual information of multi-view RGB images and the pixel affinity learned from the clutter point cloud, where the acquired instance segmentation masks of multi-view RGB images are projected to partition the clutter point cloud. Finally, the instance geometry information is obtained from the partially observed instance point cloud and the corresponding category template, and the deformation parameters regarding the template are predicted for shape estimation. Experiments in the simulated environment and real world show that our method achieves high shape estimation accuracy for densely cluttered everyday objects with various shapes." Counter-Hypothetical Particle Filters for Single Object Pose Tracking,"Elizabeth Olson, Jana Pavlasek, Jasmine Berry, Odest Chadwicke Jenkins",University of Michigan,Perception for Grasping and Manipulation II,"Abstract— Particle filtering is a common technique for six degree of freedom (6D) pose estimation due to its ability to tractably represent belief over object pose. However, the particle filter is prone to particle deprivation due to the high- dimensional nature of 6D pose. When particle deprivation occurs, it can cause mode collapse of the underlying belief distri- bution during importance sampling. If the region surrounding the true state suffers from mode collapse, recovering its belief is challenging since the area is no longer represented in the probability mass formed by the particles. Previous methods mitigate this problem by randomizing and resetting particles in the belief distribution, but determining the frequency of reinvigoration has relied on hand-tuning abstract heuristics. In this paper, we estimate the necessary reinvigoration rate at each time step by introducing a Counter-Hypothetical likelihood function, which is used alongside the standard likelihood. Inspired by the notions of plausibility and implausibility from Evidential Reasoning, the addition of our Counter-Hypothetical likelihood function assigns a level of doubt to each particle. The competing cumulative values of confidence and doubt across the particle set are used to estimate the level of failure within the filter, in order to determine the portion of particles to be reinvigorated. We demonstrate the effectiveness of our method on the rigid body object 6D pose tracking task." Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses,"Hao Zhang, Hongzhuo Liang, Lin Cong, Jianzhi Lyu, Long Zeng, Pingfa Feng, Jianwei Zhang","Tsinghua University,University of Hamburg",Learning for Grasping and Manipulation II,"Grasping an object when it is in an ungraspable pose is a challenging task, such as books or other large flat objects placed horizontally on a table. Inspired by human manipulation, we address this problem by pushing the object to the edge of the table and then grasping it from the hanging part. In this paper, we develop a model-free Deep Reinforcement Learning framework to synergize pushing and grasping actions. We first pre-train a Variational Autoencoder to extract high-dimensional features of input scenario images. One Proximal Policy Optimization algorithm with the common reward and sharing layers of Actor-Critic is employed to learn both pushing and grasping actions with high data efficiency. Experiments show that our one network policy can converge 2.5 times faster than the policy using two parallel networks. Moreover, the experiments on unseen objects show that our policy can generalize to the challenging case of objects with curved surfaces and off-center irregularly shaped objects. Lastly, our policy can be transferred to a real robot without fine-tuning by using CycleGAN for domain adaption and outperforms the push-to-wall baseline." Efficient Bimanual Handover and Rearrangement Via Symmetry-Aware Actor-Critic Learning,"Yunfei Li, Chaoyi Pan, Huazhe Xu, Xiaolong Wang, Yi Wu","Tsinghua University,UC San Diego",Learning for Grasping and Manipulation II,"Bimanual manipulation is important for building intelligent robots that unlock richer skills than single arms. We consider a multi-object bimanual rearrangement task, where a reinforcement learning (RL) agent aims to jointly control two arms to rearrange these objects as fast as possible. Solving this task efficiently is challenging for an RL agent due to the requirement of discovering precise intra-arm coordination in an exponentially large control space. We develop a symmetry-aware actor-critic framework that leverages the interchangeable roles of the two manipulators in the bimanual control setting to reduce the policy search space. To handle the compositionality over multiple objects, we augment training data with an object-centric relabeling technique. The overall approach produces an RL policy that can rearrange up to 8 objects with a success rate of over 70% in simulation. We deploy the policy to two Franka Panda arms and further show a successful demo on human-robot collaboration. Videos can be found at https://sites.google.com/view/bimanual." EDO-Net: Learning Elastic Properties of Deformable Objects from Graph Dynamics,"Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael Welle, David Held, Zackory Erickson, Danica Kragic","KTH Royal Institute of Technology,Carnegie Mellon University,KTH",Learning for Grasping and Manipulation II,"We study the problem of learning graph dynamics of deformable objects that generalizes to unknown physical properties. Our key insight is to leverage a latent representation of elastic physical properties of cloth-like deformable objects that can be extracted, for example, from a pulling interaction. In this paper we propose EDO-Net (Elastic Deformable Object Net), a model of graph dynamics trained on a large variety of samples with different elastic properties that does not rely on ground-truth labels of the properties. EDO-Net jointly learns an adaptation module, and a forward-dynamics module. The former is responsible for extracting a latent representation of the physical properties of the object, while the latter leverages the latent representation to predict future states of cloth-like objects represented as graphs. We evaluate EDO-Net both in simulation and real world, assessing its capabilities of: 1) generalizing to unknown physical properties, 2) transferring the learned representation to new downstream tasks." Edge Grasp Network: A Graph-Based SE(3)-Invariant Approach to Grasp Detection,"Haojie Huang, Dian Wang, Xupeng Zhu, Robin Walters, Robert Platt",Northeastern University,Learning for Grasping and Manipulation II,"Given point cloud input, the problem of 6-DoF grasp pose detection is to identify a set of hand poses in SE(3) from which an object can be successfully grasped. This important problem has many practical applications. Here we propose a novel method and neural network model that enables better grasp success rates relative to what is available in the literature. The method takes standard point cloud data as input and works well with single-view point clouds observed from arbitrary viewing directions." Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-Grasps,"Sudeep Dasari, Abhinav Gupta, Vikash Kumar","Carnegie Mellon University,Meta AI",Learning for Grasping and Manipulation II,"Learning diverse dexterous manipulation behaviors with assorted objects remains an open grand challenge. While policy learning methods offer a powerful avenue to attack this problem, these approaches require extensive per-task engineering and algorithmic tuning. This paper seeks to escape these constraints, by developing a Pre-Grasp informed Dexterous Manipulation (PGDM) framework that generates diverse dexterous manipulation behaviors, without any task-specific reasoning or hyper-parameter tuning. At the core of PGDM is a well known robotics construct, pre-grasps (i.e. the hand-pose preparing for object interaction). This simple primitive is enough to induce efficient exploration strategies for acquiring complex dexterous manipulation behaviors. To exhaustively verify these claims, we introduce TCDM, a benchmark of 50 diverse manipulation tasks defined over multiple objects and dexterous manipulators. Tasks for TCDM are defined automatically using exemplar object trajectories from diverse sources (animators, human behaviors, etc.), without any per-task engineering and/or supervision. Our experiments validate that PGDM’s exploration strategy, induced by a surprisingly simple ingredient (single pre-grasp pose), matches the performance of prior methods, which require expensive per-task feature/reward engineering, expert supervision, and hyper-parameter tuning. For animated visualizations, trained policies, and project code, please refer to: https://pregrasps.github.io/." A Multi-Agent Approach for Adaptive Finger Cooperation in Learning-Based In-Hand Manipulation,"Lingfeng Tao, Jiucai Zhang, Michael Bowman, Xiaoli Zhang","Colorado School of Mines,Guangzhou Automotive Group R&D Center, Silicon Valley",Learning for Grasping and Manipulation II,"In-hand manipulation is challenging for a multi-finger robotic hand due to its high degrees of freedom and the complex interaction with the object. To enable in-hand manipulation, existing deep reinforcement learning based approaches mainly focus on training a single robot-structure-specific policy through the centralized learning mechanism, lacking adaptability to changes like robot malfunction. To solve this limitation, this work treats each finger as an individual agent and trains multiple agents to control their assigned fingers to complete the in-hand manipulation task cooperatively. We propose the Multi-Agent Global-Observation Critic and Local-Observation Actor (MAGCLA) method, where the critic can observe all agents’ actions globally, and the actor only locally observes its neighbors’ actions. Besides, conventional individual experience replay may cause unstable cooperation due to the asynchronous performance increment of each agent, which is critical for in-hand manipulation tasks. To solve this issue, we propose the Synchronized Hindsight Experience Replay (SHER) method to synchronize and efficiently reuse the replayed experience across all agents. The methods are evaluated in two in-hand manipulation tasks on the Shadow dexterous hand. The results show that SHER helps MAGCLA achieve comparable learning efficiency to a single policy. The MAGCLA approach is more generalizable in different tasks. The trained policies have higher adaptability in the robot malfunction test compared to the baseline multi-agent and single-agent approaches." Bimanual Rope Manipulation Skill Synthesis through Context Dependent Correction Policy Learning from Human Demonstration,"Baturhan Akbulut, Tuba Girgin, Arash Mehrabi, Minoru Asada, Emre Ugur, Erhan Oztop","BoÄŸaziçi University,Bogazici University,Ozyegin University,Open and Transdisciplinary Research Initiatives, Osaka Universit,Osaka University / Ozyegin University",Learning for Grasping and Manipulation II,"Learning from demonstration (LfD) with behavior cloning is attractive for its simplicity; however, compounding errors in long and complex skills can be a hindrance. Considering a target skill as a sequence of motor primitives is helpful in this respect. Then the requirement that a motor primitive ends in a state that allows the successful execution of the subsequent primitive must be met. In this study, we focus on this problem by proposing to learn an explicit correction policy when the expected transition state between primitives is not achieved. The correction policy is learned via behavior cloning by the use of Conditional Neural Motor Primitives (CNMPs) that can generate correction trajectories in a context-dependent way. The advantage of the proposed system over learning the complete task as a single action is shown with a table-top setup in simulation, where an object has to be pushed through a corridor in two steps. Then, the applicability of the proposed method to bi-manual knotting in the real world is shown by equipping an upper-body humanoid robot with the skill of making knots over a bar in 3D space." Sim-And-Real Reinforcement Learning for Manipulation: A Consensus-Based Approach,"Wenxing Liu, Hanlin Niu, Wei Pan, Guido Herrmann, Joaquin Carrasco","United Kingdom Atomic Energy Authority,Delft University of Technology,The University of Manchester",Learning for Grasping and Manipulation II,"Sim-and-real training is a promising alternative to sim-to-real training for robot manipulations. However, the current sim-and-real training is neither efficient, i.e., slow convergence to the optimal policy, nor effective, i.e., sizeable real-world robot data. Given limited time and hardware budgets, the performance of sim-and-real training is not satisfactory. In this paper, we propose a Consensus-based Sim-And-Real deep reinforcement learning algorithm (CSAR) for manipulator pick-and-place tasks, which shows comparable performance in both sim-and-real worlds. In this algorithm, we train the agents in simulators and the real world to get the optimal policies for both sim-and-real worlds. We found two interesting phenomenons: (1) Best policy in simulation is not the best for sim-and-real training. (2) The more simulation agents, the better sim-and-real training. The experimental video is available at: https://youtu.be/mcHJtNIsTEQ." AutoBag: Learning to Open Plastic Bags and Insert Objects,"Lawrence Yunliang Chen, Baiyu Shi, Daniel Seita, Richard Cheng, Thomas Kollar, David Held, Ken Goldberg","UC Berkeley,Carnegie Mellon University,California Institute of Technology,Toyota Research Institute",Learning for Grasping and Manipulation II,"Thin plastic bags are ubiquitous in retail stores, healthcare, food handling, recycling, homes, and school lunchrooms. They are challenging both for perception (due to specularities and occlusions) and for manipulation (due to the dynamics of their 3D deformable structure). We formulate the task of ""bagging:"" manipulating common plastic shopping bags with two handles from an unstructured initial state to an open state where at least one solid object can be inserted into the bag and lifted for transport. We propose a self-supervised learning framework where a dual-arm robot learns to recognize the handles and rim of plastic bags using UV-fluorescent markings; at execution time, the robot does not use UV markings or UV light. We propose the AutoBag algorithm, where the robot uses the learned perception model to open a plastic bag through iterative manipulation. We present novel metrics to evaluate the quality of a bag state and new motion primitives for reorienting and opening bags based on visual observations. In physical experiments, a YuMi robot using AutoBag is able to open bags and achieve a success rate of 16/30 for inserting at least one item across a variety of initial bag configurations. Supplementary material is available at https://sites.google.com/view/autobag." Toward Fine Contact Interactions: Learning to Control Normal Contact Force with Limited Information,"Jinda Cui, Jiawei Xu, David Saldana, Jeffrey Trinkle","Honda Research Institute USA, Inc.,Lehigh University",Learning for Grasping and Manipulation II,"Dexterous manipulation of objects through fine control of physical contacts is essential for many important tasks of daily living. A fundamental ability underlying fine contact control is compliant control, i.e., controlling the contact forces while moving. For robots, the most widely explored approaches heavily depend on models of manipulated objects and expensive sensors to gather contact location and force information needed for real-time control. The models are difficult to obtain, and the sensors are costly, hindering personal robots' adoption in our homes and businesses. This study performs model-free reinforcement learning of a normal contact force controller on a robotic manipulation system built with a low-cost, information-poor tactile sensor. Despite the limited sensing capability, our force controller can be combined with a motion controller to enable fine contact interactions during object manipulation. Promising results are demonstrated in non-prehensile, dexterous manipulation experiments." Ditto in the House: Building Articulation Models of Indoor Scenes through Interactive Perception,"Cheng-chun Hsu, Zhenyu Jiang, Yuke Zhu","The University of Texas at Austin,The Unversity of Texas at Austin",Learning for Grasping and Manipulation II,"Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. To foster manipulation with articulated objects in everyday life, this work explores building articulation models of indoor scenes through a robot's purposeful interactions in these scenes. Prior work on articulation reasoning primarily focuses on siloed objects of limited categories. To extend to room-scale environments, the robot has to efficiently and effectively explore a large-scale 3D space, locate articulated objects, and infer their articulations. We introduce an interactive perception approach to this task. Our approach, named Ditto in the House, discovers possible articulated objects through affordance prediction, interacts with these objects to produce articulated motions, and infers the articulation properties from the visual observations before and after each interaction. It tightly couples affordance prediction and articulation inference to improve both tasks. We demonstrate the effectiveness of our approach in both simulation and real-world scenes. Code and additional results are available at https://ut-austin-rpl.github.io/HouseDitto/" Zero-Shot Transfer of Haptics-Based Object Insertion Policies,"Samarth Brahmbhatt, Ankur Deka, Andrew Spielberg, Matthias Mueller","Intel Corporation,Intel Labs,Harvard University, MIT,Intel",Learning for Grasping and Manipulation II,"Humans naturally exploit haptic feedback during contact-rich tasks like loading a dishwasher or stocking a bookshelf. Current robotic systems focus on avoiding unexpected contact, often relying on strategically placed environment sensors. Recently, contact-exploiting manipulation policies have been trained in simulation and deployed on real robots. However, they require some form of real-world adaptation to bridge the sim-to-real gap, which might not be feasible in all scenarios. In this paper we train a contact-exploiting manipulation policy in simulation for the contact-rich household task of loading plates into a slotted holder, which transfers without any fine-tuning to the real robot. We investigate various factors necessary for this zero-shot transfer, like time delay modeling, memory representation, and domain randomization. Our policy transfers with minimal sim-to-real gap and significantly outperforms heuristic and learnt baselines. It also generalizes well to a cup and plates of different sizes and weights. The project website is https://sites.google.com/view/compliant-object-insertion." Moment-Based Kalman Filter: Nonlinear Kalman Filtering with Exact Moment Propagation,"Yutaka Shimizu, Ashkan Jasour, Maani Ghaffari, Shinpei Kato","TIER IV,MIT,University of Michigan,The University of Tokyo",Localization II,"This paper develops a new nonlinear filter, called Moment-based Kalman Filter (MKF), using exact moment propagation method. Existing state estimation methods use linearization techniques or sampling points to compute approximated values of moments. However, moment propagation of probability distributions of random variables through nonlinear process and measurement models play a key role in the development of state estimation and directly affects their performance. The proposed moment propagation method can compute exact moments of non-independent Gaussian random variables, and this allows MKF to propagate exact moments of the uncertain state variables up to any desired order. MKF is derivative-free and does not require tuning parameters. Moreover, MKF has the same computation time complexity as the extended or unscented Kalman filters, i.e., EKF and UKF. The experimental evaluations show that MKF is the preferred filter in comparison to EKF and UKF, and outperforms both filters in non-Gaussian noise regimes." Unsupervised Quality Prediction for Improved Single-Frame and Weighted Sequential Visual Place Recognition,"Helen Carson, Jason Ford, Michael J Milford",Queensland University of Technology,Localization II,"While substantial progress has been made in the absolute performance of localization and Visual Place Recognition (VPR) techniques, it is becoming increasingly clear from translating these systems into applications that other capabilities like integrity and predictability are just as important, especially for safety- or operationally-critical autonomous systems. In this research we present a new, training-free approach to predicting the likely quality of localization estimates, and a novel method for using these predictions to bias a sequence-matching process to produce additional performance gains beyond that of a naive sequence matching approach. Our combined system is lightweight, runs in real-time and is agnostic to the underlying VPR technique. On extensive experiments across four datasets and three VPR techniques, we demonstrate our system improves precision performance, especially at the high-precision/low-recall operating point. We also present ablation and analysis identifying the performance contributions of the prediction and weighted sequence matching components in isolation, and the relationship between the quality of the prediction system and the benefits of the weighted sequential matcher." Towards Consistent Batch State Estimation Using a Time-Correlated Measurement Noise Model,"David Juny Yoon, Timothy Barfoot",University of Toronto,Localization II,"In this paper, we present an algorithm for learning time-correlated measurement covariances for application in batch state estimation. We parameterize the inverse measurement covariance matrix to be block-banded, which conveniently factorizes and results in a computationally efficient approach for correlating measurements across the entire trajectory. We train our covariance model through supervised learning using the groundtruth trajectory. In applications where the measurements are time-correlated, we demonstrate improved performance in both the mean posterior estimate and the covariance (i.e., improved estimator consistency). We use an experimental dataset collected using a mobile robot equipped with a laser rangefinder to demonstrate the improvement in performance. We also verify estimator consistency in a controlled simulation using a statistical test over several trials." A Probabilistic Framework for Visual Localization in Ambiguous Scenes,"Fereidoon Zangeneh, Leonard Bruns, Amit Dekel, Alessandro Pieropan, Patric Jensfelt","KTH Royal Institute of Technology,Univrses AB,KTH,KTH - Royal Institute of Technology",Localization II,"Visual localization allows autonomous robots to relocalize when losing track of their pose by matching their current observation with past ones. However, ambiguous scenes pose a challenge for such systems, as repetitive structures can be viewed from many distinct, equally likely camera poses, which means it is not sufficient to produce a single best pose hypothesis. In this work, we propose a probabilistic framework that for a given image predicts the arbitrarily shaped posterior distribution of its camera pose. We do this via a novel formulation of camera pose regression using variational inference, which allows sampling from the predicted distribution. Our method outperforms existing methods on localization in ambiguous scenes. We open-source our approach and share our recorded data sequence at https://github.com/efreidun/vapor." RoLM: Radar on LiDAR Map Localization,"Yukai Ma, Xiangrui Zhao, Han Li, Yaqing Gu, Xiaolei Lang, Yong Liu","zhejiang unicersity,Zhejiang University",Localization II,"Multi-sensor fusion-based localization technology has achieved high accuracy in autonomous systems. How to improve the robustness is the main challenge at present. The most commonly used LiDAR and camera are weather-sensitive, while the FMCW radar has strong adaptability but suffers from noise and ghost effects. In this paper, we propose a heterogeneous localization method of Radar on LiDAR Map (RoLM), which can eliminate the accumulated error of radar odometry in real-time to achieve higher localization accuracy without dependence on loop closures. We embed the two sensor modalities into a density map and calculate the spatial vector similarity with offset to seek the corresponding place index in the candidates and calculate the rotation and translation. Based on the coarse alignment, we use the ICP to pursue perfect matching on the LiDAR submap. Extensive experiments on Mulran Radar Dataset, the Oxford Radar RobotCar Dataset, and our data verify the feasibility and effectiveness of our approach." Direct LiDAR-Inertial Odometry: Lightweight LIO with Continuous-Time Motion Correction,"Kenny Chen, Ryan Nemiroff, Brett Lopez","University of California, Los Angeles",Localization II,"Aggressive motions from agile flights or traversing irregular terrain induce motion distortion in LiDAR scans that can degrade state estimation and mapping. Some methods exist to mitigate this effect, but they are still too simplistic or computationally costly for resource-constrained mobile robots. To this end, this paper presents Direct LiDAR-Inertial Odometry (DLIO), a lightweight LiDAR-inertial odometry algorithm with a new coarse-to-fine approach in constructing continuous-time trajectories for precise motion correction. The key to our method lies in the construction of a set of analytical equations which are parameterized solely by time, enabling fast and parallelizable point-wise deskewing. This method is feasible only because of the strong convergence properties in our nonlinear geometric observer, which provides provably correct state estimates for initializing the sensitive IMU integration step. Moreover, by simultaneously performing motion correction and prior generation, and by directly registering each scan to the map and bypassing scan-to-scan, DLIO's condensed architecture is nearly 20% more computationally efficient than the current state-of-the-art with a 12% increase in accuracy. We demonstrate DLIO's superior localization accuracy, map quality, and lower computational overhead as compared to four state-of-the-art algorithms through extensive tests using multiple public benchmark and self-collected datasets." Large-Scale Radar Localization Using Online Public Maps,"Ziyang Hong, Y. R. Petillot, Kaicheng Zhang, Shida Xu, Sen Wang","Heriot-Watt University,Imperial College London",Localization II,"In this paper, we propose using online public maps, e.g., OpenStreetMap (OSM), for large-scale radar-based localization without needing a prior sensing map. This can potentially extend the localization system to anywhere worldwide without building, saving, or maintaining a sensing map, as long as an online public map covers the operating area. Existing methods using OSM only use route network or semantics information. These two sources of information are not combined in the previous works, while our proposed system fuses them to improve localization accuracy. Our experiments, on three open datasets collected from three different continents, show that the proposed system outperforms the state-of-the-art localization methods, reducing up to 50% of position errors. We release an open-source implementation for the community." Continuous-Time LiDAR-Inertial-Vehicle Odometry Method with Lateral Acceleration Constraint,"Bin He, Weichen Dai, Zeyu Wan, Hong Zhang, Yu Zhang","Zhejiang University,Hangzhou Dianzi University",Localization II,"In this paper, we propose a continuous-time-based LiDAR-inertial-vehicle odometry method, which can tightly fuse the data from Light Detection And Ranging (LiDAR), inertial measurement units (IMU), and vehicle measurements. The lateral acceleration constraint is further added to trajectory estimation to make the estimated trajectory follow the motion characteristics of vehicles. In addition, since vehicle model parameters vary with different motion conditions and tyre pressure, we estimate vehicle correction factors that rectify changes in vehicle model parameters online, and also analyze the observability of these vehicle correction factors. In experiments, the proposed method is evaluated and compared with state-of-the-art methods in the public dataset. The experimental results show that the proposed method achieves more accurate results in all sequences since we add additional sensor measurements and utilize the characteristic of vehicle motion to restrict the trajectory estimation. The ablation study also proved the effectiveness of continuous-time representation, online correction factor estimation, and incorporation of lateral acceleration constraint." Cross-Modal Monocular Localization in Prior LiDAR Maps Utilizing Semantic Consistency,"Zhang Chi, Hengwang Zhao, Chunxiang Wang, Xuanlai Tang, Ming Yang","Shanghai Jiao Tong University,Shanghai Jiaotong University,KEENON Robotics Co., Ltd",Localization III,"Visual localization for mobile robots and intelligent vehicles in prior LiDAR maps can achieve high accuracy and low cost. However, algorithms for finding the cross-modal correspondences between images and LiDAR map points are not yet stable. In this paper, we propose a monocular visual localization system in prior LiDAR maps, which is based on the cross-modal registration to optimize the camera pose. To align the point clouds from vision and LiDAR map, a pointto-plane Iterative Closest Point algorithm utilizing semantic consistency is designed, and a decoupling optimization strategy is proposed to compute the affine transformation for the monocular scale ambiguity. Experiments on KITTI dataset show that utilizing the semantic consistency and geometric information of the map makes our system competitive with other methods. On the self-collected dataset, experiments on different light intensities demonstrate the robustness of the system in long-term localization tasks, and the ablation study demonstrates the effectiveness of the proposed algorithms." Multi-State Tightly-Coupled EKF-Based Radar-Inertial Odometry with Persistent Landmarks,"Jan Michalczyk, Roland Jung, Christian Brommer, Stephan Weiss","University of Klagenfurt,Universität Klagenfurt",Localization III,"In this paper, we present a RIO approach that utilizes performance improving modules, enhanced for the sparse and noisy radar signals, from the vision community in order to estimate the full 6DoF pose and 3D velocity of a robot in an unprepared environment. Our method leverages a multi-state approach in which we make use of several past robot poses and trails of measurements from a lightweight and inexpensive FMCW radar sensor. Furthermore, in our estimation framework we include a method for promoting measurement trails to persistent landmarks which correspond to salient features in the environment. In an EKF framework, we fuse the range measurements to the persistent landmarks, trails, and the velocity measurements of the detected 3D points together with the IMU readings. Our method is particularly relevant for (but not limited to) UAV, enabling them to localize while performing missions in GNSS-denied environments and, thanks to the properties of the radar sensor, in environments generally challenging for robot perception due to external factors such as smoke or extreme illumination. We show in real flight experiments the effectiveness of our estimator and compare it to the state-of-the-art." Loc-NeRF: Monte Carlo Localization Using Neural Radiance Fields,"Dominic Maggio, Marcus Abate, Jingnan Shi, Courtney Mario, Luca Carlone","Massachusetts Institute of Technology,MIT,Draper",Localization III,"We present Loc-NeRF, a real-time vision-based robot localization approach that combines Monte Carlo localization and Neural Radiance Fields (NeRF). Our system uses a pre-trained NeRF model as the map of an environment and can localize itself in real-time using an RGB camera as the only exteroceptive sensor onboard the robot. While neural radiance fields have seen significant applications for visual rendering in computer vision and graphics, they have found limited use in robotics. Existing approaches for NeRF- based localization require both a good initial pose guess and significant computation, making them impractical for real-time robotics applications. By using Monte Carlo localization as a workhorse to estimate poses using a NeRF map model, Loc-NeRF is able to perform localization faster than the state of the art and without relying on an initial pose estimate. In addition to testing on synthetic data, we also run our system using real data collected by a Clearpath Jackal UGV and demonstrate for the first time the ability to perform real-time and global localization (albeit over a small workspace) with neural radiance fields. We make our code publicly available at https://github.com/MIT-SPARK/Loc-NeRF." RoSS: Rotation-Induced Aliasing for Audio Source Separation,"Hyungjoo Seo, Sahil Bhandary Karnoor, Romit Roy Choudhury",University of Illinois at Urbana-Champaign,Localization III,"This paper considers the problem of audio source separation, where the goal is to isolate a target audio signal (say Alice’s speech) from a mixture of multiple interfering signals (e.g., when many people are talking). This problem has gained renewed interest mainly due to the significant growth in voice-controlled devices, including robots in homes, offices, and other public facilities. Although a rich body of work exists on the core topic of source separation, we find that rotational motion of the microphones (e.g., a swiveling robot-head) offers complementary gains. We show that rotating the microphone array to the optimal orientation can produce desirable delay aliasing between two interferers, causing the two interferers to appear as one. In general, a mixture of K signals becomes a mixture of (K-1) signals, a mathematically concrete gain. We show that the gain translates well to practice, provided two rotation-related challenges can be mitigated. This paper is focused on mitigating these challenges and demonstrating the end-to-end performance on a fully functional prototype. We believe that our Rotational Source Separation (RoSS) module could be plugged into actual robot heads or into other devices (like Amazon Show) that are also capable of rotation." L-C*: Visual-Inertial Loose Coupling for Resilient and Lightweight Direct Visual Localization,"Shuji Oishi, Kenji Koide, Masashi Yokozuka, Atsuhiko Banno","National Institute of Advanced Industrial Science and Technology (AIST),National Institute of Advanced Industrial Science and Technology,Nat. Inst. of Advanced Industrial Science and Technology,National Instisute of Advanced Industrial Science and Technology",Localization III,"This study presents a framework, L-C*, for resilient and lightweight direct visual localization, employing a loosely coupled fusion of visual and inertial data. Unlike indirect methods, direct visual localization facilitates accurate pose estimation on general color three-dimensional maps that are not tailored for visual localization. However, it suffers from temporal localization failures and high computational costs for real-time applications. For long-term and real-time visual localization, we developed an L-C* that incorporates direct visual localization C* in a visual-inertial loose coupling. By capturing ego-motion via visual-inertial odometry to interpolate global pose estimates, the framework allows for a significant reduction in the frequency of demanding global localization, thereby facilitating lightweight but reliable visual localization. In addition, forming a closed loop that feeds the latest pose estimate to the visual localization component as an initial guess for the next pose inference renders the system highly robust. A quantitative evaluation of a simulation dataset demonstrated the accuracy and efficiency of the proposed framework. Experiments using smartphone sensors also demonstrated the robustness and resiliency of L-C* in real-world situations." GRM: Gradient Rectification Module for Visual Place Retrieval,"Boshu Lei, Wenjie Ding, Limeng Qiao, Xi Qiu","University of Pennsylvania,MEGVII Inc,Megvii Inc.,Megvii",Localization III,"Visual place retrieval aims to search images in the database that depict similar places as the query image. However, global descriptors encoded by the network usually fall into a low dimensional principal space, which is harmful to the retrieval performance. We first analyze the cause of this phenomenon, pointing out that it is due to degraded distribution of the gradients of descriptors. Then, we propose Gradient Rectification Module~(GRM) to alleviate this issue. GRM is appended after the final pooling layer and can rectify the gradients to the complement space of the principal space. With GRM, the network is encouraged to generate descriptors more uniformly in the whole space. At last, we conduct experiments on multiple datasets and generalize our method to classification task under prototype learning framework." DytanVO: Joint Refinement of Visual Odometry and Motion Segmentation in Dynamic Environments,"Shihao Shen, Yilin Cai, Wenshan Wang, Sebastian Scherer",Carnegie Mellon University,Localization III,"Learning-based visual odometry (VO) algorithms achieve remarkable performance on common static scenes, benefiting from high-capacity models and massive annotated data, but tend to fail in dynamic, populated environments. Semantic segmentation is largely used to discard dynamic associations before estimating camera motions but at the cost of discarding static features and is hard to scale up to unseen categories. In this paper, we leverage the mutual dependence between camera ego-motion and motion segmentation and show that both can be jointly refined in a single learning-based framework. In particular, we present DytanVO, the first supervised learning-based VO method that deals with dynamic environments. It takes two consecutive monocular frames in real-time and predicts camera ego-motion in an iterative fashion. Our method achieves an average improvement of 27.7% in ATE over state-of-the-art VO solutions in real-world dynamic environments, and even performs competitively among dynamic visual SLAM systems which optimize the trajectory on the backend. Experiments on plentiful unseen environments also demonstrate our method's generalizability." NOCaL: Calibration-Free Semi-Supervised Learning of Odometry and Camera Intrinsics,"Ryan Griffiths, Jack Naylor, Donald G Dansereau",University of Sydney,Localization III,"There are a multitude of emerging imaging tech- nologies that could benefit robotics. However the need for bespoke models, calibration and low-level processing represents a key barrier to their adoption. In this work we present NOCaL, Neural Odometry and Calibration using Light fields, a semi-supervised learning architecture capable of interpret- ing previously unseen cameras without calibration. NOCaL learns to estimate camera parameters, relative pose, and scene appearance. It employs a scene-rendering hypernetwork pre- trained on a large number of existing cameras and scenes, and adapts to previously unseen cameras using a small supervised training set to enforce metric scale. We demonstrate NOCaL on rendered and captured imagery using conventional cam- eras, demonstrating calibration-free odometry and novel view synthesis. This work represents a key step toward automating the interpretation of general camera geometries and emerg- ing imaging technologies. Code and datasets are available at https://roboticimaging.org/Projects/NOCaL/." Efficient View Path Planning for Autonomous Implicit Reconstruction,"Jing Zeng, Yanxu Li, Yunlong Ran, Shuo Li, Shibo He, Fei Gao, Lincheng Li, Jiming Chen, Qi Ye","Zhejiang University,NetEase Fuxi AI Lab",Vision-Based Navigation II,"Implicit neural representations have shown promising potential for 3D scene reconstruction. Recent work applies it to autonomous 3D reconstruction by learning information gain for view path planning. Effective as it is, the computation of the information gain is expensive, and compared with that using volumetric representations, collision checking using the implicit representation for a 3D point is much slower. In the paper, we propose to 1) leverage a neural network as an implicit function approximator for the information gain field and 2) combine the implicit fine-grained representation with coarse volumetric representations to improve efficiency. Further with the improved efficiency, we propose a novel informative path planning based on a graph-based planner. Our method demonstrates significant improvements in the reconstruction quality and planning efficiency compared with autonomous reconstructions with implicit and explicit representations. We deploy the method on a real UAV and the results show that our method can plan informative views and reconstruct a scene with high quality." "Lighthouses and Global Graph Stabilization: Active SLAM for Low-Compute, Narrow-FoV Robots","Mohit Deshpande, Richard Kim, Dhruva Kumar, Jong Jin Park, James Zamiska","Amazon Lab,,,,Amazon, Lab,,,,Amazon",Vision-Based Navigation II,"Autonomous exploration to build a map of an unknown environment is a fundamental robotics problem. However, the quality of the map directly influences the quality of subsequent robot operation. Instability in a simultaneous localization and mapping (SLAM) system can lead to poor- quality maps and subsequent navigation failures during or after exploration. This becomes particularly noticeable in consumer robotics, where compute budget and limited field-of-view are very common. In this work, we propose (i) the concept of lighthouses: panoramic views with high visual information content that can be used to maintain the stability of the map locally in their neighborhoods and (ii) the final stabilization strategy for global pose graph stabilization. We call our exploration strategy SLAM-aware exploration (SAE), and it is currently deployed on Astro, the first home robot from Amazon. To the best of the authors’ knowledge, this is the first large-scale real-world application with a functioning active visual SLAM system." ExAug: Robot-Conditioned Navigation Policies Via Geometric Experience Augmentation,"Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine","UC Berkeley / TOYOTA Motor North America,University of California, Berkeley,UC Berkeley",Vision-Based Navigation II,"Machine learning techniques rely on large and diverse datasets for generalization. Computer vision, natural language processing, and other applications can often reuse public datasets to train many different models. However, due to differences in physical configurations, it is challenging to leverage public datasets for training robotic control policies on new robot platforms or for new tasks. In this work, we propose a novel framework, ExAug to augment the experiences of different robot platforms from multiple datasets in diverse environments. ExAug leverages a simple principle: by extracting 3D information in the form of a point cloud, we can create much more complex and structured augmentations, utilizing both generating synthetic images and geometric-aware penalization that would have been suitable in the same situation for a different robot, with different size, turning radius, and camera placement. The trained policy is evaluated on two new robot platforms with three different cameras in indoor and outdoor environments with obstacles." Multi-Object Navigation in Real Environments Using Hybrid Policies,"Assem Sadek, Guillaume Bono, Boris Chidlovskii, Atilla Baskurt, Christian Wolf","Naver Labs Europe,Naverlabs Europe,INSA Lyon",Vision-Based Navigation II,"Navigation has been classically solved in robotics through the combination of SLAM and planning. More recently, beyond waypoint planning, problems involving significant components of (visual) high-level reasoning have been explored in simulated environments, mostly addressed with large-scale machine learning, in particular RL, offline-RL or imitation learning. These methods require the agent to learn various skills like local planning, mapping objects and querying the learned spatial representations. In contrast to simpler tasks like waypoint planning (PointGoal), for these more complex tasks the current state-of-the-art models have been thoroughly evaluated in simulation but, to our best knowledge, not yet in real environments. In this work we focus on sim2real transfer. We target the challenging Multi-Object Navigation (Multi-ON) task and port it to a physical environment containing real replicas of the originally virtual Multi-ON objects. We introduce a hybrid navigation method, which decomposes the problem into two different skills: (1) waypoint navigation is addressed with classical SLAM combined with a symbolic planner, whereas (2) exploration, semantic mapping and goal retrieval are dealt with deep neural networks trained with a combination of supervised learning and RL. We show the advantages of this approach compared to end-to-end methods both in simulation and a real environment and outperform the SOTA for this task." AeriaLPiPS: A Local Planner for Aerial Vehicles with Geometric Collision Checking,"Justin Smith, Patricio A. Vela",Georgia Institute of Technology,Vision-Based Navigation II,"Real-time navigation in non-trivial environments by micro aerial vehicles (MAVs) predominantly relies on modelling the MAV with idealized geometry, such as a sphere. Simplified, conservative representations increase the likelihood of a planner failing to identify valid paths. That likelihood increases the more a robot's geometry differs from the idealized version. Few current approaches consider these situations; we are unaware of any that do so using perception space representations. This work introduces the egocan, a perception space obstacle representation using line-of-sight free space estimates, and 3DGap, a perception space approach to gap finding for identifying goal-directed, collision-free directions of travel through 3D space. Both are integrated, with real-time considerations in mind, to define a local planner module of a hierarchical navigation system. The result is Aerial Local Planning in Perception Space (AeriaLPiPS}. AeriaLPiPS is shown to be capable of safely navigating a MAV with non-idealized geometry through various environments, including those impassable by traditional real-time approaches. The open source implementation of this work is available at github.com/ivaROS/AeriaLPiPS." Frontier Semantic Exploration for Visual Target Navigation,"Bangguo Yu, Hamidreza Kasaei, Ming Cao",University of Groningen,Vision-Based Navigation II,"This work focuses on the problem of visual target navigation, which is very important for autonomous robots as it is closely related to high-level tasks. To find a special object in unknown environments, classical and learning-based approaches are fundamental components of navigation that have been investigated thoroughly in the past. However, due to the difficulty in the representation of complicated scenes and the learning of the navigation policy, previous methods are still not adequate, especially for large unknown scenes. Hence, we propose a novel framework for visual target navigation using the frontier semantic policy. In this proposed framework, the semantic map and the frontier map are built from the current observation of the environment. Using the features of the maps and object category, deep reinforcement learning enables to learn a frontier semantic policy which can be used to select a frontier cell as a long-term goal to explore the environment efficiently. Experiments on Gibson and Habitat-Matterport 3D (HM3D) demonstrate that the proposed framework significantly outperforms existing map-based methods in terms of success rate and efficiency. Ablation analysis also indicates that the proposed approach learns a more efficient exploration policy based on the frontiers. A demonstration is provided to verify the applicability of applying our model to real-world transfer. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/fsevn." VINet: Visual and Inertial-Based Terrain Classification and Adaptive Navigation Over Unknown Terrain,"Tianrui Guan, Ruitao Song, Zhixian Ye, Liangjun Zhang","University of Maryland,Aptiv Corporation,Baidu",Vision-Based Navigation II,"We present a visual and inertial-based terrain classification network (VINet) for robotic navigation over different traversable surfaces. We use a novel navigation-based labeling scheme for terrain classification and generalization on unknown surfaces. Our proposed perception method and adaptive scheduling control framework can make predictions according to terrain navigation properties and lead to better performance on both terrain classification and navigation control on known and unknown surfaces. Our VINet can achieve 98.37% in terms of accuracy under supervised setting on known terrains and improve the accuracy by 8.51% on unknown terrains compared to previous methods. We deploy VINet on a mobile tracked robot for trajectory following and navigation on different terrains, and we demonstrate an improvement of 10.3% compared to a baseline controller in terms of RMSE." Ground Then Navigate: Language-Guided Navigation in Dynamic Scenes,"Kanishk Jain, Varun Chhangani, Amogh Tiwari, Madhava Krishna, Vineet Gandhi",IIIT Hyderabad,Vision-Based Navigation II,"We investigate the problem of Vision-and-Language Navigation (VLN) in the context of autonomous driving in outdoor settings. We solve the problem by explicitly grounding the navigable regions corresponding to the textual command. At each timestamp, the model predicts a segmentation mask corresponding to the intermediate or the final navigable region. Our work contrasts with existing efforts in VLN, which pose this task as a node selection problem, given a discrete connected graph corresponding to the environment. We do not assume the availability of such a discretised map. Our work moves towards continuity in action space, provides interpretability through visual feedback and allows VLN on commands like `park between the two cars”, requiring finer manoeuvres. Furthermore, we propose a novel meta dataset CARLA-NAV to allow efficient training and validation. The dataset comprises pre-recorded training sequences and a live environment for validation and testing. We provide extensive qualitative and quantitive empirical results to validate the efficacy of the proposed approach." 3-Dimensional Sonic Phase-Invariant Echo Localization,Christopher Hahne,University of Bern,Localization and Mapping II,"Parallax and Time-of-Flight (ToF) are often regarded as complementary in robotic vision where various light and weather conditions remain challenges for advanced camera-based 3-Dimensional (3-D) reconstruction. To this end, this paper establishes Parallax among Corresponding Echoes (PaCE) to triangulate acoustic ToF pulses from arbitrary sensor positions in 3-D space for the first time. This is achieved through a novel round-trip reflection model that pinpoints targets at the intersection of ellipsoids, which are spanned by sensor locations and detected arrival times. Inter-channel echo association becomes a crucial prerequisite for target detection and is learned from feature similarity obtained by a stack of Siamese Multi-Layer Perceptrons (MLPs). The PaCE algorithm enables phase-invariant 3-D object localization from only 1 isotropic emitter and at least 3 ToF receivers with relaxed sensor position constraints. Experiments are conducted with airborne ultrasound sensor hardware and back this hypothesis with quantitative results." Calibration and Uncertainty Characterization for Ultra-Wideband Two-Way-Ranging Measurements,"Mohammed A. Shalaby, Charles Champagne Cossette, James Richard Forbes, Jerome Le Ny","McGill University,Polytechnique Montreal",Localization and Mapping II,"Ultra-Wideband (UWB) systems are becoming increasingly popular for indoor localization, where range measurements are obtained by measuring the time-of-flight of radio signals. However, the range measurements typically suffer from a systematic error or bias that must be corrected for high accuracy localization. In this paper, a ranging protocol is proposed alongside a robust and scalable antenna-delay calibration procedure to accurately and efficiently calibrate antenna delays for many UWB tags. Additionally, the bias and uncertainty of the measurements are modelled as a function of the received-signal power. The full calibration procedure is presented using experimental training data of 3 aerial robots fitted with 2 UWB tags each, and then evaluated on 2 test experiments. A localization problem is then formulated on the experimental test data, and the calibrated measurements and their modelled uncertainty are fed into an extended Kalman filter (EKF). The proposed calibration is shown to yield an average of 46% improvement in localization accuracy. Lastly, the paper is accompanied by an open-source UWB-calibration Python library, which can be found at https://github.com/decargroup/uwb_calibration." High Resolution Point Clouds from mmWave Radar,"Akarsh Prabhakara, Tao Jin, Arnav Das, Gantavya Bhatt, Lilly Kumari, Elahe Soltanaghai, Jeff Bilmes, Swarun Kumar, Anthony Rowe","Carnegie Mellon University,University of Washington,University of Illinois Urbana-Champaign",Localization and Mapping II,"This paper explores a machine learning approach on data from a single-chip mmWave radar for generating high resolution point clouds -- a key sensing primitive for robotic applications such as mapping, odometry and localization. Unlike lidar and vision-based systems, mmWave radar can operate in harsh environments and see through occlusions like smoke, fog, and dust. Unfortunately, current mmWave processing techniques offer poor spatial resolution compared to lidar point clouds. This paper presents RadarHD, an end-to-end neural network that constructs lidar-like point clouds from low resolution radar input. Enhancing radar images is challenging due to the presence of specular and spurious reflections. Radar data also doesn't map well to traditional image processing techniques due to the signal's sinc-like spreading pattern. We overcome these challenges by training RadarHD on a large volume of raw I/Q radar data paired with lidar point clouds across diverse indoor settings. Our experiments show the ability to generate rich point clouds even in scenes unobserved during training and in the presence of heavy smoke occlusion. Further, RadarHD's point clouds are high-quality enough to work with existing lidar odometry and mapping workflows." Pyramid Learnable Tokens for 3D LiDAR Place Recognition,"Congcong Wen, Hao Huang, Yu-shen Liu, Yi Fang","New York University Abu Dhabi,New York University,Tsinghua University",Localization and Mapping II,"3D LiDAR place recognition plays a vital role in various robot applications,, including robotic navigation, autonomous driving, and simultaneous localization and mapping. However, most previous studies evaluated their models on accumulated 2D scans instead of real-world 3D LiDAR scans with a larger number of points, which limits the application in real scenarios. To address this limitation, we propose a point transformer network with pyramid learnable tokens (PTNet-PLT) to learn global descriptors for an actual scanned 3D LiDAR place recognition. Specifically, we first present a novel shifted cube attention module that consists of a self-attention module for local feature extraction and a cross-attention module for regional feature aggregation. The self-attention module constrains attention computation on a locally partitioned cube and builds connections across cubes based on the shifted cube scheme. In addition, the cross-attention module introduces several learnable tokens to separately aggregate features of points with similar features but spatially distant into an arbitrarily shaped region, which enables the model to capture long-term dependencies of the points. Next, we build a pyramid architecture network to learn multi-scale features and involve a decreasing number of tokens at each layer to aggregate features over a larger region. Finally, we obtain the global descriptor by concatenating learned region tokens of all layers. Experiments on three datasets, including USyd Campus, Oxford Robot-Car, and KITTI, demonstrate the effectiveness and generalization of the proposed model for large-scale 3D LiDAR place recognition." A Decoupled and Linear Framework for Global Outlier Rejection Over Planar Pose Graph,"Tianyue Wu, Fei Gao",Zhejiang University,Localization and Mapping II,"We propose a robust framework for the planar pose graph optimization contaminated by loop closure outliers. Our framework rejects outliers by first decoupling the robust PGO problem wrapped by a Truncated Least Squares kernel into two subproblems. Then, the framework introduces a linear angle representation to rewrite the first subproblem that is originally formulated with rotation matrices. The framework is configured with the Graduated Non-Convexity (GNC) algorithm to solve the two non-convex subproblems in succession without initial guesses. Thanks to the linearity properties of both the subproblems, our framework requires only linear solvers to optimally solve the optimization problems encountered in GNC. We extensively validate the proposed framework, named DEGNC-LAF (DEcoupled Graduated Non-Convexity with Linear Angle Formulation) in planar PGO benchmarks. It turns out that it runs significantly (sometimes up to over 30 times) faster than the standard and general-purpose GNC while resulting in high-quality estimates." Robust Incremental Smoothing and Mapping (riSAM),"Daniel Mcgann, John G. Rogers Iii, Michael Kaess","Carnegie Mellon University,US Army Research Laboratory",Localization and Mapping II,"This paper presents a method for robust optimization for online incremental Simultaneous Localization and Mapping (SLAM). Due to the NP-Hardness of data association in the presence of perceptual aliasing, tractable (approximate) approaches to data association will produce erroneous measurements. We require SLAM back-ends that can converge to accurate solutions in the presence of outlier measurements while meeting online efficiency constraints. Existing robust SLAM methods either remain sensitive to outliers, become increasingly sensitive to initialization, or fail to provide online efficiency. We present the robust incremental Smoothing and Mapping (riSAM) algorithm, a robust back-end optimizer for incremental SLAM based on Graduated Non-Convexity. We demonstrate on benchmarking datasets that our algorithm achieves online efficiency, outperforms existing online approaches, and matches or improves the performance of existing offline methods." Real-Time Simultaneous Localization and Mapping with LiDAR Intensity,"Wenqiang Du, Giovanni Beltrame","Polytechnique Montreal,Ecole Polytechnique de Montreal",Localization and Mapping II,"We propose a novel real-time LiDAR intensity image-based simultaneous localization and mapping method, which addresses the geometry degeneracy problem in unstructured environments. Traditional LiDAR-based front-end odometry mostly relies on geometric features such as points, lines and planes. A lack of these features in the environment can lead to the failure of the entire odometry system. To avoid this problem, we extract feature points from the LiDAR-generated point cloud that match features identified in LiDAR intensity images. We then use the extracted feature points to perform scan registration and estimate the robot ego-movement. For the back-end, we jointly optimize the distance between the corresponding feature points, and the point to plane distance for planes identified in the map. In addition, we use the features extracted from intensity images to detect loop closure candidates from previous scans and perform pose graph optimization. Our experiments show that our method can run in real time with high accuracy and works well with illumination changes, low-texture, and unstructured environments." iMODE: Real-Time Incremental Monocular Dense Mapping Using Neural Field,"Hidenobu Matsuki, Edgar Sucar, Tristan Laidlow, Kentaro Wada, Raluca Scona, Andrew J Davison","Imperial College London,Mujin, Inc.,Ocado Technology",Localization and Mapping II,"We present the first real-time dense and semantic neural field mapping system that uses only monocular images as input. Our scene representation is a dense continuous radiance field represented by a Multi-Layer Perceptron (MLP), trained from scratch in real-time. We build on high-performance sparse visual SLAM and use camera poses and sparse keypoint depths as supervision alongside RGB keyframes. Since no prior training is required, our system flexibly fits to arbitrary scale and structure at runtime, and works even with strong specular reflections. We demonstrate reconstruction over a range of scenes from small indoor to large outdoor spaces. We also show that the method can straightforwardly benefit from additional inputs such as learned depth priors or semantic labels for more precise and advanced mapping." Probabilistic Uncertainty Quantification of Prediction Models with Application to Visual Localization,"Junan Chen, Josephine Monica, Wei-Lun Chao, Mark Campbell",Cornell University,Localization and Mapping II,"The uncertainty quantification of prediction models (e.g., neural networks) is crucial for their adoption in many robotics applications. This is arguably as important as making accurate predictions, especially for safety-critical applications such as self-driving cars. This paper proposes our approach to uncertainty quantification in the context of visual localization for autonomous driving, where we predict locations from images. Our proposed framework estimates probabilistic uncertainty by creating a sensor error model that maps an internal output of the prediction model to the uncertainty. The sensor error model is created using multiple image databases of visual localization, each with ground-truth location. We demonstrate the accuracy of our uncertainty prediction framework using the Ithaca365 dataset, which includes variations in lighting, weather (sunny, snowy, night), and alignment errors between databases. We analyze both the predicted uncertainty and its incorporation into a Kalman-based localization filter. Our results show that prediction error variations increase with poor weather and lighting condition, leading to greater uncertainty and outliers, which can be predicted by our proposed uncertainty model. Additionally, our probabilistic error model enables the filter to remove ad hoc sensor gating, as the uncertainty automatically adjusts the model to the input data." Extrinsic Calibration for Highly Accurate Trajectories Reconstruction,"Maxime Vaidis, William Dubois, Alexandre Guénette, Johann Laconte, Vladimir Kubelka, Francois Pomerleau","Université Laval,University of Toronto,Örebro University",Localization and Mapping II,"In the context of robotics, accurate ground-truth positioning is the cornerstone for the development of mapping and localization algorithms. In outdoor environments and over long distances, total stations provide accurate and precise measurements, that are unaffected by the usual factors that deteriorate the accuracy of Global Navigation Satellite System (GNSS). While a single robotic total station can track the position of a target in three Degrees Of Freedom (DOF), three robotic total stations and three targets are necessary to yield the full six DOF pose reference. Since it is crucial to express the position of targets in a common coordinate frame, we present a novel extrinsic calibration method of multiple robotic total stations with field deployment in mind. The proposed method does not require the manual collection of ground control points during the system setup, nor does it require tedious synchronous measurement on each robotic total station. Based on extensive experimental work, we compare our approach to the classical extrinsic calibration methods used in geomatics for surveying and demonstrate that our approach brings substantial time savings during the deployment. Tested on more than 30 km of trajectories, our new method increases the precision of the extrinsic calibration by 25 % compared to the best state-of-the-art method, which is the one taking manually static ground control points." Cerberus: Low-Drift Visual-Inertial-Leg Odometry for Agile Locomotion,"Shuo Yang, Zixin Zhang, Zhengyu Fu, Zachary Manchester","Carnegie Mellon University,The Hong Kong University of Science and Technology",Localization and Mapping II,"We present an open-source Visual-Inertial-Leg Odometry (VILO) state estimation solution for legged robots, called Cerberus, which precisely estimates position on various terrains in real-time using a set of standard sensors, including stereo cameras, IMU, joint encoders, and contact sensors. In addition to estimating robot states, we perform online kinematic parameter calibration and outlier rejection to substantially reduce position drift. Hardware experiments in various indoor and outdoor environments validate that online calibration of kinematic parameters can reduce estimation drift to less than 1% during long-distance, high-speed locomotion. Our drift results are better than those of any other state estimation method using the same set of sensors reported in the literature. Moreover, our state estimator performs well even when the robot experiences large impacts and camera occlusion. The implementation of the state estimator, along with the datasets used to compute our results, is available at https://github.com/ShuoYangRobotics/Cerberus." "Ensembles of Compact, Region-Specific & Regularized Spiking Neural Networks for Scalable Place Recognition","Somayeh Hussaini, Michael J Milford, Tobias Fischer",Queensland University of Technology,Localization and Mapping II,"Spiking neural networks have significant potential utility in robotics due to their high energy efficiency on specialized hardware, but proof-of-concept implementations have not yet typically achieved competitive performance or capability with conventional approaches. In this paper, we tackle one of the key practical challenges of scalability by introducing a novel modular ensemble network approach, where compact, localized spiking networks each learn and are solely responsible for recognizing places in a local region of the environment only. This modular approach creates a highly scalable system. However, it comes with a high-performance cost where a lack of global regularization at deployment time leads to hyperactive neurons that erroneously respond to places outside their learned region. Our second contribution introduces a regularization approach that detects and removes these problematic hyperactive neurons during the initial environmental learning phase. We evaluate this new scalable modular system on benchmark localization datasets Nordland and Oxford RobotCar, with comparisons to standard techniques NetVLAD, DenseVLAD, and SAD, and a previous spiking neural network system. Our system substantially outperforms the previous SNN system on its small dataset, but also maintains performance on 27 times larger benchmark datasets where the operation of the previous system is computationally infeasible, and performs competitively with the conventional localization systems." Line As a Visual Sentence: Context-Aware Line Descriptor for Visual Localization,"Sungho Yoon, Ayoung Kim","NAVER LABS,Seoul National University",Localisation 1,"Along with feature points for image matching, line features provide additional constraints to solve visual geometric problems in robotics and computer vision (CV). Although recent convolutional neural network (CNN)-based line descriptors are promising for viewpoint changes or dynamic environments, we claim that the CNN architecture has innate disadvantages to abstract variable line length into the fixed dimensional descriptor. In this paper, we effectively introduce Line-Transformers dealing with variable lines. Inspired by natural language processing (NLP) tasks where sentences can be understood and abstracted well in neural nets, we view a line segment as a sentence that contains points (words). By attending to well-describable points on a line dynamically, our descriptor performs excellently on variable line lengths. We also propose line signature networks sharing the line’s geometric attributes with neighborhoods. Performing as group descriptors, the networks enhance line descriptors by understanding lines’ relative geometries. Finally, we present the proposed line descriptor and matching in a Point and Line Localization (PL-Loc). We show that the visual localization with feature points can be improved using our line features. We validate the proposed method for homography estimation and visual localization." Robust Visual Localization of a UAV Over a Pipe-Rack Based on the Lie Group SE(3),"Vincenzo Lippiello, Jonathan Cacace","University of Naples FEDERICO II,University of Naples",Localisation 1,"Visual inspection and maintenance of industrial pipes using robots represent an emerging application in Oil & Gas and refinery facilities. In this domain, we present a pose tracking system based on a single camera sensor to localize an unmanned aerial vehicle operating over a pipe-rack to carry out inspection activities. We propose a unified framework based on the Lie group SE(3) which allows the simultaneous estimation of the pose of the UAV along with some parameters of the pipe-rack model. Numerical simulations have been performed to demonstrate the effectiveness of the proposed approach." Finding the Right Place: Sensor Placement for UWB Time Difference of Arrival Localization in Cluttered Indoor Environments,"Zhao Wenda, Abhishek Goudar, Angela P. Schoellig","University of Toronto,TU Munich",Localisation 1,"Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization has recently emerged as a promising indoor positioning solution. However, in cluttered environments, both the UWB radio positions and the obstacle-induced non-line-of-sight (NLOS) measurement biases significantly impact the quality of the position estimate. Consequently, the placement of the UWB radios must be carefully designed to provide satisfactory localization accuracy for a region of interest. In this work, we propose a novel algorithm that optimizes the UWB radio positions for a pre-defined region of interest in the presence of obstacles. The mean-squared error (MSE) metric is used to formulate an optimization problem that balances the influence of the geometry of the radio positions and the NLOS effects. We further apply the proposed algorithm to compute a minimal number of UWB radios required for a desired localization accuracy and their corresponding positions. In a real-world cluttered environment, we show that the designed UWB radio placements provide 51% and 76% localization root-mean-squared error (RMSE) reduction in 2D and 3D experiments, respectively, when compared against trivial placements." EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale,"Jacek Komorowski, Monika Wysoczanska, Tomasz Trzcinski",Warsaw University of Technology,Localisation 1,"The paper presents a deep neural network-based method for global and local descriptors extraction from a point cloud acquired by a rotating 3D LiDAR. The descriptors can be used for two-stage 6DoF relocalization. First, a course position is retrieved by finding candidates with the closest global descriptor in the database of geo-tagged point clouds. Then, the 6DoF pose between a query point cloud and a database point cloud is estimated by matching local descriptors and using a robust estimator such as RANSAC. Our method has a simple, fully convolutional architecture based on a sparse voxelized representation. It can efficiently extract a global descriptor and a set of keypoints with local descriptors from large point clouds with tens of thousand points. Our code and pretrained models are publicly available on the project website." "Stein Particle Filter for Nonlinear, Non-Gaussian State Estimation","Fahira Afzal Maken, Fabio Ramos, Lionel Ott","Data,,, CSIRO,University of Sydney, NVIDIA,ETH Zurich",Localisation 1,"Estimation of a dynamical system's latent state subject to sensor noise and model inaccuracies remains a critical yet difficult problem in robotics. While Kalman filters provide the optimal solution in the least squared sense for linear and Gaussian noise problems, the general nonlinear and non-Gaussian noise case is significantly more complicated, typically relying on sampling strategies that are limited to low-dimensional state spaces. In this paper we devise a general inference procedure for filtering of nonlinear, non-Gaussian dynamical systems that exploits the differentiability of both the update and prediction models to scale to higher dimensional spaces. Our method, Stein particle filter, can be seen as a deterministic flow of particles, embedded in a reproducing kernel Hilbert space, from an initial state to the desirable posterior. The particles evolve jointly to conform to a posterior approximation while interacting with each other through a repulsive force. We evaluate the method in simulation and in complex localization tasks while comparing it to sequential Monte Carlo solutions." Faster-LIO: Lightweight Tightly Coupled Lidar-Inertial Odometry Using Parallel Sparse Incremental Voxels,"Chunge Bai, Tao Xiao, Yajie Chen, Haoqian Wang, Fang Zhang, Xiang Gao","Tsinghua University,Beijing Idriverplus Technology Co. Ltd.,IDRIVERPLUS,Beijing Idriverplus Technology Co., Ltd.,idriverplus.com",Localisation 1,"This paper presents an incremental voxel-based lidar-inertial odometry (LIO) method for fast-tracking spinning and solid-state lidar scans. To achieve the high tracking speed, we neither use complicated tree-based structures to divide the spatial point cloud nor the strict k nearest neighbor (k-NN) queries to compute the point matching. Instead, we use the incremental voxels (iVox) as our point cloud spatial data structure, which is modified from the traditional voxels and supports incremental insertion and parallel approximated k-NN queries. We propose the linear iVox and PHC (Pseudo Hilbert Curve) iVox as two alternative underlying structures in our algorithm. The experiments show that the speed of iVox reaches 1000-2000 Hz per scan in solid-state lidars and over 200 Hz for 32 lines spinning lidars only with a modern CPU while still preserving the same level of accuracy." Homography-Based Loss Function for Camera Pose Regression,"Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel","Ifremer,Université de Toulon, Aix Marseille Univ, CNRS, LIS,Université de Toulon,University of Toulon",Localisation 1,"Some recent visual-based relocalization algorithms rely on deep-learning methods to perform camera pose regression from image data. This paper focuses on the loss functions that embed the error between two poses to perform deep-learning based camera pose regression. Existing loss functions are either difficult-to-tune multi-objective functions or present unstable reprojection errors that rely on ground-truth 3D scene points and require a two-step training. To deal with these issues, we introduce a novel loss function which is based on a multiplane homography integration. This new function does not require prior initialization and only depends on physically interpretable hyperparameters. Furthermore, the experiments carried out on well established relocalization datasets show that it minimizes best the mean square reprojection error during training when compared with existing loss functions." Broadband Sound Source Localization Via Non-Synchronous Measurements for Service Robots: A Tensor Completion Approach,"Long Chen, Weize Sun, Lei Huang, Liang Yu","Northwestern Polytechnical University,Shenzhen University,Shanghai Jiao Tong University",Localisation 1,"Constraint by the physical geometry, the lower and upper frequency bound and the scale of the scanning area of a microphone array are limited. Owing to its movable feature, for the service robots, achieving a wider working frequency range with a global view requires a virtually larger and denser array, which can be realised using non-synchronous measurements beamforming with a movable microphone array prototype. However, even when using the state-of-the-art method, it is challenging to localise multiple broadband sources, owing to the difficulty in selecting an appropriate operating frequency without any prior information about the target signal. Therefore, this letter proposes a tensor-completion-based non-synchronous measurements method for broadband multiple-sound-source localisation. The tensor data structure of the broadband signal is analysed, and an alternating direction method based on multiplier optimisation with a tensor multi-norm constraint is proposed. This algorithm can provide a sound map with a distinct global view of three different speech signal sources with high accuracy. Compared with the matrix-based optimisation method, the proposed method can significantly reduce the mean square error of the estimated source location." Proprioceptive Soft Pneumatic Gripper for Extreme Environments Using Hybrid Optical Fibers,"Babar Jamil, Gyeongjae Yoo, Youngjin Choi, Hugo Rodrigue","Sungkyunkwan University,University of Rochester,Hanyang University",Soft Sensors and Actuators,"Soft pneumatic robotic grippers can be used to conduct a wide range of tasks in extreme environments as they are generally free of conductive or magnetic components. But the sensorization of these grippers can potentially make them unsuitable for such environments and lower their performance through the introduction of electronic components or conductive materials into their structure. This work introduces the design of a sensorized soft pneumatic gripper using hybrid optical fibers making use of rigid and hard segments to measure the bending deformation and the contact force at specific segments of the finger. Using rigid optical fibers, which has low optical loss, for optical signal transmission allows to place all electronic components outside and away from the gripper ensuring the structure of the gripper and its nearby components are non-conductive and non-magnetic. The utilization of rigid reinforcements on the soft pneumatic actuator also helps maintain the performance of the actuator close to existing non-sensorized soft pneumatic actuators. This sensorized actuator is then used in a proprioceptive soft robotic gripper capable of controlling its grasping force." Modeling and Characterizing Two Dielectric Elastomer Folding Actuators for Origami-Inspired Robot,"Li Yang, Ting Zhang",Soochow University,Soft Sensors and Actuators,"Origami-inspired robots are automated machines that achieve form and function through folding. This letter models and characterizes two dielectric elastomer folding actuators, and provides a new driving method for origami-inspired robots. We place multilayer bending dielectric elastomer actuators (MBDEAs) and minimum energy structural dielectric elastomer actuators (MESDEAs) at the creases to directly drive a rigid origami-inspired robot joint. An analysis model of MBDEAs is established based on beam bending theory. This model establishes the relationships between the deformation angle and output moment of the MBDEA and the input electric field intensity according to the design parameters. In addition, this study provides a design model of MESDEAs with the rectangular hollow area for the equilibrium deformation angle and maximum output moment. The analysis model of MBDEAs for the deformation angle and the design model of MESDEAs are both experimentally verified. Finally, we realize the drive of the joint is based on MBDEAs and MESDEAs. The design and model are evaluated through experiments. This work paves the way for subsequent motion control of origami-inspired robots and further expands the application fields of dielectric elastomer actuators (DEAs)." Deployable Soft Pneumatic Networks (D-PneuNets) Actuator with Dual-Morphing Origami Chambers for High Compactness,"Woongbae Kim, Bada Seo, Sung Yol Yu, Kyu-Jin Cho","Korea Institue of Science and Technology,Seoul National University,Seoul National University, Biorobotics Laboratory",Soft Sensors and Actuators,"Soft pneumatic networks (PneuNets) actuators are widely considered as a robotic solution for safety and simplicity, yet the trade-off relationship between output force and bulkiness limits usability. In this letter, we propose a deployable soft pneumatic networks (D-PneuNets) actuator consisting of origami-designed chambers. In response to applied pressure, the origami chambers rapidly deploy to increase the moment arm, while simultaneously inflating to bend the whole body through mutual contacts and asymmetric lengthening. Consequently, the D-PneuNets actuator can generate relatively large force, and also provide compactness when not in use. Experiment and finite element analysis results show that the D-PneuNets actuator can grow more than 2.5 times of its initial height, and generate an output force more than 10 times higher compared to that of a conventional soft pneumatic networks actuator with the same initial height. In addition, a dual-material origami chamber with relatively stiff side walls was developed to prevent an unnecessary bulge out. Finally, a robotic soft glove using D-PneuNets actuators was developed and verified for grasping various everyday-objects. Our approach offers a design strategy to overcome the trade-off relationship of soft pneumatic networks actuators, enabling compact designs of soft robotic devices." Soft Fluidic Actuator for Locomotion in Multi-Phase Environments,"Roza Gkliva, Maarja Kruusmaa",Tallinn University of Technology,Soft Sensors and Actuators,"This letter presents the design, development, and experimental assessment of a soft fluidic actuator that can enable locomotion in a variety of aquatic and terrestrial environments. Most actuation strategies for amphibious locomotion rely on rigid, fast moving components to generate thrust and tractive forces. Our prototype, comprising soft materials, and relying on simple motion planning and control strategies, demonstrates two gaits, that we employ for locomotion in two vastly different scenarios, underwater swimming and moving on granular terrain with varying levels of water content. By adjusting its internal pressure, the actuator dynamically varies its stiffness and shape, and transitions between wheel and soft paddle form. Experimental results of locomotion in controlled laboratory conditions serve as proof-of-concept for the proposed actuator's efficacy. Using two different motion patterns and control schemes, we show that this prototype achieves both thrust and tractive forces." Contact Surface and Pose Recognition: Utilizing Multipole Magnetic Tactile Sensor with Meta Learning Model,"Ziwei Xia, Bin Fang, Fuchun Sun, Huaping Liu, Wei Feng Xu, Ling Fu, Yiyong Yang","China University of Geosciences, Haidian District, Beijing, Chin,Tsinghua university,Tsinghua University,Siemens Ltd., China,School of Engineering and Technology, China University of Geosci",Soft Sensors and Actuators,"Soft magnetic tactile sensors have been gradually applied to robotic systems due to their low-cost and simple fabrication. The previous soft magnetic tactile sensor was developed for tactile features of a single point (i.e., force/location) estimation and proved the feasibility of experiments. However, extracting tactile features of a surface (i.e., contact shape) by magnetic sensors remains a challenge, which limits the application. In this paper, a soft magnetic tactile sensor that can extract contact surface shape and pose features is fabricated and a multipole magnetization method is developed to improve the performance of the tactile sensor. Furthermore, we propose a metric-based meta-learning method to extract the tactile feature of the contact surface shape and pose from magnetic data under limited sample conditions and the method is verified by a series of experiments. The experimental results show that our method can achieve more than 80% accuracy in contact shape recognition and more than 95% accuracy in contact pose recognition. The experimental results demonstrate that our method can extract tactile features under limited data conditions and has a certain generalization ability for new contact data." Force/Torque-Sensorless Joint Stiffness Estimation in Articulated Soft Robots,"Maja Trumic, Giorgio Grioli, Kosta Jovanovic, Adriano Fagiolini","University of Belgrade,Istituto Italiano di Tecnologia,University of Belgrade, Serbia,University of Palermo",Soft Sensors and Actuators,"Currently, the access to the knowledge of stiffness values is typically constrained to a-priori identified models or datasheet information, which either do not usually take into account the full range of possible stiffness values or need extensive experiments. This work tackles the challenge of stiffness estimation in articulated soft manipulators, and it proposes an innovative solution adding value to the previous research by removing the necessity for force/torque sensors and generalizing to multi-degree-of-freedom robots. Built upon the theory of unknown input-state observers and recursive least-square algorithms, the solution is independent of the actuator model parameters and its internal control signals. The validity of the approach is proven analytically for single and multiple degree-of-freedom robots. The obtained estimators are first evaluated via simulations on articulated soft robots with different actuations and then tested in experiments with real robotic setups using antagonistic variable stiffness actuators." Retractable Locking System Driven by Shape Memory Alloy Actuator for Lightweight Soft Robotic Application,"Young Jin Gong, Seong Taek Hwang, Sang Yul Yang, Kihyeon Kim, Jae Hyeong Park, Hosang Jung, Dongsu Shin, Hyouk Ryeol Choi","SungKyunKwan university(SKKU),Sungkyunkwan University(SKKU),Sungkyunkwan university,Sungkyunkwan University,Sungkwunkwan University",Soft Sensors and Actuators,"Shape memory alloy actuators (SMA) are widely used in robots owing to their flexibility and light weight. However, there is a drawback in that SMAs need to be used as a bundle owing to their lower force density than the motor. In addition, SMAs require continuous current to maintain a contracted state, which results in the overheating of the actuator, causing breakdown and excessive energy consumption. This study presents a new locking system that can mechanically hold the contracted state of an actuator without consuming energy to overcome the disadvantages of a spring-type SMA. The locking mechanism was designed with a ratchet structure and pawl to prevent the actuator from stretching in the opposite direction when locked. In addition, to switch the locking and unlocking states with a single one-way actuation, a bistable retractable mechanism was applied to the system. Repeatability experiments and an analysis of the locking system resolution were conducted to validate the performance of the proposed mechanism. Furthermore, an underactuated finger applied to the proposed locking system was fabricated, and its feasibility is experimentally presented." Elastic-Actuation Mechanism for Repetitive Hopping Based on Power Modulation and Cyclic Trajectory Generation,"Won Dong Shin, William Stewart, Matthew Estrada, Auke Ijspeert, Dario Floreano","EPFL,Ecole Polytechnique Federale de Lausanne,École polytechnique fédérale de Lausanne,Ecole Polytechnique Federal, Lausanne",Soft Sensors and Actuators,"Animal locomotion results from a combination of power modulation and cyclic appendage trajectories, but combining these two properties in small-sized robots is difficult. Here we introduce and characterize a new elastic actuation system based on an inverted cam that is capable of generating cyclic locomotion with controlled elastic energy charge and release for small-sized robots. We designed a leg linkage and attached to the inverted cam to develop a single legged hopping platform with one actuated degree of freedom. The hopping platform was able to continuously hop forward at 1.82 Hz. The average horizontal hopping distance was 18.7 cm, and the average forward speed was 0.34 m/s. This speed was corresponding to a Froude number of 0.14. The energy consumed for one hop was 2.09 J, and the corresponding energetic cost of transport was 6.43. The combination of inverted cam and cyclic trajectory generation has the potential to be used in other robotic applications such as flapping wings in the air and tail fin waving in water." Learning-Based Fabric Folding and Box Wrapping,"Xiaoman Wang, Jie Zhao, Xin Jiang, Yunhui Liu","Harbin Institute of Technology, Shenzhen,Chinese University of Hong Kong",Manipulation and Grasping I,"Manipulation of deformable objects is an essential task in surgery, the textile industry, and household tasks, such as washing, hanging, and folding clothes. However, current stud- ies on fabric manipulation, have rarely considered the scenes in which rigid objects have to be wrapped by fabrics. This type of operation is widely adopted in the logistic packaging and packaging of surgical instrument baskets. In this study, we propose a method to perform this operation, which can be used in fabric folding or wrapping a box with fabric. Owing to the complex dynamics and configuration spaces of fabric, our method is based on deep imitation learning to estimate pick-place points and the phase of the manipulation process. The dataset was completely generated from an open-source physical simulator. To make the data as realistic as that in actual scenarios, we adopted domain randomization and rendered the texture of fabric and box from the real world to simulation data. This not only helps in transferring the learned policies to a physical robot, but also allows the robot to wrap a box with complex patterns. The experiments demonstrate the efficiency of the developed method for accomplishing complex manipulation tasks. The results also showed that it could be generalized to fabrics with different colors and boxes with different sizes, textures, or geometric shapes." Few-Shot Instance Grasping of Novel Objects in Clutter,"Weikun Guo, Wei Li, Ziye Hu, Zhongxue Gan","Fudan University,ENN Group",Manipulation and Grasping I,"Instance grasping, which aims to grasp a specific object out of clutter, is a fundamental task within robotics. However, allowing a robot to quickly learn to perform instance grasping for new, previously unseen objects remains challenging. In this work, we present an instance grasping meta-learning framework (IGML), a simple yet effective end-to-end approach that not only teaches robots to identify novel objects but also how to grasp them. Given only a few examples to specify the grasping point of the target object, our IGML can quickly learn to recognize the target object and grasp it at the demonstrated grasping point by leveraging prior experience. Experimental results on the test sets show that IGML achieves decent success rates in cluttered environments, significantly surpassing state-of-the-art methods. Then we deployed IGML on a UR5 robot arm to handle pick-and-place scenarios and achieved a precision rate of 93.4% and a recall rate of 87.1%." TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline,"Hongjie Fang, Hao-shu Fang, Sheng Xu, Cewu Lu","Shanghai Jiao Tong University,ShangHai Jiao Tong University",Manipulation and Grasping I,"Transparent objects are common in our daily life and frequently handled in the automated production line. Robust vision-based robotic grasping and manipulation for these objects would be beneficial for automation. However, the majority of current grasping algorithms would fail in this case since they heavily rely on the depth image, while ordinary depth sensors usually fail to produce accurate depth information for transparent objects owing to the reflection and refraction of light. In this work, we address this issue by contributing a large-scale real-world dataset for transparent object depth completion, which contains 57,715 RGB-D images from 130 different scenes. Our dataset is the first large-scale, real-world dataset that provides ground truth depth, surface normals, transparent masks in diverse and cluttered scenes. Cross-domain experiments show that our dataset is more general and can enable better generalization ability for models. Moreover, we propose an end-to-end depth completion network, which takes the RGB image and the inaccurate depth map as inputs and outputs a refined depth map. Experiments demonstrate superior efficacy, efficiency and robustness of our method over previous works, and it is able to process images of high resolutions under limited hardware resources. Real robot experiments show that our method can also be applied to novel transparent object grasping robustly. The full dataset and our method are publicly available at www.graspnet.net/transcg." Dual-Arm Control for Coordinated Fast Grabbing and Tossing of an Object,"Michael Bombile, Aude G. Billard","Ecole Polytechnique Federale de Lausanne (EPFL),EPFL",Manipulation and Grasping I,"Picking up objects to toss them on a conveyor belt are activities generated on a daily basis in the industry. These tasks are still done largely by humans. This paper proposes a unified motion generator for a bimanual robotic arm system that enables two 7 degrees of freedom robotic arms to grab and toss an object in one swipe. Unlike classical approaches that grab the object with quasi-zero relative contact velocity, the proposed approach is able to grasp the object while in motion. We control the contact forces prior to and following impact so as to stabilize the robots' grip on the object. We show that such swift grasping speeds up the pick-and-place process and reduces energy expenditure for tossing. Continuous control of reach, grab and toss motion is achieved by combining a sequence of time-invariant dynamical systems in a single control framework. We introduce a state-dependent modulation function to control the generated velocity in different directions. The framework is validated in simulation and on a real dual-arm system. We show that we can precisely toss objects within a workspace of 0.2 by 0.4 square meters. Moreover, we show that the algorithm can adapt on the fly to changes in object location." RBO Hand 3 - a Platform for Soft Dexterous Manipulation,"Steffen Puhlmann, Jason Harris, Oliver Brock","TU Berlin,Technische Universitaet Berlin,Technische Universität Berlin",Manipulation and Grasping I,"We present the RBO Hand 3, a highly capable and versatile anthropomorphic soft hand based on pneumatic actuation. The RBO Hand 3 is designed to enable dexterous manipulation, to facilitate transfer of insights about human dexterity, and to serve as a robust research platform for extensive real-world experiments. It achieves these design goals by combining many degrees of actuation with intrinsic compliance, replicating relevant functioning of the human hand, and by combining robust components in a modular design. The RBO Hand 3 possesses 16 independent degrees of actuation, implemented in a dexterous opposable thumb, two-chambered fingers, an actuated palm, and the ability to spread the fingers. In this work, we derive the design objectives that are based on experimentation with the hand’s predecessors, observations about human grasping, and insights about principles of dexterity. We explain in detail how the design features of the RBO Hand 3 achieve these goals and evaluate the hand by demonstrating its ability to achieve the highest possible score in the Kapandji test for thumb opposition, to realize all 33 grasp types of the comprehensive GRASP taxonomy, to replicate common human grasping strategies, and to perform dexterous in-hand manipulation." A Multi-DoF Exoskeleton Haptic Device for the Grasping of a Compliant Object Adapting to a User's Motion Using Jamming Transitions,"Ryohei Michikawa, Takahiro Endo, Fumitoshi Matsuno","Kyoto university,Kyoto University",Manipulation and Grasping I,"In the recently growing field of virtual reality, the virtual grasping sensation remains in its infancy, and the discrepancy between real and virtual grasping has become a problem. This paper develops a new haptic glove that presents the sensation of different types of grasping (e.g., power/precision/intermediate grasping) a compliant object. To this end, we develop an exoskeleton that fixes the movements of fingers to present a braking force and a flexible pad that reproduces a wide range of stiffness. By applying the ``string jamming mechanism,'' our exoskeleton has functions that existing devices do not. In that, it constrains the motion of finger extension/flexion and adduction/abduction with only one actuator and presents the braking force to all surfaces of the finger. In addition, we propose a lightweight and compact variable-stiffness pad that reproduces an extensive stiffness range based on the layer jamming technique. We conducted experiments to evaluate the mechanical performance of the prototype integrating the exoskeleton and the variable-stiffness pad, and demonstrated the usefulness of the proposed glove." Peg-In-Hole Assembly with Dual-Arm Robot and Dexterous Robot Hands,"Dong-Hyuk Lee, Myoung-su Choi, Hyeonjun Park, Ga-ram Jang, Jae-Han Park, Ji-Hun Bae","Korea Institute of Industrial Technology (KITECH),KITECH, UST,Korea Institute of Robotics & Technology Convergence,Korea Institute of Industrial Technology",Manipulation and Grasping I,"This study focuses on implementing robotic peg-in-hole in a more human-like approach, namely using dual-arms and dexterous robotic hands. The peg-in-hole strategy in this study mainly consists of two parts: grasping strategy (hand part) and assembly strategy (arm part). The grasping strategy explains the fundamental grasping method called ``advanced blind grasping'' and the in-hand manipulation method that is required for the reorientation of the workpiece. A feed-forward taskspace force control scheme is proposed for the actual implementation. The assembly strategy presents fundamental unit motions called ``perturbation pattern'' and proposes four assembly stages that are constructed by a combination of the unit motions. A force-position hybrid control for the implementation of the assembly strategy was also addressed. For evaluation, a peg-in-hole assembly demonstration with a keyhole-like shape was conducted using a human-sized 50-DOF upper-body robot." Manipulation Planning Using Wave Variables,"Phongsaen Pitakwatchara, Jetnipit Arunrat","Chulalongkorn University,Chula university",Manipulation and Grasping I,"There are plenty of low-level controllers for the robots to perform manipulation tasks. Most of them require specifying of either reference motion or force as the input. However, motion or force planning to interact with unstructured environment is very difficult. This letter proposes the use of wave variables which facilitate the computation of suitable reference motion or force through the wave planner framework. The notion is to specify the supplied power, via input wave, instead of desired motion or force. Input wave will be resolved into motion and force according to the port impedance. The method ensures safe manipulation when applying to passive control systems. Experimental results show advantages of using this wave planner with the impedance and force controllers." Active Inference and Behavior Trees for Reactive Action Planning and Execution in Robotics,"Corrado Pezzato, Carlos Hernandez Corbato, Stefan Bonhof, Martijn Wisse","Delft University of Technology,TU Delft",Manipulation and Grasping I,"We propose a hybrid combination of active inference and behavior trees (BTs) for reactive action planning and execution in dynamic environments, showing how robotic tasks can be formulated as a free-energy minimization problem. The proposed approach allows handling partially observable initial states and improves the robustness of classical BTs against unexpected contingencies while at the same time reducing the number of nodes in a tree. In this work, we specify the nominal behavior offline, through BTs. However, in contrast to previous approaches, we introduce a new type of leaf node to specify the desired state to be achieved rather than an action to execute. The decision of which action to execute to reach the desired state is performed online through active inference. This results in continual online planning and hierarchical deliberation. By doing so, an agent can follow a predefined offline plan while still keeping the ability to locally adapt and take autonomous decisions at runtime, respecting safety constraints. We provide proof of convergence and robustness analysis, and we validate our method in two different mobile manipulators performing similar tasks, both in a simulated and real retail environment. The results showed improved runtime adaptability with a fraction of the hand-coded nodes compared to classical BTs." Physically Consistent Preferential Bayesian Optimization for Food Arrangement,"Yuhwan Kwon, Yoshihisa Tsurumine, Takeshi Shimmura, Sadao Kawamura, Takamitsu Matsubara","Nara Institute of Science and Technology,Ritsumeikan University",Human Centered and Inspired Robotics,"This paper considers the problem of estimating a preferred food arrangement for users from interactive pairwise comparisons using Computer Graphics (CG)-based dish images. As a foodservice industry requirement, we need to utilize domain rules for the geometry of the arrangement (e.g., the food layout of some Japanese dishes is reminiscent of mountains). However, those rules are qualitative and ambiguous; the estimated result might be physically inconsistent (e.g., each food physically interferes, and the arrangement becomes infeasible). To cope with this problem, we propose Physically Consistent Preferential Bayesian Optimization (PCPBO) as a method that obtains physically feasible and preferred arrangements that satisfy domain rules. PCPBO employs a bi-level optimization that combines a physical simulation-based optimization and a Preference-based Bayesian Optimization (PbBO). Our experimental results demonstrated the effectiveness of PCPBO on simulated and actual human users." Multi-Objective Trajectory Optimization to Improve Ergonomics in Human Motion,"Waldez Gomes, Pauline Maurice, Eloise Dalin, Jean-Baptiste Mouret, Serena Ivaldi","Université Paris-Saclay,CNRS - LORIA,INRIA,Inria",Human Centered and Inspired Robotics,"Work-related musculoskeletal disorders are a major health issue often caused by awkward postures. Identifying and recommending more ergonomic body postures requires optimizing the worker's motion with respect to ergonomics criteria based on the human kinematic/kinetic state. However, many ergonomics scores assess different risks at different places of the human body, and therefore, optimizing for only one score might lead to postures that are either inefficient or that transfer the risk to a different location. We verified, in two work activities, that optimizing for a single ergonomics score may lead to motions that degrade scores other than the optimized one. To address this problem, we propose a multi-objective optimization approach that can find better Pareto-optimal trade-off motions that simultaneously optimize multiple scores. Our simulation-based approach is also user-specific and can be used to recommend ergonomic postures to workers with different body morphologies. Additionally, it can be used to generate ergonomic reference trajectories for robot controllers in human-robot collaboration." Interactive Dynamic Walking: Learning Gait Switching Policies with Generalization Guarantees,"Prem Chand, Sushant Veer, Ioannis Poulakakis","University of Delaware,NVIDIA",Human Centered and Inspired Robotics,"In this paper, we consider the problem of adapting a dynamically walking bipedal robot to follow a leading co worker while engaging in tasks that require physical interaction. Our approach relies on switching among a family of Dynamic Movement Primitives (DMPs) as governed by a supervisor. We train the supervisor to orchestrate the switching among the DMPs in order to adapt to the leader's intentions, which are only implicitly available in the form of interaction forces. The primary contribution of our approach is its ability to furnish certificates of generalization to novel leader intentions for the trained supervisor. This is achieved by leveraging the Probably Approximately Correct (PAC)-Bayes bounds from generalization theory. We demonstrate the efficacy of our approach by training a neural-network supervisor to adapt the gait of a dynamically walking biped to a leading collaborator whose intended trajectory is not known explicitly." Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes,"Kento Kawaharazuka, Kei Okada, Masayuki Inaba",The University of Tokyo,Human Centered and Inspired Robotics,"When a robot executes a task, it is necessary to model the relationship among its body, target objects, tools, and environment, and to control its body to realize the target state. However, it is difficult to model them using classical methods if the relationship is complex. In addition, when the relationship changes with time, it is necessary to deal with the temporal changes of the model. In this study, we have developed Deep Predictive Model with Parametric Bias (DPMPB) as a more human-like adaptive intelligence to deal with these modeling difficulties and temporal model changes. We categorize and summarize the theory of DPMPB and various task experiments on the actual robots, and discuss the effectiveness of DPMPB." Power-Based Velocity-Domain Variable Structure Passivity Signature Control for Physical Human-(Tele)Robot Interaction,"Peter Paik, Smrithi Thudi, S. Farokh Atashzar","New York University,New York University (NYU), US",Human Centered and Inspired Robotics,"The Excess of Passivity (EoP) of the human biomechanics plays an imperative role in absorbing the interaction energy during physical human-(tele)robot interaction and can be exploited by controllers used for stabilization of human-centered (tele)robotic systems. However, the first generation of nonlinear EoP-based stabilizers loaded the force reflection channel resulting in degradation of the force profile. This will challenge applications that are heavily dependent on the quality of force reflection such as telerobotic rehabilitation. This paper explores the possibility of developing a nonlinear stabilizer that modifies the reflected velocity to the follower-side operator based on the corresponding EoP map. The paper provides the mathematical derivation and stability proof of the nonlinear design of the stabilizer named “power-based velocity-domain variable structure passivity signature control (PV-VSPSC).” The proposed nonlinear stabilizer is evaluated through systematic experiments and systematic grid simulation studies in this paper." Human-Multirobot Collaborative Mobile Manipulation: The Omnid Mocobots,"Matthew Elwin, Billie Strong, Randy Freeman, Kevin Lynch",Northwestern University,Human Centered and Inspired Robotics,"The Omnid human-collaborative mobile manipulators are an experimental platform for testing control architectures for autonomous and human-collaborative multirobot mobile manipulation. An Omnid consists of a mecanum-wheel omnidirectional mobile base and a series-elastic Delta-type parallel manipulator, and it is a specific implementation of a broader class of mobile collaborative robots (``mocobots'') suitable for safe human co-manipulation of delicate, flexible, and articulated payloads. Key features of mocobots include passive compliance, for the safety of the human and the payload, and high-fidelity end-effector force control independent of the potentially imprecise motions of the mobile base. We describe general considerations for the design of teams of mocobots; the design of the Omnids in light of these considerations; manipulator and mobile base controllers to achieve useful multirobot collaborative behaviors; and initial experiments in human-multirobot collaborative mobile manipulation of large, unwieldy payloads, {color{blue} where the mocobot team renders the payloads weightless for effortless human co-manipulation}. For these experiments, the only communication among the humans and Omnids is mechanical, through the payload." TransDSSL: Transformer Based Depth Estimation Via Self-Supervised Learning,"Daechan Han, Jeongmin Shin, Namil Kim, Soonmin Hwang, Yukyung Choi","Sejong university,Sejong University,NAVER LABS,Carnegie Mellon University",Deep Learning for Visual Perception,"Recently, transformers have been widely adopted for various computer vision tasks and show promising results due to their ability to encode long-range spatial dependencies in an image effectively. However, very few studies on adopting transformers in self-supervised depth estimation have been conducted. When replacing the CNN architecture with the transformer in self-supervised learning of depth, we encounter several problems such as problematic multi-scale photometric loss function when used with transformers and, insufficient ability to capture local details. In this paper, we propose an attention-based decoder module, Pixel-Wise Skip Attention (PWSA), to enhance fine details in feature maps while keeping global context from transformers. In addition, we propose utilizing self-distillation loss with single-scale photometric loss to alleviate the instability of transformer training by using correct training signals. We demonstrate that the proposed model performs accurate predictions on large objects and thin structures that require global context and local details. Our model achieves state-of-the-art performance among the self-supervised monocular depth estimation methods on KITTI and DDAD benchmarks." Stereo Plane R-CNN: Accurate Scene Geometry Reconstruction Using Planar Segments and Camera-Agnostic Representation,"Jan Wietrzykowski, Dominik Belter",Poznan University of Technology,Deep Learning for Visual Perception,"The article introduces a novel method for planar segments detection and description from a stereo pair of images. The existing systems for planes detection utilize single RGB images and have accuracy- and scale-related problems regarding 3D reconstruction with the obtained planar segments. The proposed approach draws inspiration from deep-learning-based systems for plane detection and depth reconstruction. Firstly, we improve the planes detection in the image. Secondly, we enhance geometry reconstruction accuracy using a stereo setup. To achieve the 3D model of the observed planes, we introduce a novel neural network architecture and training strategy that jointly optimizes the prediction of disparity, normal vectors, and plane parameters. Moreover, the proposed approach utilizes an efficient camera-agnostic representation of the problem. Finally, we show that our system outperforms existing approaches to planar segments detection and parameters estimation and improves the reconstruction accuracy of indoor environments." Object-Aware Monocular Depth Prediction with Instance Convolutions,"Enis Simsar, Evin Pınar Örnek, Fabian Manhardt, Helisa Dhamo, Nassir Navab, Federico Tombari","ETH Zurich,Technical University of Munich,Google,Technische Universität München,TU Munich",Deep Learning for Visual Perception,"With the advent of deep learning, estimating depth from a single RGB image has recently received a lot of attention, being capable of empowering many different applications ranging from path planning for robotics to computational cinematography. Nevertheless, while the depth maps are in their entirety fairly reliable, the estimates around object discontinuities are still far from satisfactory. This can be attributed to the fact that the convolutional operator naturally aggregates features across object discontinuities, resulting in smooth transitions rather than clear boundaries. Therefore, in order to circumvent this issue, we propose a novel convolutional operator which is explicitly tailored to avoid feature aggregation of different object parts. In particular, our method is based on estimating per-part depth values by means of super-pixels. The proposed convolutional operator, which we dub ”Instance Convolution”, then only considers each object part individually on the basis of the estimated super-pixels. Our evaluation with respect to the NYUv2, iBims and KITTI datasets demonstrate the advantages of Instance Convolutions over the classical convolution at estimating depth around occlusion boundaries, while producing comparable results elsewhere. Our code is available at github.com/enisimsar/instance-conv." Uncertainty Guided Policy for Active Robotic 3D Reconstruction Using Neural Radiance Fields,"Soomin Lee, Le Chen, Jiahao Wang, Alexander Liniger, Suryansh Kumar, Fisher Yu","Oracle,ETH Zurich,ETH Zürich",Deep Learning for Visual Perception,"In this paper, we tackle the problem of active robotic 3D reconstruction of an object. In particular, we study how a mobile robot with an arm-held camera can select a favorable number of views to recover an object's 3D shape efficiently. Contrary to the existing solution to this problem, we leverage the popular neural radiance fields-based object representation, which has recently shown impressive results for various computer vision tasks. However, it is not straightforward to directly reason about an object's explicit 3D geometric details using such a representation, making the next-best-view selection problem for dense 3D reconstruction challenging. This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation. We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator. We then present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations. Encouraging experimental results on synthetic and real-world data suggest that the approach presented in this paper can enable a new research direction of using an implicit 3D object representation for the next-best-view problem in robot vision applications, distinguishing our approach from the existing approaches that rely on explicit 3D geometric modeli" Detaching and Boosting: Dual Engine for Scale-Invariant Self-Supervised Monocular Depth Estimation,"Peizhe Jiang, Wei Yang, Xiaoqing Ye, Xiao Tan, Meng Wu","Northwestern Polytechnical University,Baidu,Baidu Inc.",Deep Learning for Visual Perception,"Monocular depth estimation (MDE) in the self-supervised scenario has emerged as a promising method as it refrains from the requirement of ground truth depth. Despite continuous efforts, MDE is still sensitive to scale changes especially when all the training samples are from one single camera. Meanwhile, it deteriorates further since camera movement results in heavy coupling between the predicted depth and the scale change. In this paper, we present a scale-invariant approach for self-supervised MDE, in which scale-sensitive features (SSFs) are detached away while scale-invariant features (SIFs) are boosted further. To be specific, a simple but effective data augmentation by imitating camera zooming process is proposed to detach SSFs, making the model robust to scale changes. Besides, a dynamic cross-attention module is designed to boost SIFs by fusing multi-scale cross-attention features adaptively. Extensive experiments on the KITTI dataset demonstrate that the detaching and boosting strategies are mutually complementary in MDE and our approach achieves new State-of-The-Art performance against existing works from 0.097 to 0.090 w.r.t absolute relative error. The code will be made public soon." Lidar Upsampling with Sliced Wasserstein Distance,"Artem Savkin, Yida Wang, Sebastian Wirkert, Nassir Navab, Federico Tombari","TUM,Technical University of Munich,German Cancer Research Center,TU Munich,Technische Universität München",Deep Learning for Visual Perception,"Lidar became an important component of the perception systems in autonomous driving. But challenges of training data acquisition and annotation made emphasized the role of the sensor to sensor domain adaptation. In this work, we address the problem of lidar upsampling. Learning on lidar point clouds is rather a challenging task due to their irregular and sparse structure. Here we propose a method for lidar point cloud upsampling which can reconstruct fine- grained lidar scan patterns. The key idea is to utilize edge- aware dense convolutions for both feature extraction and feature expansion. Additionally applying a more accurate Sliced Wasserstein Distance facilitates learning of the fine lidar sweep structures. This in turn enables our method to employ a one- stage upsampling paradigm without the need for coarse and fine reconstruction. We conduct several experiments to evaluate our method and demonstrate that it provides better upsampling" Accurate 3D Single Object Tracker with Local-To-Global Feature Refinement,"Baojie Fan, Kai Wang, Wuyang Zhou, Yu Shi Yang, Kaiwei Ma, Guoping Jiang","Nanjing University of Posts and Telecommunications,Nanjing University of Posts and Telecommunications",Deep Learning for Visual Perception,"3D single object tracking in point clouds is an essential task in robotics and autonomous driving. Many astonished trackers only adopt the voting-based RPN module to regress the object's location. However, they suffer from heavy outlier votes and ignore the global semantic features of targets. To resolve the problems, we propose a two-stage RPN module with local-to-global feature refinement for accurate tracking in point clouds. Specifically, the deep Hough voting is applied to obtain coarse proposals in the first stage. In the second stage, we design a local feature refinement (LFR) module and a global feature refinement (GFR) module to realize accurate localization jointly. The LFR module excludes noisy outliers in disorder point clouds and obtains refined local features for coarse proposals. After that, the GFR module explores the relationships among all proposals to weigh the proposal-wise global context features. Integrating the proposed two-stage RPN module into the previous method BAT, we develop a coarse-to-fine 3D single object tracker in point clouds abbreviated as C2FT. Extensive experiments on KITTI and nuScene benchmarks demonstrate that C2FT achieves favorable performance with a real-time speed(~50 FPS). Furthermore, the proposed LFR and GFR modules are generalized and can be easily integrated into other trackers." Aggregation Functions for Simultaneous Attitude and Image Estimation with Event Cameras at High Angular Rates,"Matthew Ng, Zi Min Er, Gim Song Soh, Shaohui Foong",Singapore University of Technology and Design,Aerial Robots and Autonomous Agents,"For fast-moving event cameras, projection of events onto the image frame exhibits smearing of events analogous to high motion blur. For camera attitude estimation, this presents a causality dilemma where motion prior is required to unsmear events, but an image prior is required to estimate motion. This dilemma is typically circumvented by including an IMU to provide motion priors. However, IMUs limited dynamic range of ±2000 ◦/s are shown to be insufficient for high angular rate rotorcrafts. Contrast Maximization is an event-only optimization framework that computes the optimal motion compensation parameter while generating an event image simultaneously. This paper analyses the performance of existing aggregation functions of the contrast maximization framework and proposes a non-convolution-based aggregation function that outperforms existing implementations. The use of discrete event images for optimizers is discussed, demonstrating alternate avenues of the framework to exploit. The effect of motion blur in motion-compensated images is defined and studied for Contrast Maximisation at high angular rates. Lastly, the framework is applied to rotation datasets with angular rates exceeding 2000 ◦/s to demonstrate high angular rate motion estimation without motion priors." RAST: Risk-Aware Spatio-Temporal Safety Corridors for MAV Navigation in Dynamic Uncertain Environments,"Gang Chen, Siyuan Wu, Moji Shi, Wei Dong, Hai Zhu, Javier Alonso-Mora","Delft University of Technology,Shanghai Jiao Tong University,Chinese Academy of Military Sciences",Aerial Robots and Autonomous Agents,"Autonomous navigation of Micro Aerial Vehicles (MAVs) in dynamic and unknown environments is a complex and challenging task. Current works rely on assumptions to solve the problem. The MAV's pose is precisely known, the dynamic obstacles can be explicitly segmented from static ones, their number is known and fixed, or they can be modeled with given shapes. In this paper, we present a method for MAV navigation in dynamic uncertain environments without making any of these assumptions. The method employs a particle-based dynamic map to represent the local environment and predicts it to the near future. Collision risk is defined based on the predicted maps and a series of risk-aware spatio-temporal (RAST) safety corridors are constructed, which are finally used to optimize a dynamically-feasible collision-free trajectory for the MAV. We compared our method with several state-of-the-art works in 12000 simulation tests in Gazebo with the physical engine enabled. The results show that our method has the highest success rate at different uncertainty levels. Finally, we validated the proposed method in real experiments." Energy Aware Impedance Control of a Flying End-Effector in the Port-Hamiltonian Framework,"Ramy Rashad, Davide Bicego, Jelle Zult, Santiago Sanchez-escalonilla, Ran Jiao, Antonio Franchi, Stefano Stramigioli","University of Twente,Bond High Performance ,D Technology,Beihang University",Aerial Robots and Autonomous Agents,"This work addresses the interaction control problem of a fully-actuated aerial vehicle considered as a flying end-effector. We tackle the problem using geometrically-consistent variable-stiffness impedance control for safe wrench regulation using the concept of energy tanks, where both the modeling and the control are carried out in the port Hamiltonian framework. We exploit previous well-known results in the literature of ground manipulators and extend them to be applied for novel and challenging aerial physical interaction with a focus on quasi-static applications. The energy-awareness of the presented control method guarantees the stability of the aerial robot in both free-flight and in-contact scenarios together with a level of safety in the case of contact-loss with the unknown environment. Furthermore, by utilizing bond graphs we demonstrate how the closed-loop passivity can be graphically conducted. The validity of our proposed approach is shown via several experiments. We also provide several insights on how the proposed framework could be extended to a generic dynamic aerial physical interaction." Momentum-Based Extended Kalman Filter for Thrust Estimation on Flying Multibody Robots,"Hosameldin Awadalla Omer Mohamed, Gabriele Nava, Giuseppe L'erario, Silvio Traversaro, Fabio Bergonti, Luca Fiorio, Punith Reddy Vanteddu, Francesco Braghin, Daniele Pucci","Italian Institute of Technology,Istituto Italiano di Tecnologia,Politecnico di Milano",Aerial Robots and Autonomous Agents,"Effective control design of flying vehicles requires a reliable estimation of the propellers’ thrust forces to secure a successful flight. Direct measurements of thrust forces, however, are seldom available in practice and on-line thrust estimation usually follows from the application of fusion algorithms that process on-board sensor data. This letter proposes a framework for the estimation of the thrust intensities on flying multibody systems that are not equipped with sensors for direct thrust measurement. The key ingredient of the proposed framework is the so-called centroidal momentum of a multibody system, which combined with the propeller model. It enables the design of Extended Kalman Filters (EKF) for on-line thrust estimation. The presented approach tackles the additional complexity in thrust estimation due to the possibly large number of degrees of freedom of the system and uncertainties in the propeller model. For instance, a covariance scheduling approach based on the turbines RPM error is proposed to ensure a reliable estimation even in case of turbine failures. Simulations are presented to validate the proposed algorithm during robot flight. Moreover, an experimental setup is designed to evaluate the accuracy of the estimation algorithm using iRonCub, a jet-powered humanoid robot, while standing on the ground." Overcoming Bias: Equivariant Filter Design for Biased Attitude Estimation with Online Calibration,"Alessandro Fornasier, Yonhon Ng, Christian Brommer, Christoph Böhm, Robert Mahony, Stephan Weiss","University of Klagenfurt,Australian National University,University Klagenfurt,Universität Klagenfurt",Aerial Robots and Autonomous Agents,"Stochastic filters for on-line state estimation are a core technology for autonomous systems. The performance of such filters is one of the key limiting factors to a system's capability. Both asymptotic behavior (e.g.,~for regular operation) and transient response (e.g.,~for fast initialization and reset) of such filters are of crucial importance in guaranteeing robust operation of autonomous systems. This paper introduces a new generic formulation for a gyroscope aided attitude estimator using N direction measurements including both body-frame and reference-frame direction type measurements. The approach is based on an integrated state formulation that incorporates navigation, extrinsic calibration for all direction sensors, and gyroscope bias states in a single equivariant geometric structure. This newly proposed symmetry allows modular addition of different direction measurements and their extrinsic calibration while maintaining the ability to include bias states in the same symmetry. The subsequently proposed filter-based estimator using this symmetry noticeably improves the transient response, and the asymptotic bias and extrinsic calibration estimation compared to state-of-the-art approaches. The estimator is verified in statistically representative simulations and is tested in real-world experiments." DIDER: Discovering Interpretable Dynamically Evolving Relations,"Enna Sachdeva, Chiho Choi",Honda Research Institute,Aerial Robots and Autonomous Agents,"Effective understanding of dynamically evolving multiagent interactions is crucial to capturing the underlying behavior of agents in social systems. It is usually challenging to observe these interactions directly, and therefore modeling the latent interactions is essential for realizing the complex behaviors. Recent work on Dynamic Neural Relational Inference (DNRI) captures explicit inter-agent interactions at every step. However, prediction at every step results in noisy interactions and lacks intrinsic interpretability without post-hoc inspection. Moreover, it requires access to ground truth annotations to analyze the predicted interactions, which are hard to obtain. This paper introduces DIDER, Discovering Interpretable Dynamically Evolving Relations, a generic end-to-end interaction modeling framework with intrinsic interpretability. DIDER discovers an interpretable sequence of inter-agent interactions by disentangling the task of latent interaction prediction into sub-interaction prediction and duration estimation. By imposing the consistency of a sub-interaction type over an extended time duration, the proposed framework achieves intrinsic interpretability without requiring any post-hoc inspection. We evaluate DIDER on both synthetic and real-world datasets. The experimental results demonstrate that modeling disentangled and interpretable dynamic relations improves performance on trajectory forecasting tasks." A Global Max-Flow-Based Multi-Resolution Next-Best-View Method for Reconstruction of 3D Unknown Objects,"Sicong Pan, Hui Wei",Fudan University,Aerial Robots and Autonomous Agents,"Many robot tasks, such as grasping and inspection, may require complete 3D models with enough surface details. Fully autonomous environment exploration and 3D reconstruction of unknown objects, therefore, are challenging for a robot, when little or no knowledge about an object is known as a priori. Previous work updated sensor measurements from a greedy ordered set of views into a map with an unchangeable voxel size, leading to lack of details on the object surface, sampling problem, and no adaption to small objects. We propose a global max-flow-based multi-resolution next-best-view (NBV) method to improve performance on these problems. In particular, it utilizes a max-flow-based global view quality function to obtain optimal NBVs, and a multi-resolution strategy to optimize the reconstruction quality and efficiency. Results of comparative simulation experiments with state-of-the-art (SOTA) methods show that our method achieves a higher voxel coverage under the same resolution. Ablation studies confirm that the global view function and multi-resolution strategy are effective. Results of real-world experiments show that our method achieves complete reconstruction of small objects with only 6–8 views." A Stack-Of-Tasks Approach Combined with Behavior Trees: A New Framework for Robot Control,"David Caceres Dominguez, Marco Iannotta, Johannes A. Stork, Erik Schaffernicht, Todor Stoyanov","Örebro University,Orebro University,Örebro University, AASS Research Center",Aerial Robots and Autonomous Agents,"Stack-of-Tasks (SoT) control allows a robot to simultaneously fulfill a number of prioritized goals formulated in terms of (in)equality constraints in error space. Since this approach solves a sequence of Quadratic Programs (QP) at each time-step, without taking into account any temporal state evolution, it is suitable for dealing with local disturbances. However, its limitation lies in the handling of situations that require non-quadratic objectives to achieve a specific goal, as well as situations where countering the control disturbance would require a locally suboptimal action. Recent works address this shortcoming by exploiting Finite State Machines (FSMs) to compose the tasks in such a way that the robot does not get stuck in local minima. Nevertheless, the intrinsic trade-off between reactivity and modularity that characterizes FSMs makes them impractical for defining reactive behaviors in dynamic environments. In this letter, we combine the SoT control strategy with Behavior Trees (BTs), a task switching structure that addresses some of the limitations of the FSMs in terms of reactivity, modularity and re-usability. Experimental results on a Franka Emika Panda 7-DOF manipulator show the robustness of our framework, that allows the robot to benefit from the reactivity of both SoT and BTs." Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot,"Tao Huang, Kai Chen, Bin Li, Yunhui Liu, Qi Dou","The Chinese University of Hong Kong,Chinese University of Hong Kong",Medical Robotics I,"Task automation of surgical robot has the potentials to improve surgical efficiency. Recent reinforcement learning (RL) based approaches provide scalable solutions to surgical automation, but typically require extensive data collection to solve a task if no prior knowledge is given. This issue is known as the exploration challenge, which can be alleviated by providing expert demonstrations to an RL agent. Yet, how to make effective use of demonstration data to improve exploration efficiency still remains an open challenge. In this work, we introduce Demonstration-guided EXploration (DEX), an efficient reinforcement learning algorithm that aims to overcome the exploration problem with expert demonstrations for surgical automation. To effectively exploit demonstrations, our method estimates expert-like behaviors with higher values to facilitate productive interactions, and adopts non-parametric regression to enable such guidance at states unobserved in demonstration data. Extensive experiments on $10$ surgical manipulation tasks from SurRoL, a comprehensive surgical simulation platform, demonstrate significant improvements in the exploration efficiency and task success rates of our method. Moreover, we also deploy the learned policies to the da Vinci Research Kit (dVRK) platform to show the effectiveness on the real robot. Code is available at https://github.com/med-air/DEX." Dual-Robot Collaborative System for Autonomous Venous Access Based on Ultrasound and Bioimpedance Sensing Technology,"Maria Koskinopoulou, Alperen Acemoglu, Veronica Penza, Leonardo Mattos","Heriot Watt University,Istituto Italiano di Tecnologia",Medical Robotics I,"Accurate needle insertion is an important task in many medical procedures. This paper studies the case of an autonomous needle insertion system for central venous access, which is a risky and challenging procedure involving the simultaneous manipulation of an ultrasound probe and of a catheterization needle. The goal of this medical operation is to provide access to a deep central vein, which is a key step in cardiovascular treatments or for the administration of drugs and treatments for cancer or infections. Accordingly, in this work we propose an autonomous dual-arm system for central venous access. The system is composed of two Franka robotic arms that are precisely co-registered and collaborate to achieve accurate needle insertion by combining ultrasound and bioimpedance sensing to ensure robust deep vessels visualization and venipuncture detection. The proposed system performance is evaluated on a phantom trainer through experiments simulating the jugular vein access for cardiac catheterization purposes. Quantitative results show the system is able to autonomously scan the area of interest, localize the vein and perform autonomous needle insertion with high accuracy and placement error below 2mm, proving the potential of the technology for real clinical use." Vitreoretinal Surgical Robotic System with Autonomous Orbital Manipulation Using Vector-Field Inequalities,"Yuki Koyama, Murilo Marinho, Kanako Harada",The University of Tokyo,Medical Robotics I,"Vitreoretinal surgery pertains to the treatment of delicate tissues on the fundus of the eye using thin instruments. Surgeons frequently rotate the eye during surgery, which is called orbital manipulation, to observe regions around the fundus without moving the microscope or patient. In this paper, we propose the autonomous orbital manipulation of the eye in robot-assisted vitreoretinal surgery with our tele-operated surgical system. In a simulation study, we preliminarily investigated the increase in the manipulability of our system using orbital manipulation. Furthermore, we demonstrated the feasibility of our method in experiments with a physical robot and a realistic eye model, showing an increase in the view-able area of the fundus when compared to a conventional technique. Source code and minimal example available at https://github.com/mmmarinho/icra2023_orbitalmanipulation." Autonomous Needle Navigation in Retinal Microsurgery: Evaluation in Ex Vivo Porcine Eyes,"Peiyao Zhang, Ji Woong Kim, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov","Johns Hopkins University,Johns Hopkins Medical Institute",Medical Robotics I,"Important challenges in retinal microsurgery include prolonged operating time, inadequate force feedback, and poor depth perception due to a constrained top-down view of the surgery. The introduction of robot-assisted technology could potentially deal with such challenges and improve the surgeon's performance. Motivated by such challenges, this work develops a strategy for autonomous needle navigation in retinal microsurgery aiming to achieve precise manipulation, reduced end-to-end surgery time, and enhanced safety. This is accomplished through real-time geometry estimation and chance-constrained Model Predictive Control (MPC) resulting in high positional accuracy while keeping scleral forces within a safe level. The robotic system is validated using both open-sky and intact (with lens and partial vitreous removal) ex vivo porcine eyes. The experimental results demonstrate that the generation of safe control trajectories is robust to small motions associated with head drift. The mean navigation time and scleral force for MPC navigation experiments are 7.208 s and 11.97 mN, which can be considered efficient and well within acceptable safe limits. The resulting mean errors along lateral directions of the retina are below 0.06 mm, which is below the typical hand tremor amplitude in retinal microsurgery." Dynamic Modeling and Identification of a Robotic Intracardiac Echo Catheter,"Mohammad Salehizadeh, Filipe Pedrosa, Harmanpreet Bassan, Rajnikant V. Patel, Jagadeesan Jayender","Harvard Medical School, Brigham and Women's Hospital,Western University,The University of Western Ontario",Medical Robotics I,"Catheter-based cardiac ablation is the preferred method of treating atrial fibrillation. Conventionally, the catheter is navigated in the heart using X-ray fluoroscopy imaging and an electroanatomical map. Although successful, these imaging modalities do not provide real-time feedback on the quality of lesions created, which in turn could lead to recurrence of arrhythmia. Intracardiac echo (ICE) catheter provides real-time imaging within the heart to visualize both the ablation catheter and lesions created. However, manipulating the ablation and ICE catheters simultaneously is tedious and time consuming. As a first step towards developing a robotic ICE catheter that can autonomously follow the ablation catheter and monitor the lesions, we have developed a dynamic model for the ICE catheter. The model is based on the Cosserat theory for flexible rods that relies on strain parametrization. The model also accounts for frictional forces between the catheter sheath and tendons, external loads and fluid forces acting on the catheter. A good nominal model for describing the catheter dynamics is essential to develop a robust control scheme for the robotic ICE catheter. The parameters of the ICE catheter are estimated using weight release, tendon-driven actuation and fluid flow experiments. To the best of our knowledge, this is the first dynamic model for the ICE catheter that accurately reflects the dynamics of the catheter under pulsatile fluid flow within a heart phantom." Modeling of a Robotic Transcatheter Delivery System,"Namrata Unnikrishnan Nayar, Ronghuai Qi, Jaydev Desai","Georgia Institute of Technology, RoboMed Lab,Georgia Institute of Technology",Medical Robotics I,"Intracardiac transcatheter systems guided by advanced imaging modalities are gaining popularity in treating mitral regurgitation in non-surgical candidates. Robotically steerable transcatheter systems must use model-based control strategies to ensure safer and more effective transcatheter procedures with less trauma while using smaller control gains. In this paper, a 4-DoF robotically steerable tendon-driven robot was fabricated, and the relationship between the tendon displacement and the joint angle was derived. This relation was derived in two parts to make this approach applicable to any other catheter system. A model was derived to determine the tendon tensions needed to achieve desired joint angles. Then, the tendon characteristics were studied, and a tendon elongation (TE) model was derived as a function of tendon length. Executing the modeling process in two steps makes it easy to introduce additional parameters like length, friction, and pose, to characterize complex systems like catheters. The TE model was used to actuate the joints of the robot and RMSE was computed to characterize its performance. Also, PID control was used along with the TE model to improve the system's performance, and the contribution of the model and the controller in the system was recorded." A Handheld Hydraulic Cardiac Catheter with Omnidirectional Manipulator and Touch Sensing,"Nguyen Chi Cong, James J. Davies, Mai Thanh Thai, Trung Thien Hoang, Phuoc Thien Phan, Kefan Zhu, Dang Bao Nhi Tran, Van Ho, Hung La, Hoang Phuong Phan, Nigel Lovell, Thanh Nho Do","University of New South Wales,UNSW Sydney,RMIT,Japan Advanced Institute of Science and Technology,University of Nevada at Reno,The University of Tokyo",Medical Robotics I,"Atrial fi brillation (AF) is mostly treated via robotic catheter-based cardiac ablation procedures. Over the last few decades, cables or tendon mechanisms are at the core of available cardiac catheters. Despite advances, the use of cables often results in considerable force loss, nonlinear hysteresis, and control challenges. Most catheters are not equipped with force sensing, which increases the risk of the ablation process and decreases their effi cacy in clinical settings. In addition, current catheters have a poor user interface and therefore the ablation process requires skilled or trained surgeons to steer the complex motion of the catheter tip within the heart chambers. To improve the cardiac ablation procedure, a new robotic catheter that has the ability to extend its working space without moving its fl exible body and a real-time force sensor for safe operation is highly desired. In this work, a new handheld and soft robotic catheter for AF ablation is introduced. The new device consists of several improved components such as a soft manipulator for navigation and bending motion, an ergonomic handheld controller, and a soft force sensor for monitoring tool-tissue contact. The design, modeling, and fabrication of the device are presented and followed by experimental characterizations and ex-vivo validation." Optimized Design and Analysis of Active Propeller-Driven Capsule Endoscopic Robot for Gastric Examination,"Yi Zhang, Weihao Wang, Wende Ke, Chengzhi Hu",Southern University of Science and Technology,Medical Robotics I,"Capsule endoscopic robot holds great promise for the early diagnosis of gastrointestinal diseases without causing discomfort to patients. However, currently available active capsule endoscopic robots suffer from issues such as complex structure, poor mobility, large size, and high cost, which have hindered their widespread adoption and resulted in a lower screening rate for gastrointestinal diseases. To address these challenges, this paper proposes a highly integrated propeller-driven capsule endoscopic robot (PCER) system that integrates STM32 processor, magnetic sensor, IMU, RF communication unit, and motor drive. The micro propeller of the PCER has been analyzed through finite element simulation to ensure its efficiency. FLUENT software has been utilized to simulate the fluid force acting on the PCER as it moves through a liquid medium. The results of the simulation are then used to determine the optimal pitch angle for the robot's movement. The thrust generated by the capsule robot propellers has been measured using a lever mechanism to investigate the relationship between the thrust and voltage applied to the motors. The experiments confirmed that the PCER is capable of performing flexible motions within fluid environments, such as changing pitch angle during movement, passing circular obstacles, horizontal motion, and spiral ascent. These findings demonstrate the feasibility of the proposed PCER as an effective tool for non-invasive early screening of gastrointestinal diseases." QuadMag: A Mobile-Coil System with Enhanced Magnetic Actuation Efficiency and Dexterity,"Lidong Yang, Moqiu Zhang, Zhengxin Yang, Haojin Yang, Li Zhang","The Hong Kong Polytechnic University,The Chinese University of Hong Kong,The Chinese Univeristy of HongKong",Medical Robotics I,"Magnetic field is a favorable power source for actuation and control of micro-/nanorobots. To overcome the fast decay of magnetic field for large-workspace microrobotic actuation, mobile field source-based systems have been proposed. In this work, we report a new mobile-coil system, i.e., QuadMag. It consists of four electromagnetic coils, whose motion is actuated by a parallel mechanism. Compared to previous systems with three mobile coils, e.g., DeltaMag, the additional coil in the QuadMag increases the degree-of-freedom (DoF) for magnetic control. However, to control QuadMag, new control methods should be developed for the over-constrained parallel mechanism and for the field/force of the four coils. We derive the Jacobian matrix for the differential motion of the parallel mechanism and then formulate the field, force, and simultaneous field and force control methods for magnetic actuation. Comparative experiments validate the enhanced actuation efficiency when controlling torque-driven helical microrobots. Moreover, the magnetic actuation dexterity is also enhanced by the additional coil. We conduct simulated navigation experiments and prove the actuation capability of QuadMag for 3D force-driven microrobot navigation with controlled robot orientation." Evaluating the Feasibility of Magnetic Tools for the Minimum Dynamic Requirements of Microneurosurgery,"Cameron Forbrigger, Erik Fredin, Eric Diller",University of Toronto,Medical Robotics I,"Neurosurgery could benefit from robot-assisted minimally invasive approaches, but existing robot tools are insufficiently small and compact. Magnetic actuation is an attractive approach to medical robotics because it allows small, modular serial mechanisms to be remotely actuated. Despite these advantages, magnetic actuation is relatively weak compared to alternative actuation methods. In this paper, we introduce a novel analytical model for magnetic serial robots, use this model to design two prototypes, and then demonstrate that a 4-mm-diameter prototype without any internal mechanical transmission can produce forces up to 0.181 N: high enough to perform delicate microsurgical tasks. We also demonstrate that the robot can achieve a closed-loop step response rise time of 0.71 seconds with an overshoot of 7.8%: sufficiently fast for surgical motions while maintaining a tip precision of less than 2 mm during a worst-case dynamic motion. These experiments provide strong evidence for the feasibility of directly-driven magnetic tools for neurosurgical applications, and they motivate future investigations in this area." A Novel Concentric Tube Steerable Drilling Robot for Minimally Invasive Treatment of Spinal Tumors Using Cavity and U-Shape Drilling Techniques,"Susheela Sharma, Ji Hwan Park, Jordan P. Amadio, Mohsen Khadem, Farshid Alambeigi","University of Texas at Austin,The University of Texas at Austin,University of Texas Dell Medical School,University of Edinburgh",Medical Robotics I,"In this paper, we present the design, fabrication, and evaluation of a novel flexible, yet structurally strong, Concentric Tube Steerable Drilling Robot (CT-SDR) to improve minimally invasive treatment of spinal tumors. Inspired by concentric tube robots, the proposed two degree-of-freedom (DoF) CT-SDR, for the first time, not only allows a surgeon to intuitively and quickly drill smooth planar and out-of-plane J- and U- shape curved trajectories, but it also, enables drilling cavities through a hard tissue in a minimally invasive fashion. We successfully evaluated the performance and efficacy of the proposed CT-SDR in drilling various planar and out-of-plane J-shape branch, U-shape, and cavity drilling scenarios on simulated bone materials." Magnetic Ball Chain Robots for Endoluminal Interventions,"Giovanni Pittiglio, Margherita Mencattelli, Pierre Dupont","Harvard University,Boston Children's Hospital, Harvard Medical School,Children's Hospital Boston, Harvard Medical School",Medical Robotics I,"This paper introduces a novel class of hyperredundant robots comprised of chains of permanently magnetized spheres enclosed in a cylindrical polymer skin. With their shape controlled using an externally-applied magnetic field, the spherical joints of these robots enable them to bend to very small radii of curvature. These robots can be used as steerable tips for endoluminal instruments. A kinematic model is derived based on minimizing magnetic and elastic potential energy. Simulation is used to demonstrate the enhanced steerability of these robots in comparison to magnetic soft continuum robots designed using either distributed or lumped magnetic material. Experiments are included to validate the model and to demonstrate the steering capability of ball chain robots in bifurcating channels." Robotic Navigation Autonomy for Subretinal Injection Via Intelligent Real-Time Virtual iOCT Volume Slicing,"Shervin Dehghani, Michael Sommersperger, Peiyao Zhang, Alejandro Martin-gomez, Benjamin Busam, Peter Gehlbach, Nassir Navab, M. Ali Nasseri, Iulian Iordachita","TUM,Technical University of Munich,Johns Hopkins University,Johns Hopkins Medical Institute,TU Munich,Technische Universitaet Muenchen",Medical Imaging and Perception II,"In the last decade, various robotic platforms have been introduced that could support delicate retinal surgeries. Concurrently, to provide semantic understanding of the surgical area, recent advances have enabled microscope-integrated intraoperative Optical Coherent Tomography (iOCT) with high-resolution 3D imaging at near video rate. The combination of robotics and semantic understanding enables task autonomy in robotic retinal surgery, such as for subretinal injection. This procedure requires precise needle insertion for best treatment outcomes. However, merging robotic systems with iOCT introduces new challenges. These include, but are not limited to high demands on data processing rates and dynamic registration of these systems during the procedure. In this work, we propose a framework for autonomous robotic navigation for subretinal injection, based on intelligent real-time processing of iOCT volumes. Our method consists of an instrument pose estimation method, an online registration between the robotic and the iOCT system, and trajectory planning tailored for navigation to an injection target. We also introduce intelligent virtual B-scans, a volume slicing approach for rapid instrument pose estimation, which is enabled by Convolutional Neural Networks (CNNs). Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method. Finally, we discuss identified challenges in this work and suggest potential solutions to further the development of such systems." 3D Reconstruction of Tibia and Fibula Using One General Model and Two X-Ray Images,"Kai Pan, Shuai Zhang, Liang Zhao, Shoudong Huang, Yanhao Zhang, Hua Wang, Qi Luo","University of Technology Sydney,University of Technology, Sydney,Australian National University,Osteoarthropathy surgery department, Shenzhen People's Hospital",Medical Imaging and Perception II,"The 3D reconstruction of patient specific bone models plays a crucial role in orthopaedic surgery for clinical evaluation, surgical planning and precise implant design or selection. This paper considers the problem of reconstructing a patient-specific 3D tibia and fibula model from only two 2D Xray images and one 3D general model segmented from the lower leg CT scans of one randomly selected patient. Currently, the bone 3D reconstruction mainly relies on computed tomography (CT) and magnetic resonance imaging (MRI) scanning-based mode segmentation which result in high radiation exposure or expensive costs. While, the proposed algorithm can accurately and efficiently deform a 3D general model to achieve a patient-specific 3D model that matches the patient’s tibia and fibula projections in two 2D X-rays. The algorithm undergoes a preliminary deformation, 2D contour registration, and optimisation based on the deformation graph that represents the shape deformation of models. Evaluations using simulations, cadaver and in-vivo experiments demonstrate that the proposed algorithm can effectively reconstruct the patient’s 3D tibia and fibula surface model with high accuracy." "Semantic-SuPer: A Semantic-Aware Surgical Perception Framework for Endoscopic Tissue Classification, Reconstruction, and Tracking","Shan Lin, Albert Miao, Jingpei Lu, Shunkai Yu, Zih-Yun Chiu, Florian Richter, Michael Yip","University of California, San Diego,University of California San Diego,UC San Diego",Medical Imaging and Perception II,"Accurate and robust tracking and reconstruction of the surgical scene is a critical enabling technology toward autonomous robotic surgery. Existing algorithms for 3D perception in surgery mainly rely on geometric information, while we propose to also leverage semantic information inferred from the endoscopic video using image segmentation algorithms. In this paper, we present a novel, comprehensive surgical perception framework, Semantic-SuPer, that integrates geometric and semantic information to facilitate data association, 3D reconstruction, and tracking of endoscopic scenes, benefiting downstream tasks like surgical navigation. The proposed framework is demonstrated on challenging endoscopic data with deforming tissue, showing its advantages over our baseline and several other state-of-the-art approaches. Our code and dataset are available at https://github.com/ucsdarclab/Python-SuPer." Suture Thread Spline Reconstruction from Endoscopic Images for Robotic Surgery with Reliability-Driven Keypoint Detection,"Neelay Joglekar, Fei Liu, Ryan Orosco, Michael Yip","University of California, San Diego,UCSD",Medical Imaging and Perception II,"Automating the process of manipulating and delivering sutures during robotic surgery is a prominent problem at the frontier of surgical robotics, as automating this task can significantly reduce surgeons' fatigue during tele-operated surgery and allow them to spend more time addressing higher-level clinical decision making. Accomplishing autonomous suturing and suture manipulation in the real world requires accurate suture thread localization and reconstruction, the process of creating a 3D shape representation of suture thread from 2D stereo camera surgical image pairs. This is a very challenging problem due to how limited pixel information is available for the threads, as well as their sensitivity to lighting and specular reflection. We present a suture thread reconstruction work that uses reliable keypoints and a Minimum Variation Spline (MVS) smoothing optimization to construct a 3D centerline from a segmented surgical image pair. This method is comparable to previous suture thread reconstruction works, with the possible benefit of increased accuracy of grasping point estimation. Our code and datasets will be available at: https://github.com/ucsdarclab/thread-reconstruction." CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection,"Jiasheng Xu, Tianyi Zhang, Yangqian Wu, Jie Yang, Guang-Zhong Yang, Yun Gu","Shanghai Jiao Tong University,Shanghai Jiaotong University,SJTU",Medical Imaging and Perception II,"Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detection is essential for intraoperative guidance. However, these methods are sensitive to visual artifacts which are inevitable during the surgery. In this work, a cross domain feature interaction (CDFI) network is proposed to extract the structural features of lumens, as well as to provide artifact cues to characterize the visual features. To effectively extract the structural and artifact features, the Quadruple Feature Constraints (QFC) module is designed to constrain the intrinsic connections of samples with various imaging-quality. Furthermore, we design a Guided Feature Fusion (GFF) module to supervise the model for adaptive feature fusion based on different types of artifacts. Results show that the features extracted by the proposed method can preserve the structural information of lumen in the presence of large visual variations, bringing much-improved lumen detection accuracy." Real-Time Constrained 6D Object-Pose Tracking of an In-Hand Suture Needle for Minimally Invasive Robotic Surgery,"Zih-Yun Chiu, Florian Richter, Michael Yip","University of California, San Diego",Award Finalists 1,"Autonomous suturing has been a long-sought-after goal for surgical robotics. Outside of staged environments, accurate localization of suture needles is a critical foundation for automating various suture needle manipulation tasks in the real world. When localizing a needle held by a gripper, previous work usually tracks them separately without considering their relationship. Because of the significant errors that can arise in the stereo-triangulation of objects and instruments, their reconstructions may often not be consistent. This can lead to unrealistic tool-needle grasp reconstructions that are infeasible. Instead, an obvious strategy to improve localization would be to leverage constraints that arise from contact, thereby constraining reconstructions of objects and instruments into a jointly feasible space. In this work, we consider feasible grasping constraints when tracking the 6D pose of an in-hand suture needle. We propose a reparameterization trick to define a new state space for describing a needle pose, where grasp constraints can be easily defined and satisfied. Our proposed state space and feasible grasping constraints are then incorporated into Bayesian filters for real-time needle localization. In the experiments, we show that our constrained methods outperform previous unconstrained/constrained tracking approaches and demonstrate the importance of incorporating feasible grasping constraints into automating suture needle manipulation tasks." Exploring Robot-Assisted Optical Coherence Elastography for Surgical Palpation,"Yeon Hee Chang, Elan Ahronovich, Nabil Simaan, Cheol Song","DGIST,Vanderbilt ARMA,Vanderbilt University",Award Finalists 1,"Optical Coherence Elastography (OCE) is a method that discerns local tissue stiffness using optical information. This method has recently been explored for laryngeal cancer tumor margin detection but has not been widely deployed clinically. Part of the challenge hindering such clinical deployment is the need for controlled high-precision mechanical probing of the tissue. This paper explores the concept of robot-assisted optical coherence elastography (OCE) and presents a preliminary system integration that is used to demonstrate the approach for stiffness mapping and discerning tumor margins. The approach is demonstrated on a custom Cartesian stage robot and a custom-built OCE system comprised an 830 nm broadband laser with a vector-analysis method for phase gradient estimation and strain imaging. The paper illustrates one of the advantages of robot-controlled probing in terms of increasing the accuracy of the OCE system in a large range of displacement and strain. By leveraging motion information from the robot, online re-calibration of the OCE strain map may be achieved, thereby reducing OCE errors. After calibration, it is shown that the error in estimating the local Young’s modulus is 0.485 % in silicon phantom and 0.531 % in agar phantom. These results suggest that future integration of optical coherence tomography (OCT) in clinically deployable robots may offer advantages in enabling local stiffness map estimation using OCE." Locate before Segment: Topology-Guided Retinal Layer Segmentation in Optical Coherence Tomography Images,"Ye Lu, Yutian Shen, Xiaohan Xing, Max Qing Hu Meng",The Chinese University of Hong Kong,Medical Imaging and Perception II,"Optical Coherence Tomography (OCT) is a non-invasive imaging technique that is instrumental in retinal disease diagnosis and treatment. Segmentation of retinal layers in OCT is an essential step, but remains challenging for common pixel-wise segmentation methods usually fail to obtain the correct layer topology. To tackle this challenge, we propose a novel Locate-to-Segment (L2S) framework to provide a layer region location guidance for pixel-wise labeling learning so as to obtain better segmentation with the correct topology and smooth boundaries. Specifically, a Structured Boundary Regression Network (SBRNet) is devised to first predict the surface positions. For effective learning on normal-size images, we design two regression branches to regress the top surface and eight layer widths separately in SBRNet to locate each layer region with absolutely correct orderings. Then, we take the prediction of SBRNet as an additional input for a common pixel-wise segmentation network to provide the guidance of correct topology. In this L2S manner, our framework takes merits of regression-based methods and pixel-wise labeling-based methods to obtain accurate segmentation with the correct topology and smooth continuous boundaries. Experimental results on a public retinal OCT dataset demonstrate the effectiveness of our method, outperforming state-of-the-art segmentation methods with the highest average Dice score of 90.29% and the lowest average MAD score of 0.782." Visual Tracking of Needle Tip in 2D Ultrasound Based on Global Features in a Siamese Architecture,"Wanquan Yan, Qingpeng Ding, Jianghua Chen, Kim Yan, Raymond Shing-yan Tang, Shing Shin Cheng","The Chinese University of HongKong,The Chinese University of Hong Kong,The Chinese University of Hong Kong, Department of Medicine and",Medical Imaging and Perception III,"Ultrasound (US) is widely used in image-guided needle procedures. Correctly tracking the needle tip position in US images during the procedure plays an important role in improving the needle targeting accuracy and patient safety. This paper presents a leaning-based visual tracking network with a Siamese architecture, which makes full use of the attention mechanism to explore the potential of global features and takes advantage of an online target model prediction module to robustly track the needle tip in US images. Several self- and cross-attention modules are applied to learn global features from the whole US image. A discriminative target model is also learned as a complementary part to improve the discriminability of the proposed tracker. The template used during the tracking is updated frequently according to the tracking results to ensure that the tracker can always capture the latest characteristics of the appearance of the needle tip. Experimental results in both phantom and tissue showed that the proposed tracking network was more robust than other state-of-the-art visual trackers. The mean success rates of the proposed tracker are 7.1% and 9.2% higher than the second best performing visual tacker when the needle was inserted by motors and human hands in the tissue experiments." Model-Based Pose Estimation of Steerable Catheters under Bi-Plane Image Feedback,"Jared Lawson, Rohan Chitale, Nabil Simaan","Vanderbilt University,Vanderbilt University Medical Center",Medical Imaging and Perception III,"Small catheters undergo significant torsional deflections during endovascular interventions. A key challenge in enabling robot control of these catheters is the estimation of their bending planes. This paper considers approaches for estimating these bending planes based on bi-plane image feedback. The proposed approaches attempt to minimize error between either the direct (position-based) or instantaneous (velocity-based) kinematics with the reconstructed kinematics from bi-plane image feedback. A comparison between these methods is carried out on a setup using two cameras in lieu of a bi-plane fluoroscopy setup. The results show that the position-based approach is less susceptible to segmentation noise and works best when the segment is in a non-straight configuration. These results suggest that estimation of the bending planes can be accompanied with errors under 30◦. Considering that the torsional buildup of these catheters can be more than 180◦, we believe that this method can be used for catheter control with improved safety due to the reduction of this uncertainty." Pose Quality Prediction for Vision Guided Robotic Shoulder Arthroplasty,"Morgan Windsor, Jing Peng, Ashish Gupta, Peter Pivonka, Michael J Milford","Queensland University of Technology,Queensland University of Technology (QUT)",Medical Imaging and Perception III,"Surgical assistive robots offer the potential for drastically improved patient outcomes through more accurate, more repeatable surgical procedures like shoulder arthroplasty operations. Existing robotic systems typically rely on optical marker tracking and require invasive marker attachment for localization, complicating the surgical workflow and patient recovery. But moving towards a markerless system is very challenging, both because of the absolute difficulty and the large variation in localization conditions across thousands of surgical procedures. In this paper we propose an alternative approach: rather than try to create a ``perfect"" and fully generalizable markerless localization system, instead create a reliable and trustworthy localization system that is able to continually self-assess the likely quality of its localization estimates, and act accordingly. We propose a lightweight method for predicting vision-based pose estimation performance using internal pipeline artifacts (without needing external ground truth from a marker-based system). Using extensive real robot experiments with challenging actual imagery from surgery, we demonstrate our prediction system accurately self-characterizes the localization system's performance across a wide range of localization conditions, and demonstrate that this prediction system generalizes to a range of surgical conditions. We then show how online performance prediction can drive active robot navigation that minimizes localization error, reducing target pose estimation error by 96.1% for rotation and 96.7% for translation compared to rejected alternative trajectories." Image Segmentation for Continuum Robots from a Kinematic Prior,"Connor Watson, Anna Nguyen, Tania Morimoto","Morimoto Lab, UCSD,University of California San Diego",Medical Imaging and Perception III,"In this work, we address the problem of robust segmentation of a continuum robot from images without the need for training data or markers. We present a method that leverages information about the kinematics of these robots to produce an estimate of the robot shape, which is refined through optimization over global image statistics. Our approach can be straightforwardly applied to any continuum robot design and is able to handle partial occlusions of the robot body, as well as challenging background conditions. We validate our method experimentally for a concentric tube robot in a simulated surgical environment and show that our method significantly outperforms a naive projection of the robot shape and color thresholding, which is commonly used in current vision-based estimation algorithms for these robots. Overall, this work has the potential to improve the viability of vision-based state estimation for continuum robots in real-world settings." Robust Collaborative 3D Object Detection in Presence of Pose Errors,"Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, Yanfeng Wang","Shanghai Jiao Tong University,Nanjing University,Meta,University of Warwick,New York University",Object Detection I,"Collaborative 3D object detection exploits information exchange among multiple agents to enhance accuracy of object detection in presence of sensor impairments such as occlusion. However, in practice, pose estimation errors due to imperfect localization would cause spatial message misalignment and significantly reduce the performance of collaboration. To alleviate adverse impacts of pose errors, we propose CoAlign, a novel hybrid collaboration framework that is robust to unknown pose errors. The proposed solution relies on a novel agent-object pose graph modeling to enhance pose consistency among collaborating agents. Furthermore, we adopt a multi- scale data fusion strategy to aggregate intermediate features at multiple spatial resolutions. Comparing with previous works, which require ground-truth pose for training supervision, our proposed CoAlign is more practical since it doesn’t require any ground-truth pose supervision in the training and makes no specific assumptions on pose errors. Extensive evaluation of the proposed method is carried out on multiple datasets, certifying that CoAlign significantly reduce relative localization error and achieving the state of art detection performance when pose errors exist. Code are made available for the use of the research community at https://github.com/yifanlu0227/CoAlign." Joint Semi-Supervised and Active Learning Via 3D Consistency for 3D Object Detection,"Sihwan Hwang, Sanmin Kim, YoungSeok Kim, Dongsuk Kum","Korea Advanced Institute of Science and Technology,KAIST",Object Detection I,"Autonomous driving powered by deep learning require large-scale and high-quality training data from diverse driving environments to operate effectively worldwide. However, collecting and annotating such data is costly and time-consuming. To address this challenge, active learning methods have been explored to select the most informative data samples for training. Nevertheless, most existing methods focus on 2D tasks and do not fully exploit the value of unselected data. In this paper, we propose a semi-supervised active learning approach for 3D object detection tasks that leverages the potential of collected data and reduces annotation costs. Our method considers the 3D consistency of bounding box predictions in both semi-supervised and active learning processes, thereby improving the performance of point cloud-based 3D object detection models. Specifically, our framework leverages self-supervision to reduce bounding box uncertainties and selects occluded or distant objects that still have high uncertainty for annotation, even after semi-supervised training has reduced their uncertainty. Experiments on the KITTI dataset demonstrate that our semi-supervised active learning approach selects objects with high measurement uncertainties and enhances the model's ability to detect occluded objects. Using only 1500 annotated frames, our approach improves the baseline by more than 60% (+17.12 mAP)." StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks,"Hongyu Li, Zhengang Li, Neset Unver Akmandor, Huaizu Jiang, Yanzhi Wang, Taskin Padir",Northeastern University,Object Detection I,"Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for real-time feedback. This paper proposes a computationally efficient method that employs a deep neural network to detect occupancy from stereo images directly. Instead of learning the point cloud correspondence from the stereo data, our approach extracts the compact obstacle distribution based on volumetric representations. In addition, we prune the computation of safety irrelevant spaces in a coarse-to-fine manner based on octrees generated by the decoder. As a result, we achieve real-time performance on the onboard computer (NVIDIA Jetson TX2). Our approach detects obstacles accurately in the range of 32 meters and achieves better IoU (Intersection over Union) and CD (Chamfer Distance) scores with only 2% of the computation cost of the state-of-the-art stereo model. Furthermore, we validate our method's robustness and real-world feasibility through autonomous navigation experiments with a real robot. Hence, our work contributes toward closing the gap between the stereo-based system in robot perception and state-of-the-art stereo models in computer vision. To counter the scarcity of high-quality real-world indoor stereo datasets, we collect a 1.36 hours stereo dataset with a mobile robot which is used to fine-tune our model. The dataset, the code, and further details including additional visualizations are available at https://lhy.xyz/stereovoxelnet/." Perceiving Unseen 3D Objects by Poking the Objects,"Linghao Chen, Yunzhou Song, Hujun Bao, Xiaowei Zhou",Zhejiang University,Object Detection I,"We present a novel approach to interactive 3D object perception for robots. Unlike previous perception algorithms that rely on known object models or a large amount of annotated training data, we propose a poking-based approach that automatically discovers and reconstructs 3D objects. The poking process not only enables the robot to discover unseen 3D objects but also produces multi-view observations for 3D reconstruction of the objects. The reconstructed objects are then memorized by neural networks with regular supervised learning and can be recognized in new test images. The experiments on real-world data show that our approach could unsupervisedly discover and reconstruct unseen 3D objects with high quality, and facilitate real-world applications such as robotic grasping. The code and supplementary materials are available at the project page: https://zju3dv.github.io/poking_perception." MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts,"Zizhang Wu, Yuanzhu Gan, Wang Robin, Guilian Chen, Jian Pu","Zongmu Technology,Fudan University",Object Detection I,"Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center’s depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the performance of clues for spatial 3D detection tasks. To alleviate this, we propose MonoPGC, a novel end-to-end Monocular 3D object detection framework with rich Pixel Geometry Contexts. We introduce the pixel depth estimation as our auxiliary task and design depth cross-attention pyramid module (DCPM) to inject local and global depth geometry knowledge into visual features. In addition, we present the depth-space-aware transformer (DSAT) to integrate 3D space position and depth-aware features efficiently. Besides, we design a novel depth-gradient positional encoding (DGPE) to bring more distinct pixel geometry contexts into the transformer for better object detection. Extensive experiments demonstrate that our method achieves the state-of-the-art performance on the KITTI dataset." CrossDTR: Cross-View and Depth-Guided Transformers for 3D Object Detection,"Ching-yu Tseng, Yi-rong Chen, Hsin-ying Lee, Tsung-han Wu, Wen-chin Chen, Winston Hsu",National Taiwan University,Object Detection I,"To achieve accurate 3D object detection at a low cost for autonomous driving, many multi-camera methods have been proposed and solved the occlusion problem of monocular approaches. However, due to the lack of accurate estimated depth, existing multi-camera methods often generate multiple bounding boxes along a ray of depth direction for difficult small objects such as pedestrians, resulting in an extremely low recall. Furthermore, directly applying depth prediction modules to existing multi-camera methods, generally composed of large network architectures, cannot meet the real-time requirements of self-driving applications. To address these issues, we propose Cross-view and Depth-guided Transformers for 3D Object Detection, CrossDTR. First, our lightweight depth predictor is designed to produce precise object-wise sparse depth maps and low-dimensional depth embeddings without extra depth datasets during supervision. Second, a cross-view depth-guided transformer is developed to fuse the depth embeddings as well as image features from cameras of different views and generate 3D bounding boxes. Extensive experiments demonstrated that our method hugely surpassed existing multi-camera methods by 10 percent in pedestrian detection and about 3 percent in overall mAP and NDS metrics. Also, computational analyses showed that our method is 5 times faster than prior approaches. Our codes will be made publicly available at https://github.com/sty61010/CrossDTR." DOTIE - Detecting Objects through Temporal Isolation of Events Using a Spiking Architecture,"Manish Nagaraj, Chamika Mihiranga Liyanagedera, Kaushik Roy",Purdue University,Object Detection I,"Vision-based autonomous navigation systems rely on fast and accurate object detection algorithms to avoid obstacles. Algorithms and sensors designed for such systems need to be computationally efficient, due to the limited energy of the hardware used for deployment. Biologically inspired event cameras are a good candidate as a vision sensor for such systems due to their speed, energy efficiency, and robustness to varying lighting conditions. However, traditional computer vision algorithms fail to work on event-based outputs, as they lack photometric features such as light intensity and texture. In this work, we propose a novel technique that utilizes the temporal information inherently present in the events to efficiently detect moving objects. Our technique consists of a lightweight spiking neural architecture that is able to separate events based on the speed of the corresponding objects. These separated events are then further grouped spatially to determine object boundaries. This method of object detection is both asynchronous and robust to camera noise. In addition, it shows good performance in scenarios with events generated by static objects in the background, where existing event-based algorithms fail. We show that by utilizing our architecture, autonomous navigation systems can have minimal latency and energy overheads for performing object detection." CEAFFOD: Cross-Ensemble Attention-Based Feature Fusion Architecture towards a Robust and Real-Time UAV-Based Object Detection in Complex Scenarios,"Ahmed Elhagry, Hang Dai, Abdulmotaleb El Saddik, Wail Gueaieb, Giulia De Masi","MBZUAI,Mohamed bin Zayed University of Artificial Intelligence,University of Ottawa,Technology Innovation Institute",Object Detection I,"Deploying object detectors in embedded devices such as unmanned aerial vehicles (UAVs) comes with many challenges. This is due to both the UAV itself having low embedded resources in terms of computation and memory, and also due to the nature of the captured visual data with the variations in objects' scale, orientation, density, viewpoint, distribution, shape, context and others. It is crucial for the object detector to be robust with high accuracy, real-time with fast inference and light-weight to be applicable. Inspired by YOLO architecture, we propose a novel single-stage detection architecture. Our contributions are, first, feature fusion spatial pyramid pooling (FFSPP) block that applies attention-based feature fusion across both time and space utilizing the information of subsequent frames and scales in an efficient manner. Secondly, we introduce a multi-dilated attention-based cross-stage partial connection (MDACSP) block that helps in increasing the receptive field and producing per-channel modulation weights after aggregating the feature maps across their spatial domain. Third, scaled feature fusion head (SFFH) fuses both the FFSPP block features and the connected MDACSP block features specific for this head. For a more robust result across different scenarios, we perform cross-ensembling with three of the top UAV/traffic surveillance datasets: UAVDT, UA-DETRAC and VisDrone. Our ablation study shows how every contribution improves over the baseline. Our approach yielded the state-of-the-art results in all the aforementioned datasets achieving 89.3% mAP, 93.5% mAP, and 42.9% mAP respectively. Testing the model performance on NVIDIA Jetson Xavier NX board shows a desirable balance between the inference time and the memory cost. We also show qualitatively the model robustness and efficiency across the diverse complex scenarios of these datasets. We hope this work facilitates the advancement of the UAV-based perception in such crucial industrial applications." Test Time Domain Adaptation for Monocular Depth Estimation,"Zhi Li, Shaoshuai Shi, Bernt Schiele, Dengxin Dai","Max Planck Institute for Informatics,Max Planck,ETH Zurich",Depth Estimation and RGB-D Sensing,"Test-time domain adaptation, i.e. adapting source pretrained models to the test data on-the-fly in a source-free, unsupervised manner, is a highly practical yet very challenging task. Due to the domain gap between source and target data, inference quality on the target domain can drop drastically especially in terms of absolute scale of depth. In addition, unsupervised adaptation can degrade the model performance due to inaccurate pseudo labels. Furthermore, the model can suffer from catastrophic forgetting when errors are accumulated over time. We propose a test-time domain adaptation framework for monocular depth estimation which achieves both stability and adaptation performance by benefiting from both self-training of the supervised branch and pseudo labels from self-supervised branch, and is able to tackle the above problems: our scale alignment scheme aligns the input features between source and target data, correcting the absolute scale inference on the target domain; with pseudo label consistency check, we select confident pixels thus improve pseudo label quality; regularisation and self-training schemes are applied to help avoid catastrophic forgetting. Without requirement of further supervisions on the target domain, our method adapts the source-trained models to the test data with significant improvements over the direct inference results, providing scale-aware depth map outputs that outperform the state-of-the-arts. Code is available at https://github.com/Malefikus/ada-depth." TODE-Trans: Transparent Object Depth Estimation with Transformer,"Kang Chen, Shaochen Wang, Beihao Xia, Dongxu Li, Zhen Kan, Bin Li","University of Science and Technology of China,Huazhong University of Science and Technology",Depth Estimation and RGB-D Sensing,"Abstract—Transparent objects are widely used in industrial automation and daily life. However, robust visual recognition and perception of transparent objects have always been a major challenge. Currently, most commercial-grade depth cameras are still not good at sensing the surfaces of transparent objects due to the refraction and reflection of light. In this work, we present a transformer-based transparent object depth estimation approach from a single RGB-D input. We observe that the global characteristics of the transformer make it easier to extract con- textual information to perform depth estimation of transparent areas. In addition, to better enhance the fine-grained features, a feature fusion module (FFM) is designed to assist coherent prediction. Our empirical evidence demonstrates that our model delivers significant improvements in recent popular datasets, e.g., 25% gain on RMSE and 21% gain on REL compared to previous state-of-the-art convolutional-based counterparts in ClearGrasp dataset. Extensive results show that our transformer- based model enables better aggregation of the object’s RGB and inaccurate depth information to obtain a better depth representation. Our code and the pre-trained model are available at https://github.com/yuchendoudou/TODE." Learning Depth Completion of Transparent Objects Using Augmented Unpaired Data,"Floris Marc Arden Erich, Bruno Leme, Noriaki Ando, Ryo Hanai, Yukiyasu Domae","National Institute of Advanced Industrial Science and Technology,University of Florida,National Institute of Industrial Science and Technology(AIST),The National Institute of Advanced Industrial Science and Techno",Depth Estimation and RGB-D Sensing,"We propose a technique for depth completion of transparent objects using augmented data captured directly from real environments with complicated geometry. Using cyclic adversarial learning we train translators to convert between painted versions of the objects and their real transparent counterpart. The translators are trained on unpaired data, hence datasets can be created rapidly and without any manual labeling. Our technique does not make any assumptions about the geometry of the environment, unlike SOTA systems that assume easily observable occlusion and contact edges, such as ClearGrasp. We show how our technique outperforms ClearGrasp in a dishwasher environment, in which occlusion and contact edges are difficult to observe. We also show how the technique can be used to create an object manipulation application with a humanoid robot. Supplementary URI: https://florise.github.io/faking_depth_web/" Lightweight Monocular Depth Estimation Via Token-Sharing Transformer,"Dong-jae Lee, Jae Young Lee, Hyounguk Shon, Eojindl Yi, Yeong-hun Park, Sung-sik Cho, Junmo Kim","Korea Advanced Institute of Science & Technology (KAIST),Korea Advanced Institute of Science and Technology,KAIST,Hyundai Mobis",Depth Estimation and RGB-D Sensing,"Depth estimation is an important task in various robotics systems and applications. In mobile robotics systems, monocular depth estimation is desirable since a single RGB camera can be deployable at a low cost and compact size. Due to its significant and growing needs, many lightweight monocular depth estimation networks have been proposed for mobile robotics systems. While most lightweight monocular depth estimation methods have been developed using convolution neural networks, the Transformer has been gradually utilized in monocular depth estimation recently. However, massive parameters and large computational costs in the Transformer disturb the deployment to embedded devices. In this paper, we present a Token-Sharing Transformer (TST), an architecture using the Transformer for monocular depth estimation, optimized especially in embedded devices. The proposed TST utilizes global token sharing, which enables the model to obtain an accurate depth prediction with high throughput in embedded devices. Experimental results show that TST outperforms the existing lightweight monocular depth estimation methods. On the NYU Depth v2 dataset, TST can deliver depth maps up to 63.4 FPS in NVIDIA Jetson nano and 142.6 FPS in NVIDIA Jetson TX2, with lower errors than the existing methods. Furthermore, TST achieves real-time depth estimation of high-resolution images on Jetson TX2 with competitive results." Improved Event-Based Dense Depth Estimation Via Optical Flow Compensation,"Dianxi Shi, Luoxi Jing, Ruihao Li, Zhe Liu, Huachi Xu, Lin Wang, Yi Zhang","Defense Innovation Institute,Peking University,National University of Defense Technology",Depth Estimation and RGB-D Sensing,"Event cameras have the potential to overcome the limitations of classical computer vision in real-world applications. Depth estimation is a crucial step for high-level robotics tasks and has attracted much attention from the community. In this paper, we propose an event-based dense depth estimation architecture, Mixed-EF2DNet, which firstly predicts inter-grid optical flow to compensate for lost temporal information, and then estimates multiple contextual depth maps that are fused to generate a robust depth estimation map. To supervise the network training, we further design a smoothing loss function used to smooth local depth estimates and facilitate estimating reasonable depth for pixels without events. In addition, we introduce SE-resblocks in the depth network to enhance the network representation by selecting feature channels. Experimental evaluations on both real-world and synthetic datasets show that our method performs better in terms of accuracy when compared to state-of-the-art algorithms, especially in scene detail estimation. Besides, our method demonstrates excellent generalization in cross-dataset tasks." TTCDist: Fast Distance Estimation from an Active Monocular Camera Using Time-To-Contact,"Levi Burner, Nitin Sanket, Cornelia Fermuller, Yiannis Aloimonos","University of Maryland, College Park,University of Maryland",Depth Estimation and RGB-D Sensing,"Distance estimation from vision is fundamental for a myriad of robotic applications such as navigation, manipulation, and planning. Inspired by the mammal's visual system, which gazes at specific objects, we develop two novel constraints relating time-to-contact, acceleration, and distance that we call the textit{$tau$-constraint} and textit{$Phi$-constraint}. They allow an active (moving) camera to estimate depth efficiently and accurately while using only a small portion of the image. The constraints are applicable to range sensing, sensor fusion, and visual servoing. We successfully validate the proposed constraints with two experiments. The first applies both constraints in a trajectory estimation task with a monocular camera and an Inertial Measurement Unit (IMU). Our methods achieve 30-70% less average trajectory error while running 25$times$ and 6.2$times$ faster than the popular Visual-Inertial Odometry methods VINS-Mono and ROVIO respectively. The second experiment demonstrates that when the constraints are used for feedback with efference copies the resulting closed loop system's eigenvalues are invariant to scaling of the applied control signal. We believe these results indicate the $tau$ and $Phi$ constraint's potential as the basis of robust and efficient algorithms for a multitude of robotic applications." STEPS: Joint Self-Supervised Nighttime Image Enhancement and Depth Estimation,"Yupeng Zheng, Chengliang Zhong, Pengfei Li, Huan-ang Gao, Yuhang Zheng, Bu Jin, Ling Wang, Hao Zhao, Guyue Zhou, Qichao Zhang, Dongbin Zhao","Institute of Automation,Chinese Academy of Sciences,Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University,Beihang University,Institute of Automation, Chinese Academy of Sciences,Xi’an Research Institute of High-Tech,Chinese Academy of Sciences",Depth Estimation and RGB-D Sensing,"Self-supervised depth estimation draws a lot of attention recently as it can promote the 3D sensing capabilities of self-driving vehicles. However, it intrinsically relies upon the photometric consistency assumption, which hardly holds during nighttime. Although various supervised nighttime image enhancement methods have been proposed, their generalization performance on challenging driving scenarios is not satisfactory. To this end, we propose the first method that jointly learns a nighttime image enhancer and a depth estimator, without using ground truth for neither tasks. Our method tightly entangles two self-supervised tasks using a newly proposed uncertain pixel masking strategy. This strategy originates from the observation that nighttime images not only suffer from underexposed regions but also from overexposed regions. By fitting a bridge-shaped curve to the illumination map distribution, both regions are suppressed and two tasks are bridged naturally. We benchmark the method on two established datasets: nuScenes and Oxford and demonstrate state-of-the-art performance on both of them. Detailed ablations also reveal the mechanism of our proposal. Last but not least, to mitigate the problem of sparse ground truth of existing datasets, we provide a new photo-realistically enhanced nighttime dataset based upon CARLA. It brings meaningful new challenges to the community. Codes, data, and models are available at https://github.com/ucaszyp/STEPS." FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation,"Junyu Zhu, Lina Liu, Yong Liu, Wanlong Li, Feng Wen, Hongbo Zhang","zhejiang University,Zhejiang University,Beijing Huawei Digital Technologies Co., Ltd.,Huawei Technologies Co., Ltd,Huawei Technologies",Depth Estimation and RGB-D Sensing,"The great potential of unsupervised monocular depth estimation has been demonstrated by many works due to low annotation cost and impressive accuracy comparable to supervised methods. To further improve the performance, recent works mainly focus on designing more complex network structures and exploiting extra supervised information, e.g., semantic segmentation. These methods optimize the models by exploiting the reconstructed relationship between the target and reference images in varying degrees. However, previous methods prove that this image reconstruction optimization is prone to get trapped in local minima. In this paper, our core idea is to guide the optimization with prior knowledge from pretrained Flow-Net. And we show that the bottleneck of unsupervised monocular depth estimation can be broken with our simple but effective framework named FG-Depth. In particular, we propose (i) a flow distillation loss to replace the typical photometric loss that limits the capacity of the model and (ii) a prior flow based mask to remove invalid pixels that bring the noise in training loss. Extensive experiments demonstrate the effectiveness of each component, and our approach achieves state-of-the-art results on both KITTI and NYU-Depth-v2 datasets." Light-Weight Pointcloud Representation with Sparse Gaussian Process,"Mahmoud Ali, Lantao Liu",Indiana University,Depth Estimation and RGB-D Sensing,This paper presents a framework to represent high-fidelity pointcloud sensor observations for efficient communication and storage. The proposed approach exploits Sparse Gaussian Process to encode pointcloud into a compact form. Our approach represents both the free space and the occupied space using only one model (one 2D Sparse Gaussian Process) instead of the existing two-model framework (two 3D Gaussian Mixture Models). We achieve this by proposing a variance-based sampling technique that effectively discriminates between the free and occupied space. The new representation requires less memory footprint and can be transmitted across limited-bandwidth communication channels. The framework is extensively evaluated in simulation and it is also demonstrated using a real mobile robot equipped with a 3D LiDAR. Our method results in a 70$sim$100 times reduction in the communication rate compared to sending the raw pointcloud. Test-Time Synthetic-To-Real Adaptive Depth Estimation,"Eojindl Yi, Junmo Kim",KAIST,Depth Estimation and RGB-D Sensing,"Is it possible for a synthetic to realistic domain adapted neural network in single image depth estimation to truly generalize on real world data? The resultant, adapted model will only generalize on the realistic domain dataset, which only reflects a small portion of the true, real world. As a result, the network still has to cope with the potential danger of domain shift between the realistic domain dataset and the real world data. Instead, a viable solution is to design the model to be capable of continuously adapting to the distribution of data it receives at test-time. In this paper, we propose a depth estimation method that is capable of adapting to the domain shift at test-time. Our method adapts to the unseen test-time domain, by updating the network using our proposed objective functions. Following former work, we reduce the entropy of the current prediction for refinement and adaptation. We propose a Logit Order Enforcement loss that can prevent the network from deviating into wrong solutions, which can result from the mere reduction of the aforementioned entropy. Qualitative and quantitative results show the effectiveness of our method. Our method reduces the dependency on training data by 5.8 times on average, while achieving comparable performance to state-of-the-art unsupervised domain adaptation (UDA) and domain generalization methods (DG) on the KITTI dataset." Unseen Object Instance Segmentation with Fully Test-Time RGB-D Embeddings Adaptation,"Lu Zhang, Siqi Zhang, Xu Yang, Hong Qiao, Zhiyong Liu","Institute of Automation, Chinese Academy of Science,Chinese Academy of Sciences, Institute of Automation,Institute of Automation, Chinese Academy of Sciences,Institute of Automation Chinese Academy of Sciences",Depth Estimation and RGB-D Sensing,"Segmenting unseen objects is a crucial ability for the robot since it may encounter new environments during the operation. Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and directly applying the model to unseen real-world scenarios. However, the domain shift caused by the sim2real gap is inevitable, posing a crucial challenge to the segmentation model. In this paper, we emphasize the adaptation process across sim2real domains and model it as a learning problem on the BatchNorm parameters of a simulation-trained model. Specifically, we propose a novel non-parametric entropy objective, which formulates the learning objective for the test-time adaptation in an open-world manner. Then, a cross-modality knowledge distillation objective is further designed to encourage the test-time knowledge transfer for feature enhancement. Our approach can be efficiently implemented with only test images, without requiring annotations or revisiting the large-scale synthetic training data. Besides significant time savings, the proposed method consistently improves segmentation results on the overlap and boundary metrics, achieving state-of-the-art performance on unseen object instance segmentation." Robust Double-Encoder Network for RGB-D Panoptic Segmentation,"Matteo Sodano, Federico Magistri, Tiziano Guadagnino, Jens Behley, Cyrill Stachniss","Photogrammetry and Robotics Lab, University of Bonn,University of Bonn,Sapienza University of Rome",Depth Estimation and RGB-D Sensing,"Perception is crucial for robots that act in real-world environments, as autonomous systems need to see and understand the world around them to act properly. Panoptic segmentation provides an interpretation of the scene by computing a pixelwise semantic label together with instance IDs. In this paper, we address panoptic segmentation using RGB-D data of indoor scenes. We propose a novel encoder-decoder neural network that processes RGB and depth separately through two encoders. The features of the individual encoders are progressively merged at different resolutions, such that the RGB features are enhanced using complementary depth information. We propose a novel merging approach called ResidualExcite, which reweighs each entry of the feature map according to its importance. With our double-encoder architecture, we are robust to missing cues. In particular, the same model can train and infer on RGB-D, RGB-only, and depth-only input data, without the need to train specialized models. We evaluate our method on publicly available datasets and show that our approach achieves superior results compared to other common approaches for panoptic segmentation." Explain What You See: Open-Ended Segmentation and Recognition of Occluded 3D Objects,"Hamed Ayoobi, Hamidreza Kasaei, Ming Cao, Rineke Verbrugge, Bart Verheij","Imperial College London,University of Groningen",3D Vision,"Local-HDP (Local Hierarchical Dirichlet Process) is a hierarchical Bayesian method recently used for open-ended 3D object category recognition. It has been proven to be efficient in real-time robotic applications. However, the method is not robust to a high degree of occlusion. We address this limitation in two steps. First, we propose a novel semantic 3D object-parts segmentation method that has the flexibility of Local-HDP. This method is shown to be suitable for open-ended scenarios where the number of 3D objects or object parts are not fixed and can grow over time. We show that the proposed method has a higher percentage of mean intersection over union, using a smaller number of learning instances. Second, we integrate this technique with a recently introduced argumentation-based online incremental learning method, enabling the model to handle a high degree of occlusion. We show that the resulting model produces explicit explanations for the 3D object category recognition task." GMCR: Graph-Based Maximum Consensus Estimation for Point Cloud Registration,"Michael Gentner, Prajval Kumar Murali, Mohsen Kaboli","BMW Group and Technical University of Munich,BMW Group and University of Glasgow,BMW & Radboud University",3D Vision,"Point cloud registration is a fundamental and challenging problem for autonomous robots interacting in unstructured environments for applications such as object pose estimation, simultaneous localization and mapping, robot-sensor calibration, and so on. In global correspondence-based point cloud registration, data association is a highly brittle task and commonly produces high amounts of outliers. Failure to reject outliers can lead to errors propagating to downstream perception tasks. Maximum Consensus (MC) is a widely used technique for robust estimation, which is however known to be NP-hard. Exact methods struggle to scale to realistic problem instances, whereas high outlier rates are challenging for approximate methods. To this end, we propose Graph-based Maximum Consensus Registration (GMCR), which is highly robust to outliers and scales to realistic problem instances. We propose novel consensus functions to map the decoupled MC-objective to the graph domain, wherein we find a tight approximation to the maximum consensus set as the maximum clique. The final pose estimate is given in closed-form. We extensively evaluated our proposed GMCR on a synthetic registration benchmark, robotic object localization task, and additionally on a scan matching benchmark. Our proposed method shows high accuracy and time efficiency compared to other state-of-the-art MC methods and compares favorably to other robust registration methods." Toward Cooperative 3D Object Reconstruction with Multi-Agent,"Xiong Li, Zhenyu Wen, Zhou Leiqiang, Chenwei Li, Yejian Zhou, Taotao Li, Zhen Hong","Zhejiang University of Technology,Zhejiang University of technology,Zhejiang",3D Vision,"We study the problem of object reconstruction in a multi-agent collaboration scenario. Specifically, we focus on the reconstruction of specific goals through several cooperative agents equipped with vision sensors to achieve higher efficiency than single agents. Our main insight is that a complete 3D object can be split into several local 3D models and assigned to different agents. In addition, we can use the salient characteristics of the collaboration agent itself to help realize the integration of local models. We develop a novel pipeline that first restores local 3D models from the images obtained from different agents, then the relative poses between collaborative agents are estimated by aligning intrinsic features. After that, all local models are integrated using the estimated parameters. Extensive experiments show that our proposed method is capable of accurately reconstructing 3D objects in the real world in a multi-agent collaborative manner. The full reconstruction pipeline is released to the public as an open-source project." SwinDepth: Unsupervised Depth Estimation Using Monocular Sequences Via Swin Transformer and Densely Cascaded Network,"Dongseok Shim, H. Jin Kim",Seoul National University,3D Vision,"Monocular depth estimation plays a critical role in various computer vision and robotics applications such as localization, mapping, and 3D object detection. Recently, learning-based algorithms achieve huge success in depth estimation by training models with a large amount of data in a supervised manner. However, it is challenging to acquire dense ground truth depth labels for supervised training, and the unsupervised depth estimation using monocular sequences emerges as a promising alternative. Unfortunately, most studies on unsupervised depth estimation explore loss functions or occlusion masks, and there is little change in model architecture in that ConvNet-based encoder-decoder structure becomes a de-facto standard for depth estimation. In this paper, we adopt a convolution-free Swin Transformer as an image feature extractor so that the network can capture both local geometric features and global semantic features for depth estimation. Also, we propose a Densely Cascaded Multi-scale Network (DCMNet) that connects every feature map directly with another from different scales via a top-down cascade pathway. This densely cascaded connectivity reinforces the interconnection between decoding layers and produces high-quality multi-scale depth outputs. The experiments on two different datasets, KITTI and Make3D, demonstrate that our proposed method outperforms existing state-of-the-art unsupervised algorithms." GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback,"Jie Huang, Jiangshan Hao, Rongshun Juan, Randy Gomez, Keisuke Nakamura, Guangliang Li","Ocean University of China,Tianjin University,Honda Research Institute Japan Co., Ltd.",Learning from Demonstration,"Generative adversarial imitation learning (GAIL) — a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large environments. However, GAIL shares the limitation of other imitation learning methods that they can seldom surpass the performance of demonstrations. In this paper, to address the limit of GAIL, we propose GAN-based interactive reinforcement learning (GAIRL) from demonstrations and human evaluative feedback, by combining the advantages of GAIL and interactive reinforcement learning. We test GAIRL in six physics-based control tasks, ranging from simple low-dimensional control tasks — Cart Pole, Mountain Car and Lunar Lander, to difficult high-dimensional tasks — Inverted Double Pendulum, Hopper and HalfCheetah. Our results suggest that, the GAIRL agent can generally surpass the performance of demonstrations in both low-dimensional and high-dimensional tasks and get an optimal or close to optimal policy." Demonstration-Guided Optimal Control for Long-Term Non-Prehensile Planar Manipulation,"Teng Xue, Hakan Girgin, Teguh Santoso Lembono, Sylvain Calinon","Idiap Research Institute, EPFL,Idiap Research Institute",Learning from Demonstration,"Long-term non-prehensile planar manipulation is a challenging task for robot planning and feedback control. It is characterized by underactuation, hybrid control, and contact uncertainty. One main difficulty is to determine both the continuous and discrete contact configurations, e.g., contact points and modes, which requires joint logical and geometrical reasoning. To tackle this issue, we propose a demonstration-guided hierarchical optimization framework to achieve offline task and motion planning (TAMP). Our work extends the formulation of the dynamics model of the pusher-slider system to include separation mode with face switching mechanism, and solves a warm-started TAMP problem by exploiting human demonstrations. We show that our approach can cope well with the local minima problems currently present in the state-of-the-art solvers and determine a valid solution to the task. We validate our results in simulation and demonstrate its applicability on a pusher-slider system with a real Franka Emika robot in the presence of external disturbances." Learning Reward Functions for Robotic Manipulation by Observing Humans,"Minttu Alakuijala, Gabriel Dulac-arnold, Julien Mairal, Jean Ponce, Cordelia Schmid","Inria,Google,INRIA,Ecole Normale Supérieure",Learning from Demonstration,"Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. Thanks to the diversity of this training data, the learned reward function sufficiently generalizes to image observations from a previously unseen robot embodiment and environment to provide a meaningful prior for directed exploration in reinforcement learning. We propose two methods for scoring states relative to a goal image: through direct temporal regression, and through distances in an embedding space obtained with time-contrastive learning. By conditioning the function on a goal image, we are able to reuse one model across a variety of tasks. Unlike prior work on leveraging human videos to teach robots, our method, Human Offline Learned Distances (HOLD) requires neither a priori data from the robot environment, nor a set of task-specific human demonstrations, nor a predefined notion of correspondence across morphologies, yet it is able to accelerate training of several manipulation tasks on a simulated robot arm compared to using only a sparse reward obtained from task completion." Data-Driven Stochastic Motion Evaluation and Optimization with Image by Spatially-Aligned Temporal Encoding,"Takeru Oba, Norimichi Ukita",Toyota Technological Institute,Learning from Demonstration,"This paper proposes a probabilistic motion prediction method for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image. While our method evaluates the task achievability by the Energy-Based Model (EBM), previous EBMs are not designed for evaluating the consistency between different domains (i.e., image and motion in our method). Our method seamlessly integrates the image and motion data into the image feature domain by spatially-aligned temporal encoding so that features are extracted along the motion trajectory projected onto the image. Furthermore, this paper also proposes a data-driven motion optimization method, Deep Motion Optimizer (DMO), that works with EBM for motion prediction. Different from previous gradient-based optimizers, our self-supervised DMO alleviates the difficulty of hyper-parameter tuning to avoid local minima. The effectiveness of the proposed method is demonstrated with a variety of experiments with similar SOTA methods." Demonstration-Bootstrapped Autonomous Practicing Via Multi-Task Reinforcement Learning,"Abhishek Gupta, Corey Lynch, Brandon Kinman, Garrett Peake, Sergey Levine, Karol Hausman","University of Washington,Google Brain,Google LLC,Google Inc,UC Berkeley",Learning from Demonstration,"Reinforcement learning systems have the potential to enable continuous improvement in unstructured environments, leveraging data collected autonomously. However, in practice these systems require significant amounts of instrumentation or human intervention to learn in the real world. In this work, we propose a system for reinforcement learning that leverages multi-task reinforcement learning bootstrapped with prior data to enable continuous autonomous practicing, minimizing the number of resets needed while being able to learn temporally extended behaviors. We show how appropriately provided prior data can help bootstrap both low-level multi-task policies and strategies for sequencing these tasks one after another to enable learning with minimal resets. This mechanism enables our robotic system to practice with minimal human intervention at training time, while being able to solve long horizon tasks at test time. We show the efficacy of the proposed system on a challenging kitchen manipulation task both in simulation and the real world, demonstrating the ability to practice autonomously in order to solve temporally extended problems." Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning,"Abraham George, Alison Bartsch, Amir Barati Farimani",Carnegie Mellon University,Learning from Demonstration,"The use of human demonstrations in reinforcement learning has proven to significantly improve agent performance. However, any requirement for a human to manually 'teach' the model is somewhat antithetical to the goals of reinforcement learning. This paper attempts to minimize human involvement in the learning process while retaining the performance advantages by using a single human example collected through a simple-to-use virtual reality simulation to assist with RL training. Our method augments a single demonstration to generate numerous human-like demonstrations that, when combined with Deep Deterministic Policy Gradients and Hindsight Experience Replay (DDPG + HER) significantly improve training time on simple tasks and allows the agent to solve a complex task (block stacking) that DDPG + HER alone cannot solve. The model achieves this significant training advantage using a single human example, requiring less than a minute of human input. Moreover, despite learning from a human example, the agent is not constrained to human-level performance, often learning a policy that is significantly different from the human demonstration." Learning Robotic Cutting from Demonstration: Non-Holonomic DMPs Using the Udwadia-Kalaba Method,"Artūras Straižys, Michael Burke, Subramanian Ramamoorthy","University of Edinburgh,Monash University,The University of Edinburgh",Learning from Demonstration,"Dynamic Movement Primitives (DMPs) offer great versatility for encoding, generating and adapting complex end-effector trajectories. DMPs are also very well suited to learning manipulation skills from human demonstration. However, the reactive nature of DMPs restricts their applicability for tool use and object manipulation tasks involving non-holonomic constraints, such as scalpel cutting or catheter steering. In this work, we extend the Cartesian space DMP formulation by adding a coupling term that enforces a pre-defined set of non-holonomic constraints. We obtain the closed-form expression for the constraint forcing term using the Udwadia-Kalaba method. This approach offers a clean and practical solution for guaranteed constraint satisfaction at run-time. Further, the proposed analytical form of the constraint forcing term enables efficient trajectory optimization subject to constraints. We demonstrate the usefulness of this approach by showing how we can learn robotic cutting skills from human demonstration." KRIS: A Novel Device for Kinesthetic Corrective Feedback During Robot Motion,"Jorn Verhggen, Kim Baraka","Vrije universiteit,Vrije Universiteit Amsterdam",Learning from Demonstration,"This paper presents a novel device that can be used to perform kinesthetic corrective feedback for robotic systems. KRIS (Kinesthetic Robotic Interaction System) is a device that can be mounted on the end-effector of an articulated robot. From here it can be manipulated by a human to give corrective feedback to the robot system during execution and in an intuitive way. The device can provide feedback in six degrees of freedom while giving passive haptic feedback to the user about both the position, rotation, and movement of the robot. We evaluated KRIS in a user study with respect to a baseline based on keyboard feedback in the areas of usability, intuitiveness, accuracy of corrections, and user task load. KRIS outperformed our baseline on the first three metrics and performed similar on task load. We believe that KRIS can enable a wide variety of robots to be taught interactively by non-expert humans in diverse collaborative settings." Guided Learning from Demonstration for Robust Transferability,"Fouad Sukkar, Victor Hernandez Moreno, Teresa A. Vidal-Calleja, Jochen Deuse",University of Technology Sydney,Learning from Demonstration,"Learning from demonstration (LfD) has the potential to greatly increase the applicability of robotic manipulators in modern industrial applications. Recent progress in LfD methods have put more emphasis in learning robustness than in guiding the demonstration itself in order to improve robustness. The latter is particularly important to consider when the target system reproducing the motion is structurally different to the demonstration system, as some demonstrated motions may not be reproducible. In light of this, this paper introduces a new guided learning from demonstration paradigm where an interactive graphical user interface (GUI) guides the user during demonstration, preventing them from demonstrating non-reproducible motions. The key aspect of our approach is determining the space of reproducible motions based on a motion planning framework which finds regions in the task space where trajectories are guaranteed to be of bounded length. We evaluate our method on two different setups with a six-degree-of-freedom (DOF) UR5 as the target system. First our method is validated using a seven-DOF Sawyer as the demonstration system. Then an extensive user study is carried out where several participants are asked to demonstrate, with and without guidance, a mock weld task using a hand held tool tracked by a VICON system. With guidance users were able to always carry out the task successfully in comparison to only 44% of the time without guidance." One-Shot Visual Imitation Via Attributed Waypoints and Demonstration Augmentation,"Matthew Chang, Saurabh Gupta","University of Illinois at Urbana-Champaign,UIUC",Learning from Demonstration,"In this paper, we analyze the behavior of existing techniques and design new solutions for the problem of one-shot visual imitation. In this setting, an agent must solve a novel instance of a novel task given just a single visual demonstration. Our analysis reveals that current methods fall short because of three errors: the DAgger problem arising from purely offline training, last centimeter errors in interacting with objects, and mis-fitting to the task context rather than to the actual task. This motivates the design of our modular approach where we a) separate out task inference (what to do) from task execution (how to do it), and b) develop data augmentation and generation techniques to mitigate mis-fitting. The former allows us to leverage hand-crafted motor primitives for task execution which side-steps the DAgger problem and last centimeter errors, while the latter gets the model to focus on the task rather than the task context. Our model gets 100% and 48% success rates on two recent benchmarks, improving upon the current state-of-the-art by absolute 90% and 20% respectively." Show Me What You Want: Inverse Reinforcement Learning to Automatically Design Robot Swarms by Demonstration,"Ilyes Gharbi, Jonas Kuckling, David Garzon Ramos, Mauro Birattari","Université libre de Bruxelles,Université Libre de Bruxelles",Learning from Demonstration,"Automatic design is a promising approach to generating control software for robot swarms. So far, automatic design has relied on mission-specific objective functions to specify the desired collective behavior. In this paper, we explore the possibility to specify the desired collective behavior via demonstrations. We develop Demo-Cho, an automatic design method that combines inverse reinforcement learning with automatic modular design of control software for robot swarms. We show that, only on the basis of demonstrations and without the need to be provided with an explicit objective function, Demo-Cho successfully generated control software to perform four missions. We present results obtained in simulation and with physical robots." Immersive Demonstrations Are the Key to Imitation Learning,"Kelin Li, Digby Chappell, Nicolas Rojas",Imperial College London,Learning from Demonstration,"Achieving successful robotic manipulation is an essential step towards robots being widely used in industry and home settings. Recently, many learning-based methods have been proposed to tackle this challenge, with imitation learning showing great promise. However, imperfect demonstrations and a lack of feedback from teleoperation systems may lead to poor or even unsafe results. In this work we explore the effect of demonstrator force feedback on imitation learning, using a feedback glove and a robot arm to render fingertip-level and palm-level forces, respectively. 10 participants recorded 5 demonstrations of a pick-and-place task with 3 grippers, under conditions with no force feedback, fingertip force feedback, and fingertip and palm force feedback. Results show that force feedback significantly reduces demonstrator fingertip and palm forces, leads to a lower variation in demonstrator forces, and recorded trajectories that are quicker to execute. Using behavioral cloning, we find that agents trained to imitate these trajectories mirror these benefits, even though agents have no force data shown to them during training. We conclude that immersive demonstrations, achieved with force feedback, may be the key to unlocking safer, quicker-to-execute dexterous manipulation policies." DreamWaQ: Learning Robust Quadrupedal Locomotion with Implicit Terrain Imagination Via Deep Reinforcement Learning,"I Made Aswin Nahrendra, Byeongho Yu, Hyun Myung","KAIST,KAIST (Korea Advanced Institute of Science and Technology)",Learning for Locomotion,"Quadrupedal robots resemble the physical ability of legged animals to walk through unstructured terrains. However, designing a controller for quadrupedal robots poses a significant challenge due to their functional complexity and requires adaptation to various terrains. Recently, deep reinforcement learning, inspired by how legged animals learn to walk from their experiences, has been utilized to synthesize natural quadrupedal locomotion. However, state-of-the-art methods strongly depend on a complex and reliable sensing framework. Furthermore, prior works that rely only on proprioception have shown a limited demonstration for overcoming challenging terrains, especially for a long distance. This work proposes a novel quadrupedal locomotion learning framework that allows quadrupedal robots to walk through challenging terrains, even with limited sensing modalities. The proposed framework was validated in real-world outdoor environments with varying conditions within a single run for a long distance." Learning Low-Frequency Motion Control for Robust and Dynamic Robot Locomotion,"Siddhant Gangapurwala, Luigi Campanaro, Ioannis Havoutis","Sony AI,University of Oxford",Learning for Locomotion,"Robotic locomotion is often approached with the goal of maximizing robustness and reactivity by increasing motion control frequency. We challenge this intuitive notion by demonstrating robust and dynamic locomotion with a learned motion controller executing at as low as 8 Hz on a real ANYmal C quadruped. The robot is able to robustly and repeatably achieve a high heading velocity of 1.5 m/s, traverse uneven terrain, and resist unexpected external perturbations. We further present a comparative analysis of deep reinforcement learning (RL) based motion control policies trained and executed at frequencies ranging from 5 Hz to 200 Hz. We show that low-frequency policies are less sensitive to actuation latencies and variations in system dynamics. This is to the extent that a successful sim-to-real transfer can be performed even without any dynamics randomization or actuation modeling. We support this claim through a set of rigorous empirical evaluations. Moreover, to assist reproducibility, we provide the training and deployment code." OPT-Mimic: Imitation of Optimized Trajectories for Dynamic Quadruped Behaviors,"Yuni Fuchioka, Zhaoming Xie, Michiel Van De Panne","University of British Columbia,Stanford University",Learning for Locomotion,"Reinforcement Learning (RL) has seen many recent successes for quadruped robot control. The imitation of reference motions provides a simple and powerful prior for guiding solutions towards desired solutions without the need for meticulous reward design. While much work uses motion capture data or hand-crafted trajectories as the reference motion, relatively little work has explored the use of reference motions coming from model-based trajectory optimization. In this work, we investigate several design considerations that arise with such a framework, as demonstrated through four dynamic behaviours: trot, front hop, 180 backflip, and biped stepping. These are trained in simulation and transferred to a physical Solo 8 quadruped robot without further adaptation. In particular, we explore the space of feed-forward designs afforded by the trajectory optimizer to understand its impact on RL learning efficiency and sim-to-real transfer. These findings contribute to the long standing goal of producing robot controllers that combine the interpretability and precision of model-based optimization with the robustness that model-free RL-based controllers offer." Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments,"Mingyo Seo, Ryan Gupta, Yifeng Zhu, Alexy Skoutnev, Luis Sentis, Yuke Zhu","The University of Texas at Austin,University of Texas at Austin",Learning for Locomotion,"We tackle the problem of perceptive locomotion in dynamic environments. In this problem, a quadrupedal robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles. We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making to predict navigation commands and low-level gait generation to realize the target commands. In this framework, we train the high-level navigation controller with imitation learning on human demonstrations collected on a steerable cart and the low-level gait controller with reinforcement learning (RL). Therefore, our method can acquire complex navigation behaviors from human supervision and discover versatile gaits from trial and error. We demonstrate the effectiveness of our approach in simulation and with hardware experiments. Videos and code can be found at the project page: https://ut-austin-rpl.github.io/PRELUDE." Legs As Manipulator: Pushing Quadrupedal Agility Beyond Locomotion,"Xuxin Cheng, Ashish Kumar, Deepak Pathak","Carnegie Mellon University,UC Berkeley",Learning for Locomotion,"Locomotion has seen dramatic progress for walking or running across challenging terrains. However, robotic quadrupeds are still far behind their biological counterparts, such as dogs, which display a variety of agile skills and can use the legs beyond locomotion to perform several basic manipulation tasks like interacting with objects and climbing. In this paper, we take a step towards bridging this gap by training quadruped robots not only to walk but also to use the front legs to climb walls, press buttons, and perform object interaction in the real world. To handle this challenging optimization, we decouple the skill learning broadly into locomotion, which involves anything that involves movement whether via walking or climbing a wall, and manipulation, which involves using one leg to interact while balancing on the other three legs. These skills are trained in simulation using curriculum and transferred to the real world using our proposed sim2real variant that builds upon recent locomotion success. Finally, we combine these skills into a robust long-term plan by learning a behavior tree that encodes a high-level task hierarchy from one clean expert demonstration. We evaluate our method in both simulation and real-world showing successful executions of both short as well as long-range tasks and how robustness helps confront external perturbations. Videos at https://robot-skills.github.io/" Force Control for Robust Quadruped Locomotion: A Linear Policy Approach,"Aditya Shirwatkar, Vamshi Kumar Kurva, Devaraju Vinoda, Aman Singh, Aditya Varma Sagi, Himanshu Lodha, Bhavya Giri Goswami, Shivam Sood, Ketan Nehete, Shishir Kolathaya","Indian Institute of Science Bengaluru,IISc,Indian Institute of Science, Bengaluru,Indian Institute of Science,Stoch Lab, Indian Institute of Science, Bengaluru,Indian Institute of Science (IISc), Bengaluru,Indian Institute of Technology Kharagpur",Learning for Locomotion,"This work presents a simple linear policy for direct force control for quadrupedal robot locomotion. The motivation is that force control is essential for highly dynamic and agile motions. We learn a linear policy to generate end-foot trajectory parameters and a centroidal wrench, which is then distributed among the legs based on the foot contact information using a quadratic program (QP) to get the desired ground reaction forces. Unlike the majority of the existing works that use complex nonlinear function approximators to represent the RL policy or model predictive control (MPC) methods with many optimization variables in the order of hundred, our controller uses a simple linear function approximator to represent policy along with only a twelve variable QP for the force distribution. A centroidal dynamics-based MPC method is used to generate reference trajectory data, and then the linear policy is trained using imitation learning to minimize the deviations from the reference trajectory. We demonstrate this compute-efficient controller on our robot Stoch3 in simulation and real-world experiments on indoor and outdoor terrains with push recovery." Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning,"Eric Vollenweider, Marko Bjelonic, Victor Klemm, Nikita Rudin, Joonho Lee, Marco Hutter","ETH, Microsoft,ETH Zurich,ETH Zurich, NVIDIA,ETH Zurich Robotic Systems Laboratory",Learning for Locomotion,"Reinforcement learning (RL) has emerged as a powerful approach for locomotion control of highly articulated robotic systems. However, one major challenge is the tedious process of tuning the reward function to achieve the desired motion style. To address this issue, imitation learning approaches such as adversarial motion priors have been proposed, which encourage a pre-defined motion style. In this work, we present an approach to enhance the concept of adversarial motion prior-based RL, allowing for multiple, discretely switchable motion styles. Our approach demonstrates that multiple styles and skills can be learned simultaneously without significant performance differences, even in combination with motion data-free skills. We conducted several real-world experiments using a wheeled-legged robot to validate our approach. The experiments involved learning skills from existing RL controllers and trajectory optimization, such as ducking and walking, as well as novel skills, such as switching between a quadrupedal and humanoid configuration. For the latter skill, the robot was required to stand up, navigate on two wheels, and sit down. Instead of manually tuning the sit-down motion, we found that a reverse playback of the stand-up movement helped the robot discover feasible sit-down behaviors and avoided the need for tedious reward function tuning." Deep Reinforcement Learning Based Personalized Locomotion Planning for Lower-Limb Exoskeletons,"Javad K. Mehr, Edward Guo, Mojtaba Akbari, Vivian K. Mushahwar, Mahdi Tavakoli","University of Alberta,University of Calgary",Learning for Locomotion,"This paper introduces intelligent central pattern generators (iCPGs) that can plan personalized walking trajectories for lower-limb exoskeletons. This can make walking more comfortable for the users by resolving one of the significant shortcomings of most commercially available exoskeletons, which is the use of pre-defined fixed trajectories for all users. The proposed method combines reinforcement learning (RL) with previously introduced adaptable central pattern generators (ACPGs) to learn a user's physical interaction behaviour and refine the exoskeleton's walking trajectories. The ACPG method embeds physical human-robot interaction (pHRI) in CPGs to make changing gait trajectories in real-time, possible. However, to effectively refine gait trajectories based on pHRIs, the parameters must be precisely identified and updated as a user interacts with the exoskeleton. Our proposed method uses RL to modify (amplify/attenuate) the pHRI energy based on a user's interaction behaviour, and form an effective energy value which can facilitate reaching desired gait pattern for users via iCPG dynamics. The proposed method can resolve the aforementioned challenges with ACPGs and personalized trajectory generation. The simulation and experimental results provide evidence that the proposed method can effectively adapt to the user's behaviour in different walking scenarios with the Indego lower-limb exoskeleton." Expanding Versatility of Agile Locomotion through Policy Transitions Using Latent State Representation,"Guilherme Christmann, Jonathan Hans Soeseno, Ying-sheng Luo, Wei-chao Chen","Inventec Corporation,Inventec Inc.",Learning for Locomotion,"This paper proposes the transition-net, a robust transition strategy that expands the versatility of robot locomotion in the real-world setting. To this end, we start by distributing the complexity of different gaits into dedicated locomotion policies applicable to real-world robots. Next, we expand the versatility of the robot by unifying the policies with robust transitions into a single coherent meta-controller by examining the latent state representations. Our approach enables the robot to iteratively expand its skill repertoire and robustly transition between any policy pair in a library. In our framework, adding new skills does not introduce any process that alters the previously learned skills. Moreover, training of a locomotion policy takes less than an hour with a single consumer GPU. Our approach is effective in the real-world and achieves a 19% higher average success rate for the most challenging transition pairs in our experiments compared to existing approaches." Sim-To-Real Transfer for Quadrupedal Locomotion Via Terrain Transformer,"Hang Lai, Weinan Zhang, Xialin He, Chen Yu, Zheng Tian, Yong Yu, Jun Wang","Shanghai Jiao Tong University,ShanghaiTech University,University College London",Learning for Locomotion,"Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i.e., sim-to-real transfer). Despite considerable progress, the capacity and scalability of traditional neural networks are still limited, which may hinder their applications in more complex environments. In contrast, the Transformer architecture has shown its superiority in a wide range of large-scale sequence modeling tasks, including natural language processing and decision-making problems. In this paper, we propose Terrain Transformer (TERT), a high-capacity Transformer model for quadrupedal locomotion control on various terrains. Furthermore, to better leverage Transformer in sim-to-real scenarios, we present a novel two-stage training framework consisting of an offline pretraining stage and an online correction stage, which can naturally integrate Transformer with privileged training. Extensive experiments in simulation demonstrate that TERT outperforms state-of-the-art baselines on different terrains in terms of return, energy consumption and control smoothness. In further real-world validation, TERT successfully traverses nine challenging terrains, including sand pit and stair down, which can not be accomplished by strong baselines." Agile and Versatile Robot Locomotion Via Kernel-Based Residual Learning,"Milo Carroll, Zhaocheng Liu, Mohammadreza Kasaei, Zhibin Li","Entreprenuer First,The University of Edinburgh,University of Edinburgh,University College London",Learning for Locomotion,"This work developed a kernel-based residual learning framework for quadrupedal robotic locomotion. Initially, a kernel neural network is trained with data collected from an MPC controller. Alongside a frozen kernel network, a residual controller network is trained via reinforcement learning to acquire generalized locomotion skills and resilience against external perturbations. With this proposed framework, a robust quadrupedal locomotion controller is learned with high sample efficiency and controllability, providing omnidirectional locomotion at continuous velocities. Its versatility and robustness are validated on unseen terrains that the expert MPC controller fails to traverse. Furthermore, the learned kernel can produce a range of functional locomotion behaviors and can generalize to unseen gaits." DribbleBot: Dynamic Legged Manipulation in the Wild,"Yandong Ji, Gabriel Margolis, Pulkit Agrawal","MIT,Massachusetts Institute of Technology",Learning for Locomotion,"We present DribbleBot (Dexterous Ball Manipulation with a Legged Robot), a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans. We identify key challenges of in-the-wild soccer ball manipulation, including variable ball motion dynamics and perception using body-mounted cameras. To overcome these challenges, we propose a domain and task specification for learning viable soccer dribbling behaviors in simulation that transfer to real fields. Our system provides promising evidence that current legged robots are physically capable and adequately sensorized for varied and dynamic real-world soccer play." Knowledge Distillation for Feature Extraction in Underwater VSLAM,"Jinghe Yang, Mingming Gong, Girish N. Nair, Jung Hoon Lee, Jason Monty, Ye Pu","The University of Melbourne,University of Melbourne",Marine Robotics III,"In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowledge distillation framework for training an underwater feature detection and matching network (UFEN). In particular, we use in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employ these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in-air images. We embed UFEN into the ORB-SLAM3 framework to replace the ORB feature by introducing an additional binarization layer. To test the effectiveness of our method, we built a new underwater dataset with groundtruth measurements named EASI (https://github.com/Jinghe-mel/UFEN-SLAM), recorded in an indoor water tank for different turbidity levels. The experimental results on the existing dataset and our new dataset demonstrate the effectiveness of our method." OysterNet: Enhanced Oyster Detection Using Simulation,"Xiaomin Lin, Nitin Sanket, Nare Karapetyan, Yiannis Aloimonos","University of Maryland,University of Maryland, College Park",Marine Robotics III,"Oysters play a pivotal role in the bay living ecosystem and are considered the living filters for the ocean. In recent years, oyster reefs have undergone major devastation caused by commercial over-harvesting, requiring preservation to maintain ecological balance. The foundation of this preservation is to estimate the oyster density which requires accurate oyster detection. However, systems for accurate oyster detection require large datasets obtaining which is an expensive and labor-intensive task in underwater environments. To this end, we present a novel method to mathematically model oysters and render images of oysters in simulation to boost the detection performance with minimal real data. Utilizing our synthetic data along with real data for oyster detection, we obtain up to 35.1% boost in performance as compared to using only real data with our OysterNet network. We also improve the state-of-the-art by 12.7%. This shows that using underlying geometrical properties of objects can help to enhance recognition task accuracy on limited datasets successfully and we hope more researchers adopt such a strategy for hard-to-obtain datasets." SyreaNet: A Physically Guided Underwater Image Enhancement Framework Integrating Synthetic and Real Images,"Junjie Wen, Jinqiang Cui, Zhenjun Zhao, Ruixin Yan, Zhi Gao, Lihua Dou, Ben M. Chen","The Chinese University of Hong Kong,Peng Cheng Laboratory,Temasek Laboratories @ NUS,Beijing Institue of Technology,Chinese University of Hong Kong",Marine Robotics III,"Underwater image enhancement (UIE) is vital for high-level vision-related underwater tasks. Although learning-based UIE methods have achieved remarkable achievements in recent years, it's still challenging for them to consistently deal with various underwater conditions, which could be caused by: 1) the use of the simplified atmospheric image formation model in UIE may result in severe errors; 2) the network trained solely with synthetic images might have difficulty in generalizing well to real underwater images. In this work, we, for the first time, propose a framework textit{SyreaNet} for UIE that integrates both synthetic and real data under the guidance of the revised underwater image formation model and novel domain adaptation (DA) strategies. First, an underwater image synthesis module based on the revised model is proposed. Then, a physically guided disentangled network is designed to predict the clear images by combining both synthetic and real underwater images. The intra- and inter-domain gaps are abridged by fully exchanging the domain knowledge. Extensive experiments demonstrate the superiority of our framework over other state-of-the-art (SOTA) learning-based UIE methods qualitatively and quantitatively. The code and dataset are publicly available at https://github.com/RockWenJJ/SyreaNet.git." Real-Time Dense 3D Mapping of Underwater Environments,"Weihan Wang, Bharat Joshi, Nathaniel Burgdorfer, Konstantinos Batsos, Alberto Quattrini Li, Philippos Mordohai, Ioannis Rekleitis","Stevens Institute of Technology,University of South Carolina,Stevens Institute of Technoiogy,Dartmouth College",Marine Robotics III,"This paper addresses real-time dense 3D reconstruction for a resource-constrained Autonomous Underwater Vehicle (AUV). Underwater vision-guided operations are among the most challenging as they combine 3D motion in the presence of external forces, limited visibility, and absence of global positioning. Obstacle avoidance and effective path planning require online dense reconstructions of the environment. Autonomous operation is central to environmental monitoring, marine archaeology, resource utilization, and underwater cave exploration. To address this problem, we propose to use SVIn2, a robust VIO method, together with a real-time 3D reconstruction pipeline. We provide extensive evaluation on four challenging underwater datasets. Our pipeline produces comparable reconstruction with that of COLMAP, the state-of-the-art offline 3D reconstruction method, at high frame rates on a single CPU." SM/VIO: Robust Underwater State Estimation Switching between Model-Based and Visual Inertial Odometry,"Bharat Joshi, Hunter Damron, Sharmin Rahman, Ioannis Rekleitis","University of South Carolina,Amazon",Marine Robotics III,"This paper addresses the robustness problem of visual-inertial state estimation for underwater operations. Underwater robots operating in a challenging environment are required to know their pose at all times. All vision-based localization schemes are prone to failure due to poor visibility conditions, color loss, and lack of features. The proposed approach utilizes a model of the robot’s kinematics together with proprioceptive sensors to maintain the pose estimate during visual-inertial odometry (VIO) failures. Furthermore, the trajectories from successful VIO and the ones from the model-driven odometry are integrated in a coherent set that maintains a consistent pose at all times. Health-monitoring tracks the VIO process ensuring timely switches between the two estimators. Finally, loop closure is implemented on the overall trajectory. The resulting framework is a robust estimator switching between model-based and visual-inertial odometry(SM/VIO). Experimental results from numerous deployments of the Aqua2 vehicle demonstrate the robustness of our approach over coral reefs and a shipwreck." Image-Based Visual Servoing Switchable Leader-Follower Control of Heterogeneous Multi-Agent Underwater Robot System,"Kanzhong Yao, Nathalie Bauschmann, Thies Lennart Alff, Wei Cheah, Daniel Andre Duecker, Keir Groves, Ognjen Marjanovic, Simon Watson","University of Manchester,Hamburg University of Technology,Technische Universität Hamburg,The University of Manchester,Technical University of Munich (TUM)",Marine Robotics III,"Confined and cluttered aquatic environments present a number of significant challenges with respect to inspection by robotic platforms, including localisation and communications. Some of these can be mitigated by using collaborative heterogeneous multi-robot teams.An important element of such a system is collaborative control. This paper addresses this challenge by presenting an Image-Based Visual Servoing (IBVS), leader-follower control system for heterogeneous aquatic robots. Experiments were conducted in an uncluttered pond to demonstrate the capabilities of the system. The results show robots can maintain tracking each other with maximum x and y displacements of 0.42m and 0.41m, the maximum projection distance in the xy-plane of maintaining formation is 0.45m, showing the stability and feasibility of deploying such system on underwater platforms." Buoyancy Enabled Autonomous Underwater Construction with Cement Blocks,"Samuel Lensgraf, Devin Balkcom, Alberto Quattrini Li",Dartmouth College,Marine Robotics III,"We present the first free-floating autonomous underwater construction system capable of using active ballasting to transport cement building blocks efficiently. It is the first free-floating autonomous construction robot to use a paired set of resources: compressed air for buoyancy and a battery for thrusters. In construction trials, our system built structures of up to 12 components and weighing up to 100Kg (75Kg in water). Our system achieves this performance by combining a novel one-degree-of-freedom manipulator, a novel two-component cement block construction system that corrects errors in placement, and a simple active ballasting system combined with compliant placement and grasp behaviors. The passive error correcting components of the system minimize the required complexity in sensing and control. We also explore the problem of buoyancy allocation for building structures at scale by defining a convex program which allocates buoyancy to minimize the predicted energy cost for transporting blocks." Mapping Waves with an Uncrewed Surface Vessel Via Gaussian Process Regression,"Thomas Sears, Michael Riley Cooper, Joshua Marshall",Queen's University,Marine Robotics III,"Mobile robots are well suited for environmental surveys because they can travel to any area of interest and react to observations without the need for pre-existing infrastructure or significant setup time. However, vehicle motion constraints limit where and when measurements occur. This is challenging for a single vehicle observing a time-varying phenomenon, such as coastal waves, but the ability to generate a spatiotemporal map would have immediate scientific and engineering applications. In this paper, an uncrewed surface vessel (USV) was used to measure waves on the coast of Lake Ontario, Canada. Data were collected from a low-cost inertial measurement system onboard the USV and processed in an offline Gaussian process regression (GPR) workflow to create a spatiotemporal wave model. Frequency analysis of raw sensor data was used to best select and design kernel functions, and to initialize hyperparameters. The relative speed of the waves limited the ability to make complete wave reconstructions, but GPR captured the dominant periodic components of the waves despite irregularities in the signals. After optimization, the hyperparameters indicate a dominant signal with a wave period of 0.87 s, which concurs with ground truth estimates." Enforcing Constraints for Dynamic Obstacle Avoidance by Compliant Robots,"Leonidas Koutras, Konstantinos Vlachos, George Kanakis, Fotios Dimeas, Zoe Doulgeri, George Rovithakis","Aristotle University of Thessaloniki,Aristotel University of Thessaloniki",Compliance and Impedance Control,"In this work a control scheme is proposed to enforce dynamic obstacle avoidance constraints to the full body of actively compliant robots. We argue that both compliance and accuracy are necessary to build safe collaborative robotic systems; obstacle avoidance is usually not enough, due to the reliance on perception systems which exhibit delays and errors. Our scheme is able to successfully avoid obstacles, while remaining compliant in the entirety of the executed task. Therefore, in case of unexpected collisions due to perception system errors, the robot remains safe for humans and its environment. Our approach is validated through experiments with simulated and real obstacles utilizing a 7-dof KUKA LBR iiwa robotic manipulator." Increasing Admittance of Industrial Robots by Velocity Feedback Inner-Loop Shaping,"Kangwagye Samuel, Kevin Haninger, Sehoon Oh","DGIST,Fraunhofer IPK",Compliance and Impedance Control,"Admittance and impedance controllers are often purely feedforward, using measured external force or motion, respectively, to generate a reference for an inner-loop controller. In this case, the range of dynamics which can be rendered is limited by the inner-loop, which causes, e.g. contact stability issues for low admittance industrial robots in stiff contact. When both position and force are measured, feedback control can be added to more flexibly reshape the rendered dynamics. This paper uses velocity feedback to increase the admittance of motion-controlled industrial robots in force control applications. This allows an industrial robot with a lower intrinsic admittance, which may be needed for payload, speed, or accuracy, to realize a higher admittance by control, allowing lighter manual guidance and safer contact. This is achieved by a modified disturbance observer, where an inverse dynamic model estimates external forces and amplifies them with positive feedback. This approach is compared with using positive velocity feedback with a shaping filter. Here, velocity reference calculated by the virtual admittance model is modified by the DOB (Dist-Add) or the positive velocity feedback (Vel-Add). When combined with an outer-loop admittance controller, these methods can render a higher admittance while maintaining contact stability compared to standard feedforward admittance control." Bounded Compensation with Friction Estimation for Accurate Motion Tracking and Compliant Behavior of Industrial Manipulators,"Dongwoo Ko, Donghyeon Lee, Wan Kyun Chung, Keehoon Kim","POSTECH,Pohang University of Science and Technology(POSTECH),POSTECH, Pohang University of Science and Technology",Compliance and Impedance Control,"This paper proposes a control structure for accurate tracking and compliant behavior of industrial manipulators without additional sensors. To achieve control objectives, friction, one of the biggest causes of performance degradation, should be compensated. For tracking performance, the estimated friction cancels most friction effects as a feed-forward, and the modified robust control structure eliminates the remaining friction uncertainty, which was originally equivalent to the disturbance observer. For compliant behavior, the compensation force, which is fed to the real plant, is bounded in contrast to the conventional DOB structure. The compensation bound could be determined through the experiments. The proposed method is validated by experiments with a 6-DOF collaborative industrial manipulator." A Passivity-Based Approach on Relocating High-Frequency Robot Controller to the Edge Cloud,"Xiao Chen, Hamid Sadeghian, Lingyun Chen, Mario Troebinger, Abdalla Swikir, Abdeldjallil Naceri, Sami Haddadin",Technical University of Munich,Compliance and Impedance Control,"As robots become more and more intelligent, the complexity of the algorithms behind them is increasing. Since these algorithms require high computation power from the onboard robot controller, the weight of the robot and energy consumption increases. A promising solution to tackle this issue is to relocate the expensive computation to the cloud. In this pioneering work, the possibility of relocating a state-of-the-art nonlinear control is investigated. To this end, the Unified Force-Impedance Controller (UFIC) is relocated to a remote location and high frequency feedback loop is established by including the remote controller in the loop. Passivity analysis is used to ensure the stability of the whole system, comprising the robot in interaction with the environment, the communication channel, as well as the remote controller. The instability associated with the communication channel is resolved by Time Domain Passivity Approach (TDPA). The performance of the proposed framework is experimentally evaluated on a robot arm in interaction with the environment. The results illustrate the stability of the system to a time-varying delay of up to $50 pm 10ms$." A Framework for Simultaneous Workpiece Registration in Robotic Machining Applications,"Steffan Lloyd, Rishad Irani, Mojtaba Ahmadi",Carleton University,Compliance and Impedance Control,"This article presents a novel framework called Simultaneous Registration and Machining (SRAM), a generalized method to improve workpiece registration using real-time acquired data in robotic contouring applications. The method allows for online corrections to the toolpath, while a live covariance estimate is simultaneously leveraged to adaptively tune the force controller aggressively when uncertainty is high, but conservatively otherwise to minimize chatter and instability. The SRAM framework is validated in simulation and shown to significantly reduce the path corrections required from the force controller, while correctly predicting optimal controller tuning adaptations. The SRAM method is proposed to improve force control stability, increase peripheral accuracy, smooth surface finish, and reduce cycle times in contouring applications." Contact Force Control with Continuously Compliant Robotic Legs,"Robin Bendfeld, C. David Remy",University of Stuttgart,Award Finalists 1,"This paper presents a novel robotic leg design and an associated control approach, which aims at providing an extension to the classical series elastic actuation concept. We propose to directly integrate the series compliance into the structure of the robotic leg itself, as opposed to co-locating spring and motor as done in traditional series elastic actuators. Our approach will eliminate mechanical design complexity and lead to a reduction of mass in the legs. This will, as a secondary benefit, improve the energy efficiency of locomotion. The primary contribution of this work is a model-based controller that can stably and precisely regulate the ground contact forces during stance. This control approach is demonstrated in a set of test-bench experiments, in which we control the contact forces of a modified version of the robotic leg scarleth. Here, the rigid shank is replaced by a continuously compliant element made of spring steel. This work presents the first step towards a new generation of robotic legs with structural compliance." Generalization of Impact Response Factors for Proprioceptive Collaborative Robots,"Carlos Relaño, Daniel Sanz-merodio, Miguel López Estévez, Concepción A. Monje","University Carlos III of Madrid,Arquimea Research Center",Compliance and Impedance Control,"Physical Human-Robot Interaction(pHRI) requires taking safety into account from the design board to the collaborative operation of any robot. For collaborative robotic environments, where human and machine are sharing space and interacting physically, the analysis and quantification of impacts becomes very relevant and necessary. Furthermore, analyses of this kind are a valuable source of information for the design of safer, more efficient pHRI. In the definition of the first parameter for dynamic impact analysis, the dynamic impact mitigation capacity was considered for certain configurations of the robot, but the design characteristics of the robot, such as the inertia of actuators, were not included. This paradigm changed when MIT presented the “impact mitigation factor” (IMF) with which, in addition to considering the ability of a certain robot to mitigate impacts for every configuration, it was possible to quantify backdriveability by taking the inertia of actuators into account for the calculation of the factor. However, IMF was proposed as a method to analyse floating robots like. This paper presents the Generalised Impact Absorption Factor (GIAF), suitable for both floating and fixed-base robots. GIAF is a valuable design parameter, as it provides information about the backdriveability of each joint, while allowing the comparison of impact response between floating and fixed-base robotic platforms. In this work, the mathematical definition of GIAF is developed and examples of possible uses of GIAF are presented" Robotic Fastening with a Manual Screwdriver,"Ling Tang, Yan-Bin Jia",Iowa State University,Compliance and Impedance Control,"The robotic hand is still no match for the human hand on many skills. Manipulation of hand tools, which usually requires sophisticated finger movements and fine controls, not only poses a clear technical challenge but also carries a great potential for enabling the robot to assist humans in a wide range of tasks accomplishable using tools. This paper takes a first step to investigate how a robotic arm mounts a rigidly attached screwdriver onto a screw (pre-mounted in a tapped hole) and then tightens it using the tool. Mounting begins with sliding the screwdriver tip on the screw head along preplanned paths to search for the drive and follows with rotating the screwdriver to drop the tip into the drive. Prevention of a slip off the screw head is achieved via impedance control to install a “virtual fence” along its boundary. Turning of the screw is conducted via hybrid position/admittance control based on modeling the reaction force between the screw and the substrate. Simulation results with a KUKA Arm demonstrate the smoothness of the entire action." Model and Acceleration-Based Pursuit Controller for High Performance Autonomous Racing,"Jonathan Becker, Nadine Imholz, Luca Schwarzenbach, Edoardo Ghignone, Nicolas Baumann, Michele Magno","ETH Zurich,ETH",Robot Control,"Autonomous racing is a research field gaining large popularity, as it pushes autonomous driving algorithms to their limits and serves as a catalyst for general autonomous driving. For scaled autonomous racing platforms, the computational constraint and complexity often limit the use of Model Predictive Control (MPC). As a consequence, geometric controllers are the most frequently deployed controllers. They prove to be performant while yielding implementation and operational simplicity. Yet, they inherently lack the incorporation of model dynamics, thus limiting the race car to a velocity domain where tire slip can be neglected. This paper presents Model- and Acceleration-based Pursuit (MAP) a high-performance model-based trajectory tracking controller that preserves the simplicity of geometric approaches while leveraging tire dynamics. The proposed algorithm allows accurate tracking of a trajectory at unprecedented velocities compared to State-of-the-Art (SotA) geometric controllers. The MAP controller is experimentally validated and outperforms the reference geometric controller four-fold in terms of lateral tracking error, yielding a tracking error of 0.055 m at tested speeds up to 11 m/s on a scaled racecar." Extremum Seeking-Based Adaptive Sliding Mode Control with Sliding Perturbation Observer for Robot Manipulators,"Muhammad Hamza Khan, Min Cheol Lee","Pusan National University.,Pusan National University",Robot Control,"This paper proposed an adaptive robust sliding mode control (SMC) with a nonlinear sliding perturbation observer (SPO) for robot manipulators. SPO estimates the perturbation (nonlinearities, uncertainties, and disturbances) with minimal system information and enhances the controller performance. The estimation is mainly dependent on the selection of SMCSPO gain, and if not tuned well, it might result in increased error dynamics of the system. Therefore, minimizing the error dynamics by improving the estimation is the primary goal of this research. In this regard, the current study accomplishes adaptation of controller gain in real-time by using an optimization technique called extremum seeking (ES). The quality adaptation is controlled with the help of a cost function. Based on the Lyapunov-based stability analysis of SMCSPO, the cost function consisting of the estimation error of the observer and error dynamics is proposed. The unique cost function now guarantees the tracking performance within the defined error tolerance. The effectiveness of the proposed algorithm is illustrated and validated in simulation and experiments. It is shown that the adaptation based on ES with the proposed cost function converges to the optimal control gain enabling the reduced estimation error and error dynamics with enhanced tracking performance." Experimental Validation of Functional Iterative Learning Control on a One-Link Flexible Arm,"Sjoerd Drost, Pietro Pustina, Franco Angelini, Alessandro De Luca, Gerwin Smit, Cosimo Della Santina","Delft University of Technology, Delft, The Netherlands,Sapienza University of Rome,University of Pisa,Delft University of Technology,TU Delft",Robot Control,"Performing precise, repetitive motions is essential in many robotic and automation systems. Iterative learning control (ILC) allows learning the necessary control command and using a rough system model to speed up the process. Functional Iterative Learning Control is a novel technique that promises to solve several limitations of classic ILC. It operates by merging the input space into a large functional space, resulting in an over-determined control task in the iteration domain. This way, it can deal with systems having more outputs than inputs and accelerate the learning process without resorting to model discretizations. However, the framework lacks so far of any experimental validation. This paper aims to provide such experimental validation in the context of robotics. To this end, we designed and built a one-link flexible arm that we actuate with a stepper motor - this way simultaneously making the development of an accurate model more challenging and the validation closer to the industrial practice. We provide multiple experimental results across several conditions, proving this method's feasibility in practice." Robust Output Feedback Controller for a Serial Robotic Manipulator with Unknown Nonlinearities and External Disturbances,"Mohammad Al Saaideh, Almuatazbellah Boker, Mohammad Al Janaideh","Memorial University of Newfoundland,Virginia Tech,Memorial University &University of Toronto",Robot Control,"This paper presents a robust output feedback controller for a n-joint serial robotic manipulator with un- known dynamics and external disturbances. First, the robotic manipulator’s dynamic model is formulated with unknown nonlinearities and joint coupling. Then, a robust backstepping controller is proposed to eliminate bounded external disturbances. Second, the extended high gain observer estimates the controller’s joint states and dynamics. Simulations and experiments on 4 DOF robotic manipulators verify the proposed control approach. The proposed control approach achieved the end-desired effector’s trajectory under unknown disturbances" Collaborative Control Based on Payload Leading for Multi-Quadrotors Transportation Systems,"Yuan Ping, Mingming Wang, Juntong Qi, Chong Wu, Jinjin Guo","Tianjin University,Shanghai University,EFY Intelligent Control (Tianjin) Technology Co., Ltd",Robot Control,"This paper presents a collaborative control method based on payload-leading for the multi-quadrotor transportation systems. The goal is to keep the relative distance between the quadrotors and the payload as constant as possible during the transportation, so as to ensure the stable attitude of the payload. The control mechanism consists of a guidance control law that generates the common desired velocity for the quadrotors, an internal feedback controller for each quadrotor,and a decentralized formation controller. The stability of the control structure is proved by Lyapunov theory. Finally, the experimental platform of the multi-quadrotor transportation system is built to verify the effectiveness of the control method. Experimental results show that the proposed method has an excellent control effect." Torque Control with Joints Position and Velocity Limits Avoidance,"Venus Pasandi, Daniele Pucci","Femto-st Institute,Italian Institute of Technology",Robot Control,"The design of a control architecture for providing the desired motion along with the realization of the joint limitation of a robotic system is still an open challenge in control and robotics. This paper presents a torque control architecture for fully actuated manipulators for tracking the desired time-varying trajectory while ensuring the joints position and velocity limits. The presented architecture stems from the parametrization of the feasible joints position and velocity space by exogenous states. The proposed parametrization transforms the control problem with constrained states to an un-constrained one by replacing the joints position and velocity with the exogenous states. With the help of Lyapunov-based arguments, we prove that the proposed control architecture ensures the stability and convergence of the desired joint trajectory along with the joints position and velocity limits avoidance. We validate the performance of proposed architecture through various simulations on a simple two-degree-of-freedom manipulator and the humanoid robot iCub." Low-Level Controller in Response to Changes in Quadrotor Dynamics,"Jaekyung Cho, Chan Kim, Mohamed Khalid M Jaffar, Michael W. Otte, Seong-woo Kim","Seoul national university,Seoul National University,University of Maryland, College Park,University of Maryland",Robot Control,"The dynamics of all real quadrotors inevitably differ even if they are the same product. In particular, the dynamics can change significantly during the flight due to ad- ditional device attachments or overheating motors. In this study, we focused on training a low-level controller, which operates in response to dynamics changes without prior knowledge or fine-tuning of the parameters, using reinforcement learning. We randomized the dynamics of quadrotors in the simulator and trained the policy based on dynamics information ex- tracted from the state–action history through recurrent neural networks (RNNs). In addition, our experiment demonstrated the difficulties in applying existing actor-critic structures that extract dynamics information using end-to-end RNNs for un- stable quadrotors; hence, we proposed a novel structure with better performance. Finally, the excellent performance of the proposed controller was verified by testing experiments that stabilize quadrotors with different dynamics. The experiment videos and the code can be found at https://github.com/ jackyoung96/RNN-Quadrotor-controller." Biodegradable Origami Gripper Actuated with Gelatin Hydrogel for Aerial Sensor Attachment to Tree Branches,"Christian Geckeler, Benito Armas Pizzani, Stefano Mintchev","ETH Zürich,ETH Zurich",Manipulation and Control,"Forest canopies are vital ecosystems, but remain understudied due to difficult access. Forests could be monitored with a network of biodegradable sensors that break down into environmentally friendly substances at the end of their life. As a first step in this direction, this paper details the development of a biodegradable origami gripper to attach conventional sensors to branches, deployable with an aerial robot. Through exposure to sufficient moisture the gripper loses contractile force, dropping the sensor to the ground for easier collection. The origami design of the gripper as well as biodegradable materials selection is detailed, allowing for further extensions utilizing biodegradable origami. Both the gripper and the gelatin hydrogel used as an actuating elastic element for generating the grasping force are experimentally characterized, with the gripper demonstrating a maximum holding force of 1 N. Additionally, the degradation of the gripper until failure in the presence of moisture is also investigated, where the gripper can absorb up to 10 ml of water before falling off a branch. Finally, deployment of the gripper on a tree branch with an aerial robot is demonstrated. Overall, the biodegradable origami gripper represents a first step towards a more scalable and environmentally sustainable approach for ecosystem monitoring." PARSEC: An Aerial Platform for Autonomous Deployment of Self-Anchoring Payloads on Natural Vertical Surfaces,"Patrick Spieler, Skylar Wei, Monica Li, Andrew Galassi, Kyle Uckert, Arash Kalantari, Joel Burdick","JPL,Caltech,UC Berkeley,Jet Propulsion Laboratory,NASA JPL,California Institute of Technology",Manipulation and Control,"PARSEC (Payload Anchoring Robotic System for the Exploration of Cliffs) is an autonomy-equipped aerial manipulator that can deploy self-anchoring payloads on rocky vertical surfaces. It consists of a hexacopter and a two Degrees of Freedom (2 DoF) mass balancing manipulator, which can autonomously deploy a self-anchoring payload from its custom end-effector. The payload anchors itself via an actuated microspine gripper. Payload sensor data is wirelessly transmitted to the primary vehicle during and after deployment. A novel state machine controls the four-stage PARSEC deployment process. First, the rotorcraft brings the payload into contact with the surface and applies a constant 6 N normal force through a feedback control loop to preload the payload microspine gripper. Second, while the rotorcraft maintains the constant normal force, the gripper is commanded to close until engagement with the surface is confirmed through the current feedback sensing. Then, the aerial manipulator pulls with 5 N force on the anchored payload to ensure a secure grip before releasing the package and flying away. We present experimental validation of a successful deployment of a 430 g payload on a vertical vesicular basalt surface." Autonomous Control for Orographic Soaring of Fixed-Wing UAVs,"Tom Suys, Sunyou Hwang, Guido De Croon, Bart Remes","Delft University of Technology,TU Delft",Manipulation and Control,"We present a novel controller for fixed-wing UAVs that enables autonomous soaring in an orographic wind field, extending flight endurance. Our method identifies soaring regions and addresses position control challenges by introducing a target gradient line (TGL) on which the UAV achieves an equilibrium soaring position, where sink rate and updraft are balanced. Experimental testing validates the controller's effectiveness in maintaining autonomous soaring flight without using any thrust in a non-static wind field. We also demonstrate a single degree of control freedom in a soaring position through manipulation of the TGL." Stable Contact Guaranteeing Motion/Force Control for an Aerial Manipulator on an Arbitrarily Tilted Surface,"Jeonghyun Byun, Byeongjun Kim, Changhyeon Kim, Donggeon David Oh, H. Jin Kim",Seoul National University,Manipulation and Control,"This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the gains of the disturbance-observer (DOB)-based motion/force controller are calculated based on the stability conditions considering both the model uncertainties in the dynamic equation and switching between the free and contact motions. To validate the proposed controller, we conducted the time-varying motion/force tracking experiments with different approach speeds and orientations of the surface. The results show that our controller enables the aerial manipulator to track the time-varying motion/force trajectories." Design and Control of a Micro Overactuated Aerial Robot with an Origami Delta Manipulator,"Eugenio Cuniato, Christian Geckeler, Maximilian Brunner, Dario Strübin, Elia Bähler, Fabian Ospelt, Marco Tognon, Stefano Mintchev, Roland Siegwart","ETH Zurich,ETH Zürich,Inria Rennes-Bretagne Atlantique",Manipulation and Control,"This work presents the mechanical design and control of a novel small-size and lightweight Micro Aerial Vehicle (MAV) for aerial manipulation. To our knowledge, with a total take-off mass of only 2.0 kg, the proposed system is the most lightweight Aerial Manipulator (AM) that has 8-DOF independently controllable: 5 for the aerial platform and 3 for the articulated arm. We designed the robot to be fully-actuated in the body forward direction. This allows independent pitching and instantaneous force generation, improving the platform’s performance during physical interaction. The robotic arm is an origami delta manipulator driven by three servomotors, enabling active motion compensation at the end-effector. Its composite multimaterial links help reduce the weight, while their flexibility allow for compliant aerial interaction with the environment. In particular, the arm’s stiffness can be changed according to its configuration. We provide an in depth discussion of the system design and characterize the stiffness of the delta arm. A control architecture to deal with the platform’s overactuation while exploiting the delta arm is presented. Its capabilities are experimentally illustrated both in free flight and physical interaction, highlighting advantages and disadvantages of the origami’s folding mechanism." Simplifying Aerial Manipulation Using Intentional Collisions,"Mark Nail, Nicholas Janne, Olivia Ma, Gabriel Arellano, Ella Atkins, Brent Gillespie",University of Michigan,Manipulation and Control,"Aerial manipulation describes a process that includes physical interaction between an unmanned aircraft system (UAS) and its environment. We aim to apply aerial manipulation to sample leaves and small branches from rainforest trees. Current approaches to aerial manipulation involve extended periods of UAS-environment interaction, during which forces and moments can lead to a loss in attitude or position control in underactuated multicopters. By adapting intelligent foot placement strategies found in dynamically stable hopping robots, this work proposes a strategy involving carefully managed intentional collisions between the UAS and its environment. We designed an attitude controller denoted a Velocity Matching controller that aligns a UAS-mounted pogo stick foot with the center of mass velocity vector during collision approach to maximize UAS ability to recover a hover state after collision. We propose the use of a flight envelope involving altitude and horizontal speed states to assess recoverability prior to initiating each approach to collision. We identify this flight envelope from a simulation study built on a model of flight in Conventional Waypoint Following and Velocity Matching control modes as well as a model of collision response. Experimental flight testing evaluates the simulation-based envelope resulting in an actual envelope that is somewhat smaller but similarly shaped to the envelope identified in simulation." Hierarchical Whole-Body Control of the Cable-Suspended Aerial Manipulator Endowed with Winch-Based Actuation,"Yuri Sarkisov, Andre Coelho, Maihara Gabrieli Santos, Min Jun Kim, Dzmitry Tsetserukou, Christian Ott, Konstantin Kondak","SberAutoTech,German Aerospace Center (DLR),Instituto Tecnologico de Aeronautica,KAIST,Toyohashi University of Technology,TU Wien,German Aerospace Center",Manipulation and Control,"During operation, aerial manipulation systems are affected by various disturbances. Among them is a gravitational torque caused by the weight of the robotic arm. Common propeller-based actuation is ineffective against such disturbances because of possible overheating and high power consumption. To overcome this issue, in this paper we propose a winch-based actuation for the crane-stationed cable-suspended aerial manipulator. Three winch-controlled suspension rigging cables produce a desired cable tension distribution to generate a wrench that reduces the effect of gravitational torque. In order to coordinate the robotic arm and the winch-based actuation, a model-based hierarchical whole-body controller is adapted. It resolves two tasks: keeping the robotic arm end-effector at the desired pose and shifting the system center of mass in the location with zero gravitational torque. The performance of the introduced actuation system as well as control strategy is validated through experimental studies." Heading for the Abyss: Control Strategies for Exploiting Swinging of a Descending Tethered Aerial Robot,"Max Polzin, Frank Centamori, Josie Hughes",EPFL,Manipulation and Control,"The use of aerial vehicles for exploration and data collection has the potential to significantly aid environmental monitoring in environments which are dangerous and hard to navigate. However, within these environments navigation can often be restricted by overhangs which are challenging to navigate, particularly so with the high payloads required for environmental monitoring. We propose utilizing a tethered bicopter with horizontal propellers. This spherical pendulum like system can exploit the tether, not only as a means of powering and recovering the robot, but also to assist its motion, i.e. by swinging to increase the workspace of the robot. Using PD-based control, we demonstrate how the system can be stabilized and bang-bang control to excite the system to achieve large amplitude swinging. By combining these controllers, we show how the system can be used to navigate in a glacial-inspired scenario where there are overhangs and obstacles through which the robot must navigate." Vector Field Aided Trajectory Tracking by a 10-Gram Flapping-Wing Micro Aerial Vehicle,"Abdoullah Ndoye, José De Jesús Castillo Zamora, Sabrine Samorah-laki, Romain Miot, Edwin Van Ruymbeke, Franck Ruffier","Aix Marseille Université, CNRS, ISM and Gipsa-Lab,Aix-Marseille Universite, ISM CNRS,Aix Marseille Université, CNRS, ISM,XTIM - Bionic Bird,CNRS / Aix-Marseille Univ.",Manipulation and Control,"Here we describe how a 10-gram Flapping-Wing Micro Aerial Vehicle (FWMAV) was able to perform an automatic trajectory tracking task based on a vector field method. In this study, the desired heading was provided by a vector field which was computed depending on the desired trajectory. The FWMAV heading was changed by a rear steering mechanism. This rear mechanism simultaneously (i) tenses one wing and relaxes the opposite wing, and (ii) moves the rudder in the same direction as the wing is relaxed. Due to the complex dynamics, system identification methods were used to identify simple linear models using a set of dedicated free flight tests. This yaw and roll simple models help to adjust the yaw controller and the inner loop roll controller. The experimental results obtained here show that a time-independent vector field-based strategy is robust to various initial position and/or speed conditions. The task of tracking circular and 8-shaped trajectories was accomplished successfully over tens of meters." Globally Defined Dynamic Modelling and Geometric Tracking Controller Design for Aerial Manipulator,"Byeongjun Kim, Dongjae Lee, Jeonghyun Byun, H. Jin Kim",Seoul National University,Manipulation and Control,"This study presents a globally defined dynamics for a conventional multirotor equipped with a single n-DOF manipulator using modified Lagrangian dynamics. This enables the reformulation of entire dynamics directly on SO(3) without exploiting any local coordinates, and thus problems such as the singularity of Euler angles can be avoided. Since skew-symmetric property of Coriolis matrix C and inertia matrix facilitates stability analysis, we propose a method to compute C which guarantees the skew-symmetric property by considering C as a summation of two sub matrices. Then, a geometric tracking controller is designed based on decoupled dynamics applying passive decomposition. The proposed controller guarantees almost global region of attraction. We validate our method via consecutive aerial flipping experiments." FlowDrone: Wind Estimation and Gust Rejection on UAVs Using Fast-Response Hot-Wire Flow Sensors,"Nate Simon, Allen Z. Ren, Alex Pique, David Snyder, Daphne Barretto, Marcus Hultmark, Anirudha Majumdar",Princeton University,Manipulation and Control,"Unmanned aerial vehicles (UAVs) are finding use in applications that place increasing emphasis on robustness to external disturbances including extreme wind. However, traditional multirotor UAV platforms do not directly sense wind; conventional flow sensors are too slow, insensitive, or bulky for widespread integration on UAVs. Instead, drones typically observe the effects of wind indirectly through accumulated errors in position or trajectory tracking. In this work, we integrate a novel flow sensor based on micro-electro-mechanical systems (MEMS) hot-wire technology developed in our prior work onto a multirotor UAV for wind estimation. Our sensor is omnidirectional (in the plane), lightweight, fast, and accurate. In order to achieve superior hover performance in windy conditions, we train a `wind-aware' residual-based controller via reinforcement learning using simulated wind gusts and their aerodynamic effects on the drone. In extensive hardware experiments, we demonstrate the wind-aware controller outperforming two strong `wind-unaware' baseline controllers in challenging windy conditions." AutoCharge: Autonomous Charging for Perpetual Quadrotor Missions,"Alessandro Saviolo, Jeffrey Mao, Roshan Balu Thalaivirithan Margabandu Balakr, Vivek Radhakrishnan, Giuseppe Loianno","New York University,Technology Innovation Institute, New York University",Manipulation and Control,"Battery endurance represents a key challenge for long-term autonomy and long-range operations, especially in the case of aerial robots. In this paper, we propose AutoCharge, an autonomous charging solution for quadrotors that combines a portable ground station with a flexible, lightweight charging tether and is capable of universal, highly efficient, and robust charging. We design and manufacture a pair of circular magnetic connectors to ensure a precise orientation-agnostic electrical connection between the ground station and the charging tether. Moreover, we supply the ground station with an electromagnet that largely increases the tolerance to localization and control errors during the docking maneuver, while still guaranteeing smooth un-docking once the charging process is completed. We demonstrate AutoCharge on a perpetual 10 hours quadrotor flight experiment and show that the docking and un-docking performance is solidly repeatable, enabling perpetual quadrotor flight missions." DQN-Based On-Line Path Planning Method for Automatic Navigation of Miniature Robots,"Jialin Jiang, Lidong Yang, Li Zhang","The Chinese University of HONG KONG,The Hong Kong Polytechnic University,The Chinese University of Hong Kong",Micro Robotics,"Untethered magnetic microrobots with controllable locomotion property and multiple functions have attracted lots of attention in recent years. Owing to the small scale, microrobots with automatic navigation possess a promising perspective for biomedical applications including precise delivery and targeted therapy in confined and narrow space, especially for in-vivo scenario. However, the practical working environment for microrobots can be various, dynamic and complicated, and path planning algorithm applicable for both dynamic obstacle avoidance and planning in maze-like environment still remains a challenge. Futhermnore, considering the sizes, different types of microrobots may occupy different proportions of the field of vision. The safe distance between the way-points and the obstacles need to be taken into thoughts. In this work, we proposed a reinforcement learning-based strategy capable of real-time path planning for microrobots in different scales. The reference moving direction at each control period is provided by a deep Q network (DQN) according to the local surrounding environment, and the corresponding control magnetic field is generated via a 3-axis Helmholtz coil system. A disturbance observer (DOB) is responsible for the locomotion state observation and direction error compensation. Experiments demonstrate the effectiveness of our proposed strategy using microrobots with different locomotion mechanisms and scales, in both virtual dynamic obstacle environments and channel-like environments." Rendezvous and Docking of Magnetic Helical Microrobots Along Arc Orbits for Field-Directed Assembly and Disassembly,"Shuideng Wang, Zejie Yu, Chaojian Hou, Kun Wang, Lixin Dong","City University of Hongkong,City University of Hong Kong",Micro Robotics,"Due to the limited cargo/functional element loading and other capabilities of individual microrobots, assembling them for locomotion and disassembling them as arriving at the target is more effective. An approach called rendezvous and docking is proposed in this paper to control the assembly and disassembly of helical microrobots actuated by a uniform rotating magnetic field. Docking is realized around the intersection of their arc orbits with the assistance of a fluidic field. To adjust the distance between the adjacent helical robots suspending in solution but with a distance beyond the acceptable range of the magnetic interaction, their asynchronized velocities are achieved using the interaction between the robots and fluids. For robots rotating at different speeds around their longitudinal axes at a driving frequency lower than the cut-off frequency, different fluidic flows will be generated. Based on the interaction between the robots and the fluids, the translational trajectory paths may be tuned, causing the adjacent robots to move closer. Docking along the tangential direction of rendezvous arc trajectories avoids the instability of the helical robot rotating around the radial direction and the problem of excessive linear speed at the end during assembly so that the robot can rotate stably around its axis while completing the assembly. Besides these, assembled microrobots can also lower the requirements on the imaging resolution of motion tracking and the forces for driving; hence much lower cost for both imaging and driving equipment." MRI-Powered Magnetic Miniature Capsule Robot with HIFU-Controlled On-Demand Drug Delivery,"Mehmet Efe Tiryaki, Fatih DoÄŸangün, Cem Balda Dayan, Paul Wrede, Metin Sitti","Max Plank Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems Stuttgart,Max-Planck Institute for Intelligent Systems",Award Finalists 1,"Magnetic resonance imaging (MRI)-guided robotic systems offer great potential for new minimally invasive medical tools, including MRI-powered miniature robots. By re-purposing the imaging hardware of an MRI scanner, the magnetic miniature robot could be navigated into the remote part of the patient's body without needing tethered endoscopic tools. However, state-of-art MRI-powered magnetic miniature robots have limited functionality besides navigation. Here, we propose an MRI-powered magnetic miniature capsule robot benefiting from acoustic streaming forces generated by MRI-guided high-intensity focus ultrasound (HIFU) for controlled drug release. Our design comprises a polymer capsule shell with a submillimeter-diameter drug-release hole that captures an air bubble functioning as a stopper. We use the HIFU pulse to initiate drug release by removing the air bubble once the capsule robot reaches the target location. By controlling acoustic pressure, we also regulate the drug release rate for multiple locations targeting during navigation. We demonstrated that the proposed magnetic capsule robot could travel at high speed, up to 1.13 cm/s in ex vivo porcine small intestine, and release drug to multiple target sites in a single operation, using a combination of MRI-powered actuation and HIFU-controlled release. The proposed MRI-guided microrobotic drug release system will greatly impact minimally invasive medical procedures by allowing on-demand targeted drug delivery." Structural Design and Frequency Tuning of Piezoelectric Energy Harvesters Based on Topology Optimization,"Abbas Homayouni-amlashi, Micky Rakotondrabe, Abdenbi Mohand-Ousaid","FEMTO-ST Institute, Université Bourgogne Franche,Laboratoire Génie de Production (LGP),University of Franche-Comte",Micro Robotics,"Vibrational piezoelectric energy harvesters (vPEH) are of great interest in several fields such as autonomous sensors and wireless sensor networks, bird tracking devices, or autonomous miniaturized robotic systems. They capture energy from mechanical vibrations available in the ambient environment and convert it into electrical one to power those systems. Basically, a vPEH is composed of three main parts: the transducer mechanical structure, an electronic interface and the storage unit. In this paper, we focus on the optimization of the mechanical structure of the harvester. To this end, an optimization framework based on topology optimization is proposed. It consists to combine the Solid Isotropic Material with Penalization (SIMP) approach and frequency tuning technique to further increase the efficiency of the harvesters. The fundamental frequency of the design is tuned by considering the mass of the attachment as an optimization variable in addition to the classical density and polarity variables. Two numerical examples, including a new piezoelectric energy harvester configuration, are investigated to demonstrate the effectiveness of the topology optimization framework." Input-Output Boundedness of a Magnetically-Actuated Helical Device,"Leendert-Jan Wouter Ligtenberg, Islam S. M. Khalil",University of Twente,Micro Robotics,"To date, all previous research in the wireless magnetic actuation of untethered helical devices has achieved motion stability using feedback control in vitro. However, feedback control systems are likely to be affected by the increased sensory uncertainty during in vivo trials. In this study we investigate the input-output boundedness of an interconnection between a helical device and a single rotating magnet actuator in low-Reynolds-number regime. Using the resistive-force theory, the interconnection is expressed in terms of all possible input-output pairs. Inputs representing the actuation frequency, pitch angle, lateral speed, and field strength are analyzed numerically and experimentally. We demonstrate input-output boundedness of the states of the helical device during circular and straight runs in open-loop, and we demonstrate bounded input-output propulsion without orienting the angle of attack (the often used input to swim horizontally without vertical drift) of the helical device to counteract gravity. Our results are important for a number of minimally invasive applications and tasks requiring improved control authority for stable runs of helical devices without drift due to gravity and without feedback control and restricted configuration imposed on the helical device’s motion." Atomic-Level Tracking and Analyzing of Quantum-Dot Motion Steered by an Electrostatic Field Positioned by a Nanorobotic Manipulation Tip,"Zhi Qu, Wenqi Zhang, Lixin Dong","City University of Hong Kong,City University of HongKong",Micro Robotics,"Field-control-based nanorobotic manipulation of ions at the single atomic level is an enabling technique for such applications as in-situ prototyping and characterization for fundamental research and rapid product development of nanoscale and quantum devices such as sensors, batteries, neuromorphic devices, and neuro/brain interfaces. Taking the motion of quantum dots (QDs) manipulated by an electrostatic field steered by a probe tip on a target surface as an example, here we show a deep-learning-based approach for their global motion tracking via the individual atoms both on the surface and inside the body. Transmission electron graphs, element analysis, and crystal topology acquired from an aberration-corrected transmission electron microscope (Cs-TEM) are used to identify the positions, types, and structures of the atoms to understand their kinematics. The results show the feasibility of multi-target tracking of homogeneous atoms by their spatial structure projection, which is very encouraging for further extension to the tracking and regulation of crystalline grains, swarms of ions, ion filaments, and single ions." 3D-Printed Adaptive Microgripper Driven by Thin-Film NiTi Actuators,"Sukjun Kim, Sarah Bergbreiter",Carnegie Mellon University,Micro Robotics,"Creating microscale actuated mechanisms in 3D space is extremely challenging due to limitations in microfabrication processes. In this work, we present a 3D-printed adaptive microgripper that is driven by thin-film NiTi microactuators with 3D-printed linkage mechanisms. The microgripper’s fingers are passively adaptive so that the microgripper can provide conformal gripping on 3D objects. The microgripper can move its fingers by 225 μm and apply a blocking force of 30 μN per one finger when 20 mA was applied to the NiTi actuators. The microgripper was also integrated onto a printed circuit board with a current regulating circuit and a 9 V battery. Since the NiTi actuator requires a low voltage for actuation, the microgripper could be integrated with simple and affordable electronics. The fully integrated microgripper system was demonstrated playing with a shape sorting box at the microscale for the first time." Automatic Cell Rotation Method Based on Deep Reinforcement Learning,"Huiying Gong, Yujie Zhang, Yaowei Liu, Qili Zhao, Xin Zhao, Mingzhu Sun",Nankai University,Micro Robotics,"Cell rotation is widely used to adjust cell posture in sub-cellular micromanipulations. The trajectory planning of the injection micropipette is needed, so that the cells can be rotated with the minimum deformation to reduce cell damage and keep cell viability. Due to the uncertainty of cell properties and manipulation environment, it is difficult to identify the parameters of the mechanical models in traditional robotic cell rotation methods. In this paper, deep reinforcement learning is introduced into cell manipulation for the first time to perform trajectory planning of the micropipette. We first abstract the cell rotation process by using the mechanical model and microscopic vision techniques and build a cell rotation simulation environment. Then we design a reward function by combining various factors of cell rotation and implement a reinforcement learning framework based on deep Q-learning (DQL). Finally, we train the cell rotation process based on the deep reinforcement learning algorithm. The simulation results indicate the proposed DQL agent achieved an average success rate of 97% without useless exploration. Moreover, the proposed method rotated the cells in a way that causes less mechanical damage than humans, demonstrating the DRL ability for cell rotation with high efficiency and low cell damage." Noncontact Particle Manipulation on Water Surface with Ultrasonic Phased Array System and Microscopic Vision,"Yexin Zhang, Jiaqi Li, Yuyu Jia, Teng Li, Hu Su, Song Liu, David C. Jeong, Yang Wang","ShanghaiTech University,Tsinghua University,Institute of Automation, Chinese Academy of Science,Santa Clara University,Shanghaitech University",Micro Robotics,"Noncontact particle manipulation (NPM) shows great application potential than its conventional counterpart particularly in terms of non-invasiveness, and thus has significantly extended robotic manipulation capacity into bio-medical engineering, material science, etc. As NPM by means of electric, magnetic, and optical field has successfully demonstrated powerful strength in both academia and industry, NPM boosted by acoustic field, however, still faces staggering challenges. It is indeed in the very recent years that controllable dynamic airborne or waterborne acoustic field modulation technology emerged in academia. In this paper, we report our latest research regarding dexterous and dynamic noncontact micro-particle manipulation on water surface effected by acoustic field in terms of automated trapping, closed-loop positioning, and real-time motion planning, which can be applied to scenarios such as parallel 3D printing, cell assembly, etc. The main contribution of this work is we demonstrated the feasibility of objective-oriented and fully automated acoustic manipulation of micro-particle in precision scale based on robotic approach in 2D plane. Experiment results showed that the repetitive positioning accuracy can reach as high as 16 μm, which is essentially the pixel scale factor." Real-Time Acoustic Holography with Iterative Unsupervised Learning for Acoustic Robotic Manipulation,"Chengxi Zhong, Zhenhuan Sun, Teng Li, Hu Su, Song Liu","ShanghaiTech University,Shanghaitech University,Tsinghua University,Institute of Automation, Chinese Academy of Science",Micro Robotics,"Phase-only acoustic holography is a fundamental and promising technique for contactless robotic manipulation. Through independently controlling phase-only hologram (POH) of phase array of transducers (PAT) and simultaneously driving each channel by sophisticated circuits, a certain acoustic field is dynamically generated in working medium (e.g., air, water or biological tissues) at certain moment. The phase profile of PAT is required dynamically and precisely as per arbitrary expected acoustic field for the sake of versatile and stable robotic manipulation. However, the most conventional methods rely on iterative optimization algorithms which are inevitably time-consuming and probably non-convergent, moreover hindering versatility and fidelity of acoustic robotic manipulation. To address these issues, this paper reports a real-time phase-only acoustic holography algorithm by virtue of iterative unsupervised learning. Using a physics model to construct two queues, which we refer to as experience pools, data pairs consisting of a target acoustic amplitude hologram in expected acoustic field and corresponding POH of PAT are collected on-the-fly, circumventing costly preparation of annotated dataset in advance. With iterative learning between neural network training and experience pools update, both the solution of objective inverse mapping and the adaptation for arbitrary desired acoustic field are mutually enhanced. The experiments and results validated that the proposed approach surpasses previous algorithms in terms of real time and precision." ROSMC: A High-Level Mission Operation Framework for Heterogeneous Robotic Teams,"Ryo Sakagami, Sebastian Georg Brunner, Andreas Dömel, Armin Wedler, Freek Stulp","German Aerospace Center (DLR),DLR German Aerospace Center, Robotics and Mechatronics Center,DLR - German Aerospace Center,DLR - Deutsches Zentrum für Luft- und Raumfahrt e.V.",Multi-Robot Systems III,"Heterogeneous teams of multiple mobile robots will be important for future scientific explorations of extraterrestrial surfaces or hazardous areas. Mission operation in such harsh, unknown environments poses diverse challenges. Robots need to cooperate autonomously due to the large network latency to the ground station while operators need to adapt the ongoing mission flexibly based on new discoveries obtained during execution. Furthermore, shared situational awareness between operators and roboticists is highly required to deal with execution failures promptly. To overcome these challenges, this paper proposes the high-level mission operation framework ROSMC.The concept of mission synchronization to robots enables continuous mission adaptations and future planning by operators while robots execute the mission autonomously. The ROS-based GUIs enable operators to intuitively create and monitor the mission for robots as well as to communicate with roboticists smoothly. The proposed framework was evaluated by a pilot study with a simulator and demonstrated at a Moon-analogue field on Mt. Etna in Sicily, Italy, involving 3 robots and around 70 researchers for 4 weeks." Non-Cooperative Stochastic Target Encirclement by Anti-Synchronization Control Via Range-Only Measurement,"Fen Liu, Shenghai Yuan, Wei Meng, Rong Su, Lihua Xie","Guangdong University of Technology,NANYANG TECHNOLOGICAL UNIVERSITY,Nanyang Technological University,NanyangTechnological University",Multi-Robot Systems III,... Estimation of Continuous Environments by Robot Swarms: Correlated Networks and Decision-Making,"Mohsen Raoufi, Pawel Romanczuk, Heiko Hamann","Technical University of Berlin,Humboldt-Unviersity Berkin,University of Konstanz",Multi-Robot Systems III,"Collective decision-making is an essential capability of large-scale multi-robot systems to establish autonomy on the swarm level. A large portion of literature on collective decision-making in swarm robotics focuses on discrete decisions selecting from a limited number of options. Here we assign a decentralized robot system with the task of exploring an unbounded environment, finding consensus on the mean of a measurable environmental feature, and aggregating at areas where that value is measured (e.g., a~contour line). A unique quality of this task is a causal loop between the robots' dynamic network topology and their decision-making. For example, the network's mean node degree influences time to convergence while the currently agreed-on mean value influences the swarm's aggregation location, hence, also the network structure as well as the precision error. We propose a control algorithm and study it in real-world robot swarm experiments in different environments. We show that our approach is effective and achieves higher precision than a control experiment. We anticipate applications, for example, in containing pollution with surface vehicles." FogROS2: An Adaptive Platform for Cloud and Fog Robotics Using ROS 2,"Jeffrey Ichnowski, Kaiyuan Chen, Karthik Dharmarajan, Simeon Oluwafunmilore Adebola, Michael Danielczuk, Victor Mayoral-Vilches, Nikhil Jha, Hugo Zhan, Edith Llontop, Derek Xu, Camilo Buscaron, John Kubiatowicz, Ion Stoica, Joseph E. Gonzalez, Ken Goldberg","Carnegie Mellon University,University of California, Berkeley,UC Berkeley,Klagenfurt University,University of California, Berkely,Anytime.ai",Multi-Robot Systems III,"Mobility, power, and price points often dictate that robots do not have sufficient computing power on board to run contemporary robot algorithms at desired rates. Cloud computing providers such as AWS, GCP, and Azure offer immense computing power and increasingly low latency on demand, but tapping into that power from a robot is non-trivial. We present FogROS2, an open-source platform to facilitate cloud and fog robotics that is included in the Robot Operating System 2 (ROS 2) distribution. FogROS2 is distinct from its predecessor FogROS1 in 9 ways, including lower latency, overhead, and startup times; improved usability, and additional automation, such as region and computer type selection. Additionally, FogROS2 gains performance, timing, and additional improvements associated with ROS 2. In common robot applications, FogROS2 reduces SLAM latency by 50%, reduces grasp planning time from 14s to 1.2s, and speeds up motion planning 45x. When compared to FogROS1, FogROS2 reduces network utilization by up to 3.8x, improves startup time by 63%, and network round-trip latency by 97% for images using video compression. The source code, examples, and documentation for FogROS2 are available at https://github.com/BerkeleyAutomation/FogROS2, and is available through the official ROS 2 repository at https://index.ros.org/p/fogros2/." Stackelberg Games for Learning Emergent Behaviors During Competitive Autocurricula,"Boling Yang, Liyuan Zheng, Lillian J. Ratliff, Byron Boots, Joshua R. Smith",University of Washington,Multi-Robot Systems III,"Autocurricular training is an important sub-area of multi-agent reinforcement learning (MARL) that allows multiple agents to learn emergent skills in an unsupervised co-evolving scheme. The robotics community has experimented autocurricular training with physically grounded problems, such as robust control and interactive manipulation tasks. However, the asymmetric nature of these tasks makes the generation of sophisticated policies challenging. Indeed, the asymmetry in the environment may implicitly or explicitly provide an advantage to a subset of agents which could, in turn, lead to a low-quality equilibrium. This paper proposes a novel game-theoretic algorithm, Stackelberg Multi-Agent Deep Deterministic Policy Gradient (ST-MADDPG), which formulates a two-player MARL problem as a Stackelberg game with one player as the 'leader' and the other as the 'follower' in a hierarchical interaction structure wherein the leader has an advantage. We first demonstrate that the leader's advantage from ST-MADDPG can be used to alleviate the inherent asymmetry in the environment. By exploiting the leader's advantage, ST-MADDPG improves the quality of a co-evolution process and results in more sophisticated and complex strategies that work well even against an unseen strong opponent." On Legible and Predictable Robot Navigation in Multi-Agent Environments,"Jean-Luc Bastarache, Christopher Nielsen, Stephen L. Smith",University of Waterloo,Multi-Robot Systems III,"Legible motion is intent-expressive, which when employed during social robot navigation, allows others to quickly infer the intended avoidance strategy. Predictable motion matches an observer's expectation which, during navigation, allows others to confidently carryout the interaction. In this work, we present a navigation framework capable of reasoning on its legibility and predictability with respect to dynamic interactions, e.g., a passing side. Our approach generalizes the previously formalized notions of legibility and predictability by allowing dynamic goal regions in order to navigate in dynamic environments. This generalization also allows us to quantitatively evaluate the legibility and the predictability of trajectories with respect to navigation interactions. Our approach is shown to promote legible behavior in ambiguous scenarios and predictable behavior in unambiguous scenarios. In a multi-agent environment, this yields an increase in safety while remaining competitive in terms of goal-efficiency when compared to other robot navigation planners in multi-agent environments." Explainable Action Advising for Multi-Agent Reinforcement Learning,"Yue (Sophie) Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei Fang, Katia Sycara",Carnegie Mellon University,Multi-Robot Systems III,"Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student’s sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel states. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling advice generalization and leading to improved sample efficiency and learning performance – even in environments where the teacher is sub-optimal. We empirically show that our framework is effective in both single-agent and multi-agent scenarios, yielding improved policy returns and convergence rates when compared to state-of-the-art methods." A Complete Set of Connectivity-Aware Local Topology Manipulation Operations for Robot Swarms,"Karthik Soma, Koresh Khateri, Mahdi Pourgholi, Mohsen Montazeri, Lorenzo Sabattini, Giovanni Beltrame","École Polytechnique de Montréal,Shahid Beheshti University,University of Modena and Reggio Emilia,Ecole Polytechnique de Montreal",Multi-Robot Systems III,"The topology of a robotic swarm affects the convergence speed of consensus and the mobility of the robots. In this paper, we prove the existence of a complete set of local topology manipulation operations that allow the transformation of a swarm topology. The set is complete in the sense that any other possible set of manipulation operations can be performed by a sequence of operations from our set. The operations are local as they depend only on the first and second hop neighbors' information to transform any initial spanning tree of the network's graph to any other connected tree with the same number of nodes. The flexibility provided by our method is similar to global methods that require full knowledge of the swarm network. We prove the existence of a sequence of transformations for any tree-to-tree transformation, and derive sequences of operations to form a line or star from any initial spanning tree. Our work provides a theoretical and practical framework for topological control of a swarm, establishing global properties using only local information." Decentralized Multi-Agent Exploration with Limited Inter-Agent Communications,"Hans He, Alec Koppel, Amrit Bedi, Daniel Stilwell, Mazen Farhood, Benjamin Adams Biggs","Virginia Tech,JP Morgan Chase,University of Maryland, College Park,Virginia Polytechnic Institute and State University",Multi-Robot Systems III,"We consider the problem of decentralized multiagent environmental learning through maximizing the joint information gain among a team of agents. Inspired by subsea applications where bandwidth is severely limited, we explicitly consider the challenge of restricted communication between agents. The environment is modeled as a Gaussian process (GP), and the global information gain maximization problem in a GP is a set-valued optimization problem involving all agents’ locally acquired data. We develop a decentralized method to solve it based on decomposition of information gain and exchange of limited subsets of data between agents. A key technical novelty of our approach is that we formulate the incentives for information exchange among agents as a submodular set optimization problem in terms of the log-determinant of their local covariance matrices. Numerical experiments on real-world data demonstrate the ability of our algorithm to explore tradeoff between objectives. In particular, we demonstrate favorable performance on mapping problems where both decentralized information gathering and limited information exchange are essential." A Distributed Online Optimization Strategy for Cooperative Robotic Surveillance,"Lorenzo Pichierri, Guido Carnevale, Lorenzo Sforni, Andrea Testa, Giuseppe Notarstefano","University of Bologna,Alma Mater Studiorum - Università di Bologna",Multi-Robot Systems III,"In this paper, we propose a distributed algorithm to control a team of cooperating robots aiming to protect a target from a set of intruders. Specifically, we model the strategy of the defending team by means of an online optimization problem inspired by the emerging distributed aggregative framework. In particular, each defending robot determines its own position depending on (i) the relative position between an associated intruder and the target, (ii) its contribution to the barycenter of the team, and (iii) collisions to avoid with its teammates. We highlight that each agent is only aware of local, noisy measurements about the location of the associated intruder and the target. Thus, in each robot, our algorithm needs to (i) locally reconstruct global unavailable quantities and (ii) predict its current objective functions starting from the local measurements. The effectiveness of the proposed methodology is corroborated by simulations and experiments on a team of cooperating quadrotors." Risk-Aware Recharging Rendezvous for a Collaborative Team of UAVs and UGVs,"Ahmad Bilal Asghar, Guangyao Shi, Nare Karapetyan, James Humann, Jean-paul Reddinger, James Dotterweich, Pratap Tokekar","University of Maryland,DEVCOM Army Research Laboratory,,Engility Corp.",Multi-Robot Systems III,"We introduce and investigate the recharging rendezvous problem for a collaborative team of Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs), in which UAVs with limited battery capacity and UGVS persistently monitor an area. The UGVs also act as mobile recharging stations for the UAVs. In contrast to prior work on such problems, we consider the challenge of dealing with stochastic energy consumption in a risk-aware fashion. Specifically, we consider a bi-criteria optimization problem of minimizing the time taken by the UAVs on recharging detours while ensuring that the probability that no UAV runs out of charge is greater than a user-defined risk tolerance. This problem (termed {Risk-aware Recharging Rendezvous Problem (RRRP))} is a combinatorial problem with a matching constraint --- to ensure UAVs are assigned to the limited UGV recharging slots, and a knapsack constraint --- to capture the risk tolerance. We propose a novel bicriteria approximation algorithm to solve RRRP and demonstrate its effectiveness in the context of a persistent monitoring mission compared to baseline methods." Cross-Agent Relocalization for Decentralized Collaborative SLAM,"Philipp Baenninger, Ignacio Alzugaray, Marco Karrer, Margarita Chli","ETH Zurich,Imperial College London",Multi-Robot Systems III,"State-of-the-art decentralized collaborative Simultaneous Localization And Mapping (SLAM) systems crucially lack the ability to effectively use well-mapped areas generated by other agents in the team for relocalization. This often leads to map redundancy between agents, inefficient communication, and the need for costly re-mapping of areas previously mapped by other agents. In this work, we propose a strategy to efficiently share the areas mapped by different agents in a collaborative, decentralized SLAM system. This approach directly addresses map redundancy while maintaining the consistency of the estimates across the agents and keeping the overall system scalable in terms of cross-agent communication and individual computational effort. Our method leverages covisibility information between keyframes instantiated by different agents to transfer local sub-maps on-the-fly in a completely decentralized, peer-to-peer fashion. A globally consistent estimate is achieved by solving a distributed bundle adjustment problem using the Alternating Direction Method of Multipliers (ADMM), where we enforce constraints on shared map points and keyframes across agents." Planning with Occluded Traffic Agents Using Bi-Level Variational Occlusion Models,"Filippos Christianos, Peter Karkus, Boris Ivanovic, Stefano V. Albrecht, Marco Pavone","University of Edinburgh,NVIDIA,Stanford University",Intelligent Transportation Systems III,"Reasoning with occluded traffic agents is a significant open challenge for planning for autonomous vehicles. Recent deep learning models have shown impressive results for predicting occluded agents based on the behaviour of nearby visible agents; however, as we show in experiments, these models are difficult to integrate into downstream planning. To this end, we propose Bi-level Variational Occlusion Models (BiVO), a two-step generative model that first predicts likely locations of occluded agents, and then generates likely trajectories for the occluded agents.In contrast to existing methods, BiVO outputs a trajectory distribution which can then be sampled from and integrated into standard downstream planning. We evaluate the method in closed-loop replay simulation using the real-world NuScenes dataset. Our results suggest that BiVO can successfully learn to predict occluded agent trajectories, and these predictions lead to better subsequent motion plans in critical scenarios." Robust Forecasting for Robotic Control: A Game-Theoretic Approach,"Shubhankar Agarwal, David Fridovich-Keil, Sandeep Chinchali",The University of Texas at Austin,Intelligent Transportation Systems III,"Modern robots require accurate forecasts to make optimal decisions in the real world. For example, self-driving cars need an accurate forecast of other agents' future actions to plan safe trajectories. Current methods rely heavily on historical time series to accurately predict the future. However, relying entirely on the observed history is problematic since it could be corrupted by noise, have outliers, or not completely represent all possible outcomes. To solve this problem, we propose a novel framework for generating robust forecasts for robotic control. In order to model real-world factors affecting future forecasts, we introduce the notion of an adversary, which perturbs observed historical time series to increase a robot's ultimate control cost. Specifically, we model this interaction as a zero-sum two-player game between a robot's forecaster and this hypothetical adversary. We show that our proposed game may be solved to a local Nash equilibrium using gradient-based optimization techniques. Furthermore, we show that a forecaster trained with our method performs 30.14% better on out-of-distribution real-world lane change data than baselines." Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios,"Zhili Zhang, Songyang Han, Jiangwei Wang, Fei Miao",University of Connecticut,Intelligent Transportation Systems III,"Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system in dynamic and complicated driving scenarios. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel Safety Shield for CAVs in challenging driving scenarios that includes unconnected hazard vehicles. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The Safety Shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the performance of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with unconnected hazard vehicles. Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios." Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library,"Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li","Shanghai AI Laboratory,Beihang University,UCLA,Harbin Institute of Technology,University of California, Los Angeles,Sensetime Ltd.",Intelligent Transportation Systems III,"Recently, Vehicle-to-Everything(V2X) cooperative perception has attracted increasing attention.Infrastructure sensors play a critical role in this research field, however, how to find the optimal placement of infrastructure sensors is rarely studied.In this paper, we investigate the problem of infrastructure sensor placement and propose a pipeline that can efficiently and effectively find optimal installation positions for infrastructure sensors in a realistic simulated environment.To better simulate and evaluate LiDAR placement, we establish a Realistic LiDAR Simulation library that can simulate the unique characteristics of different popular LiDARs and produce high-fidelity LiDAR point clouds in the CARLA simulator.Through simulating point cloud data in different LiDAR placements, we can evaluate the perception accuracy of these placements using multiple detection models. Then, we analyze the correlation between the point cloud distribution and perception accuracy by calculating the density and uniformity of regions of interest.Experiments show that when using the same number and type of LiDAR, the placement scheme optimized by our proposed method improves the average precision by 15%, compared with the conventional placement scheme in the standard lane scene.We also analyze the correlation between perception performance in the region of interest and LiDAR point cloud distribution and validate that density and uniformity can be indicators of performance. Both the RLS Library and related code will be released at https://github.com/PJLab-ADG/LiDARSimLib-and-Placement-Eval uation." Uncertainty Quantification of Collaborative Detection for Self-Driving,"Sanbao Su, Yiming Li, Sihong He, Songyang Han, Chen Feng, Caiwen Ding, Fei Miao","University of Connecticut,New York University",Self-Driving Cars I,"Sharing information between connected and autonomous vehicles (CAVs) fundamentally improves the performance of collaborative object detection for self-driving. However, CAVs still have uncertainties on object detection due to practical challenges, which will affect the later modules in self-driving such as planning and control. Hence, uncertainty quantification is crucial for safety-critical systems such as CAVs. Our work is the first to estimate the uncertainty of collaborative object detection. We propose a novel uncertainty quantification method, called Double-M Quantification, which tailors a moving block bootstrap (MBB) algorithm with direct modeling of the multivariant Gaussian distribution of each corner of the bounding box. Our method captures both the epistemic uncertainty and aleatoric uncertainty with one inference pass based on the offline Double-M training process. And it can be used with different collaborative object detectors. Through experiments on the comprehensive collaborative perception dataset, we show that our Double-M method achieves more than 4$times$ improvement on uncertainty score and more than 3% accuracy improvement, compared with the state-of-the-art uncertainty quantification methods. Our code is public on url{ https://coperception.github.io/double-m-quantification/}." WS-3D-Lane: Weakly Supervised 3D Lane Detection with 2D Lane Labels,"Jianyong Ai, Wenbo Ding, Jiuhua Zhao, Jiachen Zhong",SAIC AI Lab,Self-Driving Cars I,"Compared to 2D lanes, real 3D lane data is difficult to collect accurately. In this paper, we propose a novel method for training 3D lanes with only 2D lane labels, called weakly supervised 3D lane detection WS-3D-Lane. By assumptions of constant lane width and equal height on adjacent lanes, we indirectly supervise 3D lane heights in the training. To overcome the problem of the dynamic change of the camera pitch during data collection, a camera pitch self-calibration method is proposed. In anchor representation, we propose a double-layer anchor with non-maximum suppression (NMS) method, which enables the anchor-based method to predict two lane lines that are close. Experiments are conducted on the base of 3D-LaneNet under two supervision methods. Under weakly supervised setting, our WS-3D-Lane outperforms previous 3D-LaneNet: F-score rises to 92.3% on Apollo 3D synthetic dataset, and F1 rises to 74.5% on ONCE-3DLanes. Meanwhile, WS-3D-Lane in purely supervised setting makes more increments and outperforms state-of-the-art. To the best of our knowledge, WS-3D-Lane is the first try of 3D lane detection under weakly supervised setting. Our code is available on https://github.com/SAIC-Vision/WS-3D-Lane." One Training for Multiple Deployments: Polar-Based Adaptive BEV Perception for Autonomous Driving,"Huitong Yang, Xuyang Bai, Xinge Zhu, Yuexin Ma","ShanghaiTech University,Hong Kong University of Science and Technology,CUHK",Self-Driving Cars I,"Current on-board chips usually have different computing power, which means multiple training processes are needed for adapting the same learning-based algorithm to different chips, costing huge computing resources. The situation becomes even worse for 3D perception methods with large models. Previous vision-centric 3D perception approaches are trained with regular grid-represented feature maps of fixed resolutions, which is not applicable to adapt to other grid scales, limiting wider deployment. In this paper, we leverage the Polar representation when constructing the BEV feature map from images in order to achieve the goal of training once for multiple deployments. Specifically, the feature along rays in Polar space can be easily adaptively sampled and projected to the feature in Cartesian space with arbitrary resolutions. To further improve the adaptation capability, we make multi-scale contextual information interact with each other to enhance the feature representation. Experiments on a large-scale autonomous driving dataset show that our method outperforms others as for the good property of one training for multiple deployments." Deep Occupancy-Predictive Representations for Autonomous Driving,"Eivind Meyer, Lars Frederik Peiss, Matthias Althoff",Technische Universität München,Self-Driving Cars I,"Manually specifying features that capture the diversity in traffic environments is impractical. Consequently, learning-based agents cannot realize their full potential as neural motion planners for autonomous vehicles. Instead, this work proposes to learn which features are task-relevant. Given its immediate relevance to motion planning, our proposed architecture encodes the probabilistic occupancy map as a proxy for obtaining pre-trained state representations of the environment. By leveraging a map-aware traffic graph formulation, our agent-centric encoder generalizes to arbitrary road networks and traffic situations. We show that our approach significantly improves the downstream performance of a reinforcement learning agent operating in urban traffic environments." PriorLane: A Prior Knowledge Enhanced Lane Detection Approach Based on Transformer,"Qibo Qiu, Haiming Gao, Wei Hua, Gang Huang, Xiaofei He","Zhejiang Lab,Zhejiang University",Self-Driving Cars I,"Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper proposes a novel and general framework called PriorLane, which is used to enhance the segmentation performance of the fully vision transformer by introducing the low-cost local prior knowledge. Specifically, PriorLane utilizes an encoder-only transformer to fuse the feature extracted by a pre-trained segmentation model with prior knowledge embeddings. Note that a Knowledge Embedding Alignment (KEA) module is adapted to enhance the fusion performance by aligning the knowledge embedding. Extensive experiments on our Zjlab dataset show that PriorLane outperforms SOTA lane detection methods by a 2.82% mIoU when prior knowledge is employed, and the code will be released at: https://github.com/vincentqqb/PriorLane." Reinforcement Learning with Probabilistically Safe Control Barrier Functions for Ramp Merging,"Soumith Udatha, Yiwei Lyu, John Dolan",Carnegie Mellon University,Self-Driving Cars I,"Prior work has looked at applying reinforcement learning (RL) and imitation learning (IL) approaches to autonomous driving scenarios, but either the safety or the the efficiency of the algorithm is compromised with IL approaches being close to the dataset provided and RL methods not having well-curated reward functions. With the use of a control barrier functions embedded into the RL policy, we arrive at safe policies to optimize the performance of the autonomous driving vehicle through the advantage of a safety layer over the RL methods to ease the design of reward functions. However, the control barrier functions need a good approximation of the model of the system. We use probabilistic control barrier functions to account for model uncertainty. The Safety-Assured Policy Optimization - Ramp Merging (SAPO-RM) algorithm is implemented online in the CARLA[1] Simulator and offline on the US I-80 dataset extracted from the NGSIM Database provided by NHTSA[2]. We further, test the algorithm and perform ablations studies of it on the US-101 and exi-D datasets to compare the approaches. The the proposed algorithm is not just a safe ramp merging algorithm, but a safe autonomous driving algorithm applied to address ramp merging on highways." Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms,"Resul Dagdanov, Halil Durmuş, Nazim Ure","Eatron Yazilim ve Muhendislik Teknolojileri A.S.,İstanbul Technical University,Istanbul Technical University",Self-Driving Cars I,"In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of safety-critical scenarios during the training phase could result in poor generalization performance in real-world driving applications. We propose a novel framework in which the weaknesses of the training set are explored through black-box verification methods. After discovering AD failure scenarios, the RL agent's training is re-initiated via transfer learning to improve the performance of previously unsafe scenarios. Simulation results demonstrate that our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method. The source code is publicly available at https://github.com/data-and-decision-lab/self-improving-RL." Multi-Source Domain Adaptation for Unsupervised Road Defect Segmentation,"JONGMIN YU, Hyeontaek Oh, Sebastiano Fichera, Paolo Paoletti, Shan Luo","King's College London,Korea Advanced Institute of Science and Technology,University of Liverpool",Self-Driving Cars I,"The performance of road defect segmentation (a.k.a. pixel-level road defect detection) has been improved alongside with remarkable achievement of deep learning. Those improvements need a large-scale and well-constructed dataset. However, road surface materials or designs vary from country to country, and the patterns of defects are hard to pre-define. In this paper, we propose a novel multi-source domain adaptation method to boost the performance of road defect segmentation on an unlabelled dataset. The proposed method generates multi-source ensembled labels by using transferred information from models trained by multiple labelled source domains, and utilises it as supervisory signals for the unlabelled target domain. Furthermore, to reduce the domain gap between each source domain and a target domain, these domains are re-aligned with outlier repositioning to improve the defect segmentation performance. We demonstrate the effectiveness of our proposed method on Cracktree200, CRACK500, CFD, and Crack360 datasets. Experimental results show that the proposed method outperforms the existing unsupervised road defect segmentation methods and achieves competitive performance compared with recent supervised methods. The source code is publicly available on url{https://github.com/andreYoo/MSDA_RDS.git}." A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations,"Sohan Rudra, Saksham Goel, Anirban Santara, Claudio Gentile, Laurent Perron, Fei Xia, Vikas Sindhwani, Carolina Parada, Gaurav Aggarwal","Google,Google Inc,Google Brain, NYC",Motion and Path Planning III,"Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e.g., television, fridge, etc.). We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects (e.g. fruits, glasses, phones, etc.) that frequently change their positions due to human interaction. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location. The likelihoods are used as rewards in a weighted minimum latency solver to deduce a trajectory for the robot. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability." Safe and Efficient Navigation in Extreme Environments Using Semantic Belief Graphs,"M. Fadhil Ginting, Sung Kyun Kim, Oriana Peltzer, Joshua Ott, Sunggoo Jung, Mykel Kochenderfer, Ali-Akbar Agha-Mohammadi","Stanford University,NASA Jet Propulsion Laboratory, Caltech,JPL,NASA-JPL, Caltech",Motion and Path Planning III,"To achieve autonomy in unknown and unstructured environments, we propose a method for semantic-based planning under perceptual uncertainty. This capability is crucial for safe and efficient robot navigation in environment with mobility-stressing elements that require terrain-specific locomotion policies. We propose the Semantic Belief Graph (SBG), a geometric- and semantic-based representation of a robot's probabilistic roadmap in the environment. The SBG nodes comprise of the robot geometric state and the semantic-knowledge of the terrains in the environment. The SBG edges represent local semantic-based controllers that drive the robot between the nodes or invoke an information gathering action to reduce semantic belief uncertainty. We formulate a semantic-based planning problem on SBG that produces a policy for the robot to safely navigate to the target location with minimal traversal time. We analyze our method in simulation and present real-world results with a legged robotic platform navigating multi-level outdoor environments." Risk-Aware Neural Navigation from BEV Input for Interactive Driving,"Suzanna Jiwani, Xiao Li, Sertac Karaman, Daniela Rus","Massachusetts Institute of Technology,MIT",Motion and Path Planning III,"Safety has been a key goal for autonomous driving since its inception, and we believe recognizing and responding to risk is a key component of safety. In this work, we aim to answer the question, ""How can explainable risk representations be generated and used to produce risk-averse trajectories?"" To answer this question, previous work uses risk metrics to formulate an optimization problem. In contrast, our work is based on research showing the usefulness of grids as a representation to generate image-based risk maps through a trained neural network. We propose a method of determining risk from a bird's eye view (BEV) of an autonomous vehicle's surroundings. Our method consists of (1) a risk map generator, which is trained to recognize risk associated with nearby agents and the map, (2) differentiable value iteration using the risk map to learn a policy, and (3) a trajectory sampler, which samples from this policy to generate a trajectory. We evaluate our planner in a close-loop manner and find improvements in its overall ability to mimic human driving while maintaining comparable safety statistics. Self-ablation also reveals the potential for fine-tuning the behavior of the planner given a designer's needs." Informable Multi-Objective and Multi-Directional RRT* System for Robot Path Planning,"Bruce Jk Huang, Yingwen Tan, Dongmyeong Lee, Vishnu Desaraju, J.W Grizzle","University of Michigan,Woven Planet North America",Motion and Path Planning III,"Multi-objective or multi-destination path planning is crucial for mobile robotics applications such as mobility as a service, robotics inspection, and electric vehicle charging for long trips. This work proposes an anytime iterative system to concurrently solve the multi-objective path planning problem and determine the visiting order of destinations. The system is comprised of an anytime informable multi-objective and multi-directional RRT* algorithm to form a simple connected graph, and a proposed solver that consists of an enhanced cheapest insertion algorithm and a genetic algorithm to solve approximately the relaxed traveling salesman problem in polynomial time. Moreover, a list of waypoints is often provided for robotics inspection and vehicle routing so that the robot can preferentially visit certain equipment or areas of interest. We show that the proposed system can inherently incorporate such knowledge to navigate challenging topology. The proposed anytime system is evaluated on large and complex graphs built for real-world driving applications. C++ implementations are available at: https://github.com/UMich-BipedLab/IMOMD-RRTStar." Leveraging Scene Embeddings for Gradient-Based Motion Planning in Latent Space,"Jun Yamada, Chia-Man Hung, Jack Collins, Ioannis Havoutis, Ingmar Posner","University of Oxford,Oxford University",Motion and Path Planning III,"Motion planning framed as optimisation in structured latent spaces has recently emerged as competitive with traditional methods in terms of planning success while significantly outperforming them in terms of computational speed. However, the real-world applicability of recent work in this domain remains limited by the need to express obstacle information directly in state-space, involving simple geometric primitives. In this work we address this challenge by leveraging learned scene embeddings together with a generative model of the robot manipulator to drive the optimisation process. In addition we introduce an approach for efficient collision checking which directly regularises the optimisation undertaken for planning. Using simulated as well as real-world experiments, we demonstrate that our approach, AMP-LS, is able to successfully plan in novel, complex scenes while outperforming competitive traditional baselines in terms of computation speed by an order of magnitude. We show that the resulting system is fast enough to enable closed-loop planning in real-world dynamic scenes." Sample-Driven Connectivity Learning for Motion Planning,"Sihui Li, Neil Dantam",Colorado School of Mines,Motion and Path Planning III,"Sampling-based motion planning works well in many cases but is less effective if the configuration space has narrow passages. In this paper, we propose a learning-based strategy to sample in these narrow passages, which improves overall planning time. Our algorithm first learns from the configuration space planning graphs and then uses the learned information to effectively generate narrow passage samples. We perform experiments in various 6D and 7D scenes. The algorithm offers one order of magnitude speed-up compared to baseline planners in some of these scenes." Online Coverage Path Planning Scheme for a Size-Variable Robot,"M. A. Viraj J. Muthugala, Bhagya Samarakoon, Rajesh Elara Mohan",Singapore University of Technology and Design,Motion and Path Planning III,"Coverage Path Planning (CPP) is an essential feature of robots deployed for applications such as lawn mowing, cleaning, painting, and exploration. However, most of the state-of-the-art CPP methods are proposed for fixed-morphology robots, and the coverage performance is limited by physical constraints such as the inaccessibility of narrow spaces. Apart from area coverage, productivity depends on coverage time and energy usage. A robot capable of varying its footprint size could be a solution for improving productivity in these aspects. In addition to that, the environments, where robots are deployed for coverage, are often subjected to changes causing uncertainties. Therefore, this paper proposes an online CPP scheme for a size-variable robot to improve coverage productivity. The navigation planning of the proposed Size-Variable CPP (VSCPP) scheme has been implemented by adapting a Glasius bio-inspired neural network that guides a robot in an efficient path for coverage while coping with dynamic changes. The size variation required for a situation is determined by analyzing a set of occupancy grid maps corresponding to the size steps of the robot. According to the results, the proposed VSCPP can ascertain coverage while coping with dynamic changes in an environment. The reduction of the coverage time due to the size variability is significant compared to a robot with no VSCPP scheme." Navigation with Polytopes and B-Spline Path Planner,"Ngoc Thinh Nguyen, Pranav Tej Gangavarapu, Arne Sahrhage, Georg Schildbach, Floris Ernst",University of Lübeck,Motion and Path Planning III,"This paper firstly presents our optimal path planning algorithm within a 2D non-convex, polytopic region defined as a sequence of connected convex polytopes. The path is a B-spline curve but being parametrized with its equivalent Bézier representation. By doing this, the local convexity bound of each curve's interval is significantly tighter. Thus, it allows many more possibilities for constraining the entire curve to remain inside the region by using only linear constraints on the control points of the curve. We further guarantee the existence of the valid path by pointing out an algebraic solution. We integrate the algorithm, together with our previously published results, into the Navigation with polytopes toolbox which can be used as a global path planner, compatible with ROS navigation tools. It provides a framework for constructing a polytope map from a standard occupancy gridmap, searching for an appropriate sequence of connected polytopes and finally, planning a minimal-length path with different options on B-spline or Bézier parametrizations. The validation and comparison with existing methods are done using gridmaps collected under Gazebo simulations and real experiments." Probabilistic Planning with Partially Ordered Preferences Over Temporal Goals,"Hazhar Rahmani, Abhishek Kulkarni, Jie Fu","University of Florida,University of Florida, Gainesville",Planning under Uncertainty I,"In this paper, we study planning in stochastic systems, modeled as Markov decision processes (MDPs), with preferences over temporally extended goals. Prior work on temporal planning with preferences assumes that the user preferences form a total order, meaning that every pair of outcomes are comparable with each other. In this work, we consider the case where the preferences over possible outcomes are a partial order rather than a total order. We first introduce a variant of deterministic finite automaton, referred to as a preference DFA, for specifying the user's preferences over temporally extended goals. Based on the order theory, we translate the preference DFA to a preference relation over policies for probabilistic planning in a labeled MDP. In this treatment, a most preferred policy induces a weak-stochastic nondominated probability distribution over the finite paths in the MDP. The proposed planning algorithm hinges on the construction of a multi-objective MDP. We prove that a weak-stochastic nondominated policy given the preference specification is Pareto-optimal in the constructed multi-objective MDP, and vice versa. Throughout the paper, we employ a running example to demonstrate the proposed preference specification and solution approaches. We show the efficacy of our algorithm using the example with detailed analysis, and then discuss possible future directions." A Causal Decoupling Approach to Efficient Planning for Logistics Problems with Stateful Stochastic Demand,"Diptanil Chaudhuri, Dylan Shell",Texas A&M University,Planning under Uncertainty I,"Future conceptions of agile, just-in-time fabrication, lean and ""smart"" manufacturing, and a host of allied processes that exploit advanced automation, depend in part on realizing improvements in logistics planning. The present paper hypothesizes that the key to improving flexibility will be the inclusion of sophisticated, time-correlated stochastic models of demand---whether that be demand by end-user consumers directly, or by other down-stream processes. Such dynamic models of demand, unfortunately, can greatly increase the space in which planning occurs when treated, as is common for planning under uncertainty, via the Markov Decision Processes formulation. To tackle this challenge, we identify three aspects that we postulate appear as commonalities in many logistics settings. They lead to an approach for approximate reduction of the planning problem via causal decoupling, which gives a spectrum of solutions where weaker correlations in time result in quicker optimization. Empirical results on small case studies---in lean manufacturing, and commodity routing---show that retaining some limited (but non-zero) amount of temporal structure can provide a useful compromise between quality of the solution obtained and computation required." Stochastic Robustness Interval for Motion Planning with Signal Temporal Logic,"Roland Ilyes, Qi Heng Ho, Morteza Lahijanian",University of Colorado Boulder,Planning under Uncertainty I,"In this work, we present a novel robustness measure for continuous-time stochastic trajectories with respect to Signal Temporal Logic (STL) specifications. We show the soundness of the measure and develop a monitor for reasoning about partial trajectories. Using this monitor, we introduce an STL sampling-based motion planning algorithm for robots under uncertainty. Given a minimum robustness requirement, this algorithm finds satisfying motion plans; alternatively, the algorithm also optimizes for the measure. We prove probabilistic completeness and asymptotic optimality of the motion planner with respect to the measure, and demonstrate the effectiveness of our approach on several case studies." Planning with SiMBA: Motion Planning under Uncertainty for Temporal Goals Using Simplified Belief Guides,"Qi Heng Ho, Zachary Sunberg, Morteza Lahijanian","University of Colorado Boulder,University of Colorado",Planning under Uncertainty I,"This paper presents a new multi-layered algorithm for motion planning under motion and sensing uncertainties for Linear Temporal Logic specifications. We propose a technique to guide a sampling-based search tree in the combined task and belief space using trajectories from a simplified model of the system, to make the problem computationally tractable. Our method eliminates the need to construct fine and accurate finite abstractions. We prove correctness and probabilistic completeness of our algorithm, and illustrate the benefits of our approach on several case studies. Our results show that guidance with a simplified belief space model allows for significant speed-up in planning for complex specifications." RAMP: A Risk-Aware Mapping and Planning Pipeline for Fast Off-Road Ground Robot Navigation,"Lakshay Sharma, Michael Everett, Donggun Lee, Xiaoyi Cai, Philip Osteen, Jonathan Patrick How","Massachusetts Institute of Technology,Northeastern University,UC Berkeley,U.S. Army Research Laboratory",Planning under Uncertainty I,"A key challenge in fast ground robot navigation in 3D terrain is balancing robot speed and safety. Recent work has shown that 2.5D maps (2D representations with additional 3D information) are ideal for real-time safe and fast planning. However, the prevalent approach of generating 2D occupancy grids through raytracing makes the generated map unsafe to plan in, due to inaccurate representation of unknown space. Additionally, existing planners such as MPPI do not consider speeds in known free and unknown space separately, leading to slower overall plans. The RAMP pipeline proposed here solves these issues using new mapping and planning methods. This work first presents ground point inflation with persistent spatial memory as a way to generate accurate occupancy grid maps from classified pointclouds. Then we present an MPPI-based planner with embedded variability in horizon, to maximize speed in known free space while retaining cautionary penetration into unknown space. Finally, we integrate this mapping and planning pipeline with risk constraints arising from 3D terrain, and verify that it enables fast and safe navigation using simulations and hardware demonstrations." "Prioritized Robotic Exploration with Deadlines: A Comparison of Greedy, Orienteering, and Profitable Tour Approaches","Sayantan Datta, Srinivas Akella",University of North Carolina at Charlotte,Planning under Uncertainty I,"This paper addresses the problem of robotic exploration of unknown indoor environments with deadlines. Indoor exploration using mobile robots has typically focused on exploring the entire environment without considering deadlines. The objective of the prioritized exploration in this paper is to rapidly compute the geometric layout of an initially unknown environment by exploring key regions of the environment and returning to the home location within a deadline. This prioritized exploration is useful for time-critical and dangerous environments where rapid robot exploration can provide vital information for subsequent operations. For example, firefighters, for whom time is of the essence, can utilize the map generated by this robotic exploration to navigate a building on fire. In our previous work, we showed that a priority-based greedy algorithm can outperform a cost-based greedy algorithm for exploration under deadlines. This paper models the prioritized exploration problem as an Orienteering Problem (OP) and a Profitable Tour Problem (PTP) in an attempt to generate exploration strategies that can explore a greater percentage of the environment in a given amount of time. The paper presents simulation results on multiple graph-based and Gazebo environments. We found that in many cases the priority-based greedy algorithm performs on par or better than the OP and PTP-based algorithms. We analyze the potential reasons for this counterintuitive result." Epistemic Prediction and Planning with Implicit Coordination for Multi-Robot Teams in Communication Restricted Environments,"Lauren Bramblett, Shijie Gao, Nicola Bezzo",University of Virginia,Planning under Uncertainty I,"In communication restricted environments, a multi-robot system can be deployed to either: i) maintain constant communication but potentially sacrifice operational efficiency due to proximity constraints or ii) allow disconnections to increase environmental coverage efficiency, challenges on how, when, and where to reconnect (rendezvous problem). In this work we tackle the latter problem and notice that most state-of-the-art methods assume that robots will be able to execute a predetermined plan; however system failures and changes in environmental conditions can cause the robots to deviate from the plan with cascading effects across the multi-robot system. This paper proposes a coordinated epistemic prediction and planning framework to achieve consensus without communicating for exploration and coverage, task discovery and completion, and rendezvous applications. Dynamic epistemic logic is the principal component implemented to allow robots to propagate belief states and empathize with other agents. Propagation of belief states and subsequent coverage of the environment is achieved via a frontier-based method within an artificial physics-based framework. The proposed framework is validated with both simulations and experiments with unmanned ground vehicles in various cluttered environments." Uncertainty-Guided Active Reinforcement Learning with Bayesian Neural Networks,"Xinyang Wu, Mohamed El-shamouty, Christof Nitsche, Marco F. Huber","Fraunhofer IPA,University of Stuttgart",Planning under Uncertainty I,"Recent advances in Reinforcement Learning (RL) have made significant contributions in past years by offering intelligent solutions to solve robotic tasks. However, most RL algorithms, especially the model-free RL, are plagued by low learning efficiency and safety problems. In this paper, we propose using the Bayesian Neural Networks (BNNs) to guide the agent exploring actively to enhance the learning efficiency in RL and investigate the potential of recognizing safety risks in working environments with uncertainty information. We compare two types of uncertainty quantification methods in both action and state spaces. To validate our method, we visualize the quantified uncertainty in robot environments with or without safety hazards. Moreover, we evaluate the learning efficiency and safety performance of the RL agents learned with BNNs on different robotic tasks." Perturbation-Based Best Arm Identification for Efficient Task Planning with Monte-Carlo Tree Search,"Daejong Jin, Juhan Park, Kyungjae Lee","Chung-Ang university,Chung-ang University,Chung-Ang University",Task Planning,"Combining task and motion planning (TAMP) is crucial for intelligent robots to perform complex and long-horizon tasks. In TAMP, many approaches generally employ Monte-Carlo tree search (MCTS) with upper confidence bound (UCB) for task planning to handle exploration-exploitation trade-offs and find globally optimal solutions. However, since UCB basically considers the estimation error caused by noise, the error caused by insufficient optimization of the sub-tree is not represented. Hence, UCB-based approaches have the disadvantage of not exploring underestimated sub-trees. To alleviate this issue, we propose a novel tree search method using perturbation-based best-arm identification (PBAI). We theoretically prove the bound of the simple regret of our method and empirically verify that PBAI finds the optimal task plans faster and more efficiently than the existing algorithms. The source code of our proposed algorithm is available at https://github.com/jdj2261/pytamp." Contingency-Aware Task Assignment and Scheduling for Human-Robot Teams,"Neel Dhanaraj, Santosh Varadanahalli Narayan, Stefanos Nikolaidis, Satyandra K. Gupta","University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA",Task Planning,"This paper considers the problem of task assignment and scheduling for human-robot teams to enable efficient completion of complex tasks like satellite assembly. In high-mix low volume settings, we must enable the human-robot team to handle uncertainty due to changing task requirements, potential failures, and delays to maintain task completion efficiency. Our approach introduces two ideas: (1) we account for the complex interaction of uncertainty that stems from the task and agents using a multi-agent concurrent MDP framework, and (2) we use Mixed Integer Linear Programs and contingency sampling to approximate action values for action selection for a state. Our results show that our anytime algorithm is computationally efficient while giving the optimal action selection compared to a value iteration baseline. This method is evaluated on a 24-task representative assembly and real-world 60-task satellite assembly, and our results show that we can find an assignment that gives close to optimal makespan." Extracting Generalizable Skills from a Single Plan Execution Using Abstraction-Critical State Detection,"Khen Elimelech, Lydia Kavraki, Vardi Moshe",Rice University,Task Planning,"Robotic task planning is computationally challenging. To reduce planning cost and support life-long operation, we must leverage prior planning experience. To this end, we address the problem of extracting reusable and generalizable abstract skills from successful plan executions. In previous work, we introduced a supporting framework, allowing us, theoretically, to extract an abstract skill from a single execution and later automatically adapt it and reuse it in new domains. We also proved that, given a library of such skills, we can significantly reduce the planning effort for new problems. Nevertheless, until now, abstract-skill extraction could only be performed manually. In this paper, we finally close the automation loop and explain how abstract skills can be practically and automatically extracted. We start by analyzing the desired qualities of an abstract skill and formulate skill extraction as an optimization problem. We then develop two extraction algorithms, based on the novel concept of abstraction-critical state detection. As we show experimentally, the approach is independent of any planning domain." Efficient Planning of Multi-Robot Collective Transport Using Graph Reinforcement Learning with Higher Order Topological Abstraction,"Steve Paul, Wenyuan Li, Brian Smyth, Yuzhou Chen, Yulia Gel, Souma Chowdhury","University at Buffalo,Temple University,University of Texas at Dallas,University at Buffalo, State University of New York",Task Planning,"Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT -- here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload constraints. For large instances of these problems involving 100s-1000's of tasks and 10s-100s of robots, traditional non-learning solvers are often time-inefficient, and emerging learning-based policies do not scale well to larger-sized problems without costly retraining. To address this gap, we use a recently proposed encoder-decoder graph neural network involving Capsule networks and multi-head attention mechanism, and innovatively add topological descriptors (TD) as new features to improve transferability to unseen problems of similar and larger size. Persistent homology is used to derive the TD, and proximal policy optimization is used to train our TD-augmented graph neural network. The resulting policy model compares favorably to state-of-the-art non-learning baselines while being much faster. The benefit of using TD is readily evident when scaling to test problems of size larger than those used in training." On the Utility of Buffers in Pick-N-Swap Based Lattice Rearrangement,"Kai Gao, Jingjin Yu",Rutgers University,Task Planning,"We investigate the utility of employing multiple buffers in solving a class of rearrangement problems with pick-n-swap manipulation primitives. In this problem, objects stored randomly in a lattice are to be sorted using a robot arm with k>=1 swap spaces or buffers, capable of holding up to k objects on its end-effector simultaneously. On the structural side, we show that the addition of each new buffer brings diminishing returns in saving the end-effector travel distance while holding the total number of pick-n-swap operations at the minimum. This is due to an interesting recursive cycle structure in random m-permutation, where the largest cycle covers over 60% of objects. On the algorithmic side, we propose fast algorithms for 1D and 2D lattice rearrangement problems that can effectively use multiple buffers to boost solution optimality. Numerical experiments demonstrate the efficiency and scalability of our methods, as well as confirm the diminishing return structure as more buffers are employed." On-Demand Multi-Agent Basket Picking for Shopping Stores,"Mattias Tiger, David Bergström, Simon Wijk Stranius, Evelina Holmgren, Daniel De Leng, Fredrik Heintz","AI and Integrated Computer Systems (AIICS), Linköping University,Linköping University",Task Planning,"Imagine placing an online order on your way to the grocery store, then being able to pick the collected basket upon arrival or shortly after. Likewise, imagine placing any online retail order, made ready for pickup in minutes instead of days. In order to realize such a low-latency automatic warehouse logistics system, solvers must be made to be basket-aware. That is, it is more important that the full order (the basket) is picked timely and fast, than that any single item in the order is picked quickly. Current state-of-the-art methods are not basket-aware. Nor are they optimized for a positive customer experience, that is; to prioritize customers based on queue place and the difficulty associated with picking their order. An example of the latter is that it is preferable to prioritize a customer ordering a pack of diapers over a customer shopping a larger order, but only as long as the second customer has not already been waiting for too long. In this work we formalize the problem outlined, propose a new method that significantly outperforms the state-of-the-art, and present a new realistic simulated benchmark. The proposed method is demonstrated to work in an on-line and real-time setting, and to solve the on-demand multi-agent basket picking problem for automated shopping stores under realistic conditions." Multi-Robot Coordination and Cooperation with Task Precedence Relationships,"Walker Gosrich, Siddharth Mayya, Saaketh Narayan, Matthew Malencia, Saurav Agarwal, Vijay Kumar","University of Pennsylvania,Amazon Robotics",Task Planning,"We propose a new formulation for the multi-robot task planning and allocation problem that incorporates (a) precedence relationships between tasks; (b) coordination for tasks allowing multiple robots to achieve increased efficiency; and (c) cooperation through the formation of robot coalitions for tasks that cannot be performed by individual robots alone. In our formulation, the tasks and the relationships between the tasks are specified by a task graph. We define a set of reward functions over the task graph’s nodes and edges. These functions model the effect of robot coalition size on task performance, while incorporating the influence of one task’s performance on a dependent task. Solving this problem optimally is NP-hard. However, using the task graph formulation allows us to leverage min-cost network flow approaches to obtain approximate solutions efficiently. Additionally, we explore a mixed integer programming approach, which gives optimal solutions for small instances of the problem but is computationally expensive. We also develop a greedy heuristic algorithm as a baseline. Our modeling and solution approaches result in task plans that leverage task precedence relationships and robot coordination and cooperation to achieve high mission performance, even in large missions with many agents." On the Programming Effort Required to Generate Behavior Trees and Finite State Machines for Robotic Applications,"Matteo Iovino, Julian Förster, Pietro Falco, Jen Jen Chung, Roland Siegwart, Claes Christian Smith","ABB Corporate Research,ETH Zurich,ABB, Corporate Research,The University of Queensland,KTH Royal Institute of Technology",Task Planning,"In this paper we provide a practical demonstration of how the modularity in a Behavior Tree (BT) decreases the effort in programming a robot task when compared to a Finite State Machine (FSM). In recent years the way to represent a task plan to control an autonomous agent has been shifting from the standard FSM towards BTs. Many works in the literature have highlighted and proven the benefits of such design compared to standard approaches, especially in terms of modularity, reactivity and human readability. However, these works have often failed in providing a tangible comparison in the implementation of those policies and the programming effort required to modify them. This is a relevant aspect in many robotic applications, where the design choice is dictated both by the robustness of the policy and by the time required to program it. In this work, we compare backward chained BTs with a fault-tolerant design of FSMs by evaluating the cost to modify them. We validate the analysis with a set of experiments in a simulation environment where a mobile manipulator solves an item fetching task." Train What You Know - Precise Pick-And-Place with Transporter Networks,"Gergely Sóti, Xi Huang, Christian Wurll, Björn Hein","Karlsruhe University of Applied Sciences,Karlsruhe Institute of Technology,University of Applied Sciences Karlsruhe",Deep Learning in Grasping and Manipulation,"Precise pick-and-place is essential in robotic applications. To this end, we define an exact training method and an iterative inference method that improve pick-and-place precision with Transporter Networks. We conduct a large scale experiment on 8 simulated tasks. A systematic analysis shows, that the proposed modifications have a significant positive effect on model performance. Considering picking and placing independently, our methods achieve up to 60% lower rotation and translation errors than baselines. For the whole pick-and-place process we observe 50% lower rotation errors for most tasks with slight improvements in terms of translation errors. Furthermore, we propose architectural changes that retain model performance and reduce computational costs and time. We validate our methods with an interactive teaching procedure on real hardware. Supplementary material is available at: https://gergely-soti.github.io/p3" Asking for Help: Failure Prediction in Behavioral Cloning through Value Approximation,"Cem Gokmen, Mohi Khansari, Daniel Ho","Stanford University,Google X",Deep Learning in Grasping and Manipulation,"Recent progress in end-to-end Imitation Learning approaches has shown promising results and generalization capabilities on mobile manipulation tasks. Such models are seeing increasing deployment in real-world settings, where scaling up requires robots to be able to operate with high autonomy, i.e. requiring as little human supervision as possible. In order to avoid the need for one-on-one human supervision, robots need to be able to detect and prevent policy failures ahead of time, and ask for help, allowing a remote operator to supervise multiple robots and help when needed. However, the black-box nature of end-to-end Imitation Learning models such as Behavioral Cloning, as well as the lack of an explicit state-value representation, make it difficult to predict failures. To this end, we introduce Behavioral Cloning Value Approximation (BCVA), an approach to learning a state value function based on and trained jointly with a Behavioral Cloning policy that can be used to predict failures. We demonstrate the effectiveness of BCVA by applying it to the challenging mobile manipulation task of latched-door opening, showing that we can identify failure scenarios with with 86% precision and 81% recall, evaluated on over 2000 real world runs, improving upon the baseline of simple failure classification by 10 percentage-points." Seq2Seq Imitation Learning for Tactile Feedback-Based Manipulation,"Wenyan Yang, Alexandre Angleraud, Roel S. Pieters, Joni Pajarinen, Joni-Kristian Kamarainen","Tampere university,Tampere University,Aalto University,Tampere University of Technology",Deep Learning in Grasping and Manipulation,"Robot control for tactile feedback-based manipulation can be difficult due to the modeling of physical contacts, partial observability of the environment, and noise in perception and control. This work focuses on solving partial observability of contact-rich manipulation tasks as a Sequence-to-Sequence (Seq2Seq) Imitation Learning (IL) problem. The proposed Seq2Seq model first produces a robot environment interaction sequence to estimate the partially observable environment state variables. Then, the observed interaction sequence is transformed into a control sequence for the task itself. The proposed Seq2Seq IL for tactile feedback based manipulation is experimentally validated on a door-open task in a simulated environment and a snap-on insertion task with a real robot. The model is able to learn both tasks from only 50 expert demonstrations, while state-of-the-art reinforcement learning and imitation learning methods fail." SGTM 2.0: Autonomously Untangling Long Cables Using Interactive Perception,"Kaushik Shivakumar, Vainavi Viswanath, Anrui Gu, Yahav Avigal, Justin Kerr, Jeffrey Ichnowski, Richard Cheng, Thomas Kollar, Ken Goldberg","University of California Berkeley,University of California, Berkeley,UC Berkeley,Carnegie Mellon University,California Institute of Technology,Toyota Research Institute",Deep Learning in Grasping and Manipulation,"Cables are commonplace in homes, hospitals, and industrial warehouses and are prone to tangling. This paper extends prior work on autonomously untangling long cables by introducing novel uncertainty quantification metrics and actions that interact with the cable to reduce perception uncertainty. We present Sliding and Grasping for Tangle Manipulation 2.0 (SGTM 2.0), a system that autonomously untangles cables approximately 3 meters in length with a bilateral robot using estimates of uncertainty at each step to inform actions. By interactively reducing uncertainty, SGTM 2.0 significantly reduces run-time. Physical experiments with 84 trials suggest that SGTM 2.0 can achieve 83% untangling success on cables with 1 or 2 overhand and figure-8 knots, and 70% termination detection success across these configurations, outperforming SGTM 1.0 by 43% in untangling accuracy and 200% in completion time. Supplementary material, visualizations, and videos can be found at sites.google.com/view/sgtm2." Online Tool Selection with Learned Grasp Prediction Models,"Rohanimanesh Khashayar, Jacob Metzger, William Richards, Aviv Tamar","Osaro Inc.,Osaro, Inc,Technion",Deep Learning in Grasping and Manipulation,"Abstract—Deep learning-based grasp prediction models have become an industry standard for robotic bin-picking systems. To maximize pick success, production environments are often equipped with several end-effector tools that can be swapped on- the-fly, based on the target object. Tool-change, however, takes time. Choosing the order of grasps to perform, and corresponding tool-change actions, can improve system throughput; this is the topic of our work. The main challenge in planning tool change is uncertainty – we typically cannot see objects in the bin that are currently occluded. Inspired by queuing and admission control problems, we model the problem as a Markov Decision Process (MDP), where the goal is to maximize expected throughput, and we pursue an approximate solution based on model predictive control, where at each time step we plan based only on the currently visible objects. Special to our method is the idea of void zones, which are geometrical boundaries in which an unknown object will be present, and therefore cannot be accounted for during planning. Our planning problem can be solved using integer linear programming (ILP). However, we find that an approximate solution based on sparse tree search yields near optimal performance at a fraction of the time. Another question that we explore is how to measure the performance of tool- change planning: we find that throughput alone can fail to capture delicate and smooth behavior, and propose a principled alternative. Finally, we demonstrate our algorithms on both synthetic and real world bin picking tasks." FOGL: Federated Object Grasping Learning,"Seok-kyu Kang, Changhyun Choi","Korea Shipbuilding & Offshore Engineering Co. Ltd (KSOE), HD Hyundai Group,University of Minnesota, Twin Cities",Deep Learning in Grasping and Manipulation,"Federated learning is a promising technique for training global models in a data-decentralized environment. In this paper, we propose a federated learning approach for robotic object grasping. The main challenge is that the data collected by multiple robots deployed in different environments tends to form heterogeneous data distributions (i.e., non-IID) and that the existing federated learning methods on such data distributions show serious performance degradation. To tackle this problem, we propose federated object grasping learning (FOGL) that uses cross-evaluation in a general federated learning process to assess the training performance of robots. We cluster robots with similar training patterns and perform independent federated learning on each cluster. Finally, we integrate the global models for each cluster through an ensemble inference. We apply FOGL to various federated learning scenarios in robotic object grasping and show state-of-the-art performance on the Cornell grasping dataset." Goal-Image Conditioned Dynamic Cable Manipulation through Bayesian Inference and Multi-Objective Black-Box Optimization,"Kuniyuki Takahashi, Tadahiro Taniguchi","Preferred Networks, Inc.,Ritsumeikan University",Deep Learning in Grasping and Manipulation,"To perform dynamic cable manipulation to realize the configuration specified by a target image, we formulate dynamic cable manipulation as a stochastic forward model. Then, we propose a method to handle uncertainty by maximizing the expectation, which also considers estimation errors of the trained model. To avoid issues like multiple local minima and requirement of differentiability by gradient-based methods, we propose using a black-box optimization (BBO) to optimize joint angles to realize a goal image. Among BBO, we use the Tree-structured Parzen Estimator (TPE), a type of Bayesian optimization. By incorporating constraints into the TPE, the optimized joint angles are constrained within the range of motion. Since TPE is population-based, it is better able to detect multiple feasible configurations using the estimated inverse model. We evaluated image similarity between the target and cable images captured by executing the robot using optimal transport distance. The results show that the proposed method improves accuracy compared to conventional gradient-based approaches and methods that use deterministic models that do not consider uncertainty." Learning Generalizable Pivoting Skills,"Xiang Zhang, Siddarth Jain, Baichuan Huang, Masayoshi Tomizuka, Diego Romeres","University of California, Berkeley,Mitsubishi Electric Research Laboratories (MERL),Rutgers University,University of California,Mitsubishi Electric research laboratories",Deep Learning in Grasping and Manipulation,"The skill of pivoting an object with a robotic system is challenging for the external forces that act on the system, mainly given by contact interaction. The complexity increases when the same skills are required to generalize across different objects. This paper proposes a framework for learning robust and generalizable pivoting skills, which consists of three steps. First, we learn a pivoting policy on an “unitary” object using Reinforcement Learning (RL). Then, we obtain the object’s feature space by supervised learning to encode the kinematic properties of arbitrary objects. Finally, to adapt the unitary policy to multiple objects, we learn data-driven projections based on the object features to adjust the state and action space of the new pivoting task. The proposed approach is entirely trained in simulation. It requires only one depth image of the object and can zero-shot transfer to real-world objects. We demonstrate robustness to sim-to-real transfer and generalization to multiple objects." Cloth Funnels: Canonicalized-Alignment for Multi-Purpose Garment Manipulation,"Alper Canberk, Cheng Chi, Huy Ha, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, Shuran Song","Columbia University,Toyota Research Institute",Deep Learning in Grasping and Manipulation,"Automating garment manipulation is challenging due to extremely high variability in object configurations. To reduce this intrinsic variation, we introduce the task of ""canonicalized-alignment"" that simplifies downstream applications by reducing the possible garment configurations. This task can be considered as ""cloth state funnel"" that manipulates arbitrarily configured clothing items into a predefined deformable configuration (i.e. canonicalization) at an appropriate rigid pose (i.e. alignment). In the end, the cloth items will result in a compact set of structured and highly visible configurations -- which are desirable for downstream manipulation skills. To enable this task, we propose a novel canonicalized-alignment objective that effectively guides learning to avoid adverse local minima during learning. Using this objective, we learn a multi-arm, multi-primitive policy that strategically chooses between dynamic flings and quasi-static pick and place actions to achieve efficient canonicalized-alignment. We evaluate this approach on a real-world ironing and folding system that relies on this learned policy as the common first step. Empirically, we demonstrate that our task-agnostic canonicalized-alignment can enable even simple manually-designed policies to work well where they were previously inadequate, thus bridging the gap between automated non-deformable manufacturing and deformable manipulation." RLAfford: End-To-End Affordance Learning for Robotic Manipulation,"Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong","Peking University,South China University of Technology",Deep Learning in Grasping and Manipulation,"Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In particular, it is hard to train a policy that can generalize over objects with different semantic categories, diverse shape geometry and versatile functionality. In this study, we focused on the contact information in manipulation processes, and proposed a unified representation for critical interactions to describe different kinds of manipulation tasks. Specifically, we take advantage of the contact information generated during the RL training process and employ it as unified visual representa- tion to predict contact map of interest. Such representation leads to an end-to-end learning framework that combined affordance based and RL based methods for the first time. Our unified framework can generalize over different types of manipulation tasks. Surprisingly, the effectiveness of such framework holds even under the multi-stage and multi-agent scenarios. We tested our method on eight types of manipulation tasks. Results showed that our methods outperform baseline algorithms, including visual affordance methods and RL methods, by a large margin on the success rate. The demonstration can be found at https://sites.google.com/view/rlafford/." Implementation and Optimization of Grasping Learning with Dual-Modal Soft Gripper,"Lei Zhao, Horeal Liu, Feihan Li, X.y. Ding, Yuhao Sun, Fuchun Sun, Jianhua Shan, Qi Ye, Lincheng Li, Bin Fang","anhui university of technology,Tsinghua University,Anhui University of Technology,Zhejiang University,NetEase Fuxi AI Lab,Tsinghua university",Deep Learning in Grasping and Manipulation,"Robust and efficient grasping of different objects is still an open problem due to the difficulty of integrating multidisciplinary knowledge such as gripper ontology design, perception, control, and learning. In recent years, learning-based methods have achieved excellent results in grasping various novel objects. However, current methods are usually limited to a single grasping mode or rely on different end effectors to grasp objects of different shapes. For human beings, our hands are capable of grasping various objects with the change in grasping methods and form of hands. In light of this, developing a gripper with similar performance could possibly improve the robot's gripping ability. In this paper, we design a dual-modal soft gripper (DSG) and propose a deep reinforcement learning (DRL) framework to implement the operations. Both of our grasping modes, namely enveloping and pinching, are achieved through the tendon drive system and the deformation of the spring steel plate, which enables the gripper to switch between the two grasping modes in real-time. We also combined the cutting-edge achievements of deep learning and reinforcement learning to design an autonomous grasping algorithm based on Q-learning and a deep Q network. Moreover, to fully utilize the visual input from the sensor, we added semantic embeddings of target objects to facilitate the learning, which is especially useful in deciding the grasping method for objects previously unseen objects. We also evaluate our DRL framework in different scenarios, offering a detailed comparison of each grasping mode and the mixed method (with or without semantic information). Our design has proved efficient in reducing the number of failing grasping actions and improving the success rate when facing novel and tricky objects." DefGraspNets: Grasp Planning on 3D Fields with Graph Neural Nets,"Isabella Huang, Yashraj Narang, Ruzena Bajcsy, Fabio Ramos, Tucker Hermans, Dieter Fox","UC Berkeley,NVIDIA,Univ of California, Berkeley,University of Sydney, NVIDIA,University of Utah,University of Washington",Deep Learning in Grasping and Manipulation,"Robotic grasping of 3D deformable objects is critical for real-world applications such as food handling and robotic surgery. Unlike rigid and articulated objects, 3D deformable objects have infinite degrees of freedom. Fully defining their state requires 3D deformation and stress fields, which are exceptionally difficult to analytically compute or experimentally measure. Thus, evaluating grasp candidates for grasp planning typically requires accurate, but slow 3D finite element method (FEM) simulation. Sampling-based grasp planning is often impractical, as it requires evaluation of a large number of grasp candidates. Gradient-based grasp planning can be more efficient, but requires a differentiable model to synthesize optimal grasps from initial candidates. Differentiable FEM simulators may fill this role, but are typically no faster than standard FEM. In this work, we propose learning a predictive graph neural network (GNN), DefGraspNets, to act as our differentiable model.We train DefGraspNets to predict 3D stress and deformation fields based on FEM-based grasp simulations. DefGraspNets not only runs up to 1500x faster than the FEM simulator, but also enables fast gradient-based grasp optimization over 3D stress and deformation metrics. We design DefGraspNets to align with real-world grasp planning practices and demonstrate generalization across multiple test sets, including real-world experiments." Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control,"Jiayu Chen, Tian Lan, Vaneet Aggarwal","Purdue University,George Washington University",Learning for Grasping and Manipulation III,"Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizon tasks from expert demonstrations by modeling the task hierarchy with the option framework. Existing methods either overlook the causal relationship between the subtask and its corresponding policy or cannot learn the policy in an end-to-end fashion, which leads to suboptimality. In this work, we develop a novel HIL algorithm based on Adversarial Inverse Reinforcement Learning and adapt it with the Expectation-Maximization algorithm in order to directly recover a hierarchical policy from the unannotated demonstrations. Further, we introduce a directed information term to the objective function to enhance the causality and propose a Variational Autoencoder framework for learning with our objectives in an end-to-end fashion. Theoretical justifications and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm. The codes are available at https://github.com/LucasCJYSDL/HierAIRL." Efficiently Learning Small Policies for Locomotion and Manipulation,"Shashank Hegde, Gaurav Sukhatme",University of Southern California,Learning for Grasping and Manipulation III,"Neural control of memory-constrained, agile robots requires small, yet highly performant models. We leverage graph hyper networks to learn graph hyper policies trained with off-policy reinforcement learning resulting in networks that are two orders of magnitude smaller than commonly used networks yet encode policies comparable to those encoded by much larger networks trained on the same task. We show that our method can be appended to any off-policy reinforcement learning algorithm, without any change in hyperparameters, by showing results across locomotion and manipulation tasks. Further, we obtain an array of working policies, with differing numbers of parameters, allowing us to pick an optimal network for the memory constraints of a system. Training multiple policies with our method is as sample efficient as training a single policy. Finally, we provide a method to select the best architecture, given a constraint on the number of parameters. Project website: https://sites.google.com/usc.edu/graphhyperpolicy" Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects,"Giulio Schiavi, Paula Wulkop, Giuseppe Maria Rizzi, Lionel Ott, Roland Siegwart, Jen Jen Chung","ETH Zürich,ETH Zurich,The University of Queensland",Learning for Grasping and Manipulation III,"Interactions with articulated objects are a challenging but important task for mobile robots. To tackle this challenge, we propose a novel closed-loop control pipeline, which integrates manipulation priors from affordance estimation with sampling-based whole-body control. We introduce the concept of agent-aware affordances which fully reflect the agent's capabilities and embodiment and we show that they outperform their state-of-the-art counterparts which are only conditioned on the end-effector geometry. Additionally, closed-loop affordance inference is found to allow the agent to divide a task into multiple non-continuous motions and recover from failure and unexpected states. Finally, the pipeline is able to perform long-horizon mobile manipulation tasks, i.e. opening and closing an oven, in the real world with high success rates (opening: 71%, closing: 72%)." SE(3)-DiffusionFields: Learning Smooth Cost Functions for Joint Grasp and Motion Optimization through Diffusion,"Julen Urain, Niklas Funk, Jan Peters, Georgia Chalvatzaki","TU Darmstadt,Technische Universität Darmstadt,Technische Universität Darmastadt",Learning for Grasping and Manipulation III,"Multi-objective optimization problems are ubiquitous in robotics, e.g., the optimization of a robot manipulation task requires a joint consideration of grasp pose configurations, collisions and joint limits. While some demands can be easily hand-designed, e.g., the smoothness of a trajectory, several task-specific objectives need to be learned from data. This work introduces a method for learning data-driven SE(3) cost functions as diffusion models. Diffusion models can represent highly-expressive multimodal distributions and exhibit proper gradients over the entire space due to their score-matching training objective. Learning costs as diffusion models allows their seamless integration with other costs into a single differentiable objective function, enabling joint gradient-based motion optimization. In this work, we focus on learning SE(3) diffusion models for 6DoF grasping, giving rise to a novel framework for joint grasp and motion optimization without needing to decouple grasp selection from trajectory generation. We evaluate the representation power of our SE(3) diffusion models w.r.t. classical generative models, and we showcase the superior performance of our proposed optimization framework in a series of simulated and real-world robotic manipulation tasks against representative baselines." Focused Adaptation of Dynamics Models for Deformable Object Manipulation,"Peter Mitrano, Alex Lagrassa, Oliver Kroemer, Dmitry Berenson","University of Michigan,Carnegie Mellon University",Learning for Grasping and Manipulation III,"In order to efficiently learn a dynamics model for a task in a new environment, one can adapt a model learned in a similar source environment. However, existing adaptation methods can fail when the target dataset contains transitions where the dynamics are very different from the source environment. For example, the source environment dynamics could be of a rope manipulated in free-space, whereas the target dynamics could involve collisions and deformation on obstacles. Our key insight is to improve data efficiency by focusing model adaptation on only the regions where the source and target dynamics are similar. In the rope example, adapting the free-space dynamics requires significantly fewer data than adapting the free-space dynamics while also learning collision dynamics. We propose a new method for adaptation that is effective in adapting to regions of similar dynamics. Additionally, we combine this adaptation method with prior work on planning with unreliable dynamics to make a method for data-efficient online adaptation, called FOCUS. We first demonstrate that the proposed adaptation method achieves statistically significantly lower prediction error in regions of similar dynamics on simulated rope manipulation and plant watering tasks. We then show on a bimanual rope manipulation task that FOCUS achieves data-efficient online learning, in simulation and in the real world." Dexterous Manipulation from Images: Autonomous Real-World RL Via Substep Guidance,"Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine","University of California, Berkeley,Meta AI,University of Washington,UC Berkeley",Learning for Grasping and Manipulation III,"Complex and contact-rich robotic manipulation tasks, particularly those that involve multi-fingered hands and underactuated object manipulation, present a significant challenge to any control method. Methods based on reinforcement learning offer an appealing choice for such settings, as they can enable robots to learn to delicately balance contact forces and dexterously reposition objects without strong modeling assumptions. However, running reinforcement learning on real-world dexterous manipulation systems often requires significant manual engineering. This negates the benefits of autonomous data collection and ease of use that reinforcement learning should in principle provide. In this paper, we describe a system for vision-based dexterous manipulation that provides a ""programming-free"" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction. The core principle underlying our system is that, in a vision-based setting, users should be able to provide high-level intermediate supervision that circumvents challenges in teleoperation or kinesthetic teaching which allows a robot to not only learn a task efficiently but also to autonomously practice. Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples, a reinforcement learning procedure that learns the task autonomously without interventions, and experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world, without simulation, manual modeling, or reward engineering." Predicting Motion Plans for Articulating Everyday Objects,"Arjun Gupta, Max Shepherd, Saurabh Gupta",UIUC,Learning for Grasping and Manipulation III,"Mobile manipulation tasks such as opening a door, pulling open a drawer, or lifting a toilet seat require constrained motion of the end-effector under environmental and task constraints. This, coupled with partial information in novel environments, makes it challenging to employ classical motion planning approaches at test time. Our key insight is to cast it as a learning problem to leverage past experience of solving similar planning problems to directly predict motion plans for mobile manipulation tasks in novel situations at test time. To enable this, we develop a simulator, ArtObjSim, that simulates articulated objects placed in real scenes. We then introduce a fast and flexible representation for motion plans. Finally, we learn models that use this representation to quickly predict motion plans for articulating novel objects at test time. Experimental evaluation shows improved speed and accuracy at generating motion plans than pure search-based methods." Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation,"Sridhar Pandian Arunachalam, Sneha Silwal, Ben Evans, Lerrel Pinto",New York University,Learning for Grasping and Manipulation III,"Optimizing behaviors for dexterous manipulation has been a longstanding challenge in robotics, with a variety of methods from model-based control to model-free reinforcement learning having been previously explored in literature. Such prior work often require extensive trial-and-error training along with task-specific tuning of reward functions, which makes applying dexterous manipulation for general purpose problems quite impractical. A sample-efficient and practical alternate to trial-and-error learning is imitation learning. However, collecting and learning from demonstrations in dexterous manipulation is quite challenging due to the high-dimensional action-space involved with multi-finger control. In this work, we propose 'Dexterous Imitation Made Easy' (DIME) a new imitation learning framework for dexterous manipulation. DIME only requires a single RGB camera that observes a human operator to teleoperate a robotic hand. Once demonstrations are collected, DIME employs state-of-the-art imitation learning methods to train dexterous manipulation policies. On real robot benchmarks we demonstrate that DIME can be used to solve complex, in-hand manipulation tasks such as 'flipping', 'spinning', and 'rotating' objects with just 30 demonstrations and no additional robot training. Our code, pre-collected demonstrations, and robot videos are publicly available at: https://nyu-robot-learning.github.io/dime" Holo-Dex: Teaching Dexterity with Immersive Mixed Reality,"Sridhar Pandian Arunachalam, Irmak Guzey, Soumith Chintala, Lerrel Pinto","New York University,Facebook AI Research",Learning for Grasping and Manipulation III,"A fundamental challenge in teaching robots is to provide an effective interface for human teachers to demonstrate useful skills to a robot. This challenge is exacerbated in dexterous manipulation, where teaching high-dimensional, contact-rich behaviors often require esoteric teleoperation tools. In this work, we present Holo-Dex, a framework for dexterous manipulation that places a teacher in an immersive mixed reality through commodity VR headsets. The high-fidelity hand pose estimator onboard the headset is used to teleoperate the robot and collect demonstrations for a variety of general-purpose dexterous tasks. Given these demonstrations, we use powerful feature learning combined with non-parametric imitation to train dexterous skills. Our experiments on six common dexterous tasks, including in-hand rotation, spinning, and bottle opening, indicate that Holo-Dex can both collect high-quality demonstration data and train skills in a matter of hours. Finally, we find that our trained skills can exhibit generalization on objects not seen in training. Videos of Holo-Dex are available on https://holo-dex.github.io/." Online Augmentation of Learned Grasp Sequence Policies for More Adaptable and Data-Efficient In-Hand Manipulation,"Ethan K. Gordon, Rana Soltani Zarrin","University of Washington,Honda Research Institute - USA",Learning for Grasping and Manipulation III,"When using a tool, the grasps used for picking it up, reposing, and holding it in a suitable pose for the desired task could be distinct. Therefore, a key challenge for autonomous in-hand tool manipulation is finding a sequence of grasps that facilitates every step of the tool use process while continuously maintaining force closure and stability. Due to the complexity of modeling the contact dynamics, reinforcement learning (RL) techniques can provide a solution in this continuous space subject to highly parameterized physical models. However, these techniques impose a trade-off in adaptability and data efficiency. At test time the tool properties, desired trajectory, and desired application forces could differ substantially from training scenarios. Adapting to this necessitates more data or computationally expensive online policy updates. In this work, we apply the principles of discrete dynamic programming (DP) to augment RL performance with domain knowledge. Specifically, we first design a computationally simple approximation of our environment. We then demonstrate in physical simulation that performing tree searches (i.e., lookaheads) and policy rollouts with this approximation can improve an RL-derived grasp sequence policy with minimal additional online computation. Additionally, we show that pretraining a deep RL network with the DP-derived solution to the discretized problem can speed up policy training." DeXtreme: Transfer of Agile In-Hand Manipulation from Simulation to Reality,"Ankur Handa, Arthur Allshire, Viktor Makoviichuk, Aleksei Petrenko, Ritvik Singh, Jingzhou Liu, Denys Makoviichuk, Karl Van Wyk, Zhurkevich Alexander, Balakumar Sundaralingam, Yashraj Narang, Jean-francois Lafleche, Dieter Fox, Gavriel State","NVidia,University of Toronto,NVIDIA,USC,University of Toronto, NVIDIA,Snap,NVIDIA Corporation,University of Washington",Learning for Grasping and Manipulation III,"Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality. In this paper, we present our techniques to train a) a policy that can perform robust dexterous manipulation on a low-cost anthropomorphic robot hand and b) a robust pose estimator suitable for providing real-time reliable information on the state of the object being manipulated. Our policies are trained to adapt to a wide range of conditions in simulation. Consequently, our vision-based policies significantly outperform the best vision policies in the literature and are competitive with policies that are given privileged state information via motion capture systems. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups, and in our case, with the Allegro Hand and Isaac Gym GPU-based simulation. Furthermore, it opens up the possibilities for researchers to achieve such results with commonly-available, lower-cost hands and cameras. Videos of the resulting policy and supplementary information, including experiments and demos can be found on href{https://dextreme.org/}{this website}." Meta-Reinforcement Learning Via Language Instructions,"Zhenshan Bing, Alexander Koch, Xiangtong Yao, Kai Huang, Alois Knoll","Technical University of Munich,Sun Yat-sen University,Tech. Univ. Muenchen TUM",Learning for Grasping and Manipulation III,"Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and progresses in the learning only relying on the reward signal. This is implicit and rather insufficient to learn a task well. On the contrary, humans are usually taught new skills via natural language instructions. However, utilizing language instructions for robotic motion control to improve the adaptability is a recently emerged topic and challenging as well. In this paper, we present a meta-RL algorithm that addresses the challenge of learning skills with language instructions in multiple manipulation tasks. On the one hand, our algorithm utilizes the language instructions to shape its interpretation of the task, on the other hand, it still learns to solve task in a trial-and-error process. We evaluate our algorithm on the robotic manipulation benchmark (Meta-World) and it significantly outperforms state-of-the-art methods in terms of training and testing task success rates. Codes are available at https://tumi6robot.wixsite.com/million." Improving Video Super-Resolution with Long-Term Self-Exemplars,"Guotao Meng, Yue Wu, Qifeng Chen","HKUST,Hong Kong University of Science and Technology",Machine Learning for Perception,"Existing video super-resolution methods often utilize a few neighboring frames to generate a higher-resolution image for each frame. However, the abundant information in distant frames has not been fully exploited in these methods: corresponding patches of the same instance appear across distant frames at different scales. Based on this observation, we propose to improve the video super-resolution quality with long-term cross-scale aggregation that leverages similar patches (self-exemplars) across distant frames. Our method can be implemented as a post processing for any super resolution methods to improve the performance. Our model consists of a multi-reference alignment module to fuse the features derived from similar patches: we fuse the features of distant references to perform high-quality super-resolution. We also propose a novel and practical training strategy for reference-based super-resolution. To evaluate the performance of our proposed method, we conduct extensive experiments on our collected CarCam dataset, the Waymo Open dataset, and the REDS dataset, and the results demonstrate our method outperforms state-of-the-art methods." Learning-Based Relational Object Matching across Views,"Cathrin Elich, Iro Armeni, Martin R. Oswald, Marc Pollefeys, Joerg Stueckler","Max Planck Institute for Intelligent Systems,ETH Zurich",Machine Learning for Perception,"Intelligent robots require object-level scene understanding to reason about possible tasks and interactions with the environment. Moreover, many perception tasks such as scene reconstruction, image retrieval, or place recognition can benefit from reasoning on the level of objects. While keypoint-based matching can yield strong results for finding correspondences for images with small to medium view point changes, for large view point changes, matching semantically on the object-level becomes advantageous. In this paper, we propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images. We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network. We demonstrate our approach in a large variety of views on realistically rendered synthetic images. Our approach compares favorably to previous state-of-the-art object-level matching approaches and achieves improved performance over a pure keypoint-based approach for large view-point changes." TransVisDrone: Spatio-Temporal Transformer for Vision-Based Drone-To-Drone Detection in Aerial Videos,"Tushar Bharat Sangam, Ishan Rajendrakumar Dave, Waqas Sultani, Mubarak Shah","University of Central Florida,Informational Technology University",Machine Learning for Perception,"Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. However, existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. In this work, we propose a simple yet effective framework, textit{TransVisDrone}, that provides an end-to-end solution with higher computational efficiency. We utilize CSPDarkNet-53 network to learn object-related spatial features and VideoSwin model to improve drone detection in challenging scenarios by learning spatio-temporal dependencies of drone motion. Our method achieves state-of-the-art performance on three challenging real-world datasets (Average [email protected]): NPS 0.95, FLDrones 0.75, and AOT 0.80, and a higher throughput than previous methods. We also demonstrate its deployment capability on edge devices and its usefulness in detecting drone-collision (encounter). Project: url{https://tusharsangam.github.io/TransVisDrone-project-page/}" Unsupervised RGB-To-Thermal Domain Adaptation Via Multi-Domain Attention Network,"Lu Gan, Connor Lee, Soon-Jo Chung","California Institute of Technology,Caltech",Machine Learning for Perception,"This work presents a new method for unsupervised thermal image classification and semantic segmentation by transferring knowledge from the RGB domain using a multi-domain attention network. Our method does not require any thermal annotations or co-registered RGB-thermal pairs, enabling robots to perform visual tasks at night and in adverse weather conditions without incurring additional costs of data labeling and registration. Current unsupervised domain adaptation methods look to align global images or features across domains. However, when the domain shift is significantly larger for cross-modal data, not all features can be transferred. We solve this problem by using a shared backbone network that promotes generalization, and domain-specific attention that reduces negative transfer by attending to domain-invariant and easily-transferable features. Our approach outperforms the state-of-the-art RGB-to-thermal adaptation method in classification benchmarks, and is successfully applied to thermal river scene segmentation using only synthetic RGB images. Our code is made publicly available at https://github.com/ganlumomo/thermal-uda-attention." Adaptive-SpikeNet: Event-Based Optical Flow Estimation Using Spiking Neural Networks with Learnable Neuronal Dynamics,"Adarsh Kosta, Kaushik Roy",Purdue University,Machine Learning for Perception,"Event-based cameras have recently shown great potential for high-speed motion estimation owing to their ability to capture temporally rich information asynchronously. Spiking Neural Networks (SNNs), with their neuro-inspired event-driven processing can efficiently handle such asynchronous data, while neuron models such as the leaky-integrate and fire (LIF) can keep track of the quintessential timing information contained in the inputs. SNNs achieve this by maintaining a dynamic state in the neuron memory, retaining important information while forgetting redundant data over time. Thus, we posit that SNNs would allow for better performance on sequential regression tasks compared to similarly sized Analog Neural Networks (ANNs). However, deep SNNs are difficult to train due to vanishing spikes at later layers. To that effect, we propose an adaptive fully-spiking framework with learnable neuronal dynamics to alleviate the spike vanishing problem. We utilize surrogate gradient-based backpropagation through time (BPTT) to train our deep SNNs from scratch. We validate our approach for the task of optical flow estimation on the Multi-Vehicle Stereo Event-Camera (MVSEC) dataset and the DSEC-Flow dataset. Our experiments on these datasets show an average reduction of ~13% in average endpoint error (AEE) compared to state-of-the-art ANNs. We also explore several down-scaled models and observe that our SNN models consistently outperform similarly sized ANNs offering ~10%-16% lower AEE. These results demonstrate the importance of SNNs for smaller models and their suitability at the edge. In terms of efficiency, our SNNs offer substantial savings in network parameters ~48x and computational energy ~10.2x while attaining 10% lower EPE compared to the state-of-the-art ANN implementations." Reinforced Learning for Label-Efficient 3D Face Reconstruction,"Hoda Mohaghegh, Hossein Rahmani, Hamid Laga, Farid Boussaid, Mohammed Bennamoun","University of Western Australia,Lancaster University,Murdoch University,The University of Western Australia,UWA",Machine Learning for Perception,"3D face reconstruction plays a major role in many human-robot interaction systems, from automatic face authentication to human-computer interface-based entertainment. To improve robustness against occlusions and noise, 3D face reconstruction networks are often trained on a set of in-the-wild face images preferably captured along different viewpoints of the subject. However, collecting the required large amounts of 3D annotated face data is expensive and time-consuming. To address the high annotation cost and due to the importance of training on a useful set, we propose an Active Learning (AL) framework that actively selects the most informative and representative samples to be labeled. To the best of our knowledge, this paper is the first work on tackling active learning for 3D face reconstruction to enable a label-efficient training strategy. In particular, we propose a Reinforcement Active Learning approach in conjunction with a clustering- based pooling strategy to select informative view-points of the subjects. Experimental results on 300W-LP and AFLW2000 datasets demonstrate that our proposed method is able to 1) efficiently select the most influencing view-points for labeling and outperforms several baseline AL techniques and 2) further improve the performance of a 3D Face Reconstruction network trained on the full dataset." Bridging the Domain Gap for Multi-Agent Perception,"Runsheng Xu, Jinlong Li, Xiaoyu Dong, Hongkai Yu, Jiaqi Ma","UCLA,cleveland state university,Northwestern University,Cleveland State University,University of California, Los Angeles",Machine Learning for Perception,"Existing multi-agent perception algorithms usually select to share deep neural features extracted from raw sensing data between agents, achieving a trade-off between accuracy and communication bandwidth limit. However, these methods assume all agents have identical neural networks, which might not be practical in the real world. The transmitted features can have a large domain gap when the models differ, leading to a dramatic performance drop in multi-agent perception. In this paper, we propose the first lightweight framework to bridge such domain gaps for multi-agent perception, which can be a plug-in module for most of the existing systems while maintaining confidentiality. Our framework consists of a learnable feature resizer to align features in multiple dimensions and a sparse cross-domain transformer for domain adaption. Extensive experiments on the public multi-agent perception dataset V2XSet have demonstrated that our method can effectively bridge the gap for features from different domains and outperform other baseline methods significantly by at least 8% for point-cloud-based 3D object detection." UPLIFT: Unsupervised Person Labeling and Identification Via Cooperative Learning with Mobile Robots,"Yu-chee Tseng, Ting-Yuan Ke, Fang-jing Wu","National Yang Ming Chiao Tung University,TU Dortmund University",Machine Learning for Perception,"As robots are widely used in assisting manual tasks, an interesting challenge is: Can mobile robots help create a labeled knowledge dataset that can be used for efficiently creating deep learning models for other sensors? This paper proposes an Unsupervised Person Labeling and Identification (UPLIFT) framework to automatically enlarge the labeled knowledge dataset. Typically, manual data labeling is very costly, especially when the user population is large and dynamic. To reduce the cost, we use a mobile robot to serve as a knowledge seed and to provide the pseudo-ground-truth for the system so that unlabeled images from other fixed surveillance cameras can be paired with the pseudo-ground-truth. Ultimately, the knowledge dataset can be generated via a system-to-system knowledge transfer process from the former to the latter and gradually expanded as the system operates longer. Experimental results in two environments indicate that UPLIFT achieves an accuracy of 94.1% on average to detect pedestrians’ IDs every 10 seconds." Learning to Explore Informative Trajectories and Samples for Embodied Perception,"Ya Jing, Tao Kong",ByteDance,Machine Learning for Perception,"We are witnessing significant progress on perception models, specifically those trained on large-scale internet images. However, efficiently generalizing these perception models to unseen embodied tasks is insufficiently studied, which will help various relevant applications (e.g., home robots). Unlike static perception methods trained on pre-collected images, the embodied agent can move around in the environment and obtain images of objects from any viewpoints. Therefore, efficiently learning the exploration policy and collection method to gather informative training samples is the key to this task. To do this, we first build a 3D semantic distribution map to train the exploration policy self-supervised by introducing the semantic distribution disagreement and the semantic distribution uncertainty rewards. Note that the map is generated from multi-view observations and can weaken the impact of misidentification from an unfamiliar viewpoint. Our agent is then encouraged to explore the objects with different semantic distributions across viewpoints, or uncertain semantic distributions. With the explored informative trajectories, we propose to select hard samples on trajectories based on the semantic distribution uncertainty to reduce unnecessary observations that can be correctly identified. Experiments show that the perception model fine-tuned with our method outperforms the baselines trained with other exploration policies. Further, we demonstrate the robustness of our method in real-robot experiments." Embodied Agents for Efficient Exploration and Smart Scene Description,"Roberto Bigazzi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara","University of Modena and Reggio Emilia,Università degli Studi di Modena e Reggio Emilia",Machine Learning for Perception,"The development of embodied agents that can communicate with humans in natural language has gained increasing interest over the last years, as it facilitates the diffusion of robotic platforms in human-populated environments. As a step towards this objective, in this work, we tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment while portraying interesting scenes with natural language descriptions. To this end, we propose and evaluate an approach that combines recent advances in visual robotic exploration and image captioning on images generated through agent-environment interaction. Our approach can generate smart scene descriptions that maximize semantic knowledge of the environment and avoid repetitions. Further, such descriptions offer user-understandable insights into the robot’s representation of the environment by highlighting the prominent objects and the correlation between them as encountered during the exploration. To quantitatively assess the performance of the proposed approach, we also devise a specific score that takes into account both exploration and description skills. The experiments carried out on both photorealistic simulated environments and real-world ones demonstrate that our approach can effectively describe the robot’s point of view during exploration, improving the human-friendly interpretability of its observations." Deep Neural Network Architecture Search for Accurate Visual Pose Estimation Aboard Nano-UAVs,"Elia Cereda, Luca Crupi, Matteo Risso, Alessio Burrello, Luca Benini, Alessandro Giusti, Daniele Jahier Pagliari, Daniele Palossi","IDSIA USI-SUPSI,Politecnico di Torino,Università di Bologna,University of Bologna,IDSIA Lugano, SUPSI,ETH Zurich",Machine Learning for Perception,"Miniaturized autonomous unmanned aerial vehicles (UAVs) are an emerging and trending topic. With their form factor as big as the palm of one hand, they can reach spots otherwise inaccessible to bigger robots and safely operate in human surroundings. The simple electronics aboard such robots (sub-100mW) make them particularly cheap and attractive but pose significant challenges in enabling onboard sophisticated intelligence. In this work, we leverage a novel neural architecture search (NAS) technique to automatically identify several Pareto-optimal convolutional neural networks (CNNs) for a visual pose estimation task. Our work demonstrates how real-life and field-tested robotics applications can concretely leverage NAS technologies to automatically and efficiently optimize CNNs for the specific hardware constraints of small UAVs. We deploy several NAS-optimized CNNs and run them in closed-loop aboard a 27-g Crazyflie nano-UAV equipped with a parallel ultra-low power System-on-Chip. Our results improve the State-of-the-Art by reducing the in-field control error of 32% while achieving a real-time onboard inference-rate of [email protected] and [email protected]" Reuse Your Features: Unifying Retrieval and Feature-Metric Alignment,"Javier Morlana, Jose M M Montiel","Universidad de Zaragoza,I,A. Universidad de Zaragoza",Machine Learning for Perception,"We propose a compact pipeline to unify all the steps of Visual Localization: image retrieval, candidate re-ranking and initial pose estimation, and camera pose refinement. Our key assumption is that the deep features used for these individual tasks share common characteristics, so we should reuse them in all the procedures of the pipeline. Our DRAN (Deep Retrieval and image Alignment Network) is able to extract global descriptors for efficient image retrieval, use intermediate hierarchical features to re-rank the retrieval list and produce an intial pose guess, which is finally refined by means of a feature-metric optimization based on learned deep multi-scale dense features. DRAN is the first single network able to produce the features for the three steps of visual localization. DRAN achieves competitive performance in terms of robustness and accuracy under challenging conditions in public benchmarks, outperforming other unified approaches and consuming lower computational and memory cost than its counterparts using multiple networks. Code and models will be publicly available at github.com/jmorlana/DRAN." FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions from Single Panoramas,"Bruno Berenguel-Baeta, Jesús Bermúdez, Josechu Guerrero","University of Zaragoza,Universidad de Zaragoza",Deep Learning for Visual Perception I,"In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images add additional problems to obtain an accurate detection and segmentation of objects or a good depth estimation. To overcome these problems, we exploit convolutions in the frequential domain obtaining a wider receptive field in each convolutional layer. These convolutions allow to leverage the whole context information from omnidirectional images. FreDSNet is the first network that jointly provides monocular depth estimation and semantic segmentation from a single panoramic image exploiting fast Fourier convolutions.Our experiments show that FreDSNet has slight better performance than the sole state-of-the-art method that obtains both semantic segmentation and depth estimation from panoramas. FreDSNet code is publicly available in https://github.com/Sbrunoberenguel/FreDSNet" CAHIR: Co-Attentive Hierarchical Image Representations for Visual Place Recognition,"Guohao Peng, Heshan Li, Yifeng Huang, Jun Zhang, Mingxing Wen, Singh Rahul, Danwei Wang","Nanyang Technological University,Continental Automotive Singapore Pte Ltd",Deep Learning for Visual Perception I,"Robust visual place recognition (VPR) against significant appearance changes is crucial for the life-long operation of mobile robots. Focusing on this task, we propose a Co-Attentive Hierarchical Image Representations (CAHIR) framework for VPR, which unifies attention-sharing global and local descriptor generation into one encoding pipeline. The hierarchical descriptors are applied to a coarse-to-fine VPR system with global retrieval and local geometric verification. To explore high-quality local matches between task-relevant visual elements, a cross-attention mutual enhancement layer is introduced to strengthen the information interaction between the local descriptors. Through the proposed selective matching distillation, the mutual enhancement layer can learn from state-of-the-art local matchers in a distillation manner. After weighted cross-matching of the enhanced local descriptors, geometric verification is applied to evaluate the spatial consistency of the compared image pair. Experiments show CAHIR outperforms the existing global and local representations for VPR in terms of performance and efficiency. Quantitatively, it achieves state-of-the-art results on three city-scale benchmark datasets. Qualitatively, CAHIR proves to attach great importance to task-relevant visual elements and excels at finding local correspondences that are discriminative to the VPR task." Monocular Visual-Inertial Depth Estimation,"Diana Wofk, Rene Ranftl, Matthias Mueller, Vladlen Koltun","Intel,Intel Labs",Deep Learning for Visual Perception I,"We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry to produce dense depth estimates with metric scale. Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment. We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in inverse RMSE with dense scale alignment relative to performing just global alignment alone. Our approach is especially competitive at low density; with just 150 sparse metric depth points, our dense-to-dense depth alignment method achieves over 50% lower iRMSE over sparse-to-dense depth completion by KBNet, currently the state of the art on VOID. We demonstrate successful zero-shot transfer from synthetic TartanAir to real-world VOID data and perform generalization tests on NYUv2 and VCU-RVI. Our approach is modular and is compatible with a variety of monocular depth estimation models." KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation,"Qiwei Meng, Jianjun Gu, Shiqiang Zhu, Jianfeng Liao, Tianlei Jin, Fangtai Guo, Wen Wang, Wei Song",Zhejiang Lab,Deep Learning for Visual Perception I,"Despite the giant leap made in object 6D pose estimation and robotic grasping under structured scenarios, most approaches depend heavily on the exact CAD models of target objects beforehand, thereby limiting their wide applications. To address this, we propose a novel knowledge-guided network - KGNet to estimate the pose and size of category-level unseen objects. This network includes three primary innovations: knowledge-guided categorical model generation, pointwise deformation probability matrix and synergetic RGBD feature fusion, with the former two leveraging categorical object knowledge for unseen object reconstruction and the latter one facilitating pose-sensitive feature extraction. Extensive experiments on CAMERA25 and REAL275 verify their effectiveness, and KGNet achieves the SOTA performance on these two acknowledged benchmarks. Additionally, a real-world robotic grasping experiment is conducted, and its results further qualitatively prove the practicability and robustness of KGNet." Online Consistent Video Depth with Gaussian Mixture Representation,"Chao Liu, Benjamin Eckart, Jan Kautz",NVIDIA,Deep Learning for Visual Perception I,"We demonstrate how off-the-shelf single-image depth estimation methods can be augmented with guidance from optical flow to achieve consistent and accurate online depth estimation using video sequences of static scenes. While previous work has successfully leveraged the complementary nature of optical flow and depth estimation, these techniques use computationally expensive test time optimization strategies that do not generalize beyond a single video sequence and also require knowledge of the future. In contrast, we present a computationally efficient feed-forward design that runs in an online fashion by utilizing learned data priors from previously seen video sequences. To accomplish this, we propose a continuous geometric scene representation that parametrically and compositionally represents the scene as a Gaussian Mixture Model (GMM). Based on this representation, our pipeline learns to estimate consistent depths and associated camera poses from video sequences of static scenes without direct supervision. Our online method achieves state-of-the-art results compared against offline methods that require all sequence frames." Deep Masked Graph Matching for Correspondence Identification in Collaborative Perception,"Peng Gao, Qingzhao Zhu, Hongsheng Lu, Chuang Gan, Hao Zhang","University of Maryland, College Park,Colorado School of Mines,Toyota Motor North America,IBM",Deep Learning for Visual Perception I,"Correspondence identification (CoID) is an essential component for collaborative perception in multi-robot systems, such as connected autonomous vehicles. The goal of CoID is to identify the correspondence of objects observed by multiple robots in their own field of view in order for robots to consistently refer to the same objects. CoID is challenging due to perceptual aliasing, object non-covisibility, and noisy sensing. In this paper, we introduce a novel deep masked graph matching approach to enable CoID and address the challenges. Our approach formulates CoID as a graph matching problem and we design a masked neural network to integrate the multimodal visual, spatial, and GPS information to perform CoID. In addition, we design a new technique to explicitly address object non-covisibility caused by occlusion and the vehicle's limited field of view. We evaluate our approach in a variety of street environments using a high-fidelity simulation that integrates the CARLA and SUMO simulators. The experimental results show that our approach outperforms the previous approaches and achieves state-of-the-art CoID performance in connected autonomous driving applications. Our work is available at: https://github.com/gaopeng5/DMGM.git." Operative Action Captioning for Estimating System Actions,"Taiki Nakamura, Seiya Kawano, Akishige Yuguchi, Yasutomo Kawanishi, Koichiro Yoshino","The University of Tokyo,RIKEN,Institute of Physical and Chemical Research (RIKEN)",Deep Learning for Visual Perception I,"Human-assistive systems, such as robots, need to correctly understand the surrounding situation based on observations and output the required support actions for humans. Language is one of the important channels to communicate with humans, and robots are required to have the ability to express their understanding and action-planning results. In this study, we propose a new task of operative action captioning that estimates and verbalizes the actions to be taken by the system in a human-assisting domain. We constructed a system that outputs a verbal description of a possible operative action that changes the current state to the given target state. We collected a dataset consisting of two images as observations, which express the current state and the state changed by actions and a caption that describes the actions that change the current state to the target state, by crowdsourcing in daily life situations. Then we constructed a system that estimates an operative action by a caption. Since the operative action's caption is expected to contain some state-changing actions, we use scene graph prediction as an auxiliary task because the events written in the scene graphs correspond to the state changes. Experimental results showed that our system successfully described the operative actions that should be conducted between the current and target states. The auxiliary tasks that predict the scene graphs improved the quality of the estimation results." Deep Unsupervised Visual Odometry Via Bundle Adjusted Pose Graph Optimization,Guoyu Lu,University of Georgia,Deep Learning for Visual Perception I,"Unsupervised visual odometry as an active topic has attracted extensive attention, benefiting from its label-free practical value and robustness in real-world scenarios. However, the performance of camera pose estimation and tracking through deep neural network is still not as ideal as most other tasks, such as detection, segmentation and depth estimation, due to the lack of drift correction in the estimated trajectory and map optimization in the recovered 3D scenes. In this work, we introduce pose graph and bundle adjustment optimization to our network training process, which iteratively updates both the motion and depth estimations from the deep learning network, and enforces the refined outputs to further meet the unsupervised photometric and geometric constraints. The integration of pose graph and bundle adjustment is easy to implement and significantly enhances the training effectiveness. Experiments on KITTI dataset demonstrate that the introduced method achieves a significant improvement in motion estimation compared with other recent unsupervised monocular visual odometry algorithms." Pose Relation Transformer : Refine Occlusions for Human Pose Estimation,"Hyung-gun Chi, Seunggeun Chi, Stanley Chan, Karthik Ramani","Purdue University,Purdue",Deep Learning for Visual Perception I,"Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in predicting the missing words conditioned on visible words. We draw upon the sentence completion analogy in NLP to guide our model to address occlusions in the pose estimation problem. We propose a novel approach that can mitigate the effect of occlusions that is motivated by the sentence completion task of NLP. In an analogous manner, we designed our model to reconstruct occluded joints given the visible joints utilizing joint correlations by capturing the implicit joint occlusions. Our proposed POse Relation Transformer (PORT) captures the global context of the pose using self-attention and a local context by aggregating adjacent joint features. To train PORT to learn joint correlations, we guide PORT to reconstruct randomly masked joints, which we call Masked Joint Modeling (MJM). PORT trained with MJM adds to existing keypoint detection methods and successfully refines occlusions. Notably, PORT is a model-agnostic plug-and-play module for pose refinement under occlusion that can be plugged into any keypoint detector with substantially low computational costs. We conducted extensive experiments to demonstrate that PORT mitigates the occlusion effects on the hand and body pose estimation. Strikingly, PORT improves the pose estimation accuracy of existing human pose estimation methods by up to 16% with only 5% of additional parameters. The code is publicly available at https://github.com/stnoah1/PORT." Question Generation for Uncertainty Elimination of Referring Expression in 3D Environment,"Fumiya Matsuzawa, Yue Qiu, Kenji Iwata, Hirokatsu Kataoka, Yutaka Satoh","National Institute of Advanced Industrial Science and Technology,AIST",Deep Learning for Visual Perception I,"We introduce a new task of question generation to eliminate the uncertainty of referring expressions in 3D indoor environments (3D-REQ). Referring to an object using natural language is one of the most common occurrences in daily human conversations; therefore, instructing robots to identify a certain object using natural language could be an essential task in various robotic applications, such as room arrangement. However, human instructions are sometimes uncertain. Existing research on visual grounding using natural language in a 3D environment assumes that the referring expression can uniquely identify the object and does not consider that humans unconsciously give uncertain expressions. When faced with uncertainties, humans ask questions to gain further information. Inspired by the above observation, we propose a method that reduces uncertainty by asking questions when being given an obscure referring expression. The purpose of this method is to predict the positions of all candidate objects that satisfy the referring expressions in a 3D indoor environment and then to ask the appropriate questions to narrow down the target objects from them. To achieve this, we constructed a new 3D-REQ dataset, the input of which is a referring expression with uncertainties in the 3D environment and point clouds, and the output of which is the bounding boxes of all candidate objects satisfying the referring expression and a question to eliminate the uncertainty. To the best of our knowledge, 3D-REQ is the first effort to eliminate the uncertainty of referring expressions for object grounding in 3D environments." A New Efficient Eye Gaze Tracker for Robotic Applications,"Chaitanya Bandi, Ulrike Thomas",Chemnitz University of Technology,Deep Learning for Visual Perception I,"Gaze estimation provides insight into a person’s intent and engagement level, which is helpful in collaborative human-robot applications. With significant advancements in deep learning architectures, appearance-based gaze estimation has gained much attention. Appearance-based methods have shown significant improvement in gaze accuracy and, unlike traditional approaches, they function well in environments where there are no constraints.We present another convolutionbased gaze estimation approach to further reduce the angular error. For estimating gaze under extreme conditions such as head variations and distances, full-face images have been shown to be efficient, so we rely on full-face and pay more attention to necessary features. With the proposed architecture, we achieve an accuracy of 3.75◦ on the MPIIFaceGaze dataset and 3.96◦ on the ETH-XGaze open-source dataset. In addition, we test eye gaze tracking in real-time robotic applications, such as attention detection, and pick-and-place." A Deep Learning Human Activity Recognition Framework for Socially Assistive Robots to Support Reablement of Older Adults,"Fraser Robinson, Goldie Nejat",University of Toronto,Deep Learning for Visual Perception I,"Many older adults prefer to stay in their own homes and age-in-place. However, physical and cognitive limitations in independently completing activities of daily living (ADLs) requires older adults to receive assistive support, often necessitating transitioning to care centers. In this paper, we present the development of a novel deep learning human activity recognition and classification architecture capable of autonomously identifying ADLs in home environments to enable long-term deployment of socially assistive robots to aid older adults. Our deep learning architecture is the first to use multimodal inputs to create an embedding vector approach for classifying and monitoring multiple ADLs. It uses spatial mid- fusion to combine geometric, motion and semantic features of users, environments, and objects to classify and track ADLs. We leverage transfer learning to extract generic features using the early layers of deep networks trained on large datasets to apply our architecture to various ADLs. The embedding vector enables identification of unseen ADLs and determines intra-class variance for monitoring user ADL performance. Our proposed unique architecture can be used by socially assistive robots to promote reablement in the home via autonomously supporting the assistance of varying ADLs. Extensive experiments show improved classification accuracy compared to unimodal/dual- modal models and the ADL embedding space also incorporates the ability to distinctly identify and track seen and unseen ADLs." FloorplanNet: Learning Topometric Floorplan Matching for Robot Localization,"Delin Feng, Zhenpeng He, Jiawei Hou, Soeren Schwertfeger, Liangjun Zhang","ShanghaiTech University,Baidu",Localization and Mapping III,"Given a building floorplan, humans can localize themselves by matching the observation of the environment with the floorplan using geometric, semantic, and topological clues. Inspired by this insight, this paper proposes a learning-based topometric robot localization method FloorplanNet, which implements a match between a metric robot map and the potentially inaccurate building floorplan in nonuniform scales and different shapes by semantic information. The method uses a novel Graph Neural Network to learn descriptors of nodes from topometric graphs generated from the input maps. We demonstrate that our method can match the 3D point cloud sub-map generated by the robot during the SLAM process with the 2D map. Furthermore, we apply our map-matching algorithm for real-world robot localization. We evaluate our method on several publicly available real-world datasets. Even though our network is solely trained using simulation data, our method demonstrates high robustness and effectiveness in real-world indoor environments and outperforms the existing SOTA map-matching algorithms. We further develop a simulator that automatically creates and annotates the required training data to train our neural networks. The method and simulator are released at: https://github.com/fengdelin/FloorplanNet.git" MOFT: Monocular Odometry Based on Deep Depth and Careful Feature Selection and Tracking,"Karlo Koledic, Igor Cvišsić, Ivan Markovic, Ivan Petrovic","University of Zagreb,University of Zagreb, Faculty of Electrical Engineering and Comp,University of Zagreb Faculty of Electrical Engineering and Computing",Localization and Mapping III,"Autonomous localization in unknown environments is a fundamental problem in many emerging fields and the monocular visual approach offers many advantages, due to being a rich source of information and avoiding comparatively more complicated setups and multisensor calibration. Deep learning opened new venues for monocular odometry yielding not only end-to-end approaches but also hybrid methods combining the well studied geometry with specific deep components. In this paper we propose a monocular odometry that leverages deep depth within a feature based geometrical framework yielding a lightweight frame-to-frame approach with metrically scaled trajectories and state-of-the-art accuracy. The front-end is based on a multihypothesis matcher with perspective correction coupled with deep depth predictions that enables careful feature selection and tracking; especially of ground plane features that are suitable for translation estimation. The back-end is based on point-to-epipolar line minimization for rotation and unit translation estimation, followed by deep depth aided reprojection error minimization for metrically correct translation estimation. Furthermore, we also present a domain shift adaptation approach that allows for generalization over different camera intrinsic and extrinsic setups. The proposed approach is evaluated on the KITTI and KITTI-360 datasets, showing competitive results and in most cases outperforming other state-of-the-art stereo and monocular methods." LGCNet: Feature Enhancement and Consistency Learning Based on Local and Global Coherence Network for Correspondence Selection,"Tzu-Han Wu, Kuan-Wen Chen",National Yang Ming Chiao Tung University,Localization and Mapping III,"Correspondence selection, a crucial step in many computer vision tasks, aims to distinguish between inliers and outliers from putative correspondences. The coherence of correspondences is often used for predicting inlier probability, but it is difficult for neural networks to extract coherence contexts based only on quadruple coordinates. To overcome this difficulty, we propose enhancing the preliminary features using local and global handcrafted coherent characteristics before model learning, which strengthens the discrimination of each correspondence and guides the model to prune obvious outliers. Furthermore, to fully utilize local information, neighbors are searched in coordinate space as well as feature space. These two kinds of neighbors provide complementary and plentiful contexts for inlier probability prediction. Finally, a novel neighbor representation and a fusion architecture are proposed to retain detailed features. Experiments demonstrate that our method achieves state-of-the-art performance on relative camera pose estimation and correspondence selection metrics on the outdoor YFCC100M and the indoor SUN3D datasets." Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors,"Hao Dong, Xieyuanli Chen, Mihai Dusmanu, Viktor Larsson, Marc Pollefeys, Cyrill Stachniss","ETH Zürich,National University of Defense Technology,ETH Zurich,Lund University,University of Bonn",Localization and Mapping III,"A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks, such as image matching, image retrieval, and visual localization. State-of-the-art descriptors, from hand-crafted descriptors such as SIFT to learned ones such as HardNet, are usually high-dimensional; 128 dimensions or even more. The higher the dimensionality, the larger the memory consumption and computational time for approaches using such descriptors. In this paper, we investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors. We thoroughly analyze our method in unsupervised, self-supervised, and supervised settings, and evaluate the dimensionality reduction results on four representative descriptors. We consider different applications, including visual localization, patch verification, image matching and retrieval. The experiments show that our lightweight MLPs trained using a supervised method achieve better dimensionality reduction than PCA. The lower-dimensional descriptors generated by our approach outperform the original higher-dimensional descriptors in downstream tasks, especially for the hand-crafted ones. The code is available at https://github.com/PRBonn/descriptor-dr." Online Visual SLAM Adaptation against Catastrophic Forgetting with Cycle-Consistent Contrastive Learning,"Sangni Xu, Hao Xiong, Qiuxia Wu, Tingting Yao, Zhihui Wang, Zhiyong Wang","South China University of Technology,Macquarie University,Dalian Maritime University,Dalian University of Technology,The University of Sydney",Localization and Mapping III,"Visual SLAM (Simultaneous Localisation and Mapping) aims to simultaneously estimate camera poses and depth maps from navigation videos captured. While recent deep learning based methods have achieved great success on this task, they tend to work well on source domain data and suffer from performance degradation on the unseen data of target domain. Hence, we propose an online adaptation approach to continuously adapt a pre-trained visual SLAM model to changing environments in a self-supervised manner. To preserve pre-learned knowledge against catastrophic forgetting, we perform updating on a novel adapter proposed rather than fine-tuning the whole model for adaptation. The adapter includes a cross-domain feature translation module that translates pre-learned features into translated features suitable for adaptation. Ideally, the translated new features should not only contain pre-learned knowledge but also substantially distinct from pre-learned features since these two features represent different domains. We thus introduce cycle-consistent contrastive learning to maximize the dissimilarity between these two features by enlarging the distance between them in the feature space. Besides, our contrastive learning method exploiting cycle-consistency contraint enables the translated features to be transferred back to the pre-learned ones, which helps the translated features better preserve pre-learned knowledge." SLAMER: Simultaneous Localization and Map-Assisted Environment Recognition,Naoki Akai,Nagoya University,Localization and Mapping III,"This paper presents a simultaneous localization and map-assisted environment recognition (SLAMER) method. Mobile robots usually have an environment map and environment information can be assigned to the map. Important information such as no entry zone can be predicted from the map if localization has succeeded. However, this prediction is failed when localization does not work. Uncertainty of pose estimate must be considered for robust-map-based environmental object prediction. Robots also have external sensors and can recognize environmental object; however, sensor-based recognition of course contain uncertainty. SLAMER fuses map-based prediction and sensor-based recognition while coping with these uncertainties and achieves accurate localization and environment recognition. In this paper, we demonstrate LiDAR-based implementation of SLAMER in two cases. In the first case, we use the SemanticKITTI dataset and show that SLAMER achieves accurate estimate more than traditional methods. In the second case, we use an indoor mobile robot and show that unmeasurable environmental objects such as open doors and no entry lines can be recognized." Descriptor Distillation for Efficient Multi-Robot SLAM,"Xiyue Guo, Junjie Hu, Hujun Bao, Guofeng Zhang","Zhejiang University,The Chinese University of Hong Kong, Shenzhen",Localization and Mapping III,"Performing accurate localization while maintaining the low-level communication bandwidth is an essential challenge of multi-robot simultaneous localization and mapping (MR-SLAM). In this paper, we tackle this problem by generating a compact yet discriminative feature descriptor with minimum inference time. We propose descriptor distillation that formulates the descriptor generation into a learning problem under the teacher-student framework. To achieve real-time descriptor generation, we design a compact student network and learn it by transferring the knowledge from a pre-trained large teacher model. To reduce the descriptor dimensions from the teacher to the student, we propose a novel loss function that enables the knowledge transfer between two different dimensional descriptors. The experimental results demonstrate that our model is 30% lighter than the state-of-the-art model and produces better descriptors in patch matching. Moreover, we build an MR-SLAM system based on the proposed method and show that our descriptor distillation can achieve higher localization performance for MR-SLAM with lower bandwidth." DS-K3DOM: 3-D Dynamic Occupancy Mapping with Kernel Inference and Dempster-Shafer Evidential Theory,"Juyeop Han, Youngjae Min, Hyeok-Joo Chae, Byeongmin Jeong, Han-Lim Choi","Korea Advanced Institute of Science and Technology,Massachusetts Institute of Technology,KAIST",Localization and Mapping III,"Occupancy mapping has been widely utilized to represent the surroundings for autonomous robots to perform tasks such as navigation and manipulation. While occupancy mapping in 2-D environments has been well-studied, there have been few approaches suitable for 3-D dynamic occupancy mapping which is essential for aerial robots. This paper presents a novel 3-D dynamic occupancy mapping algorithm called DS-K3DOM. We first establish a Bayesian method to sequentially update occupancy maps for a stream of measurements based on the random finite set theory. Then, we approximate it with particles in the Dempster-Shafer domain to enable real-time computation. Moreover, the algorithm applies kernel-based inference with Dirichlet basic belief assignment to enable dense mapping from sparse measurements. The efficacy of the proposed algorithm is demonstrated through simulations and real experiments." Monocular Visual-Inertial Odometry with Planar Regularities,"Chuchu Chen, Patrick Geneva, Yuxiang Peng, Woosik Lee, Guoquan Huang",University of Delaware,Localization and Mapping III,"State-of-the-art monocular visual-inertial odometry (VIO) approaches rely on sparse point features in part due to their efficiency, robustness, and prevalence, while ignoring high-level structural regularities such as planes that are common to man-made environments and can be exploited to further constrain motion. Generally, planes can be observed by a camera for significant periods of time due to their large spatial presence and thus, are amenable for long-term navigation. Therefore, in this paper, we design a novel real-time monocular VIO system that is fully regularized by planar features within a lightweight multi-state constraint Kalman filter (MSCKF). At the core of our method is an efficient robust monocular-based plane detection algorithm, which does not require additional sensing modalities such as a stereo or depth camera as commonly seen in the literature, while enabling real-time regularization of point features to environmental planes. Specifically, in the proposed MSCKF, long-lived planes are maintained in the state vector, while shorter ones are marginalized after use for efficiency. Planar regularities are applied to both in-state SLAM features and out-of-state MSCKF features, thus fully exploiting the environmental plane information to improve VIO performance. The proposed approach is evaluated with extensive Monte-Carlo simulations and different real-world experiments including an author-collected AR scenario, and shown to outperform the point-based VIO in structured environments." BAMF-SLAM: Bundle Adjusted Multi-Fisheye Visual-Inertial SLAM Using Recurrent Field Transforms,"Wei Zhang, Sen Wang, Xingliang Dong, Rongwei Guo, Norbert Haala","University of Stuttgart,Techinische Universität München,Huawei Technologies, Co., Ltd., P. R. CHINA,Huawei,University of Stuttgart, Institute for Photogrammetry",Localization and Mapping III,"In this paper, we present BAMF-SLAM, a Bundle Adjustment (BA) based Multi-Fisheye visual-inertial SLAM system using recurrent field transforms (RFT), for accurate and robust state estimation under various challenging scenarios. First, based on fisheye camera model, the system works directly on raw fisheye images enabling the full utilization of their wide Field-of-View (FoV). Second, to overcome the difficulty of low-texture, we explore the tightly-coupled integration of multi-camera inputs and complementary inertial measurements via a unified factor graph and jointly optimize the poses and dense depth maps. Third, for global consistency, the wide FoV of the fisheye camera allows us to find more potential loop closures, and supported by the broad convergence basin of RFT, our system can perform very wide baseline loop closing with little overlap. Furthermore, we introduce semi-pose-graph BA to avoid expensive full global BA. By combining relative pose factors with loop closure factors, the global states can be adjusted efficiently with modest memory footprint while preserving the accuracy. Evaluations on TUM-VI, Hilti-Oxford and Newer College datasets show the superior performance of the proposed system over prior works. On the Hilti SLAM Challenge 2022, our VIO version finishes in second place, while our complete system, including the global BA, outperforms the winning approach in a subsequent submission." Improving the Performance of Local Bundle Adjustment for Visual-Inertial SLAM with Efficient Use of GPU Resources,"Shishir Gopinath, Karthik Dantu, Steve Ko","Simon Fraser University,University of Buffalo",Localization and Mapping III,"In this paper, we present our approach to efficiently leveraging GPU resources to improve the performance of local bundle adjustment for visual-inertial SLAM. We observe that for local bundle adjustment (i) the Schur complement method, a technique often used to speed up bundle adjustment, has the largest overhead when solving for the parameter update, and (ii) the workload consists of operations on small- to medium-sized matrices. Based on these observations, we develop and combine several techniques that efficiently handle small- to medium-sized matrices. We then implement these techniques as a drop-in replacement block solver for g2o, a library frequently used for bundle adjustment, and integrate it with ORB-SLAM3, a well-known open-source visual-inertial SLAM system. Our evaluation done with two popular datasets, EuRoC and TUM- VI, shows that we can reduce the time taken by local bundle adjustment by 13.81%-33.79% with our techniques across an embedded device and a desktop machine." Distributed Initialization for Visual-Inertial-Ranging Odometry with Position-Unknown UWB Network,"Shenhan Jia, Rong Xiong, Yue Wang",Zhejiang University,Localization and Mapping III,"In recent years, the visual-inertial-ranging (VIR) state estimator with a position unknown UWB network has become popular. However, most existing VIR methods leverage centralized algorithms to initialize the UWB anchors, which are challenging to be applied to massive UWB networks. In this paper, we propose a distributed initialization method for consistent visual-inertial-ranging odometry with a position-unknown UWB network (DC-VIRO). For the position-unknown UWB anchors, we solve a Robot-aided Distributed Localization (RaDL) to initialize their positions. For robot state estimation, we fuse the ranging measurements of initialized anchors and visual-inertial measurements in a consistent filter. The RaDL is formulated as a consensus-based optimization problem and solved by the Distributed Alternating Direction Method of Multipliers (D-ADMM) algorithm. To identify the unobservable conditions, we propose a self-contained Fisher Information Matrix (FIM) based criterion which can be evaluated by each anchor directly with locally-preserved ranging measurements. We use Covariance Intersection (CI) to estimate the covariance of initialized anchors' positions for consistent data fusion. The proposed DC-VIRO is validated in both simulation and real-world experiments." Biomimetic Electric Sense Based Localization: A Solution for Small Underwater Robots in Large-Scale Environment,"Junzheng Zheng, Jingxian Wang, Xin Guo, Chayutpon Huntrakul, Chen Wang, Guangming Xie","Peking University,Northwestern University",Localisation 2,"This article presents a novel localization scheme for free-swimming small underwater robots in a large-scale environment. Accurate localization technology is always a challenge for small underwater robots, since the underwater lighting conditions could limit the vision while the cramped environments restrict the sonars. In nature, some fishes perceive their positions by sensing the electric field in their environment. Inspired by such electric sense behavior in fish, this article proposes a large-scale localization scheme based on electric sense for small underwater robots. Our scheme includes an electric sense based hardware solution and localization methods. Specifically, first, we design a hardware solution including the electric emitters placed in the underwater environment, and the electric receiver that can be carried by a small underwater robot. Then, we propose three distributed emitter architectures for large-scale localization. Finally, we propose three localization methods to estimate the position and orientation of the robot. We conducted four types of localization experiments for the small underwater robot, whose results fully demonstrate the robustness and effectivene" How Many Events Do You Need? Event-Based Visual Place Recognition Using Sparse but Varying Pixels,"Tobias Fischer, Michael J Milford",Queensland University of Technology,Localisation 2,"Event cameras continue to attract interest due to desirable characteristics such as high dynamic range, low latency, virtually no motion blur, and high energy efficiency. A potential application that would benefit from these characteristics lies in visual place recognition for robot localization, i.e. matching a query observation to the corresponding reference place in the database. This paper explores the distinctiveness of event streams from a small subset of pixels (in the tens or hundreds). We demonstrate that the absolute difference in the number of events at those pixel locations accumulated into event frames can be sufficient for place recognition when pixels that display large variations in the reference set are used. Using such sparse (over image coordinates) but varying (variance over the number of events per pixel location) pixels enables frequent and computationally cheap updates of the location estimates. Furthermore, when event frames contain a constant number of events, our method takes full advantage of the event-driven nature of the sensory stream and is robust to changes in velocity. We evaluate our proposed approach on the Brisbane-Event-VPR dataset in an outdoor driving scenario, as well as the newly contributed indoor QCR-Event-VPR dataset that was captured with a DAVIS346 camera mounted on a mobile robotic platform. Our results show that our approach achieves competitive performance when compared to several baseline methods on those datasets." Mitigating Shadows in LIDAR Scan Matching Using Spherical Voxels,"Matthew Mcdermott, Jason Rife",Tufts University,Localisation 2,"In this paper we propose an approach to mitigate shadowing errors in LIDAR scan matching, by introducing a preprocessing step based on spherical gridding. Because the grid aligns with the LIDAR beam, it is relatively easy to eliminate shadow edges which cause systematic errors in LIDAR scan matching. As we show through testing on real and synthetic data from a mechanically spinning multi-channel LIDAR unit, our proposed algorithm provides better results than ground plane removal, the most common existing strategy for shadow mitigation. Unlike ground plane removal, our method applies to arbitrary terrains (e.g. shadows on urban walls, shadows in hilly terrain) while retaining key LIDAR points on the ground that are critical for estimating changes in height, pitch, and roll. In our experiments, we demonstrate how our technique drastically reduces error in NDT scan registration (compared to a standard Cartesian voxel grid) on real LIDAR point cloud data, and then conduct Monte-Carlo trials in a simulated environment to demonstrate how our proposed technique eliminates the systemic bias introduced by range-shadowing." UWB-VIO Fusion for Accurate and Robust Relative Localization of Ground Robotic Teams,"Shuaikang Zheng, Zhitian Li, Yunfei Liu, Haifeng Zhang, Pengcheng Zheng, Xingdong Liang, Yanlei Li, Xiangxi Bu, Xudong Zou","University of Chinese Academy of Sciences,Aerospace Information Research Institute, Chinese Academy of Sci,National Key Laboratory of Microwave Imaging Technology, Aerospa",Localisation 2,"The relative pose estimation is one of the most fundamental components for multi-robot systems, while it still remains an open and challenging research topic in infrastructure-free environment. In this letter, we target improving the accuracy and robustness of relative pose estimation for ground robotic teams, and propose to fuse range and odometry measurements to estimate the relative pose using sliding window optimization. In the system, multiple UWB tags for ranging are equipped on each robot, and visual inertial odometry is applied for estimating the ego-motion pose for each robot. Aiming for simple and effective relative pose initialization, the triangulation uncertainty for multi-tag robots is analyzed, and an initialization method is designed. To cope with the complex environments such as continuous NLOS condition, a NLOS detection and range measurements filtering method is presented. We have conducted series of experiments to demonstrate the performance of the proposed approach." Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings,"Miguel Ángel Muñoz-Bañón, Jan-Hendrik Pauls, Haohao Hu, Christoph Stiller, Francisco A. Candelas, Fernando Torres","University of Alicante,Karlsruhe Institute of Technology (KIT),Karlsruhe Institute of Technology,University of Alicante VAT ES-Q-,,,,,,,-G",Localisation 2,"Localization in aerial imagery-based maps offers many advantages, such as global consistency, geo-referenced maps, and the availability of publicly accessible data. However, the landmarks that can be observed from both aerial imagery and on-board sensors is limited. This leads to ambiguities or aliasing during the data association. Building upon a highly informative representation (that allows efficient data association), this paper presents a complete pipeline for resolving these ambiguities. Its core is a robust self-tuning data association that adapts the search area depending on a pseudo-entropy of the measurements. Additionally, to smooth the final result, we adjust the information matrix for the associated data as a function of the relative transform produced by the data association process. We evaluate our method on real data from urban and rural scenarios around the city of Karlsruhe in Germany. We compare state-of-the-art outlier mitigation methods with our self-tuning approach, demonstrating a considerable improvement, especially for outer-urban scenarios." Fast and Versatile Feature-Based LiDAR Odometry Via Efficient Local Quadratic Surface Approximation,"Seungwon Choi, Hee-Won Chae, Yunsuk Jeung, Seokjoon Kim, Kyusung Cho, Taewan Kim","Seoul National University,Korea University,MAXST",Localisation 2,"We present a fast and versatile feature-based LiDAR odometry method using local quadratic surface approximation and point-to-surface alignment. Unlike most feature-based methods, our approach approximates the local geometry of the LiDAR scan as a quadratic surface to mitigate performance degradation caused by the inconsistency between the feature class and the map’s local geometry. For computational efficiency, we leverage a symmetric objective function to align features on the local surface of the map without requiring time-consuming surface parameter evaluation. Evaluation on the KITTI and Newer College dataset demonstrates that the proposed method performs better than other feature-based methods. In particular, our approach exhibits robust performance in environments where the ambiguity of feature classifications is considerable. In addition, to demonstrate the robustness of the proposed method when LiDAR scans are relatively sparse, we evaluated the proposed method on datasets collected using LiDAR with a relatively small number of scan channels." KPPR: Exploiting Momentum Contrast for Point Cloud-Based Place Recognition,"Louis Wiesmann, Lucas Nunes, Jens Behley, Cyrill Stachniss",University of Bonn,Localisation 2,"Place recognition plays an important role in robot localization and SLAM. Being able to retrieve the current position in a given map allows, for instance, localizing without relying on GPS reception. In this paper, we address the problem of point cloud-based place recognition, we especially focus on reducing the often significant training time needed by learning-based approaches. We propose a novel neural network architecture that first extracts local features using a pre-trained encoder network plus a stem architecture. The local features are aggregated to a global descriptor, which allows us to compute the similarity between locations. In line with with several existing approaches, we target the generation of descriptors, which are similar for spatially near locations and dissimilar to other places. By exploiting the recent success of feature banks, we are able to bypass the computation of the negative examples, which enables faster training, bigger batch sizes, or the use of more sophisticated networks. As a key result, able to speed up the training process by a factor of 17 against the most common training procedure while increasing also the performance." Handling Constrained Optimization in Factor Graphs for Autonomous Navigation,"Barbara Bazzana, Tiziano Guadagnino, Giorgio Grisetti","Sapienza Univ. of Rome,Sapienza University of Rome",Localisation 2,"Factor graphs are graphical models used to represent a wide variety of problems across robotics, such as Structure from Motion (SfM), Simultaneous Localization and Mapping (SLAM) and calibration. Typically, at their core, they have an optimization problem whose terms only depend on a small subset of variables. Factor graph solvers exploit the locality of problems to drastically reduce the computational time of the Iterative Least-Squares (ILS) methodology. Although extremely powerful, their application is usually limited to unconstrained problems. In this paper, we model constraints over variables within factor graphs by introducing a factor graph version of the Augmented Lagrangian (AL) method. We show the potential of our method by presenting a full navigation stack based on factor graphs. Differently from standard navigation stacks, we can model both optimal control for local planning and localization with factor graphs, and solve the two problems using the standard ILS methodology. We validate our approach in real- world autonomous navigation scenarios, comparing it with the de facto standard navigation stack implemented in ROS. Comparative experiments show that for the application at hand our system outperforms the standard nonlinear programming solver Interior-Point Optimizer (IPOPT) in runtime, while achieving similar solutions." Long-Term Localization Using Semantic Cues in Floor Plan Maps,"Nicky Zimmerman, Tiziano Guadagnino, Xieyuanli Chen, Jens Behley, Cyrill Stachniss","University of Bonn,Sapienza University of Rome,National University of Defense Technology",Localisation 2,"Lifelong localization in a given map is an essential capability for autonomous service robots. In this paper, we consider the task of long-term localization in a changing indoor environment given sparse CAD floor plans. The commonly used pre-built maps from the robot sensors may increase the cost and time of deployment. Furthermore, their detailed nature requires that they are updated when significant changes occur. We address the difficulty of localization when the correspondence between the map and the observations is low due to the sparsity of the CAD map and the changing environment. To overcome both challenges, we propose to exploit semantic cues that are commonly present in human-oriented spaces. These semantic cues can be detected using RGB cameras by utilizing object detection, and are matched against an easy-to-update, abstract semantic map. The semantic information is integrated into a Monte Carlo localization framework using a particle filter that operates on 2D LiDAR scans and camera data. We provide a long-term localization solution and a semantic map format, for environments that undergo changes to their interior structure and detailed geometric maps are not available. We evaluate our localization framework on multiple challenging indoor scenarios in an office environment, taken weeks apart. The experiments suggest that our approach is robust to structural changes and can run on an onboard computer. We released the open source implementation of our approach writ" COBRA: From Industrial to Medical Surgery with Slender Continuum Robots,"David Alatorre Troncoso, Jose A. Robles-linares, Matteo Russo, Mohamed A. Elbanna, Samuel Wild, Xin Dong, Abdelkhalick Mohammad, James Kell, Andy Norton, Dragos Axinte","University of Nottingham,University of Rome Tor Vergata,Rolls-Royce Plc",Medical Systems,"The maintenance of critical industrial components is often hindered by limited access, tortuous passages, and complex geometries. In highly constrained environments, inspection tasks are currently performed with borescopes, but even skilled operators struggle with hard-to-reach targets and the limited mobility prevents in-situ repair when defects are identified. Thanks to an active shape control, snake-like and continuum robots can outperform borescopes for short range inspection as well as enable intervention. However, their actuation technology limits their scalability in length, as longer bodies pose control challenges due to their intrinsically low stiffness and space constraints. To overcome the limitations of both borescopes and continuum robots, we here propose a modular design at their intersection, with both active tendon-driven and passively flexible segments. The main elements of the novel design, including actuation and control interface, are described, and the system is demonstrated in scenarios for aerospace assets, nuclear installations, and robot-assisted surgery." Assistive Robotic Technologies for Next-Generation Smart Wheelchairs,"Fabio Morbidi, Louise Devigne, Catalin Stefan Teodorescu, Bastien Fraudet, Emilie Leblong, Tom Carlson, Marie Babel, Guillaume Caron, Sarah Delmas, François Pasteau, Guillaume Vailland, Valérie Gouranton, Sylvain Guegan, Ronan Le Breton, Nicolas Ragot","Université de Picardie Jules Verne,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes - Rehabilitation Cente,The University of Manchester,Rehabilitation Center Pôle Saint Hélier,Rehabilitation Center Pôle Saint Hélier Rennes,University College London, UK,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes,CNRS,Universite de Picardie Jules Verne,,INSA Rennes / IRISA Rainbow Team,,IRISA UMR CNRS ,,,, - Inria - INSA Rennes,,INSA Rennes,,UNIV-RENNES - INSA Rennes,,CESI",Medical Systems,"This article describes the robotic assistive technologies developed for users of electrically-powered wheelchairs, within the framework of the European Union's Interreg ADAPT project. In particular, special attention is devoted to the integration of advanced sensing modalities and to the design of new shared control algorithms. In response to the clinical needs identified by our medical partners, two novel smart wheelchairs with complementary capabilities, and a virtual reality-based wheelchair simulator, have been developed. These systems have been validated via extensive experimental campaigns in France and in the United Kingdom." A-SEE: Active-Sensing End-Effector Enabled Probe Self-Normal-Positioning for Robotic Ultrasound Imaging Applications,"Xihan Ma, Wen-yi Kuo, Kehan Yang, Ashiqur Rahaman, Haichong Zhang",Worcester Polytechnic Institute,Medical Systems,"Conventional manual ultrasound (US) imaging is a physically demanding procedure for sonographers. A robotic US system (RUSS) has the potential to overcome this limitation by automating and standardizing the imaging procedure. It also extends ultrasound accessibility in resource-limited environments with the shortage of human operators by enabling remote diagnosis. During imaging, keeping the US probe normal to the skin surface largely benefits the US image quality. However, an autonomous, real-time, low-cost method to align the probe towards the direction orthogonal to the skin surface without pre-operative information is absent in RUSS. We propose a novel end-effector design to achieve self-normal-positioning of the US probe. The end-effector embeds four laser distance sensors to estimate the desired rotation towards the normal direction. We then integrate the proposed end-effector with a RUSS system which allows the probe to be automatically and dynamically kept to normal direction during US imaging. We evaluated the normal positioning accuracy and the US image quality using a flat surface phantom, an upper torso mannequin, and a lung ultrasound phantom. Results show that the normal positioning accuracy is 4.17 ± 2.24 degrees on the flat surface and 14.67 ± 8.46 degrees on the mannequin. The quality of the RUSS collected US images from the lung ultrasound phantom was equivalent to that of the manually collected ones." Hybrid Half-Gaussian Selectively Adaptive Fuzzy Control of an Actuated Ankle Foot-Orthosis,"Huiseok Moon, Roshni Maiti, Kaushik Das Sharma, Yacine Amirat, Patrick Siarry, Samer Mohammed","LISSI-lab, Universite de Paris-Est Creteil (UPEC),University of Calcutta,University of Paris Est Créteil (UPEC),Université Paris-Est Créteil,University of Paris Est Créteil - (UPEC)",Medical Systems,"To control an actuated ankle–foot orthosis (AAFO) during walking, a selectively adaptive hybrid fuzzy control employing particle-swarm optimization was used in conjunction with a Lyapunov-theory-based adaptive fuzzy-logic control. Adaptation (a computationally expensive process) was performed only when the tracking error exceeded a certain half-Gaussian function. The stability of the overall closed-loop system was proved using Lyapunov theory. The proposed control strategy was verified both by simulations and by experiments with five healthy subjects. The proposed control strategy significantly reduced both tracking error and required control torque when compared to other competing control schemes." Collaborative Magnetic Manipulation Via Two Robotically-Actuated Permanent Magnets,"Giovanni Pittiglio, Michael Brockdorff, Tomas Veiga, Josh Davy, James Henry Chandler, Pietro Valdastri","Harvard University,University of Leeds",Medical Systems,"Magnetically actuated robots have proven effective in several applications, specifically in medicine. However, generating high actuating fields with a high degree of manipulability is still a challenge, especially when the application needs a large workspace to suitably cover a patient. The presented work discusses a novel approach for the control of magnetic field and field gradients using two robotically actuated permanent magnets. In this case, permanent magnets - relative to coil-based systems - have the advantage of larger field density without energy consumption. We demonstrate that collaborative manipulation of the two permanent magnets can introduce up to three additional Degrees of Freedom (DOFs) when compared to single permanent magnet approaches (five DOFs). We characterized the dual-arm system through the measurement of the fields and gradients, and show accurate open loop control with 13.5% mean error. We then demonstrate how the magnetic DOFs can be employed in magneto-mechanical manipulation, by controlling and measuring the wrench on two orthogonal magnets within the workspace, observing a maximum cross-talk of 6.1% and mean error of 11.1%." Neuromechanical Model-Based Adaptive Control of Bi-Lateral Ankle Exoskeletons: Biological Joint Torque and Electromyogram Reduction across Walking Conditions,"Guillaume Durandau, Wolfgang Rampeltshammer, Herman Van Der Kooij, Massimo Sartori","McGill University,University Twente,Universtity of Twente,University of Twente",Medical Systems,"To enable the broad adoption of wearable robotic exoskeletons in medical and industrial settings, it is crucial they can effectively support large repertoires of movements. We propose a new human-machine interface to drive bilateral ankle exoskeletons during a range of “unseen” walking conditions that were not used for establishing the control interface. The proposed approach uses person-specific neuromechanical models of the human body to estimate biological ankle torques in real-time from electromyograms (EMGS) and joint angles. A low-level controller based on a disturbance observer translates biological torque estimates into exoskeleton commands. We call this ”neuromechanical model-based control” (NMBC). NMBC enabled six individuals to voluntarily control exoskeletons across two walking speeds performed at three ground elevations. Furthermore, a single subject case study was carried out on a dexterous moonwalk task, showing reduction in muscular effort. NMBC enabled reducing biological ankle torques as well as eight ankle EMGs both within and between walking conditions when compared to non-assisted conditions." A Markov Chain Model for Workflow Analysis in Operating Rooms,"Hanyi Zheng, Qing Wang, Jingshan Li",Tsinghua University,Medical Systems,"Improving workflow efficacy in operating rooms (OR) is of significant importance for hospital management. Although extensive studies have been carried out in OR scheduling, efficient methods for workflow analysis are missing in current literature. To bridge this gap, in this paper, a Markov chain model is presented to evaluate the workflow performance in ORs. It is shown that such a model can provide efficient analysis with acceptable accuracy. In addition, a case study in a large public hospital in Beijing, China, illustrates the applicability of the method." On the Workspace of Electromagnetic Navigation Systems,"Quentin Boehler, Simone Gervasoni, Samuel L. Charreyron, Christophe Chautems, Bradley Nelson","ETH Zurich,Accelera AI",Medical Systems,"In remote magnetic navigation, a magnetic navigation system is used to generate magnetic fields to apply mechanical wrenches to steer a magnetic object. This technique can be applied to navigate untethered micro- and nanorobots, as well as tethered magnetic surgical tools for minimally invasive medicine. The design and characterization of these systems have been extensively investigated over the past decade. The determination of the region in space in which these systems can operate has yet to be formalized within the research community. This region is commonly referred to as the ``workspace'' and constitutes a central concept for any class of robotic system. We focus on magnetic navigation systems comprised of electromagnets and propose a first set of definitions for a magnetic workspace, a methodology to determine it, and evaluation metrics to analyse its characteristics. Our methodology and tools are illustrated with several examples of planar and spatial electromagnetic magnetic navigation systems for both didactic and realistic navigation scenarios." UVtac: Switchable UV Marker-Based Tactile Sensing Finger for Effective Force Estimation and Object Localization,"Woojong Kim, Won Dong Kim, Jeong-Jung Kim, Chang-Hyun Kim, Jung Kim","KAIST,Korea Advanced Institute of Science & Technology (KAIST),Korea Institute of Machinery & Materials (KIMM),Korea Institute of Machinery and Materials (KIMM)",Manipulation and Grasping II,"Vision-based tactile sensors provide diverse information of external tactile stimuli on the skin of sensors using both marker and reflective membrane images. However, when markers and reflective membranes are used concurrently, conventional opaque markers inevitably disturb the camera's view of the reflective membrane. Thus, simultaneously increasing the quality of tactile information extracted from each visual feature has remained a challenge. In this study, we present a tactile sensing finger, UVtac, that utilizes switchable ultraviolet (UV) markers to decouple the marker and reflective membrane images to offer three-axis force estimation and object localization, whose performances are unaffected by each other. Our UVtac showed improved force estimation performance by using larger-sized UV markers through quantitative evaluation. The UVtac with 1.2 mm diameter markers showed a root mean square error of 0.264 N in estimating normal forces up to 10 N and 0.219 N in estimating the shear forces up to 5 N, when indented with a 8 X 8 mm^2 square tooltip. Based on the object localization experiment, the UVtac was verified to have a 31 % lower root mean square error than the case using opaque black markers. Finally, we demonstrated object alignment and contact force-tracking tasks using the UVtac to emphasize its multifunctionality." Sparse-Dense Motion Modelling and Tracking for Manipulation without Prior Object Models,"Christian Rauch, Ran Long, Vladimir Ivan, Sethu Vijayakumar","Robert Bosch GmbH,University of Edinburgh,Touchlab Limited",Manipulation and Grasping II,"This work presents an approach for modelling and tracking previously unseen objects for robotic grasping tasks. Using the motion of objects in a scene, our approach segments rigid entities from the scene and continuously tracks them to create a dense and sparse model of the object and the environment. While the dense tracking enables interaction with these models, the sparse tracking makes this robust against fast movements and allows to redetect already modelled objects. The evaluation on a dual-arm grasping task demonstrates that our approach 1) enables a robot to detect new objects online without a prior model and to grasp these objects using only a simple parameterisable geometric representation, and 2) is much more robust compared to the state of the art methods." Enhanced GPIS Learning Based on Local and Global Focus Areas,"Zuka Murvanidze, Marc Peter Deisenroth, Yasemin Bekiroglu","University College London,Chalmers University of Technology, University College London",Manipulation and Grasping II,"Implicit surface learning is one of the most widely used methods for 3D surface reconstruction from raw point cloud data. Current approaches employ deep neural networks or Gaussian process models with the trade-offs across computational performance, object fidelity, and generalization capabilities. We propose a novel method based on Gaussian process regression to build implicit surfaces for 3D surface reconstruction (GPIS), which leads to better accuracy in comparison to the standard GPIS formulation. Our approach encodes local and global shape information from the data to maintain the correct topology of the underlying shape. The proposed pipeline works on dense, sparse, and noisy raw point clouds and can be parallelized to improve computational efficiency. We evaluate our approach on synthetic and real point cloud datasets including data from robot visual and tactile sensors. Results show that our approach leads to high accuracy compared to baselines." Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation,"Myung-Hwan Jeon, Jeongyun Kim, Jee-Hwan Ryu, Ayoung Kim","Seoul National University,SNU,Korea Advanced Institute of Science and Technology",Manipulation and Grasping II,"6D object pose estimation aims to infer the relative pose between the object and the camera using a single or multiple images. Most existing works mainly focus on predicting the object pose without associated uncertainty under occlusion and structural ambiguity (symmetricity). However, these works demand prior information on shape attributes, and this condition is hardly satisfied in reality; even asymmetric objects may be symmetric under the viewpoint change. In addition, acquiring and fusing diverse sensor data is challenging when extending them to robotics applications. Tackling these limitations, we present an ambiguity-aware 6D object pose estimation network, PrimA6D++, as a generic uncertainty prediction method. The major challenges in pose estimation, such as occlusion and symmetry, can be handled in a generic manner based on the measured ambiguity of the prediction. Specifically, we devised a network to reconstruct the three rotation axis primitive images of a target object and predict the underlying uncertainty along each primitive axis. Leveraging the estimated uncertainty, we then optimized multiobject poses using visual measurements and camera poses by treating it as an object SLAM problem. The proposed method showed a significant performance improvement in T-LESS and YCB-Video datasets. We further demonstrated real-time scene recognition capability for visually-assisted robot manipulation. Our code and supplementary materials are available at https://github.com/r" Interaction Control of a Robotic Manipulator with the Surface of Deformable Object,"Athanasios Dometios, Costas S. Tzafestas","National Technical University of Athens (NTUA),ICCS - Inst of Communication and Computer Systems",Manipulation and Grasping II,"Robotic manipulation of deformable objects has drawn the attention of researchers over the past few years and is associated to a large spectrum of new application perspectives. In this paper, we present an efficient integrated motion planning framework to effectively and accurately control a robotic manipulator executing interactive tasks on the surface of a deformable object. The proposed interactive motion planning framework is based on a mesh representation of the object, integrating three efficient preprocessing algorithmic steps, including visual object segmentation, FEM deformation tracking and local mesh parameterization. The use of barycentric coordinates, defined on the mesh triangles, enables the establishment of bijective transformations between the deformable part of an object surface and its planar (static and dynamic) parameterized mapping. By merging these spatial transformations with the preprocessing steps, in combination with an active stiffness scheme for robot manipulator control, we are able to achieve accurate and reactive motion planning of interactive trajectories, even under large and persistent visual occlusions (such as due to the presence of the robot in the visual scene). An extensive experimental evaluation study is presented, involving a robotic manipulator in interaction with a hemispherical model of controllable periodic active deformation, which permits precise ground truth derivation." DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation with Differentiable Simulation,"Sirui Chen, Yunhao Liu, Shang Wen Yao, Jialong Li, Tingxiang Fan, Jia Pan","The University of Hong Kong,University of Hong Kong,The Univeristy of Hong Kong",Manipulation and Grasping II,"Dynamic state representation learning is essential for robot learning. Good latent space that can accurately describe dynamic transition and constraints can significantly accelerate reinforcement learning training as well as reduce motion planning complexity. However, deformable object have very complicated dynamics and is hard to be represented directly by a neural network without any prior physics information. We propose DiffSRL, an end-to-end dynamic state representation learning pipeline that use differentiable physics engine to teach neural network how to represent high dimensional pointcloud data collected from deformable objects. Our specially designed loss function can guide neural network aware physics constraints and feasibility. We benchmark the performance of our methods as well as other state representation algorithms with multiple downstream tasks on PlasticineLab. Our model demonstrate superior performance most of the time on all tasks. We also demonstrate our model's performance in real hardware setting with two manipulation tasks on a UR-5 robot arm. The source code are available at url{https://github.com/Ericcsr/DiffSRL/} and our attached video." SymmetryGrasp: Symmetry-Aware Antipodal Grasp Detection from Single-View RGB-D Images,"Yifei Shi, Zixin Tang, Xiangting Cai, Hongjia Zhang, Dewen Hu, Xin Xu",National University of Defense Technology,Manipulation and Grasping II,"Symmetry is ubiquitous in everyday objects. Humans tend to grasp objects by recognizing the symmetric regions. In this paper, we investigate how symmetry could boost robotic grasp detection. To this end, we present a learning-based method for detecting grasp from single-view RGB-D images. The key insight is to explicitly incorporate symmetry estimation into grasp detection, improving the quality of the detected grasps. Specifically, we first introduce a new grasp parameterization in grasp detection for parallel grippers based on symmetry. Based on this representation, a symmetry-aware grasp detection network method is present to simultaneously estimate object symmetry and detect grasp. We find that the learning of grasp detection greatly benefits from symmetry estimation, improving the training efficiency and the grasp quality. Besides, to facilitate the cross-instance generality of grasping unseen objects, we propose Principal-directional scale-Invariant Feature Transformer (PIFT), a plug-and-play module, that allows spatial deformation of points during the feature aggregation. The module essentially learns feature invariance to anisotropic scaling along the shape principal directions. Extensive experiments demonstrate the effectiveness of the proposed method. In particular, it outperforms previous methods, achieving state-of-the-art performance in terms of grasp quality on GraspNet-1-Billion and success rate on a real robot grasping experiment." Hardware-Accelerated Mars Sample Localization Via Deep Transfer Learning from Photorealistic Simulations,"Raul Castilla-Arquillo, Carlos Perez-del-pulgar, Gonzalo Jesús Paz Delgado, Levin Gerdes","University of Málaga,Universidad de Málaga,ESA/ESTEC",Manipulation and Grasping II,"The goal of the Mars Sample Return campaign is to collect soil samples from the surface of Mars and return them to Earth for further study. The samples will be acquired and stored in metal tubes by the Perseverance rover and deposited on the Martian surface. As part of this campaign, it is expected the Sample Fetch Rover will be in charge of localizing and gathering up to 35 sample tubes over 150 Martian sols. Autonomous capabilities are critical for the success of the overall campaign and for the Sample Fetch Rover in particular. This work proposes a novel system architecture for the autonomous detection and pose estimation of the sample tubes. For the detection stage, a Deep Neural Network and transfer learning from a synthetic dataset are proposed. The dataset is created from photorealistic 3D simulations of Martian scenarios. Additionally, the sample tubes poses are estimated using Computer Vision techniques such as contour detection and line fitting on the detected area. Finally, laboratory tests of the Sample Localization procedure are performed using the ExoMars Testing Rover on a Mars-like testbed. These tests validate the proposed approach in different hardware architectures, providing promising results related to the sample detection and pose estimation." How AI and Robotics Can Build Furniture: A Case Study from the 2021 AI-Robot Assembly Challenge,"Seongseop Yun, Myoung-su Choi, Min-young Cho, Keunhwan Kim, Dong-Hyuk Lee, Sewoong Jun, Ji-Hun Bae, Dongjun Shin","Yonsei University,KITECH, UST,KOREA ELECTRONICS TECHNOLOGY INSTITUTE,Korea Electronics Technology Institute,Korea Institute of Industrial Technology (KITECH),Korea Institute of Industrial Technology",Manipulation and Grasping II,"The ""Furniture Assembly AI-Robot Challenge 2021"" is a competition that utilizes artificial intelligence (AI) and robots to assemble furniture and assess the quality of the assembly. To generate commands that a robot can execute for the assembly instructions, it is crucial to develop an AI-based algorithm to recognize and interpret the assembly process based on the provided instructions. The assembly robot must be dexterous and capable of safely executing assembly tasks without operator intervention. Before the assembly process, our team employed the Faster-region-based convolutional neural networks (Faster-RCNN) and the multi-object rectified attention network (MORAN) recognition methods to identify the assembly instructions, creating a connection relationship tree structure to interpret the recognized information. The robot utilized a developed multi-fingered gripper and a manipulation station to quickly and precisely complete the assembly task. Based on these exceptional results, our team was awarded first place, thus validating the adequacy of the proposed AI-robot system for complex furniture assembly tasks." A Robotic End-Effector for Screwing and Unscrewing Bolts from the Side,"Rui Tao, Junfeng Fan, Fengshui Jing, Jun Hou, Shiyu Xing, Yunkai Ma, Min Tan","Institute of Automation, Chinese Academy of Sciences,Institute of Automation,CAS,Chinese Academy of Sciences, Institute of Automation,Chinese Academy of Sciences,Institute of Automation, Chinese Academy of Sciences,,Institute of Automation,Chinese Academy of Sciences",Manipulation and Grasping II,"This letter presents a novel robotic end-effector for screwing and unscrewing bolts. In many industrial scenarios, it is required to manipulate bolts from the side using robots with end-effectors. Besides, the reaction torque during tightening needs to be balanced to the stability of the robot. To address these challenges, we design a robotic end-effector that can realize side screwing and torque counteraction. Specifically, through an open gear set and a width-adjustable screwing mouth, the end-effector can approach the bolt from the side in any direction and then screw and unscrew it. A gripper with force-magnification and self-locking drive is equipped to counteract the reaction torque during screwing and unscrewing. Considering the extensions for different application scenarios, a method to extend the end-effector by switching screwing mouths is given, and the extended end-effector is able to screw a variety of objects. After further analysis and optimization of the force and size, experiments were carried out. The results show that the end-effector can screw and unscrew the target bolt robustly and counteract the reaction torque. It is also applicable in different scenarios." Adaptive Cooperative Control for Human-Robot Load Manipulation,"Carlos de Cos, Dimos V. Dimarogonas","MathWorks AB,KTH Royal Institute of Technology",Human-Robot Interaction/Collaboration,"In this letter, we propose a control strategy for human-robot cooperative manipulation under the ambiguous collaboration of a human agent. To cope with this uncertainty, an adaptive update law inferring the human contribution to the system dynamics from basic perception feedback through the human arm stiffness is used. Furthermore, the robustness and accuracy of the approach are enhanced by redundantly tracking the shared load references and its associated end-effector position references. To validate the control strategy, both theoretical Lyapunov stability analysis and experimental results - employing two robot manipulators with 6 degrees of freedom under external disturbances- are provided." An Energy Based Control Architecture for Shared Autonomy,"Federico Benzi, Federica Ferraguti, Giuseppe Riggio, Cristian Secchi","University of Modena and Reggio Emilia,Università degli Studi di Modena e Reggio Emilia",Human-Robot Interaction/Collaboration,"In robotic applications where the autonomy is shared between the human and the robot, the autonomous behavior of the robotic system is determined considering mainly the task to be executed and the data collected from the environment using, e.g., formal methods and machine learning techniques. Nevertheless, it is important to correctly translate high-level decision into low-level control inputs in order avoid an unstable behavior due to a naive implementation of the autonomy. In this paper, we propose an energy based architecture for shared autonomy that allows to reproduce as closely as possible the desired behavior while ensuring a robust stability of the robotic system. The proposed architecture is experimentally validated in two application scenarios: shared control of a multi-robot system and variable admittance control in human robot collaboration." Computational Model of Robot Trust in Human Co-Worker for Physical Human-Robot Collaboration,"Qiao Wang, Dikai Liu, Marc Garry Carmichael, Stefano Aldini, Chin-teng Lin","University of Technology Sydney,Centre for Autonomous Systems,UTS",Human-Robot Interaction/Collaboration,"Trust is key to achieving successful Human-Robot Interaction (HRI). Besides trust of the human co-worker in the robot, trust of the robot in its human co-worker should also be considered. A computational model of a robot’s trust in its human co-worker for physical human-robot collaboration (pHRC) is proposed. The trust model is a function of the human co-worker’s performance which can be characterized by factors including safety, robot singularity, smoothness, physical performance and cognitive performance. Experiments with a collaborative robot are conducted to verify the developed trust model." Robust Multi-User In-Hand Object Recognition in Human-Robot Collaboration Using a Wearable Force-Myography Device,"Eran Bamani, Nadav Dov Kahanowich, Inbar Meir, Avishai Sintov","Tel Aviv University,Tel-Aviv University",Human-Robot Interaction/Collaboration,"Applicable human-robot collaboration requires intuitive recognition of human intention during shared work. A grasped object such as a tool held by the human provides vital information about the upcoming task. In this paper, we explore the use of a wearable device to non-visually recognize objects within the human hand in various possible grasps. The device is based on Force-Myography (FMG) where simple and affordable force sensors measure perturbations of forearm muscles. We propose a novel Deep Neural-Network architecture termed Flip-U-Net inspired by the familiar U-Net architecture used for image segmentation. The Flip-U-Net is trained over data collected from several human participants and with multiple objects of each class. Data is collected while manipulating the objects between different grasps and arm postures. The data is also pre-processed with data augmentation and used to train a Variational Autoencoder for dimensionality reduction mapping. While prior work did not provide a transferable FMG-based model, we show that the proposed network can classify objects grasped by multiple new users without additional training efforts. Experiment with 12 test participants show classification accuracy of approximately 95% over multiple grasps and objects. Correlations between accuracy and various anthropometric measures are also presented. Furthermore, we show that the model can be fine-tuned to a particular user based on an anthropometric measure." CARE: Cooperation of AI-Robot Enablers to Create a Vibrant Society,"Ankit Ravankar, Amir Tafrishi, Jose Victorio Salazar Luces, Fumi Seto, Yasuhisa Hirata","Tohoku University,Cardiff University",Human-Robot Interaction/Collaboration,"Demographic changes in our society are putting a heavy burden on care facilities and healthcare infrastructure. While the elderly population is steadily increasing, there is an acute shortage of caregiving experts and professionals. This problem is becoming more severe in super-ageing societies, namely Japan. Hence, this urges new and practical solutions to welfare facilities to mitigate the burden on caregivers and human supporting partners by introducing robotics assistance through information and communication technology (ICT). In this work, we present a new multi-robot cooperation and coordination framework at different intellectual computation levels for care facilities. The framework is developed to have the healthcare 4.0 concept one step closer to reality under the ongoing project ""Moonshot R & D"" in Japan. Firstly we present an Internet of Things (IoT) integration system that is designed to include different passive and active assistive robots. Then, we re-design robot systems and develop a semi-autonomous platform that can perform tasks based on user/patient interaction in real-world care facility scenarios. Our framework provides human-robot interaction under shared autonomy between the user and assisting robots to improve the efficacy of the users in everyday tasks. Tohoku University's new state-of-the-art living lab facility is used to prepare a real-world scenario where we present our experimental results." Safety and Efficiency in Robotics: The Control Barrier Functions Approach,"Federica Ferraguti, Chiara Talignani Landi, Andrew Singletary, Hsien-Chung Lin, Aaron Ames, Cristian Secchi, Marcello Bonfe","Università degli Studi di Modena e Reggio Emilia,University of Modena and Reggio Emilia,California Institute of Technology,FANUC Corporation,Caltech,University of Ferrara",Human-Robot Interaction/Collaboration,an industrial setup for collaborative robotics. Encouraging Human Interaction with Robot Teams: Legible and Fair Subtask Allocations,"Soheil Habibian, Dylan Losey",Virginia Tech,Human-Robot Interaction/Collaboration,"Recent works explore collaboration between humans and teams of robots. These approaches make sense if the human is already working with the robot team; but how should robots encourage nearby humans to join their teams in the first place? Inspired by economics, we recognize that humans care about more than just team efficiency --- humans also have biases and expectations for team dynamics. Our hypothesis is that the way inclusive robots divide the task (i.e., how the robots split a larger task into subtask allocations) should be both legible and fair to the human partner. In this paper we introduce a bilevel optimization approach that enables robot teams to identify high-level subtask allocations and low-level trajectories that optimize for legibility, fairness, or a combination of both objectives. We then test our resulting algorithm across studies where humans watch or play with robot teams. We find that our approach to generating legible teams makes the human’s role clear, and that humans typically prefer to join and collaborate with legible teams instead of teams that only optimize for efficiency. Incorporating fairness alongside legibility further encourages participation: when humans play with robots, we find that they prefer (potentially inefficient) teams where the subtasks or effort are evenly divided." Autonomous Wristband Placement in a Moving Hand for Victims in SAR Scenarios with a Mobile Manipulator,"Francisco Pastor, Francisco Jesús Ruiz Ruiz, Jesus Manuel Gomez De Gabriel, Alfonso García-Cerezo","Universidad de Málaga,University of Málaga,Universidad de Malaga,University of Malaga",Human-Robot Interaction/Collaboration,"In this letter, we present an autonomous method for the placement of asensorized wristband to victims in a Search-And-Rescue (SAR) scenario. For this purpose, an all-terrain mobile robot includes a mobile manipulator, which End-Effector (EE) is equipped with a detachable sensorized wristband. The wristband consists of two links with a shared shaft and a spring. This configuration allows the wristband to maintain fixed to the EE while moving and get placed around the victim's forearm once the contact is produced. The method has two differentiated phases: i) The visual moving hand tracking phase, where a 3D vision system detects the victim's hand pose. At the same time, the robotic manipulator tracks it with a Model Predictive Controller (MPC). ii) The haptic force-controlled phase, where the wristband gets placed around the victim's forearm controlling the forces exerted. The wristband design is also discussed, considering the magnitude of the force needed for the attachment and the torque the wristband exerts to the forearm. Two experiments are carried out, one in the laboratory to evaluate the performance of the method and the second one in a SAR scenario, with the robotic manipulator integrated with the all-terrain mobile robot. Results show a 97.4% success in the wristband placement procedure and a good performance of the whole system in a large scale disaster exercise." Recommending Fine-Grained Tool Consistent with Common Sense Knowledge for Robot,"Jianjia Xin, Lichun Wang, Shaofan Wang, Yukun Liu, Chao Yang, Baocai Yin",Beijing University of technology,Computer Vision and Visual Servoing,"When robots carry out task, selecting an appropriate tool is necessary. Current research neglects the fine-grained characteristic of task and mainly focuses on whether the task can be completed. Little consideration is paid for object being manipulated which affects the task completion quality. To support fine-grained tool recommendation for task, this paper proposes a Fine-grained Tool-Task (FTT) dataset based on common sense knowledge. FTT dataset defines multi-granularity semantics of task, tool, object being manipulated and relationships among them. A baseline method named Fine-grained Tool Recommendation Network (FTR-Net) is simultaneously proposed, which inferences coarse and fine-grained semantics of tools and objects being manipulated in image. The coarse and fine-grained semantic prediction make FTR-Net learn both common and special features of tool and object being manipulated. FTR-Net also constrains the feature distance between tool and object well matched for tasks smaller than those unmatched. The constraint and the special features ensure fine-grained tool recommendation. The constraint and the common features ensure coarse-grained tool recommendation while the fine-grained tool is not available. Experiment shows FTR-Net can recommend tools consistent with common sense both on the test dataset and realistic situation." Real-Time Hetero-Stereo Matching for Event and Frame Camera with Aligned Events Using Maximum Shift Distance,"Haram Kim, Sangil Lee, Junha Kim, H. Jin Kim","Seoul National University,Seoul National Univ.",Computer Vision and Visual Servoing,"Event cameras can show better performance than frame cameras in challenging scenarios such as fast-moving environments or high-dynamic-range scenes. However, it is still difficult for event cameras to replace frame cameras in non-challenging normal scenarios. In order to leverage the advantages of both cameras, we conduct a study for the heterogeneous stereo camera system which employs both an event and a frame camera. The proposed system estimates the semi-dense disparity in real-time by matching heterogeneous data of an event and a frame camera in stereo. We propose an accurate, intuitive and efficient way to align events with 6-DOF camera motion, by suggesting the maximum shift distance method. The aligned event image shows high similarity to the edge image of the frame camera. The proposed method can estimate poses of an event camera and depth of events in a few frames, which can speed up the initialization of the event camera system. We verified our algorithm in the DSEC dataset. The proposed heterostereo matching outperformed other methods. For real-time operation, we implemented our code using parallel computation with CUDA and release our code open source: https://github.com/Haram-kim/Hetero Stereo Matching" Toward Holistic Scene Understanding: A Transfer of Human Scene Perception to Mobile Robots,"Florenz Graf, Jochen Lindermayr, Cagatay Odabasi, Marco F. Huber","Fraunhofer IPA,University of Stuttgart",Computer Vision and Visual Servoing,"The long-term vision for robotics is to have fully autonomous mobile robots that perceive the environment as humans do or even better. This article transfers the core ideas from human scene perception to robot scene perception to contribute toward a holistic scene understanding of robots. The first contribution is to extensively survey and compare state-of-the-art robot scene perception approaches with neuroscience theories and studies of human perception. A step-by-step transfer of the perceptual process reveals similarities and differences between robots and humans. The second contribution represents an analysis of the status quo of holistic robot perception approaches to extract to what extent the perceptual capabilities of humans have been reached. Building on this, the gaps and potentials of robot perception are illustrated to address future research directions." Object Detection Using Sim2Real Domain Randomization for Robotic Applications,"Dániel Horváth, Gábor Erdos, Zoltán Istenes, Tomas Horvath, Sándor Földi","Institute for Computer Science and Control (SZTAKI) and Eötvös L,Institute for Computer Science and Control, Engineering and Mana,Eötvös Loránd University, Faculty of Informatics,Eötvös Loránd University,Centre of Excellence in Production Informatics and Control, Inst",Computer Vision and Visual Servoing,"Robots working in unstructured environments must be capable of sensing and interpreting their surroundings. One of the main obstacles of deep-learning-based models in the field of robotics is the lack of domain-specific labeled data for different industrial applications. In this article, we propose a sim2real transfer learning method based on domain randomization for object detection with which labeled synthetic datasets of arbitrary size and object types can be automatically generated. Subsequently, a state-of-the-art convolutional neural network, YOLOv4, is trained to detect the different types of industrial objects. With the proposed domain randomization method, we could shrink the reality gap to a satisfactory level, achieving 86.32% and 97.38% mAP50 scores, respectively, in the case of zero-shot and one-shot transfers, on our manually annotated dataset containing 190 real images. Our solution fits for industrial use as the data generation process takes less than 0.5 s per image and the training lasts only around 12 h, on a GeForce RTX 2080 Ti GPU. Furthermore, it can reliably differentiate similar classes of objects by having access to only one real image for training. To our best knowledge, this is the only work thus far satisfying these constraints." Continual Adaptation of Semantic Segmentation Using Complementary 2D-3D Data Representations,"Jonas Frey, Hermann Blum, Francesco Milano, Roland Siegwart, Cesar D. Cadena Lerma",ETH Zurich,Computer Vision and Visual Servoing,"Semantic segmentation networks are usually pre-trained once and not updated during deployment. As a consequence, misclassifications commonly occur if the distribution of the training data deviates from the one encountered during the robot's operation. We propose to mitigate this problem by adapting the neural network to the robot's environment during deployment, without any need for external supervision. Leveraging complementary data representations, we generate a supervision signal, by probabilistically accumulating consecutive 2D semantic predictions in a volumetric 3D map. We then train the network on renderings of the accumulated semantic map, effectively resolving ambiguities and enforcing multi-view consistency through the 3D representation. In contrast to scene adaptation methods, we aim to retain the previously-learned knowledge, and therefore employ a continual learning experience replay strategy to adapt the network. Through extensive experimental evaluation, we show successful adaptation to real-world indoor scenes both on the ScanNet dataset and on in-house data recorded with an RGB-D sensor. Our method increases the segmentation accuracy on average by 9.9% compared to the fixed pre-trained neural network, while retaining knowledge from the pre-training dataset." ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking,"Nicola Agostino Piga, Yuriy Onyshchuk, Giulia Pasquale, Ugo Pattacini, Lorenzo Natale","Istituto Italiano di Tecnologia,Italian Institute of Technology (IIT)",Computer Vision and Visual Servoing,"6D object pose tracking has been extensively studied in the robotics and computer vision communities. The most promising solutions, leveraging on deep neural networks and/or filtering and optimization, exhibit notable performance on standard benchmarks. However, to our best knowledge, these have not been tested thoroughly against fast object motions. Tracking performance in this scenario degrades significantly, especially for methods that do not achieve real-time performance and introduce non negligible delays. In this work, we introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images. By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation with the RGB-D input stream to achieve fast and precise 6D object pose and velocity tracking. We test our method on a newly introduced photorealistic dataset, Fast-YCB, which comprises fast moving objects from the YCB model set, and on the dataset for object and hand pose estimation HO-3D. Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking. A video showing the experiments is provided as supplementary material." Stability and Convergence Analysis of 3D Feature-Based Visual Servoing,"Marco Costanzo, Giuseppe De Maria, Ciro Natale, Antonio Russo","Università degli Studi della Campania ""Luigi Vanvitelli"",Università degli Studi della Campania Luigi Vanvitelli",Computer Vision and Visual Servoing,"Visual control based on image features has received much attention for its inherent robustness against camera calibration errors, modelling uncertainties and the capability to keep the object in the Field of View (FoV) of the camera. Nevertheless, some drawbacks related to the basin of convergence and the existence of local minima, which make the camera to get stuck in undesired equilibrium points, are still worth being investigated. Nowadays, the availability of cheap and lightweight RGB-D cameras makes the use of three-dimensional features natural. By using an RGB-D camera in an eye-in-hand configuration, this letter proposes an in-depth stability and convergence analysis of 3D feature-based visual servoing. It will be proved that the visual control system is almost globally asymptotically stable in the sense that the only trajectories not converging to the desired equilibrium point are those belonging to a zero Lebesgue measure set in the feature space. Moreover, a sufficient condition guaranteeing that the feature trajectories remain in the camera FoV is derived and an algorithm to prevent feature loss caused by violation of the camera FoV constraint is proposed." A Robust Visual Servoing Controller for Anthropomorphic Manipulators with Field-Of-View Constraints and Swivel-Angle Motion,"Jiao Jiang, Yaonan Wang, Yiming Jiang, He Xie, Haoran Tan, Hui Zhang","Hunan University,Huazhong University of Science and Technology",Computer Vision and Visual Servoing,"Human-robot collaboration has attracted significant attention in the industry due to the flexibility of humans and the accuracy of robots. Humanoid control of anthropomorphic robotic arms combined with visual servoing will enhance the intelligence of industrial robots. However, the robotic manipulator will introduce psychological discomfort to nearby humans, and the loss of visual features will induce visual servoing task failure. Aiming at these problems, this paper proposes a humanoid control method based on visual servoing by utilizing the swivel angle derived from the human arm to realize the human-like behavior of anthropomorphic robot manipulators. To advance the visual servoing control performance, a constraint function is designed with the Barrier Lyapunov function (BLF) to ensure that image features stay within the field of view. The sliding mode control (SMC) is combined with image-based visual servoing (IBVS) to dispose of the uncertainties of a 7-DOF redundant robot manipulator. The proposed algorithm is substantiated through comparison experiments based on the Sawyer robot and constructed visual servoing physical platform." Formation Tracking and Obstacle Avoidance for Multiple Quadrotors with Static and Dynamic Obstacles,"Juntong Qi, Jinjin Guo, Mingming Wang, Chong Wu, Zhenwei Ma","Shanghai University,Tianjin University,EFY Intelligent Control (Tianjin) Technology Co., Ltd",Aerial Robotics,"This letter proposes a novel distributed cooperative control algorithm to address the problem of collision avoidance and obstacle avoidance for multiple quadrotors during the formation tracking process. The proposed algorithm couples collision avoidance and obstacle avoidance schemes into the control layer. To avoid collisions between quadrotors in time, a repulsion function based on Hooke's law with damping is proposed, which fully considers the relative position and relative velocity between quadrotors. In addition, based on the obstacle avoidance behavior of pigeons, a split-merge strategy is designed for multiple quadrotors to avoid static and dynamic obstacles. The split-merge strategy is driven by the relative position between the quadrotors and the obstacles, and it can calculate the optimal velocity to keep the quadrotors away from obstacles in the field of view. Several simulations and outdoor experiments for multiple quadrotors are presented to verify the effectiveness of the theoretical results." Deep Learning-Aided Synthetic Airspeed Estimation of UAVs for Analytical Redundancy with a Temporal Convolutional Network,"Hyungtae Lim, Han-seok Ryu, Matthew Rhudy, Dongjin Lee, Dongjin Jang, Changho Lee, Youngmin Park, Wonkeun Youn, Hyun Myung","Korea Advanced Institute of Science and Technology,Korea Aerospace Research Institute,Penn State University,Hanseo University,Chungnam National University,KAIST (Korea Advanced Institute of Science and Technology)",Aerial Robotics,"A synthetic air data system (SADS) is an analytical redundancy technique that is crucial for unmanned aerial vehicles (UAVs) and is used as a backup system during air data sensor failures. Unfortunately, the existing state-of-the-art approaches for SADS require GPS signals or high-fidelity dynamic UAV models. To address this problem, a novel synthetic airspeed estimation method that leverages deep learning and an unscented Kalman filter (UKF) for analytical redundancy is proposed. Our novel fusion-based method only requires an inertial measurement unit (IMU), elevator control input, and airflow angles while GPS, lift/drag coefficients, and complex aircraft dynamic models are not required. Additionally, we demonstrate that our proposed temporal convolutional network (TCN) is a more efficient model for airspeed estimation than the renowned models, such as ResNet or bidirectional long short-term memory (LSTM). Our deep learning-aided UKF was experimentally verified on long-duration real flight data and has promising performance compared with the state-of-the-art methods. In particular, it is confirmed that our proposed method robustly estimates the airspeed even under dynamic flight conditions where the performance of conventional methods is degraded." Reconfigurable Drone System for Transportation of Parcels with Variable Mass and Size,"Fabrizio Schiano, Przemyslaw Mariusz Kornatowski, Leonardo Cencetti, Dario Floreano","Leonardo S.p.a.,Ecole Polytechnique Federale de Lausanne (EPFL),Swiss Federal Institute of Technology Lausanne (EPFL),Ecole Polytechnique Federal, Lausanne",Aerial Robotics,"Cargo drones are designed to carry payloads with predefined shape, size, and/or mass. This lack of flexibility requires a fleet of diverse drones tailored to specific cargo dimensions. Here we propose a new reconfigurable drone based on a modular design that adapts to different cargo shapes, sizes, and mass. We also propose a method for the automatic generation of drone configurations and suitable parameters for the flight controller. The parcel becomes the drone’s body to which several individual propulsion modules are attached. We demonstrate the use of the reconfigurable hardware and the accompanying software by transporting parcels of different mass and sizes requiring various numbers and propulsion modules' positioning. The experiments are conducted indoors (with a motion capture system) and outdoors (with an RTK-GNSS sensor). The proposed design represents a cheaper and more versatile alternative to the solutions involving several drones for parcel transportation." Geometrically Constrained Trajectory Optimization for Multicopters,"Zhepei Wang, Xin Zhou, Chao Xu, Fei Gao","Zhejiang University,ZHEJIANG UNIVERSITY",Aerial Robotics,"We present an optimization-based framework for multicopter trajectory planning subject to geometrical configuration constraints and user-defined dynamic constraints. The basis of the framework is a novel trajectory representation built upon our novel optimality conditions for unconstrained control effort minimization. We design linear-complexity operations on this representation to conduct spatial-temporal deformation under various planning requirements. Smooth maps are utilized to exactly eliminate geometrical constraints in a lightweight fashion. A variety of state-input constraints are supported by the decoupling of dense constraint evaluation from sparse parameterization, and backward differentiation of flatness map. As a result, this framework transforms a generally constrained multicopter planning problem into an unconstrained optimization that can be solved reliably and efficiently. Our framework bridges the gaps among solution quality, planning efficiency, and constraint fidelity for a multicopter with limited resources and maneuvering capability. Its generality and robustness are both demonstrated by extensive benchmarks and extreme flight tasks." Parameter Estimation and Control of Multirotors,"Cheng-cheng Yang, Teng-Hu Cheng","National Chiao Tung University,National Yang Ming Chiao Tung University",Aerial Robotics,"A controller based on integral concurrent learning (ICL) has been developed for controlling a multirotor unmanned aerial vehicle with unknown mass and moment of inertia. To the best of our knowledge, this is the first study to estimate the mass and the moment of inertia of a multirotor, which are incorporated in the geometric tracking controller to provide feedback. Since the dynamics of a multirotor is globally defined, the developed ICL controller ensures almost global tracking and parameter estimation. The developed control architecture can be generalized to control any multirotor of unknown mass and moment of inertia with guaranteed system stability. A stability analysis is conducted to ensure that both the tracking errors and the estimation errors of the parameters asymptotically converge to zero. The performance and efficacy of the ICL controller have been verified in experiments." Indirect Force Control of a Cable-Suspended Aerial Multi-Robot Manipulator,"Dario Sanalitro, Marco Tognon, Antonio Jimenez-cano, Juan Cortes, Antonio Franchi","University of Catania,Inria Rennes-Bretagne Atlantique,Centre National de la Recherche Scientifique,LAAS-CNRS,University of Twente",Aerial Robotics,"We present the control in physical interaction with the environment of a Cable-suspended Aerial Multi-Robot Manipulator (CS-AMRM) called the Fly-Crane, composed of three aerial vehicles towed to a platform by means of six cables. The control strategy enables the system to accurately and safely perform tasks involving expected or unexpected interactions between the platform and the environment, in the absence of dedicated force/torque sensors. A previously developed Inverse Kinematic Controller (IKC) is enhanced with an admittance framework, and contacts are estimated through a generalized momentum-based observer. To assess the validity of our approach, and to provide practical insights into the method, we perform extensive experimental tests, comprehending the admittance property shaping to modulate stiffness, damping, and virtual mass, as well as experiments in a more realistic scenario involving contacts between the Fly-Crane and the environment." Accurate High-Maneuvering Trajectory Tracking for Quadrotors: A Drag Utilization Method,"Jindou Jia, Kexin Guo, Xiang Yu, Weihua Zhao, Lei Guo","Beihang University,NanyangTechnologicalUniversity",Aerial Robotics,"The balanceness between the tracking performance and the aerodynamic drag treatment is of paramount importance especially in the presence of the quadrotor aggressive maneuvers. Different from standard approaches that achieve precise tracking by feedforward compensating the estimated drag, this work presents a scheme to appropriately utilize drag. By means of the proposed drag-utilization scheme, the disturbance absorption can be achieved. In addition to eliminate the adverse effect, the control gains are subtly enlarged with less noise. Moreover, an adaptive law for estimating the drag coefficients onboard is provided. Subsequently, the wind disturbance is explicitly considered at the control design stage. A wind speed observer (WSO) is designed to improve the tracking performance, based on current velocity and attitude. Compared with the traditional disturbance observer (DO), the proposed WSO can not only fully utilize the disturbance characteristic but also contribute to reduce the control conservativeness. In experiments, two types of quadrotors with different thrust-to-weight ratios (TWRs) are employed to evaluate the applicability of the presented scheme. Comparative results show that the proposed scheme outperforms several popular methods." A Comparative Study of Nonlinear MPC and Differential-Flatness-Based Control for Quadrotor Agile Flight,"Sihao Sun, Angel Romero, Philipp Foehn, Elia Kaufmann, Davide Scaramuzza","Univesity of Twente,University of Zurich",Aerial Robotics,"Accurate trajectory tracking control for quadrotors is essential for safe navigation in cluttered environments. However, this is challenging in agile flights due to nonlinear dynamics, complex aerodynamics, and actuation constraints. In this article, we empirically compare two state-of-the-art control frameworks: nonlinear MPC and the differential-flatness-based controller (DFBC) by tracking a wide variety of agile trajectories at speeds up to 20 m/s. The comparisons are performed in both simulation and real flights to systematically evaluate both methods from the aspect of tracking accuracy, robustness, and computational efficiency. We show the superiority of NMPC in tracking dynamically infeasible trajectories at the cost of higher computation time and risk of numerical convergence issues. We also quantitatively study the effect of an inner-loop controller using the incremental nonlinear dynamic inversion (INDI) method and the effect of adding an aerodynamic drag model. Real-world experiments show more than 78% tracking error reduction of both NMPC and DFBC, indicating the necessity of using an inner-loop controller and aerodynamic drag model for agile trajectory tracking." Model Predictive Contouring Control for Time-Optimal Quadrotor Flight,"Angel Romero, Sihao Sun, Philipp Foehn, Davide Scaramuzza","University of Zurich,Univesity of Twente",Aerial Robotics,"We tackle the problem of flying time-optimal trajectories through multiple waypoints with quadrotors. Current solutions split the problem into a planning task, where a time-optimal trajectory is generated, and a control task, where this trajectory is accurately tracked. However, currently, generating a time-optimal trajectory for a quadrotor requires solving a difficult time allocation problem, which is computationally demanding (in the order of minutes or even hours). This is detrimental for replanning in presence of disturbances. We overcome this issue by solving the time allocation and the control problems concurrently via Model Predictive Contouring Control (MPCC). Our MPCC optimally selects the future states at runtime, while maximizing the progress along the reference and minimizing the distance to it. We show that, even when tracking simplified trajectories, the proposed MPCC results in a path that approaches the time-optimal one, and that can be generated in real-time. We validate our approach in the real-world, where our method outperforms both the current state-of-the-art and a world-class human pilot in terms of lap time achieving speeds of up to 60 km/h" Automating Vascular Shunt Insertion with the dVRK Surgical Robot,"Karthik Dharmarajan, William Panitch, Muyan Jiang, Kishore Srinivas, Baiyu Shi, Yahav Avigal, Huang Huang, Thomas Low, Danyal Fer, Ken Goldberg","UC Berkeley,University of California, Berkeley,University of California at Berkeley,SRI International,University of California, San Francisco East Bay",Medical Robotics II,"Vascular shunt insertion is a fundamental surgical procedure used to temporarily restore blood flow to tissues. It is often performed in the field after major trauma. We formulate a problem of automated vascular shunt insertion and propose a pipeline to perform Automated Vascular Shunt Insertion (AVSI) using a da Vinci Research Kit. The pipeline uses a learned visual model to estimate the locus of the vessel rim, plans a grasp on the rim, and moves to grasp at that point. The first robot gripper then pulls the rim to stretch open the vessel with a dilation motion. The second robot gripper then proceeds to insert a shunt into the vessel phantom (a model of the blood vessel) with a chamfer tilt followed by a screw motion. Results suggest that AVSI achieves a high success rate even with tight tolerances and varying vessel orientations up to 30°. Supplementary material, dataset, videos, and visualizations can be found at https://sites.google.com/berkeley.edu/autolab-avsi." CogniDaVinci: Towards Estimating Mental Workload Modulated by Visual Delays During Telerobotic Surgery -- an EEG-Based Analysis,"Satyam Kumar, Deland Hu Liu, Frigyes Samuel Racz, Manuel Retana, Susheela Sharma, Fumiaki Iwane, Braden Murphy, Rory O'keeffe, S. Farokh Atashzar, Farshid Alambeigi, Jose del R. Millan","The University of Texas at Austin,University of Texas at Austin,UNIVERSITY OF TEXAS, AUSTIN,National Institutes of Health,New York University,New York University (NYU), US",Medical Robotics II,"Communication latency in any delicate telerobotic operation (such as remote surgery over distance) would impose a significant challenge due to the temporal degradation of visual perception and can substantially affect the outcomes. Less is known, however, about the neurophysiological basis of how operators adapt/react to delayed visual feedback. Identification of such neural markers might provide novel ways for future applications to monitor the mental workload (MW). In this study, we recorded electroencephalography (EEG) data from nine users while performing a peg transfer task using the da Vinci Research Kit with three levels of induced visual delay in the video feedback. Our results suggest that spectral EEG-based features can provide markers of the operator’s MW modulated by arbitrary visual delay. We also show that the exposure to different visual delays could be successfully classified/detected solely from EEG data, using a Riemannian geometry-based classifier, which highlights the utility of EEG signals for detecting the effect of visual delay on brain activity." Exploring an External Approach to Subretinal Drug Delivery Via Robot Assistance and B-Mode OCT,"Elan Ahronovich, Neel Shihora, Jin-Hui Shen, Karen Joos, Nabil Simaan","Vanderbilt ARMA,Vanderbilt University",Medical Robotics II,"Injections into specific retinal layers of the eye present a serious challenge to surgeons in terms of accuracy and perception. The emergence of new gene therapies further emphasizes the need for effective tools for localized drug delivery. Unlike the dominant approach of delivering drugs via a transvitreal intraocular pathway, this paper demonstrates the feasibility of delivering injections into the space between the choroid and the retina using an external approach. The design of a cooperative robotic system for enabling robot-assisted extraocular subretinal injections is presented. The system uses a distal micromanipulator that can serve as a hand-held tool for OCT-aided injection or attach to a six degree of freedom (DOF) serial robot arm for cooperative manipulation. The kinematics and control of the robot for constrained cooperative control motions to enable safe needle injection is presented and experimentally evaluated. These results suggest that the proposed external drug delivery approach is feasible, thereby enabling the advantages of preserving the integrity of the retina and omitting the necessity for vitrectomy." Towards Surgical Context Inference and Translation to Gestures,"Kay Hutchinson, Zongyu Li, Ian Reyes, Homa Alemzadeh","University of Virginia,The University of Virginia,IBM",Medical Robotics II,"Manual labeling of gestures in robot-assisted surgery is labor intensive, prone to errors, and requires expertise or training. We propose a method for automated and explainable generation of gesture transcripts that leverages the abundance of data for image segmentation. Surgical context is detected using segmentation masks by examining the distances and intersections between the tools and objects. Next, context labels are translated into gesture transcripts using knowledge-based Finite State Machine (FSM) and data-driven Long Short Term Memory (LSTM) models. We evaluate the performance of each stage of our method by comparing the results with the ground truth segmentation masks, the consensus context labels, and the gesture labels in the JIGSAWS dataset. Our results show that our segmentation models achieve state-of-the-art performance in recognizing needle and thread in Suturing and we can automatically detect important surgical states with high agreement with crowd-sourced labels (e.g., contact between graspers and objects in Suturing). We also find that the FSM models are more robust to poor segmentation and labeling performance than LSTMs. Our proposed method can significantly shorten the gesture labeling process (~2.8 times)." A Method to Use Haptic Feedback of Laryngoscope Force Vector for Endotracheal Intubation Training,"Haonan Zhou, Siyu Yang, Louis Halamek, Thrishantha Nanayakkara","Imperial College London,Stanford University",Medical Robotics II,"Endotracheal intubation is a mandatory competency for most medical staff. This procedure involves opening the entrance of the patient's upper windpipe using a laryngoscope and then inserting a tube into the windpipe to supply Oxygen to the patient. This time critical intervention requires careful control of the force vector on the tongue to lift it parallel to the jaw than to push the jaw to open the mouth. However, traditional intubation training methods in which novices practice intubation on prostheses lack haptic feedback to improve force control. We designed a sensorised intubation training phantom that can provide trainees with vibrotactile feedback reflecting the laryngoscope's force on the tongue. The critical component of this phantom is a silicon rubber tongue embedded with magnets and hall effect sensors. We calibrated the hall effect sensor readings to predict the force vector exerted on the tongue with errors less than 0.5 N in the lifting and pushing directions. We conducted a controlled experiment, mainly comparing the training results between participants with and without haptic feedback. Results show a statistically significant drop in the undesired forces due to haptic feedback, and the skill is retained when tested after 24 hours without haptic feedback." A Hydraulic Soft Robotic Detrusor Based on an Origami Design,"Simone Onorati, Federica Semproni, Linda Paterno, Giada Casagrande, Veronica Iacovacci, Arianna Menciassi","The BioRobotics Institute - Scuola Superiore S. Anna,Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna - SSSA",Medical Robotics II,"As a permanent solution for patients who cannot contract their urinary bladder, an artificial detrusor muscle appears a higher outcome approach compared to current sacral neurostimulators featured by severe long-term side effects. In this paper, a novel soft robotic detrusor is presented to overcome the limitations of the state-of-the-art solutions. It is based on two identical origami-based hydraulic actuators, which completely surround the bladder and contract upon water aspiration. Design, manufacturing, and experimental characterization both in terms of contraction capabilities and voiding efficiency on ex vivo swine bladders are reported for two different origami geometries, as well as a proof-of-concept implementation of an autonomous driving circuit as control unit. Results from assisted urination tests outlined very good performances proving an active voiding efficiency of the hydraulic soft robotic detrusor equal to 84.8% ± 7.4% in simulated environment." Semi-Autonomous Robotic Control of a Self-Shaping Cochlear Implant,"Daniel Bautista-Salinas, Conor Kirby, Mohamed Essam Mohamed Kassem Abdel, Burak Temelkuran, Charlie T Huins, Ferdinando Rodriguez Y Baena","Imperial College London,Queen Elizabeth Hospital Birmingham,Imperial College, London, UK",Medical Robotics II,"Cochlear implants (CIs) can improve hearing in patients suffering from sensorineural hearing loss via an electrode array (EA) carefully inserted in the scala tympani. Current EAs can cause trauma during insertion, threatening hearing preservation; hence we proposed a pre-curved thermally drawn EA that curls into the cochlea under the influence of body temperature. However, the additional surgical skill required to insert pre-curved EAs usually produces worse surgical outcomes. Medical robots can offer an effective solution to assist surgeons in improving surgical outcomes and reducing outliers. This work proposes a collaborative approach to insert our EA where manageable tasks are automated using a vision-based system. The insertion strategy presented allowed us to insert our EA successfully. The feasibility study showed that we can insert EAs following the defined control strategy while keeping the exerted contact forces within safe levels. The teleoperated robotic system and robotic vision approach to control a self-shaping CI has thus shown potential to provide the tools for a more delicate and atraumatic approach." A Hybrid Steerable Robot with Magnetic Wrist for Minimally Invasive Epilepsy Surgery,"Changyan He, Robert Hideki Nguyen, Cameron Forbrigger, James Drake, Thomas Looi, Eric Diller","University of Toronto,The Hospital for Sick Children,Hospital for Sick Children, University of Toronto,Hospital for Sick Children",Medical Robotics II,"Dexterity is demanded for an endoscopic tool to handle complicated procedures in neurosurgery, e.g., removing diseased tissue from inside the deep brain along a tortuous path. Current robotic tools are either rigid or lack wristed motion ability at the tip, leading to limited usage in minimally invasive procedures. In this paper, a hybrid steerable robot with a magnetic wristed forceps is proposed to provide enhanced dexterity for endoscopic epilepsy surgery. A set of three precurved Nitinol tubes with concentric deployment, called a concentric tube robot (CTR), serves as a 6 degrees-of-freedom (DoF) robotic positioner. The magnetic end-effector is composed of a rotational wrist joint, and forceps at the tip, both of which are actuated remotely by magnetic fields. The magnetic wrist and forceps provide an extra rotational DoF and a gripping DoF on top of the CTR, respectively. The magnetic wrist and gripper are designed to have a hollow channel along their common axis, inside which a soft tube is deployed as a second functional tool for irrigation or suction. An electromagnetic navigation system (eMNS) with 8 coils is used to create the quasi-static magnetic fields. Experimental characterization of the robot kinematics is performed and the results show the mean motion error of CTR is 2.8 mm. The workspace is also analyzed and results indicate that the proposed hybrid robot has a significantly larger reachable area compared to the one of the CTR alone. Mock epilepsy procedures are performed on a brain phantom to validate the feasibility of the hybrid robot for neurosurgery applications." Induced Vertex Motion As a Performance Measure for Surgery in Confined Spaces,"Neel Shihora, Nabil Simaan",Vanderbilt University,Surgical Robotics,"While in the design phase of a robotic system for the procedures performed in surgical confined spaces or hard-to-reach-deep surgical fields, designers can leverage a systematic method to compare the design alternatives for tele-surgical manipulators quantitatively. Unlike most of the work in the literature, we propose an approach for comparing design alternatives by considering the spurious motions along the length of the manipulator in lieu of existing approaches looking at only the end-effector dexterity measures. We propose a performance measure quantifying these spurious motions while the end-effector executes the application-critical tasks such as suturing and tying a knot. A good manipulator design should yield minimal swept volume along its length portions within the confined space. If informed about these spurious motions, that design would lead to reduced force on the internal organs, reducing the pain and discomfort as well as occurrences of extracorporeal inter-manipulator collisions. To validate the proposed approach, we present two illustrative simulation case studies: (1) two planar rigid link serial robots performing the task of following a desired trajectory and (2) two different architectures of tele-surgical manipulators performing the task of passing a circular suture needle under the fulcrum constraints. The results show the applicability of the proposed performance measure in determining the suitability of a particular design alternative for a given task. Although results are promising, using this measure alone for design optimization may compromise overall device dexterity. Therefore, this measure needs to be incorporated into a weighted optimization framework for robot design." Foot Gestures to Control the Grasping of a Surgical Robot,"Yijun Cheng, Yanpei Huang, Ziwei Wang, Etienne Burdet","Imperial College London,Lancaster University,imperial college london",Surgical Robotics,"Many surgical tasks require three or more tools working together, where a hands-free interface could extend a surgeon's actions to control a third surgical tool. However, most current interfaces do not allow skilled control of grasping critical to robotic manipulation. Here we first present a systematic study to identify efficient and intuitive interaction strategies to control grasping of a surgical tool. A series of experiments were conducted to evaluate six foot pressure-based gestures. Based on the results, three modular novel foot-machine interfaces were developed, which can be integrated with other motion control interfaces. The identified interaction strategies were implemented to control a laparoscopic tool in a surgical simulator, and evaluated in a user study. The results illustrate how naive participants can operate grasping yielding smooth and pick & place operation." Design and Development of a Novel Force-Sensing Robotic System for the Transseptal Puncture in Left Atrial Catheter Ablation,"Aya Mutaz Zeidan, Zhouyang Xu, Christopher Edwin Mower, Honglei Wu, Quentin Walker, Oyinkansola Ayoade, Natalia Cotic, Jonathan Behar, Steven Williams, Aruna Arujuna, Yohan Noh, Richard James Housden, Kawal Rhode","King's College London,King’s College London,Brunel University London",Surgical Robotics,"Transseptal puncture (TSP) is a prerequisite for left atrial catheter ablation for atrial fibrillation, requiring access from the right side of the heart. It is a demanding procedural step associated with complications, including inadvertent puncturing and application of large forces on the tissue wall. Robotic systems have shown great potential to overcome such challenges by introducing force-sensing capabilities and increased precision and localization accuracy. Therefore, this work introduces the design and development of a novel robotic system developed to perform TSP. We integrated optoelectronic sensors into the tools’ fixtures, measuring tissue contact and puncture forces along one axis. The novelty of this design is in the system’s ability to manipulate a Brockenbrough (BRK) needle and dilator-sheath simultaneously and measure tissue contact and puncture forces. In performing puncture experiments on anthropomorphic tissue models, an average puncture force of 3.97 +/- 0.45 N (1SD) was established - similar to the force reported in literature on the manual procedure. This research highlights the potential for improving patient safety by enforcing force constraints, paving the way to more automated and safer TSP." Surgical-VQLA: Transformer with Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery,"Long Bai, Mobarakol Islam, Lalithkumar Seenivasan, Hongliang Ren","The Chinese University of Hong Kong,University College London,National University of Singapore,Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS)",Surgical Robotics,"Despite the availability of computer-aided simulators and recorded videos of surgical procedures, junior residents still heavily rely on experts to answer their queries. However, expert surgeons are often overloaded with clinical and academic workloads and limit their time in answering. For this purpose, we develop a surgical question-answering system to facilitate robot-assisted surgical scene and activity understanding from recorded videos. Most of the existing visual question answering (VQA) methods require an object detector and regions based feature extractor to extract visual features and fuse them with the embedded text of the question for answer generation. However, (i) surgical object detection model is scarce due to smaller datasets and lack of bounding box annotation; (ii) current fusion strategy of heterogeneous modalities like text and image is naive; (iii) the localized answering is missing, which is crucial in complex surgical scenarios. In this paper, we propose Visual Question Localized-Answering in Robotic Surgery (Surgical-VQLA) to localize the specific surgical area during the answer prediction. To deal with the fusion of the heterogeneous modalities, we design gated vision-language embedding (GVLE) to build input patches for the Language Vision Transformer (LViT) to predict the answer. To get localization, we add the detection head in parallel with the prediction head of the LViT. We also integrate generalized intersection over union (GIoU) loss to boost localization performance by preserving the accuracy of the question-answering model. We annotate two datasets of VQLA by utilizing publicly available surgical videos from EndoVis-17 and 18 of the MICCAI challenges. Our validation results suggest that Surgical-VQLA can better understand the surgical scene and localized the specific area related to the question-answering. GVLE presents an efficient language-vision embedding technique by showing superior performance over the existing benchmarks." Implicit Neural Field Guidance for Teleoperated Robot‐assisted Surgery,"Heng Zhang, Lifeng Zhu, Jiangwei Shen, Song Aiguo",Southeast University,Navigation,"Teleoperational techniques enable a remote human-robot interaction and has been widely accepted in robot-assisted surgeries. However, it is still hard to guarantee the safety of teleoperated surgery due to the imperfect input commands limited by the remote perception, preventing the teleoperated surgery being widely used. We propose a new framework to avoid collision of surgery robots and human tissue caused by inaccurate inputs. We directly take the medical volume data and propose to use implicit neural field to guide for teleoperated robot-assisted surgery. With the guidance, the trajectory of the robot manipulator is optimized to safely work inside narrow workspace. We evaluated our method in several aspects and conducted the real-world experiment on a head phantom. Experimental results show that our proposed method can effectively avoid the collision between the surgical tool and the human tissue during teleoperation." Bidirectional Generalised Rigid Point Set Registration,"Ang Zhang, Zhe Min, Li Liu, Max Qing Hu Meng","The Chinese University of Hong Kong,University College London",Navigation,"In medical robotics and image-guided surgery (IGS), registration is needed in order to align together the coordinate frames of robots, medical imaging modalities, surgical tools, and patients. Existing registration algorithms often assume one point set to be a noise-free model while the other to contain noise and outliers. However, in real scenarios, noise and outliers can exist in both point sets to be registered. To eliminate the above-mentioned challenge, in this paper, we formally formulate the Bi-directional Generalised Rigid Point Set Registration (Bi-GRPSR) problem where normal vectors are adopted, bi-directional probability density function (PDFs) and Hybrid Mixture Models (HMMs) are constructed to derive the objective function. Bi-GRPSR considering anisotropic positional noise is thus cast as a maximum likelihood estimation (MLE) problem, which is solved by the proposed Bi-directional Generalised Anisotropic Coherent Point Drift (Bi-AGCPD) where spatially nearby points are considered to move coherently and iterative expectation maximization (EM) steps are involved. Experimental results on two human bone point sets, under different settings of noise, outliers, and overlapping ratios, validate the effectiveness and improvements of Bi-AGCPD over existing probabilistic and learning-based methods." Finding the Optimal Incision Point in Robotic Assisted Surgery,"Kyriakos Almpanidis, Theodora Kastritsi, Zoe Doulgeri",Aristotle University of Thessaloniki,Navigation,"In robotic assisted surgeries, surgical tools are inserted into the human body via an incision point in the abdominal wall, which is imposed as a remote center of motion (RCM). The selection of the incision's point location in the human body is critical for the success of the surgical procedure. In this paper, we propose a simulation tool for finding the optimal incision point location, which can be utilized by the surgeon during the preoperative stage. The surgeon can plan the path/region of intervention as well as sensitive regions which should be protected from unintentional damage by the surgical tool on the preoperative images of internal organs. A target admittance model that enforces a candidate incision as a RCM is utilized in the simulation enhanced by a term for following the planned path. We propose a cost evaluation function taking into account metrics involving the distance of the tool from sensitive areas, the tool links maximum pressure on tumors and the robot’s dexterity measure. The example of a tumor resection task is used with the simulation tool to demonstrate its use in finding the incision points that ensures minimal intraoperative risks and accurate task execution." Development and Experimental Verification of a 3D Dynamic Absolute Nodal Coordinate Formulation Model of Flexible Prostate Biopsy/Brachytherapy Needles,"Athanasios Martsopoulos, Thomas Hill, Raj Persad, Stefanos Bolomytis, Antonia Tzemanaki","University of Bristol,Bristol Urological Institute, Southmead Hospital, Bristol,North Bristol NHS Trust",Navigation,"Robot-assisted percutaneous needle insertion is expected to significantly increase targeting accuracy in minimally invasive operations. For this, it is necessary to provide mathematical models that can accurately capture the underlying dynamics of medical needles. Here, we present a novel nonlinear mathematical model of flexible medical needles based on the Absolute Nodal Coordinate Formulation. The model allows the description of large needle deflections and arbitrarily large rigid body motions. Tailored to the requirements of transperineal prostate biopsy and brachytherapy, it can correlate both the translational and rotational coordinates of the needle’s base with its deflection, provide force feedback and accept arbitrary loading conditions. The model is optimised in terms of computational efficiency in order to allow real-time simulation and control. Experiments show that the proposed model allows for submillimeter precision in both static and dynamic needle deflection settings. Due to its accuracy and computational efficiency, it is expected to constitute a valuable tool for both real-time visual/haptic simulation and control of percutaneous needle insertion." Collaborative Robotic Biopsy with Trajectory Guidance and Needle Tip Force Feedback,"Robin Mieling, Maximilian Neidhardt, Sarah Latus, Carolin Stapper, Stefan Gerlach, Inga Kniep, Axel Heinemann, Benjamin Ondruschka, Alexander Schlaefer","Hamburg University of Technology,University Medical Center Hamburg-Eppendorf",Navigation,"The diagnostic value of biopsies is highly dependent on the placement of needles. Robotic trajectory guidance has been shown to improve needle positioning, but feedback for real-time navigation is limited. Haptic display of needle tip forces can provide rich feedback for needle navigation by enabling localization of tissue structures along the insertion path. We present a collaborative robotic biopsy system that combines trajectory guidance with kinesthetic feedback to assist the physician in needle placement. The robot aligns the needle while the insertion is performed in collaboration with a medical expert who controls the needle position on site. We present a needle design that senses forces at the needle tip based on optical coherence tomography and machine learning for real-time data processing. Our robotic setup allows operators to sense deep tissue interfaces independent of frictional forces to improve needle placement relative to a desired target structure. We first evaluate needle tip force sensing in ex-vivo tissue in a phantom study. We characterize the tip forces during insertions with constant velocity and demonstrate the ability to detect tissue interfaces in a collaborative user study. Participants are able to detect 91% of ex-vivo tissue interfaces based on needle tip force feedback alone. Finally, we demonstrate that even smaller, deep target structures can be accurately sampled by performing post-mortem in situ biopsies of the pancreas." Development and Evaluation of a Robotic Vessel Positioning System for Semi-Automatic Microvascular Anastomosis,"Jesse Haworth, Justin Opfermann, Michael Kam, Yaning Wang, Robin Yang, Jin Kang, Axel Krieger","Johns Hopkins University,Johns Hopkins Medicine,the Johns Hopkins University",Navigation,"This paper describes a novel tissue positioning system with an integrated suturing robot and demonstrates its ability to perform semi-automatic anastomoses of synthetic blood vessels. We began with a finite element analysis-based design consideration for achieving adequate grasping of blood vessels to demonstrate robust performance under expected clinical forces. We then conducted standardized positioning tests to measure the repeatability of the system and incorporated a high-resolution optical coherence tomography (OCT) fiber imaging sensor within the tip of the suturing tool to provide position feedback of the robot during a suturing task. Using the microvascular positioner and OCT sensor, the system performed semi-automatic suturing of synthetic 5 mm diameter blood vessels (N=4), and the suture quality was evaluated for consistency in spacing, bite depth, percent lumen reduction, and maximum suture strength. The system completed the task in an average time of 31.75 minutes. The samples had zero missed stitches, average spacing of 1.64 mm, an average bite depth of 2.14 mm, an average lumen reduction of 57.98%, and an average suture strength of 3.13 N." Robotic Sonographer: Autonomous Robotic Ultrasound Using Domain Expertise in Bayesian Optimization,"Deepak Raina, Sh Chandrashekhara, Richard Voyles, Juan Wachs, Subir Kumar Saha","Indian Institute of Technology Delhi and Purdue University USA,All India Insititute of Medical Sciences, New Delhi,Purdue University,Indain Institute of Technology Delhi",Navigation,"Ultrasound is a vital imaging modality utilized for a variety of diagnostic and interventional procedures. However, an expert sonographer is required to make accurate maneuvers of the probe over the human body while making sense of the ultrasound images for diagnostic purposes. This procedure requires a substantial amount of training and up to a few years of experience. In this paper, we propose an autonomous robotic ultrasound system that uses Bayesian Optimization (BO) in combination with the domain expertise to predict and effectively scan the regions where diagnostic quality ultrasound images can be acquired. The quality map, which is a distribution of image quality in a scanning region, is estimated using Gaussian process in BO. This relies on a prior quality map modeled using expert's demonstration of the high-quality probing maneuvers. The ultrasound image quality feedback is provided to BO, which is estimated using a deep convolution neural network model. This model was previously trained on database of images labelled for diagnostic quality by expert radiologists. Experiments on three different urinary bladder phantoms validated that the proposed autonomous ultrasound system can acquire ultrasound images for diagnostic purposes with a probing position and force accuracy of 98.7% and 97.8%, respectively." Autonomous Intelligent Navigation for Flexible Endoscopy Using Monocular Depth Guidance and 3-D Shape Planning,"Yiang Lu, Ruofeng Wei, Bin Li, Wei Chen, Jianshu Zhou, Qi Dou, Dong Sun, Yunhui Liu","The Chinese University of Hong Kong,City University of Hong Kong,Chinese University of Hong Kong",Navigation,"Recent advancements toward perception and decision-making of flexible endoscopes have shown great potential in computer-aided surgical interventions. However, owing to modeling uncertainty and inter-patient anatomical variation in flexible endoscopy, the challenge remains for efficient and safe navigation in patient-specific scenarios. This paper presents a novel data-driven framework with self-contained visual-shape fusion for autonomous intelligent navigation of flexible endoscopes requiring no priori knowledge of system models and global environments. A learning-based adaptive visual servoing controller is proposed to online update the eye-in-hand vision-motor configuration and steer the endoscope, which is guided by monocular depth estimation via a vision transformer (ViT). To prevent unnecessary and excessive interactions with surrounding anatomy, an energy-motivated shape planning algorithm is introduced through entire endoscope 3-D proprioception from embedded fiber Bragg grating (FBG) sensors. Furthermore, a model predictive control (MPC) strategy is developed to minimize the elastic potential energy flow and simultaneously optimize the steering policy. Dedicated navigation experiments on a robotic-assisted flexible endoscope with an FBG fiber in several phantom environments demonstrate the effectiveness and adaptability of the proposed framework." A Probabilistic Rotation Representation for Symmetric Shapes with an Efficiently Computable Bingham Loss Function,"Hiroya Sato, Takuya Ikeda, Koichi Nishiwaki","The University of Tokyo,Woven Planet Holdings, Inc.,Woven Alpha",Probability and Statistical Methods,"In recent years, a deep learning framework has been widely used for object pose estimation. While quaternion is a common choice for rotation representation, it cannot represent the ambiguity of the observation. In order to handle the ambiguity, the Bingham distribution is one promising solution. However, it requires complicated calculation when yielding the negative log-likelihood (NLL) loss. An alternative easy-to-implement loss function has been proposed to avoid complex computations but has difficulty expressing symmetric distribution. In this paper, we introduce a fast-computable and easy-to-implement NLL loss function for Bingham distribution. We also create the inference network and show that our loss function can capture the symmetric property of target objects from their point clouds." Topological Trajectory Prediction with Homotopy Classes,"Jennifer Wakulicz, Ki Myung Brian Lee, Teresa A. Vidal-Calleja, Robert Fitch","University of Technology Sydney, Centre for Autonomous Systems,University of Technology Sydney",Probability and Statistical Methods,"Trajectory prediction in a cluttered environment is key to many important robotics tasks such as autonomous navigation. However, there are an infinite number of possible trajectories to consider. To simplify the space of trajectories under consideration, we utilise homotopy classes to partition the space into countably many mathematically equivalent classes. All members within a class demonstrate identical high-level motion with respect to the environment, i.e., travelling above or below an obstacle. This allows high-level prediction of a trajectory in terms of a sparse label identifying its homotopy class. We therefore present a light-weight learning framework based on variable-order Markov processes to learn and predict homotopy classes and thus high-level agent motion. By informing a GMM with our homotopy class predictions, we see great improvements in low-level trajectory prediction compared to a naive GMM on a real dataset." Information-Theoretic Abstraction of Semantic Octree Models for Integrated Perception and Planning,"Daniel Larsson, Arash Asgharivaskasi, Jaein Lim, Nikolay A. Atanasov, Panagiotis Tsiotras","Georgia Institute of Technology,University of California, San Diego,Georgia Tech",Probability and Statistical Methods,"In this paper, we develop an approach that enables autonomous robots to build and compress semantic environment representations from point-cloud data. Our approach builds a three-dimensional, semantic tree representation of the environment from raw sensor data which is then compressed by a novel information-theoretic tree-pruning approach. The proposed approach is probabilistic and incorporates the uncertainty in semantic classification inherent in real-world environments. Moreover, our approach allows robots to prioritize individual semantic classes when generating the compressed trees, so as to design multi-resolution representations that retain the relevant semantic information while simultaneously discarding unwanted semantic categories. We demonstrate the approach by compressing semantic octree models of a large outdoor, semantically rich, real-world environment. In addition, we show how the octree abstractions can be used to create semantically-informed graphs for motion planning, and provide a comparison of our approach with uninformed graph construction methods such as Halton sequences." BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization,"Harel Biggie, Andrew Beathard, Christoffer Heckman","University of Colorado Boulder,University of Colorado, Boulder,University of Colorado at Boulder",Probability and Statistical Methods,"Typical algorithms for point cloud registration such as Iterative Closest Point (ICP) require a favorable initial transform estimate between two point clouds in order to perform a successful registration. State-of-the-art methods for choosing this starting condition rely on stochastic sampling or global optimization techniques such as branch and bound. In this work, we present a new method based on Bayesian optimization for finding the critical initial ICP transform. We provide three different configurations for our method which highlights the versatility of the algorithm to both find rapid results and refine them in situations where more runtime is available such as offline map building. Experiments are run on popular data sets and we show that our approach outperforms state-of-the-art methods when given similar computation time. Furthermore, it is compatible with other improvements to ICP, as it focuses solely on the selection of an initial transform, a starting point for all ICP-based methods." DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for Autonomous Driving,"Xihao Wang, Jiaming Lei, Hai Lan, Arafat Al-Jawari, Xian Wei","Technical University of Munich,Fujian Institute of Research on the Structure of Matter,East China Normal University",Object Detection II,"Outdoor 3D object detection has played an essential role in the environment perception of autonomous driving. In complicated traffic situations, precise object recognition provides indispensable information for prediction and planning in the dynamic system, improving self-driving safety and reliability. However, with the vehicle's veering, the constant rotation of the surrounding scenario makes a challenge for the perception systems. Yet most existing methods have not focused on alleviating the detection accuracy impairment brought by the vehicle's rotation, especially in outdoor 3D detection. In this paper, we propose DuEqNet, which first introduces the concept of equivariance into 3D object detection network by leveraging a hierarchical embedded framework. The dual-equivariance of our model can extract the equivariant features at both local and global levels, respectively. For the local feature, we utilize the graph-based strategy to guarantee the equivariance of the feature in point cloud pillars. In terms of the global feature, the group equivariant convolution layers are adopted to aggregate the local feature to achieve the global equivariance. In the experiment part, we evaluate our approach with different baselines in 3D object detection tasks and obtain State-Of-The-Art performance. According to the results, our model presents higher accuracy on orientation and better prediction efficiency. Moreover, our dual-equivariance strategy exhibits the satisfied plug-and-play ability on various popular object detection frameworks to improve their performance." NVRadarNet: Real-Time Radar Obstacle and Free Space Detection for Autonomous Driving,"Alexander Popov, Patrik Gebhardt, Ke Chen, Ryan Oldja, Hee Seok Lee, Shane Murray, Ruchi Bhargava, Nikolai Smolyanskiy","NVIDIA,NVIDIA Corporation,Nvidia,nvidia",Object Detection II,"Detecting obstacles is crucial for safe and efficient autonomous driving. To this end, we present NVRadarNet, a deep neural network (DNN) that detects dynamic obstacles and drivable free space using automotive RADAR sensors. The network utilizes temporally accumulated data from multiple RADAR sensors to detect dynamic obstacles and compute their orientation in a top-down bird’s-eye view (BEV). The network also regresses drivable free space to detect unclassified obstacles. Our DNN is the first of its kind to utilize sparse RADAR signals in order to perform obstacle and free space detection in real time from RADAR data only. The network has been successfully used for perception on our autonomous vehicles in real self-driving scenarios. The network runs faster than real time on an embedded GPU and shows good generalization across geographic regions." TransRSS: Transformer-Based Radar Semantic Segmentation,"Hao Zou, Harry Xie, Jiarong Ou, Gao Yutao","Alibaba group,Alibaba Group,Alibaba",Object Detection II,"Radar semantic segmentation is a challenging task in environmental understanding, due as the radar data is noisy and suffers measurement ambiguities, which could lead to poor feature learning. To better tackle such difficulties, we present a novel and high-performance Transformer-based Radar Semantic Segmentation method, named TransRSS, to effectively and efficiently feature extraction for radar segmentation. Our approach first introduces the transformer into radar semantic segmentation and deeply integrates the merits of the Convolutional Neural Network (CNN) and transformer to extract more discriminative and global-level semantic features. On the one hand, it takes advantage of the CNN with flexible receptive fields to process images thanks to the shift convolution scheme. On the other hand, it takes advantage of the transformer to model long-range dependency with the self-attention mechanism. Meanwhile, we propose a Dual Position Attention module to aggregate rich context interdependencies between the multi-view features, which achieves an implicit mechanism for adaptively feature aggregation. Extensive experiments on the CARRADA dataset and RADIal dataset demonstrate that our TransRSS surpasses the state-of-the-art (SOTA) radar segmentation methods with remarkable margins." Source-Free Unsupervised Domain Adaptation for 3D Object Detection in Adverse Weather,"Deepti Hegde, Velat Kilic, Vishwanath Sindagi, A. Brinton Cooper, Mark Foster, Vishal M. Patel","Johns Hopkins University,The Johns Hopkins UNiversity",Object Detection II,"A domain shift exists between the distributions of large scale, outdoor lidar datasets due to being captured using different types of lidar sensors, in different locations, and under varying weather conditions. Inclement weather in particular affects the quality of lidar data, adding artifacts such as scattered and missed points, leading to a drop in performance of 3D object detection networks trained on standard lidar datasets. Domain adaptation methods seek to adapt source-trained neural networks to a target domain. Pseudo-label based self training approaches are popular methods for source-free unsupervised domain adaptation. However, their efficacy depends on the quality of the labels generated by the source trained model. These labels may be incorrect with high confidence, rendering thresholding methods ineffective. In order to avoid reinforcing errors caused by label noise, we propose an uncertainty-aware mean teacher framework which implicitly filters incorrect pseudo-labels during training. Leveraging model uncertainty allows the mean teacher network to perform implicit filtering by down-weighing losses corresponding to uncertain pseudo-labels. Effectively, we perform automatic soft-sampling of pseudo-labeled data while aligning predictions from the student and teacher networks. We demonstrate our domain adaptation method on an adverse weather dataset created by augmenting lidar scenes from KITTI with rain, snow, and fog and show that it out-performs current domain adaptation frameworks. We make our code publicly available." Bayesian Deep Learning for Affordance Segmentation in Images,"Lorenzo Mur Labadia, Ruben Martinez-Cantin, Josechu Guerrero","University of Zaragoza,Universidad de Zaragoza",Object Detection II,"Affordances are a fundamental concept in robotics since they relate available actions for an agent depending on its sensory-motor capabilities and the environment. We present a novel Bayesian deep network to detect affordances in images, at the same time that we quantify the distribution of the aleatoric and epistemic variance at the spatial level. We adapt the Mask-RCNN architecture to learn a probabilistic representation using Monte Carlo dropout. Our results outperform the state-of-the-art of deterministic networks. We attribute this improvement to a better probabilistic feature space representation on the encoder and the Bayesian variability induced at the mask generation, which adapts better to the object contours. We also introduce the new Probability-based Mask Quality measure that reveals the semantic and spatial differences on a probabilistic instance segmentation model. We modify the existing Probabilistic Detection Quality metric by comparing the binary masks rather than the predicted bounding boxes, achieving a finer-grained evaluation of the probabilistic segmentation. We find aleatoric variance in the contours of the objects due to the camera noise, while epistemic variance appears in visual challenging pixels." Multi-View Keypoints for Reliable 6D Object Pose Estimation,"Alan Li, Angela P. Schoellig","University of Toronto,TU Munich",Object Detection II,"6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. 6D pose estimation is particularly challenging in bin-picking applications, where many objects are low-feature and reflective, and self-occlusion between objects of the same type is common. We propose a novel multi-view approach leveraging known camera transformations from an eye-in-hand setup to combine heatmap and keypoint estimates into a probability density map over 3D space. The result is a robust approach that is scalable in the number of views. It relies on a confidence score composed of keypoint probabilities and point-cloud alignment error, which allows reliable rejection of false positives. We demonstrate an average pose estimation error of approximately 0.5mm and 2 degrees across a variety of difficult low-feature and reflective objects in the ROBI dataset, while also surpassing the state-of-art correct detection rate, measured using the 10% object diameter threshold on ADD error." Towards Unsupervised Filtering of Millimetre-Wave Radar Returns for Autonomous Vehicle Road Following,"Dean Sacoransky, Joshua Marshall, Keyvan Hashtrudi-zaad",Queen's University,Object Detection II,"Path planning and localization in low-light and inclement weather conditions are critical problems facing autonomous vehicle systems. Our proposed method applies a single modality, millimetre-wave radar perception system for the detection of roadside retro-reflectors. Radar-based perception tasks can be challenging to perform due to the sparse and noisy nature of radar data. We propose the use of an unsupervised learning approach for filtering radar point clouds through Density-Based Spatial Clustering of Applica- tions with Noise (DBSCAN). The DBSCAN algorithm segments retro-reflector points from noise points, thus providing the autonomous vehicle with a predicted path for the road ahead. We tested the approach via indoor experiments that make use of Continental’s ARS 408 radar, a mobile Husky A2000 robot, and a Vicon motion capture system for ground truth validation. The experimental results of the proposed system demonstrated a classification accuracy of 84.13 % and F1 score of 83.71 %." Domain Generalised Fully Convolutional One Stage Detection,"Karthik Seemakurthy, Petra Bosilj, Erchan Aptoula, Charles W. Fox","University of Lincoln,Sabanci University",Object Detection II,"Real-time vision in robotics plays an important role in localising and recognising objects. Recently, deep learning approaches have been widely used in robotic vision. However, most of these approaches have assumed that training and test sets come from similar data distributions, which is not valid in many real world applications. This study proposes an approach to address domain generalisation (i.e. out-of- distribution generalisation (OODG)) where the goal is to train a model via one or more source domains, that will generalise well to unknown target domains using single stage detectors. All existing approaches which deal with OODG either use slow two stage detectors or operate under the covariate shift assumption which may not be useful for real-time robotics. This is the first paper to address domain generalisation in the context of single stage anchor free object detector FCOS without the covariate shift assumption. We focus on improving the generalisation ability of object detection by proposing new regularisation terms to address the domain shift that arises due to both classification and bounding box regression. Also, we include an additional consistency regularisation term to align the local and global level predictions. The proposed approach is implemented as a Domain Generalised Fully Convolutional One Stage (DGFCOS) detection and evaluated using four object detection datasets which provide domain metadata (GWHD, Cityscapes, BDD100K, Sim10K) where it exhibits a consistent performance improvement over the baselines and is able to run in real-time for robotics." GNN-Based Point Cloud Maps Feature Extraction and Residual Feature Fusion for 3D Object Detection,"Wei-Hsiang Liao, Chieh-Chih (Bob) Wang, Wen-chieh Lin",National Yang Ming Chiao Tung University,Object Detection and Segmentation,"LiDAR detection of long-range vehicles is challenging because very few and sparse points are measured in long distances and vehicles with similar shapes of targets could lead to false positives easily. To tackle these challenges, taking the environment information (HD maps) into account could be beneficial to predetermine where targets are more or less likely to appear. Compared with semantic maps, HD maps formed by point clouds provide much richer information from surrounding static objects and scenes. In this work, we construct a GNN-based feature extraction of point cloud maps to increase the receptive fields of learning map features. Our work is based on PVRCNN, the state-of-the-art LiDAR object detection method. With point-wise and voxel-wise features obtained from PVRCNN, residual feature fusion is proposed to fuse the features from PVRCNN and the map features from GNN. Our approach is evaluated on NuScenes dataset. It achieves a 24.78% average precision improvement for long-range objects at 40-50 meters, the farthest areas with ground truth annotation. Our approach also has a 4.22% reduction of false positives in the entire sensing areas." Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos,"Shiyang Lu, Yunfu Deng, Abdeslam Boularias, Kostas E. Bekris","Rutgers University,Shenzhen Institutes of Advanced Technology, Chinese Academy of S,Rutgers, the State University of New Jersey",Object Detection and Segmentation,"This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A key feature of the self-supervised training process is a graph-matching algorithm that operates on the over-segmentation output of the point cloud that is reconstructed from each video. The graph matching, along with point cloud registration, is able to find reoccurring object patterns across videos and combine them into 3D object pseudo labels, even under occlusions or different viewing angles. Projected 2D object masks from 3D pseudo labels are used to train a pixel-wise feature extractor through contrastive learning. During online inference, a clustering method uses the learned features to cluster foreground pixels into object segments. Experiments highlight the method's effectiveness on both real and synthetic video datasets, which include cluttered scenes of tabletop objects. The proposed method outperforms existing unsupervised methods for object segmentation by a large margin." Depth Is All You Need for Monocular 3D Detection,"Dennis Park, Jie Li, Dian Chen, Vitor Guizilini, Adrien Gaidon",Toyota Research Institute,Object Detection and Segmentation,"A key contributor to recent progress in 3D detection from single images is monocular depth estimation. Existing methods focus on how to leverage depth explicitly, by generating pseudo-pointclouds or providing attention cues for image features. More recent works leverage depth prediction as a pretraining task and fine-tune the depth representation while training it for 3D detection. However, the adaptation is limited in scale by manual labels. In this work, we propose further aligning the depth representation with the target domain in an unsupervised fashion. Our methods leverage commonly available LiDAR or RGB videos during training time to fine-tune the depth representation, which leads to improved 3D detectors. Especially when using RGB videos, we show that our two-stage training by first generating depth pseudo-labels is critical, because of the inconsistency in loss distribution between the two tasks. With either type of reference data, our multi-task learning approach improves over the state of the art on both KITTI and NuScenes, while matching the test-time complexity of its single-task sub-network. Source code and pre-trained models are available on https://github.com/TRI-ML/DD3D." Towards Visual Classification under Class Ambiguity,"Viktor Kozák, Jan Mikula, Lukáš Bertl, Karel Kosnar, Libor Přeučil","Faculty of Electrical Engineering – Czech Technical University in Prague,Czech Technical University in Prague,Czech Technical University in Prague, CIIRC",Object Detection and Segmentation,"Visual classification under uncertainty is a complex computer vision problem. We present a thorough comparison of several variants of convolutional neural network (CNN) classification techniques in the context of ambiguous image data interpretation. We explore possible improvements in classification accuracy achieved by insertion of prior ambiguity information during the annotation process. This enables us to harness known similarities between individual classes and use them as probability distributions for soft ground-truth labels. We also present an approach based on Bayesian CNNs, offering the possibility of further interpretation of classification results in a problem where the neural network model is often considered as a black box. The presented techniques are verified on a practical spot weld inspection problem." LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations,"Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Cubuk, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan","Waymo LLC,Waymo,Google,Waymo Research",Object Detection and Segmentation,"Data augmentations are important for training high-performance 3D object detectors that use point clouds. Despite recent efforts on designing new data augmentations, perhaps surprisingly, most current state-of-the-art 3D detectors only rely on a few simple data augmentations. In particular, different from 2D image data augmentations, 3D data augmentations need to account for different representations of input data and require being customized for different models, which introduces significant overhead. In this paper, we propose LidarAugment, a practical and effective data augmentation strategy for 3D object detection. Unlike previous methods, which require tuning all augmentation policies in an exponentially large search space, we propose to factorize and align the search space of each data augmentation, which cuts down the 20+ hyperparameters to 2, and significantly reduces the search complexity. We show LidarAugment can be easily adapted to different model architectures with different input representations by a simple 2D grid search, and consistently improve a range of detectors including both convolution-based UPillars/StarNet/RSN and transformer-based SWFormer. Furthermore, LidarAugment mitigates overfitting and enables 3D detectors to scale up to larger capacities. When combined with the latest 3D detectors, LidarAugment achieves a new state-of-the-art 74.8 mAPH L2 on the Waymo Open Dataset." HFT: Lifting Perspective Representations Via Hybrid Feature Transformation for BEV Perception,"Jiayu Zou, Zheng Zhu, Junjie Huang, Tian Yang, Guan Huang, Xingang Wang","Institute of Automation, Chinese Academy of Sciences,Phigent Robotics,PhiGent Robotics,Research Center of Precision Sensing and Control, Institute of A",Object Detection and Segmentation,"Restoring an accurate Bird’s Eye View (BEV) map plays a crucial role in the perception of autonomous driving. The existing works of lifting representations from frontal view to BEV can be classified into two categories, i.e., Camera model-Based Feature Transformation (CBFT) and Camera model-Free Feature Transformation (CFFT). We empirically analyze the significant differences between CBFT and CFFT. The former method lift perspective features based on the flatworld assumption, which often causes distortion of regions lying above the ground plane. The latter method is limited in the perception performance due to the absence of geometric priors and time-consuming computing. In this paper, we propose a novel framework with a Hybrid Feature Transformation module (HFT) to lift perspective representations. Furthermore, we design a mutual learning scheme to augment hybrid transformation. The deformable attention mechanism enables the model to pay more attention to relevant regions and capture features with more semantics. We illustrate the effectiveness of HFT in BEV perception tasks, such as segmentation and object detection. Notably, in the task of semantic segmentation, extensive experiments demonstrate that HFT outperforms the previous state-of-the-art method by relatively 17.9% on the Argoverse and 22.0% on the KITTI 3D Object dataset. With negligible computing budget, HFT outperforms existing imagebased methods on 3D object detection. The code will be released soon." Radar Velocity Transformer: Single-Scan Moving Object Segmentation in Noisy Radar Point Clouds,"Matthias Zeller, Vardeep Singh Sandhu, Benedikt Mersch, Jens Behley, Michael Heidingsfeld, Cyrill Stachniss","CARIAD SE,University of Bonn, CARIAD,University of Bonn",Object Detection and Segmentation,"The awareness about moving objects in the surroundings of a self-driving vehicle is essential for safe and reliable autonomous navigation. The interpretation of LiDAR and camera data achieves exceptional results but typically requires to accumulate and process temporal sequences of data in order to extract motion information. In contrast, radar sensors, which are already installed in most recent vehicles, can overcome this limitation as they directly provide the Doppler velocity of the detections and, hence incorporate instantaneous motion information within a single measurement. In this paper, we tackle the problem of moving object segmentation in noisy radar point clouds. We also consider differentiating parked from moving cars, to enhance scene understanding. Instead of exploiting temporal dependencies to identify moving objects, we develop a novel transformer-based approach to accurately perform single-scan moving object segmentation in sparse radar scans. The key to our Radar Velocity Transformer is to incorporate the valuable velocity information throughout each module of the network, thereby enabling the precise segmentation of moving and non-moving objects. Additionally, we propose a transformer-based upsampling, which enhances the performance by adaptively combining information and overcoming the limitation of interpolation of sparse point clouds. Finally, we create a new radar moving object segmentation benchmark based on the RadarScenes dataset and compare our approach to other state-of-the-art methods. Our network runs faster than the frame rate of the sensor and shows superior segmentation results using only single-scan radar data." CurveFormer: 3D Lane Detection by Curve Propagation with Curve Queries and Attention,"Yifeng Bai, Zhirong Chen, Zhangjie Fu, Lang Peng, Pengpeng Liang, Erkang Cheng","University of Science and Technology of China,Nullmax,Southeast university,Zhengzhou University,Nullmax Inc",Object Detection and Segmentation,"3D lane detection is an integral part of autonomous driving systems. Previous CNN and Transformer-based methods usually first generate a bird's-eye-view (BEV) feature map from the front view image, and then use a sub-network with BEV feature map as input to predict 3D lanes. Such approaches require an explicit view transformation between BEV and front view, which itself is still a challenging problem. In this paper, we propose CurveFormer, a single-stage Transformer-based method that directly calculates 3D lane parameters and can circumvent the difficult view transformation step. Specifically, we formulate 3D lane detection as a curve propagation problem by using curve queries. A 3D lane query is represented by a dynamic and ordered anchor point set. In this way, queries with curve representation in Transformer decoder iteratively refine the 3D lane detection results. Moreover, a curve cross-attention module is introduced to compute the similarities between curve queries and image features. Additionally, a context sampling module that can capture more relative image features of a curve query is provided to further boost the 3D lane detection performance. We evaluate our method for 3D lane detection on both synthetic and real-world datasets, and the experimental results show that our method achieves promising performance compared with the state-of-the-art approaches. The effectiveness of each component is validated via ablation studies as well." Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN,"YuXuan (Andrew) Liu, Nikhil Mishra, Pieter Abbeel, Xi Chen","Covariant.ai, UC Berkeley,UC Berkeley,covariant.ai,UC Berkeley,Embodied Intelligence, UC Berkeley",Object Detection and Segmentation,"Object recognition and instance segmentation are fundamental skills in any robotic or autonomous system. Existing state-of-the-art methods are often unable to capture meaningful uncertainty in challenging or ambiguous scenes, and as such can cause critical errors in high-performance applications. In this paper, we explore a class of distributional instance segmentation models using latent codes that can model uncertainty over plausible hypotheses of object masks. For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary in industrial use cases. We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes in a robotic application. On a real-world apparel-picking robot, our method significantly reduces double pick errors while maintaining high performance." Bayesian Inference of Fog Visibility from LiDAR Point Clouds and Correlation with Probabilities of Detection,"Karl Montalban, Christophe Reymann, Dinesh Atchuthan, Paul-Édouard Dupouy, Nicolas Riviere, Simon Lacroix","easymile,EASYMILE SAS,EasyMile,ONERA,LAAS/CNRS",Object Detection and Segmentation,"Degraded visual environments have strong impacts on the quality of LiDAR data. Experiments in artificial fog conditions show that noise points caused by water particles present various distance distributions which depend on visibility. This article introduces a mathematical framework based on Bayesian inference and Markov Chain Monte-Carlo sampling to infer optical visibility from point clouds. The visibility estimation is cast as a classification problem based on the identification of the distance distributions. Contrary to deep learning methods, our approach is model-based and focuses on the design of a full probabilistic framework, more comprehensible, which is critical for autonomous driving. Ultimately, the impact of the optical visibility on the probability of detection of standard targets is assessed, which can yield improvements on autonomous vehicles performances in adverse weather conditions." GDIP: Gated Differentiable Image Processing for Object Detection in Adverse Conditions,"Sanket Kalwar, Dhruv Patel, Aakash Aanegola, Krishna Konda, Sourav Garg, Madhava Krishna","International Institute of Information Technology, Hyderabad,International Institute of Information Technology, Hyderabad, In,ZF TCI,Queensland University of Technology,IIIT Hyderabad",Object Detection and Segmentation,"Detecting objects under adverse weather and lighting conditions is crucial for the safe and continuous operation of an autonomous vehicle, and remains an unsolved problem. We present a Gated Differentiable Image Processing(GDIP) block, a domain-agnostic network architecture, which can be plugged into existing object detection networks (e.g.,Yolo) and trained end-to-end with adverse condition images such as those captured under fog and low lighting. Our proposed GDIP block learns to enhance images directly through the downstream object detection loss. This is achieved by learning parameters of multiple image pre-processing (IP) techniques that operate concurrently, with their outputs combined using weights learned through a novel gating mechanism. We further improve GDIP through a multi-stage guidance procedure for progressive image enhancement. Finally, trading off accuracy for speed, we propose a variant of GDIP that can be used as a regularizer for training Yolo, which eliminates the need for GDIP-based image enhancement during inference, resulting in higher throughput and plausible real-world deployment. We demonstrate significant improvement in detection performance over several state-of-the-art methods through quantitative and qualitative studies on synthetic datasets such as PascalVOC, and real-world foggy (RTTS) and low-lighting (ExDark) datasets." "Sample, Crop, Track: Self-Supervised Mobile 3D Object Detection for Urban Driving LiDAR","Sangyun Shin, Stuart Golodetz, Madhu Vankadari, Zhou Kaichen, Andrew Markham, Niki Trigoni","University of Oxford,Oxford University",Object Detection and Segmentation,"Deep learning has led to great progress in the detection of mobile (i.e. movement-capable) objects in urban driving scenes in recent years. Supervised approaches typically require the annotation of large training sets; there has thus been great interest in leveraging weakly, semi- or self-supervised methods to avoid this, with much success. Whilst weakly and semi-supervised methods require some annotation, self-supervised methods have used cues such as motion to relieve the need for annotation altogether. However, a complete absence of annotation typically degrades their performance, and ambiguities that arise during motion grouping can inhibit their ability to find accurate object boundaries. In this paper, we propose a new self-supervised mobile object detection approach called SCT. This uses both motion cues and expected object sizes to improve detection performance, and predicts a dense grid of 3D oriented bounding boxes to improve object discovery. We significantly outperform the state-of-the-art self-supervised mobile object detection method TCR on the KITTI tracking benchmark, and achieve performance that is within 30% of the fully supervised PV-RCNN++ method for IoUs" Topology Matching of Branched Deformable Linear Objects,"Manuel Zürn, Markus Wnuk, Armin Lechler, Alexander Verl","Institute for control engineering of machine tools and manufactu,University Stuttgart,University of Stuttgart",Perception of Deformable Objects,"This paper presents a new method for correspondence estimation between a previously known topology of a branched deformable linear object and an image representation from a 3D stereo camera. Although frequently encountered in production, robotic deformable linear object manipulation still lacks reliable sensor feedback. Especially for branched deformable linear objects, such as wire harnesses, correspondence estimation is very challenging. Due to their flexible nature, they have an infinite-dimensional configuration space, such that visual appearances of the same object can vary strongly. Knowing the correspondence is vital for various applications, e.g., estimating valid grasping positions for robotic wire routing or augmented reality support for workers. Therefore, this paper presents a method for matching the topology of a branched deformable linear object to camera sensor data. Asymmetries in the wire harness design reduce the solution space by comparing the known topology of a model to the topology extracted from sensor data. The problem of finding the most likely solution to the matching problem requires features extracted from camera images. These features are used to construct a graph-based topology representation, which can then be matched to a graph-based topology representation of the known branched deformable linear object. The presented method is evaluated using multiple different non-overlapping configurations of a wire harness, showing the effectiveness of a graph-based segment matching approach." DLOFTBs – Fast Tracking of Deformable Linear Objects with B-Splines,"Piotr Kicki, Amadeusz Szymko, Krzysztof Walas",Poznan University of Technology,Perception of Deformable Objects,"While the manipulation of rigid objects is an extensively explored research topic, deformable linear object (DLO) manipulation seems significantly underdeveloped. A potential reason for this is the inherent difficulty in describing and observing the state of the DLO as its geometry changes during manipulation. This paper proposes an algorithm for fast-tracking the shape of a DLO based on the masked image. Having no prior knowledge about the tracked object, the proposed method finds a reliable representation of the shape of the tracked object within tens of milliseconds. This algorithm's main idea is to first skeletonize the DLO mask image, walk through the parts of the DLO skeleton, arrange the segments into an ordered path, and finally fit a B-spline into it. Experiments show that our solution outperforms the State-of-the-Art approaches in DLO's shape reconstruction accuracy and algorithm running time and can handle challenging scenarios such as severe occlusions, self-intersections, and multiple DLOs in a single image." Self-Supervised Cloth Reconstruction Via Action-Conditioned Cloth Tracking,"Zixuan Huang, Xingyu Lin, David Held","University of Michigan,Carnegie Mellon University",Perception of Deformable Objects,"State estimation is one of the greatest challenges for cloth manipulation due to cloth's high dimensionality and self-occlusion. Prior works propose to identify the full state of crumpled clothes by training a mesh reconstruction model in simulation. However, such models are prone to suffer from a sim-to-real gap due to differences between cloth simulation and the real world. In this work, we propose a self-supervised method to finetune a mesh reconstruction model in the real world. Since the full mesh of crumpled cloth is difficult to obtain in the real world, we design a special data collection scheme and an action-conditioned model-based cloth tracking method to generate pseudo-labels for self-supervised learning. By finetuning the pretrained mesh reconstruction model on this pseudo-labeled dataset, we show that we can improve the quality of the reconstructed mesh without requiring human annotations, as well as the performance of downstream robot manipulation task." Learning to Estimate 3-D States of Deformable Linear Objects from Single-Frame Occluded Point Clouds,"Kangchen Lv, Mingrui Yu, Yifan Pu, Xin Jiang, Gao Huang, Xiang Li","Tsinghua University,Beijing Academy of Artificial Intelligence",Perception of Deformable Objects,"Accurately and robustly estimating the state of deformable linear objects (DLOs), such as ropes and wires, is crucial for DLO manipulation and other applications. However, it remains a challenging open issue due to the high dimensionality of the state space, frequent occlusions, and noises. This paper focuses on learning to robustly estimate the states of DLOs from single-frame point clouds in the presence of occlusions using a data-driven method. We propose a novel two-branch network architecture to exploit global and local information of input point cloud respectively and design a fusion module to effectively leverage the advantages of both methods. Simulation and real-world experimental results demonstrate that our method can generate globally smooth and locally precise DLO state estimation results even with heavily occluded point clouds, which can be directly applied to real-world robotic manipulation of DLOs in 3-D space." Feature Extraction for Effective and Efficient Deep Reinforcement Learning on Real Robotic Platforms,"Peter Bohm, Pauline Pounds, Archie Chapman",The University of Queensland,Reinforcement Learning I,"Deep reinforcement learning (DRL) methods can solve complex continuous control tasks in simulated environments by taking actions based solely on state observations at each decision point. Because of the dynamics involved, individual snapshots of real-world sensor measurements afford only partial state observability, so it is typical to use a history of observations to improve training and policy performance. Such intertemporal information can be further exploited using a recurrent neural network (RNN) to reduce the dimensionality of the dynamic state representation. However, using RNNs as an internal part of a DRL network presents challenges of its own; and even then, the improvements in resulting policies are usually limited. To address these shortcomings, we propose using gated feature extraction to improve DRL training of real-world robots. Specifically, we use an untrained gated recurrent unit (GRU) to encode a low-dimension representation of the state observation sequence before passing it to the DRL training procedure. In addition to dimensionality reduction, this allows us to unroll the RNN by encoding the observations cumulatively as they are collected, thereby avoiding same-length input requirements, and train the RL network on the raw observations at the current step combined with the GRU-encoding of the preceding steps. Our simulation experiments employ gated feature extraction with the TD3 algorithm. Our results show that the GRU-encoded state observations improve the training speed and execution performance of the TD3 algorithm, improving the learned policies in all 19 test cases, exceeding the maximum achieved reward by over 38% in 8 and doubling the maximum achieved reward in three, while also outperforming a baseline implementation of SAC in 17 out of 19 environments. Moreover, the greatest improvement is seen in real-world experiments, where our approach successfully learns to balance a pendulum as well as a complex quadrupedal locomotion task." Online Safety Property Collection and Refinement for Safe Deep Reinforcement Learning in Mapless Navigation,"Luca Marzari, Enrico Marchesini, Alessandro Farinelli","University of Verona,Northeastern University",Reinforcement Learning I,"Safety is essential for deploying Deep Reinforcement Learning (DRL) algorithms in real-world scenarios. Recently, verification approaches have been proposed to allow quantifying the number of violations of a DRL policy over input-output relationships, called properties. However, such properties are hard-coded and require task-level knowledge, making their application intractable in challenging safety-critical tasks. To this end, we introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time. CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties. Hence, we propose a refinement strategy to combine properties that model similar unsafe interactions. Our evaluation compares the benefits of computing the number of violations using standard hard-coded properties and the ones generated with CROP. We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches." Learning to View: Decision Transformers for Active Object Detection,"Wenhao Ding, Nathalie Majcherczyk, Mohit Deshpande, Xuewei Qi, Ding Zhao, Rajasimman Madhivanan, Arnab Sen","Carnegie Mellon University,Amazon LLC,Amazon Lab,,,,Toyota North America R&D Labs,Carnegie mellon university,Amazon.com,Amazon",Reinforcement Learning I,"Active perception describes a broad class of techniques that couple planning and perception systems to move the robot in a way to give the robot more information about the environment. In most robotic systems, perception is typically independent of motion planning. For example, traditional object detection is passive: it operates only on the images it receives. However, we have a chance to improve the results if we allow planning to consume detection signals and move the robot to collect views that maximize the quality of the results. In this paper, we use reinforcement learning (RL) methods to control the robot in order to obtain images that maximize the detection quality. Specifically, we propose using a Decision Transformer with online fine-tuning, which first optimizes the policy with a pre-collected expert dataset and then improves the learned policy by exploring better solutions in the environment. We evaluate the performance of the proposed method on an interactive dataset collected from an indoor scenario simulator. Experimental results demonstrate that our method outperforms all baselines, including expert policy and pure offline RL methods. We also provide exhaustive analyses of the reward distribution and observation space." Deep Reinforcement Learning for Autonomous Driving Using High-Level Heterogeneous Graph Representations,"Maximilian Schier, Christoph Reinders, Bodo Rosenhahn","Leibniz Universität Hannover,Leibniz University Hanover,Institute of Information Processing, Leibniz Universität Hannove",Reinforcement Learning I,"Graph networks have recently been used for decision making in automated driving tasks for their ability to capture a variable number of traffic participants. Current high-level graph-based approaches, however, do not model the entire road network and thus must rely on handcrafted features for vehicle-to-vehicle edges encompassing the road topology indirectly. We propose an entity-relation framework that intuitively models the road network and the traffic participants in a heterogeneous graph, representing all relevant information. Our novel architecture transforms the heterogeneous road-vehicle graph into a simpler graph of homogeneous node and edge types to allow effective training for deep reinforcement learning while introducing minimal prior knowledge. Unlike previous approaches, the vehicle-to-vehicle edges of this reduced graph are fully learnable and can therefore encode traffic rules without explicit feature design, an important step towards a holistic reinforcement learning model for automated driving. We show that our proposed method outperforms precomputed handcrafted features on intersection scenarios while also learning the semantics of right-of-way rules." Learning on the Job: Self-Rewarding Offline-To-Online Finetuning for Industrial Insertion of Novel Connectors from Vision,"Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine","UC Berkeley,University of California, Berkeley; Siemens,Worcester Polytechnic Institute,Siemens Corporation",Reinforcement Learning I,"Learning-based methods in robotics hold the promise of generalization, but what can be done if a learned policy does not generalize to a new situation? In principle, if an agent can at least evaluate its own success (i.e., with a reward classifier that generalizes well even when the policy does not), it could actively practice the task and finetune the policy in this situation. We study this problem in the setting of industrial insertion tasks, such as inserting connectors in sockets and setting screws. Existing algorithms rely on precise localization of the connector or socket and carefully managed physical setups, such as assembly lines, to succeed at the task. But in unstructured environments such as homes or even some industrial settings, robots cannot rely on precise localization and may be tasked with previously unseen connectors. Offline reinforcement learning on a variety of connector insertion tasks is a potential solution, but what if the robot is tasked with inserting previously unseen connector? In such a scenario, we will still need methods that can robustly solve such tasks with online practice. One of the main observations we make in this work is that, with a suitable representation learning and domain generalization approach, it can be significantly easier for the reward function to generalize to a new but structurally similar task (e.g., inserting a new type of connector) than for the policy. This means that a learned reward function can be used to facilitate the finetuning of the robot's policy in situations where the policy fails to generalize in zero shot, but the reward function generalizes successfully. We show that such an approach can be instantiated in the real world, pretrained on 50 different connectors, and successfully finetuned to new connectors via the learned reward function." Multi-Alpha Soft Actor-Critic: Overcoming Stochastic Biases in Maximum Entropy Reinforcement Learning,"Conor Igoe, Swapnil Pande, Siddarth Venkatraman, Jeff Schneider","Carnegie Mellon University,Manipal Institute of Technology",Reinforcement Learning I,"The successful application of robotic control requires intelligent decision-making to handle the long tail of complex scenarios that arise in real-world environments. Recently, Deep Reinforcement Learning (DRL) has provided a data-driven framework to automatically learn effective policies in such complex settings. Since its introduction in 2018, Soft Actor-Critic (SAC) remains as one of the most popular off-policy DRL algorithms and has been used extensively to learn performant robotic control policies. However, in this paper we argue that by relying on the maximum entropy formalism to define learning objectives, previous work introduces a significant bias away from optimal decision making, which often requires near deterministic behaviour for high-precision tasks. Moreover, we show that when training with the original variants of SAC, overcoming this bias by reducing entropy budgets or entropy coefficients introduces separate issues that lead to slow or unstable learning. We address these shortcomings by treating the entropy coefficient α as a random variable and introduce Multi-Alpha Soft Actor-Critic (MAS). We show how MAS overcomes the stochastic bias of SAC in a variety of robotic control tasks including the CARLA urban-driving simulator, while maintaining the stability and sample efficiency of the original algorithms." Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning,"Zheng Wu, Yichen Xie, Wenzhao Lian, Changhao Wang, Yanjiang Guo, Jianyu Chen, Stefan Schaal, Masayoshi Tomizuka","University of California, Berkeley,Google X,Tsinghua university,Tsinghua University,University of California",Reinforcement Learning I,"Humans are capable of abstracting various tasks as different combinations of multiple attributes. This perspective of compositionality is vital for human rapid learning and adaption since previous experiences from related tasks can be combined to generalize across novel compositional settings. In this work, we aim to achieve zero-shot policy generalization of Reinforcement Learning (RL) agents by leveraging the task compositionality. Our proposed method is a meta- RL algorithm with disentangled task representation, explicitly encoding different aspects of the tasks. Policy generalization is then performed by inferring unseen compositional task representations via the obtained disentanglement without extra exploration. The evaluation is conducted on three simulated tasks and a challenging real-world robotic insertion task. Experimental results demonstrate that our proposed method achieves policy generalization to unseen compositional tasks in a zero-shot manner." Real World Offline Reinforcement Learning with Realistic Data Source,"Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar","Carnegie Mellon University,University of Washington,Meta AI",Reinforcement Learning I,"Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience. However, current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories, and thus hold limited relevance for real-world robotics. In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning. Under these settings, we perform an extensive (6500+ trajectories collected over 800+ robot hours and 270+ human labor hour) empirical study evaluating generalization and transfer capabilities of representative ORL methods on four real-world tabletop manipulation tasks. Our study finds that ORL and imitation learning prefer different action spaces, and that ORL algorithms can generalize from leveraging offline heterogeneous data sources and outperform imitation learning. We release our dataset and implementations at URL: https://sites.google.com/view/real-orl" Robotic Table Wiping Via Reinforcement Learning and Whole-Body Trajectory Optimization,"Thomas Lew, Sumeet Singh, Mario Prats, Jeffrey Bingham, Jonathan Weisz, Benjie Holson, Xiaohan Zhang, Vikas Sindhwani, Yao Lu, Fei Xia, Peng Xu, Tingnan Zhang, Jie Tan, Montse Gonzalez Arenas","Stanford University,Google,X,Everyday Robots,Binghamton University,Google Brain, NYC,Google Inc",Reinforcement Learning I,"We propose a framework to enable multipurpose assistive mobile robots to autonomously wipe tables to clean spills and crumbs. This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations. Simultaneously, we must guarantee constraints satisfaction to enable safe deployment in unstructured cluttered environments. To tackle this problem, we first propose a stochastic differential equation to model crumbs and spill dynamics and absorption with a robot wiper. Using this model, we train a vision-based policy for planning wiping actions in simulation using reinforcement learning (RL). To enable zeroshot sim-to-real deployment, we dovetail the RL policy with a whole-body trajectory optimization framework to compute base and arm joint trajectories that execute the desired wiping motions while guaranteeing constraints satisfaction. We extensively validate our approach in simulation and on hardware." Towards True Lossless Sparse Communication in Multi-Agent Systems,"Seth Karten, Mycal Tucker, Siva Kailas, Katia Sycara","Carnegie Mellon University,Massachusetts Institute of Technology",Reinforcement Learning I,"Communication enables agents to cooperate to achieve their goals. Learning when to communicate, i.e., sparse (in time) communication, and whom to message is particularly important when bandwidth is limited. However, recent work in learning sparse individualized communication suffers from high variance during training, where decreasing communication comes at the cost of decreased reward, particularly in cooperative tasks. We use the information bottleneck to reframe sparsity as a representation learning problem, which we show naturally enables lossless sparse communication at lower budgets than prior art. In this paper, we propose a method for true lossless sparsity in communication via Information Maximizing Gated Sparse Multi-Agent Communication (IMGS-MAC). Our model uses two individualized regularization objectives, an information maximization autoencoder and sparse communication loss, to create informative and sparse communication. We evaluate the learned communication `language' through direct causal analysis of messages in non-sparse runs to determine the range of lossless sparse budgets, which allow zero-shot sparsity, and the range of sparse budgets that will inquire a reward loss, which is minimized by our learned gating function with few-shot sparsity. To demonstrate the efficacy of our results, we experiment in cooperative multi-agent tasks where communication is essential for success. We evaluate our model with both continuous and discrete messages. We focus our analysis on a variety of ablations to show the effect of message representations, including their properties, and lossless performance of our model." Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning,"Cheng Liu, Erik-jan Van Kampen, Guido De Croon","Delft University of Technology,TU Delft",Reinforcement Learning I,"Enabling the capability of assessing risk and making risk-aware decisions is essential to applying reinforcement learning to safety-critical robots like drones. In this paper, we investigate a specific case where a nano quadcopter robot learns to navigate an apriori-unknown cluttered environment under partial observability. We present a distributional reinforcement learning framework to generate adaptive risk-tendency policies. Specifically, we propose to use lower tail conditional variance of the learnt return distribution as intrinsic uncertainty estimation, and use exponentially weighted average forecasting (EWAF) to adapt the risk-tendency in accordance with the estimated uncertainty. In simulation and real-world empirical results, we show that (1) the most effective risk-tendency varies across states, (2) the agent with adaptive risk-tendency achieves superior performance compared to risk-neutral policy or risk-averse policy baselines." Self-Adaptive Driving in Nonstationary Environments through Conjectural Online Lookahead Adaptation,"Tao Li, Haozhe Lei, Quanyan Zhu",New York University,Reinforcement Learning I,"Powered by deep representation learning, reinforcement learning (RL) provides an end-to-end learning framework capable of solving self-driving (SD) tasks without manual designs. However, time-varying nonstationary environments cause proficient but specialized RL policies to fail at execution time. For example, an RL-based SD policy trained under sunny days does not generalize well to rainy weather. Even though meta learning enables the RL agent to adapt to new tasks/environments, its offline operation fails to equip the agent with online adaptation ability when facing nonstationary environments. This work proposes an online meta reinforcement learning algorithm based on the emph{conjectural online lookahead adaptation} (COLA). COLA determines the online adaptation at every step by maximizing the agent's conjecture of the future performance in a lookahead horizon. Experimental results demonstrate that under dynamically changing weather and lighting conditions, the COLA-based self-adaptive driving outperforms the baseline policies in terms of online adaptability. A demo video, source code, and appendixes are available at {tt https://github.com/Panshark/COLA}." Sim-To-Real Policy and Reward Transfer with Adaptive Forward Dynamics Model,"Rongshun Juan, Hao Ju, Jie Huang, Randy Gomez, Keisuke Nakamura, Guangliang Li","Tianjin University,Ocean University of China,Honda Research Institute Japan Co., Ltd.",Transfer Learning,"Deep reinforcement learning has shown promise in learning robust skills for robot control, but typically requires a large amount of samples to achieve good performance. Sim-to-real transfer learning has been developed to solve this problem, but the policy trained in simulation usually has unsatisfactory performance in the real world because simulators inevitably model the dynamics of reality imperfectly. To enable sample-efficient learning in the real world, we proposed progressive policy transfer with adaptive dynamics model (PPTADM). PPTADM assumes the dynamics of simulation and real world do not match but the state space is the same, transfers policy from simulation via progressive neural network (PNN) and further improves the policy with a learned forward dynamics model in reality. In addition, for real-world tasks in which reward functions are difficult or even impossible to define and verify the effectiveness, PPTADM can learn in real world solely from a transferred reward function that is estimated from simulation even though their dynamics do not match. Our results in five simulated tasks and on a real robot arm show that with PPTADM, the robot’s learning efficiency and performance in the real world can be significantly improved" Safety-Constrained Policy Transfer with Successor Features,"Zeyu Feng, Bowen Zhang, Jianxin Bi, Harold Soh",National University of Singapore,Transfer Learning,"In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained exploration can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account." GNM: A General Navigation Model to Drive Any Robot,"Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine","University of California, Berkeley,UC Berkeley,UC Berkeley / TOYOTA Motor North America",Transfer Learning,"Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data. If we could combine data from all available sources, including multiple kinds of robots, we could train more powerful navigation models. In this paper, we study how a general goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots, and enable broad generalization across environments and embodiments. We analyze the necessary design decisions for effective data sharing across robots, including the use of temporal context and standardized action spaces, and demonstrate that an omnipolicy trained from heterogeneous datasets outperforms policies trained on any single dataset. We curate 60 hours of navigation trajectories from 6 distinct robots, and deploy the trained GNM on a range of new robots, including an underactuated quadrotor. We find that training on diverse data leads to robustness against degradation in sensing and actuation. Using a pre-trained navigation model with broad generalization capabilities can bootstrap applications on novel robots going forward, and we hope that the GNM represents a step in that direction. For more information on the datasets, code, and videos, please check out our project page." ViPFormer: Efficient Vision-And-Pointcloud Transformer for Unsupervised Pointcloud Understanding,"Hongyu Sun, Yongcai Wang, Xudong Cai, Xuewei Bai, Deying Li",Renmin University of China,Transfer Learning,"Recently, a growing number of work design unsupervised paradigms for point cloud processing to alleviate the limitation of expensive manual annotation and poor transferability of supervised methods. Among them, CrossPoint follows the contrastive learning framework and exploits image and point cloud data for unsupervised point cloud understanding. Although the promising performance is presented, the unbalanced architecture makes it unnecessarily complex and inefficient. For example, the image branch in CrossPoint is ~8.3x heavier than the point cloud branch leading to higher complexity and latency. To address this problem, in this paper, we propose a lightweight Vision-and-Pointcloud Transformer (ViPFormer) to unify image and point cloud processing in a single architecture. ViPFormer learns in an unsupervised manner by optimizing intra-modal and cross-modal contrastive objectives. Then the pretrained model is transferred to various downstream tasks, including 3D shape classification and semantic segmentation. Experiments on different datasets show ViPFormer surpasses previous state-of-the-art unsupervised methods with higher accuracy, lower model complexity and runtime latency. Finally, the effectiveness of each component in ViPFormer is validated by extensive ablation studies. The implementation of the proposed method is available at https://github.com/auniquesun/ViPFormer." Learning Exploration Strategies to Solve Real-World Marble Runs,"Alisa Allaire, Christopher Atkeson","Carnegie Mellon University,CMU",Learning Methods,"Tasks involving locally unstable or discontinuous dynamics (such as bifurcations and collisions) remain challenging in robotics, because small variations in the environment can have a significant impact on task outcomes. For such tasks, learning a robust deterministic policy is difficult. We focus on structuring exploration with multiple stochastic policies based on a mixture of experts (MoE) policy representation that can be efficiently adapted. The MoE policy is composed of stochastic sub-policies that allow exploration of multiple distinct regions of the action space (or strategies) and a high-level selection policy to guide exploration towards the most promising regions. We develop a robot system to evaluate our approach in a real-world physical problem solving domain. After training the MoE policy in simulation, online learning in the real world demonstrates efficient adaptation within just a few dozen attempts, with a minimal sim2real gap. Our results confirm that representing multiple strategies promotes efficient adaptation in new environments and strategies learned under different dynamics can still provide useful information about where to look for good strategies." Multi-Embodiment Legged Robot Control As a Sequence Modeling Problem,"Chen Yu, Weinan Zhang, Hang Lai, Zheng Tian, Laurent Kneip, Jun Wang","ShanghaiTech University,Shanghai Jiao Tong University,University College London",Learning Methods,"Robots are traditionally bounded by a fixed embodiment during their operational lifetime, which limits their ability to adapt to their surroundings. Co-optimizing control and morphology of a robot, however, is often inefficient due to the complex interplay between the controller and morphology. In this paper, we propose a learning-based control method that can inherently take morphology into consideration such that once the control policy is trained in the simulator, it can be easily deployed to robots with different embodiments in the real world. In particular, we present the Embodiment-aware Transformer (EAT), an architecture that casts this control problem as conditional sequence modeling. EAT outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired robot embodiment, past states, and actions, our EAT model can generate future actions that best fit the current robot embodiment. Experimental results show that EAT can outperform all other alternatives in embodiment-varying tasks, and succeed in an example of real-world evolution tasks: stepping down a stair through updating the morphology alone. We hope that EAT will inspire a new push toward real-world evolution across many domains, where algorithms like EAT can blaze a trail by bridging the field of evolutionary robotics and big data sequence modeling." Efficient Recovery Learning Using Model Predictive Meta-Reasoning,"Shivam Vats, Maxim Likhachev, Oliver Kroemer",Carnegie Mellon University,Learning Methods,"Operating under real world conditions is challenging due to the possibility of a wide range of failures induced by execution errors and state uncertainty. In relatively benign settings, such failures can be overcome by retrying or executing one of a small number of hand-engineered recovery strategies. By contrast, contact-rich sequential manipulation tasks, like opening doors and assembling furniture, are not amenable to exhaustive hand-engineering. To address this issue, we present a general approach for robustifying manipulation strategies in a sample-efficient manner. Our approach incrementally improves robustness by first discovering the failure modes of the current strategy via exploration in simulation and then learning additional recovery skills to handle these failures. To ensure efficient learning, we propose an online algorithm called Meta-Reasoning for Skill Learning (MetaReSkill) that monitors the progress of all recovery policies during training and allocates training resources to recoveries that are likely to improve the task performance the most. We use our approach to learn recovery skills for door-opening and evaluate them both in simulation and on a real robot with little fine-tuning. Compared to open-loop execution, our experiments show that even a limited amount of recovery learning improves task success substantially from 71% to 92.4% in simulation and from 75% to 90% on a real robot." Multi-Swarm Genetic Gray Wolf Optimizer with Embedded Autoencoders for High-Dimensional Expensive Problems,"Jing Bi, Jiahui Zhai, Haitao Yuan, Ziqi Wang, Junfei Qiao, Jia Zhang, Mengchu Zhou","Beijing University of Technology, Beijing ,,,,,,, China,Beijing University of Technology,Beihang University,Southern Methodist University,New Jersey Institute of Technology",Learning Methods,"High-dimensional expensive problems are often encountered in the design and optimization of complex robotic and automated systems and distributed computing systems, and they suffer from a time-consuming fitness evaluation process. It is extremely challenging and difficult to produce promising solutions in high-dimensional search space. This work proposes an evolutionary optimization framework with embedded autoencoders that effectively solve optimization problems with high-dimensional search spaces. Autoencoders provide strong dimension reduction and feature extraction abilities that compress a high-dimensional space to an informative low-dimensional one. Search operations are performed in a low-dimensional space, thereby guiding a whole population to converge to the optimal solution more efficiently. Multiple subpopulations coevolve iteratively in a distributed manner. One subpopulation is embedded by an autoencoder, and the other one is guided by a proposed Multi-swarm Gray wolf optimizer based on Genetic learning (MGG). Thus, the proposed multi-swarm framework is named Autoencoder-based MGG (AMGG). AMGG consists of three proposed strategies that well balance exploration and exploitation abilities, i.e., a Dynamic subgroup Number Strategy for reducing the number of subpopulations, a Subpopulation Reorganization Strategy for sharing useful information about each subpopulation, and a Purposeful Detection Strategy for jumping out of local optima and improving exploration ability. AMGG is compared with several widely used algorithms by using typical benchmark functions and a real-life optimization problem. Extensive experimental results prove that AMGG outperforms its peers in terms of search accuracy and convergence efficiency." "H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions","Kei Ota, Hsiao-yu Tung, Kevin Smith, Anoop Cherian, Tim K. Marks, Alan Sullivan, Asako Kanezaki, Joshua Tenenbaum","Tokyo Institute of Technology,CMU,Massachusetts Institute of Technology,Mitsubishi Electric Research Labs,Mitsubishi Electric Research Laboratories (MERL),Mitsubishi Electric Research Lab",Learning Methods,"The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn’t work. We enable these capabilities in autonomous agents by proposing “Hypothesize, Simulate, Act, Update, and Repeat” (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models." Self-Supervised Learning of Action Affordances As Interaction Modes,"Liquan Wang, Nikita Dvornik, Rafael Dubeau, Mayank Mittal, Animesh Garg","University of Toronto,Samsung,ETH Zurich",Learning Methods,"When humans perform a task with an articulated object, they interact with an object only in a handful of ways, while the space of all possible interactions is nearly endless. This is because humans have prior knowledge about what interactions are likely to be successful, i.e., to open a new door we first try the handle. While learning such priors without supervision is easy for humans, it is notoriously hard for machines. In this work, we tackle unsupervised learning of priors of useful interactions with articulated objects, that we call modes of interaction. In contrast to the prior art, we use no supervision or privileged information; we only assume access to the depth sensor in the simulator to learn the modes of interaction. More precisely, we define a successful interaction as the one changing the visual environment significantly and learn a generative model of such interactions, that can be conditioned on the desired goal state of the object. In our experiments, we show that our model covers most of the human modes of interaction, outperforms existing state-of-the-art methods for affordance learning, and can generalize to objects never seen during training. Additionally, we show promising results in the goal-conditional setup, where our model can be quickly fine-tuned to perform a given task. We show in the experiments that such affordance learning predicts interaction which covers most modes of interaction for the querying articulated object and can be fine-tuned to a goal- conditional model." LATTE: LAnguage Trajectory TransformEr,"Arthur Fender Coelho Bucker, Luis Felipe Cruz Figueredo, Sami Haddadin, Ashish Kapoor, Shuang Ma, Sai Vemprala, Rogerio Bonatti","Carnegie Mellon University,Technical University of Munich (TUM),Technical University of Munich,MicroSoft,Microsoft,Microsoft Corporation",Learning Methods,"Natural language is one of the most intuitive ways to express human intent. However, translating instructions and commands towards robotic motion generation and deployment in the real world is far from being an easy task. The challenge of combining a robot’s inherent low-level geometric and kinodynamic constraints with a human’s high-level semantic instructions traditionally is solved using task-specific solutions with little generalizability between hardware platforms, often with the use of static sets of target actions and commands. This work instead proposes a flexible language-based framework that allows a user to modify generic robotic trajectories. Our method leverages pre-trained language models (BERT and CLIP) to encode the user’s intent and target objects directly from a free-form text input and scene images, fuses geometrical features generated by a transformer encoder network, and finally outputs trajectories using a transformer decoder, without the need of priors related to the task or robot information. We significantly extend our own previous work presented in [1] by expanding the trajectory parametrization space to 3D and velocity as opposed to just XY movements. In addition, we now train the model to use actual images of the objects in the scene for context (as opposed to textual descriptions), and we evaluate the system in a diverse set of scenarios beyond manipulation, such as aerial and legged robots. Our simulated and real-life experiments demonstrate that our transformer model can successfully follow human intent, modifying the shape and speed of trajectories within multiple environments. Codebase available at: https://github.com/arthurfenderbucker/ LaTTe-Language-Trajectory-TransformEr.git." Learning Visual Locomotion with Cross-Modal Supervision,"Antonio Loquercio, Ashish Kumar, Jitendra Malik",UC Berkeley,Learning Methods,"In this work, we show how to learn a visual walking policy that only uses an onboard RGB camera and proprioception to walk. Since simulating RGB is hard, we necessarily have to learn vision in the real world. We start with a blind walking policy trained in simulation. This policy can traverse some terrains in the real world but often struggles since it lacks knowledge of the upcoming geometry. This can be resolved with the use of vision. We train a visual module in the real world to predict the upcoming terrain with our proposed algorithm Cross Modal Supervision (CMS). CMS uses time-shifted proprioception to supervise vision and allows the policy to continually improve with more real-world experience. We evaluate our vision-based walking policy over a diverse set of terrains including stairs (up to 19cm high), slippery slopes (inclination of 35â—¦), curbs and tall steps (up to 20cm), and complex discrete terrains. We achieve this performance with less than 30 minutes of real-world data." MMIC-I: A Robotic Platform for Assembly Integration and Internal Locomotion through Mechanical Meta-Material Structures,"Olivia Irene Formoso, Greenfield Trinh, Damiana Catanoso, In-won Park, Christine Gregg, Kenneth C. Cheung","NASA Ames Research Center,National Aeronautics and Space Administration (NASA)",Novel Actuation and Actuators,"In-space assembly is crucial to creating large-scale space structures and enabling long term space missions. Natural limitations in the size of transportation vehicles and ISRU production facilities necessitate an additive strategy with the size of the typical structural unit being essentially fixed and inversely proportional to the final assembly size. In prior robotic and space assembly examples, reversible mechanical integration of structural modules is typically achieved with actuated alignment and fastening mechanisms onboard every structural module. Additive assembly or manufacturing planning approaches often feature a ""build front"" that receives new materials or parts and progresses gradually across the target geometry. The system we describe here places much of the alignment and fastener actuation systems onboard a mobile robot that can operate at a build front while companion robots (Scaling Omni-directional Lattice Locomoting Explorer, SOLL-E) provide part or material transportation. The design and evaluation of this Mobile Meta-Material Interior Co-Integrator (MMIC-I), an inchworm-style locomoting robotic assembler, is described here with an emphasis on ease of assembly and a low number of unique parts for a simple design. It is designed to assist in alignment of cuboctahedron structural unit cells with captive fasteners, defining the build front in operation. Adjacent structural unit cells are locked together with specified axial and rotational actuation of the the fasteners. Hardware prototypes show that the robot is able to successfully locomote to any indexed location within a lattice structure and bolt together each set of fasteners on any interface." Flow-Based Rendezvous and Docking for Marine Modular Robots in Gyre-Like Environments,"Gedaliah Knizhnik, Peihan Li, Mark Yim, M. Ani Hsieh","RRAI, University of Pennsylvania,Drexel University,University of Pennsylvania",Novel Actuation and Actuators,"Modular self-assembling systems typically assume that modules are present to assemble. But in sparsely observed ocean environments modules of an aquatic modular robotic system may be separated by distances they do not have the energy to cross, and the information needed for optimal path planning is often unavailable. In this work we present a flow-based rendezvous and docking controller that allows aquatic robots in gyre-like environments to rendezvous with and dock to a target by leveraging environmental forces. This approach does not require complete knowledge of the flow, but suffices with imperfect knowledge of the flow's center and shape. We validate the performance of this control approach in both simulations and experiments relative to naive rendezvous and docking strategies and show that energy efficiency improves as the scale of the gyre increases." Mobility Analysis of Screw-Based Locomotion and Propulsion in Various Media,"Jason Lim, Florian Richter, Dimitri Schreiber, Peter Gavrilov, Lizzie Peiros, Mingwei Yeoh, Calvin Joyce, Sara Wickenhiser, Michael Yip","University of Nevada, Reno,University of California, San Diego,University of California,University of California San Diego",Novel Actuation and Actuators,"Robots ``in-the-wild"" encounter and must traverse widely varying terrain, ranging from solid ground to granular materials like sand to full liquids. Numerous approaches exist, including wheeled and legged robots, each excelling in specific domains. Screw-based locomotion is a promising approach for multi-domain mobility, leveraged in exploratory robotic designs, including amphibious vehicles and snake robotics. However, unlike other forms of locomotion, there is a limited exploration of the models, parameter effects, and efficiency for multi-terrain Archimedes screw locomotion. In this work, we present work towards this missing component in understanding screw-based locomotion: comprehensive experimental results and performance analysis across different media. We designed a mobile test bed for indoor and outdoor experimentation to collect this data. Beyond quantitatively showing the multi-domain mobility of screw-based locomotion, we envision future researchers and engineers using the presented results to design effective screw-based locomotion systems." TJ-FlyingFish: Design and Implementation of an Aerial-Aquatic Quadrotor with Tiltable Propulsion Units,"Xuchen Liu, Minghao Dou, Dongyue Huang, Songqun Gao, Ruixin Yan, Biao Wang, Jinqiang Cui, Qinyuan Ren, Lihua Dou, Zhi Gao, Jie Chen, Ben M. Chen","The Chinese University of Hong Kong,Chinese University of Hong Kong,Nanjing University of Aeronautics and Astronautics,Peng Cheng Laboratory,Zhejiang University,Beijing Institue of Technology,Wuhan University,Tongji University",Novel Actuation and Actuators,"Aerial-aquatic vehicles are capable to move in the two most dominant fluids, making them more promising for a wide range of applications. We propose a prototype with special designs for propulsion and thruster configuration to cope with the vast differences in the fluid properties of water and air. For propulsion, the operating range is switched for the different mediums by the dual-speed propulsion unit, providing sufficient thrust and also ensuring output efficiency. For thruster configuration, thrust vectoring is realized by the rotation of the propulsion unit around the mount arm, thus enhancing the underwater maneuverability. This paper presents a quadrotor prototype of this concept and the design details and realization in practice." Modular Multi-Axis Elastic Actuator with Torque Sensing Capable P-CFH for Highly Impact Resistive Robot Leg,"Youngrae Kim, Sunghyun Choi, Jinhyeok Song, Dongwon Yun","Daegu Gyeongbuk Institute of Science and Technology (DGIST), Dae,Daegu Gyeongbuk Institute of Science & Technology,DGIST,Daegu Gyeongbuk Institute of Science and Technology (DGIST)",Novel Actuation and Actuators,"This study proposes a modular Multi-axis Elastic Actuator (MAEA) for legged robots that can effectively cope with impacts that may occur during dynamic maneuvering. MAEA has multi-axis compliance and can measure the torque without additional encoders. Therefore, effective impact resistance is possible with less volume and weight than conventional Series Elastic Actuators (SEA). The 6-axis stiffness analysis of paired-Crossed Flexural Hinge (p-CFH) is extended from small deformation to large deformation, and the accuracy is verified through Finite Element Analysis (FEA) and experiments. Based on the analysis, the torque of p-CFH is measured, and feedback torque control is also performed. Finally, the robot leg was constructed with MAEA, and the multi-axis impact resistance performance of MAEA was demonstrated by analyzing the applied impact during landing experiments at various angles." Design and Mechanics of Cable-Driven Rolling Diaphragm Transmission for High-Transparency Robotic Motion,"Hoi Man Lam, Jared Walker, Lucas Jonasch, Dimitri Schreiber, Michael Yip","University of California San Diego,University of California,University of California, San Diego",Novel Actuation and Actuators,"Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. Utilization of cable drives compliment the rolling diaphragm transmission's advantages, but maintaining cable tension is crucial for optimal and consistent performance. % Talk about what we bring In this paper, a coaxial opposed rolling diaphragm layout with cable drive and an electronic transmission control system are investigated, with a focus on system reliability and scalability. Mechanical features are proposed which enable force balancing, decoupling of transmission pressure from bearing loads, and maintenance of cable tension. Key considerations and procedures for automation of transmission setup, phasing, and operation are also presented. We also present an analysis of system stiffness to identify key compliance contributors, and conduct experiments to validate prototype design performance." Twist Snake: Plastic Table-Top Cable-Driven Robotic Arm with All Motors Located at the Base Link,"Kazutoshi Tanaka, Masashi Hamaya",OMRON SINIC X Corporation,Novel Actuation and Actuators,"Table-top robotic arms for education and research must be low-cost for availability, and lightweight and soft for the safety. Therefore, as such a robot, this study focuses on designing a plastic table-top cable-driven robotic arm with all motors located at the base link. However, locating all motors at the base link results in a significant distance between a driving motor and driven joint, increases the number of parts for the force transmission, and increases the risk of a cable loosening and coming off of a pulley. To overcome these issues, this study proposed a novel cable-driven robotic arm named Twist Snake. We designed a joint composition of Twist Snake to minimize the number of parts for the force transmission. In addition, it has a compact cable-pretension/termination-mechanism and covering parts to prevent the cable from loosening and coming off of the pulley. The arm comprised 475 mm long moving links with an 802 g. The feasibility of the arm was experimentally demonstrated by contact rich tasks, the insertion of a toy peg into a hole and swiping a whiteboard with a cleaner. The optimization of the proposed design and the development of a learning method for the arm that leverages contact will be investigated in future work." Strained Elastic Surfaces with Adjustable-Modulus Edges (SESAMEs) for Soft Robotic Actuation,"Christopher Kimmer, Michael Seokyoung Han, Cindy Harnett","Indiana University Southeast,University of Louisville",Novel Actuation and Actuators,"For robots to interact safely with humans and travel with minimal weight in aerial and space applications, packable and lightweight electronically driven actuators are sought. Active materials like shape memory wire and other artificial muscle fibers offer solutions, but these materials need a restoring force and if joint bending is required, the actuators must exert a bending moment around the joint. In this paper, we model the three-dimensional shapes of strained elastic surfaces with adjustable-modulus edges (SESAMEs), then implement SESAMEs by machine embroidering shape memory alloy wire onto stretched elastic fabric, showing a path to lightweight actuators that exert bending forces and have built-in restoring forces. SESAMEs start out planar, and upon release from the plane take on three-dimensional shapes thanks to the balance between bending energy in the boundary and strain energy in the elastic surface. The elastic creates both a restoring force to bring the boundary back to its original shape after actuation, and an out-of- plane structure for applying a bending moment. We demonstrate SESAMEs’ properties as soft robotic actuators individually and in arrays, and coupled to flexible plastic frames during the planar fabrication process to make bistable mechanical structures." Controllable Mechanical-Domain Energy Accumulators,"Sung Kim, David Braun",Vanderbilt University,Compliant Joints and Mechanisms,"Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load, but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock over 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1% of the maximal spring force, and provide an 80% energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots." Concept Design of a New XY Compliant Parallel Manipulator with Spatial Configuration,"Zekui Lyu, Qingsong Xu",University of Macau,Compliant Joints and Mechanisms,"This paper proposes the concept design of a novel XY compliant parallel manipulator (CPM) with spatial configuration, which is beneficial to promote the performance of the XY CPM. Evolved from a planar configuration, a spatial compliant parallelogram flexure is devised as the basic module structure. Then, a mirror-symmetric XY CPM adopting spatial layout is proposed based on four-prismatic-prismatic (4-PP) parallel mechanism. The prototypes are fabricated by 3D printing for testing. The performance analysis and verification is conducted through theoretical modeling, finite element simulation, and experimental study. For comparison study, a planar XY CPM with similar mechanism is also developed. Results show that the proposed XY CPM with spatial configuration provides the benefits of smaller plane footprint, large working stroke, and enhanced load-bearing capacity as compared to the planar one. It is appropriate for precise positioning scenarios, like softcontact lithography, which require high loading capacity and great compactness." Computational Design of 3D-Printable Compliant Mechanisms with Bio-Inspired Sliding Joints,"Felipe Velasquez, Bernhard Thomaszewski, Stelian Coros","ETH Zurich,Université de Montréal",Compliant Joints and Mechanisms,"We propose a computational approach for designing fully-integrated compliant mechanisms with bio-inspired joints that are stabilized and actuated by elastic elements. Similar to human knees or finger phalanges, our mechanisms leverage sliding between pairs of contacting surfaces to generate complex motions. Due to the vast design space, however, finding surface shapes that lead to ideal approximations of given target motion is a challenging and time-consuming task. To assist users in this process, our computational design tool combines forward and inverse simulation strategies that allow for guided and automated exploration of the parameter space. We demonstrate the potential of our method on a set of compliant mechanism with different joint geometries and validate our simulation results on 3D-printed prototypes." Compliant Finger Joint with Controlled Variable Stiffness Based on Twisted Strings Actuation,"Mihai Dragusanu, Danilo Troisi, Domenico Prattichizzo, Monica Malvezzi",University of Siena,Compliant Joints and Mechanisms,"Underactuated tendon-driven fingers are a simple, yet effective solution, for realizing robotic grippers and hands. The lack of controllable degrees of actuation and precise sensing is compensated by the deformable structure of the finger, which is able to adapt to the objects to be grasped and manipulated, and also to implement grasping strategies based on environmental constraint exploitation. One of the main drawbacks of these robotic fingers is that, due to the limited number of actuators, they can only realize a limited number of movements. Finger closure motion realized by activating the tendon depends on finger mechanical properties, and in particular on elastic joint stiffness. In this paper, we introduce a passive elastic joint to be implemented in monolithic fingers in which the stiffness can be actively regulated by applying a pre-compression to the structure, controlled by a twisted-string actuator (TSA). The paper describes the working principle of the joint, investigates the relationship between pre-compression and flexural stiffness, and finally shows its application to a robotic finger composed of three phalanges." Design of a Variable Stiffness Spring with Human-Selectable Stiffness,"Chase Mathews, David Braun",Vanderbilt University,Compliant Joints and Mechanisms,"Springs are commonly used in wearable robotic devices to provide assistive joint torque without the need for motors and batteries. However, different tasks (such as walking or running) and different users (such as athletes with strong legs or the elderly with weak legs) necessitate different assistive joint torques, and therefore, springs with different stiffness. Variable stiffness springs are a special class of springs which can exert more or less torque upon the same deflection, provided that the user is able to change the stiffness of the spring. In this paper, we present a novel variable stiffness spring design in which the user can select a preferred spring stiffness similar to switching gears on a bicycle. Using a leg-swing experiment, we demonstrate that the user can increment and decrement spring stiffness in a large range to effectively assist the hip joint during leg oscillations. Variable stiffness springs with human-selectable stiffness could be key components of wearable devices which augment locomotion tasks, such as walking, running, and swimming." Novel Spring Mechanism Enables Iterative Energy Accumulation under Force and Deformation Constraints,"Cole Dempsey, David Braun",Vanderbilt University,Compliant Joints and Mechanisms,"Springs can provide force at zero net energy cost by recycling negative mechanical work to benefit motor-driven robots or spring-augmented humans. However, humans have limited force and range of motion, and motors have a limited ability to produce force. These limits constrain how much energy a conventional spring can store and, consequently, how much assistance a spring can provide. In this paper, we introduce an approach to accumulating negative work in assistive springs over several motion cycles. We show that, by utilizing a novel floating spring mechanism, the weight of a human or robot can be used to iteratively increase spring compression, irrespective of the potential energy stored by the spring. Decoupling the force required to compress a spring from the energy stored by a spring advances prior works, and could enable spring-driven robots and humans to perform physically demanding tasks without the use of large actuators." "Fast, Reliable Constrained Manipulation Using a VSA Driven Planar Robot","Andrew Bernhard, Joseph Schimmels","Argonne National Laboratory,Marquette University",Compliant Joints and Mechanisms,"This paper presents the design and performance of a planar 3R robot capable of dexterous constrained manipulation when interacting with a stiff environment. A novel variable stiffness actuator (VSA) having a stiffness ratio of approximately 500 is also described. Variable stiffness actuation, together with a combined position/compliance manipulation path, is used to: 1) allow the robot to passively comply with its environment along kinematically constrained directions despite model error in constraint locations, and 2) generate high stiffness for accurate motion control along kinematically unconstrained directions despite resisting forces. This manipulation strategy provides dexterity for cases in which mechanical work must be performed while complying with constraints. The manipulation strategy and robot performance were evaluated with the task of turning a steel crank to lift a weight. Results show that, when using passive compliance control, the robot completed the task 29 times faster with constraint forces 80% lower than when using traditional active compliance control (with VSAs at their highest stiffness)." A Stiffness-Changeable Soft Finger Based on Chain Mail Jamming,"Zhengtao Hu, Abdullah Ahmed, Weiwei Wan, Tetsuyou Watanabe, Kensuke Harada","Osaka University,Kanazawa University",Compliant Joints and Mechanisms,"This paper presents a stiffness-changeable soft finger using chain mail jamming. This finger can achieve adaptive grasping and in-hand manipulation by reshaping and exerting changeable gripping force. The jamming phenomenon happens when particles in a chamber get interlocked where confining pressure is exerted at their boundaries, which is widely used to construct mechanisms with changeable stiffness. Compared with traditional granular media, chain mail has a lower packing fraction and provides a stronger tensile force. In this paper, we proposed to apply chain mail jamming to the field of robotic finger design. Especially, we propose the design of the finger, the fabrication process, the method of predicting gripping force, and the grasping strategies. The experiments quantitatively verify the model of gripping force prediction. The demonstrations validate the advantages of adaptive grasp by picking a variety of items including foods, goods, and industrial components, and show the application of in-hand manipulation." Repetitive Twisting Durability of Synthetic Fiber Ropes,"Shinya Sadachika, Masahito Kanekiyo, Hiroyuki Nabae, Gen Endo",Tokyo Institute of Technology,Mechanism Design,"Synthetic fiber ropes are widely used for robots because of their advantages such as lightweight, high tensile strength, and flexibility. However, there is limited information on the physical properties of synthetic fiber ropes when used for robots. This study focuses on the repetitive twisting of synthetic fiber ropes and provides information for selecting them for robots based on durability. To this end, we conducted repetitive twisting experiments on five types of ropes made from different fibers; we revealed that Dyneema has higher durability against repetitive twisting than the other ropes when a single rope is twisted. In addition, we conducted experiments on Dyneema by applying torsion to two ropes in parallel like a twisted string actuator. The results indicated that two Dyneema ropes in parallel have higher durability than a single rope; however, we revealed that the tensile strength decreased sharply with an increase in the angle of twist." Computational Design of Closed-Chain Linkages: Hopping Robot Driven by Morphological Computation,"Kirill Nasonov, Dmitriy Ivolga, Ivan Borisov, Sergey Kolyubin","ITMO University,ITMO",Mechanism Design,"The main advantages of legged robots over wheeled ones are their abilities to traverse on uneven terrain due to the use of intermittent contacts and an ability to shift the center of mass relative to the contact location. A robot’s leg design can be implemented by using an open-chain mechanism actuated with high-density torque actuators though this solution needs a vast energy budget. An alternative way to design a leg mechanism is the application of morphological computation principle. According to the principle, most of the desired robot’s behavior can be delegated to the mechanics with minimum control effort needed to excite, stabilize or augment it. Within this paper, we have proposed a method to synthesize a leg for hopping robots. Due to optimization of mechanical structure, geometric parameters, mass distribution, and elasticity allocation, our method allows getting an energy-efficient robot with minimal control system complexity, which is accomplished via series elastic allocation and active variable length link. Based on this approach, we have designed a hopping robot with two low performance actuators that can achieve hopping, running, and, in the case of a biped or quadruped robot, walking motion. The paper describes a synthesized leg linkage and overviews prototype design, control strategy, and test results of a physical prototype." Trajectory Planning Issues in Cuspidal Commercial Robots,"Durgesh Haribhau Salunkhe, Damien Chablat, Philippe Wenger","CNRS-UMR,,,,-CD,,,,-LS,N,Laboratoire des Sciences du Numérique de Nantes,Ecole Centrale de Nantes - CNRS",Mechanism Design,"A cuspidal serial robot can travel from one inverse kinematic solution to another without crossing a singularity. Cuspidal robots ask for extra care and caution in trajectory planning, as identifying a unique aspect related to an inverse kinematic solution is not possible. The issues related to motion planning with cuspidal robots are related to the inherent property arising from the geometric design of the robot. The cuspidality property has not been considered in recent industrial 6R robots with a non-spherical wrist. In this work, cuspidality is illustrated with the JACO robot (gen 2, non-spherical wrist), a serial arm by Kinova Robotics which is deployed in various applications and is cuspidal in nature. A nonsingular change of solutions for the robot is provided to highlight the effect of cuspidal robots on the interference with the environment. The pose with multiple inverse kinematic solutions in an aspect is presented. Problems in choosing the initial solution of the path in cuspidal robots, and its consequence, is illustrated with an example path in the workspace of the JACO robot. The paper presents the importance of cuspidality analysis of 6R robots and the implications of neglecting it." Big Data Approach for Synthesizing a Spatial Linkage Mechanism,"Neung Hwan Yim, Jegyeong Ryu, Yoon Young Kim","Seoul National University,Korea Institute of Science and Technology",Mechanism Design,"This paper presents a novel two-step method for synthesizing spatial linkage mechanisms. Compared with planar mechanisms, the main challenge in synthesizing spatial mechanisms is that the generating motion varies depending on its mechanism topologies. Therefore, we propose a big data approach to determine the topology of spatial mechanisms. We adopt a three-dimensional (3D) spring-connected rigid block model to represent the topology of the spatial mechanism and project 3D motion onto three orthogonal planes to determine the mechanism topology with big data. In addition, a gradient-based dimension synthesis procedure was carried out to determine a detailed dimension using already determined mechanism topology by mechanism big data. Also, several successful case studies by the proposed approach are presented to support the effectiveness of the proposed synthesis method." Croche-Matic: A Robot for Crocheting 3D Cylindrical Geometry,"Gabriella Perry, Jose Luis Garcia Del Castillo Y Lopez, Nathan Melenbrink",Harvard University,Mechanism Design,"Crochet is a textile craft that has resisted mechanization and industrialization except for a select number of one-off crochet machines. These machines are only capable of producing a limited subset of common crochet stitches. Crochet machines are not used in the textile industry, yet mass-produced crochet objects and clothes sold in stores like Target and Zara are almost certainly the products of crochet sweatshops. The popularity of crochet and the existence of crochet products in major chain stores shows that there is both a clear demand for this craft as well as a need for it to be produced in a more ethical way. In this paper, we present Croche-Matic, a radial crochet machine for generating three-dimensional cylindrical geometry. The Croche-Matic is designed based on Magic Ring technique, a method for hand crocheting 3D cylindrical objects. The machine consists of nine mechanical axes that work in sequence to complete different types of crochet stitches, and includes a sensor component for measuring and regulating yarn tension within the mechanical system. Croche-Matic can complete the four main stitches used in Magic Ring technique. It has a success rate of 50.7% with single crochet stitches, and has demonstrated an ability to create three-dimensional objects." A Novel Platform to Control Biofouling in Pearl Oysters Cultivation,"Van-nhan Tran, Quan-dung Pham, Tan-sang Ha, Wong Yue Him, Sai-kit Yeung","Hong Kong University of Science and Technology,Shenzhen University",Mechanism Design,"This paper presents a simple yet effective design of a platform to automate the task of shellfish aquaculture, specifically pearl oysters. Compared to traditional methods, our platform can eliminate the tedious task of cleaning the pearl oysters due to fouling. Inspired by the low and high tide characteristics of the intertidal zone, our platform employs an air-water displacement mechanism to periodically float pearl oysters above the water’s surface, exposing fouling organisms to air and sunlight. While pearl oysters have developed the ability to stay alive during low tide, these fouling organisms cannot survive after prolonged exposure, thus preventing them from developing. Additionally, the platform provides an alternative approach to grow not only pearl oysters but also various types of shellfish, consequently benefiting the aquaculture industry. We introduce the design of the platform and provide a comprehensive analysis. We also demonstrate the practical deployment of the platform for cultivating pearl oysters." Embedded Active Stiffening Mechanisms to Modulate Kresling Tower Kinetostatic Properties,"John Berre, Lennart Rubbert, Francois Geiskopf, Pierre Renaud","INSA Strasbourg, University of Strasbourg, CNRS,INSA - Strasbourg,INSA de Strasbourg,ICube",Mechanism Design,"Non-rigidly foldable origamis are of great interest to build robotic components, as they are light, offer large deployability and can also be multistable. In this paper, we consider the Kresling tower, and propose an original way to actively modulate its kinetostatic properties. Actuated stiffening mechanisms are embedded on some folds of the origami. By adjusting the axial stiffness of the folds, modulation of the axial stiffness and the force required to switch between stable configurations are demonstrated. This adjustment can in addition be performed independently from the height of the stable configurations, which makes it simple to use. The interest of fold stiffening is outlined experimentally. Three actuation strategies are considered and implemented. Impact on Kresling tower properties are shown, with complementary performances of pneumatic, SMA-based and DC motor actuation." "A Compact, Two-Part Torsion Spring Architecture","Zachary Bons, Gray Thomas, Luke Mooney, Elliott Rouse","University of Michigan,Dephy, Inc.",Award Finalists 1,"Springs are essential mechanical elements that are used across a wide variety of industries and mechanisms. Common across many spring types and applications is the importance of compactness, low mass and customizability. In this paper, we present a novel rotary spring design that is lightweight, compact and customizable. In addition, we empirically validate the design by experimentally quantifying the performance of two test springs on a custom dynamometry testbed. Our two-part spring geometry is comprised of a central rotating gear-like cam shaft, and a disk that includes a circular array of radially-spaced tapered cantilevered beams. The two springs that we designed and tested matched desired performance specifications within 3-6%, confirming the efficacy of this unique design approach." "HREyes: Design, Development, and Evaluation of a Novel Method for AUVs to Communicate Information and Gaze Direction","Michael Fulton, Aditya Prabhu, Junaed Sattar","University of Minnesota,University of Minnesota, Twin Cities",Human-Robot Collaboration I,"We present the design, development, and evaluation of HREyes: biomimetic communication devices which use light to communicate information and, for the first time, gaze direction from AUVs to humans. First, we introduce two types of information displays using the HREye devices: active lucemes and ocular lucemes. Active lucemes communicate information explicitly through animations, while ocular lucemes communicate gaze direction implicitly by mimicking human eyes. We present a human study in which our system is compared to the use of an embedded digital display that explicitly communicates information to a diver by displaying text. Our results demonstrate accurate recognition of active lucemes for trained interactants, limited intuitive understanding of these lucemes for untrained interactants, and relatively accurate perception of gaze direction for all interactants. The results on active luceme recognition demonstrate more accurate recognition than previous light-based communication systems for AUVs (albeit with different phrase sets). Additionally, the ocular lucemes we introduce in this work represent the first method for communicating gaze direction from an AUV, a critical aspect of nonverbal communication used in collaborative work. With readily available hardware as well as open-source and easily re-configurable programming, HREyes can be easily integrated into any AUV with the physical space for the devices and used to communicate effectively with divers in any underwater environment with appropriate visibility." Dense Depth Completion Based on Multi-Scale Confidence and Self-Attention Mechanism for Intestinal Endoscopy,"Ruyu Liu, Zhengzhe Liu, Haoyu Zhang, Guodao Zhang, Zhigui Zuo, Weiguo Sheng","Hangzhou Normal University,Hangzhou Dianzi University,the First Affiliated Hospital of Wenzhou Medical University",Human-Robot Collaboration I,"Doctors perform limited one-way intestine endoscopy, in which advanced surgical robots with depth sensors, such as stereo and ToF endoscopes, can only provide sparse and incomplete depth information. However, dense, accurate and instant depth estimation during endoscopy is vital for doctors to judge the 3D location and shape of intestinal tissues, which affects the human-robot interaction between doctors and surgical robots, such as the operation on the subsequent moving of the probe. In this paper, we present a deep learning-based dense depth completion method for intestine endoscopy. We utilize the scattered depth information from depth sensors to make up for the deficiency of features in the intestine and design a multi-scale confidence prediction network to extract dense geometric depth features. Then, we introduce the structure awareness module based on the self-attention mechanism in the depth completion network to enhance the geometry and texture features of the intestine. We also present a virtual multi-modal RGBD intestine dataset and conduct comprehensive experiments on a total of three intestine datasets. The experimental results clearly demonstrate that our method achieves better results in all metrics in all intestinal environments compared to state-of-the-art methods." Design of an Energy-Aware Cartesian Impedance Controller for Collaborative Disassembly,"Sebastian Hjorth, Edoardo Lamon, Dimitrios Chrysostomou, Arash Ajoudani","Aalborg University,Istituto Italiano di Tecnologia",Human-Robot Collaboration I,"Human-robot collaborative disassembly is an emerging trend in the sustainable recycling process of electronic and mechanical products. It requires the use of advanced technologies to assist workers in repetitive physical tasks and deal with creaky and potentially damaged components. Nevertheless, when disassembling worn-out or damaged components, unexpected robot behaviors may emerge, so harmless and symbiotic physical interaction with humans and the environment becomes paramount. This work addresses this challenge at the control level by ensuring safe and passive behaviors in unplanned interactions and contact losses. The proposed algorithm capitalizes on an energy-aware Cartesian impedance controller, which features energy scaling and damping injection, and an augmented energy tank, which limits the power flow from the controller to the robot. The controller is evaluated in a real-world flawed unscrewing task with a Franka Emika Panda and is compared to a standard impedance controller and a hybrid force-impedance controller. The results demonstrate the high potential of the algorithm in human-robot collaborative disassembly tasks." Towards Robots That Influence Humans Over Long-Term Interaction,"Shahabedin Sagheb, Ye-ji Mun, Neema Ahmadian, Benjamin Christie, Andrea Bajcsy, Katherine Driggs-Campbell, Dylan Losey","Virginia Tech,University of Illinois at Urbana-Champaign,University of California Berkeley",Human-Robot Collaboration I,"When humans interact with robots influence is inevitable. Consider an autonomous car driving near a human: the speed and steering of the autonomous car will affect how the human drives. Prior works have developed frameworks that enable robots to influence humans towards desired behaviors. But while these approaches are effective in the short-term (i.e., the first few human-robot interactions), here we explore long-term influence (i.e., repeated interactions between the same human and robot). Our central insight is that humans are dynamic: people adapt to robots, and behaviors which are influential now may fall short once the human learns to anticipate the robot's actions. With this insight, we experimentally demonstrate that a prevalent game-theoretic formalism for generating influential robot behaviors becomes less effective over repeated interactions. Next, we propose three modifications to Stackelberg games that make the robot's policy both influential and unpredictable. We finally test these modifications across simulations and user studies: our results suggest that robots which purposely make their actions harder to anticipate are better able to maintain influence over long-term interaction." Carrying the Uncarriable: A Deformation-Agnostic and Human-Cooperative Framework for Unwieldy Objects Using Multiple Robots,"Doganay Sirintuna, Idil Özdamar, Arash Ajoudani","HRI, Lab., Istituto Italiano di Tecnologia. Dept. of Informatics,Istituto Italiano di Tecnologia",Human-Robot Collaboration I,"This manuscript introduces an object deformability-agnostic framework for co-carrying tasks that are shared between a person and multiple robots. Our approach allows the full control of the co-carrying trajectories by the person while sharing the load with multiple robots depending on the size and the weight of the object. This is achieved by merging the haptic information transferred through the object and the human motion information obtained from a motion capture system. One important advantage of the framework is that no strict internal communication is required between the robots, regardless of the object size and deformation characteristics. We validate the framework with two challenging real-world scenarios: co-transportation of a wooden rigid closet and a bulky box on top of forklift moving straps, with the latter characterizing deformable objects. In order to evaluate the generalizability of the proposed framework, a heterogenous team of two mobile manipulators that consist of an Omni-directional mobile base and a collaborative robotic arm with different DoFs is chosen for the experiments. The qualitative comparison between our controller and the baseline controller (i.e., an admittance controller) during these experiments demonstrated the effectiveness of the proposed framework especially when co-carrying deformable objects. Furthermore, we believe that the performance of our framework during the experiment with the lifting straps offers a promising solution for the co-transportation of bulky and ungraspable objects." A Control Approach for Human-Robot Ergonomic Payload Lifting,"Lorenzo Rapetti, Carlotta Sartore, Mohamed Elobaid, Yeshasvi Tirupachuri, Francesco Draicchio, Tomohiro Kawakami, Takahide Yoshiike, Daniele Pucci","IIT,Istituto Italiano di Tecnologia,Fondazione Istituto Italiano di Tecnologia,Italian Institute of Technology,INAIL, Department of Occupational & Environmental Medicine, Mont,Honda R&D Co., Ltd.,Honda Research Institute Japan",Award Finalists 3,"Collaborative robots can relief human operators from excessive efforts during payload lifting activities. Modelling the human partner allows the design of safe and efficient collaborative strategies. In this paper, we present a control approach for human-robot collaboration based on human monitoring through whole-body wearable sensors, and interaction modelling through coupled rigid-body dynamics. Moreover, a trajectory advancement strategy is proposed, allowing for online adaptation of the robot trajectory depending on the human motion. The resulting framework allows us to perform payload lifting tasks, taking into account the ergonomic requirements of the agents. Validation has been performed in an experimental scenario using the iCub3 humanoid robot and a human subject sensorized with the iFeel wearable system." Active Reward Learning from Online Preferences,"Vivek Myers, Erdem Bıyık, Dorsa Sadigh","UC Berkeley,Stanford University",Human-Robot Collaboration I,"Robot policies need to adapt to human preferences and/or new environments. Human experts may have the domain knowledge required to help robots achieve this adaptation. However, existing works often require costly offline re-training on human feedback, and those feedback usually need to be frequent and too complex for the humans to reliably provide. To avoid placing undue burden on human experts and allow quick adaptation in critical real-world situations, we propose designing and sparingly presenting easy-to-answer pairwise action preference queries in an online fashion. Our approach designs queries and determines when to present them to maximize the expected value derived from the queries' information. We demonstrate our approach with experiments in simulation, human user studies, and real robot experiments. In these settings, our approach outperforms baseline techniques while presenting fewer queries to human experts." Supernumerary Robotic Limbs for Next Generation Space Suit Technology,"Erik Ballesteros, Brandon Man, Harry Asada","Massachusetts Institute of Technology,Cornell University,MIT",Human-Robot Collaboration I,"This paper discusses the incorporation of a pair of Supernumerary Robotic Limbs (SuperLimbs) onto the next generation of NASA space suits. The wearable robots attached to the space suit assist an astronaut in performing Extra-Vehicular Activities (EVAs). The SuperLimbs grab handrails fixed to the outside of a space vehicle to securely hold the astronaut body. The astronaut can use both hands for performing an EVA task, rather than using one hand for securing the body or operating a tether. The SuperLimbs can also assist an astronaut in repositioning the body and stabilizing it during an EVA mission. A control algorithm based on Admittance Control is developed for a) virtually reducing the inertial load of the entire body so that an astronaut can reposition his/her body with reduced effort, and b) bracing the body stably despite reaction forces and disturbances acting on the astronaut during an EVA operation. A full-scale prototype of Space Suit SuperLimbs was constructed and tested. Results from the experimentation indicated that with the aid of SuperLimbs, energy consumption during EVAs is reduced significantly." It Takes Two: Learning to Plan for Human-Robot Cooperative Carrying,"Eley Ng, Ziang Liu, Monroe Kennedy","Stanford University,University of Southern California",Human-Robot Collaboration I,"Cooperative table-carrying is a complex task due to the continuous nature of the action and state-spaces, multimodality of strategies, and the need for instantaneous adaptation to other agents. In this work, we present a method for predicting realistic motion plans for cooperative human-robot teams on the task. Using a Variational Recurrent Neural Network (VRNN) to model the variation in the trajectory of a human-robot team across time, we are able to capture the distribution over the team’s future states while leveraging information from interaction history. The key to our approach is leveraging human demonstration data to generate trajectories that synergize well with humans during test time in a receding horizon fashion. Comparison between a baseline, sampling-based planner RRT (Rapidly-exploring Random Trees) and the VRNN planner in centralized planning shows that the VRNN generates motion more similar to the distribution of human-human demonstrations than the RRT. Results in a human-in-the-loop user study show that the VRNN planner outperforms decentralized RRT on task-related metrics, and is significantly more likely to be perceived as human than the RRT planner. Finally, we demonstrate the VRNN planner on a real robot paired with a human teleoperating another robot." Collision Detection and Contact Point Estimation Using Virtual Joint Torque Sensing Applied to a Cobot,"Dario Zurlo, Tom Heitmann, Merlin Morlock, Alessandro De Luca","Sapienza Università di Roma,NEURA Robotics GmbH,Sapienza University of Rome",Human-Robot Collaboration I,"In physical human-robot interaction (pHRI) it is essential to reliably estimate and localize contact forces between the robot and the environment. In this paper, a complete contact detection, isolation, and reaction scheme is presented and tested on a new 6-dof industrial collaborative robot. We combine two popular methods, based on monitoring energy and generalized momentum, to detect and isolate collisions on the whole robot body in a more robust way. The experimental results show the effectiveness of our implementation on the LARA 5 cobot, that only relies on motor current and joint encoder measurements. For validation purposes, contact forces are also measured using an external GTE CoboSafe sensor. After a successful collision detection, the contact point location is isolated using a combination of the residual method based on the generalized momentum with a contact particle filter (CPF) scheme. We show for the first time a successful implementation of such combination on a real robot, without relying on joint torque sensor measurements." The Human Gaze Helps Robots Run Bravely and Efficiently in Crowds,"Qianyi Zhang, Zhengxi Hu, Yinuo Song, Jiayi Pei, Jingtai Liu","Nankai University,NanKai Univerdsity,NanKai University",Human-Robot Collaboration I,"In human-aware navigation, the robot tacitly games with humans, balancing safety and efficiency according to human intentions. Poor balance or bad intent recognition causes the robot to stop conservatively or advance rashly, resulting in a deadlock or even a collision respectively. To address the issue, this paper proposes an improved limit cycle for collaboratively parameterizing human intentions and planning robot motions. The human-robot interaction is modeled as a dynamic chicken game with incomplete information, where the human gaze is introduced to depict the unique characteristics of each person, allowing the robot to approach with different safety margins. Our method is tested in challenging indoor scenarios and outperforms traditional methods in both safety and efficiency. We enable robots to utilize human wisdom to solve problems that cannot be solved on their own. The robot bravely goes through oncoming crowds by getting closer to people with higher attention on it and has the foresight to stably cross in front or behind people." A Gaze-Speech System in Mixed Reality for Human-Robot Interaction,"John David Prieto Prada, Myung Ho Lee, Cheol Song",DGIST,Human-Robot Collaboration I,"Human-robot interaction (HRI) demands efficient time performance along the tasks. However, some interaction approaches may extend the time to complete such tasks. Thus, the time performance in HRI must be enhanced. This work presents an effective way to enhance the time performance in HRI tasks with a mixed reality (MR) method based on a gaze-speech system. In this paper, we design an MR world for pick-and-place tasks. The hardware system includes an MR headset, the Baxter robot, a table, and six cubes. In addition, the holographic MR scenario offers two modes of interaction: gesture mode (GM) and gaze-speech mode (GSM). The input actions during the GM and GSM methods are based on the pinch gesture and gaze with speech commands, respectively. The proposed GSM approach can improve the time performance in pick-and-place scenarios. The GSM system is 21.33 % faster than the traditional system, GM. Also, we evaluated the target-to-target time performance against a reference based on Fitts’ law. Our findings show a promising method for time reduction in HRI tasks through MR environments." ADAPT: Action-Aware Driving Caption Transformer,"Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu","Institute of Automation, Chinese Academy of Sciences,Xidian University,Institute of Automation,Chinese Academy of Sciences,Institute for AI Industry Research (AIR), Tsinghua University,Tsinghua University,Southern University of Science and Technology,Beihang University",Human-Robot Interaction,"End-to-end autonomous driving has great potential in the transportation industry. However, the lack of transparency and interpretability of the automatic decision-making process hinders its industrial adoption in practice. There have been some early attempts to use attention maps or cost volume for better model explainability which is difficult for ordinary passengers to understand. To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action. ADAPT jointly trains both the driving caption task and the vehicular control prediction task, through a shared video representation. Experiments on BDD-X (Berkeley DeepDrive eXplanation) dataset demonstrate state-of-the-art performance of the ADAPT framework on both automatic metrics and human evaluation. To illustrate the feasibility of the proposed framework in real-world applications, we build a novel deployable system that takes raw car videos as input and outputs the action narrations and reasoning in real time. The code, models and data are available at https://github.com/jxbbb/ADAPT." Aligning Human Preferences with Baseline Objectives in Reinforcement Learning,"Daniel Marta, Simon Holk, Christian Pek, Jana Tumova, Iolanda Leite",KTH Royal Institute of Technology,Human-Robot Interaction,"Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an amplitude of factors, such as designing reward functions that cover every possible interaction. To address the heavy burden of robot reward engineering, we aim to leverage subjective human preferences gathered in the context of human-robot interaction, while taking advantage of a baseline reward function when available. By considering baseline objectives to be designed beforehand, we are able to narrow down the policy space, solely requesting human attention when their input matters the most. To allow for control over the optimization of different objectives, our approach contemplates a multi-objective setting. We achieve human-compliant policies by sequentially training an optimal policy from a baseline specification and collecting queries on pairs of trajectories. These policies are obtained by training a reward estimator to generate Pareto optimal policies that include human preferred behaviours. Our approach ensures sample efficiency and we conducted a user study to collect real human preferences, which we utilized to obtain a policy on a social navigation environment." EWareNet: Emotion Aware Pedestrian Intent Prediction and Adaptive Spatial Profile Fusion for Social Robot Navigation,"Venkatraman Narayanan, Bala Murali Manoghar Sai Sudhakar, Rama Prashanth Ramasamy Vijayakumar, Aniket Bera","UMD,University of Maryland, College Park,University of Maryland,Purdue University",Human-Robot Interaction,"We present EWareNet, a novel intent and affect-aware social robot navigation algorithm among pedestrians. Our approach predicts the trajectory-based pedestrian intent from gait sequence, which is then used for intent-guided navigation taking into account social and proxemic constraints. We propose a transformer-based model that works on commodity RGB-D cameras mounted onto a moving robot. Our intent prediction routine is integrated into a mapless navigation scheme and makes no assumptions about the environment of pedestrian motion. Our navigation scheme consists of a novel obstacle profile representation methodology that is dynamically adjusted based on the pedestrian pose, intent, and affect. The navigation scheme is based on a reinforcement learning algorithm that takes pedestrian intent and robot's impact on pedestrian intent into consideration, in addition to the environmental configuration. We outperform current state-of-art algorithms for intent prediction from 3D gaits." SCAN: Socially-Aware Navigation Using Monte Carlo Tree Search,"Jeongwoo Oh, Jae Seok Heo, Junseo Lee, Gunmin Lee, Minjae Kang, Jeongho Park, Songhwai Oh","Seoul National University,Seoul National University (SNU),Seoul National Universitiy",Human-Robot Interaction,"Designing a socially-aware navigation method for crowded environments has become a critical issue in robotics. In order to perform navigation in a crowded environment without causing discomfort to nearby pedestrians, it is necessary to design a global planner that is able to consider both human-robot interaction (HRI) and prediction of future states. In this paper, we propose a socially-aware global planner called SCAN, which is a global planner that generates appropriate local goals considering HRI and prediction of future states. Our method simulates future states considering the effects of the robot's actions on the future intentions of pedestrians using Monte Carlo tree search (MCTS), which estimates the quality of local goals. For fast simulation, we execute pedestrian motion prediction using Y-net and future state simulation using MCTS in parallel. Neural networks are only used in Y-net and not in MCTS, which enables fast simulation and prediction of a long horizon of future states. We evaluate the proposed method based on the proposed socially-aware navigation metric using realistic pedestrian simulation and real-world experiments. The results show that the proposed method outperforms existing methods significantly, indicating the importance of considering human-robot interaction for socially-aware navigation." SGPT: The Secondary Path Guides the Primary Path in Transformers for HOI Detection,"Sixian Chan, Weixiang Wang, Zhanpeng Shao, Cong Bai","Zhejiang University of Technology,湖南师范大学",Human-Robot Interaction,"HOI detection is essential for human-computer interaction, especially in behavior detection and robot manipulation. Existing mainstream transformer methods of HOI detection are focused on single-stream detection only, e.g., image → HOI(P1), or image → HO → I(P2). Both paths have their own characteristics of concern, so we propose a novel method, using the Secondary path (P2) Guides the Primary path (P1) in Transformers (SGPT). SGPT contains two core modules: the Dual-Path Consistency (DPC) module and the Instance Interaction Attention (IIA) module. DPC keeps human, object and interaction consistent on the dual-path and lets P2 guide P1 to learn more meaningful features. IIA fuses human and object to enhance interaction in P2, which allows instance to constrain interaction. Our proposed dual-path are employed during training, and only the P1 path is used for inference. Hence, SGPT improves generalization without increasing model capacity in HICO-DET and V-COCO datasets compared to the state-of-the-arts. The code of this work is available at https://github.com/visualVk/sgpt.git." Robot Person Following under Partial Occlusion,"Hanjing Ye, Jieting Zhao, Yaling Pan, Weinan Chen, Li He, Hong Zhang","Southern University of Science and Technology,Guangdong University of Technology,SUSTech",Human-Robot Interaction,"Robot person following (RPF) is a capability that supports many useful human-robot-interaction (HRI) applications. However, existing solutions to person following often assume full observation of the tracked person. As a consequence, they cannot track the person reliably under partial occlusion where the assumption of full observation is not satisfied. In this paper, we focus on the problem of robot person following under partial occlusion caused by a limited field of view of a monocular camera. Based on the key insight that it is possible to locate the target person when one or more of his/her joints are visible, we propose a method in which each visible joint contributes a location estimate of the followed person. Experiments on a public person-following dataset show that, even under partial occlusion, the proposed method can still locate the person more reliably than the existing SOTA methods. As well, the application of our method is demonstrated in real experiments on a mobile robot." A Little Bit Attention Is All You Need for Person Re-Identification,"Markus Eisenbach, Jannik Lübberstedt, Dustin Aganian, Horst-Michael Gross",Ilmenau University of Technology,Human-Robot Interaction,"Person re-identification plays a key role in applications where a mobile robot needs to track its users over a long period of time, even if they are partially unobserved for some time, in order to follow them or be available on demand. In this context, deep-learning-based real-time feature extraction on a mobile robot is often performed on special-purpose devices whose computational resources are shared for multiple tasks. Therefore, the inference speed has to be taken into account. In contrast, person re-identification is often improved by architectural changes that come at the cost of significantly slowing down inference. Attention blocks are one such example. We will show that some well-performing attention blocks used in the state of the art are subject to inference costs that are far too high to justify their use for mobile robotic applications. As a consequence, we propose an attention block that only slightly affects the inference speed while keeping up with much deeper networks or more complex attention blocks in terms of re-identification accuracy. We perform extensive neural architecture search to derive rules at which locations this attention block should be integrated into the architecture in order to achieve the best trade-off between speed and accuracy. Finally, we confirm that the best performing configuration on a re-identification benchmark also performs well on an indoor robotic dataset." Automatic Generation of Robot Facial Expressions with Preferences,"Bing Tang, Rongyun Cao, Rongya Chen, Bei Hua, Xiaoping Chen, Feng Wu","University of Science and Technology of China,Institute of Advanced Technology, University of Science and Tech,University of science and technology of China",Human-Robot Interaction,"The capability of humanoid robots to generate facial expressions is crucial for enhancing interactivity and emotional resonance in human-robot interaction. However, humanoid robots vary in mechanics, manufacturing, and appearance. The lack of consistent processing techniques and the complexity of generating facial expressions pose significant challenges in the field. To acquire solutions with high confidence, it is necessary to enable robots to explore the solution space automatically based on performance feedback. To this end, we designed a physical robot with a human-like appearance and developed a general framework for automatic expression generation using the MAP-Elites algorithm. The main advantage of our framework is that it does not only generate facial expressions automatically but can also be customized according to user preferences. The experimental results demonstrate that our framework can efficiently generate realistic facial expressions without hard coding or prior knowledge of the robot kinematics. Moreover, it can guide the solution-generation process in accordance with user preferences, which is desirable in many real-world applications." A Task Allocation Framework for Human Multi-Robot Collaborative Settings,"Martina Lippi, Paolo Augusto Di Lillo, Alessandro Marino","University of Roma Tre,University of Cassino and Southern Lazio",Human-Robot Interaction,"The requirements of modern production systems together with more advanced robotic technologies have fostered the integration of teams comprising humans and autonomous robots. While this integration has the potential to provide various benefits, it also raises questions about how to effectively manage these teams, taking into account the different characteristics of the agents involved. This paper presents a framework for task allocation in a human multi-robot collaborative scenario. The proposed solution combines an optimal offline allocation with an online reallocation strategy which accounts for inaccuracies of the offline plan and/or unforeseen events, human subjective preferences and cost of task switching. Experiments with two manipulators cooperating with a human operator in a box filling task are presented." TOP-JAM: A Bio-Inspired Topology-Based Model of Joint Attention for Human-Robot Interaction,"Hendry F. Chame, Aurélie Clodic, Alami Rachid","University of Lorraine / CNRS,LAAS - CNRS,CNRS",Human-Robot Interaction,"Coexisting with others and interacting in society implies sharing knowledge and attention about world objects, events, features, episodes, and even imagination or abstract ideas in time and space. Inspired by human phenomenological, cognitive and behavioral research, this work focuses on the study of joint attention (JA) for human-robot interaction (HRI), based on two main assumptions: a) the perception and representation of attention jointness constitute an isomorphic relation, and b) inspiration on dynamic neural fields (DNF) theory is a promising way to investigate contextual and non-linear spatio-temporal relations underlying attention and knowledge sharing in HRI. Taking into account the previous considerations, we propose a topology-based model for JA named TOP-JAM, which is able to represent and track in real-time JA states, from observations of behavioral data. More importantly, the model consists in a representation that can be directly understood by human beings, which conforms to robo-ethical principles in social robotics. This study evaluates computational properties of the model in simulation. Through a real experiment with the robot Pepper, the study shows that TOP-JAM is able to track JA in a triad interaction scenario." NOPA: Neurally-Guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants,"Xavier Puig, Tianmin Shu, Joshua Tenenbaum, Antonio Torralba","MIT,Massachusetts Institute of Technology",Human-Robot Interaction,"In this work, we study how to build socially intelligent robots to assist people in their homes. In particular, we focus on assistance with online goal inference, where robots must simultaneously infer humans' goals and how to help them achieve those goals. Prior assistance methods either lack the adaptivity to adjust helping strategies (i.e., when and how to help) in response to uncertainty about goals or the scalability to conduct fast inference in a large goal space. Our NOPA (Neurally-guided Online Probabilistic Assistance) method addresses both of these challenges. NOPA consists of (1) an online goal inference module combining neural goal proposals with inverse planning and particle filtering for robust inference under uncertainty, and (2) a helping planner that discovers valuable subgoals to help with and is aware of the uncertainty in goal inference. We compare NOPA against multiple baselines in a new embodied AI assistance challenge: Online Watch-And-Help, in which a helper agent needs to simultaneously watch a main agent's action, infer its goal, and help perform a common household task faster in realistic virtual home environments. Experiments show that our helper agent robustly updates its goal inference and adapts its helping plans to the changing level of uncertainty." Embodied Referring Expression for Manipulation Question Answering in Interactive Environment,"Qie Sima, Sinan Tan, Huaping Liu, Fuchun Sun",Tsinghua University,Human-Robot Interaction,"Embodied agents are expected to perform more complicated tasks in an interactive environment, with the progress of Embodied AI in recent years. Existing embodied tasks including Embodied Referring Expression (ERE) and other QA-form tasks mainly focuses on interaction in term of linguistic instruction. Therefore, enabling the agent to manipulate objects in the environment for exploration actively has become a challenging problem for the community. To solve this problem, We introduce a new embodied task: Remote Embodied Manipulation Question Answering (REMQA) to combine ERE with manipulation tasks. In the REMQA task, the agent needs to navigate to a remote position and perform manipulation with the target object to answer the question. We build a benchmark dataset for the REMQA task in the AI2-THOR simulator. To this end, a framework with 3D semantic reconstruction and modular network paradigms is proposed. The evaluation of the proposed framework on the REMQA dataset is presented to validate its effectiveness." Congestion Prediction for Large Fleets of Mobile Robots,"Ge Yu, Michael Wolf",Amazon,Multi-Robot Systems IV,"This paper introduces a deep learning (DL) approach to predicting congestion delays in large multi-robot systems. The problem is motivated by real-world problems in modern logistics automation, such as a warehouse with hundreds to thousands of coordinated mobile robots. Here, the large scale, the complexity of the control software, and the uncertainties of the robots' dynamics make direct (simulated) prediction of future robot states impractical. We propose predicting delays associated with future spatiotemporal locations, and we show this is useful for improving system performance via incorporating the predictions into path planning and travel time estimation. Our DL model uses convolutional long short-term memory (ConvLSTM) as the core structure, takes the historical congestion condition and planned paths as input, and generates the delays across all nodes in the spatial planning graph for a set of future time windows. When using predictions in a modified path planner, simulation experiments using production data show 4.4% average improvement in throughput performance versus without predictions." Decentralised Active Perception in Continuous Action Spaces for the Coordinated Escort Problem,"Rhett Hull, Ki Myung Brian Lee, Jennifer Wakulicz, Chanyeol Yoo, James Mcmahon, Bryan Clarke, Stuart Anstee, Jijoong Kim, Robert Fitch","University of Technology Sydney,University of Technology Sydney, Centre for Autonomous Systems,The Naval Research Laboratory,University of Sydney,Defence Science and Technology Group,Defence Science and Technology Organisation",Multi-Robot Systems IV,"We consider the coordinated escort problem, where a decentralised team of supporting robots implicitly assist the mission of higher-value principal robots. The defining challenge is how to evaluate the effect of supporting robots’ actions on the principal robots’ mission. To capture this effect, we define two novel auxiliary reward functions for supporting robots called satisfaction improvement and satisfaction entropy, which computes the improvement in probability of mission success, or the uncertainty thereof. Given these reward functions, we coordinate the entire team of principal and supporting robots using decentralised cross entropy method (Dec-CEM), a new extension of CEM to multi-agent systems based on the product distribution approximation. In a simulated object avoidance scenario, our planning framework demonstrates up to two-fold improvement in task satisfaction against conventional decoupled information gathering. The significance of our results is to introduce a new family of algorithmic problems that will enable important new practical applications of heterogeneous multirobot systems." Socially Fair Coverage Control,"Matthew Malencia, George J. Pappas, Vijay Kumar",University of Pennsylvania,Multi-Robot Systems IV,"We investigate and develop algorithms for social fairness in coverage control problems. Existing coverage control methods are efficient, optimizing the average expected distance from any event to the nearest robot. However, in societal applications like disaster response or transportation, these conventional objectives lead to disparate coverage costs with respect to different groups within a population. We formulate social fairness for coverage control as the minimization of the maximum coverage cost among a set of groups within a population. Our approach uses Voronoi iteration to solve this novel problem by approximating the non-differentiable objective with the log-sum-exp and defining a gradient based controller that prioritizes fairness while also optimizing average performance when disparities between groups are low. We show convergence properties of this proposed control law and demonstrate the approach in simulations of randomly generated population densities as well as environments generated from U.S. census data on population rates and demographics. Our approach provides greater fairness than existing methods while maintaining similar computational time and convergence properties." Exploiting Trust for Resilient Hypothesis Testing with Malicious Robots,"Matthew Cavorsi, Orhan Akgun, Michal Yemini, Andrea Goldsmith, Stephanie Gil","Harvard University,Bar-Ilan University,Stanford University",Multi-Robot Systems IV,"We develop a resilient binary hypothesis testing framework for decision making in adversarial multi-robot crowdsensing tasks. This framework exploits stochastic trust observations between robots to arrive at tractable, resilient decision making at a centralized Fusion Center (FC) even when i) there exist malicious robots in the network and their number may be larger than the number of legitimate robots, and ii) the FC uses one-shot noisy measurements from all robots. We derive two algorithms to achieve this. The first is the Two Stage Approach (2SA) that estimates the legitimacy of robots based on received trust observations, and provably minimizes the probability of detection error in the worst-case malicious attack. Here, the proportion of malicious robots is known but arbitrary. For the case of an unknown proportion of malicious robots, we develop the Adversarial Generalized Likelihood Ratio Test (A-GLRT) that uses both the reported robot measurements and trust observations to estimate the trustworthiness of robots, their reporting strategy, and the correct hypothesis simultaneously. We exploit special problem structure to show that this approach remains computationally tractable despite several unknown problem parameters. We deploy both algorithms in a hardware experiment where a group of robots conducts crowdsensing of traffic conditions on a mock-up road network similar in spirit to Google Maps, subject to a Sybil attack. We extract the trust observations for each robot from actual communication signals which provide statistical information on the uniqueness of the sender. We show that even when the malicious robots are in the majority, the FC can reduce the probability of detection error to 30.5% and 29% for the 2SA and the A-GLRT respectively." Obscuring Objectives with Pareto-Optimal Privacy-Aware Trajectories in Multi-Robot Coverage,"Brennan Brodt, Alyssa Pierson",Boston University,Multi-Robot Systems IV,This paper proposes an algorithm for generating Pareto-optimal privacy-aware trajectories for multi-robot coverage. Our approach utilizes a genetic algorithm to generate a set of modified trajectories for a team of robots that wishes to obscure its goal from an observer. A novel velocity-constrained crossover algorithm ensures all child trajectories are feasible for a holonomic vehicle. The Pareto front of generated trajectories allows a team to select an allowable trade-off between privacy and coverage cost given within their task. Simulation results demonstrate the performance of our algorithm in Voronoi-based coverage control. We show our approach successfully obscures the objective from our proposed observer. Safe and Distributed Multi-Agent Motion Planning under Minimum Speed Constraints,"Inkyu Jang, Jungwon Park, H. Jin Kim",Seoul National University,Multi-Robot Systems IV,"The motion planning problem for multiple unstoppable agents is of interest in many robotics applications, for example, autonomous traffic management for multiple fixed-wing aircraft. Unfortunately, many of the existing algorithms cannot provide safety for such agents, because they require the agents to be able to brake to a complete stop for safety and feasibility insurance. In this paper, we present a distributed multi-agent motion planner that guarantees collision avoidance and persistent feasibility, which can be applied to a team of homogeneous mobile vehicles that cannot stop. The planner is built on top of the idea that a collision-free trajectory in form of a loop can safely accommodate multiple unstoppable agents, while avoiding collisions among them and static obstacles. At every time step, in a distributed manner, the agents generate trajectory-manipulating actions that preserve the loop structure. Then, a deconfliction process selects a conflict-free subset of the generated actions, which are applied at the next time step. Through simulation using an unstoppable Dubins car model, we show that the proposed motion planner is able to provide persistent safety guarantees for such agents in obstacle-cluttered space in real-time." Minimally Constrained Multi-Robot Coordination with Line-Of-Sight Connectivity Maintenance,"Yupeng Yang, Yiwei Lyu, Wenhao Luo","University of North Carolina at Charlotte,Carnegie Mellon University",Multi-Robot Systems IV,"In this paper, we consider a team of mobile robots executing simultaneously multiple behaviors by different subgroups, while maintaining global and subgroup line-of-sight (LOS) network connectivity that minimally constrains the original multi-robot behaviors. The LOS connectivity between pairwise robots is preserved when two robots stay within the limited communication range and their LOS remains occlusion-free from static obstacles while moving. By using control barrier functions (CBF) and minimum volume enclosing ellipsoids (MVEE), we first introduce the LOS connectivity barrier certificate (LOS-CBC) to characterize the state-dependent admissible control space for pairwise robots, from which their resulting motion will keep the two robots LOS connected over time. We then propose the Minimum Line-of-Sight Connectivity Constraint Spanning Tree (MLCCST) as a step-wise bilevel optimization framework to jointly optimize (a) the minimum set of LOS edges to actively maintain, and (b) the control revision with respect to a nominal multi-robot controller due to LOS connectivity maintenance. As proved in the theoretical analysis, this allows the robots to improvise the optimal composition of LOS-CBC control constraints that are least constraining around the nominal controllers, and at the same time enforce the global and subgroup LOS connectivity through the resulting preserved set of pairwise LOS edges. The framework thus leads to robots staying as close to their nominal behaviors, while exhibiting dynamically changing LOS-connected network topology that provides the greatest flexibility for the existing multi-robot tasks in real-time. We demonstrate the effectiveness of our approach through simulations with up to 64 robots." Relay Pursuit for Multirobot Target Tracking on Tile Graphs,"Shashwata Mandal, Sourabh Bhattacharya",Iowa State University,Multi-Robot Systems IV,"In this work, we address a visbility-based target tracking problem in a polygonal environment in which a group of mobile observers try to maintain a line-of-sight with a mobile intruder. We build a bridge between data mining and visibility-based tracking using a novel tiling scheme for the polygon. First, we propose a tracking strategy for a team of guards located on the tiles to dynamically track an intruder when complete coverage of the polygon cannot be ensured. Next, we propose a novel variant of the Voronoi Diagram to construct navigation strategies for a team of co-located guards to track an intruder from any initial position in the environment. We present empirical analysis to illustrate the efficacy of the proposed tiling scheme. Simulations and testbed demonstrations are present in a video attachment." Passivity-Based Decentralized Control for Collaborative Grasping of Under-Actuated Aerial Manipulators,"Jinyeong Jeong, Min Jun Kim","Korea Advanced Institute of Science and Technology,KAIST",Multi-Robot Systems IV,"This paper proposes a decentralized passive impedance control scheme for collaborative grasping using under-actuated aerial manipulators (AMs). The AM system is formulated, using a proper coordinate transformation, as an inertially decoupled dynamics with which a passivity-based control design is conducted. Since the interaction for grasping can be interpreted as a feedback interconnection of passive systems, an arbitrary number of AMs can be modularly combined, leading to a decentralized control scheme. Another interesting consequence of the passivity property is that the AMs automatically converge to a certain configuration to accomplish the grasping. Collaborative grasping using 10 AMs is presented in simulation." Distributed Barrier Function-Enabled Human-In-The-Loop Control for Multi-Robot Systems,"Victor Nan Fernandez-Ayala, Xiao Tan, Dimos V. Dimarogonas","KTH Royal Institute of Technology,KTH Royal Institute of Technology, Sweden",Multi-Robot Systems IV,"In this work, we propose a distributed control scheme for multi-robot systems in the presence of multiple constraints using control barrier functions. The proposed scheme expands previous work where only one single constraint can be handled. Here we show how to transform multiple constraints to a collective one using a smoothly approximated minimum function. Additionally, human-in-the-loop control is also incorporated seamlessly to our control design, both through the nominal control in the optimization objective as well as a safety condition in the constraints. Possible failure regions are identified and a suitable fix is proposed. Two types of human-in-the-loop scenarios are tested on real multi-robot systems with multiple constraints, including collision avoidance, connectivity maintenance, and arena range limits." LEMURS: Learning Distributed Multi-Robot Interactions,"Eduardo Sebastián, Thai Duong, Nikolay A. Atanasov, Eduardo Montijano, Carlos Sagues","University of Zaragoza,University of California, San Diego,Universidad de Zaragoza",Multi-Robot Systems IV,"This paper presents LEMURS, an algorithm for learning scalable multi-robot control policies from cooperative task demonstrations. We propose a port-Hamiltonian description of the multi-robot system to exploit universal physical constraints in interconnected systems and achieve closed-loop stability. We represent a multi-robot control policy using an architecture that combines self-attention mechanisms and neural ordinary differential equations. The former handles time-varying communication in the robot team, while the latter respects the continuous-time robot dynamics. Our representation is distributed by construction, enabling the learned control policies to be deployed in robot teams of different sizes. We demonstrate that LEMURS can learn interactions and cooperative behaviors from demonstrations of multi-agent navigation and flocking tasks." Multi-Agent Active Search Using Detection and Location Uncertainty,"Arundhati Banerjee, Ramina Ghods, Jeff Schneider",Carnegie Mellon University,Multi-Robot Systems IV,"Active search, in applications like environment monitoring or disaster response missions, involves autonomous agents detecting targets in a search space using decision making algorithms that adapt to the history of their observations. Active search algorithms must contend with two types of uncertainty: detection uncertainty and location uncertainty. The more common approach in robotics is to focus on location uncertainty and remove detection uncertainty by thresholding the detection probability to zero or one. In contrast, it is common in the sparse signal processing literature to assume the target location is accurate and instead focus on the uncertainty of its detection. In this work, we first propose an inference method to jointly handle both target detection and location uncertainty. We then build a decision making algorithm on this inference method that uses Thompson sampling to enable decentralized multi-agent active search. We perform simulation experiments to show that our algorithms outperform competing baselines that only account for either target detection or location uncertainty. We finally demonstrate the real world transferability of our algorithms using a realistic simulation environment we created on the Unreal Engine 4 platform with an AirSim plugin." HMAAC: Hierarchical Multi-Agent Actor-Critic for Aerial Search with Explicit Coordination Modeling,"Chuanneng Sun, Songjun Huang, Dario Pompili",Rutgers University,"Search, Rescue, and Hazardous Field Robotics","Unmanned Aerial Vehicles (UAVs) has become prevalent in Search And Rescue (SAR) missions. However, existing solutions to the control and coordination of UAVs are mostly limited to specific environments and are not robust to handle unreliable communications. To deal with these challenges, Hierarchical Multi-Agent Actor-Critic (HMAAC) framework is proposed where a high-level policy is placed on top of individual low-level actor-critic policies to relax the inter-dependency among the agents. The low-level policies are considered conditionally independent given the coordination action, which is generated by the high-level policy. We do not consider Centralized Training Decentralized Execution (CTDE) because, piratically, we cannot assume that communication is always perfect during training and the whole system can benefit from communication during deployment. The proposed framework is evaluated in Airsim, a realistic multi-UAV simulator and is compared against existing models, such as Multi-Agent Actor Critic (MAAC), in two scenarios–(a) unstable communication scenario where packet drop is modeled as a Bernoulli process; and (b) shadow zone scenario where shadow zones are created in the search space and communication will be lost if the agents are in these zones. The results show that HMAAC are more scalable and robust to unreliable communication as it outperforms the other two models in terms of exploration and coordination when the number of agents is large and the communication condition is bad." GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search,"Nikhil Angad Bakshi, Tejus Gupta, Ramina Ghods, Jeff Schneider",Carnegie Mellon University,Award Finalists 3,"Robotic solutions for quick disaster response are essential to ensure minimal loss of life, especially when the search area is too dangerous or too vast for human rescuers. We model this problem as an asynchronous multi-agent active-search task where each robot aims to efficiently seek objects of interest (OOIs) in an unknown environment. This formulation addresses the requirement that search missions should focus on quick recovery of OOIs rather than full coverage of the search region. Previous approaches fail to accurately model sensing uncertainty, account for occlusions due to foliage or terrain, or consider the requirement for heterogeneous search teams and robustness to hardware and communication failures. We present the Generalized Uncertainty-aware Thompson Sampling (GUTS) algorithm, which addresses these issues and is suitable for deployment on heterogeneous multi-robot systems for active search in large unstructured environments. We show through simulation experiments that GUTS consistently outperforms existing methods such as parallelized Thompson Sampling and exhaustive search, recovering all OOIs in 80% of all runs. In contrast, existing approaches recover all OOIs in less than 40% of all runs. We conduct field tests using our multi-robot system in an unstructured environment with a search area of ~75,000 sq.m. Our system demonstrates robustness to various failure modes, achieving full recovery of OOIs (where feasible) in every field run, and significantly outperforming our baseline." CLIO: A Novel Robotic Solution for Exploration and Rescue Missions in Hostile Mountain Environments,"Michele Focchi, Mohamed Bensaadallah, Marco Frego, Angelika Peer, Daniele Fontanelli, Andrea Del Prete, Luigi Palopoli","Università di Trento,University of Batna ,,Free University of Bolzano,University of Trento","Search, Rescue, and Hazardous Field Robotics","Rescue missions in mountain environments are hardly achievable by standard legged robots—because of the high slopes—or by flying robots—because of limited payload capacity. We present a concept for a rope-aided climbing robot which can negotiate up-to-vertical slopes and carry heavy payloads. The robot is attached to the mountain through a rope, and it is equipped with a leg to push against the mountain and initiate jumping maneuvers. Between jumps, a hoist is used to wind/unwind the rope to move vertically and affect the lateral motion. This simple (yet effective) two-fold actuation allows the system to achieve high safety and energy efficiency. Indeed, the rope prevents the robot from falling while compensating for most of its weight, drastically reducing the effort required by the leg actuator. We also present an optimal control strategy to generate point-to-point trajectories overcoming an obstacle. We achieve fast computation time (" Towards Efficient Gas Leak Detection in Built Environments: Data-Driven Plume Modeling for Gas Sensing Robots,"Wanting Jin, Faezeh Rahbar, Chiara Ercolani, Alcherio Martinoli",EPFL,"Search, Rescue, and Hazardous Field Robotics","The deployment of robots for Gas Source Localization (GSL) tasks in hazardous scenarios significantly reduces the risk to humans and animals. Gas sensing using mobile robots focuses primarily on simplified scenarios, due to the complexity of gas dispersion, with a current trend towards tackling more complex environments. However, most state-of-art GSL algorithms for environments with obstacles only depend on local information, leading to low efficiency in large and more structured spaces. The efficiency of GSL can be improved dramatically by coupling it with a global knowledge of gas distribution in the environment. However, since gas dispersion in a built environment is difficult to model analytically, most previous work incorporating a gas dispersion model was tested under simplified assumptions, which do not take into consideration the impact of the presence of obstacles to the airflow and gas plume. In this paper, we propose a probabilistic algorithm that enables a robot to efficiently localize gas sources in built environments, by combining a state-of-the-art probabilistic GSL algorithm, Source Term Estimation (STE) with a learned plume model. The pipeline of generating gas dispersion datasets from realistic simulations, the training and validation of the model, as well as the integration of the learned model with the STE framework are presented. The performance of the algorithm is validated both in high-fidelity simulations and real experiments, with promising results obtained under various obstacle configurations." Image-To-Image Translation for Autonomous Driving from Coarsely-Aligned Image Pairs,"Youya Xia, Josephine Monica, Wei-Lun Chao, Bharath Hariharan, Kilian Weinberger, Mark Campbell",Cornell University,Self-Driving Cars II,"A self-driving car must be able to reliably handle adverse weather conditions (e.g., snowy) to operate safely. In this paper, we investigate the idea of turning sensor inputs (e.g., images) captured in an adverse condition into a benign one (e.g., sunny), upon which the downstream tasks (e.g., semantic segmentation) can attain high accuracy. Prior work primarily formulates this as an unpaired image-to-image translation problem due to the lack of paired images captured under the exact same camera poses and semantic layouts. While perfectly-aligned images are not available, one can easily obtain coarsely-paired images. For instance, many people drive the same routes daily in both good and adverse weather; thus, images captured at close-by GPS locations can form a pair. Though data from repeated traversals are unlikely to capture the same foreground objects, we posit that they provide rich contextual information to supervise the image translation model. To this end, we propose a novel training objective leveraging coarsely-aligned image pairs. We show that our coarsely-aligned training scheme leads to a better image translation quality and improved downstream tasks, such as semantic segmentation, monocular depth estimation, and visual localization." Small-Shot Multi-Modal Distillation for Vision-Based Autonomous Steering,"Yu Shen, Luyu Yang, Xijun Wang, Ming C. Lin","University of Maryland,University of Maryland, College Park,University of Maryland at College Park",Self-Driving Cars II,"In this paper, we propose a novel learning framework for autonomous systems that uses a small amount of ``auxiliary information'' that complements the learning of the main modality, called ``small-shot auxiliary modality distillation network (AMD-S-Net)''. The AMD-S-Net contains a two-stream framework design that can fully extract information from different types of data (i.e., paired/unpaired multi-modality data) to distill knowledge more effectively. We also propose a novel training paradigm based on the ``reset operation'' that enables the teacher to explore the local loss landscape near the student domain iteratively, providing local landscape information and potential directions to discover better solutions by the student, thus achieving higher learning performance. Our experiments show that AMD-S-Net and our training paradigm outperform other SOTA methods by up to 12.7% and 18.1% improvement in autonomous steering, respectively." SceneCalib: Automatic Targetless Calibration of Cameras and Lidars in Autonomous Driving,"Ayon Sen, Gang Pan, Anton Mitrokhin, Ashraful Islam",NVIDIA Corporation,Self-Driving Cars II,"Accurate camera-to-lidar calibration is a requirement for sensor data fusion in many 3D perception tasks. In this paper, we present SceneCalib, a novel method for simultaneous self-calibration of extrinsic and intrinsic parameters in a system containing multiple cameras and a lidar sensor. Existing methods typically require specially designed calibration targets and human operators, or they only attempt to solve for a subset of calibration parameters. We resolve these issues with a fully automatic method that requires no explicit correspondences between camera images and lidar point clouds, allowing for robustness to many outdoor environments. Furthermore, the full system is jointly calibrated with explicit cross-camera constraints to ensure that camera-to-camera and camera-to-lidar extrinsic parameters are consistent." Unsupervised Road Anomaly Detection with Language Anchors,"Beiwen Tian, Mingdao Liu, Huan-ang Gao, Pengfei Li, Hao Zhao, Guyue Zhou","Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University",Self-Driving Cars II,"Road anomaly detection is critical to safe autonomous driving, because current road scene understanding models are usually trained in a closed-set manner and fail to identify unknown objects. What's worse, it is difficult, if not impossible, to collect a large-scale dataset with anomaly annotations. So this paper studies unsupervised anomaly detection which finds out anomaly regions using scene parsing logits solely. While former methods depend on the weights learned from the closed training set as anchors for logit generation, we resort to language anchors that are learned from enormous paired vision and language data. Thanks to rich open-set semantic information contained in these language anchors, our method performs better than former unsupervised counterparts while maintaining the advantage of training without accessing any out-of-distribution data. We delve into this new paradigm and identify the superiority of using pair-wise binary logits, which we credit to a better understanding of the negation language anchor. Last but not least, we find that the former top-1 selection of semantic labels for uncertainty measurement is problematic in many cases and a new blended standardization strategy brings clear improvements to our solution. We report state-of-the-art performance on FS LostAndFound, LostAndFound and RoadAnomaly datasets among comparable methods. The codes are publicly available at https://github.com/TB5z035/URAD-LA.git" Expanding the Deployment Envelope of Behavior Prediction Via Adaptive Meta-Learning,"Boris Ivanovic, James Harrison, Marco Pavone","NVIDIA,Stanford University",Self-Driving Cars II,"Learning-based behavior prediction methods are increasingly being deployed in real-world autonomous systems, e.g., in fleets of self-driving vehicles, which are beginning to commercially operate in major cities across the world. Despite their advancements, however, the vast majority of prediction systems are specialized to a set of well-explored geographic regions or operational design domains, complicating deployment to additional cities, countries, or continents. Towards this end, we present a novel method for efficiently adapting behavior prediction models to new environments. Our approach leverages recent advances in meta-learning, specifically Bayesian regression, to augment existing behavior prediction models with an adaptive layer that enables efficient domain transfer via offline fine-tuning, online adaptation, or both. Experiments across multiple real-world datasets demonstrate that our method can efficiently adapt to a variety of unseen environments." Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control,"Piyush Gupta, David Isele, Donggun Lee, Sangjae Bae","Michigan State University,University of Pennsylvania, Honda Research Institute USA,UC Berkeley,Honda Research Institute, USA",Self-Driving Cars II,"Autonomous vehicles (AVs) must share the driving space with other drivers and often employ conservative motion planning strategies to ensure safety. These conservative strategies can negatively impact AV's performance and significantly slow traffic throughput. Therefore, to avoid conservatism, we design an interaction-aware motion planner for the ego vehicle (AV) that interacts with surrounding vehicles to perform complex maneuvers in a locally optimal manner. Our planner uses a neural network-based interactive trajectory predictor and analytically integrates it with model predictive control (MPC). We solve the MPC optimization using the alternating direction method of multipliers (ADMM) and prove the algorithm's convergence. We provide an empirical study and compare our method with a baseline heuristic method." GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting,"Alexander Cui, Sergio Casas Romero, Kelvin Wong, Shun Da Suo, Raquel Urtasun","University of Toronto, Waabi,University of Toronto",Award Finalists 3,"The task of motion forecasting is critical for self-driving vehicles (SDVs) to be able to plan a safe maneuver. Towards this goal, modern approaches reason about the map, the agents' past trajectories and their interactions in order to produce accurate forecasts. The predominant approach has been to encode the map and other agents in the reference frame of each target agent. However, this approach is computationally expensive for multi-agent prediction as inference needs to be run for each agent. To tackle the scaling challenge, the solution thus far has been to encode all agents and the map in a shared coordinate frame (e.g., the SDV frame). However, this is sample inefficient and vulnerable to domain shift (e.g., when the SDV visits uncommon states). In contrast, in this paper, we propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization. Towards this goal, we leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph. This parameterization allows us to be invariant to scene viewpoint, and save online computation by re-using map embeddings computed offline. Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction. We demonstrate the effectiveness of our approach on the urban Argoverse 2 benchmark as well as a novel highway dataset." RGB-Event Fusion for Moving Object Detection in Autonomous Driving,"Zhuyun Zhou, Zongwei Wu, Rémi Boutteau, Fan Yang, Cédric Demonceaux, Dominique Ginhac","University of Burgundy (Université de Bourgogne), France,Université de Bourgogne, France,Université de Rouen Normandie,Univ. Bourgogne Franche-Comté,Université Bourgogne Franche-Comté,Univ Burgundy",Self-Driving Cars II,"Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable performance when dealing with dynamic traffic participants. Recent advances in sensor technologies, especially the Event camera, can naturally complement the conventional camera approach to better model moving objects. However, event-based works often adopt a pre-defined time window for event representation, and simply integrate it to estimate image intensities from events, neglecting much of the rich temporal information from the available asynchronous events. Therefore, from a new perspective, we propose RENet, a novel RGB-Event fusion Network, that jointly exploits the two complementary modalities to achieve more robust MOD under challenging scenarios for autonomous driving. Specifically, we first design a temporal multi-scale aggregation module to fully leverage event frames from both the RGB exposure time and larger intervals. Then we introduce a bi-directional fusion module to attentively calibrate and fuse multi-modal features. To evaluate the performance of our network, we carefully select and annotate a sub-MOD dataset from the commonly used DSEC dataset. Extensive experiments demonstrate that our proposed method performs significantly better than the state-of-the-art RGB-Event fusion alternatives. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/RENet." Self-Entanglement-Free Tethered Path Planning for Non-Particle Differential-Driven Robot,"Tong Yang, Jiangpin Liu, Yue Wang, Rong Xiong",Zhejiang University,Motion and Path Planning IV,"A novel mechanism to derive self-entanglement-free path for tethered robots is proposed in this work. The problem is tailored to the applications of tethered robots without an omni-directional tether re-tractor mounted. This scenario is often encountered if an omni-directional tether retracting mechanism is incapable to be jointly equipped with other geometrically complicated devices (e.g. a manipulator), for instance the disaster recovery, spatial exploration, etc., or when the robot is manufactured in emergencies. Without a special consideration on the spatial relation between the pose of the mobile base and the tether state, self-entanglement appears when the robot moves, resulting in unsafe motion of the robot and potential damage to the tether. In this paper, the self-entanglement-free constraint is modelled by the admissible orientation of tether anchoring on the robot with respected to the robot's heading orientation. A searching-based path planning algorithm is then proposed to generate a near optimal path solution with guaranteed null of tether self-entanglement. The effectiveness of the proposed algorithm is compared with motions without considering the self-entanglement-free constraint, illustrated in challenging planning cases, and validated in real-world scenes. An open-source implementation has also been provided for the benefit of the robotics community." Operating with Inaccurate Models by Integrating Control-Level Discrepancy Information into Planning,"Ellis Ratner, Claire Tomlin, Maxim Likhachev","University of California, Berkeley,UC Berkeley,Carnegie Mellon University",Motion and Path Planning IV,"Typical robotic systems rely on models for planning. Therefore, the quality of the robot’s behavior is heavily dependent on how accurately the model can predict the outcome of the robot’s actions in the environment. A challenge, however, is that no model is perfect; moreover, we often do not know where discrepancies between the model’s prediction and the actual outcome occur prior to observing executions in the real-world. One way to address this is to bias the planner away from these discrepancies by inflating the cost of states and actions where we previously observed the model to be inaccurate. Making such decisions about where and how to bias purely at the planning-level, however, neglects valuable information from the control-level, which gives a more fine-grained understanding of where and how the model went wrong during execution. Based on this observation, our key idea is to first infer a statistical model over discrepancies in the control-level’s model. Then, we translate this model to the planning-level, where we use it to more informatively bias the planner away from states and actions where the model’s predicted outcome is likely to be inaccurate. We demonstrate that our framework enables a robot to complete tasks, despite an inaccurate planning model, with greater efficiency than existing approaches. We do so through an experimental evaluation in simulation and real-robot experiments on NASA’s Astrobee free-flyer." Approximation Algorithms for Robot Tours in Random Fields with Guaranteed Estimation Accuracy,"Shamak Dutta, Nils Wilde, Pratap Tokekar, Stephen L. Smith","University of Waterloo,TU Delft,University of Maryland",Motion and Path Planning IV,"We study the sample placement and shortest tour problem for robots tasked with mapping environmental phenomena modeled as stationary random fields. The objective is to minimize the resources used (samples or tour length) while guaranteeing estimation accuracy. We give approximation algorithms for both problems in convex environments. These improve previously known results, both in terms of theoretical guarantees and in simulations. In addition, we disprove an existing claim in the literature on a lower bound for a solution to the sample placement problem." Real-Time Fast Marching Tree for Mobile Robot Motion Planning in Dynamic Environments,"Jefferson Silveira, Kleber Cabral, Sidney Givigi, Joshua Marshall","Queen's University,Royal Military College of Canada",Motion and Path Planning IV,"This paper proposes the Real-Time Fast Marching Tree (RT-FMT), a real-time planning algorithm that features local and global path generation, multiple-query planning, and dynamic obstacle avoidance. During the search, RT-FMT quickly looks for the global solution and, in the meantime, generates local paths that can be used by the robot to start execution faster. In addition, our algorithm constantly rewires the tree to keep branches from forming inside the dynamic obstacles and to maintain the tree root near the robot, which allows the tree to be reused multiple times for different goals. Our algorithm is based on the planners Fast Marching Tree (FMT*) and Real-time Rapidly-Exploring Random Tree (RT-RRT*). We show via simulations that RT-FMT outperforms RT-RRT* in both execution cost and arrival time, in most cases. Moreover, we also demonstrate via simulation that it is worthwhile taking the local path before the global path is available in order to reduce arrival time, even though there is a small possibility of taking an inferior path." Efficient Optimal Planning in Non-FIFO Time-Dependent Flow Fields,"Ju Heon Lee, Chanyeol Yoo, Stuart Anstee, Robert Fitch","University of Technology Sydney,Defence Science and Technology Group",Motion and Path Planning IV,"We propose an algorithm for solving the time-dependent shortest path problem in flow fields where the FIFO (first-in-first-out) assumption is violated. This problem variant is important for autonomous vehicles in the ocean, for example, that cannot arbitrarily hover in a fixed position and that are strongly influenced by time-varying ocean currents. Although polynomial-time solutions are available for discrete-time problems, the continuous-time non-FIFO case is NP-hard with no known relevant special cases. Our main result is to show that this problem can be solved in polynomial time if the edge travel time functions are piecewise-constant, agreeing with existing worst-case bounds for FIFO problems with restricted slopes. We present a minimum-time algorithm for graphs that allows for paths with finite-length cycles, and then embed this algorithm within an asymptotically optimal sampling-based framework to find time-optimal paths in flows. The algorithm relies on an efficient data structure to represent and manipulate piecewise-constant functions and is straightforward to implement. We illustrate the behaviour of the algorithm in an example based on a common ocean vortex model." Human-Guided Planning for Complex Manipulation Tasks Using the Screw Geometry of Motion,"Dasharadhan Mahalingam, Nilanjan Chakraborty",Stony Brook University,Motion and Path Planning IV,"In this paper, we present a novel method of motion planning for performing complex manipulation tasks by using human demonstration and exploiting the screw geometry of motion. We consider complex manipulation tasks where there are constraints on the motion of the end effector of the robot. Examples of such tasks include opening a door, opening a drawer, transferring granular material from one container to another with a spoon, and loading dishes to a dishwasher. Our approach consists of two steps: First, using the fact that a motion in the task space of the robot can be approximated by using a sequence of constant screw motions, we segment a human demonstration into a sequence of constant screw motions. Second, we use the segmented screws to generate motion plans via screw-linear interpolation for other instances of the same task. The use of screw segmentation allows us to capture the invariants of the demonstrations in a coordinate-free fashion, thus allowing us to plan for different task instances from just one example. We present extensive experimental results on a variety of manipulation scenarios showing that our method can be used across a wide range of manipulation tasks." Towards Efficient Trajectory Generation for Ground Robots Beyond 2D Environment,"Jingping Wang, Long Xu, Haoran Fu, Chao Xu, Yanjun Cao, Ximin Lyu, Fei Gao","Zhejiang university,Zhejiang University,Sun Yat-sen University,Zhejiang University, Huzhou Institute of Zhejiang University,Sun Yat-Sen University",Motion and Path Planning IV,"With the development of robotics, ground robots are no longer limited to planar motion. Passive height variation due to complex terrain and active height control provided by special structures on robots require a more general navigation planning framework beyond 2D. Existing methods rarely considers both simultaneously, limiting the capabilities and applications of ground robots. In this paper, we proposed an optimization-based planning framework for ground robots considering both active and passive height changes on the z-axis. The proposed planner first constructs a penalty field for chassis motion constraints defined in R3 such that the optimal solution space of the trajectory is continuous, resulting in a high-quality smooth chassis trajectory. Also, by constructing custom constraints in the z-axis direction, it is possible to plan trajectories for different types of ground robots which have z-axis degree of freedom. We performed simulations and real-world experiments to verify the efficiency and trajectory quality of our algorithm." Concentration of Measure Phenomenon and Its Implications for Sample-Based Planning Algorithms in Very-High Dimensional Configuration Spaces,Joel Esposito,US Naval Academy,Motion and Path Planning IV,"In very high-dimensional (≫ 10D) spaces, a collection of points generated uniformly at random will concentrate very tightly about its expected value – defying intuition developed in low-dimensional spaces. This paper explores the implications of this for two major classes of sample-based robot motion planning algorithms: Rapidly Exploring Random Trees(RRTs) and Probabilistic Road Maps (PRMs). First we show that the graph vertices concentrate in a thin-shelled hypersphere, with almost none near the origin nor at the edges of the workspace. Next we examine how varying one of the algorithms’ parameters – the maximum edge length– can dramatically alter the algorithms’ complexity and the connectivity of the resulting graph. Finally, we explore how the position of the initial node, often placed arbitrarily, can impact the shape of the graph. While the contributions of this paper are largely theoretical, many robotic applications of practical interest have extremely high-dimensional configuration spaces including humanoids, swarms and soft (a.k.a. continuum) robotics." Safeguarding Learning-Based Planners under Motion and Sensing Uncertainties Using Reachability Analysis,"Akshay Shetty, Adam Dai, Alexandros Tzikas, Grace Gao",Stanford University,Planning under Uncertainty II,"Learning-based trajectory planners in robotics have attracted growing interest given their ability to plan for complex tasks. These planners are typically trained in simulation under nominal conditions before being implemented on real robots. However, in real settings, the presence of motion and sensing uncertainties causes the robot to deviate from planned reference trajectories potentially leading to unsafe outcomes such as collisions. In this paper we present a reachability analysis to predict such deviations and to evaluate robot safety along reference trajectories. We then use the reachability analysis to safeguard a learning-based planner. Finally, we demonstrate the applicability of our safeguarding algorithm for learning-based planners via multiple simulations and real robot experiments." Risk-Aware Spatio-Temporal Logic Planning in Gaussian Belief Spaces,"Matti Vahs, Christian Pek, Jana Tumova","KTH Royal Institute of Technology, Stockholm,KTH Royal Institute of Technology",Planning under Uncertainty II,"In many real-world robotic scenarios, we cannot assume exact knowledge about a robot's state due to unmodeled dynamics or noisy sensors. Planning in belief space addresses this problem by tightly coupling perception and planning modules to obtain trajectories that take into account the environment's stochasticity. However, existing works are often limited to tasks such as the classic reach-avoid problem and do not provide risk awareness. We propose a risk-aware planning strategy in belief space that minimizes the risk of violating a given specification and enables a robot to actively gather information about its state. We use Risk Signal Temporal Logic (RiSTL) as a specification language in belief space to express complex spatio-temporal missions including predicates over Gaussian beliefs. We synthesize trajectories for challenging scenarios that cannot be expressed through classical reach-avoid properties and show that risk-aware objectives improve the uncertainty reduction in a robot's belief." Density Planner: Minimizing Collision Risk in Motion Planning with Dynamic Obstacles Using Density-Based Reachability,"Laura Lützow, Yue Meng, Andres Chavez Armijos, Chuchu Fan","Technical University of Munich,Massachusetts Institute of Technology,Boston University",Planning under Uncertainty II,"Uncertainty is prevalent in robotics. Due to measurement noise and complex dynamics, we cannot estimate the exact system and environment state. Since conservative motion planners are not guaranteed to find a safe control strategy in a crowded, uncertain environment, we propose a density-based method. Our approach uses a neural network and the Liouville equation to learn the density evolution for a system with an uncertain initial state. We can plan for feasible and probably safe trajectories by applying a gradient-based optimization procedure to minimize the collision risk. We conduct motion planning experiments on simulated environments and environments generated from real-world data and outperform baseline methods such as model predictive control and nonlinear programming. While our method requires offline planning, the online run time is 100 times smaller compared to model predictive control. The code and supplementary material can be found at https://mit-realm.github.io/density_planner/." Sequential Bayesian Optimization for Adaptive Informative Path Planning with Multimodal Sensing,"Joshua Ott, Edward Balaban, Mykel Kochenderfer","Stanford University,NASA Ames Research Center",Planning under Uncertainty II,"Adaptive Informative Path Planning with Multimodal Sensing (AIPPMS) considers the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs. The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments. Previous work has focused on the less general Adaptive Informative Path Planning (AIPP) problem, which considers only the effect of the agent's movement on received observations. The AIPPMS problem adds additional complexity by requiring that the agent reasons jointly about the effects of sensing and movement while balancing resource constraints with information objectives. We formulate the AIPPMS problem as a belief Markov decision process with Gaussian process beliefs and solve it using a sequential Bayesian optimization approach with online planning. Our approach consistently outperforms previous AIPPMS solutions by more than doubling the average reward received in almost every experiment while also reducing the root-mean-square error in the environment belief by 50%. We completely open-source our implementation to aid in further development and comparison." Tree-Structured Policy Planning with Learned Behavior Models,"Yuxiao Chen, Peter Karkus, Boris Ivanovic, Xinshuo Weng, Marco Pavone","Nvidia research,NVIDIA,Carnegie Mellon University,Stanford University",Planning under Uncertainty II,"Autonomous vehicles (AVs) need to reason about the multimodal behavior of neighboring agents while planning their own motion. Many existing trajectory planners seek a single trajectory that performs well under all plausible futures simultaneously, ignoring bi-directional interactions and thus leading to overly conservative plans. Policy planning, whereby the ego agent plans a policy that reacts to the environment's multimodal behavior, is a promising direction as it can account for the action-reaction interactions between the AV and the environment. However, most existing policy planners do not scale to the complexity of real autonomous vehicle applications: they are either not compatible with modern deep learning prediction models, not interpretable, or not able to generate high quality trajectories. To fill this gap, we propose Tree Policy Planning (TPP), a policy planner that is compatible with state-of-the-art deep learning prediction models, generates multistage motion plans, and accounts for the influence of ego agent on the environment behavior. The key idea of TPP is to reduce the continuous optimization problem into a tractable discrete Markov Decision Process (MDP) through the construction of two tree structures: an ego trajectory tree for ego trajectory options, and a scenario tree for multi-modal ego-conditioned environment predictions. We demonstrate the efficacy of TPP in closed-loop simulations based on real-world nuScenes dataset and results show that TPP scales to realistic AV scenarios and significantly outperforms non-policy baselines." Fast and Scalable Signal Inference for Active Robotic Source Seeking,"Christopher E. Denniston, Oriana Peltzer, Joshua Ott, Sangwoo Moon, Sung Kyun Kim, Gaurav Sukhatme, Mykel Kochenderfer, Mac Schwager, Ali-Akbar Agha-Mohammadi","University of Southern California,Stanford University,Jet Propulsion Laboratory, NASA,NASA Jet Propulsion Laboratory, Caltech,NASA-JPL, Caltech",Planning under Uncertainty II,"In active source seeking, a robot takes repeated measurements in order to locate a signal source in a cluttered and unknown environment. A key component of an active source seeking robot planner is a model that can produce estimates of the signal at unknown locations with uncertainty quantification. This model allows the robot to plan for future measurements in the environment. Traditionally, this model has been in the form of a Gaussian process, which has difficulty scaling and cannot represent obstacles. We propose a global and local factor graph model for active source seeking, which allows the model to scale to a large number of measurements and represent unknown obstacles in the environment. We combine this model with extensions to a highly scalable planner to form a system for large-scale active source seeking. We demonstrate that our approach outperforms baseline methods in both simulated and real robot experiments." Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits,"Shohei Wakayama, Nisar Ahmed",University of Colorado Boulder,Planning under Uncertainty II,"In autonomous robotic decision-making under uncertainty, the tradeoff between exploitation and exploration of available options must be considered. If secondary information associated with options can be utilized, such decision-making problems can often be formulated as contextual multi-armed bandits (CMABs). In this study, we apply active inference, which has been actively studied in the field of neuroscience in recent years, as an alternative action selection strategy for CMABs. Unlike conventional action selection strategies, it is possible to rigorously evaluate the uncertainty of each option when calculating the expected free energy (EFE) associated with the decision agent's probabilistic model, as derived from the free-energy principle. We specifically address the case where a categorical observation likelihood function is used, such that EFE values are analytically intractable. We introduce new approximation methods for computing the EFE based on variational and Laplace approximations. Extensive simulation study results demonstrate that, compared to other strategies, active inference generally requires far fewer iterations to identify optimal options and generally achieves superior cumulative regret, for relatively low extra computational cost." Covariance Steering for Uncertain Contact-Rich Systems,"Yuki Shirai, Devesh Jha, Arvind Raghunathan","University of California, Los Angeles,Mitsubishi Electric Research Laboratories",Planning under Uncertainty II,"Planning and control for uncertain contact systems is challenging as it is not clear how to propagate uncertainty for planning. Contact-rich tasks can be modeled efficiently using complementarity constraints among other techniques. In this paper, we present a stochastic optimization technique with chance constraints for systems with stochastic complementarity constraints. We use a particle filter-based approach to propagate moments for stochastic complementarity system. To circumvent the issues of open-loop chance constrained planning, we propose a contact-aware controller for covariance steering of the complementarity system. Our optimization problem is formulated as Non-Linear Programming (NLP) using bilevel optimization. We present an important-particle algorithm for numerical efficiency for the underlying control problem. We verify that our contact-aware closed-loop controller is able to steer the covariance of the states under stochastic contact-rich tasks." A Congestion-Aware Path Planning Method Considering Crowd Spatial-Temporal Anomalies for Long-Term Autonomy of Mobile Robots,"Zijian Ge, Jingjing Jiang, Matthew Coombes","Loughborough university,Loughborough University",Planning under Uncertainty II,"A congestion-aware path planning method is presented for mobile robots during long-term deployment in human occupied environments. With known spatial-temporal crowd patterns, the robot will navigate to its destination via less congested areas. Traditional traffic-aware routing methods do not consider spatial-temporal anomalies of macroscopic crowd behaviour that can deviate from the predicted crowd spatial distribution. The proposed method improves long-term path planning adaptivity by integrating a partially updated memory (PUM) model that utilizes observed anomalies to generate a multi-layer crowd density map to improve estimation accuracy. Using this map, we are able to generate a path that has less chance to encounter the crowded areas. Simulation results show that our method outperforms the benchmark congestion-aware routing method in terms of reducing the probability of robot's proximity to dense crowds." Risk-Aware Model Predictive Path Integral Control Using Conditional Value-At-Risk,"Ji Yin, Zhiyuan Zhang, Panagiotis Tsiotras","Georgia Institute of Technology,Georgia Tech",Planning under Uncertainty II,"In this paper, we present a novel Model Predictive Control method for autonomous robot planning and control subject to arbitrary forms of uncertainty. The proposed Risk-Aware Model Predictive Path Integral (RA-MPPI) control utilizes the Conditional Value-at-Risk (CVaR) measure to generate optimal control actions for safety-critical robotic applications. Different from most existing Stochastic MPCs and CVaR optimization methods that linearize the original dynamics and formulate control tasks as convex programs, the proposed method directly uses the original dynamics without restricting the form of the cost functions or the noise. We apply the novel RA-MPPI controller to an autonomous vehicle to perform aggressive driving maneuvers in cluttered environments. Our simulations and experiments show that the proposed RA-MPPI controller can achieve similar lap times with the baseline MPPI controller while encountering significantly fewer collisions. The proposed controller performs online computation at an update frequency of up to 80~Hz, utilizing modern Graphics Processing Units (GPUs) to multi-thread the generation of trajectories as well as the CVaR values." Chance-Constrained Motion Planning with Event-Triggered Estimation,"Anne Theurkauf, Qi Heng Ho, Roland Ilyes, Nisar Ahmed, Morteza Lahijanian",University of Colorado Boulder,Planning under Uncertainty II,"We consider the problem of autonomous navigation using limited information from a remote sensor network. Because the remote sensors are power and bandwidth limited, we use event-triggered (ET) estimation to manage communication costs. We introduce a fast and efficient sampling-based planner which computes motion plans coupled with ET communication strategies that minimize communication costs, while satisfying constraints on the probability of reaching the goal region and the point-wise probability of collision. We derive a novel method for offline propagation of the expected state distribution, and corresponding bounds on this distribution. These bounds are used to evaluate the chance constraints in the algorithm. Case studies establish the validity of our approach, demonstrating fast computation of optimal plans." STAP: Sequencing Task-Agnostic Policies,"Toki Migimatsu, Christopher Agia, Jiajun Wu, Jeannette Bohg",Stanford University,Integrated Planning and Learning,"Advances in robotic skill acquisition have made it possible to build general-purpose libraries of learned skills for downstream manipulation tasks. However, naively executing these skills one after the other is unlikely to succeed without accounting for dependencies between actions prevalent in long-horizon plans. We present Sequencing Task-Agnostic Policies (STAP), a scalable framework for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks never seen by any skill during training. Given that Q-functions encode a measure of skill feasibility, we formulate an optimization problem to maximize the joint success of all skills sequenced in a plan, which we estimate by the product of their Q-values. Our experiments indicate that this objective function approximates ground truth plan feasibility and, when used as a planning objective, reduces myopic behavior and thereby promotes long-horizon task success. We further demonstrate how STAP can be used for task and motion planning by estimating the geometric feasibility of skill sequences provided by a task planner. We evaluate our approach in simulation and on a real robot. Qualitative results and code are made available at https://sites.google.com/stanford.edu/stap." A Multi-Step Dynamics Modeling Framework for Autonomous Driving in Multiple Environments,"Jason Gibson, Bogdan Vlahov, David Fan, Patrick Spieler, Daniel Pastor, Ali-Akbar Agha-Mohammadi, Evangelos Theodorou","Georgia Institute of Technology,NASA Jet Propulsion Laboratory,JPL,Caltech,NASA-JPL, Caltech",Award Finalists 2,"Modeling dynamics is often the first step to making a vehicle autonomous. While on-road autonomous vehicles have been extensively studied, off-road vehicles pose many challenging modeling problems. An off-road vehicle encounters highly complex and difficult-to-model terrain/vehicle interactions, as well as having complex vehicle dynamics of its own. These complexities can create challenges for effective high-speed control and planning. In this paper, we introduce a framework for multistep dynamics prediction that explicitly handles the accumulation of modeling error and remains scalable for sampling-based controllers. Our method uses a specially-initialized LSTM over a limited time horizon as the learned component in a hybrid model to predict the dynamics of a 4-person seating all-terrain vehicle (Polaris S4 1000 RZR) in two distinct environments. By only having the LSTM predict over a fixed time horizon, we negate the need for long term stability that is often a challenge when training recurrent neural networks. Our framework is flexible as it only requires odometry information for labels. Through extensive experimentation, we show that our method is able to predict millions of possible trajectories in real-time, with a time horizon of five seconds in challenging off road driving scenarios." Self-Adaptive Teaching-Learning-Based Optimizer with Improved RBF and Sparse Autoencoder for Complex Optimization Problems,"Jing Bi, Ziqi Wang, Haitao Yuan, Junfei Qiao, Jia Zhang, Mengchu Zhou","Beijing University of Technology, Beijing ,,,,,,, China,Beijing University of Technology,Beihang University,Southern Methodist University,New Jersey Institute of Technology",Integrated Planning and Learning,"Evolutionary algorithms are commonly used to solve many complex optimization problems in such fields as robotics, industrial automation, and complex system design. Yet, their performance is limited when dealing with highdimensional complex problems because they often require enormous computational resources to yield desired solutions, and they may easily trap into local optima. To solve this problem, this work proposes a Self-adaptive Teaching-learningbased Optimizer with an improved Radial basis function model and a sparse Autoencoder (STORA). In STORA, a Self-adaptive Teaching-learning-based Optimizer is designed to dynamically adjust parameters for balancing exploration and exploitation during its solution process. Then, a sparse autoencoder (SAE) is adopted as a dimension reduction method to compress search space into lower-dimensional one for more efficiently guiding population to converge towards global optima. Besides, an Improved Radial Basis Function model (IRBF) is designed as a surrogate model to balance training time and prediction accuracy. It is adopted to save computational resources for improving overall performance. In addition, a dynamic population allocation strategy is adopted to well integrate SAE and IRBF in STORA. We evaluate it by comparing it with several stateof- the-art algorithms through six benchmark functions. We further test it by applying it to solve a real-world computational offloading problem." Learning Neuro-Symbolic Programs for Language Guided Robot Manipulation,"Namasivayam Kalithasan, Himanshu Gaurav Singh, Vishal Bindal, Arnav Tuli, Vishwajeet Agrawal, Rahul Jain, Parag Singla, Rohan Paul","Indian Institute of Technology, Delhi,IIT DELHI,Indian Institute of Technology Delhi",Integrated Planning and Learning,"Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1], (ii) infer action sequences from instructions but require dense sub-goal supervision [2], or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.io" Real-Time Generative Grasping with Spatio-Temporal Sparse Convolution,"Tim Player, Dongsik Chang, Fuxin Li, Geoffrey Hollinger","Oregon State University,Amazon",Grasping and Manipulation I,"Robots performing mobile manipulation in unstructured environments must identify grasp affordances quickly and with robustness to perception noise. Yet in domains such as underwater manipulation, where perception noise is severe, computation is constrained, and the environment is dynamic, existing techniques fail. They are too computationally demanding, or too sensitive to noise to allow for closed loop grasping or dynamic replanning, or do not consider 6-DOF grasps. We present a novel grasp synthesis network, TSGrasp, that uses spatio-temporal sparse convolution to process a streaming point cloud in real time. The network generates 6-DOF grasps at greater speed and with less memory than Contact GraspNet, a state-of-the-art algorithm based on PointNet++. By considering information from multiple successive frames of depth video, TSGrasp boosts robustness to noise or temporary self-occlusion and allows more grasps to be rapidly identified. Our grasp synthesis system was successfully demonstrated in an underwater environment with a Blueprint Labs Bravo robotic arm." Keypoint-GraspNet: Keypoint-Based 6-DoF Grasp Generation from the Monocular RGB-D Input,"Yiye Chen, Yunzhi Lin, Ruinian Xu, Patricio A. Vela","Georgia Institute of Technology,georgia institute of technology",Grasping and Manipulation I,"The success of 6-DoF grasp learning using point cloud input is tempered by the computational costs resulting from their unordered nature and the pre-processing requirements to reduce the point cloud to a manageable size. These properties lead to their failure on small objects with relatively low point cloud cardinality. As an alternative to point clouds, this manuscript explores grasp generation directly from the RGB-D image input. The approach, called Keypoint-GraspNet(KGN), operates in perception space by detecting projected gripper keypoints in the image, then recovering their SE(3) poses with a PnP algorithm. Training of the network involves a synthetic dataset derived from primitive shape objects with known continuous grasp families. Trained with only single-object synthetic data, KGN achieves superior result on our single-object dataset, comparable performance with state-of-art baselines on a multi-object test set, and outperforms the most competitive baseline on small objects. KGN is more than 3x faster than tested point cloud methods. Robot experiments show high success rate, demonstrating its potential in actual, embodied application." Pick2Place: Task-Aware 6DoF Grasp Estimation Via Object-Centric Perspective Affordance,"Zhanpeng He, Nikhil Chavan-dafle, Jinwook Huh, Shuran Song, Volkan Isler","Columbia University,Samsung Research America,Samsung,University of Minnesota",Grasping and Manipulation I,"The choice of a grasp plays a critical role in the success of downstream manipulation tasks. Consider a task of placing an object in a cluttered scene; the majority of possible grasps may not be suitable for the desired placement. In this paper, we study the synergy between the picking and placing of an object in a cluttered scene to develop an algorithm for task-aware grasp estimation. We present an object-centric action space that encodes the relationship between the geometry of the placement scene and the object to be placed in order to provide placement affordance maps directly from perspective views of the placement scene. This action space enables the computation of a one-to-one mapping between the placement and picking actions allowing the robot to generate a diverse set of pick-and-place proposals and to optimize for a grasp under other task constraints such as robot kinematics and collision avoidance. With experiments both in simulation and on a real robot we demonstrate that with our method, the robot is able to successfully complete the task of placement-aware grasping with over 89% accuracy in such a way that generalizes to novel objects and scenes." RGB-D Grasp Detection Via Depth Guided Learning with Cross-Modal Attention,"Ran Qin, Haoxiang Ma, Boyang Gao, Di Huang","Beihang University,Geometry Robotics Ltd. Harbin Institute of Technology",Grasping and Manipulation I,"Planar grasp detection is one of the most fundamental tasks to robotic manipulation, and the recent progress of consumer-grade RGB-D sensors enables delivering more comprehensive features from both the texture and shape modalities. However, depth maps are generally of a relatively lower quality with much stronger noise compared to RGB images, making it challenging to acquire grasp depth and fuse multi-modal clues. To address the two issues, this paper proposes a novel learning based approach to RGB-D grasp detection, namely Depth Guided Cross-modal Attention Network (DGCAN). To better leverage the geometry information recorded in the depth channel, a complete 6-dimensional rectangle representation is adopted with the grasp depth dedicatedly considered in addition to those defined in the common 5-dimensional one. The prediction of the extra grasp depth substantially strengthens feature learning, thereby leading to more accurate results. Moreover, to reduce the negative impact caused by the discrepancy of data quality in two modalities, a Local Cross-modal Attention (LCA) module is designed, where the depth features are refined according to cross-modal relations and concatenated to the RGB ones for more sufficient fusion. Extensive simulation and physical evaluations are conducted and the experimental results highlight the superiority of the proposed approach." Towards Generalized Robot Assembly through Compliance-Enabled Contact Formations,"Andrew Morgan, Quentin Bateux, Mei Hao, Aaron Dollar",Yale University,Grasping and Manipulation I,"Contact can be conceptualized as a set of constraints imposed on two bodies that are interacting with one another in some way. The nature of a contact, whether a point, line, or surface, dictates how these bodies are able to move with respect to one another given a force, and a set of contacts can provide either partial or full constraint on a body's motion. Decades of work have explored how to explicitly estimate the location of a contact and its dynamics, e.g., frictional properties, but investigated methods have been computationally expensive and there often exists significant uncertainty in the final calculation. This has affected further advancements in contact-rich tasks that are seemingly simple to humans, such as generalized peg-in-hole insertions. In this work, instead of explicitly estimating the individual contact dynamics between an object and its hole, we approach this problem by investigating compliance-enabled contact formations. More formally, contact formations are defined according to the constraints imposed on an object's available degrees-of-freedom. Rather than estimating individual contact positions, we abstract out this calculation to an implicit representation, allowing the robot to either acquire, maintain, or release constraints on the object during the insertion process, by monitoring forces enacted on the end effector through time. Using a compliant robot, our method is desirable in that we are able to complete industry-relevant insertion tasks of tolerances" Design of a Multimodal Fingertip Sensor for Dynamic Manipulation,"Andrew Saloutos, Elijah Stanger-jones, Menglong Guo, Hongmin Kim, Sangbae Kim","Massachusetts Institute of Technology,University of California Berkeley,Seoul National University",Grasping and Manipulation I,"We introduce a spherical fingertip sensor for dynamic manipulation. It is based on barometric pressure and time-of-flight proximity sensors and is low-latency, compact, and physically robust. The sensor uses a trained neural network to estimate the contact location and three-axis contact forces based on data from the pressure sensors, which are embedded within the sensor's sphere of polyurethane rubber. The time-of-flight sensors face in three different outward directions, and an integrated microcontroller samples each of the individual sensors at up to 200 Hz. To quantify the effect of system latency on dynamic manipulation performance, we develop and analyze a metric called the collision impulse ratio and characterize the end-to-end latency of our new sensor. We also present experimental demonstrations with the sensor, including measuring contact transitions, performing coarse mapping, maintaining a contact force with a moving object, and reacting to avoid collisions." TactoFind: A Tactile Only System for Object Retrieval,"Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal","Massachusetts Institute of Technology,MIT,University of Washington",Grasping and Manipulation I,"We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer. Successful solutions require localizing free objects, identifying specific object instances, and then grasping the identified objects, only using touch feedback. Unlike vision, where cameras can observe the entire scene, touch sensors are local and only observe parts of the scene that are in contact with the manipulator. Moreover, information gathering via touch sensors necessitates applying forces on the touched surface which may disturb the scene itself. Reasoning with touch, therefore, requires careful exploration and integration of information over time -- a challenge we tackle. We present a system capable of using sparse tactile feedback from fingertip touch sensors on a dexterous hand to localize, identify and grasp novel objects without any visual feedback. Videos are available at https://sites.google.com/view/tactofind." FingerSLAM: Closed-Loop Unknown Object Localization and Reconstruction from Visuo-Tactile Feedback,"Alan Zhao, Maria Bauza Villalonga, Edward Adelson","Massachusetts Institute of Technology,MIT",Grasping and Manipulation I,"In this paper, we address the problem of using visuo-tactile feedback for 6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose FingerSLAM, a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera. FingerSLAM is constructed with two constituent pose estimators: a multi-pass refined tactile-based pose estimator that captures movements from detailed local textures, and a single-pass vision-based pose estimator that predicts from a global view of the object. We also design a loop closure mechanism that actively matches current vision and tactile images to previously stored key-frames to reduce accumulated error. FingerSLAM incorporates the two sensing modalities of tactile and vision, as well as the loop closure mechanism with a factor graph-based optimization framework. Such a framework produces an optimized pose estimation solution that is more accurate than the standalone estimators. The estimated poses are then used to reconstruct the shape of the unknown object incrementally by stitching the local point clouds recovered from tactile images. We train our system on real-world data collected with 20 objects. We demonstrate reliable visuo-tactile pose estimation and shape reconstruction through quantitative and qualitative real-world evaluations on 6 objects that are unseen during training." Differential Dynamic Programming Based Hybrid Manipulation Strategy for Dynamic Grasping,"Cheng Zhou, Yanbo Long, Lei Shi, Longfei Zhao, Yu Zheng","Tencent,University of Bristol,Johns Hopkins University,TENCENT",Grasping and Manipulation I,"To fully explore the potential of robots for dexterous manipulation, this paper presents a whole dynamic grasping process to achieve fluent grasping of a target object by the robot end-effector. The process starts from the phase of approaching the object over the phases of colliding with the object and letting it roll about the colliding point to the final phase of catching it by the palm or grasping it by the fingers of the end-effector. We derive a unified model for this hybrid dynamic manipulation process embodied as approaching-colliding-rolling-catching/grasping from the spatial vector based articulated body dynamics. Then, the whole process is formulated as a free-terminal constrained multi-phase optimal control problem (OCP). We extend the traditional differential dynamic programming (DDP) to solving this free-terminal OCP, where the backward pass of DDP involves constrained quadratic programming (QP) problems and we solve them by the primal-dual Augmented Lagrangian (PDAL) method. Simulation and real experiments are conducted to show the effectiveness of the proposed method for robotic dynamic grasping." A Bioinspired Synthetic Nervous System Controller for Pick-And-Place Manipulation,"Yanjun Li, Ravesh Sukhnandan, Jeffrey Gill, Hillel Chiel, Victoria Webster-Wood, Roger Quinn","Case Western Reserve University,Carnegie Mellon University",Grasping and Manipulation I,"The Synthetic Nervous System (SNS) is a biologically inspired neural network (NN). Due to its capability of capturing complex mechanisms underlying neural computation, an SNS model is a candidate for building compact and interpretable NN controllers for robots. Previous work on SNSs has focused on applying the model to the control of legged robots and the design of functional subnetworks (FSNs) to realize dynamical systems. However, the FSN approach has previously relied on the analytical solution of the governing equations, which is difficult for designing more complex NN controllers. Incorporating plasticity into SNSs and using learning algorithms to tune the parameters offers a promising solution for systematic design in this situation. In this paper, we theoretically analyze the computational advantages of SNSs compared with other classical artificial neural networks. We then use learning algorithms to develop compact subnetworks for implementing addition, subtraction, division, and multiplication. We also combine the learning-based methodology with a bioinspired architecture to design an interpretable SNS for the pick-and-place control of a simulated gantry system. Finally, we show that the SNS controller is successfully transferred to a real-world robotic platform without further tuning of the parameters, verifying the effectiveness of our approach." SDF-Based Graph Convolutional Q-Networks for Rearrangement of Multiple Objects,"Hogun Kee, Minjae Kang, Dohyeong Kim, JaeGoo Choy, Songhwai Oh","Seoul National University,Seoul National University (SNU)",Grasping and Manipulation I,"In this paper, we propose a signed distance field (SDF)-based deep Q-learning framework for multi-object rearrangement. Our method learns to rearrange objects with non-prehensile manipulation, e.g., pushing, in unstructured environments. To reliably estimate Q-values in various scenes, we train the Q-network using an SDF-based scene graph as the state-goal representation. To this end, we introduce SDFGCN, a scalable Q-network structure which can estimate Q-values from a set of image inputs satisfying permutation invariance by using graph convolutional networks. In contrast to grasping-based rearrangement methods that rely on the performance of grasp predictive models for perception and movement, our approach enables rearrangements on unseen objects, including hard-to grasp objects. Moreover, our method does not require any expert demonstrations. We observe that SDFGCN is capable to unseen objects in challenging configurations, both in the simulation and real world." Towards Open-World Interactive Disambiguation for Robotic Grasping,"Yuchen Mo, Hanbo Zhang, Tao Kong","ByteDance AI Lab,Bytedance AI Lab,ByteDance",Grasping and Manipulation I,"Language-based communications are essential in human-robot interaction, especially for the majority of non-expert users. In this paper, we present SeeAsk, an open-world interactive visual grounding system to grasp specified targets with ambiguous natural language instructions. The main contribution of SeeAsk is that it can robustly handle open-world scenes in terms of both open-set objects and open-vocabulary interactions. Specifically, our SeeAsk is built upon modern large-scale vision-language pre-trained models and traditional decision-making process, and shows promising results to be deployed in real-world scenarios. SeeAsk outperforms previous state-of-the-art algorithms with a clear margin in terms of not only success rate but also asking smarter and more informative questions. User studies also demonstrate its advantages over previous works." GenDexGrasp: Generalizable Dexterous Grasping,"Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang","Tsinghua University,Beijing Institute for General Artificial Intelligence,Peking University",Grasping and Manipulation I,"Generating dexterous grasping has been a long-standing and challenging robotic task. Despite recent progress, existing methods primarily suffer from two issues. First, most prior art focuses on a specific type of robot hand, lacking generalizable capability of handling unseen ones. Second, prior arts oftentimes fail to rapidly generate diverse grasps with a high success rate. To jointly tackle these challenges with a unified solution, we propose the GenDexGrasp, a novel hand-agnostic grasping algorithm for generalizable grasping. GenDexGrasp is trained on our proposed large-scale multi-hand grasping dataset MultiDex synthesized with force closure optimization. By leveraging the contact map as a hand-agnostic intermediate representation, GenDexGrasp efficiently generates diverse and plausible grasping poses with a high success rate and can transfer among diverse multi-fingered robotic hands. Compared with previous methods, GenDexGrasp achieves a three-way trade-off among success rate, inference speed, and diversity." Mechanical Intelligence for Prehensile In-Hand Manipulation of Spatial Trajectories,"Qiujie Lu, Zhongxue Gan, Xinran Wang, Guochao Bai, Zhuang Zhang, Nicolas Rojas","Fudan University,Imperial College London,Shanghai Jiao Tong University",Grasping and Manipulation I,"The application of mechanical and other physical properties to the development of robotic systems that can easily adapt to changing external situations is known as mechanical intelligence. Following this concept, many robot hand designs can produce self-adaptive and versatile grasps with simple underactuated fingers and open-loop control, while mechanical-intelligent strategies for dexterous manipulation are still limited. This paper proposes a mechanical-intelligent technique to facilitate dexterous manipulation, in particular prehensile in-hand manipulation. The proposed strategy is based on the generation of complex spatial trajectories of the hand-object system, controlled in open loop with the minimum number of actuators and using simple low-level non-position modes. This approach is exemplified by the rigorous analysis and testing of a three-fingered two-actuator underactuated robot hand, called the helical hand, which is capable of generating helical prehensile in-hand manipulation of diversiform objects under error tolerance controlled by constant speed algorithm." Fast-Grasp'D: Dexterous Multi-Finger Grasp Generation through Differentiable Simulation,"Dylan Turpin, Tao Zhong, Shutong Zhang, Guanglei Zhu, Eric Heiden, Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg","University of Toronto,NVIDIA,University of Copenhagen, NVIDIA,Samsung",Grasping and Manipulation I,"Multi-finger grasping relies on high quality training data, which is hard to obtain: human data is hard to transfer and synthetic data relies on simplifying assumptions that reduce grasp quality. By making grasp simulation differentiable, and contact dynamics amenable to gradient-based optimization, we accelerate the search for high-quality grasps with fewer limiting assumptions. We present Grasp’D-1M: a large-scale dataset for multi-finger robotic grasping, synthesized with FastGrasp’D, a novel differentiable grasping simulator. Grasp’D1M contains one million training examples for three robotic hands (three, four and five-fingered), each with multimodal visual inputs (RGB+depth+segmentation, available in mono and stereo). Grasp synthesis with Fast-Grasp’D is 10x faster than GraspIt! [1] and 20x faster than the prior Grasp’D differentiable simulator [2]. Generated grasps are more stable and contact-rich than GraspIt! grasps, regardless of the distance threshold used for contact generation. We validate the usefulness of our dataset by retraining an existing vision-based grasping pipeline [3] on Grasp’D-1M, and showing a dramatic increase in model performance, predicting grasps with 30% more contact, a 33% higher epsilon metric, and 35% lower simulated displacement. Additional details at fast-graspd.github.io." An Analysis of Unified Manipulation with Robot Arms and Dexterous Hands Via Optimization-Based Motion Synthesis,"Vatsal Patel, Daniel Rakita, Aaron Dollar","Yale University,University of Wisconsin-Madison",Grasping and Manipulation I,"Robot manipulation today generally focuses on motions exclusively with a robot arm or a dexterous hand, but usually not a combination of both. However, complex manipulation tasks can require coordinating arm and hand motions that leverage capabilities of both, much like the coordinated arm and hand motions carried out by humans to perform everyday tasks. In this work, we evaluate unified manipulation with robot arms and dexterous hands, using a motion optimization framework that synthesizes a series of configuration states over the entire manipulation system. We characterize the possible benefits of unifying arm and dexterous hand capabilities within a single model via metrics such as pose accuracy, manipulability, joint-space smoothness, distance to joint-limits, distance to collisions, and more. Several arm-hand combinations are quantitatively compared in simulation on a variety of experiment tasks and performance measures. Our results suggest that combining motions from robot arms and dexterous hands indeed has compelling benefits, highlighting the exciting potential of continued progress in unified arm-hand motion synthesis for robotics applications." "Spherical Cubic Blends: C2-Continuous, Zero-Clamped, and Time-Optimized Interpolation of Quaternions","Jonas Wittmann, Lukas Cha, Marco Kappertz, Philipp Seiwald, Daniel Rixen","Technical University of Munich,Technische Universität München",Planning for Manipulation,"Modern collaborative robotic applications require robot motions that are predictable for human coworkers. Therefore, trajectories often need to be planned in task space rather than configuration space (C-space). While the interpolation of translations in Euclidean space is straightforward, the interpolation of rotations in SO(3) is more complex. Most approaches originating from computer graphics do not exhibit the often desired C2-continuity in robotics. Our main contribution is a C2-continuous, zero-clamped interpolation scheme for quaternions that computes a fast synchronized motion given a set of waypoints. As a second contribution, we present modifications to two state-of-the-art quaternion interpolation schemes, Spherical Quadrangle Interpolation (SQUAD) and Spherical Parabolic Blends (SPB), to enable them to compute C2-continuous, zero-clamped trajectories. In experiments, we demonstrate that for the time optimization of trajectories, our approach is computationally efficient and at the same time computes smooth trajectory profiles." Object Reconfiguration with Simulation-Derived Feasible Actions,"Yiyuan Lee, Wil Thomason, Zachary Kingston, Lydia Kavraki",Rice University,Planning for Manipulation,"3D object reconfiguration encompasses common robot manipulation tasks in which a set of objects must be moved through a series of physically feasible state changes into a desired final configuration. Object reconfiguration is challenging to solve in general, as it requires efficient reasoning about environment physics that determine action validity. This information is typically manually encoded in an explicit transition system. Constructing these explicit encodings is tedious and error-prone, and is often a bottleneck for planner use. In this work, we explore embedding a physics simulator within a motion planner to implicitly discover and specify the valid actions from any state, removing the need for manual specification of action semantics. Our experiments demonstrate that the resulting simulation-based planner can effectively produce physically valid rearrangement trajectories for a range of 3D object reconfiguration problems without requiring more than an environment description and start and goal arrangements." CuRobo: Parellelized Collision-Free Robot Motion Generation,"Balakumar Sundaralingam, Siva Kumar Sastry Hari, Adam Fishman, Caelan Garrett, Karl Van Wyk, Valts Blukis, Alexander James Millane, Helen Oleynikova, Ankur Handa, Fabio Ramos, Nathan Ratliff, Dieter Fox","NVIDIA Corporation,NVIDIA,University of Washington,Massachusetts Institute of Technology,ETH Zurich,Nvidia,NVidia,University of Sydney, NVIDIA",Planning for Manipulation,"This paper explores the problem of collision-free motion generation for manipulators by formulating it as a global motion optimization problem. We develop a parallel optimization technique to solve this problem and demonstrate its effectiveness on massively parallel GPUs. We show that combining simple optimization techniques with many parallel seeds leads to solving difficult motion generation problems within 53ms on average, 62x faster than SOTA trajectory optimization methods. We achieve SOTA performance by combining L-BFGS step direction estimation with a novel parallel noisy line search scheme and a particle-based optimization solver. To further aid trajectory optimization, we develop a parallel geometric planner that is atleast 28x faster than SOTA RRTConnect implementations. We also introduce a collision-free IK solver that can solve 9000 queries/s. We are releasing our GPU accelerated library CuRobo that contains core components for robot motion generation." Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces,"Xinghao Zhu, Wenzhao Lian, Bodi Yuan, Daniel Freeman, Masayoshi Tomizuka","University of California, Berkeley,Google X,UC Berkeley,Google LLC,University of California",Planning for Manipulation,"In recent years, impressive results have been achieved in robotic manipulation. While many efforts focus on generating collision-free reference signals, few allow safe contact between the robot bodies and the environment. However, in human's daily manipulation, contact between arms and obstacles is prevalent and even necessary. This paper investigates the benefit of allowing safe contact during robotic manipulation and advocates generating and tracking compliance reference signals in both operational and null spaces. In addition, to optimize the collision-allowed trajectories, we present a hybrid solver that integrates sampling- and gradient-based approaches. We evaluate the proposed method on a goal-reaching task in five simulated and real-world environments with different collisional conditions. We show that allowing safe contact improves goal-reaching efficiency and provides feasible solutions in highly collisional scenarios where collision-free constraints cannot be enforced. Moreover, we demonstrate that planning in null space, in addition to operational space, improves trajectory safety." Kinodynamic Rapidly-Exploring Random Forest for Rearrangement-Based Nonprehensile Manipulation,"Kejia Ren, Podshara Chanrungmaneekul, Lydia Kavraki, Kaiyu Hang",Rice University,Planning for Manipulation,"Rearrangement-based nonprehensile manipulation still remains as a challenging problem due to the high-dimensional problem space and the complex physical uncertainties it entails. We formulate this class of problems as a coupled problem of local rearrangement and global action optimization by incorporating free-space transit motions between constrained rearranging actions. We propose a forest-based kinodynamic planning framework to concurrently search in multiple problem regions, so as to enable global exploration of the most task-relevant subspaces, while facilitating effective switches between local rearranging actions. By interleaving dynamic horizon planning and action execution, our framework can adaptively handle real-world uncertainties. With extensive experiments, we show that our framework significantly improves the planning efficiency and manipulation effectiveness while being robust against various uncertainties." Trajectory Generation with Dynamic Programming for End-Effector Sway Damping of Forestry Machine,"Iman Jebellat, Inna Sharf",McGill University,Planning for Manipulation,"When a robot end-effector is attached to the arm via passive joints, undesirable end-effector sway will occur. In a forestry crane, such as the log-loading or harvesting machine, this sway is problematic as it hinders the efficiency and also can harm the machine and environment. Here, we tackle the sway problem of the forestry forwarder by proposing a methodology for generating anti-sway trajectories in fast maneuvers. We employ the dynamic programming algorithm, combined with a suitable linearization approach, the latter identified through a comparative study. The solution has low computational cost and provides excellent performance for residual sway damping. We demonstrate the dynamic programming solution on the virtual model of the forwarder by using a high-fidelity multibody-dynamics simulator to validate its performance. The results show our optimal trajectories can suppress the residual sway effectively to be, on average, less than 10% of the sway when using fifth order polynomial trajectories, in point-to-point maneuvers starting from rest or from initial sway conditions." Planning for Complex Non-Prehensile Manipulation among Movable Objects by Interleaving Multi-Agent Pathfinding and Physics-Based Simulation,"Dhruv Saxena, Maxim Likhachev","The Robotics Institute, Carnegie Mellon University,Carnegie Mellon University",Planning for Manipulation,"Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some `movable' objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions where multiple objects might be moved by the robot simultaneously, and objects might tilt, lean on each other, or topple. To support this, we query a physics-based simulator to forward simulate these interaction dynamics which makes action evaluation during planning computationally very expensive. To make the planner tractable, we establish a connection between the domain of Manipulation Among Movable Objects and Multi-Agent Pathfinding that lets us decompose the problem into two phases our M4M algorithm iterates over. First we solve a multi-agent planning problem that reasons about the configurations of movable objects but does not forward simulate a physics model. Next, an arm motion planning problem is solved that uses a physics-based simulator but does not search over possible configurations of movable objects. We run simulated and real-world experiments with the PR2 robot and compare against relevant baseline algorithms. Our results highlight that M4M generates complex 3D interactions, and solves at least twice as many problems as the baselines with competitive performance." Torque-Limited Manipulation Planning through Contact by Interleaving Graph Search and Trajectory Optimization,"Ramkumar Natarajan, Garrison Johnston, Nabil Simaan, Maxim Likhachev, Howie Choset","Robotics Institute, Carnegie Mellon University,Vanderbilt University,Carnegie Mellon University",Planning for Manipulation,"Robots often have to perform manipulation tasks in close proximity to people. As such, it is desirable to use a robot arm that has limited joint torques so as to not injure the nearby person. Unfortunately, these limited torques then limit the payload capability of the arm. By using contact with the environment, robots can expand their reachable workspace that, otherwise, would be inaccessible due to exceeding actuator torque limits. We adapt our recently developed INSAT algorithm to tackle the problem of torque-limited whole arm manipulation planning through contact. INSAT requires no prior over contact mode sequence and no initial template or seed for trajectory optimization. INSAT achieves this by interleaving graph search to explore the manipulator joint configuration space with incremental trajectory optimizations seeded by neighborhood solutions to find a dynamically feasible trajectory through contact. We demonstrate our results on a variety of manipulators and scenarios in simulation. We also experimentally show our planner exploiting robot-environment contact for the pick and place of a payload using a Kinova Gen3 robot. In comparison to the same trajectory running in free space, we experimentally show that the utilization of bracing contacts reduces the overall torque required to execute the trajectory." FDLNet: Boosting Real-Time Semantic Segmentation by Image-Size Convolution Via Frequency Domain Learning,"Qingqing Yan, Shu Li, Chengju Liu, Ming Liu, Qijun Chen","Tongji University,Hong Kong University of Science and Technology",Semantic Scene Understanding,"This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To model spatial structure information, we construct the global structure representation path (GSRP) based on IS-Conv, which learns a unified edge-region representation with affordable complexity. For efficient and lightweight multi-level feature fusion, we propose the factorized stereoscopic attention (FSA) module, which alleviates semantic confusion and reduces feature redundancy by introducing level-wise attention before channel and spatial attention. Combining the above modules, we propose a concise semantic segmentation framework named FDLNet. We experimentally demonstrate the effectiveness and superiority of the proposed method. FDLNet achieves state-of-the-art performance on the Cityscapes, which reports 76.32% mIoU at 150+ FPS and 79.0% mIoU at 41+ FPS on the benchmark. The code will be publicly available." SphNet: A Spherical Network for Semantic Pointcloud Segmentation,"Lukas Bernreiter, Lionel Ott, Roland Siegwart, Cesar D. Cadena Lerma","ETH Zurich, Autonomous Systems Lab,ETH Zurich",Semantic Scene Understanding,"Semantic segmentation for robotic systems can enable a wide range of applications, from self-driving cars and augmented reality systems to domestic robots. We argue that a spherical representation is a natural one for egocentric pointclouds. Thus, in this work, we present a novel framework exploiting such a representation of LiDAR pointclouds for the task of semantic segmentation. Our approach is based on a spherical convolutional neural network that can seamlessly handle observations from various sensor systems (e.g., different LiDAR systems) and provides an accurate segmentation of the environment. We operate in two distinct stages: First, we encode the projected input pointclouds to spherical features. Second, we decode and back-project the spherical features to achieve an accurate semantic segmentation of the pointcloud. We evaluate our method with respect to state-of-the-art projection-based semantic segmentation approaches using well-known public datasets. We demonstrate that the spherical representation enables us to provide more accurate segmentation and to have a better generalization to sensors with different field-of-view and number of beams than what was seen during training." SRI-Graph: A Novel Scene-Robot Interaction Graph for Robust Scene Understanding,"Dong Yang, Xiao Xu, Mengchen Xiong, Edwin Babaians, Eckehard Steinbach",Technical University of Munich,Semantic Scene Understanding,"We propose a novel scene-robot interaction graph (SRI-Graph) that exploits the known position of a mobile manipulator for robust and accurate scene understanding. Compared to the state-of-the-art scene graph approaches, the proposed SRI-Graph captures not only the relationships between the objects, but also the relationships between the robot manipulator and objects with which it interacts. To improve the detection accuracy of spatial relationships, we leverage the 3D position of the mobile manipulator in addition to RGB images. The manipulator's ego information is crucial for a successful scene understanding when the relationships are visually uncertain. The proposed model is validated for a real-world 3D robot-assisted feeding task. We release a new dataset named 3DRF-Pos for training and validation. We also develop a tool, named LabelImg-Rel, as an extension of the open-sourced image annotation tool LabelImg for a convenient annotation in robot-environment interaction scenarios. Our experimental results using the Kinova Movo platform show that SRI-Graph outperforms the state-of-the-art approach and improves detection accuracy by up to 9.83%." 3D VSG: Long-Term Semantic Scene Change Prediction through 3D Variable Scene Graphs,"Samuel Looper, Javier Rodriguez-Puigvert, Roland Siegwart, Cesar D. Cadena Lerma, Lukas Maximilian Schmid","ETH Zurich,Universidad de Zaragoza,Massachusetts Institute of Technology",Semantic Scene Understanding,"Numerous applications require robots to operate in environments shared with other agents, such as humans or other robots. However, such shared scenes are typically subject to different kinds of long-term semantic scene changes. The ability to model and predict such changes is thus crucial for robot autonomy. In this work, we formalize the task of semantic scene variability estimation and identify three main varieties of semantic scene change: changes in the position of an object, its semantic state, or the composition of a scene as a whole. To represent this variability, we propose the Variable Scene Graph (VSG), which augments existing 3D Scene Graph (SG) representations with the variability attribute, representing the likelihood of discrete long-term change events. We present a novel method, DeltaVSG, to estimate the variability of VSGs in a supervised fashion. We evaluate our method on the 3RScan long-term dataset, showing notable improvements in this novel task over existing approaches. Our method DeltaVSG achieves an accuracy of 77.1% and a recall of 72.3%, often mimicking human intuition about how indoor scenes change over time. We further show the utility of VSG prediction in the task of active robotic change detection, speeding up task completion by 66.0% compared to a scene-change-unaware planner. We make our code available as open-source." Infrared Image Captioning with Wearable Device,"Chenjun Gao, Yanzhi Dong, Xiaohu Yuan, Huaping Liu","Yantai University,Tsinghua Univerisity,Tsinghua University",Semantic Scene Understanding,"Wearable devices have been widely concerned as a mobile solution, and various intelligent modules based on wearable devices are also increasingly integrated. Additionally, image captioning is an important task in computer vision to map images to text. Existing image captioning achievements is based on high-quality visible images, higher target complexity and insufficient light can lead to reduced captioning performance and mistakes. In this paper, we design a infrared image captioning framework, which aims at solving the problem of invalid visible image captioning in special conditions. Remarkably, we integrate our infrared image captioning model on the wearable device. Volunteers perform offline and real-time environmental analysis tasks in the real world, and evaluate framework effectiveness in multiple scenarios. The results show that both the accuracy of the infrared image captioning and the feedback from volunteers of wearable devices are in an ideal state." External Camera-Based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge Sensors,"Simon Bultmann, Raphael Memmesheimer, Sven Behnke",University of Bonn,Semantic Scene Understanding,"We present an approach for estimating a mobile robot's pose w.r.t. the allocentric coordinates of a network of static cameras using multi-view RGB images. The images are processed online, locally on smart edge sensors by deep neural networks to detect the robot and estimate 2D keypoints defined at distinctive positions of the 3D robot model. Robot keypoint detections are synchronized and fused on a central backend, where the robot's pose is estimated via multi-view minimization of reprojection errors. Through the pose estimation from external cameras, the robot's localization can be initialized in an allocentric map from a completely unknown state (kidnapped robot problem) and robustly tracked over time. We conduct a series of experiments evaluating the accuracy and robustness of the camera-based pose estimation compared to the robot's internal navigation stack, showing that our camera-based method achieves pose errors below 3 cm and 1° and does not drift over time, as the robot is localized allocentrically. With the robot's pose precisely estimated, its observations can be fused into the allocentric scene model. We show a real-world application, where observations from mobile robot and static smart edge sensors are fused to collaboratively build a 3D semantic map of a ≈240 m2 indoor environment." "Feature-Realistic Neural Fusion for Real-Time, Open Set Scene Understanding","Kirill Mazur, Edgar Sucar, Andrew J Davison",Imperial College London,Semantic Scene Understanding,"General scene understanding for robotics requires flexible semantic representation, so that novel objects and structures which may not have been known at training time can be identified, segmented and grouped. We present an algorithm which fuses general learned features from a standard pre-trained network into a highly efficient 3D geometric neural field representation during real-time SLAM. The fused 3D feature maps inherit the coherence of the neural field's geometry representation. This means that tiny amounts of human labelling interacting at runtime enable objects or even parts of objects to be robustly and accurately segmented in an open set manner." Deep Learning on Home Drone: Searching for the Optimal Architecture,"Alaa Maalouf, Yotam Gurfinkel, Barak Diker, Oren Gal, Daniela Rus, Dan Feldman","MIT,University of Haifa,Technion - Israel Institute of Technology",Semantic Scene Understanding,"We suggest the first system that runs real-time semantic segmentation via deep learning on the weak micro-computer Raspberry Pi Zero v2 (whose price was $15) attached to a toy drone. In particular, since the Raspberry Pi weighs less than $16$ grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy-drone (" Mask3D: Mask Transformer for 3D Semantic Instance Segmentation,"Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe","RWTH Aachen University,ETH Zurich,Nvidia,ETH Zürich",Semantic Scene Understanding,"Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques. Building on the successes of recent Transformer-based methods for object detection and image segmentation, we propose a Transformer-based approach for 3D semantic instance segmentation. We show that we can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds. In our model called Mask3D each object instance is represented as an instance query. Using Transformer decoders, the instance queries are learned by iteratively attending to point cloud features at multiple scales. Combined with point features, the instance queries directly yield all instance masks in parallel. Mask3D has several advantages over current state-of-the-art approaches, since it neither relies on (1) voting schemes which require hand-selected geometric properties (such as centers) nor (2) geometric grouping mechanisms requiring manually-tuned hyper-parameters (e.g. radii) and (3) enables a loss that directly optimizes instance masks. Mask3D sets a new state-of-the-art on ScanNet test (+6.2mAP), S3DIS 6-fold (+10.1mAP), STPLS3D (+10.9mAP) and ScanNet200 test (+12.4mAP)." Detecting Spatio-Temporal Relations by Combining a Semantic Map with a Stream Processing Engine,"Lennart Niecksch, Henning Deeken, Thomas Wiemann","German Research Centre for Artificial Intelligence (DFKI),Osnabrueck University,Fulda University of Applied Sciences",Semantic Scene Understanding,"Changes in topological spatial relations of objects are often strong indicators for state transitions in the underlying processes they are involved in. While various aspects of semantic mapping have been extensively researched, the reasoning about the temporal development of spatial relations of instances is often neglected. This paper presents a concept to combine a semantic map with a stream processing framework for live analysis of the spatio-temporal relation of objects, based on the map and information inferred from sensors streams. To demonstrate the functionality of our concept, we implemented a proof-of-concept system to track everyday events in an office environment. The presented application scenario clearly demonstrates the benefits of the proposed architecture for detecting and handling complex spatio-temporal events." Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs,"Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng","Tsinghua University,Peking University",Semantic Scene Understanding,"Dynamic scene graphs generated from video clips could help enhance the semantic visual understanding in a wide range of challenging tasks such as environmental perception, autonomous navigation, and task planning of self-driving vehicles and mobile robots. In the process of temporal and spatial modeling during dynamic scene graph generation, it is particularly intractable to learn time-variant relations in dynamic scene graphs among frames. In this paper, we propose a Time-variant Relation-aware TRansformer (TR^2), which aims to model the temporal change of relations in dynamic scene graphs. Explicitly, we leverage the difference of text embeddings of prompted sentences about relation labels as the supervision signal for relations. In this way, cross-modality feature guidance is realized for the learning of time-variant relations. Implicitly, we design a relation feature fusion module with a transformer and an additional message token that describes the difference between adjacent frames. Extensive experiments on the Action Genome dataset prove that our TR^2 can effectively model the time-variant relations. TR^2 significantly outperforms previous state-of-the-art methods under two different settings by 2.1% and 2.6% respectively." CPSeg: Cluster-Free Panoptic Segmentation of 3D LiDAR Point Clouds,"Thomas Enxu Li, Ryan Razani, Yixuan Xu, Bingbing Liu","University of Toronto,Huawei,Huawei Technologies Canada Co., Ltd.,Huawei Technologies",Semantic Scene Understanding,"A fast and accurate panoptic segmentation system for LiDAR point clouds is crucial for autonomous driving vehicles to understand the surrounding objects and scenes. Existing approaches usually rely on proposals or clustering to segment foreground instances. As a result, they struggle to achieve real-time performance. In this paper, we propose a novel real-time end-to-end panoptic segmentation network for LiDAR point clouds, called CPSeg. In particular, CPSeg comprises a shared encoder, a dual-decoder, and a cluster-free instance segmentation head, which is able to dynamically pillarize foreground points according to the learned embedding. Then, it acquires instance labels by finding connected pillars with a pairwise embedding comparison. Thus, the conventional proposal-based or clustering-based instance segmentation is transformed into a binary segmentation problem on the pairwise embedding comparison matrix. To help the network regress instance embedding, a fast and deterministic depth completion algorithm is proposed to calculate the surface normal of each point cloud in real-time. The proposed method is benchmarked on two large-scale autonomous driving datasets: SemanticKITTI and nuScenes. Notably, extensive experimental results show that CPSeg achieves state-of-the-art results among real-time approaches on both datasets." A Generic Diffusion-Based Approach for 3D Human Pose Prediction in the Wild,"Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi","EPFL,Independent Scholar,Sharif University of Technology",Deep Learning for Visual Perception II,"Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs. The code is available online: https://github.com/vita-epfl/DePOSit" DifFAR: Differentiable Frequency-Based Disentanglement for Aerial Video Action Recognition,"Divya Kothandaraman, Ming C. Lin, Dinesh Manocha","University of Maryland College Park,University of Maryland at College Park,University of Maryland",Deep Learning for Visual Perception II,"We present a learning algorithm for human activity recognition in videos. Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras that contain a human actor along with background motion. Typically, the human actors occupy less than one-tenth of the spatial resolution. Our approach simultaneously harnesses the benefits of frequency domain representations, a classical analysis tool in signal processing, and data driven neural networks. We build a differentiable static-dynamic frequency mask prior to model the salient static and dynamic pixels in the video, crucial for the underlying task of action recognition. We use this differentiable mask prior to enable the neural network to intrinsically learn disentangled feature representations via an identity loss function. Our formulation empowers the network to inherently compute disentangled salient features within its layers. Further, we propose a cost-function encapsulating temporal relevance and spatial content to sample the most important frame within uniformly spaced video segments. We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset and demonstrate relative improvements of 5.72% - 13.00% over the state-of-the-art and 14.28% - 38.05% over the corresponding baseline model." ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence,"Dmitriy Rivkin, Gregory Dudek, Nikhil Rajiv Kakodkar, David Paul Meger, Oliver Limoyo, Michael Jenkin, Xue Liu, Francois Hogan","Samsung,McGill University,University of Toronto,York University,Massachusetts Institute of Technology",Deep Learning for Visual Perception II,"Our work examines the way in which large language models can be used for robotic planning and sampling in the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods." LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR,"Pengfei Li, Ruowen Zhao, Yongliang Shi, Hao Zhao, Jirui Yuan, Guyue Zhou, Ya-qin Zhang","Institute for AI Industry Research (AIR), Tsinghua University,University of Chinese Academy of Sciences,Tsinghua University,Institute for AI Industry Research(AIR), Tsinghua University",Deep Learning for Visual Perception II,"Scene completion refers to obtaining dense scene representation from an incomplete perception of complex 3D scenes. This helps robots detect multi-scale obstacles and analyse object occlusions in scenarios such as autonomous driving. Recent advances show that implicit representation learning can be leveraged for continuous scene completion and achieved through physical constraints like Eikonal equations. However, former Eikonal completion methods only demonstrate results on watertight meshes at a scale of tens of meshes. None of them are successfully done for non-watertight LiDAR point clouds of open large scenes at a scale of thousands of scenes. In this paper, we propose a novel Eikonal formulation that conditions the implicit representation on localized shape priors which function as dense boundary value constraints, and demonstrate it works on SemanticKITTI and SemanticPOSS. It can also be extended to semantic Eikonal scene completion with only small modifications to the network architecture. With extensive quantitative and qualitative results, we demonstrate the benefits and drawbacks of existing Eikonal methods, which naturally leads to the new locally conditioned formulation. Notably, we improve IoU from 31.7% to 51.2% on SemanticKITTI and from 40.5% to 48.7% on SemanticPOSS. We extensively ablate our methods and demonstrate that the proposed formulation is robust to a wide spectrum of implementation hyper-parameters. Code, data and models will be made publicly available." Uncertainty-Aware LiDAR Panoptic Segmentation,"Kshitij Sirohi, Mohammad Sajad Marvi, Daniel Büscher, Wolfram Burgard","University of Freiburg,Albert-Ludwigs-Universität Freiburg,University of Technology Nuremberg",Deep Learning for Visual Perception II,"Modern autonomous systems often rely on LiDAR scanners, in particular for autonomous driving scenarios. In this context, reliable scene understanding is indispensable. Conventional learning-based methods generally try to achieve maximum performance for this task, while neglecting a proper estimation of the associated uncertainties. In this work, we introduce a novel approach for solving the task of uncertainty-aware panoptic segmentation using LiDAR point clouds. Our proposed EvLPSNet network is the first to solve this task efficiently in a sampling-free manner. It aims to predict per-point semantic and instance segmentations, together with per-point uncertainty estimates. Moreover, it incorporates methods that utilize the uncertainties to improve the segmentation performance. We provide several strong baselines combining state-of-the-art LiDAR panoptic segmentation networks with sampling-free uncertainty estimation techniques. Extensive evaluations show that we achieve the best performance on uncertainty-aware panoptic segmentation quality and calibration compared to these baselines. We make our code available at: url{https://github.com/kshitij3112/EvLPSNet}" E-VFIA : Event-Based Video Frame Interpolation with Attention,"Onur Selim Kilic, Ahmet Akman, A. Alatan","METU,Middle East Technical University",Deep Learning for Visual Perception II,"Video frame interpolation (VFI) is a fundamental vision task that aims to synthesize several frames between two consecutive original video images. Most algorithms aim to accomplish VFI by using only keyframes, which is an ill-posed problem since the keyframes usually do not yield any accurate precision about the trajectories of the objects in the scene. On the other hand, event-based cameras provide more precise information between the keyframes of a video. Some recent state-of-the-art event-based methods approach this problem by utilizing event data for better optical flow estimation to interpolate for video frame by warping. Nonetheless, those methods heavily suffer from the ghosting effect. On the other hand, some of kernel-based VFI methods that only use frames as input, have shown that deformable convolutions, when backed up with transformers, can be a reliable way of dealing with long-range dependencies. We propose event-based video frame interpolation with attention (E-VFIA), as a lightweight kernel-based method. E-VFIA fuses event information with standard video frames by deformable convolutions to generate high quality interpolated frames. The proposed method represents events with high temporal resolution and uses a multi-head self-attention mechanism to better encode event-based information, while being less vulnerable to blurring and ghosting artifacts; thus, generating crispier frames. The simulation results show that the proposed technique outperforms current state-of-the-art methods (both frame and event-based) with a significantly smaller model size." Edge-Guided Multi-Domain RGB-To-TIR Image Translation for Training Vision Tasks with Challenging Labels,"DongGuw Lee, Myung-Hwan Jeon, Younggun Cho, Ayoung Kim","Seoul National University (SNU),Seoul National University,Inha University",Deep Learning for Visual Perception II,"The insufficient number of annotated thermal infrared (TIR) image datasets not only hinders comparable performances to that of RGB but limits the supervised learning of TIR image-based tasks with challenging labels. As a remedy, we propose an edge-guided multidomain RGB to TIR image translation model to employ annotated RGB images with challenging labels. Our proposed method not only depicts characteristics of TIR images but the key details in the original image are well preserved in the translated image on both synthetic and real world RGB images. Using our translation model, we have enabled the supervised learning of deep TIR image-based optical flow estimation and object detection. Using our proposed method has validated a performance amelioration in TIR optical flow by reduction in end point error by 56.5% on average, and our model achieved the best object detection mAP of 23.9%. Our code and supplementary materials are available at https://github.com/rpmsnu/sRGB-TIR." Weakly Supervised Referring Expression Grounding Via Target-Guided Knowledge Distillation,"Jinpeng Mi, Song Tang, Ma Zhiyuan, Dan Liu, Qingdu Li, Jianwei Zhang","USST,University of Hamburg,University of Shanghai for Science and Technology",Deep Learning for Visual Perception II,"Weakly supervised referring expression grounding aims to train a model without the manual labels between image regions and referring expressions during the training phase. Current predominant models often adopt deep structures to reconstruct the region-expression correspondence. A crucial deficiency of the existing approaches lies in that these models neglect to exploit potential valuable information to further improve their grounding performance. To address this issue, we leverage knowledge distillation as a unique scheme to excavate and transfer helpful information for acquiring a better model. Specifically, we propose a target-guided knowledge distillation framework that accounts for region-expression pairs reconstruction and matching. We reactivate the target-related prediction information learned by a pre-trained teacher model and transfer the target-related prediction knowledge from the teacher to guide the training process and boost the performance of the student model. We conduct extensive experiments on three benchmark datasets, i.e., RefCOCO, RefCOCO+, and RefCOCOg. Without bells and whistles, our approach achieves state-of-the-art results on several splits of benchmark datasets. The implementation codes and trained models are available at: https://github.com/dami23/WREG_KD." VQA-Based Robotic State Recognition Optimized with Genetic Algorithm,"Kento Kawaharazuka, Yoshiki Obinata, Naoaki Kanazawa, Kei Okada, Masayuki Inaba",The University of Tokyo,AI-Based Methods,"State recognition of objects and environment in robots has been conducted in various ways. In most cases, this is executed by processing point clouds, learning images with annotations, and using specialized sensors. In contrast, in this study, we propose a state recognition method that applies Visual Question Answering (VQA) in a Pre-Trained Vision-Language Model (PTVLM) trained from a large-scale dataset. By using VQA, it is possible to intuitively describe robotic state recognition in the spoken language. On the other hand, there are various possible ways to ask about the same event, and the performance of state recognition differs depending on the question. Therefore, in order to improve the performance of state recognition using VQA, we search for an appropriate combination of questions using a genetic algorithm. We show that our system can recognize not only the open/closed of a refrigerator door and the on/off of a display, but also the open/closed of a transparent door and the state of water, which have been difficult to recognize." Center Feature Fusion: Selective Multi-Sensor Fusion of Center-Based Objects,"Philip Jacobson, Yiyang Zhou, Wei Zhan, Masayoshi Tomizuka, Ming Wu","University of California, Berkeley,Univeristy of California, Berkeley,University of California",AI-Based Methods,"Leveraging multi-modal fusion, especially between camera and LiDAR, has become essential for building accurate and robust 3D object detection systems for autonomous vehicles. Until recently, point decorating approaches, in which point clouds are augmented with camera features, have been the dominant approach in the field. However, these approaches fail to utilize the higher resolution images from cameras. Recent works projecting camera features to the bird’s-eye-view (BEV) space for fusion have also been proposed, however they require projecting millions of pixels, most of which only contain background information. In this work, we propose a novel approach Center Feature Fusion (CFF), in which we leverage center-based detection networks in both the camera and LiDAR streams to identify relevant object locations. We then use the center-based detection to identify the locations of pixel features relevant to object locations, a small fraction of the total number in the image. These are then projected and fused in the BEV frame. On the nuScenes dataset, we outperform the LiDAR-only baseline by 4.9% mAP while fusing up to 100x fewer features than other fusion methods." Towards Robust Reference System for Autonomous Driving: Rethinking 3D MOT,"Leichen Wang, Jiadi Zhang, Pei Cai, Xinrun Li","Robert Bosch CN,Tongji University,Nanyang Technological University,Bosch (China) Investment Co., Ltd.",AI-Based Methods,"With the rapid development of autonomous driving, the need for auto-labeling reference systems is becoming increasingly urgent. 3D MOT is one of the most critical components of the reference system. In this work, we reviewed and rethought the common failure sources and limitations of the SOTA 3D MOT methods. Based on the observation, we propose a set of innovative 3D MOT post-processing modules as a unified framework. First, we design a self-learning-based detector to eliminate the outliers in each tracklet. Then a novel post-processing module, GGTrajRec, will recover the breakpoints and ID switches in the trajectories. Finally, a confidence-guided trajectory optimizer is implemented to ensure each trajectory's consistency. Extensive experiments on KITTI and nuScenes show that our method can improve the SOTA methods on most evaluation metrics by a remarkable margin. Currently, our results are second ranking on the KITTI tracking leaderboard. Specifically, our method offers the lowest FPs, highest DetRe, and AssRe values among all methods, which can significantly contribute to a stable and robust reference system for ADAS." LATITUDE: Robotic Global Localization with Truncated Dynamic Low-Pass Filter in City-Scale NeRF,"Zhenxin Zhu, Yuantao Chen, Zirui Wu, Chao Hou, Yongliang Shi, Chuxuan Li, Pengfei Li, Guyue Zhou, Hao Zhao","Beihang University,Xi'an University of Architecture and Technology,Institute for AI Industry Research, Tsinghua University; Beijing,The University of Hong Kong,Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University",AI-Based Methods,"Neural Radiance Fields (NeRFs) have made great success in representing complex 3D scenes with high-resolution details and efficient memory. Nevertheless, current NeRF-based pose estimators have no initial pose prediction and are prone to local optima during optimization. In this paper, we present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter, which introduces a two-stage localization mechanism in city-scale NeRF. In place recognition stage, we train a regressor through images generated from trained NeRFs, which provides an initial value for global localization. In pose optimization stage, we minimize the residual between the observed image and rendered image by directly optimizing the pose on the tangent plane. To avoid falling into local optimum, we introduce a Truncated Dynamic Low-pass Filter (TDLF) for coarse-to-fine pose registration. We evaluate our method on both synthetic and real-world data and show its potential applications for high-precision navigation in large-scale city scenes. Codes and dataset will be publicly available at https://github.com/jike5/LATITUDE." 4DRadarSLAM: A 4D Imaging Radar SLAM System for Large-Scale Environments Based on Pose Graph Optimization,"Jun Zhang, Huayang Zhuge, Zhenyu Wu, Guohao Peng, Mingxing Wen, Yiyao Liu, Danwei Wang","Nanyang Technological University,NANYANG Technological University",Localization and Mapping IV,"LiDAR-based SLAM may easily fail in adverse weathers (e.g., rain, snow, smoke, fog), while mmWave Radar remains unaffected. However, current researches are primarily focused on 2D (x, y) or 3D (x, y, doppler) Radar and 3D LiDAR, while limited work can be found for 4D Radar (x, y, z, doppler). As a new entrant to the market with unique characteristics, 4D Radar outputs 3D point cloud with added elevation information, rather than 2D point cloud; compared with 3D LiDAR, 4D Radar has noisier and sparser point cloud, making it more challenging to extract geometric features (edge and plane). In this paper, we propose a full system for 4D Radar SLAM consisting of three modules: 1) Front-end module performs scan-to-scan matching to calculate the odometry based on GICP, considering the probability distribution of each point; 2) Loop detection utilizes multiple rule-based loop pre-filtering steps, followed by an intensity scan context step to identify loop candidates, and odometry check to reject false loop; 3) Back-end builds a pose graph using front-end odometry, loop closure, and optional GPS data. Optimal pose is achieved through g2o. We conducted real experiments on two platforms and five datasets (ranging from 240m to 4.8km) and will make the code open-source to promote further research at: https://github.com/zhuge2333/4DRadarSLAM" A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation,"Lin Li, Wendong Ding, Yongkun Wen, Yufei Liang, Yong Liu, Guowei Wan","Zhejiang University,Baidu,China,Intelligent Driving Group,Baidu",Localization and Mapping IV,"Pairwise point cloud registration is a critical task for many applications, which heavily depends on finding correct correspondences from the two point clouds. However, the low overlap between input point clouds causes the registration to fail easily, leading to mistaken overlapping and mismatched correspondences, especially in scenes where non-overlapping regions contain similar structures. In this paper, we present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation to fulfill pairwise registration and loop closure. Feature description is performed by a sparse UNet-like network based on BEV representation, and 3D keypoints are extracted by a detection head for 2D locations, and a regression head for heights. For overlap detection, a cross-attention module is applied for interacting contextual information of input point clouds, followed by a classification head to estimate the overlapping region. We evaluate our unified model extensively on the KITTI dataset and Apollo-SouthBay dataset. The experiments demonstrate that our method significantly outperforms existing methods on overlap estimation, especially in scenes with small overlaps. It also achieves top registration performance on both datasets in terms of translation and rotation errors." Data-Association-Free Landmark-Based SLAM,"Yihao Zhang, Odin Aleksander Severinsen, John Leonard, Luca Carlone, Kasra Khosoussi","Massachusetts Institute of Technology,MIT,The Commonwealth Scientific and Industrial Research (CSIRO)",Localization and Mapping IV,"We study landmark-based SLAM with unknown data association: our robot navigates in a completely unknown environment and has to simultaneously reason over its own trajectory, the positions of an unknown number of landmarks in the environment, and potential data associations between measurements and landmarks. This setup is interesting since: (i) it arises when recovering from data association failures or from SLAM with information-poor sensors, (ii) it sheds light on fundamental limits (and hardness) of landmark-based SLAM problems irrespective of the front-end data association method, and (iii) it generalizes existing approaches where data association is assumed to be known or partially known. We approach the problem by splitting it into an inner problem of estimating the trajectory, landmark positions and data associations and an outer problem of estimating the number of landmarks. Our approach creates useful and novel connections with existing techniques from discrete-continuous optimization (e.g., k-means clustering), which has the potential to trigger novel research. We demonstrate the proposed approaches in extensive simulations and on real datasets and show that the proposed techniques outperform typical data association baselines and are even competitive against an “oracle” baseline which has access to the number of landmarks and an initial guess for each landmark." Efficient Bundle Adjustment for Coplanar Points and Lines,"Lipu Zhou, Jiacheng Liu, Fengguang Zhai, Pan Ai, Kefei Ren, Yinian Mao, Guoquan Huang, Ziyang Meng, Michael Kaess","MeiTuan,Tsinghua University,Meituan,Meituan-Dianping Group,University of Delaware,Carnegie Mellon University",Localization and Mapping IV,"Bundle adjustment (BA) is a well-studied fundamental problem in the robotics and vision community. In man-made environments, coplanar points and lines are ubiquitous. However, the number of works on bundle adjustment with coplanar points and lines is relatively small. This paper focuses on this special BA problem, referred to as π-BA. For a point or a line on a plane, we derive a new constraint to describe the relationship among two poses and the plane, called π-constraint. We distribute π-constraints into different groups. Each group is called a π-factor. We prove that, with certain preprocessing, the computational complexity associated with a π-factor in the Levenberg-Marquardt (LM) algorithm is O(1), independent of the number of π-constraints packed into the π-factor. In π-BA, π-factors replace original reprojection errors. One problem is how to divide π-constraints into π-factors. Different strategies may result in different numbers of π-factors, which in turn affects the efficiency. It is difficult to get the optimal division. We present a greedy algorithm to overcome this problem. Experimental results verify that our algorithm can significantly accelerate the computation." Convolutional Bayesian Kernel Inference for 3D Semantic Mapping,"Joseph Wilson, Yuewei Fu, Arthur Zhang, Jingyu Song, Andrew Capodieci, Paramsothy Jayakumar, Kira Barton, Maani Ghaffari","University of Michigan,Neya Robotics,U.S. Army DEVCOM Ground Vehicle Systems Center,University of Michigan at Ann Arbor",Localization and Mapping IV,"Robotic perception is currently at a cross-roads between modern methods, which operate in an efficient latent space, and classical methods, which are mathematically founded and provide interpretable, trustworthy results. In this paper, we introduce a Convolutional Bayesian Kernel Inference (ConvBKI) layer which learns to perform explicit Bayesian inference within a depthwise separable convolution layer to maximize efficency while maintaining reliability simultaneously. We apply our layer to the task of real-time 3D semantic mapping, where we learn semantic-geometric probability distributions for LiDAR sensor information and incorporate semantic predictions into a global map. We evaluate our network against state-of-the-art semantic mapping algorithms on the KITTI data set, demonstrating improved latency with comparable semantic label inference results." SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations,"Xingguang Zhong, Yue Pan, Jens Behley, Cyrill Stachniss",University of Bonn,Localization and Mapping IV,"Accurate mapping of large-scale environments is an essential building block of most outdoor autonomous systems. Challenges of traditional mapping methods include the balance between memory consumption and mapping accuracy. This paper addresses the problem of achieving large-scale 3D reconstruction using implicit representations built from 3D LiDAR measurements. We learn and store implicit features through an octree-based, hierarchical structure, which is sparse and extensible. The implicit features can be turned into signed distance values through a shallow neural network. We leverage binary cross entropy loss to optimize the local features with the 3D measurements as supervision. Based on our implicit representation, we design an incremental mapping system with regularization to tackle the issue of forgetting in continual learning. Our experiments show that our 3D reconstructions are more accurate, complete, and memory-efficient than current state-of-the-art 3D mapping methods." Efficient and Hybrid Decoder for Local Map Construction in Bird's-Eye-View,"Kun Tian, Yun Ye, Zheng Zhu, Peng Li, Guan Huang","phigent robotics,Company,Institute of Automation, Chinese Academy of Sciences,Phigent AI,Phigent Robotics",Localization and Mapping IV,"High-definition maps are crucial perception elements for autonomous robot navigation systems, which can provide accurate scene layout and environment information for downstream motion prediction and planning control tasks. Traditional methods based on manual annotation or SLAM algorithms require massive labor efforts and time costs, which hinders the deployment of practical applications. Online construction of local maps from on-board cameras offers an alternative solution. Aiming at the problems of unsatisfying precision and redundant computation of HDMapNet, we propose an efficient and hybrid decoder (EHD) that consists of a CNN-based segmentation (Seg) head and a query-based lane detection head (QLD). Specifically, the Seg head outputs pixel-level semantic maps, and QLD predicts instance mask for each lane object through learnable query embeddings. The designed decoding method eliminates the cumulative error caused by inaccurate semantic maps and does not require additional clustering algorithm for post-processing. Through combining with a variety of bird’s-eye-view (BEV) encoders, the effectiveness and efficiency of our EHD is demonstrated by extensive experiments. For segmentation task, the mIoU scores of semantic map can be improved by 1.3%∼2.9%. Additionally, the accuracy of lane detection is also significantly increased (more than 10.2% mAP) under all evaluation criteria. Since our method discards redundant post-processing, the inference speed is up to 22.71 FPS, which is 32 times faster than HDMapNet." Contour Context: Abstract Structural Distribution for 3D LiDAR Loop Detection and Metric Pose Estimation,"Binqian Jiang, Shaojie Shen",Hong Kong University of Science and Technology,Localization and Mapping IV,"This paper proposes Contour Context, a simple, effective, and efficient topological loop closure detection pipeline with accurate 3-DoF metric pose estimation, targeting the urban autonomous driving scenario. We interpret the Cartesian birds' eye view (BEV) image projected from 3D LiDAR points as layered distribution of structures. To recover elevation information from BEVs, we slice them at different heights, and connected pixels at each level will form contours. Each contour is parameterized by abstract information, e.g., pixel count, center position, covariance, and mean height. The similarity of two BEVs is calculated in sequential discrete and continuous steps. The first step considers the geometric consensus of graph-like constellations formed by contours in particular localities. The second step models the majority of contours as a 2.5D Gaussian mixture model, which is used to calculate correlation and optimize relative transform in continuous space. A retrieval key is designed to accelerate the search of a database indexed by layered KD-trees. We validate the efficacy of our method by comparing it with recent works on public datasets." The Reflectance Field Map: Mapping Glass and Specular Surfaces in Dynamic Environments,"Paul Foster, Collin Johnson, Benjamin Kuipers","University of Michigan,May Mobility",Localization and Mapping IV,"Common architectural features like plate glass or shiny metal doors result in specular reflections of lidar rays. These reflections create difficulties for common Bayesian mapping techniques like occupancy grids that accumulate evidence of structure over time. In this paper, we present the Reflectance Field Map, a reliable real-time method for detecting these shiny surfaces, like glass, metal, and mirrors with lidar. The Reflectance Field Map combines the theory developed for Light Field Mapping, common in computer graphics, with occupancy grid mapping. Like early methods for sonar-based robot mapping, we show how the addition of angular viewpoint information to a standard 2D grid map enables robust mapping in the presence of specular reflections. However unlike these previous approaches, our method works in dynamic environments. Additionally, unlike recent approaches for lidar-based mapping of specular surfaces, our approach is sensor-agnostic and has no reliance on either intensity or multi-return measurements. We demonstrate the ability of the Reflectance Field Map to accurately map varied campus environments containing numerous pedestrians and significant plate glass, both straight and curved. The algorithm does not require a GPU and runs in real-time on a single core of a standard desktop processor." Inverse Perspective Mapping-Based Neural Occupancy Grid Map for Visual Parking,"Xiangru Mu, Haoyang Ye, Daojun Zhu, Tongqing Chen, Tong Qin","Huawei,Huawei Technologies,Huawei Technology,Huawei Techonology",Localization and Mapping IV,"Sensing environmental obstacles and establishing an occupancy map of surroundings are critical to achieving automated parking for autonomous vehicles. This paper presents a method to obtain surrounding occupancy information from inverse perspective mapping (IPM) images. This method uses the easily-accessed pseudo-labels from LiDAR to supervise a visual network, which can detect occupied boundaries of obstacles. Fusing this visual occupancy with ego-motion information, we develop a multi-frame fusion approach to build a local OGM to realize online environment mapping. Compared with other learning-based occupancy approaches, our method does not require time-consuming and labor-intensive labeling for the environment. The ground truth of surrounding occupancy comes from LiDAR easily. The proposed method achieves LiDAR-like performance with pure visual inputs, which greatly decreases the cost of real products. Experiments on driving and parking environments prove that our method can accurately sense surrounding occupancy information and build a robust occupancy map of the environment." Efficient Implicit Neural Reconstruction Using LiDAR,"Dongyu Yan, Xiaoyang Lyu, Jieqi Shi, Yi Lin","Harbin Institute of Technology (ShenZhen),The University of Hong Kong,Hong Kong University of Technology and Science,Hong Kong University of Science and Technology",Localization and Mapping IV,"Modeling scene geometry using implicit neural representation has revealed its advantages in accuracy, flexibility, and low memory usage. Previous approaches have demonstrated impressive results using color or depth images but still have difficulty handling poor light conditions and large-scale scenes. Methods taking global point cloud as input require accurate registration and ground truth coordinate labels, which limits their application scenarios. In this paper, we propose a new method that uses sparse LiDAR point clouds and rough odometry to reconstruct fine-grained implicit occupancy field efficiently within a few minutes. We introduce a new loss function that supervises directly in 3D space without 2D rendering, avoiding information loss. We also manage to refine poses of input frames in an end-to-end manner, creating consistent geometry without global point cloud registration. As far as we know, our method is the first to reconstruct implicit scene representation from LiDAR-only input. Experiments on synthetic and real-world datasets, including indoor and outdoor scenes, prove that our method is effective, efficient, and accurate, obtaining comparable results with existing methods using dense input." Factor Graph Fusion of Raw GNSS Sensing with IMU and Lidar for Precise Robot Localization without a Base Station,"Jonas Beuchert, Marco Camurri, Maurice Fallon","University of Oxford,Free University of Bozen-Bolzano",Localization and Mapping IV,"Accurate localization is a core component of a robot’s navigation system. To this end, global navigation satellite systems (GNSS) can provide absolute measurements outdoors and, therefore, eliminate long-term drift. However, fusing GNSS data with other sensor data is not trivial, especially when a robot moves between areas with and without sky view. We propose a robust approach that tightly fuses raw GNSS receiver data with inertial measurements and, optionally, lidar observations for precise and smooth mobile robot localization. A factor graph with two types of GNSS factors is proposed. First, factors based on pseudoranges, which allow for global localization on Earth. Second, factors based on carrier phases, which enable highly accurate relative localization, which is useful when other sensing modalities are challenged. Unlike traditional differential GNSS, this approach does not require a connection to a base station. On a public urban driving dataset, our approach achieves accuracy comparable to a state-of-the-art algorithm that fuses visual inertial odometry with GNSS data—despite our approach not using the camera, just inertial and GNSS data. We also demonstrate the robustness of our approach using data from a car and a quadruped robot moving in environments with little sky visibility, such as a forest. The accuracy in the global Earth frame is still 1–2 m, while the estimated trajectories are discontinuity-free and smooth. We also show how lidar measurements can be tightly integrated. We believe this is the first system that fuses raw GNSS observations (as opposed to fixes) with lidar in a factor graph." "Continuous and Precise Positioning in Urban Environments by Tightly Coupled Integration of GNSS, INS and Vision","Xingxing Li, Shengyu Li, Yuxuan Zhou, Zhiheng Shen, Xuanbin Wang, Xin Li, Weisong Wen","Wuhan University,Wuhan university,Wuhan University, School of Geodesy and Geomatics,Hong Kong Polytechnic University",Localisation and Mapping,"Accurate, continuous and seamless state estimation is the fundamental module for intelligent navigation applications, such as self-driving cars and autonomous robots. However, it is often difficult for a standalone sensor to fulfill the demanding requirements of precise navigation in complex scenarios. hl{To fill this gap, this paper proposes to exploit the complementariness of the GNSS, inertial measurement unit (IMU) and vision via a tightly coupled integration method}, aiming to achieve continuous and accurate navigation in urban environments. Specifically, the raw GNSS carrier phase and pseudorange measurements, IMU data, and visual features are directly fused at the observation level through a centralized Extended Kalman Filter (EKF) to make full use of the multi-sensor information and reject potential outlier measurements. Furthermore, the widely used high-precision GNSS models including precise point positioning (PPP) and real-time kinematic (RTK) are unified in the proposed integrated system to increase usability and flexibility. We validate the performance of the proposed method on several challenging datasets collected in urban canyons and compare against the loosely coupled and state-of-the-art methods." 360-DFPE: Leveraging Monocular 360-Layouts for Direct Floor Plan Estimation,"Bolivar Solarte, Yueh-Cheng Liu, Chin-hsuan Wu, Yi-hsuan Tsai, Min Sun","National Tsing Hua University,Technical University of Munich,NEC Labs America",Localisation and Mapping,"We present 360-DFPE, a sequential floor plan estimation method that directly takes 360-images as input without relying on active sensors or 3D information. Our approach leverages a loosely coupled integration between a monocular visual SLAM solution and a monocular 360-room layout approach, which estimate camera poses and layout geometries, respectively. Since our task is to sequentially capture the floor plan using monocular images, the entire scene structure, room instances, and room shapes are unknown. To tackle these challenges, we first handle the scale difference between visual odometry and layout geometry via formulating an entropy minimization process, which enables us to directly align 360-layouts without knowing the entire scene in advance. Second, to sequentially identify individual rooms, we propose a novel room identification algorithm that tracks every room along the camera exploration using geometry information. Lastly, to estimate the final shape of the room, we propose a shortest path algorithm with an iterative coarse-to-fine strategy, which improves prior formulations with higher accuracy and faster run-time. Moreover, we collect a new floor plan dataset with challenging large-scale scenes, providing both point clouds and sequential 360-image information. Experimental results show that our monocular solution achieves favorable performance against the current state-of-the-art algorithms that rely on active sensors and require the entire scene reconstru" Autonomous Navigation in Unknown Environments with Sparse Bayesian Kernel-Based Occupancy Mapping,"Thai Duong, Michael Yip, Nikolay A. Atanasov","University of California, San Diego",Localisation and Mapping,"This paper focuses on online occupancy mapping and real-time collision checking onboard an autonomous robot navigating in a large unknown environment. Commonly used voxel and octree map representations can be easily maintained in a small environment but have increasing memory requirements as the environment grows. We propose a fundamentally different approach for occupancy mapping, in which the boundary between occupied and free space is viewed as the decision boundary of a machine learning classifier. This work generalizes a kernel perceptron model which maintains a very sparse set of support vectors to represent the environment boundaries efficiently. We develop a probabilistic formulation based on Relevance Vector Machines, handling measurement noise, and probabilistic occupancy classification, supporting autonomous navigation. We provide an online training algorithm, updating the sparse Bayesian map incrementally from streaming range data, and an efficient collision-checking method for general curves, representing potential robot trajectories. The effectiveness of our mapping and collision checking algorithms is evaluated in tasks requiring autonomous robot navigation and active mapping in unknown environments." Multitask Learning for Scalable and Dense Multilayer Bayesian Map Inference,"Lu Gan, Youngji Kim, J.W Grizzle, Jeffrey Walls, Ayoung Kim, Ryan Eustice, Maani Ghaffari","California Institute of Technology,NAVER Labs,University of Michigan,Seoul National University",Localisation and Mapping,"This article presents a novel and flexible multitask multilayer Bayesian mapping framework with readily extendable attribute layers. The proposed framework goes beyond modern metric-semantic maps to provide even richer environmental information for robots in a single mapping formalism while exploiting intralayer and interlayer correlations. It removes the need for a robot to access and process information from many separate maps when performing a complex task, advancing the way robots interact with their environments. To this end, we design a multitask deep neural network with attention mechanisms as our front-end to provide heterogeneous observations for multiple map layers simultaneously. Our back-end runs a scalable closed-form Bayesian inference with only logarithmic time complexity. We apply the framework to build a dense robotic map including metric-semantic occupancy and traversability layers. Traversability ground truth labels are automatically generated from exteroceptive sensory data in a self-supervised manner. We present extensive experimental results on publicly available datasets and data collected by a 3D bipedal robot platform and show reliable mapping performance in different environments. Finally, we also discuss how the current framework can be extended to incorporate more information such as friction, signal strength, temperature, and physical quantity concentration using Gaussian map layers. The software for reproducing the presented results or running on" Sigma-FP: Robot Mapping of 3D Floor Plans with an RGB-D Camera under Uncertainty,"Jose Luis Matez-Bandera, Javier Monroy, Javier Gonzalez-jimenez","University of Malaga,University of Málaga",Localisation and Mapping,"This work presents Sigma-FP, a novel 3D reconstruction method to obtain the floor plan of a multi-room environment from a sequence of RGB-D images captured by a wheeled mobile robot. For each input image, the planar patches of visible walls are extracted and subsequently characterized by a multivariate Gaussian distribution in the convenient Plane Parameter Space. Then, accounting for the probabilistic nature of the robot localization, we transform and combine the planar patches from the camera frame into a 3D global model, where the planar patches include both the plane estimation uncertainty and the propagation of the robot pose uncertainty. Additionally, processing depth data, we detect openings (doors and windows) in the wall, which are also incorporated in the 3D global model to provide a more realistic representation. Experimental results, in both real-world and synthetic environments, demonstrate that our method outperforms state-of-the-art methods, both in time and accuracy, while just relying on Atlanta world assumption." Continuous-Time Trajectory Estimation for Differentially Flat Systems,"Jacob Johnson, Joshua Mangelson, Randal Beard",Brigham Young University,Localisation and Mapping,"Continuous-time estimation using splines on Lie groups has been gaining traction in the literature due to the ability to incorporate high-frequency sensor data without introducing new optimization parameters. However, evaluating time derivatives and Jacobians of Lie group splines is computationally expensive, limiting their use mainly to offline applications. Motivated by the trajectory planning literature, we develop a new estimation technique that leverages the differential flatness property of many dynamic systems to define the spline in the system’s flat output space, which is often Euclidean. Doing so has the benefit of providing a simple and effective way to include system inputs in the estimation process. We show an example of flatness-based estimation for the unicycle dynamic model. We then show that this new method can achieve similar performance as Lie group spline estimation with significantly less computation time, and validate its use in hardware using a differential-drive robot." "IC-GVINS: A Robust, Real-Time, INS-Centric GNSS-Visual-Inertial Navigation System","Xiaoji Niu, Hailiang Tang, Tisheng Zhang, Jing Fan, Liu Jingnan",Wuhan University,Localisation and Mapping,"Visual navigation systems are susceptible to complex environments, while inertial navigation systems (INS) are not affected by external factors. Hence, we present IC-GVINS, a robust, real-time, INS-centric global navigation satellite system (GNSS)-visual-inertial navigation system to fully utilize the INS advantages. The Earth rotation has been compensated in the INS to improve the accuracy of high-grade inertial measurement units (IMUs). To promote the system robustness in high-dynamic conditions, the precise INS information is employed to assist the feature tracking and landmark triangulation. With a GNSS-aided initialization, the IMU, visual, and GNSS measurements are tightly fused in a unified world frame within the factor graph optimization framework. Dedicated experiments were conducted in the public vehicle and private robot datasets to evaluate the proposed method. The results demonstrate that IC-GVINS exhibits superior robustness and accuracy in complex environments. The proposed method with the INS-centric architecture yields improved robustness and accuracy compared to the state-of-the-art methods. We open-source the proposed IC-GVINS and the multisensor datasets on GitHub (https://github.com/i2Nav-WHU/IC-GVINS)." Gyro-Net: IMU Gyroscopes Random Errors Compensation Method Based on Deep Learning,"Yunqi Gao, Dianxi Shi, Ruihao Li, Zhe Liu, Wen Sun","Defense Innovation Institute,National University of Defense Technology,Renmin University of China",Localisation and Mapping,"To solve the problem of inaccurate orientation estimation after long-term operations of the Inertial Measurement Unit (IMU), we present a learning-based method (called Gyro-Net) to estimate and compensate for IMU gyroscope random errors. We firstly introduce a semi-dense network structure, which extracts different scale features by IFES Block (IMU Feature Extraction & Selection Block), and adopts skip-connections and transition layers to adjust the feature pipeline. In this way, we can reuse features between different blocks before and after feature extraction, selection and compression. Driven by a proposed absolute and relative loss, the network can be trained and achieve the reduction of cumulative estimated orientation errors. The experimental results in public datasets show that our method can effectively and accurately estimate the orientation from raw IMU data. Moreover, we apply the network output directly to Open-VINS, and the results show that Gyro-Net can improve the accuracy of pose estimation for Open-VINS, especially in scenarios where camera-based estimation often struggles (e.g., fast motion, drastic lighting, viewpoint changes and motion blur)." Self-Supervised Feature Learning for Long-Term Metric Visual Localization,"Yuxuan Chen, Timothy Barfoot",University of Toronto,Localisation and Mapping,"Visual localization is the task of estimating camera pose in a known scene, which is an essential problem in robotics and computer vision. However, long-term visual localization is still a challenge due to the environmental appearance changes caused by lighting and seasons. While techniques exist to address appearance changes using neural networks, these methods typically require ground-truth pose information to generate accurate image correspondences or act as a supervisory signal during training. In this paper, we present a novel self-supervised feature learning framework for visual localization. We use a sequence-based image matching algorithm across different sequences of images (i.e., experiences) to generate image correspondences without ground-truth labels. We can then sample image pairs to train a deep neural network that learns sparse features with associated descriptors and scores without ground-truth pose supervision. The learned features can be used together with a classical pose estimator for visual localization. We validate the learned features by integrating with an existing Visual Teach & Repeat pipeline to perform closed-loop localization experiments under different lighting conditions for a total of 22.4 km." GraffMatch: Global Matching of 3D Lines and Planes for Wide Baseline LiDAR Registration,"Parker Lusk, Devarth Parikh, Jonathan Patrick How","Massachusetts Institute of Technology,Ford Motor Company",Localisation and Mapping,"Using geometric landmarks like lines and planes can increase navigation accuracy and decrease map storage requirements compared to commonly-used LiDAR point cloud maps. However, landmark-based registration for applications like loop closure detection is challenging because a reliable initial guess is not available. Global landmark matching has been investigated in the literature, but these methods typically use ad hoc representations of 3D line and plane landmarks that are not invariant to large viewpoint changes, resulting in incorrect matches and high registration error. To address this issue, we adopt the affine Grassmannian manifold to represent 3D lines and planes and prove that the distance between two landmarks is invariant to rotation and translation if a shift operation is performed before applying the Grassmannian metric. This invariance property enables the use of our graph-based data association framework for identifying landmark matches that can subsequently be used for registration in the least-squares sense. Evaluated on a challenging landmark matching and registration task using publicly-available LiDAR datasets, our approach yields a 1.7x and 3.5x improvement in successful registrations compared to methods that use viewpoint-dependent centroid and ""closest point"" representations, respectively." Model Learning with Backlash Compensation for a Tendon-Driven Surgical Robot,"Francesco Cursi, Weibang Bai, Eric Yeatman, Petar Kormushev",Imperial College London,Medical and Surgical Robotics,"Robots for minimally invasive surgery are becoming more and more complex, due to miniaturization and flexibility requirements. The vast majority of surgical robots are tendon-driven and this, along with the complex design, causes high nonlinearities in the system which are difficult to model analytically. In this work we analyse how incorporating a backlash model and compensation can improve model learning and control. We therefore propose a backlash compensation technique and a novel Feedforward Artificial Neural Network (ANN) model to learn the kinematics of highly articulated tendon-driven robots. Experimental results show that the proposed backlash compensation is effective in reducing nonlinearities in the system, that compensating for backlash improves model learning and control, and that our proposed ANN outperforms traditional ANN in terms of path tracking accuracy." Simultaneous Online Registration-Independent Stiffness Identification and Tip Localization of Surgical Instruments in Robot-Assisted Eye Surgery,"Ali Ebrahimi, Shahriar Sefati, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita","Johns Hopkins University,Johns Hopkins Medical Institute,The Johns Hopkins University",Medical and Surgical Robotics,"Notable challenges during retinal surgery lend themselves to robotic assistance which has proven beneficial in providing a safe steady-hand manipulation. Efficient assistance from the robots heavily relies on accurate sensing of surgery states (e.g. instrument tip localization and tool-to-tissue interaction forces). Many of the existing tool tip localization methods require pre-operative frame registrations or instrument calibrations. In this study using an iterative method, we develop calibration- and registration-independent (RI) algorithms to provide online estimates of instrument stiffness (least squares and adaptive). The estimations are then combined with a state-space model based on the forward kinematics (FWK) of the Steady-Hand Eye Robot (SHER) and Fiber Brag Grating (FBG) sensor measurements. This is accomplished using a Kalman Filtering (KF) approach to improve the deflected instrument tip position estimations during robot-assisted eye surgery. The conducted experiments demonstrate that when the online RI stiffness estimations are used, the instrument tip localization results surpass those obtained from pre-operative offline calibrations for stiffness." Robot-Assisted Retraction for Transoral Surgery,"Lifeng Zhu, Jiangwei Shen, Shuyan Yang, Song Aiguo",Southeast University,Medical and Surgical Robotics,"Tissue retraction is an important task in head and neck surgery to leave space for surgical operations. Because the contact between the retractor and soft tissue is not trivial to model, the retraction operation has not been well addressed by robots in modern robot-assisted surgery. We propose a human-robot collaboration approach to assist the retraction for transoral surgery. The surgeons only need to roughly place the retractors into the oral cavity and specify the recommended retraction force. Robot manipulators will automatically retract the tissues in a safe way. In order to keep the touching force safe, we employ a force-sensing system at the distal end of the retractor. By analyzing the real-time force sensor data, we propose a control strategy which combines active retraction angle compensation and passive torque compensation to adjust the retractor, further reducing potential slippage during the retraction. The proposed method ensures the retraction adaptive to unknown perturbations of the human anatomy. Our system requires no extra recognition or calibration of the surgical scene or the human tissue models. The approach is experimentally validated with two robot manipulators and force-sensing retractors on a physical head phantom. We show that the robots stably retract the mouth with a safe retraction force under various configurations." HIFUSK – High Intensity Focused Ultrasound Surgery Based on KUKA Robot,"Andrea Mariani, Laura Morchi, Alessandro Diodato, Selene Tognarelli, Arianna Menciassi","Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant'Anna - SSSA",Medical and Surgical Robotics,"HIFUSK (High Intensity Focused Ultrasound Surgery based on KUKA robot) can address unmet clinical needs in the treatment of pathological tissues, such as cancerous tissues. HIFUSK is the marriage of two technologies: robotics and focused ultrasound. Such a combination results in a surgical treatment which is precise thanks to robotics and non-invasive (no incisions, no anesthesia, no ionizing energy) thanks to focused ultrasound. Robotic control and machine learning algorithms ensure safety even in case of target motions during therapy. The platform has been validated in dry-lab/ex-vivo scenarios and it was preliminary tested in terms of safety and efficacy during an in-vivo test on large animal model." Rethinking Feature Extraction: Gradient-Based Localized Feature Extraction for End-To-End Surgical Downstream Tasks,"Winnie Pang, Mobarakol Islam, Sai Mitheran, Lalithkumar Seenivasan, Mengya Xu, Hongliang Ren","National University of Singapore,University College London,Carnegie Mellon University,Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS)",Medical and Surgical Robotics,"Several approaches have been introduced to understand surgical scenes through downstream tasks like captioning and surgical scene graph generation. However, most of them heavily rely on an independent object detector and region-based feature extractor. Encompassing computationally expensive detection and feature extraction models, these multi-stage methods suffer from slow inference speed and inheriting errors from the earlier stages which limit the real-time applications and degrade the performance respectively. This work develops a detector-free gradient-based localized feature extraction approach that enables end-to-end model training for downstream surgical tasks such as report generation and tool-tissue interaction graph prediction. We eliminate the need for object detection or region proposal and feature extraction networks by extracting the features of interest from the discriminative regions using gradient-based localization techniques (e.g., Grad-CAM). We show that our proposed approaches enable the real-time deployment of end-to-end models for surgical downstream tasks. We extensively validate our approach on two surgical tasks: captioning and scene graph generation. The results prove that our gradient-based localized feature extraction methods effectively substitute the detector and feature extractor networks, allowing end-to-end model development with faster inference speed, essential for real-time surgical scene understanding tasks." Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery,"Paul Maria Scheikl, Eleonora Tagliabue, Balazs Gyenes, Martin Wagner, Diego Dall'Alba, Paolo Fiorini, Franziska Mathis-Ullrich","Friedrich-Alexander-University Erlangen-Nürnberg (FAU),Carl Zeiss AG,Karlsruhe Institute of Technology,Heidelberg University Hospital,University of Verona",Medical and Surgical Robotics,"Automation holds the potential to assist surgeons in robotic interventions, shifting their mental work load from visuomotor control to high level decision making. Reinforcement learning has shown promising results in learning complex visuomotor policies, especially in simulation environments where many samples can be collected at low cost. A core challenge is learning policies in simulation that can be deployed in the real world, thereby overcoming the sim-to-real gap. In this work, we bridge the visual sim-to-real gap with an image-based reinforcement learning pipeline based on pixel-level domain adaptation and demonstrate its effectiveness on an image-based task in deformable object manipulation. We choose a tissue retraction task because of its importance in clinical reality of precise cancer surgery. After training in simulation on domain-translated images, our policy requires no retraining to perform tissue retraction with a 50% success rate on the real robotic system using raw RGB images. Furthermore, our sim-to-real transfer method makes no assumptions on the task itself and requires no paired images. This work introduces the first successful application of visual sim-to-real transfer for robotic manipulation of deformable objects in the surgical field, which represents a notable step towards the clinical translation of cognitive surgical robotics." Shape Tracking and Feedback Control of Cardiac Catheter Using MRI-Guided Robotic Platform - Validation with Pulmonary Vein Isolation Simulator in MRI,"Ziyang Dong, Xiaomei Wang, Ge Fang, Zhuoliang He, Justin Di-Lang Ho, Chim Lee Cheung, Wai Lun Tang, Xiaochen Xie, Liyuan Liang, Hing-chiu Chang, Chi Keong Ching, Ka-Wai Kwok","The University of Hong Kong,Harbin Institute of Technology, Shenzhen,National Heart Centre Singapore",Medical and Surgical Robotics,"Cardiac electrophysiology is an effective treatment for atrial fibrillation, in which a long, steerable catheter is inserted into the heart chamber to conduct radiofrequency ablation. Magnetic resonance imaging (MRI) can provide enhanced intra-operative (intra-op) monitoring of the ablation progress as well as the localization of catheter position. In this paper, we designed a shape tracking system that integrates a multi-core FBG fiber and tracking coils with a standard cardiac catheter. Both the shape and positional tracking of the bendable section could be achieved. A learning-based modeling method is developed for cardiac catheters, which uses FBG-reconstructed 3D curvatures for model initialization. The proposed modeling method was implemented on an MRI-guided robotic platform to achieve feedback control of a cardiac catheter. The shape tracking performance was experimentally verified, demonstrating 2.33° average error for each sensing segment and 1.53 mm positional accuracy at the catheter tip. The feedback control performance was tested by autonomous targeting and path following (average deviation of 0.62 mm) tasks. The overall performance of the integrated robotic system was validated by a pulmonary vein isolation (PVI) simulator with ex-vivo tissue ablation, which employed a left atrial (LA) phantom with pulsatile liquid flow. Catheter tracking and feedback control tests were conducted in an MRI scanner, demonstrating the capability of the proposed system under MRI." A Generalized Framework for Concentric Tube Robot Design Using Gradient-Based Optimization,"Jui-Te Lin, Cedric Girerd, Jiayao Yan, John T. Hwang, Tania Morimoto","University of California San Diego,University of California, San Diego",Medical and Surgical Robotics,"Concentric tube robots (CTRs) show particular promise for minimally invasive surgery due to their inherent compliance and ability to navigate in constrained environments. Due to variations in anatomy among patients and variations in task requirements among procedures, it is necessary to customize the design of these robots on a patient- or population-specific basis. However, the complex kinematics and large design space make the design problem challenging. Here we propose a computational framework that can efficiently optimize a robot design and a motion plan to enable safe navigation through the patient’s anatomy. The current framework is the first fully gradient-based method for CTR design optimization and motion planning, enabling an efficient and scalable solution for simultaneously optimizing continuous variables, even across multiple anatomies. The framework is demonstrated using two clinical examples, laryngoscopy and heart biopsy, where the optimization problems are solved for a single patient and across multiple patients, respectively." Magnetic Soft Continuum Robots with Braided Reinforcement,"Pete Lloyd, Onaizah Onaizah, Giovanni Pittiglio, Damith Suresh Chathuranga, James Henry Chandler, Pietro Valdastri","University of Leeds,McMaster University,Harvard University",Medical and Surgical Robotics,"Flexible catheters are used in a wide variety of surgical interventions including neurological, pancreatic and cardiovascular. In many cases a lack of dexterity and miniaturization along with excessive stiffness results in large regions of the anatomy being deemed inaccessible. Soft continuum robots have the potential to mitigate these issues. Due to its enormous potential for miniaturization, magnetic actuation is of particular interest in this field. Currently, flexible magnetic catheters often rely on forces of anatomical interaction to generate large deformations during navigation and for soft anatomical structures this could be considered potentially damaging. In this study we demonstrate the insertion of a high aspect ratio, 50 mm long by 2 mm diameter, soft magnetic catheter capable of navigating up to a 180 degree bend without the aid of interactive forces. This magnetic catheter is reinforced with a lengthwise braided structure and its magnetization allows it to shape form along tortuous paths. We demonstrate our innovation in a planar silicone pancreas phantom. We also compare our approach with a mechanically equivalent tip driven magnetic catheter and with an identically magnetized, unreinforced catheter." Shape Sensing of Flexible Robots Based on Deep Learning,"Xuan Thao Ha, Di Wu, Mouloud Ourak, Gianni Borghesan, Jenny Dankelman, Arianna Menciassi, Emmanuel B Vander Poorten","KU Leuven,University of Leuven,TU Delft,Scuola Superiore Sant'Anna - SSSA",Medical and Surgical Robotics,"In this paper, a deep learning method for shape sensing of continuum robots based on multi-core Fiber Bragg Grating (FBG) fiber is introduced. The proposed method, based on an Artificial Neural Network (ANN), differs from traditional approaches, where accurate shape reconstruction requires a tedious characterization of many characteristic parameters. A further limitation of traditional approaches is that they either require multiple fibers, whose location relative to the centerline must be precisely known (calibrated) or a single multi-core fiber whose position typically coincides with the neutral line. The proposed method addresses this limitation and thus allows shape sensing based on a single multi-core fiber placed off-center. This helps in miniaturizing and leaves the central channel available for other purposes. The proposed approach was compared to a recent state-of-the-art model-based shape sensing approach. A 2-DOF bench-top fluidics-driven catheter system was built to validate the proposed ANN. The proposed ANN-based shape sensing approach was evaluated on a 40 mm long steerable continuum robot in both 3D free-space and 2D constrained environments, yielding an average shape sensing error of 0.24 mm and 0.49 mm, respectively. With these results, the superiority of the proposed approach compared to the recent model-based shape sensing method was demonstrated." Multifingered Grasping Based on Multimodal Reinforcement Learning,"Hongzhuo Liang, Lin Cong, Norman Hendrich, Shuang Li, Fuchun Sun, Jianwei Zhang","University of Hamburg,Tsinghua University",Grasping and Micromanipulation,"In this work, we tackle the challenging problem of grasping novel objects using a high-DoF anthropomorphic hand-arm system. Combining fingertip tactile sensing, joint torques and proprioception, a multimodal agent is trained in simulation to learn the finger motions and to determine when to lift an object. Binary contact information and level-based joint torques simplify transferring the learned model to the real robot. To reduce the exploration space, we first generate postural synergies by collecting a dataset covering various grasp types and using principal component analysis. Curriculum learning is further applied to adjust and randomize the initial object pose based on the training performance. Simulation and real robot experiments with dedicated initial grasping poses show that our method outperforms two baseline models in the grasp success rate both for seen and unseen objects. This learning approach further serves as a fundamental technology for complex in-hand manipulations based on the multi-sensory system." Planning of Power Grasps Using Infinite Program under Complementary Constraints,"Zherong Pan, Duo Zhang, Changhe Tu, Xifeng Gao","Tencent America,New York University,Shandong University",Grasping and Micromanipulation,"We propose an optimization-based approach to plan power grasps. Central to our method is a reformulation of grasp planning as an infinite program under complementary constraints (IPCC), which allows contacts to happen between arbitrary pairs of points on the object and the robot gripper. We show that IPCC can be reduced to a conventional finite-dimensional nonlinear program (NLP) using a kernel-integral relaxation. Moreover, the values and Jacobian matrices of the kernel-integral can be evaluated efficiently using a modified Fast Multipole Method (FMM). We further guarantee that the planned grasps are collision-free using primal barrier penalties. We demonstrate the effectiveness, robustness, and efficiency of our grasp planner on a row of challenging 3D objects and high-DOF grippers, such as Barrett Hand and Shadow Hand, where our method achieves superior grasp qualities over competitors." A Soft Barometric Tactile Sensor to Simultaneously Localize Contact and Estimate Normal Force with Validation to Detect Slip in a Robotic Gripper,"Thomas De Clercq, Anatolii Sianov, Guillaume Crevecoeur","Ghent University,University of Gent, EELAB",Grasping and Micromanipulation,"Soft tactile sensing technologies provide the potential to endow a sense of touch to robots for grasping and manipulating objects. During the execution of such tasks, having accurate knowledge on the contact location and quantitative forces in broad exerted force ranges is key. For industrial adoption such sensor needs to be low-cost, robust with limited or no calibration procedures and easy to manufacture. In this work we present a microelectromechanical (MEMS) based barometric sensor array covered with an elastomer layer, with the sensor signals being interpreted in real-time on the basis of a parameterized Gaussian type of distribution. The contact location is determined by finding in real-time the corresponding parameters of the Gaussian distribution that on its turn is used for normal contact force estimation. Results show accuracies in terms of localization of 0.5~mm and normal force errors of 10 % in force ranges up to 25 N and 15 % in high force ranges of 25 - 50 N. The proposed soft tactile sensor has furthermore been validated to provide the ability to detect slip when gripping various objects." Learning Efficient Policies for Picking Entangled Wire Harnesses: An Approach to Industrial Bin Picking,"Xinyi Zhang, Yukiyasu Domae, Weiwei Wan, Kensuke Harada","Osaka University,The National Institute of Advanced Industrial Science and Techno",Grasping and Micromanipulation,"Wire harnesses are essential connecting components in manufacturing industry but are challenging to be automated in industrial tasks such as bin picking. They are long, flexible and tend to get entangled when randomly placed in a bin. This makes it difficult for the robot to pick a single one from the clutter. Besides, training or collecting data in simulation is challenging due to the difficulties in modeling wire harnesses containing both deformable cables and rigid components. In this work, instead of directly lifting wire harnesses, we propose to grasp and extract the target following circle-like trajectories until it is separated from the clutter. We learn a policy from real-world data to infer the optimal grasp and separation action from visual observation. Our policy enables the robot to efficiently pick and separate the entangled wire harnesses by maximizing success rates and reducing the execution time. To evaluate our policy, we present a set of real-world experiments on picking wire harnesses. Results show a significant improvement in success rates from 49.2% to 84.6% over baseline. We also evaluate the effectiveness of our policy under different clutter scenarios using unseen types of wire harnesses. The proposed method is a feasible approach to utilize wire harnesses for industrial bin picking." A Novel Scaffold Reinforced Actuator with Tunable Attitude Ability for Grasping,"Pei Jiang, Ji Luo, Jiaxing Li, Michael Z. Q. Chen, Yonghua Chen, Yang Yang, Rui Chen","Chongqing University,Nanjing University of Science and Technology,The University of Hong Kong,Nanjing University of Information Science and Technology",Grasping and Micromanipulation,"Due to high compliance, adaptiveness, and easy controllability, soft actuators are widely adopted in soft grippers to grasp irregularly shaped or fragile objects. The specific motions can be pre-programmed into the flexible and constrained structures of the actuator, which provides an inexpensive and convenient method for desired motions. However, most preprogrammed structures cannot change the constraints on the actuator to achieve different kinds of deformations, which limits the motion diversities of actuators. This paper proposes a scaffold reinforcement mechanism where rotatable scaffolds distribute on the surface of the soft structure. The orientation adjustments of the scaffolds can change the deformation constraint of the actuator, which results in different kinds of motions. Based on the scaffold reinforcement mechanism, a scaffold-reinforced actuator is proposed, which can achieve bending motion and complex helical motion in the three-dimensional space by properly adjusting the orientation of the scaffolds. Additionally, both kinematic and mechanical models are proposed to forecast the behavior of the actuator when driven by cable displacement or tension" Deep Learning Reactive Robotic Grasping with a Versatile Vacuum Gripper,"Hui Zhang, Jef Peeters, Eric Demeester, Karel Kellens",KU Leuven,Grasping and Micromanipulation,"In this paper, a 6-step approach is proposed to simulate the grasp and evaluate the grasp quality for a versatile vacuum gripper by tracking the deformation and force-torque wrench of the gripping pad. Over 100 K synthetic grasps are generated for neural network training. Furthermore, a Gripping Attention Convolutional Neural Network (GA-CNN) is developed to predict the grasp quality for real-world grasp, running by 15 Hz closed-loop control with the real-time robotic observation and force-torque feedback. Various experiments in both the simulation and physical grasps indicate that our GA-CNN can focus on the crucial region of the soft gripping pad to predict grasp qualities and perform a lower average error compared with a same-scale traditional CNN. Additionally, the complexity of grasping clutters is defined from Level 1 to Level 9. The proposed grasping method achieves an average success rate of 90.2% for the static clutters at Level 1 to Level 8 and an average success rate of >80.0% for the dynamic grasping at Level 1 to Level 7, which outperforms state-of-the-art grasping methods. Code and video demo are available at https://github.com/huikul/DexterousGrasp." An Unconstrained Convex Formulation of Compliant Contact,"Alejandro Castro, Frank Permenter, Xuchen Han",Toyota Research Institute,Grasping and Micromanipulation,"We present a convex formulation of compliant frictional contact and a robust, performant method to solve it in practice. By analytically eliminating contact constraints, we obtain an unconstrained convex problem. Our solver has proven global convergence and warm-starts effectively, enabling simulation at interactive rates. We develop compact analytical expressions of contact forces allowing us to describe our model in clear physical terms and to rigorously characterize our approximations. Moreover, this enables us not only to model point contact, but also to incorporate sophisticated models of compliant contact patches. Our time stepping scheme includes the midpoint rule, which we demonstrate achieves second order accuracy even with frictional contact. We introduce a number of accuracy metrics and show our method outperforms existing commercial and open source alternatives without sacrificing accuracy. Finally, we demonstrate robust simulation of robotic manipulation tasks at interactive rates, with accurately resolved stiction and contact transitions, as required for meaningful sim-to-real transfer. Our method is implemented in the open source robotics toolkit Drake." Robotic Manipulation of Sperm As a Deformable Linear Object,"Changsheng Dai, Guanqiao Shan, Hang Liu, Changhai Ru, Yu Sun","Dalian University of Technology,University of Toronto,Soochow University",Grasping and Micromanipulation,"Robotic manipulation of deformable linear objects is a classic and challenging topic. Apart from synthetic objects such as wires and cables, linear objects are also commonly found in biological cells and organisms. Biomanipulation of such objects is hampered by difficulties such as limited degrees of freedom of micromanipulators and varied mechanical properties of the biological entities to manipulate. This paper presents robotic manipulation of human sperm, which are deformable cells with a linear shape. The shape and movement of the cell are recapitulated by our developed geometric and kinematic models. Under unfixed constraints between the end-effector and cell, path planning is designed to update the manipulation point to control cell deformation. A state transition function is formulated in path planning to handle stiffness variations of sperm without force sensing. A model predictive controller is designed to minimize the orientation error and manipulation path length. To detect sperm tail for visual feedback, an accuracy of 98% was achieved via deep neural networks. Robotic manipulation of human sperm was performed using standard clinical setup of a glass micropipette to rotate a sperm to the target orientation. Experimental results showed that robotic sperm rotation achieved an orientation error of 0.8°, tail curvedness of 0.14 µm-1 and operation time of 5.6 s, all significantly less than those of manual approach." Robotic Rotational Positioning of End-Effectors for Micromanipulation,"Songlin Zhuang, Changsheng Dai, Guanqiao Shan, Changhai Ru, Zhuoran Zhang, Yu Sun","Yongjiang Laboratory,Dalian University of Technology,University of Toronto,Soochow University,The Chinese University of Hong Kong, Shenzhen",Grasping and Micromanipulation,"Precise rotational positioning of end-effectors under microscopy is crucial for robotic micromanipulation. However, the end-effector is presently limited to a fixed orientation which is manually set before a given micromanipulation task, lacking accuracy and versatility of in-situ re-orientation. In this paper, we present a unified framework for rotationally positioning the end-effector in three dimensions by establishing a general rotational model, developing a detection method within the limited field of view under microscopy, and designing a three-loop control strategy that adapts to different experimental requirements and model parameters. In experiments, a standard angled micropipette was used as the end-effector to verify the validity of the proposed methods. The performance was evaluated experimentally where the micropipette was robotically rotated to an arbitrarily desired orientation with an average orientation error less than 2°. In the experiments of sperm manipulation, the in-situ micropipette orientation control capability improved the success rate of sperm immobilization and achieved dexterous robotic sperm orientation for facile aspiration." Comparing EMG Continuous Movement Decoding with Joints Unconstrained and Constrained,"Lizhi Pan, Zhongyi Ding, Jianmin Li",Tianjin University,"Prosthetics, Exoskeletons and Rehabilitation","decreased from 0.89 to 0.82 and 0.86 to 0.52, respectively. The average NRMSE values of the wrist and MCP flexion/extension increased from 0.18 to 0.21 and 0.22 to 0.31, respectively. The results demonstrated that the movements of the MCP and the wrist joints have a significant effect on the continuous movement decoding. Our study revealed a potential factor inducing the poor performance of continuous movement decoding in amputees." Design and Validation of a Polycentric Hybrid Knee Prosthesis with Electromagnet-Controlled Mode Transition,"Xu Wang, Haohua Xiu, Yao Zhang, Wei Liang, Wei Chen, Guowu Wei, Lei Ren, Luquan Ren","Jilin University,Ningbo University of Technology,Salford University,University of Manchester","Prosthetics, Exoskeletons and Rehabilitation","A hybrid knee prosthesis is proposed in this paper, which consists of a polycentric structure in passive mode for low-torque activities and a single-axis structure in active mode for high-torque activities. A novel mode transition mechanism controls self-holding electromagnets for switching modes between the four-bar linkage and single-axis structure. Compared with the conventional single-axis hybrid knee, the four-bar polycentric mechanism with varying instantaneous center of rotation (ICR) can enhance the geometric stability and increase the toe clearance in passive mode. For active mode, we developed a custom embedded electric system, employed torque control for stance and position control for swing. The results of bench tests indicated that the bandwidth of the controller was suitable for locomotion. The clinical test of level ground walking without sudden buckling and stumble was validated by three subjects. Regarding climbing stairs, a typical high-torque activity in daily locomotion, all subjects reach the maximum knee torque around 0.95 Nm/kg comparable to the able-bodied." "Powered Knee and Ankle Prosthesis with Adaptive Control Enables Climbing Stairs with Different Stair Heights, Cadences, and Gait Patterns","Sarah Hood, Lukas Gabert, Tommaso Lenzi",University of Utah,"Prosthetics, Exoskeletons and Rehabilitation","Powered prostheses can enable individuals with above-knee amputations to ascend stairs step-over-step. To accomplish this task, available stair ascent controllers impose a pre-defined joint impedance behavior or follow a pre-programmed position trajectory. These control approaches have proved successful in the laboratory. However, they are not robust to changes in stair height or cadence, which is essential for real-world ambulation. Here we present an adaptive stair ascent controller that enables individuals with above-knee amputations to climb stairs of varying stair heights at their preferred cadence and with their preferred gait pattern. We found that modulating the prosthesis knee and ankle position as a function of the user’s thigh in swing provides toe clearance for varying stair heights. In stance, modulating the torque-angle relationship as a function of the prosthesis knee position at foot contact provides sufficient torque assistance for climbing stairs of different heights. Furthermore, the proposed controller enables individuals to climb stairs at their preferred cadence and gait pattern, such as step-by-step, step-over-step, and two-steps." "Design, Control, and Experimental Evaluation of a Novel Robotic Glove System for Patients with Brachial Plexus Injuries","Wenda Xu, Yunfei Guo, Cesar Bravo, Pinhas Ben-Tzvi","Virginia Tech,Carilion Clinic Institute of Orthopaedics and Neurosciences","Prosthetics, Exoskeletons and Rehabilitation","This paper presents the development of an exoskeleton glove system for people who suffer from brachial plexus injuries, aiming to assist their lost grasping functionality. The robotic system consists of a portable glove system and an embedded controller. The glove system consists of Linear Series Elastic Actuators (LSEA), Rotary Series Elastic Actuators (RSEA), and optimized finger linkages to provide imitated human motion to each finger and a coupled motion of the hand. The design principles and optimization strategies were investigated to balance functionality, portability, and stability. The model-based force control strategy compensated with a backlash model and model-free force control strategy are presented and compared. Results show that our proposed model-free control method achieves the goal of accurate force control. Finally, experiments were conducted with the prototype of the developed integrated exoskeleton glove system. Results from 3 subjects with 150 trials show that our proposed exoskeleton glove system has the potential to be used as a rehabilitation device for patients." Data-Driven Variable Impedance Control of a Powered Knee-Ankle Prosthesis for Adaptive Speed and Incline Walking,"T. Kevin Best, Cara Gonzalez Welker, Elliott Rouse, Robert D. Gregg","University of Michigan,University of Colorado Boulder","Prosthetics, Exoskeletons and Rehabilitation","Most impedance-based walking controllers for powered knee-ankle prostheses use a finite state machine with dozens of user-specific parameters that require manual tuning by technical experts. These parameters are only appropriate near the task (e.g. walking speed and incline) at which they were tuned, necessitating many different parameter sets for variable-task walking. In contrast, this paper presents a data-driven, phase-based controller for variable-task walking that uses continuously-variable impedance control during stance and kinematic control during swing to enable biomimetic locomotion. After generating a data-driven model of variable joint impedance with convex optimization, we implement a novel task-invariant phase variable and real-time estimates of speed and incline to enable autonomous task adaptation. Experiments with above-knee amputee participants (N=2) show that our data-driven controller 1) features highly-linear phase estimates and accurate task estimates, 2) produces biomimetic kinematic and kinetic trends as task varies, leading to low errors relative to able-bodied references, and 3) produces biomimetic joint work and cadence trends as task va" NESM-Gamma: An Upper-Limb Exoskeleton with Compliant Actuators for Clinical Deployment,"Jun Pan, Davide Astarita, Andrea Baldoni, Filippo Dell'agnello, Simona Crea, Nicola Vitiello, Emilio Trigili","Zhejiang University of Technology,Scuola Superiore Sant'Anna,ISTITUTO DI BIOROBOTICA,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant Anna","Prosthetics, Exoskeletons and Rehabilitation","This letter describes the design and characterization of an upper-limb exoskeleton for post-stroke rehabilitation. The platform interacts with the shoulder and elbow of the user through four active joints, driven by series elastic actuators (SEAs) with custom springs to achieve compactness and ease of maintenance. The exoskeleton adopts a passive kinematic chain for aligning the user’s and robot joints’ rotation axes, and a quick flipping mechanism to enable dual-side use. The pole-placement method based on the dynamic model of the SEA was used to design the low-level controller, to guarantee torque control precision and stability. The joint load due to the robot's gravity is counteracted by using a feed-forward gravity compensation algorithm. Experimental characterization demonstrates the torque control bandwidth up to 10 Hz and highly transparent behavior of the joints (namely, close to null parasitic impedance) at least up to 2 Hz, showing suitability for rehabilitation purposes." "Design, Development, and Control of a Hand/Wrist Exoskeleton for Rehabilitation and Training","Mihai Dragusanu, Muhammad Zubair Iqbal, Tommaso Lisini Baldi, Domenico Prattichizzo, Monica Malvezzi",University of Siena,"Prosthetics, Exoskeletons and Rehabilitation","Robotic devices for rehabilitation and training is a promising and challenging research topic with a potentially huge social impact. The availability of tools for autonomously performing physiotherapy exercises increases their efficiency, provides supplementary information about results and progress, reduces physiotherapist’s efforts and the need of their physical presence during exercise sessions, and encourages autonomy and independence in people with disabilities. Nevertheless, supportive technologies developed without the inputs and feedback of the end-user throughout the design process, are less likely to be adopted for their intended purpose and use case. In this paper, we propose a modular hand/wrist exoskeleton, that actuates the wrist’s flexion/extension and adduction/abduction motions and hand fingers’ flexion/extension motions. It is designed to be wearable and easy to control and manage and can be used by the patient in collaboration with the therapist or autonomously. A user-centered design perspective has been employed in all the design and development phases. The paper introduces the main features of the device and presents some tests conducted with" Markovian Transparency Control of an Exoskeleton Robot,"Felix Mauricio Escalante, Leonardo Felipe Dos Santos, Yecid Moreno, Adriano A G Siqueira, Marco Henrique Terra, Thiago Boaventura","University of São Paulo,University of Sao Paulo","Prosthetics, Exoskeletons and Rehabilitation","In wearable robotics, certain applications require the robot to be transparent, i.e., imperceptible to the user. This is a very difficult cooperative control task due to the inherent coupling between human and robot, unpredictable human movements, and user-dependent behavior. In this letter, we propose a novel transparency controller based on discrete-time Markovian jump linear systems to minimize the human-robot interaction forces of an exoskeleton robot during walking. Our model-based stochastic control approach describes a gait cycle as an event-dependent Markov chain and uses a given transition matrix to switch between them. An IMU-based Kalman filter is used to perform real-time human state estimation and gait phase detection. The robustness and effectiveness of the proposed controller are demonstrated with experiments on a lower-limb exoskeleton driven by series elastic actuators." ArmAssist: A Telerehabilitation Solution for Upper-Limb Rehabilitation at Home,"Ainara Garzo, Je Hyung Jung, Javier Arcas Ruiz-ruano, Joel C. Perry, Thierry Keller","TECNALIA, Basque Research and Technology Alliance (BRTA),University of Idaho,FUNDACION TECNALIA Research & Innovation","Prosthetics, Exoskeletons and Rehabilitation","ArmAssist (AA), developed by TECNALIA, is a telerehabilitation platform aiming to help post-stroke subjects maintain the rehabilitation of the upper-limb at home. The AA includes robotic modules with multiple sensors to train and measure the users’ voluntary movements. An assessment platform based on serious games is also included to not only engage the user but also perform automated evaluation of arm and hand function and their evolution over time. Moreover, the AA facilitates at-home rehabilitation with limited remote supervision by the therapist. In the present paper, the technical specifications and developments of the AA are described. Additionally, a summary of the outcomes of a usability evaluation of the AA is presented." "A Soft, Wearable Skin-Brace for Assisting Forearm Pronation and Supination with a Low-Profile Design","Huimin Su, Kyoung-soub Lee, Yusung Kim, Hyung-Soon Park","Korea Advanced Institute of Science and Technology,Korea Advanced Institute of Science and Technology (KAIST)","Prosthetics, Exoskeletons and Rehabilitation","Neurological diseases affect more than 50 million people living with persistent upper limb impairments worldwide, and 40% of patients remain impaired even after intensive hospital care. Wearable robots have been developed for enhancing functionality in activities of daily living (ADL). The role of forearm movement is as crucial as the hand and wrist. However, the coincidence of the rotation axis with the forearm presents design challenges, and thus there are few studies on wearable devices for assisting forearm pronation/supination movement. In this study, we propose a bioinspired soft wearable skin-brace for the forearm pronation/supination to assist patients in ADL and perform rehabilitation training. The low-profile design that can fit the forearm like clothing combines features of a rail structure and tendon driving mechanism with high transfer efficiency. We conducted a biomechanical analysis to quantify the movement of wrist landmarks during forearm rotation, as well as a performance evaluation and user tests with our prototype. We demonstrate that it can provide patients with sufficient and compliant support in terms of range of motion and output torque. The user-friendly design allows patients to don and doff the device independently and quickly, as well as for individual customization. With the advantage of not restricting other joint movements, the device also shows the possibility of co-use with other rehabilitation devices." Teachers in Concordance for Pseudo-Labeling of 3D Sequential Data,"Awet Haileslassie Gebrehiwot, Patrik Vacek, David Hurych, Karel Zimmermann, Patrick Perez, Tomas Svoboda","Czech Technical University in Prague,Ceske vysoke uceni technicke v Praze - Fakulta elektrotechnicka,Valeo,Czech Technical University Prague,valeo,Faculty of Electrical Engineering, Czech Technical University in Prague",Optimal Control and Object Detection,"Automatic pseudo-labeling is a powerful tool to tap into large amounts of sequential unlabeled data. It is especially appealing in safety-critical applications of autonomous driving, where performance requirements are extreme, datasets are large, and manual labeling is very challenging. We propose to leverage sequences of point clouds to boost the pseudo- labeling technique in a teacher-student setup via training multiple teachers, each with access to different temporal information. This set of teachers, dubbed Concordance, provides higher quality pseudo-labels for student training than standard methods. The output of multiple teachers is combined via a novel pseudo- label confidence-guided criterion. Our experimental evaluation focuses on the 3D point cloud domain and urban driving scenarios. We show the performance of our method applied to 3D semantic segmentation and 3D object detection on three benchmark datasets. Our approach, which uses only 20% manual labels, outperforms some fully supervised methods. A notable performance boost is achieved for classes rarely appearing in training data. Our codes will be made publicly available." Automatic Labeling to Generate Training Data for Online LiDAR-Based Moving Object Segmentation,"Xieyuanli Chen, Benedikt Mersch, Lucas Nunes, Rodrigo Marcuzzi, Ignacio Vizzo, Jens Behley, Cyrill Stachniss","National University of Defense Technology,University of Bonn",Optimal Control and Object Detection,"Understanding the scene is key for autonomously navigating vehicles, and the ability to segment the surroundings online into moving and non-moving objects is a central ingredient of this task. Often, deep learning-based methods are used to perform moving object segmentation (MOS). The performance of these networks, however, strongly depends on the diversity and amount of labeled training data—information that may be costly to obtain. In this paper, we propose an automatic data labeling pipeline for 3D LiDAR data to save the extensive manual labeling effort and to improve the performance of existing learning-based MOS systems by automatically annotation training data. Our proposed approach achieves this by processing the data offline in batches, i.e., it is not designed to run online on a vehicle. It labels the actually moving objects such as driving cars and pedestrians as moving. In contrast, the non-moving objects, e.g., parked cars, lamps, roads, or buildings, are labeled as static. We show that this approach allows us to label LiDAR data highly effectively and compare our results to those of other label generation methods. We also train a deep neural network with our automatically generated labels and achieve comparable performance to the one trained with manual labels on the same data—and an even better performance when using additional datasets with labels generated by our approach. Furthermore, we evaluate our method on multiple datasets using different sensors, and ou" Uncertainty for Identifying Open-Set Errors in Visual Object Detection,"Dimity Miller, Niko Suenderhauf, Michael J Milford, Feras Dayoub","Queensland University of Technology,The University of Adelaide",Optimal Control and Object Detection,"Deployed into an open world, object detectors are prone to open-set errors, false positive detections of object classes not present in the training dataset. We propose GMM-Det, a real-time method for extracting epistemic uncertainty from object detectors to identify and reject open-set errors. GMM-Det trains the detector to produce a structured logit space that is modelled with class-specific Gaussian Mixture Models. At test time, open-set errors are identified by their low log-probability under all Gaussian Mixture Models. We test two common detector architectures, Faster R-CNN and RetinaNet, across three varied datasets spanning robotics and computer vision. Our results show that GMM-Det consistently outperforms existing uncertainty techniques for identifying and rejecting open-set detections, especially at the low-error-rate operating point required for safety-critical applications. GMM-Det maintains object detection performance, and introduces only minimal computational overhead. We also introduce a methodology for converting existing object detection datasets into specific open-set datasets to evaluate open-set performance in object detection." Bounds on Optimal Revisit Times in Persistent Monitoring Missions with a Distinct & Remote Service Station,"Sai Krishna Kanth Hari, Sivakumar Rathinam, Swaroop Darbha, Krishna Kalyanam, Satyanarayana Gupta Manyam, David Casbeer","Los Alamos National Laboratory,TAMU,NASA Ames Research Center,Infoscitex corp.,AFRL",Optimal Control and Object Detection,"Persistent monitoring missions require up-to-date knowledge of the changing state of the underlying environment. UAVs can be gainfully employed to continually visit a set of targets representing tasks (and locations) in the environment and collect data therein for long time periods. The enduring nature of these missions requires the UAV to be regularly recharged at a service station. In this paper, we consider the case in which the service station is not co-located with any of the targets. Efficient monitoring requires the revisit time, defined as the maximum of the time elapsed between successive revisits to targets, to be minimized. Here, we consider the problem of determining UAV routes that lead to the minimum revisit time. The problem is NP-hard, and its computational difficulty increases with the fuel capacity of the UAV. We develop an algorithm to construct near-optimal solutions to the problem quickly, when the fuel capacity exceeds a threshold. We also develop lower bounds to the optimal revisit time and use these bounds to demonstrate (through numerical simulations) that the constructed solutions are, on an average, at most 0.01 % away from the optimum." Force Sharing Problem During Gait Using Inverse Optimal Control,"Filip Becanovic, Vincent Bonnet, Raphaël Dumas, Kosta Jovanovic, Samer Mohammed","Université Paris-Est Créteil, University of Belgrade,University Paul Sabatier,University Gustave Eiffel,University of Belgrade, Serbia,University of Paris Est Créteil - (UPEC)",Optimal Control and Object Detection,"Human gait patterns have been intensively studied, both from medical and engineering perspectives, to understand and compensate pathologies. However, the muscle-force sharing problem is still debated as acquiring individual muscle force measurements is challenging, requiring the use of invasive devices. Recent studies, using various objective functions, suggest muscle-force sharing may result from an optimization process. This study proposes using inverse optimal control to identify an objective function. Two popular methods of inverse optimal control, bilevel and inverse Karush-Kuhn-Tucker, were investigated. The identified objective functions were then used to predict muscle forces during gait, and their performances were compared to an exhaustive list of biological cost functions from the literature. The best prediction was achieved by the bilevel inverse optimal control method, with a root-mean-squared error of 176N (162N) and a correlation coefficient of 0.76 (0.68) for the stance (swing) phase of the gait cycle. These muscle force predictions were thereafter used to compute joint stiffness, exhibiting an average root-mean-square error of 42 Nm.rad−1 and a correlation coefficient of 0.90 when compared to the reference. The bilevel method’s prevalence in terms of robustness over inverse Karush-Kuhn-Tucker was demonstrated on human data and explained on a toy example." Data-Driven Iterative Optimal Control for Switched Dynamical Systems,"Yuqing Chen, Yangzhi Li, David Braun","Xi'an Jiaotong-Liverpool University,Singapore University of Technology and Design,Vanderbilt University",Optimal Control and Object Detection,This paper presents a data-driven algorithm to compute optimal control inputs for input constrained nonlinear optimal control problems with switched dynamics. We consider multi-stage optimal control problems where the control inputs and the switching instants are both unknown. Our key contribution is the new iterative online optimal control algorithm which mitigates sub-optimal control caused by model bias in the challenging class of under-actuated and intrinsically unstable switched dynamical systems. This is achieved by estimating the cost and computing the control inputs along measured trajectories of the controlled system instead of doing the same procedure along error-prone trajectories predicted by an inexact model. The algorithm is evaluated using an under-actuated and intrinsically unstable hopping robot in a simulation environment. The algorithm enables real-time data-driven optimal control using inaccurate models. BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning,"Avadesh Meduri, Paarth Shah, Julian Viereck, Majid Khadiv, Ioannis Havoutis, Ludovic Righetti","New York University,University of Oxford,Max Planck Institute for Intelligent Systems",Optimal Control and Object Detection,"Online planning of whole-body motions for legged robots is challenging due to the inherent nonlinearity in the robot dynamics. In this work, we propose a nonlinear MPC framework, the BiConMP which can generate whole body trajectories online by efficiently exploiting the structure of the robot dynamics. BiConMP is used to generate various cyclic gaits on a real quadruped robot and its performance is evaluated on different terrain, countering unforeseen pushes and transitioning online between different gaits. Further, the ability of BiConMP to generate non-trivial acyclic whole-body dynamic motions on the robot is presented. The same approach is also used to generate various dynamic motions in MPC on a humanoid robot (Talos) and another quadruped robot (AnYmal) in simulation. Finally, an extensive empirical analysis on the effects of planning horizon and frequency on the nonlinear MPC framework is reported and discussed." Environment Warped Gait Trajectory Optimization for Complex Terrains,"Zherong Pan, Tan Chen, Xifeng Gao, Wu Kui","Tencent America,Michigan Technological University,Tencent",Optimal Control and Object Detection,"Contact-Rich gait trajectory optimization is a challenging non-convex programming problem, especially for complex terrain shapes, where prominent numerical algorithms can fail to find a solution or fall into local minima. We propose an environment warping technique to alleviate this issue. Given a terrain of some general shape, our method first generates a locally injective, as-conformal-as-possible function that maps the ambient space around the terrain to a warped space. We then formulate the trajectory optimization in the warped space by remapping all the decision variables. Our method frees the numerical optimizer from tuning the trajectories to fit changing terrain shapes, leading to better numerical conditioning and fewer local minima. Numerical results show that our method outperforms direct trajectory optimization in terms of both success rates and quality of solutions." Differential Dynamic Programming with Nonlinear Safety Constraints under System Uncertainties,"Gokhan Alcan, Ville Kyrki",Aalto University,Optimal Control and Object Detection,"Safe operation of systems such as robots requires them to plan and execute trajectories subject to safety constraints. When those systems are subject to uncertainties in their dynamics, it is challenging to ensure that the constraints are not violated. In this paper, we propose Safe-CDDP, a safe trajectory optimization and control approach for systems under additive uncertainties and non-linear safety constraints based on constrained differential dynamic programming (DDP). The safety of the robot during its motion is formulated as chance constraints with user-chosen probabilities of constraint satisfaction. The chance constraints are transformed into deterministic ones in DDP formulation by constraint tightening. To avoid over-conservatism during constraint tightening, linear control gains of the feedback policy derived from the constrained DDP are used in the approximation of closed-loop uncertainty propagation in prediction. The proposed algorithm is empirically evaluated on three different robot dynamics with up to 12 degrees of freedom in simulation. The computational feasibility and applicability of the approach are demonstrated with a physical hardware implementation." ViTAL: Vision-Based Terrain-Aware Locomotion for Legged Robots,"Shamel Fahmi, Victor Barasuol, Domingo Esteban, Octavio Antonio Villarreal Magaña, Claudio Semini","Massachusetts Institute of Technology,Istituto Italiano di Tecnologia,ANYbotics AG",Optimal Control and Object Detection,"This work is on vision-based planning strategies for legged robots that separate locomotion planning into foothold selection and pose adaptation. Current pose adaptation strategies optimize the robot’s body pose relative to given footholds. If these footholds are not reached, the robot may end up in a state with no reachable safe footholds. Therefore, we present a Vision-Based Terrain-Aware Locomotion (ViTAL) strategy that consists of novel pose adaptation and foothold selection algorithms. ViTAL introduces a different paradigm in pose adaptation that does not optimize the body pose relative to given footholds, but the body pose that maximizes the chances of the legs in reaching safe footholds. ViTAL plans footholds and poses based on skills that characterize the robot’s capabilities and its terrain-awareness. We use the 90 kg HyQ and 140 kg HyQReal quadruped robots to validate ViTAL, and show that they are able to climb various obstacles including stairs, gaps, and rough terrains at different speeds and gaits. We compare ViTAL with a baseline strategy that selects the robot pose based on given selected footholds, and show that ViTAL outperforms the baseline." Experimental Study on Accurate Calibration for Industrial Robot Via Integrated Extended Kalman and Beetle Antennae Search,"Zhibing Li, Shuai Li, Xin Luo","Chongqing Institute of Green and Intelligent Technology,Chinese ,Hong Kong Polytechnic University,Chongqing Institute of Green and Intelligent Technology, Chinese","Calibration, Identification, and Simulation","Over the past decades, industrial robots have been widely used in aerospace, electronics, medical and other fields. However, the absolute positioning error of uncalibrated robot can achieve several millimeters, which cannot meet the accuracy requirements of advanced industry. To address this issue, it is necessary to calibrate the robot. Therefore, based on the DH model,a new robot calibration method combining the extended Kalman filter with quadratic interpolation beetle antennae search algorithm is proposed. Firstly, the DH model of the robot is built. Then the extended Kalman filter is adopted to preliminarily identify the robot kinematic parameters to address the issue of non-Gauss noises. Meanwhile, quadratic interpolation beetle antennae search algorithm is adopted to further calibrate the kinematic parameters. Lastly, extensive experiments are carried out on an ABB IRB 120 industrial robot. The experimental results show that the robot positioning error is significantly reduced by the proposed algorithm, which is appropriate for practical application occasions with high-precision requirements." Real-Time Model Predictive Control and System Identification Using Differentiable Physics Simulation,"Sirui Chen, Keenon Werling, Albert Wu, Karen Liu","The University of Hong Kong,Stanford University","Calibration, Identification, and Simulation","Transferring controller from a simulated environment to physical system is regarded as a challenging problem in robotics. We present a method for continuous improvement of modeling and control after deploying the robot to a dynamically-changing target environment. We develop a differentiable physics simulation framework that performs online system identification and optimal control simultaneously, using the incoming observations from the target environment in real time. To ensure robust system identification against noisy observations, we devise an algorithm to assess the confidence of our estimated parameters, using numerical analysis of the dynamic equations. To ensure real-time optimal control, we adapt the optimization window in the future so that the optimized actions can be replenished faster than consumption, while staying as up-to-date with new information as possible. The constant re-planning based on a constantly improved model allows the robot to swiftly adapt to the changing environment and utilize real-world data in the most sample-efficient way. Thanks to a fast differentiable physics simulator, the optimization for both system identification and control can be solved efficiently for robots operating in real time. We demonstrate our method on a set of examples in simulation and on a real robot and show that our results are favorable compared to baseline methods." PBACalib: Targetless Extrinsic Calibration for High-Resolution LiDAR-Camera System Based on Plane-Constrained Bundle Adjustment,"Feiyi Chen, Liang Li, Shuyang Zhang, Wu Jin, Lujia Wang","The Hong Kong University of Science and Technology,The University of Hong Kong,UESTC,The Hong Kong University of Technology","Calibration, Identification, and Simulation","The strategy of fusing multi-model data, especially from cameras, light detection and ranging sensors (LiDAR), is frequently considered in robotics to enhance the performance of the perception and navigation tasks. Extrinsic calibration, which spatially aligns different sources into a unified coordinate representation, directly determines the performance of the combined data. In this paper, we propose PBACalib, a novel targetless extrinsic calibration algorithm aiming at the dense LiDAR-camera system based on the plane-constrained bundle adjustment (PBA). The proposed method utilizes the feature points derived from a prominent plane in the scene and iteratively minimizes the reprojection error. A maximum likelihood estimator (MLE) is designed by considering the uncertainty information of the measurements. Furthermore, we explore the distribution of collected data and characterize the robustness and solvability of the extrinsic estimates using a confidence factor. Simulation and real-world experiments both qualitatively and quantitatively demonstrate the robustness and accuracy of our method. The comparison experiments show that the proposed method outperforms another targetless method. To benefit the community, Matlab code has been publicly released on Github" Probabilistic Framework for Hand-Eye and Robot-World Calibration AX=YB,Junhyoung Ha,Korea Institute of Science and Technology,"Calibration, Identification, and Simulation","Hand-eye and robot-world calibration is a problem in which the unknown homogeneous transformations X and Y must be estimated for a loop closure equation AX=YB for a set of transformation measurement pairs {(A_i, B_i)}. Previous studies on AX=YB have mainly relied on linear least-squares minimization followed by nonlinear iterative optimization for solution refinement to minimize the distances between A_i X and Y B_i. However, these methods have not been fully clarified, particularly in terms of the calibration dependence on the coordination of A, B, X, and Y along the system loop, as well as the underlying noise distributions of A_i and B_i. They also lack flexibility in the noise properties of individual measurements; thus, they cannot incorporate the relative reliability between measurements. To address these limitations, we propose a probabilistic framework for hand-eye and robot-world calibration. The proposed framework clarifies the unclear aspects of existing methods by revealing their underlying assumptions on system noise. Consequently, it identifies the applicability of distance minimization to a given calibration problem and provides the optimal coord" Multi-Kernel Maximum Correntropy Kalman Filter for Orientation Estimation,"Shilei Li, Lijing Li, Dawei Shi, Wulin Zou, Pu Duan, Ling Shi","The Hong Kong University of Science and Technology,China University of Mining and Technology,Beijing Institute of Technology,Hong Kong University of Science and Technology,Xeno Dynamics Co., Ltd","Calibration, Identification, and Simulation","Inertial measurement units (IMUs), composed of gyroscopes, accelerometers, and magnetometers, have been widely used in the field of human motion animation, rehabilitation, robotics, and aerospace. However, their performances degenerate remarkably with external acceleration and magnetic disturbance. To handle this issue, we employ a multi-kernel maximum correntropy Kalman filter (MKMCKF) to suppress the adversarial acceleration and magnetic disturbance and use Bayesian optimization (BO) to explore the optimal kernel bandwidths. We validate our algorithm in a set of experiments with different levels of disturbance. Results show that the proposed method is significantly better than the traditional error state Kalman filter (ESKF) and the gradient descent (GD) method, and its root mean square error (RMSE) is less than 0.4629 degrees on the roll and pitch even under the worst testing case." A4LidarTag: Depth-Based Fiducial Marker for Extrinsic Calibration of Solid-State Lidar and Camera,"Xie Yusen`, Deng Lei, Sun Ting, Fu Yeyu, Zhixiang Chen, Chen Baohua, Li Jian, Cui Xinglong, Yin Hanxi, Deng Shuixin, Xiao Junwei","Beijing Information Science & Technology University,Tsinghua University,The University of Sheffield","Calibration, Identification, and Simulation","Visual-based simultaneous localization and mapping (SLAM) systems perform weakly in object tracking and map reconstruction due to the unreliable depth measurement originated from image-only data. Light Detection and Ranging (LiDAR) can be coupled to overcome the drawback of uncertain depth estimation. The prerequisite for performing data fusion is to align visual-Lidar sensors to a certain coordinate system with extrinsic pose by calibrating. The conventional extrinsic calibration frameworks either rely on markers in artificial large size calibration board or uncontrollable natural scenes, which limits the stability and convenience. In this paper, we have designed a novel marker pattern, A4LidarTag, which is composed of circular holes. The difference of depth measurement was used to encode location information. Based on A4LidarTag, automatic extrinsic calibration framework between solid-state Lidar (SSL) and camera was developed. Proposed framework can be implemented in close range (within 1 meter) and A4 size calibration board. The average reprojection error in the result of Lidar point cloud projection is about 0.12 pixels. Experiments show great efficiency and versatility on both indoor and outdoor scenes. Source code is available at Github https://github.com/xieyuser/A4LidarTag" A CoppeliaSim Dynamic Simulator for the Da Vinci Research Kit,"Marco Ferro, Alessandro Mirante, Fanny Ficuciello, Marilena Vendittelli","CNRS,Sapienza University of Rome,Università di Napoli Federico II",Standalone Videos,"The design of a physics-based dynamic simulator of a robot requires to properly integrate therobot kinematic and dynamic properties in a virtual environment. Naturally, the closer is the integrated information to the real robot properties, the more accurate the simulator predicts the real robot behaviour. A reliable robot simulator is a valuable asset for developing new research ideas; its use dramatically reduces the costs and it is available to all researchers. This paper presents a dynamic simulator of the da Vinci Research Kit (dVRK) patient-side manipulator (PSM). The kinematic and dynamic properties of the simulator rely on the parameters identified in [1]. With respect to the kinematic simulator previously developed by some of the authors, this work: (i) redefines the kinematic architecture and the actuation model by modeling the double parallelogram and the counterweight mechanism, to reflect the structure of the real robot; (ii) integrates the identified dynamic parameters in the simulation model. The obtained simulator enables the design and validation of control strategies relying on the robot dynamic model, including interaction force estimation and control, that are fundamental to guarantee safety in many surgical tasks." Fast and Robust Inverse Kinematics of Serial Robots Using Halley's Method,"Steffan Lloyd, Rishad Irani, Mojtaba Ahmadi",Carleton University,"Calibration, Identification, and Simulation","This paper proposes a novel numerical inverse kinematics algorithm called the Quick Inverse Kinematics or QuIK method. The QuIK method is a third-order algorithm that uses both the first and second-order derivative information to iteratively converge to a solution. Numerical inverse kinematics methods are readily implemented on any serial robot and do not rely on joint alignment. However, they typically are slower and less robust. The second-order derivative term allows the QuIK algorithm to converge more rapidly and more robustly than existing algorithms. A damped extension to the QuIK method is also proposed to increase reliability near singularities. The QuIK methods are tested in terms of evaluation speed, reliability, and singularity robustness against the Newton-Raphson method and several other modern algorithms. The proposed QuIK methods outperform all other tested algorithms in terms of speed and robustness, and have strong performance near singularities. The QuIK algorithms are proposed as faster and more robust ""drop-in"" replacements to the Newton-Raphson methods in inverse kinematics. C++ and Matlab codebases are made available." Large-Dimensional Multibody Dynamics Simulation Using Contact Nodalization and Diagonalization,"Jeongmin Lee, Minji Lee, Dongjun Lee",Seoul National University,"Calibration, Identification, and Simulation","In this article, we propose a novel multibody dynamics simulation framework that can efficiently deal with largedimensionality and complementarity multicontact conditions. Typical contact simulation approaches require performing contact impulse fixed-point iteration, which has high time-complexity from large-size matrix factorization and multiplication, as well as susceptibility to ill-conditioned contact situations. To circumvent this, we propose a novel framework based on velocity fixed-point iteration (V-FPI), which, by utilizing a certain surrogate dynamics and contact nodalization (with virtual nodes), we achieve not only intercontact decoupling but also their interaxes decoupling (i.e., contact diagonalization) at each iteration step. This then enables us to one-shot/parallel-solve the contact problem during each V-FPI iteration-loop, while avoiding large-size/dense matrix inversion/multiplication, thereby, significantly speeding up the simulation time with improved convergence property. We theoretically show that the solution of our framework is consistent with that of the original problem and, further, elucidate mathematical conditions for the convergence of our proposed solver. Performance and properties of our proposed simulation framework are also demonstrated and experimentally validated for various largedimensional/multicontact scenarios including deformable objects." EMS®: A Massive Computational Experiment Management System towards Data-Driven Robotics,"Qinjie Lin, Guo Ye, Han Liu",Northwestern University,Software Tools I,"We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders(in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications." Rmagine: 3D Range Sensor Simulation in Polygonal Maps Via Ray Tracing for Embedded Hardware on Mobile Robots,"Alexander Mock, Thomas Wiemann, Joachim Hertzberg","University of Osnabrück,Fulda University of Applied Sciences,University of Osnabrueck",Software Tools I,"Sensor simulation has emerged as a promising and powerful technique to find solutions to many real-world robotic tasks like localization and pose tracking. However, commonly used simulators have high hardware requirements and are therefore used mostly on high-end computers. In this paper, we present an approach to simulate range sensors directly on embedded hardware of mobile robots that use triangle meshes as environment map. This library, called Rmagine, allows a robot to simulate sensor data for arbitrary range sensors directly on board via ray tracing. Since robots typically only have limited computational resources, Rmagine aims at being flexible and lightweight, while scaling well even to large environment maps. It runs on several platforms like Laptops or embedded computing boards like NVIDIA Jetson by putting an unified API over the specific proprietary libraries provided by the hardware manufacturers. This work is designed to support the future development of robotic applications depending on simulation of range data that could previously not be computed in reasonable time on mobile systems." A Framework for Fast Prototyping of Photo-Realistic Environments with Multiple Pedestrians,"Sara Casao, Andrés Otero, Álvaro Serra-gómez, Ana Cristina Murillo, Javier Alonso-Mora, Eduardo Montijano","Unversity of Zaragoza ESQ,,,,,,,G Department of Computer Science,Universidad de Zaragoza,Delft University of Technology,University of Zaragoza",Software Tools I,"Robotic applications involving people often require advanced perception systems to better understand complex real-world scenarios. To address this challenge, photo-realistic and physics simulators are gaining popularity as a means of generating accurate data labeling and designing scenarios for evaluating generalization capabilities, e.g., lighting changes, camera movements or different weather conditions. We develop a photo-realistic framework built on Unreal Engine and AirSim to generate easily scenarios with pedestrians and mobile robots. The framework is capable to generate random and customized trajectories for each person and provides up to 50 ready-to-use people models along with an API for their metadata retrieval. We demonstrate the usefulness of the proposed framework with a use case of multi-target tracking, a popular problem in real pedestrian scenarios. The notable feature variability in the obtained perception data is presented and evaluated." RoboSC: A Domain-Specific Language for Supervisory Controller Synthesis of ROS Applications,"Bart Wesselink, Koen De Vos, Ivan Kurtev, Michel Reniers, Elena Torta","Eindhoven University of Technology,Eindhoven Univeristy of Technology",Software Tools I,"The paper presents a novel domain-specific language, RoboSC, for developing supervisory controllers for robotic applications. RoboSC supports concepts of ROS/ROS2 and supervisory control theory. It enables users to focus on the modeling and the synthesis process of supervisory controllers for ROS applications only because it generates all artifacts needed to connect such controllers to ROS applications and deploy them. Validation tests with actual and simulated robots show the approach's feasibility and indicate reduced coding effort." KubeROS: A Unified Platform for Automated and Scalable Deployment of ROS2-Based Multi-Robot Applications,"Yongzhou Zhang, Christian Wurll, Björn Hein","Karlsruhe University of Applied Sciences,University of Applied Sciences Karlsruhe",Software Tools I,"As advanced algorithms enable robots to handle more challenging tasks and operate more autonomously, the on-board computer cannot meet the increased demands regarding computing power and memory storage in an efficient way. Leveraging the massive computing power of the cloud and low-latency connectivity to the edge can compensate for this lack of computing resources. However, this introduces a new challenge related to the deployment of complex robotic software across multiple devices, especially in a large-scale system. This paper presents KubeROS, a unified and fully managed platform for automated deployment of robotic applications developed on top of Robot Operating System 2 (ROS2), in a hybrid computing infrastructure with robots, edge and cloud. KubeROS uses Kubernetes from Cloud Native Computing as its underlying software orchestration framework. It aims to help researchers and developers with no prior cloud computing knowledge deploy their ROS2-based robotic applications at any scale. KubeROS eliminates the need for system configuration and network setup. We demonstrate the applicability of KubeROS by deploying a fleet of simulated mobile manipulators in a classical pick-and-place application. The experiments demonstrate the effects of different deployment strategies for vision-based motion planning under different fleet sizes and workloads. In addition, KubeROS improves task performance by using high-performance computing at the edge and in the cloud, and achieves high resource efficiency when using the shared deployment strategy." Domain-Specific Languages for Kinematic Chains and Their Solver Algorithms: Lessons Learned for Composable Models,"Sven Schneider, Nico Hochgeschwender, Herman Bruyninckx","Bonn-Rhein-Sieg University,University of Leuven",Software Tools I,"The Unified Robot Description Format (URDF) and, to a lesser extent, the COLLAborative Design Activity (COLLADA) format are two of the most popular domain-specific languages (DSLs) to represent kinematic chains in robotics with support in many tools including Gazebo, MoveIt!, KDL or IKFast. In this paper we analyse both DSLs with respect to their structure and semantics as seen by tools that produce or consume such representations. For the former, we notice a tight coupling of various unrelated domains like kinematics and dynamics with visualisation, control or even specific simulators. For the latter, a key insight is that both DSLs target human developers and leave important design decisions like the choice of joint attachment frames implicit or hidden in the documentation. The lessons learned from this analysis guide us to an improved interchange format by designing composable, loosely coupled models with complete metamodels that unambiguously define the model semantics. We substantiate our findings with concrete examples. Furthermore, we compose solver algorithms on top of the kinematic chain representation. As a consequence of the above analysis and decomposition we can systematically apply structure- and semantics-conserving model-to-code transformations to those algorithms." SIERRA: A Modular Framework for Accelerating Research and Improving Reproducibility,"John Harwell, Maria Gini",University of Minnesota,Software Tools I,"We present SIERRA, a novel framework for accelerating development and improving reproducibility of results in robotics research. SIERRA accelerates research by automating the process of generating experiments from queries over independent variables, executing experiments, and processing the results to generate deliverables such as graphs and videos. It shifts the paradigm for testing hypotheses from procedural (``Do these steps to answer the query'') to declarative (``Here is the query to test---GO!''), reducing the burden on researchers. It employs a modular architecture enabling easy customization and extension for the needs of individual researchers, thereby eliminating manual configuration and processing via throw-away scripts. SIERRA improves reproducibility of research by providing automation independent of the execution environment (HPC hardware, real robots, etc.) and targeted platform (simulator, real robots, etc.). This enables exact experiment replication, up to the limit of the execution environment and platform, as well as making it easy for researchers to test hypotheses in different computational environments. Though SIERRA is targeted at robotics research, its design makes it extendable to other fields." OpTaS: An Optimization-Based Task Specification Library for Trajectory Optimization and Model Predictive Control,"Christopher Edwin Mower, Joao Moura, Nazanin Zamani Behabadi, Sethu Vijayakumar, Tom Vercauteren, Christos Bergeles","King's College London,University of Edinburgh,Not Affiliated",Software Tools I,"This paper presents OpTaS, a task specification Python library for Trajectory Optimization (TO) and Model Predictive Control (MPC) in robotics. Both TO and MPC are increasingly receiving interest in optimal control and in particular handling dynamic environments. While a flurry of software libraries exists to handle such problems, they either provide interfaces that are limited to a specific problem formulation (e.g. TracIK, CHOMP), or are large and statically specify the problem in configuration files (e.g. EXOTica, eTaSL). OpTaS, on the other hand, allows a user to specify custom nonlinear constrained problem formulations in a single Python script allowing the controller parameters to be modified during execution. The library provides interface to several open source and commercial solvers (e.g. IPOPT, SNOPT, KNITRO, SciPy) to facilitate integration with established workflows in robotics. Further benefits of OpTaS are highlighted through a thorough comparison with common libraries. An additional key advantage of OpTaS is the ability to define optimal control tasks in the joint-space, task-space, or indeed simultaneously. The code for OpTaS is easily installed via pip, and the source code with examples can be found at github.com/cmower/optas." CMG-Net: An End-To-End Contact-Based Multi-Finger Dexterous Grasping Network,"Mingze Wei, Yaomin Huang, Zhiyuan Xu, Ning Liu, Zhengping Che, Xinyu Zhang, Chaomin Shen, Feifei Feng, Chun Shan, Jian Tang","east china normal university, midea,East China Normal University,Midea Group,Guangdong Polytechnic Normal University,Midea Group (Shanghai) Co., Ltd.",Data Sets I,"In this paper, we propose a novel representation for grasping using contacts between multi-finger robotic hands and objects to be manipulated. This representation significantly reduces the prediction dimensions and accelerates the learning process. We present an effective end-to-end network, CMG-Net, for grasping unknown objects in a cluttered environment by efficiently predicting multi-finger grasp poses and hand configurations from a single-shot point cloud. Moreover, we create a synthetic grasp dataset that consists of five thousand cluttered scenes, 80 object categories, and 20 million annotations. We perform a comprehensive empirical study and demonstrate the effectiveness of our grasping representation and CMG-Net. Our work significantly outperforms the state-of-the-art for three-finger robotic hands. We also demonstrate that the model trained using synthetic data perform very well for real robots." ARMBench: An Object-Centric Benchmark Dataset for Robotic Manipulation,"Chaitanya Mitash, Fan Wang, Shiyang Lu, Vikedo Terhuja, Tyler Garaas, Felipe Polido, Manikantan Nambi","Amazon Robotics,Rutgers University,Mitsubishi Electric Research Laboratories,Italian Institute of Technology",Data Sets I,"This paper introduces Amazon Robotic Manipulation Benchmark (ARMBench), a large-scale, object-centric benchmark dataset for robotic manipulation in the context of a warehouse. Automation of operations in modern warehouses requires a robotic manipulator to deal with a wide variety of objects, unstructured storage, and dynamically changing inventory. Such settings pose challenges in perceiving the identity, physical characteristics, and state of objects during manipulation. Existing datasets for robotic manipulation consider a limited set of objects or utilize 3D models to generate synthetic scenes with limitation in capturing the variety of object properties, clutter, and interactions. We present a large-scale dataset collected in an Amazon warehouse using a robotic manipulator performing object singulation from containers with heterogeneous contents. ARMBench contains images, videos, and metadata that corresponds to 235K+ pick-and-place activities on 190K+ unique objects. The data is captured at different stages of manipulation, i.e., pre-pick, during transfer, and after placement. Benchmark tasks are proposed by virtue of high-quality annotations and baseline performance evaluation are presented on three visual perception challenges, namely 1) object segmentation in clutter, 2) object identification, and 3) defect detection. ARMBench can be accessed at http://armbench.com" FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments,"Jishnu Jaykumar P, Yu-Wei Chao, Yu Xiang","The University of Texas at Dallas,NVIDIA,University of Texas at Dallas",Data Sets I,"We introduce the Few-Shot Object Learning (FewSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. FewSOL has object segmentation masks, poses, and attributes. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-shot object classification and (ii) joint object segmentation and few-shot classification with state-of-the-art methods for few-shot learning and meta-learning using our dataset. The evaluation results show the presence of a large margin to be improved for few-shot object classification in robotic environments, and our dataset can be used to study and enhance few-shot object recognition for robot perception." WorldGen: A Large Scale Generative Simulator,"Chahat Deep Singh, Riya Kumari, Cornelia Fermuller, Nitin Sanket, Yiannis Aloimonos","University of Maryland, College Park,University of Maryland",Data Sets I,"In the era of deep learning, data is the critical determining factor in the performance of neural network models. Generating large datasets suffers from various difficulties such as scalability, cost efficiency and photorealism. To avoid expensive and strenuous dataset collection and annotations, researchers have inclined towards computer-generated datasets. Although, a lack of photorealism and a limited amount of computer-aided data, has bounded the accuracy of network predictions. To this end, we present WorldGen -- an open source framework to autonomously generate countless structured and unstructured 3D photorealistic scenes such as city view, object collection, and object fragmentation along with its rich ground truth annotation data. WorldGen, being a generative model gives the user full access and control to features such as texture, object structure, motion, camera and lens properties for better generalizability by diminishing the data bias in the network. We demonstrate the effectiveness of WorldGen by presenting an evaluation on deep optical flow. We hope such a tool can open doors for future research in a myriad of domains related to robotics and computer vision by reducing manual labor and cost for acquiring rich and high quality data." Lossless SIMD Compression of LiDAR Range and Attribute Scan Sequences,"Jeff Ford, Jordan Ford","ComplexIQ,Carnegie Mellon University",Data Sets I,"As LiDAR sensors have become ubiquitous, the need for an efficient LiDAR data compression algorithm has increased. Modern LiDARs produce gigabytes of scan data per hour and are often used in applications with limited compute, bandwidth, and storage resources. We present a fast, lossless compression algorithm for LiDAR range and attribute scan sequences including multiple-return range, signal, reflectivity, and ambient infrared. Our algorithm---dubbed ""Jiffy""---achieves substantial compression by exploiting spatiotemporal redundancy and sparsity. Speed is accomplished by maximizing use of single-instruction-multiple-data (SIMD) instructions. In autonomous driving, infrastructure monitoring, drone inspection, and handheld mapping benchmarks, the Jiffy algorithm consistently outcompresses competing lossless codecs while operating at speeds in excess of 65M points/sec on a single core. In a typical autonomous vehicle use case, single-threaded Jiffy achieves 6x compression of centimeter-precision range scans at 500+ scans per second. To ensure reproducibility and enable adoption, the software is freely available as an open source library." 3D-DAT: 3D-Dataset Annotation Toolkit for Robotic Vision,"Markus Suchi, Bernhard Neuberger, Amanzhol Salykov, Jean-baptiste Weibel, Timothy Patten, Markus Vincze","TU Wien,University of Technology Sydney,Vienna University of Technology",Data Sets I,"Robots operating in the real world are expected to detect, classify, segment, and estimate the pose of objects to accomplish their task. Modern approaches using deep learning not only require large volumes of data but also pixel-accurate annotations in order to evaluate the performance and therefore safety of these algorithms. At present, publicly available tools for annotating data are scarce and those that are available rely on depth sensors, which excludes their use for transparent, metallic, and general non-Lambertian objects. To address this issue, we present a novel method for creating valuable datasets that can be used in these more difficult cases. Our key contribution is a purely RGB-based scene-level annotation approach that uses a neural radiance field-based method to automatically align objects. A set of user studies demonstrates the accuracy and speed of our approach over a purely manual or depth sensor assisted pipeline. We provide an open-source implementation of each component and a ROS-based recorder for capturing data with a eye-in-hand robot system. Code will be made available at https://github.com/markus-suchi/3D-DAT." "METEOR: A Dense, Heterogeneous, and Unstructured Traffic Dataset with Rare Behaviors","Rohan Chandra, Xijun Wang, Mridul Mahajan, Rahul Kala, Rishitha Palugulla, Chandrababu Naidu Nallagopu, Alok Jain, Dinesh Manocha","University of Texas, Austin,University of Maryland, College Park,Indian Institute of Information Technology Allahabad,Indian Institute of Information Technology, Allahabad, India,navAjna Technologies Private Limited,University of Maryland",Data Sets I,"We present a new traffic dataset, METEOR, which captures traffic patterns and multi-agent driving behaviors in unstructured scenarios. METEOR consists of more than 1000 one-minute videos, over 2 million annotated frames with bounding boxes and GPS trajectories for 16 unique agent categories, and more than 13 million bounding boxes for traffic agents. METEOR is a dataset for rare and interesting, multi-agent driving behaviors that are grouped into traffic violations, atypical interactions, and diverse scenarios. Every video in METEOR is tagged using a diverse range of factors corresponding to weather, time of the day, road conditions, and traffic density. We use to benchmark perception methods for object detection and multi-agent behavior prediction. Our key finding is that state-of-the-art models for object detection and behavior prediction, which otherwise succeed on existing datasets such as Waymo, fail on the METEOR dataset. METEOR is a step towards developing more sophisticated perception models for dense, heterogeneous, and unstructured scenarios." Kollagen: A Collaborative SLAM Pose Graph Generator,"Roberto C. Sundin, David Umsonst",Ericsson Research,Data Sets I,"In this paper, we address the lack of datasets for – and the issue of reproducibility in – collaborative SLAM pose graph optimizers by providing a novel pose graph generator. Our pose graph generator, kollagen, is based on a random walk in a planar grid world, similar to the popular M3500 dataset for single agent SLAM. It is simple to use and the user can set several parameters, e.g., the number of agents, the number of nodes, loop closure generation probabilities, and standard deviations of the measurement noise. Furthermore, a qualitative execution time analysis of our pose graph generator showcases the speed of the generator in the tunable parameters. In addition to the pose graph generator, our paper provides two example datasets that researchers can use out-of-the-box to evaluate their algorithms. One of the datasets has 8 agents, each with 3500 nodes, and 67645 constraints in the pose graphs, while the other has 5 agents, each with 10000 nodes, and 76134 constraints. In addition, we show that current state-of-the-art pose graph optimizers are able to process our generated datasets and perform pose graph optimization. The data generator can be found at https://github.com/EricssonResearch/kollagen" AvoidBench: A High-Fidelity Vision-Based Obstacle Avoidance Benchmarking Suite for Multi-Rotors,"Hang Yu, Guido De Croon, Christophe De Wagter","Delft University of technology,TU Delft,Delft University of Technology",Benchmarking,"Obstacle avoidance is an essential topic in the field of autonomous drone research. When choosing an avoidance algorithm, many different options are available, each with their advantages and disadvantages. As there is currently no consensus on testing methods, it is quite challenging to compare the performance between algorithms. In this paper, we propose AvoidBench, a benchmarking suite which can evaluate the performance of vision-based obstacle avoidance algorithms by subjecting them to a series of tasks. Thanks to the high fidelity of multi-rotors dynamics from RotorS and virtual scenes of Unity3D, AvoidBench can realize realistic simulated flight experiments. Compared to current drone simulators, we propose and implement both performance and environment metrics to reveal the suitability of obstacle avoidance algorithms for environments of different complexity. To illustrate AvoidBench's usage, we compare three algorithms: Ego-planner, MBPlanner, and Agile-autonomy. The trends observed are validated with real-world obstacle avoidance experiments." Generating a Terrain-Robustness Benchmark for Legged Locomotion: A Prototype Via Terrain Authoring and Active Learning,"Chong Zhang, Lizhi Yang","ETH Zurich,California Institute of Technology",Benchmarking,"Terrain-aware locomotion has become an emerging topic in legged robotics. However, it is hard to generate diverse, challenging, and realistic unstructured terrains in simulation, which limits the way researchers evaluate their locomotion policies. In this paper, we prototype the generation of a terrain dataset via terrain authoring and active learning, and the learned samplers can stably generate diverse high-quality terrains. We expect the generated dataset to make a terrain-robustness benchmark for legged locomotion. The dataset, the code implementation, and some policy evaluations are released at https://bit.ly/3bn4j7f." "Train Offline, Test Online: A Real Robot Learning Benchmark","Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Beltran Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta","Carnegie Mellon University,Meta AI,New York University,Stanford University,UC Berkeley",Benchmarking,"Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data. We take on these challenges via a new benchmark: Train Offline, Test Online (TOTO). TOTO provides remote users with access to shared robots for evaluating methods on common tasks and an open-source dataset of these tasks for offline training. Its manipulation task suite requires challenging generalization to unseen objects, positions, and lighting. We present initial results on TOTO comparing five pretrained visual representations and four offline policy learning baselines, remotely contributed by five institutions. The real promise of TOTO, however, lies in the future: we release the benchmark for additional submissions from any user, enabling easy, direct comparison to several methods without the need to obtain hardware or collect data." Benchmarking Potential Based Rewards for Learning Humanoid Locomotion,"Se Hwan Jeon, Steve Heim, Charles Khazoom, Sangbae Kim",Massachusetts Institute of Technology,Benchmarking,"The main challenge in developing effective reinforcement learning (RL) pipelines is often the design and tuning the reward functions. Well-designed shaping reward can lead to significantly faster learning. Naively formulated rewards, however, can conflict with the desired behavior and result in overfitting or even erratic performance if not properly tuned. In theory, the broad class of potential based reward shaping (PBRS) can help guide the learning process without affecting the optimal policy. Although several studies have explored the use of potential based reward shaping to accelerate learning convergence, most have been limited to grid-worlds and low-dimensional systems, and RL in robotics has predominantly relied on standard forms of reward shaping. In this paper, we benchmark standard forms of shaping with PBRS for a humanoid robot. We find that in this high-dimensional system, PBRS has only marginal benefits in convergence speed. However, the PBRS reward terms are significantly more robust to scaling than typical reward shaping approaches, and thus easier to tune." Household Clothing Set and Benchmarks for Characterising End-Effector Cloth Manipulation,"Angus B. Clark, Luke Cramphorn, Michal Rachowiecki, Austin Gregg-smith","Imperial College London,Bristol University,Dyson,University of Bristol",Benchmarking,"The highly varied and deformable structure of clothing presents a challenging task in the area of robot manipulation. Recent literature has shown an increasing interest in this field, however limited information exists on the influence of end-effector selection, instead focusing on the perception, modelling, and methodology in handling fabrics. Here, we present a benchmark set of household clothing items, along with a framework for defining textile features in relation to how the objects can be grasped and manipulated. Alongside these, we present four example benchmarks for evaluating the performance of a robot end-effector in relation to the grasping and manipulation of common pieces of clothing: Edge drag accuracy, edge grasp resilience, grasp encapsulation, and grasp fold generation. We perform these benchmarks on several common robot end-effectors (Franka Emika (FE) Hand with standard and Fin Ray style fingers (Flex), Robotiq 2F-140, and the Openhand Model T42) and present and discuss their respective performances. Results show that the Robotiq scored highest across most benchmarks, closely followed by the FE hand. The T42 showed excellent encapsulation of items, while the FE (Flex) was particularly successful picking up flat edges." Parameter Optimization for Manipulator Motion Planning Using a Novel Benchmark Set,"Carl Gaebert, Sascha Kaden, Benjamin Fischer, Ulrike Thomas","Chemnitz University of Technology,Technische Universität Chemnitz",Benchmarking,"Sampling-based motion planning algorithms have been continuously developed for more than two decades. Apart from mobile robots, they are also widely used in manipulator motion planning. Hence, these methods play a key role in collaborative and shared workspaces. Despite numerous im- provements, their performance can highly vary depending on the chosen parameter setting. The optimal parameters depend on numerous factors such as the start state, the goal state and the complexity of the environment. Practitioners usually choose these values using their experience and tedious trial and error experiments. To address this problem, recent works combine hyperparameter optimization methods with motion planning. They show that tuning the planner’s parameters can lead to shorter planning times and lower costs. It is not clear, however, how well such approaches generalize to a diverse set of planning problems that include narrow passages as well as barely cluttered environments. In this work, we analyze optimized planner settings for a large set of diverse planning problems. We then provide insights into the connection between the characteristics of the planning problem and the optimal parameters. As a result, we provide a list of recommended parameters for various use-cases. Our experiments are based on a novel motion planning benchmark for manipulators which we provide at https://mytuc.org/rybj." Benchmarking Reinforcement Learning Techniques for Autonomous Navigation,"Zifan Xu, Bo Liu, Xuesu Xiao, Anirudh Nair, Peter Stone","University of Texas at Austin,George Mason University,The University of Texas at Austin",Benchmarking,"Deep reinforcement learning (RL) has brought many successes for autonomous robot navigation. However, there still exists important limitations that prevent real-world use of RL-based navigation systems. For example, most learning approaches lack safety guarantees; and learned navigation systems may not generalize well to unseen environments. Despite a variety of recent learning techniques to tackle these challenges in general, a lack of an open-source benchmark and reproducible learning methods specifically for autonomous navigation makes it difficult for roboticists to choose what learning methods to use for their mobile robots and for learning researchers to identify current shortcomings of general learning methods for autonomous navigation. In this paper, we identify four major desiderata of applying deep RL approaches for autonomous navigation: (D1) reasoning under uncertainty, (D2) safety, (D3) learning from limited trial-and-error data, and (D4) generalization to diverse and novel environments. Then, we explore four major classes of learning techniques with the purpose of achieving one or more of the four desiderata: memory-based neural network architectures (D1), safe RL (D2), model-based RL (D2, D3), and domain randomization (D4). By deploying these learning techniques in a new open-source large-scale navigation benchmark and real-world environments, we perform a comprehensive study aimed at establishing to what extent can these techniques achieve these desiderata for RL-based navigation systems." "A Benchmark for Multi-Robot Planning in Realistic, Complex and Cluttered Environments","Simon Schaefer, Luigi Palmieri, Lukas Heuer, Rüdiger Dillmann, Sven Koenig, Alexander Kleiner","Karlsruhe Institute of Technology (KIT),Robert Bosch GmbH,Örebro University, Robert Bosch GmbH,FZI - Forschungszentrum Informatik - Karlsruhe,University of Southern California,Bosch Central Research",Benchmarking,"Several successful approaches exist for solving the complex problem of multi-robot planning and coordination. Due to the lack of adequate benchmarking tools, comparing these approaches and judging their suitability for use in realistic scenarios is currently difficult. Therefore, we propose an open-source benchmark suite that aims to close this gap. Unlike existing benchmarks, our approach uses full-stack multi-robot navigation systems in realistic 3D simulated environments from the intralogistic and household domains. Using the open-source frameworks ROS 2, Gazebo and RMF allows the user to add other robot platforms easily. The framework provides easy-to-use abstractions, typical metrics and interfaces to several established planning libraries for multi-robot systems. With all these features, our framework successfully aids practitioners and researchers in comparing multi-robot planning and coordination systems to the state of the art. Our experiments show how the proposed benchmark simplifies gaining insights on relevant close to real-life robotics use cases." D-Align: Dual Query Co-Attention Network for 3D Object Detection Based on Multi-Frame Point Cloud Sequence,"Junhyung Lee, Junho Koh, Youngwoo Lee, Junwon Choi",Hanyang University,Object Detection III,"LiDAR sensors are widely used for 3D object detection in various mobile robotics applications. LiDAR sensors continuously generate point cloud data in real-time. Conventional 3D object detectors detect objects using a set of points acquired over a fixed duration. However, recent studies have shown that the performance of object detection can be further enhanced by utilizing spatio-temporal information obtained from point cloud sequences. In this paper, we propose a new 3D object detector, named D-Align, which can effectively produce strong bird's-eye-view (BEV) features by aligning and aggregating the features obtained from a sequence of point sets. The proposed method includes a novel dual-query co-attention network that uses two types of queries, including target query set (T-QS) and support query set (S-QS), to update the features of target and support frames, respectively. D-Align aligns S-QS to T-QS based on the temporal context features extracted from the adjacent feature maps and then aggregates S-QS with T-QS using a gated fusion mechanism. The dual queries are updated through multiple attention layers to progressively enhance the target frame features used to produce the detection results. Our experiments on the nuScenes dataset show that the proposed D-Align method greatly improved the performance of a single frame-based baseline method and significantly outperformed the latest 3D object detectors. Code is available at https://github.com/junhyung-SPALab/D-Align." DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection,"Dingkang Liang, Zhe Liu, Hou Jinghua, Jingyu Li",Huazhong University of Science and Technology,Object Detection III,"In this paper, we present a simple yet effective semi-supervised 3D object detector named DDS3D. Our main contributions have two-fold. On the one hand, different from previous works using Non-Maximal Suppression~(NMS) or its variants for obtaining the sparse pseudo labels, we propose a dense pseudo-label generation strategy to get dense pseudo-labels, which can retain more potential supervision information for the student network. On the other hand, instead of traditional fixed thresholds, we propose a dynamic threshold manner to generate pseudo-labels, which can guarantee the quality and quantity of pseudo-labels during the whole training process. Benefiting from these two components, our DDS3D outperforms the state-of-the-art semi-supervised 3d object detection with mAP of 3.1% on the pedestrian and 2.1% on the cyclist under the same configure of 1% samples. Extensive ablation studies on the KITTI dataset demonstrate the effectiveness of our DDS3D. The code and models will be made publicly available at https://github.com/hust-jy/DDS3D" Fast Staircase Detection and Estimation Using 3D Point Clouds with Multi-Detection Merging for Heterogeneous Robots,"Prasanna Sriganesh, Namya Bagree, Bhaskar Vundurthy, Matthew Travers",Carnegie Mellon University,Object Detection III,"Robotic systems need advanced mobility capabilities to operate in complex, three-dimensional environments designed for human use, e.g., multi-level buildings. Incorporating some level of autonomy enables robots to operate robustly, reliably, and efficiently in such complex environments, e.g., automatically ""returning home"" if communication between an operator and robot is lost during deployment. This work presents a novel method that enables mobile robots to robustly operate in multi-level environments by making it possible to autonomously locate and climb a range of different staircases. We present results wherein a wheeled robot works together with a quadrupedal system to quickly detect different staircases and reliably climb them. The performance of this novel staircase detection algorithm that is able to run on the heterogeneous platforms is compared to the current state-of-the-art detection algorithm. We show that our approach significantly increases the accuracy and speed at which detections occur." Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection,"Xiaofang Wang, Kris Kitani","Carnegie Mellon University,CMU",Object Detection III,"Considerable research effort has been devoted to LiDAR-based 3D object detection and empirical performance has been significantly improved. While progress has been encouraging, we observe an overlooked issue: it is not yet common practice to compare different 3D detectors under the same cost, e.g., inference latency. This makes it difficult to quantify the true performance gain brought by recently proposed architecture designs. The goal of this work is to conduct a cost-aware evaluation of LiDAR-based 3D object detectors. Specifically, we focus on SECOND, a simple grid-based one-stage detector, and analyze its performance under different costs by scaling its original architecture. Then we compare the family of scaled SECOND with recent 3D detection methods, such as Voxel R-CNN and PV-RCNN++. The results are surprising. We find that, if allowed to use the same latency, SECOND can match the performance of PV-RCNN++, the current state-of-the-art method on the Waymo Open Dataset. Scaled SECOND also easily outperforms many recent 3D detection methods published during the past year. We recommend future research control the inference cost in their empirical comparison and include the family of scaled SECOND as a strong baseline when presenting novel 3D detection methods." Zero-Shot Object Detection Based on Dynamic Semantic Vectors,"Haoyu Li, Jilin Mei, Jiancong Zhou, Yu Hu","University of Chinese Academy of Sciences,Institute of Computing Technology, Chinese Academy of Sciences,Institute of Computing Technology Chinese Academy of Sciences",Object Detection III,"Zero-shot object detection has shown its ability to overcome the problems of data scarcity and novel classes. Existing methods generally utilize static semantic vectors to classify objects and guide the network to map visual features to semantic vectors. However, the distribution of semantic vectors cannot adequately represent visual features, which makes migration from seen to unseen classes difficult. This work explores the dynamic semantic vector method to align the distributions of semantic vectors and visual features. The main challenge is to get a more reasonable distribution of semantic vectors. To address this issue, we proposed a two-way classification branch network and introduce N-pair loss into the dynamic semantic vector optimization process. Experiments on the MS-COCO dataset and real-world autonomous driving data demonstrate the effectiveness and generalization of our method." Road Anomaly Segmentation Based on Pixel-Wise Logit Variance with Iterative Background Highlighting,"Dongkun Lee, Han-gyu Kim, Ho-jin Choi","KAIST,NAVER Cloud",Object Detection III,"Anomaly segmentation on the urban landscape scene is an important task in autonomous driving. This process exploits a pre-trained semantic segmentation network to estimate anomalous regions. Anomaly segmentation approaches implemented with extra requirements such as out-of-domain data, extra network, or network retraining might increase the computational cost or degradation of segmentation performance. In this study, to exploit information from the segmentation network for more robust anomaly segmentation, we propose the use of pixel-wise logit variance, which tends to be small for anomalies as network outputs even logits without confidence. Additionally iterative background highlighting is proposed to robustly detect anomalous objects on the background, which is implemented by feeding the logits back into the linear classifier of the network. We achieved state-of-the-art performance among anomaly segmentation approaches without extra requirements, reaching relative average precision improvements of 21.7% on the Fishyscapes Lost&Found and 17.4% on the Fishyscapes Static compared to the state-of-the-art method. The code of this work is available at our Github repository (https://github.com/hagg30/LogitVar)." WEDGE: Web-Image Assisted Domain Generalization for Semantic Segmentation,"Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak","POSTECH,Pohang University of Science and Technology,DGIST,Microsoft Research Asia,Eastern Institute for Advanced Study",Object Detection III,"Domain generalization for semantic segmentation is highly demanded in real applications, where a trained model is expected to work well in previously unseen domains. One challenge lies in the lack of data which could cover the diverse distributions of the possible unseen domains for training. In this paper, we propose a WEb-image assisted Domain GEneralization (WEDGE) scheme, which is the first to exploit the diversity of web-crawled images for generalizable semantic segmentation. To explore and exploit the real-world data distributions, we collect a web-crawled dataset which presents large diversity in terms of weather conditions, sites, lighting, camera styles, etc. We also present a method which injects the style representation of the web-crawled data into the source domain on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training. Moreover, we use the web-crawled dataset with predicted pseudo labels for training to further enhance the capability of the network. Extensive experiments demonstrate that our method clearly outperforms existing domain generalization techniques." Incremental Few-Shot Object Detection Via Simple Fine-Tuning Approach,"Tae-min Choi, Jong-Hwan Kim","Korea Advanced Institute of Science and Technology,KAIST",Object Detection III,"In this paper, we explore incremental few-shot object detection (iFSD), which incrementally learns novel classes using only a few examples without revisiting base classes. Previous iFSD works achieved the desired results by applying meta-learning. However, meta-learning approaches show insufficient performance that is difficult to apply to practical problems. In this light, we propose a simple fine-tuning-based approach, the Incremental Two-stage Fine-tuning Approach (iTFA) for iFSD, which contains three steps: 1) base training using abundant base classes with the class-agnostic box regressor, 2) separation of the RoI feature extractor and classifier into the base and novel class branches for preserving base knowledge, and 3) fine-tuning the novel branch using only a few novel class examples. We evaluate our iTFA on the real-world datasets PASCAL VOC, COCO, and LVIS. iTFA achieves competitive performance in COCO and shows a 30% higher AP accuracy than meta-learning methods in the LVIS dataset. Experimental results show the effectiveness and applicability of our proposed method." Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation,"Anoop Cherian, Siddarth Jain, Tim K. Marks, Alan Sullivan","Mitsubishi Electric Research Labs,Mitsubishi Electric Research Laboratories (MERL),Mitsubishi Electric Research Lab",Segmentation,"In this paper, we present a simple and efficient scheme for segmenting approximately-convex 3D object instances in depth images in a few-shot setting via discriminatively modeling the 3D shape of the object using a neural network. Our key idea is to select pairs of 3D points on the depth image between which we compute surface geodesics. As the number of such geodesics is quadratic in the number of image pixels, we can create a large training set of geodesics using only very limited ground truth instance annotations. These annotations are used to create a binary label for each geodesic, which indicates whether or not that geodesic belongs entirely to one instance segment. A neural network is then trained to classify the geodesics using these labels. During inference, we create geodesics from selected seed points in the test depth image, then produce a convex hull of the points that are classified by the neural network as belonging to the same instance, thereby achieving instance segmentation. We present experiments applying our method to segmenting instances of food items in real world depth images. Our results demonstrate promising performances compared to prior methods in both accuracy and computational efficiency." Multi-To-Single Knowledge Distillation for Point Cloud Semantic Segmentation,"Shoumeng Qiu, Feng Jiang, Haiqiang Zhang, Xiangyang Xue, Jian Pu","fudan,Fudan University,Beijing Institute of Technology",Segmentation,"3D point cloud semantic segmentation is one of the fundamental tasks for environmental understanding. Although significant progress has been made in recent years, the performance of classes with few examples or few points is still far from satisfactory. In this paper, we propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task to boost the performance of those hard classes. Instead of fusing all the points of multi-scans directly, only the instances that belong to the previously defined hard classes are fused. To effectively and sufficiently distill valuable knowledge from multi-scans, we leverage a multilevel distillation framework, i.e., feature representation distillation, logit distillation, and affinity distillation. We further develop a novel instance-aware affinity distillation algorithm for capturing high-level structural knowledge to enhance the distillation efficacy for hard classes. Finally, we conduct experiments on the SemanticKITTI dataset, and the results on both the validation and test sets demonstrate that our method yields substantial improvements compared with the baseline method. The code is available at Url{https://github.com/skyshoumeng/M2SKD}." On Improving Boundary Quality of Instance Segmentation in Cluttered and Chaotic Scenarios,"Biqi Yang, Xiaojie Gao, Xianzhi Li, Yunhui Liu, Chi-wing Fu, Pheng Ann Heng","The Chinese University of Hong Kong,Chinese University of Hong Kong",Segmentation,"Instance segmentation is a long-standing task for supporting robotic bin picking. However, objects of diverse classes can be closely packed with occlusions in cluttered and chaotic scenes, hence, even recent methods could have difficulty in locating clear and precise boundaries to distinguish nearby objects. In this work, we aim to improve the boundary quality of the instance masks for robust and precise instance segmentation in these challenging scenarios. Technical-wise, we first formulate an IoU-based Boundary-aware Mask head (IBM head) for predicting the instance-level mask, boundary, and their corresponding IoU scores. With this core module, we then follow the coarse-to-fine strategy and design our pipeline with two stages: an IoUNet to learn localization-based objectness cue and a hierarchical mask refiner to produce sharper and cleaner boundaries. We deploy the IBM head throughout the framework. Extensive experimental results on three grasping benchmarks manifest that our method attains the best instance segmentation performance, compared with the state-of-the-art approaches. Practically, we conduct real-world picking tests to show that with the objectness and boundary IoU scores as guidance, we are able to filter invalid (occluded) instances and select high-fidelity (exposed) instances for grasping." Real-Time Background Subtraction under Varying Lighting Conditions,"Sisi Liang, Darren Baker",CSIRO,Segmentation,"Background subtraction is an important topic in computer vision and video analysis. It is challenging to robustly segment foreground and background in complex scenarios. In the literature there are efforts to address some of the main challenges such as illumination change, dynamic backgrounds, hard shadows, and intermittent object motion. However, most of the research has focused on applying advanced mathematical and machine learning models rather than on improving performance in real-time applications. In this paper, we devise a method named EGMM to efficiently handle the illumination change problem and also operate at a real-time execution speed on commodity PC hardware. EGMM is an ensemble algorithm that fuses multiple Gaussian Mixture Models operating on gradient, texture and color features. Detection and removal of shadows is done using a chromaticity-based approach, and spatio-temporal history of foreground blobs is used to handle intermittent object motion. We benchmarked EGMM by creating datasets for two light change scenarios. The results demonstrate that EGMM achieves robust performance in complex illumination change cases, outperforms some state-of-the-art algorithms, and runs at 100 fps (GPU) at 1280 × 720 resolution. Moreover, experiments using the 2012 CDnet dataset show that EGMM achieves generally good performance in varying scenes with overall results better than conventional methods and runs at 1000 fps (GPU) at 320 × 240 resolution." Few-Shot 3D LiDAR Semantic Segmentation for Autonomous Driving,"Jilin Mei, Junbao Zhou, Yu Hu","Institute of Computing Technology, Chinese Academy of Sciences,Chinese Academy of Sciences,Institute of Computing Technology Chinese Academy of Sciences",Segmentation,"In autonomous driving, the novel objects and lack of annotations challenge the traditional 3D LiDAR semantic segmentation based on deep learning. Few-shot learning is a feasible way to solve these issues. However, currently few-shot semantic segmentation methods focus on camera data, and most of them only predict the novel classes without considering the base classes. This setting cannot be directly applied to autonomous driving due to safety concerns. Thus, we propose a few-shot 3D LiDAR semantic segmentation method that predicts both novel classes and base classes simultaneously. Our method tries to solve the background ambiguity problem in generalized few-shot semantic segmentation. We first review the original cross-entropy and knowledge distillation losses, then propose a new loss function that incorporates the background information to achieve 3D LiDAR few-shot semantic segmentation. Extensive experiments on SemanticKITTI demonstrate the effectiveness of our method." ERASE-Net: Efficient Segmentation Networks for Automotive Radar Signals,"Shihong Fang, Haoran Zhu, Devansh Bisla, Anna Choromanska, Satish Ravindran, Dongyin Ren, Ryan Wu","New York University,NYU,New York University Tandon School of Engineering,NXP,NXP Semiconductors",Segmentation,"Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expensive or discard significant amounts of valuable information from raw 3D radar signals by reducing them to 2D planes via averaging. In this work, we introduce ERASE-Net, an Efficient RAdar SEgmentation Network to segment the raw radar signals semantically. The core of our approach is the novel detect-then-segment method for raw radar signals. It first detects the center point of each object, then extracts a compact radar signal representation, and finally performs semantic segmentation. We show that our method can achieve superior performance on radar semantic segmentation task compared to the state-of-the-art (SOTA) technique. Furthermore, our approach requires up to 20× less computational resources. Finally, we show that the proposed ERASE-Net can be compressed by 40% without significant loss in performance, significantly more than the SOTA network, which makes it a more promising candidate for practical automotive applications." ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation Via Regularized Domain Concatenation,"Lingdong Kong, Niamul Quader, Venice Erin Liong","National University of Singapore,Motional, Singapore,Motional",Segmentation,"Transferring knowledge learned from the labeled source domain to the raw target domain for unsupervised domain adaptation (UDA) is essential to the scalable deployment of autonomous driving systems. State-of-the-art methods in UDA often employ a key idea: utilizing joint supervision signals from both source and target domains for self-training. In this work, we improve and extend this aspect. We present ConDA, a concatenation-based domain adaptation framework for LiDAR segmentation that: 1) constructs an intermediate domain consisting of fine-grained interchange signals from both source and target domains without destabilizing the semantic coherency of objects and background around the ego-vehicle; and 2) utilizes the intermediate domain for self-training. To improve the network training on the source domain and self-training on the intermediate domain, we propose an anti-aliasing regularizer and an entropy aggregator to reduce the negative effect caused by the aliasing artifacts and noisy pseudo labels. Through extensive studies, we demonstrate that ConDA significantly outperforms prior arts in mitigating domain gaps." Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection,"Darren Tsai, Julie Stephany Berrio Perez, Mao Shan, Eduardo Nebot, Stewart Worrall","University of Sydney, Australian Centre for Field Robotics,ACFR - The University of Sydney,The University of Sydney,Unversity of Sydney,University of Sydney",Segmentation,"Every autonomous driving dataset has a different configuration of sensors, originating from distinct geographic regions and covering various scenarios. As a result, 3D detectors tend to overfit the datasets they are trained on. This causes a drastic decrease in accuracy when the detectors are trained on one dataset and tested on another. We observe that lidar scan pattern differences form a large component of this reduction in performance. We address this in our approach, SEE-VCN, by designing a novel viewer-centred surface completion network (VCN) to complete the surfaces of objects of interest within an unsupervised domain adaptation framework, SEE. With SEE-VCN, we obtain a unified representation of objects across datasets, allowing the network to focus on learning geometry, rather than overfitting on scan patterns. By adopting a domain-invariant representation, SEE-VCN can be classed as a multi-target domain adaptation approach where no annotations or re-training is required to obtain 3D detections for new scan patterns. Through extensive experiments, we show that our approach outperforms previous domain adaptation methods in multiple domain adaptation settings. Our code and data are available at https://github.com/darrenjkt/SEE-VCN." Nerf2nerf: Pairwise Registration of Neural Radiance Fields,"Leili Goli, Daniel Rebain, Sara Sabour, Animesh Garg, Andrea Tagliasacchi","University of Toronto, Vector Institute,University of British Columbia,Google, University of Toronto,University of Toronto,Simon Fraser University",Radiance Fields,"We introduce a technique for pairwise registration of neural fields that extends classical optimization-based local registration (i.e. ICP) to operate on Neural Radiance Fields (NeRF) -- neural 3D scene representations trained from collections of calibrated images. NeRF does not decompose illumination and color, so to make registration invariant to illumination, we introduce the concept of a ''surface field'' -- a field distilled from a pre-trained NeRF model that measures the likelihood of a point being on the surface of an object. We then cast nerf2nerf registration as a robust optimization that iteratively seeks a rigid transformation that aligns the surface fields of the two scenes. We evaluate the effectiveness of our technique by introducing a dataset of pre-trained NeRF scenes -- our synthetic scenes enable quantitative evaluations and comparisons to classical registration techniques, while our real scenes demonstrate the validity of our technique in real-world scenarios. Additional results available at: https://nerf2nerf.github.io" NeRF2Real: Sim2real Transfer of Vision-Guided Bipedal Motion Skills Using Neural Radiance Fields,"Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess","Google,DeepMind,Deepmind,University of Washington",Award Finalists 2,"We present a system for applying sim2real approaches to ``in the wild” scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, we learn the scene's contact geometry and a function for novel view synthesis using a Neural Radiance Field (NeRF). We augment the NeRF rendering of the static scene by overlaying the rendering of other dynamic objects (e.g. the robot's own body, a ball). A simulation is then created using the rendering engine in a physics simulator which computes contact dynamics from the static scene geometry (estimated from the NeRF volume density) and the dynamic objects' geometry and physical properties (assumed known). We demonstrate that we can use this simulation to learn vision-based whole body navigation and ball pushing policies for a 20 degrees of freedom humanoid robot with an actuated head-mounted RGB camera, and we successfully transfer these policies to a real robot." Density-Aware NeRF Ensembles: Quantifying Predictive Uncertainty in Neural Radiance Fields,"Niko Suenderhauf, Dimity Miller, Jad Chakra",Queensland University of Technology,Radiance Fields,"We show that ensembling effectively quantifies model uncertainty in Neural Radiance Fields (NeRFs) if a density-aware epistemic uncertainty term is considered. The naive ensembles investigated in prior work simply average rendered RGB images to quantify the model uncertainty caused by conflicting explanations of the observed scene. In contrast, we additionally consider the termination probabilities along individual rays to identify epistemic model uncertainty due to a lack of knowledge about the parts of a scene unobserved during training. We achieve new state-of-the-art performance across established uncertainty quantification benchmarks for NeRFs, outperforming methods that require complex changes to the NeRF architecture and training regime. We furthermore demonstrate that NeRF uncertainty can be utilised for next-best view selection for model refinement." Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation,"Yunzhi Lin, Thomas Müller, Jonathan Tremblay, Bowen Wen, Stephen Tyree, Alex Evans, Patricio A. Vela, Stan Birchfield","Georgia Institute of Technology,NVIDIA,Nvidia,NVIDIA Corporation",Radiance Fields,"We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks." NeRF-Loc: Visual Localization with Conditional Neural Radiance Field,"Jianlin Liu, Qiang Nie, Yong Liu, Chengjie Wang","Tencent,The Chinese University of Hong Kong,Tencent YouTuLab, Shanghai Jiao Tong University",Radiance Fields,"We propose a novel visual re-localization method based on direct matching between the implicit 3D descriptors and the 2D image with transformer. A conditional neural radiance field(NeRF) is chosen as the 3D scene representation in our pipeline, which supports continuous 3D descriptors gener- ation and neural rendering. By unifying the feature matching and the scene coordinate regression to the same framework, our model learns both generalizable knowledge and scene prior respectively during two training stages. Furthermore, to improve the localization robustness when domain gap exists between training and testing phases, we propose an appearance adaptation layer to explicitly align styles between the 3D model and the query image. Experiments show that our method achieves higher localization accuracy than other learning-based approaches on multiple benchmarks. Codes will be released." Multimodal Neural Radiance Field,"Haidong Zhu, Yuyin Sun, Chi Liu, Lu Xia, Jiajia Luo, Nan Qiao, Ram Nevatia, Cheng-hao Kuo","University of Southern California,Amazon,University of Tennessee",Radiance Fields,"This paper addresses the challenge of reconstructing a scene with a neural radiance field (NeRF) for robot vision and scene understanding using multiple modalities. Researchers have introduced the use of NeRF to represent an object for synthesizing and rendering novel views of complex scenes by optimizing a 3-D radiance field for ray casting and rendering for 2-D RGB images. However, using RGB images alone introduces additional geometry ambiguities with transparent objects or complex scenes and cannot accurately depict the 3-D shapes. We discuss and solve this problem and use multiple modalities as input for the same NeRF model to build a multimodal NeRF by incorporating point clouds and infrared image supervision to prevent such bias. In contrast to RGB images, infrared images and point clouds are typically taken by separate cameras that cannot be aligned with the RGB camera. We further introduce the alignment of different modalities based on point cloud registration to estimate the relative transformation matrices between them before training a NeRF model with multiple modalities. We evaluate our model on chosen scenes from the ScanNet and M2DGR datasets and demonstrate that it outperforms existing state-of-the-art methods." Orbeez-SLAM: A Real-Time Monocular Visual SLAM with ORB Features and NeRF-Realized Mapping,"Chi-ming Chung, Yang-che Tseng, Ya-ching Hsu, Xiang-qian Shi, Yun-hung Hua, Jia-Fong Yeh, Yi-ting Chen, Wen-chin Chen, Winston Hsu","National Taiwan University,National Chiao Tung University",Radiance Fields,"A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated. To achieve this, we need a visual SLAM that easily adapts to new scenes without pre-training and generates dense maps for downstream tasks in real-time. None of the previous learning-based and non-learning-based visual SLAMs satisfy all needs due to the intrinsic limitations of their components. In this work, we develop a visual SLAM named Orbeez-SLAM, which successfully collaborates with implicit neural representation and visual odometry to achieve our goals. Moreover, Orbeez-SLAM can work with the monocular camera since it only needs RGB inputs, making it widely applicable to the real world. Results show that our SLAM is up to 800x faster than the strong baseline with superior rendering outcomes. Code link: https://github.com/MarvinChung/Orbeez-SLAM." NeRFing It: Offline Object Segmentation through Implicit Modeling,"Kenneth Blomqvist, Jen Jen Chung, Lionel Ott, Roland Siegwart","ETH Zurich,The University of Queensland",Radiance Fields,"Most recently proposed methods for robotic per- ception are based on deep learning, which require very large datasets to perform well. The accuracy of a learned model is mainly dependent on the data distribution it was trained on. Thus for deploying such models, it is crucial to use training data belonging to the robot’s environment. However, collecting and labeling data is a significant bottleneck, necessitating efficient data collection and labeling pipelines. This paper presents a method to compute high-quality object segmentation maps for RGB-D video sequences using minimal human labeling effort. We leverage the density learned by a Neural Radiance Field (NeRF) to infer the geometry of the scene, which we use to compute dense segmentation maps using a single 3D bounding box provided by a user. We study the accuracy of the computed segmentation maps and present a way to generate additional synthetic training examples observing the scene from novel viewpoints using the learned radiance fields. Our results show that our method is able to compute accurate segmentation maps, outperforming baseline and state-of-the-art methods. We also show that using the synthetic training examples improves performance on a downstream object detection task." Using Learning Curve Predictions to Learn from Incorrect Feedback,"Taylor Kessler Faulkner, Andrea Thomaz","University of Washington,University of Texas at Austin",Reinforcement Learning II,"Robots can incorporate data from human teachers when learning new tasks. However, this data can often be noisy, which can cause robots to learn slowly or not at all. One method for learning from human teachers is Human-in-the-loop Reinforcement Learning (HRL), which can combine information from both an environmental reward and external feedback from human teachers. However, many HRL methods assume near-perfect information from teachers or must know the skill level of each teacher before starting the learning process. Our algorithm, Classification for Learning Erroneous Assessments using Rewards (CLEAR), is a feedback filter for Reinforcement Learning (RL) algorithms, enabling learning agents to learn from imperfect teachers without prior modeling. CLEAR is able to determine whether human feedback is correct based on observations of the RL learning curve. Our results suggest that CLEAR improves the quality of human feedback --- from 57.5% to 65% correct in a human study --- and performs more reliably than baselines by matching or outperforming RL without human teachers in all tested cases." Conflict-Constrained Multi-Agent Reinforcement Learning Method for Parking Trajectory Planning,"Siyuan Chen, Meiling Wang, Yi Yang, Wenjie Song",Beijing Institute of Technology,Reinforcement Learning II,"Automated Valet Parking (AVP) has been extensively researched as an important application of autonomous driving. Considering the high dynamics and density of real parking lots, a system that considers multiple vehicles simultaneously is more robust and efficient than a single vehicle setting as in most studies. In this paper, we propose a distributed Multi-agent Reinforcement Learning(MARL) method for coordinating multiple vehicles in the framework of an AVP system. This method utilizes traditional trajectory planning to accelerate the learning process and introduces collision conflict constraints for policy optimization to mitigate the path conflict problem. In contrast to other centralized multi-agent path finding methods, the proposed approach is scalable, distributed, and adapts to dynamic stochastic scenarios. We train the models in random scenarios and validate in several artificially designed complex parking scenarios where vehicles are always disturbed by dynamic and static obstacles. Experimental results show that our approach mitigates path conflicts and excels in terms of success rate and efficiency." Improving Robot Navigation in Crowded Environments Using Intrinsic Rewards,"Diego Martinez Baselga, Luis Riazuelo, Luis Montano Gella","University of Zaragoza,Instituto de Investigación en IngenieríadeAragón,University of Z,Universidad de Zaragoza",Reinforcement Learning II,"Autonomous navigation in crowded environments is an open problem with many applications, essential for the coexistence of robots and humans in the smart cities of the future. In recent years, deep reinforcement learning approaches have proven to outperform model-based algorithms. Nevertheless, even though the results provided are promising, the works are not able to take advantage of the capabilities that their models offer. They usually get trapped in local optima in the training process, that prevent them from learning the optimal policy. They are not able to visit and interact with every possible state appropriately, such as with the states near the goal or near the dynamic obstacles. In this work, we propose using intrinsic rewards to balance between exploration and exploitation and explore depending on the uncertainty of the states instead of on the time the agent has been trained, encouraging the agent to get more curious about unknown states. We explain the benefits of the approach and compare it with other exploration algorithms that may be used for crowd navigation. Many simulation experiments are performed modifying several algorithms of the state-of-the-art, showing that the use of intrinsic rewards makes the robot learn faster and reach higher rewards and success rates (fewer collisions) in shorter navigation times, outperforming the state-of-the-art." Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers,"Yan Wang, Gautham Vasan, Rupam Mahmood",University of Alberta,Reinforcement Learning II,"Real-time learning is crucial for robotic agents adapting to ever-changing, non-stationary environments. A common setup for a robotic agent is to have two different computers simultaneously: a resource-limited local computer tethered to the robot and a powerful remote computer connected wirelessly. Given such a setup, it is unclear to what extent the performance of a learning system can be affected by resource limitation and how to efficiently use the wirelessly connected workstation to compensate for any performance loss. In this paper, we implement a real-time learning system called the Remote-Local Distributed (ReLoD) system to distribute computations of two deep reinforcement learning (RL) algorithms, Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO), between a local and a remote computer.The performance of the system is evaluated on two vision-based control tasks developed using a robotic arm and a mobile robot. Our results show that the performance of SAC degrades heavily on a resource-limited local computer. To our surprise, deploying all computations of the learning system, instead, on a powerful remote workstation fails to improve its performance, which indicates that without careful consideration, using a powerful remote computer may not result in performance improvement. However, a carefully chosen distribution of computations of SAC consistently and substantially improves its performance on both tasks. On the other hand, the performance of PPO remains largely unaffected by the distribution of computations. In addition, when all computations happen solely on a powerful tethered computer, the performance of our system remains on par with an existing system that is well-tuned for using a single machine. ReLoD is the only publicly available system for real-time RL that applies to multiple robots for vision-based tasks. The source code can be found at https://github.com/rlai-lab/relod" Reinforcement Learning for Safe Robot Control Using Control Lyapunov Barrier Functions,"Desong Du, Shaohang Han, Naiming Qi, Haitham Bou Ammar, Jun Wang, Wei Pan","Harbin Institute of Technology,Delft University of Technology,Princeton University,University College London",Reinforcement Learning II,"Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots. However, its wide application to physical robots is limited by the absence of strong safety guarantees. To overcome this challenge, this paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data without explicitly employing a dynamic model. We also proposed the Lyapunov barrier actor-critic (LBAC), a model-free RL algorithm, to search for a controller that satisfies the data-based approximation of the safety and reachability conditions. The proposed approach is demonstrated through simulation and real-world robot control experiments, i.e., a 2D quadrotor navigation task. The experimental findings reveal this approach's effectiveness in reachability and safety, surpassing other model-free RL methods." "Safe Reinforcement Learning of Dynamic High-Dimensional Robotic Tasks: Navigation, Manipulation, Interaction","Puze Liu, Kuo Zhang, Davide Tateo, Snehal Jauhri, Zhiyuan Hu, Jan Peters, Georgia Chalvatzaki","Technische Universität Darmstadt,TU-Darmstadt,TU Darmstadt,Technical University of Darmstadt,Technische Universität Darmastadt",Reinforcement Learning II,"Safety is a fundamental property for the real-world deployment of robotic platforms. Any control policy should avoid dangerous actions that could harm the environment, humans, or the robot itself. In reinforcement learning (RL), safety is crucial when exploring a new environment to learn a new skill. This paper introduces a new formulation of safe exploration for robotic RL in the tangent space of the constraint manifold that effectively transforms the action space of the RL agent for always respecting safety constraints locally. We show how to apply this approach to a wide range of robotic platforms and how to define safety constraints that represent dynamic articulated objects like humans in the context of robotic RL. Our proposed approach achieves state-of-the-art performance in simulated high-dimensional and dynamic tasks while avoiding collisions with the environment. We show safe real-world deployment of our learned controller on a TIAGo++ robot, achieving remarkable performance in manipulation and human-robot interaction tasks." Robotic Control Using Model Based Meta Adaption,"Karam Daaboul, Joel Ikels, Johann Marius Zöllner","Karlsruhe Institut for Technology,Karlsruhe Insitute of Technology,FZI Forschungszentrum Informatik",Reinforcement Learning II,"In machine learning, meta-learning methods aim for fast adaptability to unknown tasks using prior knowledge. Model-based meta-reinforcement learning combines reinforcement learning via world models with Meta Reinforcement Learning (MRL) for increased sample efficiency. However, adaption to unknown tasks does not always result in preferable agent behavior. This paper introduces a new Meta Adaptation Controller (MAC) that employs MRL to apply a preferred robot behavior from one task to many similar tasks. To do this, MAC aims to find actions an agent has to take in a new task to reach a similar outcome as in a learned task. As a result, the agent will adapt quickly to the change in the dynamic and behave appropriately without the need to construct a reward function that enforces the preferred behavior." SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations,"Khaled Nakhleh, Minahil Raza, Mack Tang, Matthew Andrews, Rinu Boney, Ilija Hadzic, Jeongran Lee, Atefeh Mohajeri, Karina Palyutina","Texas A&M University,Nokia Bell Labs,Aalto University",Reinforcement Learning II,"We study the training performance of ROS local planners based on Reinforcement Learning (RL), and the trajectories they produce on real-world robots. We show that recent enhancements to the Soft Actor Critic (SAC) algorithm such as RAD and DrQ achieve almost perfect training after only 10000 episodes. We also observe that on real-world robots the resulting SACPlanner is more reactive to obstacles than traditional ROS local planners such as DWA." Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation,"Xingyu Zhu, Xin Wang, Jonathan Freer, Hyung Jin Chang, Yixing Gao","JiLin University,Jilin University,University of Birmingham",Deep Learning Methods,"Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where grasping points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for grasping. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes grasping and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the grasping direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive grasping and unfolding experiments as part of our ablation studies, achieving an 84% success rate." Privacy-Preserving Video Conferencing Via Thermal-Generative Images,"Sheng-yang Chiu, Yu-Ting Huang, Chieh-ting Lin, Yu-chee Tseng, Jen-jee Chen, Meng-hsuan Tu, Bo-chen Tung, Yujou Nieh","National Yang Ming Chiao Tung University,NYCU,NYCU, National Yang Ming Chiao Tung University",Deep Learning Methods,"Due to the COVID-19 epidemic, video conferencing has evolved as a new paradigm of communication and teamwork. However, private and personal information can be easily leaked through cameras during video conferencing. This includes leakage of a person's appearance as well as the contents in the background. This paper proposes a novel way of using online low-resolution thermal images as conditions to guide the synthesis of RGB images, bringing a promising solution for real-time video conferencing when privacy leakage is a concern. SPADE-SR (Spatially-Adaptive De-normalization with Self Resampling), a variant of SPADE, is adopted to incorporate the spatial property of a thermal heatmap and the non-thermal property of a normal, privacy-free pre-recorded RGB image provided in a form of latent code. We create a PAIR-LRT-Human (LRT = Low-Resolution Thermal) dataset to validate our claims. The result enables a convenient way of video conferencing where users no longer need to groom themselves and tidy up backgrounds for a short meeting. Additionally, it allows a user to switch to a different appearance and background during a conference." Streaming LifeLong Learning with Any-Time Inference,"Soumya Banerjee, Vinay Kumar Verma, Vinay Namboodiri","IIT Kanpur,University of Bath",Deep Learning Methods,"Despite rapid advancements in the lifelong learning (LL) research, a large body of research mainly focuses on improving the performance in the existing textit{static} continual learning (CL) setups. These methods lack the ability to succeed in a rapidly changing textit{dynamic} environment, where an AI agent needs to quickly learn new instances in a `single pass' from the non-i.i.d (also possibly temporally contiguous/coherent) data streams without suffering from catastrophic forgetting. For practical applicability, we propose a novel lifelong learning approach, which is streaming, i.e., a single input sample arrives in each time step. Moreover, the proposed approach is single pass, class-incremental, and is subject to be evaluated at any moment. To address this challenging setup and various evaluation protocols, we propose a Bayesian framework, that enables fast parameter update, given a single training example, and enables any-time inference. We additionally propose an implicit regularizer in the form of snap-shot self-distillation, which effectively minimizes the forgetting further. We further propose an effective method that efficiently selects a subset of samples for online memory rehearsal and employs a new replay buffer management scheme that significantly boosts the overall performance. Our empirical evaluations and ablations demonstrate that the proposed method outperforms the prior works by large margins." Code as Policies: Language Model Programs for Embodied Control,"Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Peter Florence, Andy Zeng","Google,UC Berkeley,Google Inc,Google Brain,MIT",Award Finalists 2,"Large language models (LLMs) trained on code completion have been shown to be capable of synthesizing simple Python programs from docstrings [1]. We find that these codewriting LLMs can be re-purposed to write robot policy code, given natural language commands. Specifically, policy code can express functions or feedback loops that process perception outputs (e.g., from object detectors [2], [3]) and parameterize control primitive APIs. When provided as input several example language commands (formatted as comments) followed by corresponding policy code (via few-shot prompting), LLMs can take in new commands and autonomously re-compose API calls to generate new policy code respectively. By chaining classic logic structures and referencing third-party libraries (e.g., NumPy, Shapely) to perform arithmetic, LLMs used in this way can write robot policies that (i) exhibit spatial-geometric reasoning, (ii) generalize to new instructions, and (iii) prescribe precise values (e.g., velocities) to ambiguous descriptions (“faster”) depending on context (i.e., behavioral commonsense). This paper presents code as policies: a robot-centric formulation of language model generated programs (LMPs) that can represent reactive policies (e.g., impedance controllers), as well as waypoint-based policies (vision-based pick and place, trajectory-based control), demonstrated across multiple real robot platforms. Central to our approach is prompting hierarchical code-gen (recursively defining undefined functions), which can write more complex code and also improves state-of-the-art to solve 39.8% of problems on the HumanEval [1] benchmark. Code and videos are available at https://code-as-policies.github.io" Learning Sim-To-Real Dense Object Descriptors for Robotic Manipulation,"Hoang-Giang Cao, Weihao Zeng, 毅成 吳","National Yang Ming Chiao Tung University,NYCU,National Chiao Tung University",Representation Learning,"It is crucial to address the following issues for ubiquitous robotics manipulation applications: (a) vision-based manipulation tasks require the robot to visually learn and understand the object with rich information like dense object descriptors; and (b) sim-to-real transfer in robotics aims to close the gap between simulated and real data. In this paper, we present Sim-to-Real Dense Object Nets (SRDONs), a dense object descriptor that not only understands the object via appropriate representation but also maps simulated and real data to a unified feature space with pixel consistency. We proposed an object-to-object matching method for image pairs from different scenes and different domains. This method helps reduce the effort of training data from real-world by taking advantage of public datasets, such as GraspNet. With sim-to-real object representation consistency, our SRDONs can serve as a building block for a variety of sim-to-real manipulation tasks. We demonstrate in experiments that pre-trained SRDONs significantly improve performances on unseen objects and unseen visual environments for various robotic tasks with zero real-world training." Learning Visual-Audio Representations for Voice-Controlled Robots,"Peixin Chang, Shuijing Liu, D. Livingston Mcpherson, Katherine Driggs-Campbell","University of Illinois at Urbana Champaign,University of Illinois,University of Illinois at Urbana-Champaign",Representation Learning,"Based on the recent advancements in representation learning, we propose a novel pipeline for task-oriented voice-controlled robots with raw sensor inputs. Previous methods rely on a large number of labels and task-specific reward functions. Not only can such an approach hardly be improved after the deployment, but also has limited generalization across robotic platforms and tasks. To address these problems, our pipeline first learns a visual-audio representation (VAR) that associates images and sound commands. Then the robot learns to fulfill the sound command via reinforcement learning using the reward generated by the VAR. We demonstrate our approach with various sound types, robots, and tasks. We show that our method outperforms previous work with much fewer labels. We show in both the simulated and real-world experiments that the system can self-improve in previously unseen scenarios given a reasonable number of newly labeled data." Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations,"Negin Heravi, Ayzaan Wahid, Corey Lynch, Peter Florence, Travis Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta Dwibedi","Stanford University,Google,Google Brain,MIT",Representation Learning,"Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the current methodologies learn task specific representations that do not necessarily transfer well to other tasks. Furthermore, representations learned by supervised methods require large, labeled datasets for each task that are expensive to collect in the real-world. Using self-supervised learning to obtain representations from unlabeled data can mitigate this problem. However, current self-supervised representation learning methods are mostly object agnostic, and we demonstrate that the resulting representations are insufficient for general purpose robotics tasks as they fail to capture the complexity of scenes with many components. In this paper, we show the effectiveness of using object-aware representation learning techniques for robotic tasks. Our self-supervised representations are learned by observing the agent freely interacting with different parts of the environment and are queried in two different settings: (i) policy learning and (ii) object location prediction. We show that our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object agnostic techniques as well as methods trained on raw RGB images. Our results show a 20% increase in performance in low data regimes (1000 trajectories) in policy training using implicit behavioral cloning (IBC). Furthermore, our method outperforms the baselines for the task of object localization in multi-object scenes." Sample-Efficient Goal-Conditioned Reinforcement Learning Via Predictive Information Bottleneck for Goal Representation Learning,"Qiming Zou, Einoshin Suzuki",Kyushu University,Representation Learning,"We propose Predictive Information bottleneck for Goal representation learning (PI-Goal), a self-supervised method for sample-efficient goal-conditioned reinforcement learning (RL). Goal-conditioned RL learns to reach commanded goals with reward signals. A goal could be given in a noisy or abstract form, and thus jeopardizes sample efficiency. Previous methods usually assume that the agent can map a state to an achievable goal. In this work, we consider a setting in which the goal space is unknown to the agent and the agent cannot recognize a goal in a specific state (referred to as a goal state) until the goal is commanded. Our PI-Goal learns a goal representation which contains only the predictive information of a goal state, i.e., the mutual information between a current state and a future state, and guarantees the optimality of the learned policy. Experimental results show that PI-Goal consistently outperforms the baseline methods in tasks with unknown goal spaces, e.g., object manipulation, object search, and embodied question answering." Context-Aware Robot Control Using Gesture Episodes,"Petr Vanc, Jan Kristof Behrens, Karla Stepanova","CIIRC, Czech Technical University in Prague,Czech Technical University,Czech Technical university",Learning from Experience,"Collaborative robots became a popular tool for increasing productivity in partly automated manufacturing plants. Intuitive robot teaching methods are required to quickly and flexibly adapt the robot programs to new tasks. Gestures have an essential role in human communication. However, in human-robot-interaction scenarios, gesture-based user interfaces are so far used rarely, and if they employ a one-to-one mapping of gestures to robot control variables. In this paper, we propose a method that infers the user’s intent based on gesture episodes, the context of the situation, and common sense. The approach is evaluated in a simulated table-top manipulation setting. We conduct deterministic experiments with simulated users and show that the system can even handle the personal preferences of each user." Automated Action Evaluation for Robotic Imitation Learning Via Siamese Neural Networks,"Xiang Chang, Fei Chao, Changjing Shang, Qiang Shen","Aberystwyth University,Xiamen University",Learning from Experience,"Despite recent advances in video-guided robotic imitation learning, many methods still rely on human experts to provide sparse rewards that indicate whether robots have successfully completed tasks. The challenge of enabling robots to autonomously evaluate whether their actions can complete complex, multi-stage tasks remains unresolved. In this work, we propose an efficient few-shot robotic learning algorithm that centres around learning and evaluating from a third-person perspective to address the aforementioned challenge. We develop a novel Siamese neural network-based robotic action-state evaluation system, named “Behavior-Outcome Dual Assessment” (BODA), in our robotic imitation learning system, so as to replace artificial evaluations from human experts in multi-stage imitation learning processes and to improve learning efficiency. In this way, one video demonstration of a target task is divided into several stages. For each stage, we design two Siamese neural network-based evaluation modules in BODA: One module focuses on action changes, and the other handles working environment changes. The two modules work together to provide a comprehensive assessment of the robot's completion of each stage from the view of both the action and working environment changes. Then, BODA is integrated within a model-based reinforcement learning framework to enable the completion of our imitation learning cycle. Extensive experiments demonstrate that the evaluation processes of BODA can automatically and accurately evaluate task completion status without human intervention. In contrast to conventional methods, BODA is able to keep the accumulation of errors within acceptable limits through self-assessment in stages." Failure-Aware Policy Learning for Self-Assessable Robotics Tasks,"Kechun Xu, Runjian Chen, Shuqi Zhao, Zizhang Li, Hongxiang Yu, Ci Chen, Yue Wang, Rong Xiong",Zhejiang University,Learning from Experience,"Self-assessment rules play an essential role in safe and effective real-world robotic applications, which verify the feasibility of the selected action before actual execution. But how to utilize the self-assessment results to re-choose actions remains a challenge. Previous methods eliminate the selected action evaluated as failed by the self-assessment rules, and re-choose one with the next-highest affordance (i.e. process-of-elimination strategy [1]), which ignores the dependency between the self-assessment results and the remaining untried actions. However, this dependency is important since the previous failures might help trim the remaining over-estimated actions. In this paper, we set to investigate this dependency by learning a failure-aware policy. We propose two architectures for the failure-aware policy by representing the self-assessment results of previous failures as the variable state, and leveraging recurrent neural networks to implicitly memorize the previous failures. Experiments conducted on three tasks demonstrate that our method can achieve better performances with higher task success rates by less trials. Moreover, when the actions are correlated, learning a failure-aware policy can achieve better performance than the process-of-elimination strategy." Multimodal Time Series Learning of Robots Based on Distributed and Integrated Modalities: Verification with a Simulator and Actual Robots,"Hideyuki Ichiwara, Hiroshi Ito, Kenjiro Yamamoto, Hiroki Mori, Tetsuya Ogata","Hitachi, Ltd. / Waseda University,Hitachi, Ltd.,Waseda University",Learning from Experience,"We have developed an autonomous robot motion generation model based on distributed and integrated multimodal learning. Since each modality used as a robot's senses, such as image, joint angle, and torque, has a different physical meaning and time characteristic, the generation of autonomous motions using multimodal learning has sometimes failed due to overlearning in one of the modalities. Inspired by the sensory processing of the human brain, our model is based on the processing of each sense performed in the primary somatosensory cortex and the integrated processing of multiple senses in the association cortex and the primary motor cortex. Specifically, the proposed model utilizes two types of recurrent neural networks: sensory RNNs, which learn each sense in a time series, and a union RNN, which communicates with sensory RNNs and learns sensory integration. The simulation results of multiple tasks showed that our model processes multiple modalities appropriately and generates smoother motions with lower jerk than the conventional model. We also demonstrated a chair assembly task by combining fixed motions and autonomous motions with our model." Using Memory-Based Learning to Solve Tasks with State-Action Constraints,"Mrinal Verghese, Christopher Atkeson","Carnegie Mellon University,CMU",Learning from Experience,"Tasks where the set of possible actions depend discontinuously on the state pose a significant challenge for current reinforcement learning algorithms. For example, a locked door must be first unlocked, and then the handle turned before the door can be opened. The sequential nature of these tasks makes obtaining final rewards difficult, and transferring information between task variants using continuous learned values such as weights rather than discrete symbols can be inefficient. Our key insight is that agents that act and think symbolically are often more effective in dealing with these tasks." Structured Motion Generation with Predictive Learning: Proposing Subgoal for Long-Horizon Manipulation,"Namiko Saito, Joao Moura, Tetsuya Ogata, Marina Y Aoyama, Shingo Murata, Shigeki Sugano, Sethu Vijayakumar","University of Edinburgh,Waseda University,Keio University",Learning from Experience,"For assisting humans in their daily lives, robots need to perform long-horizon tasks, such as tidying up a room or preparing a meal. One effective strategy for handling a long-horizon task is to break it down into short-horizon subgoals, that the robot can execute sequentially. In this paper, we propose extending a predictive learning model using deep neural networks (DNN) with a Subgoal Proposal Module (SPM), with the goal of making such tasks realizable. We evaluate our proposed model in a case-study of a long-horizon task, consisting of cutting and arranging a pizza. This task requires the robot to consider: (1) the order of the subtasks, (2) multiple subtask selection, (3) coordination of dual-arm, and (4) variations within a subtask. The results confirm that the model is able to generalize motion generation to unseen tools and objects arrangement combinations. Furthermore, it significantly reduces the prediction error of the generated motions compared to without the proposed SPM. Finally, we validate the generated motions on the dual-arm robot Nextage Open." Sequence-Agnostic Multi-Object Navigation,"Gireesh Nandiraju, Ayush Agrawal, Ahana Datta, Snehasis Banerjee, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna","IIIT Hyderabad,Robotics Research Center, IIIT Hyderabad,International Institute of Information Technology, Hyderabad,TCS Research,University of Birmingham,Tata Consultancy Services",Learning from Experience,"The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the object classes are to be explored is provided in advance. This is a strong limitation in practical applications characterized by dynamic changes. This paper describes a deep reinforcement learning framework for sequence-agnostic MultiON based on an actor-critic architecture and a suitable reward specification. Our framework leverages past experiences and seeks to reward progress toward individual as well as multiple target object classes. We use photo-realistic scenes from the Gibson benchmark dataset in the AI Habitat 3D simulation environment to experimentally show that our method performs better than a pre-sequenced approach and a state of the art ON method extended to MultiON." Occlusion Reasoning for Skeleton Extraction of Self-Occluded Tree Canopies,"Chung Hee Kim, George Kantor",Carnegie Mellon University,Award Finalists 4,"In this work, we present a method to extract the skeleton of a self-occluded tree canopy by estimating the unobserved structures of the tree. A tree skeleton compactly describes the topological structure and contains useful information such as branch geometry, positions and hierarchy. This can be critical to planning contact interactions for agricultural manipulation, yet is difficult to gain due to occlusion by leaves, fruits and other branches. Our method uses an instance segmentation network to detect visible trunk, branches, and twigs. Then, based on the observed tree structures, we build a custom 3D likelihood map in the form of an occupancy grid to hypothesize on the presence of occluded skeletons through a series of minimum cost path searches. We show that our method outperforms baseline methods in highly occluded scenes, demonstrated through a set of experiments on a synthetic tree dataset. Qualitative results are also presented on a real tree dataset collected from the field." Statistical Shape Representations for Temporal Registration of Plant Components in 3D,"Karoline Heiwolt, Cengiz Öztireli, Grzegorz Cielniak","University of Lincoln,ETH Zurich",Agricultural Robotics and Automation I,"Plants are dynamic organisms and understanding temporal variations in vegetation is an essential problem for robots in the wild. However, associating repeated 3D scans of plants across time is challenging. A key step in this process is re-identifying and tracking the same individual plant components over time. Previously, this has been achieved by comparing their global spatial or topological location. In this work, we demonstrate how using shape features improves temporal organ matching. We present a landmark-free shape compression algorithm, which allows for the extraction of 3D shape features of leaves, characterises leaf shape and curvature efficiently in few parameters, and makes the association of individual leaves in feature space possible. The approach combines 3D contour extraction and further compression using Principal Component Analysis (PCA) to produce a shape space encoding, which is entirely learned from data and retains information about edge contours and 3D curvature. Our evaluation on temporal scan sequences of tomato plants shows, that incorporating shape features improves temporal leaf-matching. A combination of shape, location, and rotation information proves most informative for recognition of leaves over time and yields a true positive rate of 75%, a 15% improvement on sate-of-the-art methods. This is essential for robotic crop monitoring, which enables whole-of-lifecycle phenotyping." 3D Reconstruction-Based Seed Counting of Sorghum Panicles for Agricultural Inspection,"Harry Freeman, Eric Schneider, Chung Hee Kim, Moonyoung Lee, George Kantor",Carnegie Mellon University,Agricultural Robotics and Automation I,"In this paper, we present a method for creating high-quality 3D models of sorghum panicles for phenotyping in breeding experiments. This is achieved with a novel reconstruction approach that uses seeds as semantic landmarks in both 2D and 3D. To evaluate the performance, we develop a new metric for assessing the quality of reconstructed point clouds without ground-truth. Finally, a counting method is presented where the density of seed centers in the 3D model allows 2D counts from multiple views to be effectively combined into a whole-panicle count. We demonstrate that using this method to estimate seed count and weight for sorghum outperforms count extrapolation from 2D images, an approach used in most state of the art methods for seeds and grains of comparable size" "Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf Instance Segmentation in the Agricultural Domain","Gianmarco Roggiolani, Matteo Sodano, Tiziano Guadagnino, Federico Magistri, Jens Behley, Cyrill Stachniss","University of Bonn,Photogrammetry and Robotics Lab, University of Bonn,Sapienza University of Rome",Agricultural Robotics and Automation I,"Plant phenotyping is a central task in agriculture, as it describes plants’ growth stage, development, and other relevant quantities. Robots can help automate this process by accurately estimating plant traits such as the number of leaves, leaf area, and the plant size. In this paper, we address the problem of joint semantic, plant instance, and leaf instance segmentation of crop fields from RGB data. We propose a single convolutional neural network that addresses the three tasks simultaneously, exploiting their underlying hierarchical structure. We introduce task-specific skip connections, which our experimental evaluation proves to be more beneficial than the usual schemes. We also propose a novel automatic post-processing, which explicitly addresses the problem of spatially close instances, common in the agricultural domain because of overlapping leaves. Our architecture simultaneously tackles these problems jointly in the agricultural context. Previous works either focus on plant or leaf segmentation, or do not optimise for semantic segmentation. Results show that our system has superior performance compared to state-of-the-art approaches, while having a reduced number of parameters and is operating at camera frame rate." Target-Aware Implicit Mapping for Agricultural Crop Inspection,"Shane Kelly, Alessandro Riccardi, Elias Ariel Marks, Federico Magistri, Tiziano Guadagnino, Margarita Chli, Cyrill Stachniss","ETH Zurich,University of Bonn,Sapienza University of Rome",Award Finalists 2,"Crop inspection is a critical part of modern agricultural practices that helps farmers assess the current status of a field and then make crop management decisions. Current crop inspection methods are labour-intensive tasks, which makes them rather slow and expensive to apply. In this paper, we exploit recent advancements in implicit mapping to tackle the challenging context of agricultural environments to create dense maps of crop rows with high enough fidelity to be useful for automated crop inspection. Specifically, we map strawberry and sweet pepper crop rows using RGB images captured by a wheeled mobile field robot inside a greenhouse and then use this data to build 3D maps to document the development of plants and fruits. Our Target-Aware Implicit Mapping system (TAIM) uses a SLAM-based pose initialization strategy for robust pose convergence, an efficient information-guided training sample selection framework for faster loss reduction, and focuses on exploiting training samples for fruit regions of the scene, which are critical for crop inspection tasks, to create more accurate maps in less time." Robust Plant Localization and Phenotyping in Dense 3D Point Clouds for Precision Agriculture,"Henry J. Nelson, Christopher Smith, Athanasios Bacharis, Nikos Papanikolopoulos","University of Minnesota,Lake Superior State University",Agricultural Robotics and Automation I,"The determination of a crop’s growth-stage is critical information for precision agriculture. Estimates of the growth stage are used to guide irrigation and the application of agrochemicals. Of particular importance is the use of fertilizers, however, growth-stage estimates may also suggest further investigation of potential crop infections and infestations. Traditionally, the growth-stage is based upon a manual random sample of a very small number of plants that are then analyzed to produce an estimate for the entire crop (up to thousands of acres). To increase the sample size (and thus accuracy) and to enable precision agriculture to address non-uniform crop development across a field, we present an analysis methodology that facilitates the automated growth stage analysis of dense point clouds that are derived from drone imagery. Our method utilizes a standard camera drone and does not use specialized sensors or geo-spatial tagging. We propose a multi-stage unsupervised method, which provides information about the individual plant locations in a field plot with a high probability. The method also produces a measure of individual plant heights, which along with their location are critical for later growth-stage estimation and necessary for robotic precision application. We confirm our method’s efficacy with experimental results on corn fields in Minnesota." Neural-Kalman GNSS/INS Navigation for Precision Agriculture,"Yayun Du, Swapnil Sayan Saha, Sandeep Sandha, Arthur Lovekin, Jason Wu, S. Siddharth, Mahesh Chowdhary, Mohammad Khalid Jawed, Mani Srivastava","University of California, Los Angeles,University of California - Los Angeles,STMicroelectronics,UCLA",Agricultural Robotics and Automation I,"Precision agricultural robots require high-resolution navigation solutions. In this paper, we introduce a robust neural-inertial sequence learning approach to track such robots with ultra-intermittent GNSS updates. First, we propose an ultra-lightweight neural-Kalman filter that can track agricultural robots within 1.4 m (1.4 - 5.8x better than competing techniques), while tracking within 2.75 m with 20 mins of GPS outage. Second, we introduce a user-friendly video-processing toolbox to generate high-resolution (+-5 cm) position data for fine-tuning pre-trained neural-inertial models in the field. Third, we introduce the first and largest (6.5 hours, 4.5 km, 3 phases) public neural-inertial navigation dataset for precision agricultural robots. The dataset, toolbox, and code are available at: https://github.com/nesl/agrobot." Fruit Tracking Over Time Using High-Precision Point Clouds,"Alessandro Riccardi, Shane Kelly, Elias Ariel Marks, Federico Magistri, Tiziano Guadagnino, Jens Behley, Maren Bennewitz, Cyrill Stachniss","University of Bonn,ETH Zurich,Sapienza University of Rome",Agricultural Robotics and Automation I,"Monitoring the traits of plants and fruits is a fundamental task in horticulture. With accurate measurements, farmers can predict the yield of their crops and use this information for making informed management decisions, and breeders can use it for variety selection. Agricultural robotic applications promise to automate this monitoring task. In this paper, we address the problem of monitoring fruit growth and investigate the matching of fruits recorded in commercial greenhouses at different growth stages based on data recorded from terrestrial laser scanners. This is challenging as fruits appear highly similar, change over time, and are subject to severe occlusions. We first propose a fruit descriptor, which captures the topology of the fruit surroundings to facilitate the matching between different points in time. We capture and describe the relationship between a fruit and its neighbors such that our descriptors are less affected by the growth over time. Furthermore, we define a matching cost function and use an optimal assignment algorithm to match the fruit observations taken in different weeks. The experiments show that our descriptor achieves a high spatio-temporal matching accuracy, which is superior to the commonly used geometric point cloud descriptors." A MySQL Database for the Systematic Configuration Selection of Redundant Manipulators When Path Planning in Confined Spaces,"Kat Styles Wood, Thomas. B Scott, Antonia Tzemanaki",University of Bristol,Redundant Robots,"Redundant manipulators offer a continuum of joint configurations which satisfy a specific end-effector pose, an advantage when operating within confined spaces. This, however, challenges a controller to select a single goal configuration from a wide range when path planning. This paper outlines the use of the MySQL database management system for systematic goal selection during redundant manipulator path planning in confined spaces. We outline a sampling method to envelope all configurations of a redundant manipulator and utilise it to generate a complete database of configurations. We demonstrate the application of this method to generate a large data-set of (1 billion) manipulator configurations for a KUKA LBR iiwa 14 equipped with a Robotiq 2F-85 gripper. With this database, the controller systematically selects goal configurations during 50 path planning scenarios within the confined space of a glovebox. We compare this to an iterative method using existing kinematic solvers to select goal configurations as a baseline. The database method achieves a 100% success rate in 42% of the scenarios attempted. In comparison, the baseline method achieves >50% success rate in just 6% of the scenarios attempted. Our proposed method also produces repeatable paths, which are similar in length and link swept area for each attempt of the same scenario, whereas the baseline method generates a different path in every attempt." Reinforcement Learning Control of a Reconfigurable Planar Cable Driven Parallel Manipulator,"Adhiti Raman, Ameya Salvi, Matthias Schmid, Venkat Krovi",Clemson University,Redundant Robots,"Cable driven parallel robots (CDPRs) are often challenging to model and to dynamically control due to the inherent flexibility and elasticity of the cables. The additional inclusion of online geometric reconfigurability to a CDPR results in a complex underdetermined system with highly non-linear dynamics. The necessary (numerical) redundancy resolution requires multiple layers of optimization rendering its application computationally prohibitive for real-time control. Here, deep reinforcement learning approaches can offer a model-free framework to overcome these challenges and can provide a real-time capable dynamic control. This study discusses three settings for a model-free DRL implementation in dynamic trajectory tracking: (i) for a standard non-redundant CDPR with a fixed workspace; (ii) in an end-to-end setting with redundancy resolution on a reconfigurable CDPR; and (iii) in a decoupled approach resolving kinematic and actuation redundancies individually." Intuitive Telemanipulation of Hyper-Redundant Snake Robots within Locomotion and Reorientation Using Task-Priority Inverse Kinematics,"Tim-Lukas Habich, Melvin Hueter, Moritz Schappler, Svenja Tappe","Leibniz University Hannover,Institute of Mechatronic Systems, Leibniz Universitaet Hannover,Leibniz Universität Hannover",Redundant Robots,"Snake robots offer considerable potential for endoscopic interventions due to their ability to follow curvilinear paths. Telemanipulation is an open problem due to hyper-redundancy, as input devices only allow a specification of six degrees of freedom. Our work addresses this by presenting a unified telemanipulation strategy which enables follow-the-leader locomotion and reorientation keeping the shape change as small as possible. The basis for this is a novel shape-fitting approach for solving the inverse kinematics in only a few milliseconds. Shape fitting is performed by maximizing the similarity of two curves using Fréchet distance while simultaneously specifying the position and orientation of the end effector. Telemanipulation performance is investigated in a study in which 14 participants controlled a simulated snake robot to locomote into the target area. In a final validation, pivot reorientation within the target area is addressed." An Equivalent Two Section Method for Calculating the Workspace of Multi-Segment Continuum Robots,"Yeman Fan, Dikai Liu",University of Technology Sydney,Redundant Robots,"Obtaining the shape and size of a robot’s workspace is essential for both its design and control. However, determining the accurate workspace of a multi-segment continuum robot by graphic or analytical methods is a challenging task due to its inherent flexibility and complex structure. Existing numerical methods have limitations when applied to a continuum robot. This paper presents an Equivalent Two Section (ETS) method for calculating the workspace of multi-segment continuum robots. This method is based on the forward kinematics and a piecewise constant curvature (PCC) model to determine the boundaries of the workspace. In order to verify the proposed method, simulation experiments are conducted using six different maximum bending angles and seven different number of segments. Results of the ETS method are compared to the true workspaces of these configurations estimated by an exhaustive approach. The results show that the proposed ETS method is both efficient and accurate, and has small estimation errors. Discussions on the advantages and limitations of the proposed ETS method are also presented." On Locally Optimal Redundancy Resolution Using the Basis of the Null Space,"Eugenio Monari, Yi Chen, Rocco Vertechy","University of Bologna,Università di Bologna",Redundant Robots,"This paper presents two methods for the computation of the null space velocity command in redundant robots. Both these methods resort to the solution of a constrained optimization problem. The first one is a formalization of the traditional Gradient Projection Method (GPM) which guarantees the respect of the joint bounds and a gradual activation/deactivation of the null space command. The second one, called Null Space Basis Optimal Linear Combination Method (NSBM), finds the optimal coefficients of a basis of the null space of the Jacobian, ensuring in turn that the joint bounds are respected and that the null space is activated and deactivated gradually. The two methods are applied to the case study of a welding application in which the null space command must avoid the collision between the robot and an obstacle. The comparison of the results of the case study shows that NSBM performs better than GPM. The proposed algorithms are also tested on a real robotic platform to demonstrate that their computational time is compatible with the real-time requirements of the robot." Optimal Parameterized Joints Selection to Improve Motion Planning Performance of Redundant Manipulators,"Bin Xie, Qingfeng Wang, Di Wu",Central South University,Redundant Robots,"The redundant manipulators' analytical solutions can be obtained by the parameterization method. Multiple parameterized joints and their corresponding parametric representations exist for a redundant manipulator. However, how to select the optimal parameterized joints has yet to be well-addressed. This paper delves into the mechanism of the parameterization method and proposes a method to select the optimal parametric representations to improve the motion planning performance of manipulators. We tested the proposed method on an 8-degree-of-freedom (DOF) manipulator. First, all feasible parametric representations are derived, followed by an approach to obtain solution manifolds. We then introduce a metric called the ""feasible rate,"" which characterizes the percentage of the solution manifold in the joint space. This metric is used to rapidly assess the influence of different parameterized joints on the manipulator's motion planning performance. To verify the proposed method's correctness, we evaluated the performance of different representations with the MOEA/D algorithm in solving the same path optimization problems based on the algorithm running time and overall motion magnitude of the manipulator. Our simulation results demonstrate that different selections of parameterized joints affect the motion planning performance, and the performance planned by the optimal parametric representation is up to four times greater than that of the worst one." A Kinematically Redundant (6+1)-Dof Hybrid Parallel Robot for Delicate Physical Environment and Robot Interaction (pERI),"Jehyeok Kim, Clement Gosselin",Université Laval,Redundant Robots,"A novel kinematically redundant (6+1)-degree-of-freedom (dof) spatial hybrid parallel robot is proposed. Each of the two legs of the robot has a fully parallel structure to minimize the moving inertia by mounting actuators on the base. The kinematic model of each leg and overall robot architecture is developed based on the constraint conditions of the robot geometry. The singularity analysis of legs 1 and 2 reveals that their serial and parallel singularities can be avoided by properly dimensioning the robot and sacrificing the edge of the workspace. In addition, it is shown that the type II (parallel) singularities can be completely avoided, resulting in a large orientational workspace. The gripping mechanism is then introduced which is operated by the redundant degree of freedom of the robot. A CAD model of the robot and a computer animation are provided to demonstrate the positioning and orientation of the robot and the gripping function." Learning-Based Initialization of Trajectory Optimization for Path-Following Problems of Redundant Manipulators,"Minsung Yoon, Mincheul Kang, Daehyung Park, Sung-Eui Yoon","Korea Advanced Institute of Science and Technology (KAIST),KAIST,Korea Advanced Institute of Science and Technology, KAIST",Award Finalists 2,"Trajectory optimization (TO) is an efficient tool to generate a redundant manipulator's joint trajectory following a 6-dimensional Cartesian path. The optimization performance largely depends on the quality of initial trajectories. However, the selection of a high-quality initial trajectory is non-trivial and requires a considerable time budget due to the extremely large space of the solution trajectories and the lack of prior knowledge about task constraints in configuration space. To alleviate the issue, we present a learning-based initial trajectory generation method that generates high-quality initial trajectories in a short time budget by adopting example-guided reinforcement learning. In addition, we suggest a null-space projected imitation reward to consider null-space constraints by efficiently learning kinematically feasible motion captured in expert demonstrations. Our statistical evaluation in simulation shows the improved optimality, efficiency, and applicability of TO when we plug in our method's output, compared with three other baselines. We also show the performance improvement and feasibility via real-world experiments with a seven-degree-of-freedom manipulator." Kinematic Analysis and Design of a Novel (6+3)-DoF Parallel Robot with Fixed Actuators,"Arda Yigit, David Breton, Zhou Zhou, Thierry Laliberte, Clement Gosselin","Laval University,University Laval,Universite Laval,Université Laval",Kinematics,"A novel kinematically redundant (6+3)-DoF parallel robot is presented in this paper. Three identical 3-DoF RU/2-RUS legs are attached to a configurable platform through spherical joints. With the selected leg mechanism, the motors are mounted at the base, reducing the reflected inertia. The robot is intended to be actuated with direct-drive motors in order to perform intuitive physical human-robot interaction. The design of the leg mechanism maximizes the workspace in which the end-effector of the leg can have a 2g acceleration in all directions. All singularities of the leg mechanism are identified under a simplifying assumption. A CAD model of the (6+3)-DoF robot is presented in order to illustrate the preliminary design of the robot." RangedIK: An Optimization-Based Robot Motion Generation Method for Ranged-Goal Tasks,"Yeping Wang, Pragathi Praveena, Daniel Rakita, Michael Gleicher","University of Wisconsin-Madison,University of Wisconsin - Madison",Kinematics,"Generating feasible robot motions in real-time requires achieving multiple tasks (i.e., kinematic requirements) simultaneously. These tasks can have a specific goal, a range of equally valid goals, or a range of acceptable goals with a preference toward a specific goal. To satisfy multiple and potentially competing tasks simultaneously, it is important to exploit the flexibility afforded by tasks with a range of goals. In this paper, we propose a real-time motion generation method that accommodates all three categories of tasks within a single, unified framework and leverages the flexibility of tasks with a range of goals to accommodate other tasks. Our method incorporates tasks in a weighted-sum multiple-objective optimization structure and uses barrier methods with novel loss functions to encode the valid range of a task. We demonstrate the effectiveness of our method through a simulation experiment that compares it to state-of-the-art alternative approaches, and by demonstrating it on a physical camera-in-hand robot that shows that our method enables the robot to achieve smooth and feasible camera motions." Contact Based Turning Gait of a Novel Legged-Wheeled Quadruped,"Alper Yeldan, Abhimanyu Arora, Gim Song Soh",Singapore University of Technology and Design,Kinematics,"How does a wheeled robot move and turn? The answer is straightforward for a conventional wheeled robot, but it is not so easy for a robot with a discrete wheel design. Regular wheeled robots always have four contact points, resulting in static stability during locomotion. However, QuadRunner's novel leg mechanism provides only a semi-circular wheel shape, and proper gait planning is needed to go straight or turn. Therefore, this paper presents a dual frequency gait planning method which controls the robot's gait cycle's duty factor and generates unique turning gait patterns for wheel locomotion. Describing requirements and limitations, we found sets of solutions that can achieve turning. Results show that the smallest turning radius QuadRunner achieved is 1.05m, and the biggest is 1.86m. In addition, detailed experiments were made to observe the performance and stability of straight and turning wheel behaviors. Finally, a gait verification is made using high-speed cameras." Computational Modeling in System with Non-Circular Timing Pulleys,"Renzo Caballero, Angelica Coronado Preciado, Eric Feron",King Abdullah University of Science and Technology,Kinematics,"We analyze and model a belt transmission system with non-circular timing pulleys. Using a 3D printer as a proof-of-concept device, experiments consisting of tracking the pose data of a printer nozzle and its pulleys are conducted. A computational model from our previous work is validated with the experimental data and expanded to model more complex systems with multiple non-circular timing pulleys as well as slippage and non-ideal tensions. Finally, an example with two non-circular timing pulleys is presented and simulated utilizing the proposed method." "The New Exhibition {em Blind Machines}, a Large 3D Printing Machine","Jean-Pierre Merlet, Jean-Pierre Merlet",INRIA,Parallel Robots,"This paper presents the further developments and preliminary results of a large 3D printing machine based on a 3 d.o.f cable-driven parallel robot (CDPR) that is used for an artistic exhibition. The printing material is a powder constituted of glass micro-beads that is deposited on a fixed trajectory so that the resulting structure collapses with time. A first exhibition has been held during the summer of 2019 and another one was scheduled to take place during ICRA 2020, that was canceled because of the Covid. The current exhibition has started on 07/09/2022 and will end on 10/14/2019. We describe in this paper the improvements of the current prototype, both on hardware and software, compared to the 2019 and 2020 versions. Between 7/9/2022 and 16/10/2022 the CDPR has run for 126 hours and has traveled on a total distance of 9km. During the period 142 layers have been deposited, representing a mass of 2.56 tons of glass powder." New Bracket Polynomials Associated with the General Gough-Stewart Parallel Robot Singularities,Federico Thomas,CSIC-UPC,Award Finalists 1,"It is well known that the singularities of a Gough-Stewart platform arise when the determinant of the Plücker coordinates of the robot leg lines vanish. The direct expansion of this determinant in terms of the configuration of the moving platform leads to an intimidating algebraic expression which is difficult to organize in a manner that facilitates extracting geometric conditions for singularities to occur. The use of Grassmann-Cayley algebra has permitted expressing this determinant as a bracket polynomial which is easier to manipulate symbolically. Each monomial in this polynomial is the product of three brackets, 4x4 determinants involving the homogeneous coordinates of four leg attachments. In this paper, we show how to derive, using elementary linear algebra arguments, bracket polynomials where all brackets can be interpreted as reciprocal products between lines. Contrarily to what one might expect, these new bracket polynomials are simpler in general than those previously obtained using Grassmann-Cayley algebra." Output Mode Switching for Parallel Five-Bar Manipulators Using a Graph-Based Path Planner,"Parker Edwards, Aravind Baskar, Caroline Hills, Mark Plecnik, Jonathan Hauenstein",University of Notre Dame,Parallel Robots,"The configuration spaces of parallel manipulators exhibit more nonlinearity than serial manipulators. Qualitatively, they can be seen to possess extra folds. Projection onto smaller spaces of engineering relevance, such as an output workspace or an input actuator space, these folds cast edges that exhibit boundary behavior. For example, inside the global workspace bounds of a five-bar linkage appear several local workspace bounds that only constrain certain output modes of the mechanism. The presence of such boundaries, which manifest in both input and output projections, serve as a source of confusion when these projections are studied exclusively instead of the configuration space itself. Particularly, the design of nonsymmetric parallel manipulators has been confounded by the presence of exotic projections in their input and output spaces. In this paper, we represent the configuration space with a radius graph, then weight each edge by solving an optimization problem using homotopy continuation to quantify transmission quality. We then employ a graph path planner to approximate geodesics between configuration points that avoid regions of low transmission quality. Our methodology automatically generates paths capable of transitioning between non-neighboring output modes, a motion which involves osculating multiple workspace boundaries (local, global, or both). We apply our technique to two nonsymmetric five-bar examples that demonstrate how transmission properties and other characteristics of the workspace can be selected by switching output modes." Dimensional Optimization and Anti-Disturbance Analysis of an Upgraded Feed Mechanism in FAST,"Xiaoyan Wang, Bin Zhang, Zhaoyang Li, Gao Xinyu, Fei Zhang, Yifan Ma, Rui Yao, Jia-ning Yin, Hui Li, Qingge Yang, Qingwei Li, Weiwei Shang","University of Science and Technology of China,National Astronomical Observatories,Chinese Academy of Sciences,National Astronomical Observatories, Chinese Academy of Sciences",Parallel Robots,"Five-hundred-meter aperture spherical radio telescope (FAST) is a very famous large-scale scientific facility with excellent performance for astronomical observation in the world, but it currently fails to observe the center of the Milky Way Galaxy due to the limited observation angle that is affected by the heavy weight of the feed cabin. To improve this problem, an upgraded feed mechanism (UFM) with a lighter cable structure is designed and employed to replace the existing heavy rigid A-B rotator and Stewart platform in the feed cabin of FAST. The structural dimension of the UFM is analyzed and optimized under cable tension constraints to meet the requirements of the observation angle. Then, a novel disturbance increment method is proposed to analyze the anti-disturbance ability of the UFM, where a gradually increased disturbance wrench is applied to the UFM with the stiffness matrix iteratively updated. Through the dimensional optimization and further anti-disturbance analysis, the newly-designed UFM can indeed meet the higher demand for astronomical observation with the larger observation angle, which benefits from the lightweight cable structure. Besides, the UFM also has the appreciable anti-disturbance ability for long-term stable operation of FAST." "Online Social Robot Navigation in Indoor, Large and Crowded Environments","Steven Alexander Silva Mendoza, Nervo Xavier Verdezoto Dias, Dennys Paillacho, Samuel Millan-norman, Juan David Hernández","Cardiff University,Espol Polytechnic University",Human-Robot Collaboration II,"New robotics applications require robots to complete tasks in social spaces (i.e. environments shared with people), thus arising the necessity of enabling robots to operate in a socially acceptable manner. Some social spaces tend to be large and crowded (e.g. museums, shopping malls), which require robots to move around while showing appropriate social behaviors (e.g. not interfering with human's comfortable areas). Moving under such conditions is generally called social robot navigation, and there are different approaches to do so. Nonetheless, current approaches are mostly limited to navigate large and outdoor spaces, where both robots and people can easily avoid each other. Other approaches have been tested in indoor environments, however, the test environments tend to be small and largely empty. In this paper, we present an online social robot navigation framework, which allow robots to navigate indoor, large and crowded environments, while showing social behaviors. Our framework consists of 3 modules: 1) world modeling that incorporates a novel Social Heatmap (SH) to represent crowded areas, 2) multilayered path planning that uses sampling-based approaches, and 3) path following control. We extensively benchmark our approach against state-of-the-art approaches in challenging simulated scenarios, and we also demonstrate its feasibility with the Pepper robot in real-world trials." Learning Responsibility Allocations for Safe Human-Robot Interaction with Applications to Autonomous Driving,"Ryan Cosner, Yuxiao Chen, Karen Yan Ming Leung, Marco Pavone","California Institute of Technology,Nvidia research,Stanford University, NVIDIA Research, University of Washington,Stanford University",Human-Robot Collaboration II,"Drivers have a responsibility to exercise reasonable care to avoid collision with other road users. This assumed responsibility allows interacting agents to maintain safety without explicit coordination. Thus to enable safe autonomous vehicle (AV) interactions, AVs must understand what their responsibilities are to maintain safety and how they affect the safety of nearby agents. In this work we seek to understand how responsibility is shared in multi-agent settings where an autonomous agent is interacting with human counterparts. We introduce Responsibility-Aware Control Barrier Functions (RA-CBFs) and present a method to learn responsibility allocations from data. By combining safety-critical control and learning-based techniques, RA-CBFs allow us to account for scene-dependent responsibility allocations and synthesize safe and efficient driving behaviors without making worst-case assumptions that typically result in overly-conservative behaviors. We test our framework using real-world driving data and demonstrate its efficacy as a tool for both safe control and forensic analysis of unsafe driving." Efficient Inference of Temporal Task Specifications from Human Demonstrations Using Experiment Design,"Shlok Sobti, Rahul Shome, Lydia Kavraki","Diamond Age ,D,The Australian National University,Rice University",Human-Robot Collaboration II,"Robotic deployments in human environments have motivated the need for autonomous systems to be able to interact with humans and solve tasks effectively. Human demonstrations of tasks can be used to infer underlying task specifications, commonly modeled with temporal logic. State-of-the-art methods have developed Bayesian inference tools to estimate a temporal logic formula from a sequence of demonstrations. The current work proposes the use of experiment design to choose environments for humans to perform these demonstrations. This reduces the number of demonstrations needed to estimate the unknown ground truth formula with low error. A novel computationally efficient strategy is proposed to generate informative environments by using an optimal planner as the model for the demonstrator. Instead of evaluating all possible environments, the search space reduces to the placement of informative orderings of likely eventual goals along an optimal planner's solution. A human study with 600 demonstrations from 20 participants for 4 tasks on a 2D interface validates the proposed hypothesis and empirical performance benefit in terms of convergence and error over baselines. The human study dataset is also publicly shared." On the Impact of Interruptions During Multi-Robot Supervision Tasks,"Abhinav Dahiya, Yifan Cai, Oliver Schneider, Stephen L. Smith",University of Waterloo,Human-Robot Collaboration II,"Human supervisors in multi-robot systems are primarily responsible for monitoring robots, but can also be assigned with secondary tasks. These tasks can act as interruptions and can be categorized as either intrinsic, i.e., being directly related to the monitoring task, or extrinsic, i.e., being unrelated. In this paper, we investigate the impact of these two types of interruptions through a user study (N = 39), where participants monitor a number of remote mobile robots while intermittently being interrupted by either a robot fault correction task (intrinsic) or a messaging task (extrinsic). We find that task performance of participants does not change significantly with the interruptions but depends greatly on the number of robots. However, interruptions result in an increase in perceived workload, and extrinsic interruptions have a more negative effect on workload across all NASA-TLX scales. Participants also reported switching between extrinsic interruptions and the primary task to be more difficult compared to the intrinsic interruption case. Statistical significance of these results is confirmed using ANOVA and one-sample t-test. These findings suggest that when deciding task assignment in such supervision systems, one should limit interruptions from secondary tasks, especially extrinsic ones, in order to limit user workload." System Configuration and Navigation of a Guide Dog Robot: Toward Animal Guide Dog-Level Guiding Work,"Hochul Hwang, Tian Xia, Ibrahima Keita, Ken Suzuki, Joydeep Biswas, Sunghoon Ivan Lee, Donghyun Kim","University of Massachusetts Amherst,University of Massachusetts at Amherst,University of Massachusetts, Amherst,University of Texas at Austin,UMass Amherst",Human-Robot Collaboration II,"A robot guide dog has compelling advantages over animal guide dogs for its cost-effectiveness, the potential for mass production, and low maintenance burden. However, despite the long history of guide dog robot research, previous studies were conducted with little or no consideration of how the guide dog handler and the guide dog work as a team for navigation. To develop a robotic guiding system that genuinely benefits blind or visually impaired individuals, we performed qualitative research, including interviews with guide dog handlers, trainers, and first-hand blindfold walking experiences with various guide dogs. We build a collaborative indoor navigation scheme for a guide dog robot that includes preferred features such as speed and directional control. For collaborative navigation, we propose a semantic-aware local path planner that enables safe and efficient guiding work by utilizing semantic information about the environment and considering the handler's position and directional cues to determine the collision-free path. We evaluate our integrated robotic system by testing blindfolded walking in indoor settings and demonstrate guide dog-like navigation behavior by avoiding obstacles at typical gait speed (0.7m/s). The following demonstration video link includes an audio description: https://youtu.be/YxlcMeaL7GA." Human Non-Compliance with Robot Spatial Ownership Communicated Via Augmented Reality: Implications for Human-Robot Teaming Safety,"Christine T Chang, Matthew Luebbers, Mitchell Hebert, Bradley Hayes","University of Colorado Boulder,Draper",Human-Robot Collaboration II,"Ensuring the safety and efficiency of human workers in environments shared with autonomous robots is of paramount importance. In this work we examine the behavior and attitudes of participants performing tasks in a noisy environment collocated with an autonomous quadcopter robot. Visual communication of spatial ownership and nonverbal (deictic gesture) requests for changes in spatial ownership are facilitated using an augmented reality (AR) head-mounted device that renders a color-keyed grid on the floor. After a request, the robot can alter floor ownership to provide participants with a safe path to complete their work. Participants (n=20) in a between-subjects study took part in either a shared space condition (concurrently occupying the work floor with the robot, with obvious rationale for floor ownership) or a turn-taking condition (alternating excursions onto the grid with the robot, without apparent rationale for the floor grid colors). We find consistent evidence of potentially dangerous over-trust in the system that led to non-compliance; notably, 25% of participants intentionally walked across forbidden floor regions during the experiment. We identify design considerations and a variety of user-borne rationale for committing safety violations that designers will need to explicitly take measures to remedy in production AR safety systems." Robust Robot Planning for Human-Robot Collaboration,"Yang You, Vincent Thomas, Francis Colas, Alami Rachid, Olivier Buffet","Inria Nancy Grand Est,LORIA - Universite de Lorraine,CNRS,LORIA/INRIA",Human-Robot Collaboration II,"From the robot’s point of view, a major issue in human-robot collaboration is how to be robust against uncertain human objectives, and uncertain human behaviors given a known objective. A key preliminary question is then: How to derive realistic human behaviors given a known objective? Indeed, to allow for collaboration, such behaviors should also account for the robot behavior, while it is not known in the first place. In this paper, we rely on Markov decision models, representing the uncertainty over the human objective as a probability distribution over a finite set of reward functions (inducing a distribution over human behaviors). Based on this, we propose two contributions: 1) an approach to automatically generate an uncertain human behavior (a policy) for each provided reward function while accounting for possible robot behaviors; and 2) a robot planning algorithm that is robust to the above-mentioned uncertainties and relies on solving a partially observable Markov decision process (POMDP) obtained by reasoning on a distribution over human behaviors. A co-working scenario allows conducting experiments and presenting qualitative and quantitative results to evaluate our approach." Natural Language Instruction Understanding for Robotic Manipulation: A Multisensory Perception Approach,"Weihua Wang, Xiaofei Li, Yanzhi Dong, Jun Xie, Di Guo, Huaping Liu","Yantai University,Taiyuan University of Technology,Beijing University of Posts and Telecommunications,Tsinghua University",Human-Robot Collaboration II,"It has always been expected that the robot can understand the natural language instruction and thus a more natural human-robot interaction is achieved. Currently, the robot usually interprets the instruction by visually grounding the textual information to its surroundings, while it may be not enough for some complex situations with only visual perception. So it is reasonable for the robot to leverage its multisensory perception ability to better understand the instruction. In this paper, we propose a multisensory perception approach to tackle the task of natural language instruction understanding for robotic manipulation, in which the robot coordinates its visual, tactile and auditory perception to fully understand the instruction and then executes the manipulation task. Extensive experiments have been conducted demonstrating the superiority of the multisensory perception compared with single sensory perception for instruction understanding. Moreover, we establish a user-friendly human-robot interaction interface where the human sends instruction to the robot via a mobile APP." EgoHMR: Egocentric Human Mesh Recovery Via Hierarchical Latent Diffusion Model,"Yuxuan Liu, Jianxin Yang, Xiao Gu, Yao Guo, Guang-Zhong Yang","Shanghai Jiao Tong University,Imperial College London",Human-Robot Collaboration II,"Egocentric vision, playing an important role in social robotics, has demonstrated its capability in assistive healthcare and human-centric behavior analysis. Within this field, the perception of human body him/herself performs as one of the most significant prerequisites for downstream applications, e.g., action recognition and action anticipation. Extensive approaches are proposed to perform the human mesh recovery from the exocentric images capturd from a third-person view while few studies explore this challenging task with respect to the heavily distorted yet occluded egocentric images. In this paper, we propose EgoHMR, a novel hierarchical network for Egocentric Human Mesh Recovery based on latent diffusion model. Our method takes a single egocentric frame as input and can be trained in an end-to-end manner without the supervision of any 2D pose information. The network is built upon the latent diffusion model by incorporating both global and local features in a hierarchical structure. To train the proposed network, we generate weak labels from the synchronized exocentric images. To the best of our knowledge, the proposed method is the first attempt to perform human mesh recovery directly from egocentric images. Quantitative and qualitative results have been conducted to demonstrate the effectiveness of our proposed EgoHMR." Telerobot Operators Can Account for Varying Transmission Dynamics in a Visuo-Haptic Object Tracking Task,"Mohit Singhala, Jeremy Brown",Johns Hopkins University,Human-Robot Collaboration II,"Humans possess an innate ability to incorporate tools into our body schema to perform a myriad of tasks not possible with our natural limbs. Human-in-the-loop telerobotic systems (HiLTS) are tools that extend human manipulation capabilities to remote and virtual environments. Unlike most hand-held tools, however, HiLTS often possess complex electromechanical architectures that introduce non-trivial transmission dynamics between the robot's leader and follower, which alter or obfuscate the environment's dynamics. While considerable research has focused on negating or circumventing these dynamics, it is not well understood how capable human operators are at incorporating these transmission dynamics into their sensorimotor control scheme. To begin answering this question, we recruited N=12 participants to use a novel reconfigurable teleoperator with varying transmission dynamics to perform a visuo-haptic tracking task. Contrary to our original hypothesis, our findings demonstrate that humans can account for substantial differences in teleoperator transmission dynamics and produce the compensatory strategies necessary to adequately control the teleoperator. These findings suggest that advances in transparency algorithms and haptic feedback approaches must be coupled with control designs that leverage the unique capabilities of the human operator in the loop." Hierarchical Intention Tracking for Robust Human-Robot Collaboration in Industrial Assembly Tasks,"Zhe Huang, Ye-ji Mun, Xiang Li, Yiqing Xie, Ninghan Zhong, Weihang Liang, Junyi Geng, Tan Chen, Katherine Driggs-Campbell","University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign,Pennsylvania State University,Michigan Technological University",Human-Robot Collaboration II,"Collaborative robots require effective human intention estimation to safely and smoothly work with humans in less structured tasks such as industrial assembly, where human intention continuously changes. We propose the concept of intention tracking and introduce a collaborative robot system that concurrently tracks intentions at hierarchical levels. The high-level intention is tracked to estimate human's interaction pattern and enable robot to (1) avoid collision with human to minimize interruption and (2) assist human to correct failure. The low-level intention estimate provides robot with task-related information. We implement the system on a UR5e robot and demonstrate robust, seamless and ergonomic human-robot collaboration in an ablative pilot study of an assembly use case." CoGrasp: 6-DoF Grasp Generation for Human-Robot Collaboration,"Abhinav Keshari, Hanwen Ren, Ahmed H. Qureshi",Purdue University,Human-Robot Collaboration II,"Robot grasping is an actively studied area in robotics, mainly focusing on the quality of generated grasps for object manipulation. However, despite advancements, these methods do not consider the human-robot collaboration settings where robots and humans will have to grasp the same objects concurrently. Therefore, generating robot grasps compatible with human preferences of simultaneously holding an object becomes necessary to ensure a safe and natural collaboration experience. In this paper, we propose a novel, deep neural network-based method called CoGrasp that generates human-aware robot grasps by contextualizing human preference models of object grasping into the robot grasp selection process. We validate our approach against existing state-of-the-art robot grasping methods through simulated and real-robot experiments and user studies. In real robot experiments, our method achieves about 88% success rate in producing stable grasps that also allow humans to interact and grasp objects simultaneously in a socially compliant manner. Furthermore, our user study with 10 independent participants indicated our approach enables a safe, natural, and socially-aware human-robot objects' co-grasping experience compared to a standard robot grasping technique." Can We Use Diffusion Probabilistic Models for 3D Motion Prediction?,"Hyemin Ahn, Esteve Valls Mascaro, Dongheui Lee","Ulsan National Institute of Science and Technology,Technische Universitat Wien,Technische Universität Wien (TU Wien)",Intent Recognition,"After many researchers observed fruitfulness from the recent diffusion probabilistic model, its effectiveness in image generation is actively studied these days. In this paper, our objective is to evaluate the potential of diffusion probabilistic models for 3D human motion-related tasks. To this end, this paper presents a study of employing diffusion probabilistic models to predict future 3D human motion(s) from the previously observed motion. Based on the Human 3.6M and HumanEva-I datasets, our results show that diffusion probabilistic models are competitive for both single (deterministic) and multiple (stochastic) 3D motion prediction tasks, after finishing a single training process. In addition, we find out that diffusion probabilistic models can offer an attractive compromise, since they can strike the right balance between the likelihood and diversity of the predicted future motions. Our code is publicly available on the project website: https://sites.google.com/view/diffusion-motion-prediction." PedFormer: Pedestrian Behavior Prediction Via Cross-Modal Attention Modulation and Gated Multitask Learning,"Amir Rasouli, Iuliia Kotseruba","Huawei Technologies Canada,Lassonde School of Engineering",Intent Recognition,"Predicting pedestrian behavior is a crucial task for intelligent driving systems. Accurate predictions require a deep understanding of various contextual elements that could impact the way pedestrians behave. To address this challenge, we propose a novel framework that relies on different data modalities to predict future trajectories and crossing actions of pedestrians from an egocentric perspective. Specifically, our model utilizes a cross-modal Transformer architecture to capture dependencies between different data types. The output of the Transformer is augmented with representations of interactions between pedestrians and other traffic agents conditioned on the pedestrian and ego-vehicle dynamics that are generated via a semantic attentive interaction module. Lastly, the context encodings are fed into a multi-stream decoder framework using a gated-shared network. We evaluate our algorithm on public pedestrian behavior benchmarks, PIE and JAAD, and show that our model improves state-of-the-art in trajectory and action prediction by up to 22% and 13% respectively on various metrics. The advantages of the proposed components are investigated via extensive ablation studies." Robot-Assisted Eye-Hand Coordination Training System by Estimating Motion Direction Using Smooth-Pursuit Eye Movements,"Xiao Li, Zeng Hong, Chenhua Yang, Song Aiguo","School of Instrument Science and Engineering,Southeast Universit,Southeast University",Intent Recognition,"Robot-assisted eye-hand coordination rehabilitation training system is extremely urgent to study since recent evidence suggests that eye-hand coordination can be brutally disturbed by stroke with critical consequences on motor behavior. In this paper, we develop a robot-assisted eye-hand coordination training system by estimating motion direction using smooth-pursuit eye movements. Firstly, we design a Pong Game, which requires users to extrapolate the direction of a linearly moving ball and to predict whether this ball would be hit. Secondly, the motion direction of the ball is estimated via smooth-pursuit eye movements, allowing the robot quickly establish an assistive force field to hit the ball. Thirdly, adding haptic feedback technology into this training system to make users more immersive. Finally, we conduct a feasibility study with eight healthy subjects to verify the effectiveness of the proposed system. The experimental results show that the mean success rate for hitting the pong ball of the experiment group (assistance turn-on) is 28.33% higher than that of the control group (assistance turn-off), and the mean interception time of the experiment group is 0.35s shorter than that of the control group. Therefore, the developed system may be promising for transferring to the robot-assisted eye-hand coordination rehabilitation training for post-stroke patients." Generalizable Movement Intention Recognition with Multiple Heterogeneous EEG Datasets,"Xiao Gu, Jinpei Han, Guang-Zhong Yang, Benny Lo","Imperial College London,Shanghai Jiao Tong University",Intent Recognition,"Human movement intention recognition is important for human-robot interaction. Existing work based on motor imagery electroencephalogram (EEG) provides a non-invasive and portable solution for intention detection. However, the data-driven methods may suffer from the limited scale and diversity of the training datasets, which result in poor generalization performance on new test subjects. It is practically difficult to directly aggregate data from multiple datasets for training, since they often employ different channels and collected data suffers from significant domain shifts caused by different devices, experiment setup, etc. On the other hand, the inter-subject heterogeneity is also substantial due to individual differences in EEG representations. In this work, we developed two networks to learn from both the shared and the complete channels across datasets, handling inter-subject and inter-dataset heterogeneity respectively. Based on both networks, we further developed an online knowledge co-distillation framework to collaboratively learn from both networks, achieving coherent performance boosts. Experimental results have shown that our proposed method can effectively aggregate knowledge from multiple datasets, demonstrating better generalization in the context of cross-subject validation." Bi-Manual Manipulation of Multi-Component Garments towards Robot-Assisted Dressing,"Stelios Kotsovolis, Yiannis Demiris",Imperial College London,Physical Human-Robot Interaction I,"In this paper, we propose a strategy for robot-assisted dressing with multi-component garments, such as gloves. Most studies in robot-assisted dressing usually experiment with single-component garments, such as sleeves, while multi-component tasks are often approached as sequential single-component problems. In dressing scenarios with more complex garments, robots should estimate the alignment of the human body to the manipulated garments, and revise their dressing strategy. In this paper, we focus on a glove dressing scenario and propose a decision process for selecting dressing action primitives on the different components of the garment, based on a hierarchical representation of the task and a set of environmental conditions. To complement this process, we propose a set of bi-manual control strategies, based on hybrid position, visual, and force feedback, in order to execute the dressing action primitives with the deformable object. The experimental results validate our method, enabling the Baxter robot to dress a mannequin’s hand with a gardening glove." Humans Need Augmented Feedback to Physically Track Non-Biological Robot Movements,"Mahdiar Edraki, Pauline Maurice, Dagmar Sternad","Northeastern University,CNRS - LORIA",Physical Human-Robot Interaction I,"An important component for the effective collaboration of humans with robots is the compatibility of their movements, especially when humans physically collaborate with a robot partner. Following previous findings that humans interact more seamlessly with a robot that moves with human-like or biological velocity profiles, this study examined whether humans can adapt to a robot that violates human signatures. The specific focus was on the role of extensive practice and real-time augmented feedback. Six groups of participants physically tracked a robot tracing an ellipse with profiles where velocity scaled with the curvature of the path in biological and nonbiological ways, while instructed to minimize the interaction force with the robot. Three of the 6 groups received real-time visual feedback about their force error. Results showed that with 3 daily practice sessions, when given feedback about their force errors, humans could decrease their interaction forces when the robot’s trajectory violated human-like velocity patterns. Conversely, when augmented feedback was not provided, there were no improvements despite this extensive practice. The biological profile showed no improvements, even with feedback, indicating that the (non-zero) force had already reached a floor level. These findings highlight the importance of biological robot trajectories and augmented feedback to guide humans to adapt to non-biological movements in physical human-robot interaction. These results have implications on various fields of robotics, such as surgical applications and collaborative robots for industry." Robot Mimicry Attack on Keystroke-Dynamics User Identification and Authentication System,"Rongyu Yu, Burak Kizilkaya, Zhen Meng, Liying Emma Li, Guodong Zhao, Muhammad Ali Imran","University of Glasgow,University of Glasgow, UK",Physical Human-Robot Interaction I,"Future robots will be very advanced with high flexibility and accurate control performance. They will have the ability to mimic human behaviours or even perform better, which raises the significant risk of robot attack. In this work, we study the robot mimic attack on the current keystroke-dynamic user authentication system. Specifically, we proposed a robot mimicry attack framework for keystroke-dynamics systems. We collected keyboard logging data and acoustical signal data from real users and extracted the timing pattern of keystrokes to understand victim's behaviour for robot imitation attacks. Furthermore, we develop a deep Q-Network (DQN) algorithm to control the velocity of robot which is one of the key challenges of forging the human typing timing features. We tested and evaluated our approach on the real-life robotic testbed. We presented our results considering user identification and user authentication performance. We achieved a 90.3% user identification accuracy with genuine keyboard logging data samples and 89.6% accuracy with robot-forged data samples. Furthermore, we achieved 11.1%, and 36.6% EER for user authentication performance with zero-effort attack, and robot mimicry attack, respectively." In-Mouth Robotic Bite Transfer with Visual and Haptic Sensing,"Lorenzo Shaikewitz, Yilin Wu, Suneel Belkhale, Jennifer Grannen, Priya Sundaresan, Dorsa Sadigh","California Institute of Technology,Stanford University",Physical Human-Robot Interaction I,"Assistance during eating is essential for those with severe mobility issues or eating risks. However, dependence on traditional human caregivers is linked to malnutrition, weight loss, and low self-esteem. For those who require eating assistance, a semi-autonomous robotic platform can provide independence and a healthier lifestyle. We demonstrate an essential capability of this platform: safe, comfortable, and effective transfer of a bite-sized food item from a utensil directly to the inside of a person’s mouth. Our system uses a force-reactive controller to safely accommodate the user’s motions throughout the transfer, allowing full reactivity until bite detection then reducing reactivity in the direction of exit. Additionally, we introduce a novel dexterous wrist-like end effector capable of small, unimposing movements to reduce user discomfort. We conduct a user study with 11 participants covering 8 diverse food categories to evaluate our system end-to-end, and we find that users strongly prefer our method to a wide range of baselines. Appendices and videos are available at our website: https://tinyurl.com/btICRA." Robot Trust and Self-Confidence Based Role Arbitration Method for Physical Human-Robot Collaboration,"Qiao Wang, Dikai Liu, Marc Garry Carmichael, Chin-teng Lin","University of Technology Sydney,Centre for Autonomous Systems,UTS",Physical Human-Robot Interaction I,"Role arbitration in human-robot collaboration (HRC) is a dynamically changing process that is affected by many factors such as physical workload, environmental changes and trust. In order to address this dynamic process, a trust-based role arbitration method is studied in this research. A computational model of robot trust and self-confidence (TSC) in physical human-robot collaboration (pHRC) is proposed. The TSC model is defined as a function of objective robot and human co-worker performance. A role arbitration method is then proposed based on the TSC model presented. The human-in-the-loop experiments with a collaborative robot are conducted to verify the TSC-based role arbitration method. The results show that the proposed method could achieve superior human-robot combined performance, reduce human co-workers’ workload, and improve subjective preference." Design Optimization and Data-Driven Shallow Learning for Dynamic Modeling of a Smart Segmented Electroadhesive Clutch,"Navid Feizi, Zahra Bahrami, S. Farokh Atashzar, Mehrdad R. Kermani, Rajnikant V. Patel","University of Western Ontario,Institute of Geography, University of Erlangen-Nuremberg,New York University (NYU), US,The University of Western Ontario",Physical Human-Robot Interaction I,"Electroadhesive clutches have attracted a great deal of interest in the last decade as semi-active actuators for human-robot interaction due to their lightweight, low power consumption, and tunable high-torque output capability. However, because of the complexity of their dynamics, in most cases, they are utilized in an ON/OFF-control strategy. In this regard, the non-autonomous (time-dependent) degradation of electroadhesive behavior is an inherent challenge that injects unpredictability and uncertainty into the behavior of this family of semi-active clutches. We propose a novel approach to preventing degradation of electroadhesion using a segmented electrode design that modulates the electrical field on the dielectric surface while using a direct current signal and securing low power consumption. This paper, for the first time, presents an optimization process based on a novel analytic model of the proposed actuator. It also develops a data-driven model augmentation using a hybrid shallow learning approach composed of a long short-term memory (LSTM) architecture which is combined with the analytical model. The performance of the proposed semi-active clutch and the data-driven hybrid model is experimentally validated in this paper." Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method,"Alvin Shek, Bo Ying Su, Rui Chen, Changliu Liu","Carnegie Mellon University,Carnegie Mellon University; University of Michigan;",Award Finalists 3,"For robots to be effectively deployed in novel environments and tasks, they must be able to understand the feedback expressed by humans during intervention. This can either correct undesirable behavior or indicate additional preferences. Existing methods either require repeated episodes of interactions or assume prior known reward features, which is data-inefficient and can hardly transfer to new tasks. We relax these assumptions by describing human tasks in terms of object-centric sub-tasks and interpreting physical interventions in relation to specific objects. Our method, Object Preference Adaptation (OPA), is composed of two key stages: 1) pre-training a base policy to produce a wide variety of behaviors, and 2) online-updating according to human feedback. The key to our fast, yet simple adaptation is that general interaction dynamics between agents and objects are fixed, and only object-specific preferences are updated. Our adaptation occurs online, requires only one human intervention (one-shot), and produces new behaviors never seen during training. Trained on cheap synthetic data instead of expensive human demonstrations, our policy correctly adapts to human perturbations on realistic tasks in both simulation and on a physical 7DOF robot. Videos, code, and supplementary material: https://alvinosaur.github.io/AboutMe/projects/opa." Touch Classification on Robotic Skin Using Multimodal Tactile Sensing Modules,"Min Jin Yang, Junhwi Cho, Hyunjo Chung, Kyungseo Park, Jung Kim","Korea Advanced Institute of Science and Technology (KAIST),KAIST,University of Illinois at Urbana-Champaign",Physical Human-Robot Interaction I,"Human employs different touch patterns to convey diverse social messages; for example, a stroke is an encouragement, whereas a hit is an offense. Various tactile sensors have been developed to grant an intuitive physical interaction with a robotic system, yet many encountered limitations in achieving broad sensibility or fabricating into a large skin. This paper presents a robotic skin with multimodal tactile sensing modules to achieve broad spatiotemporal sensibility with a few sensing elements. The multimodal module is composed of a microphone and a vented screw installed on a conductive sensory domain. A multilayered fabric with a textured surface covers the sensory domain and forms a piezoresistive structure. High and low temporal components of touch elicit a micro-vibration and a conductivity change on the skin, where both are measured with multimodal modules. The measurements are each processed with short-time Fourier transform (STFT) and electrical resistance tomography (ERT) to encode two spatiotemporal feature maps, which are classified into ten touch classes using a convolutional neural network. Due to a sensibility to both high and low temporal components of touch, the skin classifies touches with an accuracy of 97.0%, whereas only 84.7% and 90.6% are achieved when one type of feature map is used. Also, the skin is robust and beneficial in power consumption and fabrication since the multimodal modules are not exposed to an external stimulus and are sparsely distributed." Distributed Data-Driven Predictive Control for Multi-Agent Collaborative Legged Locomotion,"Randall Fawcett, Leila Amanzadeh, Jeeseop Kim, Aaron Ames, Kaveh Akbari Hamed","Virginia Polytechnic Institute and State University,Virginia Tech University,Caltech,Virginia Tech",Award Finalists 3,"The aim of this work is to define a planner that enables robust legged locomotion for complex multi-agent systems consisting of several holonomically constrained quadrupeds. To this end, we employ a methodology based on behavioral systems theory to model the sophisticated and high-dimensional structure induced by the holonomic constraints. The resulting model is then used in tandem with distributed control techniques such that the computational burden is shared across agents while the coupling between agents is preserved. Finally, this distributed model is framed in the context of a predictive controller, resulting in a robustly stable method for trajectory planning. This methodology is tested in simulation with up to five agents and is further experimentally validated on three A1 quadrupedal robots subject to various uncertainties, including payloads, rough terrain, and push disturbances." On the Use of Torque Measurement in Centroidal State Estimation,"Shahram Khorshidi, Ahmad Gazar, Nicholas Rotella, Maximilien Naveau, Ludovic Righetti, Maren Bennewitz, Majid Khadiv","University of Bonn,Max-Planck Institute for Intelligent Systems,University of Southern California,LAAS/CNRS,New York University,Max Planck Institute for Intelligent Systems",Legged Motion Analysis and Synthesis,"State-of-the-art legged robots are either capable of measuring torque at the output of their drive systems, or have transparent drive systems which enable the computation of joint torques from motor currents. In either case, this sensor modality is seldom used in state estimation. In this paper, we propose to use joint torque measurements to estimate the centroidal states of legged robots. To do so, we project the whole-body dynamics of a legged robot into the nullspace of the contact constraints, allowing expression of the dynamics independent of the contact forces. Using the constrained dynamics and the centroidal momentum matrix, we are able to directly relate joint torques and centroidal states dynamics. Using the resulting model as the process model of an Extended Kalman Filter (EKF), we fuse the torque measurement in the centroidal state estimation problem. Through real-world experiments on a quadruped robot executing different gaits, we demonstrate that the estimated centroidal states from our torque-based EKF drastically improve the recovery of these quantities compared to direct computation." DMMGAN: Diverse Multi Motion Prediction of 3D Human Joints Using Attention-Based Generative Adversarial Network,"Payam Nikdel, Mohammad Mahdavian, Mo Chen","Simon Fraser University/Waymo,Simon Fraser University",Legged Motion Analysis and Synthesis,"Human body motion prediction is a fundamental part of many human-robot applications. Despite the recent progress in the area, most studies predict human body motion relative to a fixed joint and only limit their model to predict one possible future motion, or both. However, due to the complex nature of human motion, a single prediction cannot adequately reflect the many possible movements one can make. Also, for any robotics application, prediction of the full human body motion including the absolute 3D trajectory -- not just a 3D body pose relative to the hip joint -- is needed. In this paper, we try to address these two shortcomings by proposing a transformer-based generative model for forecasting multiple diverse human motions. Our model generates N future possible body motions given the human motion history. This is achieved by first predicting the pose of the body relative to the hip joint as was done in prior work. Then, our proposed Hip Prediction Module predicts the trajectory of the hip position relative to a global reference frame for each predicted pose frame, an aspect of human body motion neglected by previous work. To obtain a set of diverse predicted motions, we introduce a similarity loss that penalizes the pairwise sample distance. Our system not only outperforms the state-of-the-art in human motion prediction, but also is able to predict a diverse set of future human body motions, including the hip trajectory." Contact Optimization for Non-Prehensile Loco-Manipulation Via Hierarchical Model Predictive Control,"Alberto Rigo, Yiyu Chen, Satyandra K. Gupta, Quan Nguyen","USC,University of Southern California",Legged Motion Analysis and Synthesis,"Recent studies on quadruped robots have focused on either locomotion or mobile manipulation using a robotic arm. Legged robots can manipulate large objects using non-prehensile manipulation primitives, such as planar pushing, to drive the object to the desired location. This paper presents a novel hierarchical model predictive control (MPC) for contact optimization of the manipulation task. Using two cascading MPCs, we split the loco-manipulation problem into two parts: the first to optimize both contact force and contact location between the robot and the object, and the second to regulate the desired interaction force through the robot locomotion. Our method is successfully validated in both simulation and hardware experiments. While the baseline locomotion MPC fails to follow the desired trajectory of the object, our proposed approach can effectively control both object's position and orientation with minimal tracking error. This capability also allows us to perform obstacle avoidance for both the robot and the object during the loco-manipulation task." Optimal Scheduling of Models and Horizons for Model Hierarchy Predictive Control,"Charles Khazoom, Steve Heim, Daniel Gonzalez-Diaz, Sangbae Kim",Massachusetts Institute of Technology,Legged Motion Analysis and Synthesis,"Model predictive control (MPC) is a powerful tool to control systems with non-linear dynamics and constraints, but its computational demands impose limitations on the dynamics model used for planning. Instead of using a single complex model along the MPC horizon, model hierarchy predictive control (MHPC) reduces solve times by planning over a sequence of models of varying complexity within a single horizon. Choosing this model sequence can become intractable when considering all possible combinations of reduced order models and prediction horizons. We propose a framework to systematically optimize a model schedule for MHPC. We leverage trajectory optimization (TO) to approximate the accumulated cost of the closed-loop controller. We trade off performance and solve times by minimizing the number of decision variables of the MHPC problem along the horizon while keeping the approximate closed-loop cost near optimal. The framework is validated in simulation with a planar humanoid robot as a proof of concept. We find that the approximated closed-loop cost matches the simulated one for most of the model schedules, and show that the proposed approach finds optimal model schedules that transfer directly to simulation, and with total horizons that vary between 1.1 and 1.6 walking steps." STPOTR: Simultaneous Human Trajectory and Pose Prediction Using a Non-Autoregressive Transformer for Robot Follow-Ahead,"Mohammad Mahdavian, Payam Nikdel, Mahdi Taherahmadi, Mo Chen","Simon Fraser University,Simon Fraser University/Waymo",Legged Motion Analysis and Synthesis,"In this paper, we greatly expand the capability of robots to perform the follow-ahead task and variations of this task through development of a neural network model to predict future human motion from an observed human motion history. We propose a non-autoregressive transformer architecture to leverage its parallel nature for easier training and fast, accurate predictions at test time. The proposed architecture divides human motion prediction into two parts: 1) the human trajectory, which is the 3D positions of the hip joint over time, and 2)the human pose which is the 3D positions of all other joints over time with respect to a fixed hip joint. We propose to make the two predictions simultaneously, as the shared representation can improve the model performance. Therefore, the model consists of two sets of encoders and decoders. First, a multi-head attention module applied to encoder outputs improves human trajectory. Second, another multi-head self-attention module applied to encoder outputs concatenated with decoder outputs facilitates the learning of temporal dependencies. Our model is well-suited for robotic applications in terms of test accuracy and speed, and compares favourably with respect to state-of-the-art methods. We demonstrate the real-world applicability of our work via the Robot Follow-Ahead task, a challenging yet practical case study for our proposed model. The human motion predicted by our model enables the robot follow-ahead in scenarios that require taking detailed human motion into account such as sit-to-stand, stand-to-sit. It also enables simple control policies to trivially generalize to many different variations of human following, such as follow-beside. Our code and data are available at the following Github page: https://github.com/mmahdavian/STPOTR" Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion,"Victor Dhedin, Haolong Li, Shahram Khorshidi, Lukas Mack, Adithya Kumar Chinnakkonda Ravi, Avadesh Meduri, Paarth Shah, Felix Grimminger, Ludovic Righetti, Majid Khadiv, Joerg Stueckler","Max Planck Institute for Intelligent Systems,University of Bonn,New York University,University of Oxford",Legged Motion Analysis and Synthesis,"Implementing dynamic locomotion behaviors on legged robots requires a high-quality state estimation module. Especially when the motion includes flight phases, state-of-the-art approaches fail to produce reliable estimation of the robot posture, in particular base height. In this paper, we propose a novel approach for combining visual-inertial odometry (VIO) with leg odometry in an extended Kalman filter (EKF) based state estimator. The VIO module uses a stereo camera and IMU to yield low-drift 3D position and yaw orientation and drift-free pitch and roll orientation of the robot base link in the inertial frame. However, these values have a considerable amount of latency due to image processing and optimization, while the rate of update is quite low which is not suitable for low-level control. To reduce the latency, we predict the VIO state estimate at the rate of the IMU measurements of the VIO sensor. The EKF module uses the base pose and linear velocity predicted by VIO, fuses them further with a second high-rate IMU and leg odometry measurements, and produces robot state estimates with a high frequency and small latency suitable for control. We integrate this lightweight estimation framework with a nonlinear model predictive controller and show successful implementation of a set of agile locomotion behaviors, including trotting and jumping at varying horizontal speeds, on a torque-controlled quadruped robot." Getting Air: Modelling and Control of a Hybrid Pneumatic-Electric Legged Robot,"Christopher Mailer, Stacey Leigh Shield, Reuben Govender, Amir Patel","University of Cape Town,University of Cape Town,",Legged Motion Analysis and Synthesis,"With their combination of power and compliance, pneumatic actuators have great potential for enabling dynamic and agile behaviors in legged robots, but their complex dynamics impose control challenges that have hindered widespread use. In this paper, we describe the development of a tractable model and characterization procedure of an off-the-shelf double acting pneumatic cylinder controlled by on/off solenoid valves for use in trajectory optimization. With this we are able to generate motions which incorporate both the body and actuator dynamics of our robot Kemba: a novel quadrupedal robot prototype with a combination of electric and pneumatic actuators. We demonstrate both a 0.5m jump and land maneuver, and a maximal 1m jump, approximately 2.2 times its leg length, on the physical hardware with the proposed model and approach. The hardware matches the desired trajectory with a maximum height error of only 5 cm without any feedback on the pneumatic joints, demonstrating the utility of the model in high-level motion generation, and capability of the physical robot." Enhanced Balance for Legged Robots Using Reaction Wheels,"Chi Yen Lee, Shuo Yang, Benjamin Bokser, Zachary Manchester",Carnegie Mellon University,Legged Motion Analysis and Synthesis,"We introduce a reaction wheel system that enhances the balancing capabilities and stability of quadrupedal robots during challenging locomotion tasks. Inspired by both the standard centroidal dynamics model common in legged robotics and models of spacecraft commonly used in the aerospace community, we model the coupled quadruped-reaction-wheel system as a gyrostat, and simplify the dynamics to formulate the problem as a linear discrete-time trajectory optimization problem. Modifications are made to a standard centroidal model-predictive control (MPC) algorithm to solve for both stance foot ground reaction forces and reaction wheel torques simultaneously. The MPC problem is posed as a quadratic program and solved online at 1000 Hz. We demonstrate improved attitude stabilization both in simulation and on hardware compared to a quadruped without reaction wheels, and perform a challenging traversal of a narrow balance beam that would be impossible for a standard quadruped." Versatile Real-Time Motion Synthesis Via Kino-Dynamic MPC with Hybrid-Systems DDP,"He Li, Tingnan Zhang, Wenhao Yu, Patrick Wensing","University of Notre Dame,Google",Legged Motion Analysis and Synthesis,"Specialized motions such as jumping are often achieved on quadruped robots by solving a trajectory optimization problem once and executing the trajectory using a tracking controller. This approach is in parallel with Model Predictive Control (MPC) strategies that commonly control regular gaits via online re-planning. In this work, we present a nonlinear MPC (NMPC) technique that unlocks on-the-fly re-planning of specialized motion skills and regular locomotion within a unified framework. The NMPC reasons about a hybrid kinodynamic model, and is solved using a variant of a constrained Differential Dynamic Programming (DDP) solver. The proposed NMPC enables the robot to perform a variety of agile skills like jumping, bounding, and trotting, and the rapid transition between these skills. We evaluated the proposed algorithm with three challenging motion sequences that combine multiple agile skills, on two quadruped platforms, Unitree A1, and MIT Mini Cheetah, showing its effectiveness and generality." Distributed Model Predictive Formation Control with Gait Synchronization for Multiple Quadruped Robots,"Shaohang Xu, Wentao Zhang, Lijun Zhu, Chin Pang Ho","Huazhong University of Science and Technology,City University of Hong Kong",Legged Motion Analysis and Synthesis,"In this paper, we present a fully distributed framework for multiple quadruped robots in environments with obstacles. Our approach utilizes Model Predictive Control (MPC) and multi-robot consensus protocol to obtain the distributed control law. It ensures that all the robots are able to avoid obstacles, navigate to the desired positions, and meanwhile synchronize the gaits. In particular, via MPC and consensus, the robots compute the optimal trajectory and the contact profile of the legs. Then an MPC-based locomotion controller is implemented to achieve the gait, stabilize the locomotion and track the desired trajectory. We present experiments in simulation and with three real quadruped robots in an environment with a static obstacle." Video Waterdrop Removal Via Spatio-Temporal Fusion in Driving Scenes,"Qiang Wen, Yue Wu, Qifeng Chen","Hong Kong University of Science and Technology,HKUST",Autonomous Navigation,"The waterdrops on windshields during driving can cause severe visual obstructions, which may lead to car accidents. Meanwhile, the waterdrops can also degrade the performance of a computer vision system in autonomous driving. To address these issues, we propose an attention-based framework that fuses the spatio-temporal representations from multiple frames to restore visual information occluded by waterdrops. Due to the lack of training data for video waterdrop removal, we propose a large-scale synthetic dataset with simulated waterdrops in complex driving scenes on rainy days. To improve the generality of our proposed method, we adopt a cross-modality training strategy that combines synthetic videos and real-world images. Extensive experiments show that our proposed method can generalize well and achieve the best waterdrop removal performance in complex real-world driving scenes." Unsupervised Learning of Depth and Pose Based on Monocular Camera and Inertial Measurement Unit (IMU),"Yanbo Wang, Hanwen Yang, Jianwei Cai, Guangming Wang, Jingchuan Wang, Yi Huang","Shanghai Jiao Tong University,Shanghai JiaoTong University,Shanghai Weitong Vision Technology Co. , Ltd.",Autonomous Navigation,"The main content of the research in this paper is the estimation of depth and pose based on monocular vision and Inertial Measurement Unit (IMU). The usual depth estimation network and pose estimation network require depth ground truth or pose ground truth as a supervised signal for training, while the depth ground truth and pose ground truth are hard to obtain, and monocular vision based depth estimation cannot predict absolute depth. In this paper, with the help of IMU, which is inexpensive and widely used, we can obtain angular velocity and acceleration information. Two new supervision signals are proposed and the calculation expressions are given. Among them, the model trained with acceleration constraint shows a good ability to estimate the absolute depth during the test. It can be considered that the model can estimate the absolute depth. We also derive the method of estimating the scale factor during the test from the acceleration constraint, and also achieve good results as the acceleration constraint does. In addition, this paper also studies the method of using IMU information as pose network input and as selecting conditions. Moreover, it analyzes and discusses the experimental results. At the same time, we also evaluate the effect of the pose estimation of the relevant models. This article starts by reviewing the achievements and deficiencies of the work in this field, combines the use of IMU, puts forward three new methods such as a new loss function, and conducts a test analysis and discussion of relevant indicators on the KITTI data set." Self-Supervised Multi-Frame Monocular Depth Estimation with Pseudo-LiDAR Pose Augmentation,"Wenhua Wu, Guangming Wang, Jiquan Zhong, Hesheng Wang, Zhe Liu","Shang Hai Jiao Tong University,Shanghai Jiao Tong University,Shanghai Jiaotong University,University of Cambridge",Autonomous Navigation,"Depth estimation is one of the most important tasks in scene understanding. In the existing joint self-supervised learning approaches of depth-pose estimation, depth estimation and pose estimation networks are independent of each other. They only use the adjacent image frames for pose estimation and lack the use of the estimated geometric information. To enhance the depth-pose association, we propose a monocular multi-frame unsupervised depth estimation framework, named PLPE-Depth. There are a depth estimation network and two pose estimation networks with image input and pseudo-LiDAR input. The main idea of our approach is to use the pseudo-LiDAR reconstructed from the depth map to estimate the pose of adjacent frames. We propose depth re-estimation with a better pose between the image pose and the pseudo-LiDAR pose to improve the accuracy of estimation. Besides, we improve the reconstruction loss and design a pseudo-LiDAR pose enhancement loss to facilitate the joint learning. Our approach enhances the use of the estimated depth information and strengthens the coupling between depth estimation and pose estimation. Experiments on the KITTI dataset show that our depth estimation achieves state-of-the-art performance at low resolution. Our source codes will be released on https://github.com/IRMVLab/PLPE-Depth." Anomaly Detection Based Robust Autonomous Navigation,"Kefan Jin, Mu Fun, Xingyao Han, Guangming Wang, Zhe Liu","Shanghai Jiao Tong University,University of Cambridge",Autonomous Navigation,"Human drivers are remarkably robust against various unexpected occurring variations and corruptions by understanding temporal changes and traffic scenes. In contrast, the neural network based autonomous navigation system can be easily effected by sensor data anomaly, like blocking or sensor noises. Such external disturbances are inevitable in practical driving applications. In this paper, we develop a semi-supervised anomaly detection module to detect the corrupted data while extracting the traffic scenario features. We further introduce an end-to-end robust autonomous navigation framework based on the idea that the consecutive frames of clean data depict a similar traffic scenario and the differences among the sequential data imply the dynamic state changes. By taking into consideration both spatial traffic scenario and temporal environmental variation, the model is able to achieve robust navigation against sensor data corruptions. We conduct experiments in CARLA platform and the evaluation results show the effectiveness of the proposed method." Learning Perceptual Hallucination for Multi-Robot Navigation in Narrow Hallways,"JinSoo Park, Xuesu Xiao, Garrett Warnell, Harel Yedidsion, Peter Stone","The University of Texas at Austin,George Mason University,U.S. Army Research Laboratory,University of Texas at Austin",Autonomous Navigation,"While current systems for autonomous robot navigation can produce safe and efficient motion plans in static environments, they usually generate suboptimal behaviors when multiple robots must navigate together in confined spaces. For example, when two robots meet each other in a narrow hallway, they may either turn around to find an alternative route or collide with each other. This paper presents a new approach to navigation that allows two robots to pass each other in a narrow hallway without colliding, stopping, or waiting. Our approach, Perceptual Hallucination for Hallway Passing (textsc{phhp}), learns to synthetically generate virtual obstacles (i.e., {em perceptual hallucination}) to facilitate passing in narrow hallways by multiple robots that utilize otherwise standard autonomous navigation systems. Our experiments on physical robots in a variety of hallways show improved performance compared to multiple baselines." Multi-Head Attention Machine Learning for Fault Classification in Mixed Autonomous and Human-Driven Vehicle Platoons,"Theodore Wu, Satvick Acharya, Abdelrahman Khalil, Ahmed Aljanaideh, Mohammad Al Janaideh, Deepa Kundur","University of Toronto,university of Toronto,Memorial University of Newfoundland,Bentley University,Memorial University &University of Toronto",Autonomous Navigation,"Connected Autonomous Vehicle (CAV) platoons have been extensively studied to protect against cyber and physical vulnerabilities. Faults can occur in all layers of the platoon system or could be introduced by impaired human drivers. Since different types of faults may require different fault resolution methods, identifying the fault class facilitates the selection of the best mitigation strategy. This paper introduces a Multi-Head Attention Machine Learning (MHA-ML) approach to classify a set of five different faults and abnormalities in mixed autonomous and human-driven vehicle platoons. Autonomous vehicles can face actuator faults, False Data Injection (FDI) attacks, and Denial-of-Service (DoS) attacks, while abnormalities such as drunk or distracted human drivers could occur. MHA-ML is developed to identify faulty vehicle behavior over long sequences of sensor measurements. MHA-ML is trained on a mixed platoon simulation model and then tested on mobile laboratory robots. The experiment classifies the five fault categories with 90% accuracy and outperforms a baseline recurrent neural network approach." GP-Frontier for Local Mapless Navigation,"Mahmoud Ali, Lantao Liu",Indiana University,Autonomous Navigation,"We propose a new frontier concept called the Gaussian Process Frontier (GP-Frontier) that can be used to locally navigate a robot towards a goal without building a map. The GP-Frontier is built on the uncertainty assessment of an efficient variant of Gaussian Process called Variational Sparse Gaussian Process (VSGP). Based only on local ranging sensing measurement, the GP-Frontier can be used for navigation in both known and unknown environments. The proposed method is validated through intensive evaluations, and the results show that the GP-Frontier can navigate the robot in a safe and persistent way, i.e., the robot moves in the most open space (thus reducing the risk of collision) without relying on a map or a path planner." Image Masking for Robust Self-Supervised Monocular Depth Estimation,"Hemang Chawla, Kishaan Jeeveswaran, Elahe Arani, Bahram Zonooz",Navinfo Europe,Autonomous Navigation,"Self-supervised monocular depth estimation is a salient task for 3D scene understanding. Learned jointly with monocular ego-motion estimation, several methods have been proposed to predict accurate pixel-wise depth without using labeled data. Nevertheless, these methods focus on improving performance under ideal conditions without natural or digital corruptions. The general absence of occlusions is assumed even for object-specific depth estimation. These methods are also vulnerable to adversarial attacks, which is a pertinent concern for their reliable deployment in robots and autonomous driving systems. We propose MIMDepth, a method that adapts masked image modeling (MIM) for self-supervised monocular depth estimation. While MIM has been used to learn generalizable features during pre-training, we show how it could be adapted for direct training of monocular depth estimation. Our experiments show that MIMDepth is more robust to noise, blur, weather conditions, digital artifacts, occlusions, as well as untargeted and targeted adversarial attacks." Learning-Based Uncertainty-Aware Navigation in 3D Off-Road Terrains,"Hojin Lee, Junsung Kwon, Cheolhyeon Kwon","Ulsan National Institute of Science and Technology,Ulsan National Institute of Sience and Technology",Autonomous Navigation,"This paper presents a safe, efficient, and agile ground vehicle navigation algorithm for 3D off-road terrain environments. Off-road navigation is subject to uncertain vehicle-terrain interactions caused by different terrain conditions on top of 3D terrain topology. The existing works are limited to adopt overly simplified vehicle-terrain models. The proposed algorithm learns the terrain-induced uncertainties from driving data and encodes the learned uncertainty distribution into the traversability cost for path evaluation. The navigation path is then designed to optimize the uncertainty-aware traversability cost, resulting in a safe and agile vehicle maneuver. Assuring real-time execution, the algorithm is further implemented within parallel computation architecture running on Graphics Processing Units (GPU)." Safe Real-World Autonomous Driving by Learning to Predict and Plan with a Mixture of Experts,"Stefano Pini, Christian Perone, Aayush Ahuja, Ana Sofia Rufino Ferreira, Moritz Niendorf, Sergey Zagoruyko","Woven by Toyota U.K. Limited,Woven Planet UK,Woven Planet,Woven Planet Holdings, Inc",Autonomous Navigation,"The goal of autonomous vehicles is to navigate public roads safely and comfortably. To enforce safety, traditional planning approaches rely on handcrafted rules to generate trajectories. Machine learning-based systems, on the other hand, scale with data and are able to learn more complex behaviors. However, they often ignore that agents and self-driving vehicle trajectory distributions can be leveraged to improve safety. In this paper, we propose modeling a distribution over multiple future trajectories for both the self-driving vehicle and other road agents, using a unified neural network architecture for prediction and planning. During inference, we select the planning trajectory that minimizes a cost taking into account safety and the predicted probabilities. Our approach does not depend on any rule-based planners for trajectory generation or optimization, improves with more training data and is simple to implement. We extensively evaluate our method through a realistic simulator and show that the predicted trajectory distribution corresponds to different driving profiles. We also successfully deploy it on a self-driving vehicle on urban public roads, confirming that it drives safely without compromising comfort. The code for training and testing our model on a public prediction dataset and the video of the road test are available at https://woven.mobi/safepathnet." Interpretable and Flexible Target-Conditioned Neural Planners for Autonomous Vehicles,"Haolan Liu, Jishen Zhao, Liangjun Zhang","University of California San Diego,UC San Diego,Baidu",Autonomous Navigation,"Learning-based approaches to autonomous vehicle planners have the potential to scale to many complicated real-world driving scenarios by leveraging huge amounts of driver demonstrations. However, prior work only learns to estimate a single planning trajectory, while there may be multiple acceptable plans in real-world scenarios. To solve the problem, we propose an interpretable neural planner to regress a heatmap, which effectively represents multiple potential goals in the bird's-eye view of an autonomous vehicle. The planner employs an adaptive Gaussian kernel and relaxed hourglass loss to better capture the uncertainty of planning problems. We also use a negative Gaussian kernel to add supervision to the heatmap regression, enabling the model to learn collision avoidance effectively. Our systematic evaluation on the Lyft Open Dataset across a diverse range of real-world driving scenarios shows that our model achieves a safer and more flexible driving performance than prior works." Visibility-Aware Navigation among Movable Obstacles,"Jose Muguira Iturralde, Aidan Curtis, Yilun Du, Leslie Kaelbling, Tomas Lozano-Perez","Massachusetts Institute of Technology,MIT",Autonomous Navigation,"In this paper, we examine the problem of visibility-aware robot navigation among movable obstacles (VANAMO). A variant of the well-known NAMO robotic planning problem, VANAMO puts additional visibility constraints on robot motion and object movability. This new problem formulation lifts the restrictive assumption that the map is fully visible and the object positions are fully known. We provide a formal definition of the VANAMO problem and propose the Look and Manipulate Backchaining (LaMB) algorithm for solving such problems. LaMB has a simple API that makes it more easily transferable to real-world robot applications and scales to the large 3D environments that robots typically inhabit. To evaluate LaMB, we construct a set of tasks that illustrate the complex interplay between visibility and object movability that can arise in mobile base manipulation problems in unknown environments. We show that LaMB outperforms NAMO and visibility-aware motion planning approaches as well as simple combinations of them on complex manipulation problems with partial observability." Trajectory Error Compensation for Optimal Control of UMA-2 – a Climbing Robot Executing Maintenance Operation in Harsh Environment,"Diego Gitardi, Simone Sabbadini, Anna Valente","SUPSI - University of Applied Sciences and Arts of Southern Swit,SUPSI-ISTePS",Trajectory Optimization,"UMA-2 is a wheeled mobile platform equipped with a vacuum adhesion system, eight actuated joints and four passive ones, designed to climb vertical and curved surfaces. The platform can perform maintenance tasks such as corrosion removal and cleaning with grinding while climbing. The quality of the repairing process is largely affected by grinding process parameters including tool forces, toolpath and the robot trajectory accuracy. The current work introduces a trajectory analysis and adaptation model to control the UMA-2 platform to ensure specific surface quality KPIs and incorporating the effects of robot compliancy. The proposed trajectory analysis has been extensively validated through experimental campaigns representative of maintenance in wind power industry." Obstacle-Aware Topological Planning Over Polyhedral Representation for Quadrotors,"Junjie Gao, Fenghua He, Wei Zhang, Yu Yao",Harbin Institute of Technology,Award Finalists 2,"In this paper, we propose a novel mapping-planning framework for autonomous quadrotor navigation. First, a polyhedron-based mapping algorithm is presented to fully exploit the information of the onboard sensor data. Polyhedra are generated to approximate the segmented clusters of occupied voxels. Then, customized data structures are designed to extract information for motion planning in real time. With complete knowledge of the shape, position, and number of the observed obstacles, we can conveniently generate smooth trajectories with sufficient obstacle clearance along the most desired direction. Before searching for the initial path, a local topological graph is constructed to keep the path expanding in the most favorable topology class. The following path search is segmented based on the graph vertices, which allows fast convergence. The refined trajectory is obtained after smoothing, and large deviations are penalized in the formulated optimization problem to preserve the original clearance. Finally, we analyze and validate the proposed framework through extensive simulations and real-world quadrotor flights." Trajectory Optimization for 3D Shape-Changing Robots with Differential Mobile Base,"Mengke Zhang, Chao Xu, Fei Gao, Yanjun Cao","Zhejiang University,Zhejiang University, Huzhou Institute of Zhejiang University",Trajectory Optimization,"Service robots have attracted extensive attention due to specially designed functions, such as mobile manipulators or robots with extra structures. For robots that have changing shapes, autonomous navigation in the real world presents new challenges. In this paper, we propose a trajectory optimization method for differential-drive mobile robots with controllable changing shapes in dense 3D environments. We model the whole-body trajectory as a polynomial trajectory that satisfies the nonholonomic dynamics of the base and dynamics of the extra joints. These constraints are converted into soft constraints, and an activation function for dense sampling is applied to avoid nonlinear mutations. In addition, we guarantee the safety of full shape by limiting the system's distance from obstacles. To comprehensively simulate a large extent of height and width changes, we designed a novel Shape-Changing Robot with a Differential Base (SCR-DB). Our global trajectory optimization gives a smooth and collision-free trajectory for SCR-DB at a low computational cost. We present vast simulations and real-world experiments to validate our performance, including coupled whole-body and independent differential-driven vehicle motion planning." Trajectory Optimization for Distributed Manipulation by Shaping a Physical Field,"Adam Uchytil, Jiri Zemanek","Faculty of Electrical Engineering, Czech Technical University in Prague,Czech Technical University in Prague",Trajectory Optimization,"Trajectory optimization is used to solve various planning tasks. In this paper we present a optimization-based method that solves a planning problem for multiple independent objects manipulated by a spatially continuous physical field. The field is generated and controlled (shaped) in real time by an array of actuators. In the paper we first formulate a trajectory optimization problem and a related initialization scheme, and then we demonstrate the proposed method using an experimental platform for distributed magnetic manipulation. The demonstrated task is that of planar reconfiguration of an ensemble of multiple objects, which significantly benefits from the inherent parallelism of the manipulation enabled by the array of actuators shaping the physical field. We show that the system can rearrange up to eight objects simultaneously while avoiding collisions." Globally Guided Trajectory Planning in Dynamic Environments,"Oscar De Groot, Laura Ferranti, Dariu Gavrila, Javier Alonso-Mora",Delft University of Technology,Trajectory Optimization,"Navigating mobile robots through environments shared with humans is challenging. From the perspective of the robot, humans are dynamic obstacles that must be avoided. These obstacles make the collision-free space nonconvex, which leads to two distinct passing behaviors per obstacle (passing left or right). For local planners, such as receding-horizon trajectory optimization, each behavior presents a local optimum in which the planner can get stuck. This may result in slow or unsafe motion even when a better plan exists. In this work, we identify trajectories for multiple locally optimal driving behaviors, by considering their topology. This identification is made consistent over successive iterations by propagating the topology information. The most suitable high-level trajectory guides a local optimization-based planner, resulting in fast and safe motion plans. We validate the proposed planner on a mobile robot in simulation and real-world experiments." VP-STO: Via-Point-Based Stochastic Trajectory Optimization for Reactive Robot Behavior,"Julius Jankowski, Lara Brudermüller, Nick Hawes, Sylvain Calinon","Idiap Research Institute and EPFL,University of Oxford,Idiap Research Institute",Trajectory Optimization,"Achieving reactive robot behavior in complex dynamic environments is still challenging as it relies on being able to solve trajectory optimization problems quickly enough, such that we can replan the future motion at frequencies which are sufficiently high for the task at hand. We argue that current limitations in Model Predictive Control (MPC) for robot manipulators arise from inefficient, high-dimensional trajectory representations and the negligence of time-optimality in the trajectory optimization process. Therefore, we propose a motion optimization framework that optimizes jointly over space and time, generating smooth and timing-optimal robot trajectories in joint-space. While being task-agnostic, our formulation can incorporate additional task-specific requirements, such as collision avoidance, and yet maintain real-time control rates, demonstrated in simulation and real-world robot experiments on closed-loop manipulation. For additional material, please visit https://sites.google.com/oxfordrobotics.institute/vp-sto." Modular and Parallelizable Multibody Physics Simulation Via Subsystem-Based ADMM,"Jeongmin Lee, Minji Lee, Dongjun Lee",Seoul National University,Trajectory Optimization,"In this paper, we present a new multibody physics simulation framework that utilizes the subsystem-based structure and the Alternating Direction Method of Multiplier (ADMM). The major challenge in simulating complex high degree of freedom systems is a large number of coupled constraints and large-sized matrices. To address this challenge, we first split the multibody into several subsystems and reformulate the dynamics equation into a subsystem perspective based on the structure of their interconnection. Then we utilize ADMM with our novel subsystem-based variable splitting scheme to solve the equation, which allows parallelizable and modular architecture. The resulting algorithm is fast, scalable, versatile, and converges well while maintaining solution consistency. Several illustrative examples are implemented with performance evaluation results showing advantages over other state-of-the-art algorithms." Real-Time Unified Trajectory Planning and Optimal Control for Urban Autonomous Driving under Static and Dynamic Obstacle Constraints,"Rowan Dempster, Mohammad Alsharman, Derek Rayside, William Melek",University of Waterloo,Trajectory Optimization,"Trajectory planning and control have historically been separated into two modules in automated driving stacks. Trajectory planning focuses on higher-level tasks like avoiding obstacles and staying on the road surface, whereas the controller tries its best to follow an ever changing reference trajectory. We argue that this separation is (1) flawed due to the mismatch between planned trajectories and what the controller can feasibly execute, and (2) unnecessary due to the flexibility of the model predictive control (MPC) paradigm. Instead, in this paper, we present a unified MPC-based motion planning and control scheme that guarantees feasibility with respect to road boundaries, the static and dynamic environment, and enforces passenger comfort constraints. The scheme is evaluated rigorously in a variety of scenarios focused on proving the effectiveness of the optimal control problem (OCP) design and real-time solution methods. The prototype code will be released at https://github.com/WATonomous/control." A General Locomotion Approach for a Novel Multi-Legged Spherical Robot,"Dun Yang, Yunfei Liu, Yang Yu",Beihang University,Integrated Planning and Control,"As a kind of ground mobile robot, deformable robots have many advantages, such as strong terrain adaptability, lightweight, and portability. Among these robots, the radial skeleton robot has the better stability and controllability. However, because the friction of foot and ground is hard to be predicted, the accuracy of its gait generation algorithms that have been studied is very low. Furthermore, there is no closed-loop control scheme for this kind of robot at present. We designed a 12-legged radial skeleton robot with high extension ratio legs, proposed a high-precision gait generation algorithm for any multi-legged radial skeleton robot, and first proposed a closed-loop control scheme for this kind of robot. A dynamic model considering contact friction is established. And, the robot has the advantages of Omnidirectional motion, high-precision trajectory tracking, and motion robustness. By conducting prototype experiments, it is verified that our method achieves the highest accuracy when tracking trajectory and holds robustness in the unknown environment." A Coarse-To-Fine Framework for Dual-Arm Manipulation of Deformable Linear Objects with Whole-Body Obstacle Avoidance,"Mingrui Yu, Kangchen Lv, Changhao Wang, Masayoshi Tomizuka, Xiang Li","Tsinghua University,University of California, Berkeley,University of California",Integrated Planning and Control,"Manipulating deformable linear objects (DLOs) to achieve desired shapes in constrained environments with obstacles is a meaningful but challenging task. Global planning is necessary for such a highly-constrained task; however, accurate models of DLOs required by planners are difficult to obtain owing to their deformable nature, and the inevitable modeling errors significantly affect the planning results, probably resulting in task failure if the robot simply executes the planned path in an open-loop manner. In this paper, we propose a coarse-to-fine framework to combine global planning and local control for dual-arm manipulation of DLOs, capable of precisely achieving desired configurations and avoiding potential collisions between the DLO, robot, and obstacles. Specifically, the global planner refers to a simple yet effective DLO energy model and computes a coarse path to find a feasible solution efficiently; then the local controller follows that path as guidance and further shapes it with closed-loop feedback to compensate for the planning errors and improve the task accuracy. Both simulations and real-world experiments demonstrate that our framework can robustly achieve desired DLO configurations in constrained environments with imprecise DLO models, which may not be reliably achieved by only planning or control." Adaptive Approximation of Dynamics Gradients Via Interpolation to Speed up Trajectory Optimisation,"David Mackenzie Charles Russell, Rafael Papallas, Mehmet Remzi Dogar",University of Leeds,Integrated Planning and Control,"Trajectory optimisation methods for robotic motion planning often require the use of first order derivatives of the dynamics of the system with respect to the states and controls of the system. Particularly when multi-contact dynamics are present, these derivatives are often numerically approximated by a method such as finite-differencing. Finite-differencing whilst using an expensive physics simulator is usually the bottleneck in these trajectory optimisation algorithms. Since these dynamics derivatives do not change substantially over certain time intervals, we propose that trajectory optimisers can compute the dynamics derivatives less often and then interpolate approximations to the derivatives in between calculated derivatives, gaining a significant speed up for overall optimisation time with no observable degradation in the generated behaviour. We investigate different methods of interpolating approximations as well as propose an adaptive method to detect when to compute the derivatives with finite-differencing. We find a speed-up of planning times on average by 60% in a contact-based manipulation task." "Learning Augmented, Multi-Robot Long-Horizon Navigation in Partially Mapped Environments","Abhish Khanal, Gregory Stein",George Mason University,Integrated Planning and Control,"We present a novel approach for efficient and reliable goal-directed long-horizon navigation for a multi-robot team in a structured, unknown environment by predicting statistics of unknown space. Building on recent work in learning-augmented model based planning under uncertainty, we introduce a high-level state and action abstraction that lets us approximate the challenging Dec-POMDP into a tractable stochastic MDP. Our Multi-Robot Learning over Subgoals Planner (MR-LSP) guides agents towards coordinated exploration of regions more likely to reach the unseen goal. We demonstrate improvement in cost against other multi-robot strategies; in simulated office-like environments, we show that our approach saves 13.29% (2 robot) and 4.6% (3 robot) average cost versus standard non-learned optimistic planning and a learning-informed baseline." Switching Attention in Time-Varying Environments Via Bayesian Inference of Abstractions,"Meghan Booker, Anirudha Majumdar",Princeton University,Integrated Planning and Control,"Motivated by the goal of endowing robots with a means for focusing attention in order to operate reliably in complex, uncertain, and time-varying environments, we consider how a robot can (i) determine which portions of its environment to pay attention to at any given point in time, (ii) infer changes in context (e.g., task or environment dynamics), and (iii) switch its attention accordingly. In this work, we tackle these questions by modeling context switches in a time-varying Markov decision process (MDP) framework. We utilize the theory of bisimulation-based state abstractions in order to synthesize mechanisms for paying attention to context-relevant information. We then present an algorithm based on Bayesian inference for detecting changes in the robot's context (task or environment dynamics) as it operates online, and use this to trigger switches between different abstraction-based attention mechanisms. Our approach is demonstrated on two examples: (i) an illustrative discrete-state tracking problem, and (ii) a continuous-state tracking problem implemented on a quadrupedal hardware platform. These examples demonstrate the ability of our approach to detect context switches online and robustly ignore task-irrelevant distractors by paying attention to context-relevant information." Hierarchical Policy Blending As Inference for Reactive Robot Control,"Kay Hansel, Julen Urain, Jan Peters, Georgia Chalvatzaki","Intelligent Autonomous Systems Group, Technical University Darmstadt,TU Darmstadt,Technische Universität Darmstadt,Technische Universität Darmastadt",Integrated Planning and Control,"Motion generation in cluttered, dense, and dynamic environments is a central topic in robotics, rendered as a multi-objective decision-making problem. Current approaches trade-off between safety and performance. On the one hand, reactive policies guarantee a fast response to environmental changes at the risk of suboptimal behavior. On the other hand, planning-based motion generation provides feasible trajectories, but the high computational cost may limit the control frequency and, thus, safety. To combine the benefits of reactive policies and planning, we propose a hierarchical motion generation method. Moreover, we employ probabilistic inference methods to formalize the hierarchical model and stochastic optimization. We realize this approach as a weighted product of stochastic, reactive expert policies, where planning is used to adaptively compute the optimal weights over the task horizon. This stochastic optimization avoids local optima and proposes feasible reactive plans that find paths in cluttered and dense environments. Our extensive experimental study in planar navigation and 7DoF manipulation shows that our proposed hierarchical motion generation method outperforms both myopic reactive controllers and online re-planning methods." Efficient Learning of High Level Plans from Play,"Núria Armengol Urpí, Marco Bagatella, Otmar Hilliges, Georg Martius, Stelian Coros","ETH Zurich,Max Planck Institute for Intelligent Systems",Integrated Planning and Control,"Real-world robotic manipulation tasks remain an elusive challenge, since they involve both fine-grained environment interaction, as well as the ability to plan for long-horizon goals. Although deep reinforcement learning (RL) methods have shown encouraging results when planning end-to-end in high-dimensional environments, they remain fundamentally limited by poor sample efficiency due to inefficient exploration, and by the complexity of credit assignment over long horizons. In this work, we present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL to achieve long-horizon complex manipulation tasks. We leverage task-agnostic play data to learn a discrete behavioral prior over object-centric primitives, modeling their feasibility given the current context. We then design a high-level goal-conditioned policy which (1) uses primitives as building blocks to scaffold complex long-horizon tasks and (2) leverages the behavioral prior to accelerate learning. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks and learns policies that can be easily transferred to physical hardware." Multi-Objective Ergodic Search for Dynamic Information Maps,"Ananya Rao, Abigail Breitfeld, Alberto Candela, Benjamin Jensen, David Wettergreen, Howie Choset","Carnegie Mellon University,NASA Jet Propulsion Laboratory, Caltech",Learning for Motion and Path Planning,"Robotic explorers are essential tools for gathering information about regions that are inaccessible to humans. For applications like planetary exploration or search and rescue, robots use prior knowledge about the area to guide their search. Ergodic search methods find trajectories that effectively balance exploring unknown regions and exploiting prior information. In many search based problems, the robot must take into account multiple factors such as scientific information gain, risk, and energy, and update its belief about these dynamic objectives as they evolve over time. However, existing ergodic search methods either consider multiple static objectives or consider a single dynamic objective, but not multiple dynamic objectives. We address this gap in existing methods by presenting an algorithm called Dynamic Multi-Objective Ergodic Search (D-MO-ES) that efficiently plans an ergodic trajectory on multiple changing objectives. Our experiments show that our method requires up to nine times less compute time than a naive approach with comparable coverage of each objective." Safety-Critical Ergodic Exploration in Cluttered Environments Via Control Barrier Functions,"Cameron Lerch, Dayi Dong, Ian Abraham",Yale University,Learning for Motion and Path Planning,"In this paper, we address the problem of safe trajectory planning for autonomous search and exploration in constrained, cluttered environments. Guaranteeing safe (collision-free) trajectories is a challenging problem that has garnered significant due to its importance in the successful utilization of robots in search and exploration tasks. This work contributes a method that generates guaranteed safety-critical search trajectories in a cluttered environment. Our approach integrates safety-critical constraints using discrete control barrier functions (DCBFs) with ergodic trajectory optimization to enable safe exploration. Ergodic trajectory optimization plans continuous exploratory trajectories that guarantee complete coverage of a space. We demonstrate through simulated and experimental results on a drone that our approach is able to generate trajectories that enable safe and effective exploration. Furthermore, we show the efficacy of our approach for safe exploration using real-world single- and multi- drone platforms." GuILD: Guided Incremental Local Densification for Accelerated Sampling-Based Motion Planning,"Rosario Scalise, Aditya Mandalika, Brian Hou, Sanjiban Choudhury, Siddhartha Srinivasa","University of Washington,Cornell University",Learning for Motion and Path Planning,"Sampling-based motion planners rely on incremental densification to discover progressively shorter paths. After computing feasible path ξ between start xs and goal xt, the Informed Set (IS) prunes the configuration space X by conservatively eliminating points that cannot yield shorter paths. Densification via sampling from this Informed Set retains asymptotic optimality of sampling from the entire configuration space. For path length c(ξ) and Euclidean heuristic h, IS = {x|x ∈ X , h(xs, x) + h(x, xt) ≤ c(ξ)}. Relying on the heuristic can render the IS especially conservative in high dimensions or complex environments. Furthermore, the IS only shrinks when shorter paths are discovered. Thus, the computational effort from each iteration of densification and planning is wasted if it fails to yield a shorter path, despite improving the cost-to-come for vertices in the search tree. Our key insight is that even in such a failure, shorter paths to vertices in the search tree (rather than just the goal) can immediately improve the planner’s sampling strategy. Guided Incremental Local Densification (GuILD) leverages this information to sample from Local Subsets of the IS. We show that GuILD significantly outperforms uniform sampling of the Informed Set in simulated R2, SE(2) environments and manipulation tasks in R7." ARiADNE: A Reinforcement Learning Approach Using Attention-Based Deep Networks for Exploration,"Yuhong Cao, Tianxiang Hou, Yizhuo Wang, Xian Yi, Guillaume Sartoretti","National University of Singapore,National University of Singapore (NUS)",Learning for Motion and Path Planning,"In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-the-art exploration planners are frontier- and sampling-based, encouraged by the recent development in deep reinforcement learning (DRL), we propose ARiADNE, an attention-based neural approach to obtain real-time, non-myopic path planning for autonomous exploration. ARiADNE is able to learn dependencies at multiple spatial scales between areas of the agent's partial map, and implicitly predict potential gains associated with exploring those areas. This allows the agent to sequence movement actions that balance the natural trade-off between exploitation/refinement of the map in known areas and exploration of new areas. We experimentally demonstrate that our method outperforms both learning and non-learning state-of-the-art baselines in terms of average trajectory length to complete exploration in hundreds of simplified 2D indoor scenarios. We further validate our approach in high-fidelity Robot Operating System (ROS) simulations, where we consider a real sensor model, a standard Simultaneous Localization and Mapping (SLAM) algorithm, and a realistic low-level motion controller, towards deployment on real robots." On Shortest Arc-To-Arc Dubins Path,"Satyanarayana Gupta Manyam, David Casbeer","Infoscitex corp.,AFRL",Learning for Motion and Path Planning,"For a given set of orbits, the Orbiting Dubins Traveling Salesman Problem (ODTSP) involves finding Dubins tour that is tangential to each orbit at some point. We consider a shortest Arc-to-Arc Dubins (ATAD) path problem that arrives in solving lower bound to the ODTSP. Given an initial and a final arc, the objective of ATAD is to find the shortest Dubins path such that the initial and final point lie on the given two arcs, and the path is tangential to the arcs. We analyze the six Dubins modes and the degenerate cases to find local minima. We present the optimal solution for the ATAD, along with an algorithm that uses this solution to compute tight lower bounds for the ODTSP. We test the lower bounding algorithm on several random instances and report the results. Using this algorithm, we show that the percent gap between upper and lower bounds is less than 10% for most instances." Robust Navigation with Cross-Modal Fusion and Knowledge Transfer,"Wenzhe Cai, Guangran Cheng, Lingyue Kong, Lu Dong, Changyin Sun",Southeast University,Learning for Motion and Path Planning,"Recently, learning-based approaches show promising results in navigation tasks. However, the poor generalization capability and the simulation-reality gap prevent a wide range of applications. We consider the problem of improving the generalization of mobile robots and achieving sim-to-real transfer for navigation skills. To that end, we propose a cross-modal fusion method and a knowledge transfer framework for better generalization. This is realized by a teacher-student distillation architecture. The teacher learns a discriminative representation and the near-perfect policy in an ideal environment. By imitating the behavior and representation of the teacher, the student is able to align the features from noisy multi-modal input and reduce the influence of variations on navigation policy. We evaluate our method in simulated and real-world environments. Experiments show that our method outperforms the baselines by a large margin and achieves robust navigation performance with varying working conditions." Contextual Multi-Objective Path Planning,"Anna Nickelson, Kagan Tumer, William Smart",Oregon State University,Learning for Motion and Path Planning,"Many critical robot environments, such as healthcare and security, require robots to account for context-dependent criteria when performing their functions (e.g., navigation). Such domains require decisions that balance multiple factors, making it difficult for robots to make contextually appropriate decisions. Multi-Objective Optimization (MOO) methods offer a potential solution by trading off between objectives; however concepts like Pareto fronts are not only expensive to compute but struggle with differentiating among solutions on the Pareto front. This work introduces the Contextual Multi-Objective Path Planning (CMOPP) algorithm, which enables the robot to trade off different complex costs dependent on context. The key insight of this work is to separate the path planning and path cost estimation into two independent steps, thus significantly reducing computation cost without impacting the quality of the resulting path. As a result, CMOPP is able to accurately model path costs, which provide meaningful trade-offs when choosing a path that best fits the context. We show the benefits of CMOPP on case studies that demonstrate its contextual path planning capabilities. CMOPP finds contextually appropriate paths by first reducing the search space up to 99.9% to a near-optimal set of paths. This reduction enables the generation of accurate path cost models, using up to 90% less computation than similar methods." A Continuous Off-Policy Reinforcement Learning Scheme for Optimal Motion Planning in Simply-Connected Workspaces,"Panagiotis Rousseas, Charalampos Bechlioulis, Kostas Kyriakopoulos","National Technical University of Athens,University of Patras,National Technical Univ. of Athens",Learning for Motion and Path Planning,"In this work, an Integral Reinforcement Learning (RL) framework is employed to provide provably safe, convergent and almost globally optimal policies in a novel Off-Policy Iterative method for simply-connected workspaces. This restriction stems from the impossibility of strictly global navigation in multiply connected manifolds, and is necessary for formulating continuous solutions. The current method generalizes and improves upon previous results, where parametrized controllers hindered the method in scope and results. Through enhancing the traditional reactive paradigm with RL, the proposed scheme is demonstrated to outperform both previous reactive methods as well as an RRT⋆ method in path length, cost function values and execution times, indicating almost global optimality." Towards Robust Autonomous Grasping with Reflexes Using High-Bandwidth Sensing and Actuation,"Andrew Saloutos, Hongmin Kim, Elijah Stanger-jones, Menglong Guo, Sangbae Kim","Massachusetts Institute of Technology,Seoul National University,University of California Berkeley",Grasping and Manipulation II,"Modern robotic manipulation systems fall short of human manipulation skills partly because they rely on closing feedback loops exclusively around vision data, which reduces system bandwidth and speed. By developing autonomous grasping reflexes that rely on high-bandwidth force, contact, and proximity data, the overall system speed and robustness can be increased while reducing reliance on vision data. We are developing a new system built around a low-inertia, high-speed arm with nimble fingers that combines a high-level trajectory planner operating at less than 1 Hz with low-level autonomous reflex controllers running upwards of 300 Hz. We characterize the reflex system by comparing the volume of the set of successful grasps for a naive baseline controller and variations of our reflexive grasping controller, finding that our controller expands the set of successful grasps by 55% relative to the baseline. We also deploy our reflexive grasping controller with a simple vision-based planner in an autonomous clutter clearing task, achieving a grasp success rate above 90% while clearing over 100 items." High-Speed Scooping: An Implementation through Stiffness Control and Direct-Drive Actuation,"Ka Hei Mak, Pu Xu, Jungwon Seo","The Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,Pusan National University",Grasping and Manipulation II,"This study presents the technique of robotic highspeed scooping: rapidly picking an object lying on a support surface by making contact with the object’s open top face and the bottom face that is hidden in contact with the support surface. Essential to high-speed scooping is thus to make suitable dynamic, impactful interaction happen among the robot, object, and environment under errors and uncertainties. We propose a solution to this challenge based on stiffness control, an approach for indirect force control using the robot that is arranged to behave like a desired mechanical system. An implementation of the solution is then presented using a custom-built two-fingered direct-drive gripper. Our experiments verify that high-speed scooping operation is achievable, with the duration of dynamic interaction less than 0.3 s, and effective to various scooping situations featuring objects durable and fragile." GraspAda: Deep Grasp Adaptation through Domain Transfer,"Yiting Chen, Junnan Jiang, Ruiqi Lei, Yasemin Bekiroglu, Fei Chen, Miao Li","Wuhan University,Tsinghua University,Chalmers University of Technology, University College London,The Chinese University of Hong Kong",Grasping and Manipulation II,"Learning-based methods for robotic grasping have been shown to yield high performance. However, they rely on expensive-to-acquire and well-labeled datasets. In addition, how to generalize the learned grasping ability across different scenarios is still unsolved. In this paper, we present a novel grasp adaptation strategy to transfer the learned grasping ability to new domains based on visual data using a new grasp feature representation. We present a conditional generative model for visual data transformation. By leveraging the deep feature representational capacity from the well-trained grasp synthesis model, our approach utilizes feature-level contrastive representation learning and adopts adversarial learning on output space. This way we bridge the domain gap between the new domain and the training domain while keeping consistency during the adaptation process. Based on transformed input grasp data via the generator, our trained model can generalize to new domains without any fine-tuning. The proposed method is evaluated on benchmark datasets and based on real robot experiments. The results show that our approach leads to high performance in new scenarios." Task-Oriented Stiffness Setting for a Variable Stiffness Hand,"Ana Elvira Huezo Martin, Ashok Meenakshi Sundaram, Werner Friedl, Virginia Ruiz Garate, Maximo A. Roa","German Aerospace Center (DLR),German AerospaceCenter (DLR),University of Mondragon,DLR - German Aerospace Center",Grasping and Manipulation II,"The integration of variable stiffness actuators (VSA) in robotic systems endows the robot with intrinsic flexibility and therefore robustness to unknown disturbances. However, this characteristic presents a challenge: choosing the best intrinsic stiffness setting guaranteeing the required force application capability while keeping the system as adaptable to uncertainties as possible. This paper proposes a method to set the optimal stiffness for a multi-finger VSA hand to perform a desired manipulation task. The task is generically represented as a force (with unknown magnitude) applied along a reference direction. According to the force application’s direction and the hand’s kinematic state, the fingers assume a certain role to split the collective force application. We employ the endpoint stiffness ellipsoid to analyze the required finger stiffness to fulfill the task. We evaluate the optimized stiffness settings in a door opening application with an iterative adaption of the stiffness behavior to handle the unknown force requirement. The results show a successful collective behavior of the fingers, where the stiffness settings consider a task-oriented force-adaptability trade-off and effective use of independent VSA fingers." Flipbot: Learning Continuous Paper Flipping Via Coarse-To-Fine Exteroceptive-Proprioceptive Exploration,"Chao Zhao, Chunli Jiang, Junhao Cai, Hongyu Yu, Michael Y. Wang, Qifeng Chen","Hong Kong University of Science and Technology,The Hong Kong University of Science and Technology,Monash University,HKUST",Grasping and Manipulation II,"This paper tackles the task of singulating and grasping paper-like deformable objects. We refer to such tasks as paper-flipping. In contrast to manipulating deformable objects that lack compression strength (such as shirts and ropes), minor variations in the physical properties of the paper-like deformable objects significantly impact the results, making manipulation highly challenging. Here, we present Flipbot, a novel solution for flipping paper-like deformable objects. Flipbot allows the robot to capture object physical properties by integrating exteroceptive and proprioceptive perceptions that are indispensable for manipulating deformable objects. Furthermore, by incorporating a proposed coarse-to-fine exploration process, the system is capable of learning the optimal control parameters for effective paper-flipping through proprioceptive and exteroceptive inputs. We deploy our method on a real-world robot with a soft gripper and learn in a self-supervised manner. The resulting policy demonstrates the effectiveness of Flipbot on paper-flipping tasks with various settings beyond the reach of prior studies, including but not limited to flipping pages throughout a book and emptying paper sheets in a box." Anthropomorphic Robot Hand Using the Principle of Sweat and Fingerprints of Human Hands,"Donghyun Kim, Junmo Yang, Dongwon Yun","Daegu Gyeongbuk Institute of Science and Technology,Daegu Gyeongbuk Institute of Science and Technology (DGIST)",Grasping and Manipulation II,"In our daily life, when a small amount of sweat or water forms on a person's hand, we can empirically feel that the friction force of the hand increases, and the objects are gripped well. However, if sweat or water forms heavily, we can feel the friction decrease when holding an object. In this study, we analyzed the degree to which fingerprints and sweat present on a person's hand can affect the friction force between the hand and the gripping object. We fabricated an anthropomorphic robot hand with a fingerprint structure to set up an environment similar to that of the human hand, and performed object-holding and friction-change experiments by changing the amount of sweat to verify that this phenomenon can be applied to a robot hand. Furthermore, we for the first time proposed and developed a variable friction system using fluids and microstructures to solve the difficulty of anthropomorphic robot hand force control. By applying the manufactured variable friction system and performing an active friction control performance test and an object grip test of the robot hand, we validated that the fingerprint and sweat of a human hand can affect the grip of an actual object." In-Hand Manipulation in Power Grasp: Design of an Adaptive Robot Hand with Active Surfaces,"Yilin Cai, Shenli Yuan","Carnegie Mellon University,SRI International",Award Finalists 1,"This paper describes the development of BACH (Belt–Augmented Compliant Hand), a compliant robotic hand equipped with active surfaces. The hand can securely grasp an object using power grasp and simultaneously manipulate the grasped object. The hand consists of three identical fingers, each with an actuated timing belt wrapped around a Fin Ray based compliant finger backbone. Each finger is mounted on a compliant pivot joint allowing for further adaptability. The combination of compliant mechanisms and active surfaces allows the hand to perform dexterous in-hand manipulation with great robustness. Multiple analyses were conducted to optimize and validate the design of BACH. The hand was experimentally tested for grasping and manipulating objects of various geometries and sizes, and it demonstrated highly robust and efficient in-hand manipulation capabilities." Passive Robotic Gripper Using a Contact-Based Locking Mechanism,"Issei Nate, Zhongkui Wang, Shinichi Hirai","Ritsumeikan University,Ritsumeikan Univ.",Grasping and Manipulation II,"Various robotic end-effectors have been developed for various applications. Most of them are driven by actuator/actuators, such as motor or pneumatic source, which usually make the end-effector bulky and vulnerable due to the external cables and air tubes. In this study, we propose a novel passive robotic gripper with a locking mechanism that does not require any actuators. Locking and unlocking of the fingers are performed through contact with the environment, such as ground, table, and conveyor. To facilitate gripper design, modeling of the deformed finger shape was conducted, and experimental validation was performed. Finally, a robotic gripper with eight such fingers were fabricated using 3D printer. Experiments were then performed to investigate the grasping capacities in terms of size and weight. We found that the larger the object, the greater the weight capacity of the gripper, which increased significantly when the object exceeded a certain size. In addition, experiments on grasping various food products were performed and results suggested that the proposed gripper could grasp objects with complex shapes and soft fragile properties, but damages were caused on very fragile objects due to the rigid structure of the gripper." The New Dexterity Adaptive Humanlike Robot Hand: Employing a Reconfigurable Palm for Robust Grasping and Dexterous Manipulation,"Geng Gao, Anany Dwivedi, Minas Liarokapis","Acumino inc,University of Auckland,The University of Auckland",Grasping and Manipulation II,"Robots have predominantly been used in automating tasks in structured industrial environments, however, with the advances in technology they are starting to take part in roles in dynamic everyday life scenarios. As a result, the tasks executed by robotic systems will also grow in sophistication. Grasping and dexterous manipulation are critical aspects that allow humans to execute these sophisticated tasks, enabling them to interact with their environment. As such, emulating the human hand can be advantageous for interacting with a world designed for humans. However, directly replicating the anatomical structure of the hand produces designs that are fully actuated, expensive, and which require sophisticated controls and sensing to operate efficiently. In this paper, we present two different versions of the New Dexterity adaptive, human-like robot hand that is capable of executing robust caging grasps under a wide range of environmental uncertainties (e.g., object pose uncertainties). One of the versions has a classic, fixed thumb base while the second one incorporates an additional degree of freedom at the thumb base, which enables a translational motion for repositioning the thumb and adjusting the aperture. This design choice enhances the in-hand manipulation capabilities of the robot hand, improving also the power grasping capabilities for larger objects. The performances of the proposed robot hand designs are experimentally validated and compared through three different tests: i) grasping experiments involving everyday-life objects, ii) force experiments that evaluate their force exertion capabilities, and iii) in-hand manipulation experiments that demonstrate and compare their dexterity." Picking by Tilting: In-Hand Manipulation for Object Picking Using Effector with Curved Form,"Yanshu SONG, Abdullah Nazir, Darwin Lau, Yunhui Liu","CUHK(Chinese University of Hong Kong),Hong Kong Centre for Logistics Robotics,The Chinese University of Hong Kong,Chinese University of Hong Kong",Grasping and Manipulation II,"This paper presents a robotic in-hand manipulation technique that can be applied to pick an object too large to grasp in a prehensile manner, by taking advantage of its contact interactions with a curved, passive end-effector, and two flat support surfaces. First, the object is tilted up while being held between the end-effector and the supports. Then, the end-effector is tucked into the gap underneath the object, which is formed by tilting, in order to obtain a grasp against gravity. In this paper, we first examine the mechanics of tilting to understand the different ways in which the object can be initially tilted. We then present a strategy to tilt up the object in a secure manner. Finally, we demonstrate successful picking of objects of various size and geometry using our technique through a set of experiments performed with a custom-made robotic device and a conventional robot arm. Our experiment results show that object picking can be performed reliably with our method using simple hardware and control, and when possible, with appropriate fixture design." Linear Delta Arrays for Compliant Dexterous Distributed Manipulation,"Sarvesh Bipin Patil, Long Tao, Tess Hellebrekers, Zeynep Temel, Oliver Kroemer","Carnegie Mellon University School of Computer Science,Carnegie Mellon University,Meta AI Research",Grasping and Manipulation II,"This paper presents a new type of distributed dexterous manipulator: delta arrays. Our delta array setup consists of 64 linearly-actuated delta robots with 3D-printed compliant linkages. Through the design of the individual delta robots, the modular array structure, and distributed communication and control, we study a wide range of in-plane and out-of-plane manipulations, as well as prehensile manipulations among subsets of neighboring delta robots. We also demonstrate dexterous manipulation capabilities of the delta array using reinforcement learning while leveraging compliance. Our evaluations show that the resulting 192 DoF compliant robot is capable of performing various coordinated distributed manipulations of a variety of objects, including translation, alignment, prehensile squeezing, lifting, and grasping." A Tactile-Enabled Hybrid Rigid-Soft Continuum Manipulator for Forceful Enveloping Grasps Via Scale Invariant Desgin,"Ian Taylor, Maheera Bawa, Alberto Rodriguez","Massachusetts Institute of Technology,MIT",Grasping and Manipulation II,"This work presents a novel hybrid rigid-soft continuum manipulator, which integrates high-resolution tactile sensing in a form factor that is forceful, compliant, inherently safe, and easily controllable. We utilize a hybrid approach motivated by scale-invariant principles to fuse the rigid and soft design domains while addressing their respective challenges. We use Euler-Bernoulli beam theory and geometric inference to design and develop a novel variant of folded flexure hinge (FFH) compliant mechanism, the variable area moment of inertia folded flexure hinge (VAFFH), which deforms logarithmically along its length and thus yields first-order scale-invariant grasp behavior. Finally, we characterize the forcefulness of the manipulator and demonstrate its compliance, adaptability, and tactile sensing capabilities in selected tasks." Adaptive Optimal Electrical Resistance Tomography for Large-Area Tactile Sensing,"Wendong Zheng, Huaping Liu, Di Guo, Wuqiang Yang","Tsinghua University,Beijing University of Posts and Telecommunications,The University of Manchester",Force and Tactile Sensing I,"It is critical to perceive physical contact for intelligent robots to safely interact in dynamic environments. As physical contacts can occur at any location, a well-performing tactile sensing system should be able to deploy a large area on robotic surface. Some researchers have implemented large-area tactile sensors by using sensing arrays, but it is challenging to deploy many sensing elements. Electrical resistance tomography (ERT) has recently been introduced into tactile sensing to overcome some of the limitations with conventional tactile sensing arrays, and good results have been achieved for some robotic applications. However, a particular challenge is that spatial resolution is low. Although various attempts have been made to improve the performance of ERT-based tactile sensors, the intrinsic resolution issue remains unsolved. In this paper, we propose a novel adaptive optimal drive strategy for efficient ERT-based large-area tactile sensing for robotic applications, which can adaptively select the current injection and voltage measurement pattern for optimal tactile signal. In particular, regions of tactile contacts are preliminarily detected and localized by a base scanning pattern with only a few measurement data. According to this detected region, the adaptive strategy can select the optimal current injection and voltage measurement pattern to improve the sensing performance by maximizing the current density. To verify the effectiveness of the proposed strategy, the proposed method is comprehensively evaluated by simulation and experiments. The results revealed that the optimal strategy can effectively improve both spatial and temporal resolution." Towards Open-Set Material Recognition Using Robot Tactile Sensing,"Kun-hong Liu, Qianhui Yang, Yu Xie, Xiangyi Huang",Xiamen University,Award Finalists 2,"The texture recognition can provide clues for robots to interact with the external environment. The traditional tactile material recognition task is studied under the close-set assumption, which means that all types of materials are included in the training set. However, the open-set materials recognition for robots is of much greater significance because in the real-world applications, there is usually something that doesn’t belong to any known class. Up to now, there is no researcher to further the discussion of this problem. To cope with unknown classes, this study proposes the Open set Material Recognition (OpenMR) based on General Convolutional Prototype Learning (GCPL). To handle the open space risk for GCPL caused by the lack of unknown samples in the training stage, we use Generative Adversarial Networks (GAN) to synthesize open-set samples as unknowns. The proposed framework is implemented and tested on two batches of tactile data collected in different exploratory motions on 8 material textures using the electronic skin. Compared with other open-set classifiers, experiments reveal that the proposed framework achieves competitive performance in both known classification and unknown detection." "RobotSweater: Scalable, Generalizable, and Customizable Machine-Knitted Tactile Skins for Robots","Zilin Si, Tianhong Yu, Katrene Morozov, James Mccann, Wenzhen Yuan","Carnegie Mellon University,Cornell University,University of California, Santa Barbara",Force and Tactile Sensing I,"Tactile sensing is essential for robots to perceive and react to the environment. However, it remains a challenge to make large-scale and flexible tactile skins on robots. Industrial machine knitting provides solutions to manufacture customizable fabrics. Along with functional yarns, it can produce highly customizable circuits that can be made into tactile skins for robots. In this work, we present RobotSweater, a machine-knitted pressure-sensitive tactile skin that can be easily applied on robots. We design and fabricate a parameterized multi-layer tactile skin using off-the-shelf yarns, and characterize our sensor on both a flat testbed and a curved surface to show its robust contact detection, multi-contact localization, and pressure sensing capabilities. The sensor is fabricated using a well-established textile manufacturing process with a programmable industrial knitting machine, which makes it highly customizable and low-cost. The textile nature of the sensor also makes it easily fit curved surfaces of different robots and have a friendly appearance. Using our tactile skins, we conduct closed-loop control with tactile feedback for two applications: (1) human lead-through control of a robot arm, and (2) human-robot interaction with a mobile robot." DTact: A Vision-Based Tactile Sensor That Measures High-Resolution 3D Geometry Directly from Darkness,"Changyi Lin, Ziqi Lin, Shaoxiong Wang, Huazhe Xu","Shanghai Qi Zhi Institute,Tsinghua University,MIT",Force and Tactile Sensing I,"Vision-based tactile sensors that can measure 3D geometry of the contacting objects are crucial for robots to perform dexterous manipulation tasks. However, the existing sensors are usually complicated to fabricate and delicate to extend. In this work, we novelly take advantage of the reflection property of semitransparent elastomer to design a robust, low-cost, and easy-to-fabricate tactile sensor named DTact. DTact measures high-resolution 3D geometry accurately from the darkness shown in the captured tactile images with only a single image for calibration. In contrast to previous sensors, DTact is robust under various illumination conditions. Then, we build prototypes of DTact that have non-planar contact surfaces with minimal extra efforts and costs. Finally, we perform two intelligent robotic tasks including pose estimation and object recognition using DTact, in which DTact shows large potential in applications." MagTac: Magnetic Six-Axis Force/Torque Fingertip Tactile Sensor for Robotic Hand Applications,"Sungwoo Park, Sang-Rok Oh, Donghyun Hwang","Korea university, KIST,KIST,Korea Institute of Science and Technology",Force and Tactile Sensing I,"We develop a hall-effect-based six-axis force/torque (F/T) tactile sensor integrated into the fingertip of robotic hands. When the robotic hands performs the grasping tasks in an unstructured environment, the visual information plays a main role in sensing the external properties of the objects. However, the various intrinsic properties of the objects such as softness, roughness, mass distribution, and weight cannot be measured properly only with the visual information. To detect the various force information to perform diverse tasks, we aim to implement the six-axis F/T fingertip tactile sensor based on the hall-effect-based principle. The experimental results demonstrate that the proposed sensor can measure the six-axis F/T with average errors of about 3.3 %, and shield the effect of stray magnetic field using soft magnetic shielding film." Tac-VGNN: A Voronoi Graph Neural Network for Pose-Based Tactile Servoing,"Wen Fan, Max Yang, Yifan Xing, Nathan Lepora, Dandan Zhang",University of Bristol,Force and Tactile Sensing I,"Tactile pose estimation and tactile servoing are fundamental capabilities of robot touch. Reliable and precise pose estimation can be provided by applying deep learning models to high-resolution optical tactile sensors. Given the recent successes of Graph Neural Network (GNN) and the effectiveness of Voronoi features, we developed a Tactile Voronoi Graph Neural Network (Tac-VGNN) to achieve reliable pose-based tactile servoing relying on a biomimetic optical tactile sensor (TacTip). The GNN is well suited to modeling the distribution relationship between shear motions of the tactile markers, while the Voronoi diagram supplements this with area-based tactile features related to contact depth. The experiment results showed that the Tac-VGNN model can help enhance data interpretability during graph generation and model training efficiency significantly than CNN-based methods. It also improved pose estimation accuracy along vertical depth by 28.57% over vanilla GNN without Voronoi features and achieved better performance on the real surface following tasks with smoother robot control trajectories." Safe Self-Supervised Learning in Real of Visuo-Tactile Feedback Policies for Industrial Insertion,"Letian Fu, Huang Huang, Lars Berscheid, Hui Li, Ken Goldberg, Sachin Chitta","UC Berkeley,University of California at Berkeley,Karlsruhe Institute of Technology,Autodesk Research,Autodesk Inc.",Force and Tactile Sensing I,"Industrial insertion tasks are often performed repetitively with parts that are subject to tight tolerances and prone to breakage. Learning an industrial insertion policy in real is challenging as the collision between the parts and the environment can cause slippage or breakage of the part. In this paper, we present a safe self-supervised method to learn a visuo-tactile insertion policy that is robust to grasp pose variations. The method reduces human input and collisions between the part and the receptacle. The method divides the insertion task into two phases. In the first align phase, a tactile-based grasp pose estimation model is learned to align the insertion part with the receptacle. In the second insert phase, a vision-based policy is learned to guide the part into the receptacle. The robot uses force-torque sensing to achieve a safe self-supervised data collection pipeline. Physical experiments on the USB insertion task from the NIST Assembly Taskboard suggest that the resulting policies can achieve 45/45 insertion successes on 45 different initial grasp poses, improving on two baselines: (1) a behavior cloning agent trained on 50 human insertion demonstrations (1/45) and (2) an online RL policy (TD3) trained in real (0/45)." In-Situ Mechanical Calibration for Vision-Based Tactile Sensors,"Can Zhao, Jieji Ren, Hexi Yu, Daolin Ma",Shanghai Jiao Tong University,Force and Tactile Sensing I,"This paper proposes a novel approach to conduct routine calibration for the changing mechanical parameters over time of a vision-based tactile sensor, without disassembling its overall structure, i.e., in-situ mechanical calibration. Calibration for mechanical parameters, Young's modulus and Poisson's ratio, of a tactile sensor's sensing elastomer, is crucial for its force perception capabilities. However, there are few methods that can retrieve values of these parameters both accurately and conveniently. To address this problem, we propose an in-situ approach to calibrate mechanical parameters other than the verbose traditional evaluation process. This method incorporates the deformation sensing capability of the sensor, the accurate force sensing capability of a force/torque sensor, and most importantly, the deformation-force relationship for an indentation with embedded mechanical parameters of the elastomers. We also present the indentation test setup and the complete pipeline to extract Young's modulus and Poisson's ratio from experimental results. We validate the method by comparing the indentation depths simulated through finite element analysis (FEA) using the calibrated parameters with the indentation depths measured in real experiments. Furthermore, superior contact force distribution can be achieved with the accurate mechanical parameters. The proposed method provides the theoretical basis for accurate, lifelong routine calibration, whether weekly or even daily, which can enhance the applications of tactile sensors in real manipulation scenarios." Tactile-Driven Gentle Grasping for Human-Robot Collaborative Tasks,"Christopher Ford, Haoran Li, John Lloyd, Manuel Giuseppe Catalano, Matteo Bianchi, Efi Psomopoulou, Nathan Lepora","University of Bristol,Istituto Italiano di Tecnologia,University of Pisa",Force and Tactile Sensing I,"This paper presents a control scheme for force sensitive, gentle grasping with a Pisa/IIT anthropomorphic SoftHand equipped with a miniaturised version of the TacTip optical tactile sensor on all five fingertips. The tactile sensors provide high-resolution information about a grasp and how the fingers interact with held objects. We first describe a series of hardware developments for performing asynchronous sensor data acquisition and processing, resulting in a fast control loop sufficient for real-time grasp control. We then develop a novel grasp controller that uses tactile feedback from all five fingertip sensors simultaneously to gently and stably grasp 43 objects of varying geometry and stiffness, which is then applied to a human-to-robot handover task. These developments open the door to more advanced manipulation with underactuated hands via fast reflexive control using high-resolution tactile sensing." TANDEM3D: Active Tactile Exploration for 3D Object Recognition,"Jingxi Xu, Han Lin, Shuran Song, Matei Ciocarlie",Columbia University,Force and Tactile Sensing I,"Tactile recognition of 3D objects remains a challenging task. Compared to 2D shapes, the complex geometry of 3D surfaces requires richer tactile signals, more dexterous actions, and more advanced encoding techniques. In this work, we propose TANDEM3D, a method that applies a co-training framework for exploration and decision making to 3D object recognition with tactile signals. Starting with our previous work, which introduced a co-training paradigm for 2D recognition problems, we introduce a number of advances that enable us to scale up to 3D. TANDEM3D is based on a novel encoder that builds 3D object representation from contact positions and normals using PointNet++. Furthermore, by enabling 6DOF movement, TANDEM3D explores and collects discriminative touch information with high efficiency. Our method is trained entirely in simulation and validated with real-world experiments. Compared to state-of-the-art baselines, TANDEM3D achieves higher accuracy and a lower number of actions in recognizing 3D objects and is also shown to be more robust to different types and amounts of sensor noise." Cable Routing and Assembly Using Tactile-Driven Motion Primitives,"Achu Wilson, Helen Jiang, Wenzhao Lian, Wenzhen Yuan","Carnegie Mellon University,Google X",Force and Tactile Sensing I,"In this paper, we propose to integrate tactile-guided low-level motion control with high-level vision-based task parsing for a challenging task: cable routing and assembly on a reconfigurable task board. Specifically, we build a library of tactile-guided motion primitives using a fingertip GelSight sensor, where each primitive reliably accomplishes an operation such as cable following and weaving. The overall task is inferred via visual perception given a goal configuration image, and then used to generate the primitive sequence. Experiments demonstrate the effectiveness of individual tactile-guided primitives and the integrated end-to-end solution, significantly outperforming the method without tactile sensing. Our reconfigurable task setup and proposed baselines provide a benchmark for future research in cable manipulation." A Tactile Feedback Insertion Strategy for Peg-In-Hole Tasks,"Oliver Gibbons, Alessandro Albini, Perla Maiolino",University of Oxford,Force and Tactile Sensing I,"The Peg-In-Hole (PiH) task performed under uncertain conditions still represents a challenge for autonomous robots. When the peg is not rigidly connected to the robot end-effector, the external forces generated by peg-environment interactions can change the in-hand pose of the peg. This aspect must be taken into account when performing an insertion. This paper deals with this problem and proposes an insertion strategy driven by tactile feedback. In particular, we consider holding the peg using a parallel gripper equipped with tactile sensors, whose measurements are processed to capture in-hand rotations of the peg pose. This information is fed back to the robot controller and used to compensate for changes in the peg orientation and end-point position occurring during the task execution. The approach is validated on a real robot using a two-finger gripper equipped with two capacitive-based tactile sensor arrays hosting 20 tactile elements each. We show that the proposed method achieves an insertion success rate of 38/40 with a 0.1 mm clearance between the peg and hole." "Coupled, Closed-System Fluidic Actuators for Use in Wearable Rehabilitation Devices","James Greig, Maria Elena Giannaccini, Edward Chadwick","University of Aberdeen,University of Bristol",Rehabilitation and Augmentation I,"This paper presents a novel closed-system, coupled soft actuator that aims to increase the applied bending moment that can be powered by a single pneumatic pump. The actuator incorporates both positive pressure and vacuum actuators of established design. The purpose of this development is to enable the design of an effective soft robotic wearable device for the rehabilitation of the revolute joints in post-stroke individuals. The design of a test rig to provide consistent, quantitative data on the output of the soft actuators is presented, allowing a comparison of the positive pressure, vacuum and combined (positive and vacuum) actuators. This combination demonstrates the ability to significantly increase the torque output when compared to a single actuator using the same pump for input, potentially reducing the weight of a wearable device. The closed-system, coupled soft actuator system shows opportunity for use in a wide range of applications due to this reduction in pump weight and isolation from environmental conditions." Emulating Human Kinematic Behavior on Lower-Limb Prostheses Via Multi-Contact Models and Force-Based Nonlinear Control,"Rachel Gehlhar, Aaron Ames",California Institute of Technology,Rehabilitation and Augmentation I,"Active lower-limb prostheses could enable more natural assisted locomotion by contributing net positive work through important gait events, such as ankle push-off. This paper uses multi-contact models of locomotion together with force-based nonlinear optimization-based controllers to achieve human-like kinematic behavior, including ankle push-off, on a powered transfemoral prosthesis. In particular, we leverage model-based control approaches for dynamic bipedal robotic walking to develop a systematic method to realize human-like walking on a powered prosthesis that does not require subject-specific tuning. The proposed controller is implemented on a prosthesis for 2 subjects without tuning between subjects, emulating subject-specific human kinematic trends on the prosthesis joints. These experimental results demonstrate that our force-based nonlinear control approach achieves better tracking of human-like kinematic trajectories, with an average RMSE of 0.0223 during weight-bearing, compared to 2 non-force-sensing methods with an average RMSE of 0.0411 and 0.0430." Simplified Motor Primitives for Gait Symmetrization: Pilot Study with an Active Hip Orthosis,"Henri Laloyaux, Chiara Livolsi, Andrea Pergolini, Simona Crea, Nicola Vitiello, Renaud Ronsse","Université catholique de Louvain,IUVO S.r.l, Scuola Superiore Sant'Anna of Pisa,Scuola Superiore Sant'Anna of Pisa,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant Anna",Rehabilitation and Augmentation I,"Lower-limb exoskeletons are wearable devices whose main purposes are human rehabilitation and bilateral locomotion assistance. In particular, there is a growing interest for their use to symmetrize the gait of hemiparetic patients. This often consists in using the kinematics of the less affected side as a reference for the most affected one. In this work, we followed this approach to design a symmetrization algorithm using the formalism of motor primitives, i.e. a low-dimensional set of signals that provide the desired assistance through their combination. The amount of variables to be stored in memory is thus intrinsically limited, and this framework is particularly adapted to include other modes of assistance and/or transitions between locomotion tasks. In this paper, we report the preliminary validation of this newly developed algorithm with a hip exoskeleton and a single participant replicating hemiparetic walking. Results show that the algorithm effectively managed to reduce both temporal and spatial gait asymmetry." A Preliminary Study of the Effects of Active Recovery Reflexes on Stumble Recovery in a Swing-Assist Knee Prosthesis,"Jantzen Lee, Shane King, Maura Eveld, Michael Goldfarb","Vanderbilt University,University of Twente",Rehabilitation and Augmentation I,"This paper explores the effects of a swing phase stumble recovery controller in a swing-assist prosthesis. The prosthesis detects a stumble event and employs either a lowering recovery response – wherein the user’s swing is truncated, and the leg is prepared for loading– or an elevating recovery response – wherein an amplified swing flexion is employed to step over the obstacle causing the perturbation. The controller described in this paper choses which of these responses to use based on the perturbation timing within the gait cycle, where stumble events which occur prior to an estimated 35 percent of the way through swing trigger an elevating response, and later stumbles trigger a lowering response. The potential efficacy of this approach was assessed in a preliminary study with two participants with transfemoral amputation; wherein each participant’s walking was perturbed in early, mid, and late swing phase when wearing both their prescribed prosthesis and the swing-assist prosthesis prototype. When wearing the swing-assist device, 0 of the 13 perturbations resulted in falls, with none of the trials being classifiable as “near falls”. Conversely, when using their prescribed device, one participant had a fall rate of 3 out of 6 perturbations, with 1 of the 3 recoveries being classifiable as a “near fall”; the second participant had a fall rate of 0 of 3 trials, with 2 of the 3 recoveries being classifiable as “near falls”. For both participants, when recovery was achieved, it was accompanied by significantly longer periods of irregularity and asymmetry in gait when using their prescribed devices, as compared to the test device. These results suggest the possibility of substantial benefit provided by a low-power, reflex-based stumble recovery feature in knee prostheses." Exploring Multimodal Gait Rehabilitation and Assistance through an Adaptable Robotic Platform,"Sophia Otalora, Sergio D. Sierra M., Felipe Ballen-moreno, Marcela Múnera, Carlos A. Cifuentes","Federal University of Espírito Santo,University of Bristol - University of the West of England,Vrije Universiteit Brussel, R&MM, Brubotics, Flanders Make,Escuela Colombiana de Ingeniería Julio Garavito,University of the West of England, Bristol",Rehabilitation and Augmentation I,"Lower-limb exoskeletons and smart walkers are robotic devices to assist patients in regaining their autonomy after a stroke. The integration of these devices enables gait rehabilitation and functional compensation, promoting natural overground walking. This article presents the Adaptable Robotic Platform for Gait Rehabilitation and Assistance (AGoRA V2 platform), which integrates the new AGoRA V2 Smart Walker and the AGoRA V2 unilateral lower-limb exoskeleton. It was evaluated with 14 healthy subjects using physiological and kinematic variables and a perception assessment. The study entailed four conditions: Without exoskeleton (WOE), With Exoskeleton (WE&T), With Walker (WW), and With Platform (WP). Results indicate a reduction in the muscle activity of the Rectus Femoris (18%) and Vastus Lateralis (15%), comparing WE&T and WP, as well as walking without any device (WOE) and using any robotic device (WE&T, WW, WP). Results suggest the importance of combining the exoskeleton with the robotic walker and the assistance of each device independently. Moreover, using the complete platform induces slower gait patterns than the walker, as the mean impulse force and linear velocity decrease by 42% and 44%, respectively. These results demonstrate that the platform contributes to safety, and improvements in gait parameters and muscular activity, indicating the system's potential to act as a modular device according to users' conditions and therapeutic goals." Bilateral Asymmetric Hip Stiffness Applied by a Robotic Hip Exoskeleton Elicits Kinematic and Kinetic Adaptation,"Banu Abdikadirova, Mark Price, Jonaz Moreno Jaramillo, Wouter Hoogkamer, Meghan Huber","University of Massachusetts Amherst,University of Massachusetts, Amherst",Rehabilitation and Augmentation I,"Wearable robotic exoskeletons hold great promise for gait rehabilitation as portable, accessible tools. However, a better understanding of the potential for exoskeletons to elicit neural adaptation—a critical component of neurological gait rehabilitation—is needed. In this study, we investigated whether humans adapt to bilateral asymmetric stiffness perturbations applied by a hip exoskeleton, taking inspiration from the asymmetry augmentation strategies used in split-belt treadmill training. During walking, we applied torques about the hip joints to repel the thigh away from a neutral position on the left side and attract the thigh toward a neutral position on the right side. Six participants performed an adaptation walking trial on a treadmill while wearing the exoskeleton. The exoskeleton elicited time-varying changes and aftereffects in step length and propulsive/braking ground reaction forces, indicating behavioral signatures of neural adaptation. These responses resemble typical responses to split-belt treadmill training, suggesting that the proposed intervention with a robotic hip exoskeleton may be an effective approach to (re)training symmetric gait." Gait Event Detection with Proprioceptive Force Sensing in a Powered Knee-Ankle Prosthesis: Validation Over Walking Speeds and Slopes,"Emily Keller, Curt A. Laubscher, Robert D. Gregg",University of Michigan,Rehabilitation and Augmentation I,"Many powered prosthetic devices use load cells to detect ground interaction forces and gait events. These sensors introduce additional weight and cost in the device. Recent proprioceptive actuators enable an algebraic relationship between actuator torques and ground contact forces. This paper presents a proprioceptive force sensing paradigm which estimates ground reaction forces as a solution to detect gait events without a load cell. A floating body dynamic model is obtained with constraints at the center of pressure representing foot-ground interaction. Constraint forces are derived to estimate ground reaction forces and subsequently timing of gait events. A treadmill experiment is conducted with a powered knee-ankle prosthesis used by an able-bodied subject walking at various speeds and slopes. Results show accurate gait event timing, with pooled data showing heel strike detection lagging by only 6.7 ± 7.2 ms and toe off detection leading by 30.4 ± 11.0 ms compared to values obtained from the load cell. These results establish proof of concept for predicting gait events without a load cell in powered prostheses with proprioceptive actuators." Towards a Finned-Swimming Exoskeleton: A Robotic Flutter Kicking Testbed and Its Corresponding Thrust Generation,"Beau Johnson, Michael Goldfarb",Vanderbilt University,Rehabilitation and Augmentation I,"While lower limb exoskeletons for aboveground locomotion have been developed, few attempts have been made to develop an exoskeleton to augment human swimming. Such efforts are hindered by a lack of knowledge surrounding the kinematics and kinetics of human swimming. This work presents the design of a robotic platform to be used as a finned swimming testbed; describes a controller to generate finned swimming movement; and describes experiments conducted to explore thrust production resulting from a flutter kick swimming motion." "Continuous Prediction of Leg Kinematics During Walking Using Inertial Sensors, Smart Glasses, and Embedded Computing","Oleksii Tsepa, Roman Burakov, Brokoslaw Laschowski, Alex Mihailidis","Igor Sikorsky Kyiv Polytechnic Institute,National University of Kyiv-Mohyla Academy,University of Toronto",Rehabilitation and Augmentation I,"Unlike traditional hierarchical controllers for robotic leg prostheses and exoskeletons, continuous systems could allow persons with mobility impairments to walk more naturally in real-world environments without requiring high-level switching between locomotion modes. To support these next-generation controllers, we developed a new system called KIFNet (Kinematics and Image Fusing Network) that uses lightweight and efficient deep learning models to continuously predict the leg kinematics during walking. We tested different sensor fusion methods to combine kinematics data from inertial sensors and computer vision data from smart glasses and found that adaptive instance normalization achieved the lowest RMSE predictions for knee and ankle joint kinematics. We also deployed our model on an embedded device. Without inference optimization, our model was 20 times faster than the previous state-of-the-art and achieved 20% higher prediction accuracies, and during some locomotor activities like stair descent, decreased RMSE up to 300%. With inference optimization, our best model achieved 125 FPS on an NVIDIA Jetson Nano. These results demonstrate the potential to build fast and accurate deep learning models for continuous prediction of leg kinematics during walking based on sensor fusion and embedded computing, therein providing a foundation for real-time continuous controllers for robotic leg prostheses and exoskeletons." Trajectory and Sway Prediction towards Fall Prevention,"Weizhuo Wang, Michael Raitor, Steven H. Collins, Karen Liu, Monroe Kennedy",Stanford University,Rehabilitation and Augmentation I,"Falls are the leading cause of fatal and non-fatal injuries, particularly for older persons. Imbalance can result from the body’s internal causes (illness), or external causes (active or passive perturbation). Active perturbation results from applying an external force to a person, while passive perturbation results from human motion interacting with a static obstacle. This work proposes a metric that allows for the monitoring of the persons torso and its correlation to active and passive perturbations. We show that large changes in the torso sway can be strongly correlated to active perturbations. We also show that we can reasonably predict the future path and expected change in torso sway by conditioning the expected path and torso sway on the past trajectory, torso motion, and the surrounding scene. This could have direct future applications to fall prevention. Results demonstrate that the torso sway is strongly correlated with perturbations. And our model is able to make use of the visual cues presented in the panorama and condition the prediction accordingly." Multi-Modal Learning and Relaxation of Physical Conflict for an Exoskeleton Robot with Proprioceptive Perception,"Xuan Zhang, Yana Shu, Yu Chen, Gong Chen, Jing Ye, Xiang Li","Tsinghua University,Shenzhen MileBot Robotics,Shenzhen MileBot Robotics Co. Ltd.",Rehabilitation and Augmentation I,"Exoskeleton robots provide assistive forces to suit the human subject via physical human-robot interaction. During the closely-coupled interaction, a mismatch between the wearer and the robot may result in physical conflict, which could affect assistance efficiency or even compromise safety. Therefore, such conflicts should be accurately detected and then properly relaxed by adjusting the robot's action. This paper proposes a new learning scheme to detect physical conflicts between humans and robots. The constructed learning network receives multi-modal information from proprioceptive sensors and then outputs the anomaly score to specify the physical conflict, which score is further used to continuously adjust the robot impedance to ensure a safe and efficient interaction. Such a formulation allows the robot to explore the semantic information during the interaction (e.g., gait phases, imbalance, human fatigue) and hence react properly to the physical conflict. Experimental results and comparative studies on a lower-limb exoskeleton robot are presented to illustrate that the proposed learning scheme can deal with physical conflicts in a faster and more accurate manner." Learning Personalised Human Sit-To-Stand Motion Strategies Via Inverse Musculoskeletal Optimal Control,"Daniel F. N. Gordon, Andreas Christou, Theodoros Stouraitis, Michael Gienger, Sethu Vijayakumar","University of Edinburgh,Honda Research Institute EU and the University of Edinburgh,Honda Research Institute Europe",Rehabilitation and Augmentation I,"Physically assistive robots and exoskeletons have great potential to help humans with a wide variety of collaborative tasks. However, a challenging aspect of the control of such devices is to accurately model or predict human behaviour, which can be highly individual and personalised. In this work, we implement a framework for learning subject-specific models of underlying human motion strategies using inverse musculoskeletal optimal control. We apply this framework to a specific motion task: the sit-to-stand transition. By collecting sit-to-stand data from 4 subjects with and without perturbations, we show that humans modulate their sit to-stand strategy in the presence of instability, and learn the corresponding models of these strategies. In the future, the personalised motion strategies resulting from this framework could be used to inform the design of real-time assistance strategies for human-robot collaboration problems." Robust Human Pose Estimation under Gaussian Noise,"Patrick Schlosser, Christoph Ledermann",Karlsruhe Institute of Technology,Safety and Trustworthy Robotics I,"Robustness against specific kinds of noise is of high importance for safety-critical components in industrial robot applications, as legal and normative regulations demand the identification and handling of all unacceptable risks. This includes risks from environmental conditions, like noisy data. One such component is human pose estimation, which is needed and crucial for human-robot collaboration tasks and applications. However, little research on human pose estimation under specific noise types has been performed. In our work, we focus on extensively evaluating human pose estimation under specific noise and propose potential countermeasures. We leverage Gaussian noise as specific noise type and the hourglass model as human pose estimator. We show that human pose estimation is already vulnerable to small amounts of Gaussian noise. As countermeasures we propose either denoising images upfront or training the hourglass model to be robust against Gaussian noise. All methods achieve a significantly higher robustness against Gaussian noise, typically at the cost of slightly worse performance on clean data. Three of our methods also achieved slight improvements on clean data." Enforcing Safety for Vision-Based Controllers Via Control Barrier Functions and Neural Radiance Fields,"Mukun Tong, Charles Dawson, Chuchu Fan","Tsinghua University,MIT,Massachusetts Institute of Technology",Safety and Trustworthy Robotics I,"To navigate complex environments, robots must increasingly use high-dimensional visual feedback (e.g. images) for control. However, relying on high-dimensional image data to make control decisions raises important questions; particularly, how might we prove the safety of a visual-feedback controller? Control barrier functions (CBFs) are powerful tools for certifying the safety of feedback controllers in the state-feedback setting, but CBFs have traditionally been poorly-suited to visual feedback control due to the need to predict future observations in order to evaluate the barrier function. In this work, we solve this issue by leveraging recent advances in neural radiance fields (NeRFs), which learn implicit representations of 3D scenes and can render images from previously-unseen camera perspectives, to provide single-step visual foresight for a CBF-based controller. This novel combination is able to filter out unsafe actions and intervene to preserve safety. We demonstrate the effect of our controller in real-time simulation experiments where it successfully prevents the robot from taking dangerous actions." Mimicking Real Forces on a Drone through a Haptic Suit to Enable Cost-Effective Validation,"Carl Hilderbrandt, Wen Ying, Seongkook Heo, Sebastian Elbaum",University of Virginia,Safety and Trustworthy Robotics I,"Robots operate under certain forces that affect their behavior. Consider, a drone meant to deliver packages must hold its pose as long as it operates under its weight and wind limits. Validating that such a drone handles external forces correctly is key to ensuring its safety. Nevertheless, validating the system's behavior under the effect of such forces can be difficult and costly. For example, checking the effects of different wind magnitudes may require waiting for the matching outdoor conditions or requiring wind tunnels. Checking the effects of different package sizes and shapes may require many slow and laborious iterations, and validating the combinations of wind gusts and package configurations is often hard to replicate. This work introduces a framework to overcome such challenges by mimicking external forces exercised on a drone with limited cost, setup, and space. The framework consists of a haptic suit device with directional propellers that can be mounted onto a drone, a synthesizer to transform intended forces into setpoints for the suit's directional propellers, and a controller for the suit to meet those setpoints. We conduct a study to assess the framework's capabilities under multiple scenarios involving various forces. Our findings show that the haptic suit framework can recreate real-world forces on the drone with acceptable precision." Generating Formal Safety Assurances for High-Dimensional Reachability,"Albert Lin, Somil Bansal","Princeton University,University of Southern California",Safety and Trustworthy Robotics I,"Providing formal safety and performance guarantees for autonomous systems is becoming increasingly important. Hamilton-Jacobi (HJ) reachability analysis is a popular formal verification tool for providing these guarantees, since it can handle general nonlinear system dynamics, bounded adversarial system disturbances, and state and input constraints. However, it involves solving a PDE, whose computational and memory complexity scales exponentially with respect to the state dimensionality, making its direct use on large-scale systems intractable. A recently proposed method called DeepReach overcomes this challenge by leveraging a sinusoidal neural PDE solver for high-dimensional reachability problems, whose computational requirements scale with the complexity of the underlying reachable tube rather than the state space dimension. Unfortunately, neural networks can make errors and thus the computed solution may not be safe, which falls short of achieving our overarching goal to provide formal safety assurances. In this work, we propose a method to compute an error bound for the DeepReach solution. This error bound can then be used for reachable tube correction, resulting in a safe approximation of the true reachable tube. We also propose a scenario-based optimization approach to compute a probabilistic bound on this error correction for general nonlinear dynamical systems. We demonstrate the efficacy of the proposed approach in obtaining probabilistically safe reachable tubes for high-dimensional rocket-landing and multi-vehicle collision-avoidance problems." Safety Evaluation of Robot Systems Via Uncertainty Quantification,"Woo-Jeong Baek, Torsten Kroeger","Karlsruhe Institute of Technology (KIT),Karlsruher Institut für Technologie (KIT)",Safety and Trustworthy Robotics I,"In this paper, we present an approach for quantifying the propagated uncertainty of robot systems in an online and data-driven manner. Especially in Human-Robot Collaboration, keeping track of the safety compliance during run time is essential: Misclassifying dangerous situations as safe might result in severe accidents. According to official regulations (eg, ISO standards), safety in industrial robot applications depends on critical parameters, such as the distance and relative velocity between humans and robots. However, safety can only be assured given a measure for the reliability of these parameters. While different risk detection and mitigation approaches exist in literature, a measure that can be used to evaluate safety limits online, and succinctly implies whether a situation is safe or dangerous, is missing to date. Motivated by this, we introduce a generalizable method for calculating the propagated measurement uncertainty of arbitrary parameters, that captures the accumulated uncertainty originating from sensory devices and environmental disturbances of the system. To show that our approach delivers correct results, we perform validation experiments in simulation. In addition, we employ our method in two real-world settings and demonstrate how quantifying the propagated uncertainty of critical parameters facilitates assessing safety online in Human-Robot Collaboration." Safety-Critical Controller Verification Via Sim2Real Gap Quantification,"Prithvi Akella, Wyatt Ubellacker, Aaron Ames","California Institute of Technology,Caltech",Safety and Trustworthy Robotics I,"The well-known quote from George Box states that: ""All models are wrong, but some are useful."" To develop more useful models, we quantify the inaccuracy with which a given model represents a system of interest, so that we may leverage this quantity to facilitate controller synthesis and verification. Specifically, we develop a procedure that identifies a sim2real gap that holds with a minimum probability. Augmenting the nominal model with our identified sim2real gap produces an uncertain model which we prove is an accurate representor of system behavior. We leverage this uncertain model to synthesize and verify a controller in simulation using a probabilistic verification approach. This pipeline produces controllers with an arbitrarily high probability of realizing desired safe behavior on system hardware without requiring hardware testing except for those required for sim2real gap identification. We also showcase our procedure working on two hardware platforms - the Robotarium and a quadruped." One-Shot Reachability Analysis of Neural Network Dynamical Systems,"Shaoru Chen, Victor M. Preciado, Mahyar Fazlyab","Microsoft Research, NYC,University of Pennsylvania,Johns Hopkins University",Safety and Trustworthy Robotics I,"The arising application of neural networks (NN) in robotic systems has driven the development of safety verification methods for neural network dynamical systems (NNDS). Recursive techniques for reachability analysis of dynamical systems in closed-loop with a NN controller, planner or perception can over-approximate the reachable sets of the NNDS by bounding the outputs of the NN and propagating these NN output bounds forward. However, this recursive reachability analysis may suffer from compounding errors, rapidly becoming overly conservative over a longer horizon. In this work, we prove that an alternative one-shot reachability analysis framework which directly verifies the unrolled NNDS can significantly mitigate the compounding errors, enabling the use of the rolling horizon as a design parameter for verification purposes. We characterize the performance gap between the recursive and one-shot frameworks for NNDS with general computational graphs. The applicability of one-shot analysis is demonstrated through numerical examples on a cart-pole system." Parameter-Conditioned Reachable Sets for Updating Safety Assurances Online,"Javier Borquez, Somil Bansal, Kensuke Nakamura","University of Southern California,Princeton University",Safety and Trustworthy Robotics I,"Hamilton-Jacobi (HJ) reachability analysis is a powerful tool for analyzing the safety of autonomous systems. However, the provided safety assurances are often predicated on the assumption that once deployed, the system or its environment does not evolve. Online, however, autonomous system might experience changes in system dynamics, control authority, external disturbances, and/or the surrounding environment, requiring updated safety assurances. Rather than restarting the safety analysis from scratch, which can be time-consuming and often intractable to perform online, we propose to compute parameter-conditioned reachable sets. Assuming expected system and environment changes can be parameterized, we treat these parameters as virtual states in the system and leverage recent advances in high-dimensional reachability analysis to solve the corresponding reachability problem offline. This results in a family of reachable sets that is parameterized by the environment and system factors. Online, as these factors change, the system can simply query the corresponding safety function from this family to ensure system safety, enabling a real-time update of the safety assurances. Through various simulation studies, we demonstrate the capability of our approach in maintaining the system safety despite system and environment evolution." Hazard Analysis of Collaborative Automation Systems: A Two-Layer Approach Based on Supervisory Control and Simulation,"Tom P. Huck, Yuvaraj Selvaraj, Constantin Cronrath, Christoph Ledermann, Martin Fabian, Bengt Lennartson, Torsten Kroeger","Karlsruhe Institute of Technology,Zenseact,Chalmers University of Technology,Department of Electrical Engineering,Karlsruher Institut für Technologie (KIT)",Safety and Trustworthy Robotics I,"Safety critical systems are typically subjected to hazard analysis before commissioning to identify and analyse potentially hazardous system states that may arise during operation. Currently, hazard analysis is mainly based on human reasoning, past experiences, and simple tools such as checklists and spreadsheets. Increasing system complexity make such tools decreasingly suitable. Furthermore, testing-based hazard analysis is often not suitable due to high costs or dangers of physical faults. A remedy for this are model-based hazard analysis methods, which either rely on formal models or on simulation models, each with their own benefits and drawbacks. This paper proposes a two-layer approach that combines the benefits of exhaustive analysis using formal methods with detailed analysis using simulation. Unsafe behaviours that lead to unsafe states are first synthesised from a formal model of the system using Supervisory Control Theory. The result is then input to the simulation where detailed analyses using domain-specific risk metrics are performed. Though the presented approach is generally applicable, this paper demonstrates the benefits of the approach on an industrial human-robot collaboration system." SmartRainNet: Uncertainty Estimation for Laser Measurement in Rain,"Chen Zhang, Zefan Huang, Beatrix Tung, Marcelo Ang, Daniela Rus","National University of Singapore,Singapore-MIT Alliance for Research and Technology,MIT",Award Finalists 4,"Adverse weather has raised a big challenge for autonomous vehicles. Unreliable measurements due to sensor degradation could seriously affect the performance of autonomous driving tasks, such as perception and localization. In this work, we study sensor degradation in rainy weather and present a novel method that evaluates the uncertainty for each laser measurement from a 3D LiDAR. With uncertainty estimation, downstream tasks that rely on LiDAR input (e.g., perception or localization) can increase their reliability by adjusting their reliance on laser measurements with varying fidelity. Alternatively, uncertainty estimation can be used for sensor performance evaluation. Our proposed method, SmartRainNet, uses an attention-based Mixture Density Network to model the dependence between neighboring laser measurements and then calculate the probability density for each laser measurement as an uncertainty score. We evaluate SmartRainNet on synthetic and naturalistic sensor degradation datasets and provide qualitative and quantitative results to demonstrate the effectiveness of our method in evaluating uncertainty. Finally, we demonstrate three practical applications of uncertainty estimation to address autonomous driving challenges in rainy weather." Data-Driven Optimal Control under Safety Constraints Using Sparse Koopman Approximation,"Hongzhe Yu, Joseph Moyalan, Umesh Vaidya, Yongxin Chen","Georgia Institute of Technology,Clemson University",Safety and Trustworthy Robotics I,"In this work we approach the dual optimal reach-safe control problem using sparse approximations of Koopman operator. Matrix approximation of Koopman operator needs to solve a least-squares (LS) problem in the lifted function space, which is computationally intractable for fine discretizations and high dimensions. The state transitional physical meaning of the Koopman operator leads to a sparse LS problem in this space. Leveraging this sparsity, we propose an efficient method to solve the sparse LS problem where we reduce the problem dimension dramatically by formulating the problem using only the non-zero elements in the approximation matrix with known sparsity pattern. The obtained matrix approximation of the operators is then used in a dual optimal reach-safe problem formulation where a linear program with sparse linear constraints naturally appears. We validate our proposed method on various dynamical systems and show that the computation time for operator approximation is greatly reduced with high precision in the solutions." Predictive Runtime Verification of Skill-Based Robotic Systems Using Petri Nets,"Baptiste Pelletier, Charles Lesire, Christophe Grand, David Doose, Mathieu Rognant","ONERA/DTIS, University of Toulouse,ONERA,Onera - The French Aerospace Lab",Safety and Trustworthy Robotics I,"This work presents a novel approach for the online supervision of robotic systems assembled from multiple complex components with skillset-based architectures, using Petri nets (PN). Predictive runtime verification is performed, which warns the system user about actions that would lead to the violation of safety specifications, using online model-checking tools on the system PNs." CIOT: Constraint-Enhanced Inertial-Odometric Tracking for Articulated Dump Trucks in GNSS-Denied Mining Environments,"David Benz, Jonathan Thomas Weseloh, Dirk Abel, Heike Vallery","RWTH Aachen University,TU Delft",Localization and Mapping V,"The ongoing electrification in all domains relies on strong increase in raw material extraction. Autonomous dump trucks are key to facilitating this. The automation requires the development of new localization approaches, as deep open-pit mines are challenging for satellite-based localization systems. Deep funnel-shaped mines reduce the sky-view angle from a certain position onward so that few to no satellites are visible. Therefore, we introduce a new wheel-odometry-aided navigation filter for articulated vehicles that fuses measurements from an inertial measurement unit (IMU), global navigation satellite systems (GNSS), and wheel encoders. Non-holonomic constraints are incorporated by assuming the lateral velocity of each wheel to be zero. We present two different measurement models that either use the wheel encoder signals of the rear wheels or all wheels of the articulated vehicle. This approach enables articulated vehicles to cope with the challenges of open-pit mines. The developed navigation filter is evaluated experimentally with an articulated dumper in two scenarios: A paved parking lot and a gravel pit. With the proposed method, we achieved a mean position error of 0.21 m during a 190 s test drive in the gravel pit with a simulated GNSS interruption of 90 s. This is an improvement of 64 m compared to a state-of-the-art navigation filter that fuses only inertial and GNSS measurements." Wide-Area Geolocalization with a Limited Field of View Camera,"Lena Downes, Ted Steiner, Rebecca Russell, Jonathan Patrick How","Massachusetts Institute of Technology,Draper",Localization and Mapping V,"Cross-view geolocalization, a supplement or replacement for GPS, localizes an agent within a search area by matching images taken from a ground-view camera to overhead images taken from satellites or aircraft. Although the viewpoint disparity between ground and overhead images makes cross-view geolocalization challenging, significant progress has been made assuming that the ground agent has access to a panoramic camera. For example, our prior work (WAG) introduced changes in search area discretization, training loss, and particle filter weighting that enabled city-scale panoramic cross-view geolocalization. However, panoramic cameras are not widely used in existing robotic platforms due to their complexity and cost. Non-panoramic cross-view geolocalization is more applicable for robotics, but is also more challenging. This paper presents Restricted FOV Wide-Area Geolocalization (ReWAG), a cross-view geolocalization approach that generalizes WAG for use with standard, non-panoramic ground cameras by creating pose-aware embeddings and providing a strategy to incorporate particle pose into the Siamese network. ReWAG is a neural network and particle filter system that is able to globally localize a mobile agent in a GPS-denied environment with only odometry and a 90 degree FOV camera, achieving similar localization accuracy as what WAG achieved with a panoramic camera and improving localization accuracy by a factor of 100 compared to a baseline vision transformer (ViT) approach." Probabilistic Plane Extraction and Modeling for Active Visual-Inertial Mapping,"Mitchell Usayiwevu, Fouad Sukkar, Teresa A. Vidal-Calleja",University of Technology Sydney,Localization and Mapping V,"This paper presents an active visual-inertial mapping framework with points and planes. The key aspect of the proposed framework is a novel probabilistic plane extraction with its associated model for estimation. The approach allows the extraction of plane parameters and their uncertainties based on a modified version of PlaneRCNN. Both, model and data uncertainties are considered in the proposed deep neural network architecture. Moreover, the extracted probabilistic plane features are fused with point features in order to increase the robustness of the estimation system in texture-less environments, where algorithms based on points alone would struggle. A visual-inertial framework based on Iterative Extended Kalman filter (IEKF) is used to demonstrate the approach. The IEKF equations are customized through a measurement extrapolation method, which enables the estimation to handle the delay introduced by the neural network inference time systematically. The system is encompassed within an active mapping framework, based on Informative Path Planning to find the most informative path for minimizing map uncertainty in visual-inertial systems. The results from the conducted experiments with a stereo/IMU system mounted on a robotic arm show that introducing planar features to the map, in order to complement the point features in the state estimation, improves robustness in texture-less environments." Visual Language Maps for Robot Navigation,"Chenguang Huang, Oier Mees, Andy Zeng, Wolfram Burgard","University of Freiburg,Google,University of Technology Nuremberg",Localization and Mapping V,"Grounding language to the visual observations of a navigating agent can be performed using off-the-shelf visual-language models pretrained on Internet-scale data (e.g., image captions) – while this is useful for matching images to natural language descriptions of object goals, this grounding in prior work remains disjoint from the mapping of the environment, which loses the spatial precision of classic geometric maps. To address this, we propose VLMaps, a spatial map representation that directly fuses pretrained visual-language features into a 3D reconstruction of the physical world. VLMaps can be autonomously built from video feed on robots using standard exploration and enables natural language indexing of the map without additional labeled data. Specifically, when combined with large language models (LLMs), VLMaps can be used to (i) translate natural language commands into a sequence of open-vocabulary navigation goals (which, beyond prior work, can be spatial by construction e.g., “in between the sofa and TV” or “3 meters to the right of the chair”) directly localized in the map, and (ii) can be shared among multiple robots with different embodiments to generate new obstacle maps on-the-fly (by using a list of obstacle categories). Extensive experiments in both simulated and real-world environments show that VLMaps enables navigating to more complex language instructions than existing methods, and can be improved in the future with better visual language models." Asynchronous State Estimation of Simultaneous Ego-Motion Estimation and Multiple Object Tracking for LiDAR-Inertial Odometry,"Yu-Kai Lin, Wen-chieh Lin, Chieh-Chih (Bob) Wang",National Yang Ming Chiao Tung University,Localization and Mapping V,"We propose LiDAR-Inertial Odometry via Simultaneous EGo-motion estimation and Multiple Object Tracking (LIO-SEGMOT), an optimization-based odometry approach targeted for dynamic environments. LIO-SEGMOT is formulated as a state estimation approach with asynchronous state update of the odometry and the object tracking. That is, LIO-SEGMOT can provide continuous object tracking results while preserving the keyframe selection mechanism in the odometry system. Meanwhile, a hierarchical criterion is designed to properly couple odometry and object tracking, preventing system instability due to poor detections. We compare LIO-SEGMOT against the baseline model LIO-SAM, a state-of-the-art LIO approach, under dynamic environments of the KITTI raw dataset and the self-collected Hsinchu dataset. The former experiment shows that LIO-SEGMOT obtains an average improvement 1.61% and 5.41% of odometry accuracy in terms of absolute translational and rotational trajectory errors. The latter experiment also indicates that LIO-SEGMOT obtains an average improvement 6.97% and 4.21% of odometry accuracy." Pose-Graph SLAM Using Multi-Order Ultrasonic Echoes and Beamforming for Long-Range Inspection Robots,"Othmane-Latif Ouabi, Neil Zeghidour, Nico F. Declercq, Matthieu Geist, Cedric Pradalier","UMI ,,,, GT-CNRS,Google Brain,Georgia Institute of Technology, Atlanta, Georgia ,,,,,–,,,,,Université de Lorraine,GeorgiaTech Lorraine",Localization and Mapping V,"This paper presents a Graph-based Simultaneous Localization And Mapping (GraphSLAM) approach for a robotic system relying on the reflections of ultrasonic guided waves to enable long-range inspection tasks on plate-based metal structures. A measurement model that can leverage multi-order acoustic echoes is introduced for accurate localization, and beamforming is used for mapping the boundaries of individual metal panels. These two elements are subsequently integrated within a nonlinear least squares optimizer to solve the full offline SLAM problem. We experimentally evaluate the potential of this approach in a laboratory environment. We observe the improved localization accuracy of the multi-order echo model compared to a second model, from previous works, that relies solely on first-order echoes. We also show that the proposed approach can yield accurate SLAM results, hence showcasing the standalone capability of ultrasonic-based GraphSLAM for envisioned long-range inspection applications." "EdgeVO: An Efficient, Accurate, and Robust Edge-Based Visual Odometry","Hui Zhao, Jianga Shang, Kai Liu, Chao Chen, Fuqiang Gu","College of Computer Science, China University of Geoscience,Chongqing University",Localization and Mapping V,"Visual odometry is important for plenty of applications such as autonomous vehicles, and robot navigation. It is challenging to conduct visual odometry in textureless scenes or environments with sudden illumination changes where popular feature-based methods or direct methods cannot work well. To address this challenge, some edge-based methods have been proposed, but they usually struggle between the efficiency and accuracy. In this work, we propose a novel visual odometry approach called EdgeVO, which is accurate, efficent, and robust. By efficiently selecting a small set of edges with certain strategies, we significantly improve the computational efficiency without sacrificing the accuracy. Compared existing edge-based method, our method can significantly reduce the computational complexity while maintain similar accuracy or even achieve better accuracy since our method removes useless or noisy edges. Experimental results on the TUM datasets indicate that EdgeVO significantly outperforms other methods in terms of efficiency, accuracy and robustness." SCORE: A Second-Order Conic Initialization for Range-Aided SLAM,"Alan Papalia, Joseph Morales, Kevin Doherty, David Rosen, John Leonard","MIT,Massachusetts Institute of Technology,Northeastern University",Localization and Mapping V,"We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RA-SLAM problem to a convex problem which is then solved to determine an initialization for the original, non-convex problem. The relaxation is a second-order cone program (SOCP), which is derived from a quadratically constrained quadratic program (QCQP) formulation of the RA-SLAM problem. As a SOCP, the method is highly scalable. We name this relaxation Second-order COnic RElaxation for RA- SLAM (SCORE). To our knowledge, this work represents the first convex relaxation for RA-SLAM. We present simulated experiments which show SCORE initialization permits the efficient recovery of high-quality solutions for a variety of challenging single- and multi-robot RA-SLAM problems with thousands of poses and range measurements." A Real-Time Dynamic Obstacle Tracking and Mapping System for UAV Navigation and Collision Avoidance with an RGB-D Camera,"Zhefan Xu, Xiaoyang Zhan, Baihan Chen, Yumeng Xiu, Chenhao Yang, Kenji Shimada",Carnegie Mellon University,Mapping and Localization,"The real-time dynamic environment perception have become vital for autonomous robots in crowded spaces. Although the popular voxel-based mapping methods can efficiently represent 3D obstacles with arbitrarily complex shapes, they can hardly distinguish between static and dynamic obstacles, leading to the limited performance of obstacle avoidance. While plenty of sophisticated learning-based dynamic obstacle detection algorithms exist in autonomous driving, the quadcopter's limited computation resources cannot achieve real-time performance using those approaches. To address these issues, we propose a real-time dynamic obstacle tracking and mapping system for quadcopter obstacle avoidance using an RGB-D camera. The proposed system first utilizes a depth image with an occupancy voxel map to generate potential dynamic obstacle regions as proposals. With the obstacle region proposals, the Kalman filter and our continuity filter are applied to track each dynamic obstacle. Finally, an environment-aware trajectory prediction method is proposed based on the Markov chain using the states of tracked dynamic obstacles. We implemented the proposed system with our custom quadcopter and navigation planner. The simulation and physical experiments show that our methods can successfully track and represent obstacles in dynamic environments in real-time and safely avoid obstacles." Resilient Terrain Navigation with a 5 DOF Metal Detector Drone,"Patrick Pfreundschuh, Rik Marian Kai Bähnemann, Tim Kazik, Thomas Mantel, Roland Siegwart, Olov Andersson","ETH Zurich,ETH Zürich",Mapping and Localization,"Micro aerial vehicles (MAVs) hold the potential for performing autonomous and contactless land surveys for the detection of landmines and explosive remnants of war (ERW). Metal detectors are the standard detection tool but must be operated close to and parallel to the terrain. A successful combination of MAVs with metal detectors has not been presented yet, as it requires advanced flight capabilities. To this end, we present an autonomous system to survey challenging undulated terrain using a metal detector mounted on a 5 degrees of freedom (DOF) MAV. Based on an online estimate of the terrain, our receding-horizon planner efficiently covers the area, aligning the detector to the surface while considering the kinematic and visibility constraints of the platform. As the survey requires resilient and accurate localization in diverse terrain, we also propose a factor graph-based online fusion of GNSS, IMU, and LiDAR measurements. We validate the robustness of the solution to individual sensor degeneracy by flying under the canopy of trees and over featureless fields. A simulated ablation study shows that the proposed planner reduces coverage duration and improves trajectory smoothness. Real-world flight experiments showcase autonomous mapping of buried metallic objects in undulated and obstructed terrain." Efficient Visual-Inertial Navigation with Point-Plane Map,"Jiaxin Hu, Kefei Ren, Xiaoyu Xu, Lipu Zhou, Xiaoming Lang, Yinian Mao, Guoquan Huang","Meituan,University of Electronic Science and Technology of China,MeiTuan,Meituan-Dianping Group,University of Delaware",Mapping and Localization,"Accurate and real-time global pose estimation relative to a global prior map is indispensable in many applications, such as logistics with micro aerial vehicles and Augmented Reality. Supposed that a pure sparse 3D point map can provide a structureless representation of the environment, then generating a point–plane prior map can further model the environment topology and offer global constraints for an accurate localization. To implement this, we propose a filter-based, large-scale visual-inertial odometry system, termed PPM-VIO, which utilizes a point–plane map to correct the cumulative drift. Our system, detecting coplanar information from sparse point clouds with semantic information, achieves accurate online plane matching via geometric constraints, semantic constraints, and descriptor constraints. To improve the localization performance, we effectively integrate and formulate the global planar measurements and points measurements in a filter-based estimator. The effectiveness of the proposed method is extensively validated on real-world datasets collected in different scenarios. Experimental results demonstrate that, rather than using the point map alone, leveraging the plane information in the prior map can yield better trajectory estimates and broaden the effective scope of the prior map in different scenes." CAROM Air - Vehicle Localization and Traffic Scene Reconstruction from Aerial Videos,"Duo Lu, Eric Eaton, Matt Weg, Wei Wang, Steven Como, Jeffrey Wishart, Hongbin Yu, Yezhou Yang","Rider University,Arizona State University",Mapping and Localization,"Road traffic scene reconstruction from videos has been desirable by road safety regulators, city planners, researchers, and autonomous driving technology developers. However, it is expensive and unnecessary to cover every mile of the road with cameras mounted on the road infrastructure. This paper presents a method that can process aerial videos to vehicle trajectory data so that a traffic scene can be automatically reconstructed and accurately re-simulated using computers. On average, the vehicle localization error is about 0.1 m to 0.3 m using a consumer-grade drone flying at 120 meters. This project also compiles a dataset of 50 reconstructed road traffic scenes from about 100 hours of aerial videos to enable various downstream traffic analysis applications and facilitate further road traffic related research. The dataset is available at https://github.com/duolu/CAROM." Control of Rough Terrain Vehicles Using Deep Reinforcement Learning,"Viktor Wiberg, Erik Wallin, Tomas Nordfjell, Martin Servin","Umeå univsersity,Umeå University,Swedish University of Agricultural Sciences",SLAM & Navigation,"We explore the potential to control terrain vehicles using deep reinforcement in scenarios where human operators and traditional control methods are inadequate. This letter presents a controller that perceives, plans, and successfully controls a 16-tonne forestry vehicle with two frame articulation joints, six wheels, and their actively articulated suspensions to traverse rough terrain. The carefully shaped reward signal promotes safe, environmental, and efficient driving, which leads to the emergence of unprecedented driving skills. We test learned skills in a virtual environment, including terrains reconstructed from high-density laser scans of forest sites. The controller displays the ability to handle obstructing obstacles, slopes up to 27 degrees, and a variety of natural terrains, all with limited wheel slip, smooth, and upright traversal with intelligent use of the active suspensions. The results confirm that deep reinforcement learning has the potential to enhance control of vehicles with complex dynamics and high-dimensional observation data compared to human operators or traditional control methods, especially in rough terrain." DynaVINS: A Visual-Inertial SLAM for Dynamic Environments,"Seungwon Song, Hyungtae Lim, Alex Lee, Hyun Myung","KAIST,Korea Advanced Institute of Science and Technology,Hyundai Motor Company,KAIST (Korea Advanced Institute of Science and Technology)",SLAM & Navigation,"Visual inertial odometry and SLAM algorithms are widely used in various fields, such as service robots, drones, and autonomous vehicles. Most of the SLAM algorithms are based on assumption that landmarks are static. However, in the real-world, various dynamic objects exist, and they degrade the pose estimation accuracy. In addition, temporarily static objects, which are static during observation but move when they are out of sight, trigger false positive loop closings. To overcome these problems, we propose a novel visual-inertial SLAM framework, called DynaVINS, which is robust against both dynamic objects and temporarily static objects. In our framework, we first present a robust bundle adjustment that could reject the features from dynamic objects by leveraging pose priors estimated by the IMU preintegration. Then, a keyframe grouping and a multi-hypothesis-based constraints grouping methods are proposed to reduce the effect of temporarily static objects in the loop closing. Subsequently, we evaluated our method in a public dataset that contains numerous dynamic objects. Finally, the experimental results corroborate that our DynaVINS has promising performance compared with other state-of-the-art methods by successfully rejecting the effect of dynamic and temporarily static objects. Our code is available at https://github.com/url-kaist/dynaVINS." Visual-Inertial Odometry with Online Calibration of Velocity-Control Based Kinematic Motion Models,"Haolong Li, Joerg Stueckler",Max Planck Institute for Intelligent Systems,SLAM & Navigation,"Visual-inertial odometry (VIO) is an important technology for autonomous robots with power and payload constraints. In this paper, we propose a novel approach for VIO with stereo cameras which integrates and calibrates the velocity-control based kinematic motion model of wheeled mobile robots online. Including such a motion model can help to improve the accuracy of VIO. Compared to several previous approaches proposed to integrate wheel odometer measurements for this purpose, our method does not require wheel encoders and can be applied when the robot motion can be modeled with velocity-control based kinematic motion model. We use radial basis function (RBF) kernel to compensate for the time delay and deviations between control commands and actual robot motion. The motion model is calibrated online by the VIO system and can be used as a forward model for motion control and planning. We evaluate our approach with data obtained in variously sized indoor environments, demonstrate improvements over a pure VIO method, and evaluate the prediction accuracy of the online calibrated model." Learning Setup Policies: Reliable Transition between Locomotion Behaviours,"Brendan Tidd, Jurgen Leitner, Akansel Cosgun, Nicolas Hudson","CSIRO,LYRO Robotics & Monash University,Monash University,X, The Moonshot Factory",SLAM & Navigation,"Abstract— Dynamic platforms that operate over many unique terrain conditions typically require many behaviours. To transition safely, there must be an overlap of states between adjacent controllers. We develop a novel method for training setup policies that bridge the trajectories between pre-trained Deep Reinforcement Learning (DRL) policies. We demonstrate our method with a simulated biped traversing a difficult jump terrain, where a single policy fails to learn the task, and switching between pre-trained policies without setup policies also fails. We perform an ablation of key components of our system, and show that our method outperforms others that learn transition policies. We demonstrate our method with several difficult and diverse terrain types, and show that we can use setup policies as part of a modular control suite to successfully traverse a sequence of complex terrains. We show that using setup policies improves the success rate for traversing a single difficult jump terrain (from 51.3% success rate with the best comparative method to 82.2%), and traversing a random sequence of difficult obstacles (from 1.9% without setup policies to 71.2%)." MMDF: Multi-Modal Deep Feature Based Place Recognition of Mobile Robots with Applications on Cross-Scene Navigation,"Xiang Yu, Bo Zhou, Zeqing Chang, Kun Qian, Fang Fang","Southeast University,Southeast university",SLAM & Navigation,"Although the navigation of robots in urban environments has achieved great performance, there is still a problem of insufficient robustness in cross-scene (ground, water surface) applications. Intuition is to introduce multi-modal complementary data to improve the robustness of the algorithm. This paper presents a MMDF (multi-modal deep feature) based cross-scene place recognition framework, which consists of four kinds of modules: LiDAR module, image module, fusion module and NetVLAD module. 3D point clouds and images are input to the network firstly. The point cloud module uses PointNet to extract point cloud features. The image module uses a lightweight network to extract image features. The fusion module uses image semantic features to enhance point cloud features, and then the enhanced point cloud features are aggregated using NetVLAD to obtain the final enhanced descriptors. Extensive experiments on KITTI, Oxford RobotCar and USVInland datasets demonstrate MMDF outperforms PointNetVLAD, NetVLAD and a camera-LiDAR fused descriptor." Deep IMU Bias Inference for Robust Visual-Inertial Odometry with Factor Graphs,"Russell Buchanan, Varun Agrawal, Marco Camurri, Frank Dellaert, Maurice Fallon","University of Edinburgh,Georgia Institute of Technology,Free University of Bozen-Bolzano,University of Oxford",SLAM & Navigation,"Visual Inertial Odometry (VIO) is one of the most established state estimation methods for mobile platforms. However, when visual tracking fails, VIO algorithms quickly diverge due to rapid error accumulation during inertial data integration. This error is typically modeled as a combination of additive Gaussian noise and a slowly changing bias which evolves as a random walk. In this work, we propose to train a neural network to learn the true bias evolution. We implement and compare two common sequential deep learning architectures: LSTMs and Transformers. Our approach follows from recent learning-based inertial estimators, but, instead of learning a motion model, we target IMU bias explicitly, which allows us to generalize to locomotion patterns unseen in training. We show that our proposed method improves state estimation in visually challenging situations across a wide range of motions by quadrupedal robots, walking humans, and drones. Our experiments show an average 15% reduction in drift rate, with much larger reductions when there is total vision failure. Importantly, we also demonstrate that models trained with one locomotion pattern (human walking) can be applied to another (quadruped robot trotting) without retraining." Hierarchical Motion Planning for Autonomous Vehicles in Unstructured Dynamic Environments,"Yao Qi, Binbing He, Yang Tai, Rendong Wang, Le Wang, Youchun Xu","Army Military Transportation University,Institute of Military Transportation, Army Military Transportati,Tianjin Navigation Instruments Research Institute,Military Transportation University",SLAM & Navigation,"This paper presents a hierarchical motion planner for generating smooth and feasible trajectories for autonomous vehicles in unstructured environments with static and moving obstacles. The framework enables real-time computation by progressively shrinking the solution space. First, a graph searcher based on combined heuristic and partial motion planning is proposed for finding coarse trajectories in spatiotemporal space. To enable fast online planning, a time interval-based algorithm that considers obstacle prediction trajectories is proposed, which uses line segment intersection detection to check for collisions. Second, to practically smooth the coarse trajectory, a continuous optimizer is implemented in three layers, corresponding to the whole path, the near-future path and the speed profile. We use discrete points to represent the far-future path and parametric curves to represent the near-future path and the whole speed profile. The approach is validated in both simulations and real-world off-road environments based on representative scenarios, including the “wait and go” scenario. The experimental results show that the proposed method improves the success rate and travel efficiency while actively avoiding static and moving obstacles." SOFT2: Stereo Visual Odometry for Road Vehicles Based on a Point-To-Epipolar-Line Metric,"Igor Cvišsić, Ivan Markovic, Ivan Petrovic","University of Zagreb, Faculty of Electrical Engineering and Comp,University of Zagreb Faculty of Electrical Engineering and Computing,University of Zagreb",SLAM & Navigation,"Accurate localization constitutes a fundamental building block of any autonomous system. In this paper, we focus on stereo cameras and present a novel approach, dubbed SOFT2, that is currently the highest-ranking algorithm on the KITTI scoreboard. SOFT2 relies on the constraints imposed by the epipolar geometry and kinematics, i.e., it is developed for configurations that cannot exhibit pure rotation. We minimize point-to-epipolar-line distances, which makes the approach resilient to object depth uncertainty, and as the first step, we estimate motion up to scale using just a single camera. Then, we propose to jointly estimate the absolute scale and the extrinsic rotation of the second camera in order to alleviate the effects of varying stereo rig extrinsics. Finally, we smooth the motion estimates in a temporal window of frames by using the proposed epipolar line bundle adjustment procedure. We also introduce a multiple hypothesis feature matching approach for self-similar planar surfaces that accounts for appearance change due to perspective. We evaluate SOFT2 and compare it to ORB-SLAM2, OV2SLAM, and VINS-FUSION on the KITTI-360 dataset, KITTI train sequences, Málaga Urban datase" Winding Through: Crowd Navigation Via Topological Invariance,"Christoforos Mavrogiannis, Krishnan Balasubramanian, Sriyash Poddar, Anush Gandra, Siddhartha Srinivasa","University of Michigan,University of Washington,Indian Institute of Technology Kharagpur",SLAM & Navigation,"We focus on robot navigation in crowded environments. The challenge of predicting the motion of a crowd around a robot makes it hard to ensure human safety and comfort. Recent approaches often employ end-to-end techniques for robot control or deep architectures for high-fidelity human motion prediction. While these methods achieve important performance benchmarks in simulated domains, dataset limitations and high sample complexity tend to prevent them from transferring to real-world environments. Our key insight is that a low-dimensional representation that captures critical features of crowd-robot dynamics could be sufficient to enable a robot to wind through a crowd smoothly. To this end, we mathematically formalize the act of passing between two agents as a rotation, using a notion of topological invariance. Based on this formalism, we design a cost functional that favors robot trajectories contributing higher passing progress and penalizes switching between different sides of a human. We incorporate this functional into a model predictive controller that employs a simple constant-velocity model of human motion prediction. This results in robot motion that accomplishes statistically significantly higher clearances from the crowd compared to state-of-the-art baselines while maintaining competitive levels of efficiency, across extensive simulations and challenging real-world experiments on a self-balancing robot." Tactile-Based Task Description through Edge Contact Formation Setpoints for Object Exploration and Manipulation,"Zhanat Kappassov, Juan Antonio Corrales Ramon, Véronique Perdereau","Nazarbayev University,Universidade de Santiago de Compostela,Sorbonne University",Force and Tactile Sensing and Haptics and Haptic Interfaces,"In autonomous robot tasks involving physical contacts with the environment, it is still challenging to perform dexterous manipulation. Force control approaches and force sensors are usually used to control actions of a robot. However, the spatial resolution of the force sensors is limited to explore and manipulate an object through the tracking salient tactile features, such as edges, while touching the surface of the object. In fact, the exploration or manipulation can be implemented via tactile servoing approaches that use the parameters of those edges. These parameters, being obtained by an array of tactile sensors, are used for generating setpoints driving a robot arm to minimize the gap between a desired and current parameters of a given edge. This paper describes a new common strategy for defining tactile setpoint signals for tactile servoing approaches in order to implement different contact-based tasks. These setpoints represent artificial constraints and comply with natural constraints on force and position of the robot end-effector imposed by the physical contact between the robot and the object. The sequence of setpoints for three different tasks are given as examples: alignment with an object, exploration of a linear object with variable stiffness and manipulation by rolling of ellipsoidal objects. These tasks are validated with real experiments using a KUKA LWR 4+ robot arm and a Weiss WTS-0614 piezoresistive sensor. The arm runs at 1 kHz and control at 100 Hz." 3D Contact Point Cloud Reconstruction from Vision-Based Tactile Flow,"Yipai Du, Guanlan Zhang, Michael Y. Wang","Hong Kong University of Science and Technology,The Hong Kong University of Science and Technology,Monash University",Force and Tactile Sensing and Haptics and Haptic Interfaces,"With the growing interest in vision-based tactile sensors, various types of sensors that utilize digital imaging are being developed. Among them, a group of sensors captures the tactile flow resulted from the contact deformation using the optical flow algorithm from computer vision and achieves full resolution deformation tracking on the tactile surface. In this work, a novel 3D contact reconstruction algorithm is proposed and evaluated. It exploits the contact geometry and projection relationship in the tactile flow, which are versatile for vision-based tactile sensors, unique for tactile perception but not inherited from computer vision. The resulted 3D contact point cloud representation is consistent with the tactile flow constraint, scale estimation, and contact edge estimation. It can be directly manipulated in downstream applications such as contact force estimation and contact pose estimation. Experiments and examples are provided that indicate the potential for the proposed tactile processing algorithm to connect tactile perception to tactile enabled robotic manipulation tasks." Visuo-Tactile Recognition of Partial Point Clouds Using PointNet and Curriculum Learning,"Christopher Parsons, Alessandro Albini, Daniele De Martini, Perla Maiolino",University of Oxford,Force and Tactile Sensing and Haptics and Haptic Interfaces,"This paper is about recognising hand-held objects from incomplete tactile observations with a classifier trained only on visual representations. Our method is based on the Deep Learning (DL) architecture PointNet and a Curriculum Learning (CL) technique for fostering the learning of descriptors robust to partial representations of objects. The learning procedure gradually decomposes the visual point clouds to synthesise sparser and sparser input data for the model. In this manner were able to use one-shot learning, using the decomposed visual point clouds as augmentations, and reduce the training requirement for data collection. The approach allows for a gradual improvement of prediction accuracy as more tactile data become available. We evaluated the effectiveness of the curriculum strategy on our generated visual and tactile datasets, experimentally showing that the proposed method improved the recognition accuracy by up to 23% on partial tactile data and boosted accuracy on full tactile data from 93% to 100%. The curriculum trained network recognised objects with an accuracy of 80% using only 20% of the tactile data representing the objects, increasing to 100% accuracy on clou" Bidirectional Sim-To-Real Transfer for GelSight Tactile Sensors with CycleGAN,"Weihang Chen, Yuan Xu, Zhenyang Chen, Peiyu Zeng, Renjun Dang, Rui Chen, Jing Xu","Tsinghua University,Southern University of Science and Technology",Force and Tactile Sensing and Haptics and Haptic Interfaces,"GelSight optical tactile sensors have high-resolution and low-cost advantages and have witnessed growing adoption in various contact-rich robotic applications. Sim2Real for GelSight sensors can reduce the time cost and sensor damage during data collection and is crucial for learning-based tactile perception and control. However, it remains difficult for existing simulation methods to resemble the complex and non-ideal light transmission of real sensors. In this paper, we propose to narrow the gap between simulation and real world using CycleGAN. Due to the bi-directional generators of CycleGAN, the proposed method can not only generate more realistic simulated tactile images, but also improve the deformation measurement accuracy of real sensors by transferring them to simulation domain. Experiments on a public dataset and our own GelSight sensors have validated the effectiveness of our method. Our code will be released upon acceptance." Development of a Novel 2-Dimensional Neck Haptic Device for Gait Balance Training,"Hosu Lee, Amre Eizad, Jiho Park, Yeongmi Kim, Sunwoo Hwang, Min-kyun Oh, Jungwon Yoon","Gwangju Institute of Science and Technology,GIST,MCI,Gyeongsang National University Hospital,Gwangju Institutue of Science and Technology",Force and Tactile Sensing and Haptics and Haptic Interfaces,"Balance problems can be the major cause of falling. Existing gait balance rehabilitation devices usually have limited overground usability due to low portability. Therefore, there is need for development of a portable gait balance rehabilitation system. To fulfill this need, a wearable balance biofeedback system is proposed that utilizes a pair of novel devices to deliver 2-dimensional hybrid haptic biofeedback to the neck that is a combination of indentation and stretching of the skin corresponding to the mediolateral (ML) and anteroposterior (AP) directions, respectively. The system’s functionality is demonstrated through an experiment where 14 healthy subjects and 1 stroke patient performed stance and gait tasks under various feedback conditions. Provision of feedback to healthy subjects resulted in significant improvements in two-dimensional balance under all task conditions. It is also observed that provision of feedback during more difficult tasks resulted in more significant balance improvements. Furthermore, use of the system during gait balance evaluation trials did not cause any significant change in gait speed,meaning that it does not have any detrimental effect on the user’s gait. Results of the stroke subject pilot trial showed similar trends. We expect that use of the proposed system may help to improve the overground gait balance of people suffering from the after effects of diseases such as stroke." Communicating Inferred Goals with Passive Augmented Reality and Active Haptic Feedback,"James Mullen, Josh Mosier, Sounak Chakrabarti, Anqi Chen, Tyler White, Dylan Losey","University of Maryland,Virginia Tech,Virginia Polytechnic Institute and State University",Force and Tactile Sensing and Haptics and Haptic Interfaces,"Robots learn as they interact with humans. Consider a human teleoperating an assistive robot arm: as the human guides and corrects the arm’s motion, the robot gathers information about the human’s desired task. But how does the human know what their robot has inferred? Today’s approaches often focus on conveying intent: for instance, using legible motions or gestures to indicate what the robot is planning. However, closing the loop on robot inference requires more than just revealing the robot’s current policy: the robot should also display the alternatives it thinks are likely, and prompt the human teacher when additional guidance is necessary. In this paper we propose a multimodal approach for communicating robot inference that combines both passive and active feedback. Specifically, we leverage information-rich augmented reality to passively visualize what the robot has inferred, and attention-grabbing haptic wristbands to actively prompt and direct the human’s teaching. We apply our system to shared autonomy tasks where the robot must infer the human’s goal in real-time. Within this context, we integrate passive and active modalities into a single algorithmic framework that determines when and which type of feedback to provide. Combining both passive and active feedback experimentally outperforms single modality baselines; during an in-person user study, we demonstrate that our integrated approach increases how efficiently humans teach the robot while simultaneously decre" Touching the Sound: Audible Features Enable Haptics for Robot Control,"Hongshen Shi, Matteo Russo, Juan De La Torre, Abdelkhalick Mohammad, Xin Dong, Dragos Axinte","University of Nottignham,University of Rome Tor Vergata,University of Nottingham",Force and Tactile Sensing and Haptics and Haptic Interfaces,"Haptic control interfaces can significantly improve the quality of teleoperation in robotic and mechatronic systems. However, the required force feedback introduces new challenges when highly constrained environments and systems hinder the use of conventional force sensors. In this paper, we propose a haptic control method that provides feedback to the operator by extracting force data from a sound signal. As sound can be acquired by a microphone deployed outside the workspace, this method enables remote feedback in any process that can be characterized through acoustic emissions (e.g. machining), without the need for sensors embedded in the system or deployed in the work environment. The performance of the method is verified by an example application, with a teleoperated robot that executes a machining process (milling and grinding), whose sound is acquired with a microphone and processed to successfully estimate the machining force for haptic feedback." Haptify: A Measurement-Based Benchmarking System for Grounded Force-Feedback Devices,"Farimah Fazlollahi, Katherine J. Kuchenbecker",Max Planck Institute for Intelligent Systems,Force and Tactile Sensing and Haptics and Haptic Interfaces,"Grounded force-feedback (GFF) devices are an established and diverse class of haptic technology based on robotic arms. However, the number of designs and how they are specified make comparing devices difficult. We thus present Haptify, a benchmarking system that can thoroughly, fairly, and noninvasively evaluate GFF haptic devices. The user holds the instrumented device end-effector and moves it through a series of passive and active experiments. Haptify records the interaction between the hand, device, and ground with a seven-camera optical motion-capture system, a 60-cm-square custom force plate, and a customized sensing end-effector. We demonstrate six key ways to assess GFF device performance: workspace shape, global free-space forces, global free-space vibrations, local dynamic forces and torques, frictionless surface rendering, and stiffness rendering. We then use Haptify to benchmark two commercial haptic devices. With a smaller workspace than the 3D Systems Touch, the more expensive Touch X outputs smaller free-space forces and vibrations, smaller and more predictable dynamic forces and torques, and higher-quality renderings of a frictionless surface and high stiffness." Biomimetic Force and Impedance Adaptation Based on Broad Learning System in Stable and Unstable Tasks,"Zhenyu Lu, Ning Wang","Bristol Robotics Laboratory,University of the West of England",Bioinspiration and Biomimetics,"This article presents a novel biomimetic force and impedance adaption framework based on Broad Learning System (BLS) for robot control in stable and unstable environments. Different from iterative learning control, the adaptation process is realized by a neural network (NN)-based framework, similar to BLS, to realize a varying learning rate for the feedforward force and impedance factors. The connections of NN layers and the settings of the feature nodes are related to human motor control and learning principle that is described as a relationship between feedforward force, impedance, reflex and position errors, etc., to make the NN explainable. Some comparative simulations are created and tested in five force fields to verify the advantages of the proposed framework in terms of force and trajectory tracking efficiency and accuracy, robust responses to different force situations and continuity of force application in a mixed stable and unstable environment. Finally, an experiment is taken to verify the effectiveness of the proposed method." CPG-RL: Learning Central Pattern Generators for Quadruped Locomotion,"Guillaume Bellegarda, Auke Ijspeert",EPFL,Bioinspiration and Biomimetics,"In this paper, we present a method for integrating central pattern generators (CPGs), i.e. systems of coupled oscillators, into the deep reinforcement learning (DRL) framework to produce robust and omnidirectional quadruped locomotion. The agent learns to directly modulate the intrinsic oscillator setpoints (amplitude and frequency) and coordinate rhythmic behavior among different oscillators. This approach also allows the use of DRL to explore questions related to neuroscience, namely the role of descending pathways, interoscillator couplings, and sensory feedback in gait generation. We train our policies in simulation and perform a sim-to-real transfer to the Unitree A1 quadruped, where we observe robust behavior to disturbances unseen during training, most notably to a dynamically added 13.75 kg load representing 115% of the nominal quadruped mass. We test several different observation spaces based on proprioceptive sensing and show that our framework is deployable with no domain randomization and very little feedback, where along with the oscillator states, it is possible to provide only contact booleans in the observation space." Research on Target Tracking for Robotic Fish Based on Low-Cost Scarce Sensing Information Fusion,"Yong Zhong, Youdong Chen, Chengcai Wang, Qixing Wang, Jiawei Yang","South China University of Technology,Peking University",Bioinspiration and Biomimetics,"Target tracking for underwater robots is always challenging, due to low-quality sensing information, information interference, and environmental disturbances. Traditional sensing methods for target tracking include vision-based tracking, acoustic-based tracking, and etc., which suffer from poor visibility, blurring, absorption, and scattering for vision-based methods, and limited bandwidth, noise, high delay, Doppler spread, and extremely expensive acoustic sensors for acoustic-based methods. In this paper, we investigate the possibility of utilizing low-cost scarce sensing information for underwater target tracking through a robotic fish platform. First, we introduce the design and control system of the robotic fish platform. Instead of using cameras and other expensive sensors, we adopt low-cost infrared sensors as the primary sensors for the robotic fish, which can only detect the appearance of objects within the sensing range of each sensor. Second, we presented a target tracking strategy based on the scarce sensing information and fusion by analyzing and combining the information of two adjacent consecutive moments of the sensors. Then, a centralized detection decision tree with fewer branches, fast convergence, and high purity is proposed. Finally, to verify our method, several sets of guided target tracking experiments are conducted. The experimental results show the effectiveness and robustness of the proposed target tracking strategy based on low-cost scarce sensin" "An Anthropomorphic Robotic Finger with Innate Human-Finger-Like Biomechanical Advantages Part I: Design, Ligamentous Joint and Extensor Mechanism","Yingmin Zhu, Guowu Wei, Lei Ren, Zirong Luo, Jianzhong Shang","School of Mechano-electronic Engineering,Xidian University,Salford University,University of Manchester,National University of Defense Technology",Bioinspiration and Biomimetics,"Exploring human hand fundamental biomechanical features and exploiting them to robotic hands have been proven to be an effective approach to enhancing artificial hands’ performance, especially when interactingwith various objects in dynamic unstructured environments. In this article, a bioinspired anthropomorphic robotic finger is first proposed, which embeds human finger musculoskeletal features in the design. Based on this design, three human-finger-like biomechanical advantages are systematically investigated and embodied in the bioinspired robotic finger. This article for the first time derives, presents, and experimentally verifies the mathematical models for the variable stiffness of finger ligamentous joints and self-adaptive morphing mechanism of finger flexible tendon sheaths, and validates and compares the influence of the reticular and linear extensor morphologies on fingertip feasible forces in three-dimensional (3-D) space. In this Part I of the article, two of the biomechanical properties, i.e., joint stiffness generated by the ligamentous joint of the finger, and fingertip feasible force space influenced by the reticular extensor mechanism are systematically investigated through theoretical modeling and experimental verification. Correspondingly, two biomechanical advantages were found, i.e., the ligamentous joint of the finger could provide anisotropic variable joint stiffness, enhancing the adaptivity, dexterity, and stability of fingers; and a reticular e" An Anthropomorphic Robotic Finger with Innate Human-Finger-Like Biomechanical Advantages Part II: Flexible Tendon Sheath and Grasping Demonstration,"Yiming Zhu, Guowu Wei, Lei Ren, Zirong Luo, Jianzhong Shang","The University of Manchester,Salford University,University of Manchester,National University of Defense Technology",Bioinspiration and Biomimetics,"The human hand has a fantastic ability to interact with various objects in the dynamic unstructured environment of our daily activities. We believe that this outstanding performance benefits a lot from the unique biological features of the hand musculoskeletal system. In Part I of this paper, a bio-inspired anthropomorphic robotic finger was developed, based on which, two human-finger-like biomechanical advantages were elaborately investigated, including the anisotropic variable stiffness associated with the ligamentous joints, and the enlarged feasible force space associated with the reticular extensor mechanisms. In Part II, the fingertip force-velocity characteristics resulting from the flexible tendon sheath are studied. It indicates that the fingertip force-velocity workspace can be greatly augmented owing to the self-adaptive morphing of the flexible tendon sheaths, showing the average improvement of 41.2% theoretically and 117.5% experimentally compared with the results of 2 mm, 4 mm and 6 mm size rigid tendon sheaths. Grasping tests and comparisons are then conducted with four three-fingered robotic hands (one with the robotic finger proposed in Part I, one with hinge joint" Sim-To-Real: Learning Energy-Efficient Slithering Gaits for a Snake-Like Robot,"Zhenshan Bing, Long Cheng, Kai Huang, Alois Knoll","Technical University of Munich,Wenzhou University,Sun Yat-sen University,Tech. Univ. Muenchen TUM",Bioinspiration and Biomimetics,"To resemble the body flexibility of biological snakes, snake-like robots are designed as a chain of body modules, which gives them many degrees of freedom on the one hand and leads to a challenging task to control them on the other hand. Compared with conventional model-based control methods, reinforcement learning based methods provide promising solutions to design agile and energy-efficient gaits for snake-like robots, since reinforcement learning based methods can fully exploit the hyper-redundant bodies of the robots. However, reinforcement learning based methods for snake-like robots have rarely been investigated even in simulations, let alone been deployed on real-world snake-like robots. In this work, we introduce a novel approach for designing energy-efficient gaits for a snake-like robot, which first learns a policy using a reinforcement learning algorithm in simulation and then transfers it to the real-world testing, thereby leveraging fast and economical gait generation process. We evaluate our reinforcement learning based approach in both simulations and real-world experiments to demonstrate that it can generate substantially more energy-efficient gaits than those gene" S2worm: A Fast-Moving Untethered Insect-Scale Robot with 2-DoF Transmission Mechanism,"Yide Liu, Yanhong Chen, Bo Feng, Dongqi Wang, Taishan Liu, Haofei Zhou, Hua Li, Shaoxing Qu, Wei Yang","zhejiang university,Zhejiang University",Bioinspiration and Biomimetics,"Designing terrestrial insect-scale robot with high maneuverability and autonomy is becoming an essential challenge in the field of robotics research. Previous work has indicated that compact transmission and integrated control devices can improve the application potential. In this work, an untethered insect-scale inchworm robot is designed and fabricated by using screw theory and smart composite microstructure method, and is named S2worm. The robot weighs 4.34 g and spans 4.1 cm in length. The robot is equipped with custom designed onboard control system and high voltage boost converter to provide the driving signal for the piezoelectric bending actuator. Attributing to the novel transmission mechanism with two degrees of freedom designed based on screw theory, the S2worm can be fabricated through the smart composite microstructure method easily and shows high mobility such as high crawling speed (27.4 cm/s, 6.7 body length/s) and small turning radius (1.7 cm, 0.4 body length/s). The S2worm holds the following advantages: small size, untethered, high mobility and low energy consumption. The robot is promising for application in planetary exploration, earthquake search and constructing insect-scale multi-robot system." Towards a Discrete Snake-Like Robot Based on SMA-Actuated Tristable Modules for Follow the Leader Control Strategy,"Beniamin Calmé, Lennart Rubbert, Yassine Haddab","LIRMM, Univ Montpellier, CNRS,INSA - Strasbourg,University of Montpellier",Bioinspiration and Biomimetics,"Snake-like robots applications include surveillance and exploration of confined environments where human presence is incompatible. The attractiveness of this type of mechatronic structure is notably linked to the modular character and the hyper-redundancy of their architecture, which gives it both mechanical robustness and great manoeuvrability. However, due to the large number of degrees-of-freedom, the use of advanced mathematical models are needed to asses the motion patterns and to simulate it. A new snake-like robot architecture is introduced in this paper together with the development methodology of a replicable multistable module. The interest of this contribution lies in the combination of the mechanical stability of the modules with an easy-to-use direct kinematic model, thus avoiding the need of complex control strategies. The design of one module exploiting the three stable states of a tristable flexible mechanism so as to orient the articulation of the module at three distinct stable angles is first presented. Then a prototype of a modular snake-like robot is built and experimentally evaluated. The prototype consists of 4 modules that can be individually oriented by ±7.1 degrees. Each module measures 51.5x32mm and weighs 2.5g. Thereby, this work provides the first results on the feasibility of this robotic architecture which consists of several multistable modules. A good agreement between the estimated workspace and experimental results is obtained." Three-Dimensional Modeling and Kinematic Analysis of Human Elbow Joint Axis Based on Anatomy and Screw Theory,"Yongsheng Gao, Guodong Lang, Wenpeng Shen, Jie Zhao",Harbin Institute of Technology,Bioinspiration and Biomimetics,"As the complexity of the coupling of sliding and rolling motion in human joint tissue, it is hard to depict the dynamic state of the joint axis. Discovering the rational of the joint axis can be significantin the process of study the bionic motion of the exoskeleton robot. we proposed a method to build the biological joint model based on Computed Tomography (CT) and anatomy to find biological joint axis based on screw theory.The bone models were only reconstructed by scanning CT images of the elbow joint in any single position. According to anatomy, humerus trochlear and trochlear notch of models were tangent to build elbow joint motion model. The screw theory was used to study the elbow joint axis and establish the equation of motion of the axis under joint’s compound motion in physiological structure.The model following anatomical principles can reflect the human joints well. And the pattern that depicted joint axis travelled is similar to the shape of the Mobius Strip.This work provides a new method to study human joint axis, and the axis results provide references to the design of exoskeleton joint and bionic robot." High-Performance Six-DOF Flight Control of the Bee++: An Inclined-Stroke-Plane Approach,"Ryan Bena, Xiufeng Yang, Ariel Calderon, Nestor O Perez-arancibia","University of Southern California,Washington State University (WSU)",Bioinspiration and Biomimetics,"We present a new method for synthesizing and implementing high-performance six-degree-of-freedom (6-DOF) flight controllers for the Bee++, an insect-scale flying robot driven by four independently-actuated flapping wings. Each wing of the Bee++ is installed with a preset orientation such that the stroke plane generated during flight is inclined, thus enabling reliable roll, pitch, and yaw torque generation. Leveraging this capability, we propose a Lyapunov-based nonlinear control archi- tecture that enables closed-loop position and attitude regulation and tracking. The control algorithms presented in this article simultaneously stabilize position and attitude by independently varying the wingstroke amplitudes of the four flapping wings of the Bee++. We use this particular control architecture to exemplify the process of controller synthesis and real-time implementation; however, the aerodynamic design of the Bee++ is compatible with a great variety of control structures and performance objectives. As a main result, we present the first set of experimental data demonstrating sustained and robust high-performance tracking of a 6-DOF reference signal during flight at the insect scale, which has been a long-standing control problem in the field of flapping-wing microrobotics. Furthermore, using data obtained through a series of systematic flight tests, we show that the Bee++ can achieve the highest 6-DOF performance ever recorded for an insect-scale flapping-wing flying robot." Autonomous Dozer Sand Grading under Localization Uncertainties,"Yakov Miron, Yuval Goldfracht, Chana Ross, Dotan Di Castro, Itzik Klein","Bosch,BCAI,University of Haifa",Sensing and Control,"Surface grading, the process of leveling an uneven area containing pre-dumped sand piles, is an important task in the construction site pipeline. This labour-intensive process is often carried out by a dozer, a key machinery tool at any construction site. Current attempts to automate surface grading assume perfect localization. However, in real-world scenarios, this assumption fails, as agents are presented with imperfect perception, which leads to degraded performance. In this work, we address the problem of autonomous grading under uncertainties. First, we implement a simulation and a scaled real-world prototype environment to enable rapid policy exploration and evaluation in this setting. Second, we formalize the problem as a partially observable markov decision process and train an agent capable of handling such uncertainties. We show, through rigorous experiments, that an agent trained under perfect localization will suffer degraded performance when presented with localization uncertainties. However, an agent trained using our method will develop a more robust policy for addressing such errors and, consequently, exhibit a better grading performance." Self-Triggered Coverage Control for Mobile Sensors,"Erick J. Rodriguez-Seda, Xiaotian Xu, Josep M. Olm, Arnau Doria-cerezo, Yancy Diaz-Mercado","United States Naval Academy,University of Maryland, College Park,Universitat Politecnica de Catalunya,Polytechnic University of Catalonia,University of Maryland",Sensing and Control,"The deployment and coordination of mobile sensor networks for coverage control applications can present several practical challenges including how to efficiently share limited communication resources and how to reduce the use of localization devices (e.g., radars and lidars). One potential solution to these challenges is to reduce the frequency at which agents communicate or sample each other’s position. In this paper, we present a distributed, asynchronous self-triggered control policy for centroidal Voronoi coverage control that is shown to decrease the sampling or communication instants among agents without degrading the performance of the mobile sensor network. Each agent independently decides when to sample the position of nearby agents and uses outdated information of its neighbors until new information is required. We prove that the locational cost function describing the distribution of agents monotonically decreases everywhere outside of a bounded neighborhood around the group’s optimal configuration and that the agents asymptotically converge to their Voronoi centroids if the data-sampled centroid errors approach zero. In addition, we show that the sampling intervals are always positive and lower bounded and, as illustrated by simulations and experiments, they tend to stabilize at a large value as the mobile sensor network comes to a steady-state." Constrained Gaussian Processes with Integrated Kernels for Long-Horizon Prediction of Dense Pedestrian Crowd Flows,"Stefan H. Kiss, Kavindie Katuwandeniya, Alen Alempijevic, Teresa A. Vidal-Calleja",University of Technology Sydney,Sensing and Control,"In this paper, we present a novel approach for predicting pedestrian crowd dynamics over longer time horizons (30s). In dense environments over long time horizons, the number of pedestrian interactions is high, leading to the degradation of traditional pedestrian trajectory estimation techniques. Alternatively, we consider the macroscopic properties of the crowd as a whole, focusing on the flow of density. This approach benefits from not considering pedestrians individually, and can probabilistically estimate the existence of previously unobserved individuals. We propose a novel approach to imposing a physical constraint on the crowd density flow. Initially, a coarse resolution prediction is generated by a Convolutional Recurrent Neural Network (ConvRNN), and subsequently smoothly interpolated by a Gaussian Process (GP). Using the linearity properties of GPs, a continuous representation of the crowd is produced that complies with both the ConvRNN’s prediction and a conservation of density constraint. The approach is trained and analysed on the dense ATC dataset, where we show the advantages of the approach and the improvements from our contributions." Large-Workspace Polyarticulated Micro-Structures Based-On Folded Silica for Tethered Nanorobotics,"Yuning Lei, Cédric Clévy, Jean-yves Rauch, Philippe Lutz","Carl von Ossietzky Universität Oldenburg,Franche-Comté University,FEMTO-ST institute,FEMTO-ST - UMR CNRS ,,,, - UFC/ENSMM/UTBM",Sensing and Control,"Origami structures have a wide range of applications in robotics and have been intensively investigated by researchers in recent years. However, enabling sub-millimeter structures is an open question especially because of the lack of small enough joints. In this paper, compliant joints made of Silica by Focused Ion Beam (FIB) folding are proposed to achieve continuous, highly repeatable large motions. A polyarticulated structure including 3 joints is especially studied following a series of robotic analyses and experimentations to quantify the performances. The size of the structure firstly appears disruptive because smaller than 50 µm in typical overall length, i.e. less than the radius of an optical fiber. Secondly, the structure can achieve a planar workspace of 57 µm squared, which is significantly large compared to the structure dimension. Thirdly, repetitive movements performed at randomly selected positions, demonstrate an excellent repeatability standard deviations of 227 nm and 216 nm in x and y directions, respectively. These results together state the interest of novel polyarticulated structures resulting from the FIB folding as a basis for the next tethered nanorobotics generation." Direction and Trajectory Tracking Control for Nonholonomic Spherical Robot by Combining Sliding Mode Controller and Model Prediction Controller,"Yifan Liu, Yixu Wang, Xiaoqing Guan, Tao Hu, Ziang Zhang, Song Jin, You Wang, Jie Hao, Guang Li","Zhejiang University,Luoteng Hangzhou Techonlogy Co.,Ltd.",Sensing and Control,"A spherical robot is a nonlinear, nonholonomic, and unstable system which increases the difficulty of the direction and trajectory tracking problem. In this study, we propose a new direction controller Hierarchical Terminal Sliding Mode Controller (HTSMC), an instruction planning controller called Model Prediction Control-based Planner (MPCBP), and a trajectory tracking framework named MPCBP-HTSMC-HSMC (MHH). The HTSMC is designed by integrating a fast terminal algorithm, a hierarchical method, the motion features of a spherical robot, and its dynamics. In addition, the new direction controller has an excellent control effect with a quick response speed and strong stability. MPCBP can obtain optimal commands that are then transmitted to the velocity and direction controller. Since the two torque controllers in MHH are all Lyapunov-based sliding mode controllers, the MHH framework may achieve optimal control performance while assuring stability. Finally, the two controllers eliminate the requirement for MPCBP's stability and dynamic constraints. Finally, hardware experiments demonstrate the efficacy of the HTSMC, MPCBP, and MHH." Advanced Manufacturing Configuration by Sample-Efficient Batch Bayesian Optimization,"Xavier Guidetti, Alisa Rupenyan, Lutz Fassl, Majid Nabavi, John Lygeros","ETH Zürich,Equipment Digitalization Team, Oerlikon Metco,ETH Zurich",Sensing and Control,"We propose a framework for the configuration and operation of expensive-to-evaluate advanced manufacturing methods, based on Bayesian optimization. The framework unifies a tailored acquisition function, a parallel acquisition procedure, and the integration of process information providing context to the optimization procedure. The novel acquisition function is demonstrated, analyzed and compared on state-of-the-art benchmarking problems. We apply the optimization approach to atmospheric plasma spraying and fused deposition modeling. Our results demonstrate that the proposed frame- work can efficiently find input parameters that produce the desired outcome and minimize the process cost." Automatically Deployable Robust Control of Modular Reconfigurable Robot Manipulators,"Carlo Nainer, Andrea Giusti",Fraunhofer Italia Research,Sensing and Control,"We propose an automatically deployable robust control scheme for modular reconfigurable robot manipulators, which accounts for noisy velocity measurements, yet it maximizes the tracking performance while avoiding chattering effects. Our proposed control approach automatically adapts to any of the possible robot compositions of given sets of modules. The robust stability is guaranteed by exploiting the use of a recursive Newton-Euler scheme with interval arithmetic computations. Moreover, the robust performance is maximized via an online regulation of the control parameters by analyzing the power spectral density of the commanded torque signal. Being fully automatic, the proposed approach allows for a quick deployment and reconfiguration of modular reconfigurable manipulators. The algorithm is validated via simulations and experiments on a commercially available robot." Velocity Following Control of a Pseudo-Driven Wheel for Reducing Internal Forces between Wheels,"Huanan Qi, Liang Ding, Bo You, Lan Huang, Xin An, Shu Li, Guangjun Liu","Harbin Institute of Technology,Harbin University of Science and Technology,Tsinghua University,Ryerson University",Sensing and Control,"The coordination of multiple driving wheels is an important issue in wheeled mobile robot design, for both maximizing tractive capability and optimizing energy consumption. This study converts a driving wheel into a pseudo-driven wheel (PDW), which is controlled to follow the motion of the robot body, in accordance with the kinematic constraints. To eliminate the internal force conflicts between wheels, a velocity following control (VFC) method is proposed to control the PDW in the presence of force disturbance caused by soil distortion during wheel-terrain interaction. The feasibility of the PDW conversion and the proposed control method are experimentally demonstrated. Compared with the kinematic model-based control method, the rover using PDW with VFC is shown to be capable of more accurate straight and steering path following on soft terrain, reducing internal forces between wheels by more than 70% in the experiments." Adaptive Tracking Control with Uncertainty-Aware and State-Dependent Feedback Action Blending for Robot Manipulators,"Xuwei Wu, Annika Kirner, Gianluca Garofalo, Christian Ott, Paul Kotyczka, Alexander Dietrich","German Aerospace Center (DLR),TU Wien,ABB AB,Technische Universität München",Sensing and Control,"Adaptive control can significantly improve tracking performance of robot manipulators subject to modeling errors in dynamics. In this letter, we propose a new framework combining the composite adaptive controller using a natural adaptation law and an extension of the adaptive variance algorithm (AVA) for controller blending. The proposed approach not only automatically adjusts the feedback action to reduce the risk of violating actuator constraints but also anticipates substantial modeling errors by means of an uncertainty measure, thus preventing severe performance deterioration. A formal stability analysis of the closed-loop system is conducted. The control scheme is experimentally validated and directly compared with baseline methods on a torque-controlled KUKA LWR IV+." Kinetostatic Modeling of Tendon-Driven Parallel Continuum Robots,"Sven Lilge, Jessica Burgner-kahrs",University of Toronto,"Kinematics, Dynamics, and Motion Control","Tendon-driven parallel continuum robots consist of multiple individual continuous kinematic chains, that are actuated in bending utilizing tendons routed along their backbones. This work derives and proposes a Cosserat rod based kinetostatic modeling framework for such parallel structures that allows for efficiently solving the forward, inverse and velocity kinetostatic problems. Using this model, the kinematic properties such as reachable workspace, singularities, manipulability and compliance of tendon-driven parallel continuum robots are studied in detail. Experiments are conducted using a real robotic prototype to validate the derived modeling approach. Overall, a median pose accuracy of 4.9 mm, corresponding to 3.4% of the continuum robots’ lengths, and 6.2â—¦ is achieved. The median of the model’s computation time results in 0.51 s on standard computing hardware. Fast computations of below 100 ms can be achieved, if an appropriate initial guess for solving the kinetostatic model is available, making the model suitable for a range of different applications including optimization or control." Globally Optimal Solution to Inverse Kinematics of 7DOF Serial Manipulator,"Pavel Trutman, Mohab Safey El Din, Didier Henrion, Tomas Pajdla","Czech Technical University in Prague,Sorbonne Univ.,University of Toulouse","Kinematics, Dynamics, and Motion Control","The Inverse Kinematics (IK) problem is concerned with finding robot control parameters to bring the robot into a desired position under the kinematics and joint limit constraints. We present a globally optimal solution to the IK problem for a general serial 7DOF manipulator with revolute joints and a polynomial objective function. We show that the kinematic constraints due to rotations can be all generated by the second-degree polynomials. This is an important result since it significantly simplifies the further step where we find the optimal solution by Lasserre relaxations of nonconvex polynomial systems. We demonstrate that the second relaxation is sufficient to solve a general 7DOF IK problem. Our approach is certifiably globally optimal. We demonstrate the method on the 7DOF KUKA LBR IIWA manipulator and show that we are, in practice, able to compute the optimal IK or certify infeasibility in 99.9 % tested poses. We also demonstrate that by the same approach, we are able to solve the IK problem for any generic (random) manipulator with seven revolute joints." Kinematic Redundancy Analysis for (2n+1)R Circular Manipulators,"Zijia Li, Mathias Brandstötter, Michael Hofbaur","Chinese Academy of Sciences,JOANNEUM RESEARCH Forschungsgesellschaft mbH - ROBOTICS,JOANNEUM RESEARCH Forschungsgesellschaft mbH","Kinematics, Dynamics, and Motion Control","The kinematic analysis of redundant serial manipulators with 2n+1 revolute joints (integer n>2), which we call circular manipulators, is presented in this paper. The structure of the kinematic chain of circular manipulators has special properties that can be seen in the Denavit-Hartenberg parameters: all orthogonal distances are zero, all even-numbered offsets are zeros, but odd-numbered offsets are not. Typical manipulators that fulfill these properties are redundant 7R serial chains (n=3) that mimic the human arm, e.g., the lightweight robot arm KUKA LBR iiwa. This 7R circular manipulator has self-motion as rotation around an axis that goes through two fixed points for a fixed pose. First, radical reparametrization is presented based on the swivel angle of the closed-form inverse kinematics solution for the 7R circular manipulator. Second, for a six-dimensional task, the inverse kinematics solution for redundant serial manipulators with 2n+1 revolute joints (n>2) is reparametrized by the swivel angle and other 2n-6 rotation parameters. From a geometric point of view, for a circular manipulator with 2n+1 revolute joints, one can have n(n-1)/2 choices of such circular rotations. Third, we conjecture numerical kinematic singularities for circular manipulators in a recursive formula, confirming n = 5,6,7." Adaptive Constrained Kinematic Control Using Partial or Complete Task-Space Measurements,"Murilo Marinho, Bruno Vilhena Adorno","The University of Tokyo,The University of Manchester","Kinematics, Dynamics, and Motion Control","Recent advancements in constrained kinematic control make it an attractive strategy for controlling robots with arbitrary geometry in challenging tasks. Most current works assume that the robot kinematic model is precise enough for the task at hand. However, with increasing demands and safety requirements in robotic applications, there is a need for a controller that compensates online for kinematic inaccuracies. We propose an adaptive constrained kinematic control strategy based on quadratic programming, which uses partial or complete task-space measurements to compensate online for calibration errors. Our method is validated in experiments and simulations that show increased accuracy and safety compared to a state-of-the-art kinematic control strategy." Connecting Gaits in Energetically Conservative Legged Systems,"Maximilian Raff, Nelson Rosa, C. David Remy",University of Stuttgart,"Kinematics, Dynamics, and Motion Control","In this work, we present a nonlinear dynamics perspective on generating and connecting gaits for energetically conservative models of legged systems. In particular, we show that the set of conservative gaits constitutes a connected space of locally defined 1D submanifolds in the gait space. These manifolds are coordinate-free parameterized by energy level. We present algorithms for identifying such families of gaits through the use of numerical continuation methods, generating sets and bifurcation points. To this end, we also introduce several details for the numerical implementation. Most importantly, we establish the necessary condition for the Delassus’ matrix to preserve energy across impacts. An important application of our work is with simple models of legged locomotion that are often able to capture the complexity of legged locomotion with just a few degrees of freedom and a small number of physical parameters. We demonstrate the efficacy of our framework on a one-legged hopper with four degrees of freedom." "Reduced Euler-Lagrange Equations of Floating-Base Robots: Computation, Properties & Applications","Hrishik Mishra, Gianluca Garofalo, Alessandro Massimo Giordano, Marco De Stefano, Christian Ott, Andreas Kugi","German Aerospace Center (DLR),ABB AB,DLR (German Aerospace Center),TU Wien","Kinematics, Dynamics, and Motion Control","At first glance, a Floating-base Robotic System is a kinematic chain, and its equations of motion are described by the inertia-coupled dynamics of its shape and movable base. However, the dynamics embody an additional structure due to the momentum evolution, which acts as a velocity constraint. In prior works of robot dynamics, matrix transformations of the dynamics revealed a block-diagonal inertia. However, the structure of the transformed matrix of Coriolis/Centrifugal (CC) terms was not examined, and is the primary contribution of this paper. To this end, we simplify the CC terms from robot dynamics and derive the analogous terms from geometric mechanics. Using this interdisciplinary link, we derive a two-part structure of the CC matrix, in which each partition is iteratively computed using a self-evident velocity dependency. Through this CC matrix, we reveal a commutative property, the velocity dependencies of the skew-symmetry property, the invariance of the shape dynamics to the basis of momentum, and the curvature as a matrix operator. Finally, we show the application of the proposed CC matrix structure through controller design and locomotion analysis." Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application,"Diego Romeres, Fabio Amadio, Alberto Dalla Libera, Riccardo Antonello, Daniel Nikovski, Ruggero Carli","Mitsubishi Electric research laboratories,Leonardo Labs - IIT,University of Padova,MERL","Kinematics, Dynamics, and Motion Control","In this paper, we present a Model-Based Reinforcement Learning (MBRL) algorithm named Monte Carlo Probabilistic Inference for Learning COntrol (MC-PILCO). The algorithm relies on Gaussian Processes (GPs) to model the system dynamics and on a Monte Carlo approach to estimate the policy gradient. This defines a framework in which we ablate the choice of the following components: (i) the selection of the cost function, (ii) the optimization of policies using dropout, (iii) an improved data efficiency through the use of structured kernels in the GP models. The combination of the aforementioned aspects affects dramatically the performance of MC-PILCO. Numerical comparisons in a simulated cart-pole environment show that MC-PILCO exhibits better data efficiency and control performance w.r.t. state-of-the-art GP-based MBRL algorithms. Finally, we apply MC-PILCO to real systems, considering in particular systems with partially measurable states. We discuss the importance of modeling both the measurement system and the state estimators during policy optimization. The effectiveness of the proposed solutions has been tested in simulation and on two real systems, a Furuta pendulum and a ball-and-plate rig." Hybrid Learning of Time-Series Inverse Dynamics Models for Locally Isotropic Robot Motion,"Tolga-Can Çallar, Sven Böttger","Universität zu Lübeck,University of Luebeck","Kinematics, Dynamics, and Motion Control","Applications of force control and motion planning often rely on an inverse dynamics model to represent the high-dimensional dynamic behavior of robots during motion. The widespread occurrence of low-velocity, small-scale, locally isotropic motion (LIMO) typically complicates the identification of appropriate models due to the exaggeration of dynamic effects and sensory perturbation caused by complex friction and phenomena of hysteresis, e.g., pertaining to joint elasticity. We propose a hybrid model learning base architecture combining rigid body dynamics model identified by parametric regression and time-series neural network architectures based on multilayer-perceptron, LSTM, and Transformer topologies. Further, we introduce a novel joint-wise rotational history encoding, reinforcing temporal information to effectively model dynamic hysteresis. The models are evaluated on a KUKA iiwa 14 during algorithmically generated locally isotropic movements. Together with the rotational encoding, the proposed architectures outperform state of-the-art baselines by a magnitude of 1000 yielding an RMSE of 0.14 Nm. Leveraging the hybrid structure and time-series encoding capabilities, our approach allows for accurate torque estimation, indicating its applicability in critically force-sensitive applications during motion sequences exceeding the capacity of conventional inverse dynamics models while retaining trainability in face of scarce data and explainability due to the physic prior." A Joint Acceleration Estimation Method Based on a High-Order Disturbance Observer,"Jiexin Zhang, Pingyun Nie, Yuhang Chen, Bo Zhang","Shanghaijiaotong university,Shanghai Jiao Tong University","Kinematics, Dynamics, and Motion Control","Joint acceleration feedback is widely used in the design of controllers and observers since joint accelerations reflect the joint dynamics of robots, especially in physical human-robot interaction. However, joint acceleration acquisition is a technical difficulty for robots. The dynamics-based methods can achieve joint acceleration estimation using only a nominal model. Still, the performance of these methods is limited by the fast time-varying disturbances in the system. This letter proposes a joint acceleration estimation method based on a high-order disturbance observer. This method can observe and compensate for fast time-varying lumped disturbances in the observer while maintaining joint acceleration estimation performance at low frequencies. The finite-time stability of the proposed estimation method is proved using the Lyapunov theory. Simulations and experiments with a lower limb rehabilitation robot are implemented to verify the performance of the proposed method." A Sampling-Based Motion Assignment Strategy with Multi-Performance Optimization for Macro-Micro Robotic System,"Yaohua Zhou, Chin-Yin Chen, Guilin Yang, Yaonan Li","Ningbo Institute of Materials Technology and Engineering,Ningbo Institute of Material Technology and Engineering, CAS,Ningbo Institute of Material Technology and Engineering, Chines,Shenzhen Academy of Robotics","Kinematics, Dynamics, and Motion Control","This article proposes a sampling-basedmotion assignment strategy for coordinated motion planning of macro-micro robotic systems. It is used to achieve performance enhancements while solving joint trajectories. The sampling strategy is implemented by traversing a series of feasible sets generated by the trajectory constraints of the micro robot. Meanwhile, two kinds of performance index maps are introduced to achieve normalization and integration of multiple performance indices. They are used for iterative generation of trajectories and overall performance evaluation, respectively. Comparative numerical results prove the validity of the proposed strategy." Offline Programming Guidance for Swarm Steering of Micro/Nano Magnetic Particles in a Dynamic Multichannel Vascular Model,"Myungjin Park, Le Tuan-anh, Jungwon Yoon","Gwangju institute of science and technology,Gwangju Institute of Science and Technology,Gwangju Institutue of Science and Technology",Swarms and Multi Agent Systems,"Magnetically targeted drug delivery (MTD) systems are used in the treatment of various diseases. However, few studies on the targeting of micro-/nano-sized magnetic particles (MPs) inside multi-bifurcations vessel with fluid flow have appeared. Here, we present a user-interface offline programming guidance (OLPG) scheme that controls MPs within a multi-channel dynamic vascular model. The OLPG scheme can simplify the guidance complexity for MTD and overcome the difficulties in real-time sensing of magnetic nanoparticles (MPs). Calibration between real and virtual environments minimizes OLPG errors due to the aggregation properties of the MPs. A Swarm of Aggregated MPs (SAMPs) can be defined experimentally as the equivalent diameter of a single MP. The joystick position is linearly related to the MP magnetic forces of a real electromagnetic actuator. SAMPs were controlled inside the MTD simulator using the joystick and their control commands can be downloaded to the real controller of the in vitro multi-channel vessel model. We performed both simulations and in vitro studies in the multi-channel vascular model. A user guided MPs to the desired locations in ~50% of simulations and ~49.5% of in vitro studies, in the absence of visual feedback. Also, a realistic 3D blood vessel model was simulated to evaluate the feasibility of the OLPG scheme. Our system has a potential to guide in vivo drug delivery." Mean Field Behaviour of Collaborative Multi-Agent Foragers,"Daniel Jarne Ornia, Pedro J. Zufiria, Manuel Mazo Jr.","Delft University of Technology,Universidad Politecnica de Madrid",Swarms and Multi Agent Systems,"Collaborative multi-agent robotic systems where agents coordinate by modifying a shared environment often result in undesired dynamical couplings that complicate the analysis and experiments when solving a specific problem or task. Simultaneously, biologically-inspired robotics rely on simplifying agents and increasing their number to obtain more efficient solutions to such problems, drawing similarities with natural processes. In this work we focus on the problem of a biologically-inspired multi-agent system solving collaborative foraging. We show how mean field techniques can be used to re-formulate such a stochastic multi-agent problem into a deterministic autonomous system. This de-couples agent dynamics, enabling the computation of limit behaviours and the analysis of optimality guarantees. Furthermore, we analyse how having finite number of agents affects the performance when compared to the mean field limit and we discuss the implications of such limit approximations in this multi-agent system, which have impact on more general collaborative stochastic problems." Closed-Loop Motion Control of Robotic Swarms – a Tether-Based Strategy,"Kasra Eshaghi, Andrew Rogers, Goldie Nejat, Beno Benhabib",University of Toronto,Swarms and Multi Agent Systems,"Swarm robots can achieve effective task execution via closed-loop motion control. However, such a goal can only be realized through accurate localization of the swarm. Past approaches have focused on addressing this issue using external sensors, static sensor networks, or through active localization – requirements that may restrict the motion of the swarm or may not be achievable in practice. We present a tether-based strategy that achieves closed-loop swarm-motion control by using a secondary team of mobile sensors. These sensors form a wireless tether that allows the swarm to indirectly sense a home base or a landmark, and to compensate for the accumulated motion errors via a closed-loop control strategy. The proposed strategy is the first to use a tether of mobile sensors that can dynamically re-shape and re-connect to various points in the environment to achieve closed-loop motion control. The novelty of the strategy is in its ability to adapt to any swarm motion considered, and to be applied to swarms with limited sensing capabilities and knowledge of their environment. The performance of the proposed strategy was validated through extensive experiments." Controlling Collision-Induced Aggregations in a Swarm of Micro Bristle-Robots,"Zhijian Hao, Siddharth Mayya, Gennaro Notomista, Seth Hutchinson, Magnus Egerstedt, Azadeh Ansari","Georgia Institute of Technology,Amazon Robotics,University of Waterloo,University of California, Irvine",Swarms and Multi Agent Systems,"Systematically designing local interaction rules to achieve collective behaviors in robot swarms is a challenging endeavor, especially in micro-robots where size restrictions imply severe sensing, communication, and computation limitations. In such robot swarms, performing useful functions is often preconditioned on the formation of high-density aggregations which can facilitate collective signaling and information sharing. In this paper, we present a systematic approach to control aggregation behaviors by leveraging the physical interactions in a swarm of 300 3-mm vibration-driven micro bristle-robots that we designed and fabricated. We demonstrate the ability to control the degree of aggregation by varying the motility characteristics of the robots through global vibration frequency and amplitude inputs, after comprehensive characterization, modeling and simulation of the locomotion dynamics and robot interactions. To quantify the degree of aggregation, we also introduce a new metric, the MIPS index (Motility-Induced Phase Separation index), which unlike many existing methods does not require a scenario-specific tuning of parameters. Our investigations reveal how physics-driven interaction mechanisms can be exploited to achieve desired behaviors in minimally equipped robot swarms and highlight the specific ways in which hardware and software developments aid in the achievement of collision-induced aggregations." Multi-Robot Pickup and Delivery Via Distributed Resource Allocation,"Andrea Camisa, Andrea Testa, Giuseppe Notarstefano",University of Bologna,Swarms and Multi Agent Systems,"In this paper, we consider a large-scale instance of the classical Pickup-and-Delivery Vehicle Routing Problem (PDVRP) that must be solved by a network of mobile cooperating robots. Robots must self-coordinate and self-allocate a set of pickup/delivery tasks while minimizing a given cost figure. This results in a large, challenging Mixed-Integer Linear Problem that must be cooperatively solved without a central coordinator. We propose a distributed algorithm based on a primal decomposition approach that provides a feasible solution to the problem in finite time. An interesting feature of the proposed scheme is that each robot computes only its own block of solution, thereby preserving privacy of sensible information. The algorithm also exhibits attractive scalability properties that guarantee solvability of the problem even in large networks. To the best of our knowledge, this is the first attempt to provide a scalable distributed solution to the problem. The algorithm is first tested through Gazebo simulations on a ROS 2 platform, highlighting the effectiveness of the proposed solution. Finally, experiments on a real testbed with a team of ground and aerial robots are provided." Deep Reinforcement Learning for Decentralized Multi-Robot Exploration with Macro Actions,"Aaron Tan, Federico Pizarro Bejarano, Yuhan Zhu, Richard Ren, Goldie Nejat",University of Toronto,Swarms and Multi Agent Systems,"Cooperative multi-robot teams need to be able to explore cluttered and unstructured environments while dealing with communication dropouts that prevent them from exchanging local information to maintain team coordination. Therefore, robots need to consider high-level teammate intentions during action selection. In this letter, we present the first Macro Action Decentralized Exploration Network (MADE-Net) using multi-agent deep reinforcement learning (DRL) to address the challenges of communication dropouts during multi-robot exploration in unseen, unstructured, and cluttered environments. Simulated robot team exploration experiments were conducted and compared against classical and DRL methods where MADE-Net outperformed all benchmark methods in terms of computation time, total travel distance, number of local interactions between robots, and exploration rate across various degrees of communication dropouts. A scalability study in 3D environments showed a decrease in exploration time with MADENet with increasing team and environment sizes. The experiments presented highlight the effectiveness and robustness of our method." Time-Inverted Kuramoto Model Meets Lissajous Curves: Multi-Robot Persistent Monitoring and Target Detection,"Manuel Boldrer, Lorenzo Lyons, Luigi Palopoli, Daniele Fontanelli, Laura Ferranti","Delft University of Technology,University of Trento",Swarms and Multi Agent Systems,"This work proposes a distributed strategy to achieve both persistent monitoring and target detection in a rectangular and obstacle-free environment. Each robot has to repeatedly follow a smooth trajectory and avoid collisions with other robots. To achieve this goal, we rely on the time-inverted Kuramoto dynamics and the use of Lissajous curves. We analyze the resiliency of the system to perturbations or temporary failures, and we validate our approach through both simulations and experiments on real robotic platforms. In the latter, we adopt Model Predictive Contouring Control as a low level controller to minimize the tracking error while accounting for the robots' dynamical constraints and the control inputs saturation. The results obtained in the experiments are in accordance with the simulations." A Decentralized Multi-Robot Spatio-Temporal Multi-Task Assignment Approach for Perimeter Defense,"Shridhar Velhal, Suresh Sundaram, Sundararajan Narasimman","Indian Institute of Science,Nanyang Technological University",Swarms and Multi Agent Systems,"This paper provides a new decentralized approach to a perimeter defense problem(PDP). In a typical perimeter defense problem, many intruders try to enter a territory, and a group of defenders protects the territory by capturing the intruders on the perimeter. The objective of the defenders is to detect and capture the intruders before they enter the territory. Defenders sense the intruders independently and compute their trajectories to capture all the intruders in a cooperative way. Each intruder is estimated to reach a specific location on the perimeter at a specific time, and this is considered as a spatio-temporal task to be handled by a defender. At any given time, the perimeter defense problem is converted into a Decentralized Multi-Robot Spatio-Temporal Multi-Task Assignment (DMRST-MTA) problem. The cost of executing a task for a defender is defined by a composite cost function that includes both the spatial and temporal cost components. In this paper, a modified decentralized consensus-based bundle algorithm is presented to solve the above spatio-temporal multi-task assignment problem. Performance evaluation of the proposed approach is presented based on Monte-Carlo studies" Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards,"Ruihua Han, Shengduo Chen, Shuaijun Wang, Zeqing Zhang, Rui Gao, Qi Hao, Jia Pan","University of Hong Kong,Southern University of Science and Technology,The University of Hong Kong,SOUTHERN UNIVERSITY OF SCIENCE AND TECHNOLOGY",Swarms and Multi Agent Systems,"The challenges to solving the collision avoidance problem lie in adaptively choosing optimal robot velocities in complex scenarios full of interactive obstacles. In this paper, we propose a distributed approach for multi-robot navigation which combines the concept of reciprocal velocity obstacle (RVO) and the scheme of deep reinforcement learning (DRL) to solve the reciprocal collision avoidance problem under limited information. The novelty of this work is threefold: (1) using a set of sequential VO and RVO vectors to represent the interactive environmental states of static and dynamic obstacles, respectively; (2) developing a bidirectional recurrent module based neural network, which maps the states of a varying number of surrounding obstacles to the actions directly; (3) developing a RVO area and expected collision time based reward function to encourage reciprocal collision avoidance behaviors and trade off between collision risk and travel time. The proposed policy is trained through simulated scenarios and updated by the actor-critic based DRL algorithm. We validate the policy in complex environments with various numbers of differential drive robots and obstacles. The experiment results demonstrate that our approach outperforms the state-of-art methods and other learning based approaches in terms of the success rate, travel time, and average speed." Chance-Constrained Iterative Linear-Quadratic Stochastic Games,"Hai Zhong, Yutaka Shimizu, Jianyu Chen","Tsinghua University,TIER IV",Swarms and Multi Agent Systems,"Dynamic game arises as a powerful paradigm for multi-robot planning, for which the safety constraints satisfaction is crucial. Constrained stochastic games are of particular interest, as real-world robots need to operate and satisfy constraints under uncertainty. Existing methods for solving stochastic games handle constraints using soft penal- ties with hand-tuned weights. However, finding a suitable penalty weight is non-trivial and requires trial and error. In this paper, we propose the chance-constrained iterative linear-quadratic stochastic games (CCILQGames) algorithm. CCILQGames solves chance-constrained stochastic games using the augmented Lagrangian method, with the merit of automatically finding a suitable penalty weight. We evaluate our algorithm in three autonomous driving scenarios, including merge, intersection, and roundabout. Experimental results and Monte Carlo tests show that CCILQGames could generate safe and interactive strategies in stochastic environments." The SLAM Hive Benchmarking Suite,"Yuanyuan Yang, Bowen Xu, Yinjie Li, Soeren Schwertfeger",ShanghaiTech University,Software Tools II,"Benchmarking Simultaneous Localization and Mapping (SLAM) algorithms is important to scientists and users of robotic systems alike. But through their many configuration options in hardware and software, SLAM systems feature a vast parameter space that scientists up to now were not able to explore. The proposed SLAM Hive Benchmarking Suite is able to analyze SLAM algorithms in 1000's of mapping runs, through its utilization of container technology and deployment in a cluster. This paper presents the architecture and open source implementation of SLAM Hive and compares it to existing efforts on SLAM evaluation. Furthermore, we highlight the function of SLAM Hive by exploring some open source algorithms on public datasets in terms of accuracy. We compare the algorithms against each other and evaluate how parameters effect not only accuracy but also CPU and memory usage. Through this we show that SLAM Hive can become an essential tool for proper comparisons and evaluations of SLAM algorithms and thus drive the scientific development in the research on SLAM." Discovering Multiple Algorithm Configurations,"Leonid Keselman, Martial Hebert","Carnegie Mellon University,CMU",Software Tools II,"Many practitioners in robotics regularly depend on classic, hand-designed algorithms. Often the performance of these algorithms is tuned across a dataset of annotated examples which represent typical deployment conditions. Automatic tuning of these settings is traditionally known as algorithm configuration. In this work, we extend algorithm configuration to automatically discover multiple modes in the tuning dataset. Unlike prior work, these configuration modes represent multiple dataset instances and are detected automatically during the course of optimization. We propose three methods for mode discovery: a post hoc method, a multi-stage method, and an online algorithm using a multi-armed bandit. Our results characterize these methods on synthetic test functions and in multiple robotics application domains: stereoscopic depth estimation, differentiable rendering, motion planning, and visual odometry. We show the clear benefits of detecting multiple modes in algorithm configuration space." Aquarium: A Fully Differentiable Fluid-Structure Interaction Solver for Robotics Applications,"Jeong Hun Lee, Mike Yan Michelis, Robert Kevin Katzschmann, Zachary Manchester","Carnegie Mellon University,ETH Zurich",Software Tools II,"We present Aquarium, a differentiable fluid-structure interaction solver for robotics that offers stable simulation, accurately coupled fluid-robot physics in two dimensions, and full differentiability with respect to fluid and robot states and parameters. Aquarium achieves stable simulation with accurate flow physics by directly integrating over the incompressible Navier-Stokes equations using a fully implicit Crank-Nicolson scheme with a second-order finite-volume spatial discretization. The fluid and robot physics are coupled using the immersed-boundary method by formulating the no-slip condition as an equality constraint applied directly to the Navier-Stokes system. This choice of coupling allows the fluid-structure interaction to be posed and solved as a nonlinear optimization problem. This optimization-based formulation is then exploited using the implicit-function theorem to compute derivatives. Derivatives can then be passed to downstream gradient-based optimization or learning algorithms. We demonstrate Aquarium's ability to accurately simulate coupled fluid-robot physics with numerous 2D examples, including a cylinder in free stream and a soft robotic fish tail with hardware validation. We also demonstrate Aquarium's ability to provide analytical gradients by performing gradient-based shape-and-gait optimization of an oscillating diamond foil to maximize its generated thrust." Robust Co-Design of Robots Via Cascaded Optimisation,"Akhil Sathuluri, Anand Vazhapilli Sureshbabu, Markus Zimmermann","Technical University of Munich,Technische Universität München",Software Tools II,"Optimising mechanical, control and actuator design variables together as a co-design problem enables identifying novel and better-performing robot architectures. Typically, solving such problems using conventional optimization methods yields a single, point-based solution. Deviating from the computed optima may be necessary to ensure physical feasibility, typically associated with a performance loss. In this work, we present a two-step cascaded optimisation approach to identify non-intuitive designs and recover the loss in performance by constructing a solution space. The solution space provides robustness in the form of permissible ranges of design variable values and enables the selection of a physically feasible design. In our study, we observe (1) up to 20% of the lost performance is recovered and (2) an improvement of 30% on the task metric in comparison to an existing robot and (3) designs with cost savings of up to 10% can be identified." Autotuning Symbolic Optimization Fabrics for Trajectory Generation,"Max Spahn, Javier Alonso-Mora","TU Delft,Delft University of Technology",Software Tools II,"In this paper, we present an automated parameter optimization method for trajectory generation. We formulate parameter optimization as a constrained optimization problem that can be effectively solved using Bayesian optimization. While the approach is generic to any trajectory generation method, we showcase it using optimization fabrics. Optimiza- tion fabrics are a geometric trajectory generation method based on non-Riemannian geometry. By symbolically pre-solving the structure of the tree of fabrics, we obtain a parameterized trajectory generator, called symbolic fabrics. We show that autotuned symbolic fabrics reach expert-level performance in a few trials. Additionally, we show that tuning transfers across different robots, motion planning problems and between sim- ulation and real world. Finally, we qualitatively showcase that the framework could be used for coupled mobile manipulation." Auto-Assembly: A Framework for Automated Robotic Assembly Directly from CAD,"Fedor Chervinskii, Sergei Zobov, Aleksandr Rybnikov, Danil Petrov, Komal Sai Reddy Vendidandi","Arrival,Micropsi Industries Gmbh,ARRIVAL",Software Tools II,"In this work, we propose a framework called Auto-Assembly for automated robotic assembly from design files and demonstrate a practical implementation on modular parts joined by fastening using a robotic cell consisting of two robots. We show the flexibility of the approach by testing it on different input designs. Auto-Assembly consists of several parts: design analysis, assembly sequence generation, bill-of-process (BOP) generation, conversion of the BOP to control code, path planning, simulation, and execution of the control code to assemble parts in the physical environment." "General, Single-Shot, Target-Less, and Automatic LiDAR-Camera Extrinsic Calibration Toolbox","Kenji Koide, Shuji Oishi, Masashi Yokozuka, Atsuhiko Banno","National Institute of Advanced Industrial Science and Technology,National Institute of Advanced Industrial Science and Technology (AIST),Nat. Inst. of Advanced Industrial Science and Technology,National Instisute of Advanced Industrial Science and Technology",Software Tools II,"This paper presents an open source LiDAR-camera calibration toolbox that is general to LiDAR and camera projection models, requires only one pairing of LiDAR and camera data without a calibration target, and is fully automatic (no manual initial guess is required). For automatic initial guess estimation, we employ the SuperGlue image matching pipeline to find 2D-3D correspondences between LiDAR and camera data and estimate the LiDAR-camera transformation via RANSAC. Given the initial guess, we refine the transformation estimate with direct LiDAR-camera registration based on the normalized information distance, a mutual information-based cross-modal distance metric. For a handy calibration process, we also present several assistance capabilities (e.g., dynamic LiDAR data integration and user interface for making 2D-3D correspondence manually). The experimental results show that the proposed toolbox enables calibration of any combination of spinning and non-repetitive scan LiDARs and pinhole and omnidirectional cameras, and shows better calibration accuracy and robustness than those of the state-of-the-art edge-alignment-based calibration method." GaPT: Gaussian Process Toolkit for Online Regression with Application to Learning Quadrotor Dynamics,"Francesco Crocetti, Jeffrey Mao, Alessandro Saviolo, Gabriele Costante, Giuseppe Loianno","University of Perugia,New York University",Software Tools II,"Gaussian Processes (GPs) are expressive models for capturing signal statistics and expressing prediction uncertainty. As a result, the robotics community has gathered interest in leveraging these methods for inference, planning, and control. Unfortunately, despite providing a closed-form inference solution, GPs are non-parametric models that typically scale cubically with the dataset size, hence making them difficult to be used especially on onboard Size, Weight, and Power (SWaP) constrained aerial robots. In addition, the integration of popular libraries with GPs for different kernels is not trivial. In this paper, we propose GaPT, a novel toolkit that converts GPs to their state space form and performs regression in linear time. GaPT is designed to be highly compatible with several optimizers popular in robotics. We thoroughly validate the proposed approach for learning quadrotor dynamics on both single and multiple input GP settings. GaPT accurately captures the system behavior in multiple flight regimes and operating conditions, including those producing highly nonlinear effects such as aerodynamic forces and rotor interactions. Moreover, the results demonstrate the superior computational performance of GaPT compared to a classical GP inference approach on both single and multi-input settings, enabling real-time inference speed on embedded platforms used on SWaP-constrained aerial robots, especially for inference on a large number of points." Transferring Implicit Knowledge of Non-Visual Object Properties across Heterogeneous Robot Morphologies,"Gyan Tatiya, Jonathan Francis, Jivko Sinapov","Tufts University,Bosch Center for Artificial Intelligence",Data Sets II,"Humans leverage multiple sensor modalities when interacting with objects and discovering their intrinsic properties. Using the visual modality alone is insufficient for deriving intuition behind object properties (e.g., which of two boxes is heavier), making it essential to consider non-visual modalities as well, such as the tactile and auditory. Whereas robots may leverage various modalities to obtain object property understanding via learned exploratory interactions with objects (e.g., grasping, lifting, and shaking behaviors), challenges remain: the implicit knowledge acquired by one robot via object exploration cannot be directly leveraged by another robot with different morphology, because the sensor models, observed data distributions, and interaction capabilities are different across these different robot configurations. To avoid the costly process of learning interactive object perception tasks from scratch, we propose a multi-stage projection framework for each new robot for transferring implicit knowledge of object properties across heterogeneous robot morphologies. We evaluate our approach on the object-property recognition and object-identity recognition tasks, using a dataset containing two heterogeneous robots that perform 7,600 object interactions. Results indicate that knowledge can be transferred across robots, such that a newly-deployed robot can bootstrap its recognition models without exhaustively exploring all objects. We also propose a data augmentation technique and show that this technique improves the generalization of models. We release code, datasets, and additional results, here: https://github.com/gtatiya/Implicit-Knowledge-Transfer." Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments,"Joshua Barton Knights, Kavisha Vidanapathirana, Milad Ramezani, Sridha Sridharan, Clinton Fookes, Peyman Moghadam","Queensland University of Technology,CSIRO",Data Sets II,"Many existing datasets for lidar place recognition are solely representative of structured urban environments, and have recently been saturated in performance by deep learning based approaches. Natural and unstructured environments present many additional challenges for the tasks of long-term localization but these environments are not represented in currently available datasets. To address this we introduce Wild-Places, a challenging large-scale dataset for lidar place recognition in unstructured, natural environments. Wild-Places contains eight lidar sequences collected with a handheld sensor payload over the course of fourteen months, containing a total of ∼63K undistorted lidar submaps along with accurate 6D ground truth. Our dataset contains multiple revisits both within and between sequences, allowing for both intra-sequence (i.e., loop closure detection) and inter-sequence (i.e., re-localisation) place recognition. We also benchmark several state-of-the-art approaches to demonstrate the challenges that this dataset introduces, particularly the case of long-term place recognition due to natural environments changing over time. Our dataset and code will be available at https://github.com/csiro-robotics/Wild-Places" "On Human Grasping and Manipulation in Kitchens: Automated Annotation, Insights, and Metrics for Effective Data Collection","Sivashanmuganathan Elangovan, Ricardo De Godoy, Felipe Sanches, Ke Wang, Tom White, Patrick Jarvis, Minas Liarokapis","University of Auckland,The University of Auckland,AI Data Innovations,Acumino",Data Sets II,"The advancement in robotic grasping and manipulation has elicited an increased research interest in the development of household robots capable of performing a plethora of complex tasks. These advancements require the shift of robotics research from a laboratory setting to dynamic and unstructured home environments. In this work, we focus on a comprehensive data collection and analysis of key attributes involved in the selection of grasping and manipulation strategies for the successful execution of kitchen tasks. An unprecedented dataset that comprises over 7 hours of high-definition videos that were analyzed to classify more than 10,000 kitchen activities annotated with 24 attributes each has been created. Machine learning techniques were employed to automate the annotation process partially by extracting grasp types, hand, and object information from the videos. The annotated dataset was analyzed using clustering algorithms to identify underlying patterns. This study also identifies key attributes and specific data that require focus during data collection based on inter-subject variability. The insights from this study can be used to improve the speed, quality, and effectiveness of data collection. It also helps identify the strategies employed by the humans for the execution of kitchen tasks and transfer the necessary skills to a robotic end-effector enabling it to complete the tasks autonomously or collaborate with humans." Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning,"David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jacob Varley","New York University,Google,University of Pennsylvania",Data Sets II,"We consider how to most efficiently leverage teleoperator time to collect data for learning robust image-based value functions and policies for sparse reward robotic tasks. To accomplish this goal, we modify the process of data collection to include more than just successful demonstrations of the desired task. Instead we develop a novel protocol that we call Visual Backtracking Teleoperation (VBT), which deliberately collects a dataset of visually similar failures, recoveries, and successes. VBT data collection is particularly useful for efficiently learning accurate value functions from small datasets of image-based observations. We demonstrate VBT on a real robot to perform continuous control from image observations for the deformable manipulation task of T-shirt grasping. We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection. Specifically, we find that offline reinforcement learning on VBT data outperforms standard behavior cloning on successful demonstration data by 13% when both methods are given equal-sized datasets of 60 minutes of data from the real robot." COLA: COarse LAbel Pre-Training for 3D Semantic Segmentation of Sparse LiDAR Datasets,"Jules Sanchez, François Goulette, Jean-Emmanuel Deschaud","Mines Paris - PSL University,MINES ParisTech",Data Sets II,"Transfer learning is a proven technique in 2D computer vision to leverage the large amount of data available and achieve high performance with datasets limited in size due to the cost of acquisition or annotation. In 3D, annotation is known to be a costly task; nevertheless, pre-training methods have only recently been investigated. Due to this cost, unsupervised pre-training has been heavily favored. In this work, we tackle the case of real-time 3D semantic segmentation of sparse autonomous driving LiDAR scans. Such datasets have been increasingly released, but each has a unique label set. We propose here an intermediate-level label set called coarse labels, which can easily be used on any existing and future autonomous driving datasets, thus allowing all the data available to be leveraged at once without any additional manual labeling. This way, we have access to a larger dataset, alongside a simple task of semantic segmentation. With it, we introduce a new pre-training task: coarse label pre-training, also called COLA. We thoroughly analyze the impact of COLA on various datasets and architectures and show that it yields a noticeable performance improvement, especially when only a small dataset is available for the finetuning task." Enhancing the Efficacy of Lower-Body Assistive Devices through the Understanding of Human Movement in the Real World,"Loubna Baroudi, Stephen Cain, Alex Shorter, Kira Barton","University of Michigan,University of Michigan at Ann Arbor",Data Sets II,"In previous studies, researchers have successfully measured walking in healthy able-bodied humans to create safe control strategies for lower body assistive devices. Mea- surements used to establish design requirements often come from testing and evaluation that takes place in laboratory settings during steady-state tasks, where participants often select movement strategies that minimize the cost of transport. However, human walking in these conditions does not neces- sarily represent the natural behavior of an individual in the real world. In this work, we conducted a study to characterize human walking in the real world. We combined week-scale free- living measurements of gait with in-lab data collection to: 1) quantify the proportion of steady-state walking in a population of healthy able-bodied adults, and 2) evaluate whether this population favors the selection of a range of walking speeds that minimize their cost of transport in the real world. We found that the majority of walking bouts contain mostly transient walking, suggesting that researchers should complement steady-state characterization with non-steady-state tasks. We also found that the most often used steady-state walking speeds for all participants were higher than the range that minimizes cost of transport, suggesting that individuals are influenced by more than energy economy when moving in the real world. Thus, when developing control strategies for these devices, researchers should consider a variety of optimization objectives to adapt for the multifarious situations of daily life." DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation,"Ruicheng Wang, Jialiang Zhang, Jiayi Chen, Yinzhen Xu, Puhao Li, Tengyu Liu, He Wang","Peking University,Tsinghua University,Beijing Institute for General Artificial Intelligence",Award Finalists 1,"Robotic dexterous grasping is the first step to enable human-like dexterous object manipulation and thus a crucial robotic technology. However, dexterous grasping is much more under-explored than object grasping with parallel grippers, partially due to the lack of a large-scale dataset. In this work, we present a large-scale robotic dexterous grasp dataset, DexGraspNet, generated by our proposed highly efficient synthesis method that can be generally applied to any dexterous hand. Our method leverages a deeply accelerated differentiable force closure estimator and thus can efficiently and robustly synthesize stable and diverse grasps on a large scale. We choose ShadowHand and generate 1.32 million grasps for 5355 objects, covering more than 133 object categories and containing more than 200 diverse grasps for each object instance, with all grasps having been validated by the Isaac Gym simulator. Compared to the previous dataset from Liu et al. generated by GraspIt!, our dataset has not only more objects and grasps, but also higher diversity and quality. Via performing cross-dataset experiments, we show that training several algorithms of dexterous grasp synthesis on our dataset significantly outperforms training on the previous one. To access our data and code, including code for human and Allegro grasp synthesis, please visit our project page: https://pku-epic.github.io/DexGraspNet/." ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding,"Dustin Aganian, Benedict Stephan, Markus Eisenbach, Corinna Stretz, Horst-Michael Gross","Ilmenau University of Technology,University of Technology Ilmenau",Data Sets II,"With the emergence of collaborative robots (cobots), human-robot collaboration in industrial manufacturing is coming into focus. For a cobot to act autonomously and as an assistant, it must understand human actions during assembly. To effectively train models for this task, a dataset containing suitable assembly actions in a realistic setting is crucial. For this purpose, we present the ATTACH dataset, which contains 51.6 hours of assembly with 95.2k annotated fine-grained actions monitored by three cameras, which represent potential viewpoints of a cobot. Since in an assembly context workers tend to perform different actions simultaneously with their two hands, we annotated the performed actions for each hand separately. Therefore, in the ATTACH dataset, more than 68% of annotations overlap with other annotations, which is many times more than in related datasets, typically featuring more simplistic assembly tasks. For better generalization with respect to the background of the working area, we did not only record color and depth images, but also used the Azure Kinect body tracking SDK for estimating 3D skeletons of the worker. To create a first baseline, we report the performance of state-of-the-art methods for action recognition as well as action detection on video and skeleton-sequence inputs. The dataset is available at https://www.tu-ilmenau.de/neurob/data-sets-code/attach-dataset." Synthetic-To-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances,"Arun Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil Katyal, Dinesh Manocha, Celso De Melo, Rama Chellappa","Johns Hopkins University,Johns Hopkins University Applied Physics Lab,Georgia Tech,University of Maryland,CCDC US Army Research Laboratory",Data Sets II,"Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data has shown promise as a way to avoid the substantial costs and potential ethical concerns associated with collecting and labeling enormous amounts of data in the real-world. However, synthetic data may differ from real data in important ways. This phenomenon, known as domain shift, can limit the utility of synthetic data in robotics applications. To mitigate the effects of domain shift, substantial effort is being dedicated to the development of domain adaptation (DA) techniques. Yet, much remains to be understood about how best to develop these techniques. In this paper, we introduce a new dataset called Robot Control Gestures (RoCoG-v2). The dataset is composed of both real and synthetic videos from seven gesture classes, and is intended to support the study of synthetic-to-real domain shift for video- based action recognition. Our work expands upon existing datasets by focusing the action classes on gestures for human- robot teaming, as well as by enabling investigation of domain shift in both ground and aerial views. We present baseline results using state-of-the-art action recognition and domain adaptation algorithms and offer initial insight on tackling the synthetic-to-real and ground-to-air domain shifts. Instructions on accessing the dataset can be found at https://github. com/reddyav1/RoCoG- v2." Robotic Method and Instrument to Efficiently Synthesize Faulty Conditions and Mass-Produce Faulty-Conditioned Data for Rotary Machines,"Yip Fun Yeung, Fangzhou Xia, Juliana Covarrubias, Furokawa Mikio, Hirano Takayuki, Kamal Youcef-Toumi","MIT,Massachusetts Institute of Technology,Japan Steel Works",Data Sets II,"Condition synthesis is vital for generating data for fault detection and diagnosis studies. Traditional methods rely heavily on human labor. This study proposes a robotic method and its instrument to efficiently synthesize faulty conditions and mass-produce data to develop fault detection and diagnosis algorithms. The first contribution is the formalization of a new approach called Robotic Condition Synthesis, which shifts the traditionally labor-intensive task of condition synthesis to a robot-based force control task. The second contribution is developing a new robotic manipulator, which is more effective than current lab-grade robots for the tasks involved in the Robotic Condition Synthesis. The third contribution is empirical evidence of the superiority of this new robot in performing the Robotic Condition Synthesis tasks. This study also explores the potential of the new robot by conducting a three-dimensional system identification of a rotordynamic plant, which lays the foundation for more advanced Robotic Condition Synthesis policies in the future." FLYOVER: A Model-Driven Method to Generate Diverse Highway Interchanges for Autonomous Vehicle Testing,"Yuan Zhou, Gengjie Lin, Yun Tang, Kairui Yang, Wei Jing, Ping Zhang, Junbo Chen, Liang Gong, Yang Liu","Nanyang Technological University,Shanghai Jiao Tong University,NANYANG TECHNOLOGICAL UNIVERSITY,Damo Academy, Alibaba Group,Alibaba,Alibaba Group,Shanghai Jiao Tong University",Data Sets II,"It has become a consensus that autonomous vehicles (AVs) will first be widely deployed on highways. However, the complexity of highway interchanges becomes the bottleneck for their deployment. An AV should be sufficiently tested under different highway interchanges, which is still challenging due to the lack of available datasets containing diverse highway interchanges. In this paper, we propose a model-driven method, FLYOVER, to generate a dataset of diverse interchanges with measurable diversity coverage. First, FLYOVER uses a labeled digraph to model interchange topology. Second, FLYOVER takes real-world interchanges as input to guarantee topology practicality and extracts different topology equivalence classes by classifying corresponding topology models. Third, for each topology class, FLYOVER identifies the corresponding geometrical features for the ramps and generates concrete interchanges using k-way combinatorial coverage and differential evolution. To illustrate the diversity and applicability of the generated interchange dataset, we test the built-in traffic flow control algorithm in SUMO and the fuel-optimization trajectory tracking algorithm deployed to Alibaba's autonomous trucks on the dataset. The results show that except for the geometrical difference, the interchanges are diverse in throughput and fuel consumption under the traffic flow control and trajectory tracking algorithms, respectively." Towards Multi-Day Field Deployment Autonomy: A Long-Term Self-Sustainable Micro Aerial Vehicle Robot,"Stephen Carlson, Prateek Arora, Tolga Karakurt, Brandon Moore, Christos Papachristos","University of Nevada, Reno,University of Nevada Reno",Environmental Applications,"This works deals with the problem of long-term autonomy in the context of multi-day field deployments of Micro Aerial Vehicle (MAV) systems. To truly depart from the necessity for human intervention for the crucial task of providing battery recharging, and to liberate from the need to operate in a confined range around specially installed infrastructure such as recharging pods, the MAV robot is required to harvest power on its own, but equally importantly also sustain prolonged periods of ambient power scarcity. This implies being able to sustain the battery charge overnight when using solar recharging, or even during multiple days of illumination inadequacy (e.g., due to degraded atmospheric lucidity and heavy overcast). We address this by presenting a Self-Sustainable Autonomous System architecture for MAVs centered around a specially tailored Power Management Stack, which is capable of achieving deep system hibernation, a feature that facilitates the aforementioned functionalities. We present a) continuous, b) multi-day successive, and c) externally-powered recharging that uses a legged robot-mounted Mobile Recharging Station. We conclude by demonstrating a challenging zero-intervention multi-day field deployment mission in the N.Nevada region." Stable Station Keeping of Autonomous Sailing Robots Via the Switched Systems Approach for Ocean Observation,"Weimin Qi, Qinbo Sun, Yu Cao, Huihuan Qian","The Chinese University of Hong Kong, Shenzhen,The Chinese Univeristy of Hong Kong, Shenzhen,Huawei Technology,The Chinese University of Hong Kong, Shenzhen",Environmental Applications,"Ocean observation is an emerging field, and sailing robots have several promising features (e.g., long-range sailing, environmental friendliness, energy-saving and low-noise) to perform tasks. In this paper, we define an ocean observation mission in a restricted target area as a station keeping problem. Inspired by an orientation-restricted Dubins path method, the robot keeps sailing and collecting data in a smooth reciprocation, where the trajectories consist of sailing against wind segments and turning downwind parts divided by a goal area and an acceptable area. The upwind sailing segments are of interest for data acquisition. However, the system stability can not be guaranteed during the whole reciprocation especially for sailing outside the goal area. Hereby, we refer to a switched systems approach and propose a desired heading generation scheme to realize safe and stable control in both areas. The stability for subsystems is proved with Lyapunov-like functions. The stable station keeping scheme is verified in both simulation and real experiments. Finally, we completed continuous and effective observation within 50 minutes in the goal area with a radius of 50 meters by a catamaran robot named OceanVoy460." CUREE: A Curious Underwater Robot for Ecosystem Exploration,"Yogesh Girdhar, Nathan Mcguire, Levi Cai, Stewart Jamieson, Seth Mccammon, John E. San Soucie, Jessica Eve Todd, Brian Claus, T. Aran Mooney","Woods Hole Oceanographic Institution,Northeastern University,Massachusetts Institute of Technology,Woods Hole Oceanographic Instituttion,MIT",Environmental Applications,"The current approach to exploring and monitoring complex underwater ecosystems, such as coral reefs, is to conduct surveys using diver-held cameras, static cameras, or deploying sensor buoys. These approaches often fail to capture the full variation and complexity of interactions between different reef organisms and their habitat. The CUREE platform presented in this paper provides a unique set of capabilities in the form of robot behaviors and perception algorithms to enable scientists to explore different aspects of an ecosystem. Examples of these capabilities include low-altitude visual surveys, soundscape surveys, habitat characterization, and animal following. We demonstrate these capabilities by describing two field deployments on coral reefs in the US Virgin Islands. In the first deployment, we show that CUREE can identify the preferred habitat type of snapping shrimps in a reef through a combination of a visual survey, habitat characterization, and a soundscape survey. In the second deployment, we demonstrate CUREE's ability to follow arbitrary animals by separately following a barracuda and stingray for several minutes each in midwater and benthic environments, respectively." "Multi-Robot 3D Gas Distribution Mapping: Coordination, Information Sharing and Environmental Knowledge","Chiara Ercolani, Shashank Mahendra Deshmukh, Thomas Laurent Peeters, Alcherio Martinoli",EPFL,Environmental Applications,"Environmental monitoring and mapping operations are an essential tool to combat climate change. An important branch of this domain concerns the construction of reliable gas maps. Adaptive navigation strategies coupled with multi-robot systems improve the outcome of an environmental mapping mission by focusing more efficiently on informative areas. This direction is yet to be explored in the context of gas mapping, which presents peculiar challenges due to the hard-to-sense and expensive-to-model nature of the underlying phenomenon. In this paper, we introduce the application of a multi-robot system to a gas mission with severe time constraints. We study the impact of information-based navigation strategies, coupled with increasing levels of coordination among the robots, on information gathering and consequent map reconstruction performance. We also focus on proposing solutions that inject additional knowledge into the system to enhance the final mapping outcome. We tested the strategies through extensive high-fidelity simulation experiments, and we compared the proposed approaches to three relevant baseline methods." L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras,"Kevin Ta, David Bruggemann, Tim Broedermann, Sakaridis Christos, Luc Van Gool","Waabi,ETH Zurich",Calibration and Identification,"As neuromorphic technology is maturing, its application to robotics and autonomous vehicle systems has become an area of active research. In particular, event cameras have emerged as a compelling alternative to frame-based cameras in low-power and latency-demanding applications. To enable event cameras to operate alongside staple sensors like lidar in perception tasks, we propose a direct, temporally-decoupled extrinsic calibration method between event cameras and lidars. The high dynamic range, high temporal resolution, and low-latency operation of event cameras are exploited to directly register lidar laser returns, allowing information-based correlation methods to optimize for the 6-DoF extrinsic calibration between the two sensors. This paper presents the first direct calibration method between event cameras and lidars, removing dependencies on frame-based camera intermediaries and/or highly-accurate hand measurements. Code: https://github.com/kev-in-ta/l2e" Experimental Evaluation of a Method for Improving Experiment Design in Robot Identification,"Stefanie Zimmermann, Martin Enqvist, Svante Gunnarsson, Stig Moberg, Mikael Norrlöf","Linköping University,ABB AB",Calibration and Identification,"The control system of industrial robots is often model-based, and the quality of the model of high importance. Therefore, a fast and easy-to-use process for finding the model parameters from a combination of prior knowledge and measurement data is required. It has been shown that the experiment design can be improved in terms of short experiment times and an accurate parameter estimate if the robot configurations for the identification experiments are selected carefully. Estimates of the information matrix can be generated based on simulations for a number of candidate configurations, and an optimization problem can be solved for finding the optimal configurations. This work shows that the proposed method for improved experiment design works with a real manipulator, i.e.;it is demonstrated that the experiment time is reduced significantly and the accuracy of the parameter estimate can be maintained or reduced if experiments are conducted only in the optimal manipulator configurations. It is also shown that the model improvement is relevant for realizing accurate control. Finally, the experimental data reveals that, in order to further improve the model accuracy, a more advanced model structure is needed for taking into account the commonly present nonlinear transmission stiffness of the robotic joints." DEdgeNet: Extrinsic Calibration of Camera and LiDAR with Depth-Discontinuous Edges,"Yiyang Hu, Hui Ma, Leiping Jie, Hui Zhang","Beijing Normal University - Hong Kong Baptist University United ,Hong Kong Baptist University,United International College, BNU-HKBU",Calibration and Identification,"This paper addresses the problem of calibrating extrinsic parameter matrix between an RGB camera and a LiDAR. Multimodal sensing systems are essential for fully autonomous navigation platforms. A key pre-requisite for such a system is calibration between different sensors. As the two most widely equipped sensors, calibration between RGB cameras and LiDARs remains challenging. Existing methods address this problem without using explicit geometric priors. In this paper, we propose a novel real-time network that utilizes depth-discontinuous edges extracted from a single image to calibrate cameras and LiDARs. Our network consists of two key components: (1) a self-supervised edge extraction network named DEdgeNet, which detects depth-discontinuous edges from a single image and extracts corresponding features; (2) prediction of the extrinsic parameter matrix between the camera and the LiDAR by matching fixed features in RGB images and updating depth features in a coarse-to-fine frame. Specifically, considering that edges are rich and common in natural scenes, DEdgeNet simplifies RGB image encoding and extracts fixed edges for feature matching. We conducted extensive experiments on the KITTI-odometry dataset. The results show that our method achieves an average rotation error of 0.028° and an average translation error of 0.247 cm, which demonstrates the superiority of our method." Joint Camera Intrinsic and LiDAR-Camera Extrinsic Calibration,"Guohang Yan, Feiyu He, Chunlei Shi, Pengjin Wei, Xinyu Cai, Yikang Li","Shanghai AI Laboratory,Shanghai AI Lab,Southeast University,SHANGHAI JIAO TONG UNIVERSITY,Sensetime Ltd.",Calibration and Identification,"Sensor-based environmental perception is a crucial step for autonomous driving systems, for which an accurate calibration between multiple sensors plays a critical role. For the calibration of LiDAR and camera, the existing method is generally to calibrate the intrinsic of the camera first and then calibrate the extrinsic of the LiDAR and camera. If the camera's intrinsic is not calibrated correctly in the first stage, it isn't easy to calibrate the LiDAR-camera extrinsic accurately. Due to the complex internal structure of the camera and the lack of an effective quantitative evaluation method for the camera's intrinsic calibration, in the actual calibration, the accuracy of extrinsic parameter calibration is often reduced due to the tiny error of the camera's intrinsic parameters. To this end, we propose a novel target-based joint calibration method of the camera intrinsic and LiDAR-camera extrinsic parameters. Firstly, we design a novel calibration board pattern, adding four circular holes around the checkerboard for locating the LiDAR pose. Subsequently, a cost function defined under the reprojection constraints of the checkerboard and circular holes features is designed to solve the camera's intrinsic parameters, distortion factor, and LiDAR-camera extrinsic parameter. In the end, quantitative and qualitative experiments are conducted in actual and simulated environments, and the result shows the proposed method can achieve accuracy and robust performance. The open-source code is available at https://github.com/OpenCalib/JointCalib." Online Hand-Eye Calibration with Decoupling by 3D Textureless Object Tracking,"Li Jin, Kang Xie, Wenxuan Chen, Xin Cao, Yuehua Li, Jiachen Li, Jiankai Qian, Xueying Xueying Qin","Shandong university,Shandong University,Zhejiang Lab,Zhejiang University,ShanDong University",Calibration and Identification,"Hand-eye calibration estimates the pose of a camera relative to a robot, which is a fundamental problem for visually guided robots, especially for dynamic object grasping. Most methods use 2D fiducial markers with distinctive visual features, but only a few use 3D objects. We propose a novel hand-eye calibration method using a 3D object on a work site, which can work online and automatically even if the object is textureless or weakly textured. Although the object-to-camera poses obtained by 3D tracking are pretty noisy due to the absence of visual features of objects, we propose an iterative optimization strategy to achieve robust and accurate calibration results without artificial initial values. First, we build a 3D convergence point constraint with multi-view lines of sight of object position to optimize its value. Then, we optimize the hand-eye pose by the closed-loop constraint with a fixed object position. Therefore, the hand-eye pose is disentangled with the object position by alternate iterations of those two constraints. The Strategy can robustly estimate the hand-eye pose even with limited textureless object tracking. We also propose a Pose Refinement Network (PR-Net), which improves the accuracy of 3D object tracking, and the accuracy of the hand-eye pose can be further enhanced. The experiments show that the average error of our hand-eye calibration method is 1.20 degrees and 23.18 mm. The results achieved state of art by using the working object to realize the online hand-eye calibration." Using the Deflection Center to Auto-Calibrate the Pan-Tilt-Zoom Camera Linearly,"Liu Yu, Hui Zhang","United International College, BNU-HKBU",Calibration and Identification,"This paper addresses the linear auto-calibration problem of a pan-tilt-zoom (PTZ) camera. Unlike existing methods, we take full advantage of the offset of the camera center from the rotation center, which is usually non-negligible in bullet-type PTZ cameras. Without any prior assumption, we propose a linear method to recover all intrinsic parameters. First, we successively acquired at least four images using the zoom and rotation capabilities of the PTZ camera. Second, using the homography of two images at the same location but different scales, the principal point and zoom scalar can be linearly recovered. Finally, based on the unknown offset of the camera center and rotation center, we propose a linear method to solve the scale factor in the Kruppa equation and recover the remaining camera intrinsic parameters, namely focal lengths and skew. Synthetic and real experiments demonstrate the feasibility of our approach." Coordinate Calibration of a Dual-Arm Robot System by Visual Tool Tracking,"Junlei Hu, Dominic Jones, Pietro Valdastri",University of Leeds,Calibration and Identification,"The calibration of a vision-guided dual-arm robotic system, including the robot-robot and hand-eye calibration, requires the tracked positions of markers in different postures. However, in many cases, using markers to calibrate is impractical. Only some markerless features can be obtained rather than the rigid transform matrix; for example, the shaft of a markerless robotic tool can be tracked. Therefore, we proposed a Kronecker-Product-based method to calibrate the dual-arm system with a tracked robotic tool by decoupling the translation and rotation. The simulation and experiment results on a da Vinci Research Kit show that the proposed method is robust and accurate under different noise levels and various sample robot movements, compared with two state-of-the-art methods for dual-arm calibration with complete homogeneous transformations." A Graph-Based Optimization Framework for Hand-Eye Calibration for Multi-Camera Setups,"Daniele Evangelista, Emilio Olivastri, Davide Allegro, Emanuele Menegatti, Alberto Pretto","Università degli studi di Padova,University of Padua,University of Padova,The University of Padua",Calibration and Identification,"Hand-eye calibration is the problem of estimating the spatial transformation between a reference frame, usually the base of a robot arm or its gripper, and the reference frame of one or multiple cameras. Generally, this calibration is solved as a non-linear optimization problem, what instead is rarely done is to exploit the underlying graph structure of the problem itself. Actually, the problem of hand-eye calibration can be seen as an instance of the Simultaneous Localization and Mapping (SLAM) problem. Inspired by this fact, in this work we present a pose-graph approach to the hand-eye calibration problem that extends a recent state-of-the-art solution in two different ways: i) by formulating the solution to eye-on-base setups with one camera; ii) by covering multi-camera robotic setups. The proposed approach has been validated in simulation against standard hand-eye calibration methods. Moreover, a real application is shown. In both scenarios, the proposed approach overcomes all alternative methods. We release with this paper an open-source implementation of our graph-based optimization framework for multi-camera setups." Fast Extrinsic Calibration for Multiple Inertial Measurement Units in Visual-Inertial System,"Youwei Yu, Yanqing Liu, Fengjie Fu, Sihan He, Dongchen Zhu, Lei Wang, Xiaolin Zhang, Jiamao Li","Shanghai Institute of Microsystem and Information Technology,Shanghai Institute of Microsystem and Information Technology, Ch,Shanghai Institute of Microsystem Information and technology, Ch,Shanghai Institute of Microsystem and Information Technology,Chi,Shanghai Institute of Microsystem And Information Technology,Chi",Calibration and Identification,"In this paper, we propose a fast extrinsic calibration method for fusing multiple inertial measurement units (MIMU) to improve visual-inertial odometry (VIO) localization accuracy. Currently, data fusion algorithms for MIMU highly depend on the number of inertial sensors. Based on the assumption that extrinsic parameters between inertial sensors are perfectly calibrated, the fusion algorithm provides better localization accuracy with more IMUs, while neglecting the effect of extrinsic calibration error. Our method builds two non-linear least-squares problems to estimate the MIMU relative position and orientation separately, independent of external sensors and inertial noises online estimation. Then we give the general form of the virtual IMU (VIMU) method and propose its propagation on manifold. We perform our method on datasets, our self-made sensor board, and board with different IMUs, validating the superiority of our method over competing methods concerning speed, accuracy, and robustness. In the simulation experiment, we show that only fusing two IMUs with our calibration method to predict motion can rival nine IMUs. Real-world experiments demonstrate better localization accuracy of the VIO integrated with our calibration method and VIMU propagation on manifold." Completely Rational SO(n) Orthonormalization,"Wu Jin, Soheil Sarabandi, Jianhao Jiao, Huaiyang Huang, Bohuan Xue, Ruoyu Geng, Lujia Wang, Ming Liu","UESTC,IRI (CSIC-UPC),The Hong Kong University of Science and Technology,the Hong Kong University of Science and Technology,HKUST,HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY,The Hong Kong University of Technology,Hong Kong University of Science and Technology",Calibration and Identification,"The rotation orthonormalization on the special orthogonal group SO(n), also known as the high dimensional nearest rotation problem, has been revisited. A new generalized simple iterative formula has been proposed that solves this problem in a completely rational manner. Rational operations allow for efficient implementation on various platforms and also significantly simplify the synthesis of large-scale circuitization. The developed scheme is also capable of designing efficient fundamental rational algorithms, for example, quaternion normalization, which outperforms long-existing solvers. Furthermore, an SO(n) neural network has been developed for further learning purposes in the rotation group. Simulation results verify the effectiveness of the proposed scheme and show the superiority against existing representatives. Applications show that the proposed orthonormalizer is of potential in robotic pose estimation problems, e.g., hand-eye calibration." An Active Learning Based Robot Kinematic Calibration Framework Using Gaussian Processes,"Ersin Das, Joel Burdick","Caltech,California Institute of Technology",Calibration and Identification,"Future NASA lander missions to icy moons will require completely automated, accurate, and data efficient calibration methods for the robot manipulator arms that sample icy terrains in the lander's vicinity. To support this need, this paper presents a Gaussian Process (GP) approach to the classical manipulator kinematic calibration process. Instead of identifying a corrected set of Denavit-Hartenberg kinematic parameters, a set of GPs models the residual kinematic error of the arm over the workspace. More importantly, this modeling framework allows a Gaussian Process Upper Confident Bound (GP-UCB) algorithm to efficiently and adaptively select the calibration's measurement points so as to minimize the number of experiments, and therefore minimize the time needed for recalibration. The method is demonstrated in simulation on a simple 2-DOF arm, a 6 DOF arm whose geometry is a candidate for a future NASA mission, and a 7 DOF Barrett WAM arm." Identification of a Generalized Base Inertial Parameter Set of Robotic Manipulators Considering Mounting Configurations,"Mario Troebinger, Abdeldjallil Naceri, Xiao Chen, Hamid Sadeghian, Sami Haddadin",Technical University of Munich,Calibration and Identification,"Identifying the inertial parameters of real robotic manipulators is a fundamental step towards realistic modeling and better controller performances, which is crucial for safe human-robot interaction. Our work introduces a novel framework for identifying a generalized set of base inertial parameters of a serial link manipulator. This framework is designed to be adaptable to accommodate any new mounting configuration of the robot. Our theoretical analysis highlights the influence of the robot's mounting configuration on the emergence of new parameters that cannot be identified through the conventional vertical base-axis mounting approach studied previously. To validate our proposed framework, we carried out two main experiments: the first involved simulation to establish the feasibility of our concept, and in the second, our framework was employed on a Franka Emika Robot in a real-world scenario to demonstrate and validate our approach. Our simulation results confirmed the feasibility of our proposed framework, while our real-world experiment successfully identified the generalized base inertial parameter set and validated its applicability to a new robot mounting configuration." "Open-Vocabulary, Queryable Scene Representations for Real World Planning","Boyuan Chen, Fei Xia, Brian Ichter, Kanishka Rao, Keerthana Gopalakrishnan, Michael S Ryoo, Austin Stone, Daniel Kappler","Massachusetts Institute of Technology,Google Inc,Google Brain,Google,Google, Stony Brook University,X (Google)",AI-Enabled Robotics,"Large language models (LLMs) have unlocked new capabilities of task planning from human instructions. However, prior attempts to apply LLMs to real-world robotic tasks are limited by the lack of grounding in the surrounding scene. In this paper, we develop NLMap, an open-vocabulary and queryable scene representation to address this problem. NLMap serves as a framework to gather and integrate contextual information into LLM planners, allowing them to see and query available objects in the scene before generating a context-conditioned plan. NLMap first establishes a natural language queryable scene representation with Visual Language models (VLMs). An LLM based object proposal module parses instructions and proposes involved objects to query the scene representation for object availability and location. An LLM planner then plans with such information about the scene. NLMap allows robots to operate without a fixed list of objects nor executable options, enabling real robot operation that cannot be achieved by previous methods. Project website: https://nlmap-saycan.github.io" ProgPrompt: Generating Situated Robot Task Plans Using Large Language Models,"Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg","University of Southern California,NVIDIA,Stanford Univesity,Nvidia,University of Washington,USC Viterbi School of Engineering,University of Toronto",AI-Enabled Robotics,"Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks. Website at https://progprompt.github.io/" Guiding Reinforcement Learning with Shared Control Templates,"Abhishek Padalkar, Gabriel Quere, Franz Steinmetz, Antonin Raffin, Matthias Nieuwenhuisen, João Silvério, Freek Stulp","German Aerospace Center, Institute of Robotics and Mechatronics,,DLR,German Aerospace Center (DLR),Fraunhofer Institute for Communication, Information Processing a,German Aerospace Center,DLR - Deutsches Zentrum für Luft- und Raumfahrt e.V.",AI-Enabled Robotics,"Purposeful interaction with objects usually requires certain constraints to be respected, e.g. keeping a bottle upright to avoid spilling. In reinforcement learning, such constraints are typically encoded in the reward function. As a consequence, constraints can only be learned by violating them. This often precludes learning on the physical robot, as it may take many trials to learn the constraints, and the necessity to violate them during the trial-and-error learning may be unsafe. We have serendipitously discovered that constraint representations for shared control – in particular Shared Control Templates (SCTs) – are ideally suited for safely guiding RL. Representing constraints explicitly, rather than implicitly in the reward function, also simplifies the design of the reward function. The main advantage of the approach is safer, faster learning without constraint violations (even with sparse reward functions). We demonstrate this in a pouring task in simulation and on a real robot, where learning the task requires only 65 episodes in 16 minutes." Anticipatory Planning: Improving Long-Lived Planning by Estimating Expected Cost of Future Tasks,"Roshan Dhakal, Gregory Stein, Md Ridwan Hossain Talukder",George Mason University,AI-Enabled Robotics,"We consider a service robot in a household environment given a sequence of high-level tasks one at a time. Most existing task planners, lacking knowledge of what they may be asked to do next, solve each task in isolation and so may unwittingly introduce side effects that make subsequent tasks more costly. In order to reduce the overall cost of completing all tasks, we consider that the robot must anticipate the impact its actions could have on future tasks. Thus, we propose emph{anticipatory planning}: an approach in which estimates of the expected future cost, from a graph neural network, augment model-based task planning. Our approach guides the robot towards behaviors that encourage preparation and organization, reducing overall costs in long-lived planning scenarios. We evaluate our method on blockworld environments and show that our approach reduces the overall planning costs by 5% as compared to planning without anticipatory planning. Additionally, if given an opportunity to emph{prepare} the environment in advance (a special case of anticipatory planning), our planner improves overall cost by 11%." Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement,"Zirui Zhao, Wee Sun Lee, David Hsu",National University of Singapore,AI-Enabled Robotics,"We present a new method, PARsing And visual GrOuNding (ParaGon), for grounding natural language in object placement tasks. Natural language generally describes objects and spatial relations with compositionally and ambiguity, two major obstacles to effective language grounding. For compositionality, ParaGon parses a language instruction into an object-centric graph representation to ground objects individually. For ambiguity, ParaGon uses a novel particle-based graph neural network to reason about object placements with uncertainty. Essentially, ParaGon integrates a parsing algorithm into a probabilistic, data-driven learning framework. It is fully differentiable and trained end-to-end from data for robustness against complex, ambiguous language input." Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification,"Jiayi Pan, Glen Chou, Dmitry Berenson",University of Michigan,AI-Enabled Robotics,"To make robots accessible to a broad audience, it is critical to endow them with the ability to take universal modes of communication, like commands given in natural language, and extract a concrete desired task specification, defined using a formal language like linear temporal logic (LTL). In this paper, we present a learning-based approach for translating from natural language commands to LTL specifications with very limited human-labeled training data. This is in stark contrast to existing natural-language to LTL translators, which require large human-labeled datasets, often in the form of labeled pairs of LTL formulas and natural language commands, to train the translator. To reduce reliance on human data, our approach generates a large synthetic training dataset through algorithmic generation of LTL formulas, conversion to structured English, and then exploiting the paraphrasing capabilities of modern large language models (LLMs) to synthesize a diverse corpus of natural language commands corresponding to the LTL formulas. We use this generated data to finetune an LLM and apply a constrained decoding procedure at inference time to ensure the returned LTL formula is syntactically correct. We evaluate our approach on three existing LTL/natural language datasets and show that we can translate natural language commands at 75% accuracy with far less human data (" Improving the Generalizability of Trajectory Prediction Models with Frenét-Based Domain Normalization,"Luyao Ye, Zikang Zhou, Jianping Wang",City University of Hong Kong,AI-Enabled Robotics,"Predicting the future trajectories of robots' nearby objects plays a pivotal role in applications such as autonomous driving. While learning-based trajectory prediction methods have achieved remarkable performance on public benchmarks, the generalization ability of these approaches remains questionable. The poor generalizability on unseen domains, as a well-recognized defect of data-driven approaches, can potentially harm the real-world performance of trajectory prediction models. We are thus motivated to improve models’ generalization ability instead of merely pursuing high accuracy on average. Due to the lack of benchmarks for quantifying the generalization ability of trajectory predictors, we first construct a new benchmark called argoverse-shift, where the distributions of training and testing data are significantly different. Using this benchmark for evaluation, we identify that the domain shift problem seriously hinders the generalization of trajectory predictors since state-of-the-art approaches suffer from severe performance degradation when facing those out-of-distribution scenes. To enhance the robustness of models against domain shift, we propose a plug-and-play strategy for domain normalization in trajectory prediction. Our strategy utilizes the Frenét coordinate frame for modeling and can effectively narrow the domain gap of different scenes that is caused by the variety of road geometry and topology. Experiments show that our approach noticeably boosts the prediction performance of the state-of-the-art in domains that were previously unseen to the models and thereby improves the generalization ability of data-driven trajectory prediction methods." An Open Approach to Energy-Efficient Autonomous Mobile Robots,"Liangkai Liu, Ren Zhong, Aaron Willcock, Nathan Fisher, Weisong Shi","Wayne State University,wayne state university,University of Delaware",AI-Enabled Robotics,"Autonomous mobile robots (AMRs) have the capability to execute a wide range of tasks with minimal human intervention. However, one of the major limitations of AMRs is their limited battery life, which often results in interruptions to their task execution and the need to reach the nearest charging station. Optimizing energy consumption in AMRs has become a critical challenge in their deployment. Through empirical studies on real AMRs, we have identified a lack of coordination between computation and control as a major source of energy inefficiency. In this paper, we propose a comprehensive energy prediction model that provides real-time energy consumption for each component of the AMR. Additionally, we propose three path models to address the obstacle avoidance problem for AMRs. To evaluate the performance of our energy prediction and path models, we have developed a customized AMR called Donkey, which has the capability for fine-grained (millisecond-level) end-to-end power profiling. Our energy prediction model demonstrated an accuracy of over 90% in our evaluations. Finally, we applied our energy prediction model to obstacle avoidance and guided energy-efficient path selection, resulting in up to a 44.8% reduction in energy consumption compared to the baseline." Grounding Language with Visual Affordances Over Unstructured Data,"Oier Mees, Jessica Borja Diaz, Wolfram Burgard","University of Freiburg,University of Technology Nuremberg",Award Finalists 2,"Recent works have shown that Large Language Models (LLMs) can be applied to ground natural language to a wide variety of robot skills. However, in practice, learning multi-task, language-conditioned robotic skills typically requires large-scale data collection and frequent human intervention to reset the environment or help correcting the current policies. In this work, we propose a novel approach to efficiently learn general-purpose language-conditioned robot skills from unstructured, offline and reset-free data in the real world by exploiting a self-supervised visuo-lingual affordance model, which requires annotating as little as 1% of the total data with language. We evaluate our method in extensive experiments both in simulated and real-world robotic tasks, achieving state-of-the-art performance on the challenging CALVIN benchmark and learning over 25 distinct visuomotor manipulation tasks with a single policy in the real world. We find that when paired with LLMs to break down abstract natural language instructions into subgoals via few-shot prompting, our method is capable of completing long-horizon, multi-tier tasks in the real world, while requiring an order of magnitude less data than previous approaches. Code and videos are available at http://hulc2.cs.uni-freiburg.de" Gaka-Chu: A Self-Employed Autonomous Robot Artist,"Eduardo Castello, Ivan Berman, Aleksandr Kapitonov, Vadim Manaenko, Makar Cherniaev, Pavel Tarasov","Indra Digital Labs,M,M Economy, MerkleBot Inc.,M,M Economy, Merklebot Inc.,M,M Economy, Inc. (""Merklebot""), San Francisco, CA, USA",AI-Enabled Robotics,"The physical autonomy of robots is well understood both theoretically and practically. By contrast, there is almost no research exploring their potential economic autonomy. In this paper, we present the first economically autonomous robot---a robot able to produce marketable goods while having full control over the use of its generated income. Gaka-chu (""painter"" in Japanese) is a 6-axis robot arm that creates paintings of Japanese characters from an autoselected keyword. By using a blockchain-based smart contract, Gaka-chu can autonomously list a painting it made for sale in an online auction. In this transaction, the robot interacts with the human bidders as a peer not as a tool. Using the blockchain-based smart contract, Gaka-chu can then use its income from selling paintings to replenish its resources by autonomously ordering materials from an online art shop. We built the Gaka-chu prototype with an Ethereum-based smart contract and ran a 6-month long experiment, during which the robot created and sold four paintings, simultaneously using its income to purchase supplies and repay initial investors. In this work, we present the results of the experiments conducted and discuss the implications of economically autonomous robots." LEARNEST: LEARNing Enhanced Model-Based State ESTimation for Robots Using Knowledge-Based Neural Ordinary Differential Equations,"Kong Yao Chee, M. Ani Hsieh",University of Pennsylvania,AI-Enabled Robotics,"State estimation is an important aspect in many robotics applications. In this work, we consider the task of obtaining accurate state estimates for robotic systems by enhancing the dynamics model used in state estimation algorithms. Existing frameworks such as moving horizon estimation (MHE) and the unscented Kalman filter (UKF) provide the flexibility to incorporate nonlinear dynamics and measurement models. However, this implies that the dynamics model within these algorithms has to be sufficiently accurate in order to warrant the accuracy of the state estimates. To enhance the dynamics models and improve the estimation accuracy, we utilize a deep learning framework known as knowledge-based neural ordinary differential equations (KNODEs). The KNODE framework embeds prior knowledge into the training procedure and synthesizes an accurate hybrid model by fusing a prior first-principles model with a neural ordinary differential equation (NODE) model. In our proposed LEARNEST framework, we integrate the data-driven model into two novel model-based state estimation algorithms, which are denoted as KNODE-MHE and KNODE-UKF. These two algorithms are compared against their conventional counterparts across a number of robotic applications; state estimation for a cartpole system using partial measurements, localization for a ground robot, as well as state estimation for a quadrotor. Through simulations and tests using real-world experimental data, we demonstrate the versatility and efficacy of the proposed learning-enhanced state estimation framework." A Joint Modeling of Vision-Language-Action for Target-Oriented Grasping in Clutter,"Kechun Xu, Shuqi Zhao, Zhongxiang Zhou, Zizhang Li, Huaijin Pi, Yifeng Zhu, Yue Wang, Rong Xiong","Zhejiang University,The University of Texas at Austin",AI-Enabled Robotics,"We focus on the task of language-conditioned grasping in clutter, in which a robot is supposed to grasp the target object based on a language instruction. Previous works separately conduct visual grounding to localize the target object, and generate a grasp for that object. However, these works require object labels or visual attributes for grounding, which calls for handcrafted rules in planner and restricts the range of language instructions. In this paper, we propose to jointly model vision, language and action with object-centric representation. Our method is applicable under more flexible language instructions, and not limited by visual grounding error. Besides, by utilizing the powerful priors from the pre-trained multi-modal model and grasp model, sample efficiency is effectively improved and the sim2real problem is relived without additional data for transfer. A series of experiments carried out in simulation and real world indicate that our method can achieve better task success rate by less times of motion under more flexible language instructions. Moreover, our method is capable of generalizing better to scenarios with unseen objects and language instructions." A Virtual Reality Framework for Fast Dataset Creation Applied to Cloth Manipulation with Automatic Semantic Labelling,"Julia Borras Sol, Arnau Boix-granell, Sergi Foix, Carme Torras","Institut de Robòtica i Informàtica Industrial (CSIC-UPC),CSIC-UPC,CSIC - UPC",Virtual Reality and Interfaces,"Teaching complex manipulation skills, such as folding garments, to a bi-manual robot is a very challenging task, which is often tackled through learning from demonstration. The few datasets of garment-folding demonstrations available nowadays to the robotics research community have been either gathered from human demonstrations or generated through simulation. The former have the great difficulty of perceiving both cloth state and human action as well as transferring them to the dynamic control of the robot, while the latter require coding human motion into the simulator in open loop, i.e., without incorporating the visual feedback naturally used by people, resulting in far-from-realistic movements. In this article, we present an accurate dataset of human cloth folding demonstrations. The dataset is collected through our novel virtual reality (VR) framework, based on Unity’s 3D platform and the use of an HTC Vive Pro system. The framework is capable of simulating realistic garments while allowing users to interact with them in real time through handheld controllers. By doing so, and thanks to the immersive experience, our framework permits exploiting human visual feedback in the demonstrations while at the same time getting rid of the difficulties of capturing the state of cloth, thus simplifying data acquisition and resulting in more realistic demonstrations. We create and make public a dataset of cloth manipulation sequences, whose cloth states are semantically labeled in an automatic way by using a novel low-dimensional cloth representation that yields a very good separation between different cloth configurations." Skill-Based Robot Programming in Mixed Reality with Ad-Hoc Validation Using a Force-Enabled Digital Twin,"Jan Krieglstein, Gesche Held, Balázs András Bálint, Frank Naegele, Werner Kraus",Fraunhofer IPA,Virtual Reality and Interfaces,"Skill-based programming has proven to be advantageous for assembly tasks, but still requires expert knowledge, especially for force-controlled applications. However, it is error-prone due to the multitude of parameters, e.g. different coordinate frames and either position-, velocity- or force-controlled motions on the axes of a frame. We propose a mixed reality based solution, which systematically visualizes the geometric constraints of advanced high-level skills directly in the real-world robotic environment and provides a user interface to create applications efficiently and safely in mixed reality. Therefore, state-machine information is also visualized, and a holographic digital twin allows the user to ad-hoc validate the program via force-enabled simulation. The approach is evaluated on a top hat rail mounting task, proving the capability of the system to handle advanced assembly programming tasks efficiently and tangibly." "A Virtual Reality Planning Environment for High-Risk, High-Latency Teleoperation","Will Pryor, Liam Wang, Arko Chatterjee, Balazs Vagvolgyi, Anton Deguet, Simon Leonard, Louis Whitcomb, Peter Kazanzides","Johns Hopkins University,The Johns Hopkins University",Virtual Reality and Interfaces,"Teleoperation of robots in space is challenging due to high latency and limited workspace visibility. Previously, the Interactive Planning and Supervised Execution (IPSE) and Augmented Virtuality systems were developed to reduce failure risk. These tools were visualized on a 3D da Vinci surgical console and operated using the da Vinci manipulators or visualized on conventional monitors and operated with a keyboard and mouse. Experimental studies indicated operator preference for the latter. In this work, we develop a 3D virtual reality (VR) interface for IPSE, implemented on a Meta Quest 2 head-mounted display (HMD), and evaluate it against the prior 2D, keyboard-and-mouse-based interface. The results demonstrate improved operator load with the 3D VR interface, with no decrease in task performance, while also providing cost and portability benefits compared to the conventional 2D interface." Avatarm: An Avatar with Manipulation Capabilities for the Physical Metaverse,"Alberto Villani, Giovanni Cortigiani, Bernardo Brogi, Nicole D'Aurizio, Tommaso Lisini Baldi, Domenico Prattichizzo","University of Siena,University of Siena, Istituto Italiano di Tecnologia",Virtual Reality and Interfaces,"Metaverse is an immersive shared space that remote users can access through virtual and augmented reality interfaces, enabling their avatars to interact with each other and the surrounding. Although digital objects can be manipulated, physical objects cannot be touched, grasped, or moved within the metaverse due to the lack of a suitable interface. This work proposes a solution to overcome this limitation by introducing the concept of a Physical Metaverse enabled by a new interface named “Avatarm”. The Avatarm consists in an avatar enhanced with a robotic arm that performs physical manipulation tasks while remaining entirely hidden in the metaverse. The users have the illusion that the avatar is directly manipulating objects without the mediation by a robot. The Avatarm is the first step towards a new metaverse, the “Physical Metaverse,” where users can physically interact each other and with the environment." Interacting with Multi-Robot Systems Via Mixed Reality,"Florian Kennel-Maushart, Roi Poranne, Stelian Coros","ETHZ,ETH Zurich",Virtual Reality and Interfaces,"Mobile robots are becoming safer and more affordable, and their presence in the workspace is increasing. However, many tasks that involve reasoning, long-term planning or human preferences are still hard to automate. While some solutions in specialised areas slowly emerge, an alternative to full autonomy can be to actively leverage intuition and experience of human operators. To do this, suitable interfaces and modes of interaction have to be explored. Inspired by Real-Time Strategy games, we implement a Mixed Reality interface that can be used with either a Microsoft HoloLens 2 headset or a tablet. The interface allows users to interact with multiple mobile robots simultaneously. We conduct a user study to compare the headset and tablet versions of the interface in different scenarios inspired by a real-world construction setting. We show that, while performance and preference of interface are dependent on the task and the complexity of the required interaction, users are able to solve non-trivial tasks on both platforms using our system." PointCloudLab: An Environment for 3D Point Cloud Annotation with Adapted Visual Aids and Levels of Immersion,"Achref Doula, Tobias Güdelhöfer, Andrii Matviienko, Max Mühlhäuser, Alejandro Sanchez Guinea","Technical University of Darmstadt,Technische Universität Darmstadt,TU Darmstadt",Virtual Reality and Interfaces,"The annotation of 3D point cloud datasets is an expensive and tedious task. To optimize the annotation process, recent works have proposed the use of environments with higher levels of immersion in combination with different types of visual aids. However, two problems remain unresolved. First, the proposed environments limit the user to a unique level of immersion and a fixed hardware setup. Second, their design overlooks the interaction effects between the level of immersion and the visual aids on the quality of the annotation process. To address these issues, we propose PointCloudLab, an environment for 3D point cloud annotation that allows the use of different levels of immersion that work in combination with visual aids. Using PointCloudLab, we conducted a controlled experiment (N=20) to investigate the effects of levels of immersion and visual aids on the annotation process. Our findings reveal that higher levels of immersion combined with object-based visual aids lead to a faster and more accurate annotation. Furthermore, we found significant interaction effects between the levels of immersion and the visual aids on the accuracy of the annotation." Augmented Reality-Assisted Robot Learning Framework for Minimally Invasive Surgery Task,"Junling Fu, Maria Chiara Palumbo, Elisa Iovene, Qingsheng Liu, Ilaria Burzo, Alberto Redaelli, Giancarlo Ferrigno, Elena De Momi","Politecnico di Milano,Ocean University of China",Virtual Reality and Interfaces,"This paper presents an Augmented Reality (AR)-assisted robot learning framework for a Minimally Invasive Surgery (MIS) task. The proposed framework exploits an external optical tracking system to collect human demonstration. Gaussian Mixture Model (GMM) and Gaussian Mixture Regression (GMR) are utilized to encode and generate a robust desired trajectory for transferring to the real robot for the MIS task. The HoloLens 2 Head-Mounted-Display (HMD) is integrated for intuitive visualization of the robot configuration under the constraint of a small incision on the patient’s abdominal cavity during the demonstration phase. Experiments are conducted to verify the feasibility and performance of the proposed framework and compared it with the kinesthetic teaching-based modality in a tumor resection MIS task. The results illustrate that the proposed AR-assisted robot learning framework requires lower workload demand, achieves higher performance and efficiency, and ensures the feasibility of the learned results for reproduction on a real robot for MIS tasks." Intuitive Robot Integration Via Virtual Reality Workspaces,"Minh Tram, Joseph Cloud, William Beksi","University of Texas at Arlington,University of Texas at Arlington, NASA Kennedy Space Center",Virtual Reality and Interfaces,"As robots become increasingly prominent in diverse industrial settings, the desire for an accessible and reliable system has correspondingly increased. Yet, the task of meaningfully assessing the feasibility of introducing a new robotic component, or adding more robots into an existing infrastructure, remains a challenge. This is due to both the logistics of acquiring a robot and the need for expert knowledge in setting it up. In this paper, we address these concerns by developing a purely virtual simulation of a robotic system. Our proposed framework enables natural human-robot interaction through a visually immersive representation of the workspace. The main advantages of our approach are the following: (i) independence from a physical system, (ii) flexibility in defining the workspace and robotic tasks, and (iii) an intuitive interaction between the operator and the simulated environment. Not only does our system provide an enhanced understanding of 3D space to the operator, but it also encourages a hands-on way to perform robot programming. We evaluate the effectiveness of our method in applying novel automation assignments by training a robot in virtual reality and then executing the task on a real robot." Reconstructing Objects In-The-Wild for Realistic Sensor Simulation,"Ze Yang, Siva Manivasagam, Yun Chen, Jingkang Wang, Rui Hu, Raquel Urtasun","University of Toronto,UBER ATG R&D,Uber",Simulation and Sim2Real,"Reconstructing objects from real world data and rendering them at novel views is critical to bringing realism, diversity and scale to simulation for robotics training and testing. In this work, we present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data captured at distance and at limited viewpoints. Towards this goal, we represent the object surface as a neural signed distance function and leverage both LiDAR and camera sensor data to reconstruct smooth and accurate geometry and normals. We model the object appearance with a robust physics-inspired reflectance representation effective for in-the-wild data. Our experiments show that NeuSim has strong view synthesis performance on challenging scenarios with sparse training views. Furthermore, we showcase composing NeuSim assets into a virtual world and generating realistic multi-sensor data for evaluating self-driving perception models." Real-Time Event Simulation with Frame-Based Cameras,"Andreas Ziegler, Daniel Teigland, Jonas Tebbe, Thomas Gossard, Andreas Zell",University of Tübingen,Simulation and Sim2Real,"Event cameras are becoming increasingly popular in robotics and computer vision due to their beneficial properties, e.g., high temporal resolution, high bandwidth, almost no motion blur, and low power consumption. However, these cameras remain expensive and scarce in the market, making them inaccessible to the majority. Using event simulators minimizes the need for real event cameras to develop novel algorithms. However, due to the computational complexity of the simulation, the event streams of existing simulators cannot be generated in real-time but rather have to be pre-calculated from existing video sequences or pre-rendered and then simulated from a virtual 3D scene. Although these offline generated event streams can be used as training data for learning tasks, all response time dependent applications cannot benefit from these simulators yet, as they still require an actual event camera. This work proposes simulation methods that improve the performance of event simulation by two orders of magnitude (making them real-time capable) while remaining competitive in the quality assessment." PCGen: Point Cloud Generator for LiDAR Simulation,"Chenqi Li, Yuan Ren, Bingbing Liu","University of Toronto,Noah's Ark Lab, Huawei Technologies Canada Inc,Huawei Technologies",Simulation and Sim2Real,"Data is a fundamental building block for LiDAR perception systems. Unfortunately, real-world data collection and annotation is extremely costly & laborious. Recently, real data based LiDAR simulators have shown tremendous potential to complement real data, due to their scalability and high-fidelity compared to graphics engine based methods. Before simulation can be deployed in the real-world, two shortcomings need to be addressed. First, existing methods usually generate data which are more noisy and complete than the real point clouds, due to 3D reconstruction error and pure geometry-based raycasting method. Second, prior works on simulation for object detection focus solely on rigid objects, like cars, but Vulnerable Road User (VRU)s, like pedestrians, are important road participants. To tackle the first challenge, we propose First Peak Averaging (FPA) raycasting and surrogate model raydrop. FPA enables the simulation of both point cloud coordinates and sensor features, while taking into account reconstruction noise. The ray-wise surrogate raydrop model mimics the physical properties of LiDAR’s laser receiver to determine whether a simulated point would be recorded by a real LiDAR. With minimal training data, the surrogate model can generalize to different geographies and scenes, closing the domain gap between raycasted and real point clouds. To tackle the simulation of deformable VRU simulation, we employ Skinned Multi-Person Linear model (SMPL) dataset to provide a pedestrian simulation baseline and compare the domain gap between CAD and reconstructed objects. Applying our pipeline to perform novel sensor synthesis, results show that object detection models trained by simulation data can achieve similar result as the real data trained model." Differentiable Dynamics Simulation Using Invariant Contact Mapping and Damped Contact Force,"Minji Lee, Jeongmin Lee, Dongjun Lee",Seoul National University,Simulation and Sim2Real,"The gradient of typical differentiable simulation is uninformative for two reasons: 1) non-smoothness in contact dynamics not considered properly, and 2) excessive local minima generated from the smoothing procedure. To tackle this issue, we first propose differentiable contact dynamics with an invariant contact set and coordinate differentiation using a signed distance function (SDF). Also, to eliminate the undesirable jittering caused by the smoothing procedure, which induces extra local minima, and to achieve a smooth and informative gradient, we further endow our framework with a novel damped contact model. Various optimization problems are implemented to demonstrate the usefulness and efficacy of our differentiable framework." M-EMBER: Tackling Long-Horizon Mobile Manipulation Via Factorized Domain Transfer,"Bohan Wu, Roberto Martín-martín, Fei-Fei Li","Stanford University,University of Texas at Austin",Award Finalists 1,"In this paper, we propose a novel method to create visuomotor mobile manipulation solutions to long-horizon activities. We propose to leverage the recent advances in robot simulation to train robust visual solutions in simulation that can transfer to the real world. While previous works have shown success applying this procedure to autonomous visual navigation and stationary manipulation, applying it to long-horizon visuomotor mobile manipulation is still an open challenge that demands both perceptual and compositional generalization of multiple skills. In this work, we develop M-EMBER, or M-EMBER, a factorized method that decomposes a long-horizon mobile manipulation activity into a repertoire of primitive visual skills, reinforcement-learns each skill in simulation, and composes these skills to a long-horizon mobile manipulation activity. On a real mobile manipulation robot, we find that M-EMBER completes a long-horizon household activity, cleaning_kitchen, achieving over 50% success rate. This requires successfully planning and executing five factorized, learned visual skills, in sequences of up to 48 skills long." Sim2Real^2: Actively Building Explicit Physics Model for Precise Articulated Object Manipulation,"Liqian Ma, Jiaojiao Meng, Shuntao Liu, Weihang Chen, Jing Xu, Rui Chen","Tsinghua University,Beijing University of posts and Telecommunications,AVIC Chengdu Aircraft Industrial (Group) Co.",Simulation and Sim2Real,"Accurately manipulating articulated objects is a challenging yet important task for real robot applications. In this paper, we present a novel framework called Sim2Real2 to enable the robot to manipulate an unseen articulated object to the desired state precisely in the real world with no human demonstrations. We leverage recent advances in physics simulation and learning-based perception to build the interactive explicit physics model of the object and use it to plan a long-horizon manipulation trajectory to accomplish the task. However, the interactive model cannot be correctly estimated from a static observation. Therefore, we learn to predict the object affordance from a single-frame point cloud, control the robot to actively interact with the object with a one-step action, and capture another point cloud. Further, the physics model is constructed from the two point clouds. Experimental results show that our framework achieves about 70% manipulations with" A Generic Power Wheelchair Lumped Model in the Sagittal Plane: Towards Realistic Self-Motion Perception in a Virtual Reality Simulator,"Fabien Grzeskowiak, Ronan Le Breton, Louise Devigne, François Pasteau, Marie Babel, Sylvain Guegan","INRIA - Rennes,UNIV-RENNES - INSA Rennes,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes - Rehabilitation Cente,INSA Rennes / IRISA Rainbow Team,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes,INSA Rennes",Simulation and Sim2Real,"This paper presents a generic power wheelchair dynamic model. As a first contribution, this paper proposes to use a generic model composed of a geometric model and a lumped model in order to be compliant with a wide range of existing commercially available wheelchairs. In this model, a set of essential parameters are enough to accurately replicate the dynamic behavior of a wheelchair. As a second contribution, this paper presents an identification method of a n-wheel type power wheelchair. The presented model is restricted to the sagittal plane only, which is sufficient to study the reliability of the identification and validation methods. Moreover, a Motion Cueing Algorithm based on the proposed model controls a simulator mechanical platform. The generic model has been then validated through a user study with 18 able-bodied participants evaluating the self-motion perception with our multisensory power wheelchair driving simulator. Results show that the simplified model is sufficient to provide accurate sensations to the user with respect to their experience while driving a power wheelchair." "FRIDA: A Collaborative Robot Painter with a Differentiable, Real2Sim2Real Planning Environment","Peter Schaldenbrand, James Mccann, Jean Oh",Carnegie Mellon University,Award Finalists 4,"Painting is an artistic process of rendering visual content that achieves the high-level communication goals of an artist that may change dynamically throughout the creative process. In this paper, we present a Framework and Robotics Initiative for Developing Arts (FRIDA) that enables humans to produce paintings on canvases by collaborating with a painter robot using simple inputs such as language descriptions or images. FRIDA introduces several technical innovations for computationally modeling a creative painting process. First, we develop a fully differentiable simulation environment for painting, adopting the idea of real to simulation to real (real2sim2real). We show that our proposed simulated painting environment is higher fidelity to reality than existing simulation environments used for robot painting. Second, to model the evolving dynamics of a creative process, we develop a planning approach that can continuously optimize the painting plan based on the evolving canvas with respect to the high-level goals. In contrast to existing approaches where the content generation process and action planning are performed independently and sequentially, FRIDA adapts to the stochastic nature of using paint and a brush by continually re-planning and re-assessing its semantic goals based on its visual perception of the painting progress. We describe the details on the technical approach as well as the system integration. FRIDA software is open source." SAMLoc: Structure-Aware Constraints with Multi-Task Distillation for Long-Term Visual Localization,"Jian Ning, Yunzhou Zhang, Xinge Zhao, Sonya Coleman, Kunmo Li, Dermot Kerr","Northeastern University,University of Ulster",Localization and Learning,"Real-time and robust long-term visual localization is a crucial technology for autonomous driving. Season and illumination variance make this problem more challenging. At present, most of excellent visual localization algorithms cannot run in real-time on devices with limited computing resources. In this paper, we propose SAMLoc, a structure-aware and self-supervised visual localization system, for fast and robust 6-DoF localization. To obtain structural features in the scene, we propose local and global structure-aware constraints using edge information. Then, we integrate the structure-aware constraints into the hierarchical localization network of multi-task distillation, which significantly reduces the feature extraction time while ensuring localization accuracy. As a result, real-time and robust large-scale localization can be achieved on mobile devices. Experimental results on public datasets show that our system can achieve high localization accuracy and have satisfactory real-time performance. Compared with several state-of-the-art visual localization systems, our framework achieves a competitive localization performance." Energy-Based Models for Cross-Modal Localization Using Convolutional Transformers,"Alan Wu, Michael S Ryoo","Indiana University Bloomington, MIT Lincoln Laboratory,Google, Stony Brook University",Localization and Learning,"We present a novel framework using Energy-Based Models (EBMs) for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS. Lidar sensors have become ubiquitous on autonomous vehicles for describing its surrounding environment. Map priors are typically built using the same sensor modality for localization purposes. However, these map building endeavors using range sensors are often expensive and time-consuming. Alternatively, we leverage the use of satellite images as map priors, which are widely available, easily accessible, and provide comprehensive coverage. We propose a method using convolutional transformers that performs accurate metric-level localization in a cross-modal manner, which is challenging due to the drastic difference in appearance between the sparse range sensor readings and the rich satellite imagery. We train our model end-to-end and demonstrate our approach achieving higher accuracy than the state-of-the-art on KITTI, Pandaset, and a custom dataset." Boosting 3D Point Cloud Registration by Transferring Multi-Modality Knowledge,"Mingzhi Yuan, Xiaoshui Huang, Kexue Fu, Zhihao Li, Manning Wang","Fudan University,Shanghai AI Laboratory,Fudan university",Localization and Learning,"The recent multi-modality models have achieved great performance in many vision tasks because the extracted features contain the multi-modality knowledge. However, most of the current registration descriptors have only concentrated on local geometric structures. This paper proposes a method to boost point cloud registration accuracy by transferring the multi-modality knowledge of pre-trained multi-modality model to a new descriptor neural network. Different to the previous multi-modality methods that requires both modalities, the proposed method only requires point clouds during inference. Specifically, we propose an ensemble descriptor neural network combining pre-trained sparse convolution branch and a new point-based convolution branch. By fine-tuning on a single modality data, the proposed method achieves new state-of-the-art results on 3DMatch and competitive accuracy on 3DLoMatch and KITTI." Local_INN: Implicit Map Representation and Localization with Invertible Neural Networks,"Zirui Zang, Hongrui Zheng, Johannes Betz, Rahul Mangharam","University of Pennsylvania,Technical University of Munich",Localization and Learning,"Robot localization is an inverse problem of finding a robot's pose using a map and sensor measurements. In recent years, Invertible Neural Networks (INNs) have been successfully applied to solving ambiguous inverse problems in various fields. This paper proposes a framework that solves the localization problem with INN. We design an INN that provides implicit map representation in the forward path and localization in the inverse path. By sampling the latent space in evaluation, Local_INN outputs robot poses with covariance, which can be used to estimate the uncertainty. We show that the localization performance of Local_INN is on par with current methods with much lower latency. We show detailed map reconstruction from Local_INN using poses exterior to the training set. We also provide a global localization algorithm using Local_INN to tackle the kidnapping problem." Combining Scene Coordinate Regression and Absolute Pose Regression for Visual Relocalization,"Jiahao Ruan, Li He, Yisheng Guan, Hong Zhang","Guangdong University of Technology,Southern University of Science and Technology,SUSTech",Localization and Learning,"Visual relocalization is a fundamental problem in computer vision and robotics. Recently, regression-based methods become popular and they can be categorized into two classes: absolute pose regression and scene coordinate regression. In this work, we present a combined regression network that jointly learns scene coordinate regression and absolute pose regression for single-image visual relocalization. The proposed network composes of a feature encoder and two regression branches with uncertainty modeling. In particular, we design a deep feature conditioning module, aiming at propagating the coarse pose information in absolute pose regression to inform the predictions in scene coordinate regression. The proposed network is trained in an end-to-end fashion to learn both regression tasks. Moreover, we propose an uncertainty-driven RANSAC algorithm that incorporates the predicted scene coordinates and their uncertainties to solve the camera pose during inference. To the best of our knowledge, this work is the first to combine scene coordinate regression and pose regression in a hierarchical framework for visual relocalization. Experiments on indoor and outdoor benchmarks demonstrate the effectiveness and the superiority of the proposed method over the state-of-the-art methods." A Consistency-Based Loss for Deep Odometry through Uncertainty Propagation,"Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad, Behzad Moshiri","K. N. Toosi University of Technology,New York University,K.N.Toosi University of Technology,University of Tehran",Localization and Learning,"Conventionally, deep odometry networks use objective functions that only penalize short-term deviations from the true path. Since such an objective does not impose any constraints on the long-term deviations from the path, a second consistency-based loss term may be added to lower long-term drift. However, maintaining a balance between the two loss terms is challenging and often treated as a design hyperparameter. To mitigate this balancing issue, we propose to use the uncertainty over both odometry and the long-term transformations in a maximum likelihood setting and allow the network to tune the weighting between the two loss terms. To this end, we derive the odometry uncertainty alongside the pose outputs using the network itself and to derive the covariance matrix over the integrated transformation, we propose to propagate the odometry uncertainty through each iteration. This formulation provides an adaptive and statistically consistent method to weigh the incremental and integrated loss terms against each other, noting the increase in uncertainty as more steps are integrated over. We show that our approach to consistency-based losses allows the network to surpass the accuracy of the state-of-the-art visual odometry approaches. Then, the efficacy of the derived uncertainty as weighting medium is visualized and the performance benefits of uncertainty quantification are shown in a pose-graph based localization scenario." Slice Transformer and Self-Supervised Learning for 6DoF Localization in 3D Point Cloud Maps,"Muhammad Ibrahim, Naveed Akhtar, Saeed Anwar, Michael Wise, Ajmal Mian","University of Western Australia,KFUPM",Localization and Learning,"Precise localization is critical for autonomous vehicles. We present a self-supervised learning method that employs Transformers for the first time for the task of outdoor localization using LiDAR data. We propose a pre-text task that reorganizes the slices of a 360 degree LiDAR scan to leverage its axial properties. Our model, called Slice Transformer, employs multi-head attention while systematically processing the slices. To the best of our knowledge, this is the first instance of leveraging multi-head attention for outdoor point clouds. We additionally introduce the Perth-WA dataset, which provides a large-scale LiDAR map of Perth city in Western Australia, covering 4 km square area. Localization annotations are provided for Perth-WA.The proposed localization method is thoroughly evaluated on Perth-WA and Appollo-SouthBay datasets. We also establish the efficacy of our self-supervised learning approach for the common downstream task of object classification using ModelNet40 and ScanNN datasets." AANet: Aggregation and Alignment Network with Semi-Hard Positive Sample Mining for Hierarchical Place Recognition,"Feng Lu, Lijun Zhang, Shuting Dong, Baifan Chen, Chun Yuan","Tsinghua University,Chongqing Institute of Green and Intelligent Technology, CAS; Un,Central South University",Localization and Learning,"Visual place recognition (VPR) is one of the research hotspots in robotics, which uses visual information to locate robots. Recently, the hierarchical two-stage VPR methods have become popular in this field due to the trade-off between accuracy and efficiency. These methods retrieve the top-k candidate images using the global features in the first stage, then re-rank the candidates by matching the local features in the second stage. However, they usually require additional algorithms (e.g. RANSAC) for geometric consistency verification in re-ranking, which is time-consuming. Here we propose a Dynamically Aligning Local Features (DALF) algorithm to align the local features under spatial constraints. It is significantly more efficient than the methods that need geometric consistency verification. We present a unified network capable of extracting global features for retrieving candidates via an aggregation module and aligning local features for re-ranking via the DALF alignment module. We call this network AANet. Meanwhile, many works use the simplest positive samples in triplet for weakly supervised training, which limits the ability of the network to recognize harder positive pairs. To address this issue, we propose a Semi-hard Positive Sample Mining (ShPSM) strategy to select appropriate hard positive images for training more robust VPR networks. Extensive experiments on four benchmark VPR datasets show that the proposed AANet can outperform several state-of-the-art methods with less time consumption. The code is released at https://github.com/Lu-Feng/AANet." Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists,"Simeon Oluwafunmilore Adebola, Rishi Parikh, Mark Presten, Satvik Sharma, Shrey Aeron, Ananth Rao, Sandeep Mukherjee, Tomson Qu, Tina Wistrom, Eugen Solowjow, Ken Goldberg","University of California, Berkeley,University of California Berkeley,University of California, Berkeley, Rausser College of Natural R,Siemens Corporation,UC Berkeley",Award Finalists 2,"The AlphaGarden is an automated testbed for indoor polyculture farming which combines a first-order plant simulator, a gantry robot, a seed planting algorithm, plant phenotyping and tracking algorithms, irrigation sensors and algorithms, and custom pruning tools and algorithms. In this paper, we systematically compare the performance of the AlphaGarden to professional horticulturalists on the staff of the UC Berkeley Oxford Tract Greenhouse. The humans and the machine tend side-by-side polyculture gardens with the same seed arrangement. We compare performance in terms of canopy coverage, plant diversity, and water consumption. Results from two 60-day cycles suggest that the automated AlphaGarden performs comparably to professional horticulturalists in terms of coverage and diversity, and reduces water consumption by as much as 44%. Code, videos, and datasets are available at: https://sites.google.com/berkeley.edu/systematiccomparison" On Domain-Specific Pre-Training for Effective Semantic Perception in Agricultural Robotics,"Gianmarco Roggiolani, Federico Magistri, Tiziano Guadagnino, Jan Weyler, Giorgio Grisetti, Cyrill Stachniss, Jens Behley","University of Bonn,Sapienza University of Rome",Agricultural Robotics and Automation II,"Agricultural robots have the prospect to enable more efficient and sustainable agricultural production of food, feed, and fiber. Perception of crop and weed is a central component of agricultural robots that aim to monitor fields and assess the plants as well as their growth stage in an automatic manner. Semantic perception mostly relies on deep learning using supervised approaches, which require time and qualified workers to label fairly large amounts of data. In this paper, we look into the problem of reducing the amount of labels without compromising the final segmentation performance. For robots operating in the field, pre-training networks in a supervised way is already a popular method to reduce the number of required labeled images. We investigate the possibility of pre-training in a self-supervised fashion using data from the target domain. To better exploit this data, we propose a set of domain-specific augmentation strategies. We evaluate our pre-training on semantic segmentation and leaf instance segmentation, two important tasks in our domain. The experimental results suggest that pre-training with domain-specific data paired with our data augmentation strategy leads to superior performance compared to commonly used pre-trainings. Furthermore, the pre-trained networks obtain similar performance to the fully supervised with less labeled data." Semantic Keypoint Extraction for Scanned Animals Using Multi-Depth-Camera Systems,"Raphael Falque, Teresa A. Vidal-Calleja, Alen Alempijevic",University of Technology Sydney,Agricultural Robotics and Automation II,"Keypoint annotation in pointclouds is an important task for 3D reconstruction, object tracking and alignment, in particular in deformable or moving scenes. In the context of agriculture robotics, it is a critical task for livestock automation to work toward condition assessment or behaviour recognition. In this work, we propose a novel approach for semantic keypoint annotation in pointclouds, by reformulating the keypoint extraction as a regression problem of the distance between the keypoints and the rest of the pointcloud. We use the distance on the pointcloud manifold mapped into a radial basis function (RBF), which is then learned using an encoder-decoder architecture. Special consideration is given to the data augmentation specific to multi-depth-camera systems by considering noise over the extrinsic calibration and camera frame dropout. Additionally, we investigate computationally efficient non-rigid deformation methods that can be applied to animal pointclouds. Our method is tested on data collected in the field, on moving beef cattle, with a calibrated system of multiple hardware-synchronised RGB-D cameras. Keywords: 3D deep learning, keypoints annotation, multi-depth-camera systems, livestock" Grasp Planning with CNN for Log-Loading Forestry Machine,"Elie Ayoub, Patrick Levesque, Inna Sharf","McGill University,FPInnovations",Agricultural Robotics and Automation II,"Log loading constitutes a key operation in timber harvesting, and despite the recent spike of interest in introducing automation to the forestry sector, efficient and intelligent grasping of logs remains unresolved. This paper presents a grasp planning pipeline that relies on the identification of logs' characteristics and pose in the environment of a log-loading machine, to generate high quality grasps. The proposed pipeline involves replicating identified logs in a virtual environment where grasp planning is carried out by using a convolutional neural network and a virtual depth camera. The network relies solely on depth information and the virtual camera can be positioned at a strategically selected location or to follow a certain trajectory to enhance exposure of the logs, all this without having to move the log-loader's crane. The grasp planning pipeline is evaluated through simulated grasping trials and experiments on a large-scale log-loading test-bed with several configurations of wood logs ranging from a single to multiple logs. The grasp planning pipeline proved to be successful with a grasping rate of 98.33% in the simulated trials and 96.67% in the experimental trials. The grasp planner was able to overcome log characterization and localization uncertainties, thus allowing the log-loader to pick individual logs, and multiple logs at once when possible." A Hybrid Cable-Driven Robot for Non-Destructive Leafy Plant Monitoring and Mass Estimation Using Structure from Motion,"Gerry Chen, Venkata Harsh Suhith Muriki, Andrew Sharkey, Cedric Pradalier, Yongsheng Chen, Frank Dellaert","Georgia Institute of Technology,GeorgiaTech Lorraine",Agricultural Robotics and Automation II,"We propose a novel hybrid cable-based robot with manipulator and camera for high-accuracy, medium-throughput plant monitoring in a vertical hydroponic farm and, as an example application, demonstrate non-destructive plant mass estimation. Plant monitoring with high temporal and spatial resolution is important to both farmers and researchers to detect anomalies and develop predictive models for plant growth. The availability of high-quality, off-the-shelf structure-from-motion (SfM) and photogrammetry packages has enabled a vibrant community of roboticists to apply computer vision for non-destructive plant monitoring. While existing approaches tend to focus on either high-throughput (e.g. satellite, unmanned aerial vehicle (UAV), vehicle-mounted, conveyor-belt imagery) or high-accuracy/robustness to occlusions (e.g. turn-table scanner or robot arm), we propose a middle-ground that achieves high accuracy with a medium-throughput, highly automated robot. Our design pairs the workspace scalability of a cable-driven parallel robot (CDPR) with the dexterity of a 4 degree-of-freedom (DoF) robot arm to autonomously image many plants from a variety of viewpoints. We describe our robot design and demonstrate it experimentally by collecting daily photographs of 54 plants from 64 viewpoints each. We show that our approach can produce scientifically useful measurements, operate fully autonomously after initial calibration, and produce better reconstructions and plant property estimates than those of over-canopy methods (e.g. UAV). As example applications, we show that our system can successfully estimate plant mass with a Mean Absolute Error (MAE) of 0.586g and, when used to perform hypothesis testing on the relationship between mass and age, produces p-values comparable to ground-truth data (p=0.0020 and p=0.0016, respectively)." Optimal Multi-Robot Coverage Path Planning for Agricultural Fields Using Motion Dynamics,"Jahid Chowdhury Choton, Pavithra Prabhakar",Kansas State University,Agricultural Robotics and Automation II,"Coverage path planning (CPP) is the task of computing an optimal path within a region to completely scan or survey the area of interest by using robotic sensor footprints. In this work, we propose a novel approach to find the multi-robot optimal coverage path of an agricultural field using motion dynamics while minimizing the mission time. Our approach consists of three steps: (i) divide the agricultural field into convex polygonal areas to optimally distribute them among the robots, (ii) generate an optimal coverage path to ensure minimum coverage time for each of the polygonal areas, and (iii) generate the trajectory for each coverage path using Dubins motion dynamics. Several experiments and simulations were performed to check the validity and feasibility of our approach, and the results and limitations are discussed." CropNav: A Framework for Autonomous Navigation in Real Farms,"Mateus Valverde Gasparino, Vitor Akihiro Hisano Higuti, Arun Narenthiran Sivakumar, Andres Eduardo Baquero Velasquez, Marcelo Becker, Girish Chowdhary","University of Illinois at Urbana-Champaign,EarthSense Inc.,University of Illinois at Urbana Champaign,Earthsense,USP",Agricultural Robotics and Automation II,"Small robots that can operate under the plant canopy can enable new possibilities in agriculture. However, unlike larger autonomous tractors, autonomous navigation for such under canopy robots remains an open challenge because Global Navigation Satellite System (GNSS) is unreliable under the plant canopy. We present a hybrid navigation system that autonomously switches between different sets of sensing modalities to enable full field navigation, both inside and outside of crop. By choosing the appropriate path reference source, the robot can accommodate for loss of GNSS signal quality and leverage row-crop structure to autonomously navigate. However, such switching can be tricky and difficult to execute over scale. Our system provides a solution by automatically switching between an exteroceptive sensing based system, such as Light Detection And Ranging (LiDAR) row-following navigation and waypoints path tracking. In addition, we show how our system can detect when the navigate fails and recover automatically extending the autonomous time and mitigating the necessity of human intervention. Our system shows an improvement of about 750 m per intervention over GNSS-based navigation and 500 m over row following navigation." Tendon-Driven Soft Robotic Gripper with Integrated Ripeness Sensing for Blackberry Harvesting,"Alex Qiu, Claire Young, Anthony Gunderman, Milad Azizkhani, Yue Chen, Ai-Ping Hu","Georgia Institute of Technology,Georgia Institute of Techology,Georgia Tech Research Institute",Agricultural Robotics and Automation II,"Growing global demand for food, coupled with continuing labor shortages, motivate the need for automated agricultural harvesting. While some specialty crops (e.g., apples, peaches, blueberries) can be harvested via existing harvesting modalities, fruits such as blackberries and raspberries require delicate handling to mitigate fruit damage that could significantly impact marketability. This motivates the development of soft robotic solutions that enable efficient, delicate harvesting. This paper presents the design, fabrication and feasibility testing of a tendon-driven soft gripping system focused on blackberries, which are a fragile fruit susceptible to post-harvest damage. The gripper is both low-cost and small form factor, allowing for the integration of a micro-servo for tendon retraction, a near-infrared (NIR) based blackberry ripeness sensor utilizing the reflectance modality for identifying fully ripe blackberries, and an endoscopic camera for visual servoing. The gripper was used to harvest 139 berries with manual positioning in two separate field tests. Field testing found an average retention force of 2.06 N and 6.08 N for ripe and unripe blackberries, respectively. Sensor tests identified an average reflectance of 16.78 and 21.70 for ripe and unripe blackberries, respectively, indicating a clear distinction between the two ripeness levels. Finally, the soft robotic gripper was integrated onto a UR5 robot arm and successfully harvested fifteen artificial blackberries in a lab setting using visual servoing." Motion Planning for a Climbing Robot with Stochastic Grasps,"Stephanie Newdick, Nitin Ongole, Tony G. Chen, Edward Schmerling, Mark Cutkosky, Marco Pavone",Stanford University,Space Robotics,"ReachBot is a robot that uses extendable and retractable booms as limbs to move around unpredictable environments such as martian caves. Each boom is capped by a microspine gripper designed for grasping rocky surfaces. Motion planning for ReachBot must be versatile to accommodate variable terrain features and robust to mitigate risks from the stochastic nature of grasping with spines. In this paper, we introduce a graph traversal algorithm to select a discrete sequence of grasps based on available terrain features suitable for grasping. This discrete plan is complemented by a decoupled motion planner that considers the alternating phases of body movement and end-effector movement, using a combination of sampling-based planning and sequential convex programming to optimize individual phases. We use our motion planner to plan a trajectory across a simulated 2D cave environment with at least 90% probability of success and demonstrate improved robustness over a baseline trajectory. Finally, we use a simplified prototype to verify a body movement trajectory generated by our motion planning algorithm." RAMP: Reaction-Aware Motion Planning of Multi-Legged Robots for Locomotion in Microgravity,"Warley F. R. Ribeiro, Kentaro Uno, Masazumi Imai, Koki Murase, Kazuya Yoshida",Tohoku University,Award Finalists 3,"Robotic mobility in microgravity is necessary to expand human utilization and exploration of outer space. Bio-inspired multi-legged robots are a possible solution for safe and precise locomotion. However, a dynamic motion of a robot in microgravity can lead to failures due to gripper detachment caused by excessive motion reactions. We propose a novel Reaction-Aware Motion Planning (RAMP) to improve locomotion safety in microgravity, decreasing the risk of losing contact with the terrain surface by reducing the robot's momentum change. RAMP minimizes the swing momentum with a Low-Reaction Swing Trajectory (LRST) while distributing this momentum to the whole body, ensuring zero velocity for the supporting grippers and minimizing motion reactions. We verify the proposed approach with dynamic simulations indicating the capability of RAMP to generate a safe motion without detachment of the supporting grippers, resulting in the robot reaching its specified location. We further validate RAMP in experiments with an air-floating system, demonstrating a significant reduction in reaction forces and improved mobility in microgravity." Risk-Aware Path Planning Via Probabilistic Fusion of Traversability Prediction for Planetary Rovers on Heterogeneous Terrains,"Masafumi Endo, Tatsunori Taniai, Ryo Yonetani, Genya Ishigami","Keio University,OMRON SINIC X Corporation,OMRON SINIC X",Space Robotics,"Machine learning (ML) plays a crucial role in assessing traversability for autonomous rover operations on deformable terrains but suffers from inevitable prediction errors. Especially for heterogeneous terrains where the geological features vary from place to place, erroneous traversability prediction can become more apparent, increasing the risk of unrecoverable rover's wheel slip and immobilization. In this work, we propose a new path planning algorithm that explicitly accounts for such erroneous prediction. The key idea is the probabilistic fusion of distinctive ML models for terrain type classification and slip prediction into a single distribution. This gives us a multimodal slip distribution accounting for heterogeneous terrains and further allows statistical risk assessment to be applied to derive risk-aware traversing costs for path planning. Extensive simulation experiments have demonstrated that the proposed method is able to generate more feasible paths on heterogeneous terrains compared to existing methods." A Gravity Compensation Strategy for On-Ground Validation of Orbital Manipulators,"Marco De Stefano, Ria Vijayan, Andreas Stemmer, Ferdinand Elhardt, Christian Ott","German Aerospace Center (DLR),DLR - German Aerospace Center,Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR),TU Wien",Space Robotics,"The on-ground validation of orbital manipulators is a challenging task because the robot is designed for a gravity-free operational environment, but it is validated under the effect of gravity. As a consequence, joint torque limits can be easily reached in certain configurations when gravity is actively compensated by the joints. Hence, the workspace for on-ground testing is restricted. In this paper, an optimal strategy is proposed for achieving gravity compensation of an orbital manipulator arm on ground. The strategy minimizes the joint torques acting on the manipulator by solving an optimization problem and it computes the necessary forces to be tracked by an external carrier. Hence, full gravity compensation is achieved for the orbital manipulator. Experimental results validate the effectiveness of the method on the DLR CAESAR space robot, which uses a cable suspended system as external carrier to track the desired gravity compensation force, resulting from the proposed method." Towards Bridging the Space Domain Gap for Satellite Pose Estimation Using Event Sensing,"Mohsi Jawaid, Ethan Elms, Yasir Latif, Tat-Jun Chin","The University of Adelaide,University of Adelaide",Space Robotics,"Deep models trained using synthetic data require domain adaptation to bridge the gap between the simulation and target environments. State-of-the-art domain adaptation methods often demand sufficient amounts of (unlabelled) data from the target domain. However, this need is difficult to fulfil when the target domain is an extreme environment, such as space. In this paper, our target problem is close proximity satellite pose estimation, where it is costly to obtain images of satellites from actual rendezvous missions. We demonstrate that event sensing offers a promising solution to generalise from the simulation to the target domain under stark illumination differences. Our main contribution is an event-based satellite pose estimation technique, trained purely on synthetic event data with basic data augmentation to improve robustness against practical (noisy) event sensors. Underpinning our method is a novel dataset with carefully calibrated ground truth, comprising of real event data obtained by emulating satellite rendezvous scenarios in the lab under drastic lighting conditions. Results on the dataset showed that our event-based satellite pose estimation method, trained only on synthetic data without adaptation, could generalise to the target domain effectively." Hardware-In-The-Loop Simulator with Low-Thrust Actuator for Free-Flying Robot's Omni-Directional Control,"Daichi Hirano, Shinji Mitani, Taisei Nishishita, Tatsuhiko Saito","Japan Aerospace Exploration Agency,JAXA,Systems Engineering Consultants Co.,LTD.",Space Robotics,"Small free-flying robots to assist astronauts and perform experiments need a propulsion system to move freely in microgravity. Hardware-in-the-loop (HIL) simulators can simultaneously verify guidance, navigation, and control (GNC) systems, including flight hardware and software, in three dimensions. However, it is difficult to incorporate a small free-flying robot into the HIL simulator because of the low propulsive force and gravity compensation associated with its attitude changes. This paper proposes a HIL simulator with a propulsion subsystem mounted on a statically fixed force/torque sensor and a GNC subsystem mounted on a dynamically movable robotic arm. This simulator allows us to verify the GNC algorithms comprehensively using actual navigation sensors and propulsive actuators in an emulated flight environment. The actual capabilities of this simulator were successfully demonstrated in motion verifications of a free-flying robot, the Int-Ball2." Loitering and Trajectory Tracking of Suspended Payloads in Cable-Driven Balloons Using UGVs,"Julius Wanner, Eric Sihite, Alireza Ramezani, Gharib Morteza","ETH Zurich / California Institute of Technology,California Institute of Technology,Northeastern University,CALTECH",Space Robotics,"Investigations of unmanned aerial vehicles (UAVs) for planetary exploration and payload manipulation have become a strong focus of research within space robotics. Among possible solutions, balloon-based systems possess merits that make them extremely attractive, such as their simple operation mechanism and endured operation time. However, there are many hurdles to overcome to achieve robust trajectory tracking performance for balloon-based applications. In this work, in order to facilitate the control and versatile use of balloons for near-surface planetary payload manipulation, a novel robotic platform and control strategy featuring the coordinated servoing of multiple unmanned ground vehicles (UGVs) to actuate a cable-driven balloon and the suspended payload is proposed. An earthbound prototype and dynamic model of this system are designed to allow for the investigation of payload trajectory tracking performance using a tailored Model Predictive Controller in simulation and experiment." Design and Validation of a Multi-Arm Relocatable Manipulator for Space Applications,"Enrico Mingo Hoffman, Arturo Laurenzi, Francesco Ruscelli, Luca Rossini, Lorenzo Baccelliere, Davide Antonucci, Alessio Margan, Paolo Guria, Marco Migliorini, Stefano Cordasco, Gennaro Raiola, Luca Muratore, Joaquín Estremera, Andrea Rusconi, Guido Sangiovanni, Nikos Tsagarakis","Leonardo S.p.A.,Istituto Italiano di Tecnologia,Istituto italiano di tecnologia,Istituto Italiano di Tecnologia (IIT),Leonardo s.p.a.,GMV,Selex Galileo,Politecnico di Milano",Space Robotics,"This work presents the computational design and validation of the Multi-Arm Relocatable Manipulator (MARM), a three-limb robot for space applications, with particular reference to the MIRROR (i.e., the Multi-arm Installation Robot for Readying ORUs and Reflectors) use-case scenario as proposed by the European Space Agency. A holistic computational design and validation pipeline is proposed, with the aim of comparing different proposed limb designs, as well as ensuring that valid MARM candidates are able to perform the complex loco-manipulation tasks required by MIRROR. Motivated by the task complexity in terms of kinematic reachability, (self)-collision avoidance, contact wrench limits, and motor torque limits affecting Earth experiments, this work proposes the application of multiple state-of-art planning and control approaches to aid the robot design and validation. These include sampling-based planning on manifolds, non-linear trajectory optimization, and quadratic programs for inverse dynamics computations with constraints. Finally, we describe the attained MARM design and conduct preliminary tests for hardware validation through a set of lab experiments." Tentacle-Based Shape Shifting of Metamorphic Robots Using Fast Inverse Kinematics,"Jan Mrázek, Patrick Ondika, Ivana ÄŒerná, Jiri Barnat",Masaryk University,Modular and Reconfigurable Robots,"We present a new approach to tackle the problem of metamorphic robots' reconfiguration. Given the chain-type metamorphic robot's initial and target configuration, we compute a reconfiguration plan that is provably physically collision-free. Our solution employs a specific heuristic. The robot initially reconfigures to a shape that resembles an octopus with many tentacles. After that, the tentacles gradually reconnect to each other using inverse kinematics, separating one tentacle from the body and keeping the other one connected. This strategy eventually leads to a snake-like structure of the robot. For the target configuration, we compute the reconfiguration plan with the same procedure, however, we reverse the plan to reconfigure the robot from the snake-like structure to the target shape. According to our experimental evaluation, our newly introduced strategy for finding reconfiguration plans is successful. It efficiently finds collision free plans even for robots consisting of hundreds of modules." A Non-Planar Assembly of Modular Tetrahedral-Shaped Aerial Robots,"Obadah Wali, Mohamad Shahab, Eric Feron","KAUST,King Abdullah University of Science and Technology",Modular and Reconfigurable Robots,"This paper presents a new design of aerial vehicles with tetrahedral geometry. We call this design the TetraQuad. The TetraQuad is a fractal modular aerial robot. A characteristic of fractals is that they have a geometric shape that can be assembled to generate the same geometry on a larger scale. Therefore multiple TetraQuad modules can be assembled to produce a larger scaled tetrahedral shaped aerial vehicle. The advantage is to have modular aerial robots that assemble in the vertical direction; this increases the rigidity of the structure, as well as reduces the wake interaction of the elevated propellers in the assembly. This work presents a design and analysis of the TetraQuad module as well as assemblies of multiple modules. A modular controller strategy is discussed. The functionality of the controller is illustrated using simulations. We validate our design with experimental flight tests." Learning Modular Robot Visual-Motor Locomotion Policies,"Julian Whitman, Howie Choset",Carnegie Mellon University,Modular and Reconfigurable Robots,"Control policy learning for modular robot locomotion has previously been limited to proprioceptive feedback and flat terrain. This paper develops policies for modular systems with vision traversing more challenging environments. These modular robots can be reconfigured to form many different designs, where each design needs a controller to function. Though one could create a policy for individual designs and environments, such an approach is not scalable given the wide range of potential designs and environments. To address this challenge, we create a visual-motor policy that can generalize to both new designs and environments. The policy itself is modular, in that it is divided into components, each of which corresponds to a type of module (e.g., a leg, wheel, or body). The policy components can be recombined during training to learn to control multiple designs. We develop a deep reinforcement learning algorithm where visual observations are input to a modular policy interacting with multiple environments at once. We apply this algorithm to train robots with combinations of legs and wheels, then demonstrate the policy controlling real robots climbing stairs and curbs." DisCo: A Multiagent 3D Coordinate System for Lattice Based Modular Self-Reconfigurable Robots,"Benoît Piranda, Frédéric Lassabe, Julien Bourgeois","Université de Franche-Comté / FEMTO-ST Institute,FEMTO-ST Institute, Univ. Bourgogne Franche-Comté, CNRS,Institut FEMTO-ST",Modular and Reconfigurable Robots,"Localizing each module in a modular self-reconfigurable robot (MSR) is of paramount importance. In MSR, the communication graph is directly mapped to the real topology which makes the localization problem easy to solve. However, some types of connectors can lose the orientation of the modules, making the problem intractable. In this work, we propose to build a coordinate system for 3D lattice-based modular robots using a multiagent system. We present DisCo algorithm, that uses one agent per module which can only communicate with its connected neighbors and that does not need a central coordination system. We show that the agents can tackle any kinds of 3D lattice and we illustrate it with a Face Centered Cubic lattice (12 neighbors) and a cubic lattice (6 neighbors). Using communications and only four states, DisCo can also deduce the orientation of modules if the connectors do not provide this information." Finding Optimal Modular Robots for Aerial Tasks,"Jiawei Xu, David Saldana",Lehigh University,Modular and Reconfigurable Robots,"Traditional aerial vehicles have limitations in their capabilities due to actuator constraints, such as motor saturation. The hardware components and their arrangement are designed to satisfy specific requirements and are difficult to modify during operation. To address this problem, we introduce a versatile modular multi-rotor vehicle that can change its capabilities by reconfiguration. Our modular robot consists of homogeneous cuboid modules, propelled by quadrotors with tilted rotors. Depending on the number of modules and their configuration, the robot can expand its actuation capabilities. In this paper, we build a mathematical model for the actuation capability of a modular multi-rotor vehicle and develop methods to determine if a vehicle is capable of satisfying a task requirement. Based on this result, we find the optimal configurations for a given task. Our approach is validated in realistic 3D simulations, showing that our modular system can adapt to tasks with varying requirements." Coaxial Modular Aerial System and the Reconfiguration Applications,"José Baca, Syed Izzat Ullah, Pablo Rangel","Texas A&M University-Corpus Christi,Texas A&M University - Corpus Christi",Modular and Reconfigurable Robots,This paper presents a coaxial modular aerial system (CMAS) formed by homogeneous modules driven by their center of mass. CMAS is designed to perform independent and cooperative flight with or without payload. Properties of the modularity concept allow the system to adapt to different situations and/or tasks by adding/removing modules to/from a configuration. The CMAS module is based on a coaxial motor and a two degree-of-freedom mechanism that transfers its center of mass from one side to another to make the module navigate around. The magnetic-based connector mechanism allows the module to be attached to other modules and to different metallic surfaces. A decentralized and asynchronous 3D path planning algorithm is implemented to avoid the trajectories of other modules/obstacles and ensures safe reconfiguration of the modules. Simulations within various environments show the applicability of the reconfiguration algorithm. ADAPT: A 3 Degrees of Freedom Reconfigurable Force Balanced Parallel Manipulator for Aerial Applications,"Kartik Suryavanshi, Salua Hamaza, Volkert Van Der Wijk, Just Herder","TU Delft,Delft University of Technology",Modular and Reconfigurable Robots,"In this paper, we present the ADAPT, a novel reconfigurable force-balanced parallel manipulator for spatial motions and interaction capabilities underneath a drone. The reconfigurable aspect allows different motion-based 3-DoF operation modes like translational, rotational, planar, and so on, without the need for disassembly. For the purpose of this study, the manipulator is used in translation mode only. A kinematic model is developed and validated for the manipulator. The design and motion capabilities are also validated both by conducting dynamics simulations of a simplified model on MSC ADAMS, and experiments on the physical setup." Rearrange Indoor Scenes for Human-Robot Co-Activity,"Weiqi Wang, Zihang Zhao, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu, Hangxin Liu","University of California, Los Angeles,Beijing Institute for General Artificial Intelligence,Beijing Institute for General Artificial Intelligence (BIGAI),Peking University,UCLA",Human-Centered Robotics,"We present an optimization-based framework for rearranging indoor furniture to accommodate human-robot co-activities better. The rearrangement aims to afford sufficient accessible space for robot activities without compromising everyday human activities. To retain human activities, our algorithm preserves the functional relations among furniture by integrating spatial and semantic co-occurrence extracted from SUNCG and ConceptNet, respectively. By defining the robot’s accessible space by the amount of open space it can traverse and the number of objects it can reach, we formulate the rearrangement for human-robot co-activity as an optimization problem, solved by adaptive simulated annealing (ASA) and covariance matrix adaptation evolution strategy (CMA-ES). Our experiments on the SUNCG dataset quantitatively show that rearranged scenes provide an average of 14% more accessible space and 30% more objects to interact with. The quality of the rearranged scenes is qualitatively validated by a human study, indicating the efficacy of the proposed strategy." Design and Evaluation of an Augmented Reality Head-Mounted Display User Interface for Controlling Legged Manipulators,"Rodrigo Chacon Quesada, Yiannis Demiris",Imperial College London,Human-Centered Robotics,"Designing an intuitive User Interface (UI) for controlling assistive robots remains challenging. Most existing UIs leverage traditional control interfaces such as joysticks, hand-held controllers, and 2D UIs. Thus, users have limited availability to use their hands for other tasks. Furthermore, although there is extensive research regarding legged manipulators, comparatively little is on their UIs. Towards extending the state-of-art in this domain, we provide a user study comparing an Augmented Reality (AR) Head-Mounted Display (HMD) UI we developed for controlling a legged manipulator against off-the-shelf control methods for such robots. We made this comparison baseline across multiple factors relevant to a successful interaction. The results from our user study (N = 17) show that although the AR UI increases immersion, off-the-shelf control methods outperformed the AR UI in terms of time performance and cognitive workload. Nonetheless, a follow-up pilot study incorporating the lessons learned shows that AR UIs can outpace hand-held-based control methods and reduce the cognitive requirements when designers include hands-free interactions and cognitive offloading principles into the UI." Exploiting Intrinsic Kinematic Null Space for Supernumerary Robotic Limbs Control,"Tommaso Lisini Baldi, Nicole D'Aurizio, Sergio Gurgone, Daniele Borzelli, Andrea D'Avella, Domenico Prattichizzo","University of Siena,University of Siena, Istituto Italiano di Tecnologia,University of Messina,Fondazione Santa Lucia,IRCCS Fondazione Santa Lucia",Human-Centered Robotics,"Supernumerary robotic limbs (SRLs) gained increasing interest in the last years for their applicability as healthcare and assistive technologies. These devices can either support or augment human sensorimotor capabilities, allowing users to complete tasks that are more complex than those feasible for their natural limbs. However, for a successful coordination between natural and artificial limbs, intuitiveness of interaction and perception of autonomy are key enabling features, especially for people suffering from motor disorders and impairments. The development of suitable human-robot interfaces is thus fundamental to foster the adoption of SRLs. With this work, we describe how to control an extra degree of freedom by taking advantage of what we defined the Intrinsic Kinematic Null Space, i.e. the redundancy of the human kinematic chain involved in the ongoing task. Obtained results demonstrated that the proposed control strategy is effective for performing complex tasks with a supernumerary robotic finger, and that practice improves users’ control ability." Robot Explanatory Narratives of Collaborative and Adaptive Experiences,"Alberto Olivares-Alarcos, Antonio Andriella, Sergi Foix, Guillem Alenyà","Institut de Robòtica i Informàtica Industrial (CSIC-UPC),Pal Robotics,CSIC-UPC",Human-Centered Robotics,"In the future, robots are expected to autonomously interact and/or collaborate with humans, who will increase the uncertainty during the execution of tasks, provoking online adaptations of robots' plans. Hence, trustworthy robots must be able to store, retrieve and narrate important knowledge about their collaborations and adaptations. In this article, it is proposed a sound methodology that integrates three main elements. First, an ontology for collaborative robotics and adaptation to model the domain knowledge. Second, an episodic memory for time-indexed knowledge storage and retrieval. Third, a novel algorithm to extract the relevant knowledge and generate textual explanatory narratives. The algorithm produces three different types of outputs, varying the specificity, for diverse uses and preferences. A pilot study was conducted to evaluate the usefulness of the narratives, obtaining promising results. Finally, we discuss how the methodology can be generalized to other ontologies and experiences. This work boosts robot explainability, especially in cases where robots need to narrate the details of their short and long-term past experiences." Evaluating Immersive Teleoperation Interfaces: Coordinating Robot Radiation Monitoring Tasks in Nuclear Facilities,"Harvey Stedman, Başaran Bahadır Koçer, Nejra Van Zalk, Mirko Kovac, Vijay Pawar","University College London,Imperial College London",Human-Centered Robotics,"We present a virtual reality (VR) teleoperation interface for a ground-based robot, featuring dense 3D environment reconstruction and a low latency video stream, with which operators can immersively explore remote environments. At the UK Atomic Energy Authority's (UKAEA) Remote Applications in Challenging Environments (RACE) facility, we applied the interface in a user study where trained robotics operators completed simulated nuclear monitoring and decommissioning style tasks to compare VR and traditional teleoperation interface designs. We found that operators in the VR condition took longer to complete the experiment, had reduced collisions, and rated the generated 3D map with higher importance when compared to non-VR operators. Additional physiological data suggested that VR operators had a lower objective cognitive workload during the experiment but also experienced increased physical demand. Overall the presented results show that VR interfaces may benefit work patterns in teleoperation tasks within the nuclear industry, but further work is needed to investigate how such interfaces can be integrated into real world decommissioning workflows." A Social Referencing Disambiguation Framework for Domestic Service Robots,"Kevin Fan, Melanie Jouaiti, Ali Noormohammadi Asl, Kerstin Dautenhahn, Chrystopher Nehaniv","University of Waterloo,Imperial College London",Human-Centered Robotics,"The successful integration of domestic service robots into home environments can bring significant services and convenience to the general population and possibly mitigate important societal issues, such as care provision for older adults. However, home environments are complex, dynamic and object-rich. It is, thus, very probable that service robots will encounter ambiguity while interacting with household items. To enable service robots to be more adaptive, we propose a learning social referencing computational framework and experimentally evaluated the framework on a mobile manipulator robot, Fetch, in object selection scenarios. The framework allows the robot to (1) detect and analyze the ambiguity level based on the robot's view and user's command, (2) assess the human's attention level and attract their attention, (3) disambiguate references to objects using human feedback and (4) learn novel objects after clarification from the user. System evaluation results are presented. The framework is modular and can be applied to different robotic platforms." Ex(plainable) Machina: How Social-Implicit XAI Affects Complex Human-Robot Teaming Tasks,"Marco Matarese, Francesca Cocchella, Francesco Rea, Alessandra Sciutti","Italian Institute of Technology,Istituto Italiano di Tecnologia",Human-Centered Robotics,"In this paper, we investigated how shared experience-based counterfactual explanations affected people's performance and robots' persuasiveness during a decision-making task in a social HRI context. We used the Connect 4 game as a complex decision-making task where participants and the robot had to play as a team against the computer. We compared two strategies of explanation generation (classical vs shared experience-based) and investigated their differences in terms of team performance, the robot's persuasive power, and participants' perception of the robot and self. Our results showed that the two explanation strategies led to comparable performances. Moreover, shared experience-based explanations gave higher persuasiveness to the robot's suggestions than classical ones. Finally, we noted that low-performers tend to follow the robot more than high-performers, providing insights into the potential danger for non-expert users interacting with expert explainable robots." Towards Safe Remote Manipulation: User Command Adjustment Based on Risk Prediction for Dynamic Obstacles,"Mincheul Kang, Minsung Yoon, Sung-Eui Yoon","KAIST,Korea Advanced Institute of Science and Technology (KAIST)",Human-Centered Robotics,"Real-time remote manipulation requires careful operations by a user to ensure the safety of a robot, which is designed to follow user’s commands, against dynamic obstacles. However, a user may give commands to a robot at the risk of collision with dynamic obstacles due to a user’s unfamiliar control ability or unexpected situations. In this paper, we propose a risk-aware user command adjustment method to avoid potential collision with dynamic obstacles. Our method consists of a network that predicts the risk of dynamic obstacles and another network that synthesizes commands to avoid obstacles. Based on the predicted risk, our method decides an adjusted command between a user command and a command to avoid collisions. We evaluate our method in problems that face collisions with dynamic obstacles when following given commands and in problems with static obstacles. We show that our method improves safety against the risk of dynamic obstacles or follows user commands when there is no risk. We also demonstrate the feasibility of our method using the real fetch manipulator with seven-degrees-of-freedom." Computational Methods to Support Prototyping of an Adaptive Robot Joystick Controller for Children with Upper Limb Impairments,"Melanie Jouaiti, Negin Azizi, Kerstin Dautenhahn","Imperial College London,University of Waterloo",Human-Centered Robotics,"Between 2% to 5% of children are affected by Developmental Coordination Disorders in Canada and have been diagnosed with upper limb impairments, which affect their daily lives and reduces their autonomy. Motor impairments can be part of progressive disorders, so despite regular therapy, progress remains fleeting. Affected individuals therefore consistently face many barriers, including entertainment opportunities, as availability of off-the-shelf inclusive technology is very limited. Our long-term goal is to develop a play-mediator robot, which would facilitate play between children with motor impairments and their peers or family members. Here, games that the robot can play are remotely controlled by the participants, using appropriate interfaces (e.g. joysticks). In this paper, we take the first step towards that goal and develop an adaptive joystick controller that can compensate for individual deficits. We monitor movement statistics to determine if re-calibration of the controller is necessary. Moreover, we propose a computational model of data 'distortion', as a tool for developers to test their technology in the very early stages of prototype development, without requiring access to participants. This work is validated with data from healthy adults and children with upper limb impairments." Ethical Assessment of a Hospital Disinfection Robot,"Conor Mcginn, Robert Scott, Niamh Donnelly, Michael F. Cullinan, Alan Winfield, Patricia Treusch","Trinity College Dublin,Akara Robotics,University of the West of England, Bristol,TU Berlin",Human-Centered Robotics,"Robots have the potential to deliver very positive impacts for society, however, it’s critical that in preparing for real-world deployments, we recognize and take steps to mitigate against the potential harms, both direct and indirect, that they may cause. In this paper, we explore how the ethics canvas (EC) and the ethical risk assessment (ERA) methodology defined in British Standard 8611 can be combined to better align robot technologies with ethics and their socio-cultural context of operation. We illustrate this through a practical case-study involving the real-world introduction of a disinfection robot to a radiology department in a European hospital. Using the EC, we identified 49 distinct ways that the technology was likely to impact key stakeholders and 11 ways that failure or misuse of the technology was likely to impact service provision. From this data, 8 mitigating measures were identified. Then, using the ERA tool, 9 risks were identified that were considered to represent a high likelihood of occurrence. From these insights, a further 8 mitigation measures were proposed. The combined use of both tools was found to be complementary, since the EC fostered a bottom-up, subjective critical thinking process whereas the ERA provided a broader, more top-down objective view. This example provides a practical template for robotics practitioners to better understand and manage the ethical and socio-cultural dimensions of their work, and contributes towards the standardization of ethical assessments in robotics with an emphasis on the move from principles to practice." Intention Aware Robot Crowd Navigation with Attention-Based Interaction Graph,"Shuijing Liu, Peixin Chang, Zhe Huang, Neeloy Chakraborty, Kaiwen Hong, Weihang Liang, D. Livingston Mcpherson, Junyi Geng, Katherine Driggs-Campbell","University of Illinois at Urbana Champaign,University of Illinois at Urbana-Champaign,University of Illinois,Pennsylvania State University",Human-Centered Robotics,"We study the problem of safe and intention-aware robot navigation in dense and interactive crowds. Most previous reinforcement learning (RL) based methods fail to consider different types of interactions among all agents or ignore the intentions of people, which results in performance degradation. In this paper, we propose a novel recurrent graph neural network with attention mechanisms to capture heterogeneous interactions among agents through space and time. To encourage longsighted robot behaviors, we infer the intentions of dynamic agents by predicting their future trajectories for several timesteps. The predictions are incorporated into a model-free RL framework to prevent the robot from intruding into the intended paths of other agents. We demonstrate that our method enables the robot to achieve good navigation performance and non-invasiveness in challenging crowd navigation scenarios. We successfully transfer the policy learned in simulation to a real-world TurtleBot 2i. Our code and videos are available at https://sites.google.com/view/intention-aware-crowdnav/home." A Study into Understanding User Requirements to Inform the Design of Customisable Robotic Pain Management Devices,"Angela Higgins, Alison Llewellyn, Emma Dures, Praminda Caleb-Solly","University of Nottingham,University of the West of England",Human-Centered Robotics,"Previous research into using robots for pain management has shown promise. However to date, there seems to have been little research investigating user requirements for robotic pain management devices which could be used by adults living with chronic pain, and how these might be translated into custom products. We carried out a user study comprising online surveys and interviews with people who have lived experience of chronic pain to investigate their perspectives. We had a total of 44 participants in our study. Our research revealed a preference for robotic devices for pain management which have an abstract or animal-like form, noting that contact points with the body should feel soft, warm, and light. Study participants also felt that the user should initiate the interaction and should have control of the robot, as well as the type and intensity of touch. Favored touch types included massaging, rubbing, and stroking. From the emerging requirements, given the diversity of experiences, design-related attributes identified could be used for a form-customization application, such as interactive evolutionary computation (IEC), as a means to personalize the embodiment of robotic devices. Prioritized form factors for customization through included size, weight, and feel." Occlusion-Aware Crowd Navigation Using People As Sensors,"Ye-ji Mun, Masha Itkina, Shuijing Liu, Katherine Driggs-Campbell","University of Illinois at Urbana-Champaign,Stanford University,University of Illinois at Urbana Champaign",Human-Aware Motion Planning,"Autonomous navigation in crowded spaces poses a challenge for mobile robots due to the highly dynamic, partially observable environment. Occlusions are highly prevalent in such settings due to a limited sensor field of view and obstructing human agents. Previous work has shown that observed interactive behaviors of human agents can be used to estimate potential obstacles despite occlusions. We propose integrating such social inference techniques into the planning pipeline. We use a variational autoencoder with a specially designed loss function to learn representations that are meaningful for occlusion inference. This work adopts a deep reinforcement learning approach to incorporate the learned representation into occlusion-aware planning. In simulation, our occlusion-aware policy achieves comparable collision avoidance performance to fully observable navigation by estimating agents in occluded spaces. We demonstrate successful policy transfer from simulation to the real-world Turtlebot 2i. To the best of our knowledge, this work is the first to use social occlusion inference for crowd navigation." Efficiently Approaching Groups of People in a Socially Acceptable Manner in Environments with Obstacles,"Aline Silva, Luciano Almeida, Douglas Guimarães Macharet","Universidade Federal de Minas Gerais - Brazil,Universidade Federal de Minas Gerais",Human-Aware Motion Planning,"Advancements in mobile robotics have allowed humans and robots to interact in different environments and ways. A problem of great interest in Human-Robot Interaction is how to approach individuals, e.g., to gather information, in a socially acceptable manner. We present a new method for planning sequential visits to various groups of people in cluttered environments. The problem is formulated as a Set Orienteering Problem, where each group denotes a cluster with a set of possible approaching points considering different F-formations. We use the concept of a social probabilistic roadmap to determine safe paths between groups. Simulations considering different cases show that methodology produces efficient tours that maximize the number of approached individuals while respecting social norms of distance and a limited budget." SoLo T-DIRL: Socially-Aware Dynamic Local Planner Based on Trajectory-Ranked Deep Inverse Reinforcement Learning,"Yifan Xu, Theodor Chakhachiro, Tribhi Kathuria, Maani Ghaffari","University of Michigan,American University of Beirut,University of Michigan, Ann Arbor",Human-Aware Motion Planning,"This work proposes a novel framework for socially-aware robot navigation in dynamic, crowded environments using a Deep Inverse Reinforcement Learning. To address the social navigation problem, our multi-modal learning based planner explicitly considers social interaction factors, as well as social-awareness factors, into the DIRL pipeline to learn a reward function from human demonstrations. Moreover, we propose a novel trajectory ranking score using the sudden velocity change of pedestrians around the robot to address the sub-optimality in human demonstrations. Our evaluation shows that this method can successfully make a robot navigate in a crowded social environment and outperforms the state-of-art social navigation methods in terms of the success rate, navigation time, and invasion rate." Noise and Environmental Justice in Drone Fleet Delivery Paths: A Simulation-Based Audit and Algorithm for Fairer Impact Distribution,"Zewei Zhou, Martim Brandao",King's College London,Human-Aware Motion Planning,"Despite the growing interest in the use of drone fleets for delivery of food and parcels, the negative impact of such technology is still poorly understood. In this paper we investigate the impact of such fleets in terms of noise pollution and environmental justice. We use simulation with real population data to analyze the spatial distribution of noise, and find that: 1) noise increases rapidly with fleet size; and 2) drone fleets can produce noise hotspots that extend far beyond warehouses or charging stations, at levels that lead to annoyance and interference of human activities. This, we will show, leads to concerns of fairness of noise distribution. We then propose an algorithm that successfully balances the spatial distribution of noise across the city, and discuss the limitations of such purely technical approaches. We complement the work with a discussion of environmental justice, showing how careless UAV fleet development and regulation can lead to reinforcing well-being deficiencies of poor and marginalized communities." Actuator Capabilities Aware Limitation for TDPA Passivity Controller Action,"Francesco Porcini, Alessandro Filippeschi, Massimiliano Solazzi, Carlo Alberto Avizzano, Antonio Frisoli","PERCRO Laboratory, TeCIP Institute, Sant’Anna School of Advanced,Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna, TeCIP Institute",Physical Human-Robot Interaction II,"Haptic interaction often requires stabilizing controllers for safety. The Time-Domain Passivity Approach guarantees passivity (then stability) by observing and dissipating energy generated from active elements in a network. The dissipating action is performed by a Passivity Controller, whose action is commanded to the physically limited robot actuators. Thus, the controller stabilizing action should be in turn limited in order to command displayable references to the actuators. This problem is rarely taken into account in the literature and when it is, the limitation is neither directly related to the actuator power limits, nor to the robot's current configuration. The limits of the currently adopted strategies leave room for improvement. In this paper, a new strategy to limit the Passivity Controller action is proposed taking into account both the physical limits of the actuators and the robot configuration. This new strategy is experimentally tested against the classical one based on the sampling time. In the experiment, a human interacts with a virtual wall in a Virtual Environment through a haptic interface. The wall induces an unstable behavior passivated with the two limiting strategies. The results clearly state the benefits introduced by the proposed strategy in two relevant cases." Upper-Limb Geometric MyoPassivity Map for Physical Human-Robot Interaction,"Xingyuan Zhou, Peter Paik, S. Farokh Atashzar","New York University,New York University (NYU), US",Physical Human-Robot Interaction II,"The intrinsic biomechanical characteristic of the human upper limb plays a central role in absorbing the interactive energy during physical human-robot interaction (pHRI). We have recently shown that based on the concept of ""Excess of Passivity (EoP),"" from nonlinear control theory, it is possible to decode such energetic behavior for both upper and lower limbs. The extracted knowledge can be used in the design of controllers for optimizing the transparency and fidelity of force fields in human-robot interaction and in haptic systems. In this paper, for the first time, we investigate the frequency behavior of the passivity map for the upper limb when the muscle co-activation was controlled in real-time through visual electromyographic feedback. Five healthy subjects (age: 27+-5) were included in this study. The energetic behavior was evaluated at two stimulation frequencies at eight interaction directions over two controlled muscle co-activation levels. Electromyography (EMG) was captured using the Delsys Wireless Trigno system. Results showed a correlation between EMG and EoP, which was further amplified by decreasing the frequency. The proposed energetic behavior is named the Geometric MyoPassivity (GMP) map. The findings indicate that the GMP map has the potential to be used in real-time to quantify the absorbable energy, thus passivity margin of stability for upper limb interaction during pHRI." Learning and Blending Robot Hugging Behaviors in Time and Space,"Drolet Michael, Joseph Campbell, Heni Ben Amor","Arizona State University,Carnegie Mellon University",Physical Human-Robot Interaction II,"We introduce an imitation learning-based physical human-robot interaction algorithm capable of predicting appropriate robot responses in complex interactions involving a superposition of multiple interactions. Our proposed algorithm, Blending Bayesian Interaction Primitives (B-BIP) allows us to achieve responsive interactions in complex hugging scenarios, capable of reciprocating and adapting to a hug's motion and timing. We show that this algorithm is a generalization of prior work, for which the original formulation reduces to the particular case of a single interaction, and evaluate our method through both an extensive user study and empirical experiments. Our algorithm yields significantly better quantitative prediction error and more-favorable participant responses with respect to accuracy, responsiveness, and timing, when compared to existing state-of-the-art methods." Quadruped Guidance Robot for the Visually Impaired: A Comfort-Based Approach,"Yanbo Chen, Zhengzhe Xu, Zhuozhu Jian, Gengpan Tang, Yunong Yangli, Anxing Xiao, Xueqian Wang, Bin Liang","Harbin Institute of Technology, Shenzhen,Tsinghua University,National University of Singapore,Center for Artificial Intelligence and Robotics, Graduate School",Physical Human-Robot Interaction II,"Guidance robots that can guide people and avoid various obstacles, could potentially be owned by more visually impaired people at a fairly low cost. Most of the previous guidance robots for the visually impaired ignored the human response behavior and comfort, treating the human as an appendage dragged by the robot, which can lead to imprecise guidance of the human and sudden changes in the traction force experienced by the human. In this paper, we propose a novel quadruped guidance robot system with a comfort-based concept. We design a controllable traction device that can adjust the length and force between human and robot to ensure comfort. To allow the human to be guided safely and comfortably to the target position in complex environments, our proposed human motion planner can plan the traction force with the force-based human motion model. To track the planned force, we also propose a robot motion planner that can generate the specific robot motion command and design the force control device. Our system has been deployed on Unitree Laikago quadrupedal platform and validated in real-world scenarios." Online Learning and Suppression of Vibration in Collaborative Robots with Power Tools,"Gokhan Solak, Arash Ajoudani","Italian Institute of Technology, Genoa,Istituto Italiano di Tecnologia",Physical Human-Robot Interaction II,"Vibration suppression is an important skill for future robots that will collaborate with humans in industrial settings. The vibration through physical interaction is a common problem in such settings, especially in operations involving hand-held vibrating tools. The existing human-robot collaboration (HRC) works addressing this problem mostly focus on the oscillations caused by the human operator, and suppress them by adapting the admittance parameters. This, however, usually results in stiffer robot behavior and contributes to reducing the overall performance of the task, in particular when impedance planning is a requirement. In this work, we focus on the vibration coming from external sources such as power tools and suppress it actively. We learn the vibration using the bandlimited multiple Fourier linear combiner (BMFLC) algorithm and apply it as a feedforward Cartesian force to cancel the vibration. We combine the feedforward force control with variable impedance learning and show that it improves the vibration suppression performance in simulation and real-world experiments. The feedforward approach can suppress the vibration better while keeping a more compliant set of impedance parameters, which is crucial in HRC." "Towards Human-Robot Collaboration with Parallel Robots by Kinetostatic Analysis, Impedance Control and Contact Detection","Aran Mohammad, Moritz Schappler, Tobias Ortmaier","Leibniz University Hannover,Institute of Mechatronic Systems, Leibniz Universitaet Hannover,Leibniz University Hanover",Physical Human-Robot Interaction II,"Parallel robots provide the potential to be leveraged for human-robot collaboration (HRC) due to low collision energies even at high speeds resulting from their reduced moving masses. However, the risk of unintended contact with the leg chains increases compared to the structure of serial robots. As a first step towards HRC, contact cases on the whole parallel robot structure are investigated and a disturbance observer based on generalized momenta and measurements of motor current is applied. In addition, a Kalman filter and a second-order sliding-mode observer based on generalized momenta are compared in terms of error and detection time. Gearless direct drives with low friction improve external force estimation and enable low impedance. The experimental validation is performed with two force-torque sensors and a kinetostatic model. This allows a new identification method of the motor torque constant of an assembled parallel robot to estimate external forces from the motor current and via a dynamics model. A Cartesian impedance control scheme for compliant robot-environmental dynamics with stiffness from 0.1-2 N/mm and the force observation for low forces over the entire structure are validated. The observers are used for collisions and clamping at velocities of 0.4-0.9 m/s for detection within 9-58 ms and a reaction in the form of a zero-g mode." Proprioceptive Sensor-Based Simultaneous Multi-Contact Point Localization and Force Identification for Robotic Arms,"Seowook Han, Min Jun Kim","Korean Advanced Institute of Science and Technology,KAIST",Physical Human-Robot Interaction II,"In this paper, we propose an algorithm that estimates contact point and force simultaneously. We consider a collaborative robot equipped with proprioceptive sensors, in particular, joint torque sensors (JTSs) and a base force/torque (F/T) sensor. The proposed method has the following advantages. First, fast computation is achieved by proper preprocessing of robot meshes. Second, multi-contact can be identified with the aid of the base F/T sensor, while this is challenging when the robot is equipped with only JTSs. The proposed method is a modification of the standard particle filter to cope with mesh preprocessing and with available sensor data. In simulation validation, for a 7 degree-of-freedom robot, the algorithm runs at 2200Hz with 99.96% success rate for the single-contact case. In terms of the run-time, the proposed method was >=3.5X faster compared to the existing methods. Dual and triple contacts are also reported in the manuscript." Nonlinear Model Predictive Control of a 3D Hopping Robot: Leveraging Lie Group Integrators for Dynamically Stable Behaviors,"Noel Csomay-Shanklin, Victor Dorobantu, Aaron Ames",California Institute of Technology,Award Finalists 4,"Achieving stable hopping has been a hallmark challenge in the field of dynamic legged locomotion. Controlled hopping is notably difficult due to extended periods of underactuation, combined with very short ground phases wherein ground interactions must be modulated to regulate global state. In this work, we explore the use of hybrid nonlinear model predictive control, paired with a low-level feedback controller in a multi-rate hierarchy, to achieve dynamically stable motions on a novel 3D hopping robot. In order to demonstrate richer behaviors on the manifold of rotations, both the planning and feedback layers must be done in a geometrically consistent fashion; therefore, we develop the necessary tools to employ Lie group integrators and an appropriate feedback controller. We experimentally demonstrate stable 3D hopping on a novel robot, as well as trajectory tracking and flipping in simulation." Anchoring Sagittal Plane Templates in a Spatial Quadruped,"Timothy Greco, Daniel Koditschek",University of Pennsylvania,Legged Robots,"This paper introduces a new controller that stabilizes the motion of a spatial quadruped around sagittal-plane templates. It enables highly dynamic gaits and transitional maneuvers formed from parallel and sequential compositions of such planar templates in settings that require significant out-of-plane reactivity. The controller admits formal guarantees of stability with some modest assumptions. Experimental results validate the reliable execution of those planar template-based maneuvers, even in the face of large lateral, yaw, and roll incurring disturbances. This spatial anchor, fixed in parallel composition with a variety of different parallel and sequential compositions of sagittal plane templates, illustrates the robust portability of provably interoperable modular control components across a variety of hardware platforms and behaviors." External Force Estimation of Legged Robots Via a Factor Graph Framework with a Disturbance Observer,"Jeonguk Kang, Hyun-bin Kim, Keun Ha Choi, Kyung-Soo Kim","KAIST,Korea Advanced Institute of Science and Technology,KAIST(Korea Advanced Institute of Science and Technology)",Legged Robots,"Recently, legged robots have been used for various purposes, such as exploring unknown terrain or interacting with the world. For control and planning legged systems during interactive operations, it is essential to estimate and respond to external forces. However, in legged system, it becomes difficult to estimate forces due to highly dynamic situations. There are several studies that use a force sensor on the foot and end effector, but these approaches have disadvantages in terms of cost and sustainability. Therefore, in this paper, we propose an improved method for estimating external forces without a force sensor. First, each leg force was obtained using the system dynamics of the robot with a disturbance observer. Then, by preintegration, it was tightly coupled with other sensors to estimate the pose and external force simultaneously. Despite the impact and slip, we estimate external forces accurately in standing and walking motions. Moreover, we compared pose estimation performance with VINS-Mono [1], and there is no significant accuracy degradation in spite of highly dynamic force residual." Morphological Characteristics That Enable Stable and Efficient Walking in Hexapod Robot Driven by Reflex-Based Intra-Limb Coordination,"Wataru Sato, Jun Nishii, Mitsuhiro Hayashibe, Dai Owaki","Tohoku University,Yamaguchi University",Legged Robots,"Insects exhibit adaptive walking behavior in an unstructured environment, despite having only an extremely small number of neurons (10^5 to 10^6). This suggests that not only the brain nervous system but also properties of the physical body, such as the morphological characteristics, play an essential role in generating such adaptive behavior. Our study aims at investigating the effect of body morphological characteristics on the walking performance in a robot model, which is designed to mimic an insect. To this end, we constructed an insect-like hexapod model in a simulation environment that implements a reflex-based intra-limb coordination control. Herein, for a set of walking parameters, which were optimized to maximize the energy efficiency at the target speed, we investigated the effects of changes in the standard posture of the two leg joints on the walking success rate for various initial conditions and cost of transport (CoT) as an index of energy efficiency. Simulation results indicated that robots with specific morphological characteristics similar to those of insects exhibited high gait stability and energetic efficiency. Because only the reflex-based control was employed, the inter-leg coordination occurred spontaneously, suggesting that our approach would lead to a useful design methodology from the perspective of computational cost in generating the walking locomotion." Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors,"Shikha Surana, Bryan Wei Tern Lim, Antoine Cully",Imperial College London,Legged Robots,"Data-driven learning based methods have recently been particularly successful at learning robust locomotion controllers for a variety of unstructured terrains. Prior work has shown that incorporating good locomotion priors in the form of trajectory generators (TGs) is effective at efficiently learning complex locomotion skills. However, defining a good single TG as tasks and environments become increasingly more complex remains challenging requiring extensive tuning or risks reducing the effectiveness of the prior. In this paper, we present EETG, a method that learns a diverse set of specialised locomotion priors using Quality-Diversity algorithms while maintaining a single policy within the Policies Modulating TG (PMTG) architecture. The results demonstrate that EETG enables a quadruped robot to successfully traverse a wide range of environments, such as slopes, stairs, rough terrain, and balance beams. Our experiments show that learning a diverse set of specialized TG priors is significantly more efficient than using a single, fixed prior when dealing with a wide range of environments." Robust Locomotion on Legged Robots through Planning on Motion Primitive Graphs,"Wyatt Ubellacker, Aaron Ames",California Institute of Technology,Award Finalists 3,"The functional demands of robotic systems often require completing various tasks or behaviors under the effect of disturbances or uncertain environments. Of increasing interest is the autonomy for dynamic robots, such as multirotors, motor vehicles, and legged platforms. Here, disturbances and environmental conditions can have significant impact on the successful performance of the individual dynamic behaviors, referred to as ``motion primitives''. Despite this, robustness can be achieved by switching to and transitioning through suitable motion primitives. This paper contributes such a method by presenting an abstraction of the motion primitive dynamics and a corresponding ``motion primitive transfer function''. From this, a mixed discrete and continuous ``motion primitive graph'' is constructed, and an algorithm capable of online search of this graph is detailed. The result is a framework capable of realizing holistic robustness on dynamic systems. This is experimentally demonstrated for a set of motion primitives on a quadrupedal robot, subject to various environmental and intentional disturbances." Learning Arm-Assisted Fall Damage Reduction and Recovery for Legged Mobile Manipulators,"Yuntao Ma, Farbod Farshidian, Marco Hutter","ETH Zürich,ETH Zurich",Legged Robots,"Adaptive falling and recovery skills greatly extend the applicability of robot deployments. In the case of legged mobile manipulators, the robot arm could adaptively stop the fall and assist the recovery. Prior works on falling and recovery strategies for legged mobile manipulators usually rely on assumptions such as inelastic collisions and falling in defined directions to enable real-time computation. This paper presents a learning-based approach to reducing fall damage and recovery. An asymmetric actor-critic training structure is used to train a time-invariant policy with time-varying reward functions. In simulated experiments, the policy recovers from 98.9% of initial falling configurations. It reduces base contact impulse, peak joint internal forces, and base acceleration during the fall compared to the baseline methods. The trained control policy is deployed and extensively tested on the ALMA robot hardware. A video summarizing the proposed method and the hardware tests is available at https://youtu.be/avwg2HqGi8s." Hierarchical Adaptive Loco-Manipulation Control for Quadruped Robots,"Mohsen Sombolestan, Quan Nguyen",University of Southern California,Legged Robots,"Legged robots have shown remarkable advantages in navigating uneven terrain. However, realizing effective locomotion and manipulation tasks on quadruped robots is still challenging. In addition, object and terrain parameters are generally unknown to the robot in these problems. Therefore, this paper proposes a hierarchical adaptive control framework that enables legged robots to perform loco-manipulation tasks without any given assumption on the object's mass, the friction coefficient, or the slope of the terrain. In our approach, we first present an adaptive manipulation control to regulate the contact force to manipulate an unknown object on unknown terrain. We then introduce a unified model predictive control (MPC) for loco-manipulation that takes into account the manipulation force in our robot dynamics. The proposed MPC framework thus can effectively regulate the interaction force between the robot and the object while keeping the robot balance. Experimental validation of our proposed approach is successfully conducted on a Unitree A1 robot, allowing it to manipulate an unknown time-varying load up to $7$ $kg$ ($60%$ of the robot's weight). Moreover, our framework enables fast adaptation to unknown slopes or different surfaces with different friction coefficients." Probabilistic Contact State Estimation for Legged Robots Using Inertial Information,"Michael Maravgakis, Despina-ekaterini Argiropoulos, Stylianos Piperakis, Panos Trahanias","Institute of Computer Science, Foundation for Research and Techn,(a) Institute of Computer Science Foundation for Research and T,Agility Robotics Inc,,Foundation for Research and Technology – Hellas (FORTH)",Legged Robots,"Legged robot navigation in unstructured and slippery terrains depends heavily on the ability to accurately identify the quality of contact between the robot's feet and the ground. Contact state estimation is regarded as a challenging problem and is typically addressed by exploiting force measurements, joint encoders and/or robot kinematics and dynamics. In contrast to most state of the art approaches, the current work introduces a novel probabilistic method for estimating the contact state based solely on proprioceptive sensing, as it is readily available by Inertial Measurement Units (IMUs) mounted on the robot's end effectors. Capitalizing on the uncertainty of IMU measurements, our method estimates the probability of stable contact. This is accomplished by approximating the multimodal probability density function over a batch of data points for each axis of the IMU with Kernel Density Estimation. The proposed method has been extensively assessed against both real and simulated scenarios on bipedal and quadrupedal robotic platforms such as ATLAS, TALOS and Unitree's GO1." Learning an Efficient Terrain Representation for Haptic Localization of a Legged Robot,"Damian Sójka, MichaÅ‚ Nowicki, Piotr Skrzypczynski",Poznan University of Technology,Legged Robots,"Although haptic sensing has recently been used for legged robot localization in extreme environments where a camera or LiDAR might fail, the problem of efficiently representing the haptic signatures in a learned prior map is still open.This paper introduces an approach to terrain representation for haptic localization inspired by recent trends in machine learning. It combines this approach with the proven Monte Carlo algorithm to obtain an accurate, computation-efficient, and practical method for localizing legged robots under adversarial environmental conditions. We apply the triplet loss concept to learn highly descriptive embeddings in a transformer-based neural network. As the training haptic data are not labeled, the positive and negative examples are discriminated by their geometric locations discovered while training.We demonstrate experimentally that the proposed approach outperforms by a large margin the previous solutions to haptic localization of legged robots concerning the accuracy, inference time, and the amount of data stored in the map. As far as we know, this is the first approach that completely removes the need to use a dense terrain map for accurate haptic localization, thus paving the way to practical applications." Event-Based Agile Object Catching with a Quadrupedal Robot,"Benedek Forrai, Takahiro Miki, Daniel Gehrig, Marco Hutter, Davide Scaramuzza","ETH Zürich,ETH Zurich,University of Zurich / ETH,University of Zurich",Legged Robots,"Quadrupedal robots are conquering search-and-rescue applications due to their capability to navigate challenging uneven terrains. Exteroceptive information greatly enhances this capability since perceiving their surroundings allows them to adapt their controller and thus achieve higher levels of robustness. However, sensors such as LiDARs and RGB cameras do not provide sufficient information to quickly and precisely react in a highly dynamic environment since they suffer from a bandwidth-latency tradeoff. They require significant bandwidth at high frame rates while featuring significant perceptual latency at lower frame rates, thereby limiting their versatility on resource constrained platforms. In this work, we tackle this problem by equipping our quadruped with an event camera, which does not suffer from this tradeoff due to its asynchronous and sparse operation. In levering the low latency of the events, we push the limits of quadruped agility and demonstrating high-speed ball catching with a net for the first time. We show that our our quadruped equipped with an event-camera can catch objects at maximum speeds of 15 m/s from 4 meters, with a success rate of 83%. With a VGA event camera, our method runs at 100 Hz on an NVIDIA Jetson Orin." Evaluation of Legged Robot Landing Capability under Aggressive Linear and Angular Velocities,"Keran Ye, Konstantinos Karydis","University of California, Riverside",Legged Robots,"This paper proposes a method to evaluate the capability of aggressive legged robot landing under significant touchdown linear and angular velocities upon impact. Our approach builds upon the Planar Inverted Pendulum with Flywheel (PIPF) model and introduces a landing framework for the first stance step on a non-dimensional basis. We develop a nonlinear framework with iterative constrained trajectory optimization to stabilize the first stance step prior to N-step Capturability analysis. Performance maps across many different initial conditions reveal approximately linear boundaries as well as the effect of inertia, body incidence angle and leg attacking angle on the boundary shape. Our method also yields the engineering insight that body inertia affects the performance map the most, hence its optimization can be prioritized when the target is to improve robot landing efficacy." Bipedal Robot Walking Control Using Human Whole-Body Dynamic Telelocomotion,"Guillermo Colin Navarro, Youngwoo Sim, Joao Ramos",University of Illinois at Urbana-Champaign,Humanoids and Bipedal Locomotion,"For humanoids to be deployed in demanding situations, such as search and rescue, highly intelligent decision making and proficient sensorimotor skill is expected. A promising solution is to leverage human prowess by interconnecting robot and human via teleoperation. Towards creating seamless operation, this paper presents a dynamic telelocomotion framework that synchronizes the gait of a human pilot with the walking of a bipedal robot. First, we introduce a method to generate a virtual human walking model from the stepping behavior of a human pilot which serves as a reference for the robot to walk. Second, the dynamics of the walking reference and robot walking are synchronized by applying forces to the human pilot and the robot to achieve dynamic similarity between the two systems. This enables the human pilot to continuously perceive and cancel any asynchrony between the walking reference and robot. A consistent step placement strategy for the robot is derived to maintain dynamic similarity through step transitions. Using our human-machine-interface, we demonstrate that the human pilot can achieve stable and synchronous teleoperation of a simulated robot through stepping-in-place, walking, and disturbance rejection experiments. This work provides a fundamental step towards transferring human intelligence and reflexes to humanoid robots." Foot Stepping Algorithm of Humanoids with Double Support Time Adjustment Based on Capture Point Control,"Myeong-Ju Kim, Daegyu Lim, Gyeongjae Park, Jaeheung Park",Seoul National University,Humanoids and Bipedal Locomotion,"Recently, foot stepping strategies of humanoid robots have been actively developed for robust balancing of humanoids against disturbances. In this paper, a novel stepping algorithm adjusting double support phase (DSP) time is proposed. First, the stepping algorithm is proposed based on a model predictive control (MPC) framework for capture point (CP) control and footstep adjustment. Next, when the remaining step time is not enough to adjust the footstep, the DSP scaling method brings the next swing phase forward by reducing the DSP time, which enables the robot to maintain the balance robustly. The robust balance control performance of the proposed method is validated through simulations and experiments when the robot is walking in the presence of external pushes. A more stable balancing performance is realized compared to state-of-the-art stepping controllers." Optimizing Bipedal Locomotion for the 100m Dash with Comparison to Human Running,"Devin Crowley, Jeremy Dao, Helei Duan, Kevin Green, Jonathan Hurst, Alan Fern",Oregon State University,Humanoids and Bipedal Locomotion,"In this paper, we explore the space of running gaits for the bipedal robot Cassie. Our first contribution is to present an approach for optimizing gait efficiency across a spectrum of speeds with the aim of enabling extremely high-speed running on hardware. This raises the question of how the resulting gaits compare to human running mechanics, which are known to be highly efficient in comparison to quadrupeds. Our second contribution is to conduct this comparison based on established human biomechanical studies. We find that despite morphological differences between Cassie and humans, key properties of the gaits are highly similar across a wide range of speeds. Finally, our third contribution is to integrate the optimized running gaits into a full controller that satisfies the rules of the real-world task of the 100m dash, including starting and stopping from a standing position. We demonstrate this controller on hardware to establish the Guinness World Record for Fastest 100m by a Bipedal Robot." Effect of the Dynamics of a Horizontally Wobbling Mass on Biped Walking Performance,"Tomoya Kamimura, Akihito Sano",Nagoya Institute of Technology,Humanoids and Bipedal Locomotion,"We have developed biped robots with a passive dynamic walking mechanism. This study proposes a compass model with a wobbling mass connected to the upper body and oscillating in the horizontal direction to clarify the influence of the horizontal dynamics of the upper body on bipedal walking. The limit cycles of the model were numerically searched, and their stability and energy efficiency was investigated. Several qualitatively different limit cycles were obtained depending mainly on the spring constant that supports the wobbling mass. Specific types of solutions decreased the stability while reducing the risk of accidental falling and improving the energy efficiency. The obtained results were attributed to the wobbling mass moving in the opposite direction to the upper body, thereby preventing large changes in acceleration and deceleration while walking. The relationship between the locomotion of the proposed model and the actual biped robot and human gaits was investigated." Robust Bipedal Locomotion: Leveraging Saltation Matrices for Gait Optimization,"Maegan Tucker, Noel Csomay-Shanklin, Aaron Ames","California Institute of Technology,Caltech",Humanoids and Bipedal Locomotion,"The ability to generate robust walking gaits on bipedal robots is key to their successful realization on hardware. To this end, this work extends the method of Hybrid Zero Dynamics (HZD) -- which traditionally only accounts for locomotive stability via periodicity constraints under perfect impact events -- through the inclusion of the saltation matrix with a view toward synthesizing robust walking gaits. By jointly minimizing the norm of the extended saltation matrix and the torque of the robot directly in the gait generation process, we demonstrate that the synthesized gaits are more robust than gaits generated with either term alone; these results are shown in simulation and on hardware for the AMBER-3M planar biped and the Atalante lower-body exoskeleton (both with and without a human subject). The end result is experimental validation that combining salt" Topology-Based MPC for Automatic Footstep Placement and Contact Surface Selection,"Jaehyun Shim, Carlos Mastalli, Thomas Corbères, Steve Tonneau, Vladimir Ivan, Sethu Vijayakumar","University of Edinburgh,Heriot-Watt University,LAAS-CNRS,The University of Edinburgh,Touchlab Limited",Humanoids and Bipedal Locomotion,"State-of-the-art approaches to footstep planning assume reduced-order dynamics when solving the combinatorial problem of selecting contact surfaces in real time. However, in exchange for computational efficiency, these approaches ignore joint torque limits and limb dynamics. In this work, we address these limitations by presenting a topology-based approach that enables model predictive control (MPC) to simultaneously plan full-body motions, torque commands, footstep placements, and contact surfaces in real time. To determine if a robot’s foot is inside a contact surface, we borrow the winding number concept from topology. We then use this winding number and potential field to create a contact-surface penalty function. By using this penalty function, MPC can select a contact surface from all candidate surfaces in the vicinity and determine footstep placements within it. We demonstrate the benefits of our approach by showing the impact of considering full-body dynamics, which includes joint torque limits and limb dynamics, on the selection of footstep placements and contact surfaces. Furthermore, we validate the feasibility of deploying our topology-based approach in an MPC scheme and explore its potential capabilities through a series of experimental and simulation trials." Online Non-Linear Centroidal MPC for Humanoid Robots Payload Carrying with Contact-Stable Force Parametrization,"Mohamed Elobaid, Giulio Romualdi, Gabriele Nava, Lorenzo Rapetti, Hosameldin Awadalla Omer Mohamed, Daniele Pucci","Fondazione Istituto Italiano di Tecnologia,Istituto Italiano di Tecnologia,IIT,Italian Institute of Technology",Humanoids and Bipedal Locomotion,"In this paper we consider the problem of allowing a humanoid robot that is subject to a persistent disturbance, in the form of a payload-carrying task, to follow given planned footsteps. To solve this problem, we combine an online nonlinear centroidal Model Predictive Controller - MPC with a contact stable force parametrization. The cost function of the MPC is augmented with terms handling the disturbance and regularizing the parameter. The performances of the resulting controller is validated both in simulations and on the humanoid robot iCub. Finally, the effect of using the parametrization on the computational time of the controller is briefly studied." Holistic View of Inverse Optimal Control by Introducing Projections on Singularity Curves,"Jessica Colombel, David Daney, François Charpillet","Université de Lorraine, CNRS, Inria, LORIA, F-,,,,, Nancy, Franc,Inria centre at the university of Bordeaux, F-,,,,, Talence, Fra",Humanoids and Bipedal Locomotion,"Inverse optimal control (IOC) is a framework used in many fields, especially in robotics and human motion analysis. In this context, various methods of resolution have been proposed in the literature. This article presents Projected Inverse Optimal Control (PIOC), an approach that offers a simple and comprehensive view of IOC methods. Especially, we explain how uncertainties can be properly addressed in our view. Thus, this article highlights how classical methods can be understood as projections of trajectories in the solution space of the underlying Direct Optimal Control (DOC) problem. This perspective allows for an examination of projections other than the classical methods, which can be fruitful for researchers in the field. As an example, we propose a projection that allows us to choose the underlying cost functions of an IOC problem from a set. The IOC's sub-problems are also addressed, such as modelling observed trajectories, noise measurement and the reliability of solutions obtained by IOC. Our proposal is supported by a simple and canonical example throughout the document." The Role of Symmetry in Constructing Geometric Flat Outputs for Free-Flying Robotic Systems,"Jake Welde, Matthew Kvalheim, Vijay Kumar","University of Pennsylvania,University of Michigan",Underactuated Systems,"Mechanical systems naturally evolve on principal bundles describing their inherent symmetries. The ensuing factorization of the configuration manifold into a symmetry group and an internal shape space has provided deep insights into the locomotion of many robotic and biological systems. On the other hand, the property of differential flatness has enabled efficient, effective planning and control algorithms for various robotic systems. Yet, a practical means of finding a flat output for an arbitrary robotic system remains an open question. In this work, we demonstrate surprising new connections between these two domains, for the first time employing symmetry directly to construct a flat output. We provide sufficient conditions for the existence of a trivialization of the bundle in which the group variables themselves are a flat output. We call this a geometric flat output, since it is equivariant (i.e. it preserves the symmetry) and often global or almost global, properties not typically enjoyed by other flat outputs. In such a trivialization, the motion planning problem is easily solved, since a given trajectory for the group variables will fully determine the trajectory for the shape variables that exactly achieves this motion. We provide a partial catalog of robotic systems with geometric flat outputs and worked examples for the planar rocket, planar aerial manipulator, and quadrotor." On the Learned Balance Manifold of Underactuated Balance Robots,"Feng Han, Jingang Yi",Rutgers University,Underactuated Systems,"Tracking control of underactuated balance robots needs to estimate balance profiles, that is, balance equilibrium manifold (BEM) of the unactuated subsystems. We present a learning-based approach to obtain the balance manifold for underactuated balance robots. We first establish the relationship between the BEM and the zero dynamics of the underactuated balance robots. The analysis shows that the BEM is a close approximation of the equilibria of the zero dynamics under perfectly tracking control. A Gaussian process learning-based method is proposed to estimate and obtain the BEM and zero dynamics, avoiding the direct inversion of the physics-based robot dynamic model. We demonstrate the analysis and applications experimentally on a rotary inverted pendulum and a bipedal robot." Controlling an Underactuated AUV As an Inverted Pendulum Using Nonlinear Model Predictive Control and Behavior Trees,"Sriharsha Bhat, Ivan Stenius","KTH Royal Institute of Technology,KTH",Underactuated Systems,"Agile and hydrobatic maneuvering capabilities can enhance AUV operations in increasingly challenging scenarios. In this paper, we explore the ability of an underactuated AUV to transition to and hold a pitch angle close to 90 degrees at a particular depth, like an inverted pendulum. Holding such an orientation can be valuable in observing a calving glacier, under-ice launch and recovery, underwater docking, inspecting vertical structures, and observing targets above the water surface. However, such control is challenging because of underactuation, rapid response times and varying stability in different configurations. To address this, a control policy is derived offline using nonlinear MPC in a high-fidelity simulation environment in Simulink. For real-time control, a hybrid controller using a behavior tree (BT) is developed based on the optimal MPC policy and applied on the AUV system. The BT controller considers Safety, Transit and Stabilize behaviors. The control algorithm is validated with simulations in Simulink and Stonefish-ROS as well as field experiments with the hydrobatic SAM AUV, showing repeatable performance in the inverted pendulum maneuver." Towards Exact Interaction Force Control for Underactuated Quadrupedal Systems with Orthogonal Projection and Quadratic Programming,"Shengzhi Wang, Xiangyu Chu, Samuel Au",The Chinese University of Hong Kong,Underactuated Systems,"Projected Inverse Dynamics Control (PIDC) is commonly used in robots subject to contact, especially in quadrupedal systems. Many methods based on such dynamics have been developed for quadrupedal locomotion tasks, and only a few works studied simple interactions between the robot and environment, such as pressing an E-stop button. To facilitate the interaction requiring exact force control for safety, we propose a novel interaction force control scheme for underactuated quadrupedal systems relying on projection techniques and Quadratic Programming (QP). This algorithm allows the robot to apply a desired interaction force to the environment without using force sensors while satisfying physical constraints and inducing minimal base motion. Unlike previous projection-based methods, the QP design uses two selection matrices in its hierarchical structure, facilitating the decoupling between force and motion control. The proposed algorithm is verified with a quadrupedal robot in a high-fidelity simulator. Compared to the QP designs without the strategy of using two selection matrices and the PIDC method for contact force control, our method provided more accurate contact force tracking performance with minimal base movement, paving the way to approach the exact interaction force control for underactuated quadrupedal systems." Reinforcement Learning for Laser Welding Speed Control Minimizing Bead Width Error,"Toshimitsu Kaneko, Gaku Minamoto, Yusuke Hirose, Tetsuo Sakai","Toshiba Corporation,TOSHIBA/RIKEN",Industrial Robotics and Automation,"In this paper, we propose a method for reinforcement learning-based laser welding control. Conventional methods apply standard reinforcement learning formulations to welding tasks, but we show that this formulation can minimize bead width or penetration depth errors only when the welding speed is constant. Therefore, conventional methods are suboptimal for training control parameters including the welding speed. The proposed method discounts future rewards with respect to the welding length instead of time steps to solve this issue. This is easily implemented by (1) modifying the discount factor used for $Q$-function updates in existing reinforcement learning algorithms and (2) using an appropriate reward function. Experimental results using simulators show that the proposed method achieves performance that is superior to conventional methods." Real-Time Model Predictive Control for Industrial Manipulators with Singularity-Tolerant Hierarchical Task Control,"Jaemin Lee, Mingyo Seo, Andrew Bylard, Zhouwen Sun, Luis Sentis","California Institute of Technology,The University of Texas at Austin,Stanford University,Dexterity Inc",Industrial Robotics and Automation,"This paper proposes a real-time model predictive control (MPC) strategy for accomplishing multiple tasks using robots within a finite-time horizon. In industrial robotic applications, it is crucial to consider various constraints to ensure that joint position, velocity, and torque limits are not exceeded. In addition, singularity-free and smooth motions require executing tasks continuously and safely. Instead of formulating nonlinear MPC problems, we devise linear MPC problems using kinematic and dynamic models linearized along nominal trajectories produced by hierarchical controllers. These linear MPC problems are solvable via the use of Quadratic Programming; therefore, we significantly reduce the computation time of the proposed MPC framework so the resulting update frequency is higher than 1 kHz. Our proposed MPC framework is more efficient in reducing task tracking errors than a baseline based on operational space control (OSC). We validate our approach in numerical simulations and in real experiments using an industrial manipulator. More specifically, we deploy our method in two practical scenarios for robotic logistics: 1) controlling a robot carrying heavy payloads while accounting for torque limits, and 2) controlling the end-effector while avoiding singularities." High-Speed High-Accuracy Spatial Curve Tracking Using Motion Primitives in Industrial Robots,"Honglu He, Chen-lung Lu, Yunshi Wen, Glenn Saunders, Pinghai Yang, Jeffrey Schoonover, John Wason, Agung Julius, John Wen","Rensselaer Polytechnic Institute,GE Research,Wason Technology, LLC",Industrial Robotics and Automation,"Industrial robots are increasingly deployed in applications requiring an end effector tool to closely track a specified path, such as in spraying and welding. Performance and productivity present possibly conflicting objectives: tracking accuracy, path speed, and motion uniformity. Industrial robots are programmed through motion primitives consisting of waypoints connected by pre-defined motion segments, with specified parameters such as path speed and blending zone. The actual executed robot motion depends on the robot joint servo controller and joint motion constraints (velocity, acceleration, etc.) which are largely unknown to the users. Programming a robot to achieve the desired performance today is timeconsuming and mostly manual, requiring tuning a large number of coupled parameters in the motion primitives. The performance also depends on the choice of additional parameters: possible redundant degrees of freedom, location of the target curve, and the robot configuration. This paper presents a systematic approach to optimize the robot motion primitives for performance. The approach first selects the static parameters, then the motion primitives, and finally iteratively update the waypoints to minimize the tracking error. The ultimate performance objective is to maximize the path speed subject to the tracking accuracy and speed uniformity constraints over the entire path. We have demonstrated the effectiveness of this approach both in simulation and physically for ABB and FANUC robots for two challenging example curves. Comparing with the baseline using the current industry practice, the optimized performance shows over 200% performance improvement." A New Robust Control Framework for Robot Manipulators without Velocity Measurements: A Modified Dual-Loop Control Scheme,"Hae Yeon Park, Jung Hoon Kim","POSTECH,Pohang University of Science and Technology",Industrial Robotics and Automation,"This paper proposes a new framework for the computed torque method (CTM) of robot manipulators without velocity measurements. We first introduce the Luenberger observer-based CTM with only position measurements. We then clarify that the external disturbance affects not only the tracking performances with respect to the plant but also the estimation accuracies relevant to the state observer. To address this problem, we establish a new architecture for the so-called dual-loop control scheme, by which both the tracking performances and estimation accuracies can be simultaneously improved, in contrast to its existing structure. A guideline for taking control parameters corresponding to the proposed control structure is also provided with respect to the stabilization of the overall closed-loop systems. Finally, simulation and experimental results are provided to demonstrate the validity and practical feasibility of the developed structure." "Optimal Workpiece Placement Based on Robot Reach, Manipulability and Joint Torques","Baris Balci, Jared Donovan, Jonathan Roberts, Peter Corke",Queensland University of Technology,Industrial Robotics and Automation,Workpiece placement with respect to an industrial robot plays an important role in robotic manufacturing due to its influence on the configuration-dependent properties of industrial robots. Suboptimal placements of the workpiece may increase the required joint torques and decrease the dexterity of the robot. The focus of this work is to identify an optimal workpiece pose that enables a robot to carry out surface finishing with configurations that require the lowest possible joint torques while having maximum possible manipulability. We present a non-linear optimization-based algorithm to solve this problem and demonstrate the algorithm's capability on different workpieces which we share to facilitate further research in this area. Experimental Workflow Implementation for Automatic Detection of Filament Deviation in 3D Robotic Printing Process,"Xinrui Yang, Othman Lakhal, Abdelkader Belarouci, Kamal Youcef-Toumi, Rochdi Merzouki","University of Lille,University Lille, CRIStAL, CNRS-UMR ,,,,,University of Lille - CRIStAL Lab,Massachusetts Institute of Technology,CRIStAL, CNRS UMR ,,,,, University of Lille,",Industrial Robotics and Automation,"Robotic 3D Concrete Printing (3DCP) is a process of additive manufacturing using building materials. The system that performs 3DCP is a complex system consisting of multiple parts that are independent of each other. However, conventional 3DCP workflows usually lack automatic monitoring of print quality which can be easily affected for various reasons. This paper proposes an integrated workflow of automatic detection of filament deviation in a 3DCP process. The deformation of the filament is adopted as the criterion for print quality evaluation. A Deep Learning-morphology-based filament width estimation method is developed, and a filament deviation detection algorithm with presence of parametric uncertainties is proposed. This workflow allows to detect width deviations in the printed filament by considering several parameters of the printing system. The integrated workflow is implemented and tested through on-site printing tests." Neuro-Adaptive Dynamic Control with Edge-Computing for Collaborative Digital Twin of an Industrial Robotic Manipulator,"Sumit Kumar Das, Mohammad Helal Uddin, Dan Popa, Sabur Hassan Baidya",University of Louisville,Industrial Robotics and Automation,"With the advancement of industrial manufacturing and an increase in introduction of robots in the workspace, the need of safe operation, communication and information sharing is paramount. The work presented here focuses on cyber-physical system integration through Digital Twin (DT) technology. Our novel DT architecture is based on a model-free Neuro-Adaptive controller (NAC), and an edge-computing scheme for scene monitoring. The NAC can account for varying robot dynamics in both real and virtual environments, and allows for the DT system to expand the realm of cyber-physical integration without expensive model tuning. The edge-computing device introduced in our architecture, observes the robot's workspace from a distance with a wider field of view. This wide viewpoint, enhances the detection and mitigation of any obstacles entering the robot's workspace during operation. We experimentally evaluated the performance of our proposed architecture by introducing dynamic obstacles during a pick-and-place task that both the physical robot and its digital twin had to avoid. Results show that the proposed DT architecture successfully integrates the novel controller and edge-computing elements and successfully performs the given navigation task. The results also show that NAC outperforms a PD controller with more than 70% improvement in joint tracking error between the physical and virtual robots. It was observed that the latency experienced while using NAC is about 48% lower than when Proportional-Derivative (PD) controller was operational." Contact-Based Pose Estimation of Workpieces for Robotic Setups,"Yitaek Kim, Aljaz Kramberger, Anders Glent Buch, Christoffer Sloth",University of Southern Denmark,Industrial Robotics and Automation,"This paper presents a method for contact-based pose estimation of workpieces using a collaborative robot. The proposed pose estimation exploits positions and surface normal vectors along an arbitrary path on an object with known geometry, where surface normal vectors are estimated based on contact forces measured by the robot. When data is only available along a single path, it is difficult to find initial correspondences between source data (recorded points and normal vectors) and target data (CAD of an object); hence, a novel weighted incremental spatial search approach for generating correspondences based on point pair features is proposed. Subsequently, robust pose estimation is employed to reduce the effect of erroneous correspondences. The proposed pose estimation is verified in simulation on three paths on two objects and with different levels of noise on the source data to quantify the robustness of the algorithm. Finally, the method is experimentally validated to provide an average pose rotation and translation accuracy of 0.55° and 0.51 mm, respectively, when using the robust estimation cost function Geman-McClure." Local Layer Splitting: An Additive Manufacturing Method to Define the Mechanical Properties of Soft Pneumatic Actuators During Fabrication,"Brice Parilusyan, Marc Teyssier, Zacharie Guillaume, Thibault Charlet, Clément Duhart, Marcos Serrano","Léonard de Vinci Pôle Universitaire , Research Center,Saarland University, Saarland Informatics Campus,De Vinci Innovation Center, ESPCI, ENAC,École supérieur d’ingénierie Léonard de Vinci,Léonard de Vinci Pôle Universitaire, Research center, ,, ,,, Par,IRIT - University of Toulouse",Additive Manufacturing,"Additive manufacturing of silicone is increasingly being explored to complement the traditional molding fabrication technique for Soft Pneumatic Actuators (SPAs). However, the mechanical behavior of SPAs is defined by their 3D form, which leads to prioritizing the SPAs mechanical properties over their aspect. In this paper, we propose a novel SPA fabrication method where the mechanical properties of a silicone part are defined during the fabrication phase rather than the 3D modeling phase, leading to the object’s mechanical properties being independent of the object’s aspect. This novel SPA fabrication method, named Local Layer Splitting (LLS), consists of local modifications of the printing layer height to integrate stiffness variation, thus generating controlled mechanical deformation when pressured. We discovered that silicone printing layer height impacts the final stiffness of the material, and it could be used to program bending deformation to actuators during printing. We first characterize the effect of the layer height parameters on 3D-printed silicone stiffness with tensile tests. Then, we present a custom slicer we developed to generate G-codes with local layer height variations depending on the x and y positions. We then characterize the bending and force achievable by SPAs made with the LLS process and find that they match those of state-of-the-art SPAs. Finally, we present and discuss how the LLS method impacts the SPAs design by shifting the bending behavior integration from the SPAs 3D conception to their fabrication phase." Support Generation for Robot-Assisted 3D Printing with Curved Layers,"Tianyu Zhang, Yuming Huang, Piotr Tomasz Kukulski, Neelotpal Dutta, Guoxin Fang, Charlie C.l. Wang","The University of Manchester,University of Manchester",Additive Manufacturing,"Robot-assisted 3D printing has drawn a lot of attention by its capability to fabricate curved layers that are optimized according to different objectives. However, the support generation algorithm based on a fixed printing direction for planar layers cannot be directly applied for curved layers as the orientation of material accumulation is dynamically varied. In this paper, we propose a skeleton-based support generation method for robot-assisted 3D printing with curved layers. The support is represented as an implicit solid so that the problems of numerical robustness can be effectively avoided. The effectiveness of our algorithm is verified on a dual-material printing platform that consists of a robotic arm and a newly designed dual-material extruder. Experiments have been successfully conducted on our system to fabricate a variety of freeform models." Learning Deposition Policies for Fused Multi-Material 3D Printing,"Kang Liao, Thibault Tricard, Michal Piovarci, Hans-peter Seidel, Vahid Babaei","Beijing Jiaotong University,INRIA,Institute of Science and Technology Austria,Max Planck Institute for Informatics",Additive Manufacturing,"3D printing based on continuous deposition of materials, such as filament-based 3D printing, has seen widespread adoption thanks to its versatility in working with a wide range of materials. An important shortcoming of this type of technology is its limited multi-material capabilities. While there are simple hardware designs that enable multi-material printing in principle, the required software is heavily underdeveloped. A typical hardware design fuses together individual materials fed into a single chamber from multiple inlets before they are deposited. This design, however, introduces a time delay between the intended material mixture and its actual deposition. In this work, inspired by diverse path planning research in robotics, we show that this mechanical challenge can be addressed via improved printer control. We propose to formulate the search for optimal multi-material printing policies in a reinforcement learning setup. We put forward a simple numerical deposition model that takes into account the non-linear material mixing and delayed material deposition. To validate our system we focus on color fabrication, a problem known for its strict requirements for varying material mixtures at a high spatial frequency. We demonstrate that our learned control policy outperforms state-of-the-art hand-crafted algorithms." Transparent Objects: A Corner Case in Stereo Matching,"Zhiyuan Wu, Shuai Su, Qijun Chen, Rui Fan","Tongji University,Tongji University, China",Logistics,"Stereo matching is a common technique used in 3D perception, but transparent objects such as reflective and penetrable glass pose a challenge as their disparities are often estimated inaccurately. In this paper, we propose transparency-aware stereo (TA-Stereo), an effective solution to tackle this issue. TA-Stereo first utilizes a semantic segmentation or salient object detection network to identify transparent objects, and then homogenizes them to enable stereo matching algorithms to handle them as non-transparent objects. To validate the effectiveness of our proposed TA-Stereo strategy, we collect 260 images containing transparent objects from the KITTI Stereo 2012 and 2015 datasets and manually label pixel-level ground truth. We evaluate our strategy with six deep stereo networks and two types of transparent object detection methods. Our experiments demonstrate that TA-Stereo significantly improves the disparity accuracy of transparent objects. Our project webpage can be accessed at mias.group/TA-Stereo." D2NT: A High-Performing Depth-To-Normal Translator,"Yi Feng, Bohuan Xue, Ming Liu, Qijun Chen, Rui Fan","Tongji University,HKUST,Hong Kong University of Science and Technology",Logistics,"Surface normal holds significant importance in visual environmental perception, serving as a source of rich geometric information. However, the state-of-the-art (SoTA) surface normal estimators (SNEs) generally suffer from an unsatisfactory trade-off between efficiency and accuracy. To resolve this dilemma, this paper first presents a superfast depth-to-normal translator (D2NT), which can directly translate depth images into surface normal maps without calculating 3D coordinates. We then propose a discontinuity-aware gradient (DAG) filter, which adaptively generates gradient convolution kernels to improve depth gradient estimation. Finally, we propose a surface normal refinement module that can easily be integrated into any depth-to-normal SNEs, substantially improving the surface normal estimation accuracy. Our proposed algorithm demonstrates the best accuracy among all other existing real-time SNEs and achieves the SoTA trade-off between efficiency and accuracy." Security-Aware Reinforcement Learning under Linear Temporal Logic Specifications,"Bohan Cui, Keyi Zhu, Shaoyuan Li, Xiang Yin","Shanghai Jiao Tong University,Shanghai Jiao Tong Univ",Logistics,"In this paper, we investigate the problem of reinforcement learning under linear temporal logic (LTL) specifications for Markov decision processes (MDPs) with security constraints. We consider an outside passive intruder (observer) that can observe the external output behavior of the system through an output projection. We assume the secret of the system is a subset of the initial states. The security constraint requires that the observer can never infer for sure that the agent was released from a secret state. To solve the problem of shaping the reward for reinforcement learning to achieve the LTL task while ensuring security, we propose a standard approach here, which is based on the initial-state estimator and the limit deterministic B¨uchi automata. The approach is also evaluated by a case study for a robot moving example." Global Localization in Repetitive and Ambiguous Environments,"Zhenyu Wu, Wei Wang, Jun Zhang, Qiyang Lyu, Haoyuan Zhang, Danwei Wang",Nanyang Technological University,Logistics,"Accurate global localization is an essential ingredient for autonomous mobile robots (AMRs) operating in enclosed or partially enclosed repetitive environments (e.g., office corridors, industrial warehouses, transportation centers). In such environments, the Global Navigation Satellite System (GNSS) signals are unreliable or severely degraded. The highly ambiguous structures in such challenging scenarios would also lead the ordinary geometric feature-based LiDAR/visual localization methods to fail. The ambient magnetic field (MF) has exhibited high distinctiveness at different location, which makes it a viable alternative for infrastructure-free AMR localization. However, few of the previous research has been focused on the orientation-dependency and similar-sequential-route limitations of MF-based localization. Thus, this paper proposes a novel probabilistic global localization system with 2-D LiDAR and rotation-invariant magnetic field for AMRs operating in challenging repetitive and ambiguous environments. The proposed localization system mainly consists of: 1) Two-step Initialization: laser distance and MF sequence based matching, and 2) MF-based Pose Tracking: recursive multi-dimensional MF sequence based matching. Extensive experimental results demonstrate the advantageous localization performances of the proposed localization system over the existing methods." Grey-Box Learning of Adaptive Manipulation Primitives for Robotic Assembly,"Marco Braun, Sebastian Wrede",Bielefeld University,Assembly,"Autonomous learning of robotic manipulation tasks is a promising approach to reduce manual engineering effort and increase flexibility in the future of industrial manufacturing. Although a lot of research has been done especially robotic assembly tasks requiring contact-rich compliant interaction remain a challenge for learning-based methods, since large amounts of interaction data are required. Incorporation of prior knowledge has long been seen as a possibility to make learning-based approaches tractable. The question is how can we enable process experts to encode their prior knowledge in grey-box models so that it can be used for learning robotic manipulation tasks? For that reason we propose a new grey-box learning approach, ""Adaptive Manipulation Primitives"" (AMP), introduced in this paper. AMPs combine compliant manipulation task specifications based on Manipulation Primitives Nets with Policy Gradient Reinforcement Learning. Our framework is evaluated in a real-world robotic assembly task. It is shown that learning to assemble industrial connector modules is possible with comparatively few real-world trials." Speeding up Assembly Sequence Planning through Learning Removability Probabilities,"Alexander Cebulla, Tamim Asfour, Torsten Kroeger","Karlsruhe Institute of Technology (KIT),Karlsruher Institut für Technologie (KIT)",Assembly,"Industry 4.0 facilitates a high number of product variants, posing significant challenges for modern manufacturing. One of them is the automatic creation of assembly sequences. This can be achieved with the assembly-by-disassembly (AbD) approach, which is currently highly inefficient. We aim at speeding up AbD by leveraging deep learning. AbD relies on iteratively testing parts for removal, which makes the order in which parts are tested highly relevant for the run-time. We optimize this order by training a graph neural network (GNN) based on the shape of parts and the shape of local part connections. For each part, it predicts a removability probability. We use these probabilities to optimize the order in which parts are tested for removal. This reduces the number of parts tested by approximately 64%--90%, depending on the tested product. Further improvements are achieved by combining our approach with bookkeeping, another approach for speeding up AbD. Finally, we separately analyze the impact of the parts and the connections on the removability probabilities predicted by the GNN. We found that most of the important information regarding a part's removability can be derived from its connections alone." Planning Assembly Sequence with Graph Transformer,"Lin Ma, Jiangtao Gong, Hao Xu, Hao Chen, Hao Zhao, Wenbing Huang, Guyue Zhou","Southwestern University of Finance and Ecomonics,Tsinghua University,Qianzhi Technology,Qianzhi Technology Inc.,Renmin University of China",Assembly,"Assembly sequence planning (ASP) is the essential process for modern manufacturing, proven to be NP-complete thus its effective and efficient solution has been a challenge for researchers in the field. In this paper, we present a graph-transformer based framework for the ASP problem which is trained and demonstrated on a self-collected ASP database. The ASP database contains a self-collected set of LEGO models. The LEGO model is abstracted to a heterogeneous graph structure after a thorough analysis of the original structure and feature extraction. The ground truth assembly sequence is first generated by brute-force search and then adjusted manually to in line with human rational habits. Based on this self-collected ASP dataset, we propose a heterogeneous graph-transformer framework to learn the latent rules for assembly planning. We evaluated the proposed framework in a series of experiment. The results show that the similarity of the predicted and ground truth sequences can reach 0.44, a medium correlation measured by Kendall’s τ. Meanwhile, we compared the different effects of node features and edge features and generated a feasible and reasonable assembly sequence as a benchmark for further research. Our data set and code will be available on GitHub personal homepage." CFVS: Coarse-To-Fine Visual Servoing for 6-DoF Object-Agnostic Peg-In-Hole Assembly,"Bo-Siang Lu, Tung-i Chen, Hsin-ying Lee, Winston Hsu",National Taiwan University,Assembly,"Robotic peg-in-hole assembly remains a challenging task due to its high accuracy demand. Previous work tends to simplify the problem by restricting the degree of freedom of the end-effector, or limiting the distance between the target and the initial pose position, which prevents them from being deployed in real-world manufacturing. Thus, we present a Coarse-to-Fine Visual Servoing (CFVS) peg-in-hole method, achieving 6-DoF end-effector motion control based on 3D visual feedback. CFVS can handle arbitrary tilt angles and large initial alignment errors through a fast pose estimation before refinement. Furthermore, by introducing a confidence map to ignore the irrelevant contour of objects, CFVS is robust against noise and can deal with various targets beyond training data. Extensive experiments show CFVS outperforms state-of-the-art methods and obtains 100%, 91%, and 82% average success rates in 3-DoF, 4-DoF, and 6-DoF peg-in-hole, respectively." Probabilistic Rare-Event Verification for Temporal Logic Robot Tasks,"Guy Scher, Sadra Sadraddini, Hadas Kress-Gazit","Cornell University,MIT",Formal Methods,"We present a method for calculating the probability that a robot successfully performs a task described using Signal Temporal Logic (STL). We focus on cases where the failure probability is very small, hence a traditional Monte-Carlo method becomes inefficient due to the large number of samples required to observe failures. Using elliptical sliced sampling, normalizing flows, and Bayesian optimization, we develop an algorithm that, under mild assumptions, is applicable to black-box systems, and can be applied to uncertainty sources with non-Gaussian probabilities. We demonstrate the application of our method on multiple robot simulations." Safe Model-Based Control from Signal Temporal Logic Specifications Using Recurrent Neural Networks,"Wenliang Liu, Mirai Duintjer Tebbens Nishioka, Calin Belta","Boston University,Commonwealth School",Formal Methods,"We propose a policy search approach to learn controllers from specifications given as Signal Temporal Logic (STL) formulae. The system model, which is unknown but assumed to be an affine control system, is learned together with the control policy. The model is implemented as two feedforward neural networks (FNNs) - one for the drift, and one for the control directions. To capture the history dependency of STL specifications, we use a recurrent neural network (RNN) to implement the control policy. In contrast to prevalent model-free methods, the learning approach proposed here takes advantage of the learned model and is more efficient. We use control barrier functions (CBFs) with the learned model to improve the safety of the system. We validate our algorithm via simulations and experiments. The results show that our approach can satisfy the given specification within very few system runs, and can be used for on-line control." Temporal Logic Swarm Control with Splitting and Merging,"Gustavo Andres Cardona, Kevin Leahy, Cristian Ioan Vasile","Lehigh University,MIT Lincoln Laboratory",Formal Methods,"This paper presents an agent-agnostic framework to control swarms of robots tasked with temporal and logical missions expressed as Metric Temporal Logic (MTL) formulas. We consider agents that can receive global commands from a high-level planner but no inter-agent communication. Moreover, agents are grouped into sub-swarms whose number can vary over the mission time horizon due to splitting and merging. However, a strict upper bound on the maximum number of sub-swarms is imposed to ensure their safe operation in the environment. We propose a two-phase approach. In the first phase, we compute the trajectories of the sub-swarms, splitting, and merging actions using a Mixed Integer Linear Programming approach that ensures the satisfaction of the MTL specification with minimal swarm division over the mission time horizon. Moreover, it enforces the upper bound on the number of sub-swarms. In the second phase, splitting fractions for sub-swarms resulting from splitting actions are computed. A distributed randomized protocol with no inter-agent communication ensures agent assignments matching the splitting fractions. Finally, we show the operation and performance of the approach in simulations with multiple tasks that require swarm splitting or merging." Synthesizing Reactive Test Environments for Autonomous Systems: Testing Reach-Avoid Specifications with Multi-Commodity Flows,"Apurva Badithela, Josefine Graebener, Wyatt Ubellacker, Eric Mazumdar, Aaron Ames, Richard M. Murray","Caltech,California Institute of Technology",Formal Methods,"We study automated test generation for testing discrete decision-making modules in autonomous systems. Linear temporal logic is used to encode the system specification --- requirements of the system under test --- and the test specification, which is unknown to the system and describes the desired test behavior. The reactive test synthesis problem is to find constraints on system actions such that both the system and test specifications are satisfied. To do this, we first use the specifications and their corresponding Büchi automata to construct the specification product automaton. Second, a virtual product graph representing all possible test executions of the system is constructed from the transition system and the specification product automaton. The main result of this paper is framing the test synthesis problem as a multi-commodity network flow optimization. This optimization is used to derive reactive constraints on system actions, which constitute the test environment. The resulting test environment ensures that the system meets the test specification while also satisfying the system specification. We illustrate this framework in simulation using grid world examples and demonstrate it on hardware with the Unitree A1 quadruped, where we test dynamic locomotion behaviors reactively." HaPPArray: Haptic Pneumatic Pouch Array for Feedback in Hand-Held Robots,"Xiaolei Luo, Jui-Te Lin, Tania Morimoto",University of California San Diego,Haptics and Haptic Interfaces,"Haptic feedback can provide operators of hand-held robots with active guidance during challenging tasks and with critical information on environment interactions. Yet for such haptic feedback to be effective, it must be lightweight, capable of integration into a hand-held form factor, and capable of displaying easily discernible cues. We present the design and evaluation of HaPPArray — a haptic pneumatic pouch array — where the pneumatic pouches can be actuated alone or in sequence to provide information to the user. A 3x3 array of pouches was integrated into a handle, representative of an interface for a hand-held robot. When actuated individually, users were able to correctly identify the pouch being actuated with 86% accuracy, and when actuated in sequence, users were able to correctly identify the associated direction cue with 89% accuracy. These results, along with a demonstration of how the direction cues can be used for haptic guidance of a medical robot, suggest that HaPPArray can be an effective approach for providing haptic feedback for hand-held robots." Vis2Hap: Vision-Based Haptic Rendering by Cross-Modal Generation,"Guanqun Cao, Jiaqi Jiang, Ningtao Mao, Danushka Bollegala, Min Li, Shan Luo","University of Liverpool,King's College London,School of Design, University of Leeds,Xi'an Jiaotong University",Haptics and Haptic Interfaces,"To assist robots in teleoperation tasks, haptic rendering which allows human operators access a virtual touch feeling has been developed in recent years. Most previous haptic rendering methods strongly rely on data collected by tactile sensors. However, tactile data is not widely available for robots due to their limited reachable space and the restrictions of tactile sensors. To eliminate the need for tactile data, in this paper we propose a novel method named as Vis2Hap to generate haptic rendering from visual inputs that can be obtained from a distance without physical interaction. We take the surface texture of objects as key cues to be conveyed to the human operator. To this end, a generative model is designed to simulate the roughness and slipperiness of the object's surface. To embed haptic cues in Vis2Hap, we use height maps from tactile sensors and spectrograms from friction coefficients as the intermediate outputs of the generative model. Once Vis2Hap is trained, it can be used to generate height maps and spectrograms of new surface textures, from which a friction image can be obtained and displayed on a haptic display. The user study demonstrates that our proposed Vis2Hap method enables users to access a realistic haptic feeling similar to that of physical objects. The proposed vision-based haptic rendering has the potential to enhance human operators' perception of the remote environment and facilitate robotic manipulation." A Plug-In Weight-Shifting Module That Adds Emotional Expressiveness to Inanimate Objects in Handheld Interaction,"Yohei Noguchi, Yijie Guo, Fumihide Tanaka",University of Tsukuba,Haptics and Haptic Interfaces,"A plug-in weight-shifting module that can be inserted into a variety of objects is presented. The module is equipped with a movable weight inside its body. Three-dimensional weight shifts are presented by controlling one-dimensional translational and two-dimensional rotational movements. To explore the use case of this weight-shifting module, eight weight shift patterns expressing certain emotions were created through a workshop and a qualitative analysis. User tests, to which three different embodiments and scenarios were applied, examined the following three cases: the weight shift patterns were presented to the user by a) a stuffed toy-style robot that mediated human messaging, b) a cushion that made the user relax, and c) a container that enhanced the user's movie-watching experience. User interviews revealed the feasibility of the module and its weight shift patterns for the user's perception of emotions." Model-Mediated Teleoperation for Remote Haptic Texture Sharing: Initial Study of Online Texture Modeling and Rendering,"Mudassir Ibrahim Awan, Tatyana Ogay, Waseem Hassan, Dongbeom Ko, Sungjoo Kang, Seokhee Jeon","Kyung Hee university,Kyung Hee University,ETRI (Electronics and Telecommunications Research Institute),Electronics and Telecommunications Research Institute (ETRI)",Haptics and Haptic Interfaces,"While model-mediated teleoperation (MMT) is an effective alternative for ensuring both transparency and stability, its potential in transmitting surface haptic texture is not yet explored. This paper introduces the first MMT framework capable of sharing surface haptic texture. The slave side collects physical signals contributing to haptic texture perception, e.g., high frequency acceleration, and streams them to the master side. The master side uses the signals to build and update a local measurement-based texture simulation model that reflects the remote surface. At the same time, the master runs local simulation using the model, resulting in non-delayed, stable, and accurate feedback of texture. Considering that rendering haptic texture needs tougher real-time requirements, e.g., higher update rate and lower action-feedback latency, MMT can be a perfect platform for remote texture sharing. An initial proof-of-concept system supporting single and homogeneous surface is implemented and evaluated, demonstrating the potential of the approach." Using a Collaborative Robotic Arm As Human-Machine Interface: System Setup and Application to Pose Control Tasks,"Christian Braun, Ludwig Haide, Lars Fischer, Sean Kille, Balint Varga, Simon Rothfuß, Soeren Hohmann","Karlsruhe Institute of Technology (KIT),Karlsruhe Institute of Technology,Karlsruhe Institute of Technology (KIT), Campus South,Institute of Control Systems, Karlsruhe Institute of Technology",Haptics and Haptic Interfaces,"While robotic arms have been used in a vast range of application areas, so far no extensive reports on the utilization as human-machine interface exist. Compared to HMI devices from literature, the robotic arm used in this work (KUKA LBR iiwa 14 R820) features a relatively large workspace and is able to generate force and torque feedback that surpasses the capabilities of literature devices. We describe the setup allowing to use the robotic arm as HMI and analytically determine the optimal initial pose of it based on the manipulability measure of Yoshikawa. To demonstrate that the robotic arm is able to serve as HMI, we report on a comparative study with a state of the art haptic HMI featuring 20 participants. Additionally, two applications from the context of planetary exploration are presented: The first considers the teleoperation of the pan-tilt unit of a lightweight rover unit and illustrates how the large workspace of the HMI benefits the precision of the teleoperation compared to a setup with a smaller workspace. The second experiment showcases the use of the force feedback of the HMI to enable a cooperation between the operator and a supporting path-following automation in a shared control of a simulated ground robot. Both the study and the applications highlight the performance, precision and reliability of our proposed system." Disturbance Observer Based Contact Detection for Motorized Hydraulic Actuators,"Chunpeng Wang, John Peter Whitney",Northeastern University,Haptics and Haptic Interfaces,"Contact detection without endpoint tactile sensing is challenging; friction and inertia obscure the sensing of low amplitude and high frequency forces. In this work we explore fluidic transmissions as series-elastic actuators, coupled to remotely-located direct-drive brushless motors, in a bid to maximize low-impedance sensitivity to contact while maintaining high bandwidth. We employ a disturbance observer to remove motor friction and further reduce minimum impedance. Using a 2-DOF remotely-actuated hydraulically-coupled robotic gripper, we demonstrate a maximum endpoint Z-width of 40dB and a robust contact detection threshold of 0.2N, without endpoint tactile sensing or joint position sensing. These results enable wiring-free and joint sensor-free arm and end-effector design, which are of particular interest for human-robot interaction, harsh-environment, magnetically-sensitive, and low-cost robotic manipulators that must maintain high bandwidth and high contact sensitivity." A Framework for Active Haptic Guidance Using Robotic Haptic Proxies,"Niall L. Williams, Nicholas Rewkowski, Jiasheng Li, Ming C. Lin","University of Maryland, College Park,UMD College Park,University of Maryland at College Park",Haptics and Haptic Interfaces,"Haptic feedback is an important component of creating an immersive mixed reality experience. Traditionally, haptic forces are rendered in response to the user's interactions with the virtual environment. In this work, we explore the idea of rendering haptic forces in a proactive manner, with the explicit intention to influence the user's behavior through compelling haptic forces. To this end, we present a framework for active haptic guidance in mixed reality, using one or more robotic haptic proxies to influence user behavior and deliver a safer and more immersive virtual experience. We provide details on common challenges that need to be overcome when implementing active haptic guidance, and discuss example applications that show how active haptic guidance can be used to influence the user's behavior. Finally, we apply active haptic guidance to a virtual reality navigation problem, and conduct a user study that demonstrates how active haptic guidance creates a safer and more immersive experience for users." An Optimized Portable Cable-Driven Haptic Robot Enables Free Motion and Hard Contact,"Changqi Zhang, Cui Wang, Qingkai Yang, Mingming Zhang","Southern University of Science and Technology,Southern University of Science And Technology",Haptics and Haptic Interfaces,"Task-oriented training with haptic rendering can boost robot-aided motor learning to tasks with similar dynamics. Although multi-DOF robots better match the rendering of real task scenarios, single-DOF haptic robots show great potential for home use with enhanced task rendering performance. This study presents our attempts to optimize and develop a single-DOF cable-driven robot with appropriate workspace and force rendering capacity. The core technologies consist of two aspects: 1) a multi-objective optimization method was adopted to obtain optimal configuration of the haptic robot; and 2) a slider-crank-mechanism-based portable cable-driven robot was developed. Performance evaluation experiments demonstrated that 1) the robot has a workspace larger than 300 mm; 2) the robot can achieve 40 N force output and 40 N·mm^(-1) stiffness for hard contact; 3) the root mean square of the resistance during free motion is 0.93 N; 4) in the purely passive case (without motor compensation), the average resistance to back drive the motor is 2.5 N. These lead us to believe that the developed robot holds the promise to serve as a robotic rehabilitation training platform for home use on the neurological-impaired patients." Enable Natural Tactile Interaction for Robot Dog Based on Large-Format Distributed Flexible Pressure Sensors,"Lishuang Zhan, Yancheng Cao, Qitai Chen, Haole Guo, Jiasi Gao, Yiyue Luo, Shihui Guo, Guyue Zhou, Jiangtao Gong","Xiamen University,Institute for AI Industry Research (AIR), Tsinghua University, C,Guangzhou Maritime University,Tsinghua University,Massachusetts Institute of Technology",Haptics and Haptic Interfaces,"Touch is an important channel for human-robot interaction, while it is challenging for robots to recognize human touch accurately and make appropriate responses. In this paper, we design and implement a set of large-format distributed flexible pressure sensors on a robot dog to enable natural human-robot tactile interaction. Through a heuristic study, we sorted out 81 tactile gestures commonly used when humans interact with real dogs and 44 dog reactions. A gesture classification algorithm based on ResNet is proposed to recognize these 81 human gestures, and the classification accuracy reaches 98.7%. In addition, an action prediction algorithm based on Transformer is proposed to predict dog actions from human gestures, reaching a 1-gram BLEU score of 0.87. Finally, we compare the tactile interaction with the voice interaction during a freedom human-robot-dog interactive playing study. The results show that tactile interaction plays a more significant role in alleviating user anxiety, stimulating user excitement and improving the acceptability of robot dogs." Multi-Modal Interactive Perception in Human Control of Complex Objects,"Rashida Nayeem, Salah Bazzi, Mohsen Sadeghi, Reza Sharif Razavian, Dagmar Sternad",Northeastern University,Haptics and Haptic Interfaces,"Tactile sensing has been increasingly utilized in robot control of unknown objects to infer physical properties and optimize manipulation. However, there is limited understanding about the contribution of different sensory modalities during interactive perception in complex interaction both in robots and in humans. This study investigated the effect of visual and haptic information on humans' exploratory interactions with a 'cup of coffee', an object with nonlinear internal dynamics. Subjects were instructed to rhythmically transport a virtual cup with a rolling ball inside between two targets at a specified frequency, using a robotic interface. The cup and targets were displayed on a screen, and force feedback from the cup-and-ball dynamics was provided via the robotic manipulandum. Subjects were encouraged to explore and prepare the dynamics by ""shaking"" the cup-and-ball system to find the best initial conditions prior to the task. Two groups of subjects received the full haptic feedback about the cup-and-ball movement during the task; however, for one group the ball movement was visually occluded. Visual information about the ball movement had two distinctive effects on the performance: it reduced preparation time needed to understand the dynamics and, importantly, it led to simpler, more linear input-output interactions between hand and object. The results highlight how visual and haptic information regarding nonlinear internal dynamics have distinct roles for the interactive perception of complex objects." Soft Sensing Skins for Arbitrary Objects: An Automatic Framework,"Sonja Groß, Diego Xavier Hidalgo Carvajal, Silija Breimann, Nicolai Stein, Amartya Ganguly, Abdeldjallil Naceri, Sami Haddadin","Technical University of Munich,Technical University Munich,Technische Universität München",Haptics and Haptic Interfaces,"Tactile sensing is increasingly applied in various fields such as robotics, human-object interaction, and ergonomics. In the advent of new surge in customizing soft tactile skin for diverse applications, the need to automate design and manufacturing processes based on user requirements is steadily growing. In this work, we propose a partially automated framework, in which silicone-based, skin like sensors are designed and customized for arbitrary objects. We evaluate the performance of stretch and contact sensors with custom sensor patterns on complex surfaces, which are tested in position control and manipulation scenarios. The results of our study serve proof-of-concept that such skin like sensors can be fabricated effectively in an automated framework to be used in different environments and applications." Error-Domain Conservativity Control to Transparently Increase the Stability Range of Time-Discretized Controllers,"Michael Rothammer, Jee-Hwan Ryu","TUM, Munich,Korea Advanced Institute of Science and Technology",Haptics and Haptic Interfaces,"Time-discretization introduces an explicit time dependency for control laws that were originally designed to depend exclusively on an error variable: At different times, the control actions at the same error value might differ. Integrating the control action over the error reveals that this time dependency translates into the energy. It can directly cause active behavior when energy values at given error values decrease over time, potentially destabilizing the system. In this work, we aim to prevent energy values at given error values from decreasing over time. To this end, energies are recorded when error values are encountered for the first time. Linear interpolation of the recorded energy values provides a lower limit for energy as a function of the error value. This limit is enforced using an adaptive damping. The main contributions of this work include increasing the stability range with minimal amplitude control modifications, while promoting a symmetric behavior of control actions and energy. The approach's characteristics are shown in simulation and validated in experiments." A Digital Twin for Teleoperation of Vehicles in Urban Environments,"Philipp Kremer, Navid Nourani-Vatani, Sangyoung Park","Technische Universität Berlin,Imperium Drive Ltd,Technical University of Berlin",Teleoperation,"Teleoperated driving (ToD) is increasingly consid- ered as a fallback solution for autonomous driving. As of now, ToD requires a very reliable high-throughput network capable of transmitting multiple video streams with low latency. Recently, significant progresses have been made in vehicular sensors and perception algorithms, which we believe, have huge implications in ToD. We envisage that a real-time digital twin that tracks remote vehicle’s environment will play a crucial role in reducing the required communication bandwidth and providing a better teleoperator interface. Furthermore, it would allow various degrees of cooperation between automated driving functionalities and human teleoperators. In this paper, the concept of digital twin for ToD is outlined and a proof of concept is implemented using a real-world vehicle simulator and a teleoperator hardware setup. A significant reduction in required bandwidth is reported by transmitting less video data and reconstructing the scene from the digital twin." WE-Filter: Adaptive Acceptance Criteria for Filter-Based Shared Autonomy,"Michael Bowman, Xiaoli Zhang",Colorado School of Mines,Teleoperation,"Filter-based shared control aims to accept and augment an operator's ability to control a robot. Current solutions accept actions based on their direction aligning with the robot's optimal policy. These strategies reject a human’s small corrective actions if they conflict with the robot’s direction and accept too aggressive actions as long as they are consistent with the robot's direction. Such strategies may cause task failures and the operator’s feeling of loss of control. To close the gap, we propose WE-Filter, which has flexible, adaptive criteria allowing the operator’s small corrective actions and tempering too aggressive ones. Inspired by classical work-energy impact problems between two dynamic, interactive bodies, both inputs' properties (direction and magnitude) are inherently considered, creating intuitive, adaptive bounds to accept sensible actions. The model identifies behaviors before and after impact. The rationale is that each timestep of shared control acts as an impact between the operator’s and the robot’s policies, where post-impact behaviors depend on their previous behaviors. As time continues, a series of impacts occur. The aim is to minimize impacts that occur to reach an agreement faster and reduce strong reactionary behaviors. Our model determines flexible acceptance criteria to bound a mismatch of magnitude and finds a replacement action for conflicting policies. The WE-Filter achieves better task performance, the ratio of accepted actions, and action similarity than the existing methods." Monocular Reactive Collision Avoidance for MAV Teleoperation with Deep Reinforcement Learning,"Raffaele Brilli, Marco Legittimo, Francesco Crocetti, Mirko Leomanni, Mario Luca Fravolini, Gabriele Costante",University of Perugia,Teleoperation,"Enabling Micro Aerial Vehicles (MAVs) with semiautonomous capabilities to assist their teleoperation is crucial in several applications. Remote human operators do not have, in general, the situational awareness to perceive obstacles near the drone, nor the readiness to provide commands to avoid collisions. In this work, we devise a novel teleoperation setting that asks the operator to provide a simple high-level signal encoding the speed and the direction they expect the drone to follow. We then endow the MAV with an end-to-end Deep Reinforcement Learning (DRL) model that computes control commands to track the desired trajectory while performing collision avoidance. Differently from State-of-the-Art (SotA) works, it allows the robot to move freely in the 3D space, requires only the current RGB image captured by a monocular camera and the current robot position, and does not make any assumption about obstacle shape and size. We prove the effectiveness and the generalization capabilities of our strategy by comparing it against a SotA baseline in photorealistic simulated environments" HAT: Head-Worn Assistive Teleoperation of Mobile Manipulators,"Akhil Padmanabha, Qin Wang, Daphne Han, Jashkumar Rasikbhai Diyora, Kriti Kacker, Hamza Khaild, Liang-jung Chen, Carmel Majidi, Zackory Erickson",Carnegie Mellon University,Teleoperation,"Mobile manipulators in the home can provide increased autonomy to individuals with severe motor impairments, who often cannot complete activities of daily living (ADLs) without the help of a caregiver. Teleoperation of an assistive mobile manipulator could enable an individual with motor impairments to independently perform self-care and household tasks, yet limited motor function can impede one's ability to interface with a robot. In this work, we present a unique inertial-based wearable assistive interface, embedded in a familiar head-worn garment, for individuals with severe motor impairments to teleoperate and perform physical tasks with a mobile manipulator. We evaluate this wearable interface with both able-bodied (N = 16) and individuals with motor impairments (N = 2) for performing ADLs and everyday household tasks. Our results show that the wearable interface enabled participants to complete physical tasks with low error rates, high perceived ease of use, and low workload measures. Overall, this inertial-based wearable serves as a new assistive interface option for control of mobile manipulators in the home." DenseTact 2.0: Optical Tactile Sensor for Shape and Force Reconstruction,"Won Kyung Do, Bianca Jurewicz, Monroe Kennedy",Stanford University,Force and Tactile Sensing II,"Collaborative robots stand to have an immense impact on both human welfare in domestic service applications and industrial superiority in advanced manufacturing with dexterous assembly. The outstanding challenge is providing robotic fingertips with a physical design that makes them adept at performing dexterous tasks that require high-resolution, calibrated shape reconstruction and force sensing. In this work, we present DenseTact 2.0, an optical-tactile sensor capable of visualizing the deformed surface of a soft fingertip and using that image in a neural network to perform both calibrated shape reconstruction and 6-axis wrench estimation. We demonstrate the sensor accuracy of 0.3633mm per pixel for shape reconstruction, 0.410N for forces, 0.387Nmm for torques, and the ability to calibrate new fingers through transfer learning, which achieves comparable performance with only 12% of the non-transfer learning dataset size." SonicFinger: Pre-Touch and Contact Detection Tactile Sensorfor Reactive Pregrasping,"Siddharth Rupavatharam, Caleb Escobedo, Daewon Lee, Colin Prepscius, Lawrence Jackel, Richard Howard, Volkan Isler","Samsung AI Center,University of Colorado - Boulder,Samsung AI Center New York,Samsung,North-C Technologies Inc,University of Minnesota",Force and Tactile Sensing II,"Sensing systems with proximity detection and contact sensing capabilities can reactively reshape their grasp. Reactive preshaping helps align objects to ensure successful grasps. In this work, we introduce SonicFinger, a sensing system capable of full-surface pre-touch and contact sensing. A single piezoelectric transducer embedded within a novel 3D printed finger is used to create an acoustic aura encompassing the finger. The acoustic aura enables pre-touch sensing, and gripper alignment, while changes in finger-transducer acoustic coupling indicate contact. SonicFinger is low-cost, compact, and easy to manufacture and assemble. The system is evaluated using a set of objects with various physical properties such as optical reflectivity, dielectric constants, mechanical properties, and acoustic absorption. A dataset with over 8,000 proximity and contact events is collected and evaluated. Our system shows a pre-touch detection true positive rate (TPR) of 92.4% and a true negative rate (TNR) of 95.3%. Contact detection experiments show a TPR of 93.7% and a TNR of 98.7%. Furthermore, pre-touch detection information from SonicFinger is used to adjust the robot grippers pose to align a target object at the center of both fingers." Simultaneous Tactile Estimation and Control of Extrinsic Contact,"Sangwoon Kim, Devesh Jha, Diego Romeres, Parag Patre, Alberto Rodriguez","Massachusetts Institute of Technology,Mitsubishi Electric Research Laboratories,Mitsubishi Electric research laboratories,University of Florida",Force and Tactile Sensing II,"We propose a method that simultaneously estimates and controls extrinsic contact with tactile feedback. The method enables challenging manipulation tasks that require controlling light forces and accurate motions in contact, such as balancing an unknown object on a thin rod standing upright. A factor graph-based framework fuses a sequence of tactile and kinematic measurements to estimate and control the interaction between gripper-object-environment, including the location and wrench at the extrinsic contact between the grasped object and the environment and the grasp wrench transferred from the gripper to the object. The same framework simultaneously plans the gripper motions that make it possible to estimate the state while satisfying regularizing control objectives to prevent slip, such as minimizing the grasp wrench and minimizing frictional force at the extrinsic contact. We show results with sub-millimeter contact localization error and good slip prevention even on slippery environments, for multiple contact formations (point, line, patch contact) and transitions between them. See supplementary video and results at https://sites.google.com/view/sim-tact." A Miniaturised Camera-Based Multi-Modal Tactile Sensor,"Kaspar Althoefer, Yonggen Ling, Wanlin Li, Xinyuan Qian, Wang Wei Lee, Peng Qi","Queen Mary University of London,Tencent,Beijing Institute for General Artificial Intelligence (BIGAI),University of Science and Technology Beijing,Tongji University",Force and Tactile Sensing II,"In conjunction with huge recent progress in camera and computer vision technology, camera-based sensors have increasingly shown considerable promise in relation to tactile sensing. In comparison to competing technologies (be they resistive, capacitive or magnetic based), they offer super-high-resolution, while suffering from fewer wiring problems. The human tactile system is composed of various types of mechanoreceptors, each able to perceive and process distinct information such as force, pressure, texture, etc. Camera-based tactile sensors such as GelSight mainly focus on high-resolution geometric sensing on a flat surface, and their force measurement capabilities are limited by the hysteresis of the silicone material. In this paper, we present a miniaturised dome-shaped camera-based tactile sensor that allows accurate force and tactile sensing in a single coherent system. We demonstrate how to build a smooth silicone hemispheric sensing medium with uniform markers on its curved surface. We also enhance the illumination of the rounded silicone with diffused LEDs and construct a force-sensitive mechanical structure within the body of the sensor to accurately perceive forces. Our multi-modal sensor is able to acquire tactile information from multi-axis forces, local force distribution, and contact geometry, all in real-time. We apply an end-to-end deep learning method to process all the information." Neural Contact Fields: Tracking Extrinsic Contact with Tactile Sensing,"Carolina Higuera, Siyuan Dong, Byron Boots, Mustafa Mukadam","University of Washington,MIT,Facebook AI Research",Force and Tactile Sensing II,"We present Neural Contact Fields, a method that brings together neural fields and tactile sensing to address the problem of tracking extrinsic contact between object and environment. Knowing where the external contact occurs is a first step towards methods that can actively control it in facilitating downstream manipulation tasks. Prior work for localizing environmental contacts typically assume a contact type (e.g. point or line), does not capture contact/no-contact transitions, and only works with basic geometric-shaped objects. Neural Contact Fields are the first method that can track arbitrary multi-modal extrinsic contacts without making any assumptions about the contact type. Our key insight is to estimate the probability of contact for any 3D point in the latent space of object’s shapes, given vision-based tactile inputs that sense the local motion resulting from the external contact. In experiments, we find that Neural Contact Fields are able to localize multiple contact patches without making any assumptions about the geometry of the contact, and capture contact/no-contact transitions for known categories of objects with unseen shapes in unseen environment configurations. In addition to Neural Contact Fields, we also release our YCB-Extrinsic-Contact dataset of simulated extrinsic contact interactions to enable further research in this area. Project page: https://github.com/carolinahiguera/NCF" Estimating Tactile Models of Heterogeneous Deformable Objects in Real Time,"Shaoxiong Yao, Kris Hauser","University of Illinois Urbana-Champaign,University of Illinois at Urbana-Champaign",Force and Tactile Sensing II,"This paper introduces a method for learning the force response of heterogeneous, deformable objects directly from robot sensor data without prior knowledge. The method estimates an object's force response given robot force or torque measurements using a novel volumetric stiffness field representation and point-based contact simulator. The stiffness of each point colliding with the robot is estimated independently and is updated upon each observed measurement using a projected diagonal Kalman filter. Experiments show that this method can update a stiffness field over 100,000 points at 23Hz or higher, and is more accurate than learning-based methods in predicting robot torque readings while touching artificial plants. The method can also be augmented with visual information to help extrapolate stiffness fields to distant parts of the touched object using only a small number of touches." Tactile Identification of Object Shapes Via In-Hand Manipulation with a Minimalistic Barometric Tactile Sensor Array,"Xin Zhou, Ad Spiers",Imperial College London,Force and Tactile Sensing II,"With the goal of providing an alternative to optical and other tactile sensors, we set out to stress test the object shape identification capabilities of barometric tactile arrays in robotic manipulation tasks. These sensors are superior to optical devices in terms of form factor, ease of fabrication, and data reading/processing speeds, but lack the necessary spatial resolution to identify surface shapes via a single contact. To compensate, we utilize in-hand-manipulation, specifically in-hand-rolling to identify object shapes via a spatiotemporal approach. To increase task difficulty, we only use three neighboring barometric sensors and designed strict experiment requirements with the purpose of creating a set of extremely confusable test objects. The E-TRoll robotic hand, equipped with a barometric tactile array on one finger, was used to roll test objects within its grasp, taking just under 3.4 seconds for data collection under the fastest tested speed setting, compared to 33 seconds in our previous work. We also designed and implemented a feature extraction algorithm, based and improved upon our recently published algorithm. This captures enough information from the collected spatiotemporal data samples for successful classification with only 13 features. Finally, a bagged tree classification algorithm was trained and optimized with data from 1,164 trials of rolling 9 prismatic test objects, leading to a five-fold cross validation accuracy of 90.5% for identifying the 9 object classes." Tactile Tool Manipulation,"Yuki Shirai, Devesh Jha, Arvind Raghunathan, Dennis Hong","University of California, Los Angeles,Mitsubishi Electric Research Laboratories,UCLA",Force and Tactile Sensing II,"Humans can effortlessly perform very complex, dexterous manipulation tasks by reacting to sensor observations. Robots, on the other hand, can not perform reactive manipulation and they mostly operate in open-loop while interacting with their environment. Consequently, the current manipulation algorithms either are very inefficient in performance or can only work in highly structured environments. In this paper, we present closed-loop control of a complex manipulation task where a robot uses a tool to interact with objects. Manipulation using a tool leads to complex kinematics and contact constraints that need to be satisfied for generating feasible manipulation trajectories. We first present an open-loop controller design for tool manipulation using Non-Linear Programming (NLP) that satisfies these constraints. In order to design a closed-loop controller, we present a pose estimator of objects and tools using tactile sensors. Using our tactile estimator, we design a closed-loop controller based on MPC. The proposed algorithm is verified using a 6 DoF manipulator on tasks using a variety of objects and tools. We verify that our closed-loop controller can successfully perform tool manipulation under several unexpected contacts." Preliminary Evaluation of a Wearable Thruster for Arresting Backwards Falls,"Michael Finn-henry, Jose Leonardo Brenes, Almaskhan Baimyshev, Michael Goldfarb","Vanderbilt,Vanderbilt University",Rehabilitation and Augmentation II,"This paper presents preliminary results assessing the efficacy of a backpack-worn cold-gas thruster to potentially arrest impending backwards falls. Specifically, a nitrogen-based cold gas thruster system was integrated into a backpack-worn prototype device, and experiments were conducted to assess the effect of the wearable device on backwards falls. Although the device is eventually intended for individuals at fall risk, these preliminary experiments were conducted on three healthy subjects. The experiments compared each subject’s ability to recover from an impending fall with and without assistance from the thruster. Results suggest that the likelihood of a fall was substantially reduced with the thruster assistance." A Method for Selecting Stumble Recovery Response in a Knee Exoskeleton,"Maura Eveld, Shane King, Karl Zelik, Michael Goldfarb","University of Twente,Vanderbilt University",Rehabilitation and Augmentation II,"Powered lower-limb exoskeletons have been shown to assist and augment walking, but most such devices do not currently have the ability to explicitly accommodate a stumble perturbation. A major challenge in doing so is identifying a stumble event and selecting in real-time which recovery strategy (elevating or lowering) to employ, particularly since the exoskeleton should ideally select the same strategy selected by the user. In order to do so, the authors conducted experiments involving five young, healthy adults wearing a knee exoskeleton. Each participant underwent a stumble experiment in order to collect an exoskeleton sensor dataset of stumbles throughout swing phase, which was used for stumble detection and recovery strategy identification algorithm development and testing. Overall, the proposed detection and identification algorithms provide improved accuracy with fewer required sensors relative to previous works, and were tested on the largest exoskeleton sensor stumble dataset to date, showing the feasibility of such algorithms for real-time implementation, which is an essential first step in developing lower-limb assistive devices that are robust to stumbles." A Dual-Arm Participated Human-Robot Collaboration Method for Upper Limb Rehabilitation of Hemiplegic Patients,"Lufeng Chen, Jing Qiu, Xuan Zou, Hong Cheng","University of Electronic Science and Technology of China,University of Electronic Science and Technology",Rehabilitation and Augmentation II,"Upper limb rehabilitation robots are mainly used as a physical therapy method to passively or actively train the affected side. However, they are rarely implemented in accordance with the occupational therapy theory, which is dedicated to improving the sensorimotor coordination of hemiplegic patients by considering both healthy and affected limbs. To realize the occupational therapy concept in robot-assisted upper limb rehabilitation, we propose a new human-robot collaboration framework for hemiplegic patients that integrates healthy/affected limbs and robot. The strategy aims at achieving patient-specific movement capabilities and improving the participation of the affected limb during rehabilitation. To accomplish this task, we have addressed two essential issues: accurate motion estimation of the healthy limb and the rehabilitation trajectory learning technique. The posture estimation is achieved by introducing the calibration model to reduce static and time dependent errors during the measurement. We also introduce a force term to the conventional imitation learning method to improve the adaptability in integrating the affected side in cooperation with the robot. Various experiments have been conducted to validate the feasibility and effectiveness of our proposed dual-arm collaboration strategy." A Force-Sensitive Exoskeleton for Teleoperation: An Application in Elderly Care Robotics,"Alexander Toedtheide, Xiao Chen, Hamid Sadeghian, Abdeldjallil Naceri, Sami Haddadin",Technical University of Munich,Rehabilitation and Augmentation II,"With the increasing demand for new healthcare solutions and technologies, such as those resulting from the COVID-19 crisis, and the growing elderly population, exoskeletons for teleoperation are a promising solution for many future medical applications. In this context, we propose two force-sensitive upper-limb exoskeletons for teleoperation, that are characterized by: i) torque-controlled robotic actuators, ii) rigid-body model compensations, and iii) a lightweight design achieved through the use of Bowden cable transmissions and remotely placed actuators. Specifically, we present a semi-active upper-limb exoskeleton for which we demonstrate human-device interaction control and bilateral teleoperation with force-feedback, evaluated via simulation, in the lab and over the internet. We also introduce a design for a future fully-active upper-limb exoskeleton with two contact force/torque sensors, for a dual-arm device, which features a novel 3-degrees-of-freedom exoskeleton shoulder design and a contact wrench mitigation controller, as demonstrated through simulation. With this work, we propose the essential technical steps towards a novel teleoperation system for elderly care." A Model-Based Analysis of the Effect of Repeated Unilateral Low Stiffness Perturbations on Human Gait: Toward Robot-Assisted Rehabilitation,"Vaughn Chambers, Panagiotis Artemiadis",University of Delaware,Rehabilitation and Augmentation II,"Human gait is quite complex, especially when considering the irregular and uncertain environments that humans are able to walk in. While unperturbed gait in a controlled environment is understood to a large degree, gait in more unique environments, such as asymmetric compliant terrain, is not understood to the same degree. In this study, we build upon a neuromuscular gait model and extend it to allow for walking on unilaterally compliant (soft) surfaces. This model is then compared to and verified by experimental human data. The model can successfully walk with step length trends similar to human data. Additionally, the model shows similar behaviors with respect to kinematics and muscle activity. We believe this work contributes significantly to a better understanding of the control of human gait and could lead to model-informed, patient-specific rehabilitation strategies that can advance the field of rehabilitation robotics, as well as the development of bio-inspired controllers for bipedal robots that would be able to traverse through dynamic and complaint terrains." Shared Control of Assistive Robots through User-Intent Prediction and Hyperdimensional Recall of Reactive Behavior,"Alisha Menon, Laura I. Galindez Olascoaga, Vamshi Balanaga, Anirudh Natarajan, Jennifer Ruffing, Ryan Ardalan, Jan M. Rabaey",University of California: Berkeley,Rehabilitation and Augmentation II,"There is increasing interest in shared control for assistive robotics with adaptable levels of supervised autonomy. In this work, we present a user-adaptive multi-layer shared control scheme for control of assistive devices. The system leverages the advantages of brain-inspired hyperdimensional computing (HDC) for classification & recall of reactive robotic behavior including high performance, computational efficiency and intelligent sensor fusion, to execute actuation based on the user’s goal while alleviating the burden of fine control. Using a multi-modal dataset of activities of daily living, we first recognize the user’s most recent behaviors, then predict the user’s next action based on their habitual action sequences, and finally, determine actuation through HDC recall-based shared control which intelligently deliberates between the predicted action and sensor feedback-based autonomy. In this work, we independently implement each layer to achieve >92% accuracy and then integrate all the layers and discuss the combined performance and methods to reduce accumulated error." Towards Predicting Fine Finger Motions from Ultrasound Images Via Kinematic Representation,"Dean Zadok, Oren Salzman, Alon Wolf, Alexander Bronstein",Technion,Rehabilitation and Augmentation II,"A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze muscle states. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in detection, or inference, done in the context of fine motions that evolve over time. Examples include motions occurring when performing fine and dexterous tasks such as keyboard typing or piano playing. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. To this end, we present an end-to-end robotic system, which can successfully infer fine finger motions. This is achieved by modeling the hand as a robotic manipulator and using it as an intermediate representation to encode muscles' dynamics from a sequence of US images. We evaluated our method by collecting data from a group of subjects and demonstrating how it can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system." Enabling Safe Walking Rehabilitation on the Exoskeleton Atalante: Experimental Results,"Maxime Brunet, Marine Pétriaux, Florent Di Meglio, Nicolas Petit","MINES Paristech,Wandercraft,MINES ParisTech, PSL Research University,MINES ParisTech, PSL",Rehabilitation and Augmentation II,"This paper exposes a control architecture enabling rehabilitation of walking impaired patients with the lower-limb exoskeleton Atalante. Atalante’s control system is modified to allow the patient to contribute to the walking motion through their efforts. Only the swing leg degree of freedom along the nominal path is relaxed. An online trajectory optimization checks that the muscle forces do not jeopardize stability. The optimization generates reference trajectories that satisfy several key constraints from the current point to the end of the step. One of the constraints requires that the center or pressure remains inside the support polygon, which ensures that the support leg subsystem successfully tracks the reference trajectory. As a result of the presented works, the robot provides a non-zero force in the direction of motion only when required, helping the patient go fast enough to maintain balance (or preventing him from going too fast). Experimental results are reported. They illustrate that variations of +/-50% of the duration of the step can be achieved in response to the patient's efforts and that many steps are achieved without falling." A Probabilistic Model of Activity Recognition with Loose Clothing,"Tianchen Shen, Irene Di Giulio, Matthew Howard",King's College London,Rehabilitation and Augmentation II,"Human activity recognition has become an attractive research area with the development of on-body wearable sensing technology. With comfortable electronic-textiles, sensors can be embedded into clothing so that it is possible to record human movement outside the laboratory for long periods. However, a long-standing issue is how to deal with motion artefacts introduced by movement of clothing with respect to the body. Surprisingly, recent empirical findings suggest that cloth-attached sensor can actually achieve higher accuracy of activity recognition than rigidattached sensor, particularly when predicting from short timewindows. In this work, a probabilistic model is introduced in which this improved accuracy and resposiveness is explained by the increased statistical distance between movements recorded via fabric sensing. The predictions of the model are verified in simulated and real human motion capture experiments, where it is evident that this counterintuitive effect is closely captured." Real-Time Estimation of Walking Speed and Stride Length Using an IMU Embedded in a Robotic Hip Exoskeleton,Keehong Seo,"Samsung Research/Samsung Electronics Co., Ltd.",Rehabilitation and Augmentation II,"Gait parameters, including walking speed and stride length, are crucial indicators of health status and rehabilitation progress for individuals using wearable robots for exercise or rehabilitation. These metrics play a crucial role in monitoring progress and adjusting training programs, thereby fostering greater engagement in the training. In this paper, we present methods for estimating walking speed and stride length using sensors in wearable hip exoskeleton GEMS-H. Our study collected data from 79 middle-aged healthy individuals walking on a treadmill while wearing GEMS-H under various assistance conditions. To estimate walking speed, we evaluated linear regression models, deep neural networks, and ensemble models using different combinations of joint encoders and an IMU in the GEMS-H hip exoskeleton to form various sets of features. The ensemble of deep neural networks using only 6-DOF IMU signals as features achieved the lowest root-mean-square error (RMSE) for walking speed estimation, which was 0.066 m/s. We also present an algorithm for real-time stride length estimation, building on one of the speed estimation models. The speed and stride length estimation model was tested on 12 middle-aged healthy subjects walking in GEMS-H overground, yielding an RMSE of 0.060 m/s for speed and 7.1 cm for stride length." Adaptive Based Assist-As-Needed Control Strategy for Ankle Movement Assistance,"Rami Jradi, Hala Rifai, Yacine Amirat, Samer Mohammed","UPEC,University of Paris Est Créteil,University of Paris Est Créteil (UPEC),University of Paris Est Créteil - (UPEC)",Rehabilitation and Augmentation II,"Stroke affects a large number of people every year. One consequence is the weakness of ambulatory muscles resulting in a paretic gait. Actuated ankle foot orthoses can be a solution to assist paretic patients to dorsiflex and/or plantar flex their ankle joint during the gait phases. To assist the wearer following a predefined ankle joint desired trajectory, an adaptive active disturbance rejection controller is proposed in this study. The human muscular torque and estimation errors are estimated through a nonlinear disturbance observer based on the estimated model. This estimated torque is compensated within the proposed projection based adaptive controller combined to a saturated proportional derivative term. The purposes of using this controller are : i) the no need of prior system’s parameter identification due to the adaptive structure, ii) the assistance-as-needed of the wearer through the rejection term and iii) the avoidance of the actuator saturation by including projection and saturation functions. This controller is tested in real time using an actuated ankle-foot-orthosis (AAFO) in lab environment with three healthy subjects to show its effectiveness." Anticipation and Delayed Estimation of Sagittal Plane Human Hip Moments Using Deep Learning and a Robotic Hip Exoskeleton,"Dean Molinaro, Ethan Park, Aaron Young","Georgia Institute of Technology,University of Illinois Urbana-Champaign,Georgia Tech",Rehabilitation and Augmentation II,"Estimating human joint moments using wearable sensors has utility for personalized health monitoring and generalized exoskeleton control. Data-driven models have potential to map wearable sensor data to human joint moments, even with a reduced sensor suite and without subject-specific calibration. In this study, we quantified the RMSE and R2 of a temporal convolutional network (TCN), trained to estimate human hip moments in the sagittal plane using exoskeleton sensor data (i.e., a hip encoder and thigh- and pelvis-mounted inertial measurement units). We conducted three analyses in which we iteratively retrained the network while: 1) varying the input sequence length of the model, 2) incorporating noncausal data into the input sequence, thus delaying the network estimates, and 3) time shifting the labels to train the model to anticipate (i.e., predict) human hip moments. We found that 930 ms of causal input data maintained model performance while minimizing input sequence length (validation RMSE and R2 of 0.141±0.014 Nm/kg and 0.883±0.025, respectively). Further, delaying the model estimate by up to 200 ms significantly improved model performance compared to the best causal estimators (p" Safety under Uncertainty: Tight Bounds with Risk-Aware Control Barrier Functions,"Mitchell Black, Georgios Fainekos, Bardh Hoxha, Danil Prokhorov, Dimitra Panagou","University of Michigan,Toyota NA-R&D,Southern Illinois University,Toyota Tech Center,University of Michigan, Ann Arbor",Safety and Trustworthy Robotics II,"We propose a novel class of risk-aware control barrier functions (RA-CBFs) for the control of stochastic safety-critical systems. Leveraging a result from the stochastic level-crossing literature, we deviate from the martingale theory that is currently used in stochastic CBF techniques and prove that a RA-CBF based control synthesis confers a tighter upper bound on the probability of the system becoming unsafe within a finite time interval than existing approaches. We highlight the advantages of our proposed approach over the state-of-the-art via a comparative study on an mobile-robot example, and further demonstrate its viability on an autonomous vehicle highway merging problem in dense traffic." Distributionally Robust RRT with Risk Allocation,"Kajsa Ekenberg, Venkatraman Renganathan, Bjorn Olofsson",Lund University,Safety and Trustworthy Robotics II,"An integration of distributionally robust risk allocation into sampling-based motion planning algorithms for robots operating in uncertain environments is proposed. We perform non-uniform risk allocation by decomposing the distributionally robust joint risk constraints defined over the entire planning horizon into individual risk constraints given the total risk budget. Specifically, the deterministic tightening defined using the individual risk constraints is leveraged to define our proposed exact risk allocation procedure. Embedding the risk allocation technique into sampling-based motion planning algorithms realises guaranteed conservative, yet increasingly more risk-feasible trajectories for efficient state-space exploration." Statistical Safety and Robustness Guarantees for Feedback Motion Planning of Unknown Underactuated Stochastic Systems,"Craig Knuth, Glen Chou, Jamie Reese, Joseph Moore","Johns Hopkins University Applied Physics Lab,University of Michigan,Johns Hopkins Applied Physics Lab",Safety and Trustworthy Robotics II,"We present a method for providing statistical guarantees on runtime safety and goal reachability for integrated planning and control of a class of systems with unknown nonlinear stochastic underactuated dynamics. Specifically, given a dynamics dataset, our method jointly learns a mean dynamics model, a spatially-varying disturbance bound that captures the effect of noise and model mismatch, and a feedback controller based on contraction theory that stabilizes the learned dynamics. We propose a sampling-based planner that uses the mean dynamics model and simultaneously bounds the closed-loop tracking error via a learned disturbance bound. We employ techniques from Extreme Value Theory (EVT) to estimate, to a specified level of confidence, several constants which characterize the learned components and govern the size of the tracking error bound. This ensures plans are guaranteed to be safely tracked at runtime. We validate that our guarantees translate to empirical safety in simulation on a 10D quadrotor, and in the real world on a physical CrazyFlie quadrotor and Clearpath Jackal robot, whereas baselines that ignore the model error and stochasticity are unsafe." A Sensitivity-Aware Motion Planner (SAMP) to Generate Intrinsically-Robust Trajectories,"Simon Wasiela, Paolo Robuffo Giordano, Juan Cortes, Thierry Simeon","LAAS-CNRS,IRISA CNRS UMR,,,,",Safety and Trustworthy Robotics II,"Closed-loop state sensitivity is a recently introduced notion that can be used to quantify deviations of the closed-loop trajectory of a robot/controller pair against variations of uncertain parameters in the robot model. While local optimization techniques can be used to generate reference trajectories minimizing a sensitivity-based cost, no global planning algorithm considering this metric to compute collision-free motions robust to parametric uncertainties has yet been proposed. The contribution of this paper is to propose a global control-aware motion planner for optimizing a state sensitivity metric and producing collision-free reference motions that are robust against parametric uncertainties for a large class of complex dynamical systems. Given the prohibitively high computational cost of directly minimizing the state sensitivity using asymptotically optimal sampling-based tree planners, the proposed RRT*-based SAMP planner uses an appropriate steering method to first compute a (near) time-optimal and kinodynamically feasible trajectory that is then locally deformed to improve robustness and decrease its sensitivity to uncertainties. The evaluation performed on planar/full-3D quadrotor UAV models shows that the SAMP method produces low sensitivity robust solutions with a much higher performance than a planner directly optimizing the sensitivity." Proficiency Self-Assessment without Breaking the Robot: Anomaly Detection Using Assumption-Alignment Tracking from Safe Experiments,"Xuan Cao, Jacob Crandall, Ethan Pedersen, Alvika Gautam, Michael A. Goodrich","Brigham Young University,Texas A & M University",Award Finalists 1,"Proficiency self-assessment (PSA), the ability to assess how well one can carry out a task, is a desirable capability of autonomous robot systems. Prior work has proposed assumption-alignment tracking (AAT) for performing PSA, and has shown that it can accurately predict robot performance in real-time given a dataset obtained from both normal and abnormal training runs. Obtaining data in abnormal conditions (i.e., conditions in which the robot is not prepared to operate) is difficult and is often not possible. As a result, many realistic datasets contain very few data points for abnormal conditions, making it difficult to apply AAT. This paper hypothesizes that a one-class classifier can be built to detect anomalies using only data collected under normal conditions. Two metrics, difference and separation, are proposed and used to demonstrate that AAT feature vectors from different running conditions tend to form distinct clusters that are identifiable by mainstream one-class classification algorithms. Thus, one-class classifiers trained on AAT feature vectors from normal data can detect anomalous conditions. Furthermore, preliminary results suggest that a few abnormal data points, if available, can be used to classify the abnormality type and, in turn, the degree to which the anomalies will likely impact robot performance. Empirical results from both a simulated navigation robot and a Sawyer robot manipulating blocks show the efficacy of the approach." Failure Detection for Motion Prediction of Autonomous Driving: An Uncertainty Perspective,"Wenbo Shao, Yanchao Xu, Liang Peng, Jun Li, Hong Wang","Tsinghua University,Beijing Institute of Technology",Safety and Trustworthy Robotics II,"Motion prediction is essential for safe and efficient autonomous driving. However, the inexplicability and uncertainty of complex artificial intelligence models may lead to unpredictable failures of the motion prediction module, which may mislead the system to make unsafe decisions. Therefore, it is necessary to develop methods to guarantee reliable autonomous driving, where failure detection is a potential direction. Uncertainty estimates can be used to quantify the degree of confidence a model has in its predictions and may be valuable for failure detection. We propose a framework of failure detection for motion prediction from the uncertainty perspective, considering both motion uncertainty and model uncertainty, and formulate various uncertainty scores according to different prediction stages. The proposed approach is evaluated based on different motion prediction algorithms, uncertainty estimation methods, uncertainty scores, etc., and the results show that uncertainty is promising for failure detection for motion prediction but should be used with caution." Analysing the Safety and Security of a UV-C Disinfection Robot,"Desiana Nurchalifah, Sebastian Blumenthal, Luigi Lo Iacono, Nico Hochgeschwender","Hochschule Bonn-Rhein-Sieg,Locomotec,Hochschule Bonn-Rhein-Sieg University of Applied Sciences,Bonn-Rhein-Sieg University",Safety and Trustworthy Robotics II,"Safety is paramount for robots used in environments they share with humans. In such scenarios, security is also growing in importance. However, conventional approaches to analysing safety requirements are aimed at identifying hazards only. Security-related aspects such as cyber threats, cyber attacks and vulnerabilities have hardly been integrated into analysis and design methods to date. The methods available so far for the joint analysis of safety and security are based on established methods of safety engineering, where the amount of information is very large and usually stored in text- and table-based documents. This makes it challenging for engineers to systematically assess and maintain safety and security information. Thus, adequate tool support for robot engineers is required to cope with the increased complexity and to manage the safety and security risks. In this paper, we demonstrate that robot's safety and security information can be expressed, stored, analysed and queried in a knowledge graph representation paving the way to automated analysis. More specifically, we apply an integrated, systems-oriented safety and security co-analysis approach, namely STPA-Safesec, to a robot performing disinfection tasks in domestic environments. By querying the resulting graph of safety and security artefacts, we automatically retrieve hazardous scenarios, identify gaps in the analysis and increase our understanding of the overall risks of the robot." Failure Detection and Fault Tolerant Control of a Jet-Powered Flying Humanoid Robot,"Gabriele Nava, Daniele Pucci","Istituto Italiano di Tecnologia,Italian Institute of Technology",Safety and Trustworthy Robotics II,"Failure detection and fault tolerant control are fundamental safety features of any aerial vehicle. With the emergence of complex, multi-body flying systems such as jet-powered humanoid robots, it becomes of crucial importance to design fault detection and control strategies for these systems, too. In this paper we propose a fault detection and control framework for the flying humanoid robot iRonCub in case of loss of one turbine. The framework is composed of a failure detector based on turbines rotational speed, a momentum-based flight control for fault response, and an offline reference generator that produces far-from-singularities configurations and accounts for self and jet exhausts collision avoidance. Simulation results with Gazebo and MATLAB prove the effectiveness of the proposed control strategy." Testing Rare Downstream Safety Violations Via Upstream Adaptive Sampling of Perception Error Models,"Craig Innes, Subramanian Ramamoorthy","University of Edinburgh,The University of Edinburgh",Safety and Trustworthy Robotics II,"Testing black-box perceptual-control systems in simulation has two difficulties. First, perceptual inputs in simulation lack the fidelity of real-world sensor inputs. Second, for a reasonably accurate perception system, encountering a rare failure trajectory may require running infeasibly many simulations. This paper combines perception error models---surrogates for a sensor-based detection system---with state-dependent adaptive importance sampling. This allows us to efficiently assess the rare failure probabilities for real-world perceptual control systems within simulation. Our experiments on an autonomous braking system with an RGB obstacle-detector show our method can calculate accurate failure probabilities with an inexpensive number of simulations. Further, we show how choice of safety metric can influence the process of learning proposal distributions capable of reliably sampling high-probability failures." Learning to Forecast Aleatoric and Epistemic Uncertainties Over Long Horizon Trajectories,"Aastha Acharya, Rebecca Russell, Nisar Ahmed","University of Colorado Boulder; Draper,Draper,University of Colorado Boulder",Safety and Trustworthy Robotics II,Giving autonomous agents the ability to forecast their own outcomes and uncertainty will allow them to communicate their competencies and be used more safely. We accomplish this by using a learned world model of the agent system to forecast full agent trajectories over long time horizons. Real world systems involve significant sources of both aleatoric and epistemic uncertainty that compound and interact over time in the trajectory forecasts. We develop a deep generative world model that quantifies aleatoric uncertainty while incorporating the effects of epistemic uncertainty during the learning process. We show on two reinforcement learning problems that our uncertainty model produces calibrated outcome uncertainty estimates over the full trajectory horizon. S∗: On Safe and Time Efficient Robot Motion Planning,"Riddhiman Laha, Wenxi Wu, Ruiai Sun, Nico Mansfeld, Luis Felipe Cruz Figueredo, Sami Haddadin","Technical University of Munich,Franka Emika GmbH,Technical University of Munich (TUM)",Safety and Trustworthy Robotics II,"As robots and humans increasingly share the same workspace, the development of safe motion plans becomes paramount. For real-world applications, nonetheless, it is critical that safety solutions are achieved without compromising performance. The computation of safe, time-efficient trajectories, however, usually requires rather complex often decoupled planning and optimization methods which degrades the nominal performance. In this work, instead, we cast the problem as a graph search-based scheme that enables us to solve the problem efficiently. The graph search is guided by an informed cost balance criterion. In this context we present the S∗ algorithm which minimizes the total planning time by equilibrising shortest time-efficient paths and paths with higher safe velocities. The approach is compatible with standards and validated both in rigorous simulation trials on a 6 DoF UR5 robot as well as real world experiments on a Franka Emika 7 DoF research robot." Online Update of Safety Assurances Using Confidence-Based Predictions,"Kensuke Nakamura, Somil Bansal","Princeton University,University of Southern California",Safety and Trustworthy Robotics II,"Robots such as autonomous vehicles and assistive manipulators are increasingly operating in dynamic environments and close physical proximity to people. In such scenarios, the robot can leverage a human motion predictor to predict their future states and plan safe and efficient trajectories. However, no model is ever perfect – when the observed human behavior deviates from the model predictions, the robot might plan unsafe maneuvers. Recent works have explored maintaining a confidence parameter in the human model to overcome this challenge, wherein the predicted human actions are tempered online based on the likelihood of the observed human action under the prediction model. This has opened up a new research challenge, i.e., how to compute the future human states online as the confidence parameter changes? In this work, we propose a Hamilton-Jacobi (HJ) reachability-based approach to overcome this challenge. Treating the confidence parameter as a virtual state in the system, we compute a parameter-conditioned forward reachable tube (FRT) that provides the future human states as a function of the confidence parameter. Online, as the confidence parameter changes, we can simply query the corresponding FRT, and use it to update the robot plan. Computing parameter-conditioned FRT corresponds to an (offline) high-dimensional reachability problem, which we solve by leveraging recent advances in data-driven reachability analysis. Overall, our framework enables an online maintenance and updates of safety assurances in human-robot interaction scenarios, even when the human prediction model is incorrect. We demonstrate our approach in several safety-critical autonomous driving scenarios, involving a state-of-the-art deep learning-based prediction model." Self-Supervised Point Cloud Understanding Via Mask Transformer and Contrastive Learning,"Di Wang, Zhi-Xin Yang",University of Macau,Deep Learning for Visual Perception,"Self-supervised point cloud understanding can pre-train the point cloud learning network on a large dataset, which helps boost fine-tuning performance on other smaller datasets in downstream tasks. Motivated to design an efficient self-supervised pre-training strategy and capture useful and discriminative representations of the 3D point cloud, we propose ContrastMPCT, a self-reconstruction scheme with the contrastive learning principle. Specifically, two contrastive loss functions are designed for 3D point clouds to maximize the dependence between the input tokens and output tokens of the encoder and fasten the convergence of the model. Extensive experiments show that our pre-training strategy of ContrastMPCT can effectively improve the fine-tuning performance on the downstream tasks, including object classification and part segmentation. Moreover, compared with both CNN-based and Transformer-based existing works, the superior results indicate the efficacy of the proposed method. The source code will be available at: https://github.com/wendydidi/ContrastMPCT.git" ================================================ FILE: README.md ================================================ # ICRA2023 Paper List This repo contains a list of all the papers being presented at ICRA2023. Along with the session in which the paper is being presented. A CSV file with abstracts is also available. There is also a google sheets version with abstracts [here](https://docs.google.com/spreadsheets/d/1kdHs53v0YS-RhM_NOU1hGIxqcZhudUPdHBBRac0ST3Y/edit?usp=sharing). Note: This list is likely not a perfect list there might be papers missed or duplicated. # Paper List | Title | Authors | Organisation | Session | |---------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------| | Picking up Speed: Continuous-Time Lidar-Only Odometry Using Doppler Velocity Measurements | Yuchen Wu, David Juny Yoon, Keenan Burnett, Sören Kammel, Yi Chen, Heethesh Vhavle, Timothy Barfoot | University of Toronto,Aeva Inc,Aeva,Aeva, Inc | SLAM 1 | | Stein ICP for Uncertainty Estimation in Point Cloud Matching | Fahira Afzal Maken, Fabio Ramos, Lionel Ott | Data,,, CSIRO,University of Sydney, NVIDIA,ETH Zurich | SLAM 1 | | Direct and Sparse Deformable Tracking | Jose Lamarca, Juan Jose Gomez Rodriguez, Juan D. Tardos, Jose M M Montiel | Apple Inc.,Universidad de Zaragoza,I,A. Universidad de Zaragoza | SLAM 1 | | ASRO-DIO: Active Subspace Random Optimization Based Depth Inertial Odometry | Jiazhao Zhang, Yijie Tang, He Wang, Kai Xu | National University of Defense Technology,Peking University | SLAM 1 | | Discrete-Continuous Smoothing and Mapping | Kevin Doherty, Ziqi Lu, Kurran Singh, John Leonard | Massachusetts Institute of Technology,MIT | SLAM 1 | | Anderson Acceleration for On-Manifold Iterated Error State Kalman Filters | Xiang Gao, Tao Xiao, Chunge Bai, Dezhao Zhang, Fang Zhang | idriverplus.com,Beijing Idriverplus Technology Co. Ltd.,Tsinghua University,Beijing Idriverplus Technology Co., Ltd. | SLAM 1 | | Generalized LOAM: LiDAR Odometry Estimation with Trainable Local Geometric Features | Kohei Honda, Kenji Koide, Masashi Yokozuka, Shuji Oishi, Atsuhiko Banno | Nagoya University Graduate School,National Institute of Advanced Industrial Science and Technology,Nat. Inst. of Advanced Industrial Science and Technology,National Institute of Advanced Industrial Science and Technology (AIST),National Instisute of Advanced Industrial Science and Technology | SLAM 1 | | BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAM | Yunge Cui, Xieyuanli Chen, Yinlong Zhang, Jiahua Dong, Qingxiao Wu, Feng Zhu | Shenyang Institute of Automation Chinese Academy of Sciences,National University of Defense Technology,Shenyang Institute of Automation, Chinese Academy of Sciences,Shenyang Institute of Automation,Chinese Academy of Scien | SLAM 1 | | Gaussian Mixture Midway-Merge for Object SLAM with Pose Ambiguity | Jae Hyung Jung, Chan Gook Park | Seoul National University | SLAM 1 | | Design and Characterization of a 3D-Printed Pneumatically-Driven Bistable Valve with Tunable Characteristics | Sihan Wang, Liang He, Perla Maiolino | University of Oxford | Soft Robot Applications | | Design of Fully Controllable and Continuous Programmable Surface Based on Machine Learning | Jue Wang, Jiaqi Suo, Alex Chortos | Purdue University,Gensler Baltimore,Purdue | Soft Robot Applications | | On the Use of Magnets to Robustify the Motion Control of Soft Hands | Sara Marullo, Gionata Salvietti, Domenico Prattichizzo | University of Siena | Soft Robot Applications | | Kinegami: Algorithmic Design of Compliant Kinematic Chains from Tubular Origami | Wei-Hsi Chen, Woohyeok Yang, Lucien Peach, Daniel Koditschek, Cynthia Sung | University of Pennsylvania | Soft Robot Applications | | Entrainment During Human Locomotion Using a Lightweight Soft Robotic Hip Exosuit (SR-HExo) | Lily C. Baye-wallace, Carly Thalman, Hyunglae Lee | Southwest Research Institute; Arizona State University,Arizona State University | Soft Robot Applications | | SOPHIE: SOft and Flexible Aerial Vehicle for PHysical Interaction with the Environment | Fernando Ruiz Vincueria, Begoña C. Arrue, Aníbal Ollero | UNIVERSIDAD DE SEVILLA,Universidad de Sevilla,University of Seville | Soft Robot Applications | | A Tensegrity-Based Inchworm-Like Robot for Crawling in Pipes with Varying Diameters | Yixiang Liu, Xiaolin Dai, Zhe Wang, Qing Bi, Rui Song, Jie Zhao, Yibin Li | Shandong University,Volvo Construction Equipment Technology (China) Co., Ltd,shandong university,Harbin Institute of Technology | Soft Robot Applications | | Untethered Robotic Millipede Driven by Low-Pressure Microfluidic Actuators for Multi-Terrain Exploration | Qi Shao, Xuguang Dong, Zhonghan Lin, Chao Tang, Hao Sun, Xin-Jun Liu, Huichan Zhao | Tsinghua University | Soft Robot Applications | | FEA-Based Soft Robotic Modeling: Simulating a Soft-Actuator in SOFA | Pasquale Ferrentino, Ellen Roels, Joost Brancart, Seppe Terryn, Guy Van Assche, Bram Vanderborght | Vrije Universiteit Brussels,Vrije Universiteit Brussel,Vrije Universiteit Brussel (VUB) | Soft Robot Applications | | Inflated Bendable Eversion Cantilever Mechanism with Inner Skeleton for Increased Stiffness | Tomoya Takahashi, Masahiro Watanabe, Kazuki Abe, Kenjiro Tadakuma, Naoto Saiki, Masashi Konyo, Satoshi Tadokoro | Tohoku University | Soft Robot Applications | | Energy-Based Design Optimization of a Miniature Wave-Like Robot Inside Curved Compliant Tubes | Rotem Katz, Dan Shachaf, David Zarrouk | Ben Gurion University of the Negev,BGU,Ben Gurion University | Design of Mechanisms | | A Palm-Sized Omnidirectional Mobile Robot Driven by 2-DOF Torus Wheels | Yunosuke Sato, Ayato Kanada, Tomoaki Mashimo | Toyohashi University of Technology,Kyushu University,Okayama University | Design of Mechanisms | | Flipper-Style Locomotion through Strong Expanding Modular Robots | Lillian Chin, Max Burns, Gregory Xie, Daniela Rus | Massachusetts Institute of Technology,MIT | Design of Mechanisms | | Simplified Configuration Design of Anthropomorphic Hand Imitating Specific Human Hand Grasps | Xinyang Tian, Qiang Zhan, Yin Zhang, Junyi Zou, Lingxiao Jiang, Qinhuan Xu | Beihang university,Beihang University | Design of Mechanisms | | Meta Reinforcement Learning for Optimal Design of Legged Robots | Alvaro Belmonte-baeza, Joonho Lee, Giorgio Valsecchi, Marco Hutter | University of Alicante,ETH Zurich Robotic Systems Laboratory,Robotic System Lab, ETH,ETH Zurich | Design of Mechanisms | | Advanced 2-DOF Counterbalance Mechanism Based on Gear Units and Springs to Minimize Required Torques of Robot Arm | Hwi-Su Kim, Jongwoo Park, Myeongsu Bae, Dongil Park, Chanhun Park, Hyunmin Do, Taeyong Choi, Doo-hyeong Kim, Jinho Kyung | Korea Institute of Machinery & Materials,Korea Institue of Machinery & Materials,Dyence tech,Korea Institute of Machinery and Materials (KIMM),KIMM,Korea Institute of Machinery and Materials,Korea Institute of Machinery & Materials (KIMM) | Design of Mechanisms | | Permanent-Magnetically Amplified Robotic Gripper with Less Clamping Width Influence on Compensation Realized by a Stepless Width Adjustment Mechanism | Tori Shimizu, Kenjiro Tadakuma, Masahiro Watanabe, Kazuki Abe, Masashi Konyo, Satoshi Tadokoro | Tohoku University | Design of Mechanisms | | Design of a New Bio-Inspired Dual-Axis Compliant Micromanipulator with Millimeter Strokes | Zekui Lyu, Qingsong Xu | University of Macau | Design of Mechanisms | | Optimal Elastic Wing for Flapping-Wing Robots through Passive Morphing | Cristina Ruiz Paez, Jose Angel Acosta, Aníbal Ollero | University of Seville | Design of Mechanisms | | Robust Multi-Robot Trajectory Optimization Using Alternating Direction Method of Multiplier | Ruiqi Ni, Zherong Pan, Xifeng Gao | Florida State University,Tencent America | Planning | | Autonomous Exploration in a Cluttered Environment for a Mobile Robot with 2D-Map Segmentation and Object Detection | Hyung Seok Kim, Hyeongjin Kim, Seon-il Lee, Hyeonbeom Lee | Kyungpook National University | Planning | | Distributionally Safe Path Planning: Wasserstein Safe RRT | Paul Lathrop, Beth Boardman, Sonia Martinez | University of California, San Diego,Los Alamos National Laboratory,UC San Diego | Planning | | Sim2Real Learning of Obstacle Avoidance for Robotic Manipulators in Uncertain Environments | Tan Zhang, Kefang Zhang, Jiatao Lin, Wing-yue Geoffrey Louie, Hui Huang | Shenzhen Techonology University,Shenzhen University,Oakland University | Planning | | Bidirectional Sampling-Based Motion Planning without Two-Point Boundary Value Solution | Sharan Nayak, Michael W. Otte | University of Maryland, College Park,University of Maryland | Planning | | Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly | Valentin Hartmann, Andreas Orthey, Danny Driess, Ozgur S. Oguz, Marc Toussaint | University of Stuttgart,TU Berlin,Bilkent University | Planning | | A Reachability-Based Spatio-Temporal Sampling Strategy for Kinodynamic Motion Planning | Yongxing Tang, Zhanxia Zhu, Hongwen Zhang | Northwestern Polytechnical University,Zhejiang Lab | Planning | | Efficient Speed Planning for Autonomous Driving in Dynamic Environment with Interaction Point Model | Yingbing Chen, Ren Xin, Jie Cheng, Qingwen Zhang, Xiaodong Mei, Ming Liu, Lujia Wang | The Hongkokng University of Science and Technology,the Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,KTH Royal Institute of Technology,HKUST,The Hong Kong University of Technology | Planning | | Efficient Anytime CLF Reactive Planning System for a Bipedal Robot on Undulating Terrain | Bruce Jk Huang, J.W Grizzle | University of Michigan | Planning | | A Framework to Co-Optimize Robot Exploration and Task Planning in Unknown Environments | Yuanfan Xu, Zhaoliang Zhang, Yu Jincheng, Yuan Shen, Yu Wang | Tsinghua University | Planning | | Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA | Yuki Kadokawa, Yoshihisa Tsurumine, Takamitsu Matsubara | Nara Institute of Science and Technology | Reinforcement Learning | | Automating Reinforcement Learning with Example-Based Resets | Jigang Kim, J. Hyeon Park, Daesol Cho, H. Jin Kim | Seoul National University | Reinforcement Learning | | Improving the Robustness of Reinforcement Learning Policies with L1 Adaptive Control | Yikun Cheng, Pan Zhao, Fanxin Wang, Daniel Block, Naira Hovakimyan | University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign,University of Illinois | Reinforcement Learning | | Developing Cooperative Policies for Multi-Stage Reinforcement Learning Tasks | Jordan Erskine, Christopher Lehnert | Queensland University of Technology | Reinforcement Learning | | Learning Performance Graphs from Demonstrations Via Task-Based Evaluations | Aniruddh Gopinath Puranic, Jyotirmoy Deshmukh, Stefanos Nikolaidis | University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA | Reinforcement Learning | | Tumbling Robot Control Using Reinforcement Learning | Andrew Schwartzwald, Matthew Tlachac, Luis Guzman, Athanasios Bacharis, Nikos Papanikolopoulos | CSE, UMN,CSE, University of Minnesota,University of Minnesota | Reinforcement Learning | | Guided Reinforcement Learning – a Review and Evaluation for Efficient and Effective Real-World Robotics | Julian Eßer, Nicolas Bach, Christian Jestel, Oliver Urbann, Sören Kerner | Fraunhofer IML | Reinforcement Learning | | Robust Adaptive Ensemble Adversary Reinforcement Learning | Peng Zhai, Taixian Hou, Xiaopeng Ji, Zhiyan Dong, Lihua Zhang | Fudan University,FuDan University,Zhejiang University | Reinforcement Learning | | GIN: Graph-Based Interaction-Aware Constraint Policy Optimization for Autonomous Driving | Se-Wook Yoo, Chan Kim, Jinwoo Choi, Seong-woo Kim, Seung-Woo Seo | Seoul National University | Reinforcement Learning | | Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning | Nicolai Dorka, Tim Welschehold, Joschka Boedecker, Wolfram Burgard | University of Freiburg,Albert-Ludwigs-Universität Freiburg,University of Technology Nuremberg | Reinforcement Learning | | An Investigation on the Effect of Actuation Pattern on the Power Consumption of Legged Robots for Extraterrestrial Exploration | Yuan Hu, Weizhong Guo, Rongfu Lin | University of Shanghai for Science and Technology,Shanghai Jiao Tong University,ShangHai JiaoTong university | Marine and Field Robotics | | Intent Inference-Based Ship Collision Avoidance in Encounters with Rule-Violating Vessels | Yonghoon Cho, Jonghwi Kim, Jinwhan Kim | Agency for Defense Development,KAIST | Marine and Field Robotics | | Nezha-Mini: Design and Locomotion of a Miniature Low-Cost Hybrid Aerial Underwater Vehicle | Yuanbo Bi, Yufei Jin, Chenxin Lyu, Zheng Zeng, Lian Lian | Shanghai jiao tong University,Shanghai Jiao Tong University,Shanghai Jiaotong University | Marine and Field Robotics | | CPG-Based Motion Planning of Hybrid Underwater Hexapod Robot for Wall Climbing and Transition | Feiyu Ma, Weisheng Yan, Lepeng Chen, Rongxin Cui | Northwestern Polytechnical University | Marine and Field Robotics | | Improving Self-Consistency in Underwater Mapping through Laser-Based Loop Closure | Thomas Hitchcox, James Richard Forbes | McGill University | Marine and Field Robotics | | Passive Inverted Ultra-Short Baseline Positioning for a Disc-Shaped Autonomous Underwater Vehicle: Design and Field Experiments | Yingqiang Wang, Ruoyu Hu, S. H. Huang, Zhikun Wang, Peizhou Du, Wencheng Yang, Ying Chen | Zhejiang University,Zhejiang Univ.,China | Marine and Field Robotics | | The Robustness of Tether Friction in Non-Idealized Terrains | Justin Page, Laura Treers, Steven Jens Jorgensen, Ronald Fearing, Hannah Stuart | UC Berkeley Mechanical Engineering,University of California Berkeley,Apptronik,University of California at Berkeley,UC Berkeley | Marine and Field Robotics | | Reconfigurable Inflated Soft Arms | Nam Gyun Kim, Jee-Hwan Ryu | Korea Advanced Institute of Science and Technology | Soft Robots I | | A Soft Hybrid-Actuated Continuum Robot Based on Dual Origami Structures | Jian Tao, Qiqiang Hu, Tianzhi Luo, Erbao Dong | University of Science and Technology of China,City University of Hong Kong | Soft Robots I | | Direct and Inverse Modeling of Soft Robots by Learning a Condensed FEM Model | Etienne Ménager, Tanguy Navez, Olivier Goury, Christian Duriez | Univ. Lille, Inria, CNRS, Centrale Lille, UMR , CRIStAL,University of Lille - INRIA,Inria - Lille Nord Europe,INRIA | Soft Robots I | | Limit Cycle Generation with Pneumatically Driven Physical Reservoir Computing | Hiroaki Shinkawa, Toshihiro Kawase, Tetsuro Miyazaki, Takahiro Kanno, Maina Sogabe, Kenji Kawashima | The University of Tokyo,Tokyo Denki University,Riverfield Inc.,the University of Tokyo | Soft Robots I | | Toward Zero-Shot Sim-To-Real Transfer Learning for Pneumatic Soft Robot 3D Proprioception Sensing | Uksang Yoo, Hanwen Zhao, Alvaro Altamirano, Wenzhen Yuan, Chen Feng | Carnegie Mellon University,New York University | Soft Robots I | | Cross-Domain Transfer Learning and State Inference for Soft Robots Via a Semi-Supervised Sequential Variational Bayes Framework | Shageenderan Sapai, Junn Yong Loo, Ze Yang Ding, Chee Pin Tan, Raphael Phan, Vishnu Monn Baskaran, Surya G. Nurzaman | Monash University,Monash Malaysia,Monash University Malaysia | Soft Robots I | | Image-Based Pose Estimation and Shape Reconstruction for Robot Manipulators and Soft, Continuum Robots Via Differentiable Rendering | Jingpei Lu, Fei Liu, Cedric Girerd, Michael Yip | University of California San Diego,UCSD,University of California, San Diego | Soft Robots I | | Discrete-Time Model Based Control of Soft Manipulator with FBG Sensing | Enrico Franco, Ayhan Aktas, Shen Treratanakulchai, Arnau Garriga-casanovas, Abdulhamit Donder, Ferdinando Rodriguez Y Baena | Imperial College London,Imperial College,Imperial College, London, UK | Soft Robots I | | A Soft Robot with Three Dimensional Shape Sensing and Contact Recognition Multi-Modal Sensing Via Tunable Soft Optical Sensors | Max Mccandless, Frank Juliá Wise, Sheila Russo | Boston University | Soft and Flexible Sensors | | A Flexible 3D Force Sensor with Tunable Sensitivity | James J. Davies, Mai Thanh Thai, Trung Thien Hoang, Nguyen Chi Cong, Phuoc Thien Phan, Kefan Zhu, Dang Bao Nhi Tran, Van Ho, Hung La, Q P Ha, Nigel Lovell, Thanh Nho Do | University of New South Wales,UNSW Sydney,RMIT,Japan Advanced Institute of Science and Technology,University of Nevada at Reno,University of Technology Sydney | Soft and Flexible Sensors | | STEV: Stretchable Triboelectric E-Skin Enabled Proprioceptive Vibration Sensing for Soft Robot | Zihan Wang, Kai-chong Lei, Tang Huaze, Shoujie Li, Yuan Dai, Wenbo Ding, Xiao-Ping (Steven) Zhang | Tsinghua University,Tsinghua Shenzhen International Graduate School,Tencent,Ryerson University | Soft and Flexible Sensors | | Design and Development of a Hydrogel-Based Soft Sensor for Multi-Axis Force Control | Yichen Cai, David Hardman, Fumiya Iida, Thomas George Thuruthel | University of Cambridge,University College London | Soft and Flexible Sensors | | Design and Characterization of a Low Mechanical Loss, High-Resolution Wearable Strain Gauge | Addison Liu, Oluwaseun Adelowo Araromi, Conor James Walsh, Robert Wood | Harvard University,Harvard University Science and Engineering Building | Soft and Flexible Sensors | | Identifying Contact Distance Uncertainty in Whisker Sensing with Tapered, Flexible Whiskers | Teresa Kent, Hannah Emnett, Mahnoush Babaei, Mitra Hartmann, Sarah Bergbreiter | Carnegie Mellon University,Northwestern University,The University of Texas at Austin | Soft and Flexible Sensors | | Learning Decoupled Multi-Touch Force Estimation, Localization and Stretch for Soft Capacitive E-Skin | Abu Bakar Dawood, Claudio Coppola, Kaspar Althoefer | Queen Mary University of London | Soft and Flexible Sensors | | OptiGap: A Modular Optical Sensor System for Bend Localization | Jr. Bupe, Cindy Harnett | University of Louisville | Soft and Flexible Sensors | | A Silicone-Sponge-Based Variable-Stiffness Device | Tianqi Yue, Tsam Lung You, Hemma Philamore, Hermes Gadelha, Jonathan Rossiter | University of Bristol,Kyoto University,Department of engineering, University of Bristol, UK | Actuation | | Design and Control of a Tunable-Stiffness Coiled-Spring Actuator | Shivangi Misra, Mason Mitchell, Rongqian Chen, Cynthia Sung | University of Pennsylvania,Worcester Polytechnic Institute | Actuation | | Wirelessly-Controlled Untethered Piezoelectric Planar Soft Robot Capable of Bidirectional Crawling and Rotation | Zhiwu Zheng, Hsin Cheng, Prakhar Kumar, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm | Princeton University | Actuation | | Origami Folding Enhances Modularity and Mechanical Efficiency of Soft Actuators | Zheng Wang, Yazhou Song, Zhongkui Wang, Hongying Zhang | National University of Singapore,Ritsumeikan University | Actuation | | Characterisation of Antagonistically Actuated, Stiffness-Controllable Joint-Link Units for Cobots | Wenlong Gaozhang, Jialei Shi, Yue Li, Agostino Stilli, Helge Wurdemann | University College London,Kings College London | Actuation | | A Fluidic Actuator with an Internal Stiffening Structure Inspired by Mammalian Erectile Tissue | Jan Fras, Kaspar Althoefer | Queen Mary University of London | Actuation | | On Tendon Driven Continuum Robots with Compressible Backbones | Manu Srivastava, Ian Walker | Clemson University | Actuation | | FourStr: When Multi-Sensor Fusion Meets Semi-Supervised Learning | Bangquan Xie, Liang Yang, Zongming Yang, Ailin Wei, Xiaoxiong Weng, Bing Li | South China University of Technology,Apple Inc,Clemson University,Clemson Univeristy | Sensor Fusion I | | Combining Motion and Appearance for Robust Probabilistic Object Segmentation in Real Time | Vito Mengers, Aravind Battaje, Manuel Baum, Oliver Brock | Technische Universität Berlin,TU Berlin | Sensor Fusion I | | Event-Based Real-Time Moving Object Detection Based on IMU Ego-Motion Compensation | Chunhui Zhao, Yakun Li, Yang Lyu | Northwestern Polytechnical University | Sensor Fusion I | | Estimating the Motion of Drawers from Sound | Manuel Baum, Amelie Froessl, Aravind Battaje, Oliver Brock | TU Berlin,Technische Universitaet Berlin,Technische Universität Berlin | Sensor Fusion I | | Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents That See and Hear | Ruohan Gao, Hao Li, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Fei-Fei Li, Jiajun Wu | Stanford University,Google Inc | Sensor Fusion I | | LAPTNet-FPN: Multi-Scale LiDAR-Aided Projective Transform Network for Real Time Semantic Grid Prediction | Manuel Diaz Zapata, David Sierra Gonzalez, Ozgur Erkent, Christian Laugier, Jilles Dibangoye | Inria Grenoble,Inria Grenoble Rhône-Alpes,Hacettepe University,INRIA,Univ Lyon | Sensor Fusion I | | Collision-Aware In-Hand 6D Object Pose Estimation Using Multiple Vision-Based Tactile Sensors | Gabriele Mario Caddeo, Nicola Agostino Piga, Fabrizio Bottarel, Lorenzo Natale | Istituto Italiano di Tecnologia | Sensor Fusion I | | CalibDepth: Unifying Depth Map Representation for Iterative LiDAR-Camera Online Calibration | Jiangtong Zhu, Jianru Xue, Pu Zhang | Xi'an Jiaotong University | Sensor Fusion I | | Shape Visual Servoing of a Tether Cable from Parabolic Features | Lev Smolentsev, Alexandre Krupa, Francois Chaumette | INRIA Rennes - Bretagne Atlantique,Centre Inria de l'Université de Rennes,Inria center at University of Rennes | Visual Servoing | | Deep Metric Learning for Visual Servoing: When Pose and Image Meet in Latent Space | Samuel Felton, Elisa Fromont, Eric Marchand | Université de Rennes ,, IRISA,Université of Rennes ,-- IRISA/Inria rba,Univ Rennes, Inria, CNRS, IRISA | Visual Servoing | | CNN-Based Visual Servoing for Simultaneous Positioning and Flattening of Soft Fabric Parts | Fuyuki Tokuda, Akira Seino, Akinari Kobayashi, Kazuhiro Kosuge | Centre for Transformative Garment Production,Tohoku University,The University of Hong Kong | Visual Servoing | | Dynamical System-Based Imitation Learning for Visual Servoing Using the Large Projection Formulation | Antonio Paolillo, Paolo Robuffo Giordano, Matteo Saveriano | IDSIA USI-SUPSI,IRISA CNRS UMR,,,,,University of Trento | Visual Servoing | | Constant Distance and Orientation Following of an Unknown Surface with a Cable-Driven Parallel Robot | Thomas Rousseau, Nicolo Pedemonte, Stephane Caro, Francois Chaumette | Nantes Université, LS,N, IRT Jules Verne,IRT Jules Verne,CNRS/LS,N,Inria center at University of Rennes | Visual Servoing | | 3D Spectral Domain Registration-Based Visual Servoing | Komlan Adjigble, Brahim Tamadazte, Cristiana De Farias, Rustam Stolkin, Naresh Marturi | University of Birmingham,CNRS | Visual Servoing | | Autonomous Endoscope Control Algorithm with Visibility and Joint Limits Avoidance Constraints for Da Vinci Research Kit Robot | Rocco Moccia, Fanny Ficuciello | Università degli Studi di Napoli Federico II,Università di Napoli Federico II | Visual Servoing | | Safe Control Using Vision-Based Control Barrier Function (V-CBF) | Hossein Abdi, Golnaz Raja, Reza Ghabcheloo | Tampere University | Visual Servoing | | DC-MOT: Motion Deblurring and Compensation for Multi-Object Tracking in UAV Videos | Song Cheng, Meibao Yao, Xueming Xiao | Jilin University,Changchun University of Science and Technology | Visual Tracking | | Fast Event-Based Double Integral for Real-Time Robotics | Shijie Lin, Yinqiang Zhang, Dongyue Huang, Bin Zhou, Xiaowei Luo, Jia Pan | The University of Hong Kong,The Chinese University of Hong Kong,Beihang University,City University, HONG KONG,University of Hong Kong | Visual Tracking | | Continuous-Time Gaussian Process Motion-Compensation for Event-Vision Pattern Tracking with Distance Fields | Cedric Le Gentil, Ignacio Alzugaray, Teresa A. Vidal-Calleja | University of Technology Sydney,Imperial College London | Visual Tracking | | EXOT: Exit-Aware Object Tracker for Safe Robotic Manipulation of Moving Object | Hyunseo Kim, Hye Jung Yoon, Minji Kim, Dong-sig Han, Byoung-Tak Zhang | Seoul National University | Visual Tracking | | Mono-STAR: Mono-Camera Scene-Level Tracking and Reconstruction | Haonan Chang, Dhruv Metha Ramesh, Shijie Geng, Yuqiu Gan, Abdeslam Boularias | Rutgers University,Columbia University | Visual Tracking | | DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor Fusion | Mohamed Nagy, Majid Khonji, Jorge Dias, Sajid Javed | Khalifa University | Visual Tracking | | Fusion of Events and Frames Using 8-DOF Warping Model for Robust Feature Tracking | Min Seok Lee, Ye Jun Kim, Jae Hyung Jung, Chan Gook Park | Seoul National University,Hyundai motor group | Visual Tracking | | 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds | Jyoti Kini, Ajmal Mian, Mubarak Shah | University of Central Florida,University of Western Australia | Visual Tracking | | Inverse Reinforcement Learning Framework for Transferring Task Sequencing Policies from Humans to Robots in Manufacturing Applications | Omey Mohan Manyar, Zachary Mcnulty, Stefanos Nikolaidis, Satyandra K. Gupta | University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA | Robot Learning | | Learning State Conditioned Linear Mappings for Low-Dimensional Control of Robotic Manipulators | Michael Przystupa, Kerrick Johnstonbaugh, Zichen(Vincent) Zhang, Laura Petrich, Masood Dehghan, Faezeh Haghverd, Martin Jagersand | University of Alberta,University of Alberta, Canada | Robot Learning | | Decoupling Skill Learning from Robotic Control for Generalizable Object Manipulation | Kai Lu, Bo Yang, Bing Wang, Andrew Markham | University of Oxford,The Hong Kong Polytechnic University,Oxford University | Robot Learning | | Comparison of Model-Based and Model-Free Reinforcement Learning for Real-World Dexterous Robotic Manipulation Tasks | David Patricio Valencia Redrovan, John Jia, Raymond Li, Alex Hayashi, Reuel Terezakis, Trevor Gee, Minas Liarokapis, Bruce Macdonald, Henry Williams | The University of Auckland,University of AUCKLAND,University of Auckland | Robot Learning | | Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control | Murad Elnagdi, Nils Dengler, Jorge De Heuvel, Maren Bennewitz | University of Bonn | Robot Learning | | Task-Driven Graph Attention for Hierarchical Relational Object Navigation | Michael Lingelbach, Chengshu Li, Minjune Hwang, Andrey Kurenkov, Alan Lou, Roberto Martín-martín, Ruohan Zhang, Fei-Fei Li, Jiajun Wu | Stanford University,University of Texas at Austin,Stanford University | Robot Learning | | Safety-Guaranteed Skill Discovery for Robot Manipulation Tasks | Sunin Kim, Jaewoon Kwon, Taeyoon Lee, Younghyo Park, Julien Perez | NAVER LABS,Naver labs,MIT,NAVER LABS EUROPE | Robot Learning | | A Framework for the Unsupervised Inference of Relations between Sensed Object Spatial Distributions and Robot Behaviors | Christopher Morse, Lu Feng, Matthew Dwyer, Sebastian Elbaum | University of Virginia | Robot Learning | | Learning Video-Conditioned Policies for Unseen Manipulation Tasks | Elliot Chane-sane, Cordelia Schmid, Ivan Laptev | Inria PARIS,Inria,INRIA | Robot Learning | | Learning Food Picking without Food: Fracture Anticipation by Breaking Reusable Fragile Objects | Rinto Yagawa, Reina Ishikawa, Masashi Hamaya, Kazutoshi Tanaka, Atsushi Hashimoto, Hideo Saito | Keio University,OMRON SINIC X Corporation,OMRON SINIC X | Robot Learning | | Learning Risk-Aware Costmaps Via Inverse Reinforcement Learning for Off-Road Navigation | Samuel Triest, Mateo Guaman Castro, Parv Maheshwari, Matthew Sivaprakasam, Wenshan Wang, Sebastian Scherer | Carnegie Mellon University,Indian Institute of Technology Kharagpur | Robot Learning | | How Does It Feel? Self-Supervised Costmap Learning for Off-Road Vehicle Traversability | Mateo Guaman Castro, Samuel Triest, Wenshan Wang, Jason M. Gregory, Felix Sanchez, John G. Rogers Iii, Sebastian Scherer | Carnegie Mellon University,US Army Research Laboratory,Booz Allen Hamilton | Robot Learning | | Global and Reactive Motion Generation with Geometric Fabric Command Sequences | Weiming Zhi, Iretiayo Akinola, Karl Van Wyk, Nathan Ratliff, Fabio Ramos | Carnegie Mellon University, University of Sydney,Columbia University,NVIDIA,University of Sydney, NVIDIA | Learning for Control I | | Enforcing the Consensus between Trajectory Optimization and Policy Learning for Precise Robot Control | Quentin Le Lidec, Wilson Jallet, Ivan Laptev, Cordelia Schmid, Justin Carpentier | INRIA-ENS-PSL,LAAS-CNRS,INRIA,Inria | Learning for Control I | | Neural Optimal Control Using Learned System Dynamics | Kazim Selim Engin, Volkan Isler | University of Minnesota | Learning for Control I | | Learned Risk Metric Maps for Kinodynamic Systems | Ross Allen, Wei Xiao, Daniela Rus | MIT Lincoln Laboratory,MIT | Learning for Control I | | Autonomous Drifting with 3 Minutes of Data Via Learned Tire Models | Franck Djeumou, Jonathan Goh, Ufuk Topcu, Avinash Balachandran | University of Texas at Austin,Toyota Research Institute,The University of Texas at Austin,Toyota Research Institue | Learning for Control I | | DDK: A Deep Koopman Approach for Longitudinal and Lateral Control of Autonomous Ground Vehicles | Yongqian Xiao, Xinglong Zhang, Xin Xu, Lu Yang, Junxiang Li | National University of Defense Technology,National university of defense technology | Learning for Control I | | Meta-Learning-Based Optimal Control for Soft Robotic Manipulators to Interact with Unknown Environments | Zhiqiang Tang, Peiyi Wang, Wenci Xin, Zhexin Xie, Longxin Kan, Muralidharan Mohanakrishnan, Cecilia Laschi | National University of Singapore,Beijing Jiaotong University | Learning for Control I | | Dealing with Sparse Rewards in Continuous Control Robotics Via Heavy-Tailed Policy Optimization | Souradip Chakraborty, Amrit Bedi, Kasun Weerakoon, Prithvi Poddar, Alec Koppel, Pratap Tokekar, Dinesh Manocha | UNIVERSITY OF MARYLAND,University of Maryland, College Park,IISER Bhopal,JP Morgan Chase,University of Maryland | Learning for Control I | | MPC with Sensor-Based Online Cost Adaptation | Avadesh Meduri, Huaijiang Zhu, Armand Jordana, Ludovic Righetti | New York University,NYU | Learning for Control I | | ReachLipBnB: A Branch-And-Bound Method for Reachability Analysis of Neural Network Autonomous Systems Using Lipschitz Bounds | Taha Entesari, Sina Sharifi, Mahyar Fazlyab | Johns Hopkins University | Learning for Control I | | Gradient-Based Trajectory Optimization with Learned Dynamics | Bhavya Sukhija, Nathanael Köhler, Miguel Zamora, Simon Zimmermann, Sebastian Curi, Stelian Coros, Andreas Krause | ETH Zürich,ETH Zurich | Learning for Control I | | RAMP-Net: A Robust Adaptive MPC for Quadrotors Via Physics-Informed Neural Network | Sourav Sanyal, Kaushik Roy | Purdue University | Learning for Control I | | 3-D Reconstruction Using Monocular Camera and Lights: Multi-View Photometric Stereo for Non-Stationary Robots | Monika Roznere, Philippos Mordohai, Ioannis Rekleitis, Alberto Quattrini Li | Dartmouth College,Stevens Institute of Technology,University of South Carolina | Marine Robotics I | | GMM Registration: A Probabilistic Scan Matching Approach for Sonar-Based AUV Navigation | Pau Vial, Miguel Malagón Pedrosa, Ricard Segura, Narcís Palomeras, Marc Carreras | Universitat de Girona ESQ,,,,,,,E,Universitat de Girona | Marine Robotics I | | Neural Implicit Surface Reconstruction Using Imaging Sonar | Mohamad Qadri, Michael Kaess, Ioannis Gkioulekas | Carnegie Mellon University | Marine Robotics I | | Conditional GANs for Sonar Image Filtering with Applications to Underwater Occupancy Mapping | Tianxiang Lin, Akshay Hinduja, Mohamad Qadri, Michael Kaess | Carnegie Mellon University | Marine Robotics I | | Stochastic Planning for ASV Navigation Using Satellite Images | Yizhou Huang, Hamza Dugmag, Florian Shkurti, Timothy Barfoot | University of Toronto | Marine Robotics I | | Autonomous Underwater Docking Using Flow State Estimation and Model Predictive Control | Rakesh Vivekanandan, Geoffrey Hollinger, Dongsik Chang | Oregon State University,Amazon | Marine Robotics I | | Real-Time Navigation for Autonomous Surface Vehicles in Ice-Covered Waters | Rodrigue De Schaetzen, Alexander Botros, Robert Gash, Kevin Murrant, Stephen L. Smith | University of Waterloo,National Research Council of Canada | Marine Robotics I | | Experiments in Underwater Feature Tracking with Performance Guarantees Using a Small AUV | Benjamin Adams Biggs, Hans He, James Mcmahon, Daniel Stilwell | Virginia Polytechnic Institute and State University,Virginia Tech,The Naval Research Laboratory | Marine Robotics I | | Robust Imaging Sonar-Based Place Recognition and Localization in Underwater Environments | Hogyun Kim, Kang Gilhwan, Seokhwan Jeong, Seungjun Ma, Younggun Cho | Inha University,Inha university | Marine Robotics I | | Deep Underwater Monocular Depth Estimation with Single-Beam Echosounder | Haowen Liu, Monika Roznere, Alberto Quattrini Li | Dartmouth College | Marine Robotics I | | Self-Supervised Monocular Depth Underwater | Shlomi Amitai, Itzik Klein, Tali Treibitz | University of Haifa | Marine Robotics I | | Performance Evaluation of 3D Keypoint Detectors and Descriptors on Coloured Point Clouds in Subsea Environments | Kyungmin Jung, Thomas Hitchcox, James Richard Forbes | McGill University | Marine Robotics I | | Puppeteer and Marionette: Learning Anticipatory Quadrupedal Locomotion Based on Interactions of a Central Pattern Generator and Supraspinal Drive | Milad Shafiee, Guillaume Bellegarda, Auke Ijspeert | EPFL | Biomimetic Systems | | A Performance Optimization Strategy Based on Improved NSGA-II for a Flexible Robotic Fish | Ben Lu, Jian Wang, Xiaocun Liao, Qianqian Zou, Min Tan, Chao Zhou | Institute of Automation, Chinese Academy of Sciences,Institution of Automation, Chinese Academy of sciences,Institute of Automation,Chinese Academy of Sciences,Chinese Academy of Sciences | Biomimetic Systems | | Swarm Robotics Search and Rescue: A Bee-Inspired Swarm Cooperation Approach without Information Exchange | Yue Li, Yan Gao, Sijie Yang, Quan Quan | Beihang University,School of Automation Science and Electrical Engineering, Beihang | Biomimetic Systems | | Achieving Extensive Trajectory Variation in Impulsive Robotic Systems | Luis Viornery, Chloe Goode, Gregory Sutton, Sarah Bergbreiter | Carnegie Mellon University,University of Lincoln | Biomimetic Systems | | Towards Safe Landing of Falling Quadruped Robots Using a 3-DoF Morphable Inertial Tail | Yunxi Tang, Jiajun An, Xiangyu Chu, Shengzhi Wang, Ching Yan Wong, Samuel Au | The Chinese University of Hong Kong | Biomimetic Systems | | Bioinspired Tearing Manipulation with a Robotic Fish | Stanley Wang, Juan Romero, Monica Li, Peter Wainwright, Hannah Stuart | University of California, Berkeley,UC Berkeley,University of California, Davis | Biomimetic Systems | | Learnable Tegotae-Based Feedback in CPGs with Sparse Observation Produces Efficient and Adaptive Locomotion | Christopher Herneth, Mitsuhiro Hayashibe, Dai Owaki | Technical University Munich,Tohoku University | Biomimetic Systems | | Multi-Segmented, Adaptive Feet for Versatile Legged Locomotion in Natural Terrain | Abhishek Chatterjee, An Mo, Bernadett Kiss, Emre Cemal Gonen, Alexander Badri-Spröwitz | Max Planck Institute for Intelligent Systems, Stuttgart,MPI IS Stuttgart,Max Planck Institute for Intelligent Systems | Biomimetic Systems | | Burst Stimulation for Enhanced Locomotion Control of Terrestrial Cyborg Insects | Huu Duoc Nguyen, Hirotaka Sato, Tat Thang Vo Doan | Nanyang Technological University,University of Freiburg | Biomimetic Systems | | Twisting Spine or Rigid Torso: Exploring Quadrupedal Morphology Via Trajectory Optimization | J. Diego Caporale, Zeyuan Feng, Shane Rozen-levy, Aja Carter, Daniel Koditschek | University of Pennsylvania | Biomimetic Systems | | Dynamic Locomotion of a Quadruped Robot with Active Spine Via Model Predictive Control | Wanyue Li, Zida Zhou, Hui Cheng | Sun Yat-sen University | Biomimetic Systems | | Scalable Task-Driven Robotic Swarm Control Via Collision Avoidance and Learning Mean-Field Control | Kai Cui, Mengguang Li, Christian Fabian, Heinz Koeppl | Technische Universität Darmstadt | Aerial Robotics I | | STD-Trees: Spatio-Temporal Deformable Trees for Multirotors Kinodynamic Planning | Hongkai Ye, Chao Xu, Fei Gao | Zhejiang University | Aerial Robotics I | | PredRecon: A Prediction-Boosted Planning Framework for Fast and High-Quality Autonomous Aerial Reconstruction | Chen Feng, Haojia Li, Fei Gao, Boyu Zhou, Shaojie Shen | The Hong Kong University of Science and Technology,Zhejiang University,Sun Yat-sen University,Hong Kong University of Science and Technology | Aerial Robotics I | | Vision-Aided UAV Navigation and Dynamic Obstacle Avoidance Using Gradient-Based B-Spline Trajectory Optimization | Zhefan Xu, Yumeng Xiu, Xiaoyang Zhan, Baihan Chen, Kenji Shimada | Carnegie Mellon University | Aerial Robotics I | | Multi-Agent Spatial Predictive Control with Application to Drone Flocking | Andreas Brandstätter, Scott Smolka, Scott Stoller, Ashish Tiwari, Radu Grosu | Technische Universität Wien,Stony Brook University,Microsoft Corp,TU Wien | Aerial Robotics I | | Multimodal Image Registration for GPS-Denied UAV Navigation Based on Disentangled Representations | Huandong Li, Zhunga Liu, Yanyi Lyu, Feiyan Wu | Northwestern Polytechnical University | Aerial Robotics I | | SEER: Safe Efficient Exploration for Aerial Robots Using Learning to Predict Information Gain | Yuezhan Tao, Yuwei Wu, Beiming Li, Fernando Cladera, Alex Zhou, Dinesh Thakur, Vijay Kumar | University of Pennsylvania | Aerial Robotics I | | Trajectory Planning for the Bidirectional Quadrotor As a Differentially Flat Hybrid System | Katherine Mao, Jake Welde, M. Ani Hsieh, Vijay Kumar | University of Pennsylvania | Aerial Robotics I | | Fisher Information Based Active Planning for Aerial Photogrammetry | Jaeyoung Lim, Nicholas Lawrance, Florian Achermann, Thomas Stastny, Rik Marian Kai Bähnemann, Roland Siegwart | ETH Zurich,CSIRO Data,,,ETH Zurich, ASL,Swiss Federal Institute of Technology (ETH Zurich),ETH Zürich | Aerial Robotics I | | Integrated Vector Field and Backstepping Control for Quadcopters | Arthur Henrique Dias Nunes, Guilherme Vianna Raffo, Luciano Pimenta | Universidade Federal de Minas Gerais | Aerial Robotics I | | Learning a Single Near-Hover Position Controller for Vastly Different Quadcopters | Dingqi Zhang, Antonio Loquercio, Xiangyu Wu, Ashish Kumar, Jitendra Malik, Mark Wilfried Mueller | University of California, Berkeley,UC Berkeley | Aerial Robotics I | | Forming and Controlling Hitches in Midair Using Aerial Robots | Diego Salazar-Dantonio, Subhrajit Bhattacharya, David Saldana | Lehigh University | Aerial Robotics I | | AirTrack: Onboard Deep Learning Framework for Long-Range Aircraft Detection and Tracking | Sourish Ghosh, Jay Patrikar, Brady Moon, Milad Moghassem Hamidi, Sebastian Scherer | Carnegie Mellon University | Aerial Robot Learning | | Towards a Reliable and Lightweight Onboard Fault Detection in Autonomous Unmanned Aerial Vehicles | Sai Srinadhu Katta, Eduardo Viegas | TII,Pontifícia Universidade Catolica do Paraná (PUCPR), Brazil | Aerial Robot Learning | | Variable Admittance Interaction Control of UAVs Via Deep Reinforcement Learning | Yuting Feng, Chuanbeibei Shi, Jianrui Du, Yushu Yu, Fuchun Sun, Yixu Song | Beijing Institute of Technology,Univeristy of Toronto,Tsinghua University,Tsinghua university | Aerial Robot Learning | | Learning Tethered Perching for Aerial Robots | Fabian Hauf, Başaran Bahadır Koçer, Hai-nguyen Nguyen, Oscar Kwong Fai Pang, Ronald Clark, Edward Johns, Mirko Kovac | Imperial College London,CNRS,University of Oxford | Aerial Robot Learning | | Credible Online Dynamics Learning for Hybrid UAVs | David Rohr, Nicholas Lawrance, Olov Andersson, Roland Siegwart | ETH Zurich,CSIRO Data,,,ETH Zürich | Aerial Robot Learning | | AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning | Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso De Melo, Stephen Nogar, Aniket Bera, Dinesh Manocha | University of Maryland, College Park,University of Maryland-College Park,University of Maryland,CCDC US Army Research Laboratory,CCDC U.S. Army Research Laboratory,Purdue University | Aerial Robot Learning | | Follow the Rules: Online Signal Temporal Logic Tree Search for Guided Imitation Learning in Stochastic Domains | Jasmine Jerry Aloor, Jay Patrikar, Parv Kapoor, Jean Oh, Sebastian Scherer | Massachusetts Institute of Technology,Carnegie Mellon University | Aerial Robot Learning | | Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking | Changhong Fu, Mutian Cai, Sihang Li, Kunhan Lu, Haobo Zuo, Chongjun Liu | Tongji University,Harbin Engineering University | Aerial Robot Learning | | Weighted Maximum Likelihood for Controller Tuning | Angel Romero, Shreedhar Govil, Gonca Yilmaz, Yunlong Song, Davide Scaramuzza | University of Zurich | Aerial Robot Learning | | User-Conditioned Neural Control Policies for Mobile Robotics | Leonard Bauersfeld, Elia Kaufmann, Davide Scaramuzza | University of Zurich (UZH),,University of Zurich | Aerial Robot Learning | | Training Efficient Controllers Via Analytic Policy Gradient | Nina Wiedemann, Valentin Wueest, Antonio Loquercio, Matthias Mueller, Dario Floreano, Davide Scaramuzza | Robotics and Perception Group, University of Zürich,EPFL,UC Berkeley,Intel,Ecole Polytechnique Federal, Lausanne,University of Zurich | Aerial Robot Learning | | Parallel Reinforcement Learning Simulation for Visual Quadrotor Navigation | Jack Saunders, Sajad Saeedi, Wenbin Li | University of Bath,Toronto Metropolitan University | Aerial Robot Learning | | Toward Efficient Physical and Algorithmic Design of Automated Garages | Teng Guo, Jingjin Yu | Rutgers University | Multi-Robot Systems I | | Chronos and CRS: Design of a Miniature Car-Like Robot and a Software Framework for Single and Multi-Agent Robotics and Control | Andrea Carron, Bodmer Sabrina, Lukas Vogel, René Zurbruegg, David Helm, Rahel Rickenbach, Simon Muntwiler, Jerome Sieber, Melanie N. Zeilinger | ETH Zurich,ETH Zürich | Multi-Robot Systems I | | Multi-Agent Path Integral Control for Interaction-Aware Motion Planning in Urban Canals | Lucas Michael Streichenberg, Elia Trevisan, Jen Jen Chung, Roland Siegwart, Javier Alonso-Mora | ETH Zurich,Delft University of Technology,The University of Queensland | Multi-Robot Systems I | | Mixed Observable RRT: Multi-Agent Mission-Planning in Partially Observable Environments | Kasper Johansson, Ugo Rosolia, Wyatt Ubellacker, Andrew Singletary, Aaron Ames | Stanford University,Caltech,California Institute of Technology | Multi-Robot Systems I | | RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments | Aakriti Agrawal, Amrit Bedi, Dinesh Manocha | University of Maryland, College Park,University of Maryland | Multi-Robot Systems I | | Hybrid SUSD-Based Task Allocation for Heterogeneous Multi-Robot Teams | Shengkang Chen, Tony Lin, Said Al-abri, Ronald Arkin, Fumin Zhang | Georgia Tech,Georgia Institute of Technology | Multi-Robot Systems I | | Search Algorithms for Multi-Agent Teamwise Cooperative Path Finding | Zhongqiang Ren, Chaoran Zhang, Sivakumar Rathinam, Howie Choset | Carnegie Mellon University,TAMU | Multi-Robot Systems I | | Collaborative Scheduling with Adaptation to Failure for Heterogeneous Robot Teams | Peng Gao, Sriram Siva, Anthony Micciche, Hao Zhang | University of Maryland, College Park,Colorado School of Mines,University of Massachusetts Amherst | Multi-Robot Systems I | | AMSwarm: An Alternating Minimization Approach for Safe Motion Planning of Quadrotor Swarms in Cluttered Environments | Vivek Kantilal Adajania, Siqi Zhou, Arun Singh, Angela P. Schoellig | University of Toronto,Technical University of Munich,University of Tartu,TU Munich | Multi-Robot Systems I | | Decentralized Deadlock-Free Trajectory Planning for Quadrotor Swarm in Obstacle-Rich Environments | Jungwon Park, Inkyu Jang, H. Jin Kim | Seoul National University | Multi-Robot Systems I | | A Negative Imaginary Theory-Based Time-Varying Group Formation Tracking Scheme for Multi-Robot Systems: Applications to Quadcopters | Yu-Hsiang Su, Parijat Bhowmick, Alexander Lanzon | The University of Manchester,Indian Institute of Technology Guwahati | Multi-Robot Systems I | | Data-Driven Risk-Sensitive Model Predictive Control for Safe Navigation in Multi-Robot Systems | Atharva Navsalkar, Ashish Hota | Indian Institute of Technology Kharagpur,Indian Institute of Technology (IIT) Kharagpur | Multi-Robot Systems I | | Multi-Modal Hierarchical Transformer for Occupancy Flow Field Prediction in Autonomous Driving | Haochen Liu, Zhiyu Huang, Chen Lv | Nanyang Technological University | Intelligent Transportation Systems I | | Annotating Covert Hazardous Driving Scenarios Online: Utilizing the Driver's Electroencephalography (EEG) Signals | Chen Zheng, Muxiao Zi, Wenjie Jiang, Mengdi Chu, Yan Zhang, Jirui Yuan, Guyue Zhou, Jiangtao Gong | Institute for AI Industry Research, Tsinghua University,Tsinghua University | Intelligent Transportation Systems I | | Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints | Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled Refaat, Weilong Yang, Eugene Ie, Congcong Li | Stanford University,Waymo LLC,Waymo,Google,Waymo Inc. | Intelligent Transportation Systems I | | Model-Agnostic Multi-Agent Perception Framework | Runsheng Xu, Weizhe Chen, Hao Xiang, Xia Xin, Lantao Liu, Jiaqi Ma | UCLA,Indiana University Bloomington,University of California, Los Angeles,Indiana University | Intelligent Transportation Systems I | | Explainable Action Prediction through Self-Supervision on Scene Graphs | Pawit Kochakarn, Daniele De Martini, Daniel Omeiza, Lars Kunze | University of Oxford | Intelligent Transportation Systems I | | CueCAn: Cue-Driven Contextual Attention for Identifying Missing Traffic Signs on Unconstrained Roads | Varun Gupta, Anbumani Subramanian, C.V. Jawahar, Rohit Saluja | IIIT, Hyderabad,Intel,IIIT Hyderabad | Intelligent Transportation Systems I | | Tackling Clutter in Radar Data - Label Generation and Detection Using PointNet++ | Johannes Kopp, Dominik Kellner, Aldi Piroli, Klaus Dietmayer | Ulm University, Germany,BMW AG,Universität Ulm,University of Ulm | Intelligent Transportation Systems I | | Effective Combination of Vertical, Longitudinal and Lateral Data for Vehicle Mass Estimation | Younesse EL MRHASLI, Bruno Monsuez, Xavier Mouton | ENSTA PARIS,ENSTA-ParisTech,Groupe Renault | Intelligent Transportation Systems I | | Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles | Sushant Veer, Karen Yan Ming Leung, Ryan Cosner, Yuxiao Chen, Peter Karkus, Marco Pavone | NVIDIA,Stanford University, NVIDIA Research, University of Washington,California Institute of Technology,Nvidia research,Stanford University | Intelligent Transportation Systems I | | Active Probing and Influencing Human Behaviors Via Autonomous Agents | Shuangge Wang, Yiwei Lyu, John Dolan | University of Southern California,Carnegie Mellon University | Intelligent Transportation Systems I | | TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction | Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc Van Gool | ETH Zurich,ETH Zürich | Intelligent Transportation Systems I | | SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments | Arec Jamgochian, Etienne Buehrle, Johannes Fischer, Mykel Kochenderfer | Stanford University,Karlsruhe Institute of Technology | Intelligent Transportation Systems I | | Reinforcement Learning-Based Optimal Multiple Waypoint Navigation | Christos Vlachos, Panagiotis Rousseas, Charalampos Bechlioulis, Kostas Kyriakopoulos | National Technical University of Athens,University of Patras,National Technical Univ. of Athens | Motion and Path Planning I | | DriveIRL: Drive in Real Life with Inverse Reinforcement Learning | Tung Phan-minh, Forbes Howington, Ting-sheng Chu, Momchil Tomov, Robert Beaudoin, Sang Uk Lee, Nanxiang Li, Caglayan Dicle, Samuel Findler, Francisco Suárez-Ruiz, Bo Yang, Sammy Omari, Eric Wolff | Motional AD,Motional,University of Michigan,Bosch Research and Technology Center,Senior Software Engineer at Motional,Nanyang Technological University,ETH Zurich,California Institute of Technology | Motion and Path Planning I | | LES: Locally Exploitative Sampling for Robot Path Planning | Sagar Joshi, Seth Hutchinson, Panagiotis Tsiotras | Aurora Innovation,Georgia Institute of Technology,Georgia Tech | Motion and Path Planning I | | Boundary Conditions in Geodesic Motion Planning for Manipulators | Mario Laux, Andreas Zell | University of Tübingen | Motion and Path Planning I | | TOFG: A Unified and Fine-Grained Environment Representation in Autonomous Driving | Zihao Wen, Yifan Zhang, Xinhong Chen, Jianping Wang | City University of Hong Kong | Motion and Path Planning I | | Unidirectional-Road-Network-Based Global Path Planning for Cleaning Robots in Semi-Structured Environments | Yong Li, Hui Cheng | Guangzhou Shiyuan Electronic Technology Co., Ltd,Sun Yat-sen University | Motion and Path Planning I | | A Hierarchical Decoupling Approach for Fast Temporal Logic Motion Planning | Ziyang Chen, Zhangli Zhou, Shaochen Wang, Zhen Kan | University of Science and Technology of China | Motion and Path Planning I | | A Fast Two-Stage Approach for Multi-Goal Path Planning in a Fruit Tree | Werner Kroneman, João Valente, Frank Van Der Stappen | University College Roosevelt,Wageningen University & Research,Utrecht University | Motion and Path Planning I | | Online Whole-Body Motion Planning for Quadrotor Using Multi-Resolution Search | Yunfan Ren, Siqi Liang, Fangcheng Zhu, Guozheng Lu, Fu Zhang | The University of Hong Kong,Harbin Institute of Technology, Shenzhen,University of Hong Kong | Motion and Path Planning I | | Intermittent Diffusion Based Path Planning for Heterogeneous Groups of Mobile Sensors in Cluttered Environments | Christina Frederick, Haomin Zhou, Frank Crosby | NJIT,Georgia Institute of Technology,USNWC PC | Motion and Path Planning I | | GANet: Goal Area Network for Motion Forecasting | Mingkun Wang, Xinge Zhu, Changqian Yu, Wei Li, Yuexin Ma, Ruochun Jin, Xiaoguang Ren, Dongchun Ren, Mingxu Wang, Wenjing Yang | Peking University,CUHK,Meituan,Inceptio,ShanghaiTech University,National University of Defense Technology,Academy of Military Sciences,Fudan University,State Key Laboratory of High Performance Computing (HPCL), Schoo | Motion and Path Planning I | | FlowMap: Path Generation for Automated Vehicles in Open Space Using Traffic Flow | Wenchao Ding, Jieru Zhao, Yubin Chu, Haihui Huang, Tong Qin, Chunjing Xu, Yuxiang Guan, Zhongxue Gan | Fudan University,Shanghai Jiao Tong University,Dalian University of Technology,Zhejiang University,Huawei Techonology,Huawei Technologies | Motion and Path Planning I | | An Architecture for Reactive Mobile Manipulation On-The-Move | Ben Burgess-Limerick, Christopher Lehnert, Jurgen Leitner, Peter Corke | Queensland University of Technology,LYRO Robotics & Monash University | Reactive and Sensor-Based Planning | | Multi-Robot Mission Planning in Dynamic Semantic Environments | Samarth Kalluraya, George J. Pappas, Yiannis Kantaros | Washington University in St. Louis,University of Pennsylvania | Reactive and Sensor-Based Planning | | A System for Generalized 3D Multi-Object Search | Kaiyu Zheng, Anirudha Paul, Stefanie Tellex | Brown University,Brown | Reactive and Sensor-Based Planning | | A General Class of Combinatorial Filters That Can Be Minimized Efficiently | Yulin Zhang, Dylan Shell | Amazon,Texas A&M University | Reactive and Sensor-Based Planning | | Cautious Planning with Incremental Symbolic Perception: Designing Verified Reactive Driving Maneuvers | Disha Kamale, Sofie Haesaert, Cristian Ioan Vasile | Lehigh University,Eindhoven University of Technology | Reactive and Sensor-Based Planning | | Decision Diagrams As Plans: Answering Observation-Grounded Queries | Dylan Shell, Jason O'kane | Texas A&M University | Reactive and Sensor-Based Planning | | Obstacle Avoidance Using Raycasting and Riemannian Motion Policies at kHz Rates for MAVs | Michael Pantic, Isar Meijer, Rik Marian Kai Bähnemann, Nikhilesh Alatur, Olov Andersson, Cesar D. Cadena Lerma, Roland Siegwart, Lionel Ott | ETH Zürich,ETH Zurich | Reactive and Sensor-Based Planning | | Adaptive and Explainable Deployment of Navigation Skills Via Hierarchical Deep Reinforcement Learning | Kyowoon Lee, Seongun Kim, Jaesik Choi | Ulsan National Institute of Science and Technology,Korea Advanced Institute of Science and Technology | Reactive and Sensor-Based Planning | | Learning Agile Flight Maneuvers: Deep SE(3) Motion Planning and Control for Quadrotors | Yixiao Wang, Bingheng Wang, Shenning Zhang, Han Wei Sia, Lin Zhao | National University of Singapore,ST Engineering | Collision Avoidance | | Robust MADER: Decentralized and Asynchronous Multiagent Trajectory Planner Robust to Communication Delay | Kota Kondo, Jesus Tordesillas Torres, Reinaldo Figueroa, Juan Rached, Joseph Merkel, Parker Lusk, Jonathan Patrick How | Massachusetts Institute of Technology,MIT Aerospace Controls Lab | Collision Avoidance | | Obstacle Identification and Ellipsoidal Decomposition for Fast Motion Planning in Unknown Dynamic Environments | Mehmetcan Kaymaz, Nazim Ure | Istanbul Technical University | Collision Avoidance | | Safe Operations of an Aerial Swarm Via a Cobot Human Swarm Interface | Sydrak Abdi, Derek Paley | University of Maryland | Collision Avoidance | | MonoGraspNet: 6-DoF Grasping with a Single RGB Image | Guangyao Zhai, Dianye Huang, Shun-cheng Wu, Hyunjun Jung, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam | Technical University of Munich,Google,Technische Universität München,TU Munich | Perception for Grasping and Manipulation I | | USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation | Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu | Shanghai Jiao Tong University,Tsinghua University,Carnegie Mellon University,Center for Artificial Intelligence and Robotics, Graduate School | Perception for Grasping and Manipulation I | | Semantic Mapping with Confidence Scores through Metric Embeddings and Gaussian Process Classification | Jungseok Hong, Suveer Garg, Volkan Isler | University of Minnesota,University of Pennsylvania | Perception for Grasping and Manipulation I | | The Third Generation (G3) Dual-Modal and Dual Sensing Mechanisms (DMDSM) Pretouch Sensor for Robotic Grasping | Cheng Fang, Shuangliang Li, Di Wang, Fengzhi Guo, Dezhen Song, Jun Zou | Texas A&M University | Perception for Grasping and Manipulation I | | Learning Height for Top-Down Grasps with the DIGIT Sensor | Thais Bernardi, Yoann Fleytoux, Jean-Baptiste Mouret, Serena Ivaldi | Inria,INRIA | Perception for Grasping and Manipulation I | | Instance-Wise Grasp Synthesis for Robotic Grasping | Yucheng Xu, Mohammadreza Kasaei, Hamidreza Kasaei, Zhibin Li | University of Edinburgh,University of Groningen,University College London | Perception for Grasping and Manipulation I | | Joint Segmentation and Grasp Pose Detection with Multi-Modal Feature Fusion Network | Xiaozheng Liu, Yunzhou Zhang, He Cao, Shan Dexing, Jiaqi Zhao | Northeastern University | Perception for Grasping and Manipulation I | | GraspNeRF: Multiview-Based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF | Qiyu Dai, Yan Zhu, Yiran Geng, Ciyu Ruan, Jiazhao Zhang, He Wang | Peking University,National University of Defense Technology | Perception for Grasping and Manipulation I | | Elastic Context: Encoding Elasticity for Data-Driven Models of Textiles | Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael Welle, Alexander Kravberg, Yufei Wang, David Held, Zackory Erickson, Danica Kragic | KTH Royal Institute of Technology,Carnegie Mellon University,KTH | Perception for Grasping and Manipulation I | | Vision-Based Six-Dimensional Peg-In-Hole for Practical Connector Insertion | Kun Zhang, Chen Wang, Hua Chen, Jia Pan, Michael Y. Wang, Wei Zhang | Hong Kong University of Science and Technology,The University of Hong Kong,Southern University of Science and Technology,University of Hong Kong,Monash University | Perception for Grasping and Manipulation I | | RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control | Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield | University of Illinois Urbana-Champaign,NVIDIA Corporation,Nvidia,NVIDIA,Carnegie Mellon University,University of Illinois at Urbana-Champaign | Perception for Grasping and Manipulation I | | Multi-View Object Pose Estimation from Correspondence Distributions and Epipolar Geometry | Rasmus Haugaard, Thorbjørn Mosekjær Iversen | University of Southern Denmark,The Maersk Mc-Kinney Moller Institute, University of Southern De | Perception for Grasping and Manipulation I | | FSG-Net: A Deep Learning Model for Semantic Robot Grasping through Few-Shot Learning | Leonardo Barcellona, Alberto Bacchin, Alberto Gottardi, Emanuele Menegatti, Stefano Ghidoni | University of Padova,University of Padua,The University of Padua | Learning for Grasping and Manipulation I | | Learning Pre-Grasp Manipulation of Flat Objects in Cluttered Environments Using Sliding Primitives | Jiaxi Wu, Haoran Wu, Shanlin Zhong, Quqin Sun, Yinlin Li | Peking University,University of Science and Technology of China,Institute of Automation, Chinese Academy of Sciences,Wuhan Second.Ship Design.and Research Institute | Learning for Grasping and Manipulation I | | Learning Category-Level Manipulation Tasks from Point Clouds with Dynamic Graph CNNs | Junchi Liang, Abdeslam Boularias | Rutgers University | Learning for Grasping and Manipulation I | | Neural Grasp Distance Fields for Robot Manipulation | Thomas Weng, David Held, Franziska Meier, Mustafa Mukadam | Carnegie Mellon University,Facebook,Facebook AI Research | Learning for Grasping and Manipulation I | | Planning for Multi-Object Manipulation with Graph Neural Network Relational Classifiers | Yixuan Huang, Adam Conkey, Tucker Hermans | University of Utah | Learning for Grasping and Manipulation I | | Local Neural Descriptor Fields: Locally Conditioned Object Representations for Manipulation | Ethan Chun, Yilun Du, Anthony Simeonov, Tomas Lozano-Perez, Leslie Kaelbling | Massachusetts Institute of Technology,MIT | Learning for Grasping and Manipulation I | | Practical Visual Deep Imitation Learning Via Task-Level Domain Consistency | Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang | Google X,UC Berkeley,Everyday Robots,X, The Moonshot Factory,Halodi Robotics | Learning for Grasping and Manipulation I | | SEIL: Simulation-Augmented Equivariant Imitation Learning | Mingxi Jia, Dian Wang, Guanang Su, David Klee, Xupeng Zhu, Robin Walters, Robert Platt | Northeastern University | Learning for Grasping and Manipulation I | | Dextrous Tactile In-Hand Manipulation Using a Modular Reinforcement Learning Architecture | Johannes Pitz, Lennart Röstel, Leon Sievers, Berthold Bäuml | German Aerospace Center,German Aerospace Center (DLR) | Learning for Grasping and Manipulation I | | Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation | Mengxi Li, Rika Antonova, Dorsa Sadigh, Jeannette Bohg | Stanford University | Learning for Grasping and Manipulation I | | CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation | Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox | Nvidia Corporation,NVIDIA,University of Washington | Learning for Grasping and Manipulation I | | NIFT: Neural Interaction Field and Template for Object Manipulation | Zeyu Huang, Juzhan Xu, Sisi Dai, Kai Xu, Hao Zhang, Hui Huang, Ruizhen Hu | Shenzhen University,National University of Defense Technology,Simon Fraser University | Learning for Grasping and Manipulation I | | Place Recognition under Occlusion and Changing Appearance Via Disentangled Representations | Yue Chen, Xingyu Chen, Yicen Li | Xi'an Jiaotong University,Laboratory of Visual Cognitive Computing and Intelligent Vehicle,McMaster University | Localization I | | GIDP: Learning a Good Initialization and Inducing Descriptor Post-Enhancing for Large-Scale Place Recognition | Zhaoxin Fan, Zhenbo Song, Jun He, Hongyan Liu | Renmin University of China,Nanjing University of Science and Technology,Tsinghua University | Localization I | | STD: Stable Triangle Descriptor for 3D Place Recognition | Yuan Chongjian, Jiarong Lin, Zuhao Zou, Xiaoping Hong, Fu Zhang | The University of Hong Kong,HongKong University,Southern University of Science and Technology,University of Hong Kong | Localization I | | DeepRING: Learning Roto-Translation Invariant Representation for LiDAR Based Place Recognition | Sha Lu, Xuecheng Xu, Li Tang, Rong Xiong, Yue Wang | Zhejiang University | Localization I | | Sensor Localization by Few Distance Measurements Via the Intersection of Implicit Manifolds | Michael Moshe Bilevich, Steven M Lavalle, Dan Halperin | Tel Aviv University,University of Oulu | Localization I | | Boosting Performance of a Baseline Visual Place Recognition Technique by Predicting the Maximally Complementary Technique | Connor Malone, Stephen Hausler, Tobias Fischer, Michael J Milford | Queensland University of Technology,CSIRO | Localization I | | Loosely-Coupled Localization Fusion System Based on Track-To-Track Fusion with Bias Alignment | Soyeong Kim, Kichun Jo, Benazouz Bradai, Paulo Resende, Jaeyoung Jo | Konkuk University,Valeo,Konkuk university, Smart vehicle engineering | Localization I | | Portable Multi-Hypothesis Monte Carlo Localization for Mobile Robots | Alberto García, Francisco Martin Rico, Jose Miguel Guerrero, Francisco Javier Rodríguez Lera, Vicente Matellan | Universidad Rey Juan Carlos,Carnegie Mellon University,Rey Juan Carlos University,Universidad de León,Universidad de Leon | Localization I | | CPnP: Consistent Pose Estimator for Perspective-N-Point Problem with Bias Elimination | Guangyang Zeng, Shiyu Chen, Biqiang Mu, Guodong Shi, Junfeng Wu | The Chinese University of Hong Kong, Shenzhen,Chinese Academy of Sciences,The University of Sydney,The Chinese Unviersity of Hong Kong, Shenzhen | Localization I | | LiDAR-Based Indoor Localization with Optimal Particle Filters Using Surface Normal Constraints | Heruka Andradi, Sebastian Blumenthal, Erwin Prassler, Paul G. Plöger | Hochschule Bonn Rhein Sieg,Locomotec,Bonn-Rhein-Sieg Univ. of Applied Sciences | Localization I | | Efficient Planar Pose Estimation Via UWB Measurements | Haodong Jiang, Wentao Wang, Yuan Shen, Xinghan Li, Xiaoqiang Ren, Biqiang Mu, Junfeng Wu | The Chinese University of Hong Kong, Shenzhen,ZhejiangUniversity,Nanjing University of Science and Technology,Zhejiang university,Shanghai University,Chinese Academy of Sciences,The Chinese Unviersity of Hong Kong, Shenzhen | Localization I | | Visual Pitch and Roll Estimation for Inland Water Vessels | Dennis Griesser, Georg Umlauf, Matthias Franz | University of Applied Sciences Konstanz, Institute for Optical S | Vision-Based Navigation I | | GPF-BG: A Hierarchical Vision-Based Planning Framework for Safe Quadrupedal Navigation | Shiyu Feng, Ziyi Zhou, Justin Smith, Maxwell Asselmeier, Ye Zhao, Patricio A. Vela | Georgia Institute of Technology | Vision-Based Navigation I | | Direct Angular Rate Estimation without Event Motion-Compensation at High Angular Rates | Matthew Ng, Xinyu Cai, Shaohui Foong | Singapore University of Technology and Design | Vision-Based Navigation I | | StereoVAE: A Lightweight Stereo-Matching System Using Embedded GPUs | Qiong Chang, Li Xiang, Xu Xin, Xin Liu, Yun Li, Jun Miyazaki | Tokyo Institute of Technology,NanJing University,National Institute of Advanced Industrial Science and Technology,Tokyo Institute of Technology School of Computing | Vision-Based Navigation I | | Learning Perception-Aware Agile Flight in Cluttered Environments | Yunlong Song, Kexin Shi, Robert Pěnička, Davide Scaramuzza | University of Zurich,Universität Zürich,Czech Technical University in Prague | Vision-Based Navigation I | | NanoFlowNet: Real-Time Dense Optical Flow on a Nano Quadcopter | Rik Jan Bouwmeester, Federico Paredes-valles, Guido De Croon | Delft University of Technology,TU Delft | Vision-Based Navigation I | | Zero-Shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants | Jeongeun Park, Taerim Yoon, Jejoon Hong, Youngjae Yu, Matthew Pan, Sungjoon Choi | Korea University,Yonsei University,Queen's University | Vision-Based Navigation I | | Memory-Based Exploration-Value Evaluation Model for Visual Navigation | Yongquan Feng, Liyang Xu, Minglong Li, Ruochun Jin, Da Huang, Shaowu Yang, Wenjing Yang | National University of Defense Technology,NUDT,the State Key Laboratory of High Performance Computing (HPCL) &,State Key Laboratory of High Performance Computing (HPCL), Schoo | Vision-Based Navigation I | | ViNL: Visual Navigation and Locomotion Over Obstacles | Simar Kareer, Naoki Yokoyama, Dhruv Batra, Sehoon Ha, Joanne Truong | Georgia Tech,Georgia Institute of Technology,Georgia Tech / Facebook AI Research,The Georgia Institute of Technology | Vision-Based Navigation I | | Zero-Shot Object Goal Visual Navigation | Qianfan Zhao, Lu Zhang, Bin He, Hong Qiao, Zhiyong Liu | State Key Laboratory of Management and Control for Complex Syste,Institute of Automation, Chinese Academy of Science,Tongji University,Institute of Automation, Chinese Academy of Sciences,Institute of Automation Chinese Academy of Sciences | Vision-Based Navigation I | | Monocular Simultaneous Localization and Mapping Using Ground Textures | Kyle Hart, Brendan Englot, Ryan O'shea, John Kelly, David Martinez | Stevens Institute of Technology,Naval Air Warfare Center Aircraft Division,RISE Laboratory at Naval Air Warfare Center,Pennsylvania State University | Vision-Based Navigation I | | WAVN: Wide Area Visual Navigation for Large-Scale, GPS-Denied Environments | Damian Lyons, Mohamed Rahouti | Fordham University | Vision-Based Navigation I | | ORORA: Outlier-Robust Radar Odometry | Hyungtae Lim, Kawon Han, Gunhee Shin, Giseop Kim, Songcheol Hong, Hyun Myung | Korea Advanced Institute of Science and Technology,Inha University,NAVER LABS,KAIST (Korea Advanced Institute of Science and Technology) | Localization and Mapping I | | AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from Motion | Yu Chen, Zihao Yu, Shu Song, Jianming Li, Tianning Yu, Gim Hee Lee | National University of Singapore,Beihang University,Nreal,Segway Ninebot,Willand Company | Localization and Mapping I | | Robust Map Fusion with Visual Attention Utilizing Multi-Agent Rendezvous | Jaein Kim, Dong-sig Han, Byoung-Tak Zhang | Seoul National University | Localization and Mapping I | | Wi-Closure: Reliable and Efficient Search of Inter-Robot Loop Closures Using Wireless Sensing | Weiying Wang, Anne Kemmeren, Daniel Son, Javier Alonso-Mora, Stephanie Gil | Harvard University,Delft University,Delft University of Technology | Localization and Mapping I | | COVINS-G: A Generic Back-End for Collaborative Visual-Inertial SLAM | Manthan Patel, Marco Karrer, Philipp Baenninger, Margarita Chli | ETH Zurich | Localization and Mapping I | | PIEKF-VIWO: Visual-Inertial-Wheel Odometry Using Partial Invariant Extended Kalman Filter | Tong Hua, Tao Li, Ling Pei | Shanghai Jiao Tong University | Localization and Mapping I | | Observability-Aware Active Extrinsic Calibration of Multiple Sensors | Shida Xu, Jonatan Scharff Willners, Ziyang Hong, Kaicheng Zhang, Y. R. Petillot, Sen Wang | Imperial College London,Heriot-Watt University | Localization and Mapping I | | Learning Continuous Control Policies for Information-Theoretic Active Perception | Pengzhi Yang, Yuhan Liu, Shumon Koga, Arash Asgharivaskasi, Nikolay A. Atanasov | University of Electronic Science and Technology of China,University of California, San Diego,University of California San Diego | Localization and Mapping I | | Structure PLP-SLAM: Efficient Sparse Mapping and Localization Using Point, Line and Plane for Monocular, RGB-D and Stereo Cameras | Fangwen Shu, Jiaxuan Wang, Alain Pagani, Stricker Didier | DFKI,German Research Center for Artificial Intelligence | Localization and Mapping I | | Rotation Synchronization Via Deep Matrix Factorization | Tejus Gk, Giacomo Zara, Paolo Rota, Andrea Fusiello, Elisa Ricci, Federica Arrigoni | Indian Institute of Technology (ISM) Dhanbad,University of Trento,University of Udine,Politecnico di Milano | Localization and Mapping I | | Object-Based SLAM Utilizing Unambiguous Pose Parameters Considering General Symmetry Types | Taekbeom Lee, Youngseok Jang, H. Jin Kim | Seoul National University | Localization and Mapping I | | Towards View-Invariant and Accurate Loop Detection Based on Scene Graph | Chuhao Liu, Shaojie Shen | Hong Kong University of Science and Technology | Localization and Mapping I | | ViViD++: Vision for Visibility Dataset | Alex Lee, Younggun Cho, Young-Sik Shin, Ayoung Kim, Hyun Myung | Hyundai Motor Company,Inha University,KIMM,Seoul National University,KAIST (Korea Advanced Institute of Science and Technology) | SLAM 2 | | CamMap: Extrinsic Calibration of Non-Overlapping Cameras Based on SLAM Map Alignment | Jie Xu, Ruifeng Li, Lijun Zhao, Wenlu Yu, Zhiheng Liu, Bo Zhang, Yuchen Li | Harbin Institute of Technology,harbin institute of technology | SLAM 2 | | Hybrid Visual SLAM for Underwater Vehicle Manipulator Systems | Gideon Billings, Richard Camilli, Matthew Johnson-Roberson | University of Sydney, Australian Center for Field Robotics,Woods Hole Oceanographic Institution,University of Michigan | SLAM 2 | | WOLF: A Modular Estimation Framework for Robotics Based on Factor Graphs | Joan Solà, Joan Vallvé, Joaquim Casals, Jeremie Deray, Mederic Fourmy, Dinesh Atchuthan, Andreu Corominas-murtra, Juan Andrade-Cetto | Institut de Robòtica i Informàtica Industrial,CSIC-UPC,Institut de Robòtica i Informàtica Industrial, CSIC-UPC,LAAS, CNRS,EasyMile,Beta Robots SL | SLAM 2 | | Point Cloud Change Detection with Stereo V-SLAM: Dataset, Metrics and Baseline | Zihan Lin, Yu Jincheng, Lipu Zhou, Xudong Zhang, Jian Wang, Yu Wang | Tsinghua University,MeiTuan,Tsinghua Univ. | SLAM 2 | | Hilti-Oxford Dataset: A Millimeter-Accurate Benchmark for Simultaneous Localization and Mapping | Lintong Zhang, Michael Helmberger, Lanke Frank Tarimo Fu, David Wisth, Marco Camurri, Davide Scaramuzza, Maurice Fallon | University of Oxford,HILTI AG,Free University of Bozen-Bolzano,University of Zurich | SLAM 2 | | Long-Term Visual SLAM with Bayesian Persistence Filter Based Global Map Prediction | Tianchen Deng, Hongle Xie, Jingchuan Wang, Weidong Chen | Shanghai Jiao Tong University | SLAM 2 | | Wheel-SLAM: Simultaneous Localization and Terrain Mapping Using One Wheel-Mounted IMU | Yibin Wu, Jian Kuang, Xiaoji Niu, Jens Behley, Lasse Klingbeil, Heiner Kuhlmann | University of Bonn,Wuhan University | SLAM 2 | | Maplab 2.0 - a Modular and Multi-Modal Mapping Framework | Andrei Cramariuc, Lukas Bernreiter, Florian Tschopp, Marius Fehr, Victor Reijgwart, Juan Nieto, Roland Siegwart, Cesar D. Cadena Lerma | ETHZ,ETH Zurich, Autonomous Systems Lab,Arrival Ltd,Voliro AG,ETH Zurich,Microsoft | SLAM 2 | | Simulation Data Driven Design Optimization for Reconfigurable Soft Gripper System | Jun LIU, Jin Huat Low, Qian Qian Han, Marisa Lim, Dingjie Lu, Yangfan Li, Chen-Hua Yeow, Zhuangjian Liu | IHPC, A*STAR,National University of Singapore,IHPC, ASTAR,Institute of High Performance Computing, A*Star,INSTITUTE OF HIGH PERFORMANCE COMPUTING | Modeling, Control, and Learning for Soft Robots | | Research on Design and Experiment of a Wearable Hand Rehabilitation Device Driven by Fiber-Reinforced Soft Actuator | Kaiwei Ma, Zhenjiang Jiang, Shuang Gao, Guoping Jiang, Fengyu Xu | Nanjing University of Posts and Telecommunications,southeast university | Modeling, Control, and Learning for Soft Robots | | DNN-Based Predictive Model for a Batoid-Inspired Soft Robot | Guangtong Li, Thileepan Stalin, Truong Van Tien, Pablo Valdivia | Singapore University of Technology and Design,Singapore University of Technology and Design, MIT | Modeling, Control, and Learning for Soft Robots | | Modeling the Locomotion of Articulated Soft Robots in Granular Medium | Yayun Du, Jacqueline Lam, Karunesh Sachanandani, Mohammad Khalid Jawed | University of California, Los Angeles,UCLA | Modeling, Control, and Learning for Soft Robots | | SoRoSim: A MATLAB Toolbox for Hybrid Rigid-Soft Robots Based on the Geometric Variable-Strain Approach | Anup Teejo Mathew, Ikhlas Mohamed Ben Hmida, Costanza Armanini, Frédéric Boyer, Federico Renda | Khalifa University,IMT atlantique,Khalifa University of Science and Technology | Modeling, Control, and Learning for Soft Robots | | A Geometrically-Exact Assumed Strain Modes Approach for the Geometrico and Kinemato-Static Modellings of Continuum Parallel Robots | Sébastien Briot, Frédéric Boyer | LS,N,IMT atlantique | Modeling, Control, and Learning for Soft Robots | | Towards a Physics-Based Model for Steerable Eversion Growing Robots | Zicong Wu, Mikel De Iturrate Reyzabal, S.M.Hadi Sadati, Hongbin Liu, Sebastien Ourselin, Daniel Richard Leff, Robert Kevin Katzschmann, Kawal Rhode, Christos Bergeles | King's College London,Hong Kong Institute of Science & Innovation, Chinese Academy of ,University College London,Imperial College London,ETH Zurich | Modeling, Control, and Learning for Soft Robots | | P-satI-D Shape Regulation of Soft Robots | Pietro Pustina, Pablo Borja, Cosimo Della Santina, Alessandro De Luca | Sapienza University of Rome,University of Plymouth,TU Delft | Modeling, Control, and Learning for Soft Robots | | Statics and Dynamics of Continuum Robots Based on Cosserat Rods and Optimal Control Theories | Frédéric Boyer, Vincent Lebastard, Fabien Candelier, Federico Renda, Mazen Alamir | IMT atlantique,Université Aix Marseille,Khalifa University of Science and Technology,LAG | Modeling, Control, and Learning for Soft Robots | | Robotic Fiber Threading from a Gel-Like Substance Based on Impedance Control with Force Tracking | Houari Bettahar, P. A. Diluka Harischandra, Quan Zhou | Aalto university,Aalto University | Modeling, Control, and Learning for Soft Robots | | Overload Clutch with Integrated Torque Sensing and Decoupling Detection for Collision Tolerant Hybrid High-Speed Industrial Cobots | Frederik Ostyn, Bram Vanderborght, Guillaume Crevecoeur | Ghent University,VUB | Compliant Mechanisms | | A Micro Aircraft with Passive Variable-Sweep Wings | Songnan Bai, Runze Ding, Pakpong Chirarattananon | City University of Hong Kong,CITY UNIVERSITY OF HONGKONG | Compliant Mechanisms | | Design and Voluntary Control of Variable Stiffness Exoskeleton Based on sEMG Driven Model | Yanghui Zhu, Qingcong Wu, Bai Chen, Ziyue Zhao | Nanjing University of Aeronautics and Astronautics | Compliant Mechanisms | | A Robotic Torso Joint with Adjustable Linear Spring Mechanism for Natural Dynamic Motions in a Differential-Elastic Arrangement | Jens Reinecke, Alexander Dietrich, Anton Shu, Bastian Deutschmann, Marco Hutter | DLR,German Aerospace Center (DLR),German Aerospace Center,ETH Zurich | Compliant Mechanisms | | Requirements on the Spatial Distribution of Elastic Components Used in Compliance Realization | Shuguang Huang, Joseph Schimmels | Marquette University | Compliant Mechanisms | | A Novel Metamorphic Foot Mechanism with Toe Joints Based on Spring-Loaded Linkages | Jianwei Sun, Zhenyu Wang, Meiling Zhang, Songyu Zhang, Zhihui Qian, Jinkui Chu | Changchun University of Technology,Jilin University,Dalian University of Technology | Compliant Mechanisms | | Haptic-Based and SE(3)-Aware Object Insertion Using Compliant Hands | Osher Azulay, Maxim Monastirsky, Avishai Sintov | Tel Aviv University,Tel-Aviv University | Compliant Mechanisms | | Dynamic Modeling and Performance Analysis for a Wire-Driven Elastic Robotic Fish | Xiaocun Liao, Chao Zhou, Qianqian Zou, Jian Wang, Ben Lu | Institute of Automation, Chinese Academy of Sciences,Chinese Academy of Sciences,Institution of Automation, Chinese Academy of sciences | Compliant Mechanisms | | A 2-Degree-Of-Freedom Quasi-Passive Prosthetic Wrist with Two Levels of Compliance | Leonardo Cappello, Daniele D'accolti, Marta Gherardini, Marco Controzzi, Christian Cipriani | Scuola Superiore Sant'Anna,The Biorobotics Institute, Sant'Anna School of Advanced Studies | Compliant Mechanisms | | DiffCo: Auto-Differentiable Proxy Collision Detection with Multi-Class Labels for Safety-Aware Trajectory Optimization | Yuheng Zhi, Nikhil Das, Michael Yip | University of California, San Diego,UCSD | Path Planning and Collision Avoidance | | Risk-Aware Submodular Optimization for Multi-Robot Coordination | Lifeng Zhou, Pratap Tokekar | Drexel University,University of Maryland | Path Planning and Collision Avoidance | | Risk-Aware Fast Trajectory Planner for Uncertain Environments Based on Probabilistic Surrogate Reliability and Risk Contours | Guobiao Wang | Southeast university | Path Planning and Collision Avoidance | | Collision Avoidance among Dense Heterogeneous Agents Using Deep Reinforcement Learning | Kai Zhu, Bin Li, Wen Ming Zhe, Tao Zhang | Tsinghua University,JD | Path Planning and Collision Avoidance | | Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions | Negar Mehr, Mingyu Wang, Maulik Bhatt, Mac Schwager | University of Illinois Urbana-Champaign,Stanford University | Path Planning and Collision Avoidance | | Distributing Collaborative Multi-Robot Planning with Gaussian Belief Propagation | Aalok Patwardhan, Riku Murai, Andrew J Davison | Imperial College London | Path Planning and Collision Avoidance | | Interactive Multi-Modal Motion Planning with Branch Model Predictive Control | Yuxiao Chen, Ugo Rosolia, Wyatt Ubellacker, Noel Csomay-Shanklin, Aaron Ames | Nvidia research,Caltech,California Institute of Technology | Path Planning and Collision Avoidance | | A Sequential MPC Approach to Reactive Planning for Bipedal Robots Using Safe Corridors in Highly Cluttered Environments | Kunal Sanjay Narkhede, Abhijeet Kulkarni, Dhruv Ashwinkumar Thanki, Ioannis Poulakakis | University of Delaware | Path Planning and Collision Avoidance | | Towards a Continuous Solution of the D-Visibility Watchman Route Problem in a Polygon with Holes | Jan Mikula, Miroslav Kulich | Faculty of Electrical Engineering – Czech Technical University in Prague,Czech Technical University in Prague | Path Planning and Collision Avoidance | | Learning Deep Neural Network Controller for Path Following of Unicycle Robots | Priyabrata Saha, Luis Guerrero-bonilla, Magnus Egerstedt, Saibal Mukhopadhyay | Georgia Institute of Technology,Instituto Tecnologico y de Estudios Superiores de Monterrey,University of California, Irvine | Deep Learning and Neural Networks in Robotics | | ViewBirdiformer: Learning to Recover Ground-Plane Crowd Trajectories and Ego-Motion from a Single Ego-Centric View | Mai Nishimura, Shohei Nobuhara, Ko Nishino | OMRON SINIC X,Kyoto University | Deep Learning and Neural Networks in Robotics | | Closing the Planning-Learning Loop with Application to Autonomous Driving | Panpan Cai, David Hsu | Shanghai Jiao Tong University,National University of Singapore | Deep Learning and Neural Networks in Robotics | | Learning from Demonstrations Via Multi-Level and Multi-Attention Domain-Adaptive Meta-Learning | Ziye Hu, Zhongxue Gan, Wei Li, Weikun Guo, Xiang Gao, Jiwei Zhu | Fudan University,Jihua Lab | Deep Learning and Neural Networks in Robotics | | Learning Stable Vector Fields on Lie Groups | Julen Urain, Davide Tateo, Jan Peters | TU Darmstadt,Technische Universität Darmstadt | Deep Learning and Neural Networks in Robotics | | Learning to Play Table Tennis from Scratch Using Muscular Robots | Dieter Buechler, Simon Guist, Roberto Calandra, Vincent Berenz, Bernhard Schölkopf, Jan Peters | Max Planck Institute for Intelligent Systems Tübingen,Max Planck Institute for Intelligent Systems,Meta AI,Technische Universität Darmstadt | Deep Learning and Neural Networks in Robotics | | Particle Filters in Latent Space for Robust Deformable Linear Object Tracking | Yuxuan Yang, Johannes A. Stork, Todor Stoyanov | Örebro University,Orebro University | Deep Learning and Neural Networks in Robotics | | Multi-Scale Interaction for Real-Time LiDAR Data Segmentation on an Embedded Platform | Shijie Li, Xieyuanli Chen, Yun Liu, Dengxin Dai, Cyrill Stachniss, Juergen Gall | Bonn University,National University of Defense Technology,Agency for Science, Technology and Research (A*STAR),ETH Zurich,University of Bonn | Deep Learning and Neural Networks in Robotics | | Stable Neural Adaptive Filters for Teleoperations with Uncertain Delays | Parham Kebria, Abbas Khosravi, Saeid Nahavandi | Deakin University | Deep Learning and Neural Networks in Robotics | | Compliant Microgripper Using Soft Polymer Actuator | Jung-Hwan Youn, Je-Sung Koh, Ki-Uk Kyung | Electronics and Telecommunications Research Institute (ETRI),Ajou University,Korea Advanced Institute of Science & Technology (KAIST) | Soft Robots II | | Development of Hydraulically-Driven Soft Hand for Handling Heavy Vegetables and Its Experimental Evaluation | Osamu Azami, Kyosuke Ishibashi, Mitsuo Komagata, Ko Yamamoto | Tokyo University,The University of Tokyo,University of Tokyo | Soft Robots II | | Two-Stage Grasping: A New Bin Picking Framework for Small Objects | Hanwen Cao, Jianshu Zhou, Yichuan Li, Rui Cao, Qi Dou, Yunhui Liu | The Chinese University of Hong Kong,Chinese University of Hong Kong | Soft Robots II | | Electroadhesive Auxetics As Programmable Layer Jamming Skins for Formable Crust Shape Displays | Ahad Rauf, John Settimio Bernardo, Sean Follmer | Stanford University | Soft Robots II | | Navigating Soft Robots through Wireless Heating | Yiwen Song, Mason Zadan, Kushaan Misra, Zefang Li, Jingxian Wang, Carmel Majidi, Swarun Kumar | Carnegie Mellon University,Microsoft & National University of Singapore | Soft Robots II | | Fast Untethered Soft Robotic Crawler with Elastic Instability | Zechen Xiong, Yufeng Su, Hod Lipson | Columbia University,Columbia university | Soft Robots II | | An Underwater Jet-Propulsion Soft Robot with High Flexibility Driven by Water Hydraulics | Siqing Chen, He Xu, Xiong Xiao, Ben Lu | Harbin Engineering University,College of Mechanical and Electrical Engineering, Harbin Enginee,Institute of Automation, Chinese Academy of Sciences | Soft Robots II | | Force/Torque Sensing for Soft Grippers Using an External Camera | Jeremy Collins, Patrick Grady, Charlie Kemp | Georgia Institute of Technology | Soft Robots II | | Data-Driven Spectral Submanifold Reduction for Nonlinear Optimal Control of High-Dimensional Robots | John Irvin Alora, Mattia Cenedese, Edward Schmerling, George Haller, Marco Pavone | Stanford University,ETH Zürich,ETH Zurich | Modelling and Control | | Control of Shape Memory Alloy Actuator Via Electrostatic Capacitive Sensor for Meso-Scale Mirror Tilting System | Baekgyeom Kim, Doohoe Lee, Dongjin Kim, Seungyong Han, Daeshik Kang, Uikyum Kim, Je-Sung Koh | Ajou University | Modelling and Control | | Data-Efficient Non-Parametric Modelling and Control of an Extensible Soft Manipulator | Mohammadreza Kasaei, Keyhan Kouhkiloui Babarahmati, Zhibin Li, Mohsen Khadem | University of Edinburgh,University College London | Modelling and Control | | Analytical Approach to Inverse Kinematics of Single Section Mobile Continuum Manipulators | Audrey Hyacinthe Bouyom Boutchouang, Achille Melingui, Joseph Jean-baptiste Mvogo Ahanda, Xinrui Yang, Othman Lakhal, Frederic Biya Motto, Rochdi Merzouki | University of Yaounde I,Higher Technical Teacher Training collage, University of Bame,University of Lille,University Lille, CRIStAL, CNRS-UMR ,,,,,CRIStAL, CNRS UMR ,,,,, University of Lille, | Modelling and Control | | A Fast Geometric Framework for Dynamic Cosserat Rods with Discrete Actuated Joints | Hossain Samei, Robin Chhabra | Carleton University | Modelling and Control | | Data-Driven Estimation of Forces Along the Backbone of Concentric Tube Continuum Robots | Heiko Donat, Pouya Mohammadi, Jochen Steil | Technische Universität Braunschweig | Modelling and Control | | Bootstrapping the Dynamic Gait Controller of the Soft Robot Arm | Rudolf Szadkowski, Muhammad Sunny Nazeer, Matteo Cianchetti, Egidio Falotico, Jan Faigl | Czech Technical University in Prague,The BioRobotics Institute, Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna | Modelling and Control | | Model Based Position Control of Soft Hydraulic Actuators | Mark Runciman, Enrico Franco, James Avery, Ferdinando Rodriguez Y Baena, George Mylonas | Imperial College London,Imperial College, London, UK | Modelling and Control | | Multiple Surgical Instruments Tracking-By-Prediction with Graph Hierarchy | Rui Guo, Xi Liu, Ziheng Wang, Tony Jarc | Intuitive Surgical | Medical Imaging and Perception I | | Fully Robotized 3D Ultrasound Image Acquisition for Artery | Mingcong Chen, Yuanrui Huang, Jian Chen, Tongxi Zhou, Jiuan Chen, Hongbin Liu | Institute of Automation Chinese Academy of Sciences,University of Chinese Academy of Sciences,Institute of Automation, Chinese Academy of Sciences,institute of Automation, Chinese Academy of Sciences,Institute of Automation,Chinese Academy of Sciences | Medical Imaging and Perception I | | Depth Estimation for Oral Cavity by Shape from Shading with Endoscope | Xi Wu, Gangtie Zheng | Tsinghua University | Medical Imaging and Perception I | | Dynamic Interactive Relation Capturing Via Scene Graph Learning for Robotic Surgical Report Generation | Hongqiu Wang, Yueming Jin, Lei Zhu | Hong Kong University of Science and Technology (Guangzhou),University College London,The Hong Kong University of Science and Technology (Guangzhou) | Medical Imaging and Perception I | | Reslicing Ultrasound Images for Data Augmentation and Vessel Reconstruction | Cecilia Morales, Jason Yao, Tejas Rane, Robert Edman, Howie Choset, Artur Dubrawski | Carnege Mellon University,Carnegie Mellon University | Medical Imaging and Perception I | | Expert-Agnostic Ultrasound Image Quality Assessment Using Deep Variational Clustering | Deepak Raina, Dimitrios Ntentia, Sh Chandrashekhara, Richard Voyles, Subir Kumar Saha | Indian Institute of Technology Delhi and Purdue University USA,Purdue university,All India Insititute of Medical Sciences, New Delhi,Purdue University,Indain Institute of Technology Delhi | Medical Imaging and Perception I | | A Curvature and Trajectory Optimization-Based 3D Surface Reconstruction Pipeline for Ultrasound Trajectory Generation | Ananya Bal, Ashutosh Gupta, Fnu Abhimanyu, John Galeotti, Howie Choset | Carnegie Mellon University,BITS Pilani KK Birla Goa campus | Medical Imaging and Perception I | | Graph-Based Pose Estimation of Texture-Less Surgical Tools for Autonomous Robot Control | HAOZHENG XU, Mark Runciman, João Cartucho, Chi Xu, Stamatia Giannarou | Imperial college london,Imperial College London | Medical Imaging and Perception I | | Adaptive Sampling-Based Particle Filter for Visual-Inertial Gimbal in the Wild | Xueyang Kang, Ariel Herrera, Henry Lema, Esteban Valencia, Patrick Vandewalle | KU Leuven,Escuela Politécnica Nacional,Escuela Politecnica Nacional | Sensor Fusion II | | DAMS-LIO: A Degeneration-Aware and Modular Sensor-Fusion LiDAR-Inertial Odometry | Fuzhang Han, Han Zheng, Wenjun Huang, Rong Xiong, Yue Wang, Yanmei Jiao | Zhejiang University,Hangzhou Normal University | Sensor Fusion II | | ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions | Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng Chen, Jiming Chen, Yuchi Huo, Qi Ye | Zhejiang University,Tsinghua university,Netease Inc | Sensor Fusion II | | Simple-BEV: What Really Matters for Multi-Sensor BEV Perception? | Adam Harley, Zhaoyuan Fang, Jie Li, Rares Ambrus, Aikaterini Fragkiadaki | Stanford University,Carnegie Mellon University,Toyota Research Institute | Sensor Fusion II | | MVFusion: Multi-View 3D Object Detection with Semantic-Aligned Radar and Camera Fusion | Zizhang Wu, Guilian Chen, Yuanzhu Gan, Wang Robin, Jian Pu | Zongmu Technology,Fudan University | Sensor Fusion II | | BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation | Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, Song Han | MIT,Massachusetts Institute of Technology,Shanghai Jiao Tong University,OmniML | Sensor Fusion II | | Fusing Event-Based Camera and Radar for SLAM Using Spiking Neural Networks with Continual STDP Learning | Ali Safa, Tim Verbelen, Ilja Ocket, André Bourdoux, Hichem Sahli, Catthoor Francky, Georges Gielen | KU Leuven - IMEC,Ghent University - imec,imec - KU Leuven,imec,Vrije Universiteit Brussel | Sensor Fusion II | | AI-Based Multi-Object Relative State Estimation with Self-Calibration Capabilities | Thomas Jantos, Christian Brommer, Eren Allak, Stephan Weiss, Jan Steinbrener | University of Klagenfurt,Universität Klagenfurt | Sensor Fusion II | | Are All Point Clouds Suitable for Completion? Weakly Supervised Quality Evaluation Network for Point Cloud Completion | Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen | Hong Kong University of Technology and Science,HKUST, Robotics Institute,DJI,Hong Kong University of Science and Technology | Point Clouds | | From Semi-Supervised to Omni-Supervised Room Layout Estimation Using Point Clouds | Huan-ang Gao, Beiwen Tian, Pengfei Li, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Yurong Chen, Hongbin Zha | Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University,Intel,Peking University | Point Clouds | | Few-Shot Point Cloud Semantic Segmentation Via Contrastive Self-Supervision and Multi-Resolution Attention | Jiahui Wang, Haiyue Zhu, Haoren Guo, Abdullah Al Mamun, Cheng Xiang, Tong Heng Lee | National University of Singapore,Agency for Science, Technology and Research (A*STAR) | Point Clouds | | Scene-Level Point Cloud Colorization with Semantics-And-Geometry-Aware Networks | Rongrong Gao, Tian-zhu Xiang, Chenyang Lei, Jaesik Park, Qifeng Chen | HongKong university of science and engineering,Inception Institute of Artificial Intelligence,HKUST,POSTECH | Point Clouds | | Deep Interactive Full Transformer Framework for Point Cloud Registration | Guangyan Chen, Meiling Wang, Qingxiang Zhang, Li Yuan, Tong Liu, Yufeng Yue | Beijing Institute of technology,Beijing Institute of Technology,Peking University | Point Clouds | | Coarse-To-Fine Point Cloud Registration with SE(3)-Equivariant Representations | Cheng-wei Lin, Tung-i Chen, Hsin-ying Lee, Wen-chin Chen, Winston Hsu | National Taiwan University | Point Clouds | | LiDAR-SGM: Semi-Global Matching on LiDAR Point Clouds and Their Cost-Based Fusion into Stereo Matching | Bianca Forkel, Hans J Wuensche | Universität der Bundeswehr München | Point Clouds | | Segregator: Global Point Cloud Registration with Semantic and Geometric Cues | Pengyu Yin, Shenghai Yuan, Cao Haozhi, Xingyu Ji, Shuyang Zhang, Lihua Xie | Nanyang Technological University,NANYANG TECHNOLOGICAL UNIVERSITY,The Hong Kong University of Science and Technology,NanyangTechnological University | Point Clouds | | StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images Via Back-View NOCS | Kai Chen, Stephen James, Congying Sui, Yunhui Liu, Pieter Abbeel, Qi Dou | The Chinese University of Hong Kong,Dyson,Chinese University of Hong Kong,UC Berkeley | Pose Estimation | | Non-Minimal Solvers for Relative Pose Estimation with a Known Relative Rotation Angle | Deshun Hu | Harbin Institute of Technology | Pose Estimation | | Generalizable Pose Estimation Using Implicit Scene Representations | Vaibhav Saxena, Kamal Rahimi Malekshan, Linh Tran, Yotto Koga | Georgia Institute of Technology,Autodesk | Pose Estimation | | RFFCE: Residual Feature Fusion and Confidence Evaluation Network for 6DoF Pose Estimation | Qiwei Meng, Shanshan Ji, Shiqiang Zhu, Tianlei Jin, Te Li, Jianjun Gu, Wei Song | Zhejiang Lab,zhejiang lab | Pose Estimation | | Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-Hand Objects | Alireza Rezazadeh, Snehal Dikhale, Soshi Iba, Nawid Jamali | University of Minnesota,Honda Research Institute USA | Pose Estimation | | Interactive Object Segmentation in 3D Point Clouds | Theodora Kontogianni, Ekin Celikkan, Siyu Tang, Konrad Schindler | ETH Zurich,RWTH Aachen University,ETH Zürich | Pose Estimation | | GSNet: Model Reconstruction Network for Category-Level 6D Object Pose and Size Estimation | Penglei Liu, Qieshi Zhang, Jun Cheng | Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences,Shenzhen Institutes of Advanced Technology, Chinese Academy of S,Shenzhen Institutes of Advanced Technology | Pose Estimation | | 6D Pose Estimation for Textureless Objects on RGB Frames Using Multi-View Optimization | Jun Yang, Wenjie Xue, Sahar Ghavidel, Steven Lake Waslander | University of Toronto,Epson Canada | Pose Estimation | | Learning Stabilization Control from Observations by Learning Lyapunov-Like Proxy Models | Milan Ganai, Chiaki Hirayama, Ya-Chien Chang, Sicun Gao | University of California San Diego,UCSD | Imitation Learning | | Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models | Yi Liu, Gaurav Datta, Ellen Novoseller, Daniel Brown | UC Berkeley,University of California, Berkeley,University of Utah | Imitation Learning | | BITS: Bi-Level Imitation for Traffic Simulation | Danfei Xu, Yuxiao Chen, Boris Ivanovic, Marco Pavone | Stanford Univesity,Nvidia research,NVIDIA,Stanford University | Imitation Learning | | Off-Policy Imitation Learning from Visual Inputs | Zhihao Cheng, Li Shen, Dacheng Tao | The University of Sydney,JD Explore Academy | Imitation Learning | | Versatile Skill Control Via Self-Supervised Adversarial Imitation of Unlabeled Mixed Motions | Chenhao Li, Sebastian Blaes, Pavel Kolev, Marin Vlastelica, Jonas Frey, Georg Martius | ETH Zürich,Max Planck Institute for Intelligent Systems,ETH Zurich | Imitation Learning | | Curriculum-Based Imitation of Versatile Skills | Maximilian Xiling Li, Onur Celik, Philipp Becker, Denis Blessing, Rudolf Lioutikov, Gerhard Neumann | Karlsruhe Institute of Technology,KIT,Karlsruhe Institute of Technology (KIT) | Imitation Learning | | Learning Stable Dynamics Via Iterative Quadratic Programming | Paul Gesel, Momotaz Begum | University of New Hampshire | Imitation Learning | | Holistic Graph-Based Motion Prediction | Daniel Grimm, Philip Schörner, Moritz Dreßler, Johann Marius Zöllner | FZI Research Center for Information Technology,Karlsruhe Institute of Technology (KIT),FZI Forschungszentrum Informatik | Imitation Learning | | Extraneousness-Aware Imitation Learning | Ray Chen Zheng, Kaizhe Hu, Zhecheng Yuan, Boyuan Chen, Huazhe Xu | Tsinghua University,Massachusetts Institute of Technology | Imitation Learning | | Wayformer: Motion Forecasting Via Simple & Efficient Attention Networks | Nigamaa Nayakanti, Rami Al-rfou, Aurick Zhou, Kratarth Goel, Khaled Refaat, Benjamin Sapp | Waymo | Imitation Learning | | A Non-Parametric Skill Representation with Soft Null Space Projectors for Fast Generalization | João Silvério, Yanlong Huang | German Aerospace Center,University of Leeds | Imitation Learning | | Sample Efficient Dynamics Learning for Symmetrical Legged Robots: Leveraging Physics Invariance and Geometric Symmetries | Jee-Eun Lee, Jaemin Lee, Tirthankar Bandyopadhyay, Luis Sentis | The University of Texas at Austin,California Institute of Technology,CSIRO | Learning for Control II | | Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion | Lev Grossman, Brian Plancher | Berkshire Grey,Barnard College, Columbia University | Learning for Control II | | Causal Inference for De-Biasing Motion Estimation from Robotic Observational Data | Junhong Xu, Kai Yin, Jason M. Gregory, Lantao Liu | Indiana University,Expedia Group,US Army Research Laboratory | Learning for Control II | | Active Predictive Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems | Alex Ororbia, Ankur Mali | Rochester Institute of Technology,University of South Florida | Learning for Control II | | Approximating Discontinuous Nash Equilibrial Values of Two-Player General-Sum Differential Games | Lei Zhang, Mukesh Ghimire, Wenlong Zhang, Zhe Xu, Yi Ren | Arizona State University | Learning for Control II | | Visual Affordance Prediction for Guiding Robot Exploration | Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani | Carnegie Mellon University | Learning for Control II | | Generating Stable and Collision-Free Policies through Lyapunov Function Learning | Alexandre Coulombe, Hsiu-Chin Lin | McGill University | Learning for Control II | | ALAN: Autonomously Exploring Robotic Agents in the Real World | Russell Mendonca, Shikhar Bahl, Deepak Pathak | Carnegie Mellon University,UC Berkeley | Learning for Control II | | Throwing Objects into a Moving Basket While Avoiding Obstacles | Hamidreza Kasaei, Mohammadreza Kasaei | University of Groningen,University of Edinburgh | Learning for Control II | | AIMY: An Open-Source Table Tennis Ball Launcher for Versatile and High-Fidelity Trajectory Generation | Alexander Dittrich, Jan Schneider, Simon Guist, Nico Gürtler, Heiko Ott, Thomas Steinbrenner, Bernhard Schölkopf, Dieter Buechler | Max Planck Institute for Intelligent Systems, Tübingen, Germany,Max Planck Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems Tübingen,MPI for Intelligent Systems | Learning for Control II | | Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees | Ewerton Vieira, Aravind Sivaramakrishnan, Yao Song, Edgar Granados, Marcio Gameiro, Konstantin Mischaikow, Ying Hung, Kostas E. Bekris | Rutgers University,Rutgers,Rutgers, the State University of New Jersey | Learning for Control II | | Modeling and Inertial Parameter Estimation of Cart-Like Nonholonomic Systems Using a Mobile Manipulator | Sergio Aguilera, Muhammad Ali Murtaza, Jonathan Rogers, Seth Hutchinson | Georgia Institute of Technology | Learning for Control II | | Using Registration with Fourier-SOFT in 2D (FS2D) for Robust Scan Matching of Sonar Range Data | Tim Hansen, Andreas Birk | Constructor University,Jacobs University | Marine Robotics II | | A Robotic Cooperative Network for Localising a Submarine in Distress: Results from REPMUS21 | Gabriele Ferri, Alessandro Faggiani, Roberto Petroccia, Pietro Stinco, Alessandra Tesei | NATO Centre for Maritime Research and Experimentation,CMRE,NATO Ctr. on Maritime Research and Experimentation (CMRE),NATO STO CMRE | Marine Robotics II | | DeepSeeColor: Realtime Adaptive Color Correction for Autonomous Underwater Vehicles Via Deep Learning Methods | Stewart Jamieson, Jonathan Patrick How, Yogesh Girdhar | Massachusetts Institute of Technology,Woods Hole Oceanographic Institution | Marine Robotics II | | From Concept to Field Tests: Accelerated Development of Multi-AUV Missions Using a High-Fidelity Faster-Than-Real-Time Simulator | Tim Player, Arjo Chakravarty, Mabel Zhang, Ben Yair Raanan, Brian Kieft, Yanwu Zhang, Brett Hobson | Oregon State University,Open Robotics, Singapore University of Science and Technology,Open Robotics team at Intrinsic,Monterey Bay Aquarium Research Institute,MBARI | Marine Robotics II | | Deep Reinforcement Learning Based Tracking Control of an Autonomous Surface Vessel in Natural Waters | Wei Wang, Xiaojing Cao, Alejandro Gonzalez-garcia, Lianhao Yin, Niklas Hagemann, Yuanyuan Qiao, Carlo Ratti, Daniela Rus | Massachusetts Institute of Technology,Beijing University of Posts and Telecommunications,KU Leuven,MIT | Marine Robotics II | | UDepth: Fast Monocular Depth Estimation for Visually-Guided Underwater Robots | Boxiao Yu, Jiayi Wu, Md Jahidul Islam | University of Florida | Marine Robotics II | | Improved Benthic Classification Using Resolution Scaling and SymmNet Unsupervised Domain Adaptation | Heather Doig, Oscar Pizarro, Stefan Bernard Williams | University of Sydney,Australian Centre for Field Robotics | Marine Robotics II | | Data-Driven Loop Closure Detection in Bathymetric Point Clouds for Underwater SLAM | Jiarui Tan, Ignacio Torroba Balmori, Yiping Xie, John Folkesson | KTH Royal Institute of Technology,KTH | Marine Robotics II | | ResiPlan: Closing the Planning-Acting Loop for Safe Underwater Navigation | Marios Xanthidis, Eleni Kelasidi, Kostas Alexis | SINTEF Ocean,NTNU - Norwegian University of Science and Technology | Marine Robotics II | | Diver Interest Via Pointing: Human-Directed Object Inspection for AUVs | Chelsey Edge, Junaed Sattar | University of Minnesota-Twin Cities,University of Minnesota | Marine Robotics II | | Robust Uncertainty Estimation for Classification of Maritime Objects | Jonathan Becktor, Frederik Scholler, Evangelos Boukas, Lazaros Nalpantidis | Techincal University of Denmark,Technical University of Denmark | Marine Robotics II | | Adaptive Heading for Perception-Aware Trajectory Following | Jonatan Scharff Willners, Sean Katagiri, Shida Xu, Tomasz Luczynski, Joshua Roe, Y. R. Petillot | Heriot-Watt University,Imperial College London | Marine Robotics II | | An Optimal Open-Loop Strategy for Handling a Flexible Beam with a Robot Manipulator | shamilmamedov, Alejandro Astudillo, Daniele Ronzani, Wilm Decré, Jean-philippe Noël, Jan Swevers | KU Leuven,Katholieke Universiteit Leuven | Optimization and Optimal Control | | Constraint Manifolds for Robotic Inference and Planning | Yetong Zhang, Fan Jiang, Gerry Chen, Varun Agrawal, Adam Rutkowski, Frank Dellaert | Georgia Institute of Technology,Air Force Research Laboratory | Optimization and Optimal Control | | Model Predictive Optimized Path Integral Strategies | Dylan M. Asmar, Ransalu Senanayake, Shawn Manuel, Mykel Kochenderfer | Stanford University | Optimization and Optimal Control | | Real-Time Solutions to Multimodal Partially Observable Dynamic Games | Oswin So, Paul Drews, Thomas Balch, Velin Dimitrov, Guy Rosman, Evangelos Theodorou | Massachusetts Institute of Technology,Toyota Research Institute,Georgia Institute of Technology | Optimization and Optimal Control | | Autonomous Drone Racing: Time-Optimal Spatial Iterative Learning Control within a Virtual Tube | Shuli Lv, Yan Gao, Jiaxing Che, Quan Quan | Beihang University,School of Automation Science and Electrical Engineering, Beihang | Optimization and Optimal Control | | Curvature-Aware Model Predictive Contouring Control | Lorenzo Lyons, Laura Ferranti | Delft University of Technology | Optimization and Optimal Control | | A Sequential Quadratic Programming Approach to the Solution of Open-Loop Generalized Nash Equilibria | Edward Zhu, Francesco Borrelli | University of California, Berkeley | Optimization and Optimal Control | | RPGD: A Small-Batch Parallel Gradient Descent Optimizer with Explorative Resampling for Nonlinear Model Predictive Control | Frederik Heetmeyer, Marcin Paluch, Diego Bolliger, Florian Bolli, Xiang Deng, Ennio Filicicchia, Tobi Delbruck | ETH Zurich,University of Zurich,Univ. of Zurich & ETH Zurich | Optimization and Optimal Control | | Distributionally Robust Optimization with Unscented Transform for Learning-Based Motion Control in Dynamic Environments | Astghik Hakobyan, Insoon Yang | Seoul National University | Optimization and Optimal Control | | Event-Triggered Optimal Formation Tracking Control Using Reinforcement Learning for Large-Scale UAV Systems | Ziwei Yan, Liang Han, Xiaoduo Li, Jinjie Li, Zhang Ren | Beihang University,Shanghai Jiao Tong University,Beihang Unviersity | Optimization and Optimal Control | | Differentiable Collision Detection: A Randomized Smoothing Approach | Louis Montaut, Quentin Le Lidec, Antoine Bambade, Vladimír Petrík, Josef Sivic, Justin Carpentier | INRIA (Paris) - CIIRC (Prague),INRIA-ENS-PSL,INRIA Paris, ENPC France,Czech Technical University in Prague,Czech Technical University,INRIA | Optimization and Optimal Control | | Start State Selection for Control Policy Learning from Optimal Trajectories | Christoph Zelch, Jan Peters, Oskar Von Stryk | Technische Universität Darmstadt | Optimization and Optimal Control | | Swarm-LIO: Decentralized Swarm LiDAR-Inertial Odometry | Fangcheng Zhu, Yunfan Ren, Fanze Kong, Huajie Wu, Siqi Liang, Nan Chen, Wei Xu, Fu Zhang | The University of Hong Kong,Hong Kong University,Harbin Institute of Technology, Shenzhen,University of Hong Kong | Aerial Robotics II | | HALO: Hazard-Aware Landing Optimization for Autonomous Systems | Christopher Hayner, Samuel Buckner, Daniel Broyles, Evelyn Madewell, Karen Yan Ming Leung, Behcet Acikmese | University of Washington,Stanford University, NVIDIA Research, University of Washington | Aerial Robotics II | | Onboard Controller Design for Nano UAV Swarm in Operator-Guided Collective Behaviors | Tugay Alperen Karagüzel, Victor Retamal Guiberteau, Eliseo Ferrante | Vrije Universiteit Amsterdam | Aerial Robotics II | | EFTrack: A Lightweight Siamese Network for Aerial Object Tracking | Wenqi Zhang, Yuan Yao, Xincheng Liu, Kai Kou, Gang Yang | Northwestern Polytechnical University | Aerial Robotics II | | Active Metric-Semantic Mapping by Multiple Aerial Robots | Xu Liu, Ankit Prabhu, Fernando Cladera, Ian Douglas Miller, Lifeng Zhou, Camillo Jose Taylor, Vijay Kumar | University of Pennsylvania,Drexel University | Aerial Robotics II | | Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm Using Deep Multi-Agent Reinforcement Learning | Maryam Kouzehgar, Youngbin Song, Malika Meghjani, Roland Bouffanais | Singapore University of Technology and Design,University of Ottawa | Aerial Robotics II | | A Moving Target Tracking System of Quadrotors with Visual-Inertial Localization | Ziyue Lin, Wenbo Xu, Wei Wang | Institute of Automation, Chinese Academy of Sciences | Aerial Robotics II | | BogieCopter: A Multi-Modal Aerial-Ground Vehicle for Long-Endurance Inspection Applications | Teodoro Dias, Meysam Basiri | Instituto Superior Técnico | Aerial Robotics II | | Towards Autonomous UAV Railway DC Line Recharging: Design and Simulation | Frederik Falk Nyboe, Nicolaj Malle, Gerd Vom Bögel, Linda Cousin, Thomas Heckel, Konstantin Troidl, Anders Schack Madsen, Emad Samuel Malki Ebeid | University of Southern Denmark,Fraunhofer IMS,Fraunhofer IISB | Perception | | Fast Region of Interest Proposals on Maritime UAVs | Benjamin Kiefer, Andreas Zell | University of Tuebingen,University of Tübingen | Perception | | TRADE: Object Tracking with 3D Trajectory and Ground Depth Estimates for UAVs | Pedro Proença, Patrick Spieler, Robert Hewitt, Jeff Delaune | NASA-JPL,JPL,Jet Propulsion Laboratory | Perception | | Adaptive Keyframe Generation Based LiDAR Inertial Odometry for Complex Underground Environments | Boseong Kim, Chanyoung Jung, David Hyunchul Shim, Ali-Akbar Agha-Mohammadi | KAIST,NASA-JPL, Caltech | Perception | | Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV | Sotirios Papatheodorou, Nils Funk, Dimos Tzoumanikas, Christopher Choi, Binbin Xu, Stefan Leutenegger | Imperial College London,University of Toronto,Technical University of Munich | Perception | | Stealthy Perception-Based Attacks on Unmanned Aerial Vehicles | Amir Khazraei, Haocheng Meng, Miroslav Pajic | Duke university,Duke University | Perception | | SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking | Liangliang Yao, Changhong Fu, Sihang Li, Guangze Zheng, Junjie Ye | Tongji University | Perception | | Semantics-Aware Exploration and Inspection Path Planning | Mihir Rahul Dharmadhikari, Kostas Alexis | NTNU - Norwegian University of Science and Technology | Perception | | Inverted Landing in a Small Aerial Robot Via Deep Reinforcement Learning for Triggering and Control of Rotational Maneuvers | Bryan Habas, Jack W. Langelaan, Bo Cheng | The Pennsylvania State University,Penn State University,Pennsylvania State University | Micro Aerial Robots | | Heading Control of a Long-Endurance Insect-Scale Aerial Robot Powered by Soft Artificial Muscles | Yi-Hsuan Hsiao, Suhan Kim, Zhijian Ren, Yufeng Chen | Massachusetts Institute of Technology,Massachusetts Institute of Technology (MIT) | Micro Aerial Robots | | Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC | Andrea Tagliabue, Yi-Hsuan Hsiao, Urban Fasel, J. Nathan Kutz, Steven L. Brunton, Yufeng Chen, Jonathan Patrick How | Massachusetts Institute of Technology,Imperial College London,University of Washington | Micro Aerial Robots | | A New Sensation: Digital Strain Sensing for Disturbance Detection in Flapping Wing Micro Aerial Vehicles | Regan Kubicek, Mahnoush Babaei, Alison Weber, Sarah Bergbreiter | Carnegie Mellon University,The University of Texas at Austin,University of Washington | Micro Aerial Robots | | A Lightweight High-Voltage Boost Circuit for Soft-Actuated Micro-Aerial-Robots | Zhijian Ren, Jiahui Yang, Suhan Kim, Yi-Hsuan Hsiao, Jeffrey Lang, Yufeng Chen | Massachusetts Institute of Technology,Southern University of Science and Technology,Massachusetts Institute of Technology (MIT),MIT | Micro Aerial Robots | | Hummingbird-Bat Hybrid Wing by 3-D Printing | Tomoya Fujii, Jinqiang Dang, Hiroto Tanaka | Tokyo institute of technology,Tokyo Institute of Technology | Micro Aerial Robots | | Ultra-Low Power Deep Learning-Based Monocular Relative Localization Onboard Nano-Quadrotors | Stefano Bonato, Stefano Carlo Lambertenghi, Elia Cereda, Alessandro Giusti, Daniele Palossi | USI and SUPSI,USI, SUPSI,IDSIA USI-SUPSI,IDSIA Lugano, SUPSI,ETH Zurich | Micro Aerial Robots | | A Hybrid Quadratic Programming Framework for Real-Time Embedded Safety-Critical Control | Ryan Bena, Sushmit Hossain, Buyun Chen, Wei Wu, Quan Nguyen | University of Southern California | Micro Aerial Robots | | D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage | Vishnu Sharma, Lifeng Zhou, Pratap Tokekar | University of Maryland,Drexel University | Multi-Robot Systems II | | Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality | Chenning Yu, Qingbiao Li, Sicun Gao, Amanda Prorok | University of California San Diego,The University of Cambridge,UCSD,University of Cambridge | Multi-Robot Systems II | | Environment Optimization for Multi-Agent Navigation | Zhan Gao, Amanda Prorok | University of Cambridge | Multi-Robot Systems II | | Heterogeneous Coverage and Multi-Resource Allocation in Supply-Constrained Teams | Mela Coffey, Alyssa Pierson | Boston University | Multi-Robot Systems II | | Sequential Stochastic Multi-Task Assignment for Multi-Robot Deployment Planning | Colin Mitchell, Graeme Best, Geoffrey Hollinger | Oregon State University,University of Technology Sydney | Multi-Robot Systems II | | Path Planning under Uncertainty to Localize mmWave Sources | Kai Pfeiffer, Yuze Jia, Mingsheng Yin, Akshaj Kumar Veldanda, Yaqi Hu, Amee Trivedi, Jeff Jun Zhang, Siddharth Garg, Elza Erkip, Sundeep Rangan, Ludovic Righetti | Nanyang Technological University,NYU,UBC,Yale,New York University | Multi-Robot Systems II | | Communication-Critical Planning Via Multi-Agent Trajectory Exchange | Nathaniel Glaser, Zsolt Kira | Georgia Institute of Technology | Multi-Robot Systems II | | Distributed Potential iLQR: Scalable Game-Theoretic Trajectory Planning for Multi-Agent Interactions | Zach Williams, Jushan Chen, Negar Mehr | University of Illinois Urbana-Champaign | Multi-Robot Systems II | | FRAME: Fast and Robust Autonomous 3D Point Cloud Map-Merging for Egocentric Multi-Robot Exploration | Nikolaos Stathoulopoulos, Anton Koval, Ali-Akbar Agha-Mohammadi, George Nikolakopoulos | Luleå University of Technology, Robotics and AI Group,Luleå University of Technology,NASA-JPL, Caltech | Multi-Robot Systems II | | Autonomous Task Planning for Heterogeneous Multi-Agent Systems | Anatoli Tziola, Savvas Loizou | Cyprus University of Technology | Multi-Robot Systems II | | Graph Neural Networks for Multi-Robot Active Information Acquisition | Mariliza Tzes, Nikolaos Bousias, Evangelos Chatzipantazis, George J. Pappas | University of Pennsylvania | Multi-Robot Systems II | | Balancing Efficiency and Unpredictability in Multi-Robot Patrolling: A MARL-Based Approach | Lingxiao Guo, Haoxuan Pan, Xiaoming Duan, Jianping He | Shanghai Jiao Tong University,Department of Automation, Shanghai Jiao Tong University | Multi-Robot Systems II | | Learning to Influence Vehicles' Routing in Mixed-Autonomy Networks by Dynamically Controlling the Headway of Autonomous Cars | Xiaoyu Ma, Negar Mehr | University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign | Intelligent Transportation Systems II | | Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation | Laura Zheng, Sanghyun Son, Ming C. Lin | University of Maryland, College Park,University of Maryland,University of Maryland at College Park | Intelligent Transportation Systems II | | Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand | Daniel Garces, Sushmita Bhattacharya, Stephanie Gil, Dimitri Bertsekas | Harvard University,MIT | Intelligent Transportation Systems II | | Cooperative Driving in Mixed Traffic of Manned and Unmanned Vehicles Based on Human Driving Behavior Understanding | Jiaxing Lu, Sanzida Hossain, Weihua Sheng, He Bai | Oklahoma State University | Intelligent Transportation Systems II | | Exploring Navigation Maps for Learning-Based Motion Prediction | Julian Schmidt, Julian Jordan, Franz Gritschneder, Thomas Monninger, Klaus Dietmayer | Mercedes-Benz AG, Ulm University,Mercedes-Benz AG,Ulm University,Mercedes-Benz AG, University of Stuttgart,University of Ulm | Intelligent Transportation Systems II | | SLAMesh: Real-Time LiDAR Simultaneous Localization and Meshing | Jianyuan Ruan, Bo Li, Yibo Wang, Yuxiang Sun | Hong Kong Polytechnic University,Zhejiang University,The Hong Kong Polytechnic University | Intelligent Transportation Systems II | | CenterLineDet: CenterLine Graph Detection for Road Lanes with Vehicle-Mounted Sensors by Transformer for HD Map Generation | Zhenhua Xu, Yuxuan Liu, Yuxiang Sun, Ming Liu, Lujia Wang | the Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,The Hong Kong Polytechnic University,The Hong Kong University of Technology | Intelligent Transportation Systems II | | Guided Conditional Diffusion for Controllable Traffic Simulation | Ziyuan Zhong, Davis Rempe, Danfei Xu, Yuxiao Chen, Sushant Veer, Tong Che, Baishakhi Ray, Marco Pavone | Columbia University,Stanford University,Stanford Univesity,Nvidia research,NVIDIA,Columbia University in the City of New York | Intelligent Transportation Systems II | | TrafficGen: Learning to Generate Diverse and Realistic Traffic Scenarios | Lan Feng, Quanyi Li, Zhenghao Peng, Shuhan Tan, Bolei Zhou | ETH ZURICH,University of Edinburgh,University of California, Los Angeles,UT Austin | Intelligent Transportation Systems II | | Infrastructure-Based End-To-End Learning and Prevention of Driver Failure | Noam Buckman, Shiva Sreeram, Mathias Lechner, Yutong Ban, Ramin Hasani, Sertac Karaman, Daniela Rus | Massachusetts Institute of Technology,MIT,Massachusetts Institute of Technology (MIT) | Intelligent Transportation Systems II | | V2XP-ASG: Generating Adversarial Scenes for Vehicle-To-Everything Perception | Hao Xiang, Runsheng Xu, Xia Xin, Zhaoliang Zheng, Bolei Zhou, Jiaqi Ma | University of California, Los Angeles,UCLA | Intelligent Transportation Systems II | | Satellite Image Based Cross-View Localization for Autonomous Vehicle | Shan Wang, Yanhao Zhang, Ankit Vora, Akhil Perincherry, Hongdong Li | The Australian National University,Australian National University,Ford Motor Company,Australian National university and NICTA | Intelligent Transportation Systems II | | Collision-Free Coverage Path Planning for the Variable-Speed Curvature-Constrained Robot | Lin Li, Dianxi Shi, Songchang Jin, Yixuan Sun, Xing Zhou, Shaowu Yang, Hengzhu Liu | National University of Defense Technology,Defense Innovation Institute | Motion and Path Planning II | | Stochastic Traveling Salesperson Problem with Neighborhoods for Object Detection | Cheng Peng, Minghan Wei, Volkan Isler | Univerisyt of Minnesota, Twin Cities,University of Minnesota | Motion and Path Planning II | | Optimal Allocation of Many Robot Guards for Sweep-Line Coverage | Si Wei Feng, Teng Guo, Jingjin Yu | Rutgers University | Motion and Path Planning II | | A Linear and Exact Algorithm for Whole-Body Collision Evaluation Via Scale Optimization | Qianhao Wang, Zhepei Wang, Liuao Pei, Chao Xu, Fei Gao | Zhejiang University | Motion and Path Planning II | | Probabilistic Risk Assessment for Chance-Constrained Collision Avoidance in Uncertain Dynamic Environments | Khaled Alaaeldin Abdelfattah Mustafa, Oscar De Groot, Xinwei Wang, Jens Kober, Javier Alonso-Mora | TU Delft,Delft University of Technology | Motion and Path Planning II | | Computational Tradeoff in Minimum Obstacle Displacement Planning for Robot Navigation | Antony Thomas, Giulio Ferro, Fulvio Mastrogiovanni, Michela Robba | University of Genoa | Motion and Path Planning II | | A Trajectory Planner for Mobile Robots Steering Non-Holonomic Wheelchairs in Dynamic Environments | Martin Schulze, Friedrich Graaf, Lea Steffen, Arne Roennau, Rüdiger Dillmann | FZI Research Center for Information Technology,FZI Research Center for Information Technology, ,,,,, Karlsruhe,,FZI Forschungszentrum Informatik, Karlsruhe,FZI - Forschungszentrum Informatik - Karlsruhe | Motion and Path Planning II | | Safe Bipedal Path Planning Via Control Barrier Functions for Polynomial Shape Obstacles Estimated Using Logistic Regression | Chengyang Peng, Octavian Donca, Guillermo Castillo, Ayonga Hereid | The Ohio State University,Ohio State University | Motion and Path Planning II | | Real-Time Decentralized Navigation of Nonholonomic Agents Using Shifted Yielding Areas | He Liang, Zherong Pan, Dinesh Manocha | University of North Carolina at Chapel Hill,Tencent America,University of Maryland | Motion and Path Planning II | | Differentiable Collision Detection for a Set of Convex Primitives | Kevin Tracy, Taylor Howell, Zachary Manchester | Carnegie Mellon University,Stanford University | Motion and Path Planning II | | Shunted Collision Avoidance for Multi-UAV Motion Planning with Posture Constraints | Gang Xu, Deye Zhu, Junjie Cao, Yong Liu, Jian Yang | Zhejiang University,Institute of Cyber Systems and Control, Zhejiang University,China Research and Development Academy of Machinery Equipment | Motion and Path Planning II | | Dynamic Control Barrier Function-Based Model Predictive Control to Safety-Critical Obstacle-Avoidance of Mobile Robot | Zhuozhu Jian, Zihong Yan, Xuanang Lei, Zihong Lu, Bin Lan, Xueqian Wang, Bin Liang | Tsinghua University,Tsinghua university,ETH Zurich,Harbin Institute of Technology, Shenzhen,Center for Artificial Intelligence and Robotics, Graduate School | Motion and Path Planning II | | A Minimum Swept-Volume Metric Structure for Configuration Space | Yann Dubois De Mont-marin, Jean Ponce, Jean-Paul Laumond | Inria, DI ENS,Ecole Normale Supérieure,Inria, DI ENS PSL | Task and Motion Planning | | Task-Space Clustering for Mobile Manipulator Task Sequencing | Quang-Nam Nguyen, Nicholas Adrian, Quang-Cuong Pham | Nanyang Technological University,NTU Singapore | Task and Motion Planning | | Sampling-Based Path Planning under Temporal Logic Constraints with Real-Time Adaptation | Yizhou Chen, Ruoyu Wang, Xinyi Wang, Ben M. Chen | Chinese University of Hong Kong,The Chinese University of Hong Kong | Task and Motion Planning | | Optimal Grasps and Placements for Task and Motion Planning in Clutter | Carlos Quintero-Pena, Zachary Kingston, Tianyang Pan, Rahul Shome, Anastasios Kyrillidis, Lydia Kavraki | Rice University,The Australian National University | Task and Motion Planning | | Resolution Complete In-Place Object Retrieval Given Known Object Models | Daniel Nakhimovich, Yinglong Miao, Kostas E. Bekris | Rutgers, the State University of New Jersey,Rutgers University | Task and Motion Planning | | Task-Directed Exploration in Continuous POMDPs for Robotic Manipulation of Articulated Objects | Aidan Curtis, Leslie Kaelbling, Siddarth Jain | MIT,Mitsubishi Electric Research Laboratories (MERL) | Task and Motion Planning | | Learning Feasibility of Factored Nonlinear Programs in Robotic Manipulation Planning | Joaquim Ortiz De Haro, Jung-su Ha, Danny Driess, Erez Karpas, Marc Toussaint | University of Stuttgart,TU Berlin,Technion | Task and Motion Planning | | Learning to Predict Action Feasibility for Task and Motion Planning in 3D Environments | Smail Ait Bouhsain, Alami Rachid, Thierry Simeon | LAAS-CNRS,CNRS | Task and Motion Planning | | Policy Guided Lazy Search with Feedback for Task and Motion Planning | Mohamed Khodeir, Atharv Sonwane, Ruthrash Hari, Florian Shkurti | University of Toronto,Microsoft Research | Task and Motion Planning | | A Reachability Tree-Based Algorithm for Robot Task and Motion Planning | Kanghyun Kim, Daehyung Park, Min Jun Kim | Korea Advanced Institute of Science and Technology (KAIST),Korea Advanced Institute of Science and Technology, KAIST,KAIST | Task and Motion Planning | | Dual Quaternion Based Dynamic Movement Primitives to Learn Industrial Tasks Using Teleoperation | Rohit CHANDRA, Victor Henri Giraud, Mohammad Alkhatib, Youcef Mezouar | SIGMA, UCA Clermont-Ferrand, France,SIGMA-Clermont / Institut Pascal,Université Clermont Auvergne,Clermont Auvergne INP - SIGMA Clermont | Task and Motion Planning | | Multi-Contact Task and Motion Planning Guided by Video Demonstration | Kateryna Zorina, David Kovar, Florent Lamiraux, Nicolas Mansard, Justin Carpentier, Josef Sivic, Vladimír Petrík | CIIRC,Czech Technical University in Prague,CNRS,INRIA,Czech Technical University | Task and Motion Planning | | MVTrans: Multi-View Perception of Transparent Objects | Yi Ru Wang, Yuchi Zhao, Haoping Xu, Sagi Eppel, Alan Aspuru-guzik, Florian Shkurti, Animesh Garg | University of Toronto, University of Washington,University of Waterloo,University of Toronto | Perception for Grasping and Manipulation II | | The Sum of Its Parts: Visual Part Segmentation for Inertial Parameter Identification of Manipulated Objects | Philippe Nadeau, Matthew Giamou, Jonathan Kelly | University of Toronto | Perception for Grasping and Manipulation II | | SLURP! Spectroscopy of Liquids Using Robot Pre-Touch Sensing | Nathaniel Hanson, Wesley Lewis, Kavya Puthuveetil, Donelle Furline Jr, Akhil Padmanabha, Taskin Padir, Zackory Erickson | Northeastern University,Carnegie Mellon University | Perception for Grasping and Manipulation II | | Tactile Based Robotic Skills for Cable Routing Operations | Andrea Monguzzi, Martina Pelosi, Andrea Maria Zanchettin, Paolo Rocco | Politecnico di Milano | Perception for Grasping and Manipulation II | | Category-Level Global Camera Pose Estimation with Multi-Hypothesis Point Cloud Correspondences | Jun-Jee Chao, Kazim Selim Engin, Nicolai Häni, Volkan Isler | University of Minnesota | Perception for Grasping and Manipulation II | | GSMR-CNN: An End-To-End Trainable Architecture for Grasping Target Objects from Multi-Object Scenes | Valerija Holomjova, Andrew Joe Starkey, Pascal Meißner | University of Aberdeen | Perception for Grasping and Manipulation II | | 3DSGrasp: 3D Shape-Completion for Robotic Grasp | Seyed Saber Mohammadi, Nuno Ferreira Duarte, Plinio Moreno, Atabak Dehban, Dimitrios Dimou, Pietro Morerio, Matteo Taiana, Yiming Wang, Alexandre Bernardino, Alessio Del Bue, José Santos-Victor | Istituto Italiano di Tecnologia (IIT),IST-ID,IST-ID ,,, ,,, ,,,,Instituto Superior Tecnico, University of Lisbon,Istituto Italiano di Tecnologia,Italian Institute of Technology (IIT),Fondazione Bruno Kessler,IST - Técnico Lisboa,Instituto Superior Técnico - Lisbon | Perception for Grasping and Manipulation II | | Goal-Conditioned Action Space Reduction for Deformable Object Manipulation | Shengyin Wang, Rafael Papallas, Matteo Leonetti, Mehmet Remzi Dogar | University of Leeds,King's College London | Perception for Grasping and Manipulation II | | MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes | Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nan-Ning Zheng | Xi'an Jiaotong University,Xi'an Jiaotong Univ. | Perception for Grasping and Manipulation II | | SCARP: 3D Shape Completion in ARbitrary Poses for Improved Grasping | Bipasha Sen, Aditya Agarwal, Gaurav Singh, Brojeshwar Bhowmick, Srinath Sridhar, Madhava Krishna | International Institute of Information Technology,IIIT Hyderabad,Tata Consultancy Services,Brown University | Perception for Grasping and Manipulation II | | Category-Level Shape Estimation for Densely Cluttered Objects | Zhenyu Wu, Ziwei Wang, Jiwen Lu, Haibin Yan | Beijing University of Posts and Telecommunications,Tsinghua University | Perception for Grasping and Manipulation II | | Counter-Hypothetical Particle Filters for Single Object Pose Tracking | Elizabeth Olson, Jana Pavlasek, Jasmine Berry, Odest Chadwicke Jenkins | University of Michigan | Perception for Grasping and Manipulation II | | Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses | Hao Zhang, Hongzhuo Liang, Lin Cong, Jianzhi Lyu, Long Zeng, Pingfa Feng, Jianwei Zhang | Tsinghua University,University of Hamburg | Learning for Grasping and Manipulation II | | Efficient Bimanual Handover and Rearrangement Via Symmetry-Aware Actor-Critic Learning | Yunfei Li, Chaoyi Pan, Huazhe Xu, Xiaolong Wang, Yi Wu | Tsinghua University,UC San Diego | Learning for Grasping and Manipulation II | | EDO-Net: Learning Elastic Properties of Deformable Objects from Graph Dynamics | Alberta Longhini, Marco Moletta, Alfredo Reichlin, Michael Welle, David Held, Zackory Erickson, Danica Kragic | KTH Royal Institute of Technology,Carnegie Mellon University,KTH | Learning for Grasping and Manipulation II | | Edge Grasp Network: A Graph-Based SE(3)-Invariant Approach to Grasp Detection | Haojie Huang, Dian Wang, Xupeng Zhu, Robin Walters, Robert Platt | Northeastern University | Learning for Grasping and Manipulation II | | Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-Grasps | Sudeep Dasari, Abhinav Gupta, Vikash Kumar | Carnegie Mellon University,Meta AI | Learning for Grasping and Manipulation II | | A Multi-Agent Approach for Adaptive Finger Cooperation in Learning-Based In-Hand Manipulation | Lingfeng Tao, Jiucai Zhang, Michael Bowman, Xiaoli Zhang | Colorado School of Mines,Guangzhou Automotive Group R&D Center, Silicon Valley | Learning for Grasping and Manipulation II | | Bimanual Rope Manipulation Skill Synthesis through Context Dependent Correction Policy Learning from Human Demonstration | Baturhan Akbulut, Tuba Girgin, Arash Mehrabi, Minoru Asada, Emre Ugur, Erhan Oztop | BoÄŸaziçi University,Bogazici University,Ozyegin University,Open and Transdisciplinary Research Initiatives, Osaka Universit,Osaka University / Ozyegin University | Learning for Grasping and Manipulation II | | Sim-And-Real Reinforcement Learning for Manipulation: A Consensus-Based Approach | Wenxing Liu, Hanlin Niu, Wei Pan, Guido Herrmann, Joaquin Carrasco | United Kingdom Atomic Energy Authority,Delft University of Technology,The University of Manchester | Learning for Grasping and Manipulation II | | AutoBag: Learning to Open Plastic Bags and Insert Objects | Lawrence Yunliang Chen, Baiyu Shi, Daniel Seita, Richard Cheng, Thomas Kollar, David Held, Ken Goldberg | UC Berkeley,Carnegie Mellon University,California Institute of Technology,Toyota Research Institute | Learning for Grasping and Manipulation II | | Toward Fine Contact Interactions: Learning to Control Normal Contact Force with Limited Information | Jinda Cui, Jiawei Xu, David Saldana, Jeffrey Trinkle | Honda Research Institute USA, Inc.,Lehigh University | Learning for Grasping and Manipulation II | | Ditto in the House: Building Articulation Models of Indoor Scenes through Interactive Perception | Cheng-chun Hsu, Zhenyu Jiang, Yuke Zhu | The University of Texas at Austin,The Unversity of Texas at Austin | Learning for Grasping and Manipulation II | | Zero-Shot Transfer of Haptics-Based Object Insertion Policies | Samarth Brahmbhatt, Ankur Deka, Andrew Spielberg, Matthias Mueller | Intel Corporation,Intel Labs,Harvard University, MIT,Intel | Learning for Grasping and Manipulation II | | Moment-Based Kalman Filter: Nonlinear Kalman Filtering with Exact Moment Propagation | Yutaka Shimizu, Ashkan Jasour, Maani Ghaffari, Shinpei Kato | TIER IV,MIT,University of Michigan,The University of Tokyo | Localization II | | Unsupervised Quality Prediction for Improved Single-Frame and Weighted Sequential Visual Place Recognition | Helen Carson, Jason Ford, Michael J Milford | Queensland University of Technology | Localization II | | Towards Consistent Batch State Estimation Using a Time-Correlated Measurement Noise Model | David Juny Yoon, Timothy Barfoot | University of Toronto | Localization II | | A Probabilistic Framework for Visual Localization in Ambiguous Scenes | Fereidoon Zangeneh, Leonard Bruns, Amit Dekel, Alessandro Pieropan, Patric Jensfelt | KTH Royal Institute of Technology,Univrses AB,KTH,KTH - Royal Institute of Technology | Localization II | | RoLM: Radar on LiDAR Map Localization | Yukai Ma, Xiangrui Zhao, Han Li, Yaqing Gu, Xiaolei Lang, Yong Liu | zhejiang unicersity,Zhejiang University | Localization II | | Direct LiDAR-Inertial Odometry: Lightweight LIO with Continuous-Time Motion Correction | Kenny Chen, Ryan Nemiroff, Brett Lopez | University of California, Los Angeles | Localization II | | Large-Scale Radar Localization Using Online Public Maps | Ziyang Hong, Y. R. Petillot, Kaicheng Zhang, Shida Xu, Sen Wang | Heriot-Watt University,Imperial College London | Localization II | | Continuous-Time LiDAR-Inertial-Vehicle Odometry Method with Lateral Acceleration Constraint | Bin He, Weichen Dai, Zeyu Wan, Hong Zhang, Yu Zhang | Zhejiang University,Hangzhou Dianzi University | Localization II | | Cross-Modal Monocular Localization in Prior LiDAR Maps Utilizing Semantic Consistency | Zhang Chi, Hengwang Zhao, Chunxiang Wang, Xuanlai Tang, Ming Yang | Shanghai Jiao Tong University,Shanghai Jiaotong University,KEENON Robotics Co., Ltd | Localization III | | Multi-State Tightly-Coupled EKF-Based Radar-Inertial Odometry with Persistent Landmarks | Jan Michalczyk, Roland Jung, Christian Brommer, Stephan Weiss | University of Klagenfurt,Universität Klagenfurt | Localization III | | Loc-NeRF: Monte Carlo Localization Using Neural Radiance Fields | Dominic Maggio, Marcus Abate, Jingnan Shi, Courtney Mario, Luca Carlone | Massachusetts Institute of Technology,MIT,Draper | Localization III | | RoSS: Rotation-Induced Aliasing for Audio Source Separation | Hyungjoo Seo, Sahil Bhandary Karnoor, Romit Roy Choudhury | University of Illinois at Urbana-Champaign | Localization III | | L-C*: Visual-Inertial Loose Coupling for Resilient and Lightweight Direct Visual Localization | Shuji Oishi, Kenji Koide, Masashi Yokozuka, Atsuhiko Banno | National Institute of Advanced Industrial Science and Technology (AIST),National Institute of Advanced Industrial Science and Technology,Nat. Inst. of Advanced Industrial Science and Technology,National Instisute of Advanced Industrial Science and Technology | Localization III | | GRM: Gradient Rectification Module for Visual Place Retrieval | Boshu Lei, Wenjie Ding, Limeng Qiao, Xi Qiu | University of Pennsylvania,MEGVII Inc,Megvii Inc.,Megvii | Localization III | | DytanVO: Joint Refinement of Visual Odometry and Motion Segmentation in Dynamic Environments | Shihao Shen, Yilin Cai, Wenshan Wang, Sebastian Scherer | Carnegie Mellon University | Localization III | | NOCaL: Calibration-Free Semi-Supervised Learning of Odometry and Camera Intrinsics | Ryan Griffiths, Jack Naylor, Donald G Dansereau | University of Sydney | Localization III | | Efficient View Path Planning for Autonomous Implicit Reconstruction | Jing Zeng, Yanxu Li, Yunlong Ran, Shuo Li, Shibo He, Fei Gao, Lincheng Li, Jiming Chen, Qi Ye | Zhejiang University,NetEase Fuxi AI Lab | Vision-Based Navigation II | | Lighthouses and Global Graph Stabilization: Active SLAM for Low-Compute, Narrow-FoV Robots | Mohit Deshpande, Richard Kim, Dhruva Kumar, Jong Jin Park, James Zamiska | Amazon Lab,,,,Amazon, Lab,,,,Amazon | Vision-Based Navigation II | | ExAug: Robot-Conditioned Navigation Policies Via Geometric Experience Augmentation | Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine | UC Berkeley / TOYOTA Motor North America,University of California, Berkeley,UC Berkeley | Vision-Based Navigation II | | Multi-Object Navigation in Real Environments Using Hybrid Policies | Assem Sadek, Guillaume Bono, Boris Chidlovskii, Atilla Baskurt, Christian Wolf | Naver Labs Europe,Naverlabs Europe,INSA Lyon | Vision-Based Navigation II | | AeriaLPiPS: A Local Planner for Aerial Vehicles with Geometric Collision Checking | Justin Smith, Patricio A. Vela | Georgia Institute of Technology | Vision-Based Navigation II | | Frontier Semantic Exploration for Visual Target Navigation | Bangguo Yu, Hamidreza Kasaei, Ming Cao | University of Groningen | Vision-Based Navigation II | | VINet: Visual and Inertial-Based Terrain Classification and Adaptive Navigation Over Unknown Terrain | Tianrui Guan, Ruitao Song, Zhixian Ye, Liangjun Zhang | University of Maryland,Aptiv Corporation,Baidu | Vision-Based Navigation II | | Ground Then Navigate: Language-Guided Navigation in Dynamic Scenes | Kanishk Jain, Varun Chhangani, Amogh Tiwari, Madhava Krishna, Vineet Gandhi | IIIT Hyderabad | Vision-Based Navigation II | | 3-Dimensional Sonic Phase-Invariant Echo Localization | Christopher Hahne | University of Bern | Localization and Mapping II | | Calibration and Uncertainty Characterization for Ultra-Wideband Two-Way-Ranging Measurements | Mohammed A. Shalaby, Charles Champagne Cossette, James Richard Forbes, Jerome Le Ny | McGill University,Polytechnique Montreal | Localization and Mapping II | | High Resolution Point Clouds from mmWave Radar | Akarsh Prabhakara, Tao Jin, Arnav Das, Gantavya Bhatt, Lilly Kumari, Elahe Soltanaghai, Jeff Bilmes, Swarun Kumar, Anthony Rowe | Carnegie Mellon University,University of Washington,University of Illinois Urbana-Champaign | Localization and Mapping II | | Pyramid Learnable Tokens for 3D LiDAR Place Recognition | Congcong Wen, Hao Huang, Yu-shen Liu, Yi Fang | New York University Abu Dhabi,New York University,Tsinghua University | Localization and Mapping II | | A Decoupled and Linear Framework for Global Outlier Rejection Over Planar Pose Graph | Tianyue Wu, Fei Gao | Zhejiang University | Localization and Mapping II | | Robust Incremental Smoothing and Mapping (riSAM) | Daniel Mcgann, John G. Rogers Iii, Michael Kaess | Carnegie Mellon University,US Army Research Laboratory | Localization and Mapping II | | Real-Time Simultaneous Localization and Mapping with LiDAR Intensity | Wenqiang Du, Giovanni Beltrame | Polytechnique Montreal,Ecole Polytechnique de Montreal | Localization and Mapping II | | iMODE: Real-Time Incremental Monocular Dense Mapping Using Neural Field | Hidenobu Matsuki, Edgar Sucar, Tristan Laidlow, Kentaro Wada, Raluca Scona, Andrew J Davison | Imperial College London,Mujin, Inc.,Ocado Technology | Localization and Mapping II | | Probabilistic Uncertainty Quantification of Prediction Models with Application to Visual Localization | Junan Chen, Josephine Monica, Wei-Lun Chao, Mark Campbell | Cornell University | Localization and Mapping II | | Extrinsic Calibration for Highly Accurate Trajectories Reconstruction | Maxime Vaidis, William Dubois, Alexandre Guénette, Johann Laconte, Vladimir Kubelka, Francois Pomerleau | Université Laval,University of Toronto,Örebro University | Localization and Mapping II | | Cerberus: Low-Drift Visual-Inertial-Leg Odometry for Agile Locomotion | Shuo Yang, Zixin Zhang, Zhengyu Fu, Zachary Manchester | Carnegie Mellon University,The Hong Kong University of Science and Technology | Localization and Mapping II | | Ensembles of Compact, Region-Specific & Regularized Spiking Neural Networks for Scalable Place Recognition | Somayeh Hussaini, Michael J Milford, Tobias Fischer | Queensland University of Technology | Localization and Mapping II | | Line As a Visual Sentence: Context-Aware Line Descriptor for Visual Localization | Sungho Yoon, Ayoung Kim | NAVER LABS,Seoul National University | Localisation 1 | | Robust Visual Localization of a UAV Over a Pipe-Rack Based on the Lie Group SE(3) | Vincenzo Lippiello, Jonathan Cacace | University of Naples FEDERICO II,University of Naples | Localisation 1 | | Finding the Right Place: Sensor Placement for UWB Time Difference of Arrival Localization in Cluttered Indoor Environments | Zhao Wenda, Abhishek Goudar, Angela P. Schoellig | University of Toronto,TU Munich | Localisation 1 | | EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale | Jacek Komorowski, Monika Wysoczanska, Tomasz Trzcinski | Warsaw University of Technology | Localisation 1 | | Stein Particle Filter for Nonlinear, Non-Gaussian State Estimation | Fahira Afzal Maken, Fabio Ramos, Lionel Ott | Data,,, CSIRO,University of Sydney, NVIDIA,ETH Zurich | Localisation 1 | | Faster-LIO: Lightweight Tightly Coupled Lidar-Inertial Odometry Using Parallel Sparse Incremental Voxels | Chunge Bai, Tao Xiao, Yajie Chen, Haoqian Wang, Fang Zhang, Xiang Gao | Tsinghua University,Beijing Idriverplus Technology Co. Ltd.,IDRIVERPLUS,Beijing Idriverplus Technology Co., Ltd.,idriverplus.com | Localisation 1 | | Homography-Based Loss Function for Camera Pose Regression | Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel | Ifremer,Université de Toulon, Aix Marseille Univ, CNRS, LIS,Université de Toulon,University of Toulon | Localisation 1 | | Broadband Sound Source Localization Via Non-Synchronous Measurements for Service Robots: A Tensor Completion Approach | Long Chen, Weize Sun, Lei Huang, Liang Yu | Northwestern Polytechnical University,Shenzhen University,Shanghai Jiao Tong University | Localisation 1 | | Proprioceptive Soft Pneumatic Gripper for Extreme Environments Using Hybrid Optical Fibers | Babar Jamil, Gyeongjae Yoo, Youngjin Choi, Hugo Rodrigue | Sungkyunkwan University,University of Rochester,Hanyang University | Soft Sensors and Actuators | | Modeling and Characterizing Two Dielectric Elastomer Folding Actuators for Origami-Inspired Robot | Li Yang, Ting Zhang | Soochow University | Soft Sensors and Actuators | | Deployable Soft Pneumatic Networks (D-PneuNets) Actuator with Dual-Morphing Origami Chambers for High Compactness | Woongbae Kim, Bada Seo, Sung Yol Yu, Kyu-Jin Cho | Korea Institue of Science and Technology,Seoul National University,Seoul National University, Biorobotics Laboratory | Soft Sensors and Actuators | | Soft Fluidic Actuator for Locomotion in Multi-Phase Environments | Roza Gkliva, Maarja Kruusmaa | Tallinn University of Technology | Soft Sensors and Actuators | | Contact Surface and Pose Recognition: Utilizing Multipole Magnetic Tactile Sensor with Meta Learning Model | Ziwei Xia, Bin Fang, Fuchun Sun, Huaping Liu, Wei Feng Xu, Ling Fu, Yiyong Yang | China University of Geosciences, Haidian District, Beijing, Chin,Tsinghua university,Tsinghua University,Siemens Ltd., China,School of Engineering and Technology, China University of Geosci | Soft Sensors and Actuators | | Force/Torque-Sensorless Joint Stiffness Estimation in Articulated Soft Robots | Maja Trumic, Giorgio Grioli, Kosta Jovanovic, Adriano Fagiolini | University of Belgrade,Istituto Italiano di Tecnologia,University of Belgrade, Serbia,University of Palermo | Soft Sensors and Actuators | | Retractable Locking System Driven by Shape Memory Alloy Actuator for Lightweight Soft Robotic Application | Young Jin Gong, Seong Taek Hwang, Sang Yul Yang, Kihyeon Kim, Jae Hyeong Park, Hosang Jung, Dongsu Shin, Hyouk Ryeol Choi | SungKyunKwan university(SKKU),Sungkyunkwan University(SKKU),Sungkyunkwan university,Sungkyunkwan University,Sungkwunkwan University | Soft Sensors and Actuators | | Elastic-Actuation Mechanism for Repetitive Hopping Based on Power Modulation and Cyclic Trajectory Generation | Won Dong Shin, William Stewart, Matthew Estrada, Auke Ijspeert, Dario Floreano | EPFL,Ecole Polytechnique Federale de Lausanne,École polytechnique fédérale de Lausanne,Ecole Polytechnique Federal, Lausanne | Soft Sensors and Actuators | | Learning-Based Fabric Folding and Box Wrapping | Xiaoman Wang, Jie Zhao, Xin Jiang, Yunhui Liu | Harbin Institute of Technology, Shenzhen,Chinese University of Hong Kong | Manipulation and Grasping I | | Few-Shot Instance Grasping of Novel Objects in Clutter | Weikun Guo, Wei Li, Ziye Hu, Zhongxue Gan | Fudan University,ENN Group | Manipulation and Grasping I | | TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline | Hongjie Fang, Hao-shu Fang, Sheng Xu, Cewu Lu | Shanghai Jiao Tong University,ShangHai Jiao Tong University | Manipulation and Grasping I | | Dual-Arm Control for Coordinated Fast Grabbing and Tossing of an Object | Michael Bombile, Aude G. Billard | Ecole Polytechnique Federale de Lausanne (EPFL),EPFL | Manipulation and Grasping I | | RBO Hand 3 - a Platform for Soft Dexterous Manipulation | Steffen Puhlmann, Jason Harris, Oliver Brock | TU Berlin,Technische Universitaet Berlin,Technische Universität Berlin | Manipulation and Grasping I | | A Multi-DoF Exoskeleton Haptic Device for the Grasping of a Compliant Object Adapting to a User's Motion Using Jamming Transitions | Ryohei Michikawa, Takahiro Endo, Fumitoshi Matsuno | Kyoto university,Kyoto University | Manipulation and Grasping I | | Peg-In-Hole Assembly with Dual-Arm Robot and Dexterous Robot Hands | Dong-Hyuk Lee, Myoung-su Choi, Hyeonjun Park, Ga-ram Jang, Jae-Han Park, Ji-Hun Bae | Korea Institute of Industrial Technology (KITECH),KITECH, UST,Korea Institute of Robotics & Technology Convergence,Korea Institute of Industrial Technology | Manipulation and Grasping I | | Manipulation Planning Using Wave Variables | Phongsaen Pitakwatchara, Jetnipit Arunrat | Chulalongkorn University,Chula university | Manipulation and Grasping I | | Active Inference and Behavior Trees for Reactive Action Planning and Execution in Robotics | Corrado Pezzato, Carlos Hernandez Corbato, Stefan Bonhof, Martijn Wisse | Delft University of Technology,TU Delft | Manipulation and Grasping I | | Physically Consistent Preferential Bayesian Optimization for Food Arrangement | Yuhwan Kwon, Yoshihisa Tsurumine, Takeshi Shimmura, Sadao Kawamura, Takamitsu Matsubara | Nara Institute of Science and Technology,Ritsumeikan University | Human Centered and Inspired Robotics | | Multi-Objective Trajectory Optimization to Improve Ergonomics in Human Motion | Waldez Gomes, Pauline Maurice, Eloise Dalin, Jean-Baptiste Mouret, Serena Ivaldi | Université Paris-Saclay,CNRS - LORIA,INRIA,Inria | Human Centered and Inspired Robotics | | Interactive Dynamic Walking: Learning Gait Switching Policies with Generalization Guarantees | Prem Chand, Sushant Veer, Ioannis Poulakakis | University of Delaware,NVIDIA | Human Centered and Inspired Robotics | | Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes | Kento Kawaharazuka, Kei Okada, Masayuki Inaba | The University of Tokyo | Human Centered and Inspired Robotics | | Power-Based Velocity-Domain Variable Structure Passivity Signature Control for Physical Human-(Tele)Robot Interaction | Peter Paik, Smrithi Thudi, S. Farokh Atashzar | New York University,New York University (NYU), US | Human Centered and Inspired Robotics | | Human-Multirobot Collaborative Mobile Manipulation: The Omnid Mocobots | Matthew Elwin, Billie Strong, Randy Freeman, Kevin Lynch | Northwestern University | Human Centered and Inspired Robotics | | TransDSSL: Transformer Based Depth Estimation Via Self-Supervised Learning | Daechan Han, Jeongmin Shin, Namil Kim, Soonmin Hwang, Yukyung Choi | Sejong university,Sejong University,NAVER LABS,Carnegie Mellon University | Deep Learning for Visual Perception | | Stereo Plane R-CNN: Accurate Scene Geometry Reconstruction Using Planar Segments and Camera-Agnostic Representation | Jan Wietrzykowski, Dominik Belter | Poznan University of Technology | Deep Learning for Visual Perception | | Object-Aware Monocular Depth Prediction with Instance Convolutions | Enis Simsar, Evin Pınar Örnek, Fabian Manhardt, Helisa Dhamo, Nassir Navab, Federico Tombari | ETH Zurich,Technical University of Munich,Google,Technische Universität München,TU Munich | Deep Learning for Visual Perception | | Uncertainty Guided Policy for Active Robotic 3D Reconstruction Using Neural Radiance Fields | Soomin Lee, Le Chen, Jiahao Wang, Alexander Liniger, Suryansh Kumar, Fisher Yu | Oracle,ETH Zurich,ETH Zürich | Deep Learning for Visual Perception | | Detaching and Boosting: Dual Engine for Scale-Invariant Self-Supervised Monocular Depth Estimation | Peizhe Jiang, Wei Yang, Xiaoqing Ye, Xiao Tan, Meng Wu | Northwestern Polytechnical University,Baidu,Baidu Inc. | Deep Learning for Visual Perception | | Lidar Upsampling with Sliced Wasserstein Distance | Artem Savkin, Yida Wang, Sebastian Wirkert, Nassir Navab, Federico Tombari | TUM,Technical University of Munich,German Cancer Research Center,TU Munich,Technische Universität München | Deep Learning for Visual Perception | | Accurate 3D Single Object Tracker with Local-To-Global Feature Refinement | Baojie Fan, Kai Wang, Wuyang Zhou, Yu Shi Yang, Kaiwei Ma, Guoping Jiang | Nanjing University of Posts and Telecommunications,Nanjing University of Posts and Telecommunications | Deep Learning for Visual Perception | | Aggregation Functions for Simultaneous Attitude and Image Estimation with Event Cameras at High Angular Rates | Matthew Ng, Zi Min Er, Gim Song Soh, Shaohui Foong | Singapore University of Technology and Design | Aerial Robots and Autonomous Agents | | RAST: Risk-Aware Spatio-Temporal Safety Corridors for MAV Navigation in Dynamic Uncertain Environments | Gang Chen, Siyuan Wu, Moji Shi, Wei Dong, Hai Zhu, Javier Alonso-Mora | Delft University of Technology,Shanghai Jiao Tong University,Chinese Academy of Military Sciences | Aerial Robots and Autonomous Agents | | Energy Aware Impedance Control of a Flying End-Effector in the Port-Hamiltonian Framework | Ramy Rashad, Davide Bicego, Jelle Zult, Santiago Sanchez-escalonilla, Ran Jiao, Antonio Franchi, Stefano Stramigioli | University of Twente,Bond High Performance ,D Technology,Beihang University | Aerial Robots and Autonomous Agents | | Momentum-Based Extended Kalman Filter for Thrust Estimation on Flying Multibody Robots | Hosameldin Awadalla Omer Mohamed, Gabriele Nava, Giuseppe L'erario, Silvio Traversaro, Fabio Bergonti, Luca Fiorio, Punith Reddy Vanteddu, Francesco Braghin, Daniele Pucci | Italian Institute of Technology,Istituto Italiano di Tecnologia,Politecnico di Milano | Aerial Robots and Autonomous Agents | | Overcoming Bias: Equivariant Filter Design for Biased Attitude Estimation with Online Calibration | Alessandro Fornasier, Yonhon Ng, Christian Brommer, Christoph Böhm, Robert Mahony, Stephan Weiss | University of Klagenfurt,Australian National University,University Klagenfurt,Universität Klagenfurt | Aerial Robots and Autonomous Agents | | DIDER: Discovering Interpretable Dynamically Evolving Relations | Enna Sachdeva, Chiho Choi | Honda Research Institute | Aerial Robots and Autonomous Agents | | A Global Max-Flow-Based Multi-Resolution Next-Best-View Method for Reconstruction of 3D Unknown Objects | Sicong Pan, Hui Wei | Fudan University | Aerial Robots and Autonomous Agents | | A Stack-Of-Tasks Approach Combined with Behavior Trees: A New Framework for Robot Control | David Caceres Dominguez, Marco Iannotta, Johannes A. Stork, Erik Schaffernicht, Todor Stoyanov | Örebro University,Orebro University,Örebro University, AASS Research Center | Aerial Robots and Autonomous Agents | | Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot | Tao Huang, Kai Chen, Bin Li, Yunhui Liu, Qi Dou | The Chinese University of Hong Kong,Chinese University of Hong Kong | Medical Robotics I | | Dual-Robot Collaborative System for Autonomous Venous Access Based on Ultrasound and Bioimpedance Sensing Technology | Maria Koskinopoulou, Alperen Acemoglu, Veronica Penza, Leonardo Mattos | Heriot Watt University,Istituto Italiano di Tecnologia | Medical Robotics I | | Vitreoretinal Surgical Robotic System with Autonomous Orbital Manipulation Using Vector-Field Inequalities | Yuki Koyama, Murilo Marinho, Kanako Harada | The University of Tokyo | Medical Robotics I | | Autonomous Needle Navigation in Retinal Microsurgery: Evaluation in Ex Vivo Porcine Eyes | Peiyao Zhang, Ji Woong Kim, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov | Johns Hopkins University,Johns Hopkins Medical Institute | Medical Robotics I | | Dynamic Modeling and Identification of a Robotic Intracardiac Echo Catheter | Mohammad Salehizadeh, Filipe Pedrosa, Harmanpreet Bassan, Rajnikant V. Patel, Jagadeesan Jayender | Harvard Medical School, Brigham and Women's Hospital,Western University,The University of Western Ontario | Medical Robotics I | | Modeling of a Robotic Transcatheter Delivery System | Namrata Unnikrishnan Nayar, Ronghuai Qi, Jaydev Desai | Georgia Institute of Technology, RoboMed Lab,Georgia Institute of Technology | Medical Robotics I | | A Handheld Hydraulic Cardiac Catheter with Omnidirectional Manipulator and Touch Sensing | Nguyen Chi Cong, James J. Davies, Mai Thanh Thai, Trung Thien Hoang, Phuoc Thien Phan, Kefan Zhu, Dang Bao Nhi Tran, Van Ho, Hung La, Hoang Phuong Phan, Nigel Lovell, Thanh Nho Do | University of New South Wales,UNSW Sydney,RMIT,Japan Advanced Institute of Science and Technology,University of Nevada at Reno,The University of Tokyo | Medical Robotics I | | Optimized Design and Analysis of Active Propeller-Driven Capsule Endoscopic Robot for Gastric Examination | Yi Zhang, Weihao Wang, Wende Ke, Chengzhi Hu | Southern University of Science and Technology | Medical Robotics I | | QuadMag: A Mobile-Coil System with Enhanced Magnetic Actuation Efficiency and Dexterity | Lidong Yang, Moqiu Zhang, Zhengxin Yang, Haojin Yang, Li Zhang | The Hong Kong Polytechnic University,The Chinese University of Hong Kong,The Chinese Univeristy of HongKong | Medical Robotics I | | Evaluating the Feasibility of Magnetic Tools for the Minimum Dynamic Requirements of Microneurosurgery | Cameron Forbrigger, Erik Fredin, Eric Diller | University of Toronto | Medical Robotics I | | A Novel Concentric Tube Steerable Drilling Robot for Minimally Invasive Treatment of Spinal Tumors Using Cavity and U-Shape Drilling Techniques | Susheela Sharma, Ji Hwan Park, Jordan P. Amadio, Mohsen Khadem, Farshid Alambeigi | University of Texas at Austin,The University of Texas at Austin,University of Texas Dell Medical School,University of Edinburgh | Medical Robotics I | | Magnetic Ball Chain Robots for Endoluminal Interventions | Giovanni Pittiglio, Margherita Mencattelli, Pierre Dupont | Harvard University,Boston Children's Hospital, Harvard Medical School,Children's Hospital Boston, Harvard Medical School | Medical Robotics I | | Robotic Navigation Autonomy for Subretinal Injection Via Intelligent Real-Time Virtual iOCT Volume Slicing | Shervin Dehghani, Michael Sommersperger, Peiyao Zhang, Alejandro Martin-gomez, Benjamin Busam, Peter Gehlbach, Nassir Navab, M. Ali Nasseri, Iulian Iordachita | TUM,Technical University of Munich,Johns Hopkins University,Johns Hopkins Medical Institute,TU Munich,Technische Universitaet Muenchen | Medical Imaging and Perception II | | 3D Reconstruction of Tibia and Fibula Using One General Model and Two X-Ray Images | Kai Pan, Shuai Zhang, Liang Zhao, Shoudong Huang, Yanhao Zhang, Hua Wang, Qi Luo | University of Technology Sydney,University of Technology, Sydney,Australian National University,Osteoarthropathy surgery department, Shenzhen People's Hospital | Medical Imaging and Perception II | | Semantic-SuPer: A Semantic-Aware Surgical Perception Framework for Endoscopic Tissue Classification, Reconstruction, and Tracking | Shan Lin, Albert Miao, Jingpei Lu, Shunkai Yu, Zih-Yun Chiu, Florian Richter, Michael Yip | University of California, San Diego,University of California San Diego,UC San Diego | Medical Imaging and Perception II | | Suture Thread Spline Reconstruction from Endoscopic Images for Robotic Surgery with Reliability-Driven Keypoint Detection | Neelay Joglekar, Fei Liu, Ryan Orosco, Michael Yip | University of California, San Diego,UCSD | Medical Imaging and Perception II | | CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection | Jiasheng Xu, Tianyi Zhang, Yangqian Wu, Jie Yang, Guang-Zhong Yang, Yun Gu | Shanghai Jiao Tong University,Shanghai Jiaotong University,SJTU | Medical Imaging and Perception II | | Real-Time Constrained 6D Object-Pose Tracking of an In-Hand Suture Needle for Minimally Invasive Robotic Surgery | Zih-Yun Chiu, Florian Richter, Michael Yip | University of California, San Diego | Award Finalists 1 | | Exploring Robot-Assisted Optical Coherence Elastography for Surgical Palpation | Yeon Hee Chang, Elan Ahronovich, Nabil Simaan, Cheol Song | DGIST,Vanderbilt ARMA,Vanderbilt University | Award Finalists 1 | | Locate before Segment: Topology-Guided Retinal Layer Segmentation in Optical Coherence Tomography Images | Ye Lu, Yutian Shen, Xiaohan Xing, Max Qing Hu Meng | The Chinese University of Hong Kong | Medical Imaging and Perception II | | Visual Tracking of Needle Tip in 2D Ultrasound Based on Global Features in a Siamese Architecture | Wanquan Yan, Qingpeng Ding, Jianghua Chen, Kim Yan, Raymond Shing-yan Tang, Shing Shin Cheng | The Chinese University of HongKong,The Chinese University of Hong Kong,The Chinese University of Hong Kong, Department of Medicine and | Medical Imaging and Perception III | | Model-Based Pose Estimation of Steerable Catheters under Bi-Plane Image Feedback | Jared Lawson, Rohan Chitale, Nabil Simaan | Vanderbilt University,Vanderbilt University Medical Center | Medical Imaging and Perception III | | Pose Quality Prediction for Vision Guided Robotic Shoulder Arthroplasty | Morgan Windsor, Jing Peng, Ashish Gupta, Peter Pivonka, Michael J Milford | Queensland University of Technology,Queensland University of Technology (QUT) | Medical Imaging and Perception III | | Image Segmentation for Continuum Robots from a Kinematic Prior | Connor Watson, Anna Nguyen, Tania Morimoto | Morimoto Lab, UCSD,University of California San Diego | Medical Imaging and Perception III | | Robust Collaborative 3D Object Detection in Presence of Pose Errors | Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, Yanfeng Wang | Shanghai Jiao Tong University,Nanjing University,Meta,University of Warwick,New York University | Object Detection I | | Joint Semi-Supervised and Active Learning Via 3D Consistency for 3D Object Detection | Sihwan Hwang, Sanmin Kim, YoungSeok Kim, Dongsuk Kum | Korea Advanced Institute of Science and Technology,KAIST | Object Detection I | | StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks | Hongyu Li, Zhengang Li, Neset Unver Akmandor, Huaizu Jiang, Yanzhi Wang, Taskin Padir | Northeastern University | Object Detection I | | Perceiving Unseen 3D Objects by Poking the Objects | Linghao Chen, Yunzhou Song, Hujun Bao, Xiaowei Zhou | Zhejiang University | Object Detection I | | MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts | Zizhang Wu, Yuanzhu Gan, Wang Robin, Guilian Chen, Jian Pu | Zongmu Technology,Fudan University | Object Detection I | | CrossDTR: Cross-View and Depth-Guided Transformers for 3D Object Detection | Ching-yu Tseng, Yi-rong Chen, Hsin-ying Lee, Tsung-han Wu, Wen-chin Chen, Winston Hsu | National Taiwan University | Object Detection I | | DOTIE - Detecting Objects through Temporal Isolation of Events Using a Spiking Architecture | Manish Nagaraj, Chamika Mihiranga Liyanagedera, Kaushik Roy | Purdue University | Object Detection I | | CEAFFOD: Cross-Ensemble Attention-Based Feature Fusion Architecture towards a Robust and Real-Time UAV-Based Object Detection in Complex Scenarios | Ahmed Elhagry, Hang Dai, Abdulmotaleb El Saddik, Wail Gueaieb, Giulia De Masi | MBZUAI,Mohamed bin Zayed University of Artificial Intelligence,University of Ottawa,Technology Innovation Institute | Object Detection I | | Test Time Domain Adaptation for Monocular Depth Estimation | Zhi Li, Shaoshuai Shi, Bernt Schiele, Dengxin Dai | Max Planck Institute for Informatics,Max Planck,ETH Zurich | Depth Estimation and RGB-D Sensing | | TODE-Trans: Transparent Object Depth Estimation with Transformer | Kang Chen, Shaochen Wang, Beihao Xia, Dongxu Li, Zhen Kan, Bin Li | University of Science and Technology of China,Huazhong University of Science and Technology | Depth Estimation and RGB-D Sensing | | Learning Depth Completion of Transparent Objects Using Augmented Unpaired Data | Floris Marc Arden Erich, Bruno Leme, Noriaki Ando, Ryo Hanai, Yukiyasu Domae | National Institute of Advanced Industrial Science and Technology,University of Florida,National Institute of Industrial Science and Technology(AIST),The National Institute of Advanced Industrial Science and Techno | Depth Estimation and RGB-D Sensing | | Lightweight Monocular Depth Estimation Via Token-Sharing Transformer | Dong-jae Lee, Jae Young Lee, Hyounguk Shon, Eojindl Yi, Yeong-hun Park, Sung-sik Cho, Junmo Kim | Korea Advanced Institute of Science & Technology (KAIST),Korea Advanced Institute of Science and Technology,KAIST,Hyundai Mobis | Depth Estimation and RGB-D Sensing | | Improved Event-Based Dense Depth Estimation Via Optical Flow Compensation | Dianxi Shi, Luoxi Jing, Ruihao Li, Zhe Liu, Huachi Xu, Lin Wang, Yi Zhang | Defense Innovation Institute,Peking University,National University of Defense Technology | Depth Estimation and RGB-D Sensing | | TTCDist: Fast Distance Estimation from an Active Monocular Camera Using Time-To-Contact | Levi Burner, Nitin Sanket, Cornelia Fermuller, Yiannis Aloimonos | University of Maryland, College Park,University of Maryland | Depth Estimation and RGB-D Sensing | | STEPS: Joint Self-Supervised Nighttime Image Enhancement and Depth Estimation | Yupeng Zheng, Chengliang Zhong, Pengfei Li, Huan-ang Gao, Yuhang Zheng, Bu Jin, Ling Wang, Hao Zhao, Guyue Zhou, Qichao Zhang, Dongbin Zhao | Institute of Automation,Chinese Academy of Sciences,Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University,Beihang University,Institute of Automation, Chinese Academy of Sciences,Xi’an Research Institute of High-Tech,Chinese Academy of Sciences | Depth Estimation and RGB-D Sensing | | FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation | Junyu Zhu, Lina Liu, Yong Liu, Wanlong Li, Feng Wen, Hongbo Zhang | zhejiang University,Zhejiang University,Beijing Huawei Digital Technologies Co., Ltd.,Huawei Technologies Co., Ltd,Huawei Technologies | Depth Estimation and RGB-D Sensing | | Light-Weight Pointcloud Representation with Sparse Gaussian Process | Mahmoud Ali, Lantao Liu | Indiana University | Depth Estimation and RGB-D Sensing | | Test-Time Synthetic-To-Real Adaptive Depth Estimation | Eojindl Yi, Junmo Kim | KAIST | Depth Estimation and RGB-D Sensing | | Unseen Object Instance Segmentation with Fully Test-Time RGB-D Embeddings Adaptation | Lu Zhang, Siqi Zhang, Xu Yang, Hong Qiao, Zhiyong Liu | Institute of Automation, Chinese Academy of Science,Chinese Academy of Sciences, Institute of Automation,Institute of Automation, Chinese Academy of Sciences,Institute of Automation Chinese Academy of Sciences | Depth Estimation and RGB-D Sensing | | Robust Double-Encoder Network for RGB-D Panoptic Segmentation | Matteo Sodano, Federico Magistri, Tiziano Guadagnino, Jens Behley, Cyrill Stachniss | Photogrammetry and Robotics Lab, University of Bonn,University of Bonn,Sapienza University of Rome | Depth Estimation and RGB-D Sensing | | Explain What You See: Open-Ended Segmentation and Recognition of Occluded 3D Objects | Hamed Ayoobi, Hamidreza Kasaei, Ming Cao, Rineke Verbrugge, Bart Verheij | Imperial College London,University of Groningen | 3D Vision | | GMCR: Graph-Based Maximum Consensus Estimation for Point Cloud Registration | Michael Gentner, Prajval Kumar Murali, Mohsen Kaboli | BMW Group and Technical University of Munich,BMW Group and University of Glasgow,BMW & Radboud University | 3D Vision | | Toward Cooperative 3D Object Reconstruction with Multi-Agent | Xiong Li, Zhenyu Wen, Zhou Leiqiang, Chenwei Li, Yejian Zhou, Taotao Li, Zhen Hong | Zhejiang University of Technology,Zhejiang University of technology,Zhejiang | 3D Vision | | SwinDepth: Unsupervised Depth Estimation Using Monocular Sequences Via Swin Transformer and Densely Cascaded Network | Dongseok Shim, H. Jin Kim | Seoul National University | 3D Vision | | GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback | Jie Huang, Jiangshan Hao, Rongshun Juan, Randy Gomez, Keisuke Nakamura, Guangliang Li | Ocean University of China,Tianjin University,Honda Research Institute Japan Co., Ltd. | Learning from Demonstration | | Demonstration-Guided Optimal Control for Long-Term Non-Prehensile Planar Manipulation | Teng Xue, Hakan Girgin, Teguh Santoso Lembono, Sylvain Calinon | Idiap Research Institute, EPFL,Idiap Research Institute | Learning from Demonstration | | Learning Reward Functions for Robotic Manipulation by Observing Humans | Minttu Alakuijala, Gabriel Dulac-arnold, Julien Mairal, Jean Ponce, Cordelia Schmid | Inria,Google,INRIA,Ecole Normale Supérieure | Learning from Demonstration | | Data-Driven Stochastic Motion Evaluation and Optimization with Image by Spatially-Aligned Temporal Encoding | Takeru Oba, Norimichi Ukita | Toyota Technological Institute | Learning from Demonstration | | Demonstration-Bootstrapped Autonomous Practicing Via Multi-Task Reinforcement Learning | Abhishek Gupta, Corey Lynch, Brandon Kinman, Garrett Peake, Sergey Levine, Karol Hausman | University of Washington,Google Brain,Google LLC,Google Inc,UC Berkeley | Learning from Demonstration | | Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning | Abraham George, Alison Bartsch, Amir Barati Farimani | Carnegie Mellon University | Learning from Demonstration | | Learning Robotic Cutting from Demonstration: Non-Holonomic DMPs Using the Udwadia-Kalaba Method | Artūras Straižys, Michael Burke, Subramanian Ramamoorthy | University of Edinburgh,Monash University,The University of Edinburgh | Learning from Demonstration | | KRIS: A Novel Device for Kinesthetic Corrective Feedback During Robot Motion | Jorn Verhggen, Kim Baraka | Vrije universiteit,Vrije Universiteit Amsterdam | Learning from Demonstration | | Guided Learning from Demonstration for Robust Transferability | Fouad Sukkar, Victor Hernandez Moreno, Teresa A. Vidal-Calleja, Jochen Deuse | University of Technology Sydney | Learning from Demonstration | | One-Shot Visual Imitation Via Attributed Waypoints and Demonstration Augmentation | Matthew Chang, Saurabh Gupta | University of Illinois at Urbana-Champaign,UIUC | Learning from Demonstration | | Show Me What You Want: Inverse Reinforcement Learning to Automatically Design Robot Swarms by Demonstration | Ilyes Gharbi, Jonas Kuckling, David Garzon Ramos, Mauro Birattari | Université libre de Bruxelles,Université Libre de Bruxelles | Learning from Demonstration | | Immersive Demonstrations Are the Key to Imitation Learning | Kelin Li, Digby Chappell, Nicolas Rojas | Imperial College London | Learning from Demonstration | | DreamWaQ: Learning Robust Quadrupedal Locomotion with Implicit Terrain Imagination Via Deep Reinforcement Learning | I Made Aswin Nahrendra, Byeongho Yu, Hyun Myung | KAIST,KAIST (Korea Advanced Institute of Science and Technology) | Learning for Locomotion | | Learning Low-Frequency Motion Control for Robust and Dynamic Robot Locomotion | Siddhant Gangapurwala, Luigi Campanaro, Ioannis Havoutis | Sony AI,University of Oxford | Learning for Locomotion | | OPT-Mimic: Imitation of Optimized Trajectories for Dynamic Quadruped Behaviors | Yuni Fuchioka, Zhaoming Xie, Michiel Van De Panne | University of British Columbia,Stanford University | Learning for Locomotion | | Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments | Mingyo Seo, Ryan Gupta, Yifeng Zhu, Alexy Skoutnev, Luis Sentis, Yuke Zhu | The University of Texas at Austin,University of Texas at Austin | Learning for Locomotion | | Legs As Manipulator: Pushing Quadrupedal Agility Beyond Locomotion | Xuxin Cheng, Ashish Kumar, Deepak Pathak | Carnegie Mellon University,UC Berkeley | Learning for Locomotion | | Force Control for Robust Quadruped Locomotion: A Linear Policy Approach | Aditya Shirwatkar, Vamshi Kumar Kurva, Devaraju Vinoda, Aman Singh, Aditya Varma Sagi, Himanshu Lodha, Bhavya Giri Goswami, Shivam Sood, Ketan Nehete, Shishir Kolathaya | Indian Institute of Science Bengaluru,IISc,Indian Institute of Science, Bengaluru,Indian Institute of Science,Stoch Lab, Indian Institute of Science, Bengaluru,Indian Institute of Science (IISc), Bengaluru,Indian Institute of Technology Kharagpur | Learning for Locomotion | | Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning | Eric Vollenweider, Marko Bjelonic, Victor Klemm, Nikita Rudin, Joonho Lee, Marco Hutter | ETH, Microsoft,ETH Zurich,ETH Zurich, NVIDIA,ETH Zurich Robotic Systems Laboratory | Learning for Locomotion | | Deep Reinforcement Learning Based Personalized Locomotion Planning for Lower-Limb Exoskeletons | Javad K. Mehr, Edward Guo, Mojtaba Akbari, Vivian K. Mushahwar, Mahdi Tavakoli | University of Alberta,University of Calgary | Learning for Locomotion | | Expanding Versatility of Agile Locomotion through Policy Transitions Using Latent State Representation | Guilherme Christmann, Jonathan Hans Soeseno, Ying-sheng Luo, Wei-chao Chen | Inventec Corporation,Inventec Inc. | Learning for Locomotion | | Sim-To-Real Transfer for Quadrupedal Locomotion Via Terrain Transformer | Hang Lai, Weinan Zhang, Xialin He, Chen Yu, Zheng Tian, Yong Yu, Jun Wang | Shanghai Jiao Tong University,ShanghaiTech University,University College London | Learning for Locomotion | | Agile and Versatile Robot Locomotion Via Kernel-Based Residual Learning | Milo Carroll, Zhaocheng Liu, Mohammadreza Kasaei, Zhibin Li | Entreprenuer First,The University of Edinburgh,University of Edinburgh,University College London | Learning for Locomotion | | DribbleBot: Dynamic Legged Manipulation in the Wild | Yandong Ji, Gabriel Margolis, Pulkit Agrawal | MIT,Massachusetts Institute of Technology | Learning for Locomotion | | Knowledge Distillation for Feature Extraction in Underwater VSLAM | Jinghe Yang, Mingming Gong, Girish N. Nair, Jung Hoon Lee, Jason Monty, Ye Pu | The University of Melbourne,University of Melbourne | Marine Robotics III | | OysterNet: Enhanced Oyster Detection Using Simulation | Xiaomin Lin, Nitin Sanket, Nare Karapetyan, Yiannis Aloimonos | University of Maryland,University of Maryland, College Park | Marine Robotics III | | SyreaNet: A Physically Guided Underwater Image Enhancement Framework Integrating Synthetic and Real Images | Junjie Wen, Jinqiang Cui, Zhenjun Zhao, Ruixin Yan, Zhi Gao, Lihua Dou, Ben M. Chen | The Chinese University of Hong Kong,Peng Cheng Laboratory,Temasek Laboratories @ NUS,Beijing Institue of Technology,Chinese University of Hong Kong | Marine Robotics III | | Real-Time Dense 3D Mapping of Underwater Environments | Weihan Wang, Bharat Joshi, Nathaniel Burgdorfer, Konstantinos Batsos, Alberto Quattrini Li, Philippos Mordohai, Ioannis Rekleitis | Stevens Institute of Technology,University of South Carolina,Stevens Institute of Technoiogy,Dartmouth College | Marine Robotics III | | SM/VIO: Robust Underwater State Estimation Switching between Model-Based and Visual Inertial Odometry | Bharat Joshi, Hunter Damron, Sharmin Rahman, Ioannis Rekleitis | University of South Carolina,Amazon | Marine Robotics III | | Image-Based Visual Servoing Switchable Leader-Follower Control of Heterogeneous Multi-Agent Underwater Robot System | Kanzhong Yao, Nathalie Bauschmann, Thies Lennart Alff, Wei Cheah, Daniel Andre Duecker, Keir Groves, Ognjen Marjanovic, Simon Watson | University of Manchester,Hamburg University of Technology,Technische Universität Hamburg,The University of Manchester,Technical University of Munich (TUM) | Marine Robotics III | | Buoyancy Enabled Autonomous Underwater Construction with Cement Blocks | Samuel Lensgraf, Devin Balkcom, Alberto Quattrini Li | Dartmouth College | Marine Robotics III | | Mapping Waves with an Uncrewed Surface Vessel Via Gaussian Process Regression | Thomas Sears, Michael Riley Cooper, Joshua Marshall | Queen's University | Marine Robotics III | | Enforcing Constraints for Dynamic Obstacle Avoidance by Compliant Robots | Leonidas Koutras, Konstantinos Vlachos, George Kanakis, Fotios Dimeas, Zoe Doulgeri, George Rovithakis | Aristotle University of Thessaloniki,Aristotel University of Thessaloniki | Compliance and Impedance Control | | Increasing Admittance of Industrial Robots by Velocity Feedback Inner-Loop Shaping | Kangwagye Samuel, Kevin Haninger, Sehoon Oh | DGIST,Fraunhofer IPK | Compliance and Impedance Control | | Bounded Compensation with Friction Estimation for Accurate Motion Tracking and Compliant Behavior of Industrial Manipulators | Dongwoo Ko, Donghyeon Lee, Wan Kyun Chung, Keehoon Kim | POSTECH,Pohang University of Science and Technology(POSTECH),POSTECH, Pohang University of Science and Technology | Compliance and Impedance Control | | A Passivity-Based Approach on Relocating High-Frequency Robot Controller to the Edge Cloud | Xiao Chen, Hamid Sadeghian, Lingyun Chen, Mario Troebinger, Abdalla Swikir, Abdeldjallil Naceri, Sami Haddadin | Technical University of Munich | Compliance and Impedance Control | | A Framework for Simultaneous Workpiece Registration in Robotic Machining Applications | Steffan Lloyd, Rishad Irani, Mojtaba Ahmadi | Carleton University | Compliance and Impedance Control | | Contact Force Control with Continuously Compliant Robotic Legs | Robin Bendfeld, C. David Remy | University of Stuttgart | Award Finalists 1 | | Generalization of Impact Response Factors for Proprioceptive Collaborative Robots | Carlos Relaño, Daniel Sanz-merodio, Miguel López Estévez, Concepción A. Monje | University Carlos III of Madrid,Arquimea Research Center | Compliance and Impedance Control | | Robotic Fastening with a Manual Screwdriver | Ling Tang, Yan-Bin Jia | Iowa State University | Compliance and Impedance Control | | Model and Acceleration-Based Pursuit Controller for High Performance Autonomous Racing | Jonathan Becker, Nadine Imholz, Luca Schwarzenbach, Edoardo Ghignone, Nicolas Baumann, Michele Magno | ETH Zurich,ETH | Robot Control | | Extremum Seeking-Based Adaptive Sliding Mode Control with Sliding Perturbation Observer for Robot Manipulators | Muhammad Hamza Khan, Min Cheol Lee | Pusan National University.,Pusan National University | Robot Control | | Experimental Validation of Functional Iterative Learning Control on a One-Link Flexible Arm | Sjoerd Drost, Pietro Pustina, Franco Angelini, Alessandro De Luca, Gerwin Smit, Cosimo Della Santina | Delft University of Technology, Delft, The Netherlands,Sapienza University of Rome,University of Pisa,Delft University of Technology,TU Delft | Robot Control | | Robust Output Feedback Controller for a Serial Robotic Manipulator with Unknown Nonlinearities and External Disturbances | Mohammad Al Saaideh, Almuatazbellah Boker, Mohammad Al Janaideh | Memorial University of Newfoundland,Virginia Tech,Memorial University &University of Toronto | Robot Control | | Collaborative Control Based on Payload Leading for Multi-Quadrotors Transportation Systems | Yuan Ping, Mingming Wang, Juntong Qi, Chong Wu, Jinjin Guo | Tianjin University,Shanghai University,EFY Intelligent Control (Tianjin) Technology Co., Ltd | Robot Control | | Torque Control with Joints Position and Velocity Limits Avoidance | Venus Pasandi, Daniele Pucci | Femto-st Institute,Italian Institute of Technology | Robot Control | | Low-Level Controller in Response to Changes in Quadrotor Dynamics | Jaekyung Cho, Chan Kim, Mohamed Khalid M Jaffar, Michael W. Otte, Seong-woo Kim | Seoul national university,Seoul National University,University of Maryland, College Park,University of Maryland | Robot Control | | Biodegradable Origami Gripper Actuated with Gelatin Hydrogel for Aerial Sensor Attachment to Tree Branches | Christian Geckeler, Benito Armas Pizzani, Stefano Mintchev | ETH Zürich,ETH Zurich | Manipulation and Control | | PARSEC: An Aerial Platform for Autonomous Deployment of Self-Anchoring Payloads on Natural Vertical Surfaces | Patrick Spieler, Skylar Wei, Monica Li, Andrew Galassi, Kyle Uckert, Arash Kalantari, Joel Burdick | JPL,Caltech,UC Berkeley,Jet Propulsion Laboratory,NASA JPL,California Institute of Technology | Manipulation and Control | | Autonomous Control for Orographic Soaring of Fixed-Wing UAVs | Tom Suys, Sunyou Hwang, Guido De Croon, Bart Remes | Delft University of Technology,TU Delft | Manipulation and Control | | Stable Contact Guaranteeing Motion/Force Control for an Aerial Manipulator on an Arbitrarily Tilted Surface | Jeonghyun Byun, Byeongjun Kim, Changhyeon Kim, Donggeon David Oh, H. Jin Kim | Seoul National University | Manipulation and Control | | Design and Control of a Micro Overactuated Aerial Robot with an Origami Delta Manipulator | Eugenio Cuniato, Christian Geckeler, Maximilian Brunner, Dario Strübin, Elia Bähler, Fabian Ospelt, Marco Tognon, Stefano Mintchev, Roland Siegwart | ETH Zurich,ETH Zürich,Inria Rennes-Bretagne Atlantique | Manipulation and Control | | Simplifying Aerial Manipulation Using Intentional Collisions | Mark Nail, Nicholas Janne, Olivia Ma, Gabriel Arellano, Ella Atkins, Brent Gillespie | University of Michigan | Manipulation and Control | | Hierarchical Whole-Body Control of the Cable-Suspended Aerial Manipulator Endowed with Winch-Based Actuation | Yuri Sarkisov, Andre Coelho, Maihara Gabrieli Santos, Min Jun Kim, Dzmitry Tsetserukou, Christian Ott, Konstantin Kondak | SberAutoTech,German Aerospace Center (DLR),Instituto Tecnologico de Aeronautica,KAIST,Toyohashi University of Technology,TU Wien,German Aerospace Center | Manipulation and Control | | Heading for the Abyss: Control Strategies for Exploiting Swinging of a Descending Tethered Aerial Robot | Max Polzin, Frank Centamori, Josie Hughes | EPFL | Manipulation and Control | | Vector Field Aided Trajectory Tracking by a 10-Gram Flapping-Wing Micro Aerial Vehicle | Abdoullah Ndoye, José De Jesús Castillo Zamora, Sabrine Samorah-laki, Romain Miot, Edwin Van Ruymbeke, Franck Ruffier | Aix Marseille Université, CNRS, ISM and Gipsa-Lab,Aix-Marseille Universite, ISM CNRS,Aix Marseille Université, CNRS, ISM,XTIM - Bionic Bird,CNRS / Aix-Marseille Univ. | Manipulation and Control | | Globally Defined Dynamic Modelling and Geometric Tracking Controller Design for Aerial Manipulator | Byeongjun Kim, Dongjae Lee, Jeonghyun Byun, H. Jin Kim | Seoul National University | Manipulation and Control | | FlowDrone: Wind Estimation and Gust Rejection on UAVs Using Fast-Response Hot-Wire Flow Sensors | Nate Simon, Allen Z. Ren, Alex Pique, David Snyder, Daphne Barretto, Marcus Hultmark, Anirudha Majumdar | Princeton University | Manipulation and Control | | AutoCharge: Autonomous Charging for Perpetual Quadrotor Missions | Alessandro Saviolo, Jeffrey Mao, Roshan Balu Thalaivirithan Margabandu Balakr, Vivek Radhakrishnan, Giuseppe Loianno | New York University,Technology Innovation Institute, New York University | Manipulation and Control | | DQN-Based On-Line Path Planning Method for Automatic Navigation of Miniature Robots | Jialin Jiang, Lidong Yang, Li Zhang | The Chinese University of HONG KONG,The Hong Kong Polytechnic University,The Chinese University of Hong Kong | Micro Robotics | | Rendezvous and Docking of Magnetic Helical Microrobots Along Arc Orbits for Field-Directed Assembly and Disassembly | Shuideng Wang, Zejie Yu, Chaojian Hou, Kun Wang, Lixin Dong | City University of Hongkong,City University of Hong Kong | Micro Robotics | | MRI-Powered Magnetic Miniature Capsule Robot with HIFU-Controlled On-Demand Drug Delivery | Mehmet Efe Tiryaki, Fatih DoÄŸangün, Cem Balda Dayan, Paul Wrede, Metin Sitti | Max Plank Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems,Max Planck Institute for Intelligent Systems Stuttgart,Max-Planck Institute for Intelligent Systems | Award Finalists 1 | | Structural Design and Frequency Tuning of Piezoelectric Energy Harvesters Based on Topology Optimization | Abbas Homayouni-amlashi, Micky Rakotondrabe, Abdenbi Mohand-Ousaid | FEMTO-ST Institute, Université Bourgogne Franche,Laboratoire Génie de Production (LGP),University of Franche-Comte | Micro Robotics | | Input-Output Boundedness of a Magnetically-Actuated Helical Device | Leendert-Jan Wouter Ligtenberg, Islam S. M. Khalil | University of Twente | Micro Robotics | | Atomic-Level Tracking and Analyzing of Quantum-Dot Motion Steered by an Electrostatic Field Positioned by a Nanorobotic Manipulation Tip | Zhi Qu, Wenqi Zhang, Lixin Dong | City University of Hong Kong,City University of HongKong | Micro Robotics | | 3D-Printed Adaptive Microgripper Driven by Thin-Film NiTi Actuators | Sukjun Kim, Sarah Bergbreiter | Carnegie Mellon University | Micro Robotics | | Automatic Cell Rotation Method Based on Deep Reinforcement Learning | Huiying Gong, Yujie Zhang, Yaowei Liu, Qili Zhao, Xin Zhao, Mingzhu Sun | Nankai University | Micro Robotics | | Noncontact Particle Manipulation on Water Surface with Ultrasonic Phased Array System and Microscopic Vision | Yexin Zhang, Jiaqi Li, Yuyu Jia, Teng Li, Hu Su, Song Liu, David C. Jeong, Yang Wang | ShanghaiTech University,Tsinghua University,Institute of Automation, Chinese Academy of Science,Santa Clara University,Shanghaitech University | Micro Robotics | | Real-Time Acoustic Holography with Iterative Unsupervised Learning for Acoustic Robotic Manipulation | Chengxi Zhong, Zhenhuan Sun, Teng Li, Hu Su, Song Liu | ShanghaiTech University,Shanghaitech University,Tsinghua University,Institute of Automation, Chinese Academy of Science | Micro Robotics | | ROSMC: A High-Level Mission Operation Framework for Heterogeneous Robotic Teams | Ryo Sakagami, Sebastian Georg Brunner, Andreas Dömel, Armin Wedler, Freek Stulp | German Aerospace Center (DLR),DLR German Aerospace Center, Robotics and Mechatronics Center,DLR - German Aerospace Center,DLR - Deutsches Zentrum für Luft- und Raumfahrt e.V. | Multi-Robot Systems III | | Non-Cooperative Stochastic Target Encirclement by Anti-Synchronization Control Via Range-Only Measurement | Fen Liu, Shenghai Yuan, Wei Meng, Rong Su, Lihua Xie | Guangdong University of Technology,NANYANG TECHNOLOGICAL UNIVERSITY,Nanyang Technological University,NanyangTechnological University | Multi-Robot Systems III | | Estimation of Continuous Environments by Robot Swarms: Correlated Networks and Decision-Making | Mohsen Raoufi, Pawel Romanczuk, Heiko Hamann | Technical University of Berlin,Humboldt-Unviersity Berkin,University of Konstanz | Multi-Robot Systems III | | FogROS2: An Adaptive Platform for Cloud and Fog Robotics Using ROS 2 | Jeffrey Ichnowski, Kaiyuan Chen, Karthik Dharmarajan, Simeon Oluwafunmilore Adebola, Michael Danielczuk, Victor Mayoral-Vilches, Nikhil Jha, Hugo Zhan, Edith Llontop, Derek Xu, Camilo Buscaron, John Kubiatowicz, Ion Stoica, Joseph E. Gonzalez, Ken Goldberg | Carnegie Mellon University,University of California, Berkeley,UC Berkeley,Klagenfurt University,University of California, Berkely,Anytime.ai | Multi-Robot Systems III | | Stackelberg Games for Learning Emergent Behaviors During Competitive Autocurricula | Boling Yang, Liyuan Zheng, Lillian J. Ratliff, Byron Boots, Joshua R. Smith | University of Washington | Multi-Robot Systems III | | On Legible and Predictable Robot Navigation in Multi-Agent Environments | Jean-Luc Bastarache, Christopher Nielsen, Stephen L. Smith | University of Waterloo | Multi-Robot Systems III | | Explainable Action Advising for Multi-Agent Reinforcement Learning | Yue (Sophie) Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei Fang, Katia Sycara | Carnegie Mellon University | Multi-Robot Systems III | | A Complete Set of Connectivity-Aware Local Topology Manipulation Operations for Robot Swarms | Karthik Soma, Koresh Khateri, Mahdi Pourgholi, Mohsen Montazeri, Lorenzo Sabattini, Giovanni Beltrame | École Polytechnique de Montréal,Shahid Beheshti University,University of Modena and Reggio Emilia,Ecole Polytechnique de Montreal | Multi-Robot Systems III | | Decentralized Multi-Agent Exploration with Limited Inter-Agent Communications | Hans He, Alec Koppel, Amrit Bedi, Daniel Stilwell, Mazen Farhood, Benjamin Adams Biggs | Virginia Tech,JP Morgan Chase,University of Maryland, College Park,Virginia Polytechnic Institute and State University | Multi-Robot Systems III | | A Distributed Online Optimization Strategy for Cooperative Robotic Surveillance | Lorenzo Pichierri, Guido Carnevale, Lorenzo Sforni, Andrea Testa, Giuseppe Notarstefano | University of Bologna,Alma Mater Studiorum - Università di Bologna | Multi-Robot Systems III | | Risk-Aware Recharging Rendezvous for a Collaborative Team of UAVs and UGVs | Ahmad Bilal Asghar, Guangyao Shi, Nare Karapetyan, James Humann, Jean-paul Reddinger, James Dotterweich, Pratap Tokekar | University of Maryland,DEVCOM Army Research Laboratory,,Engility Corp. | Multi-Robot Systems III | | Cross-Agent Relocalization for Decentralized Collaborative SLAM | Philipp Baenninger, Ignacio Alzugaray, Marco Karrer, Margarita Chli | ETH Zurich,Imperial College London | Multi-Robot Systems III | | Planning with Occluded Traffic Agents Using Bi-Level Variational Occlusion Models | Filippos Christianos, Peter Karkus, Boris Ivanovic, Stefano V. Albrecht, Marco Pavone | University of Edinburgh,NVIDIA,Stanford University | Intelligent Transportation Systems III | | Robust Forecasting for Robotic Control: A Game-Theoretic Approach | Shubhankar Agarwal, David Fridovich-Keil, Sandeep Chinchali | The University of Texas at Austin | Intelligent Transportation Systems III | | Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios | Zhili Zhang, Songyang Han, Jiangwei Wang, Fei Miao | University of Connecticut | Intelligent Transportation Systems III | | Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library | Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li | Shanghai AI Laboratory,Beihang University,UCLA,Harbin Institute of Technology,University of California, Los Angeles,Sensetime Ltd. | Intelligent Transportation Systems III | | Uncertainty Quantification of Collaborative Detection for Self-Driving | Sanbao Su, Yiming Li, Sihong He, Songyang Han, Chen Feng, Caiwen Ding, Fei Miao | University of Connecticut,New York University | Self-Driving Cars I | | WS-3D-Lane: Weakly Supervised 3D Lane Detection with 2D Lane Labels | Jianyong Ai, Wenbo Ding, Jiuhua Zhao, Jiachen Zhong | SAIC AI Lab | Self-Driving Cars I | | One Training for Multiple Deployments: Polar-Based Adaptive BEV Perception for Autonomous Driving | Huitong Yang, Xuyang Bai, Xinge Zhu, Yuexin Ma | ShanghaiTech University,Hong Kong University of Science and Technology,CUHK | Self-Driving Cars I | | Deep Occupancy-Predictive Representations for Autonomous Driving | Eivind Meyer, Lars Frederik Peiss, Matthias Althoff | Technische Universität München | Self-Driving Cars I | | PriorLane: A Prior Knowledge Enhanced Lane Detection Approach Based on Transformer | Qibo Qiu, Haiming Gao, Wei Hua, Gang Huang, Xiaofei He | Zhejiang Lab,Zhejiang University | Self-Driving Cars I | | Reinforcement Learning with Probabilistically Safe Control Barrier Functions for Ramp Merging | Soumith Udatha, Yiwei Lyu, John Dolan | Carnegie Mellon University | Self-Driving Cars I | | Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms | Resul Dagdanov, Halil Durmuş, Nazim Ure | Eatron Yazilim ve Muhendislik Teknolojileri A.S.,İstanbul Technical University,Istanbul Technical University | Self-Driving Cars I | | Multi-Source Domain Adaptation for Unsupervised Road Defect Segmentation | JONGMIN YU, Hyeontaek Oh, Sebastiano Fichera, Paolo Paoletti, Shan Luo | King's College London,Korea Advanced Institute of Science and Technology,University of Liverpool | Self-Driving Cars I | | A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations | Sohan Rudra, Saksham Goel, Anirban Santara, Claudio Gentile, Laurent Perron, Fei Xia, Vikas Sindhwani, Carolina Parada, Gaurav Aggarwal | Google,Google Inc,Google Brain, NYC | Motion and Path Planning III | | Safe and Efficient Navigation in Extreme Environments Using Semantic Belief Graphs | M. Fadhil Ginting, Sung Kyun Kim, Oriana Peltzer, Joshua Ott, Sunggoo Jung, Mykel Kochenderfer, Ali-Akbar Agha-Mohammadi | Stanford University,NASA Jet Propulsion Laboratory, Caltech,JPL,NASA-JPL, Caltech | Motion and Path Planning III | | Risk-Aware Neural Navigation from BEV Input for Interactive Driving | Suzanna Jiwani, Xiao Li, Sertac Karaman, Daniela Rus | Massachusetts Institute of Technology,MIT | Motion and Path Planning III | | Informable Multi-Objective and Multi-Directional RRT* System for Robot Path Planning | Bruce Jk Huang, Yingwen Tan, Dongmyeong Lee, Vishnu Desaraju, J.W Grizzle | University of Michigan,Woven Planet North America | Motion and Path Planning III | | Leveraging Scene Embeddings for Gradient-Based Motion Planning in Latent Space | Jun Yamada, Chia-Man Hung, Jack Collins, Ioannis Havoutis, Ingmar Posner | University of Oxford,Oxford University | Motion and Path Planning III | | Sample-Driven Connectivity Learning for Motion Planning | Sihui Li, Neil Dantam | Colorado School of Mines | Motion and Path Planning III | | Online Coverage Path Planning Scheme for a Size-Variable Robot | M. A. Viraj J. Muthugala, Bhagya Samarakoon, Rajesh Elara Mohan | Singapore University of Technology and Design | Motion and Path Planning III | | Navigation with Polytopes and B-Spline Path Planner | Ngoc Thinh Nguyen, Pranav Tej Gangavarapu, Arne Sahrhage, Georg Schildbach, Floris Ernst | University of Lübeck | Motion and Path Planning III | | Probabilistic Planning with Partially Ordered Preferences Over Temporal Goals | Hazhar Rahmani, Abhishek Kulkarni, Jie Fu | University of Florida,University of Florida, Gainesville | Planning under Uncertainty I | | A Causal Decoupling Approach to Efficient Planning for Logistics Problems with Stateful Stochastic Demand | Diptanil Chaudhuri, Dylan Shell | Texas A&M University | Planning under Uncertainty I | | Stochastic Robustness Interval for Motion Planning with Signal Temporal Logic | Roland Ilyes, Qi Heng Ho, Morteza Lahijanian | University of Colorado Boulder | Planning under Uncertainty I | | Planning with SiMBA: Motion Planning under Uncertainty for Temporal Goals Using Simplified Belief Guides | Qi Heng Ho, Zachary Sunberg, Morteza Lahijanian | University of Colorado Boulder,University of Colorado | Planning under Uncertainty I | | RAMP: A Risk-Aware Mapping and Planning Pipeline for Fast Off-Road Ground Robot Navigation | Lakshay Sharma, Michael Everett, Donggun Lee, Xiaoyi Cai, Philip Osteen, Jonathan Patrick How | Massachusetts Institute of Technology,Northeastern University,UC Berkeley,U.S. Army Research Laboratory | Planning under Uncertainty I | | Prioritized Robotic Exploration with Deadlines: A Comparison of Greedy, Orienteering, and Profitable Tour Approaches | Sayantan Datta, Srinivas Akella | University of North Carolina at Charlotte | Planning under Uncertainty I | | Epistemic Prediction and Planning with Implicit Coordination for Multi-Robot Teams in Communication Restricted Environments | Lauren Bramblett, Shijie Gao, Nicola Bezzo | University of Virginia | Planning under Uncertainty I | | Uncertainty-Guided Active Reinforcement Learning with Bayesian Neural Networks | Xinyang Wu, Mohamed El-shamouty, Christof Nitsche, Marco F. Huber | Fraunhofer IPA,University of Stuttgart | Planning under Uncertainty I | | Perturbation-Based Best Arm Identification for Efficient Task Planning with Monte-Carlo Tree Search | Daejong Jin, Juhan Park, Kyungjae Lee | Chung-Ang university,Chung-ang University,Chung-Ang University | Task Planning | | Contingency-Aware Task Assignment and Scheduling for Human-Robot Teams | Neel Dhanaraj, Santosh Varadanahalli Narayan, Stefanos Nikolaidis, Satyandra K. Gupta | University of Southern California,UNIVERSITY OF SOUTHERN CALIFORNIA | Task Planning | | Extracting Generalizable Skills from a Single Plan Execution Using Abstraction-Critical State Detection | Khen Elimelech, Lydia Kavraki, Vardi Moshe | Rice University | Task Planning | | Efficient Planning of Multi-Robot Collective Transport Using Graph Reinforcement Learning with Higher Order Topological Abstraction | Steve Paul, Wenyuan Li, Brian Smyth, Yuzhou Chen, Yulia Gel, Souma Chowdhury | University at Buffalo,Temple University,University of Texas at Dallas,University at Buffalo, State University of New York | Task Planning | | On the Utility of Buffers in Pick-N-Swap Based Lattice Rearrangement | Kai Gao, Jingjin Yu | Rutgers University | Task Planning | | On-Demand Multi-Agent Basket Picking for Shopping Stores | Mattias Tiger, David Bergström, Simon Wijk Stranius, Evelina Holmgren, Daniel De Leng, Fredrik Heintz | AI and Integrated Computer Systems (AIICS), Linköping University,Linköping University | Task Planning | | Multi-Robot Coordination and Cooperation with Task Precedence Relationships | Walker Gosrich, Siddharth Mayya, Saaketh Narayan, Matthew Malencia, Saurav Agarwal, Vijay Kumar | University of Pennsylvania,Amazon Robotics | Task Planning | | On the Programming Effort Required to Generate Behavior Trees and Finite State Machines for Robotic Applications | Matteo Iovino, Julian Förster, Pietro Falco, Jen Jen Chung, Roland Siegwart, Claes Christian Smith | ABB Corporate Research,ETH Zurich,ABB, Corporate Research,The University of Queensland,KTH Royal Institute of Technology | Task Planning | | Train What You Know - Precise Pick-And-Place with Transporter Networks | Gergely Sóti, Xi Huang, Christian Wurll, Björn Hein | Karlsruhe University of Applied Sciences,Karlsruhe Institute of Technology,University of Applied Sciences Karlsruhe | Deep Learning in Grasping and Manipulation | | Asking for Help: Failure Prediction in Behavioral Cloning through Value Approximation | Cem Gokmen, Mohi Khansari, Daniel Ho | Stanford University,Google X | Deep Learning in Grasping and Manipulation | | Seq2Seq Imitation Learning for Tactile Feedback-Based Manipulation | Wenyan Yang, Alexandre Angleraud, Roel S. Pieters, Joni Pajarinen, Joni-Kristian Kamarainen | Tampere university,Tampere University,Aalto University,Tampere University of Technology | Deep Learning in Grasping and Manipulation | | SGTM 2.0: Autonomously Untangling Long Cables Using Interactive Perception | Kaushik Shivakumar, Vainavi Viswanath, Anrui Gu, Yahav Avigal, Justin Kerr, Jeffrey Ichnowski, Richard Cheng, Thomas Kollar, Ken Goldberg | University of California Berkeley,University of California, Berkeley,UC Berkeley,Carnegie Mellon University,California Institute of Technology,Toyota Research Institute | Deep Learning in Grasping and Manipulation | | Online Tool Selection with Learned Grasp Prediction Models | Rohanimanesh Khashayar, Jacob Metzger, William Richards, Aviv Tamar | Osaro Inc.,Osaro, Inc,Technion | Deep Learning in Grasping and Manipulation | | FOGL: Federated Object Grasping Learning | Seok-kyu Kang, Changhyun Choi | Korea Shipbuilding & Offshore Engineering Co. Ltd (KSOE), HD Hyundai Group,University of Minnesota, Twin Cities | Deep Learning in Grasping and Manipulation | | Goal-Image Conditioned Dynamic Cable Manipulation through Bayesian Inference and Multi-Objective Black-Box Optimization | Kuniyuki Takahashi, Tadahiro Taniguchi | Preferred Networks, Inc.,Ritsumeikan University | Deep Learning in Grasping and Manipulation | | Learning Generalizable Pivoting Skills | Xiang Zhang, Siddarth Jain, Baichuan Huang, Masayoshi Tomizuka, Diego Romeres | University of California, Berkeley,Mitsubishi Electric Research Laboratories (MERL),Rutgers University,University of California,Mitsubishi Electric research laboratories | Deep Learning in Grasping and Manipulation | | Cloth Funnels: Canonicalized-Alignment for Multi-Purpose Garment Manipulation | Alper Canberk, Cheng Chi, Huy Ha, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, Shuran Song | Columbia University,Toyota Research Institute | Deep Learning in Grasping and Manipulation | | RLAfford: End-To-End Affordance Learning for Robotic Manipulation | Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong | Peking University,South China University of Technology | Deep Learning in Grasping and Manipulation | | Implementation and Optimization of Grasping Learning with Dual-Modal Soft Gripper | Lei Zhao, Horeal Liu, Feihan Li, X.y. Ding, Yuhao Sun, Fuchun Sun, Jianhua Shan, Qi Ye, Lincheng Li, Bin Fang | anhui university of technology,Tsinghua University,Anhui University of Technology,Zhejiang University,NetEase Fuxi AI Lab,Tsinghua university | Deep Learning in Grasping and Manipulation | | DefGraspNets: Grasp Planning on 3D Fields with Graph Neural Nets | Isabella Huang, Yashraj Narang, Ruzena Bajcsy, Fabio Ramos, Tucker Hermans, Dieter Fox | UC Berkeley,NVIDIA,Univ of California, Berkeley,University of Sydney, NVIDIA,University of Utah,University of Washington | Deep Learning in Grasping and Manipulation | | Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control | Jiayu Chen, Tian Lan, Vaneet Aggarwal | Purdue University,George Washington University | Learning for Grasping and Manipulation III | | Efficiently Learning Small Policies for Locomotion and Manipulation | Shashank Hegde, Gaurav Sukhatme | University of Southern California | Learning for Grasping and Manipulation III | | Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects | Giulio Schiavi, Paula Wulkop, Giuseppe Maria Rizzi, Lionel Ott, Roland Siegwart, Jen Jen Chung | ETH Zürich,ETH Zurich,The University of Queensland | Learning for Grasping and Manipulation III | | SE(3)-DiffusionFields: Learning Smooth Cost Functions for Joint Grasp and Motion Optimization through Diffusion | Julen Urain, Niklas Funk, Jan Peters, Georgia Chalvatzaki | TU Darmstadt,Technische Universität Darmstadt,Technische Universität Darmastadt | Learning for Grasping and Manipulation III | | Focused Adaptation of Dynamics Models for Deformable Object Manipulation | Peter Mitrano, Alex Lagrassa, Oliver Kroemer, Dmitry Berenson | University of Michigan,Carnegie Mellon University | Learning for Grasping and Manipulation III | | Dexterous Manipulation from Images: Autonomous Real-World RL Via Substep Guidance | Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine | University of California, Berkeley,Meta AI,University of Washington,UC Berkeley | Learning for Grasping and Manipulation III | | Predicting Motion Plans for Articulating Everyday Objects | Arjun Gupta, Max Shepherd, Saurabh Gupta | UIUC | Learning for Grasping and Manipulation III | | Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation | Sridhar Pandian Arunachalam, Sneha Silwal, Ben Evans, Lerrel Pinto | New York University | Learning for Grasping and Manipulation III | | Holo-Dex: Teaching Dexterity with Immersive Mixed Reality | Sridhar Pandian Arunachalam, Irmak Guzey, Soumith Chintala, Lerrel Pinto | New York University,Facebook AI Research | Learning for Grasping and Manipulation III | | Online Augmentation of Learned Grasp Sequence Policies for More Adaptable and Data-Efficient In-Hand Manipulation | Ethan K. Gordon, Rana Soltani Zarrin | University of Washington,Honda Research Institute - USA | Learning for Grasping and Manipulation III | | DeXtreme: Transfer of Agile In-Hand Manipulation from Simulation to Reality | Ankur Handa, Arthur Allshire, Viktor Makoviichuk, Aleksei Petrenko, Ritvik Singh, Jingzhou Liu, Denys Makoviichuk, Karl Van Wyk, Zhurkevich Alexander, Balakumar Sundaralingam, Yashraj Narang, Jean-francois Lafleche, Dieter Fox, Gavriel State | NVidia,University of Toronto,NVIDIA,USC,University of Toronto, NVIDIA,Snap,NVIDIA Corporation,University of Washington | Learning for Grasping and Manipulation III | | Meta-Reinforcement Learning Via Language Instructions | Zhenshan Bing, Alexander Koch, Xiangtong Yao, Kai Huang, Alois Knoll | Technical University of Munich,Sun Yat-sen University,Tech. Univ. Muenchen TUM | Learning for Grasping and Manipulation III | | Improving Video Super-Resolution with Long-Term Self-Exemplars | Guotao Meng, Yue Wu, Qifeng Chen | HKUST,Hong Kong University of Science and Technology | Machine Learning for Perception | | Learning-Based Relational Object Matching across Views | Cathrin Elich, Iro Armeni, Martin R. Oswald, Marc Pollefeys, Joerg Stueckler | Max Planck Institute for Intelligent Systems,ETH Zurich | Machine Learning for Perception | | TransVisDrone: Spatio-Temporal Transformer for Vision-Based Drone-To-Drone Detection in Aerial Videos | Tushar Bharat Sangam, Ishan Rajendrakumar Dave, Waqas Sultani, Mubarak Shah | University of Central Florida,Informational Technology University | Machine Learning for Perception | | Unsupervised RGB-To-Thermal Domain Adaptation Via Multi-Domain Attention Network | Lu Gan, Connor Lee, Soon-Jo Chung | California Institute of Technology,Caltech | Machine Learning for Perception | | Adaptive-SpikeNet: Event-Based Optical Flow Estimation Using Spiking Neural Networks with Learnable Neuronal Dynamics | Adarsh Kosta, Kaushik Roy | Purdue University | Machine Learning for Perception | | Reinforced Learning for Label-Efficient 3D Face Reconstruction | Hoda Mohaghegh, Hossein Rahmani, Hamid Laga, Farid Boussaid, Mohammed Bennamoun | University of Western Australia,Lancaster University,Murdoch University,The University of Western Australia,UWA | Machine Learning for Perception | | Bridging the Domain Gap for Multi-Agent Perception | Runsheng Xu, Jinlong Li, Xiaoyu Dong, Hongkai Yu, Jiaqi Ma | UCLA,cleveland state university,Northwestern University,Cleveland State University,University of California, Los Angeles | Machine Learning for Perception | | UPLIFT: Unsupervised Person Labeling and Identification Via Cooperative Learning with Mobile Robots | Yu-chee Tseng, Ting-Yuan Ke, Fang-jing Wu | National Yang Ming Chiao Tung University,TU Dortmund University | Machine Learning for Perception | | Learning to Explore Informative Trajectories and Samples for Embodied Perception | Ya Jing, Tao Kong | ByteDance | Machine Learning for Perception | | Embodied Agents for Efficient Exploration and Smart Scene Description | Roberto Bigazzi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara | University of Modena and Reggio Emilia,Università degli Studi di Modena e Reggio Emilia | Machine Learning for Perception | | Deep Neural Network Architecture Search for Accurate Visual Pose Estimation Aboard Nano-UAVs | Elia Cereda, Luca Crupi, Matteo Risso, Alessio Burrello, Luca Benini, Alessandro Giusti, Daniele Jahier Pagliari, Daniele Palossi | IDSIA USI-SUPSI,Politecnico di Torino,Università di Bologna,University of Bologna,IDSIA Lugano, SUPSI,ETH Zurich | Machine Learning for Perception | | Reuse Your Features: Unifying Retrieval and Feature-Metric Alignment | Javier Morlana, Jose M M Montiel | Universidad de Zaragoza,I,A. Universidad de Zaragoza | Machine Learning for Perception | | FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions from Single Panoramas | Bruno Berenguel-Baeta, Jesús Bermúdez, Josechu Guerrero | University of Zaragoza,Universidad de Zaragoza | Deep Learning for Visual Perception I | | CAHIR: Co-Attentive Hierarchical Image Representations for Visual Place Recognition | Guohao Peng, Heshan Li, Yifeng Huang, Jun Zhang, Mingxing Wen, Singh Rahul, Danwei Wang | Nanyang Technological University,Continental Automotive Singapore Pte Ltd | Deep Learning for Visual Perception I | | Monocular Visual-Inertial Depth Estimation | Diana Wofk, Rene Ranftl, Matthias Mueller, Vladlen Koltun | Intel,Intel Labs | Deep Learning for Visual Perception I | | KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation | Qiwei Meng, Jianjun Gu, Shiqiang Zhu, Jianfeng Liao, Tianlei Jin, Fangtai Guo, Wen Wang, Wei Song | Zhejiang Lab | Deep Learning for Visual Perception I | | Online Consistent Video Depth with Gaussian Mixture Representation | Chao Liu, Benjamin Eckart, Jan Kautz | NVIDIA | Deep Learning for Visual Perception I | | Deep Masked Graph Matching for Correspondence Identification in Collaborative Perception | Peng Gao, Qingzhao Zhu, Hongsheng Lu, Chuang Gan, Hao Zhang | University of Maryland, College Park,Colorado School of Mines,Toyota Motor North America,IBM | Deep Learning for Visual Perception I | | Operative Action Captioning for Estimating System Actions | Taiki Nakamura, Seiya Kawano, Akishige Yuguchi, Yasutomo Kawanishi, Koichiro Yoshino | The University of Tokyo,RIKEN,Institute of Physical and Chemical Research (RIKEN) | Deep Learning for Visual Perception I | | Deep Unsupervised Visual Odometry Via Bundle Adjusted Pose Graph Optimization | Guoyu Lu | University of Georgia | Deep Learning for Visual Perception I | | Pose Relation Transformer : Refine Occlusions for Human Pose Estimation | Hyung-gun Chi, Seunggeun Chi, Stanley Chan, Karthik Ramani | Purdue University,Purdue | Deep Learning for Visual Perception I | | Question Generation for Uncertainty Elimination of Referring Expression in 3D Environment | Fumiya Matsuzawa, Yue Qiu, Kenji Iwata, Hirokatsu Kataoka, Yutaka Satoh | National Institute of Advanced Industrial Science and Technology,AIST | Deep Learning for Visual Perception I | | A New Efficient Eye Gaze Tracker for Robotic Applications | Chaitanya Bandi, Ulrike Thomas | Chemnitz University of Technology | Deep Learning for Visual Perception I | | A Deep Learning Human Activity Recognition Framework for Socially Assistive Robots to Support Reablement of Older Adults | Fraser Robinson, Goldie Nejat | University of Toronto | Deep Learning for Visual Perception I | | FloorplanNet: Learning Topometric Floorplan Matching for Robot Localization | Delin Feng, Zhenpeng He, Jiawei Hou, Soeren Schwertfeger, Liangjun Zhang | ShanghaiTech University,Baidu | Localization and Mapping III | | MOFT: Monocular Odometry Based on Deep Depth and Careful Feature Selection and Tracking | Karlo Koledic, Igor Cvišsić, Ivan Markovic, Ivan Petrovic | University of Zagreb,University of Zagreb, Faculty of Electrical Engineering and Comp,University of Zagreb Faculty of Electrical Engineering and Computing | Localization and Mapping III | | LGCNet: Feature Enhancement and Consistency Learning Based on Local and Global Coherence Network for Correspondence Selection | Tzu-Han Wu, Kuan-Wen Chen | National Yang Ming Chiao Tung University | Localization and Mapping III | | Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors | Hao Dong, Xieyuanli Chen, Mihai Dusmanu, Viktor Larsson, Marc Pollefeys, Cyrill Stachniss | ETH Zürich,National University of Defense Technology,ETH Zurich,Lund University,University of Bonn | Localization and Mapping III | | Online Visual SLAM Adaptation against Catastrophic Forgetting with Cycle-Consistent Contrastive Learning | Sangni Xu, Hao Xiong, Qiuxia Wu, Tingting Yao, Zhihui Wang, Zhiyong Wang | South China University of Technology,Macquarie University,Dalian Maritime University,Dalian University of Technology,The University of Sydney | Localization and Mapping III | | SLAMER: Simultaneous Localization and Map-Assisted Environment Recognition | Naoki Akai | Nagoya University | Localization and Mapping III | | Descriptor Distillation for Efficient Multi-Robot SLAM | Xiyue Guo, Junjie Hu, Hujun Bao, Guofeng Zhang | Zhejiang University,The Chinese University of Hong Kong, Shenzhen | Localization and Mapping III | | DS-K3DOM: 3-D Dynamic Occupancy Mapping with Kernel Inference and Dempster-Shafer Evidential Theory | Juyeop Han, Youngjae Min, Hyeok-Joo Chae, Byeongmin Jeong, Han-Lim Choi | Korea Advanced Institute of Science and Technology,Massachusetts Institute of Technology,KAIST | Localization and Mapping III | | Monocular Visual-Inertial Odometry with Planar Regularities | Chuchu Chen, Patrick Geneva, Yuxiang Peng, Woosik Lee, Guoquan Huang | University of Delaware | Localization and Mapping III | | BAMF-SLAM: Bundle Adjusted Multi-Fisheye Visual-Inertial SLAM Using Recurrent Field Transforms | Wei Zhang, Sen Wang, Xingliang Dong, Rongwei Guo, Norbert Haala | University of Stuttgart,Techinische Universität München,Huawei Technologies, Co., Ltd., P. R. CHINA,Huawei,University of Stuttgart, Institute for Photogrammetry | Localization and Mapping III | | Improving the Performance of Local Bundle Adjustment for Visual-Inertial SLAM with Efficient Use of GPU Resources | Shishir Gopinath, Karthik Dantu, Steve Ko | Simon Fraser University,University of Buffalo | Localization and Mapping III | | Distributed Initialization for Visual-Inertial-Ranging Odometry with Position-Unknown UWB Network | Shenhan Jia, Rong Xiong, Yue Wang | Zhejiang University | Localization and Mapping III | | Biomimetic Electric Sense Based Localization: A Solution for Small Underwater Robots in Large-Scale Environment | Junzheng Zheng, Jingxian Wang, Xin Guo, Chayutpon Huntrakul, Chen Wang, Guangming Xie | Peking University,Northwestern University | Localisation 2 | | How Many Events Do You Need? Event-Based Visual Place Recognition Using Sparse but Varying Pixels | Tobias Fischer, Michael J Milford | Queensland University of Technology | Localisation 2 | | Mitigating Shadows in LIDAR Scan Matching Using Spherical Voxels | Matthew Mcdermott, Jason Rife | Tufts University | Localisation 2 | | UWB-VIO Fusion for Accurate and Robust Relative Localization of Ground Robotic Teams | Shuaikang Zheng, Zhitian Li, Yunfei Liu, Haifeng Zhang, Pengcheng Zheng, Xingdong Liang, Yanlei Li, Xiangxi Bu, Xudong Zou | University of Chinese Academy of Sciences,Aerospace Information Research Institute, Chinese Academy of Sci,National Key Laboratory of Microwave Imaging Technology, Aerospa | Localisation 2 | | Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings | Miguel Ángel Muñoz-Bañón, Jan-Hendrik Pauls, Haohao Hu, Christoph Stiller, Francisco A. Candelas, Fernando Torres | University of Alicante,Karlsruhe Institute of Technology (KIT),Karlsruhe Institute of Technology,University of Alicante VAT ES-Q-,,,,,,,-G | Localisation 2 | | Fast and Versatile Feature-Based LiDAR Odometry Via Efficient Local Quadratic Surface Approximation | Seungwon Choi, Hee-Won Chae, Yunsuk Jeung, Seokjoon Kim, Kyusung Cho, Taewan Kim | Seoul National University,Korea University,MAXST | Localisation 2 | | KPPR: Exploiting Momentum Contrast for Point Cloud-Based Place Recognition | Louis Wiesmann, Lucas Nunes, Jens Behley, Cyrill Stachniss | University of Bonn | Localisation 2 | | Handling Constrained Optimization in Factor Graphs for Autonomous Navigation | Barbara Bazzana, Tiziano Guadagnino, Giorgio Grisetti | Sapienza Univ. of Rome,Sapienza University of Rome | Localisation 2 | | Long-Term Localization Using Semantic Cues in Floor Plan Maps | Nicky Zimmerman, Tiziano Guadagnino, Xieyuanli Chen, Jens Behley, Cyrill Stachniss | University of Bonn,Sapienza University of Rome,National University of Defense Technology | Localisation 2 | | COBRA: From Industrial to Medical Surgery with Slender Continuum Robots | David Alatorre Troncoso, Jose A. Robles-linares, Matteo Russo, Mohamed A. Elbanna, Samuel Wild, Xin Dong, Abdelkhalick Mohammad, James Kell, Andy Norton, Dragos Axinte | University of Nottingham,University of Rome Tor Vergata,Rolls-Royce Plc | Medical Systems | | Assistive Robotic Technologies for Next-Generation Smart Wheelchairs | Fabio Morbidi, Louise Devigne, Catalin Stefan Teodorescu, Bastien Fraudet, Emilie Leblong, Tom Carlson, Marie Babel, Guillaume Caron, Sarah Delmas, François Pasteau, Guillaume Vailland, Valérie Gouranton, Sylvain Guegan, Ronan Le Breton, Nicolas Ragot | Université de Picardie Jules Verne,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes - Rehabilitation Cente,The University of Manchester,Rehabilitation Center Pôle Saint Hélier,Rehabilitation Center Pôle Saint Hélier Rennes,University College London, UK,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes,CNRS,Universite de Picardie Jules Verne,,INSA Rennes / IRISA Rainbow Team,,IRISA UMR CNRS ,,,, - Inria - INSA Rennes,,INSA Rennes,,UNIV-RENNES - INSA Rennes,,CESI | Medical Systems | | A-SEE: Active-Sensing End-Effector Enabled Probe Self-Normal-Positioning for Robotic Ultrasound Imaging Applications | Xihan Ma, Wen-yi Kuo, Kehan Yang, Ashiqur Rahaman, Haichong Zhang | Worcester Polytechnic Institute | Medical Systems | | Hybrid Half-Gaussian Selectively Adaptive Fuzzy Control of an Actuated Ankle Foot-Orthosis | Huiseok Moon, Roshni Maiti, Kaushik Das Sharma, Yacine Amirat, Patrick Siarry, Samer Mohammed | LISSI-lab, Universite de Paris-Est Creteil (UPEC),University of Calcutta,University of Paris Est Créteil (UPEC),Université Paris-Est Créteil,University of Paris Est Créteil - (UPEC) | Medical Systems | | Collaborative Magnetic Manipulation Via Two Robotically-Actuated Permanent Magnets | Giovanni Pittiglio, Michael Brockdorff, Tomas Veiga, Josh Davy, James Henry Chandler, Pietro Valdastri | Harvard University,University of Leeds | Medical Systems | | Neuromechanical Model-Based Adaptive Control of Bi-Lateral Ankle Exoskeletons: Biological Joint Torque and Electromyogram Reduction across Walking Conditions | Guillaume Durandau, Wolfgang Rampeltshammer, Herman Van Der Kooij, Massimo Sartori | McGill University,University Twente,Universtity of Twente,University of Twente | Medical Systems | | A Markov Chain Model for Workflow Analysis in Operating Rooms | Hanyi Zheng, Qing Wang, Jingshan Li | Tsinghua University | Medical Systems | | On the Workspace of Electromagnetic Navigation Systems | Quentin Boehler, Simone Gervasoni, Samuel L. Charreyron, Christophe Chautems, Bradley Nelson | ETH Zurich,Accelera AI | Medical Systems | | UVtac: Switchable UV Marker-Based Tactile Sensing Finger for Effective Force Estimation and Object Localization | Woojong Kim, Won Dong Kim, Jeong-Jung Kim, Chang-Hyun Kim, Jung Kim | KAIST,Korea Advanced Institute of Science & Technology (KAIST),Korea Institute of Machinery & Materials (KIMM),Korea Institute of Machinery and Materials (KIMM) | Manipulation and Grasping II | | Sparse-Dense Motion Modelling and Tracking for Manipulation without Prior Object Models | Christian Rauch, Ran Long, Vladimir Ivan, Sethu Vijayakumar | Robert Bosch GmbH,University of Edinburgh,Touchlab Limited | Manipulation and Grasping II | | Enhanced GPIS Learning Based on Local and Global Focus Areas | Zuka Murvanidze, Marc Peter Deisenroth, Yasemin Bekiroglu | University College London,Chalmers University of Technology, University College London | Manipulation and Grasping II | | Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation | Myung-Hwan Jeon, Jeongyun Kim, Jee-Hwan Ryu, Ayoung Kim | Seoul National University,SNU,Korea Advanced Institute of Science and Technology | Manipulation and Grasping II | | Interaction Control of a Robotic Manipulator with the Surface of Deformable Object | Athanasios Dometios, Costas S. Tzafestas | National Technical University of Athens (NTUA),ICCS - Inst of Communication and Computer Systems | Manipulation and Grasping II | | DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation with Differentiable Simulation | Sirui Chen, Yunhao Liu, Shang Wen Yao, Jialong Li, Tingxiang Fan, Jia Pan | The University of Hong Kong,University of Hong Kong,The Univeristy of Hong Kong | Manipulation and Grasping II | | SymmetryGrasp: Symmetry-Aware Antipodal Grasp Detection from Single-View RGB-D Images | Yifei Shi, Zixin Tang, Xiangting Cai, Hongjia Zhang, Dewen Hu, Xin Xu | National University of Defense Technology | Manipulation and Grasping II | | Hardware-Accelerated Mars Sample Localization Via Deep Transfer Learning from Photorealistic Simulations | Raul Castilla-Arquillo, Carlos Perez-del-pulgar, Gonzalo Jesús Paz Delgado, Levin Gerdes | University of Málaga,Universidad de Málaga,ESA/ESTEC | Manipulation and Grasping II | | How AI and Robotics Can Build Furniture: A Case Study from the 2021 AI-Robot Assembly Challenge | Seongseop Yun, Myoung-su Choi, Min-young Cho, Keunhwan Kim, Dong-Hyuk Lee, Sewoong Jun, Ji-Hun Bae, Dongjun Shin | Yonsei University,KITECH, UST,KOREA ELECTRONICS TECHNOLOGY INSTITUTE,Korea Electronics Technology Institute,Korea Institute of Industrial Technology (KITECH),Korea Institute of Industrial Technology | Manipulation and Grasping II | | A Robotic End-Effector for Screwing and Unscrewing Bolts from the Side | Rui Tao, Junfeng Fan, Fengshui Jing, Jun Hou, Shiyu Xing, Yunkai Ma, Min Tan | Institute of Automation, Chinese Academy of Sciences,Institute of Automation,CAS,Chinese Academy of Sciences, Institute of Automation,Chinese Academy of Sciences,Institute of Automation, Chinese Academy of Sciences,,Institute of Automation,Chinese Academy of Sciences | Manipulation and Grasping II | | Adaptive Cooperative Control for Human-Robot Load Manipulation | Carlos de Cos, Dimos V. Dimarogonas | MathWorks AB,KTH Royal Institute of Technology | Human-Robot Interaction/Collaboration | | An Energy Based Control Architecture for Shared Autonomy | Federico Benzi, Federica Ferraguti, Giuseppe Riggio, Cristian Secchi | University of Modena and Reggio Emilia,Università degli Studi di Modena e Reggio Emilia | Human-Robot Interaction/Collaboration | | Computational Model of Robot Trust in Human Co-Worker for Physical Human-Robot Collaboration | Qiao Wang, Dikai Liu, Marc Garry Carmichael, Stefano Aldini, Chin-teng Lin | University of Technology Sydney,Centre for Autonomous Systems,UTS | Human-Robot Interaction/Collaboration | | Robust Multi-User In-Hand Object Recognition in Human-Robot Collaboration Using a Wearable Force-Myography Device | Eran Bamani, Nadav Dov Kahanowich, Inbar Meir, Avishai Sintov | Tel Aviv University,Tel-Aviv University | Human-Robot Interaction/Collaboration | | CARE: Cooperation of AI-Robot Enablers to Create a Vibrant Society | Ankit Ravankar, Amir Tafrishi, Jose Victorio Salazar Luces, Fumi Seto, Yasuhisa Hirata | Tohoku University,Cardiff University | Human-Robot Interaction/Collaboration | | Safety and Efficiency in Robotics: The Control Barrier Functions Approach | Federica Ferraguti, Chiara Talignani Landi, Andrew Singletary, Hsien-Chung Lin, Aaron Ames, Cristian Secchi, Marcello Bonfe | Università degli Studi di Modena e Reggio Emilia,University of Modena and Reggio Emilia,California Institute of Technology,FANUC Corporation,Caltech,University of Ferrara | Human-Robot Interaction/Collaboration | | Encouraging Human Interaction with Robot Teams: Legible and Fair Subtask Allocations | Soheil Habibian, Dylan Losey | Virginia Tech | Human-Robot Interaction/Collaboration | | Autonomous Wristband Placement in a Moving Hand for Victims in SAR Scenarios with a Mobile Manipulator | Francisco Pastor, Francisco Jesús Ruiz Ruiz, Jesus Manuel Gomez De Gabriel, Alfonso García-Cerezo | Universidad de Málaga,University of Málaga,Universidad de Malaga,University of Malaga | Human-Robot Interaction/Collaboration | | Recommending Fine-Grained Tool Consistent with Common Sense Knowledge for Robot | Jianjia Xin, Lichun Wang, Shaofan Wang, Yukun Liu, Chao Yang, Baocai Yin | Beijing University of technology | Computer Vision and Visual Servoing | | Real-Time Hetero-Stereo Matching for Event and Frame Camera with Aligned Events Using Maximum Shift Distance | Haram Kim, Sangil Lee, Junha Kim, H. Jin Kim | Seoul National University,Seoul National Univ. | Computer Vision and Visual Servoing | | Toward Holistic Scene Understanding: A Transfer of Human Scene Perception to Mobile Robots | Florenz Graf, Jochen Lindermayr, Cagatay Odabasi, Marco F. Huber | Fraunhofer IPA,University of Stuttgart | Computer Vision and Visual Servoing | | Object Detection Using Sim2Real Domain Randomization for Robotic Applications | Dániel Horváth, Gábor Erdos, Zoltán Istenes, Tomas Horvath, Sándor Földi | Institute for Computer Science and Control (SZTAKI) and Eötvös L,Institute for Computer Science and Control, Engineering and Mana,Eötvös Loránd University, Faculty of Informatics,Eötvös Loránd University,Centre of Excellence in Production Informatics and Control, Inst | Computer Vision and Visual Servoing | | Continual Adaptation of Semantic Segmentation Using Complementary 2D-3D Data Representations | Jonas Frey, Hermann Blum, Francesco Milano, Roland Siegwart, Cesar D. Cadena Lerma | ETH Zurich | Computer Vision and Visual Servoing | | ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking | Nicola Agostino Piga, Yuriy Onyshchuk, Giulia Pasquale, Ugo Pattacini, Lorenzo Natale | Istituto Italiano di Tecnologia,Italian Institute of Technology (IIT) | Computer Vision and Visual Servoing | | Stability and Convergence Analysis of 3D Feature-Based Visual Servoing | Marco Costanzo, Giuseppe De Maria, Ciro Natale, Antonio Russo | "Università degli Studi della Campania ""Luigi Vanvitelli"",Università degli Studi della Campania Luigi Vanvitelli" | Computer Vision and Visual Servoing | | A Robust Visual Servoing Controller for Anthropomorphic Manipulators with Field-Of-View Constraints and Swivel-Angle Motion | Jiao Jiang, Yaonan Wang, Yiming Jiang, He Xie, Haoran Tan, Hui Zhang | Hunan University,Huazhong University of Science and Technology | Computer Vision and Visual Servoing | | Formation Tracking and Obstacle Avoidance for Multiple Quadrotors with Static and Dynamic Obstacles | Juntong Qi, Jinjin Guo, Mingming Wang, Chong Wu, Zhenwei Ma | Shanghai University,Tianjin University,EFY Intelligent Control (Tianjin) Technology Co., Ltd | Aerial Robotics | | Deep Learning-Aided Synthetic Airspeed Estimation of UAVs for Analytical Redundancy with a Temporal Convolutional Network | Hyungtae Lim, Han-seok Ryu, Matthew Rhudy, Dongjin Lee, Dongjin Jang, Changho Lee, Youngmin Park, Wonkeun Youn, Hyun Myung | Korea Advanced Institute of Science and Technology,Korea Aerospace Research Institute,Penn State University,Hanseo University,Chungnam National University,KAIST (Korea Advanced Institute of Science and Technology) | Aerial Robotics | | Reconfigurable Drone System for Transportation of Parcels with Variable Mass and Size | Fabrizio Schiano, Przemyslaw Mariusz Kornatowski, Leonardo Cencetti, Dario Floreano | Leonardo S.p.a.,Ecole Polytechnique Federale de Lausanne (EPFL),Swiss Federal Institute of Technology Lausanne (EPFL),Ecole Polytechnique Federal, Lausanne | Aerial Robotics | | Geometrically Constrained Trajectory Optimization for Multicopters | Zhepei Wang, Xin Zhou, Chao Xu, Fei Gao | Zhejiang University,ZHEJIANG UNIVERSITY | Aerial Robotics | | Parameter Estimation and Control of Multirotors | Cheng-cheng Yang, Teng-Hu Cheng | National Chiao Tung University,National Yang Ming Chiao Tung University | Aerial Robotics | | Indirect Force Control of a Cable-Suspended Aerial Multi-Robot Manipulator | Dario Sanalitro, Marco Tognon, Antonio Jimenez-cano, Juan Cortes, Antonio Franchi | University of Catania,Inria Rennes-Bretagne Atlantique,Centre National de la Recherche Scientifique,LAAS-CNRS,University of Twente | Aerial Robotics | | Accurate High-Maneuvering Trajectory Tracking for Quadrotors: A Drag Utilization Method | Jindou Jia, Kexin Guo, Xiang Yu, Weihua Zhao, Lei Guo | Beihang University,NanyangTechnologicalUniversity | Aerial Robotics | | A Comparative Study of Nonlinear MPC and Differential-Flatness-Based Control for Quadrotor Agile Flight | Sihao Sun, Angel Romero, Philipp Foehn, Elia Kaufmann, Davide Scaramuzza | Univesity of Twente,University of Zurich | Aerial Robotics | | Model Predictive Contouring Control for Time-Optimal Quadrotor Flight | Angel Romero, Sihao Sun, Philipp Foehn, Davide Scaramuzza | University of Zurich,Univesity of Twente | Aerial Robotics | | Automating Vascular Shunt Insertion with the dVRK Surgical Robot | Karthik Dharmarajan, William Panitch, Muyan Jiang, Kishore Srinivas, Baiyu Shi, Yahav Avigal, Huang Huang, Thomas Low, Danyal Fer, Ken Goldberg | UC Berkeley,University of California, Berkeley,University of California at Berkeley,SRI International,University of California, San Francisco East Bay | Medical Robotics II | | CogniDaVinci: Towards Estimating Mental Workload Modulated by Visual Delays During Telerobotic Surgery -- an EEG-Based Analysis | Satyam Kumar, Deland Hu Liu, Frigyes Samuel Racz, Manuel Retana, Susheela Sharma, Fumiaki Iwane, Braden Murphy, Rory O'keeffe, S. Farokh Atashzar, Farshid Alambeigi, Jose del R. Millan | The University of Texas at Austin,University of Texas at Austin,UNIVERSITY OF TEXAS, AUSTIN,National Institutes of Health,New York University,New York University (NYU), US | Medical Robotics II | | Exploring an External Approach to Subretinal Drug Delivery Via Robot Assistance and B-Mode OCT | Elan Ahronovich, Neel Shihora, Jin-Hui Shen, Karen Joos, Nabil Simaan | Vanderbilt ARMA,Vanderbilt University | Medical Robotics II | | Towards Surgical Context Inference and Translation to Gestures | Kay Hutchinson, Zongyu Li, Ian Reyes, Homa Alemzadeh | University of Virginia,The University of Virginia,IBM | Medical Robotics II | | A Method to Use Haptic Feedback of Laryngoscope Force Vector for Endotracheal Intubation Training | Haonan Zhou, Siyu Yang, Louis Halamek, Thrishantha Nanayakkara | Imperial College London,Stanford University | Medical Robotics II | | A Hydraulic Soft Robotic Detrusor Based on an Origami Design | Simone Onorati, Federica Semproni, Linda Paterno, Giada Casagrande, Veronica Iacovacci, Arianna Menciassi | The BioRobotics Institute - Scuola Superiore S. Anna,Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna - SSSA | Medical Robotics II | | Semi-Autonomous Robotic Control of a Self-Shaping Cochlear Implant | Daniel Bautista-Salinas, Conor Kirby, Mohamed Essam Mohamed Kassem Abdel, Burak Temelkuran, Charlie T Huins, Ferdinando Rodriguez Y Baena | Imperial College London,Queen Elizabeth Hospital Birmingham,Imperial College, London, UK | Medical Robotics II | | A Hybrid Steerable Robot with Magnetic Wrist for Minimally Invasive Epilepsy Surgery | Changyan He, Robert Hideki Nguyen, Cameron Forbrigger, James Drake, Thomas Looi, Eric Diller | University of Toronto,The Hospital for Sick Children,Hospital for Sick Children, University of Toronto,Hospital for Sick Children | Medical Robotics II | | Induced Vertex Motion As a Performance Measure for Surgery in Confined Spaces | Neel Shihora, Nabil Simaan | Vanderbilt University | Surgical Robotics | | Foot Gestures to Control the Grasping of a Surgical Robot | Yijun Cheng, Yanpei Huang, Ziwei Wang, Etienne Burdet | Imperial College London,Lancaster University,imperial college london | Surgical Robotics | | Design and Development of a Novel Force-Sensing Robotic System for the Transseptal Puncture in Left Atrial Catheter Ablation | Aya Mutaz Zeidan, Zhouyang Xu, Christopher Edwin Mower, Honglei Wu, Quentin Walker, Oyinkansola Ayoade, Natalia Cotic, Jonathan Behar, Steven Williams, Aruna Arujuna, Yohan Noh, Richard James Housden, Kawal Rhode | King's College London,King’s College London,Brunel University London | Surgical Robotics | | Surgical-VQLA: Transformer with Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery | Long Bai, Mobarakol Islam, Lalithkumar Seenivasan, Hongliang Ren | The Chinese University of Hong Kong,University College London,National University of Singapore,Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) | Surgical Robotics | | Implicit Neural Field Guidance for Teleoperated Robot‐assisted Surgery | Heng Zhang, Lifeng Zhu, Jiangwei Shen, Song Aiguo | Southeast University | Navigation | | Bidirectional Generalised Rigid Point Set Registration | Ang Zhang, Zhe Min, Li Liu, Max Qing Hu Meng | The Chinese University of Hong Kong,University College London | Navigation | | Finding the Optimal Incision Point in Robotic Assisted Surgery | Kyriakos Almpanidis, Theodora Kastritsi, Zoe Doulgeri | Aristotle University of Thessaloniki | Navigation | | Development and Experimental Verification of a 3D Dynamic Absolute Nodal Coordinate Formulation Model of Flexible Prostate Biopsy/Brachytherapy Needles | Athanasios Martsopoulos, Thomas Hill, Raj Persad, Stefanos Bolomytis, Antonia Tzemanaki | University of Bristol,Bristol Urological Institute, Southmead Hospital, Bristol,North Bristol NHS Trust | Navigation | | Collaborative Robotic Biopsy with Trajectory Guidance and Needle Tip Force Feedback | Robin Mieling, Maximilian Neidhardt, Sarah Latus, Carolin Stapper, Stefan Gerlach, Inga Kniep, Axel Heinemann, Benjamin Ondruschka, Alexander Schlaefer | Hamburg University of Technology,University Medical Center Hamburg-Eppendorf | Navigation | | Development and Evaluation of a Robotic Vessel Positioning System for Semi-Automatic Microvascular Anastomosis | Jesse Haworth, Justin Opfermann, Michael Kam, Yaning Wang, Robin Yang, Jin Kang, Axel Krieger | Johns Hopkins University,Johns Hopkins Medicine,the Johns Hopkins University | Navigation | | Robotic Sonographer: Autonomous Robotic Ultrasound Using Domain Expertise in Bayesian Optimization | Deepak Raina, Sh Chandrashekhara, Richard Voyles, Juan Wachs, Subir Kumar Saha | Indian Institute of Technology Delhi and Purdue University USA,All India Insititute of Medical Sciences, New Delhi,Purdue University,Indain Institute of Technology Delhi | Navigation | | Autonomous Intelligent Navigation for Flexible Endoscopy Using Monocular Depth Guidance and 3-D Shape Planning | Yiang Lu, Ruofeng Wei, Bin Li, Wei Chen, Jianshu Zhou, Qi Dou, Dong Sun, Yunhui Liu | The Chinese University of Hong Kong,City University of Hong Kong,Chinese University of Hong Kong | Navigation | | A Probabilistic Rotation Representation for Symmetric Shapes with an Efficiently Computable Bingham Loss Function | Hiroya Sato, Takuya Ikeda, Koichi Nishiwaki | The University of Tokyo,Woven Planet Holdings, Inc.,Woven Alpha | Probability and Statistical Methods | | Topological Trajectory Prediction with Homotopy Classes | Jennifer Wakulicz, Ki Myung Brian Lee, Teresa A. Vidal-Calleja, Robert Fitch | University of Technology Sydney, Centre for Autonomous Systems,University of Technology Sydney | Probability and Statistical Methods | | Information-Theoretic Abstraction of Semantic Octree Models for Integrated Perception and Planning | Daniel Larsson, Arash Asgharivaskasi, Jaein Lim, Nikolay A. Atanasov, Panagiotis Tsiotras | Georgia Institute of Technology,University of California, San Diego,Georgia Tech | Probability and Statistical Methods | | BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization | Harel Biggie, Andrew Beathard, Christoffer Heckman | University of Colorado Boulder,University of Colorado, Boulder,University of Colorado at Boulder | Probability and Statistical Methods | | DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for Autonomous Driving | Xihao Wang, Jiaming Lei, Hai Lan, Arafat Al-Jawari, Xian Wei | Technical University of Munich,Fujian Institute of Research on the Structure of Matter,East China Normal University | Object Detection II | | NVRadarNet: Real-Time Radar Obstacle and Free Space Detection for Autonomous Driving | Alexander Popov, Patrik Gebhardt, Ke Chen, Ryan Oldja, Hee Seok Lee, Shane Murray, Ruchi Bhargava, Nikolai Smolyanskiy | NVIDIA,NVIDIA Corporation,Nvidia,nvidia | Object Detection II | | TransRSS: Transformer-Based Radar Semantic Segmentation | Hao Zou, Harry Xie, Jiarong Ou, Gao Yutao | Alibaba group,Alibaba Group,Alibaba | Object Detection II | | Source-Free Unsupervised Domain Adaptation for 3D Object Detection in Adverse Weather | Deepti Hegde, Velat Kilic, Vishwanath Sindagi, A. Brinton Cooper, Mark Foster, Vishal M. Patel | Johns Hopkins University,The Johns Hopkins UNiversity | Object Detection II | | Bayesian Deep Learning for Affordance Segmentation in Images | Lorenzo Mur Labadia, Ruben Martinez-Cantin, Josechu Guerrero | University of Zaragoza,Universidad de Zaragoza | Object Detection II | | Multi-View Keypoints for Reliable 6D Object Pose Estimation | Alan Li, Angela P. Schoellig | University of Toronto,TU Munich | Object Detection II | | Towards Unsupervised Filtering of Millimetre-Wave Radar Returns for Autonomous Vehicle Road Following | Dean Sacoransky, Joshua Marshall, Keyvan Hashtrudi-zaad | Queen's University | Object Detection II | | Domain Generalised Fully Convolutional One Stage Detection | Karthik Seemakurthy, Petra Bosilj, Erchan Aptoula, Charles W. Fox | University of Lincoln,Sabanci University | Object Detection II | | GNN-Based Point Cloud Maps Feature Extraction and Residual Feature Fusion for 3D Object Detection | Wei-Hsiang Liao, Chieh-Chih (Bob) Wang, Wen-chieh Lin | National Yang Ming Chiao Tung University | Object Detection and Segmentation | | Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos | Shiyang Lu, Yunfu Deng, Abdeslam Boularias, Kostas E. Bekris | Rutgers University,Shenzhen Institutes of Advanced Technology, Chinese Academy of S,Rutgers, the State University of New Jersey | Object Detection and Segmentation | | Depth Is All You Need for Monocular 3D Detection | Dennis Park, Jie Li, Dian Chen, Vitor Guizilini, Adrien Gaidon | Toyota Research Institute | Object Detection and Segmentation | | Towards Visual Classification under Class Ambiguity | Viktor Kozák, Jan Mikula, Lukáš Bertl, Karel Kosnar, Libor Přeučil | Faculty of Electrical Engineering – Czech Technical University in Prague,Czech Technical University in Prague,Czech Technical University in Prague, CIIRC | Object Detection and Segmentation | | LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations | Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Cubuk, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan | Waymo LLC,Waymo,Google,Waymo Research | Object Detection and Segmentation | | HFT: Lifting Perspective Representations Via Hybrid Feature Transformation for BEV Perception | Jiayu Zou, Zheng Zhu, Junjie Huang, Tian Yang, Guan Huang, Xingang Wang | Institute of Automation, Chinese Academy of Sciences,Phigent Robotics,PhiGent Robotics,Research Center of Precision Sensing and Control, Institute of A | Object Detection and Segmentation | | Radar Velocity Transformer: Single-Scan Moving Object Segmentation in Noisy Radar Point Clouds | Matthias Zeller, Vardeep Singh Sandhu, Benedikt Mersch, Jens Behley, Michael Heidingsfeld, Cyrill Stachniss | CARIAD SE,University of Bonn, CARIAD,University of Bonn | Object Detection and Segmentation | | CurveFormer: 3D Lane Detection by Curve Propagation with Curve Queries and Attention | Yifeng Bai, Zhirong Chen, Zhangjie Fu, Lang Peng, Pengpeng Liang, Erkang Cheng | University of Science and Technology of China,Nullmax,Southeast university,Zhengzhou University,Nullmax Inc | Object Detection and Segmentation | | Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN | YuXuan (Andrew) Liu, Nikhil Mishra, Pieter Abbeel, Xi Chen | Covariant.ai, UC Berkeley,UC Berkeley,covariant.ai,UC Berkeley,Embodied Intelligence, UC Berkeley | Object Detection and Segmentation | | Bayesian Inference of Fog Visibility from LiDAR Point Clouds and Correlation with Probabilities of Detection | Karl Montalban, Christophe Reymann, Dinesh Atchuthan, Paul-Édouard Dupouy, Nicolas Riviere, Simon Lacroix | easymile,EASYMILE SAS,EasyMile,ONERA,LAAS/CNRS | Object Detection and Segmentation | | GDIP: Gated Differentiable Image Processing for Object Detection in Adverse Conditions | Sanket Kalwar, Dhruv Patel, Aakash Aanegola, Krishna Konda, Sourav Garg, Madhava Krishna | International Institute of Information Technology, Hyderabad,International Institute of Information Technology, Hyderabad, In,ZF TCI,Queensland University of Technology,IIIT Hyderabad | Object Detection and Segmentation | | Sample, Crop, Track: Self-Supervised Mobile 3D Object Detection for Urban Driving LiDAR | Sangyun Shin, Stuart Golodetz, Madhu Vankadari, Zhou Kaichen, Andrew Markham, Niki Trigoni | University of Oxford,Oxford University | Object Detection and Segmentation | | Topology Matching of Branched Deformable Linear Objects | Manuel Zürn, Markus Wnuk, Armin Lechler, Alexander Verl | Institute for control engineering of machine tools and manufactu,University Stuttgart,University of Stuttgart | Perception of Deformable Objects | | DLOFTBs – Fast Tracking of Deformable Linear Objects with B-Splines | Piotr Kicki, Amadeusz Szymko, Krzysztof Walas | Poznan University of Technology | Perception of Deformable Objects | | Self-Supervised Cloth Reconstruction Via Action-Conditioned Cloth Tracking | Zixuan Huang, Xingyu Lin, David Held | University of Michigan,Carnegie Mellon University | Perception of Deformable Objects | | Learning to Estimate 3-D States of Deformable Linear Objects from Single-Frame Occluded Point Clouds | Kangchen Lv, Mingrui Yu, Yifan Pu, Xin Jiang, Gao Huang, Xiang Li | Tsinghua University,Beijing Academy of Artificial Intelligence | Perception of Deformable Objects | | Feature Extraction for Effective and Efficient Deep Reinforcement Learning on Real Robotic Platforms | Peter Bohm, Pauline Pounds, Archie Chapman | The University of Queensland | Reinforcement Learning I | | Online Safety Property Collection and Refinement for Safe Deep Reinforcement Learning in Mapless Navigation | Luca Marzari, Enrico Marchesini, Alessandro Farinelli | University of Verona,Northeastern University | Reinforcement Learning I | | Learning to View: Decision Transformers for Active Object Detection | Wenhao Ding, Nathalie Majcherczyk, Mohit Deshpande, Xuewei Qi, Ding Zhao, Rajasimman Madhivanan, Arnab Sen | Carnegie Mellon University,Amazon LLC,Amazon Lab,,,,Toyota North America R&D Labs,Carnegie mellon university,Amazon.com,Amazon | Reinforcement Learning I | | Deep Reinforcement Learning for Autonomous Driving Using High-Level Heterogeneous Graph Representations | Maximilian Schier, Christoph Reinders, Bodo Rosenhahn | Leibniz Universität Hannover,Leibniz University Hanover,Institute of Information Processing, Leibniz Universität Hannove | Reinforcement Learning I | | Learning on the Job: Self-Rewarding Offline-To-Online Finetuning for Industrial Insertion of Novel Connectors from Vision | Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine | UC Berkeley,University of California, Berkeley; Siemens,Worcester Polytechnic Institute,Siemens Corporation | Reinforcement Learning I | | Multi-Alpha Soft Actor-Critic: Overcoming Stochastic Biases in Maximum Entropy Reinforcement Learning | Conor Igoe, Swapnil Pande, Siddarth Venkatraman, Jeff Schneider | Carnegie Mellon University,Manipal Institute of Technology | Reinforcement Learning I | | Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning | Zheng Wu, Yichen Xie, Wenzhao Lian, Changhao Wang, Yanjiang Guo, Jianyu Chen, Stefan Schaal, Masayoshi Tomizuka | University of California, Berkeley,Google X,Tsinghua university,Tsinghua University,University of California | Reinforcement Learning I | | Real World Offline Reinforcement Learning with Realistic Data Source | Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar | Carnegie Mellon University,University of Washington,Meta AI | Reinforcement Learning I | | Robotic Table Wiping Via Reinforcement Learning and Whole-Body Trajectory Optimization | Thomas Lew, Sumeet Singh, Mario Prats, Jeffrey Bingham, Jonathan Weisz, Benjie Holson, Xiaohan Zhang, Vikas Sindhwani, Yao Lu, Fei Xia, Peng Xu, Tingnan Zhang, Jie Tan, Montse Gonzalez Arenas | Stanford University,Google,X,Everyday Robots,Binghamton University,Google Brain, NYC,Google Inc | Reinforcement Learning I | | Towards True Lossless Sparse Communication in Multi-Agent Systems | Seth Karten, Mycal Tucker, Siva Kailas, Katia Sycara | Carnegie Mellon University,Massachusetts Institute of Technology | Reinforcement Learning I | | Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning | Cheng Liu, Erik-jan Van Kampen, Guido De Croon | Delft University of Technology,TU Delft | Reinforcement Learning I | | Self-Adaptive Driving in Nonstationary Environments through Conjectural Online Lookahead Adaptation | Tao Li, Haozhe Lei, Quanyan Zhu | New York University | Reinforcement Learning I | | Sim-To-Real Policy and Reward Transfer with Adaptive Forward Dynamics Model | Rongshun Juan, Hao Ju, Jie Huang, Randy Gomez, Keisuke Nakamura, Guangliang Li | Tianjin University,Ocean University of China,Honda Research Institute Japan Co., Ltd. | Transfer Learning | | Safety-Constrained Policy Transfer with Successor Features | Zeyu Feng, Bowen Zhang, Jianxin Bi, Harold Soh | National University of Singapore | Transfer Learning | | GNM: A General Navigation Model to Drive Any Robot | Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine | University of California, Berkeley,UC Berkeley,UC Berkeley / TOYOTA Motor North America | Transfer Learning | | ViPFormer: Efficient Vision-And-Pointcloud Transformer for Unsupervised Pointcloud Understanding | Hongyu Sun, Yongcai Wang, Xudong Cai, Xuewei Bai, Deying Li | Renmin University of China | Transfer Learning | | Learning Exploration Strategies to Solve Real-World Marble Runs | Alisa Allaire, Christopher Atkeson | Carnegie Mellon University,CMU | Learning Methods | | Multi-Embodiment Legged Robot Control As a Sequence Modeling Problem | Chen Yu, Weinan Zhang, Hang Lai, Zheng Tian, Laurent Kneip, Jun Wang | ShanghaiTech University,Shanghai Jiao Tong University,University College London | Learning Methods | | Efficient Recovery Learning Using Model Predictive Meta-Reasoning | Shivam Vats, Maxim Likhachev, Oliver Kroemer | Carnegie Mellon University | Learning Methods | | Multi-Swarm Genetic Gray Wolf Optimizer with Embedded Autoencoders for High-Dimensional Expensive Problems | Jing Bi, Jiahui Zhai, Haitao Yuan, Ziqi Wang, Junfei Qiao, Jia Zhang, Mengchu Zhou | Beijing University of Technology, Beijing ,,,,,,, China,Beijing University of Technology,Beihang University,Southern Methodist University,New Jersey Institute of Technology | Learning Methods | | H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions | Kei Ota, Hsiao-yu Tung, Kevin Smith, Anoop Cherian, Tim K. Marks, Alan Sullivan, Asako Kanezaki, Joshua Tenenbaum | Tokyo Institute of Technology,CMU,Massachusetts Institute of Technology,Mitsubishi Electric Research Labs,Mitsubishi Electric Research Laboratories (MERL),Mitsubishi Electric Research Lab | Learning Methods | | Self-Supervised Learning of Action Affordances As Interaction Modes | Liquan Wang, Nikita Dvornik, Rafael Dubeau, Mayank Mittal, Animesh Garg | University of Toronto,Samsung,ETH Zurich | Learning Methods | | LATTE: LAnguage Trajectory TransformEr | Arthur Fender Coelho Bucker, Luis Felipe Cruz Figueredo, Sami Haddadin, Ashish Kapoor, Shuang Ma, Sai Vemprala, Rogerio Bonatti | Carnegie Mellon University,Technical University of Munich (TUM),Technical University of Munich,MicroSoft,Microsoft,Microsoft Corporation | Learning Methods | | Learning Visual Locomotion with Cross-Modal Supervision | Antonio Loquercio, Ashish Kumar, Jitendra Malik | UC Berkeley | Learning Methods | | MMIC-I: A Robotic Platform for Assembly Integration and Internal Locomotion through Mechanical Meta-Material Structures | Olivia Irene Formoso, Greenfield Trinh, Damiana Catanoso, In-won Park, Christine Gregg, Kenneth C. Cheung | NASA Ames Research Center,National Aeronautics and Space Administration (NASA) | Novel Actuation and Actuators | | Flow-Based Rendezvous and Docking for Marine Modular Robots in Gyre-Like Environments | Gedaliah Knizhnik, Peihan Li, Mark Yim, M. Ani Hsieh | RRAI, University of Pennsylvania,Drexel University,University of Pennsylvania | Novel Actuation and Actuators | | Mobility Analysis of Screw-Based Locomotion and Propulsion in Various Media | Jason Lim, Florian Richter, Dimitri Schreiber, Peter Gavrilov, Lizzie Peiros, Mingwei Yeoh, Calvin Joyce, Sara Wickenhiser, Michael Yip | University of Nevada, Reno,University of California, San Diego,University of California,University of California San Diego | Novel Actuation and Actuators | | TJ-FlyingFish: Design and Implementation of an Aerial-Aquatic Quadrotor with Tiltable Propulsion Units | Xuchen Liu, Minghao Dou, Dongyue Huang, Songqun Gao, Ruixin Yan, Biao Wang, Jinqiang Cui, Qinyuan Ren, Lihua Dou, Zhi Gao, Jie Chen, Ben M. Chen | The Chinese University of Hong Kong,Chinese University of Hong Kong,Nanjing University of Aeronautics and Astronautics,Peng Cheng Laboratory,Zhejiang University,Beijing Institue of Technology,Wuhan University,Tongji University | Novel Actuation and Actuators | | Modular Multi-Axis Elastic Actuator with Torque Sensing Capable P-CFH for Highly Impact Resistive Robot Leg | Youngrae Kim, Sunghyun Choi, Jinhyeok Song, Dongwon Yun | Daegu Gyeongbuk Institute of Science and Technology (DGIST), Dae,Daegu Gyeongbuk Institute of Science & Technology,DGIST,Daegu Gyeongbuk Institute of Science and Technology (DGIST) | Novel Actuation and Actuators | | Design and Mechanics of Cable-Driven Rolling Diaphragm Transmission for High-Transparency Robotic Motion | Hoi Man Lam, Jared Walker, Lucas Jonasch, Dimitri Schreiber, Michael Yip | University of California San Diego,University of California,University of California, San Diego | Novel Actuation and Actuators | | Twist Snake: Plastic Table-Top Cable-Driven Robotic Arm with All Motors Located at the Base Link | Kazutoshi Tanaka, Masashi Hamaya | OMRON SINIC X Corporation | Novel Actuation and Actuators | | Strained Elastic Surfaces with Adjustable-Modulus Edges (SESAMEs) for Soft Robotic Actuation | Christopher Kimmer, Michael Seokyoung Han, Cindy Harnett | Indiana University Southeast,University of Louisville | Novel Actuation and Actuators | | Controllable Mechanical-Domain Energy Accumulators | Sung Kim, David Braun | Vanderbilt University | Compliant Joints and Mechanisms | | Concept Design of a New XY Compliant Parallel Manipulator with Spatial Configuration | Zekui Lyu, Qingsong Xu | University of Macau | Compliant Joints and Mechanisms | | Computational Design of 3D-Printable Compliant Mechanisms with Bio-Inspired Sliding Joints | Felipe Velasquez, Bernhard Thomaszewski, Stelian Coros | ETH Zurich,Université de Montréal | Compliant Joints and Mechanisms | | Compliant Finger Joint with Controlled Variable Stiffness Based on Twisted Strings Actuation | Mihai Dragusanu, Danilo Troisi, Domenico Prattichizzo, Monica Malvezzi | University of Siena | Compliant Joints and Mechanisms | | Design of a Variable Stiffness Spring with Human-Selectable Stiffness | Chase Mathews, David Braun | Vanderbilt University | Compliant Joints and Mechanisms | | Novel Spring Mechanism Enables Iterative Energy Accumulation under Force and Deformation Constraints | Cole Dempsey, David Braun | Vanderbilt University | Compliant Joints and Mechanisms | | Fast, Reliable Constrained Manipulation Using a VSA Driven Planar Robot | Andrew Bernhard, Joseph Schimmels | Argonne National Laboratory,Marquette University | Compliant Joints and Mechanisms | | A Stiffness-Changeable Soft Finger Based on Chain Mail Jamming | Zhengtao Hu, Abdullah Ahmed, Weiwei Wan, Tetsuyou Watanabe, Kensuke Harada | Osaka University,Kanazawa University | Compliant Joints and Mechanisms | | Repetitive Twisting Durability of Synthetic Fiber Ropes | Shinya Sadachika, Masahito Kanekiyo, Hiroyuki Nabae, Gen Endo | Tokyo Institute of Technology | Mechanism Design | | Computational Design of Closed-Chain Linkages: Hopping Robot Driven by Morphological Computation | Kirill Nasonov, Dmitriy Ivolga, Ivan Borisov, Sergey Kolyubin | ITMO University,ITMO | Mechanism Design | | Trajectory Planning Issues in Cuspidal Commercial Robots | Durgesh Haribhau Salunkhe, Damien Chablat, Philippe Wenger | CNRS-UMR,,,,-CD,,,,-LS,N,Laboratoire des Sciences du Numérique de Nantes,Ecole Centrale de Nantes - CNRS | Mechanism Design | | Big Data Approach for Synthesizing a Spatial Linkage Mechanism | Neung Hwan Yim, Jegyeong Ryu, Yoon Young Kim | Seoul National University,Korea Institute of Science and Technology | Mechanism Design | | Croche-Matic: A Robot for Crocheting 3D Cylindrical Geometry | Gabriella Perry, Jose Luis Garcia Del Castillo Y Lopez, Nathan Melenbrink | Harvard University | Mechanism Design | | A Novel Platform to Control Biofouling in Pearl Oysters Cultivation | Van-nhan Tran, Quan-dung Pham, Tan-sang Ha, Wong Yue Him, Sai-kit Yeung | Hong Kong University of Science and Technology,Shenzhen University | Mechanism Design | | Embedded Active Stiffening Mechanisms to Modulate Kresling Tower Kinetostatic Properties | John Berre, Lennart Rubbert, Francois Geiskopf, Pierre Renaud | INSA Strasbourg, University of Strasbourg, CNRS,INSA - Strasbourg,INSA de Strasbourg,ICube | Mechanism Design | | A Compact, Two-Part Torsion Spring Architecture | Zachary Bons, Gray Thomas, Luke Mooney, Elliott Rouse | University of Michigan,Dephy, Inc. | Award Finalists 1 | | HREyes: Design, Development, and Evaluation of a Novel Method for AUVs to Communicate Information and Gaze Direction | Michael Fulton, Aditya Prabhu, Junaed Sattar | University of Minnesota,University of Minnesota, Twin Cities | Human-Robot Collaboration I | | Dense Depth Completion Based on Multi-Scale Confidence and Self-Attention Mechanism for Intestinal Endoscopy | Ruyu Liu, Zhengzhe Liu, Haoyu Zhang, Guodao Zhang, Zhigui Zuo, Weiguo Sheng | Hangzhou Normal University,Hangzhou Dianzi University,the First Affiliated Hospital of Wenzhou Medical University | Human-Robot Collaboration I | | Design of an Energy-Aware Cartesian Impedance Controller for Collaborative Disassembly | Sebastian Hjorth, Edoardo Lamon, Dimitrios Chrysostomou, Arash Ajoudani | Aalborg University,Istituto Italiano di Tecnologia | Human-Robot Collaboration I | | Towards Robots That Influence Humans Over Long-Term Interaction | Shahabedin Sagheb, Ye-ji Mun, Neema Ahmadian, Benjamin Christie, Andrea Bajcsy, Katherine Driggs-Campbell, Dylan Losey | Virginia Tech,University of Illinois at Urbana-Champaign,University of California Berkeley | Human-Robot Collaboration I | | Carrying the Uncarriable: A Deformation-Agnostic and Human-Cooperative Framework for Unwieldy Objects Using Multiple Robots | Doganay Sirintuna, Idil Özdamar, Arash Ajoudani | HRI, Lab., Istituto Italiano di Tecnologia. Dept. of Informatics,Istituto Italiano di Tecnologia | Human-Robot Collaboration I | | A Control Approach for Human-Robot Ergonomic Payload Lifting | Lorenzo Rapetti, Carlotta Sartore, Mohamed Elobaid, Yeshasvi Tirupachuri, Francesco Draicchio, Tomohiro Kawakami, Takahide Yoshiike, Daniele Pucci | IIT,Istituto Italiano di Tecnologia,Fondazione Istituto Italiano di Tecnologia,Italian Institute of Technology,INAIL, Department of Occupational & Environmental Medicine, Mont,Honda R&D Co., Ltd.,Honda Research Institute Japan | Award Finalists 3 | | Active Reward Learning from Online Preferences | Vivek Myers, Erdem Bıyık, Dorsa Sadigh | UC Berkeley,Stanford University | Human-Robot Collaboration I | | Supernumerary Robotic Limbs for Next Generation Space Suit Technology | Erik Ballesteros, Brandon Man, Harry Asada | Massachusetts Institute of Technology,Cornell University,MIT | Human-Robot Collaboration I | | It Takes Two: Learning to Plan for Human-Robot Cooperative Carrying | Eley Ng, Ziang Liu, Monroe Kennedy | Stanford University,University of Southern California | Human-Robot Collaboration I | | Collision Detection and Contact Point Estimation Using Virtual Joint Torque Sensing Applied to a Cobot | Dario Zurlo, Tom Heitmann, Merlin Morlock, Alessandro De Luca | Sapienza Università di Roma,NEURA Robotics GmbH,Sapienza University of Rome | Human-Robot Collaboration I | | The Human Gaze Helps Robots Run Bravely and Efficiently in Crowds | Qianyi Zhang, Zhengxi Hu, Yinuo Song, Jiayi Pei, Jingtai Liu | Nankai University,NanKai Univerdsity,NanKai University | Human-Robot Collaboration I | | A Gaze-Speech System in Mixed Reality for Human-Robot Interaction | John David Prieto Prada, Myung Ho Lee, Cheol Song | DGIST | Human-Robot Collaboration I | | ADAPT: Action-Aware Driving Caption Transformer | Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu | Institute of Automation, Chinese Academy of Sciences,Xidian University,Institute of Automation,Chinese Academy of Sciences,Institute for AI Industry Research (AIR), Tsinghua University,Tsinghua University,Southern University of Science and Technology,Beihang University | Human-Robot Interaction | | Aligning Human Preferences with Baseline Objectives in Reinforcement Learning | Daniel Marta, Simon Holk, Christian Pek, Jana Tumova, Iolanda Leite | KTH Royal Institute of Technology | Human-Robot Interaction | | EWareNet: Emotion Aware Pedestrian Intent Prediction and Adaptive Spatial Profile Fusion for Social Robot Navigation | Venkatraman Narayanan, Bala Murali Manoghar Sai Sudhakar, Rama Prashanth Ramasamy Vijayakumar, Aniket Bera | UMD,University of Maryland, College Park,University of Maryland,Purdue University | Human-Robot Interaction | | SCAN: Socially-Aware Navigation Using Monte Carlo Tree Search | Jeongwoo Oh, Jae Seok Heo, Junseo Lee, Gunmin Lee, Minjae Kang, Jeongho Park, Songhwai Oh | Seoul National University,Seoul National University (SNU),Seoul National Universitiy | Human-Robot Interaction | | SGPT: The Secondary Path Guides the Primary Path in Transformers for HOI Detection | Sixian Chan, Weixiang Wang, Zhanpeng Shao, Cong Bai | Zhejiang University of Technology,湖南师范大学 | Human-Robot Interaction | | Robot Person Following under Partial Occlusion | Hanjing Ye, Jieting Zhao, Yaling Pan, Weinan Chen, Li He, Hong Zhang | Southern University of Science and Technology,Guangdong University of Technology,SUSTech | Human-Robot Interaction | | A Little Bit Attention Is All You Need for Person Re-Identification | Markus Eisenbach, Jannik Lübberstedt, Dustin Aganian, Horst-Michael Gross | Ilmenau University of Technology | Human-Robot Interaction | | Automatic Generation of Robot Facial Expressions with Preferences | Bing Tang, Rongyun Cao, Rongya Chen, Bei Hua, Xiaoping Chen, Feng Wu | University of Science and Technology of China,Institute of Advanced Technology, University of Science and Tech,University of science and technology of China | Human-Robot Interaction | | A Task Allocation Framework for Human Multi-Robot Collaborative Settings | Martina Lippi, Paolo Augusto Di Lillo, Alessandro Marino | University of Roma Tre,University of Cassino and Southern Lazio | Human-Robot Interaction | | TOP-JAM: A Bio-Inspired Topology-Based Model of Joint Attention for Human-Robot Interaction | Hendry F. Chame, Aurélie Clodic, Alami Rachid | University of Lorraine / CNRS,LAAS - CNRS,CNRS | Human-Robot Interaction | | NOPA: Neurally-Guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants | Xavier Puig, Tianmin Shu, Joshua Tenenbaum, Antonio Torralba | MIT,Massachusetts Institute of Technology | Human-Robot Interaction | | Embodied Referring Expression for Manipulation Question Answering in Interactive Environment | Qie Sima, Sinan Tan, Huaping Liu, Fuchun Sun | Tsinghua University | Human-Robot Interaction | | Congestion Prediction for Large Fleets of Mobile Robots | Ge Yu, Michael Wolf | Amazon | Multi-Robot Systems IV | | Decentralised Active Perception in Continuous Action Spaces for the Coordinated Escort Problem | Rhett Hull, Ki Myung Brian Lee, Jennifer Wakulicz, Chanyeol Yoo, James Mcmahon, Bryan Clarke, Stuart Anstee, Jijoong Kim, Robert Fitch | University of Technology Sydney,University of Technology Sydney, Centre for Autonomous Systems,The Naval Research Laboratory,University of Sydney,Defence Science and Technology Group,Defence Science and Technology Organisation | Multi-Robot Systems IV | | Socially Fair Coverage Control | Matthew Malencia, George J. Pappas, Vijay Kumar | University of Pennsylvania | Multi-Robot Systems IV | | Exploiting Trust for Resilient Hypothesis Testing with Malicious Robots | Matthew Cavorsi, Orhan Akgun, Michal Yemini, Andrea Goldsmith, Stephanie Gil | Harvard University,Bar-Ilan University,Stanford University | Multi-Robot Systems IV | | Obscuring Objectives with Pareto-Optimal Privacy-Aware Trajectories in Multi-Robot Coverage | Brennan Brodt, Alyssa Pierson | Boston University | Multi-Robot Systems IV | | Safe and Distributed Multi-Agent Motion Planning under Minimum Speed Constraints | Inkyu Jang, Jungwon Park, H. Jin Kim | Seoul National University | Multi-Robot Systems IV | | Minimally Constrained Multi-Robot Coordination with Line-Of-Sight Connectivity Maintenance | Yupeng Yang, Yiwei Lyu, Wenhao Luo | University of North Carolina at Charlotte,Carnegie Mellon University | Multi-Robot Systems IV | | Relay Pursuit for Multirobot Target Tracking on Tile Graphs | Shashwata Mandal, Sourabh Bhattacharya | Iowa State University | Multi-Robot Systems IV | | Passivity-Based Decentralized Control for Collaborative Grasping of Under-Actuated Aerial Manipulators | Jinyeong Jeong, Min Jun Kim | Korea Advanced Institute of Science and Technology,KAIST | Multi-Robot Systems IV | | Distributed Barrier Function-Enabled Human-In-The-Loop Control for Multi-Robot Systems | Victor Nan Fernandez-Ayala, Xiao Tan, Dimos V. Dimarogonas | KTH Royal Institute of Technology,KTH Royal Institute of Technology, Sweden | Multi-Robot Systems IV | | LEMURS: Learning Distributed Multi-Robot Interactions | Eduardo Sebastián, Thai Duong, Nikolay A. Atanasov, Eduardo Montijano, Carlos Sagues | University of Zaragoza,University of California, San Diego,Universidad de Zaragoza | Multi-Robot Systems IV | | Multi-Agent Active Search Using Detection and Location Uncertainty | Arundhati Banerjee, Ramina Ghods, Jeff Schneider | Carnegie Mellon University | Multi-Robot Systems IV | | HMAAC: Hierarchical Multi-Agent Actor-Critic for Aerial Search with Explicit Coordination Modeling | Chuanneng Sun, Songjun Huang, Dario Pompili | Rutgers University | Search, Rescue, and Hazardous Field Robotics | | GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search | Nikhil Angad Bakshi, Tejus Gupta, Ramina Ghods, Jeff Schneider | Carnegie Mellon University | Award Finalists 3 | | CLIO: A Novel Robotic Solution for Exploration and Rescue Missions in Hostile Mountain Environments | Michele Focchi, Mohamed Bensaadallah, Marco Frego, Angelika Peer, Daniele Fontanelli, Andrea Del Prete, Luigi Palopoli | Università di Trento,University of Batna ,,Free University of Bolzano,University of Trento | Search, Rescue, and Hazardous Field Robotics | | Towards Efficient Gas Leak Detection in Built Environments: Data-Driven Plume Modeling for Gas Sensing Robots | Wanting Jin, Faezeh Rahbar, Chiara Ercolani, Alcherio Martinoli | EPFL | Search, Rescue, and Hazardous Field Robotics | | Image-To-Image Translation for Autonomous Driving from Coarsely-Aligned Image Pairs | Youya Xia, Josephine Monica, Wei-Lun Chao, Bharath Hariharan, Kilian Weinberger, Mark Campbell | Cornell University | Self-Driving Cars II | | Small-Shot Multi-Modal Distillation for Vision-Based Autonomous Steering | Yu Shen, Luyu Yang, Xijun Wang, Ming C. Lin | University of Maryland,University of Maryland, College Park,University of Maryland at College Park | Self-Driving Cars II | | SceneCalib: Automatic Targetless Calibration of Cameras and Lidars in Autonomous Driving | Ayon Sen, Gang Pan, Anton Mitrokhin, Ashraful Islam | NVIDIA Corporation | Self-Driving Cars II | | Unsupervised Road Anomaly Detection with Language Anchors | Beiwen Tian, Mingdao Liu, Huan-ang Gao, Pengfei Li, Hao Zhao, Guyue Zhou | Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University | Self-Driving Cars II | | Expanding the Deployment Envelope of Behavior Prediction Via Adaptive Meta-Learning | Boris Ivanovic, James Harrison, Marco Pavone | NVIDIA,Stanford University | Self-Driving Cars II | | Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control | Piyush Gupta, David Isele, Donggun Lee, Sangjae Bae | Michigan State University,University of Pennsylvania, Honda Research Institute USA,UC Berkeley,Honda Research Institute, USA | Self-Driving Cars II | | GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting | Alexander Cui, Sergio Casas Romero, Kelvin Wong, Shun Da Suo, Raquel Urtasun | University of Toronto, Waabi,University of Toronto | Award Finalists 3 | | RGB-Event Fusion for Moving Object Detection in Autonomous Driving | Zhuyun Zhou, Zongwei Wu, Rémi Boutteau, Fan Yang, Cédric Demonceaux, Dominique Ginhac | University of Burgundy (Université de Bourgogne), France,Université de Bourgogne, France,Université de Rouen Normandie,Univ. Bourgogne Franche-Comté,Université Bourgogne Franche-Comté,Univ Burgundy | Self-Driving Cars II | | Self-Entanglement-Free Tethered Path Planning for Non-Particle Differential-Driven Robot | Tong Yang, Jiangpin Liu, Yue Wang, Rong Xiong | Zhejiang University | Motion and Path Planning IV | | Operating with Inaccurate Models by Integrating Control-Level Discrepancy Information into Planning | Ellis Ratner, Claire Tomlin, Maxim Likhachev | University of California, Berkeley,UC Berkeley,Carnegie Mellon University | Motion and Path Planning IV | | Approximation Algorithms for Robot Tours in Random Fields with Guaranteed Estimation Accuracy | Shamak Dutta, Nils Wilde, Pratap Tokekar, Stephen L. Smith | University of Waterloo,TU Delft,University of Maryland | Motion and Path Planning IV | | Real-Time Fast Marching Tree for Mobile Robot Motion Planning in Dynamic Environments | Jefferson Silveira, Kleber Cabral, Sidney Givigi, Joshua Marshall | Queen's University,Royal Military College of Canada | Motion and Path Planning IV | | Efficient Optimal Planning in Non-FIFO Time-Dependent Flow Fields | Ju Heon Lee, Chanyeol Yoo, Stuart Anstee, Robert Fitch | University of Technology Sydney,Defence Science and Technology Group | Motion and Path Planning IV | | Human-Guided Planning for Complex Manipulation Tasks Using the Screw Geometry of Motion | Dasharadhan Mahalingam, Nilanjan Chakraborty | Stony Brook University | Motion and Path Planning IV | | Towards Efficient Trajectory Generation for Ground Robots Beyond 2D Environment | Jingping Wang, Long Xu, Haoran Fu, Chao Xu, Yanjun Cao, Ximin Lyu, Fei Gao | Zhejiang university,Zhejiang University,Sun Yat-sen University,Zhejiang University, Huzhou Institute of Zhejiang University,Sun Yat-Sen University | Motion and Path Planning IV | | Concentration of Measure Phenomenon and Its Implications for Sample-Based Planning Algorithms in Very-High Dimensional Configuration Spaces | Joel Esposito | US Naval Academy | Motion and Path Planning IV | | Safeguarding Learning-Based Planners under Motion and Sensing Uncertainties Using Reachability Analysis | Akshay Shetty, Adam Dai, Alexandros Tzikas, Grace Gao | Stanford University | Planning under Uncertainty II | | Risk-Aware Spatio-Temporal Logic Planning in Gaussian Belief Spaces | Matti Vahs, Christian Pek, Jana Tumova | KTH Royal Institute of Technology, Stockholm,KTH Royal Institute of Technology | Planning under Uncertainty II | | Density Planner: Minimizing Collision Risk in Motion Planning with Dynamic Obstacles Using Density-Based Reachability | Laura Lützow, Yue Meng, Andres Chavez Armijos, Chuchu Fan | Technical University of Munich,Massachusetts Institute of Technology,Boston University | Planning under Uncertainty II | | Sequential Bayesian Optimization for Adaptive Informative Path Planning with Multimodal Sensing | Joshua Ott, Edward Balaban, Mykel Kochenderfer | Stanford University,NASA Ames Research Center | Planning under Uncertainty II | | Tree-Structured Policy Planning with Learned Behavior Models | Yuxiao Chen, Peter Karkus, Boris Ivanovic, Xinshuo Weng, Marco Pavone | Nvidia research,NVIDIA,Carnegie Mellon University,Stanford University | Planning under Uncertainty II | | Fast and Scalable Signal Inference for Active Robotic Source Seeking | Christopher E. Denniston, Oriana Peltzer, Joshua Ott, Sangwoo Moon, Sung Kyun Kim, Gaurav Sukhatme, Mykel Kochenderfer, Mac Schwager, Ali-Akbar Agha-Mohammadi | University of Southern California,Stanford University,Jet Propulsion Laboratory, NASA,NASA Jet Propulsion Laboratory, Caltech,NASA-JPL, Caltech | Planning under Uncertainty II | | Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits | Shohei Wakayama, Nisar Ahmed | University of Colorado Boulder | Planning under Uncertainty II | | Covariance Steering for Uncertain Contact-Rich Systems | Yuki Shirai, Devesh Jha, Arvind Raghunathan | University of California, Los Angeles,Mitsubishi Electric Research Laboratories | Planning under Uncertainty II | | A Congestion-Aware Path Planning Method Considering Crowd Spatial-Temporal Anomalies for Long-Term Autonomy of Mobile Robots | Zijian Ge, Jingjing Jiang, Matthew Coombes | Loughborough university,Loughborough University | Planning under Uncertainty II | | Risk-Aware Model Predictive Path Integral Control Using Conditional Value-At-Risk | Ji Yin, Zhiyuan Zhang, Panagiotis Tsiotras | Georgia Institute of Technology,Georgia Tech | Planning under Uncertainty II | | Chance-Constrained Motion Planning with Event-Triggered Estimation | Anne Theurkauf, Qi Heng Ho, Roland Ilyes, Nisar Ahmed, Morteza Lahijanian | University of Colorado Boulder | Planning under Uncertainty II | | STAP: Sequencing Task-Agnostic Policies | Toki Migimatsu, Christopher Agia, Jiajun Wu, Jeannette Bohg | Stanford University | Integrated Planning and Learning | | A Multi-Step Dynamics Modeling Framework for Autonomous Driving in Multiple Environments | Jason Gibson, Bogdan Vlahov, David Fan, Patrick Spieler, Daniel Pastor, Ali-Akbar Agha-Mohammadi, Evangelos Theodorou | Georgia Institute of Technology,NASA Jet Propulsion Laboratory,JPL,Caltech,NASA-JPL, Caltech | Award Finalists 2 | | Self-Adaptive Teaching-Learning-Based Optimizer with Improved RBF and Sparse Autoencoder for Complex Optimization Problems | Jing Bi, Ziqi Wang, Haitao Yuan, Junfei Qiao, Jia Zhang, Mengchu Zhou | Beijing University of Technology, Beijing ,,,,,,, China,Beijing University of Technology,Beihang University,Southern Methodist University,New Jersey Institute of Technology | Integrated Planning and Learning | | Learning Neuro-Symbolic Programs for Language Guided Robot Manipulation | Namasivayam Kalithasan, Himanshu Gaurav Singh, Vishal Bindal, Arnav Tuli, Vishwajeet Agrawal, Rahul Jain, Parag Singla, Rohan Paul | Indian Institute of Technology, Delhi,IIT DELHI,Indian Institute of Technology Delhi | Integrated Planning and Learning | | Real-Time Generative Grasping with Spatio-Temporal Sparse Convolution | Tim Player, Dongsik Chang, Fuxin Li, Geoffrey Hollinger | Oregon State University,Amazon | Grasping and Manipulation I | | Keypoint-GraspNet: Keypoint-Based 6-DoF Grasp Generation from the Monocular RGB-D Input | Yiye Chen, Yunzhi Lin, Ruinian Xu, Patricio A. Vela | Georgia Institute of Technology,georgia institute of technology | Grasping and Manipulation I | | Pick2Place: Task-Aware 6DoF Grasp Estimation Via Object-Centric Perspective Affordance | Zhanpeng He, Nikhil Chavan-dafle, Jinwook Huh, Shuran Song, Volkan Isler | Columbia University,Samsung Research America,Samsung,University of Minnesota | Grasping and Manipulation I | | RGB-D Grasp Detection Via Depth Guided Learning with Cross-Modal Attention | Ran Qin, Haoxiang Ma, Boyang Gao, Di Huang | Beihang University,Geometry Robotics Ltd. Harbin Institute of Technology | Grasping and Manipulation I | | Towards Generalized Robot Assembly through Compliance-Enabled Contact Formations | Andrew Morgan, Quentin Bateux, Mei Hao, Aaron Dollar | Yale University | Grasping and Manipulation I | | Design of a Multimodal Fingertip Sensor for Dynamic Manipulation | Andrew Saloutos, Elijah Stanger-jones, Menglong Guo, Hongmin Kim, Sangbae Kim | Massachusetts Institute of Technology,University of California Berkeley,Seoul National University | Grasping and Manipulation I | | TactoFind: A Tactile Only System for Object Retrieval | Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal | Massachusetts Institute of Technology,MIT,University of Washington | Grasping and Manipulation I | | FingerSLAM: Closed-Loop Unknown Object Localization and Reconstruction from Visuo-Tactile Feedback | Alan Zhao, Maria Bauza Villalonga, Edward Adelson | Massachusetts Institute of Technology,MIT | Grasping and Manipulation I | | Differential Dynamic Programming Based Hybrid Manipulation Strategy for Dynamic Grasping | Cheng Zhou, Yanbo Long, Lei Shi, Longfei Zhao, Yu Zheng | Tencent,University of Bristol,Johns Hopkins University,TENCENT | Grasping and Manipulation I | | A Bioinspired Synthetic Nervous System Controller for Pick-And-Place Manipulation | Yanjun Li, Ravesh Sukhnandan, Jeffrey Gill, Hillel Chiel, Victoria Webster-Wood, Roger Quinn | Case Western Reserve University,Carnegie Mellon University | Grasping and Manipulation I | | SDF-Based Graph Convolutional Q-Networks for Rearrangement of Multiple Objects | Hogun Kee, Minjae Kang, Dohyeong Kim, JaeGoo Choy, Songhwai Oh | Seoul National University,Seoul National University (SNU) | Grasping and Manipulation I | | Towards Open-World Interactive Disambiguation for Robotic Grasping | Yuchen Mo, Hanbo Zhang, Tao Kong | ByteDance AI Lab,Bytedance AI Lab,ByteDance | Grasping and Manipulation I | | GenDexGrasp: Generalizable Dexterous Grasping | Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang | Tsinghua University,Beijing Institute for General Artificial Intelligence,Peking University | Grasping and Manipulation I | | Mechanical Intelligence for Prehensile In-Hand Manipulation of Spatial Trajectories | Qiujie Lu, Zhongxue Gan, Xinran Wang, Guochao Bai, Zhuang Zhang, Nicolas Rojas | Fudan University,Imperial College London,Shanghai Jiao Tong University | Grasping and Manipulation I | | Fast-Grasp'D: Dexterous Multi-Finger Grasp Generation through Differentiable Simulation | Dylan Turpin, Tao Zhong, Shutong Zhang, Guanglei Zhu, Eric Heiden, Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg | University of Toronto,NVIDIA,University of Copenhagen, NVIDIA,Samsung | Grasping and Manipulation I | | An Analysis of Unified Manipulation with Robot Arms and Dexterous Hands Via Optimization-Based Motion Synthesis | Vatsal Patel, Daniel Rakita, Aaron Dollar | Yale University,University of Wisconsin-Madison | Grasping and Manipulation I | | Spherical Cubic Blends: C2-Continuous, Zero-Clamped, and Time-Optimized Interpolation of Quaternions | Jonas Wittmann, Lukas Cha, Marco Kappertz, Philipp Seiwald, Daniel Rixen | Technical University of Munich,Technische Universität München | Planning for Manipulation | | Object Reconfiguration with Simulation-Derived Feasible Actions | Yiyuan Lee, Wil Thomason, Zachary Kingston, Lydia Kavraki | Rice University | Planning for Manipulation | | CuRobo: Parellelized Collision-Free Robot Motion Generation | Balakumar Sundaralingam, Siva Kumar Sastry Hari, Adam Fishman, Caelan Garrett, Karl Van Wyk, Valts Blukis, Alexander James Millane, Helen Oleynikova, Ankur Handa, Fabio Ramos, Nathan Ratliff, Dieter Fox | NVIDIA Corporation,NVIDIA,University of Washington,Massachusetts Institute of Technology,ETH Zurich,Nvidia,NVidia,University of Sydney, NVIDIA | Planning for Manipulation | | Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces | Xinghao Zhu, Wenzhao Lian, Bodi Yuan, Daniel Freeman, Masayoshi Tomizuka | University of California, Berkeley,Google X,UC Berkeley,Google LLC,University of California | Planning for Manipulation | | Kinodynamic Rapidly-Exploring Random Forest for Rearrangement-Based Nonprehensile Manipulation | Kejia Ren, Podshara Chanrungmaneekul, Lydia Kavraki, Kaiyu Hang | Rice University | Planning for Manipulation | | Trajectory Generation with Dynamic Programming for End-Effector Sway Damping of Forestry Machine | Iman Jebellat, Inna Sharf | McGill University | Planning for Manipulation | | Planning for Complex Non-Prehensile Manipulation among Movable Objects by Interleaving Multi-Agent Pathfinding and Physics-Based Simulation | Dhruv Saxena, Maxim Likhachev | The Robotics Institute, Carnegie Mellon University,Carnegie Mellon University | Planning for Manipulation | | Torque-Limited Manipulation Planning through Contact by Interleaving Graph Search and Trajectory Optimization | Ramkumar Natarajan, Garrison Johnston, Nabil Simaan, Maxim Likhachev, Howie Choset | Robotics Institute, Carnegie Mellon University,Vanderbilt University,Carnegie Mellon University | Planning for Manipulation | | FDLNet: Boosting Real-Time Semantic Segmentation by Image-Size Convolution Via Frequency Domain Learning | Qingqing Yan, Shu Li, Chengju Liu, Ming Liu, Qijun Chen | Tongji University,Hong Kong University of Science and Technology | Semantic Scene Understanding | | SphNet: A Spherical Network for Semantic Pointcloud Segmentation | Lukas Bernreiter, Lionel Ott, Roland Siegwart, Cesar D. Cadena Lerma | ETH Zurich, Autonomous Systems Lab,ETH Zurich | Semantic Scene Understanding | | SRI-Graph: A Novel Scene-Robot Interaction Graph for Robust Scene Understanding | Dong Yang, Xiao Xu, Mengchen Xiong, Edwin Babaians, Eckehard Steinbach | Technical University of Munich | Semantic Scene Understanding | | 3D VSG: Long-Term Semantic Scene Change Prediction through 3D Variable Scene Graphs | Samuel Looper, Javier Rodriguez-Puigvert, Roland Siegwart, Cesar D. Cadena Lerma, Lukas Maximilian Schmid | ETH Zurich,Universidad de Zaragoza,Massachusetts Institute of Technology | Semantic Scene Understanding | | Infrared Image Captioning with Wearable Device | Chenjun Gao, Yanzhi Dong, Xiaohu Yuan, Huaping Liu | Yantai University,Tsinghua Univerisity,Tsinghua University | Semantic Scene Understanding | | External Camera-Based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge Sensors | Simon Bultmann, Raphael Memmesheimer, Sven Behnke | University of Bonn | Semantic Scene Understanding | | Feature-Realistic Neural Fusion for Real-Time, Open Set Scene Understanding | Kirill Mazur, Edgar Sucar, Andrew J Davison | Imperial College London | Semantic Scene Understanding | | Deep Learning on Home Drone: Searching for the Optimal Architecture | Alaa Maalouf, Yotam Gurfinkel, Barak Diker, Oren Gal, Daniela Rus, Dan Feldman | MIT,University of Haifa,Technion - Israel Institute of Technology | Semantic Scene Understanding | | Mask3D: Mask Transformer for 3D Semantic Instance Segmentation | Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe | RWTH Aachen University,ETH Zurich,Nvidia,ETH Zürich | Semantic Scene Understanding | | Detecting Spatio-Temporal Relations by Combining a Semantic Map with a Stream Processing Engine | Lennart Niecksch, Henning Deeken, Thomas Wiemann | German Research Centre for Artificial Intelligence (DFKI),Osnabrueck University,Fulda University of Applied Sciences | Semantic Scene Understanding | | Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs | Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng | Tsinghua University,Peking University | Semantic Scene Understanding | | CPSeg: Cluster-Free Panoptic Segmentation of 3D LiDAR Point Clouds | Thomas Enxu Li, Ryan Razani, Yixuan Xu, Bingbing Liu | University of Toronto,Huawei,Huawei Technologies Canada Co., Ltd.,Huawei Technologies | Semantic Scene Understanding | | A Generic Diffusion-Based Approach for 3D Human Pose Prediction in the Wild | Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi | EPFL,Independent Scholar,Sharif University of Technology | Deep Learning for Visual Perception II | | DifFAR: Differentiable Frequency-Based Disentanglement for Aerial Video Action Recognition | Divya Kothandaraman, Ming C. Lin, Dinesh Manocha | University of Maryland College Park,University of Maryland at College Park,University of Maryland | Deep Learning for Visual Perception II | | ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence | Dmitriy Rivkin, Gregory Dudek, Nikhil Rajiv Kakodkar, David Paul Meger, Oliver Limoyo, Michael Jenkin, Xue Liu, Francois Hogan | Samsung,McGill University,University of Toronto,York University,Massachusetts Institute of Technology | Deep Learning for Visual Perception II | | LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR | Pengfei Li, Ruowen Zhao, Yongliang Shi, Hao Zhao, Jirui Yuan, Guyue Zhou, Ya-qin Zhang | Institute for AI Industry Research (AIR), Tsinghua University,University of Chinese Academy of Sciences,Tsinghua University,Institute for AI Industry Research(AIR), Tsinghua University | Deep Learning for Visual Perception II | | Uncertainty-Aware LiDAR Panoptic Segmentation | Kshitij Sirohi, Mohammad Sajad Marvi, Daniel Büscher, Wolfram Burgard | University of Freiburg,Albert-Ludwigs-Universität Freiburg,University of Technology Nuremberg | Deep Learning for Visual Perception II | | E-VFIA : Event-Based Video Frame Interpolation with Attention | Onur Selim Kilic, Ahmet Akman, A. Alatan | METU,Middle East Technical University | Deep Learning for Visual Perception II | | Edge-Guided Multi-Domain RGB-To-TIR Image Translation for Training Vision Tasks with Challenging Labels | DongGuw Lee, Myung-Hwan Jeon, Younggun Cho, Ayoung Kim | Seoul National University (SNU),Seoul National University,Inha University | Deep Learning for Visual Perception II | | Weakly Supervised Referring Expression Grounding Via Target-Guided Knowledge Distillation | Jinpeng Mi, Song Tang, Ma Zhiyuan, Dan Liu, Qingdu Li, Jianwei Zhang | USST,University of Hamburg,University of Shanghai for Science and Technology | Deep Learning for Visual Perception II | | VQA-Based Robotic State Recognition Optimized with Genetic Algorithm | Kento Kawaharazuka, Yoshiki Obinata, Naoaki Kanazawa, Kei Okada, Masayuki Inaba | The University of Tokyo | AI-Based Methods | | Center Feature Fusion: Selective Multi-Sensor Fusion of Center-Based Objects | Philip Jacobson, Yiyang Zhou, Wei Zhan, Masayoshi Tomizuka, Ming Wu | University of California, Berkeley,Univeristy of California, Berkeley,University of California | AI-Based Methods | | Towards Robust Reference System for Autonomous Driving: Rethinking 3D MOT | Leichen Wang, Jiadi Zhang, Pei Cai, Xinrun Li | Robert Bosch CN,Tongji University,Nanyang Technological University,Bosch (China) Investment Co., Ltd. | AI-Based Methods | | LATITUDE: Robotic Global Localization with Truncated Dynamic Low-Pass Filter in City-Scale NeRF | Zhenxin Zhu, Yuantao Chen, Zirui Wu, Chao Hou, Yongliang Shi, Chuxuan Li, Pengfei Li, Guyue Zhou, Hao Zhao | Beihang University,Xi'an University of Architecture and Technology,Institute for AI Industry Research, Tsinghua University; Beijing,The University of Hong Kong,Tsinghua University,Institute for AI Industry Research (AIR), Tsinghua University | AI-Based Methods | | 4DRadarSLAM: A 4D Imaging Radar SLAM System for Large-Scale Environments Based on Pose Graph Optimization | Jun Zhang, Huayang Zhuge, Zhenyu Wu, Guohao Peng, Mingxing Wen, Yiyao Liu, Danwei Wang | Nanyang Technological University,NANYANG Technological University | Localization and Mapping IV | | A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation | Lin Li, Wendong Ding, Yongkun Wen, Yufei Liang, Yong Liu, Guowei Wan | Zhejiang University,Baidu,China,Intelligent Driving Group,Baidu | Localization and Mapping IV | | Data-Association-Free Landmark-Based SLAM | Yihao Zhang, Odin Aleksander Severinsen, John Leonard, Luca Carlone, Kasra Khosoussi | Massachusetts Institute of Technology,MIT,The Commonwealth Scientific and Industrial Research (CSIRO) | Localization and Mapping IV | | Efficient Bundle Adjustment for Coplanar Points and Lines | Lipu Zhou, Jiacheng Liu, Fengguang Zhai, Pan Ai, Kefei Ren, Yinian Mao, Guoquan Huang, Ziyang Meng, Michael Kaess | MeiTuan,Tsinghua University,Meituan,Meituan-Dianping Group,University of Delaware,Carnegie Mellon University | Localization and Mapping IV | | Convolutional Bayesian Kernel Inference for 3D Semantic Mapping | Joseph Wilson, Yuewei Fu, Arthur Zhang, Jingyu Song, Andrew Capodieci, Paramsothy Jayakumar, Kira Barton, Maani Ghaffari | University of Michigan,Neya Robotics,U.S. Army DEVCOM Ground Vehicle Systems Center,University of Michigan at Ann Arbor | Localization and Mapping IV | | SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations | Xingguang Zhong, Yue Pan, Jens Behley, Cyrill Stachniss | University of Bonn | Localization and Mapping IV | | Efficient and Hybrid Decoder for Local Map Construction in Bird's-Eye-View | Kun Tian, Yun Ye, Zheng Zhu, Peng Li, Guan Huang | phigent robotics,Company,Institute of Automation, Chinese Academy of Sciences,Phigent AI,Phigent Robotics | Localization and Mapping IV | | Contour Context: Abstract Structural Distribution for 3D LiDAR Loop Detection and Metric Pose Estimation | Binqian Jiang, Shaojie Shen | Hong Kong University of Science and Technology | Localization and Mapping IV | | The Reflectance Field Map: Mapping Glass and Specular Surfaces in Dynamic Environments | Paul Foster, Collin Johnson, Benjamin Kuipers | University of Michigan,May Mobility | Localization and Mapping IV | | Inverse Perspective Mapping-Based Neural Occupancy Grid Map for Visual Parking | Xiangru Mu, Haoyang Ye, Daojun Zhu, Tongqing Chen, Tong Qin | Huawei,Huawei Technologies,Huawei Technology,Huawei Techonology | Localization and Mapping IV | | Efficient Implicit Neural Reconstruction Using LiDAR | Dongyu Yan, Xiaoyang Lyu, Jieqi Shi, Yi Lin | Harbin Institute of Technology (ShenZhen),The University of Hong Kong,Hong Kong University of Technology and Science,Hong Kong University of Science and Technology | Localization and Mapping IV | | Factor Graph Fusion of Raw GNSS Sensing with IMU and Lidar for Precise Robot Localization without a Base Station | Jonas Beuchert, Marco Camurri, Maurice Fallon | University of Oxford,Free University of Bozen-Bolzano | Localization and Mapping IV | | Continuous and Precise Positioning in Urban Environments by Tightly Coupled Integration of GNSS, INS and Vision | Xingxing Li, Shengyu Li, Yuxuan Zhou, Zhiheng Shen, Xuanbin Wang, Xin Li, Weisong Wen | Wuhan University,Wuhan university,Wuhan University, School of Geodesy and Geomatics,Hong Kong Polytechnic University | Localisation and Mapping | | 360-DFPE: Leveraging Monocular 360-Layouts for Direct Floor Plan Estimation | Bolivar Solarte, Yueh-Cheng Liu, Chin-hsuan Wu, Yi-hsuan Tsai, Min Sun | National Tsing Hua University,Technical University of Munich,NEC Labs America | Localisation and Mapping | | Autonomous Navigation in Unknown Environments with Sparse Bayesian Kernel-Based Occupancy Mapping | Thai Duong, Michael Yip, Nikolay A. Atanasov | University of California, San Diego | Localisation and Mapping | | Multitask Learning for Scalable and Dense Multilayer Bayesian Map Inference | Lu Gan, Youngji Kim, J.W Grizzle, Jeffrey Walls, Ayoung Kim, Ryan Eustice, Maani Ghaffari | California Institute of Technology,NAVER Labs,University of Michigan,Seoul National University | Localisation and Mapping | | Sigma-FP: Robot Mapping of 3D Floor Plans with an RGB-D Camera under Uncertainty | Jose Luis Matez-Bandera, Javier Monroy, Javier Gonzalez-jimenez | University of Malaga,University of Málaga | Localisation and Mapping | | Continuous-Time Trajectory Estimation for Differentially Flat Systems | Jacob Johnson, Joshua Mangelson, Randal Beard | Brigham Young University | Localisation and Mapping | | IC-GVINS: A Robust, Real-Time, INS-Centric GNSS-Visual-Inertial Navigation System | Xiaoji Niu, Hailiang Tang, Tisheng Zhang, Jing Fan, Liu Jingnan | Wuhan University | Localisation and Mapping | | Gyro-Net: IMU Gyroscopes Random Errors Compensation Method Based on Deep Learning | Yunqi Gao, Dianxi Shi, Ruihao Li, Zhe Liu, Wen Sun | Defense Innovation Institute,National University of Defense Technology,Renmin University of China | Localisation and Mapping | | Self-Supervised Feature Learning for Long-Term Metric Visual Localization | Yuxuan Chen, Timothy Barfoot | University of Toronto | Localisation and Mapping | | GraffMatch: Global Matching of 3D Lines and Planes for Wide Baseline LiDAR Registration | Parker Lusk, Devarth Parikh, Jonathan Patrick How | Massachusetts Institute of Technology,Ford Motor Company | Localisation and Mapping | | Model Learning with Backlash Compensation for a Tendon-Driven Surgical Robot | Francesco Cursi, Weibang Bai, Eric Yeatman, Petar Kormushev | Imperial College London | Medical and Surgical Robotics | | Simultaneous Online Registration-Independent Stiffness Identification and Tip Localization of Surgical Instruments in Robot-Assisted Eye Surgery | Ali Ebrahimi, Shahriar Sefati, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita | Johns Hopkins University,Johns Hopkins Medical Institute,The Johns Hopkins University | Medical and Surgical Robotics | | Robot-Assisted Retraction for Transoral Surgery | Lifeng Zhu, Jiangwei Shen, Shuyan Yang, Song Aiguo | Southeast University | Medical and Surgical Robotics | | HIFUSK – High Intensity Focused Ultrasound Surgery Based on KUKA Robot | Andrea Mariani, Laura Morchi, Alessandro Diodato, Selene Tognarelli, Arianna Menciassi | Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant'Anna - SSSA | Medical and Surgical Robotics | | Rethinking Feature Extraction: Gradient-Based Localized Feature Extraction for End-To-End Surgical Downstream Tasks | Winnie Pang, Mobarakol Islam, Sai Mitheran, Lalithkumar Seenivasan, Mengya Xu, Hongliang Ren | National University of Singapore,University College London,Carnegie Mellon University,Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) | Medical and Surgical Robotics | | Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery | Paul Maria Scheikl, Eleonora Tagliabue, Balazs Gyenes, Martin Wagner, Diego Dall'Alba, Paolo Fiorini, Franziska Mathis-Ullrich | Friedrich-Alexander-University Erlangen-Nürnberg (FAU),Carl Zeiss AG,Karlsruhe Institute of Technology,Heidelberg University Hospital,University of Verona | Medical and Surgical Robotics | | Shape Tracking and Feedback Control of Cardiac Catheter Using MRI-Guided Robotic Platform - Validation with Pulmonary Vein Isolation Simulator in MRI | Ziyang Dong, Xiaomei Wang, Ge Fang, Zhuoliang He, Justin Di-Lang Ho, Chim Lee Cheung, Wai Lun Tang, Xiaochen Xie, Liyuan Liang, Hing-chiu Chang, Chi Keong Ching, Ka-Wai Kwok | The University of Hong Kong,Harbin Institute of Technology, Shenzhen,National Heart Centre Singapore | Medical and Surgical Robotics | | A Generalized Framework for Concentric Tube Robot Design Using Gradient-Based Optimization | Jui-Te Lin, Cedric Girerd, Jiayao Yan, John T. Hwang, Tania Morimoto | University of California San Diego,University of California, San Diego | Medical and Surgical Robotics | | Magnetic Soft Continuum Robots with Braided Reinforcement | Pete Lloyd, Onaizah Onaizah, Giovanni Pittiglio, Damith Suresh Chathuranga, James Henry Chandler, Pietro Valdastri | University of Leeds,McMaster University,Harvard University | Medical and Surgical Robotics | | Shape Sensing of Flexible Robots Based on Deep Learning | Xuan Thao Ha, Di Wu, Mouloud Ourak, Gianni Borghesan, Jenny Dankelman, Arianna Menciassi, Emmanuel B Vander Poorten | KU Leuven,University of Leuven,TU Delft,Scuola Superiore Sant'Anna - SSSA | Medical and Surgical Robotics | | Multifingered Grasping Based on Multimodal Reinforcement Learning | Hongzhuo Liang, Lin Cong, Norman Hendrich, Shuang Li, Fuchun Sun, Jianwei Zhang | University of Hamburg,Tsinghua University | Grasping and Micromanipulation | | Planning of Power Grasps Using Infinite Program under Complementary Constraints | Zherong Pan, Duo Zhang, Changhe Tu, Xifeng Gao | Tencent America,New York University,Shandong University | Grasping and Micromanipulation | | A Soft Barometric Tactile Sensor to Simultaneously Localize Contact and Estimate Normal Force with Validation to Detect Slip in a Robotic Gripper | Thomas De Clercq, Anatolii Sianov, Guillaume Crevecoeur | Ghent University,University of Gent, EELAB | Grasping and Micromanipulation | | Learning Efficient Policies for Picking Entangled Wire Harnesses: An Approach to Industrial Bin Picking | Xinyi Zhang, Yukiyasu Domae, Weiwei Wan, Kensuke Harada | Osaka University,The National Institute of Advanced Industrial Science and Techno | Grasping and Micromanipulation | | A Novel Scaffold Reinforced Actuator with Tunable Attitude Ability for Grasping | Pei Jiang, Ji Luo, Jiaxing Li, Michael Z. Q. Chen, Yonghua Chen, Yang Yang, Rui Chen | Chongqing University,Nanjing University of Science and Technology,The University of Hong Kong,Nanjing University of Information Science and Technology | Grasping and Micromanipulation | | Deep Learning Reactive Robotic Grasping with a Versatile Vacuum Gripper | Hui Zhang, Jef Peeters, Eric Demeester, Karel Kellens | KU Leuven | Grasping and Micromanipulation | | An Unconstrained Convex Formulation of Compliant Contact | Alejandro Castro, Frank Permenter, Xuchen Han | Toyota Research Institute | Grasping and Micromanipulation | | Robotic Manipulation of Sperm As a Deformable Linear Object | Changsheng Dai, Guanqiao Shan, Hang Liu, Changhai Ru, Yu Sun | Dalian University of Technology,University of Toronto,Soochow University | Grasping and Micromanipulation | | Robotic Rotational Positioning of End-Effectors for Micromanipulation | Songlin Zhuang, Changsheng Dai, Guanqiao Shan, Changhai Ru, Zhuoran Zhang, Yu Sun | Yongjiang Laboratory,Dalian University of Technology,University of Toronto,Soochow University,The Chinese University of Hong Kong, Shenzhen | Grasping and Micromanipulation | | Comparing EMG Continuous Movement Decoding with Joints Unconstrained and Constrained | Lizhi Pan, Zhongyi Ding, Jianmin Li | Tianjin University | Prosthetics, Exoskeletons and Rehabilitation | | Design and Validation of a Polycentric Hybrid Knee Prosthesis with Electromagnet-Controlled Mode Transition | Xu Wang, Haohua Xiu, Yao Zhang, Wei Liang, Wei Chen, Guowu Wei, Lei Ren, Luquan Ren | Jilin University,Ningbo University of Technology,Salford University,University of Manchester | Prosthetics, Exoskeletons and Rehabilitation | | Powered Knee and Ankle Prosthesis with Adaptive Control Enables Climbing Stairs with Different Stair Heights, Cadences, and Gait Patterns | Sarah Hood, Lukas Gabert, Tommaso Lenzi | University of Utah | Prosthetics, Exoskeletons and Rehabilitation | | Design, Control, and Experimental Evaluation of a Novel Robotic Glove System for Patients with Brachial Plexus Injuries | Wenda Xu, Yunfei Guo, Cesar Bravo, Pinhas Ben-Tzvi | Virginia Tech,Carilion Clinic Institute of Orthopaedics and Neurosciences | Prosthetics, Exoskeletons and Rehabilitation | | Data-Driven Variable Impedance Control of a Powered Knee-Ankle Prosthesis for Adaptive Speed and Incline Walking | T. Kevin Best, Cara Gonzalez Welker, Elliott Rouse, Robert D. Gregg | University of Michigan,University of Colorado Boulder | Prosthetics, Exoskeletons and Rehabilitation | | NESM-Gamma: An Upper-Limb Exoskeleton with Compliant Actuators for Clinical Deployment | Jun Pan, Davide Astarita, Andrea Baldoni, Filippo Dell'agnello, Simona Crea, Nicola Vitiello, Emilio Trigili | Zhejiang University of Technology,Scuola Superiore Sant'Anna,ISTITUTO DI BIOROBOTICA,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant Anna | Prosthetics, Exoskeletons and Rehabilitation | | Design, Development, and Control of a Hand/Wrist Exoskeleton for Rehabilitation and Training | Mihai Dragusanu, Muhammad Zubair Iqbal, Tommaso Lisini Baldi, Domenico Prattichizzo, Monica Malvezzi | University of Siena | Prosthetics, Exoskeletons and Rehabilitation | | Markovian Transparency Control of an Exoskeleton Robot | Felix Mauricio Escalante, Leonardo Felipe Dos Santos, Yecid Moreno, Adriano A G Siqueira, Marco Henrique Terra, Thiago Boaventura | University of São Paulo,University of Sao Paulo | Prosthetics, Exoskeletons and Rehabilitation | | ArmAssist: A Telerehabilitation Solution for Upper-Limb Rehabilitation at Home | Ainara Garzo, Je Hyung Jung, Javier Arcas Ruiz-ruano, Joel C. Perry, Thierry Keller | TECNALIA, Basque Research and Technology Alliance (BRTA),University of Idaho,FUNDACION TECNALIA Research & Innovation | Prosthetics, Exoskeletons and Rehabilitation | | A Soft, Wearable Skin-Brace for Assisting Forearm Pronation and Supination with a Low-Profile Design | Huimin Su, Kyoung-soub Lee, Yusung Kim, Hyung-Soon Park | Korea Advanced Institute of Science and Technology,Korea Advanced Institute of Science and Technology (KAIST) | Prosthetics, Exoskeletons and Rehabilitation | | Teachers in Concordance for Pseudo-Labeling of 3D Sequential Data | Awet Haileslassie Gebrehiwot, Patrik Vacek, David Hurych, Karel Zimmermann, Patrick Perez, Tomas Svoboda | Czech Technical University in Prague,Ceske vysoke uceni technicke v Praze - Fakulta elektrotechnicka,Valeo,Czech Technical University Prague,valeo,Faculty of Electrical Engineering, Czech Technical University in Prague | Optimal Control and Object Detection | | Automatic Labeling to Generate Training Data for Online LiDAR-Based Moving Object Segmentation | Xieyuanli Chen, Benedikt Mersch, Lucas Nunes, Rodrigo Marcuzzi, Ignacio Vizzo, Jens Behley, Cyrill Stachniss | National University of Defense Technology,University of Bonn | Optimal Control and Object Detection | | Uncertainty for Identifying Open-Set Errors in Visual Object Detection | Dimity Miller, Niko Suenderhauf, Michael J Milford, Feras Dayoub | Queensland University of Technology,The University of Adelaide | Optimal Control and Object Detection | | Bounds on Optimal Revisit Times in Persistent Monitoring Missions with a Distinct & Remote Service Station | Sai Krishna Kanth Hari, Sivakumar Rathinam, Swaroop Darbha, Krishna Kalyanam, Satyanarayana Gupta Manyam, David Casbeer | Los Alamos National Laboratory,TAMU,NASA Ames Research Center,Infoscitex corp.,AFRL | Optimal Control and Object Detection | | Force Sharing Problem During Gait Using Inverse Optimal Control | Filip Becanovic, Vincent Bonnet, Raphaël Dumas, Kosta Jovanovic, Samer Mohammed | Université Paris-Est Créteil, University of Belgrade,University Paul Sabatier,University Gustave Eiffel,University of Belgrade, Serbia,University of Paris Est Créteil - (UPEC) | Optimal Control and Object Detection | | Data-Driven Iterative Optimal Control for Switched Dynamical Systems | Yuqing Chen, Yangzhi Li, David Braun | Xi'an Jiaotong-Liverpool University,Singapore University of Technology and Design,Vanderbilt University | Optimal Control and Object Detection | | BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning | Avadesh Meduri, Paarth Shah, Julian Viereck, Majid Khadiv, Ioannis Havoutis, Ludovic Righetti | New York University,University of Oxford,Max Planck Institute for Intelligent Systems | Optimal Control and Object Detection | | Environment Warped Gait Trajectory Optimization for Complex Terrains | Zherong Pan, Tan Chen, Xifeng Gao, Wu Kui | Tencent America,Michigan Technological University,Tencent | Optimal Control and Object Detection | | Differential Dynamic Programming with Nonlinear Safety Constraints under System Uncertainties | Gokhan Alcan, Ville Kyrki | Aalto University | Optimal Control and Object Detection | | ViTAL: Vision-Based Terrain-Aware Locomotion for Legged Robots | Shamel Fahmi, Victor Barasuol, Domingo Esteban, Octavio Antonio Villarreal Magaña, Claudio Semini | Massachusetts Institute of Technology,Istituto Italiano di Tecnologia,ANYbotics AG | Optimal Control and Object Detection | | Experimental Study on Accurate Calibration for Industrial Robot Via Integrated Extended Kalman and Beetle Antennae Search | Zhibing Li, Shuai Li, Xin Luo | Chongqing Institute of Green and Intelligent Technology,Chinese ,Hong Kong Polytechnic University,Chongqing Institute of Green and Intelligent Technology, Chinese | Calibration, Identification, and Simulation | | Real-Time Model Predictive Control and System Identification Using Differentiable Physics Simulation | Sirui Chen, Keenon Werling, Albert Wu, Karen Liu | The University of Hong Kong,Stanford University | Calibration, Identification, and Simulation | | PBACalib: Targetless Extrinsic Calibration for High-Resolution LiDAR-Camera System Based on Plane-Constrained Bundle Adjustment | Feiyi Chen, Liang Li, Shuyang Zhang, Wu Jin, Lujia Wang | The Hong Kong University of Science and Technology,The University of Hong Kong,UESTC,The Hong Kong University of Technology | Calibration, Identification, and Simulation | | Probabilistic Framework for Hand-Eye and Robot-World Calibration AX=YB | Junhyoung Ha | Korea Institute of Science and Technology | Calibration, Identification, and Simulation | | Multi-Kernel Maximum Correntropy Kalman Filter for Orientation Estimation | Shilei Li, Lijing Li, Dawei Shi, Wulin Zou, Pu Duan, Ling Shi | The Hong Kong University of Science and Technology,China University of Mining and Technology,Beijing Institute of Technology,Hong Kong University of Science and Technology,Xeno Dynamics Co., Ltd | Calibration, Identification, and Simulation | | A4LidarTag: Depth-Based Fiducial Marker for Extrinsic Calibration of Solid-State Lidar and Camera | Xie Yusen`, Deng Lei, Sun Ting, Fu Yeyu, Zhixiang Chen, Chen Baohua, Li Jian, Cui Xinglong, Yin Hanxi, Deng Shuixin, Xiao Junwei | Beijing Information Science & Technology University,Tsinghua University,The University of Sheffield | Calibration, Identification, and Simulation | | A CoppeliaSim Dynamic Simulator for the Da Vinci Research Kit | Marco Ferro, Alessandro Mirante, Fanny Ficuciello, Marilena Vendittelli | CNRS,Sapienza University of Rome,Università di Napoli Federico II | Standalone Videos | | Fast and Robust Inverse Kinematics of Serial Robots Using Halley's Method | Steffan Lloyd, Rishad Irani, Mojtaba Ahmadi | Carleton University | Calibration, Identification, and Simulation | | Large-Dimensional Multibody Dynamics Simulation Using Contact Nodalization and Diagonalization | Jeongmin Lee, Minji Lee, Dongjun Lee | Seoul National University | Calibration, Identification, and Simulation | | EMS®: A Massive Computational Experiment Management System towards Data-Driven Robotics | Qinjie Lin, Guo Ye, Han Liu | Northwestern University | Software Tools I | | Rmagine: 3D Range Sensor Simulation in Polygonal Maps Via Ray Tracing for Embedded Hardware on Mobile Robots | Alexander Mock, Thomas Wiemann, Joachim Hertzberg | University of Osnabrück,Fulda University of Applied Sciences,University of Osnabrueck | Software Tools I | | A Framework for Fast Prototyping of Photo-Realistic Environments with Multiple Pedestrians | Sara Casao, Andrés Otero, Álvaro Serra-gómez, Ana Cristina Murillo, Javier Alonso-Mora, Eduardo Montijano | Unversity of Zaragoza ESQ,,,,,,,G Department of Computer Science,Universidad de Zaragoza,Delft University of Technology,University of Zaragoza | Software Tools I | | RoboSC: A Domain-Specific Language for Supervisory Controller Synthesis of ROS Applications | Bart Wesselink, Koen De Vos, Ivan Kurtev, Michel Reniers, Elena Torta | Eindhoven University of Technology,Eindhoven Univeristy of Technology | Software Tools I | | KubeROS: A Unified Platform for Automated and Scalable Deployment of ROS2-Based Multi-Robot Applications | Yongzhou Zhang, Christian Wurll, Björn Hein | Karlsruhe University of Applied Sciences,University of Applied Sciences Karlsruhe | Software Tools I | | Domain-Specific Languages for Kinematic Chains and Their Solver Algorithms: Lessons Learned for Composable Models | Sven Schneider, Nico Hochgeschwender, Herman Bruyninckx | Bonn-Rhein-Sieg University,University of Leuven | Software Tools I | | SIERRA: A Modular Framework for Accelerating Research and Improving Reproducibility | John Harwell, Maria Gini | University of Minnesota | Software Tools I | | OpTaS: An Optimization-Based Task Specification Library for Trajectory Optimization and Model Predictive Control | Christopher Edwin Mower, Joao Moura, Nazanin Zamani Behabadi, Sethu Vijayakumar, Tom Vercauteren, Christos Bergeles | King's College London,University of Edinburgh,Not Affiliated | Software Tools I | | CMG-Net: An End-To-End Contact-Based Multi-Finger Dexterous Grasping Network | Mingze Wei, Yaomin Huang, Zhiyuan Xu, Ning Liu, Zhengping Che, Xinyu Zhang, Chaomin Shen, Feifei Feng, Chun Shan, Jian Tang | east china normal university, midea,East China Normal University,Midea Group,Guangdong Polytechnic Normal University,Midea Group (Shanghai) Co., Ltd. | Data Sets I | | ARMBench: An Object-Centric Benchmark Dataset for Robotic Manipulation | Chaitanya Mitash, Fan Wang, Shiyang Lu, Vikedo Terhuja, Tyler Garaas, Felipe Polido, Manikantan Nambi | Amazon Robotics,Rutgers University,Mitsubishi Electric Research Laboratories,Italian Institute of Technology | Data Sets I | | FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments | Jishnu Jaykumar P, Yu-Wei Chao, Yu Xiang | The University of Texas at Dallas,NVIDIA,University of Texas at Dallas | Data Sets I | | WorldGen: A Large Scale Generative Simulator | Chahat Deep Singh, Riya Kumari, Cornelia Fermuller, Nitin Sanket, Yiannis Aloimonos | University of Maryland, College Park,University of Maryland | Data Sets I | | Lossless SIMD Compression of LiDAR Range and Attribute Scan Sequences | Jeff Ford, Jordan Ford | ComplexIQ,Carnegie Mellon University | Data Sets I | | 3D-DAT: 3D-Dataset Annotation Toolkit for Robotic Vision | Markus Suchi, Bernhard Neuberger, Amanzhol Salykov, Jean-baptiste Weibel, Timothy Patten, Markus Vincze | TU Wien,University of Technology Sydney,Vienna University of Technology | Data Sets I | | METEOR: A Dense, Heterogeneous, and Unstructured Traffic Dataset with Rare Behaviors | Rohan Chandra, Xijun Wang, Mridul Mahajan, Rahul Kala, Rishitha Palugulla, Chandrababu Naidu Nallagopu, Alok Jain, Dinesh Manocha | University of Texas, Austin,University of Maryland, College Park,Indian Institute of Information Technology Allahabad,Indian Institute of Information Technology, Allahabad, India,navAjna Technologies Private Limited,University of Maryland | Data Sets I | | Kollagen: A Collaborative SLAM Pose Graph Generator | Roberto C. Sundin, David Umsonst | Ericsson Research | Data Sets I | | AvoidBench: A High-Fidelity Vision-Based Obstacle Avoidance Benchmarking Suite for Multi-Rotors | Hang Yu, Guido De Croon, Christophe De Wagter | Delft University of technology,TU Delft,Delft University of Technology | Benchmarking | | Generating a Terrain-Robustness Benchmark for Legged Locomotion: A Prototype Via Terrain Authoring and Active Learning | Chong Zhang, Lizhi Yang | ETH Zurich,California Institute of Technology | Benchmarking | | Train Offline, Test Online: A Real Robot Learning Benchmark | Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Beltran Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta | Carnegie Mellon University,Meta AI,New York University,Stanford University,UC Berkeley | Benchmarking | | Benchmarking Potential Based Rewards for Learning Humanoid Locomotion | Se Hwan Jeon, Steve Heim, Charles Khazoom, Sangbae Kim | Massachusetts Institute of Technology | Benchmarking | | Household Clothing Set and Benchmarks for Characterising End-Effector Cloth Manipulation | Angus B. Clark, Luke Cramphorn, Michal Rachowiecki, Austin Gregg-smith | Imperial College London,Bristol University,Dyson,University of Bristol | Benchmarking | | Parameter Optimization for Manipulator Motion Planning Using a Novel Benchmark Set | Carl Gaebert, Sascha Kaden, Benjamin Fischer, Ulrike Thomas | Chemnitz University of Technology,Technische Universität Chemnitz | Benchmarking | | Benchmarking Reinforcement Learning Techniques for Autonomous Navigation | Zifan Xu, Bo Liu, Xuesu Xiao, Anirudh Nair, Peter Stone | University of Texas at Austin,George Mason University,The University of Texas at Austin | Benchmarking | | A Benchmark for Multi-Robot Planning in Realistic, Complex and Cluttered Environments | Simon Schaefer, Luigi Palmieri, Lukas Heuer, Rüdiger Dillmann, Sven Koenig, Alexander Kleiner | Karlsruhe Institute of Technology (KIT),Robert Bosch GmbH,Örebro University, Robert Bosch GmbH,FZI - Forschungszentrum Informatik - Karlsruhe,University of Southern California,Bosch Central Research | Benchmarking | | D-Align: Dual Query Co-Attention Network for 3D Object Detection Based on Multi-Frame Point Cloud Sequence | Junhyung Lee, Junho Koh, Youngwoo Lee, Junwon Choi | Hanyang University | Object Detection III | | DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection | Dingkang Liang, Zhe Liu, Hou Jinghua, Jingyu Li | Huazhong University of Science and Technology | Object Detection III | | Fast Staircase Detection and Estimation Using 3D Point Clouds with Multi-Detection Merging for Heterogeneous Robots | Prasanna Sriganesh, Namya Bagree, Bhaskar Vundurthy, Matthew Travers | Carnegie Mellon University | Object Detection III | | Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection | Xiaofang Wang, Kris Kitani | Carnegie Mellon University,CMU | Object Detection III | | Zero-Shot Object Detection Based on Dynamic Semantic Vectors | Haoyu Li, Jilin Mei, Jiancong Zhou, Yu Hu | University of Chinese Academy of Sciences,Institute of Computing Technology, Chinese Academy of Sciences,Institute of Computing Technology Chinese Academy of Sciences | Object Detection III | | Road Anomaly Segmentation Based on Pixel-Wise Logit Variance with Iterative Background Highlighting | Dongkun Lee, Han-gyu Kim, Ho-jin Choi | KAIST,NAVER Cloud | Object Detection III | | WEDGE: Web-Image Assisted Domain Generalization for Semantic Segmentation | Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak | POSTECH,Pohang University of Science and Technology,DGIST,Microsoft Research Asia,Eastern Institute for Advanced Study | Object Detection III | | Incremental Few-Shot Object Detection Via Simple Fine-Tuning Approach | Tae-min Choi, Jong-Hwan Kim | Korea Advanced Institute of Science and Technology,KAIST | Object Detection III | | Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation | Anoop Cherian, Siddarth Jain, Tim K. Marks, Alan Sullivan | Mitsubishi Electric Research Labs,Mitsubishi Electric Research Laboratories (MERL),Mitsubishi Electric Research Lab | Segmentation | | Multi-To-Single Knowledge Distillation for Point Cloud Semantic Segmentation | Shoumeng Qiu, Feng Jiang, Haiqiang Zhang, Xiangyang Xue, Jian Pu | fudan,Fudan University,Beijing Institute of Technology | Segmentation | | On Improving Boundary Quality of Instance Segmentation in Cluttered and Chaotic Scenarios | Biqi Yang, Xiaojie Gao, Xianzhi Li, Yunhui Liu, Chi-wing Fu, Pheng Ann Heng | The Chinese University of Hong Kong,Chinese University of Hong Kong | Segmentation | | Real-Time Background Subtraction under Varying Lighting Conditions | Sisi Liang, Darren Baker | CSIRO | Segmentation | | Few-Shot 3D LiDAR Semantic Segmentation for Autonomous Driving | Jilin Mei, Junbao Zhou, Yu Hu | Institute of Computing Technology, Chinese Academy of Sciences,Chinese Academy of Sciences,Institute of Computing Technology Chinese Academy of Sciences | Segmentation | | ERASE-Net: Efficient Segmentation Networks for Automotive Radar Signals | Shihong Fang, Haoran Zhu, Devansh Bisla, Anna Choromanska, Satish Ravindran, Dongyin Ren, Ryan Wu | New York University,NYU,New York University Tandon School of Engineering,NXP,NXP Semiconductors | Segmentation | | ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation Via Regularized Domain Concatenation | Lingdong Kong, Niamul Quader, Venice Erin Liong | National University of Singapore,Motional, Singapore,Motional | Segmentation | | Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection | Darren Tsai, Julie Stephany Berrio Perez, Mao Shan, Eduardo Nebot, Stewart Worrall | University of Sydney, Australian Centre for Field Robotics,ACFR - The University of Sydney,The University of Sydney,Unversity of Sydney,University of Sydney | Segmentation | | Nerf2nerf: Pairwise Registration of Neural Radiance Fields | Leili Goli, Daniel Rebain, Sara Sabour, Animesh Garg, Andrea Tagliasacchi | University of Toronto, Vector Institute,University of British Columbia,Google, University of Toronto,University of Toronto,Simon Fraser University | Radiance Fields | | NeRF2Real: Sim2real Transfer of Vision-Guided Bipedal Motion Skills Using Neural Radiance Fields | Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess | Google,DeepMind,Deepmind,University of Washington | Award Finalists 2 | | Density-Aware NeRF Ensembles: Quantifying Predictive Uncertainty in Neural Radiance Fields | Niko Suenderhauf, Dimity Miller, Jad Chakra | Queensland University of Technology | Radiance Fields | | Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation | Yunzhi Lin, Thomas Müller, Jonathan Tremblay, Bowen Wen, Stephen Tyree, Alex Evans, Patricio A. Vela, Stan Birchfield | Georgia Institute of Technology,NVIDIA,Nvidia,NVIDIA Corporation | Radiance Fields | | NeRF-Loc: Visual Localization with Conditional Neural Radiance Field | Jianlin Liu, Qiang Nie, Yong Liu, Chengjie Wang | Tencent,The Chinese University of Hong Kong,Tencent YouTuLab, Shanghai Jiao Tong University | Radiance Fields | | Multimodal Neural Radiance Field | Haidong Zhu, Yuyin Sun, Chi Liu, Lu Xia, Jiajia Luo, Nan Qiao, Ram Nevatia, Cheng-hao Kuo | University of Southern California,Amazon,University of Tennessee | Radiance Fields | | Orbeez-SLAM: A Real-Time Monocular Visual SLAM with ORB Features and NeRF-Realized Mapping | Chi-ming Chung, Yang-che Tseng, Ya-ching Hsu, Xiang-qian Shi, Yun-hung Hua, Jia-Fong Yeh, Yi-ting Chen, Wen-chin Chen, Winston Hsu | National Taiwan University,National Chiao Tung University | Radiance Fields | | NeRFing It: Offline Object Segmentation through Implicit Modeling | Kenneth Blomqvist, Jen Jen Chung, Lionel Ott, Roland Siegwart | ETH Zurich,The University of Queensland | Radiance Fields | | Using Learning Curve Predictions to Learn from Incorrect Feedback | Taylor Kessler Faulkner, Andrea Thomaz | University of Washington,University of Texas at Austin | Reinforcement Learning II | | Conflict-Constrained Multi-Agent Reinforcement Learning Method for Parking Trajectory Planning | Siyuan Chen, Meiling Wang, Yi Yang, Wenjie Song | Beijing Institute of Technology | Reinforcement Learning II | | Improving Robot Navigation in Crowded Environments Using Intrinsic Rewards | Diego Martinez Baselga, Luis Riazuelo, Luis Montano Gella | University of Zaragoza,Instituto de Investigación en IngenieríadeAragón,University of Z,Universidad de Zaragoza | Reinforcement Learning II | | Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers | Yan Wang, Gautham Vasan, Rupam Mahmood | University of Alberta | Reinforcement Learning II | | Reinforcement Learning for Safe Robot Control Using Control Lyapunov Barrier Functions | Desong Du, Shaohang Han, Naiming Qi, Haitham Bou Ammar, Jun Wang, Wei Pan | Harbin Institute of Technology,Delft University of Technology,Princeton University,University College London | Reinforcement Learning II | | Safe Reinforcement Learning of Dynamic High-Dimensional Robotic Tasks: Navigation, Manipulation, Interaction | Puze Liu, Kuo Zhang, Davide Tateo, Snehal Jauhri, Zhiyuan Hu, Jan Peters, Georgia Chalvatzaki | Technische Universität Darmstadt,TU-Darmstadt,TU Darmstadt,Technical University of Darmstadt,Technische Universität Darmastadt | Reinforcement Learning II | | Robotic Control Using Model Based Meta Adaption | Karam Daaboul, Joel Ikels, Johann Marius Zöllner | Karlsruhe Institut for Technology,Karlsruhe Insitute of Technology,FZI Forschungszentrum Informatik | Reinforcement Learning II | | SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations | Khaled Nakhleh, Minahil Raza, Mack Tang, Matthew Andrews, Rinu Boney, Ilija Hadzic, Jeongran Lee, Atefeh Mohajeri, Karina Palyutina | Texas A&M University,Nokia Bell Labs,Aalto University | Reinforcement Learning II | | Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation | Xingyu Zhu, Xin Wang, Jonathan Freer, Hyung Jin Chang, Yixing Gao | JiLin University,Jilin University,University of Birmingham | Deep Learning Methods | | Privacy-Preserving Video Conferencing Via Thermal-Generative Images | Sheng-yang Chiu, Yu-Ting Huang, Chieh-ting Lin, Yu-chee Tseng, Jen-jee Chen, Meng-hsuan Tu, Bo-chen Tung, Yujou Nieh | National Yang Ming Chiao Tung University,NYCU,NYCU, National Yang Ming Chiao Tung University | Deep Learning Methods | | Streaming LifeLong Learning with Any-Time Inference | Soumya Banerjee, Vinay Kumar Verma, Vinay Namboodiri | IIT Kanpur,University of Bath | Deep Learning Methods | | Code as Policies: Language Model Programs for Embodied Control | Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Peter Florence, Andy Zeng | Google,UC Berkeley,Google Inc,Google Brain,MIT | Award Finalists 2 | | Learning Sim-To-Real Dense Object Descriptors for Robotic Manipulation | Hoang-Giang Cao, Weihao Zeng, 毅成 吳 | National Yang Ming Chiao Tung University,NYCU,National Chiao Tung University | Representation Learning | | Learning Visual-Audio Representations for Voice-Controlled Robots | Peixin Chang, Shuijing Liu, D. Livingston Mcpherson, Katherine Driggs-Campbell | University of Illinois at Urbana Champaign,University of Illinois,University of Illinois at Urbana-Champaign | Representation Learning | | Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations | Negin Heravi, Ayzaan Wahid, Corey Lynch, Peter Florence, Travis Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta Dwibedi | Stanford University,Google,Google Brain,MIT | Representation Learning | | Sample-Efficient Goal-Conditioned Reinforcement Learning Via Predictive Information Bottleneck for Goal Representation Learning | Qiming Zou, Einoshin Suzuki | Kyushu University | Representation Learning | | Context-Aware Robot Control Using Gesture Episodes | Petr Vanc, Jan Kristof Behrens, Karla Stepanova | CIIRC, Czech Technical University in Prague,Czech Technical University,Czech Technical university | Learning from Experience | | Automated Action Evaluation for Robotic Imitation Learning Via Siamese Neural Networks | Xiang Chang, Fei Chao, Changjing Shang, Qiang Shen | Aberystwyth University,Xiamen University | Learning from Experience | | Failure-Aware Policy Learning for Self-Assessable Robotics Tasks | Kechun Xu, Runjian Chen, Shuqi Zhao, Zizhang Li, Hongxiang Yu, Ci Chen, Yue Wang, Rong Xiong | Zhejiang University | Learning from Experience | | Multimodal Time Series Learning of Robots Based on Distributed and Integrated Modalities: Verification with a Simulator and Actual Robots | Hideyuki Ichiwara, Hiroshi Ito, Kenjiro Yamamoto, Hiroki Mori, Tetsuya Ogata | Hitachi, Ltd. / Waseda University,Hitachi, Ltd.,Waseda University | Learning from Experience | | Using Memory-Based Learning to Solve Tasks with State-Action Constraints | Mrinal Verghese, Christopher Atkeson | Carnegie Mellon University,CMU | Learning from Experience | | Structured Motion Generation with Predictive Learning: Proposing Subgoal for Long-Horizon Manipulation | Namiko Saito, Joao Moura, Tetsuya Ogata, Marina Y Aoyama, Shingo Murata, Shigeki Sugano, Sethu Vijayakumar | University of Edinburgh,Waseda University,Keio University | Learning from Experience | | Sequence-Agnostic Multi-Object Navigation | Gireesh Nandiraju, Ayush Agrawal, Ahana Datta, Snehasis Banerjee, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna | IIIT Hyderabad,Robotics Research Center, IIIT Hyderabad,International Institute of Information Technology, Hyderabad,TCS Research,University of Birmingham,Tata Consultancy Services | Learning from Experience | | Occlusion Reasoning for Skeleton Extraction of Self-Occluded Tree Canopies | Chung Hee Kim, George Kantor | Carnegie Mellon University | Award Finalists 4 | | Statistical Shape Representations for Temporal Registration of Plant Components in 3D | Karoline Heiwolt, Cengiz Öztireli, Grzegorz Cielniak | University of Lincoln,ETH Zurich | Agricultural Robotics and Automation I | | 3D Reconstruction-Based Seed Counting of Sorghum Panicles for Agricultural Inspection | Harry Freeman, Eric Schneider, Chung Hee Kim, Moonyoung Lee, George Kantor | Carnegie Mellon University | Agricultural Robotics and Automation I | | Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf Instance Segmentation in the Agricultural Domain | Gianmarco Roggiolani, Matteo Sodano, Tiziano Guadagnino, Federico Magistri, Jens Behley, Cyrill Stachniss | University of Bonn,Photogrammetry and Robotics Lab, University of Bonn,Sapienza University of Rome | Agricultural Robotics and Automation I | | Target-Aware Implicit Mapping for Agricultural Crop Inspection | Shane Kelly, Alessandro Riccardi, Elias Ariel Marks, Federico Magistri, Tiziano Guadagnino, Margarita Chli, Cyrill Stachniss | ETH Zurich,University of Bonn,Sapienza University of Rome | Award Finalists 2 | | Robust Plant Localization and Phenotyping in Dense 3D Point Clouds for Precision Agriculture | Henry J. Nelson, Christopher Smith, Athanasios Bacharis, Nikos Papanikolopoulos | University of Minnesota,Lake Superior State University | Agricultural Robotics and Automation I | | Neural-Kalman GNSS/INS Navigation for Precision Agriculture | Yayun Du, Swapnil Sayan Saha, Sandeep Sandha, Arthur Lovekin, Jason Wu, S. Siddharth, Mahesh Chowdhary, Mohammad Khalid Jawed, Mani Srivastava | University of California, Los Angeles,University of California - Los Angeles,STMicroelectronics,UCLA | Agricultural Robotics and Automation I | | Fruit Tracking Over Time Using High-Precision Point Clouds | Alessandro Riccardi, Shane Kelly, Elias Ariel Marks, Federico Magistri, Tiziano Guadagnino, Jens Behley, Maren Bennewitz, Cyrill Stachniss | University of Bonn,ETH Zurich,Sapienza University of Rome | Agricultural Robotics and Automation I | | A MySQL Database for the Systematic Configuration Selection of Redundant Manipulators When Path Planning in Confined Spaces | Kat Styles Wood, Thomas. B Scott, Antonia Tzemanaki | University of Bristol | Redundant Robots | | Reinforcement Learning Control of a Reconfigurable Planar Cable Driven Parallel Manipulator | Adhiti Raman, Ameya Salvi, Matthias Schmid, Venkat Krovi | Clemson University | Redundant Robots | | Intuitive Telemanipulation of Hyper-Redundant Snake Robots within Locomotion and Reorientation Using Task-Priority Inverse Kinematics | Tim-Lukas Habich, Melvin Hueter, Moritz Schappler, Svenja Tappe | Leibniz University Hannover,Institute of Mechatronic Systems, Leibniz Universitaet Hannover,Leibniz Universität Hannover | Redundant Robots | | An Equivalent Two Section Method for Calculating the Workspace of Multi-Segment Continuum Robots | Yeman Fan, Dikai Liu | University of Technology Sydney | Redundant Robots | | On Locally Optimal Redundancy Resolution Using the Basis of the Null Space | Eugenio Monari, Yi Chen, Rocco Vertechy | University of Bologna,Università di Bologna | Redundant Robots | | Optimal Parameterized Joints Selection to Improve Motion Planning Performance of Redundant Manipulators | Bin Xie, Qingfeng Wang, Di Wu | Central South University | Redundant Robots | | A Kinematically Redundant (6+1)-Dof Hybrid Parallel Robot for Delicate Physical Environment and Robot Interaction (pERI) | Jehyeok Kim, Clement Gosselin | Université Laval | Redundant Robots | | Learning-Based Initialization of Trajectory Optimization for Path-Following Problems of Redundant Manipulators | Minsung Yoon, Mincheul Kang, Daehyung Park, Sung-Eui Yoon | Korea Advanced Institute of Science and Technology (KAIST),KAIST,Korea Advanced Institute of Science and Technology, KAIST | Award Finalists 2 | | Kinematic Analysis and Design of a Novel (6+3)-DoF Parallel Robot with Fixed Actuators | Arda Yigit, David Breton, Zhou Zhou, Thierry Laliberte, Clement Gosselin | Laval University,University Laval,Universite Laval,Université Laval | Kinematics | | RangedIK: An Optimization-Based Robot Motion Generation Method for Ranged-Goal Tasks | Yeping Wang, Pragathi Praveena, Daniel Rakita, Michael Gleicher | University of Wisconsin-Madison,University of Wisconsin - Madison | Kinematics | | Contact Based Turning Gait of a Novel Legged-Wheeled Quadruped | Alper Yeldan, Abhimanyu Arora, Gim Song Soh | Singapore University of Technology and Design | Kinematics | | Computational Modeling in System with Non-Circular Timing Pulleys | Renzo Caballero, Angelica Coronado Preciado, Eric Feron | King Abdullah University of Science and Technology | Kinematics | | The New Exhibition {em Blind Machines}, a Large 3D Printing Machine | Jean-Pierre Merlet, Jean-Pierre Merlet | INRIA | Parallel Robots | | New Bracket Polynomials Associated with the General Gough-Stewart Parallel Robot Singularities | Federico Thomas | CSIC-UPC | Award Finalists 1 | | Output Mode Switching for Parallel Five-Bar Manipulators Using a Graph-Based Path Planner | Parker Edwards, Aravind Baskar, Caroline Hills, Mark Plecnik, Jonathan Hauenstein | University of Notre Dame | Parallel Robots | | Dimensional Optimization and Anti-Disturbance Analysis of an Upgraded Feed Mechanism in FAST | Xiaoyan Wang, Bin Zhang, Zhaoyang Li, Gao Xinyu, Fei Zhang, Yifan Ma, Rui Yao, Jia-ning Yin, Hui Li, Qingge Yang, Qingwei Li, Weiwei Shang | University of Science and Technology of China,National Astronomical Observatories,Chinese Academy of Sciences,National Astronomical Observatories, Chinese Academy of Sciences | Parallel Robots | | Online Social Robot Navigation in Indoor, Large and Crowded Environments | Steven Alexander Silva Mendoza, Nervo Xavier Verdezoto Dias, Dennys Paillacho, Samuel Millan-norman, Juan David Hernández | Cardiff University,Espol Polytechnic University | Human-Robot Collaboration II | | Learning Responsibility Allocations for Safe Human-Robot Interaction with Applications to Autonomous Driving | Ryan Cosner, Yuxiao Chen, Karen Yan Ming Leung, Marco Pavone | California Institute of Technology,Nvidia research,Stanford University, NVIDIA Research, University of Washington,Stanford University | Human-Robot Collaboration II | | Efficient Inference of Temporal Task Specifications from Human Demonstrations Using Experiment Design | Shlok Sobti, Rahul Shome, Lydia Kavraki | Diamond Age ,D,The Australian National University,Rice University | Human-Robot Collaboration II | | On the Impact of Interruptions During Multi-Robot Supervision Tasks | Abhinav Dahiya, Yifan Cai, Oliver Schneider, Stephen L. Smith | University of Waterloo | Human-Robot Collaboration II | | System Configuration and Navigation of a Guide Dog Robot: Toward Animal Guide Dog-Level Guiding Work | Hochul Hwang, Tian Xia, Ibrahima Keita, Ken Suzuki, Joydeep Biswas, Sunghoon Ivan Lee, Donghyun Kim | University of Massachusetts Amherst,University of Massachusetts at Amherst,University of Massachusetts, Amherst,University of Texas at Austin,UMass Amherst | Human-Robot Collaboration II | | Human Non-Compliance with Robot Spatial Ownership Communicated Via Augmented Reality: Implications for Human-Robot Teaming Safety | Christine T Chang, Matthew Luebbers, Mitchell Hebert, Bradley Hayes | University of Colorado Boulder,Draper | Human-Robot Collaboration II | | Robust Robot Planning for Human-Robot Collaboration | Yang You, Vincent Thomas, Francis Colas, Alami Rachid, Olivier Buffet | Inria Nancy Grand Est,LORIA - Universite de Lorraine,CNRS,LORIA/INRIA | Human-Robot Collaboration II | | Natural Language Instruction Understanding for Robotic Manipulation: A Multisensory Perception Approach | Weihua Wang, Xiaofei Li, Yanzhi Dong, Jun Xie, Di Guo, Huaping Liu | Yantai University,Taiyuan University of Technology,Beijing University of Posts and Telecommunications,Tsinghua University | Human-Robot Collaboration II | | EgoHMR: Egocentric Human Mesh Recovery Via Hierarchical Latent Diffusion Model | Yuxuan Liu, Jianxin Yang, Xiao Gu, Yao Guo, Guang-Zhong Yang | Shanghai Jiao Tong University,Imperial College London | Human-Robot Collaboration II | | Telerobot Operators Can Account for Varying Transmission Dynamics in a Visuo-Haptic Object Tracking Task | Mohit Singhala, Jeremy Brown | Johns Hopkins University | Human-Robot Collaboration II | | Hierarchical Intention Tracking for Robust Human-Robot Collaboration in Industrial Assembly Tasks | Zhe Huang, Ye-ji Mun, Xiang Li, Yiqing Xie, Ninghan Zhong, Weihang Liang, Junyi Geng, Tan Chen, Katherine Driggs-Campbell | University of Illinois at Urbana-Champaign,University of Illinois Urbana-Champaign,Pennsylvania State University,Michigan Technological University | Human-Robot Collaboration II | | CoGrasp: 6-DoF Grasp Generation for Human-Robot Collaboration | Abhinav Keshari, Hanwen Ren, Ahmed H. Qureshi | Purdue University | Human-Robot Collaboration II | | Can We Use Diffusion Probabilistic Models for 3D Motion Prediction? | Hyemin Ahn, Esteve Valls Mascaro, Dongheui Lee | Ulsan National Institute of Science and Technology,Technische Universitat Wien,Technische Universität Wien (TU Wien) | Intent Recognition | | PedFormer: Pedestrian Behavior Prediction Via Cross-Modal Attention Modulation and Gated Multitask Learning | Amir Rasouli, Iuliia Kotseruba | Huawei Technologies Canada,Lassonde School of Engineering | Intent Recognition | | Robot-Assisted Eye-Hand Coordination Training System by Estimating Motion Direction Using Smooth-Pursuit Eye Movements | Xiao Li, Zeng Hong, Chenhua Yang, Song Aiguo | School of Instrument Science and Engineering,Southeast Universit,Southeast University | Intent Recognition | | Generalizable Movement Intention Recognition with Multiple Heterogeneous EEG Datasets | Xiao Gu, Jinpei Han, Guang-Zhong Yang, Benny Lo | Imperial College London,Shanghai Jiao Tong University | Intent Recognition | | Bi-Manual Manipulation of Multi-Component Garments towards Robot-Assisted Dressing | Stelios Kotsovolis, Yiannis Demiris | Imperial College London | Physical Human-Robot Interaction I | | Humans Need Augmented Feedback to Physically Track Non-Biological Robot Movements | Mahdiar Edraki, Pauline Maurice, Dagmar Sternad | Northeastern University,CNRS - LORIA | Physical Human-Robot Interaction I | | Robot Mimicry Attack on Keystroke-Dynamics User Identification and Authentication System | Rongyu Yu, Burak Kizilkaya, Zhen Meng, Liying Emma Li, Guodong Zhao, Muhammad Ali Imran | University of Glasgow,University of Glasgow, UK | Physical Human-Robot Interaction I | | In-Mouth Robotic Bite Transfer with Visual and Haptic Sensing | Lorenzo Shaikewitz, Yilin Wu, Suneel Belkhale, Jennifer Grannen, Priya Sundaresan, Dorsa Sadigh | California Institute of Technology,Stanford University | Physical Human-Robot Interaction I | | Robot Trust and Self-Confidence Based Role Arbitration Method for Physical Human-Robot Collaboration | Qiao Wang, Dikai Liu, Marc Garry Carmichael, Chin-teng Lin | University of Technology Sydney,Centre for Autonomous Systems,UTS | Physical Human-Robot Interaction I | | Design Optimization and Data-Driven Shallow Learning for Dynamic Modeling of a Smart Segmented Electroadhesive Clutch | Navid Feizi, Zahra Bahrami, S. Farokh Atashzar, Mehrdad R. Kermani, Rajnikant V. Patel | University of Western Ontario,Institute of Geography, University of Erlangen-Nuremberg,New York University (NYU), US,The University of Western Ontario | Physical Human-Robot Interaction I | | Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method | Alvin Shek, Bo Ying Su, Rui Chen, Changliu Liu | Carnegie Mellon University,Carnegie Mellon University; University of Michigan; | Award Finalists 3 | | Touch Classification on Robotic Skin Using Multimodal Tactile Sensing Modules | Min Jin Yang, Junhwi Cho, Hyunjo Chung, Kyungseo Park, Jung Kim | Korea Advanced Institute of Science and Technology (KAIST),KAIST,University of Illinois at Urbana-Champaign | Physical Human-Robot Interaction I | | Distributed Data-Driven Predictive Control for Multi-Agent Collaborative Legged Locomotion | Randall Fawcett, Leila Amanzadeh, Jeeseop Kim, Aaron Ames, Kaveh Akbari Hamed | Virginia Polytechnic Institute and State University,Virginia Tech University,Caltech,Virginia Tech | Award Finalists 3 | | On the Use of Torque Measurement in Centroidal State Estimation | Shahram Khorshidi, Ahmad Gazar, Nicholas Rotella, Maximilien Naveau, Ludovic Righetti, Maren Bennewitz, Majid Khadiv | University of Bonn,Max-Planck Institute for Intelligent Systems,University of Southern California,LAAS/CNRS,New York University,Max Planck Institute for Intelligent Systems | Legged Motion Analysis and Synthesis | | DMMGAN: Diverse Multi Motion Prediction of 3D Human Joints Using Attention-Based Generative Adversarial Network | Payam Nikdel, Mohammad Mahdavian, Mo Chen | Simon Fraser University/Waymo,Simon Fraser University | Legged Motion Analysis and Synthesis | | Contact Optimization for Non-Prehensile Loco-Manipulation Via Hierarchical Model Predictive Control | Alberto Rigo, Yiyu Chen, Satyandra K. Gupta, Quan Nguyen | USC,University of Southern California | Legged Motion Analysis and Synthesis | | Optimal Scheduling of Models and Horizons for Model Hierarchy Predictive Control | Charles Khazoom, Steve Heim, Daniel Gonzalez-Diaz, Sangbae Kim | Massachusetts Institute of Technology | Legged Motion Analysis and Synthesis | | STPOTR: Simultaneous Human Trajectory and Pose Prediction Using a Non-Autoregressive Transformer for Robot Follow-Ahead | Mohammad Mahdavian, Payam Nikdel, Mahdi Taherahmadi, Mo Chen | Simon Fraser University,Simon Fraser University/Waymo | Legged Motion Analysis and Synthesis | | Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion | Victor Dhedin, Haolong Li, Shahram Khorshidi, Lukas Mack, Adithya Kumar Chinnakkonda Ravi, Avadesh Meduri, Paarth Shah, Felix Grimminger, Ludovic Righetti, Majid Khadiv, Joerg Stueckler | Max Planck Institute for Intelligent Systems,University of Bonn,New York University,University of Oxford | Legged Motion Analysis and Synthesis | | Getting Air: Modelling and Control of a Hybrid Pneumatic-Electric Legged Robot | Christopher Mailer, Stacey Leigh Shield, Reuben Govender, Amir Patel | University of Cape Town,University of Cape Town, | Legged Motion Analysis and Synthesis | | Enhanced Balance for Legged Robots Using Reaction Wheels | Chi Yen Lee, Shuo Yang, Benjamin Bokser, Zachary Manchester | Carnegie Mellon University | Legged Motion Analysis and Synthesis | | Versatile Real-Time Motion Synthesis Via Kino-Dynamic MPC with Hybrid-Systems DDP | He Li, Tingnan Zhang, Wenhao Yu, Patrick Wensing | University of Notre Dame,Google | Legged Motion Analysis and Synthesis | | Distributed Model Predictive Formation Control with Gait Synchronization for Multiple Quadruped Robots | Shaohang Xu, Wentao Zhang, Lijun Zhu, Chin Pang Ho | Huazhong University of Science and Technology,City University of Hong Kong | Legged Motion Analysis and Synthesis | | Video Waterdrop Removal Via Spatio-Temporal Fusion in Driving Scenes | Qiang Wen, Yue Wu, Qifeng Chen | Hong Kong University of Science and Technology,HKUST | Autonomous Navigation | | Unsupervised Learning of Depth and Pose Based on Monocular Camera and Inertial Measurement Unit (IMU) | Yanbo Wang, Hanwen Yang, Jianwei Cai, Guangming Wang, Jingchuan Wang, Yi Huang | Shanghai Jiao Tong University,Shanghai JiaoTong University,Shanghai Weitong Vision Technology Co. , Ltd. | Autonomous Navigation | | Self-Supervised Multi-Frame Monocular Depth Estimation with Pseudo-LiDAR Pose Augmentation | Wenhua Wu, Guangming Wang, Jiquan Zhong, Hesheng Wang, Zhe Liu | Shang Hai Jiao Tong University,Shanghai Jiao Tong University,Shanghai Jiaotong University,University of Cambridge | Autonomous Navigation | | Anomaly Detection Based Robust Autonomous Navigation | Kefan Jin, Mu Fun, Xingyao Han, Guangming Wang, Zhe Liu | Shanghai Jiao Tong University,University of Cambridge | Autonomous Navigation | | Learning Perceptual Hallucination for Multi-Robot Navigation in Narrow Hallways | JinSoo Park, Xuesu Xiao, Garrett Warnell, Harel Yedidsion, Peter Stone | The University of Texas at Austin,George Mason University,U.S. Army Research Laboratory,University of Texas at Austin | Autonomous Navigation | | Multi-Head Attention Machine Learning for Fault Classification in Mixed Autonomous and Human-Driven Vehicle Platoons | Theodore Wu, Satvick Acharya, Abdelrahman Khalil, Ahmed Aljanaideh, Mohammad Al Janaideh, Deepa Kundur | University of Toronto,university of Toronto,Memorial University of Newfoundland,Bentley University,Memorial University &University of Toronto | Autonomous Navigation | | GP-Frontier for Local Mapless Navigation | Mahmoud Ali, Lantao Liu | Indiana University | Autonomous Navigation | | Image Masking for Robust Self-Supervised Monocular Depth Estimation | Hemang Chawla, Kishaan Jeeveswaran, Elahe Arani, Bahram Zonooz | Navinfo Europe | Autonomous Navigation | | Learning-Based Uncertainty-Aware Navigation in 3D Off-Road Terrains | Hojin Lee, Junsung Kwon, Cheolhyeon Kwon | Ulsan National Institute of Science and Technology,Ulsan National Institute of Sience and Technology | Autonomous Navigation | | Safe Real-World Autonomous Driving by Learning to Predict and Plan with a Mixture of Experts | Stefano Pini, Christian Perone, Aayush Ahuja, Ana Sofia Rufino Ferreira, Moritz Niendorf, Sergey Zagoruyko | Woven by Toyota U.K. Limited,Woven Planet UK,Woven Planet,Woven Planet Holdings, Inc | Autonomous Navigation | | Interpretable and Flexible Target-Conditioned Neural Planners for Autonomous Vehicles | Haolan Liu, Jishen Zhao, Liangjun Zhang | University of California San Diego,UC San Diego,Baidu | Autonomous Navigation | | Visibility-Aware Navigation among Movable Obstacles | Jose Muguira Iturralde, Aidan Curtis, Yilun Du, Leslie Kaelbling, Tomas Lozano-Perez | Massachusetts Institute of Technology,MIT | Autonomous Navigation | | Trajectory Error Compensation for Optimal Control of UMA-2 – a Climbing Robot Executing Maintenance Operation in Harsh Environment | Diego Gitardi, Simone Sabbadini, Anna Valente | SUPSI - University of Applied Sciences and Arts of Southern Swit,SUPSI-ISTePS | Trajectory Optimization | | Obstacle-Aware Topological Planning Over Polyhedral Representation for Quadrotors | Junjie Gao, Fenghua He, Wei Zhang, Yu Yao | Harbin Institute of Technology | Award Finalists 2 | | Trajectory Optimization for 3D Shape-Changing Robots with Differential Mobile Base | Mengke Zhang, Chao Xu, Fei Gao, Yanjun Cao | Zhejiang University,Zhejiang University, Huzhou Institute of Zhejiang University | Trajectory Optimization | | Trajectory Optimization for Distributed Manipulation by Shaping a Physical Field | Adam Uchytil, Jiri Zemanek | Faculty of Electrical Engineering, Czech Technical University in Prague,Czech Technical University in Prague | Trajectory Optimization | | Globally Guided Trajectory Planning in Dynamic Environments | Oscar De Groot, Laura Ferranti, Dariu Gavrila, Javier Alonso-Mora | Delft University of Technology | Trajectory Optimization | | VP-STO: Via-Point-Based Stochastic Trajectory Optimization for Reactive Robot Behavior | Julius Jankowski, Lara Brudermüller, Nick Hawes, Sylvain Calinon | Idiap Research Institute and EPFL,University of Oxford,Idiap Research Institute | Trajectory Optimization | | Modular and Parallelizable Multibody Physics Simulation Via Subsystem-Based ADMM | Jeongmin Lee, Minji Lee, Dongjun Lee | Seoul National University | Trajectory Optimization | | Real-Time Unified Trajectory Planning and Optimal Control for Urban Autonomous Driving under Static and Dynamic Obstacle Constraints | Rowan Dempster, Mohammad Alsharman, Derek Rayside, William Melek | University of Waterloo | Trajectory Optimization | | A General Locomotion Approach for a Novel Multi-Legged Spherical Robot | Dun Yang, Yunfei Liu, Yang Yu | Beihang University | Integrated Planning and Control | | A Coarse-To-Fine Framework for Dual-Arm Manipulation of Deformable Linear Objects with Whole-Body Obstacle Avoidance | Mingrui Yu, Kangchen Lv, Changhao Wang, Masayoshi Tomizuka, Xiang Li | Tsinghua University,University of California, Berkeley,University of California | Integrated Planning and Control | | Adaptive Approximation of Dynamics Gradients Via Interpolation to Speed up Trajectory Optimisation | David Mackenzie Charles Russell, Rafael Papallas, Mehmet Remzi Dogar | University of Leeds | Integrated Planning and Control | | Learning Augmented, Multi-Robot Long-Horizon Navigation in Partially Mapped Environments | Abhish Khanal, Gregory Stein | George Mason University | Integrated Planning and Control | | Switching Attention in Time-Varying Environments Via Bayesian Inference of Abstractions | Meghan Booker, Anirudha Majumdar | Princeton University | Integrated Planning and Control | | Hierarchical Policy Blending As Inference for Reactive Robot Control | Kay Hansel, Julen Urain, Jan Peters, Georgia Chalvatzaki | Intelligent Autonomous Systems Group, Technical University Darmstadt,TU Darmstadt,Technische Universität Darmstadt,Technische Universität Darmastadt | Integrated Planning and Control | | Efficient Learning of High Level Plans from Play | Núria Armengol Urpí, Marco Bagatella, Otmar Hilliges, Georg Martius, Stelian Coros | ETH Zurich,Max Planck Institute for Intelligent Systems | Integrated Planning and Control | | Multi-Objective Ergodic Search for Dynamic Information Maps | Ananya Rao, Abigail Breitfeld, Alberto Candela, Benjamin Jensen, David Wettergreen, Howie Choset | Carnegie Mellon University,NASA Jet Propulsion Laboratory, Caltech | Learning for Motion and Path Planning | | Safety-Critical Ergodic Exploration in Cluttered Environments Via Control Barrier Functions | Cameron Lerch, Dayi Dong, Ian Abraham | Yale University | Learning for Motion and Path Planning | | GuILD: Guided Incremental Local Densification for Accelerated Sampling-Based Motion Planning | Rosario Scalise, Aditya Mandalika, Brian Hou, Sanjiban Choudhury, Siddhartha Srinivasa | University of Washington,Cornell University | Learning for Motion and Path Planning | | ARiADNE: A Reinforcement Learning Approach Using Attention-Based Deep Networks for Exploration | Yuhong Cao, Tianxiang Hou, Yizhuo Wang, Xian Yi, Guillaume Sartoretti | National University of Singapore,National University of Singapore (NUS) | Learning for Motion and Path Planning | | On Shortest Arc-To-Arc Dubins Path | Satyanarayana Gupta Manyam, David Casbeer | Infoscitex corp.,AFRL | Learning for Motion and Path Planning | | Robust Navigation with Cross-Modal Fusion and Knowledge Transfer | Wenzhe Cai, Guangran Cheng, Lingyue Kong, Lu Dong, Changyin Sun | Southeast University | Learning for Motion and Path Planning | | Contextual Multi-Objective Path Planning | Anna Nickelson, Kagan Tumer, William Smart | Oregon State University | Learning for Motion and Path Planning | | A Continuous Off-Policy Reinforcement Learning Scheme for Optimal Motion Planning in Simply-Connected Workspaces | Panagiotis Rousseas, Charalampos Bechlioulis, Kostas Kyriakopoulos | National Technical University of Athens,University of Patras,National Technical Univ. of Athens | Learning for Motion and Path Planning | | Towards Robust Autonomous Grasping with Reflexes Using High-Bandwidth Sensing and Actuation | Andrew Saloutos, Hongmin Kim, Elijah Stanger-jones, Menglong Guo, Sangbae Kim | Massachusetts Institute of Technology,Seoul National University,University of California Berkeley | Grasping and Manipulation II | | High-Speed Scooping: An Implementation through Stiffness Control and Direct-Drive Actuation | Ka Hei Mak, Pu Xu, Jungwon Seo | The Hong Kong University of Science and Technology,Hong Kong University of Science and Technology,Pusan National University | Grasping and Manipulation II | | GraspAda: Deep Grasp Adaptation through Domain Transfer | Yiting Chen, Junnan Jiang, Ruiqi Lei, Yasemin Bekiroglu, Fei Chen, Miao Li | Wuhan University,Tsinghua University,Chalmers University of Technology, University College London,The Chinese University of Hong Kong | Grasping and Manipulation II | | Task-Oriented Stiffness Setting for a Variable Stiffness Hand | Ana Elvira Huezo Martin, Ashok Meenakshi Sundaram, Werner Friedl, Virginia Ruiz Garate, Maximo A. Roa | German Aerospace Center (DLR),German AerospaceCenter (DLR),University of Mondragon,DLR - German Aerospace Center | Grasping and Manipulation II | | Flipbot: Learning Continuous Paper Flipping Via Coarse-To-Fine Exteroceptive-Proprioceptive Exploration | Chao Zhao, Chunli Jiang, Junhao Cai, Hongyu Yu, Michael Y. Wang, Qifeng Chen | Hong Kong University of Science and Technology,The Hong Kong University of Science and Technology,Monash University,HKUST | Grasping and Manipulation II | | Anthropomorphic Robot Hand Using the Principle of Sweat and Fingerprints of Human Hands | Donghyun Kim, Junmo Yang, Dongwon Yun | Daegu Gyeongbuk Institute of Science and Technology,Daegu Gyeongbuk Institute of Science and Technology (DGIST) | Grasping and Manipulation II | | In-Hand Manipulation in Power Grasp: Design of an Adaptive Robot Hand with Active Surfaces | Yilin Cai, Shenli Yuan | Carnegie Mellon University,SRI International | Award Finalists 1 | | Passive Robotic Gripper Using a Contact-Based Locking Mechanism | Issei Nate, Zhongkui Wang, Shinichi Hirai | Ritsumeikan University,Ritsumeikan Univ. | Grasping and Manipulation II | | The New Dexterity Adaptive Humanlike Robot Hand: Employing a Reconfigurable Palm for Robust Grasping and Dexterous Manipulation | Geng Gao, Anany Dwivedi, Minas Liarokapis | Acumino inc,University of Auckland,The University of Auckland | Grasping and Manipulation II | | Picking by Tilting: In-Hand Manipulation for Object Picking Using Effector with Curved Form | Yanshu SONG, Abdullah Nazir, Darwin Lau, Yunhui Liu | CUHK(Chinese University of Hong Kong),Hong Kong Centre for Logistics Robotics,The Chinese University of Hong Kong,Chinese University of Hong Kong | Grasping and Manipulation II | | Linear Delta Arrays for Compliant Dexterous Distributed Manipulation | Sarvesh Bipin Patil, Long Tao, Tess Hellebrekers, Zeynep Temel, Oliver Kroemer | Carnegie Mellon University School of Computer Science,Carnegie Mellon University,Meta AI Research | Grasping and Manipulation II | | A Tactile-Enabled Hybrid Rigid-Soft Continuum Manipulator for Forceful Enveloping Grasps Via Scale Invariant Desgin | Ian Taylor, Maheera Bawa, Alberto Rodriguez | Massachusetts Institute of Technology,MIT | Grasping and Manipulation II | | Adaptive Optimal Electrical Resistance Tomography for Large-Area Tactile Sensing | Wendong Zheng, Huaping Liu, Di Guo, Wuqiang Yang | Tsinghua University,Beijing University of Posts and Telecommunications,The University of Manchester | Force and Tactile Sensing I | | Towards Open-Set Material Recognition Using Robot Tactile Sensing | Kun-hong Liu, Qianhui Yang, Yu Xie, Xiangyi Huang | Xiamen University | Award Finalists 2 | | RobotSweater: Scalable, Generalizable, and Customizable Machine-Knitted Tactile Skins for Robots | Zilin Si, Tianhong Yu, Katrene Morozov, James Mccann, Wenzhen Yuan | Carnegie Mellon University,Cornell University,University of California, Santa Barbara | Force and Tactile Sensing I | | DTact: A Vision-Based Tactile Sensor That Measures High-Resolution 3D Geometry Directly from Darkness | Changyi Lin, Ziqi Lin, Shaoxiong Wang, Huazhe Xu | Shanghai Qi Zhi Institute,Tsinghua University,MIT | Force and Tactile Sensing I | | MagTac: Magnetic Six-Axis Force/Torque Fingertip Tactile Sensor for Robotic Hand Applications | Sungwoo Park, Sang-Rok Oh, Donghyun Hwang | Korea university, KIST,KIST,Korea Institute of Science and Technology | Force and Tactile Sensing I | | Tac-VGNN: A Voronoi Graph Neural Network for Pose-Based Tactile Servoing | Wen Fan, Max Yang, Yifan Xing, Nathan Lepora, Dandan Zhang | University of Bristol | Force and Tactile Sensing I | | Safe Self-Supervised Learning in Real of Visuo-Tactile Feedback Policies for Industrial Insertion | Letian Fu, Huang Huang, Lars Berscheid, Hui Li, Ken Goldberg, Sachin Chitta | UC Berkeley,University of California at Berkeley,Karlsruhe Institute of Technology,Autodesk Research,Autodesk Inc. | Force and Tactile Sensing I | | In-Situ Mechanical Calibration for Vision-Based Tactile Sensors | Can Zhao, Jieji Ren, Hexi Yu, Daolin Ma | Shanghai Jiao Tong University | Force and Tactile Sensing I | | Tactile-Driven Gentle Grasping for Human-Robot Collaborative Tasks | Christopher Ford, Haoran Li, John Lloyd, Manuel Giuseppe Catalano, Matteo Bianchi, Efi Psomopoulou, Nathan Lepora | University of Bristol,Istituto Italiano di Tecnologia,University of Pisa | Force and Tactile Sensing I | | TANDEM3D: Active Tactile Exploration for 3D Object Recognition | Jingxi Xu, Han Lin, Shuran Song, Matei Ciocarlie | Columbia University | Force and Tactile Sensing I | | Cable Routing and Assembly Using Tactile-Driven Motion Primitives | Achu Wilson, Helen Jiang, Wenzhao Lian, Wenzhen Yuan | Carnegie Mellon University,Google X | Force and Tactile Sensing I | | A Tactile Feedback Insertion Strategy for Peg-In-Hole Tasks | Oliver Gibbons, Alessandro Albini, Perla Maiolino | University of Oxford | Force and Tactile Sensing I | | Coupled, Closed-System Fluidic Actuators for Use in Wearable Rehabilitation Devices | James Greig, Maria Elena Giannaccini, Edward Chadwick | University of Aberdeen,University of Bristol | Rehabilitation and Augmentation I | | Emulating Human Kinematic Behavior on Lower-Limb Prostheses Via Multi-Contact Models and Force-Based Nonlinear Control | Rachel Gehlhar, Aaron Ames | California Institute of Technology | Rehabilitation and Augmentation I | | Simplified Motor Primitives for Gait Symmetrization: Pilot Study with an Active Hip Orthosis | Henri Laloyaux, Chiara Livolsi, Andrea Pergolini, Simona Crea, Nicola Vitiello, Renaud Ronsse | Université catholique de Louvain,IUVO S.r.l, Scuola Superiore Sant'Anna of Pisa,Scuola Superiore Sant'Anna of Pisa,Scuola Superiore Sant'Anna, The BioRobotics Institute,Scuola Superiore Sant Anna | Rehabilitation and Augmentation I | | A Preliminary Study of the Effects of Active Recovery Reflexes on Stumble Recovery in a Swing-Assist Knee Prosthesis | Jantzen Lee, Shane King, Maura Eveld, Michael Goldfarb | Vanderbilt University,University of Twente | Rehabilitation and Augmentation I | | Exploring Multimodal Gait Rehabilitation and Assistance through an Adaptable Robotic Platform | Sophia Otalora, Sergio D. Sierra M., Felipe Ballen-moreno, Marcela Múnera, Carlos A. Cifuentes | Federal University of Espírito Santo,University of Bristol - University of the West of England,Vrije Universiteit Brussel, R&MM, Brubotics, Flanders Make,Escuela Colombiana de Ingeniería Julio Garavito,University of the West of England, Bristol | Rehabilitation and Augmentation I | | Bilateral Asymmetric Hip Stiffness Applied by a Robotic Hip Exoskeleton Elicits Kinematic and Kinetic Adaptation | Banu Abdikadirova, Mark Price, Jonaz Moreno Jaramillo, Wouter Hoogkamer, Meghan Huber | University of Massachusetts Amherst,University of Massachusetts, Amherst | Rehabilitation and Augmentation I | | Gait Event Detection with Proprioceptive Force Sensing in a Powered Knee-Ankle Prosthesis: Validation Over Walking Speeds and Slopes | Emily Keller, Curt A. Laubscher, Robert D. Gregg | University of Michigan | Rehabilitation and Augmentation I | | Towards a Finned-Swimming Exoskeleton: A Robotic Flutter Kicking Testbed and Its Corresponding Thrust Generation | Beau Johnson, Michael Goldfarb | Vanderbilt University | Rehabilitation and Augmentation I | | Continuous Prediction of Leg Kinematics During Walking Using Inertial Sensors, Smart Glasses, and Embedded Computing | Oleksii Tsepa, Roman Burakov, Brokoslaw Laschowski, Alex Mihailidis | Igor Sikorsky Kyiv Polytechnic Institute,National University of Kyiv-Mohyla Academy,University of Toronto | Rehabilitation and Augmentation I | | Trajectory and Sway Prediction towards Fall Prevention | Weizhuo Wang, Michael Raitor, Steven H. Collins, Karen Liu, Monroe Kennedy | Stanford University | Rehabilitation and Augmentation I | | Multi-Modal Learning and Relaxation of Physical Conflict for an Exoskeleton Robot with Proprioceptive Perception | Xuan Zhang, Yana Shu, Yu Chen, Gong Chen, Jing Ye, Xiang Li | Tsinghua University,Shenzhen MileBot Robotics,Shenzhen MileBot Robotics Co. Ltd. | Rehabilitation and Augmentation I | | Learning Personalised Human Sit-To-Stand Motion Strategies Via Inverse Musculoskeletal Optimal Control | Daniel F. N. Gordon, Andreas Christou, Theodoros Stouraitis, Michael Gienger, Sethu Vijayakumar | University of Edinburgh,Honda Research Institute EU and the University of Edinburgh,Honda Research Institute Europe | Rehabilitation and Augmentation I | | Robust Human Pose Estimation under Gaussian Noise | Patrick Schlosser, Christoph Ledermann | Karlsruhe Institute of Technology | Safety and Trustworthy Robotics I | | Enforcing Safety for Vision-Based Controllers Via Control Barrier Functions and Neural Radiance Fields | Mukun Tong, Charles Dawson, Chuchu Fan | Tsinghua University,MIT,Massachusetts Institute of Technology | Safety and Trustworthy Robotics I | | Mimicking Real Forces on a Drone through a Haptic Suit to Enable Cost-Effective Validation | Carl Hilderbrandt, Wen Ying, Seongkook Heo, Sebastian Elbaum | University of Virginia | Safety and Trustworthy Robotics I | | Generating Formal Safety Assurances for High-Dimensional Reachability | Albert Lin, Somil Bansal | Princeton University,University of Southern California | Safety and Trustworthy Robotics I | | Safety Evaluation of Robot Systems Via Uncertainty Quantification | Woo-Jeong Baek, Torsten Kroeger | Karlsruhe Institute of Technology (KIT),Karlsruher Institut für Technologie (KIT) | Safety and Trustworthy Robotics I | | Safety-Critical Controller Verification Via Sim2Real Gap Quantification | Prithvi Akella, Wyatt Ubellacker, Aaron Ames | California Institute of Technology,Caltech | Safety and Trustworthy Robotics I | | One-Shot Reachability Analysis of Neural Network Dynamical Systems | Shaoru Chen, Victor M. Preciado, Mahyar Fazlyab | Microsoft Research, NYC,University of Pennsylvania,Johns Hopkins University | Safety and Trustworthy Robotics I | | Parameter-Conditioned Reachable Sets for Updating Safety Assurances Online | Javier Borquez, Somil Bansal, Kensuke Nakamura | University of Southern California,Princeton University | Safety and Trustworthy Robotics I | | Hazard Analysis of Collaborative Automation Systems: A Two-Layer Approach Based on Supervisory Control and Simulation | Tom P. Huck, Yuvaraj Selvaraj, Constantin Cronrath, Christoph Ledermann, Martin Fabian, Bengt Lennartson, Torsten Kroeger | Karlsruhe Institute of Technology,Zenseact,Chalmers University of Technology,Department of Electrical Engineering,Karlsruher Institut für Technologie (KIT) | Safety and Trustworthy Robotics I | | SmartRainNet: Uncertainty Estimation for Laser Measurement in Rain | Chen Zhang, Zefan Huang, Beatrix Tung, Marcelo Ang, Daniela Rus | National University of Singapore,Singapore-MIT Alliance for Research and Technology,MIT | Award Finalists 4 | | Data-Driven Optimal Control under Safety Constraints Using Sparse Koopman Approximation | Hongzhe Yu, Joseph Moyalan, Umesh Vaidya, Yongxin Chen | Georgia Institute of Technology,Clemson University | Safety and Trustworthy Robotics I | | Predictive Runtime Verification of Skill-Based Robotic Systems Using Petri Nets | Baptiste Pelletier, Charles Lesire, Christophe Grand, David Doose, Mathieu Rognant | ONERA/DTIS, University of Toulouse,ONERA,Onera - The French Aerospace Lab | Safety and Trustworthy Robotics I | | CIOT: Constraint-Enhanced Inertial-Odometric Tracking for Articulated Dump Trucks in GNSS-Denied Mining Environments | David Benz, Jonathan Thomas Weseloh, Dirk Abel, Heike Vallery | RWTH Aachen University,TU Delft | Localization and Mapping V | | Wide-Area Geolocalization with a Limited Field of View Camera | Lena Downes, Ted Steiner, Rebecca Russell, Jonathan Patrick How | Massachusetts Institute of Technology,Draper | Localization and Mapping V | | Probabilistic Plane Extraction and Modeling for Active Visual-Inertial Mapping | Mitchell Usayiwevu, Fouad Sukkar, Teresa A. Vidal-Calleja | University of Technology Sydney | Localization and Mapping V | | Visual Language Maps for Robot Navigation | Chenguang Huang, Oier Mees, Andy Zeng, Wolfram Burgard | University of Freiburg,Google,University of Technology Nuremberg | Localization and Mapping V | | Asynchronous State Estimation of Simultaneous Ego-Motion Estimation and Multiple Object Tracking for LiDAR-Inertial Odometry | Yu-Kai Lin, Wen-chieh Lin, Chieh-Chih (Bob) Wang | National Yang Ming Chiao Tung University | Localization and Mapping V | | Pose-Graph SLAM Using Multi-Order Ultrasonic Echoes and Beamforming for Long-Range Inspection Robots | Othmane-Latif Ouabi, Neil Zeghidour, Nico F. Declercq, Matthieu Geist, Cedric Pradalier | UMI ,,,, GT-CNRS,Google Brain,Georgia Institute of Technology, Atlanta, Georgia ,,,,,–,,,,,Université de Lorraine,GeorgiaTech Lorraine | Localization and Mapping V | | EdgeVO: An Efficient, Accurate, and Robust Edge-Based Visual Odometry | Hui Zhao, Jianga Shang, Kai Liu, Chao Chen, Fuqiang Gu | College of Computer Science, China University of Geoscience,Chongqing University | Localization and Mapping V | | SCORE: A Second-Order Conic Initialization for Range-Aided SLAM | Alan Papalia, Joseph Morales, Kevin Doherty, David Rosen, John Leonard | MIT,Massachusetts Institute of Technology,Northeastern University | Localization and Mapping V | | A Real-Time Dynamic Obstacle Tracking and Mapping System for UAV Navigation and Collision Avoidance with an RGB-D Camera | Zhefan Xu, Xiaoyang Zhan, Baihan Chen, Yumeng Xiu, Chenhao Yang, Kenji Shimada | Carnegie Mellon University | Mapping and Localization | | Resilient Terrain Navigation with a 5 DOF Metal Detector Drone | Patrick Pfreundschuh, Rik Marian Kai Bähnemann, Tim Kazik, Thomas Mantel, Roland Siegwart, Olov Andersson | ETH Zurich,ETH Zürich | Mapping and Localization | | Efficient Visual-Inertial Navigation with Point-Plane Map | Jiaxin Hu, Kefei Ren, Xiaoyu Xu, Lipu Zhou, Xiaoming Lang, Yinian Mao, Guoquan Huang | Meituan,University of Electronic Science and Technology of China,MeiTuan,Meituan-Dianping Group,University of Delaware | Mapping and Localization | | CAROM Air - Vehicle Localization and Traffic Scene Reconstruction from Aerial Videos | Duo Lu, Eric Eaton, Matt Weg, Wei Wang, Steven Como, Jeffrey Wishart, Hongbin Yu, Yezhou Yang | Rider University,Arizona State University | Mapping and Localization | | Control of Rough Terrain Vehicles Using Deep Reinforcement Learning | Viktor Wiberg, Erik Wallin, Tomas Nordfjell, Martin Servin | Umeå univsersity,Umeå University,Swedish University of Agricultural Sciences | SLAM & Navigation | | DynaVINS: A Visual-Inertial SLAM for Dynamic Environments | Seungwon Song, Hyungtae Lim, Alex Lee, Hyun Myung | KAIST,Korea Advanced Institute of Science and Technology,Hyundai Motor Company,KAIST (Korea Advanced Institute of Science and Technology) | SLAM & Navigation | | Visual-Inertial Odometry with Online Calibration of Velocity-Control Based Kinematic Motion Models | Haolong Li, Joerg Stueckler | Max Planck Institute for Intelligent Systems | SLAM & Navigation | | Learning Setup Policies: Reliable Transition between Locomotion Behaviours | Brendan Tidd, Jurgen Leitner, Akansel Cosgun, Nicolas Hudson | CSIRO,LYRO Robotics & Monash University,Monash University,X, The Moonshot Factory | SLAM & Navigation | | MMDF: Multi-Modal Deep Feature Based Place Recognition of Mobile Robots with Applications on Cross-Scene Navigation | Xiang Yu, Bo Zhou, Zeqing Chang, Kun Qian, Fang Fang | Southeast University,Southeast university | SLAM & Navigation | | Deep IMU Bias Inference for Robust Visual-Inertial Odometry with Factor Graphs | Russell Buchanan, Varun Agrawal, Marco Camurri, Frank Dellaert, Maurice Fallon | University of Edinburgh,Georgia Institute of Technology,Free University of Bozen-Bolzano,University of Oxford | SLAM & Navigation | | Hierarchical Motion Planning for Autonomous Vehicles in Unstructured Dynamic Environments | Yao Qi, Binbing He, Yang Tai, Rendong Wang, Le Wang, Youchun Xu | Army Military Transportation University,Institute of Military Transportation, Army Military Transportati,Tianjin Navigation Instruments Research Institute,Military Transportation University | SLAM & Navigation | | SOFT2: Stereo Visual Odometry for Road Vehicles Based on a Point-To-Epipolar-Line Metric | Igor Cvišsić, Ivan Markovic, Ivan Petrovic | University of Zagreb, Faculty of Electrical Engineering and Comp,University of Zagreb Faculty of Electrical Engineering and Computing,University of Zagreb | SLAM & Navigation | | Winding Through: Crowd Navigation Via Topological Invariance | Christoforos Mavrogiannis, Krishnan Balasubramanian, Sriyash Poddar, Anush Gandra, Siddhartha Srinivasa | University of Michigan,University of Washington,Indian Institute of Technology Kharagpur | SLAM & Navigation | | Tactile-Based Task Description through Edge Contact Formation Setpoints for Object Exploration and Manipulation | Zhanat Kappassov, Juan Antonio Corrales Ramon, Véronique Perdereau | Nazarbayev University,Universidade de Santiago de Compostela,Sorbonne University | Force and Tactile Sensing and Haptics and Haptic Interfaces | | 3D Contact Point Cloud Reconstruction from Vision-Based Tactile Flow | Yipai Du, Guanlan Zhang, Michael Y. Wang | Hong Kong University of Science and Technology,The Hong Kong University of Science and Technology,Monash University | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Visuo-Tactile Recognition of Partial Point Clouds Using PointNet and Curriculum Learning | Christopher Parsons, Alessandro Albini, Daniele De Martini, Perla Maiolino | University of Oxford | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Bidirectional Sim-To-Real Transfer for GelSight Tactile Sensors with CycleGAN | Weihang Chen, Yuan Xu, Zhenyang Chen, Peiyu Zeng, Renjun Dang, Rui Chen, Jing Xu | Tsinghua University,Southern University of Science and Technology | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Development of a Novel 2-Dimensional Neck Haptic Device for Gait Balance Training | Hosu Lee, Amre Eizad, Jiho Park, Yeongmi Kim, Sunwoo Hwang, Min-kyun Oh, Jungwon Yoon | Gwangju Institute of Science and Technology,GIST,MCI,Gyeongsang National University Hospital,Gwangju Institutue of Science and Technology | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Communicating Inferred Goals with Passive Augmented Reality and Active Haptic Feedback | James Mullen, Josh Mosier, Sounak Chakrabarti, Anqi Chen, Tyler White, Dylan Losey | University of Maryland,Virginia Tech,Virginia Polytechnic Institute and State University | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Touching the Sound: Audible Features Enable Haptics for Robot Control | Hongshen Shi, Matteo Russo, Juan De La Torre, Abdelkhalick Mohammad, Xin Dong, Dragos Axinte | University of Nottignham,University of Rome Tor Vergata,University of Nottingham | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Haptify: A Measurement-Based Benchmarking System for Grounded Force-Feedback Devices | Farimah Fazlollahi, Katherine J. Kuchenbecker | Max Planck Institute for Intelligent Systems | Force and Tactile Sensing and Haptics and Haptic Interfaces | | Biomimetic Force and Impedance Adaptation Based on Broad Learning System in Stable and Unstable Tasks | Zhenyu Lu, Ning Wang | Bristol Robotics Laboratory,University of the West of England | Bioinspiration and Biomimetics | | CPG-RL: Learning Central Pattern Generators for Quadruped Locomotion | Guillaume Bellegarda, Auke Ijspeert | EPFL | Bioinspiration and Biomimetics | | Research on Target Tracking for Robotic Fish Based on Low-Cost Scarce Sensing Information Fusion | Yong Zhong, Youdong Chen, Chengcai Wang, Qixing Wang, Jiawei Yang | South China University of Technology,Peking University | Bioinspiration and Biomimetics | | An Anthropomorphic Robotic Finger with Innate Human-Finger-Like Biomechanical Advantages Part I: Design, Ligamentous Joint and Extensor Mechanism | Yingmin Zhu, Guowu Wei, Lei Ren, Zirong Luo, Jianzhong Shang | School of Mechano-electronic Engineering,Xidian University,Salford University,University of Manchester,National University of Defense Technology | Bioinspiration and Biomimetics | | An Anthropomorphic Robotic Finger with Innate Human-Finger-Like Biomechanical Advantages Part II: Flexible Tendon Sheath and Grasping Demonstration | Yiming Zhu, Guowu Wei, Lei Ren, Zirong Luo, Jianzhong Shang | The University of Manchester,Salford University,University of Manchester,National University of Defense Technology | Bioinspiration and Biomimetics | | Sim-To-Real: Learning Energy-Efficient Slithering Gaits for a Snake-Like Robot | Zhenshan Bing, Long Cheng, Kai Huang, Alois Knoll | Technical University of Munich,Wenzhou University,Sun Yat-sen University,Tech. Univ. Muenchen TUM | Bioinspiration and Biomimetics | | S2worm: A Fast-Moving Untethered Insect-Scale Robot with 2-DoF Transmission Mechanism | Yide Liu, Yanhong Chen, Bo Feng, Dongqi Wang, Taishan Liu, Haofei Zhou, Hua Li, Shaoxing Qu, Wei Yang | zhejiang university,Zhejiang University | Bioinspiration and Biomimetics | | Towards a Discrete Snake-Like Robot Based on SMA-Actuated Tristable Modules for Follow the Leader Control Strategy | Beniamin Calmé, Lennart Rubbert, Yassine Haddab | LIRMM, Univ Montpellier, CNRS,INSA - Strasbourg,University of Montpellier | Bioinspiration and Biomimetics | | Three-Dimensional Modeling and Kinematic Analysis of Human Elbow Joint Axis Based on Anatomy and Screw Theory | Yongsheng Gao, Guodong Lang, Wenpeng Shen, Jie Zhao | Harbin Institute of Technology | Bioinspiration and Biomimetics | | High-Performance Six-DOF Flight Control of the Bee++: An Inclined-Stroke-Plane Approach | Ryan Bena, Xiufeng Yang, Ariel Calderon, Nestor O Perez-arancibia | University of Southern California,Washington State University (WSU) | Bioinspiration and Biomimetics | | Autonomous Dozer Sand Grading under Localization Uncertainties | Yakov Miron, Yuval Goldfracht, Chana Ross, Dotan Di Castro, Itzik Klein | Bosch,BCAI,University of Haifa | Sensing and Control | | Self-Triggered Coverage Control for Mobile Sensors | Erick J. Rodriguez-Seda, Xiaotian Xu, Josep M. Olm, Arnau Doria-cerezo, Yancy Diaz-Mercado | United States Naval Academy,University of Maryland, College Park,Universitat Politecnica de Catalunya,Polytechnic University of Catalonia,University of Maryland | Sensing and Control | | Constrained Gaussian Processes with Integrated Kernels for Long-Horizon Prediction of Dense Pedestrian Crowd Flows | Stefan H. Kiss, Kavindie Katuwandeniya, Alen Alempijevic, Teresa A. Vidal-Calleja | University of Technology Sydney | Sensing and Control | | Large-Workspace Polyarticulated Micro-Structures Based-On Folded Silica for Tethered Nanorobotics | Yuning Lei, Cédric Clévy, Jean-yves Rauch, Philippe Lutz | Carl von Ossietzky Universität Oldenburg,Franche-Comté University,FEMTO-ST institute,FEMTO-ST - UMR CNRS ,,,, - UFC/ENSMM/UTBM | Sensing and Control | | Direction and Trajectory Tracking Control for Nonholonomic Spherical Robot by Combining Sliding Mode Controller and Model Prediction Controller | Yifan Liu, Yixu Wang, Xiaoqing Guan, Tao Hu, Ziang Zhang, Song Jin, You Wang, Jie Hao, Guang Li | Zhejiang University,Luoteng Hangzhou Techonlogy Co.,Ltd. | Sensing and Control | | Advanced Manufacturing Configuration by Sample-Efficient Batch Bayesian Optimization | Xavier Guidetti, Alisa Rupenyan, Lutz Fassl, Majid Nabavi, John Lygeros | ETH Zürich,Equipment Digitalization Team, Oerlikon Metco,ETH Zurich | Sensing and Control | | Automatically Deployable Robust Control of Modular Reconfigurable Robot Manipulators | Carlo Nainer, Andrea Giusti | Fraunhofer Italia Research | Sensing and Control | | Velocity Following Control of a Pseudo-Driven Wheel for Reducing Internal Forces between Wheels | Huanan Qi, Liang Ding, Bo You, Lan Huang, Xin An, Shu Li, Guangjun Liu | Harbin Institute of Technology,Harbin University of Science and Technology,Tsinghua University,Ryerson University | Sensing and Control | | Adaptive Tracking Control with Uncertainty-Aware and State-Dependent Feedback Action Blending for Robot Manipulators | Xuwei Wu, Annika Kirner, Gianluca Garofalo, Christian Ott, Paul Kotyczka, Alexander Dietrich | German Aerospace Center (DLR),TU Wien,ABB AB,Technische Universität München | Sensing and Control | | Kinetostatic Modeling of Tendon-Driven Parallel Continuum Robots | Sven Lilge, Jessica Burgner-kahrs | University of Toronto | Kinematics, Dynamics, and Motion Control | | Globally Optimal Solution to Inverse Kinematics of 7DOF Serial Manipulator | Pavel Trutman, Mohab Safey El Din, Didier Henrion, Tomas Pajdla | Czech Technical University in Prague,Sorbonne Univ.,University of Toulouse | Kinematics, Dynamics, and Motion Control | | Kinematic Redundancy Analysis for (2n+1)R Circular Manipulators | Zijia Li, Mathias Brandstötter, Michael Hofbaur | Chinese Academy of Sciences,JOANNEUM RESEARCH Forschungsgesellschaft mbH - ROBOTICS,JOANNEUM RESEARCH Forschungsgesellschaft mbH | Kinematics, Dynamics, and Motion Control | | Adaptive Constrained Kinematic Control Using Partial or Complete Task-Space Measurements | Murilo Marinho, Bruno Vilhena Adorno | The University of Tokyo,The University of Manchester | Kinematics, Dynamics, and Motion Control | | Connecting Gaits in Energetically Conservative Legged Systems | Maximilian Raff, Nelson Rosa, C. David Remy | University of Stuttgart | Kinematics, Dynamics, and Motion Control | | Reduced Euler-Lagrange Equations of Floating-Base Robots: Computation, Properties & Applications | Hrishik Mishra, Gianluca Garofalo, Alessandro Massimo Giordano, Marco De Stefano, Christian Ott, Andreas Kugi | German Aerospace Center (DLR),ABB AB,DLR (German Aerospace Center),TU Wien | Kinematics, Dynamics, and Motion Control | | Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application | Diego Romeres, Fabio Amadio, Alberto Dalla Libera, Riccardo Antonello, Daniel Nikovski, Ruggero Carli | Mitsubishi Electric research laboratories,Leonardo Labs - IIT,University of Padova,MERL | Kinematics, Dynamics, and Motion Control | | Hybrid Learning of Time-Series Inverse Dynamics Models for Locally Isotropic Robot Motion | Tolga-Can Çallar, Sven Böttger | Universität zu Lübeck,University of Luebeck | Kinematics, Dynamics, and Motion Control | | A Joint Acceleration Estimation Method Based on a High-Order Disturbance Observer | Jiexin Zhang, Pingyun Nie, Yuhang Chen, Bo Zhang | Shanghaijiaotong university,Shanghai Jiao Tong University | Kinematics, Dynamics, and Motion Control | | A Sampling-Based Motion Assignment Strategy with Multi-Performance Optimization for Macro-Micro Robotic System | Yaohua Zhou, Chin-Yin Chen, Guilin Yang, Yaonan Li | Ningbo Institute of Materials Technology and Engineering,Ningbo Institute of Material Technology and Engineering, CAS,Ningbo Institute of Material Technology and Engineering, Chines,Shenzhen Academy of Robotics | Kinematics, Dynamics, and Motion Control | | Offline Programming Guidance for Swarm Steering of Micro/Nano Magnetic Particles in a Dynamic Multichannel Vascular Model | Myungjin Park, Le Tuan-anh, Jungwon Yoon | Gwangju institute of science and technology,Gwangju Institute of Science and Technology,Gwangju Institutue of Science and Technology | Swarms and Multi Agent Systems | | Mean Field Behaviour of Collaborative Multi-Agent Foragers | Daniel Jarne Ornia, Pedro J. Zufiria, Manuel Mazo Jr. | Delft University of Technology,Universidad Politecnica de Madrid | Swarms and Multi Agent Systems | | Closed-Loop Motion Control of Robotic Swarms – a Tether-Based Strategy | Kasra Eshaghi, Andrew Rogers, Goldie Nejat, Beno Benhabib | University of Toronto | Swarms and Multi Agent Systems | | Controlling Collision-Induced Aggregations in a Swarm of Micro Bristle-Robots | Zhijian Hao, Siddharth Mayya, Gennaro Notomista, Seth Hutchinson, Magnus Egerstedt, Azadeh Ansari | Georgia Institute of Technology,Amazon Robotics,University of Waterloo,University of California, Irvine | Swarms and Multi Agent Systems | | Multi-Robot Pickup and Delivery Via Distributed Resource Allocation | Andrea Camisa, Andrea Testa, Giuseppe Notarstefano | University of Bologna | Swarms and Multi Agent Systems | | Deep Reinforcement Learning for Decentralized Multi-Robot Exploration with Macro Actions | Aaron Tan, Federico Pizarro Bejarano, Yuhan Zhu, Richard Ren, Goldie Nejat | University of Toronto | Swarms and Multi Agent Systems | | Time-Inverted Kuramoto Model Meets Lissajous Curves: Multi-Robot Persistent Monitoring and Target Detection | Manuel Boldrer, Lorenzo Lyons, Luigi Palopoli, Daniele Fontanelli, Laura Ferranti | Delft University of Technology,University of Trento | Swarms and Multi Agent Systems | | A Decentralized Multi-Robot Spatio-Temporal Multi-Task Assignment Approach for Perimeter Defense | Shridhar Velhal, Suresh Sundaram, Sundararajan Narasimman | Indian Institute of Science,Nanyang Technological University | Swarms and Multi Agent Systems | | Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards | Ruihua Han, Shengduo Chen, Shuaijun Wang, Zeqing Zhang, Rui Gao, Qi Hao, Jia Pan | University of Hong Kong,Southern University of Science and Technology,The University of Hong Kong,SOUTHERN UNIVERSITY OF SCIENCE AND TECHNOLOGY | Swarms and Multi Agent Systems | | Chance-Constrained Iterative Linear-Quadratic Stochastic Games | Hai Zhong, Yutaka Shimizu, Jianyu Chen | Tsinghua University,TIER IV | Swarms and Multi Agent Systems | | The SLAM Hive Benchmarking Suite | Yuanyuan Yang, Bowen Xu, Yinjie Li, Soeren Schwertfeger | ShanghaiTech University | Software Tools II | | Discovering Multiple Algorithm Configurations | Leonid Keselman, Martial Hebert | Carnegie Mellon University,CMU | Software Tools II | | Aquarium: A Fully Differentiable Fluid-Structure Interaction Solver for Robotics Applications | Jeong Hun Lee, Mike Yan Michelis, Robert Kevin Katzschmann, Zachary Manchester | Carnegie Mellon University,ETH Zurich | Software Tools II | | Robust Co-Design of Robots Via Cascaded Optimisation | Akhil Sathuluri, Anand Vazhapilli Sureshbabu, Markus Zimmermann | Technical University of Munich,Technische Universität München | Software Tools II | | Autotuning Symbolic Optimization Fabrics for Trajectory Generation | Max Spahn, Javier Alonso-Mora | TU Delft,Delft University of Technology | Software Tools II | | Auto-Assembly: A Framework for Automated Robotic Assembly Directly from CAD | Fedor Chervinskii, Sergei Zobov, Aleksandr Rybnikov, Danil Petrov, Komal Sai Reddy Vendidandi | Arrival,Micropsi Industries Gmbh,ARRIVAL | Software Tools II | | General, Single-Shot, Target-Less, and Automatic LiDAR-Camera Extrinsic Calibration Toolbox | Kenji Koide, Shuji Oishi, Masashi Yokozuka, Atsuhiko Banno | National Institute of Advanced Industrial Science and Technology,National Institute of Advanced Industrial Science and Technology (AIST),Nat. Inst. of Advanced Industrial Science and Technology,National Instisute of Advanced Industrial Science and Technology | Software Tools II | | GaPT: Gaussian Process Toolkit for Online Regression with Application to Learning Quadrotor Dynamics | Francesco Crocetti, Jeffrey Mao, Alessandro Saviolo, Gabriele Costante, Giuseppe Loianno | University of Perugia,New York University | Software Tools II | | Transferring Implicit Knowledge of Non-Visual Object Properties across Heterogeneous Robot Morphologies | Gyan Tatiya, Jonathan Francis, Jivko Sinapov | Tufts University,Bosch Center for Artificial Intelligence | Data Sets II | | Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments | Joshua Barton Knights, Kavisha Vidanapathirana, Milad Ramezani, Sridha Sridharan, Clinton Fookes, Peyman Moghadam | Queensland University of Technology,CSIRO | Data Sets II | | On Human Grasping and Manipulation in Kitchens: Automated Annotation, Insights, and Metrics for Effective Data Collection | Sivashanmuganathan Elangovan, Ricardo De Godoy, Felipe Sanches, Ke Wang, Tom White, Patrick Jarvis, Minas Liarokapis | University of Auckland,The University of Auckland,AI Data Innovations,Acumino | Data Sets II | | Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning | David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jacob Varley | New York University,Google,University of Pennsylvania | Data Sets II | | COLA: COarse LAbel Pre-Training for 3D Semantic Segmentation of Sparse LiDAR Datasets | Jules Sanchez, François Goulette, Jean-Emmanuel Deschaud | Mines Paris - PSL University,MINES ParisTech | Data Sets II | | Enhancing the Efficacy of Lower-Body Assistive Devices through the Understanding of Human Movement in the Real World | Loubna Baroudi, Stephen Cain, Alex Shorter, Kira Barton | University of Michigan,University of Michigan at Ann Arbor | Data Sets II | | DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation | Ruicheng Wang, Jialiang Zhang, Jiayi Chen, Yinzhen Xu, Puhao Li, Tengyu Liu, He Wang | Peking University,Tsinghua University,Beijing Institute for General Artificial Intelligence | Award Finalists 1 | | ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding | Dustin Aganian, Benedict Stephan, Markus Eisenbach, Corinna Stretz, Horst-Michael Gross | Ilmenau University of Technology,University of Technology Ilmenau | Data Sets II | | Synthetic-To-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances | Arun Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil Katyal, Dinesh Manocha, Celso De Melo, Rama Chellappa | Johns Hopkins University,Johns Hopkins University Applied Physics Lab,Georgia Tech,University of Maryland,CCDC US Army Research Laboratory | Data Sets II | | Robotic Method and Instrument to Efficiently Synthesize Faulty Conditions and Mass-Produce Faulty-Conditioned Data for Rotary Machines | Yip Fun Yeung, Fangzhou Xia, Juliana Covarrubias, Furokawa Mikio, Hirano Takayuki, Kamal Youcef-Toumi | MIT,Massachusetts Institute of Technology,Japan Steel Works | Data Sets II | | FLYOVER: A Model-Driven Method to Generate Diverse Highway Interchanges for Autonomous Vehicle Testing | Yuan Zhou, Gengjie Lin, Yun Tang, Kairui Yang, Wei Jing, Ping Zhang, Junbo Chen, Liang Gong, Yang Liu | Nanyang Technological University,Shanghai Jiao Tong University,NANYANG TECHNOLOGICAL UNIVERSITY,Damo Academy, Alibaba Group,Alibaba,Alibaba Group,Shanghai Jiao Tong University | Data Sets II | | Towards Multi-Day Field Deployment Autonomy: A Long-Term Self-Sustainable Micro Aerial Vehicle Robot | Stephen Carlson, Prateek Arora, Tolga Karakurt, Brandon Moore, Christos Papachristos | University of Nevada, Reno,University of Nevada Reno | Environmental Applications | | Stable Station Keeping of Autonomous Sailing Robots Via the Switched Systems Approach for Ocean Observation | Weimin Qi, Qinbo Sun, Yu Cao, Huihuan Qian | The Chinese University of Hong Kong, Shenzhen,The Chinese Univeristy of Hong Kong, Shenzhen,Huawei Technology,The Chinese University of Hong Kong, Shenzhen | Environmental Applications | | CUREE: A Curious Underwater Robot for Ecosystem Exploration | Yogesh Girdhar, Nathan Mcguire, Levi Cai, Stewart Jamieson, Seth Mccammon, John E. San Soucie, Jessica Eve Todd, Brian Claus, T. Aran Mooney | Woods Hole Oceanographic Institution,Northeastern University,Massachusetts Institute of Technology,Woods Hole Oceanographic Instituttion,MIT | Environmental Applications | | Multi-Robot 3D Gas Distribution Mapping: Coordination, Information Sharing and Environmental Knowledge | Chiara Ercolani, Shashank Mahendra Deshmukh, Thomas Laurent Peeters, Alcherio Martinoli | EPFL | Environmental Applications | | L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras | Kevin Ta, David Bruggemann, Tim Broedermann, Sakaridis Christos, Luc Van Gool | Waabi,ETH Zurich | Calibration and Identification | | Experimental Evaluation of a Method for Improving Experiment Design in Robot Identification | Stefanie Zimmermann, Martin Enqvist, Svante Gunnarsson, Stig Moberg, Mikael Norrlöf | Linköping University,ABB AB | Calibration and Identification | | DEdgeNet: Extrinsic Calibration of Camera and LiDAR with Depth-Discontinuous Edges | Yiyang Hu, Hui Ma, Leiping Jie, Hui Zhang | Beijing Normal University - Hong Kong Baptist University United ,Hong Kong Baptist University,United International College, BNU-HKBU | Calibration and Identification | | Joint Camera Intrinsic and LiDAR-Camera Extrinsic Calibration | Guohang Yan, Feiyu He, Chunlei Shi, Pengjin Wei, Xinyu Cai, Yikang Li | Shanghai AI Laboratory,Shanghai AI Lab,Southeast University,SHANGHAI JIAO TONG UNIVERSITY,Sensetime Ltd. | Calibration and Identification | | Online Hand-Eye Calibration with Decoupling by 3D Textureless Object Tracking | Li Jin, Kang Xie, Wenxuan Chen, Xin Cao, Yuehua Li, Jiachen Li, Jiankai Qian, Xueying Xueying Qin | Shandong university,Shandong University,Zhejiang Lab,Zhejiang University,ShanDong University | Calibration and Identification | | Using the Deflection Center to Auto-Calibrate the Pan-Tilt-Zoom Camera Linearly | Liu Yu, Hui Zhang | United International College, BNU-HKBU | Calibration and Identification | | Coordinate Calibration of a Dual-Arm Robot System by Visual Tool Tracking | Junlei Hu, Dominic Jones, Pietro Valdastri | University of Leeds | Calibration and Identification | | A Graph-Based Optimization Framework for Hand-Eye Calibration for Multi-Camera Setups | Daniele Evangelista, Emilio Olivastri, Davide Allegro, Emanuele Menegatti, Alberto Pretto | Università degli studi di Padova,University of Padua,University of Padova,The University of Padua | Calibration and Identification | | Fast Extrinsic Calibration for Multiple Inertial Measurement Units in Visual-Inertial System | Youwei Yu, Yanqing Liu, Fengjie Fu, Sihan He, Dongchen Zhu, Lei Wang, Xiaolin Zhang, Jiamao Li | Shanghai Institute of Microsystem and Information Technology,Shanghai Institute of Microsystem and Information Technology, Ch,Shanghai Institute of Microsystem Information and technology, Ch,Shanghai Institute of Microsystem and Information Technology,Chi,Shanghai Institute of Microsystem And Information Technology,Chi | Calibration and Identification | | Completely Rational SO(n) Orthonormalization | Wu Jin, Soheil Sarabandi, Jianhao Jiao, Huaiyang Huang, Bohuan Xue, Ruoyu Geng, Lujia Wang, Ming Liu | UESTC,IRI (CSIC-UPC),The Hong Kong University of Science and Technology,the Hong Kong University of Science and Technology,HKUST,HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY,The Hong Kong University of Technology,Hong Kong University of Science and Technology | Calibration and Identification | | An Active Learning Based Robot Kinematic Calibration Framework Using Gaussian Processes | Ersin Das, Joel Burdick | Caltech,California Institute of Technology | Calibration and Identification | | Identification of a Generalized Base Inertial Parameter Set of Robotic Manipulators Considering Mounting Configurations | Mario Troebinger, Abdeldjallil Naceri, Xiao Chen, Hamid Sadeghian, Sami Haddadin | Technical University of Munich | Calibration and Identification | | Open-Vocabulary, Queryable Scene Representations for Real World Planning | Boyuan Chen, Fei Xia, Brian Ichter, Kanishka Rao, Keerthana Gopalakrishnan, Michael S Ryoo, Austin Stone, Daniel Kappler | Massachusetts Institute of Technology,Google Inc,Google Brain,Google,Google, Stony Brook University,X (Google) | AI-Enabled Robotics | | ProgPrompt: Generating Situated Robot Task Plans Using Large Language Models | Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg | University of Southern California,NVIDIA,Stanford Univesity,Nvidia,University of Washington,USC Viterbi School of Engineering,University of Toronto | AI-Enabled Robotics | | Guiding Reinforcement Learning with Shared Control Templates | Abhishek Padalkar, Gabriel Quere, Franz Steinmetz, Antonin Raffin, Matthias Nieuwenhuisen, João Silvério, Freek Stulp | German Aerospace Center, Institute of Robotics and Mechatronics,,DLR,German Aerospace Center (DLR),Fraunhofer Institute for Communication, Information Processing a,German Aerospace Center,DLR - Deutsches Zentrum für Luft- und Raumfahrt e.V. | AI-Enabled Robotics | | Anticipatory Planning: Improving Long-Lived Planning by Estimating Expected Cost of Future Tasks | Roshan Dhakal, Gregory Stein, Md Ridwan Hossain Talukder | George Mason University | AI-Enabled Robotics | | Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement | Zirui Zhao, Wee Sun Lee, David Hsu | National University of Singapore | AI-Enabled Robotics | | Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification | Jiayi Pan, Glen Chou, Dmitry Berenson | University of Michigan | AI-Enabled Robotics | | Improving the Generalizability of Trajectory Prediction Models with Frenét-Based Domain Normalization | Luyao Ye, Zikang Zhou, Jianping Wang | City University of Hong Kong | AI-Enabled Robotics | | An Open Approach to Energy-Efficient Autonomous Mobile Robots | Liangkai Liu, Ren Zhong, Aaron Willcock, Nathan Fisher, Weisong Shi | Wayne State University,wayne state university,University of Delaware | AI-Enabled Robotics | | Grounding Language with Visual Affordances Over Unstructured Data | Oier Mees, Jessica Borja Diaz, Wolfram Burgard | University of Freiburg,University of Technology Nuremberg | Award Finalists 2 | | Gaka-Chu: A Self-Employed Autonomous Robot Artist | Eduardo Castello, Ivan Berman, Aleksandr Kapitonov, Vadim Manaenko, Makar Cherniaev, Pavel Tarasov | "Indra Digital Labs,M,M Economy, MerkleBot Inc.,M,M Economy, Merklebot Inc.,M,M Economy, Inc. (""Merklebot""), San Francisco, CA, USA" | AI-Enabled Robotics | | LEARNEST: LEARNing Enhanced Model-Based State ESTimation for Robots Using Knowledge-Based Neural Ordinary Differential Equations | Kong Yao Chee, M. Ani Hsieh | University of Pennsylvania | AI-Enabled Robotics | | A Joint Modeling of Vision-Language-Action for Target-Oriented Grasping in Clutter | Kechun Xu, Shuqi Zhao, Zhongxiang Zhou, Zizhang Li, Huaijin Pi, Yifeng Zhu, Yue Wang, Rong Xiong | Zhejiang University,The University of Texas at Austin | AI-Enabled Robotics | | A Virtual Reality Framework for Fast Dataset Creation Applied to Cloth Manipulation with Automatic Semantic Labelling | Julia Borras Sol, Arnau Boix-granell, Sergi Foix, Carme Torras | Institut de Robòtica i Informàtica Industrial (CSIC-UPC),CSIC-UPC,CSIC - UPC | Virtual Reality and Interfaces | | Skill-Based Robot Programming in Mixed Reality with Ad-Hoc Validation Using a Force-Enabled Digital Twin | Jan Krieglstein, Gesche Held, Balázs András Bálint, Frank Naegele, Werner Kraus | Fraunhofer IPA | Virtual Reality and Interfaces | | A Virtual Reality Planning Environment for High-Risk, High-Latency Teleoperation | Will Pryor, Liam Wang, Arko Chatterjee, Balazs Vagvolgyi, Anton Deguet, Simon Leonard, Louis Whitcomb, Peter Kazanzides | Johns Hopkins University,The Johns Hopkins University | Virtual Reality and Interfaces | | Avatarm: An Avatar with Manipulation Capabilities for the Physical Metaverse | Alberto Villani, Giovanni Cortigiani, Bernardo Brogi, Nicole D'Aurizio, Tommaso Lisini Baldi, Domenico Prattichizzo | University of Siena,University of Siena, Istituto Italiano di Tecnologia | Virtual Reality and Interfaces | | Interacting with Multi-Robot Systems Via Mixed Reality | Florian Kennel-Maushart, Roi Poranne, Stelian Coros | ETHZ,ETH Zurich | Virtual Reality and Interfaces | | PointCloudLab: An Environment for 3D Point Cloud Annotation with Adapted Visual Aids and Levels of Immersion | Achref Doula, Tobias Güdelhöfer, Andrii Matviienko, Max Mühlhäuser, Alejandro Sanchez Guinea | Technical University of Darmstadt,Technische Universität Darmstadt,TU Darmstadt | Virtual Reality and Interfaces | | Augmented Reality-Assisted Robot Learning Framework for Minimally Invasive Surgery Task | Junling Fu, Maria Chiara Palumbo, Elisa Iovene, Qingsheng Liu, Ilaria Burzo, Alberto Redaelli, Giancarlo Ferrigno, Elena De Momi | Politecnico di Milano,Ocean University of China | Virtual Reality and Interfaces | | Intuitive Robot Integration Via Virtual Reality Workspaces | Minh Tram, Joseph Cloud, William Beksi | University of Texas at Arlington,University of Texas at Arlington, NASA Kennedy Space Center | Virtual Reality and Interfaces | | Reconstructing Objects In-The-Wild for Realistic Sensor Simulation | Ze Yang, Siva Manivasagam, Yun Chen, Jingkang Wang, Rui Hu, Raquel Urtasun | University of Toronto,UBER ATG R&D,Uber | Simulation and Sim2Real | | Real-Time Event Simulation with Frame-Based Cameras | Andreas Ziegler, Daniel Teigland, Jonas Tebbe, Thomas Gossard, Andreas Zell | University of Tübingen | Simulation and Sim2Real | | PCGen: Point Cloud Generator for LiDAR Simulation | Chenqi Li, Yuan Ren, Bingbing Liu | University of Toronto,Noah's Ark Lab, Huawei Technologies Canada Inc,Huawei Technologies | Simulation and Sim2Real | | Differentiable Dynamics Simulation Using Invariant Contact Mapping and Damped Contact Force | Minji Lee, Jeongmin Lee, Dongjun Lee | Seoul National University | Simulation and Sim2Real | | M-EMBER: Tackling Long-Horizon Mobile Manipulation Via Factorized Domain Transfer | Bohan Wu, Roberto Martín-martín, Fei-Fei Li | Stanford University,University of Texas at Austin | Award Finalists 1 | | Sim2Real^2: Actively Building Explicit Physics Model for Precise Articulated Object Manipulation | Liqian Ma, Jiaojiao Meng, Shuntao Liu, Weihang Chen, Jing Xu, Rui Chen | Tsinghua University,Beijing University of posts and Telecommunications,AVIC Chengdu Aircraft Industrial (Group) Co. | Simulation and Sim2Real | | A Generic Power Wheelchair Lumped Model in the Sagittal Plane: Towards Realistic Self-Motion Perception in a Virtual Reality Simulator | Fabien Grzeskowiak, Ronan Le Breton, Louise Devigne, François Pasteau, Marie Babel, Sylvain Guegan | INRIA - Rennes,UNIV-RENNES - INSA Rennes,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes - Rehabilitation Cente,INSA Rennes / IRISA Rainbow Team,IRISA UMR CNRS ,,,, - INRIA - INSA Rennes,INSA Rennes | Simulation and Sim2Real | | FRIDA: A Collaborative Robot Painter with a Differentiable, Real2Sim2Real Planning Environment | Peter Schaldenbrand, James Mccann, Jean Oh | Carnegie Mellon University | Award Finalists 4 | | SAMLoc: Structure-Aware Constraints with Multi-Task Distillation for Long-Term Visual Localization | Jian Ning, Yunzhou Zhang, Xinge Zhao, Sonya Coleman, Kunmo Li, Dermot Kerr | Northeastern University,University of Ulster | Localization and Learning | | Energy-Based Models for Cross-Modal Localization Using Convolutional Transformers | Alan Wu, Michael S Ryoo | Indiana University Bloomington, MIT Lincoln Laboratory,Google, Stony Brook University | Localization and Learning | | Boosting 3D Point Cloud Registration by Transferring Multi-Modality Knowledge | Mingzhi Yuan, Xiaoshui Huang, Kexue Fu, Zhihao Li, Manning Wang | Fudan University,Shanghai AI Laboratory,Fudan university | Localization and Learning | | Local_INN: Implicit Map Representation and Localization with Invertible Neural Networks | Zirui Zang, Hongrui Zheng, Johannes Betz, Rahul Mangharam | University of Pennsylvania,Technical University of Munich | Localization and Learning | | Combining Scene Coordinate Regression and Absolute Pose Regression for Visual Relocalization | Jiahao Ruan, Li He, Yisheng Guan, Hong Zhang | Guangdong University of Technology,Southern University of Science and Technology,SUSTech | Localization and Learning | | A Consistency-Based Loss for Deep Odometry through Uncertainty Propagation | Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad, Behzad Moshiri | K. N. Toosi University of Technology,New York University,K.N.Toosi University of Technology,University of Tehran | Localization and Learning | | Slice Transformer and Self-Supervised Learning for 6DoF Localization in 3D Point Cloud Maps | Muhammad Ibrahim, Naveed Akhtar, Saeed Anwar, Michael Wise, Ajmal Mian | University of Western Australia,KFUPM | Localization and Learning | | AANet: Aggregation and Alignment Network with Semi-Hard Positive Sample Mining for Hierarchical Place Recognition | Feng Lu, Lijun Zhang, Shuting Dong, Baifan Chen, Chun Yuan | Tsinghua University,Chongqing Institute of Green and Intelligent Technology, CAS; Un,Central South University | Localization and Learning | | Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists | Simeon Oluwafunmilore Adebola, Rishi Parikh, Mark Presten, Satvik Sharma, Shrey Aeron, Ananth Rao, Sandeep Mukherjee, Tomson Qu, Tina Wistrom, Eugen Solowjow, Ken Goldberg | University of California, Berkeley,University of California Berkeley,University of California, Berkeley, Rausser College of Natural R,Siemens Corporation,UC Berkeley | Award Finalists 2 | | On Domain-Specific Pre-Training for Effective Semantic Perception in Agricultural Robotics | Gianmarco Roggiolani, Federico Magistri, Tiziano Guadagnino, Jan Weyler, Giorgio Grisetti, Cyrill Stachniss, Jens Behley | University of Bonn,Sapienza University of Rome | Agricultural Robotics and Automation II | | Semantic Keypoint Extraction for Scanned Animals Using Multi-Depth-Camera Systems | Raphael Falque, Teresa A. Vidal-Calleja, Alen Alempijevic | University of Technology Sydney | Agricultural Robotics and Automation II | | Grasp Planning with CNN for Log-Loading Forestry Machine | Elie Ayoub, Patrick Levesque, Inna Sharf | McGill University,FPInnovations | Agricultural Robotics and Automation II | | A Hybrid Cable-Driven Robot for Non-Destructive Leafy Plant Monitoring and Mass Estimation Using Structure from Motion | Gerry Chen, Venkata Harsh Suhith Muriki, Andrew Sharkey, Cedric Pradalier, Yongsheng Chen, Frank Dellaert | Georgia Institute of Technology,GeorgiaTech Lorraine | Agricultural Robotics and Automation II | | Optimal Multi-Robot Coverage Path Planning for Agricultural Fields Using Motion Dynamics | Jahid Chowdhury Choton, Pavithra Prabhakar | Kansas State University | Agricultural Robotics and Automation II | | CropNav: A Framework for Autonomous Navigation in Real Farms | Mateus Valverde Gasparino, Vitor Akihiro Hisano Higuti, Arun Narenthiran Sivakumar, Andres Eduardo Baquero Velasquez, Marcelo Becker, Girish Chowdhary | University of Illinois at Urbana-Champaign,EarthSense Inc.,University of Illinois at Urbana Champaign,Earthsense,USP | Agricultural Robotics and Automation II | | Tendon-Driven Soft Robotic Gripper with Integrated Ripeness Sensing for Blackberry Harvesting | Alex Qiu, Claire Young, Anthony Gunderman, Milad Azizkhani, Yue Chen, Ai-Ping Hu | Georgia Institute of Technology,Georgia Institute of Techology,Georgia Tech Research Institute | Agricultural Robotics and Automation II | | Motion Planning for a Climbing Robot with Stochastic Grasps | Stephanie Newdick, Nitin Ongole, Tony G. Chen, Edward Schmerling, Mark Cutkosky, Marco Pavone | Stanford University | Space Robotics | | RAMP: Reaction-Aware Motion Planning of Multi-Legged Robots for Locomotion in Microgravity | Warley F. R. Ribeiro, Kentaro Uno, Masazumi Imai, Koki Murase, Kazuya Yoshida | Tohoku University | Award Finalists 3 | | Risk-Aware Path Planning Via Probabilistic Fusion of Traversability Prediction for Planetary Rovers on Heterogeneous Terrains | Masafumi Endo, Tatsunori Taniai, Ryo Yonetani, Genya Ishigami | Keio University,OMRON SINIC X Corporation,OMRON SINIC X | Space Robotics | | A Gravity Compensation Strategy for On-Ground Validation of Orbital Manipulators | Marco De Stefano, Ria Vijayan, Andreas Stemmer, Ferdinand Elhardt, Christian Ott | German Aerospace Center (DLR),DLR - German Aerospace Center,Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR),TU Wien | Space Robotics | | Towards Bridging the Space Domain Gap for Satellite Pose Estimation Using Event Sensing | Mohsi Jawaid, Ethan Elms, Yasir Latif, Tat-Jun Chin | The University of Adelaide,University of Adelaide | Space Robotics | | Hardware-In-The-Loop Simulator with Low-Thrust Actuator for Free-Flying Robot's Omni-Directional Control | Daichi Hirano, Shinji Mitani, Taisei Nishishita, Tatsuhiko Saito | Japan Aerospace Exploration Agency,JAXA,Systems Engineering Consultants Co.,LTD. | Space Robotics | | Loitering and Trajectory Tracking of Suspended Payloads in Cable-Driven Balloons Using UGVs | Julius Wanner, Eric Sihite, Alireza Ramezani, Gharib Morteza | ETH Zurich / California Institute of Technology,California Institute of Technology,Northeastern University,CALTECH | Space Robotics | | Design and Validation of a Multi-Arm Relocatable Manipulator for Space Applications | Enrico Mingo Hoffman, Arturo Laurenzi, Francesco Ruscelli, Luca Rossini, Lorenzo Baccelliere, Davide Antonucci, Alessio Margan, Paolo Guria, Marco Migliorini, Stefano Cordasco, Gennaro Raiola, Luca Muratore, Joaquín Estremera, Andrea Rusconi, Guido Sangiovanni, Nikos Tsagarakis | Leonardo S.p.A.,Istituto Italiano di Tecnologia,Istituto italiano di tecnologia,Istituto Italiano di Tecnologia (IIT),Leonardo s.p.a.,GMV,Selex Galileo,Politecnico di Milano | Space Robotics | | Tentacle-Based Shape Shifting of Metamorphic Robots Using Fast Inverse Kinematics | Jan Mrázek, Patrick Ondika, Ivana ÄŒerná, Jiri Barnat | Masaryk University | Modular and Reconfigurable Robots | | A Non-Planar Assembly of Modular Tetrahedral-Shaped Aerial Robots | Obadah Wali, Mohamad Shahab, Eric Feron | KAUST,King Abdullah University of Science and Technology | Modular and Reconfigurable Robots | | Learning Modular Robot Visual-Motor Locomotion Policies | Julian Whitman, Howie Choset | Carnegie Mellon University | Modular and Reconfigurable Robots | | DisCo: A Multiagent 3D Coordinate System for Lattice Based Modular Self-Reconfigurable Robots | Benoît Piranda, Frédéric Lassabe, Julien Bourgeois | Université de Franche-Comté / FEMTO-ST Institute,FEMTO-ST Institute, Univ. Bourgogne Franche-Comté, CNRS,Institut FEMTO-ST | Modular and Reconfigurable Robots | | Finding Optimal Modular Robots for Aerial Tasks | Jiawei Xu, David Saldana | Lehigh University | Modular and Reconfigurable Robots | | Coaxial Modular Aerial System and the Reconfiguration Applications | José Baca, Syed Izzat Ullah, Pablo Rangel | Texas A&M University-Corpus Christi,Texas A&M University - Corpus Christi | Modular and Reconfigurable Robots | | ADAPT: A 3 Degrees of Freedom Reconfigurable Force Balanced Parallel Manipulator for Aerial Applications | Kartik Suryavanshi, Salua Hamaza, Volkert Van Der Wijk, Just Herder | TU Delft,Delft University of Technology | Modular and Reconfigurable Robots | | Rearrange Indoor Scenes for Human-Robot Co-Activity | Weiqi Wang, Zihang Zhao, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu, Hangxin Liu | University of California, Los Angeles,Beijing Institute for General Artificial Intelligence,Beijing Institute for General Artificial Intelligence (BIGAI),Peking University,UCLA | Human-Centered Robotics | | Design and Evaluation of an Augmented Reality Head-Mounted Display User Interface for Controlling Legged Manipulators | Rodrigo Chacon Quesada, Yiannis Demiris | Imperial College London | Human-Centered Robotics | | Exploiting Intrinsic Kinematic Null Space for Supernumerary Robotic Limbs Control | Tommaso Lisini Baldi, Nicole D'Aurizio, Sergio Gurgone, Daniele Borzelli, Andrea D'Avella, Domenico Prattichizzo | University of Siena,University of Siena, Istituto Italiano di Tecnologia,University of Messina,Fondazione Santa Lucia,IRCCS Fondazione Santa Lucia | Human-Centered Robotics | | Robot Explanatory Narratives of Collaborative and Adaptive Experiences | Alberto Olivares-Alarcos, Antonio Andriella, Sergi Foix, Guillem Alenyà | Institut de Robòtica i Informàtica Industrial (CSIC-UPC),Pal Robotics,CSIC-UPC | Human-Centered Robotics | | Evaluating Immersive Teleoperation Interfaces: Coordinating Robot Radiation Monitoring Tasks in Nuclear Facilities | Harvey Stedman, Başaran Bahadır Koçer, Nejra Van Zalk, Mirko Kovac, Vijay Pawar | University College London,Imperial College London | Human-Centered Robotics | | A Social Referencing Disambiguation Framework for Domestic Service Robots | Kevin Fan, Melanie Jouaiti, Ali Noormohammadi Asl, Kerstin Dautenhahn, Chrystopher Nehaniv | University of Waterloo,Imperial College London | Human-Centered Robotics | | Ex(plainable) Machina: How Social-Implicit XAI Affects Complex Human-Robot Teaming Tasks | Marco Matarese, Francesca Cocchella, Francesco Rea, Alessandra Sciutti | Italian Institute of Technology,Istituto Italiano di Tecnologia | Human-Centered Robotics | | Towards Safe Remote Manipulation: User Command Adjustment Based on Risk Prediction for Dynamic Obstacles | Mincheul Kang, Minsung Yoon, Sung-Eui Yoon | KAIST,Korea Advanced Institute of Science and Technology (KAIST) | Human-Centered Robotics | | Computational Methods to Support Prototyping of an Adaptive Robot Joystick Controller for Children with Upper Limb Impairments | Melanie Jouaiti, Negin Azizi, Kerstin Dautenhahn | Imperial College London,University of Waterloo | Human-Centered Robotics | | Ethical Assessment of a Hospital Disinfection Robot | Conor Mcginn, Robert Scott, Niamh Donnelly, Michael F. Cullinan, Alan Winfield, Patricia Treusch | Trinity College Dublin,Akara Robotics,University of the West of England, Bristol,TU Berlin | Human-Centered Robotics | | Intention Aware Robot Crowd Navigation with Attention-Based Interaction Graph | Shuijing Liu, Peixin Chang, Zhe Huang, Neeloy Chakraborty, Kaiwen Hong, Weihang Liang, D. Livingston Mcpherson, Junyi Geng, Katherine Driggs-Campbell | University of Illinois at Urbana Champaign,University of Illinois at Urbana-Champaign,University of Illinois,Pennsylvania State University | Human-Centered Robotics | | A Study into Understanding User Requirements to Inform the Design of Customisable Robotic Pain Management Devices | Angela Higgins, Alison Llewellyn, Emma Dures, Praminda Caleb-Solly | University of Nottingham,University of the West of England | Human-Centered Robotics | | Occlusion-Aware Crowd Navigation Using People As Sensors | Ye-ji Mun, Masha Itkina, Shuijing Liu, Katherine Driggs-Campbell | University of Illinois at Urbana-Champaign,Stanford University,University of Illinois at Urbana Champaign | Human-Aware Motion Planning | | Efficiently Approaching Groups of People in a Socially Acceptable Manner in Environments with Obstacles | Aline Silva, Luciano Almeida, Douglas Guimarães Macharet | Universidade Federal de Minas Gerais - Brazil,Universidade Federal de Minas Gerais | Human-Aware Motion Planning | | SoLo T-DIRL: Socially-Aware Dynamic Local Planner Based on Trajectory-Ranked Deep Inverse Reinforcement Learning | Yifan Xu, Theodor Chakhachiro, Tribhi Kathuria, Maani Ghaffari | University of Michigan,American University of Beirut,University of Michigan, Ann Arbor | Human-Aware Motion Planning | | Noise and Environmental Justice in Drone Fleet Delivery Paths: A Simulation-Based Audit and Algorithm for Fairer Impact Distribution | Zewei Zhou, Martim Brandao | King's College London | Human-Aware Motion Planning | | Actuator Capabilities Aware Limitation for TDPA Passivity Controller Action | Francesco Porcini, Alessandro Filippeschi, Massimiliano Solazzi, Carlo Alberto Avizzano, Antonio Frisoli | PERCRO Laboratory, TeCIP Institute, Sant’Anna School of Advanced,Scuola Superiore Sant'Anna,Scuola Superiore Sant'Anna, TeCIP Institute | Physical Human-Robot Interaction II | | Upper-Limb Geometric MyoPassivity Map for Physical Human-Robot Interaction | Xingyuan Zhou, Peter Paik, S. Farokh Atashzar | New York University,New York University (NYU), US | Physical Human-Robot Interaction II | | Learning and Blending Robot Hugging Behaviors in Time and Space | Drolet Michael, Joseph Campbell, Heni Ben Amor | Arizona State University,Carnegie Mellon University | Physical Human-Robot Interaction II | | Quadruped Guidance Robot for the Visually Impaired: A Comfort-Based Approach | Yanbo Chen, Zhengzhe Xu, Zhuozhu Jian, Gengpan Tang, Yunong Yangli, Anxing Xiao, Xueqian Wang, Bin Liang | Harbin Institute of Technology, Shenzhen,Tsinghua University,National University of Singapore,Center for Artificial Intelligence and Robotics, Graduate School | Physical Human-Robot Interaction II | | Online Learning and Suppression of Vibration in Collaborative Robots with Power Tools | Gokhan Solak, Arash Ajoudani | Italian Institute of Technology, Genoa,Istituto Italiano di Tecnologia | Physical Human-Robot Interaction II | | Towards Human-Robot Collaboration with Parallel Robots by Kinetostatic Analysis, Impedance Control and Contact Detection | Aran Mohammad, Moritz Schappler, Tobias Ortmaier | Leibniz University Hannover,Institute of Mechatronic Systems, Leibniz Universitaet Hannover,Leibniz University Hanover | Physical Human-Robot Interaction II | | Proprioceptive Sensor-Based Simultaneous Multi-Contact Point Localization and Force Identification for Robotic Arms | Seowook Han, Min Jun Kim | Korean Advanced Institute of Science and Technology,KAIST | Physical Human-Robot Interaction II | | Nonlinear Model Predictive Control of a 3D Hopping Robot: Leveraging Lie Group Integrators for Dynamically Stable Behaviors | Noel Csomay-Shanklin, Victor Dorobantu, Aaron Ames | California Institute of Technology | Award Finalists 4 | | Anchoring Sagittal Plane Templates in a Spatial Quadruped | Timothy Greco, Daniel Koditschek | University of Pennsylvania | Legged Robots | | External Force Estimation of Legged Robots Via a Factor Graph Framework with a Disturbance Observer | Jeonguk Kang, Hyun-bin Kim, Keun Ha Choi, Kyung-Soo Kim | KAIST,Korea Advanced Institute of Science and Technology,KAIST(Korea Advanced Institute of Science and Technology) | Legged Robots | | Morphological Characteristics That Enable Stable and Efficient Walking in Hexapod Robot Driven by Reflex-Based Intra-Limb Coordination | Wataru Sato, Jun Nishii, Mitsuhiro Hayashibe, Dai Owaki | Tohoku University,Yamaguchi University | Legged Robots | | Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors | Shikha Surana, Bryan Wei Tern Lim, Antoine Cully | Imperial College London | Legged Robots | | Robust Locomotion on Legged Robots through Planning on Motion Primitive Graphs | Wyatt Ubellacker, Aaron Ames | California Institute of Technology | Award Finalists 3 | | Learning Arm-Assisted Fall Damage Reduction and Recovery for Legged Mobile Manipulators | Yuntao Ma, Farbod Farshidian, Marco Hutter | ETH Zürich,ETH Zurich | Legged Robots | | Hierarchical Adaptive Loco-Manipulation Control for Quadruped Robots | Mohsen Sombolestan, Quan Nguyen | University of Southern California | Legged Robots | | Probabilistic Contact State Estimation for Legged Robots Using Inertial Information | Michael Maravgakis, Despina-ekaterini Argiropoulos, Stylianos Piperakis, Panos Trahanias | Institute of Computer Science, Foundation for Research and Techn,(a) Institute of Computer Science Foundation for Research and T,Agility Robotics Inc,,Foundation for Research and Technology – Hellas (FORTH) | Legged Robots | | Learning an Efficient Terrain Representation for Haptic Localization of a Legged Robot | Damian Sójka, MichaÅ‚ Nowicki, Piotr Skrzypczynski | Poznan University of Technology | Legged Robots | | Event-Based Agile Object Catching with a Quadrupedal Robot | Benedek Forrai, Takahiro Miki, Daniel Gehrig, Marco Hutter, Davide Scaramuzza | ETH Zürich,ETH Zurich,University of Zurich / ETH,University of Zurich | Legged Robots | | Evaluation of Legged Robot Landing Capability under Aggressive Linear and Angular Velocities | Keran Ye, Konstantinos Karydis | University of California, Riverside | Legged Robots | | Bipedal Robot Walking Control Using Human Whole-Body Dynamic Telelocomotion | Guillermo Colin Navarro, Youngwoo Sim, Joao Ramos | University of Illinois at Urbana-Champaign | Humanoids and Bipedal Locomotion | | Foot Stepping Algorithm of Humanoids with Double Support Time Adjustment Based on Capture Point Control | Myeong-Ju Kim, Daegyu Lim, Gyeongjae Park, Jaeheung Park | Seoul National University | Humanoids and Bipedal Locomotion | | Optimizing Bipedal Locomotion for the 100m Dash with Comparison to Human Running | Devin Crowley, Jeremy Dao, Helei Duan, Kevin Green, Jonathan Hurst, Alan Fern | Oregon State University | Humanoids and Bipedal Locomotion | | Effect of the Dynamics of a Horizontally Wobbling Mass on Biped Walking Performance | Tomoya Kamimura, Akihito Sano | Nagoya Institute of Technology | Humanoids and Bipedal Locomotion | | Robust Bipedal Locomotion: Leveraging Saltation Matrices for Gait Optimization | Maegan Tucker, Noel Csomay-Shanklin, Aaron Ames | California Institute of Technology,Caltech | Humanoids and Bipedal Locomotion | | Topology-Based MPC for Automatic Footstep Placement and Contact Surface Selection | Jaehyun Shim, Carlos Mastalli, Thomas Corbères, Steve Tonneau, Vladimir Ivan, Sethu Vijayakumar | University of Edinburgh,Heriot-Watt University,LAAS-CNRS,The University of Edinburgh,Touchlab Limited | Humanoids and Bipedal Locomotion | | Online Non-Linear Centroidal MPC for Humanoid Robots Payload Carrying with Contact-Stable Force Parametrization | Mohamed Elobaid, Giulio Romualdi, Gabriele Nava, Lorenzo Rapetti, Hosameldin Awadalla Omer Mohamed, Daniele Pucci | Fondazione Istituto Italiano di Tecnologia,Istituto Italiano di Tecnologia,IIT,Italian Institute of Technology | Humanoids and Bipedal Locomotion | | Holistic View of Inverse Optimal Control by Introducing Projections on Singularity Curves | Jessica Colombel, David Daney, François Charpillet | Université de Lorraine, CNRS, Inria, LORIA, F-,,,,, Nancy, Franc,Inria centre at the university of Bordeaux, F-,,,,, Talence, Fra | Humanoids and Bipedal Locomotion | | The Role of Symmetry in Constructing Geometric Flat Outputs for Free-Flying Robotic Systems | Jake Welde, Matthew Kvalheim, Vijay Kumar | University of Pennsylvania,University of Michigan | Underactuated Systems | | On the Learned Balance Manifold of Underactuated Balance Robots | Feng Han, Jingang Yi | Rutgers University | Underactuated Systems | | Controlling an Underactuated AUV As an Inverted Pendulum Using Nonlinear Model Predictive Control and Behavior Trees | Sriharsha Bhat, Ivan Stenius | KTH Royal Institute of Technology,KTH | Underactuated Systems | | Towards Exact Interaction Force Control for Underactuated Quadrupedal Systems with Orthogonal Projection and Quadratic Programming | Shengzhi Wang, Xiangyu Chu, Samuel Au | The Chinese University of Hong Kong | Underactuated Systems | | Reinforcement Learning for Laser Welding Speed Control Minimizing Bead Width Error | Toshimitsu Kaneko, Gaku Minamoto, Yusuke Hirose, Tetsuo Sakai | Toshiba Corporation,TOSHIBA/RIKEN | Industrial Robotics and Automation | | Real-Time Model Predictive Control for Industrial Manipulators with Singularity-Tolerant Hierarchical Task Control | Jaemin Lee, Mingyo Seo, Andrew Bylard, Zhouwen Sun, Luis Sentis | California Institute of Technology,The University of Texas at Austin,Stanford University,Dexterity Inc | Industrial Robotics and Automation | | High-Speed High-Accuracy Spatial Curve Tracking Using Motion Primitives in Industrial Robots | Honglu He, Chen-lung Lu, Yunshi Wen, Glenn Saunders, Pinghai Yang, Jeffrey Schoonover, John Wason, Agung Julius, John Wen | Rensselaer Polytechnic Institute,GE Research,Wason Technology, LLC | Industrial Robotics and Automation | | A New Robust Control Framework for Robot Manipulators without Velocity Measurements: A Modified Dual-Loop Control Scheme | Hae Yeon Park, Jung Hoon Kim | POSTECH,Pohang University of Science and Technology | Industrial Robotics and Automation | | Optimal Workpiece Placement Based on Robot Reach, Manipulability and Joint Torques | Baris Balci, Jared Donovan, Jonathan Roberts, Peter Corke | Queensland University of Technology | Industrial Robotics and Automation | | Experimental Workflow Implementation for Automatic Detection of Filament Deviation in 3D Robotic Printing Process | Xinrui Yang, Othman Lakhal, Abdelkader Belarouci, Kamal Youcef-Toumi, Rochdi Merzouki | University of Lille,University Lille, CRIStAL, CNRS-UMR ,,,,,University of Lille - CRIStAL Lab,Massachusetts Institute of Technology,CRIStAL, CNRS UMR ,,,,, University of Lille, | Industrial Robotics and Automation | | Neuro-Adaptive Dynamic Control with Edge-Computing for Collaborative Digital Twin of an Industrial Robotic Manipulator | Sumit Kumar Das, Mohammad Helal Uddin, Dan Popa, Sabur Hassan Baidya | University of Louisville | Industrial Robotics and Automation | | Contact-Based Pose Estimation of Workpieces for Robotic Setups | Yitaek Kim, Aljaz Kramberger, Anders Glent Buch, Christoffer Sloth | University of Southern Denmark | Industrial Robotics and Automation | | Local Layer Splitting: An Additive Manufacturing Method to Define the Mechanical Properties of Soft Pneumatic Actuators During Fabrication | Brice Parilusyan, Marc Teyssier, Zacharie Guillaume, Thibault Charlet, Clément Duhart, Marcos Serrano | Léonard de Vinci Pôle Universitaire , Research Center,Saarland University, Saarland Informatics Campus,De Vinci Innovation Center, ESPCI, ENAC,École supérieur d’ingénierie Léonard de Vinci,Léonard de Vinci Pôle Universitaire, Research center, ,, ,,, Par,IRIT - University of Toulouse | Additive Manufacturing | | Support Generation for Robot-Assisted 3D Printing with Curved Layers | Tianyu Zhang, Yuming Huang, Piotr Tomasz Kukulski, Neelotpal Dutta, Guoxin Fang, Charlie C.l. Wang | The University of Manchester,University of Manchester | Additive Manufacturing | | Learning Deposition Policies for Fused Multi-Material 3D Printing | Kang Liao, Thibault Tricard, Michal Piovarci, Hans-peter Seidel, Vahid Babaei | Beijing Jiaotong University,INRIA,Institute of Science and Technology Austria,Max Planck Institute for Informatics | Additive Manufacturing | | Transparent Objects: A Corner Case in Stereo Matching | Zhiyuan Wu, Shuai Su, Qijun Chen, Rui Fan | Tongji University,Tongji University, China | Logistics | | D2NT: A High-Performing Depth-To-Normal Translator | Yi Feng, Bohuan Xue, Ming Liu, Qijun Chen, Rui Fan | Tongji University,HKUST,Hong Kong University of Science and Technology | Logistics | | Security-Aware Reinforcement Learning under Linear Temporal Logic Specifications | Bohan Cui, Keyi Zhu, Shaoyuan Li, Xiang Yin | Shanghai Jiao Tong University,Shanghai Jiao Tong Univ | Logistics | | Global Localization in Repetitive and Ambiguous Environments | Zhenyu Wu, Wei Wang, Jun Zhang, Qiyang Lyu, Haoyuan Zhang, Danwei Wang | Nanyang Technological University | Logistics | | Grey-Box Learning of Adaptive Manipulation Primitives for Robotic Assembly | Marco Braun, Sebastian Wrede | Bielefeld University | Assembly | | Speeding up Assembly Sequence Planning through Learning Removability Probabilities | Alexander Cebulla, Tamim Asfour, Torsten Kroeger | Karlsruhe Institute of Technology (KIT),Karlsruher Institut für Technologie (KIT) | Assembly | | Planning Assembly Sequence with Graph Transformer | Lin Ma, Jiangtao Gong, Hao Xu, Hao Chen, Hao Zhao, Wenbing Huang, Guyue Zhou | Southwestern University of Finance and Ecomonics,Tsinghua University,Qianzhi Technology,Qianzhi Technology Inc.,Renmin University of China | Assembly | | CFVS: Coarse-To-Fine Visual Servoing for 6-DoF Object-Agnostic Peg-In-Hole Assembly | Bo-Siang Lu, Tung-i Chen, Hsin-ying Lee, Winston Hsu | National Taiwan University | Assembly | | Probabilistic Rare-Event Verification for Temporal Logic Robot Tasks | Guy Scher, Sadra Sadraddini, Hadas Kress-Gazit | Cornell University,MIT | Formal Methods | | Safe Model-Based Control from Signal Temporal Logic Specifications Using Recurrent Neural Networks | Wenliang Liu, Mirai Duintjer Tebbens Nishioka, Calin Belta | Boston University,Commonwealth School | Formal Methods | | Temporal Logic Swarm Control with Splitting and Merging | Gustavo Andres Cardona, Kevin Leahy, Cristian Ioan Vasile | Lehigh University,MIT Lincoln Laboratory | Formal Methods | | Synthesizing Reactive Test Environments for Autonomous Systems: Testing Reach-Avoid Specifications with Multi-Commodity Flows | Apurva Badithela, Josefine Graebener, Wyatt Ubellacker, Eric Mazumdar, Aaron Ames, Richard M. Murray | Caltech,California Institute of Technology | Formal Methods | | HaPPArray: Haptic Pneumatic Pouch Array for Feedback in Hand-Held Robots | Xiaolei Luo, Jui-Te Lin, Tania Morimoto | University of California San Diego | Haptics and Haptic Interfaces | | Vis2Hap: Vision-Based Haptic Rendering by Cross-Modal Generation | Guanqun Cao, Jiaqi Jiang, Ningtao Mao, Danushka Bollegala, Min Li, Shan Luo | University of Liverpool,King's College London,School of Design, University of Leeds,Xi'an Jiaotong University | Haptics and Haptic Interfaces | | A Plug-In Weight-Shifting Module That Adds Emotional Expressiveness to Inanimate Objects in Handheld Interaction | Yohei Noguchi, Yijie Guo, Fumihide Tanaka | University of Tsukuba | Haptics and Haptic Interfaces | | Model-Mediated Teleoperation for Remote Haptic Texture Sharing: Initial Study of Online Texture Modeling and Rendering | Mudassir Ibrahim Awan, Tatyana Ogay, Waseem Hassan, Dongbeom Ko, Sungjoo Kang, Seokhee Jeon | Kyung Hee university,Kyung Hee University,ETRI (Electronics and Telecommunications Research Institute),Electronics and Telecommunications Research Institute (ETRI) | Haptics and Haptic Interfaces | | Using a Collaborative Robotic Arm As Human-Machine Interface: System Setup and Application to Pose Control Tasks | Christian Braun, Ludwig Haide, Lars Fischer, Sean Kille, Balint Varga, Simon Rothfuß, Soeren Hohmann | Karlsruhe Institute of Technology (KIT),Karlsruhe Institute of Technology,Karlsruhe Institute of Technology (KIT), Campus South,Institute of Control Systems, Karlsruhe Institute of Technology | Haptics and Haptic Interfaces | | Disturbance Observer Based Contact Detection for Motorized Hydraulic Actuators | Chunpeng Wang, John Peter Whitney | Northeastern University | Haptics and Haptic Interfaces | | A Framework for Active Haptic Guidance Using Robotic Haptic Proxies | Niall L. Williams, Nicholas Rewkowski, Jiasheng Li, Ming C. Lin | University of Maryland, College Park,UMD College Park,University of Maryland at College Park | Haptics and Haptic Interfaces | | An Optimized Portable Cable-Driven Haptic Robot Enables Free Motion and Hard Contact | Changqi Zhang, Cui Wang, Qingkai Yang, Mingming Zhang | Southern University of Science and Technology,Southern University of Science And Technology | Haptics and Haptic Interfaces | | Enable Natural Tactile Interaction for Robot Dog Based on Large-Format Distributed Flexible Pressure Sensors | Lishuang Zhan, Yancheng Cao, Qitai Chen, Haole Guo, Jiasi Gao, Yiyue Luo, Shihui Guo, Guyue Zhou, Jiangtao Gong | Xiamen University,Institute for AI Industry Research (AIR), Tsinghua University, C,Guangzhou Maritime University,Tsinghua University,Massachusetts Institute of Technology | Haptics and Haptic Interfaces | | Multi-Modal Interactive Perception in Human Control of Complex Objects | Rashida Nayeem, Salah Bazzi, Mohsen Sadeghi, Reza Sharif Razavian, Dagmar Sternad | Northeastern University | Haptics and Haptic Interfaces | | Soft Sensing Skins for Arbitrary Objects: An Automatic Framework | Sonja Groß, Diego Xavier Hidalgo Carvajal, Silija Breimann, Nicolai Stein, Amartya Ganguly, Abdeldjallil Naceri, Sami Haddadin | Technical University of Munich,Technical University Munich,Technische Universität München | Haptics and Haptic Interfaces | | Error-Domain Conservativity Control to Transparently Increase the Stability Range of Time-Discretized Controllers | Michael Rothammer, Jee-Hwan Ryu | TUM, Munich,Korea Advanced Institute of Science and Technology | Haptics and Haptic Interfaces | | A Digital Twin for Teleoperation of Vehicles in Urban Environments | Philipp Kremer, Navid Nourani-Vatani, Sangyoung Park | Technische Universität Berlin,Imperium Drive Ltd,Technical University of Berlin | Teleoperation | | WE-Filter: Adaptive Acceptance Criteria for Filter-Based Shared Autonomy | Michael Bowman, Xiaoli Zhang | Colorado School of Mines | Teleoperation | | Monocular Reactive Collision Avoidance for MAV Teleoperation with Deep Reinforcement Learning | Raffaele Brilli, Marco Legittimo, Francesco Crocetti, Mirko Leomanni, Mario Luca Fravolini, Gabriele Costante | University of Perugia | Teleoperation | | HAT: Head-Worn Assistive Teleoperation of Mobile Manipulators | Akhil Padmanabha, Qin Wang, Daphne Han, Jashkumar Rasikbhai Diyora, Kriti Kacker, Hamza Khaild, Liang-jung Chen, Carmel Majidi, Zackory Erickson | Carnegie Mellon University | Teleoperation | | DenseTact 2.0: Optical Tactile Sensor for Shape and Force Reconstruction | Won Kyung Do, Bianca Jurewicz, Monroe Kennedy | Stanford University | Force and Tactile Sensing II | | SonicFinger: Pre-Touch and Contact Detection Tactile Sensorfor Reactive Pregrasping | Siddharth Rupavatharam, Caleb Escobedo, Daewon Lee, Colin Prepscius, Lawrence Jackel, Richard Howard, Volkan Isler | Samsung AI Center,University of Colorado - Boulder,Samsung AI Center New York,Samsung,North-C Technologies Inc,University of Minnesota | Force and Tactile Sensing II | | Simultaneous Tactile Estimation and Control of Extrinsic Contact | Sangwoon Kim, Devesh Jha, Diego Romeres, Parag Patre, Alberto Rodriguez | Massachusetts Institute of Technology,Mitsubishi Electric Research Laboratories,Mitsubishi Electric research laboratories,University of Florida | Force and Tactile Sensing II | | A Miniaturised Camera-Based Multi-Modal Tactile Sensor | Kaspar Althoefer, Yonggen Ling, Wanlin Li, Xinyuan Qian, Wang Wei Lee, Peng Qi | Queen Mary University of London,Tencent,Beijing Institute for General Artificial Intelligence (BIGAI),University of Science and Technology Beijing,Tongji University | Force and Tactile Sensing II | | Neural Contact Fields: Tracking Extrinsic Contact with Tactile Sensing | Carolina Higuera, Siyuan Dong, Byron Boots, Mustafa Mukadam | University of Washington,MIT,Facebook AI Research | Force and Tactile Sensing II | | Estimating Tactile Models of Heterogeneous Deformable Objects in Real Time | Shaoxiong Yao, Kris Hauser | University of Illinois Urbana-Champaign,University of Illinois at Urbana-Champaign | Force and Tactile Sensing II | | Tactile Identification of Object Shapes Via In-Hand Manipulation with a Minimalistic Barometric Tactile Sensor Array | Xin Zhou, Ad Spiers | Imperial College London | Force and Tactile Sensing II | | Tactile Tool Manipulation | Yuki Shirai, Devesh Jha, Arvind Raghunathan, Dennis Hong | University of California, Los Angeles,Mitsubishi Electric Research Laboratories,UCLA | Force and Tactile Sensing II | | Preliminary Evaluation of a Wearable Thruster for Arresting Backwards Falls | Michael Finn-henry, Jose Leonardo Brenes, Almaskhan Baimyshev, Michael Goldfarb | Vanderbilt,Vanderbilt University | Rehabilitation and Augmentation II | | A Method for Selecting Stumble Recovery Response in a Knee Exoskeleton | Maura Eveld, Shane King, Karl Zelik, Michael Goldfarb | University of Twente,Vanderbilt University | Rehabilitation and Augmentation II | | A Dual-Arm Participated Human-Robot Collaboration Method for Upper Limb Rehabilitation of Hemiplegic Patients | Lufeng Chen, Jing Qiu, Xuan Zou, Hong Cheng | University of Electronic Science and Technology of China,University of Electronic Science and Technology | Rehabilitation and Augmentation II | | A Force-Sensitive Exoskeleton for Teleoperation: An Application in Elderly Care Robotics | Alexander Toedtheide, Xiao Chen, Hamid Sadeghian, Abdeldjallil Naceri, Sami Haddadin | Technical University of Munich | Rehabilitation and Augmentation II | | A Model-Based Analysis of the Effect of Repeated Unilateral Low Stiffness Perturbations on Human Gait: Toward Robot-Assisted Rehabilitation | Vaughn Chambers, Panagiotis Artemiadis | University of Delaware | Rehabilitation and Augmentation II | | Shared Control of Assistive Robots through User-Intent Prediction and Hyperdimensional Recall of Reactive Behavior | Alisha Menon, Laura I. Galindez Olascoaga, Vamshi Balanaga, Anirudh Natarajan, Jennifer Ruffing, Ryan Ardalan, Jan M. Rabaey | University of California: Berkeley | Rehabilitation and Augmentation II | | Towards Predicting Fine Finger Motions from Ultrasound Images Via Kinematic Representation | Dean Zadok, Oren Salzman, Alon Wolf, Alexander Bronstein | Technion | Rehabilitation and Augmentation II | | Enabling Safe Walking Rehabilitation on the Exoskeleton Atalante: Experimental Results | Maxime Brunet, Marine Pétriaux, Florent Di Meglio, Nicolas Petit | MINES Paristech,Wandercraft,MINES ParisTech, PSL Research University,MINES ParisTech, PSL | Rehabilitation and Augmentation II | | A Probabilistic Model of Activity Recognition with Loose Clothing | Tianchen Shen, Irene Di Giulio, Matthew Howard | King's College London | Rehabilitation and Augmentation II | | Real-Time Estimation of Walking Speed and Stride Length Using an IMU Embedded in a Robotic Hip Exoskeleton | Keehong Seo | Samsung Research/Samsung Electronics Co., Ltd. | Rehabilitation and Augmentation II | | Adaptive Based Assist-As-Needed Control Strategy for Ankle Movement Assistance | Rami Jradi, Hala Rifai, Yacine Amirat, Samer Mohammed | UPEC,University of Paris Est Créteil,University of Paris Est Créteil (UPEC),University of Paris Est Créteil - (UPEC) | Rehabilitation and Augmentation II | | Anticipation and Delayed Estimation of Sagittal Plane Human Hip Moments Using Deep Learning and a Robotic Hip Exoskeleton | Dean Molinaro, Ethan Park, Aaron Young | Georgia Institute of Technology,University of Illinois Urbana-Champaign,Georgia Tech | Rehabilitation and Augmentation II | | Safety under Uncertainty: Tight Bounds with Risk-Aware Control Barrier Functions | Mitchell Black, Georgios Fainekos, Bardh Hoxha, Danil Prokhorov, Dimitra Panagou | University of Michigan,Toyota NA-R&D,Southern Illinois University,Toyota Tech Center,University of Michigan, Ann Arbor | Safety and Trustworthy Robotics II | | Distributionally Robust RRT with Risk Allocation | Kajsa Ekenberg, Venkatraman Renganathan, Bjorn Olofsson | Lund University | Safety and Trustworthy Robotics II | | Statistical Safety and Robustness Guarantees for Feedback Motion Planning of Unknown Underactuated Stochastic Systems | Craig Knuth, Glen Chou, Jamie Reese, Joseph Moore | Johns Hopkins University Applied Physics Lab,University of Michigan,Johns Hopkins Applied Physics Lab | Safety and Trustworthy Robotics II | | A Sensitivity-Aware Motion Planner (SAMP) to Generate Intrinsically-Robust Trajectories | Simon Wasiela, Paolo Robuffo Giordano, Juan Cortes, Thierry Simeon | LAAS-CNRS,IRISA CNRS UMR,,,, | Safety and Trustworthy Robotics II | | Proficiency Self-Assessment without Breaking the Robot: Anomaly Detection Using Assumption-Alignment Tracking from Safe Experiments | Xuan Cao, Jacob Crandall, Ethan Pedersen, Alvika Gautam, Michael A. Goodrich | Brigham Young University,Texas A & M University | Award Finalists 1 | | Failure Detection for Motion Prediction of Autonomous Driving: An Uncertainty Perspective | Wenbo Shao, Yanchao Xu, Liang Peng, Jun Li, Hong Wang | Tsinghua University,Beijing Institute of Technology | Safety and Trustworthy Robotics II | | Analysing the Safety and Security of a UV-C Disinfection Robot | Desiana Nurchalifah, Sebastian Blumenthal, Luigi Lo Iacono, Nico Hochgeschwender | Hochschule Bonn-Rhein-Sieg,Locomotec,Hochschule Bonn-Rhein-Sieg University of Applied Sciences,Bonn-Rhein-Sieg University | Safety and Trustworthy Robotics II | | Failure Detection and Fault Tolerant Control of a Jet-Powered Flying Humanoid Robot | Gabriele Nava, Daniele Pucci | Istituto Italiano di Tecnologia,Italian Institute of Technology | Safety and Trustworthy Robotics II | | Testing Rare Downstream Safety Violations Via Upstream Adaptive Sampling of Perception Error Models | Craig Innes, Subramanian Ramamoorthy | University of Edinburgh,The University of Edinburgh | Safety and Trustworthy Robotics II | | Learning to Forecast Aleatoric and Epistemic Uncertainties Over Long Horizon Trajectories | Aastha Acharya, Rebecca Russell, Nisar Ahmed | University of Colorado Boulder; Draper,Draper,University of Colorado Boulder | Safety and Trustworthy Robotics II | | S∗: On Safe and Time Efficient Robot Motion Planning | Riddhiman Laha, Wenxi Wu, Ruiai Sun, Nico Mansfeld, Luis Felipe Cruz Figueredo, Sami Haddadin | Technical University of Munich,Franka Emika GmbH,Technical University of Munich (TUM) | Safety and Trustworthy Robotics II | | Online Update of Safety Assurances Using Confidence-Based Predictions | Kensuke Nakamura, Somil Bansal | Princeton University,University of Southern California | Safety and Trustworthy Robotics II | | Self-Supervised Point Cloud Understanding Via Mask Transformer and Contrastive Learning | Di Wang, Zhi-Xin Yang | University of Macau | Deep Learning for Visual Perception |