2025 |
Ramesh, D; Keskar, S; Sivaramakrishnan, A; Bekris, K; Yu, J; Boularias, A PROBE: Proprioceptive Obstacle Detection and Estimation while Navigating in Clutter Conference IEEE International Conference on Robotics and Automation (ICRA), 2025. Abstract | Links | BibTeX | Tags: Perception @conference{metha2025probe, title = {PROBE: Proprioceptive Obstacle Detection and Estimation while Navigating in Clutter}, author = {D Ramesh and S Keskar and A Sivaramakrishnan and K Bekris and J Yu and A Boularias}, url = {https://dhruvmetha.github.io/legged-probe/}, year = {2025}, date = {2025-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {In critical applications, including search-andrescue in degraded environments, blockages can be prevalent and prevent the effective deployment of certain sensing modalities, particularly vision, due to occlusion and the constrained range of view of onboard camera sensors. To enable robots to tackle these challenges, we propose a new approach, Proprioceptive Obstacle Detection and Estimation while navigating in clutter (PROBE), which instead relies only on the robot's proprioception to infer the presence or the absence of occluded rectangular obstacles while predicting their dimensions and poses in SE(2). The approach is a Transformer neural network that receives as input a history of applied torques and sensed whole-body movements of the robot and returns a parameterized representation of the obstacles in the environment. The effectiveness of PROBE is evaluated on simulated environments in Isaac Gym and with a real Unitree Go1 quadruped robot.}, keywords = {Perception}, pubstate = {published}, tppubtype = {conference} } In critical applications, including search-andrescue in degraded environments, blockages can be prevalent and prevent the effective deployment of certain sensing modalities, particularly vision, due to occlusion and the constrained range of view of onboard camera sensors. To enable robots to tackle these challenges, we propose a new approach, Proprioceptive Obstacle Detection and Estimation while navigating in clutter (PROBE), which instead relies only on the robot's proprioception to infer the presence or the absence of occluded rectangular obstacles while predicting their dimensions and poses in SE(2). The approach is a Transformer neural network that receives as input a history of applied torques and sensed whole-body movements of the robot and returns a parameterized representation of the obstacles in the environment. The effectiveness of PROBE is evaluated on simulated environments in Isaac Gym and with a real Unitree Go1 quadruped robot. |
Marougkas, I; Ramesh, D; Doerr, J; Granados, E; Sivaramakrishnan, A; Boularias, A; Bekris, K Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies Conference IEEE International Conference on Robotics and Automation (ICRA), 2025. Abstract | BibTeX | Tags: Manipulation, Planning @conference{marougkas2025integration, title = {Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies}, author = {I Marougkas and D Ramesh and J Doerr and E Granados and A Sivaramakrishnan and A Boularias and K Bekris}, year = {2025}, date = {2025-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {Object insertion under tight tolerances (<1mm) is an important but challenging assembly task as even slight errors can result in undesirable contacts. Recent efforts have focused on using Reinforcement Learning (RL) and often depend on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved accuracy given training of the policy exclusively in simulation and zero- shot transfer to the real system. It employs a potential field- based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with a residual RL one, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL methods in this domain and prior efforts for hybrid policies. Ablations highlight the impact of each component of the approach}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {conference} } Object insertion under tight tolerances (<1mm) is an important but challenging assembly task as even slight errors can result in undesirable contacts. Recent efforts have focused on using Reinforcement Learning (RL) and often depend on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved accuracy given training of the policy exclusively in simulation and zero- shot transfer to the real system. It employs a potential field- based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with a residual RL one, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL methods in this domain and prior efforts for hybrid policies. Ablations highlight the impact of each component of the approach |
2024 |
Bekris, K; Doerr, J; Meng, P; Tangirala, S The State of Robot Motion Generation Inproceedings International Symposium of Robotics Research (ISRR), Long Beach, California, 2024. Abstract | Links | BibTeX | Tags: Dynamics, Learning, Planning @inproceedings{Bekris:2024aa, title = {The State of Robot Motion Generation}, author = {K Bekris and J Doerr and P Meng and S Tangirala}, url = {https://arxiv.org/abs/2410.12172 https://pracsys.cs.rutgers.edu/papers/the-state-of-robot-motion-generation/}, year = {2024}, date = {2024-12-01}, booktitle = {International Symposium of Robotics Research (ISRR)}, address = {Long Beach, California}, abstract = {This paper first reviews the large spectrum of methods for generating robot motion proposed over the 50 years of robotics research culminating to recent developments. It crosses the boundaries of methodologies, which are typically not surveyed together, from those that operate over explicit models to those that learn implicit ones. The paper concludes with a discussion of the current state-of-the-art and the properties of the varying methodologies highlighting opportunities for integration.}, keywords = {Dynamics, Learning, Planning}, pubstate = {published}, tppubtype = {inproceedings} } This paper first reviews the large spectrum of methods for generating robot motion proposed over the 50 years of robotics research culminating to recent developments. It crosses the boundaries of methodologies, which are typically not surveyed together, from those that operate over explicit models to those that learn implicit ones. The paper concludes with a discussion of the current state-of-the-art and the properties of the varying methodologies highlighting opportunities for integration. |
Chen, N; Wang, K; Johnson, W; Kramer-Bottiglio, R; Bekris, K; Aanjaneya, M Learning Differentiable Tensegrity Dynamics Using Graph Neural Networks Inproceedings Conference on Robot Learning (CoRL), Munich, Germany, 2024. Abstract | Links | BibTeX | Tags: Soft-Robots @inproceedings{Chen:2024aa, title = {Learning Differentiable Tensegrity Dynamics Using Graph Neural Networks}, author = {N Chen and K Wang and W Johnson and R Kramer-Bottiglio and K Bekris and M Aanjaneya}, url = {https://openreview.net/pdf?id=5Awumz1VKU}, year = {2024}, date = {2024-11-01}, booktitle = {Conference on Robot Learning (CoRL)}, address = {Munich, Germany}, abstract = {Tensegrity robots are composed of rigid struts and flexible cables and constitute an emerging class of hybrid rigid-soft robotic systems. They are promising systems for a wide-array of applications, ranging from locomotion to assembly. They are difficult to control and model accurately, however, due to their high number of degrees of freedom and compliance. To address this issue, prior works have introduced a differentiable physics engine designed for tensegrity robots based on first-principles. In contrast, this work proposes the use of graph neural networks to model contact dynamics over a graph representation of tensegrity robots, which leverages the natural graph-like cable connectivity between the rod end caps. This learned simulator can accurately model 3-bar and 6-bar tensegrity robot dynamics in simulation-to-simulation experiments where MuJoCo is used as the ground truth. It can also achieve higher accuracy than the previous differentiable engine for a real 3-bar tensegrity robot, where the robot state is only being partially observable. When compared against direct applications of recent graph neural network simulators, the proposed approach is computationally more efficient both for training and inference, while achieving higher accuracy.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {inproceedings} } Tensegrity robots are composed of rigid struts and flexible cables and constitute an emerging class of hybrid rigid-soft robotic systems. They are promising systems for a wide-array of applications, ranging from locomotion to assembly. They are difficult to control and model accurately, however, due to their high number of degrees of freedom and compliance. To address this issue, prior works have introduced a differentiable physics engine designed for tensegrity robots based on first-principles. In contrast, this work proposes the use of graph neural networks to model contact dynamics over a graph representation of tensegrity robots, which leverages the natural graph-like cable connectivity between the rod end caps. This learned simulator can accurately model 3-bar and 6-bar tensegrity robot dynamics in simulation-to-simulation experiments where MuJoCo is used as the ground truth. It can also achieve higher accuracy than the previous differentiable engine for a real 3-bar tensegrity robot, where the robot state is only being partially observable. When compared against direct applications of recent graph neural network simulators, the proposed approach is computationally more efficient both for training and inference, while achieving higher accuracy. |
Sivaramakrishnan, A; Tangirala, S; Granados, E; Carver, N; Bekris, K Roadmaps with Gaps Over Controllers: Achieving Efficiency in Planning under Dynamics Inproceedings IEEE/RSJ Intern. Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates, 2024. Abstract | Links | BibTeX | Tags: Dynamics, Planning @inproceedings{Sivaramakrishnan:2024aa, title = {Roadmaps with Gaps Over Controllers: Achieving Efficiency in Planning under Dynamics}, author = {A Sivaramakrishnan and S Tangirala and E Granados and N Carver and K Bekris}, url = {https://arxiv.org/abs/2310.03239}, year = {2024}, date = {2024-10-01}, booktitle = {IEEE/RSJ Intern. Conference on Intelligent Robots and Systems (IROS)}, address = {Abu Dhabi, United Arab Emirates}, abstract = {This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics through the use of learned controllers. It adopts a decoupled strategy, where a system-specific controller is first trained offline in an empty environment to deal with the robot's dynamics. For a target environment, the proposed approach constructs offline a data structure, a ``Roadmap with Gaps,'' to approximately learn how to solve planning queries in this environment using the learned controller. The nodes of the roadmap correspond to local regions. Edges correspond to applications of the learned control policy that approximately connect these regions. Gaps arise because the controller does not perfectly connect pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The accompanying experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects.}, keywords = {Dynamics, Planning}, pubstate = {published}, tppubtype = {inproceedings} } This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics through the use of learned controllers. It adopts a decoupled strategy, where a system-specific controller is first trained offline in an empty environment to deal with the robot's dynamics. For a target environment, the proposed approach constructs offline a data structure, a ``Roadmap with Gaps,'' to approximately learn how to solve planning queries in this environment using the learned controller. The nodes of the roadmap correspond to local regions. Edges correspond to applications of the learned control policy that approximately connect these regions. Gaps arise because the controller does not perfectly connect pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The accompanying experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects. |
Vieira, E; Sivaramakrishnan, A; Tangirala, S; Granados, E; Mischaikow, K; Bekris, K MORALS: Analysis of High-Dimensional Robot Controllers Via Topological Tools in a Latent Space Conference IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan (Nominated for Best Paper Award in Automation), 2024. Abstract | Links | BibTeX | Tags: Dynamics, Planning, Verification @conference{Vieira:2024aa, title = {MORALS: Analysis of High-Dimensional Robot Controllers Via Topological Tools in a Latent Space}, author = {E Vieira and A Sivaramakrishnan and S Tangirala and E Granados and K Mischaikow and K Bekris}, url = {https://arxiv.org/abs/2310.03246}, year = {2024}, date = {2024-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, address = {Yokohama, Japan (Nominated for Best Paper Award in Automation)}, abstract = {Estimating the region of attraction (𝚁𝚘𝙰) for a robotic system's controller is essential for safe application and controller composition. Many existing methods require access to a closed-form expression that limit applicability to data-driven controllers. Methods that operate only over trajectory rollouts tend to be data-hungry. In prior work, we have demonstrated that topological tools based on Morse Graphs offer data-efficient 𝚁𝚘𝙰 estimation without needing an analytical model. They struggle, however, with high-dimensional systems as they operate over a discretization of the state space. This paper presents Morse Graph-aided discovery of Regions of Attraction in a learned Latent Space (𝙼𝙾𝚁𝙰𝙻𝚂). The approach combines autoencoding neural networks with Morse Graphs. 𝙼𝙾𝚁𝙰𝙻𝚂 shows promising predictive capabilities in estimating attractors and their 𝚁𝚘𝙰s for data-driven controllers operating over high-dimensional systems, including a 67-dim humanoid robot and a 96-dim 3-fingered manipulator. It first projects the dynamics of the controlled system into a learned latent space. Then, it constructs a reduced form of Morse Graphs representing the bistability of the underlying dynamics, i.e., detecting when the controller results in a desired versus an undesired behavior. The evaluation on high-dimensional robotic datasets indicates the data efficiency of the approach in 𝚁𝚘𝙰 estimation.}, keywords = {Dynamics, Planning, Verification}, pubstate = {published}, tppubtype = {conference} } Estimating the region of attraction (𝚁𝚘𝙰) for a robotic system's controller is essential for safe application and controller composition. Many existing methods require access to a closed-form expression that limit applicability to data-driven controllers. Methods that operate only over trajectory rollouts tend to be data-hungry. In prior work, we have demonstrated that topological tools based on Morse Graphs offer data-efficient 𝚁𝚘𝙰 estimation without needing an analytical model. They struggle, however, with high-dimensional systems as they operate over a discretization of the state space. This paper presents Morse Graph-aided discovery of Regions of Attraction in a learned Latent Space (𝙼𝙾𝚁𝙰𝙻𝚂). The approach combines autoencoding neural networks with Morse Graphs. 𝙼𝙾𝚁𝙰𝙻𝚂 shows promising predictive capabilities in estimating attractors and their 𝚁𝚘𝙰s for data-driven controllers operating over high-dimensional systems, including a 67-dim humanoid robot and a 96-dim 3-fingered manipulator. It first projects the dynamics of the controlled system into a learned latent space. Then, it constructs a reduced form of Morse Graphs representing the bistability of the underlying dynamics, i.e., detecting when the controller results in a desired versus an undesired behavior. The evaluation on high-dimensional robotic datasets indicates the data efficiency of the approach in 𝚁𝚘𝙰 estimation. |
2023 |
Vieira, E; Gao, K; Nakhimovich, D; Bekris, K; Yu, J Persistent Homology Guided Monte-Carlo Tree Search for Effective Non-Prehensile Manipulation Inproceedings International Symposium on Experimental Robotics (ISER), 2023. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Vieira:2023ab, title = {Persistent Homology Guided Monte-Carlo Tree Search for Effective Non-Prehensile Manipulation}, author = {E Vieira and K Gao and D Nakhimovich and K Bekris and J Yu}, url = {http://arxiv.org/abs/2210.01283}, year = {2023}, date = {2023-12-01}, booktitle = {International Symposium on Experimental Robotics (ISER)}, abstract = {Performing object retrieval in real-world workspaces must tackle challenges including uncertainty and clutter. One option is to apply prehensile operations, which can be time consuming in highly-cluttered scenarios. On the other hand, non-prehensile actions, such as pushing simultaneously multiple objects, can help to quickly clear a cluttered workspace and retrieve a target object. Such actions, however, can also lead to increased uncertainty as it is difficult to estimate the outcome of pushing operations. The proposed framework in this work integrates topological tools and Monte-Carlo Tree Search (MCTS) to achieve effective and robust pushing for object retrieval. It employs persistent homology to automatically identify manageable clusters of blocking objects without the need for manually adjusting hyper-parameters. Then, MCTS uses this information to explore feasible actions to push groups of objects, aiming to minimize the number of operations needed to clear the path to the target. Real-world experiments using a Baxter robot, which involves some noise in actuation, show that the proposed framework achieves a higher success rate in solving retrieval tasks in dense clutter than alternatives. Moreover, it produces solutions with few pushing actions improving the overall execution time. More critically, it is robust enough that it allows one to plan the sequence of actions offline and then execute them reliably on a Baxter robot.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } Performing object retrieval in real-world workspaces must tackle challenges including uncertainty and clutter. One option is to apply prehensile operations, which can be time consuming in highly-cluttered scenarios. On the other hand, non-prehensile actions, such as pushing simultaneously multiple objects, can help to quickly clear a cluttered workspace and retrieve a target object. Such actions, however, can also lead to increased uncertainty as it is difficult to estimate the outcome of pushing operations. The proposed framework in this work integrates topological tools and Monte-Carlo Tree Search (MCTS) to achieve effective and robust pushing for object retrieval. It employs persistent homology to automatically identify manageable clusters of blocking objects without the need for manually adjusting hyper-parameters. Then, MCTS uses this information to explore feasible actions to push groups of objects, aiming to minimize the number of operations needed to clear the path to the target. Real-world experiments using a Baxter robot, which involves some noise in actuation, show that the proposed framework achieves a higher success rate in solving retrieval tasks in dense clutter than alternatives. Moreover, it produces solutions with few pushing actions improving the overall execution time. More critically, it is robust enough that it allows one to plan the sequence of actions offline and then execute them reliably on a Baxter robot. |
Lu, S; Chang, H; Jing, E; Boularias, A; Bekris, K Ovir-3d: Open-Vocabulary 3d Instance Retrieval without Training on 3d Data Inproceedings Conference on Robot Learning (CoRL), Atlanta, GA, 2023. Abstract | Links | BibTeX | Tags: Perception @inproceedings{Lu:2023aa, title = {Ovir-3d: Open-Vocabulary 3d Instance Retrieval without Training on 3d Data}, author = {S Lu and H Chang and E Jing and A Boularias and K Bekris}, url = {https://proceedings.mlr.press/v229/lu23a/lu23a.pdf}, year = {2023}, date = {2023-11-01}, booktitle = {Conference on Robot Learning (CoRL)}, address = {Atlanta, GA}, abstract = {This work presents OVIR-3D, a straightforward yet effective method for open-vocabulary 3D object instance retrieval without using any 3D data for training. Given a language query, the proposed method is able to return a ranked set of 3D object instance segments based on the feature similarity of the instance and the text query. This is achieved by a multi-view fusion of text-aligned 2D region proposals into 3D space, where the 2D region proposal network could lever-age 2D datasets, which are more accessible and typically larger than 3D datasets. The proposed fusion process is efficient as it can be performed in real-time for most indoor 3D scenes and does not require additional training in 3D space. Experiments on public datasets and a real robot show the effectiveness of the method and its potential for applications in robot navigation and manipulation.}, keywords = {Perception}, pubstate = {published}, tppubtype = {inproceedings} } This work presents OVIR-3D, a straightforward yet effective method for open-vocabulary 3D object instance retrieval without using any 3D data for training. Given a language query, the proposed method is able to return a ranked set of 3D object instance segments based on the feature similarity of the instance and the text query. This is achieved by a multi-view fusion of text-aligned 2D region proposals into 3D space, where the 2D region proposal network could lever-age 2D datasets, which are more accessible and typically larger than 3D datasets. The proposed fusion process is efficient as it can be performed in real-time for most indoor 3D scenes and does not require additional training in 3D space. Experiments on public datasets and a real robot show the effectiveness of the method and its potential for applications in robot navigation and manipulation. |
Chang, H; Boyalakuntla, K; Lu, S; Cai, S; Jing, E; Keskar, S; Geng, S; Abbas, A; Zhou, L; Bekris, K; Boularias, A Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs Conference Conference on Robot Learning (CoRL), Atlanta, GA, 2023. Abstract | Links | BibTeX | Tags: Perception @conference{Chang_Corl23b, title = {Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs}, author = {H Chang and K Boyalakuntla and S Lu and S Cai and E Jing and S Keskar and S Geng and A Abbas and L Zhou and K Bekris and A Boularias}, url = {https://arxiv.org/abs/2309.15940}, year = {2023}, date = {2023-11-01}, booktitle = {Conference on Robot Learning (CoRL)}, address = {Atlanta, GA}, abstract = {We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as pick up a cup on a kitchen table or navigate to a sofa on which someone is sitting. In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.}, keywords = {Perception}, pubstate = {published}, tppubtype = {conference} } We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as pick up a cup on a kitchen table or navigate to a sofa on which someone is sitting. In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments. |
Wang, K; Johnson, W; Lu, S; Huang, X; Booth, J; Kramer-Bottiglio, R; Aanjaneya, M; Bekris, K Real2sim2real Transfer for Control of Cable-Driven Robots Via a Differentiable Physics Engine Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, 2023. Abstract | Links | BibTeX | Tags: Dynamics, Soft-Robots @inproceedings{Wang:2023aa, title = {Real2sim2real Transfer for Control of Cable-Driven Robots Via a Differentiable Physics Engine}, author = {K Wang and W Johnson and S Lu and X Huang and J Booth and R Kramer-Bottiglio and M Aanjaneya and K Bekris}, url = {https://arxiv.org/abs/2209.06261}, year = {2023}, date = {2023-10-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, address = {Detroit, MI}, abstract = {Tensegrity robots, composed of rigid rods and flexible cables, exhibit high strength-to-weight ratios and significant deformations, which enable them to navigate unstructured terrains and survive harsh impacts. They are hard to control, however, due to high dimensionality, complex dynamics, and a coupled architecture. Physics-based simulation is a promising avenue for developing locomotion policies that can be transferred to real robots. Nevertheless, modeling tensegrity robots is a complex task due to a substantial sim2real gap. To address this issue, this paper describes a Real2Sim2Real (R2S2R) strategy for tensegrity robots. This strategy is based on a differentiable physics engine that can be trained given limited data from a real robot. These data include offline measurements of physical properties, such as mass and geometry for various robot components, and the observation of a trajectory using a random control policy. With the data from the real robot, the engine can be iteratively refined and used to discover locomotion policies that are directly transferable to the real robot. Beyond the R2S2R pipeline, key contributions of this work include computing non-zero gradients at contact points, a loss function for matching tensegrity locomotion gaits, and a trajectory segmentation technique that avoids conflicts in gradient evaluation during training. Multiple iterations of the R2S2R process are demonstrated and evaluated on a real 3-bar tensegrity robot.}, keywords = {Dynamics, Soft-Robots}, pubstate = {published}, tppubtype = {inproceedings} } Tensegrity robots, composed of rigid rods and flexible cables, exhibit high strength-to-weight ratios and significant deformations, which enable them to navigate unstructured terrains and survive harsh impacts. They are hard to control, however, due to high dimensionality, complex dynamics, and a coupled architecture. Physics-based simulation is a promising avenue for developing locomotion policies that can be transferred to real robots. Nevertheless, modeling tensegrity robots is a complex task due to a substantial sim2real gap. To address this issue, this paper describes a Real2Sim2Real (R2S2R) strategy for tensegrity robots. This strategy is based on a differentiable physics engine that can be trained given limited data from a real robot. These data include offline measurements of physical properties, such as mass and geometry for various robot components, and the observation of a trajectory using a random control policy. With the data from the real robot, the engine can be iteratively refined and used to discover locomotion policies that are directly transferable to the real robot. Beyond the R2S2R pipeline, key contributions of this work include computing non-zero gradients at contact points, a loss function for matching tensegrity locomotion gaits, and a trajectory segmentation technique that avoids conflicts in gradient evaluation during training. Multiple iterations of the R2S2R process are demonstrated and evaluated on a real 3-bar tensegrity robot. |
Zhao, L; Wu, Y; Yan, W; Zhan, W; Huang, X; Booth, J; Mehta, A; Bekris, K; Kramer-Bottiglio, R; Balkcom, D Starblocks: Soft Actuated Self-Connecting Blocks for Building Deformable Lattice Structures Journal Article IEEE Robotics and Automation Letters, 8 (8), pp. 4521–4528, 2023. Abstract | Links | BibTeX | Tags: Soft-Robots @article{Zhao:2023aa, title = {Starblocks: Soft Actuated Self-Connecting Blocks for Building Deformable Lattice Structures}, author = {L Zhao and Y Wu and W Yan and W Zhan and X Huang and J Booth and A Mehta and K Bekris and R Kramer-Bottiglio and D Balkcom}, url = {https://ieeexplore.ieee.org/document/10146508}, year = {2023}, date = {2023-08-01}, journal = {IEEE Robotics and Automation Letters}, volume = {8}, number = {8}, pages = {4521--4528}, abstract = {In this paper, we present a soft modular block inspired by tensegrity structures that can form load-bearing structures through self-assembly. The block comprises a stellated compliant skeleton, shape memory alloy muscles, and permanent magnet connectors. We classify five deformation primitives for individual blocks: bend, compress, stretch, stand, and shrink, which can be combined across modules to reason about full-lattice deformation. Hierarchical function is abundant in nature and in human-designed systems. Using multiple self-assembled lattices, we demonstrate the formation and actuation of 3-dimensional shapes, including a load-bearing pop-up tent, a self-assembled wheel, a quadruped, a block-based robotic arm with gripper, and non-prehensile manipulation. To our knowledge, this is the first example of active deformable modules (blocks) that can reconfigure into different load-bearing structures on-demand.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {article} } In this paper, we present a soft modular block inspired by tensegrity structures that can form load-bearing structures through self-assembly. The block comprises a stellated compliant skeleton, shape memory alloy muscles, and permanent magnet connectors. We classify five deformation primitives for individual blocks: bend, compress, stretch, stand, and shrink, which can be combined across modules to reason about full-lattice deformation. Hierarchical function is abundant in nature and in human-designed systems. Using multiple self-assembled lattices, we demonstrate the formation and actuation of 3-dimensional shapes, including a load-bearing pop-up tent, a self-assembled wheel, a quadruped, a block-based robotic arm with gripper, and non-prehensile manipulation. To our knowledge, this is the first example of active deformable modules (blocks) that can reconfigure into different load-bearing structures on-demand. |
Li, S; Keipour, A; Jamieson, K; Hudson, N; Swan, C; Bekris, K Demonstrating Large-Scale Package Manipulation Via Learned Metrics of Pick Success Inproceedings Robotics: Science and Systems (RSS), Daegu, Korea, 2023. Abstract | BibTeX | Tags: Manipulation @inproceedings{Li:2023aa, title = {Demonstrating Large-Scale Package Manipulation Via Learned Metrics of Pick Success}, author = {S Li and A Keipour and K Jamieson and N Hudson and C Swan and K Bekris}, year = {2023}, date = {2023-07-01}, booktitle = {Robotics: Science and Systems (RSS)}, address = {Daegu, Korea}, abstract = {Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to workforce fluctuations. The past few years have seen increased interest in automating such repeated tasks but mostly in controlled settings. Tasks such as picking objects from unstructured, cluttered piles have only recently become robust enough for large-scale deployment with minimal human intervention. This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data. Specifically, the system was trained on over 394K picks. It is used for singulating up to 5~million packages per day and has manipulated over 200~million packages during this paper's evaluation period. The developed learned pick quality measure ranks various pick alternatives in real-time and prioritizes the most promising ones for execution. The pick success predictor aims to estimate from prior experience the success probability of a desired pick by the deployed industrial robotic arms in cluttered scenes containing deformable and rigid objects with partially known properties. It is a shallow machine learning model, which allows us to evaluate which features are most important for the prediction. An online pick ranker leverages the learned success predictor to prioritize the most promising picks for the robotic arm, which are then assessed for collision avoidance. This learned ranking process is demonstrated to overcome the limitations and outperform the performance of manually engineered and heuristic alternatives. To the best of the authors' knowledge, this paper presents the first large-scale deployment of learned pick quality estimation methods in a real production system.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to workforce fluctuations. The past few years have seen increased interest in automating such repeated tasks but mostly in controlled settings. Tasks such as picking objects from unstructured, cluttered piles have only recently become robust enough for large-scale deployment with minimal human intervention. This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data. Specifically, the system was trained on over 394K picks. It is used for singulating up to 5~million packages per day and has manipulated over 200~million packages during this paper's evaluation period. The developed learned pick quality measure ranks various pick alternatives in real-time and prioritizes the most promising ones for execution. The pick success predictor aims to estimate from prior experience the success probability of a desired pick by the deployed industrial robotic arms in cluttered scenes containing deformable and rigid objects with partially known properties. It is a shallow machine learning model, which allows us to evaluate which features are most important for the prediction. An online pick ranker leverages the learned success predictor to prioritize the most promising picks for the robotic arm, which are then assessed for collision avoidance. This learned ranking process is demonstrated to overcome the limitations and outperform the performance of manually engineered and heuristic alternatives. To the best of the authors' knowledge, this paper presents the first large-scale deployment of learned pick quality estimation methods in a real production system. |
Lu, S; Deng, Y; Boularias, A; Bekris, K Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos Inproceedings IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023. Abstract | Links | BibTeX | Tags: Learning, Perception @inproceedings{Lu:2023ab, title = {Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos}, author = {S Lu and Y Deng and A Boularias and K Bekris}, url = {https://arxiv.org/abs/2304.04325}, year = {2023}, date = {2023-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, address = {London, UK}, abstract = {This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A key feature of the self-supervised training process is a graph-matching algorithm that operates on the over-segmentation output of the point cloud that is reconstructed from each video. The graph matching, along with point cloud registration, is able to find reoccurring object patterns across videos and combine them into 3D object pseudo labels, even under occlusions or different viewing angles. Projected 2D object masks from 3D pseudo labels are used to train a pixel-wise feature extractor through contrastive learning. During online inference, a clustering method uses the learned features to cluster foreground pixels into object segments. Experiments highlight the method's effectiveness on both real and synthetic video datasets, which include cluttered scenes of tabletop objects. The proposed method outperforms existing unsupervised methods for object segmentation by a large margin.}, keywords = {Learning, Perception}, pubstate = {published}, tppubtype = {inproceedings} } This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A key feature of the self-supervised training process is a graph-matching algorithm that operates on the over-segmentation output of the point cloud that is reconstructed from each video. The graph matching, along with point cloud registration, is able to find reoccurring object patterns across videos and combine them into 3D object pseudo labels, even under occlusions or different viewing angles. Projected 2D object masks from 3D pseudo labels are used to train a pixel-wise feature extractor through contrastive learning. During online inference, a clustering method uses the learned features to cluster foreground pixels into object segments. Experiments highlight the method's effectiveness on both real and synthetic video datasets, which include cluttered scenes of tabletop objects. The proposed method outperforms existing unsupervised methods for object segmentation by a large margin. |
Vieira, E; Sivaramakrishnan, A; Song, Y; Granados, E; Gameiro, M; Mischaikow, K; Hung, Y; Bekris, K Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees Inproceedings IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023. Abstract | BibTeX | Tags: Dynamics, Verification @inproceedings{Vieira:2023aa, title = {Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees}, author = {E Vieira and A Sivaramakrishnan and Y Song and E Granados and M Gameiro and K Mischaikow and Y Hung and K Bekris}, year = {2023}, date = {2023-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, address = {London, UK}, abstract = {This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representation is built and used to describe the dynamics in the form of a directed acyclic graph, known as it Morse graph. The Morse graph is able to describe the system's attractors and their corresponding regions of attraction (roa). Furthermore, a pointwise confidence level of the global dynamics estimation over the entire state space is provided. In contrast to alternatives, the framework does not require estimation of Lyapunov functions, alleviating the need for high prediction accuracy of the GP. The framework is suitable for data-driven controllers that do not expose an analytical model as long as Lipschitz-continuity is satisfied. The method is compared against established analytical and recent machine learning alternatives for estimating roa s, outperforming them in data efficiency without sacrificing accuracy.}, keywords = {Dynamics, Verification}, pubstate = {published}, tppubtype = {inproceedings} } This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representation is built and used to describe the dynamics in the form of a directed acyclic graph, known as it Morse graph. The Morse graph is able to describe the system's attractors and their corresponding regions of attraction (roa). Furthermore, a pointwise confidence level of the global dynamics estimation over the entire state space is provided. In contrast to alternatives, the framework does not require estimation of Lyapunov functions, alleviating the need for high prediction accuracy of the GP. The framework is suitable for data-driven controllers that do not expose an analytical model as long as Lipschitz-continuity is satisfied. The method is compared against established analytical and recent machine learning alternatives for estimating roa s, outperforming them in data efficiency without sacrificing accuracy. |
Nakhimovich, D; Miao, Y; Bekris, K Resolution Complete In-Place Object Retrieval Given Known Object Models Inproceedings IEEE International Conference on Robotics and Automatics (ICRA), London, UK, 2023. Abstract | Links | BibTeX | Tags: Manipulation, Perception, Planning @inproceedings{Nakhimovich:2023aa, title = {Resolution Complete In-Place Object Retrieval Given Known Object Models}, author = {D Nakhimovich and Y Miao and K Bekris}, url = {https://arxiv.org/abs/2303.14562}, year = {2023}, date = {2023-01-01}, booktitle = {IEEE International Conference on Robotics and Automatics (ICRA)}, address = {London, UK}, abstract = {This work proposes a robot task planning framework for retrieving a target object in a confined workspace among multiple stacked objects that obstruct the target. The robot can use prehensile picking and in-workspace placing actions. The method assumes access to 3D models for the visible objects in the scene. The key contribution is in achieving desirable properties, i.e., to provide (a) safety, by avoiding collisions with sensed obstacles, objects, and occluded regions, and (b) resolution completeness (RC) - or probabilistic completeness (PC) depending on implementation - which indicates a solution will be eventually found (if it exists) as the resolution of algorithmic parameters increases. A heuristic variant of the basic RC algorithm is also proposed to solve the task more efficiently while retaining the desirable properties. Simulation results compare using random picking and placing operations against the basic RC algorithm that reasons about object dependency as well as its heuristic variant. The success rate is higher for the RC approaches given the same amount of time. The heuristic variant is able to solve the problem even more efficiently than the basic approach. The integration of the RC algorithm with perception, where an RGB-D sensor detects the objects as they are being moved, enables real robot demonstrations of safely retrieving target objects from a cluttered shelf.}, keywords = {Manipulation, Perception, Planning}, pubstate = {published}, tppubtype = {inproceedings} } This work proposes a robot task planning framework for retrieving a target object in a confined workspace among multiple stacked objects that obstruct the target. The robot can use prehensile picking and in-workspace placing actions. The method assumes access to 3D models for the visible objects in the scene. The key contribution is in achieving desirable properties, i.e., to provide (a) safety, by avoiding collisions with sensed obstacles, objects, and occluded regions, and (b) resolution completeness (RC) - or probabilistic completeness (PC) depending on implementation - which indicates a solution will be eventually found (if it exists) as the resolution of algorithmic parameters increases. A heuristic variant of the basic RC algorithm is also proposed to solve the task more efficiently while retaining the desirable properties. Simulation results compare using random picking and placing operations against the basic RC algorithm that reasons about object dependency as well as its heuristic variant. The success rate is higher for the RC approaches given the same amount of time. The heuristic variant is able to solve the problem even more efficiently than the basic approach. The integration of the RC algorithm with perception, where an RGB-D sensor detects the objects as they are being moved, enables real robot demonstrations of safely retrieving target objects from a cluttered shelf. |
2022 |
Lu, S; Johnson, W; Wang, K; Huang, X; Booth, J; Kramer-Bottiglio, R; Bekris, K 6n-Dof Pose Tracking for Tensegrity Robots Inproceedings International Symposium on Robotics Research (ISRR), 2022. Abstract | BibTeX | Tags: Soft-Robots @inproceedings{Lu:2022aa, title = {6n-Dof Pose Tracking for Tensegrity Robots}, author = {S Lu and W Johnson and K Wang and X Huang and J Booth and R Kramer-Bottiglio and K Bekris}, year = {2022}, date = {2022-10-01}, booktitle = {International Symposium on Robotics Research (ISRR)}, abstract = {Tensegrity robots, which are composed of rigid compressive elements (rods) and flexible tensile elements (e.g., cables), have a variety of advantages, including flexibility, light weight, and resistance to mechanical impact. Nevertheless, the hybrid soft-rigid nature of these robots also complicates the ability to localize and track their state. This work aims to address what has been recognized as a grand challenge in this domain, i.e., the pose tracking of tensegrity robots through a markerless, vision-based method, as well as novel, onboard sensors that can measure the length of the robot's cables. In particular, an iterative optimization process is proposed to estimate the 6-DoF poses of each rigid element of a tensegrity robot from an RGB-D video as well as endcap distance measurements from the cable sensors. To ensure the pose estimates of rigid elements are physically feasible, i.e., they are not resulting in collisions between rods or with the environment, physical constraints are introduced during the optimization. Real-world experiments are performed with a 3-bar tensegrity robot, which performs locomotion gaits. Given ground truth data from a motion capture system, the proposed method achieves less than 1 cm translation error and 3 degrees rotation error, which significantly outperforms alternatives. At the same time, the approach can provide pose estimates throughout the robot's motion, while motion capture often fails due to occlusions.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {inproceedings} } Tensegrity robots, which are composed of rigid compressive elements (rods) and flexible tensile elements (e.g., cables), have a variety of advantages, including flexibility, light weight, and resistance to mechanical impact. Nevertheless, the hybrid soft-rigid nature of these robots also complicates the ability to localize and track their state. This work aims to address what has been recognized as a grand challenge in this domain, i.e., the pose tracking of tensegrity robots through a markerless, vision-based method, as well as novel, onboard sensors that can measure the length of the robot's cables. In particular, an iterative optimization process is proposed to estimate the 6-DoF poses of each rigid element of a tensegrity robot from an RGB-D video as well as endcap distance measurements from the cable sensors. To ensure the pose estimates of rigid elements are physically feasible, i.e., they are not resulting in collisions between rods or with the environment, physical constraints are introduced during the optimization. Real-world experiments are performed with a 3-bar tensegrity robot, which performs locomotion gaits. Given ground truth data from a motion capture system, the proposed method achieves less than 1 cm translation error and 3 degrees rotation error, which significantly outperforms alternatives. At the same time, the approach can provide pose estimates throughout the robot's motion, while motion capture often fails due to occlusions. |
Wang, K; Aanjaneya, M; Bekris, K A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | BibTeX | Tags: Soft-Robots @inproceedings{Wang:2022aa, title = {A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data}, author = {K Wang and M Aanjaneya and K Bekris}, year = {2022}, date = {2022-07-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {Tensegrity robots, composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Differentiable physics engines have been recently proposed as a data-driven approach for model identification of such complex robotic systems. These engines are often executed at a high-frequency to achieve accurate simulation. Ground truth trajectories for training differentiable engines, however, are not typically available at such high frequencies due to limitations of real-world sensors. The present work focuses on this frequency mismatch, which impacts the modeling accuracy. We proposed a recurrent structure for a differentiable physics engine of tensegrity robots, which can be trained effectively even with low-frequency trajectories. To train this new recurrent engine in a robust way, this work introduces relative to prior work: (i) a new implicit integration scheme, (ii) a progressive training pipeline, and (iii) a differentiable collision checker. A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data. Simulated experiments show that once the recurrent differentiable engine has been trained given the low-frequency trajectories from MuJoCo, it is able to match the behavior of MuJoCo's system. The criterion for success is whether a locomotion strategy learned using the differentiable engine can be transferred back to the ground-truth system and result in a similar motion. Notably, the amount of ground truth data needed to train the differentiable engine, such that the policy is transferable to the ground truth system, is 1% of the data needed to train the policy directly on the ground-truth system.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {inproceedings} } Tensegrity robots, composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Differentiable physics engines have been recently proposed as a data-driven approach for model identification of such complex robotic systems. These engines are often executed at a high-frequency to achieve accurate simulation. Ground truth trajectories for training differentiable engines, however, are not typically available at such high frequencies due to limitations of real-world sensors. The present work focuses on this frequency mismatch, which impacts the modeling accuracy. We proposed a recurrent structure for a differentiable physics engine of tensegrity robots, which can be trained effectively even with low-frequency trajectories. To train this new recurrent engine in a robust way, this work introduces relative to prior work: (i) a new implicit integration scheme, (ii) a progressive training pipeline, and (iii) a differentiable collision checker. A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data. Simulated experiments show that once the recurrent differentiable engine has been trained given the low-frequency trajectories from MuJoCo, it is able to match the behavior of MuJoCo's system. The criterion for success is whether a locomotion strategy learned using the differentiable engine can be transferred back to the ground-truth system and result in a similar motion. Notably, the amount of ground truth data needed to train the differentiable engine, such that the policy is transferable to the ground truth system, is 1% of the data needed to train the policy directly on the ground-truth system. |
McMahon, T; Sivaramakrishnan, A; Granados, E; Bekris, K A Survey on the Integration of Machine Learning with Sampling-Based Motion Planning Journal Article Forthcoming Foundations and Trends in Robotics, Forthcoming. Abstract | Links | BibTeX | Tags: Planning @article{McMahon:2022aa, title = {A Survey on the Integration of Machine Learning with Sampling-Based Motion Planning}, author = {T McMahon and A Sivaramakrishnan and E Granados and K Bekris}, url = {https://arxiv.org/abs/2211.08368}, year = {2022}, date = {2022-06-01}, journal = {Foundations and Trends in Robotics}, abstract = {Sampling-based methods are widely adopted solutions for robot motion planning. The methods are straightforward to implement, effective in practice for many robotic systems. It is often possible to prove that they have desirable properties, such as probabilistic completeness and asymptotic optimality. Nevertheless, they still face challenges as the complexity of the underlying planning problem increases, especially under tight computation time constraints, which impact the quality of returned solutions or given inaccurate models. This has motivated machine learning to improve the computational efficiency and applicability of Sampling-Based Motion Planners (SBMPs). This survey reviews such integrative efforts and aims to provide a classification of the alternative directions that have been explored in the literature. It first discusses how learning has been used to enhance key components of SBMPs, such as node sampling, collision detection, distance or nearest neighbor computation, local planning, and termination conditions. Then, it highlights planners that use learning to adaptively select between different implementations of such primitives in response to the underlying problem's features. It also covers emerging methods, which build complete machine learning pipelines that reflect the traditional structure of SBMPs. It also discusses how machine learning has been used to provide data-driven models of robots, which can then be used by a SBMP. Finally, it provides a comparative discussion of the advantages and disadvantages of the approaches covered, and insights on possible future directions of research. An online version of this survey can be found at: https://prx-kinodynamic.github.io}, keywords = {Planning}, pubstate = {forthcoming}, tppubtype = {article} } Sampling-based methods are widely adopted solutions for robot motion planning. The methods are straightforward to implement, effective in practice for many robotic systems. It is often possible to prove that they have desirable properties, such as probabilistic completeness and asymptotic optimality. Nevertheless, they still face challenges as the complexity of the underlying planning problem increases, especially under tight computation time constraints, which impact the quality of returned solutions or given inaccurate models. This has motivated machine learning to improve the computational efficiency and applicability of Sampling-Based Motion Planners (SBMPs). This survey reviews such integrative efforts and aims to provide a classification of the alternative directions that have been explored in the literature. It first discusses how learning has been used to enhance key components of SBMPs, such as node sampling, collision detection, distance or nearest neighbor computation, local planning, and termination conditions. Then, it highlights planners that use learning to adaptively select between different implementations of such primitives in response to the underlying problem's features. It also covers emerging methods, which build complete machine learning pipelines that reflect the traditional structure of SBMPs. It also discusses how machine learning has been used to provide data-driven models of robots, which can then be used by a SBMP. Finally, it provides a comparative discussion of the advantages and disadvantages of the approaches covered, and insights on possible future directions of research. An online version of this survey can be found at: https://prx-kinodynamic.github.io |
McMahon, T; Sivaramakrishnan, A; Kedia, K; Granados, E; Bekris, K Terrain-Aware Learned Controllers for Sampling-Based Kinodynamic Planning Over Physically Simulated Terrains Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022. Abstract | Links | BibTeX | Tags: Dynamics, Learning, Planning @inproceedings{McMahon:2022ab, title = {Terrain-Aware Learned Controllers for Sampling-Based Kinodynamic Planning Over Physically Simulated Terrains}, author = {T McMahon and A Sivaramakrishnan and K Kedia and E Granados and K Bekris}, url = {https://ieeexplore.ieee.org/document/9982136}, year = {2022}, date = {2022-06-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, abstract = {This paper explores learning an effective controller for improving the efficiency of kinodynamic planning for vehicular systems navigating uneven terrains. It describes the pipeline for training the corresponding controller and using it for motion planning purposes. The training process uses a soft actor-critic approach with hindsight experience replay to train a model, which is parameterized by the incline of the robot's local terrain. This trained model is then used during the expansion process of an asymptotically optimal kinodynamic planner to generate controls that allow the robot to reach desired local states. It is also used to define a heuristic cost-to-go function for the planner via a wavefront operation that estimates the cost of reaching the global goal. The cost-to-go function is used both for selecting nodes for expansion as well as for generating local goals for the controller to expand towards. The accompanying experimental section applies the integrated planning solution on models of all-terrain robots in a variety of physically simulated terrains. It shows that the proposed terrain-aware controller and the proposed wavefront function based on the cost-to-go model enable motion planners to find solutions in less time and with lower cost than alternatives. An ablation study emphasizes the benefits of a learned controller that is parameterized by the incline of the robot's local terrain as well as of an incremental training process for the controller.}, keywords = {Dynamics, Learning, Planning}, pubstate = {published}, tppubtype = {inproceedings} } This paper explores learning an effective controller for improving the efficiency of kinodynamic planning for vehicular systems navigating uneven terrains. It describes the pipeline for training the corresponding controller and using it for motion planning purposes. The training process uses a soft actor-critic approach with hindsight experience replay to train a model, which is parameterized by the incline of the robot's local terrain. This trained model is then used during the expansion process of an asymptotically optimal kinodynamic planner to generate controls that allow the robot to reach desired local states. It is also used to define a heuristic cost-to-go function for the planner via a wavefront operation that estimates the cost of reaching the global goal. The cost-to-go function is used both for selecting nodes for expansion as well as for generating local goals for the controller to expand towards. The accompanying experimental section applies the integrated planning solution on models of all-terrain robots in a variety of physically simulated terrains. It shows that the proposed terrain-aware controller and the proposed wavefront function based on the cost-to-go model enable motion planners to find solutions in less time and with lower cost than alternatives. An ablation study emphasizes the benefits of a learned controller that is parameterized by the incline of the robot's local terrain as well as of an incremental training process for the controller. |
Wen, B; Lian, W; Bekris, K; Schaal, S You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration Inproceedings Robotics: Science and Systems (RSS), 2022, (Nomination for Best Paper Award). Abstract | Links | BibTeX | Tags: Learning, Manipulation, Perception @inproceedings{Wen:2022ab, title = {You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration}, author = {B Wen and W Lian and K Bekris and S Schaal}, url = {https://www.roboticsproceedings.org/rss18/p044.pdf}, year = {2022}, date = {2022-06-01}, booktitle = {Robotics: Science and Systems (RSS)}, abstract = {Promising results have been achieved recently in category-level manipulation that generalizes across object instances. Nevertheless, it often requires expensive real-world data collection and manual specification of semantic keypoints for each object category and task. Additionally, coarse keypoint predictions and ignoring intermediate action sequences hinder adoption in complex manipulation tasks beyond pick-and-place. This work proposes a novel, category-level manipulation framework that leverages an object-centric, category-level representation and model-free 6 DoF motion tracking. The canonical object representation is learned solely in simulation and then used to parse a category-level, task trajectory from a single demonstration video. The demonstration is reprojected to a target trajectory tailored to a novel object via the canonical representation. During execution, the manipulation horizon is decomposed into long range, collision-free motion and last-inch manipulation. For the latter part, a category-level behavior cloning (CatBC) method leverages motion tracking to perform closed-loop control. CatBC follows the target trajectory, projected from the demonstration and anchored to a dynamically selected category-level coordinate frame. The frame is automatically selected along the manipulation horizon by a local attention mechanism. This framework allows to teach different manipulation strategies by solely providing a single demonstration, without complicated manual programming. Extensive experiments demonstrate its efficacy in a range of challenging industrial tasks in high precision assembly, which involve learning complex, long-horizon policies. The process exhibits robustness against uncertainty due to dynamics as well as generalization across object instances and scene configurations.}, note = {Nomination for Best Paper Award}, keywords = {Learning, Manipulation, Perception}, pubstate = {published}, tppubtype = {inproceedings} } Promising results have been achieved recently in category-level manipulation that generalizes across object instances. Nevertheless, it often requires expensive real-world data collection and manual specification of semantic keypoints for each object category and task. Additionally, coarse keypoint predictions and ignoring intermediate action sequences hinder adoption in complex manipulation tasks beyond pick-and-place. This work proposes a novel, category-level manipulation framework that leverages an object-centric, category-level representation and model-free 6 DoF motion tracking. The canonical object representation is learned solely in simulation and then used to parse a category-level, task trajectory from a single demonstration video. The demonstration is reprojected to a target trajectory tailored to a novel object via the canonical representation. During execution, the manipulation horizon is decomposed into long range, collision-free motion and last-inch manipulation. For the latter part, a category-level behavior cloning (CatBC) method leverages motion tracking to perform closed-loop control. CatBC follows the target trajectory, projected from the demonstration and anchored to a dynamically selected category-level coordinate frame. The frame is automatically selected along the manipulation horizon by a local attention mechanism. This framework allows to teach different manipulation strategies by solely providing a single demonstration, without complicated manual programming. Extensive experiments demonstrate its efficacy in a range of challenging industrial tasks in high precision assembly, which involve learning complex, long-horizon policies. The process exhibits robustness against uncertainty due to dynamics as well as generalization across object instances and scene configurations. |
Wang, R; Gao, K; Yu, J; Bekris, K Lazy Rearrangement Planning in Confined Spaces Inproceedings International Conference on Automated Planning and Scheduling (ICAPS), 2022. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Wang:2022ac, title = {Lazy Rearrangement Planning in Confined Spaces}, author = {R Wang and K Gao and J Yu and K Bekris}, url = {https://arxiv.org/abs/2203.10379}, year = {2022}, date = {2022-06-01}, booktitle = {International Conference on Automated Planning and Scheduling (ICAPS)}, abstract = {Object rearrangement is important for many applications but remains challenging, especially in confined spaces, such as shelves, where objects cannot be accessed from above and they block reachability to each other. Such constraints require many motion planning and collision checking calls, which are computationally expensive. In addition, the arrangement space grows exponentially with the number of objects. To address these issues, this work introduces a lazy evaluation framework with a local monotone solver and a global planner. Monotone instances are those that can be solved by moving each object at most once. A key insight is that reachability constraints at the grasps for objects' starts and goals can quickly reveal dependencies between objects without having to execute expensive motion planning queries. Given that, the local solver builds lazily a search tree that respects these reachability constraints without verifying that the arm paths are collision free. It only collision checks when a promising solution is found. If a monotone solution is not found, the non-monotone planner loads the lazy search tree and explores ways to move objects to intermediate locations from where monotone solutions to the goal can be found. Results show that the proposed framework can solve difficult instances in confined spaces with up to 16 objects, which state-of-the-art methods fail to solve. It also solves problems faster than alter- natives, when the alternatives find a solution. It also achieves high-quality solutions, i.e., only 1.8 additional actions on av- erage are needed for non-monotone instances.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } Object rearrangement is important for many applications but remains challenging, especially in confined spaces, such as shelves, where objects cannot be accessed from above and they block reachability to each other. Such constraints require many motion planning and collision checking calls, which are computationally expensive. In addition, the arrangement space grows exponentially with the number of objects. To address these issues, this work introduces a lazy evaluation framework with a local monotone solver and a global planner. Monotone instances are those that can be solved by moving each object at most once. A key insight is that reachability constraints at the grasps for objects' starts and goals can quickly reveal dependencies between objects without having to execute expensive motion planning queries. Given that, the local solver builds lazily a search tree that respects these reachability constraints without verifying that the arm paths are collision free. It only collision checks when a promising solution is found. If a monotone solution is not found, the non-monotone planner loads the lazy search tree and explores ways to move objects to intermediate locations from where monotone solutions to the goal can be found. Results show that the proposed framework can solve difficult instances in confined spaces with up to 16 objects, which state-of-the-art methods fail to solve. It also solves problems faster than alter- natives, when the alternatives find a solution. It also achieves high-quality solutions, i.e., only 1.8 additional actions on av- erage are needed for non-monotone instances. |
Vieira, E; Granados, E; Sivaramakrishnan, A; Gameiro, M; Mischaikow, K; Bekris, K Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers Inproceedings Workshop on the Algorithmic Foundations of Robotics (WAFR), 2022. Abstract | Links | BibTeX | Tags: Dynamics, Planning, Verification @inproceedings{Vieira:2022aa, title = {Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers}, author = {E Vieira and E Granados and A Sivaramakrishnan and M Gameiro and K Mischaikow and K Bekris}, url = {https://arxiv.org/abs/2202.08383}, year = {2022}, date = {2022-06-01}, booktitle = {Workshop on the Algorithmic Foundations of Robotics (WAFR)}, abstract = {Understanding the global dynamics of a robot controller, such as identifying attractors and their regions of attraction (RoA), is important for safe deployment and synthesizing more effective hybrid controllers. This paper proposes a topological framework to analyze the global dynamics of robot controllers, even data-driven ones, in an effective and explainable way. It builds a combinatorial representation representing the underlying system's state space and non-linear dynamics, which is summarized in a directed acyclic graph, the Morse graph. The approach only probes the dynamics locally by forward propagating short trajectories over a state-space discretization, which needs to be a Lipschitz-continuous function. The framework is evaluated given either numerical or data-driven controllers for classical robotic benchmarks. It is compared against established analytical and recent machine learning alternatives for estimating the RoAs of such controllers. It is shown to outperform them in accuracy and efficiency. It also provides deeper insights as it describes the global dynamics up to the discretization's resolution. This allows to use the Morse graph to identify how to synthesize controllers to form improved hybrid solutions or how to identify the physical limitations of a robotic system.}, keywords = {Dynamics, Planning, Verification}, pubstate = {published}, tppubtype = {inproceedings} } Understanding the global dynamics of a robot controller, such as identifying attractors and their regions of attraction (RoA), is important for safe deployment and synthesizing more effective hybrid controllers. This paper proposes a topological framework to analyze the global dynamics of robot controllers, even data-driven ones, in an effective and explainable way. It builds a combinatorial representation representing the underlying system's state space and non-linear dynamics, which is summarized in a directed acyclic graph, the Morse graph. The approach only probes the dynamics locally by forward propagating short trajectories over a state-space discretization, which needs to be a Lipschitz-continuous function. The framework is evaluated given either numerical or data-driven controllers for classical robotic benchmarks. It is compared against established analytical and recent machine learning alternatives for estimating the RoAs of such controllers. It is shown to outperform them in accuracy and efficiency. It also provides deeper insights as it describes the global dynamics up to the discretization's resolution. This allows to use the Morse graph to identify how to synthesize controllers to form improved hybrid solutions or how to identify the physical limitations of a robotic system. |
Lu, S; Wang, R; Miao, Y; Mitash, C; Bekris, K Online Object Model Reconstruction and Reuse for Lifelong Improvement of Robot Manipulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022, (Nomination for Best Paper Award in Manipulation). Abstract | Links | BibTeX | Tags: Manipulation, Perception @inproceedings{Lu:2022ab, title = {Online Object Model Reconstruction and Reuse for Lifelong Improvement of Robot Manipulation}, author = {S Lu and R Wang and Y Miao and C Mitash and K Bekris}, url = {https://arxiv.org/abs/2109.13910}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {This work proposes a robotic pipeline for picking and constrained placement of objects without geometric shape priors. Compared to recent efforts developed for similar tasks, where every object was assumed to be novel, the proposed system recognizes previously manipulated objects and performs online model reconstruction and reuse. Over a lifelong manipulation process, the system keeps learning features of objects it has interacted with and updates their reconstructed models. Whenever an instance of a previously manipulated object reappears, the system aims to first recognize it and then register its previously reconstructed model given the current observation. This step greatly reduces object shape uncertainty allowing the system to even reason for parts of objects, which are currently not observable. This also results in better manipulation efficiency as it reduces the need for active perception of the target object during manipulation. To get a reusable reconstructed model, the proposed pipeline adopts: i) TSDF for object representation, and ii) a variant of the standard particle filter algorithm for pose estimation and tracking of the partial object model. Furthermore, an effective way to construct and maintain a dataset of manipulated objects is presented. A sequence of real-world manipulation experiments is performed. They show how future manipulation tasks become more effective and efficient by reusing reconstructed models of previously manipulated objects, which were generated during their prior manipulation, instead of treating objects as novel every time.}, note = {Nomination for Best Paper Award in Manipulation}, keywords = {Manipulation, Perception}, pubstate = {published}, tppubtype = {inproceedings} } This work proposes a robotic pipeline for picking and constrained placement of objects without geometric shape priors. Compared to recent efforts developed for similar tasks, where every object was assumed to be novel, the proposed system recognizes previously manipulated objects and performs online model reconstruction and reuse. Over a lifelong manipulation process, the system keeps learning features of objects it has interacted with and updates their reconstructed models. Whenever an instance of a previously manipulated object reappears, the system aims to first recognize it and then register its previously reconstructed model given the current observation. This step greatly reduces object shape uncertainty allowing the system to even reason for parts of objects, which are currently not observable. This also results in better manipulation efficiency as it reduces the need for active perception of the target object during manipulation. To get a reusable reconstructed model, the proposed pipeline adopts: i) TSDF for object representation, and ii) a variant of the standard particle filter algorithm for pose estimation and tracking of the partial object model. Furthermore, an effective way to construct and maintain a dataset of manipulated objects is presented. A sequence of real-world manipulation experiments is performed. They show how future manipulation tasks become more effective and efficient by reusing reconstructed models of previously manipulated objects, which were generated during their prior manipulation, instead of treating objects as novel every time. |
Vieira, E; Nakhimovich, D; Gao, K; Wang, R; Yu, J; Bekris, K Persistent Homology for Effective Non-Prehensile Manipulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Vieira:2022ab, title = {Persistent Homology for Effective Non-Prehensile Manipulation}, author = {E Vieira and D Nakhimovich and K Gao and R Wang and J Yu and K Bekris}, url = {https://arxiv.org/abs/2202.02937}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {This work explores the use of topological tools for achieving effective non-prehensile manipulation in cluttered, constrained workspaces. In particular, it proposes the use of persistent homology as a guiding principle in identifying the appropriate non-prehensile actions, such as pushing, to clean a cluttered space with a robotic arm so as to allow the retrieval of a target object. Persistent homology enables the automatic identification of connected components of blocking objects in the space without the need for manual input or tuning of parameters. The proposed algorithm uses this information to push groups of cylindrical objects together and aims to minimize the number of pushing actions needed to reach to the target. Simulated experiments in a physics engine using a model of the Baxter robot show that the proposed topology-driven solution is achieving significantly higher success rate in solving such constrained problems relatively to state-of-the-art alternatives from the literature. It manages to keep the number of pushing actions low, is computationally efficient and the resulting decisions and motion appear natural for effectively solving such tasks.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } This work explores the use of topological tools for achieving effective non-prehensile manipulation in cluttered, constrained workspaces. In particular, it proposes the use of persistent homology as a guiding principle in identifying the appropriate non-prehensile actions, such as pushing, to clean a cluttered space with a robotic arm so as to allow the retrieval of a target object. Persistent homology enables the automatic identification of connected components of blocking objects in the space without the need for manual input or tuning of parameters. The proposed algorithm uses this information to push groups of cylindrical objects together and aims to minimize the number of pushing actions needed to reach to the target. Simulated experiments in a physics engine using a model of the Baxter robot show that the proposed topology-driven solution is achieving significantly higher success rate in solving such constrained problems relatively to state-of-the-art alternatives from the literature. It manages to keep the number of pushing actions low, is computationally efficient and the resulting decisions and motion appear natural for effectively solving such tasks. |
Mitash, C; Boularias, A; Bekris, K Physics-Based Scene-Level Reasoning for Object Pose Estimation in Clutter Journal Article International Journal of Robotics Research (IJRR), 2022. Abstract | Links | BibTeX | Tags: Perception @article{Mitash:2022aa, title = {Physics-Based Scene-Level Reasoning for Object Pose Estimation in Clutter}, author = {C Mitash and A Boularias and K Bekris}, url = {https://arxiv.org/pdf/1806.10457.pdf}, year = {2022}, date = {2022-05-01}, journal = {International Journal of Robotics Research (IJRR)}, abstract = {This paper focuses on vision-based pose estimation for multiple rigid objects placed in clutter, especially in cases involving occlusions and objects resting on each other. Progress has been achieved recently in object recognition given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large number of objects and variety of conditions. Moreover, the combinatorial nature of the scenes that could arise from the placement of multiple objects is hard to capture in the training dataset. Thus, the learned models might not produce the desired level of precision required for tasks, such as robotic manipulation. This work proposes an autonomous process for pose estimation that spans from data generation, to scene-level reasoning and self-learning. In particular, the proposed framework first generates a labeled dataset for training a Convolutional Neural Network (CNN) for object detection in clutter. These detections are used to guide a scene-level optimization process, which considers the interactions between the different objects present in the clutter to output pose estimates of high precision. Furthermore, confident estimates are used to label online real images from multiple views and re-train the process in a self-learning pipeline. Experimental results indicate that this process is quickly able to identify in cluttered scenes physically-consistent object poses that are more precise than the ones found by reasoning over individual instances of objects. Furthermore, the quality of pose estimates increases over time given the self-learning process.}, keywords = {Perception}, pubstate = {published}, tppubtype = {article} } This paper focuses on vision-based pose estimation for multiple rigid objects placed in clutter, especially in cases involving occlusions and objects resting on each other. Progress has been achieved recently in object recognition given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large number of objects and variety of conditions. Moreover, the combinatorial nature of the scenes that could arise from the placement of multiple objects is hard to capture in the training dataset. Thus, the learned models might not produce the desired level of precision required for tasks, such as robotic manipulation. This work proposes an autonomous process for pose estimation that spans from data generation, to scene-level reasoning and self-learning. In particular, the proposed framework first generates a labeled dataset for training a Convolutional Neural Network (CNN) for object detection in clutter. These detections are used to guide a scene-level optimization process, which considers the interactions between the different objects present in the clutter to output pose estimates of high precision. Furthermore, confident estimates are used to label online real images from multiple views and re-train the process in a self-learning pipeline. Experimental results indicate that this process is quickly able to identify in cluttered scenes physically-consistent object poses that are more precise than the ones found by reasoning over individual instances of objects. Furthermore, the quality of pose estimates increases over time given the self-learning process. |
Liang, J; Wen, B; Bekris, K; Boularias, A Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Liang:2022aa, title = {Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations}, author = {J Liang and B Wen and K Bekris and A Boularias}, url = {https://arxiv.org/abs/2203.03797}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {This work aims to learn how to perform complex robot manipulation tasks that are composed of several, consecutively executed low-level sub-tasks, given as input a few visual demonstrations of the tasks performed by a person. The sub-tasks consist of moving the robot's end-effector until it reaches a sub-goal region in the task space, performing an action, and triggering the next sub-task when a pre-condition is met. Most prior work in this domain has been concerned with learning only low-level tasks, such as hitting a ball or reaching an object and grasping it. This paper describes a new neural network-based framework for learning simultaneously low-level policies as well as high-level policies, such as deciding which object to pick next or where to place it relative to other objects in the scene. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations, without any manual annotation or post-processing of the data. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks, and outperforms popular imitation learning algorithms.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } This work aims to learn how to perform complex robot manipulation tasks that are composed of several, consecutively executed low-level sub-tasks, given as input a few visual demonstrations of the tasks performed by a person. The sub-tasks consist of moving the robot's end-effector until it reaches a sub-goal region in the task space, performing an action, and triggering the next sub-task when a pre-condition is met. Most prior work in this domain has been concerned with learning only low-level tasks, such as hitting a ball or reaching an object and grasping it. This paper describes a new neural network-based framework for learning simultaneously low-level policies as well as high-level policies, such as deciding which object to pick next or where to place it relative to other objects in the scene. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations, without any manual annotation or post-processing of the data. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks, and outperforms popular imitation learning algorithms. |
Granados, E; Boularias, A; Bekris, K; Aanjaneya, M Model Identification and Control of a Mobile Robot with Omnidirectional Wheels Using Differentiable Physics Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Dynamics, Planning @inproceedings{Granados:2022aa, title = {Model Identification and Control of a Mobile Robot with Omnidirectional Wheels Using Differentiable Physics}, author = {E Granados and A Boularias and K Bekris and M Aanjaneya}, url = {https://orionquest.github.io/papers/MICLCMR/paper.html}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {We present a new data-driven technique for predicting the motion of a low-cost omnidirectional mobile robot under the influence of motor torques and friction forces. Our method utilizes a novel differentiable physics engine for analytically computing the gradient of the deviation between predicted motion trajectories and real-world trajectories. This allows to automatically learn and fine-tune the unknown friction coefficients on-the-fly, by minimizing a carefully designed loss function using gradient descent. Experiments show that the predicted trajectories are in excellent agreement with their real-world counterparts. Our proposed approach is computationally superior to existing black-box optimization methods, requiring very few real-world samples for accurate trajectory prediction compared to physics-agnostic techniques, such as neural networks. Experiments also demonstrate that the proposed method allows the robot to quickly adapt to changes in the terrain. Our proposed approach combines the data-efficiency of classical analytical models that are derived from first principles, with the flexibility of data-driven methods, which makes it appropriate for low-cost mobile robots.}, keywords = {Dynamics, Planning}, pubstate = {published}, tppubtype = {inproceedings} } We present a new data-driven technique for predicting the motion of a low-cost omnidirectional mobile robot under the influence of motor torques and friction forces. Our method utilizes a novel differentiable physics engine for analytically computing the gradient of the deviation between predicted motion trajectories and real-world trajectories. This allows to automatically learn and fine-tune the unknown friction coefficients on-the-fly, by minimizing a carefully designed loss function using gradient descent. Experiments show that the predicted trajectories are in excellent agreement with their real-world counterparts. Our proposed approach is computationally superior to existing black-box optimization methods, requiring very few real-world samples for accurate trajectory prediction compared to physics-agnostic techniques, such as neural networks. Experiments also demonstrate that the proposed method allows the robot to quickly adapt to changes in the terrain. Our proposed approach combines the data-efficiency of classical analytical models that are derived from first principles, with the flexibility of data-driven methods, which makes it appropriate for low-cost mobile robots. |
Gao, K; Lau, D; Huang, B; Bekris, K; Yu, J Fast High-Quality Tabletop Rearrangement in Bounded Workspace Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Gao:2022aa, title = {Fast High-Quality Tabletop Rearrangement in Bounded Workspace}, author = {K Gao and D Lau and B Huang and K Bekris and J Yu}, url = {https://arxiv.org/abs/2110.12325}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {In this paper, we examine the problem of rearranging many objects on a tabletop in a cluttered setting using overhand grasps. Efficient solutions for the problem, which capture a common task that we solve on a daily basis, are essential in enabling truly intelligent robotic manipulation. In a given instance, objects may need to be placed at temporary positions (buffers) to complete the rearrangement, but allocating these buffer locations can be highly challenging in a cluttered environment. To tackle the challenge, a two-step baseline planner is first developed, which generates a primitive plan based on inherent combinatorial constraints induced by start and goal poses of the objects and then selects buffer locations assisted by the primitive plan. We then employ the "lazy" planner in a tree search framework which is further sped up by adapting a novel preprocessing routine. Simulation experiments show our methods can quickly generate high-quality solutions and are more robust in solving large-scale instances than existing state-of-the-art approaches.}, keywords = {Manipulation}, pubstate = {published, manipulation}, tppubtype = {inproceedings} } In this paper, we examine the problem of rearranging many objects on a tabletop in a cluttered setting using overhand grasps. Efficient solutions for the problem, which capture a common task that we solve on a daily basis, are essential in enabling truly intelligent robotic manipulation. In a given instance, objects may need to be placed at temporary positions (buffers) to complete the rearrangement, but allocating these buffer locations can be highly challenging in a cluttered environment. To tackle the challenge, a two-step baseline planner is first developed, which generates a primitive plan based on inherent combinatorial constraints induced by start and goal poses of the objects and then selects buffer locations assisted by the primitive plan. We then employ the "lazy" planner in a tree search framework which is further sped up by adapting a novel preprocessing routine. Simulation experiments show our methods can quickly generate high-quality solutions and are more robust in solving large-scale instances than existing state-of-the-art approaches. |
Wang, R; Miao, Y; Bekris, K Efficient and High-Quality Prehensile Rearrangement in Cluttered and Confined Spaces Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation, Planning @inproceedings{Wang:2022ab, title = {Efficient and High-Quality Prehensile Rearrangement in Cluttered and Confined Spaces}, author = {R Wang and Y Miao and K Bekris}, url = {https://arxiv.org/abs/2110.02814}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {Prehensile object rearrangement in cluttered and confined spaces has broad applications but is also challenging. For instance, rearranging products in a grocery shelf means that the robot cannot directly access all objects and has limited free space. This is harder than tabletop rearrangement where objects are easily accessible with top-down grasps, which simplifies robot-object interactions. This work focuses on problems where such interactions are critical for completing tasks. It proposes a new efficient and complete solver under general constraints for monotone instances, which can be solved by moving each object at most once. The monotone solver reasons about robot-object constraints and uses them to effectively prune the search space. The new monotone solver is integrated with a global planner to solve non-monotone instances with high-quality solutions fast. Furthermore, this work contributes an effective pre-processing tool to significantly speed up online motion planning queries for rearrangement in confined spaces. Experiments further demonstrate that the proposed monotone solver, equipped with the pre-processing tool, results in 57.3% faster computation and 3 times higher success rate than state-of-the-art methods. Similarly, the resulting global planner is computationally more efficient and has a higher success rate, while producing high-quality solutions for non-monotone instances (i.e., only 1.3 additional actions are needed on average).}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {inproceedings} } Prehensile object rearrangement in cluttered and confined spaces has broad applications but is also challenging. For instance, rearranging products in a grocery shelf means that the robot cannot directly access all objects and has limited free space. This is harder than tabletop rearrangement where objects are easily accessible with top-down grasps, which simplifies robot-object interactions. This work focuses on problems where such interactions are critical for completing tasks. It proposes a new efficient and complete solver under general constraints for monotone instances, which can be solved by moving each object at most once. The monotone solver reasons about robot-object constraints and uses them to effectively prune the search space. The new monotone solver is integrated with a global planner to solve non-monotone instances with high-quality solutions fast. Furthermore, this work contributes an effective pre-processing tool to significantly speed up online motion planning queries for rearrangement in confined spaces. Experiments further demonstrate that the proposed monotone solver, equipped with the pre-processing tool, results in 57.3% faster computation and 3 times higher success rate than state-of-the-art methods. Similarly, the resulting global planner is computationally more efficient and has a higher success rate, while producing high-quality solutions for non-monotone instances (i.e., only 1.3 additional actions are needed on average). |
Wen, B; Lian, W; Bekris, K; Schaal, S Catgrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation, Perception @inproceedings{Wen:2022aa, title = {Catgrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation}, author = {B Wen and W Lian and K Bekris and S Schaal}, url = {https://arxiv.org/abs/2109.09163}, year = {2022}, date = {2022-05-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, abstract = {Task-relevant grasping is critical for industrial assembly, where downstream manipulation tasks constrain the set of valid grasps. Learning how to perform this task, however, is challenging, since task-relevant grasp labels are hard to define and annotate. There is also yet no consensus on proper representations for modeling or off-the-shelf tools for performing task-relevant grasps. This work proposes a framework to learn task-relevant grasping for industrial objects without the need of time-consuming real-world data collection or manual annotation. To achieve this, the entire framework is trained solely in simulation, including supervised training with synthetic label generation and self-supervised, hand-object interaction. In the context of this framework, this paper proposes a novel, object-centric canonical representation at the category level, which allows establishing dense correspondence across object instances and transferring task-relevant grasps to novel instances. Extensive experiments on task-relevant grasping of densely-cluttered industrial objects are conducted in both simulation and real-world setups, demonstrating the effectiveness of the proposed framework. Code and data is released at https://sites.google.com/view/catgrasp.}, keywords = {Manipulation, Perception}, pubstate = {published}, tppubtype = {inproceedings} } Task-relevant grasping is critical for industrial assembly, where downstream manipulation tasks constrain the set of valid grasps. Learning how to perform this task, however, is challenging, since task-relevant grasp labels are hard to define and annotate. There is also yet no consensus on proper representations for modeling or off-the-shelf tools for performing task-relevant grasps. This work proposes a framework to learn task-relevant grasping for industrial objects without the need of time-consuming real-world data collection or manual annotation. To achieve this, the entire framework is trained solely in simulation, including supervised training with synthetic label generation and self-supervised, hand-object interaction. In the context of this framework, this paper proposes a novel, object-centric canonical representation at the category level, which allows establishing dense correspondence across object instances and transferring task-relevant grasps to novel instances. Extensive experiments on task-relevant grasping of densely-cluttered industrial objects are conducted in both simulation and real-world setups, demonstrating the effectiveness of the proposed framework. Code and data is released at https://sites.google.com/view/catgrasp. |
Morgan, A; Hang, K; Wen, B; Bekris, K; Dollar, A Complex In-Hand Manipulation Via Compliance-Enabled Finger Gaiting and Multi-Modal Planning Journal Article IEEE Robotics and Automation Letters (also at ICRA), 2022. Abstract | Links | BibTeX | Tags: Manipulation, Planning @article{Morgan:2022aa, title = {Complex In-Hand Manipulation Via Compliance-Enabled Finger Gaiting and Multi-Modal Planning}, author = {A Morgan and K Hang and B Wen and K Bekris and A Dollar}, url = {https://arxiv.org/abs/2201.07928}, year = {2022}, date = {2022-05-01}, journal = {IEEE Robotics and Automation Letters (also at ICRA)}, abstract = {Constraining contacts to remain fixed on an object during manipulation limits the potential workspace size, as motion is subject to the hand's kinematic topology. Finger gaiting is one way to alleviate such restraints. It allows contacts to be freely broken and remade so as to operate on different manipulation manifolds. This capability, however, has traditionally been difficult or impossible to practically realize. A finger gaiting system must simultaneously plan for and control forces on the object while maintaining stability during contact switching. This work alleviates the traditional requirement by taking advantage of system compliance, allowing the hand to more easily switch contacts while maintaining a stable grasp. Our method achieves complete SO(3) finger gaiting control of grasped objects against gravity by developing a manipulation planner that operates via orthogonal safe modes of a compliant, underactuated hand absent of tactile sensors or joint encoders. During manipulation, a low-latency 6D pose object tracker provides feedback via vision, allowing the planner to update its plan online so as to adaptively recover from trajectory deviations. The efficacy of this method is showcased by manipulating both convex and non-convex objects on a real robot. Its robustness is evaluated via perturbation rejection and long trajectory goals. To the best of the authors' knowledge, this is the first work that has autonomously achieved full SO(3) control of objects within-hand via finger gaiting and without a support surface, elucidating a valuable step towards realizing true robot in-hand manipulation capabilities.}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {article} } Constraining contacts to remain fixed on an object during manipulation limits the potential workspace size, as motion is subject to the hand's kinematic topology. Finger gaiting is one way to alleviate such restraints. It allows contacts to be freely broken and remade so as to operate on different manipulation manifolds. This capability, however, has traditionally been difficult or impossible to practically realize. A finger gaiting system must simultaneously plan for and control forces on the object while maintaining stability during contact switching. This work alleviates the traditional requirement by taking advantage of system compliance, allowing the hand to more easily switch contacts while maintaining a stable grasp. Our method achieves complete SO(3) finger gaiting control of grasped objects against gravity by developing a manipulation planner that operates via orthogonal safe modes of a compliant, underactuated hand absent of tactile sensors or joint encoders. During manipulation, a low-latency 6D pose object tracker provides feedback via vision, allowing the planner to update its plan online so as to adaptively recover from trajectory deviations. The efficacy of this method is showcased by manipulating both convex and non-convex objects on a real robot. Its robustness is evaluated via perturbation rejection and long trajectory goals. To the best of the authors' knowledge, this is the first work that has autonomously achieved full SO(3) control of objects within-hand via finger gaiting and without a support surface, elucidating a valuable step towards realizing true robot in-hand manipulation capabilities. |
Miao, Y; Wang, R; Bekris, K Safe, Occlusion-Aware Manipulation for Online Object Reconstruction in Confined Space Inproceedings International Symposium on Robotics Research (ISRR), 2022. Abstract | Links | BibTeX | Tags: Manipulation, Planning @inproceedings{Miao:2022aa, title = {Safe, Occlusion-Aware Manipulation for Online Object Reconstruction in Confined Space}, author = {Y Miao and R Wang and K Bekris}, url = {https://arxiv.org/abs/2205.11719}, year = {2022}, date = {2022-01-01}, booktitle = {International Symposium on Robotics Research (ISRR)}, abstract = {Recent work in robotic manipulation focuses on object retrieval in cluttered space under occlusion. Nevertheless, the majority of efforts lack an analysis of conditions for the completeness of the approaches or the methods apply only when objects can be removed from the workspace. This work formulates the general, occlusion-aware manipulation task, and focuses on safe object reconstruction in a confined space with in-place relocation. A framework that ensures safety with completeness guarantees is proposed. Furthermore, an algorithm, which is an instantiation of this framework for monotone instances, is developed and evaluated empirically by comparing against a random and a greedy baseline on randomly generated experiments in simulation. Even for cluttered scenes with realistic objects, the proposed algorithm significantly outperforms the baselines and maintains a high success rate across experimental conditions.}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {inproceedings} } Recent work in robotic manipulation focuses on object retrieval in cluttered space under occlusion. Nevertheless, the majority of efforts lack an analysis of conditions for the completeness of the approaches or the methods apply only when objects can be removed from the workspace. This work formulates the general, occlusion-aware manipulation task, and focuses on safe object reconstruction in a confined space with in-place relocation. A framework that ensures safety with completeness guarantees is proposed. Furthermore, an algorithm, which is an instantiation of this framework for monotone instances, is developed and evaluated empirically by comparing against a random and a greedy baseline on randomly generated experiments in simulation. Even for cluttered scenes with realistic objects, the proposed algorithm significantly outperforms the baselines and maintains a high success rate across experimental conditions. |
2021 |
Shah, D; Booth, J; Baines, R; Wang, K; Vespignani, M; Bekris, K; Kramer-Bottiglio, R Tensegrity Robotics Journal Article Soft Robotics, 2021. Abstract | Links | BibTeX | Tags: Soft-Robots @article{Shah:2021aa, title = {Tensegrity Robotics}, author = {D Shah and J Booth and R Baines and K Wang and M Vespignani and K Bekris and R Kramer-Bottiglio}, doi = {10.1089/soro.2020.0170}, year = {2021}, date = {2021-12-01}, journal = {Soft Robotics}, abstract = {Numerous recent advances in robotics have been inspired by the biological principle of tensile integrity --- or ``tensegrity''--- to achieve remarkable feats of dexterity and resilience. Tensegrity robots contain compliant networks of rigid struts and soft cables, allowing them to change their shape by adjusting their internal tension. Local rigidity along the struts provides support to carry electronics and scientific payloads, while global compliance enabled by the flexible interconnections of struts and ca- bles allows a tensegrity to distribute impacts and prevent damage. Numerous techniques have been proposed for designing and simulating tensegrity robots, giving rise to a wide range of locomotion modes including rolling, vibrating, hopping, and crawling. Here, we review progress in the burgeoning field of tensegrity robotics, highlighting several emerging challenges, including automated design, state sensing, and kinodynamic motion planning.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {article} } Numerous recent advances in robotics have been inspired by the biological principle of tensile integrity --- or ``tensegrity''--- to achieve remarkable feats of dexterity and resilience. Tensegrity robots contain compliant networks of rigid struts and soft cables, allowing them to change their shape by adjusting their internal tension. Local rigidity along the struts provides support to carry electronics and scientific payloads, while global compliance enabled by the flexible interconnections of struts and ca- bles allows a tensegrity to distribute impacts and prevent damage. Numerous techniques have been proposed for designing and simulating tensegrity robots, giving rise to a wide range of locomotion modes including rolling, vibrating, hopping, and crawling. Here, we review progress in the burgeoning field of tensegrity robotics, highlighting several emerging challenges, including automated design, state sensing, and kinodynamic motion planning. |
Meng, P; Wang, W; Balkcom, D; Bekris, K ASCE Earth and Space Conference 2021, Seattle, WA, 2021. Abstract | Links | BibTeX | Tags: Soft-Robots @conference{Meng:2021aa, title = {Proof-Of-Concept Designs for the Assembly of Modular, Dynamic Tensegrities into Easily Deployable Structures}, author = {P Meng and W Wang and D Balkcom and K Bekris}, url = {https://par.nsf.gov/servlets/purl/10294210}, year = {2021}, date = {2021-10-01}, booktitle = {ASCE Earth and Space Conference 2021}, address = {Seattle, WA}, abstract = {Dynamic tensegrity robots are inspired by tensegrity structures in architecture; arrangements of rigid rods and flexible elements allow the robots to deform. This work proposes the use of multiple, modular, tensegrity robots that can move and compliantly connect to assemble larger, compliant, lightweight, strong structures and scaffolding. The focus is on proof-of-concept designs for the modular robots themselves and their docking mechanisms, which can allow the easy deployment of structures in unstructured environments. These mechanisms include (electro)magnets to allow each individual robot to connect and disconnect on cue. An exciting direction is the design of specific module and structure designs to fit the mission at hand. For example, this work highlights how the considered three bar structures could stack to form a column or deform on one side to create an arch. A critical component of future work will involve the development of algorithms for automatic design and layout of modules in structures.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {conference} } Dynamic tensegrity robots are inspired by tensegrity structures in architecture; arrangements of rigid rods and flexible elements allow the robots to deform. This work proposes the use of multiple, modular, tensegrity robots that can move and compliantly connect to assemble larger, compliant, lightweight, strong structures and scaffolding. The focus is on proof-of-concept designs for the modular robots themselves and their docking mechanisms, which can allow the easy deployment of structures in unstructured environments. These mechanisms include (electro)magnets to allow each individual robot to connect and disconnect on cue. An exciting direction is the design of specific module and structure designs to fit the mission at hand. For example, this work highlights how the considered three bar structures could stack to form a column or deform on one side to create an arch. A critical component of future work will involve the development of algorithms for automatic design and layout of modules in structures. |
Wang, K; Aanjaneya, M; Bekris, K Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. Abstract | Links | BibTeX | Tags: Dynamics, Learning, Soft-Robots @inproceedings{Wang:2021ab, title = {Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots}, author = {K Wang and M Aanjaneya and K Bekris}, url = {https://arxiv.org/abs/2011.04929}, year = {2021}, date = {2021-09-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, abstract = {Learning policies in simulation is promising for reducing human effort when training robot controllers. This is especially true for soft robots that are more adaptive and safe but also more difficult to accurately model and control. The sim2real gap is the main barrier to successfully transfer policies from simulation to a real robot. System identification can be applied to reduce this gap but traditional identification methods require a lot of manual tuning. Data-driven alternatives can tune dynamical models directly from data but are often data hungry, which also incorporates human effort in collecting data. This work proposes a data-driven, end-to-end differentiable simulator focused on the exciting but challenging domain of tensegrity robots. To the best of the authors' knowledge, this is the first differentiable physics engine for tensegrity robots that supports cable, contact, and actuation modeling. The aim is to develop a reasonably simplified, data-driven simulation, which can learn approximate dynamics with limited ground truth data. The dynamics must be accurate enough to generate policies that can be transferred back to the ground-truth system. As a first step in this direction, the current work demonstrates sim2sim transfer, where the unknown physical model of MuJoCo acts as a ground truth system. Two different tensegrity robots are used for evaluation and learning of locomotion policies, a 6-bar and a 3-bar tensegrity. The results indicate that only 0.25% of ground truth data are needed to train a policy that works on the ground truth system when the differentiable engine is used for training against training the policy directly on the ground truth system.}, keywords = {Dynamics, Learning, Soft-Robots}, pubstate = {published}, tppubtype = {inproceedings} } Learning policies in simulation is promising for reducing human effort when training robot controllers. This is especially true for soft robots that are more adaptive and safe but also more difficult to accurately model and control. The sim2real gap is the main barrier to successfully transfer policies from simulation to a real robot. System identification can be applied to reduce this gap but traditional identification methods require a lot of manual tuning. Data-driven alternatives can tune dynamical models directly from data but are often data hungry, which also incorporates human effort in collecting data. This work proposes a data-driven, end-to-end differentiable simulator focused on the exciting but challenging domain of tensegrity robots. To the best of the authors' knowledge, this is the first differentiable physics engine for tensegrity robots that supports cable, contact, and actuation modeling. The aim is to develop a reasonably simplified, data-driven simulation, which can learn approximate dynamics with limited ground truth data. The dynamics must be accurate enough to generate policies that can be transferred back to the ground-truth system. As a first step in this direction, the current work demonstrates sim2sim transfer, where the unknown physical model of MuJoCo acts as a ground truth system. Two different tensegrity robots are used for evaluation and learning of locomotion policies, a 6-bar and a 3-bar tensegrity. The results indicate that only 0.25% of ground truth data are needed to train a policy that works on the ground truth system when the differentiable engine is used for training against training the policy directly on the ground truth system. |
Sivaramakrishnan, A; Granados, E; Karten, S; McMahon, T; Bekris, K Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. Abstract | Links | BibTeX | Tags: Dynamics, Planning @inproceedings{Sivaramakrishnan:2021aa, title = {Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers}, author = {A Sivaramakrishnan and E Granados and S Karten and T McMahon and K Bekris}, url = {https://arxiv.org/pdf/2110.04238}, year = {2021}, date = {2021-09-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, abstract = {This paper aims to improve the path quality and computational efficiency of sampling-based kinodynamic planners for vehicular navigation. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based planners. Given a dynamics model, a reinforcement learning process is trained offline to return a low-cost control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles. By focusing on the system's dynamics and not knowing the environment, this process is data-efficient and takes place once for a robotic system. In this way, it can be reused in different environments. The planner generates online local goal states for the learned controller in an informed manner to bias towards the goal and consecutively in an exploratory, random manner. For the informed expansion, local goal states are generated either via (a) medial axis information in environments with obstacles, or (b) wavefront information for setups with traversability costs. The learning process and the resulting planning framework are evaluated for a first and second-order differential drive system, as well as a physically simulated Segway robot. The results show that the proposed integration of learning and planning can produce higher quality paths than sampling-based kinodynamic planning with random controls in fewer iterations and computation time.}, keywords = {Dynamics, Planning}, pubstate = {published}, tppubtype = {inproceedings} } This paper aims to improve the path quality and computational efficiency of sampling-based kinodynamic planners for vehicular navigation. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based planners. Given a dynamics model, a reinforcement learning process is trained offline to return a low-cost control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles. By focusing on the system's dynamics and not knowing the environment, this process is data-efficient and takes place once for a robotic system. In this way, it can be reused in different environments. The planner generates online local goal states for the learned controller in an informed manner to bias towards the goal and consecutively in an exploratory, random manner. For the informed expansion, local goal states are generated either via (a) medial axis information in environments with obstacles, or (b) wavefront information for setups with traversability costs. The learning process and the resulting planning framework are evaluated for a first and second-order differential drive system, as well as a physically simulated Segway robot. The results show that the proposed integration of learning and planning can produce higher quality paths than sampling-based kinodynamic planning with random controls in fewer iterations and computation time. |
Wen, B; Bekris, K Bundletrack: 6d Pose Tracking for Novel Objects without Instance or Category-Level 3d Models Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. Abstract | Links | BibTeX | Tags: Perception @inproceedings{Wen:2021aa, title = {Bundletrack: 6d Pose Tracking for Novel Objects without Instance or Category-Level 3d Models}, author = {B Wen and K Bekris}, url = {https://arxiv.org/abs/2108.00516}, year = {2021}, date = {2021-09-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, abstract = {Tracking the 6D pose of objects in video sequences is important for robot manipulation. Prior efforts, however, often assume that the target object's CAD model, at least at a category-level, is available for offline training or during online template matching. This work proposes BundleTrack, a general framework for 6D pose tracking of novel objects, which does not depend upon instance or category-level 3D models. It leverages the complementary attributes of recent advances in deep learning for segmentation and robust feature extraction, as well as memory augmented pose-graph optimization for achieving spatiotemporal consistency. This enables long-term, low-drift tracking under various challenging scenarios, including significant occlusions and object motions. Comprehensive experiments given two public benchmarks demonstrate that the proposed approach significantly outperforms state-of-art category-level 6D tracking or dynamic-SLAM methods. When compared against state-of-art methods that rely on an object instance CAD model, comparable performance is achieved, despite the proposed method's reduced information requirements. An efficient implementation in CUDA provides a real-time performance of 10Hz for the entire framework.}, keywords = {Perception}, pubstate = {published}, tppubtype = {inproceedings} } Tracking the 6D pose of objects in video sequences is important for robot manipulation. Prior efforts, however, often assume that the target object's CAD model, at least at a category-level, is available for offline training or during online template matching. This work proposes BundleTrack, a general framework for 6D pose tracking of novel objects, which does not depend upon instance or category-level 3D models. It leverages the complementary attributes of recent advances in deep learning for segmentation and robust feature extraction, as well as memory augmented pose-graph optimization for achieving spatiotemporal consistency. This enables long-term, low-drift tracking under various challenging scenarios, including significant occlusions and object motions. Comprehensive experiments given two public benchmarks demonstrate that the proposed approach significantly outperforms state-of-art category-level 6D tracking or dynamic-SLAM methods. When compared against state-of-art methods that rely on an object instance CAD model, comparable performance is achieved, despite the proposed method's reduced information requirements. An efficient implementation in CUDA provides a real-time performance of 10Hz for the entire framework. |
Morgan, A; Wen, B; Junchi, L; Boularias, A; Dollar, A; Bekris, K Vision-Driven Compliant Manipulation for Reliable, High-Precision Assembly Tasks Conference Robotics: Science and Systems, 2021. Abstract | BibTeX | Tags: Manipulation, Perception @conference{Morgan:2021aa, title = {Vision-Driven Compliant Manipulation for Reliable, High-Precision Assembly Tasks}, author = {A Morgan and B Wen and L Junchi and A Boularias and A Dollar and K Bekris}, year = {2021}, date = {2021-07-01}, booktitle = {Robotics: Science and Systems}, abstract = {Highly constrained manipulation tasks continue to be challenging for autonomous robots as they require high levels of precision, typically less than 1mm, which is often incompatible with what can be achieved by traditional perception systems. This paper demonstrates that the combination of state-of-the-art object tracking with passively adaptive mechanical hardware can be leveraged to complete precision manipulation tasks with tight, industrially-relevant tolerances (0.25mm). The proposed control method closes the loop through vision by tracking the relative 6D pose of objects in the relevant workspace. It adjusts the control reference of both the compliant manipulator and the hand to complete object insertion tasks via within-hand manipulation. Contrary to previous efforts for insertion, our method does not require expensive force sensors, precision manipulators, or time-consuming, online learning, which is data hungry. Instead, this effort leverages mechanical compliance and utilizes an object-agnostic manipulation model of the hand learned offline, off-the-shelf motion planning, and an RGBD-based object tracker trained solely with synthetic data. These features allow the proposed system to easily generalize and transfer to new tasks and environments. This paper describes in detail the system components and showcases its efficacy with extensive experiments involving tight tolerance peg-in-hole insertion tasks of various geometries as well as open-world constrained placement tasks.}, keywords = {Manipulation, Perception}, pubstate = {published}, tppubtype = {conference} } Highly constrained manipulation tasks continue to be challenging for autonomous robots as they require high levels of precision, typically less than 1mm, which is often incompatible with what can be achieved by traditional perception systems. This paper demonstrates that the combination of state-of-the-art object tracking with passively adaptive mechanical hardware can be leveraged to complete precision manipulation tasks with tight, industrially-relevant tolerances (0.25mm). The proposed control method closes the loop through vision by tracking the relative 6D pose of objects in the relevant workspace. It adjusts the control reference of both the compliant manipulator and the hand to complete object insertion tasks via within-hand manipulation. Contrary to previous efforts for insertion, our method does not require expensive force sensors, precision manipulators, or time-consuming, online learning, which is data hungry. Instead, this effort leverages mechanical compliance and utilizes an object-agnostic manipulation model of the hand learned offline, off-the-shelf motion planning, and an RGBD-based object tracker trained solely with synthetic data. These features allow the proposed system to easily generalize and transfer to new tasks and environments. This paper describes in detail the system components and showcases its efficacy with extensive experiments involving tight tolerance peg-in-hole insertion tasks of various geometries as well as open-world constrained placement tasks. |
Wang, R; Gao, K; Nakhimovich, D; Yu, J; Bekris, K Uniform Object Rearrangement: From Complete Monotone Primitives to Efficient Non-Monotone Informed Search Inproceedings International Conference on Robotics and Automation (ICRA) 2021, 2021. Abstract | Links | BibTeX | Tags: Manipulation @inproceedings{Wang:2021ac, title = {Uniform Object Rearrangement: From Complete Monotone Primitives to Efficient Non-Monotone Informed Search}, author = {R Wang and K Gao and D Nakhimovich and J Yu and K Bekris}, url = {https://ieeexplore.ieee.org/document/9561716}, year = {2021}, date = {2021-05-01}, booktitle = {International Conference on Robotics and Automation (ICRA) 2021}, abstract = {Object rearrangement is a widely-applicable and challenging task for robots. Geometric constraints must be carefully examined to avoid collisions and combinatorial issues arise as the number of objects increases. This work studies the algorithmic structure of rearranging uniform objects, where robot-object collisions do not occur but object-object collisions have to be avoided. The objective is minimizing the number of object transfers under the assumption that the robot can manipulate one object at a time. An efficiently computable decomposition of the configuration space is used to create a "region graph", which classifies all continuous paths of equivalent collision possibilities. Based on this compact but rich representation, a complete dynamic programming primitive DFSDP performs a recursive depth first search to solve monotone problems quickly, i.e., those instances that do not require objects to be moved first to an intermediate buffer. DFSDP is extended to solve single-buffer, non-monotone instances, given a choice of an object and a buffer. This work utilizes these primitives as local planners in an informed search framework for more general, non-monotone instances. The search utilizes partial solutions from the primitives to identify the most promising choice of objects and buffers. Experiments demonstrate that the proposed solution returns near-optimal paths with higher success rate, even for challenging non-monotone instances, than other leading alternatives.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {inproceedings} } Object rearrangement is a widely-applicable and challenging task for robots. Geometric constraints must be carefully examined to avoid collisions and combinatorial issues arise as the number of objects increases. This work studies the algorithmic structure of rearranging uniform objects, where robot-object collisions do not occur but object-object collisions have to be avoided. The objective is minimizing the number of object transfers under the assumption that the robot can manipulate one object at a time. An efficiently computable decomposition of the configuration space is used to create a "region graph", which classifies all continuous paths of equivalent collision possibilities. Based on this compact but rich representation, a complete dynamic programming primitive DFSDP performs a recursive depth first search to solve monotone problems quickly, i.e., those instances that do not require objects to be moved first to an intermediate buffer. DFSDP is extended to solve single-buffer, non-monotone instances, given a choice of an object and a buffer. This work utilizes these primitives as local planners in an informed search framework for more general, non-monotone instances. The search utilizes partial solutions from the primitives to identify the most promising choice of objects and buffers. Experiments demonstrate that the proposed solution returns near-optimal paths with higher success rate, even for challenging non-monotone instances, than other leading alternatives. |
Shome, R; Solovey, K; Yu, J; Bekris, K; Halperin, D Fast, High-Quality Two-Arm Rearrangement in Synchronous, Monotone Tabletop Setups Journal Article IEEE Transactions on Automation Science and Engineering, 2021. Abstract | Links | BibTeX | Tags: Manipulation, Planning @article{Shome:2021aa, title = {Fast, High-Quality Two-Arm Rearrangement in Synchronous, Monotone Tabletop Setups}, author = {R Shome and K Solovey and J Yu and K Bekris and D Halperin}, url = {https://arxiv.org/abs/1810.12202}, year = {2021}, date = {2021-03-01}, journal = {IEEE Transactions on Automation Science and Engineering}, abstract = {Rearranging objects on a planar surface arises in a variety of robotic applications, such as product packaging. Using two arms can improve efficiency but introduces new computational challenges. This paper studies the problem structure of object rearrangement using two arms in synchronous, monotone tabletop setups and develops an optimal mixed integer model. It then describes an efficient and scalable algorithm, which first minimizes the cost of object transfers and then of moves between objects. This is motivated by the fact that, asymptotically, object transfers dominate the cost of solutions. Moreover, a lazy strategy minimizes the number of motion planning calls and results in significant speedups. Theoretical arguments support the benefits of using two arms and indicate that synchronous execution, in which the two arms perform together either transfers or moves, introduces only a small overhead. Experiments support these claims and show that the scalable method can quickly compute solutions close to the optimal for the considered setup.}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {article} } Rearranging objects on a planar surface arises in a variety of robotic applications, such as product packaging. Using two arms can improve efficiency but introduces new computational challenges. This paper studies the problem structure of object rearrangement using two arms in synchronous, monotone tabletop setups and develops an optimal mixed integer model. It then describes an efficient and scalable algorithm, which first minimizes the cost of object transfers and then of moves between objects. This is motivated by the fact that, asymptotically, object transfers dominate the cost of solutions. Moreover, a lazy strategy minimizes the number of motion planning calls and results in significant speedups. Theoretical arguments support the benefits of using two arms and indicate that synchronous execution, in which the two arms perform together either transfers or moves, introduces only a small overhead. Experiments support these claims and show that the scalable method can quickly compute solutions close to the optimal for the considered setup. |
Surovik, D; Wang, K; Vespignani, M; Bruce, J; Bekris, K Adaptive Tensegrity Locomotion: Controlling a Compliant Icosahedron with Symmetry-Reduced Reinforcement Learning Journal Article International Journal of Robotics Research (IJRR), 2021. Abstract | Links | BibTeX | Tags: Soft-Robots @article{Surovik:2021aa, title = {Adaptive Tensegrity Locomotion: Controlling a Compliant Icosahedron with Symmetry-Reduced Reinforcement Learning}, author = {D Surovik and K Wang and M Vespignani and J Bruce and K Bekris}, url = {https://www.cs.rutgers.edu/~kb572/pubs/reinf_learning_tensegrity_locomotion.pdf}, year = {2021}, date = {2021-01-01}, journal = {International Journal of Robotics Research (IJRR)}, abstract = {Tensegrity robots, which are prototypical examples of hybrid soft-rigid robots, exhibit dynamical properties that provide ruggedness and adaptability. They also bring about, however, major challenges for locomotion control. Due to high dimensionality and the complex evolution of contact states, data-driven approaches are appropriate for producing viable feedback policies for tensegrities. Guided Policy Search (GPS), a sample-efficient hybrid framework for optimization and reinforcement learning, has previously been applied to generate periodic, axis-constrained locomotion by an icosahedral tensegrity on flat ground. Varying environments and tasks, however, create a need for more adaptive and general locomotion control that actively utilizes an expanded space of robot states. This implies significantly higher needs in terms of sample data and setup effort. This work mitigates such requirements by proposing a new GPS- based reinforcement learning pipeline, which exploits the vehicle's high degree of symmetry and appropriately learns contextual behaviors that are sustainable without periodicity. Newly achieved capabilities include axially-unconstrained rolling, rough terrain traversal, and rough incline ascent. These tasks are evaluated for a small variety of key model parameters in simulation and tested on the NASA hardware prototype, SUPERball. Results confirm the utility of symmetry exploitation and the adaptability of the vehicle. They also shed light on numerous strengths and limitations of the GPS framework for policy design and transfer to real hybrid soft-rigid robots.}, keywords = {Soft-Robots}, pubstate = {published}, tppubtype = {article} } Tensegrity robots, which are prototypical examples of hybrid soft-rigid robots, exhibit dynamical properties that provide ruggedness and adaptability. They also bring about, however, major challenges for locomotion control. Due to high dimensionality and the complex evolution of contact states, data-driven approaches are appropriate for producing viable feedback policies for tensegrities. Guided Policy Search (GPS), a sample-efficient hybrid framework for optimization and reinforcement learning, has previously been applied to generate periodic, axis-constrained locomotion by an icosahedral tensegrity on flat ground. Varying environments and tasks, however, create a need for more adaptive and general locomotion control that actively utilizes an expanded space of robot states. This implies significantly higher needs in terms of sample data and setup effort. This work mitigates such requirements by proposing a new GPS- based reinforcement learning pipeline, which exploits the vehicle's high degree of symmetry and appropriately learns contextual behaviors that are sustainable without periodicity. Newly achieved capabilities include axially-unconstrained rolling, rough terrain traversal, and rough incline ascent. These tasks are evaluated for a small variety of key model parameters in simulation and tested on the NASA hardware prototype, SUPERball. Results confirm the utility of symmetry exploitation and the adaptability of the vehicle. They also shed light on numerous strengths and limitations of the GPS framework for policy design and transfer to real hybrid soft-rigid robots. |
Wang, R; Nakhimovich, D; Roberts, F; Bekris, K Robotics As an Enabler of Resiliency to Disasters: Promises and Pitfalls Book Chapter 12660 , pp. 75–101, Springer, 2021. Abstract | Links | BibTeX | Tags: Other @inbook{Wang:2021aa, title = {Robotics As an Enabler of Resiliency to Disasters: Promises and Pitfalls}, author = {R Wang and D Nakhimovich and F Roberts and K Bekris}, url = {http://www.cs.rutgers.edu/~kb572/pubs/Robotics_Enabler_Resiliency_Disasters.pdf}, year = {2021}, date = {2021-01-01}, volume = {12660}, pages = {75--101}, publisher = {Springer}, series = {Lecture Notes in Computer Science}, abstract = {The Covid-19 pandemic is a reminder that modern society is still susceptible to multiple types of natural or man-made disasters, which motivates the need to improve resiliency through technological advancement. This article focuses on robotics and the role it can play towards providing resiliency to disasters. The progress in this domain brings the promise of effectively deploying robots in response to life-threatening disasters, which includes highly unstructured setups and hazardous spaces inaccessible or harmful to humans. This article discusses the maturity of robotics technology and explores the needed advances that will allow robots to become more capable and robust in disaster response measures. It also explores how robots can help in making human and natural environments preemptively more resilient without compromising long-term prospects for economic development. Despite its promise, there are also concerns that arise from the deployment of robots. Those discussed relate to safety considerations, privacy infringement, cyber-security, and financial aspects, such as the cost of development and maintenance as well as impact on employment.}, keywords = {Other}, pubstate = {published}, tppubtype = {inbook} } The Covid-19 pandemic is a reminder that modern society is still susceptible to multiple types of natural or man-made disasters, which motivates the need to improve resiliency through technological advancement. This article focuses on robotics and the role it can play towards providing resiliency to disasters. The progress in this domain brings the promise of effectively deploying robots in response to life-threatening disasters, which includes highly unstructured setups and hazardous spaces inaccessible or harmful to humans. This article discusses the maturity of robotics technology and explores the needed advances that will allow robots to become more capable and robust in disaster response measures. It also explores how robots can help in making human and natural environments preemptively more resilient without compromising long-term prospects for economic development. Despite its promise, there are also concerns that arise from the deployment of robots. Those discussed relate to safety considerations, privacy infringement, cyber-security, and financial aspects, such as the cost of development and maintenance as well as impact on employment. |
Feng, S; Guo, T; Bekris, K; Yu, J Team Rubot's Experiences and Lessons from the Ariac Journal Article Robotics and Computer-Integrated Manufacturing, 70 , 2021. Abstract | Links | BibTeX | Tags: Manipulation @article{Feng:2021aa, title = {Team Rubot's Experiences and Lessons from the Ariac}, author = {S Feng and T Guo and K Bekris and J Yu}, editor = {Erez Karpas}, url = {https://www.sciencedirect.com/science/article/abs/pii/S0736584521000120}, year = {2021}, date = {2021-01-01}, journal = {Robotics and Computer-Integrated Manufacturing}, volume = {70}, abstract = {We share experiences and lessons learned in participating the annual Agile Robotics for Industrial Automation Competition (ARIAC). ARIAC is a simulation-based competition focusing on pushing the agility of robotic systems for handling industrial pick-and-place challenges. Team RuBot started competing from 2019, placing 2nd place in ARIAC 2019 and 3rd place in ARIAC 2020. The article also discusses the difficulties we faced during the contest and our strategies for tackling them.}, keywords = {Manipulation}, pubstate = {published}, tppubtype = {article} } We share experiences and lessons learned in participating the annual Agile Robotics for Industrial Automation Competition (ARIAC). ARIAC is a simulation-based competition focusing on pushing the agility of robotic systems for handling industrial pick-and-place challenges. Team RuBot started competing from 2019, placing 2nd place in ARIAC 2019 and 3rd place in ARIAC 2020. The article also discusses the difficulties we faced during the contest and our strategies for tackling them. |
2020 |
Wen, B; Mitash, C; Ren, B; Bekris, K se(3)-TrackNet: Data-Driven 6d Pose Tracking by Calibrating Image Residuals in Synthetic Domains Conference IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, 2020. Abstract | Links | BibTeX | Tags: Learning, Perception @conference{Wen:2020ab, title = {se(3)-TrackNet: Data-Driven 6d Pose Tracking by Calibrating Image Residuals in Synthetic Domains}, author = {B Wen and C Mitash and B Ren and K Bekris}, url = {http://arxiv.org/abs/2007.13866}, year = {2020}, date = {2020-10-01}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, address = {Las Vegas, NV}, abstract = {Tracking the 6D pose of objects in video sequences is important for robot manipulation. This task, however, introduces multiple challenges: (i) robot manipulation involves significant occlusions; (ii) data and annotations are troublesome and difficult to collect for 6D poses, which complicates machine learning solutions, and (iii) incremental error drift often accumulates in long term tracking to necessitate re-initialization of the object's pose. This work proposes a data-driven optimization approach for long-term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The key contribution in this context is a novel neural network architecture, which appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra. Consequently, even when the network is trained only with synthetic data can work effectively over real images. Comprehensive experiments over benchmarks - existing ones as well as a new dataset with significant occlusions related to object manipulation - show that the proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images. The approach is also the most computationally efficient among the alternatives and achieves a tracking frequency of 90.9Hz.}, keywords = {Learning, Perception}, pubstate = {published}, tppubtype = {conference} } Tracking the 6D pose of objects in video sequences is important for robot manipulation. This task, however, introduces multiple challenges: (i) robot manipulation involves significant occlusions; (ii) data and annotations are troublesome and difficult to collect for 6D poses, which complicates machine learning solutions, and (iii) incremental error drift often accumulates in long term tracking to necessitate re-initialization of the object's pose. This work proposes a data-driven optimization approach for long-term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The key contribution in this context is a novel neural network architecture, which appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra. Consequently, even when the network is trained only with synthetic data can work effectively over real images. Comprehensive experiments over benchmarks - existing ones as well as a new dataset with significant occlusions related to object manipulation - show that the proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images. The approach is also the most computationally efficient among the alternatives and achieves a tracking frequency of 90.9Hz. |
Mitash, C; Shome, R; Wen, B; Boularias, A; Bekris, K Task-Driven Perception and Manipulation for Constrained Placement of Unknown Objects Journal Article IEEE Robotics and Automation Letters (RA-L) (also appearing at IEEE/RSJ IROS 2020), 2020. Abstract | Links | BibTeX | Tags: Manipulation, Perception, Planning @article{Mitash:2020ab, title = {Task-Driven Perception and Manipulation for Constrained Placement of Unknown Objects}, author = {C Mitash and R Shome and B Wen and A Boularias and K Bekris}, url = {https://arxiv.org/abs/2006.15503}, year = {2020}, date = {2020-10-01}, journal = {IEEE Robotics and Automation Letters (RA-L) (also appearing at IEEE/RSJ IROS 2020)}, abstract = {Recent progress in robotic manipulation has dealt with the case of no prior object models in the context of relatively simple tasks, such as bin-picking. Existing methods for more constrained problems, however, such as deliberate placement in a tight region, depend more critically on shape information to achieve safe execution. This work introduces a possibilistic object representation for solving constrained placement tasks without shape priors. A perception method is proposed to track and update the object representation during motion execution, which respects physical and geometric constraints. The method operates directly over sensor data, modeling the seen and unseen parts of the object given observations. It results in a dynamically updated conservative representation, which can be used to plan safe manipulation actions. This task-driven perception process is integrated with manipulation task planning architecture for a dual-arm manipulator to discover efficient solutions for the constrained placement task with minimal sensing. The planning process can make use of handoff operations when necessary for safe placement given the conservative representation. The pipeline is evaluated with data from over 240 real-world experiments involving constrained placement of various unknown objects using a dual-arm manipulator. While straightforward pick-sense-and-place architectures frequently fail to solve these problems, the proposed integrated pipeline achieves more than 95% success and faster execution times.}, keywords = {Manipulation, Perception, Planning}, pubstate = {published}, tppubtype = {article} } Recent progress in robotic manipulation has dealt with the case of no prior object models in the context of relatively simple tasks, such as bin-picking. Existing methods for more constrained problems, however, such as deliberate placement in a tight region, depend more critically on shape information to achieve safe execution. This work introduces a possibilistic object representation for solving constrained placement tasks without shape priors. A perception method is proposed to track and update the object representation during motion execution, which respects physical and geometric constraints. The method operates directly over sensor data, modeling the seen and unseen parts of the object given observations. It results in a dynamically updated conservative representation, which can be used to plan safe manipulation actions. This task-driven perception process is integrated with manipulation task planning architecture for a dual-arm manipulator to discover efficient solutions for the constrained placement task with minimal sensing. The planning process can make use of handoff operations when necessary for safe placement given the conservative representation. The pipeline is evaluated with data from over 240 real-world experiments involving constrained placement of various unknown objects using a dual-arm manipulator. While straightforward pick-sense-and-place architectures frequently fail to solve these problems, the proposed integrated pipeline achieves more than 95% success and faster execution times. |
Mitash, C Scalable, Physics-Aware 6d Pose Estimation for Robot Manipulation PhD Thesis Rutgers University, 2020. Abstract | Links | BibTeX | Tags: Manipulation, Perception, Planning @phdthesis{Mitash:2020aa, title = {Scalable, Physics-Aware 6d Pose Estimation for Robot Manipulation}, author = {C Mitash}, url = {https://rucore.libraries.rutgers.edu/rutgers-lib/64961/}, year = {2020}, date = {2020-09-01}, school = {Rutgers University}, abstract = {Robot manipulation often depend on some form of pose estimation to represent the state of the world and allow decision making both at the task-level and for motion or grasp planning. Recent progress in deep learning gives hope for a pose estimation solution that could generalize over textured and texture-less objects, objects with or without distinctive shape properties, and under different lighting conditions and clutter scenarios. Nevertheless, it gives rise to a new set of challenges such as the painful task of acquiring large-scale labeled training datasets and of dealing with their stochastic output over unforeseen scenarios that are not captured by the training. This restricts the scalability of such pose estimation solutions in robot manipulation tasks that often deal with a variety of objects and changing environments. The thesis first describes an automatic data generation and learning framework to address the scalability challenge. Learning is bootstrapped by generating labeled data via physics simulation and rendering. Then it self-improves over time by acquiring and labeling real-world images via a search-based pose estimation process. The thesis proposes algorithms to generate and validate object poses online based on the objects' geometry and based on the physical consistency of their scene-level interactions. These algorithms provide robustness even when there exists a domain gap between the synthetic training and the real test scenarios. Finally, the thesis proposes a manipulation planning framework that goes beyond model-based pose estimation. By utilizing a dynamic object representation, this integrated perception and manipulation framework can efficiently solve the task of picking unknown objects and placing them in a constrained space. The algorithms are evaluated over real-world robot manipulation experiments and over large-scale public datasets. The results indicate the usefulness of physical constraints in both the training and the online estimation phase. Moreover, the proposed framework, while only utilizing simulated data can obtain robust estimation in challenging scenarios such as densely-packed bins and clutter where other approaches suffer as a result of large occlusion and ambiguities due to similar looking texture-less surfaces.}, keywords = {Manipulation, Perception, Planning}, pubstate = {published}, tppubtype = {phdthesis} } Robot manipulation often depend on some form of pose estimation to represent the state of the world and allow decision making both at the task-level and for motion or grasp planning. Recent progress in deep learning gives hope for a pose estimation solution that could generalize over textured and texture-less objects, objects with or without distinctive shape properties, and under different lighting conditions and clutter scenarios. Nevertheless, it gives rise to a new set of challenges such as the painful task of acquiring large-scale labeled training datasets and of dealing with their stochastic output over unforeseen scenarios that are not captured by the training. This restricts the scalability of such pose estimation solutions in robot manipulation tasks that often deal with a variety of objects and changing environments. The thesis first describes an automatic data generation and learning framework to address the scalability challenge. Learning is bootstrapped by generating labeled data via physics simulation and rendering. Then it self-improves over time by acquiring and labeling real-world images via a search-based pose estimation process. The thesis proposes algorithms to generate and validate object poses online based on the objects' geometry and based on the physical consistency of their scene-level interactions. These algorithms provide robustness even when there exists a domain gap between the synthetic training and the real test scenarios. Finally, the thesis proposes a manipulation planning framework that goes beyond model-based pose estimation. By utilizing a dynamic object representation, this integrated perception and manipulation framework can efficiently solve the task of picking unknown objects and placing them in a constrained space. The algorithms are evaluated over real-world robot manipulation experiments and over large-scale public datasets. The results indicate the usefulness of physical constraints in both the training and the online estimation phase. Moreover, the proposed framework, while only utilizing simulated data can obtain robust estimation in challenging scenarios such as densely-packed bins and clutter where other approaches suffer as a result of large occlusion and ambiguities due to similar looking texture-less surfaces. |
Kleinbort, M; Solovey, K; Bonalli, R; Granados, E; Bekris, K; Halperin, D Refined Analysis of Asymptotically-Optimal Kinodynamic Planning in the State-Cost Space Conference IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 2020. Abstract | Links | BibTeX | Tags: Dynamics, Planning @conference{Kleinbort:2020aa, title = {Refined Analysis of Asymptotically-Optimal Kinodynamic Planning in the State-Cost Space}, author = {M Kleinbort and K Solovey and R Bonalli and E Granados and K Bekris and D Halperin}, url = {https://arxiv.org/abs/1909.05569}, year = {2020}, date = {2020-06-01}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, address = {Paris, France}, abstract = {We present a novel analysis of AO-RRT: a tree-based planner for motion planning with kinodynamic constraints, originally described by Hauser and Zhou (AO-X, 2016). AO-RRT explores the state-cost space and has been shown to efficiently obtain high-quality solutions in practice without relying on the availability of a computationally-intensive two-point boundary-value solver. Our main contribution is an optimality proof for the single-tree version of the algorithm---a variant that was not analyzed before. Our proof only requires a mild and easily-verifiable set of assumptions on the problem and system: Lipschitz-continuity of the cost function and the dynamics. In particular, we prove that for any system satisfying these assumptions, any trajectory having a piecewise-constant control function and positive clearance from the obstacles can be approximated arbitrarily well by a trajectory found by AO-RRT. We also discuss practical aspects of AO-RRT and present experimental comparisons of variants of the algorithm.}, keywords = {Dynamics, Planning}, pubstate = {published}, tppubtype = {conference} } We present a novel analysis of AO-RRT: a tree-based planner for motion planning with kinodynamic constraints, originally described by Hauser and Zhou (AO-X, 2016). AO-RRT explores the state-cost space and has been shown to efficiently obtain high-quality solutions in practice without relying on the availability of a computationally-intensive two-point boundary-value solver. Our main contribution is an optimality proof for the single-tree version of the algorithm---a variant that was not analyzed before. Our proof only requires a mild and easily-verifiable set of assumptions on the problem and system: Lipschitz-continuity of the cost function and the dynamics. In particular, we prove that for any system satisfying these assumptions, any trajectory having a piecewise-constant control function and positive clearance from the obstacles can be approximated arbitrarily well by a trajectory found by AO-RRT. We also discuss practical aspects of AO-RRT and present experimental comparisons of variants of the algorithm. |
Sintov, A; Kimmel, A; Wen, B; Boularias, A; Bekris, K Tools for Data-Driven Modeling of Within-Hand Manipulation with Underactuated Adaptive Hands Conference Learning for Dynamics &amp; Control (L4DC), Berkeley, CA, 2020. Abstract | Links | BibTeX | Tags: Dynamics @conference{Sintov:2020ab, title = {Tools for Data-Driven Modeling of Within-Hand Manipulation with Underactuated Adaptive Hands}, author = {A Sintov and A Kimmel and B Wen and A Boularias and K Bekris}, url = {https://proceedings.mlr.press/v120/sintov20a.html}, year = {2020}, date = {2020-06-01}, booktitle = {Learning for Dynamics &amp; Control (L4DC)}, address = {Berkeley, CA}, abstract = {Precise in-hand manipulation is an important skill for a robot to perform tasks in human environments. Practical robotic hands must be low-cost, easy to control and capable. 3D-printed underactuated adaptive hands provide such properties as they are cheap to fabricate and adapt to objects of uncertain geometry with stable grasps. Challenges still remain, however, before such hands can attain human-like performance due to complex dynamics and contacts. In particular, useful models for planning, control or model-based reinforcement learning are still lacking. Recently, data-driven approaches for such models have shown promise. This work provides the first large public dataset of real within-hand manipulation that facilitates building such models, along with baseline data-driven modeling results. Furthermore, it contributes ROS-based physics-engine model of such hands for independent data collection, experimentation and sim-to-reality transfer work.}, keywords = {Dynamics}, pubstate = {published}, tppubtype = {conference} } Precise in-hand manipulation is an important skill for a robot to perform tasks in human environments. Practical robotic hands must be low-cost, easy to control and capable. 3D-printed underactuated adaptive hands provide such properties as they are cheap to fabricate and adapt to objects of uncertain geometry with stable grasps. Challenges still remain, however, before such hands can attain human-like performance due to complex dynamics and contacts. In particular, useful models for planning, control or model-based reinforcement learning are still lacking. Recently, data-driven approaches for such models have shown promise. This work provides the first large public dataset of real within-hand manipulation that facilitates building such models, along with baseline data-driven modeling results. Furthermore, it contributes ROS-based physics-engine model of such hands for independent data collection, experimentation and sim-to-reality transfer work. |
Shome, R; Bekris, K Synchronized Multi-Arm Rearrangement Guided by Mode Graphs with Capacity Constraints Conference Workshop on the Algorithmic Foundations of Robotics (WAFR), Oulu, Finland, 2020. Abstract | Links | BibTeX | Tags: Manipulation, Planning @conference{Shome:2020ac, title = {Synchronized Multi-Arm Rearrangement Guided by Mode Graphs with Capacity Constraints}, author = {R Shome and K Bekris}, url = {https://arxiv.org/abs/2005.09127}, year = {2020}, date = {2020-06-01}, booktitle = {Workshop on the Algorithmic Foundations of Robotics (WAFR)}, address = {Oulu, Finland}, abstract = {Solving task planning problems involving multiple objects and multiple robotic arms poses scalability challenges. Such problems involve not only coordinating multiple high-DoF arms, but also searching through possible sequences of actions including object placements, and handoffs. The current work identifies a useful connection between multi-arm rearrangement and recent results in multi-body path planning on graphs with vertex capacity constraints. Solving a synchronized multi-arm rearrangement at a high-level involves reasoning over a modal graph, where nodes correspond to stable object placements and object transfer states by the arms. Edges of this graph correspond to pick, placement and handoff operations. The objects can be viewed as pebbles moving over this graph, which has capacity constraints. For instance, each arm can carry a single object but placement locations can accumulate many objects. Efficient integer linear programming-based solvers have been proposed for the corresponding pebble problem. The current work proposes a heuristic to guide the task planning process for synchronized multi-arm rearrangement. Results indicate good scalability to multiple arms and objects, and an algorithm that can find high-quality solutions fast and exhibiting desirable anytime behavior.}, keywords = {Manipulation, Planning}, pubstate = {published}, tppubtype = {conference} } Solving task planning problems involving multiple objects and multiple robotic arms poses scalability challenges. Such problems involve not only coordinating multiple high-DoF arms, but also searching through possible sequences of actions including object placements, and handoffs. The current work identifies a useful connection between multi-arm rearrangement and recent results in multi-body path planning on graphs with vertex capacity constraints. Solving a synchronized multi-arm rearrangement at a high-level involves reasoning over a modal graph, where nodes correspond to stable object placements and object transfer states by the arms. Edges of this graph correspond to pick, placement and handoff operations. The objects can be viewed as pebbles moving over this graph, which has capacity constraints. For instance, each arm can carry a single object but placement locations can accumulate many objects. Efficient integer linear programming-based solvers have been proposed for the corresponding pebble problem. The current work proposes a heuristic to guide the task planning process for synchronized multi-arm rearrangement. Results indicate good scalability to multiple arms and objects, and an algorithm that can find high-quality solutions fast and exhibiting desirable anytime behavior. |
Shome, R; Nakhimovich, D; Bekris, K Pushing the Boundaries of Asymptotic Optimality in Integrated Task and Motion Planning Conference Workshop on the Algorithmic Foundations of Robotics (WAFR), Oulu, Finland, 2020. Abstract | Links | BibTeX | Tags: Planning @conference{Shome:2020ab, title = {Pushing the Boundaries of Asymptotic Optimality in Integrated Task and Motion Planning}, author = {R Shome and D Nakhimovich and K Bekris}, url = {http://www.cs.rutgers.edu/~kb572/pubs/asymptotic_optimality_task_motion_planning.pdf}, year = {2020}, date = {2020-06-01}, booktitle = {Workshop on the Algorithmic Foundations of Robotics (WAFR)}, address = {Oulu, Finland}, abstract = {Integrated task and motion planning problems describe a multi-modal state space, which is often abstracted as a set of smooth manifolds that are connected via sets of transitions states. One approach to solving such problems is to sample reachable states in each of the manifolds, while simultaneously sampling transition states. Prior work has shown that in order to achieve asymptotically optimal (AO) solutions for such piecewise-smooth task planning problems, it is sufficient to double the connection radius required for AO sampling-based motion planning. This was shown under the assumption that the transition sets themselves are smooth. The current work builds upon this result and demonstrates that it is sufficient to use the same connection radius as for standard AO motion planning. Furthermore, the current work studies the case that the transition sets are non-smooth boundary points of the valid state space, which is frequently the case in practice, such as when a gripper grasps an object. This paper generalizes the notion of clearance that is typically assumed in motion and task planning to include such individual, potentially non-smooth transition states. It is shown that asymptotic optimality is retained under this generalized regime.}, keywords = {Planning}, pubstate = {published}, tppubtype = {conference} } Integrated task and motion planning problems describe a multi-modal state space, which is often abstracted as a set of smooth manifolds that are connected via sets of transitions states. One approach to solving such problems is to sample reachable states in each of the manifolds, while simultaneously sampling transition states. Prior work has shown that in order to achieve asymptotically optimal (AO) solutions for such piecewise-smooth task planning problems, it is sufficient to double the connection radius required for AO sampling-based motion planning. This was shown under the assumption that the transition sets themselves are smooth. The current work builds upon this result and demonstrates that it is sufficient to use the same connection radius as for standard AO motion planning. Furthermore, the current work studies the case that the transition sets are non-smooth boundary points of the valid state space, which is frequently the case in practice, such as when a gripper grasps an object. This paper generalizes the notion of clearance that is typically assumed in motion and task planning to include such individual, potentially non-smooth transition states. It is shown that asymptotic optimality is retained under this generalized regime. |
2025 |
PROBE: Proprioceptive Obstacle Detection and Estimation while Navigating in Clutter Conference IEEE International Conference on Robotics and Automation (ICRA), 2025. |
Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies Conference IEEE International Conference on Robotics and Automation (ICRA), 2025. |
2024 |
The State of Robot Motion Generation Inproceedings International Symposium of Robotics Research (ISRR), Long Beach, California, 2024. |
Learning Differentiable Tensegrity Dynamics Using Graph Neural Networks Inproceedings Conference on Robot Learning (CoRL), Munich, Germany, 2024. |
Roadmaps with Gaps Over Controllers: Achieving Efficiency in Planning under Dynamics Inproceedings IEEE/RSJ Intern. Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates, 2024. |
MORALS: Analysis of High-Dimensional Robot Controllers Via Topological Tools in a Latent Space Conference IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan (Nominated for Best Paper Award in Automation), 2024. |
2023 |
Persistent Homology Guided Monte-Carlo Tree Search for Effective Non-Prehensile Manipulation Inproceedings International Symposium on Experimental Robotics (ISER), 2023. |
Ovir-3d: Open-Vocabulary 3d Instance Retrieval without Training on 3d Data Inproceedings Conference on Robot Learning (CoRL), Atlanta, GA, 2023. |
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs Conference Conference on Robot Learning (CoRL), Atlanta, GA, 2023. |
Real2sim2real Transfer for Control of Cable-Driven Robots Via a Differentiable Physics Engine Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, 2023. |
Starblocks: Soft Actuated Self-Connecting Blocks for Building Deformable Lattice Structures Journal Article IEEE Robotics and Automation Letters, 8 (8), pp. 4521–4528, 2023. |
Demonstrating Large-Scale Package Manipulation Via Learned Metrics of Pick Success Inproceedings Robotics: Science and Systems (RSS), Daegu, Korea, 2023. |
Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos Inproceedings IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023. |
Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees Inproceedings IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023. |
Resolution Complete In-Place Object Retrieval Given Known Object Models Inproceedings IEEE International Conference on Robotics and Automatics (ICRA), London, UK, 2023. |
2022 |
6n-Dof Pose Tracking for Tensegrity Robots Inproceedings International Symposium on Robotics Research (ISRR), 2022. |
A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
A Survey on the Integration of Machine Learning with Sampling-Based Motion Planning Journal Article Forthcoming Foundations and Trends in Robotics, Forthcoming. |
Terrain-Aware Learned Controllers for Sampling-Based Kinodynamic Planning Over Physically Simulated Terrains Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022. |
You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration Inproceedings Robotics: Science and Systems (RSS), 2022, (Nomination for Best Paper Award). |
Lazy Rearrangement Planning in Confined Spaces Inproceedings International Conference on Automated Planning and Scheduling (ICAPS), 2022. |
Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers Inproceedings Workshop on the Algorithmic Foundations of Robotics (WAFR), 2022. |
Online Object Model Reconstruction and Reuse for Lifelong Improvement of Robot Manipulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022, (Nomination for Best Paper Award in Manipulation). |
Persistent Homology for Effective Non-Prehensile Manipulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Physics-Based Scene-Level Reasoning for Object Pose Estimation in Clutter Journal Article International Journal of Robotics Research (IJRR), 2022. |
Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Model Identification and Control of a Mobile Robot with Omnidirectional Wheels Using Differentiable Physics Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Fast High-Quality Tabletop Rearrangement in Bounded Workspace Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Efficient and High-Quality Prehensile Rearrangement in Cluttered and Confined Spaces Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Catgrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation Inproceedings IEEE International Conference on Robotics and Automation (ICRA), 2022. |
Complex In-Hand Manipulation Via Compliance-Enabled Finger Gaiting and Multi-Modal Planning Journal Article IEEE Robotics and Automation Letters (also at ICRA), 2022. |
Safe, Occlusion-Aware Manipulation for Online Object Reconstruction in Confined Space Inproceedings International Symposium on Robotics Research (ISRR), 2022. |
2021 |
Tensegrity Robotics Journal Article Soft Robotics, 2021. |
ASCE Earth and Space Conference 2021, Seattle, WA, 2021. |
Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. |
Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. |
Bundletrack: 6d Pose Tracking for Novel Objects without Instance or Category-Level 3d Models Inproceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. |
Vision-Driven Compliant Manipulation for Reliable, High-Precision Assembly Tasks Conference Robotics: Science and Systems, 2021. |
Uniform Object Rearrangement: From Complete Monotone Primitives to Efficient Non-Monotone Informed Search Inproceedings International Conference on Robotics and Automation (ICRA) 2021, 2021. |
Fast, High-Quality Two-Arm Rearrangement in Synchronous, Monotone Tabletop Setups Journal Article IEEE Transactions on Automation Science and Engineering, 2021. |
Adaptive Tensegrity Locomotion: Controlling a Compliant Icosahedron with Symmetry-Reduced Reinforcement Learning Journal Article International Journal of Robotics Research (IJRR), 2021. |
Robotics As an Enabler of Resiliency to Disasters: Promises and Pitfalls Book Chapter 12660 , pp. 75–101, Springer, 2021. |
Team Rubot's Experiences and Lessons from the Ariac Journal Article Robotics and Computer-Integrated Manufacturing, 70 , 2021. |
2020 |
se(3)-TrackNet: Data-Driven 6d Pose Tracking by Calibrating Image Residuals in Synthetic Domains Conference IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, 2020. |
Task-Driven Perception and Manipulation for Constrained Placement of Unknown Objects Journal Article IEEE Robotics and Automation Letters (RA-L) (also appearing at IEEE/RSJ IROS 2020), 2020. |
Scalable, Physics-Aware 6d Pose Estimation for Robot Manipulation PhD Thesis Rutgers University, 2020. |
Refined Analysis of Asymptotically-Optimal Kinodynamic Planning in the State-Cost Space Conference IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 2020. |
Tools for Data-Driven Modeling of Within-Hand Manipulation with Underactuated Adaptive Hands Conference Learning for Dynamics &amp; Control (L4DC), Berkeley, CA, 2020. |
Synchronized Multi-Arm Rearrangement Guided by Mode Graphs with Capacity Constraints Conference Workshop on the Algorithmic Foundations of Robotics (WAFR), Oulu, Finland, 2020. |
Pushing the Boundaries of Asymptotic Optimality in Integrated Task and Motion Planning Conference Workshop on the Algorithmic Foundations of Robotics (WAFR), Oulu, Finland, 2020. |