Papers – PRACSYS Lab

Show all

2025

Li, S; Keipour, A; Zhao, S; Rajagopalan, S; Swan, C; Bekris, K

Learning to Optimize Package Picking for Large-Scale, Real-World Robot Induction Conference

19th International Symposium on Experimental Robotics (ISER), 2025.

Abstract | Links | BibTeX | Tags: Learning, Manipulation

Wang, C; Vanbaar, J; Mitash, C; Li, S; Randle, D; Keipour, A; Hussein, M; Bekris, K; Katyal, K

Demonstrating Item Picking at Scale via Effective Learning of Multimodal Representations Conference

Proceedings of Robotics: Science and Systems (RSS), 2025.

Abstract | Links | BibTeX | Tags: Learning, Manipulation, Planning

@conference{wang2025Demonstrating,
title = {Demonstrating Item Picking at Scale via Effective Learning of Multimodal Representations},
author = {C Wang and J Vanbaar and C Mitash and S Li and D Randle and A Keipour and M Hussein and K Bekris and K Katyal},
url = {https://arxiv.org/abs/2506.10359},
year = {2025},
date = {2025-06-21},
booktitle = {Proceedings of Robotics: Science and Systems (RSS)},
journal = {Proceedings of Robotics: Science and Systems (RSS)},
abstract = {This work demonstrates how autonomously learning aspects of robotic operation from sparsely-labeled, real-world data of deployed, engineered solutions at industrial scale can provide with solutions that achieve improved performance. Specifically, it focuses on multi-suction robot picking and performs a comprehensive study on the application of multi-modal visual encoders for predicting the success of candidate robotic picks. Picking diverse items from unstructured piles is an important and challenging task for robot manipulation in real-world settings, such as warehouses. Methods for picking from clutter must work for an open set of items while simultaneously meeting latency constraints to achieve high throughput. The demonstrated approach utilizes multiple input modalities, such as RGB, depth and semantic segmentation, to estimate the quality of candidate multi-suction picks. The strategy is trained from real-world item picking data, with a combination of multimodal pretrain and finetune. The manuscript provides comprehensive experimental evaluation performed over a large item-picking dataset, an item-picking dataset targeted to include partial occlusions, and a package-picking dataset, which focuses on containers, such as boxes and envelopes, instead of unpackaged items. The evaluation measures performance for different item configurations, pick scenes, and object types. Ablations help to understand the effects of in-domain pretraining, the impact of different modalities and the importance of finetuning. These ablations reveal both the importance of training over multiple modalities but also the ability of models to learn during pretraining the relationship between modalities so that during finetuning and inference, only a subset of them can be used as input. },
keywords = {Learning, Manipulation, Planning},
pubstate = {published},
tppubtype = {conference}
}

2024

Bekris, K; Doerr, J; Meng, P; Tangirala, S

The State of Robot Motion Generation Inproceedings

International Symposium of Robotics Research (ISRR), Long Beach, California, 2024.

Abstract | Links | BibTeX | Tags: Dynamics, Learning, Planning

2023

Lu, S; Deng, Y; Boularias, A; Bekris, K

Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos Inproceedings

IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023.

Abstract | Links | BibTeX | Tags: Learning, Perception

2022

McMahon, T; Sivaramakrishnan, A; Kedia, K; Granados, E; Bekris, K

Terrain-Aware Learned Controllers for Sampling-Based Kinodynamic Planning Over Physically Simulated Terrains Inproceedings

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.

Abstract | Links | BibTeX | Tags: Dynamics, Learning, Planning

@inproceedings{McMahon:2022ab,
title = {Terrain-Aware Learned Controllers for Sampling-Based Kinodynamic Planning Over Physically Simulated Terrains},
author = {T McMahon and A Sivaramakrishnan and K Kedia and E Granados and K Bekris},
url = {https://ieeexplore.ieee.org/document/9982136},
year = {2022},
date = {2022-06-01},
booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
abstract = {This paper explores learning an effective controller for improving the efficiency of kinodynamic planning for vehicular systems navigating uneven terrains. It describes the pipeline for training the corresponding controller and using it for motion planning purposes. The training process uses a soft actor-critic approach with hindsight experience replay to train a model, which is parameterized by the incline of the robot's local terrain. This trained model is then used during the expansion process of an asymptotically optimal kinodynamic planner to generate controls that allow the robot to reach desired local states. It is also used to define a heuristic cost-to-go function for the planner via a wavefront operation that estimates the cost of reaching the global goal. The cost-to-go function is used both for selecting nodes for expansion as well as for generating local goals for the controller to expand towards. The accompanying experimental section applies the integrated planning solution on models of all-terrain robots in a variety of physically simulated terrains. It shows that the proposed terrain-aware controller and the proposed wavefront function based on the cost-to-go model enable motion planners to find solutions in less time and with lower cost than alternatives. An ablation study emphasizes the benefits of a learned controller that is parameterized by the incline of the robot's local terrain as well as of an incremental training process for the controller.},
keywords = {Dynamics, Learning, Planning},
pubstate = {published},
tppubtype = {inproceedings}
}

Wen, B; Lian, W; Bekris, K; Schaal, S

You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration Inproceedings

Robotics: Science and Systems (RSS), 2022, (Nomination for Best Paper Award).

Abstract | Links | BibTeX | Tags: Learning, Manipulation, Perception

@inproceedings{Wen:2022ab,
title = {You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration},
author = {B Wen and W Lian and K Bekris and S Schaal},
url = {https://www.roboticsproceedings.org/rss18/p044.pdf},
year = {2022},
date = {2022-06-01},
booktitle = {Robotics: Science and Systems (RSS)},
abstract = {Promising results have been achieved recently in category-level manipulation that generalizes across object instances. Nevertheless, it often requires expensive real-world data collection and manual specification of semantic keypoints for each object category and task. Additionally, coarse keypoint predictions and ignoring intermediate action sequences hinder adoption in complex manipulation tasks beyond pick-and-place. This work proposes a novel, category-level manipulation framework that leverages an object-centric, category-level representation and model-free 6 DoF motion tracking. The canonical object representation is learned solely in simulation and then used to parse a category-level, task trajectory from a single demonstration video. The demonstration is reprojected to a target trajectory tailored to a novel object via the canonical representation. During execution, the manipulation horizon is decomposed into long range, collision-free motion and last-inch manipulation. For the latter part, a category-level behavior cloning (CatBC) method leverages motion tracking to perform closed-loop control. CatBC follows the target trajectory, projected from the demonstration and anchored to a dynamically selected category-level coordinate frame. The frame is automatically selected along the manipulation horizon by a local attention mechanism. This framework allows to teach different manipulation strategies by solely providing a single demonstration, without complicated manual programming. Extensive experiments demonstrate its efficacy in a range of challenging industrial tasks in high precision assembly, which involve learning complex, long-horizon policies. The process exhibits robustness against uncertainty due to dynamics as well as generalization across object instances and scene configurations.},
note = {Nomination for Best Paper Award},
keywords = {Learning, Manipulation, Perception},
pubstate = {published},
tppubtype = {inproceedings}
}

2021

Wang, K; Aanjaneya, M; Bekris, K

Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots Inproceedings

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.

Abstract | Links | BibTeX | Tags: Dynamics, Learning, Soft-Robots

@inproceedings{Wang:2021ab,
title = {Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots},
author = {K Wang and M Aanjaneya and K Bekris},
url = {https://arxiv.org/abs/2011.04929},
year = {2021},
date = {2021-09-01},
booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
abstract = {Learning policies in simulation is promising for reducing human effort when training robot controllers. This is especially true for soft robots that are more adaptive and safe but also more difficult to accurately model and control. The sim2real gap is the main barrier to successfully transfer policies from simulation to a real robot. System identification can be applied to reduce this gap but traditional identification methods require a lot of manual tuning. Data-driven alternatives can tune dynamical models directly from data but are often data hungry, which also incorporates human effort in collecting data. This work proposes a data-driven, end-to-end differentiable simulator focused on the exciting but challenging domain of tensegrity robots. To the best of the authors' knowledge, this is the first differentiable physics engine for tensegrity robots that supports cable, contact, and actuation modeling. The aim is to develop a reasonably simplified, data-driven simulation, which can learn approximate dynamics with limited ground truth data. The dynamics must be accurate enough to generate policies that can be transferred back to the ground-truth system. As a first step in this direction, the current work demonstrates sim2sim transfer, where the unknown physical model of MuJoCo acts as a ground truth system. Two different tensegrity robots are used for evaluation and learning of locomotion policies, a 6-bar and a 3-bar tensegrity. The results indicate that only 0.25% of ground truth data are needed to train a policy that works on the ground truth system when the differentiable engine is used for training against training the policy directly on the ground truth system.},
keywords = {Dynamics, Learning, Soft-Robots},
pubstate = {published},
tppubtype = {inproceedings}
}

2020

Wen, B; Mitash, C; Ren, B; Bekris, K

se(3)-TrackNet: Data-Driven 6d Pose Tracking by Calibrating Image Residuals in Synthetic Domains Conference

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, 2020.

Abstract | Links | BibTeX | Tags: Learning, Perception

2019

Mitash, C; Wen, B; Bekris, K; Boularias, A

Scene-Level Pose Estimation for Multiple Instances of Densely Packed Objects Conference

Conference on Robot Learning (CoRL), Osaka, Japan, 2019.

Abstract | Links | BibTeX | Tags: Learning, Perception