Preprints

DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability
Under Review. June 2023.
[Abstract] [BibTex] [PDF]

@unpublished{fang2023dimsam,
  title = {DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability},
  author = {Fang, Xiaolin and Garrett, Caelan and Eppner, Clemens and Lozano-Pérez, Tomás and Kaelbling, Leslie and Fox, Dieter},
  booktitle = {Under Review},
  year = {2023}
}
Task and Motion Planning (TAMP) approaches are effective at planning long-horizon autonomous robot manipulation. However, because they require a planning model, it can be difficult to apply them to domains where the environment and its dynamics are not fully known. We propose to overcome these limitations by leveraging deep generative modeling, specifically diffusion models, to learn constraints and samplers that capture these difficult-to-engineer aspects of the planning model. These learned samplers are composed and combined within a TAMP solver in order to find action parameter values jointly that satisfy the constraints along a plan. To tractably make predictions for unseen objects in the environment, we define these samplers on low-dimensional learned latent embeddings of changing object state. We evaluate our approach in an articulated object manipulation domain and show how the combination of classical TAMP, generative learning, and latent embeddings enables long-horizon constraint-based reasoning.

Refereed Conference & Journal Publications

Deep Learning Approaches to Grasp Synthesis: A Review
IEEE Transactions on Robotics (T-RO). IEEE. June 2023.
[Abstract] [BibTex] [PDF] [Website]

@article{newbury2022review,
  title = {Deep Learning Approaches to Grasp Synthesis: A Review},
  author = {Newbury, Rhys and Gu, Morris and Chumbley, Lachlan and Mousavian, Arsalan and Eppner, Clemens and Leitner, Jürgen and Bohg, Jeannette and Morales, Antonio and Asfour, Tamim and Kragic, Danica and Fox, Dieter and Cosgun, Akansel},
  journal = {IEEE Transactions on Robotics (T-RO)},
  year = {2023}
}
Grasping is the process of picking up an object by applying forces and torques at a set of contacts. Recent advances in deep learning methods have allowed rapid progress in robotic object grasping. In this systematic review, we surveyed the publications over the last decade, with a particular interest in grasping an object using all six degrees of freedom of the end-effector pose. Our review found four common methodologies for robotic grasping: sampling-based approaches, direct regression, reinforcement learning, and exemplar approaches In addition, we found two “supporting methods” around grasping that use deep learning to support the grasping process, shape approximation, and affordances. We have distilled the publications found in this systematic review (85 papers) into ten key takeaways we consider crucial for future robotic grasping and manipulation research.
CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2023.
[Abstract] [BibTex] [PDF] [Website]

@inproceedings{murali2022cabinet,
  title = {CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation},
  author = {Murali, Adithya and Mousavian, Arsalan and Eppner, Clemens and Fishman, Adam and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2023}
}
We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes— orders of magnitude more than prior work—in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7μs/query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene’s signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Robot experiments in completely unknown scenes and objects are shown in the supplementary video.
Motion Policy Networks
Conference on Robot Learning (CoRL). December 2022.
[Abstract] [BibTex] [PDF] [Video] [Website] [Code ★93]

@inproceedings{fishman2022mpinets,
  title = {Motion Policy Networks},
  author = {Fishman, Adam and Murali, Adithya and Eppner, Clemens and Peele, Bryan and Boots, Byron and Fox, Dieter},
  booktitle = {Conference on Robot Learning (CoRL)},
  year = {2022}
}
Collision-free motion generation in unknown environments is a core building block for robot manipulation. Generating such motions is challenging due to multiple objectives; not only should the solutions be optimal, the motion generator itself must be fast enough for real-time performance and reliable enough for practical deployment. A wide variety of methods have been proposed ranging from local controllers to global planners, often being combined to offset their shortcomings. We present an end-to-end neural model called Motion Policy Networks (MπNets) to generate collision-free, smooth motion from just a single depth camera observation. MπNets are trained on over 3 million motion planning problems in over 500,000 environments. Our experiments show that MπNets are significantly faster than global planners while exhibiting the reactivity needed to deal with dynamic scenes. They are 46% better than prior neural planners and more robust than local control policies. Despite being only trained in simulation, MπNets transfer well to the real robot with noisy partial point clouds.
DefGraspSim: Simulation-based Grasping of 3D Deformable Objects
R:SS 2021 Workshop on Deformable Object Simulation in Robotics. July 2021.
[Abstract] [BibTex] [PDF] [Website] [Code ★99] [Press: TechXplore]
[Best Workshop Paper Award]

@inproceedings{huang2021defgraspsim,
  title = {DefGraspSim: Simulation-based Grasping of 3D Deformable Objects},
  author = {Huang, Isabella and Narang, Yashraj and Eppner, Clemens and Sundaralingam, Balakumar and Macklin, Miles and Hermans, Tucker and Fox, Dieter},
  booktitle = {R:SS 2021 Workshop on Deformable Object Simulation in Robotics},
  year = {2021}
}
Robotic grasping of 3D deformable objects (e.g., fruits/vegetables, internal organs, bottles/boxes) is critical for real-world applications such as food processing, robotic surgery, and household automation. However, developing grasp strategies for such objects is uniquely challenging. In this work, we efficiently simulate grasps on a wide range of 3D deformable objects using a GPU-based implementation of the corotational finite element method (FEM). To facilitate future research, we open-source our simulated dataset (34 objects, 1e5 Pa elasticity range, 6800 grasp evaluations, 1.1M grasp measurements), as well as a code repository that allows researchers to run our full FEM-based grasp evaluation pipeline on arbitrary 3D object models of their choice. We also provide a detailed analysis on 6 object primitives. For each primitive, we methodically describe the effects of different grasp strategies, compute a set of performance metrics (e.g., deformation, stress) that fully capture the object response, and identify simple grasp features (e.g., gripper displacement, contact area) measurable by robots prior to pickup and predictive of these performance metrics. Finally, we demonstrate good correspondence between grasps on simulated objects and their real-world counterparts.
Alternative Paths Planner (APP) for Provably Fixed-time Manipulation Planning in Semi-structured Environments
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2021.
[Abstract] [BibTex] [PDF]

@inproceedings{islam2020alternative,
  title = {Alternative Paths Planner (APP) for Provably Fixed-time Manipulation Planning in Semi-structured Environments},
  author = {Islam, Fahad and Paxton, Christopher and Eppner, Clemens and Peele, Bryan and Likhachev, Maxim and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2021}
}
In many applications, including logistics and manufacturing, robot manipulators operate in semi-structured environments alongside humans or other robots. These environments are largely static, but they may contain some movable obstacles that the robot must avoid. Manipulation tasks in these applications are often highly repetitive, but require fast and reliable motion planning capabilities, often under strict time constraints. Existing preprocessing-based approaches are beneficial when the environments are highly-structured, but their performance degrades in the presence of movable obstacles, since these are not modelled a priori. We propose a novel preprocessing-based method called Alternative Paths Planner (APP) that provides provably fixed-time planning guarantees in semi-structured environments. APP plans a set of alternative paths offline such that, for any configuration of the movable obstacles, at least one of the paths from this set is collision-free. During online execution, a collision-free path can be looked up efficiently within a few microseconds. We evaluate APP on a 7 DoF robot arm in semi-structured domains of varying complexity and demonstrate that APP is several orders of magnitude faster than state-of-the-art motion planners for each domain. We further validate this approach with real-time experiments on a robotic manipulator.
ACRONYM: A Large-Scale Grasp Dataset Based on Simulation
Clemens Eppner, Arsalan Mousavian and Dieter Fox
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2021.
[Abstract] [BibTex] [PDF] [Data] [Code ★82]

@inproceedings{eppner2020acronym,
  title = {ACRONYM: A Large-Scale Grasp Dataset Based on Simulation},
  author = {Eppner, Clemens and Mousavian, Arsalan and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2021}
}
We introduce ACRONYM, a dataset for robot grasp planning based on physics simulation. The dataset contains 17.7M parallel-jaw grasps, spanning 8872 objects from 262 different categories, each labeled with the grasp result obtained from a physics simulator. We show the value of this large and diverse dataset by using it to train two state-of-the-art learning-based grasp planning algorithms. Grasp performance improves significantly when compared to the original smaller dataset.
Object Rearrangement Using Learned Implicit Collision Functions
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2021.
[Abstract] [BibTex] [PDF] [Website] [Code ★44]

@inproceedings{danielczuk2020rearrangement,
  title = {Object Rearrangement Using Learned Implicit Collision Functions},
  author = {Danielczuk, Michael and Mousavian, Arsalan and Eppner, Clemens and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2021}
}
Robotic object rearrangement combines the skills of picking and placing objects. When object models are unavailable, typical collision-checking models may be unable to predict collisions in partial point clouds with occlusions, making generation of collision-free grasping or placement trajectories challenging. We propose a learned collision model that accepts scene and query object point clouds and predicts collisions for 6DOF object poses within the scene. We train the model on a synthetic set of 1 million scene/object point cloud pairs and 2 billion collision queries. We leverage the learned collision model as part of a model predictive path integral (MPPI) policy in a tabletop rearrangement task and show that the policy can plan collision-free grasps and placements for objects unseen in training in both simulated and physical cluttered scenes with a Franka Panda robot. The learned model outperforms both traditional pipelines and learned ablations by 9.8% in accuracy on a dataset of simulated collision queries and is 75x faster than the best-performing baseline.
6-DOF Grasping for Target-driven Object Manipulation in Clutter
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Paris, France. May 2020.
[Abstract] [BibTex] [PDF] [Video] [Talk]
[ICRA 2020 Best Student & Best Manipulation Paper Award Finalist]

@inproceedings{murali2019cluttergrasping,
  title = {6-DOF Grasping for Target-driven Object Manipulation in Clutter},
  author = {Murali, Adithya and Mousavian, Arsalan and Eppner, Clemens and Paxton, Christopher and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2020}
}
Grasping in cluttered environments is a fundamental but challenging robotic skill. It requires both reasoning about unseen object parts and potential collisions with the manipulator. Most existing data-driven approaches avoid this problem by limiting themselves to top-down planar grasps which is insufficient for many real-world scenarios and greatly limits possible grasps. We present a method that plans 6-DOF grasps for any desired object in a cluttered scene from partial point cloud observations. Our method achieves a grasp success of 80.3%, outperforming baseline approaches by 17.6% and clearing 9 cluttered table scenes (which contain 23 unknown objects and 51 picks in total) on a real robotic platform. By using our learned collision checking module, we can even reason about effective grasp sequences to retrieve objects that are not immediately accessible.
Self-supervised 6D Object Pose Estimation for Robot Manipulation
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Paris, France. May 2020.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{deng2019selfsupervised,
  title = {Self-supervised 6D Object Pose Estimation for Robot Manipulation},
  author = {Deng, Xinke and Xiang, Yu and Mousavian, Arsalan and Eppner, Clemens and Bretl, Timothy and Fox, Dieter},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2020}
}
To teach robots skills, it is crucial to obtain data with supervision. Since annotating real world data is time-consuming and expensive, enabling robots to learn in a self-supervised way is important. In this work, we introduce a robot system for self-supervised 6D object pose estimation. Starting from modules trained in simulation, our system is able to label real world images with accurate 6D object poses for self-supervised learning. In addition, the robot interacts with objects in the environment to change the object configuration by grasping or pushing objects. In this way, our system is able to continuously collect data and improve its pose estimation modules. We show that the self-supervised learning improves object segmentation and 6D pose estimation performance, and consequently enables the system to grasp objects more reliably.
Representing Robot Task Plans as Robust Logical-Dynamical Systems
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau, China. November 2019.
[Abstract] [BibTex] [PDF]

@inproceedings{paxton2019logicaldynamicalsystems,
  title = {Representing Robot Task Plans as Robust Logical-Dynamical Systems},
  author = {Christopher Paxton and Nathan Ratliff and Clemens Eppner and Dieter Fox},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2019}
}
It is difficult to create robust, reusable, and reactive behaviors for robots that can be easily extended and combined. Frameworks such as Behavior Trees are flexible but difficult to characterize, especially when designing reactions and recovery behaviors to consistently converge to a desired goal condition. We propose a framework which we call Robust Logical-Dynamical Systems (RLDS), which combines the advantages of task representations like behavior trees with theoretical guarantees on performance. RLDS can also be constructed automatically from simple sequential task plans and will still achieve robust, reactive behavior in dynamic real-world environments. In this work, we describe both our proposed framework and a case study on a simple household manipulation task, with examples for how specific pieces can be implemented to achieve robust behavior. Finally, we show how in the context of these manipulation tasks, a combination of an RLDS with planning can achieve better results under adversarial conditions.
A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-Based Grasp Data Set
Clemens Eppner, Arsalan Mousavian and Dieter Fox
Springer Proceedings of the 19th International Symposium of Robotics Research (ISRR). Hanoi, Vietnam. October 2019.
[Abstract] [BibTex] [PDF] [Website]

@inproceedings{eppner2019graspsampling,
  title = {A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-Based Grasp Data Set},
  author = {Clemens Eppner and Arsalan Mousavian and Dieter Fox},
  booktitle = {Springer Proceedings of the 19th International Symposium of Robotics Research (ISRR)},
  year = {2019}
}
Robot grasping is often formulated as a learning problem. With the increasing speed and quality of physics simulations, generating large-scale grasping data sets that feed learning algorithms is becoming more and more popular. An often overlooked question is how to generate the grasps that make up these data sets. In this paper, we review, classify, and compare different grasp sampling strategies. Our evaluation is based on a fine-grained discretization of SE(3) and uses physics-based simulation to evaluate the quality and robustness of the corresponding parallel-jaw grasps. Specifically, we consider more than 1 billion grasps for each of the 21 objects from the YCB data set. This dense data set lets us evaluate existing sampling schemes w.r.t. their bias and efficiency. Our experiments show that some popular sampling schemes contain significant bias and do not cover all possible ways an object can be grasped.
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation
Arsalan Mousavian, Clemens Eppner and Dieter Fox
Proceedings of the International Conference on Computer Vision (ICCV). Seoul, Korea. October 2019.
[Abstract] [BibTex] [PDF] [Video] [Website] [Code ★175] [Press: Neowin]

@inproceedings{mousavian2019graspnet,
  title = {6-DOF GraspNet: Variational Grasp Generation for Object Manipulation},
  author = {Arsalan Mousavian and Clemens Eppner and Dieter Fox},
  booktitle = {Proceedings of the International Conference on Computer Vision (ICCV)},
  year = {2019}
}
Generating grasp poses is a crucial component for any robot object manipulation task. In this work, we formulate the problem of grasp generation as sampling a set of grasps using a variational autoencoder and assess and refine the sampled grasps using a grasp evaluator model. Both Grasp Sampler and Grasp Refinement networks take 3D point clouds observed by a depth camera as input. We evaluate our approach in simulation and real-world robot experiments. Our approach achieves 88% success rate on various commonly used objects with diverse appearances, scales, and weights. Our model is trained purely in simulation and works in the real world without any extra steps.
The RBO Dataset of Articulated Objects and Interactions
The International Journal of Robotics Research. SAGE Publications. August 2019.
[Abstract] [BibTex] [PDF] [Data]

@article{martin2018dataset,
  title = {The RBO Dataset of Articulated Objects and Interactions},
  author = {Roberto Mart\'{\i}n-Mart\'{\i}n and Clemens Eppner and Brock, Oliver},
  journal = {The International Journal of Robotics Research},
  year = {2019}
}
We present a dataset with models of 14 articulated objects commonly found in human environments and with RGB-D video sequences and wrenches recorded of human interactions with them. The 358 interaction sequences total 67 minutes of human manipulation under varying experimental conditions (type of interaction, lighting, perspective, and background). Each interaction with an object is annotated with the ground-truth poses of its rigid parts and the kinematic state obtained by a motion-capture system. For a subset of 78 sequences (25 minutes), we also measured the interaction wrenches. The object models contain textured three-dimensional triangle meshes of each link and their motion constraints. We provide Python scripts to download and visualize the data. The data are available at https://turbo.github.io/articulated-objects/ and hosted at https://zenodo.org/record/1036660/.
Physics-Based Selection of Informative Actions for Interactive Perception
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Brisbane, Australia. May 2018.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{eppner2018physics,
  title = {Physics-Based Selection of Informative Actions for Interactive Perception},
  author = {Clemens Eppner and Roberto Mart\'{\i}n-Mart\'{\i}n and Oliver Brock},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2018}
}
Interactive perception exploits the correlation between forceful interactions and changes in the observed signals to extract task-relevant information from the sensor stream. Finding the most informative interactions to perceive complex objects, like articulated mechanisms, is challenging because the outcome of the interaction is difficult to predict. We propose a method to select the most informative action while deriving a model of articulated mechanisms that includes kinematic, geometric, and dynamic properties. Our method addresses the complexity of the action selection task based on two insights. First, we show that for a class of interactive perception methods, information gain can be approximated by the amount of motion induced in the mechanism. Second, we resort to physics simulations grounded in the real-world through interactive perception to predict possible action outcomes. Our method enables the robot to autonomously select actions for interactive perception that reveal most information, given the current knowledge ofthe world. This leads to improved perception and more accurate world models, finally enabling robust manipulation.
Visual Detection of Opportunities to Exploit Contact in Grasping Using Contextual Multi-Armed Bandits
Clemens Eppner and Oliver Brock
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada. September 2017.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{eppner2017visual,
  title = {Visual Detection of Opportunities to Exploit Contact in Grasping Using Contextual Multi-Armed Bandits},
  author = {Clemens Eppner and Oliver Brock},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2017}
}
Environment-constrained grasping exploits beneficial interactions between hand, object, and environment to increase grasp success. Instead of focusing on the final static relationship between hand posture and object pose, this view of grasping emphasizes the need and the opportunity to select the most appropriate, contact-rich grasping motion, leading up to a final static grasp configuration. This view changes the nature of the underlying planning problem: Instead of planning for static contact points, we need to decide which environmental constraint (EC) to use during the grasping motion. We propose a method to make these decisions based on depth measurements so as to generate robust grasps for a large variety of objects. Our planner exploits the advantages of a soft robot hand and learns a hand-specific classifier for edge-, surface-, and wall-grasps, each exploiting a different EC. Additionally, we show how the model can continuously be improved in a contextual multi-armed bandit setting without an explicit training and test phase, enabling the continuous improvement of a robot’s grasping skills throughout life time.
Interleaving Motion in Contact and in Free Space for Planning Under Uncertainty
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada. September 2017.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{sieverling2017interleaving,
  title = {Interleaving Motion in Contact and in Free Space for Planning Under Uncertainty},
  author = {Arne Sieverling and Clemens Eppner and Felix Wolff and Oliver Brock},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2017}
}
In this paper we present a planner that interleaves free-space motion with motion in contact to reduce uncertainty. The planner finds such motions by growing a search tree in the combined space of collision-free and contact configurations. The planner reasons efficiently about the accumulated uncertainty by factoring the state in a belief over configuration and a fully observable contact state. We show the uncertainty-reducing capabilities of the planner on manipulation benchmark from the POMDP literature. The planner scales up to more complex problems like manipulation under uncertainty in seven-dimensional configuration space. We validate our planner in simulation and on a real robot.
Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstrations
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Deajeon, South Korea. October 2016.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{gupta2016learning,
  title = {Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstrations},
  author = {Gupta, Abhishek and Eppner, Clemens and Levine, Sergey and Abbeel, Pieter},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2016}
}
Dexterous multi-fingered hands can accomplish fine manipulation behaviors that are infeasible with simple robotic grippers. However, sophisticated multi-fingered hands are often expensive and fragile. Low-cost soft hands offer an appealing alternative to more conventional devices, but present considerable challenges in sensing and actuation, making them difficult to apply to more complex manipulation tasks. In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks. Our method uses object-centric demonstrations, where a human demonstrates the desired motion of manipulated objects with their own hands, and the robot autonomously learns to imitate these demonstrations using reinforcement learning. We propose a novel algorithm that allows us to blend and select a subset of the most feasible demonstrations to learn to imitate on the hardware, which we use with an extension of the guided policy search framework to use multiple demonstrations to learn generalizable neural network policies. We demonstrate our approach on the RBO Hand 2, with learned motor skills for turning a valve, manipulating an abacus, and grasping.
Probabilistic Multi-Class Segmentation for the Amazon Picking Challenge
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Deajeon, South Korea. October 2016.
[Abstract] [BibTex] [PDF] [Video]
[IROS 2016 Best Paper Award Finalist]

@inproceedings{jonschkowski2016probabilistic,
  title = {Probabilistic Multi-Class Segmentation for the Amazon Picking Challenge},
  author = {Rico Jonschkowski and Clemens Eppner and Sebastian H{\"o}fer and Roberto Mart\'{\i}n-Mart\'{\i}n and Oliver Brock },
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2016}
}
We present a method for multi-class segmentation from RGB-D data in a realistic warehouse picking setting. The method computes pixel-wise probabilities and combines them to find a coherent object segmentation. It reliably segments objects in cluttered scenarios, even when objects are translucent, reflective, highly deformable, have fuzzy surfaces, or consist of loosely coupled components. The robust performance results from the exploitation of problem structure inherent to the warehouse setting. The proposed method proved its capabilities as part of our winning entry to the 2015 Amazon Picking Challenge. We present a detailed experimental analysis of the contribution of different information sources, compare our method to standard segmentation techniques, and assess possible extensions that further enhance the algorithm's capabilities. We release our software and data sets as open source.
Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems
Proceedings of Robotics: Science and Systems (RSS). Ann Arbor, Michigan, USA. June 2016.
[Abstract] [BibTex] [PDF] [Video] [Press: Engadget, RoboHub]
[RSS 2016 Best Systems Paper Award]

@inproceedings{eppner2016lessons,
  title = {Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems},
  author = {Clemens Eppner and Sebastian H{\"o}fer and Rico Jonschkowski and Roberto Mart\'{\i}n-Mart\'{\i}n and Arne Sieverling and Vincent Wall and Oliver Brock},
  booktitle = {Proceedings of Robotics: Science and Systems (RSS)},
  year = {2016}
}
We describe the winning entry to the Amazon Picking Challenge. From the experience of building this system and competing in the Amazon Picking Challenge, we derive several conclusions: 1) We suggest to characterize robotic systems building along four key aspects, each of them spanning a spectrum of solutions: modularity vs. integration, generality vs. assumptions, computation vs. embodiment, and planning vs. feedback. 2) To understand which region of each spectrum most adequately addresses which robotic problem, we must explore the full spectrum of possible approaches. To achieve this, our community should agree on key aspects that characterize the solution space of robotic systems. 3) For manipulation problems in unstructured environments, certain regions of each spectrum match the problem most adequately, and should be exploited further. This is supported by the fact that our solution deviated from the majority of the other challenge entries along each of the spectra.
Combining Model-Based Policy Search with Online Model Learning for Control of Physical Humanoids
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Stockholm, Sweden. May 2016.
[Abstract] [BibTex] [PDF] [Press: MIT Technology Review]

@conference{mordatch2016combining,
  title = {Combining Model-Based Policy Search with Online Model Learning for Control of Physical Humanoids},
  author = {Igor Mordatch and Nikhil Mishra and Clemens Eppner and Pieter Abbeel},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2016}
}
We present an automatic method for interactive control of physical humanoid robots based on high-level tasks that does not require manual specification of motion trajectories or specially-designed control policies. The method is based on the combination of a model-based policy that is trained off-line in simulation and sends high-level commands to a model-free controller that executes these commands on the physical robot. This low-level controller simultaneously learns and adapts a local model of dynamics on-line and computes optimal controls under the learned model. The high-level policy is trained using a combination of trajectory optimization and neural network learning, while considering physical limitations such as limited sensors and communication delays. The entire system runs in real-time on the robot's computer and uses only on-board sensors. We demonstrate successful policy execution on a range of tasks such as leaning, hand reaching, and robust balancing behaviors atop a tilting base on the physical robot and in simulation.
Planning Grasp Strategies That Exploit Environmental Constraints
Clemens Eppner and Oliver Brock
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Seattle, USA. May 2015.
[Abstract] [BibTex] [PDF] [Video] [Code ★7]

@inproceedings{eppner2015planning,
  title = {Planning Grasp Strategies That Exploit Environmental Constraints},
  author = {Clemens Eppner and Oliver Brock},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2015}
}
There is strong evidence that robustness in human and robotic grasping can be achieved through the deliberate exploitation of contact with the environment. In contrast to this, traditional grasp planners generally disregard the opportunity to interact with the environment during grasping. In this paper, we propose a novel view of grasp planning that centers on the exploitation of environmental contact. In this view, grasps are sequences of constraint exploitations, i.e. consecutive motions constrained by features in the environment, ending in a grasp. To be able to generate such grasp plans, it becomes necessary to consider planning, perception, and control as tightly integrated components. As a result, each of these components can be simplified while still yielding reliable grasping performance. We propose a first implementation of a grasp planner based on this view and demonstrate in real-world experiments the robustness and versatility of the resulting grasp plans.
A Taxonomy of Human Grasping Behavior Suitable for Transfer to Robotic Hands
Fabian Heinemann, Steffen Puhlmann, Clemens Eppner, José Álvarez-Ruiz, Marianne Maertens and Oliver Brock
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Seattle, USA. May 2015.
[Abstract] [BibTex] [PDF]

@inproceedings{heinemann2015taxonomy,
  title = {A Taxonomy of Human Grasping Behavior Suitable for Transfer to Robotic Hands},
  author = {Heinemann, Fabian and Puhlmann, Steffen and Eppner, Clemens and {\'A}lvarez-Ruiz, Jos{\'e} and Maertens, Marianne and Brock, Oliver},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2015}
}
As a first step towards transferring human grasping capabilities to robots, we analyzed the grasping behavior of human subjects. We derived a taxonomy in order to adequately represent the observed strategies. During the analysis of the recorded data, this classification scheme helped us to obtain a better understanding of human grasping behavior. We will provide support for our hypothesis that humans exploit compliant contact between the hand and the environment to compensate for uncertainty. We will also show a realization of the resulting grasping strategies on a real robot. It is our belief that the detailed analysis of human grasping behavior will ultimately lead to significant increases in robot manipulation and dexterity.
Exploitation of Environmental Constraints in Human and Robotic Grasping
The International Journal of Robotics Research. SAGE Publications. April 2015.
[Abstract] [BibTex] [PDF] [Video]

@article{eppner2015exploitation,
  title = {Exploitation of Environmental Constraints in Human and Robotic Grasping},
  author = {Eppner, Clemens and Deimel, Raphael and {\'A}lvarez-Ruiz, Jos{\'e} and Maertens, Marianne and Brock, Oliver},
  journal = {The International Journal of Robotics Research},
  year = {2015}
}
We investigate the premise that robust grasping performance is enabled by exploiting constraints present in the environment. These constraints, leveraged through motion in contact, counteract uncertainty in state variables relevant to grasp success. Given this premise, grasping becomes a process of successive exploitation of environmental constraints, until a successful grasp has been established. We present support for this view found through the analysis of human grasp behavior and by showing robust robotic grasping based on constraint-exploiting grasp strategies. Furthermore, we show that it is possible to design robotic hands with inherent capabilities for the exploitation of environmental constraints.
Exploitation of Environmental Constraints in Human and Robotic Grasping
Proceedings of the 16th International Symposium on Robotics Research (ISRR). Singapore. December 2013.
[Abstract] [BibTex] [PDF]

@inproceedings{deimel2013exploitation,
  title = {Exploitation of Environmental Constraints in Human and Robotic Grasping},
  author = {Raphael Deimel and Clemens Eppner and Jos{\'e} {\'A}lvarez-Ruiz and Marianne Maertens and Oliver Brock},
  booktitle = {Proceedings of the 16th International Symposium on Robotics Research (ISRR)},
  year = {2013}
}
We investigate the premise that robust grasping performance is enabled by exploiting constraints present in the environment. These constraints, leveraged through motion in contact, counteract uncertainty in state variables relevant to grasp success. Given this premise, grasping becomes a process of successive exploitation of environmental constraints, until a successful grasp has been established. We present support for this view by analyzing human grasp behavior and by showing robust robotic grasping based on constraint-exploiting grasp strategies. Furthermore, we show that it is possible to design robotic hands with inherent capabilities for the exploitation of environmental constraints.
Grasping Unknown Objects by Exploiting Shape Adaptability and Environmental Constraints
Clemens Eppner and Oliver Brock
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Tokyo, Japan. November 2013.
[Abstract] [BibTex] [PDF]

@inproceedings{eppner2013grasping,
  title = {Grasping Unknown Objects by Exploiting Shape Adaptability and Environmental Constraints},
  author = {Eppner, Clemens and Brock, Oliver},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2013}
}
In grasping, shape adaptation between hand and object has a major influence on grasp success. In this paper, we present an approach to grasping unknown objects that explicitly considers the effect of shape adaptability to simplify perception. Shape adaptation also occurs between the hand and the environment, for example, when fingers slide across the surface of the table to pick up a small object. Our approach to grasping also considers environmental shape adaptability to select grasps with high probability of success. We validate the proposed shape-adaptability-aware grasping approach in 880 real-world grasping trials with 30 objects. Our experiments show that the explicit consideration of shape adaptability of the hand leads to robust grasping of unknown objects. Simple perception suffices to achieve this robust grasping behavior.
The Humanoid Museum Tour Guide Robotinho
Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). Toyama, Japan. September 2009.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{faber2009humanoid,
  title = {The Humanoid Museum Tour Guide Robotinho},
  author = {Faber, Felix and Bennewitz, Maren and Eppner, Clemens and Gorog, Attila and Gonsior, Christoph and Joho, Dominik and Schreiber, Michael and Behnke, Sven},
  booktitle = {Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)},
  year = {2009}
}
Wheeled tour guide robots have already been deployed in various museums or fairs worldwide. A key requirement for successful tour guide robots is to interact with people and to entertain them. Most of the previous tour guide robots, however, focused more on the involved navigation task than on natural interaction with humans. Humanoid robots, on the other hand, offer a great potential for investigating intuitive, multimodal interaction between humans and machines. In this paper, we present our mobile full-body humanoid tour guide robot Robotinho. We provide mechanical and electrical details and cover perception, the integration of multiple modalities for interaction, navigation control, and system integration aspects. The multimodal interaction capabilities of Robotinho have been designed and enhanced according to the questionnaires filled out by the people who interacted with the robot at previous public demonstrations. We present experiences we have made during experiments in which untrained users interacted with the robot.
Imitation Learning with Generalized Task Descriptions
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Kobe, Japan. May 2009.
[Abstract] [BibTex] [PDF] [Video]

@inproceedings{eppner2009imitation,
  title = {Imitation Learning with Generalized Task Descriptions},
  author = {Eppner, Clemens and Sturm, J{\"u}rgen and Bennewitz, Maren and Stachniss, Cyrill and Burgard, Wolfram},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2009}
}
In this paper, we present an approach that allows a robot to observe, generalize, and reproduce tasks observed from multiple demonstrations. Motion capture data is recorded in which a human instructor manipulates a set of objects. In our approach, we learn relations between body parts of the demonstrator and objects in the scene. These relations result in a generalized task description. The problem of learning and reproducing human actions is formulated using a dynamic Bayesian network (DBN). The posteriors corresponding to the nodes of the DBN are estimated by observing objects in the scene and body parts of the demonstrator. To reproduce a task, we seek for the maximum-likelihood action sequence according to the DBN. We additionally show how further constraints can be incorporated online, for example, to robustly deal with unforeseen obstacles. Experiments carried out with a real 6-DoF robotic manipulator as well as in simulation show that our approach enables a robot to reproduce a task carried out by a human demonstrator. Our approach yields a high degree of generalization illustrated by performing a pick-and-place and a whiteboard cleaning task.

Theses

Robot Grasping by Exploiting Compliance and Environmental Constraints
Clemens Eppner
Doctoral Thesis. Technische Universität Berlin. February 2018.
[Abstract] [BibTex] [PDF] [Video]

@phdthesis{eppner2018thesis,
  title = {Robot Grasping by Exploiting Compliance and Environmental Constraints},
  author = {Clemens Eppner},
  booktitle = {Doctoral Thesis},
  school = {Technische Universität Berlin},
  year = {2018}
}
Grasping is a crucial skill for any autonomous system that needs to alter the physical world. The complexity of robot grasping stems from the fact that any solution comprises various components: Hand design, control, perception, and planning all affect the success of a grasp. Apart from picking solutions in well-defined industrial scenarios, general grasping in unstructured environment is still an open problem. In this thesis, we exploit two general properties to devise grasp planning algorithms: the compliance of robot hands and the stiffness of the environment that surrounds an object. We view hand compliance as an enabler for local adaptability in the grasping process that does not require explicit reasoning or planning. As a result, we study compliance-aware algorithms to synthesize grasps. Exploiting hand compliance also simplifies perception, since precise geometric object models are not needed. Complementary to hand compliance is the idea of exploiting the stiffness of the environment. In real-world scenarios, objects never occur in isolation. They are situated in an environmental context: on a table, in a shelf, inside a drawer, etc. Robotic grasp strategies can benefit from contact with the environment by pulling objects to edges, pushing them against surfaces etc. We call this principle the exploitation of environmental constraints. We present grasp planning algorithms which detect and sequence environmental constraint exploitations. We study the two ideas by focusing on the relationships between the three main constituents of the grasping problem: hand, object, and environment. We show that the interactions between adaptable hands and objects lend themselves to low-dimensional grasp actions. Based on this insight, we devise two grasp planning algorithms which map compliance modes to raw sensor signals using minimal prior knowledge. Next, we focus on the interactions between hand and environment. We show that contacting the environment can improve success in motion and grasping tasks. We conclude our investigations by considering interactions between all three factors: hand, object, and environment. We extend our grasping approach to select the most appropriate environmental constraint exploitation based on the shape of an object. Finally, we consider simple manipulation tasks that require individual finger movements. Although compliant hands pose challenges due to the difficulty in modeling and limitations in sensing, we propose an approach to learn feedback control strategies that solve these tasks. We evaluate all algorithms presented in this thesis in extensive real-world experiments, compare their assumptions and discuss limitations. The investigations and planning algorithms show that exploiting compliance in hands and stiffness in the environment leads to improved grasp performance.
Techniques for the Imitiation of Manipulative Actions by Robots
Clemens Eppner
Diploma Thesis. Albert-Ludwigs-Universität Freiburg. November 2008.
[Abstract] [BibTex] [PDF] [Video]

@phdthesis{eppner2008thesis,
  title = {Techniques for the Imitiation of Manipulative Actions by Robots},
  author = {Clemens Eppner},
  booktitle = {Diploma Thesis},
  school = {Albert-Ludwigs-Universität Freiburg},
  year = {2008}
}
In this thesis, we present an approach that allows a robot to observe, generalize, and reproduce tasks observed from multiple demonstrations. Motion capture data is recorded in which a human instructor manipulates a set of objects. In our approach, we learn relations between body parts of the demonstrator and objects in the scene. These relations result in a generalized task description. The problem of learning and reproducing human actions is formulated using a dynamic Bayesian network (DBN). The posteriors corresponding to the nodes of the DBN are estimated by observing objects in the scene and body parts of the demonstrator. To reproduce a task, we seek for the maximum-likelihood action sequence according to the DBN. We additionally show how further constraints can be incorporated online, for example, to robustly deal with unforeseen obstacles. Experiments carried out with a real 6-DoF robotic manipulator as well as in simulation show that our approach enables a robot to reproduce a task carried out by a human demonstrator. Our approach yields a high degree of generalization illustrated by performing a pick-and-place, a pouring, and a whiteboard cleaning task.
Simulation eines Prallluftschiffs
Clemens Eppner
Studienarbeit. Albert-Ludwigs-Universität Freiburg. October 2007.
[Abstract] [BibTex] [PDF]

@phdthesis{eppner2007thesis,
  title = {Simulation eines Prallluftschiffs},
  author = {Clemens Eppner},
  booktitle = {Studienarbeit},
  school = {Albert-Ludwigs-Universität Freiburg},
  year = {2007}
}

Patents

Method for Assessing the Quality of a Robotic Grasp on 3D Deformable Objects
Isabella Huang, Yashraj Shyam Narang, Clemens Eppner, Balakumar Sundaralingam, Miles Macklin, Tucker Ryer Hermans and Dieter Fox
US20220297297. United States. September 2022.
Object rearrangement using learned implicit collision functions
US20220152826A1. United States. May 2022.
Grasp determination for an object in clutter
Arsalan Mousavian, Clemens Eppner and Dieter Fox
US20210138655A1. United States. May 2021.
Grasp generation using a variational autoencoder
Arsalan Mousavian, Clemens Eppner and Dieter Fox
US20200361083A1. United States. November 2020.