Motivation

CCGE is a general-purpose exploration reward for dexterous manipulation that encourages diverse finger–object contact patterns by tracking state-conditioned contact coverage. It improves training efficiency and task success rate.

Simulation Tasks

Cluttered Object Singulation
Constrained Object Retrieval
In-Hand Reorientation (Elephant)
In-Hand Reorientation (Mug)
Bimanual Manipulation (Waffle)
Bimanual Manipulation (Box)
In-Hand Reorientation (Cube)
In-Hand Reorientation (Bunny)

Real-World Results

We distill the simulation oracle policy into a vision-based policy and evaluate real-world object singulation with randomized target objects. Videos are shown at 1x speed.

Select trial
Global view
Side view
Top-down view

Abstract

Deep Reinforcement learning (DRL) has achieved remarkable success in domains with well-defined reward structures, such as Atari games and locomotion. In contrast, dexterous manipulation lacks general-purpose reward formulations and typically depends on task-specific, handcrafted priors to guide hand–object interactions. We propose Contact Coverage-Guided Exploration (CCGE), a general exploration method designed for general-purpose dexterous manipulation tasks. CCGE represents contact state as the intersection between object surface points and predefined hand keypoints, encouraging dexterous hands to discover diverse and novel contact patterns, namely which fingers contact which object regions. It maintains a contact counter conditioned on discretized object states obtained via learned hash codes, capturing how frequently each finger interacts with different object regions. This counter is leveraged in two complementary ways: (1) to assign a count-based contact coverage reward that promotes exploration of novel contact patterns, and (2) an energy-based reaching reward that guides the agent toward under-explored contact regions. We evaluate CCGE on a diverse set of dexterous manipulation tasks, including cluttered object singulation, constrained object retrieval, in-hand reorientation, and bimanual manipulation. Experimental results show that CCGE substantially improves training efficiency and success rates over existing exploration methods, and that the contact patterns learned with CCGE transfer robustly to real-world robotic systems.

Deep Reinforcement learning (DRL) has achieved remarkable success in domains with well-defined reward structures, such as Atari games and locomotion. In contrast, dexterous manipulation lacks general-purpose reward formulations and typically depends on task-specific, handcrafted priors to guide hand–object interactions. We propose Contact Coverage-Guided Exploration (CCGE), a general exploration method designed for general-purpose dexterous manipulation tasks. CCGE represents contact state as the intersection between object surface points and predefined hand keypoints, encouraging dexterous hands to discover diverse and novel contact patterns, namely which fingers contact which object regions. It maintains a contact counter conditioned on discretized object states obtained via learned hash codes, capturing how frequently each finger interacts with different object regions. This counter is leveraged in two complementary ways: (1) to assign a count-based contact coverage reward that promotes exploration of novel contact patterns, and (2) an energy-based reaching reward that guides the agent toward under-explored contact regions. We evaluate CCGE on a diverse set of dexterous manipulation tasks, including cluttered object singulation, constrained object retrieval, in-hand reorientation, and bimanual manipulation. Experimental results show that CCGE substantially improves training efficiency and success rates over existing exploration methods, and that the contact patterns learned with CCGE transfer robustly to real-world robotic systems.

Method

Contact Coverage-Guided Exploration (CCGE) explicitly models hand–object interactions and consists of three key components: a learned state hashing module that discretizes continuous object states into compact state clusters, a contact coverage counter that records state-conditioned finger–region interactions, and a structured exploration reward.

The exploration reward is decomposed into a contact coverage reward, which encourages exploration of under-explored contact regions after contact occurs, and a pre-contact energy-based reaching reward, which guides the policy toward unexplored object regions to facilitate efficient contact discovery before physical interaction occurs. The current object state and the goal state are visualized as colored point clouds, with colors indicating different object surface regions.

ccge_overview_update
Figure. Overview of CCGE.

Experiment

Training curves across tasks
Figure. Training curves across tasks.

We compare CCGE, against baselines on a range of challenging manipulation tasks. CCGE shows (i) more stable training across different random seeds, (ii) improved sample efficiency, and (iii) achieves higher task performance overall.