Obtaining real world data for robotics tasks is harder than for other modalities such as vision and text. The data that is currently available for robot learning is mostly set in static scenes, and deals with a single robot only. Dealing with multiple robots comes with additional difficulties compared to single robot settings: the motion planning for multiple agents needs to take into account the movement of the other robots, and task planning needs to consider to which robot a task is assigned to, in addition to when a task should be done.
In this work, we present TAPAS, a simulated dataset containing task and motion plans for multiple robots acting asynchronously in the same workspace and modifying the same environment. We consider prehensile manipulation in this dataset, and focus on various pick and place tasks. We demonstrate that training using this data for predicting makespan of a task sequence enables speeding up finding low makespan sequences by ranking sequences before computing the full motion plan.
The TAPAS dataset consists of 204k task and motion plans on 7'000 different randomized scenarios containing up to 4 robot arms. The goal in all the scenes is to move the objects to their corresponding goals using (possible collaborative) pick and place.
We currently have 4 base scenes (from left to right, illustrated above): random, with up to 4 arms with random base orientation and pose, husky, with two arms on a husky base, conveyor, a conveyor-like setting with 4 arms where objects have to be moved from the middle to the outside, and shelf a setting with a shelf and 2 arms.
We randomize the size of the objects and the start and the goal pose of the objects differently for each scene:
For each scene, we generate (multiple) possible task sequences for the robots, and generate full motion plans from the sequence using our multi agent task and motion planner (described in the paper). Each task and motion plan contains:
We use the dataset to accelerate search for a good task plan. We do this by learning to predict makespan of a candidate sequences using the dataset. We then use this policy to rank candidate sequences, and compute the full plans in order of the ranking.
We use a transformer-encoder as backbone for an MLP to predict the makespan for a given sequence and scene.
Below, we show the resulting makespan for a scenario over time for the policy using the predicted makespan (in green) and the baseline of a random search (blue).
@article{authors,
author = {Zamora, Miguel and Hartmann, Valentin N. and Coros, Stelian},
title = {TAPAS: A Dataset for Task Assignment and Planning for Multi Agent Systems},
year = {2024},
journal = {Workshop on Data Generation for Robotics at Robotics, Science and Systems '24}
}