Project Logo

A Lean Ecosystem for Robot Learning at Scale

Tobias Jülg*1, Pierre Krack*1, Seongjin Bien*1, Yannik Blei1, Khaled Gamal1, Ken Nakahara2, Johannes Hechtl3, Roberto Calandra2, Wolfram Burgard1 and Florian Walter1,4

1University of Technology Nuremberg,
2TU Dresden, 3Siemens AG, 4Technical University of Munich
*Equal Contribution

Abstract

Vision-Language-Action models (VLAs) mark a major shift in robot learning. They replace specialized architectures and task-tailored components of expert policies with large-scale data collection and setup-specific fine-tuning. In this machine learning-focused workflow that is centered around models and scalable training, traditional robotics software frameworks become a bottleneck, while robot simulations offer only limited support for transitioning from and to real-world experiments. In this work, we close this gap by introducing Robot Control Stack (RCS), a lean ecosystem designed from the ground up to support research in robot learning with large-scale generalist policies. At its core, RCS features a modular and easily extensible layered architecture with a unified interface for simulated and physical robots, facilitating sim-to-real transfer. Despite its minimal footprint and dependencies, it offers a complete feature set, enabling both real-world experiments and large-scale training in simulation. Our contribution is twofold: First, we introduce the architecture of RCS and explain its design principles. Second, we evaluate its usability and performance along the development cycle of VLA and RL policies. Our experiments also provide an extensive evaluation of Octo, OpenVLA, and Pi Zero on multiple robots and shed light on how simulation data can improve real-world policy performance.

BibTeX

TODO