🏆 IROS 2025 Challenge#
Welcome to the IROS 2025 Challenge of Multimodal Robot Learning in InternUtopia and Real World! InternManip provides the official baseline and evaluation toolkit for Track: Vision-Language Manipulation in Open Tabletop Environments, featured at the IROS 2025 Workshop.
🚀 Challenge Overview#
In this challenge, participants will develop end-to-end policies that fuse vision and language to control robots in simulated physics-based environment. Models are trained using the InternManip framework and GenManip dataset, and evaluated in a closed-loop benchmark on unseen private scenes.
This repository serves as the starter kit and evaluation toolkit—you can use it to:
Implement your own policy models
Train them on GenManip public data
Submit them via Docker for final evaluation
📚 More information#
You can get information about the competition here, including resources, time and rewards, etc.
🛠️ guided tutorial#
We’ve provided a concise guided tutorial for challengers, divided into three parts: Environment Setup, Local Development & Testing, and Packaging & Submission.
😄 Good luck, and we look forward to your innovations!