🥇 Add a New Benchmark#
This guide walks you through adding a custom agent and custom evaluation benchmark to the InternManip framework.
1. Implement Your Model Agent#
To support a new model in InternManip, define a subclass of BaseAgent
. You must implement two core methods:
step()
: given an observation, returns an action.reset()
: resets internal states, if needed.
Example: Define a Custom Agent
from internmanip.agent.base import BaseAgent
from internmanip.configs import AgentCfg
class MyCustomAgent(BaseAgent):
def __init__(self, config: AgentCfg):
super().__init__(config)
# Custom model initialization here
def step(self, obs):
# Implement forward logic here
return action
def reset(self):
# Optional: reset internal state
pass
Register Your Agent
In internmanip/agent/base.py
, register your agent in the AgentRegistry
:
class AgentRegistry(Enum):
...
CUSTOM = "MyCustomAgent"
@property
def value(self):
if self.name == "CUSTOM":
from internmanip.agent.my_custom_agent import MyCustomAgent
return MyCustomAgent
...
2. Creating a New Evaluator#
To add support for a new evaluation environment, inherit from the Evaluator
base class and implement required methods:
from internmanip.evaluator.base import Evaluator
from internmanip.configs import EvalCfg
class CustomEvaluator(Evaluator):
def __init__(self, config: EvalCfg):
super().__init__(config)
# Custom initialization logic
...
@classmethod
def _get_all_episodes_setting_data(cls, episodes_config_path) -> List[Any]:
"""Get all episodes setting data from the given path."""
...
def eval(self):
"""The default entrypoint of the evaluation pipeline."""
...
3. Registering the Evaluator#
Register the new evaluator in EvaluatorRegistry
under internmanip/evaluator/base.py
:
# In internmanip/evaluator/base.py
class EvaluatorRegistry(Enum):
...
CUSTOM = "CustomEvaluator" # Add new evaluator
@property
def value(self):
if self.name == "CUSTOM":
from internmanip.evaluator.custom_evaluator import CustomEvaluator
return CustomEvaluator
...
4. Creating Configuration Files#
Create configuration files for the new evaluator:
# scripts/eval/configs/custom_agent_on_custom_bench.py
from internmanip.configs import *
from pathlib import Path
eval_cfg = EvalCfg(
eval_type="custom_bench", # Corresponds to the name registered in EvaluatorRegistry
agent=AgentCfg(
agent_type="custom_agent", # Corresponds to the name registered in AgentRegistry
model_name_or_path="path/to/model",
model_kwargs={...},
server_cfg=ServerCfg( # Optional server configuration
server_host="localhost",
server_port=5000,
),
),
env=EnvCfg(
env_type="custom_env", # Corresponds to the name registered in EnvWrapperRegistry
config_path="path/to/env_config.yaml",
env_settings=CustomEnvSettings(...)
),
logging_dir="logs/eval/custom",
distributed_cfg=DistributedCfg( # Optional distributed configuration
num_workers=4,
ray_head_ip="auto", # Use "auto" for local machine
include_dashboard=True,
dashboard_port=8265,
)
)
5. Launch the Evaluator#
python scripts/eval/start_evaluator.py \
--config scripts/eval/configs/custom_on_custom.py
Use --distributed
for Ray-based multi-GPU, and --server
for client-server mode.