# 🥇 Add a New Benchmark This guide walks you through adding a custom agent and custom evaluation benchmark to the InternManip framework. ### 1. Implement Your Model Agent To support a new model in InternManip, define a subclass of [`BaseAgent`](../../internmanip/agent/base.py). You must implement two core methods: - `step()`: given an observation, returns an action. - `reset()`: resets internal states, if needed. **Example: Define a Custom Agent** ```python from internmanip.agent.base import BaseAgent from internmanip.configs import AgentCfg class MyCustomAgent(BaseAgent): def __init__(self, config: AgentCfg): super().__init__(config) # Custom model initialization here def step(self, obs): # Implement forward logic here return action def reset(self): # Optional: reset internal state pass ``` **Register Your Agent** In `internmanip/agent/base.py`, register your agent in the `AgentRegistry`: ```python class AgentRegistry(Enum): ... CUSTOM = "MyCustomAgent" @property def value(self): if self.name == "CUSTOM": from internmanip.agent.my_custom_agent import MyCustomAgent return MyCustomAgent ... ``` ### 2. Creating a New Evaluator To add support for a new evaluation environment, inherit from the `Evaluator` base class and implement required methods: ```python from internmanip.evaluator.base import Evaluator from internmanip.configs import EvalCfg class CustomEvaluator(Evaluator): def __init__(self, config: EvalCfg): super().__init__(config) # Custom initialization logic ... @classmethod def _get_all_episodes_setting_data(cls, episodes_config_path) -> List[Any]: """Get all episodes setting data from the given path.""" ... def eval(self): """The default entrypoint of the evaluation pipeline.""" ... ``` ### 3. Registering the Evaluator Register the new evaluator in `EvaluatorRegistry` under `internmanip/evaluator/base.py`: ```python # In internmanip/evaluator/base.py class EvaluatorRegistry(Enum): ... CUSTOM = "CustomEvaluator" # Add new evaluator @property def value(self): if self.name == "CUSTOM": from internmanip.evaluator.custom_evaluator import CustomEvaluator return CustomEvaluator ... ``` ### 4. Creating Configuration Files Create configuration files for the new evaluator: ```python # scripts/eval/configs/custom_agent_on_custom_bench.py from internmanip.configs import * from pathlib import Path eval_cfg = EvalCfg( eval_type="custom_bench", # Corresponds to the name registered in EvaluatorRegistry agent=AgentCfg( agent_type="custom_agent", # Corresponds to the name registered in AgentRegistry model_name_or_path="path/to/model", model_kwargs={...}, server_cfg=ServerCfg( # Optional server configuration server_host="localhost", server_port=5000, ), ), env=EnvCfg( env_type="custom_env", # Corresponds to the name registered in EnvWrapperRegistry config_path="path/to/env_config.yaml", env_settings=CustomEnvSettings(...) ), logging_dir="logs/eval/custom", distributed_cfg=DistributedCfg( # Optional distributed configuration num_workers=4, ray_head_ip="auto", # Use "auto" for local machine include_dashboard=True, dashboard_port=8265, ) ) ``` ## 5. Launch the Evaluator ```python python scripts/eval/start_evaluator.py \ --config scripts/eval/configs/custom_on_custom.py ``` Use `--distributed` for Ray-based multi-GPU, and `--server` for client-server mode.