Robots
InternDataEngine supports multiple robot platforms for manipulation tasks. Each robot has a dedicated wrapper class for articulation control and state management.
Supported Robots
| Robot | Type | DOF | Gripper (DOF) | Arm Model |
|---|---|---|---|---|
| ARX Lift-2 | Dual-arm | 6+6 | Parallel (2) | R5a |
| Agilex Split Aloha | Dual-arm | 6+6 | Parallel (2) | Piper-100 |
| Genie-1 | Dual-arm | 7+7 | Parallel (2) | G1-120s |
| Franka FR3 | Single-arm | 7 | Panda (1) | Franka |
| Franka Robotiq85 | Single-arm | 7 | Robotiq 2F-85 (2) | Franka |
Robot End Effector and TCP Frame Visualizations
FR3 (Single-arm, Franka Panda Gripper)
FR3 end-effector head frame visualization
FR3 TCP (Tool Center Point) head frame visualization
FR3 TCP hand frame visualization
Franka Robotiq85 (Single-arm, Robotiq 2F-85 Gripper)
Franka Robotiq85 end-effector head frame visualization
Franka Robotiq85 TCP (Tool Center Point) head frame visualization
Franka Robotiq85 TCP hand frame visualization
Genie-1 (Dual-arm, G1-120s)
Genie-1 left arm end-effector head frame visualization
Genie-1 left arm TCP head frame visualization
Genie-1 left arm TCP hand frame visualization
Genie-1 right arm end-effector head frame visualization
Genie-1 right arm TCP head frame visualization
Genie-1 right arm TCP hand frame visualization
ARX Lift-2 (Dual-arm, R5a)
Lift-2 left arm end-effector head frame visualization
Lift-2 left arm TCP head frame visualization
Lift-2 left arm TCP hand frame visualization
Lift-2 right arm end-effector head frame visualization
Lift-2 right arm TCP head frame visualization
Lift-2 right arm TCP hand frame visualization
Agilex Split Aloha (Dual-arm, Piper-100)
Split Aloha right arm end-effector head frame visualization
Split Aloha right arm TCP head frame visualization
Split Aloha right arm TCP hand frame visualization
Robot Configuration
Robot configuration is split into two parts: task-level configuration (in task YAML) and robot-specific parameters (in separate robot YAML files).
Part 1: Task-Level Configuration (Task YAML)
Configure robot instance in task YAML files:
robots:
- name: "lift2" # Robot identifier
robot_config_file: workflows/simbox/core/configs/robots/lift2.yaml # Path to robot-specific config
euler: [0.0, 0.0, 90.0] # Initial orientation [roll, pitch, yaw] in degrees
ignore_substring: ["material", "table", "gso_box"] # Collision filter substringsPart 2: Robot-Specific Parameters (Robot YAML)
Define robot hardware parameters in a separate YAML file (e.g., workflows/simbox/core/configs/robots/lift2.yaml):
# Robot info
target_class: Lift2 # Python class for robot wrapper
path: "lift2/robot_invisible.usd" # USD file path relative to asset root
# CuRobo kinematics config files (one per arm for dual-arm robots)
robot_file:
- workflows/simbox/curobo/src/curobo/content/configs/robot/r5a_left_arm.yml
- workflows/simbox/curobo/src/curobo/content/configs/robot/r5a_right_arm.yml
# Gripper parameters
gripper_max_width: 0.088 # Maximum gripper opening width (meters)
gripper_min_width: 0.0 # Minimum gripper closing width (meters)
tcp_offset: 0.125 # Tool center point offset from end-effector (meters)
# Solver parameters for physics simulation
solver_position_iteration_count: 128 # Position solver iterations
solver_velocity_iteration_count: 4 # Velocity solver iterations
stabilization_threshold: 0.005 # Stabilization threshold
# Joint indices in articulation
left_joint_indices: [10, 12, 14, 16, 18, 20] # Left arm joint indices
right_joint_indices: [9, 11, 13, 15, 17, 19] # Right arm joint indices
left_gripper_indices: [23] # Left end-effector (gripper) joint index
right_gripper_indices: [21] # Right end-effector (gripper) joint index
lift_indices: [6] # Lift joint index
# End-effector paths
fl_ee_path: "lift2/lift2/fl/link6" # Front-left end-effector prim path
fr_ee_path: "lift2/lift2/fr/link6" # Front-right end-effector prim path
fl_base_path: "lift2/lift2/fl/base_link" # Front-left base prim path
fr_base_path: "lift2/lift2/fr/base_link" # Front-right base prim path
# Gripper keypoints for visualization
fl_gripper_keypoints:
tool_head: [0.135, 0.0, 0.0, 1]
tool_tail: [0.085, 0.0, 0.0, 1]
tool_side: [0.135, -0.044, 0.0, 1]
fr_gripper_keypoints:
tool_head: [0.135, 0.0, 0.0, 1]
tool_tail: [0.085, 0.0, 0.0, 1]
tool_side: [0.135, -0.044, 0.0, 1]
# Collision filter paths
fl_filter_paths: # Paths to filter from collision (gripper fingers)
- "lift2/lift2/fl/link7"
- "lift2/lift2/fl/link8"
fr_filter_paths:
- "lift2/lift2/fr/link7"
- "lift2/lift2/fr/link8"
fl_forbid_collision_paths: # Paths forbidden for self-collision
- "lift2/lift2/fl/link2"
- "lift2/lift2/fl/link3"
- "lift2/lift2/fl/link4"
- "lift2/lift2/fl/link5"
fr_forbid_collision_paths:
- "lift2/lift2/fr/link2"
- "lift2/lift2/fr/link3"
- "lift2/lift2/fr/link4"
- "lift2/lift2/fr/link5"
# Pose processing parameters
R_ee_graspnet: [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]] # Grasp rotation correction
ee_axis: "x" # End-effector approach axis (x/y/z)
# Default joint home positions (radians)
left_joint_home: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
right_joint_home: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
left_joint_home_std: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0] # Randomization std for left arm
right_joint_home_std: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0] # Randomization std for right arm
left_gripper_home: [0.044] # Left gripper default width (meters)
right_gripper_home: [0.044] # Right gripper default width (meters)
lift_home: [0.46] # Lift joint default position (meters)Understanding Robot Parameters
Task-Level Fields
- name (str): Robot identifier used in skills and cameras.
- robot_config_file (str): Path to robot-specific YAML configuration file.
- euler (list): Initial robot orientation in degrees [roll, pitch, yaw].
- ignore_substring (list): Collision filter substrings to ignore during simulation.
Robot-Specific Fields
Some fields require detailed explanation due to their importance in grasp pose processing and robot kinematics:
R_ee_graspnet
A 3×3 rotation matrix that transforms the grasp pose orientation from the pre-defined graspnet frame to the robot's end-effector frame. Our generated grasp pose follows the frame definition from GraspNet.
GraspNet gripper frame definition (source: graspnetAPI)
Source: graspnet/graspnetAPI
The following examples illustrate how R_ee_graspnet is configured for different robots. You can verify these values by comparing the GraspNet frame definition with each robot's end-effector orientation shown in the visualizations above.
Config Example:
# FR3
R_ee_graspnet: [[0.0, 0.0, -1.0], [0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]
# Genie-1
R_ee_graspnet: [[0.0, 1.0, 0.0], [0.0, 0.0, 1.0], [1.0, 0.0, 0.0]]
# ARX Lift-2
R_ee_graspnet: [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]]
# Agilex Split Aloha
R_ee_graspnet: [[0.0, -1.0, 0.0], [0.0, 0.0, -1.0], [1.0, 0.0, 0.0]]fl_ee_path / fr_ee_path
The USD prim path to the end-effector link (typically the last link of the arm before the TCP).
Warning
This link frame has a fixed transformation to the TCP frame. This path should be aligned with the ee_link in the CuRobo config file.
Usage:
- Compute the end-effector pose relative to the robot base via
get_relative_transform - Define where gripper keypoints are attached
- Serve as the reference frame for TCP (Tool Center Point) calculations
Config Example:
# FR3
fl_ee_path: "fr3/panda_hand" # relative to root robot prim path
# Franka Robotiq
fl_ee_path: "arm/panda_link8"
# Agilex Split Aloha
fl_ee_path: "split_aloha_mid_360_with_piper/split_aloha_mid_360_with_piper/fl/link6"
fr_ee_path: "split_aloha_mid_360_with_piper/split_aloha_mid_360_with_piper/fr/link6"ee_axis
The axis along from end-effector origin to gripper TCP. Valid values are "x", "y", or "z".
Usage:
- Used in
pose_post_process_fnto calculate the actual EE position from TCP - Affects the 180° rotation variant generation for grasp pose diversity
Config Example:
# FR3
ee_axis: "z"
# ARX Lift-2
ee_axis: "x"
# Agilex Split Aloha
ee_axis: "z"tcp_offset
The distance from the end-effector frame origin to the Tool Center Point (TCP). The TCP is the point where the gripper fingertips meet when closed, which is the actual grasping point.
Usage:
- During grasp pose processing, the EE position is calculated as:
ee_center = tcp_center + approach_axis * (depth - tcp_offset) - This offset accounts for the physical distance between the robot's end-effector frame and the actual grasping point
fl_gripper_keypoints / fr_gripper_keypoints
3D keypoints defined in the end-effector frame for articulated object manipulation planning.
Assume the gripper is oriented upright facing the user. The keypoints are defined as follows:
tool_head: The point at the midpoint of the gripper fingertips, along the approach direction.tool_tail: The point at the midpoint of the gripper finger bases, along the same approach direction.tool_side: A point on the side of the right fingertip, used to indicate the gripper width.
Gripper keypoints visualization showing tool_head, tool_tail, and tool_side
Config Example:
# ARX Lift-2
fl_gripper_keypoints:
tool_head: [0.135, 0.0, 0.0, 1] # Gripper fingertip (approach direction)
tool_tail: [0.085, 0.0, 0.0, 1] # Gripper base
tool_side: [0.135, -0.044, 0.0, 1] # Side point for width visualizationOther Fields
- target_class (str): Python class for robot wrapper (e.g.,
Lift2,FR3). - path (str): USD file path relative to asset root.
- robot_file (list): CuRobo kinematics config file(s) - one per arm.
- gripper_max_width (float): Maximum gripper opening width (meters).
- gripper_min_width (float): Minimum gripper closing width (meters).
- solver_position_iteration_count (int): Physics solver position iterations.
- solver_velocity_iteration_count (int): Physics solver velocity iterations.
- stabilization_threshold (float): Physics stabilization threshold.
- left_joint_indices (list): Joint indices for left arm in articulation.
- right_joint_indices (list): Joint indices for right arm in articulation.
- left_gripper_indices (list): Gripper joint index for left arm.
- right_gripper_indices (list): Gripper joint index for right arm.
- lift_indices (list): Lift joint indices (for robots with lift mechanism).
- fl_base_path (str): Left arm base prim path.
- fr_base_path (str): Right arm base prim path.
- fl_filter_paths (list): Collision filter prims' paths for left arm.
- fr_filter_paths (list): Collision filter prims' paths for right arm.
- fl_forbid_collision_paths (list): Forbidden collision prims' paths for left arm.
- fr_forbid_collision_paths (list): Forbidden collision prims' paths for right arm.
- left_joint_home (list): Default joint positions for left arm (radians).
- right_joint_home (list): Default joint positions for right arm (radians).
- left_joint_home_std (list): Standard deviation for randomizing left arm home position.
- right_joint_home_std (list): Standard deviation for randomizing right arm home position.
- left_gripper_home (list): Default gripper joint value for left gripper (Isaac).
- right_gripper_home (list): Default gripper joint value for right gripper (Isaac).
- lift_home (list): Default lift joint position (meters).
Robot Wrappers
Robot wrappers (workflows/simbox/core/robots/) provide a unified interface for:
- Articulation control
- Gripper interface
- State / observation management
- Grasp pose post-processing
All concrete robots (e.g., FR3, FrankaRobotiq, Genie1, Lift2, SplitAloha) share the same TemplateRobot implementation, with differences configured through YAML files and minimal subclass code.
Template Code Example:
from copy import deepcopy
import numpy as np
from core.robots.base_robot import register_robot
from omni.isaac.core.robots.robot import Robot
from omni.isaac.core.utils.prims import create_prim, get_prim_at_path
from omni.isaac.core.utils.transformations import (
get_relative_transform,
tf_matrix_from_pose,
)
from scipy.interpolate import interp1d
@register_robot
class TemplateRobot(Robot):
"""
Template class for manipulator robots.
All important parameters should be prepared in cfg before instantiation.
The cfg is merged from: robot_config_file -> task_config_robots
"""
def __init__(self, asset_root: str, root_prim_path: str, cfg: dict, *args, **kwargs):
self.asset_root = asset_root
self.cfg = cfg
# Create prim from USD
usd_path = f"{asset_root}/{cfg['path']}"
prim_path = f"{root_prim_path}/{cfg['name']}"
create_prim(usd_path=usd_path, prim_path=prim_path)
super().__init__(prim_path, cfg["name"], *args, **kwargs)
self.robot_prim_path = prim_path
# Gripper parameters (from cfg)
self.gripper_max_width = cfg["gripper_max_width"]
self.gripper_min_width = cfg["gripper_min_width"]
# Solver parameters
self.set_solver_position_iteration_count(cfg["solver_position_iteration_count"])
self.set_stabilization_threshold(cfg["stabilization_threshold"])
self.set_solver_velocity_iteration_count(cfg["solver_velocity_iteration_count"])
# Setup from config
self._setup_joint_indices()
self._setup_paths()
self._setup_gripper_keypoints()
self._setup_collision_paths()
self._load_extra_depth(usd_path)
def initialize(self, *args, **kwargs):
super().initialize()
self._articulation_view.initialize()
self._setup_joint_velocities()
self._setup_joint_homes()
self._set_initial_positions()
def apply_action(self, joint_positions, joint_indices, *args, **kwargs):
self._articulation_view.set_joint_position_targets(joint_positions, joint_indices=joint_indices)
def get_observations(self) -> dict:
joint_state = self.get_joints_state()
qpos, qvel = joint_state.positions, joint_state.velocities
T_base_ee_fl = get_relative_transform(
get_prim_at_path(self.fl_ee_path), get_prim_at_path(self.fl_base_path)
)
T_world_base = tf_matrix_from_pose(*self.get_local_pose())
obs = self._build_observations(qpos, qvel, T_base_ee_fl, T_world_base)
return obs
def pose_post_process_fn(
self, poses, *args, lr_arm="left", grasp_scale=1, tcp_offset=None, constraints=None, **kwargs
):
if poses.shape[-2:] == (4, 4):
return poses
R_ee_graspnet = self._get_R_ee_graspnet()
n_grasps = poses.shape[0]
T_obj_tcp = np.repeat(np.eye(4)[np.newaxis, :, :], n_grasps, axis=0)
R_ee_graspnet = np.array(R_ee_graspnet)
T_obj_tcp[:, :3, :3] = np.matmul(poses[:, 4:13].reshape(-1, 3, 3), R_ee_graspnet.T)
T_obj_tcp[:, :3, 3] = poses[:, 13:16] * grasp_scale
scores = poses[:, 0]
widths = np.clip(poses[:, 1:2], self.gripper_min_width, self.gripper_max_width)
depths = poses[:, 3:4]
if tcp_offset is None:
tcp_offset = self.tcp_offset
if self._gripper_ed_func is not None:
depths = depths + self._gripper_ed_func(widths)
# ... see full implementation in workflows/simbox/core/robots/template_robot.py__init__(self, asset_root: str, root_prim_path: str, cfg: dict, *args, **kwargs)
Create a robot instance from USD and initialize all geometry and dynamics-related paths/parameters based on configuration.
Parameters:
- asset_root (str): Root directory for robot assets.
- root_prim_path (str): Root prim path where the robot is mounted in the USD stage.
- cfg (dict): Merged robot configuration (from robot YAML + task YAML).
- ***args, kwargs: Passed to the
Robotbase class.
initialize(self, *args, **kwargs)
Perform one-time initialization after the physics engine is ready, including joint velocity limits, home poses, and initial joint positions.
apply_action(self, joint_positions, joint_indices, *args, **kwargs)
Send joint position targets to the Isaac articulation. This is one of the main interfaces between upper-level controllers/policies and the robot.
Parameters:
- joint_positions (np.ndarray): Target joint positions.
- joint_indices (np.ndarray): Joint indices to control.
- ***args, kwargs: Reserved for future extensions (e.g., velocity, torque control).
get_observations(self)
Collect the robot's current state for use as observation input by upper-level policies/planning modules.
Returns:
- dict: Observation dictionary for policy/planning.
pose_post_process_fn(self, poses, *args, lr_arm="left", grasp_scale=1, tcp_offset=None, constraints=None, **kwargs)
Convert grasp poses from a graspnet-style annotations (e.g., (score, width, depth, R, t)) to robot-specific end-effector pose sets, including TCP offset, grasp width clipping, and optional spatial constraints.
Core Logic:
- If
posesis already a transformation matrix of shape(N, 4, 4), return directly. - Use
R_ee_graspnet = self._get_R_ee_graspnet()to correct the grasp rotation and constructT_obj_tcp:- Rotation:
R_tcp = poses_rot @ R_ee_graspnet^T. - Translation:
t_tcp = poses[:, 13:16] * grasp_scale.
- Rotation:
- Clip grasp widths
widthsto[gripper_min_width, gripper_max_width], and optionally correct insertion depthdepthsusing_gripper_ed_func. - Use
tcp_offsetand the end-effector axis (determined byee_axis) to transform TCP to actual EE transformT_obj_ee. - If
constraints = [axis, min_ratio, max_ratio]is provided, filter grasps along the specified axis, keeping only those within the given range. - Call
_apply_rotation_variantto generate a 180° rotation variant around the grasp axis, returning two sets of candidate grasp poses and their scores.
Parameters:
- poses (np.ndarray): Raw grasp annotations, typically shape
(N, 16). - lr_arm (str): Left/right arm marker ("left" or "right", currently mainly used by upper layers).
- grasp_scale (float): Grasp position scale factor.
- tcp_offset (float, optional): TCP offset relative to EE; defaults to
self.tcp_offsetfrom config. - constraints (list, optional): Spatial filtering constraint of the form
[axis, min_ratio, max_ratio]. - ***args, kwargs: Reserved for extensions.
Returns:
- tuple:
- First item:
np.ndarrayof EE poses, shape approximately(2N_filtered, 4, 4). - Second item:
np.ndarrayof corresponding scores, shape approximately(2N_filtered,).
- First item:






















