Installation Guide#
This page provides detailed instructions for installing InternNav in inference-only mode, such as when deploying InternVLA-N1 on your own robot or with a custom dataset. Follow the steps below to set up the environment and run inference with the model.
If you want to reproduce the results presented in the technical report, please follow this page, and also complete the following sections on Simulation Environments Setup, Dataset Preparation and Training and Evaluation.
For more advanced examples, refer to these demos:
Prerequisites#
InternNav works across most hardware setups. Just note the following exceptions:
Benchmark based on Isaac Sim such as VN and VLN-PE benchmarks must run on NVIDIA RTX series GPUs (e.g., RTX 4090).
Simulation Requirements#
OS: Ubuntu 20.04/22.04
GPU Compatibility:
| GPU | Model Training & Inference | Simulation | ||
| VLN-CE | VN | VLN-PE | ||
|
NVIDIA RTX Series (Driver: 535.216.01+ ) |
✅ | ✅ | ✅ | ✅ |
| NVIDIA V/A/H100 | ✅ | ✅ | ❌ | ❌ |
Note
We provide a flexible installation tool for users who want to use InternNav for different purposes. Users can choose to install the training and inference environment, and the individual simulation environment independently.
Model-Specific Requirements#
| Models | Minimum GPU Requirement |
System RAM (Train/Inference) |
|
| Training | Inference | ||
| StreamVLN & InternVLA-N1 | A100 | RTX 4090 / A100 | 80GB / 24GB |
| NavDP (VN Models) | RTX 4090 / A100 | RTX 3060 / A100 | 16GB / 2GB |
| CMA (VLN-PE Small Models) | RTX 4090 / A100 | RTX 3060 / A100 | 8GB / 1GB |
Quick Installation#
Download Checkpoints#
InternVLA-N1 pretrained Checkpoints
Download our latest pretrained checkpoint of InternVLA-N1 and run the following script to inference with visualization results. Move the checkpoint to the
checkpointsdirectory.
DepthAnything v2 Checkpoints
Download the DepthAnything v2 pretrained checkpoint. Move the checkpoint to the
checkpointsdirectory.
Verification#
InternNav adopts a client–server design to simplify model deployment and prediction.
To verify the installation of InternNav, start the model server first.
python scripts/eval/start_server.py --port 8087
The output should be:
Starting Agent Server...
Registering agents...
INFO: Started server process [18877]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:8087 (Press CTRL+C to quit)
To verify the installation of internvla-n1. Initialize the internvla-n1 agent by
from internnav.configs.agent import AgentCfg
from internnav.utils import AgentClient
agent=AgentCfg(
server_host='localhost',
server_port=8087,
model_name='internvla_n1',
ckpt_path='',
model_settings={
'policy_name': "InternVLAN1_Policy",
'state_encoder': None,
'env_num': 1,
'sim_num': 1,
'model_path': "checkpoints/InternVLA-N1",
'camera_intrinsic': [[585.0, 0.0, 320.0], [0.0, 585.0, 240.0], [0.0, 0.0, 1.0]],
'width': 640,
'height': 480,
'hfov': 79,
'resize_w': 384,
'resize_h': 384,
'max_new_tokens': 1024,
'num_frames': 32,
'num_history': 8,
'num_future_steps': 4,
'device': 'cuda:0',
'predict_step_nums': 32,
'continuous_traj': True,
}
)
agent = AgentClient(cfg.agent)
The output should be something like:
Loading navdp model: NavDP_Policy_DPT_CriticSum_DAT
Pretrained: None
No pretrained weights provided, initializing randomly.
Loading checkpoint shards: 100%|██████████| 4/4 [00:03<00:00, 1.06it/s]
INFO: ::1:38332 - "POST /agent/init HTTP/1.1" 201 Created
Load a capture frame from RealSense DS455 camera:
from scripts.iros_challenge.onsite_competition.sdk.save_obs import load_obs_from_meta
rs_meta_path = '/root/InternNav/scripts/iros_challenge/onsite_competition/captures/rs_meta.json'
fake_obs_640 = load_obs_from_meta(rs_meta_path)
fake_obs_640['instruction'] = 'go to the red car'
print(fake_obs_640['rgb'].shape, fake_obs_640['depth'].shape)
The output should be:
(480, 640, 3) (480, 640)
Test model inference
action = agent.step([obs])[0]['action'][0]
print(f"Action taken: {action}")
The output should be:
============ output 1 ←←←←
s2 infer finish!!
get s2 output lock
=============== [2, 2, 2, 2] =================
Output discretized traj: [2] 0
INFO: ::1:46114 - "POST /agent/internvla_n1/step HTTP/1.1" 200 OK
Action taken: 2
Congrats, now you have made one prediction. In this task, the agent convert the trajectory output to discrete action. Apply this action “turn left” (2) to real robot controller by using internnav.env.real_world_env.
Checkout the real deploy demo video:
for more details, check out the Internvla_n1 Inference-only Demo.