Robots Sim
Strands Robots Sim is a Python library for controlling robots in simulated environments with natural language through Strands Agents. It lets you develop and test robot control strategies without physical hardware, using the same policy abstraction as Strands Robots.
The library provides two execution modes as Strands agent tools: SimEnv for full episode execution where the agent specifies a task and the policy runs to completion, and SteppedSimEnv for iterative control where the agent observes camera feedback after each batch of steps and adapts its instructions accordingly. This enables a dual-system pattern where the agent handles high-level reasoning and planning while a VLA policy handles low-level motor control.
Getting started
Section titled “Getting started”Installation
Section titled “Installation”pip install strands-robots-sim
# For simulation environment dependencies (e.g. Libero)pip install strands-robots-sim[sim]Basic usage
Section titled “Basic usage”from strands import Agentfrom strands_robots_sim import SimEnv, gr00t_inference
sim_env = SimEnv( tool_name="my_sim", env_type="libero", task_suite="libero_10", data_config="libero_10",)
agent = Agent(tools=[sim_env, gr00t_inference])
# Start inference serviceagent.tool.gr00t_inference( action="start", checkpoint_path="/data/checkpoints/model", port=8000, data_config="examples.Libero.custom_data_config:LiberoDataConfig",)
# Run a taskagent("Run the task 'pick up the red block' for 5 episodes with video recording")How it works
Section titled “How it works”graph TD A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent] B --> C[SimEnv / SteppedSimEnv] C --> D[Policy Provider] C --> G[Simulation Environment] D --> F[Action Chunk] F --> G G -.->|Observation| C G -.->|Visual Feedback + State<br/>SteppedSimEnv only| B
classDef input fill:#2ea44f,stroke:#1b7735,color:#fff classDef agent fill:#0969da,stroke:#044289,color:#fff classDef policy fill:#8250df,stroke:#5a32a3,color:#fff classDef simulation fill:#bf8700,stroke:#875e00,color:#fff
class A input class B,C agent class D,F policy class G simulationThe agent receives a natural language instruction and routes it to a simulation tool. The tool coordinates with a policy provider to generate action chunks, which are executed in the simulation environment. Observations flow back for the next inference cycle. In SteppedSimEnv mode, camera images and state are also returned to the agent so it can reason about progress and adapt.
Architecture
Section titled “Architecture”flowchart TB subgraph Agent["🤖 Strands Agent"] NL[Natural Language Input] Tools[Tool Registry] end
subgraph SimTool["🦾 Simulation Tool"] direction TB SE[SimEnv:<br/>Full Episode Execution] SSE[SteppedSimEnv:<br/>Iterative Control] TM[Task Manager] AS[Async Executor] end
subgraph Policy["🧠 Policy Layer"] direction TB PA[Policy Abstraction] GP[GR00T Policy] MP[Mock Policy] CP[Custom Policy] end
subgraph SimLayer["🔧 Simulation Layer"] direction TB ENV[Environment Abstraction] SUITES[Task Suites] CAM[Camera Interfaces] STATE[State Management] end
NL --> Tools Tools --> SE Tools --> SSE SE --> TM SSE --> TM TM --> AS AS --> PA PA --> GP PA --> MP PA --> CP AS --> ENV ENV --> SUITES ENV --> CAM ENV --> STATE
classDef agentStyle fill:#0969da,stroke:#044289,color:#fff classDef toolStyle fill:#2ea44f,stroke:#1b7735,color:#fff classDef policyStyle fill:#8250df,stroke:#5a32a3,color:#fff classDef simStyle fill:#d73a49,stroke:#a72b3a,color:#fff
class NL,Tools agentStyle class SE,SSE,TM,AS toolStyle class PA,GP,MP,CP policyStyle class ENV,SUITES,CAM,STATE simStyleExecution modes
Section titled “Execution modes”SimEnv - full episode execution
Section titled “SimEnv - full episode execution”The agent specifies a task once and the policy runs the full episode autonomously. This is the simpler mode, suited for benchmarking and well-defined tasks.
from strands_robots_sim import SimEnv
sim_env = SimEnv( tool_name="my_sim", env_type="libero", task_suite="libero_10", data_config="libero_10",)
agent = Agent(tools=[sim_env, gr00t_inference])
# Blocking executionagent.tool.my_sim( action="execute", instruction="pick up the red block", policy_port=8000, max_episodes=5, max_steps_per_episode=200, record_video=True,)
# Or async execution with status monitoringagent.tool.my_sim( action="start", instruction="stack the blocks", policy_port=8000, max_episodes=10,)agent.tool.my_sim(action="status")agent.tool.my_sim(action="stop")SteppedSimEnv - iterative agent control
Section titled “SteppedSimEnv - iterative agent control”The agent acts as a planner, executing a limited number of steps per call and receiving camera images and state back. It can then reason about progress, decompose complex tasks into subtasks, and adapt instructions based on what it observes.
from strands_robots_sim import SteppedSimEnv
stepped_sim = SteppedSimEnv( tool_name="my_stepped_sim", env_type="libero", task_suite="libero_10", data_config="libero_10", steps_per_call=10, max_steps_per_episode=500,)
agent = Agent(tools=[stepped_sim, gr00t_inference])
# Reset to a specific taskagent.tool.my_stepped_sim( action="reset_episode", task_name="KITCHEN_SCENE1_put_the_black_bowl_on_top_of_the_cabinet",)
# Execute steps - returns camera images, state, reward, done statusagent.tool.my_stepped_sim( action="execute_steps", instruction="move gripper toward the bowl", policy_port=8000, num_steps=10,)
# Agent observes the result and decides what to do nextagent.tool.my_stepped_sim(action="get_state")In practice, you hand the full loop to the agent with a planning prompt. The agent decomposes a complex task like “pick up the block and place it in the drawer” into subtasks (locate block, grasp, lift, move to drawer, place), executes each with execute_steps, observes camera feedback, and adapts if something goes wrong.
Comparing the modes
Section titled “Comparing the modes”| Feature | SimEnv | SteppedSimEnv |
|---|---|---|
| Control flow | One-shot execution | Step-by-step iteration |
| Agent feedback | Final reward only | Camera images + state per batch |
| Use case | Known tasks, benchmarking | Complex tasks requiring adaptation |
| Error recovery | None | Agent can retry with different instructions |
Dual-system architecture
Section titled “Dual-system architecture”The framework implements a pattern inspired by System 1 / System 2 thinking. The Strands Agent serves as the deliberate planner (System 2) - it reasons about goals, decomposes tasks, and adapts strategy based on observations. The VLA policy serves as the fast executor (System 1) - it maps visual observations and language instructions to motor actions with low latency.
In SimEnv mode, System 2 fires once to specify the task and System 1 handles the rest. In SteppedSimEnv mode, the two systems collaborate iteratively: System 2 observes, plans, and issues instructions every N steps while System 1 executes the low-level control between each planning cycle.
Policy and environment abstraction
Section titled “Policy and environment abstraction”The library uses the same Policy abstract class as Strands Robots. It ships with GR00T and mock providers, and you can add custom VLA models by subclassing Policy.
from strands_robots_sim import create_policy
policy = create_policy(provider="groot", data_config="libero", host="localhost", port=8000)policy = create_policy(provider="mock")Simulation environments are similarly abstracted through a SimulationEnvironment base class. The library ships with a Libero integration, and the factory supports adding new backends:
from strands_robots_sim.envs import create_simulation_environment
env = create_simulation_environment(env_type="libero", task_suite="libero_10")Supported task suites
Section titled “Supported task suites”The current Libero integration includes:
| Suite | Tasks | Description |
|---|---|---|
libero_spatial | 10 | Spatial reasoning tasks |
libero_object | 10 | Object-centric tasks |
libero_goal | 10 | Goal-conditioned manipulation |
libero_10 | 10 | Standard benchmark |
libero_90 | 90 | Extended benchmark for comprehensive evaluation |
Complete example
Section titled “Complete example”This example shows the stepped execution mode where the agent plans and adapts:
from strands import Agentfrom strands_robots_sim import SteppedSimEnv, gr00t_inference
stepped_sim = SteppedSimEnv( tool_name="my_stepped_sim", env_type="libero", task_suite="libero_10", data_config="libero_10", steps_per_call=10, max_steps_per_episode=500,)
agent = Agent(tools=[stepped_sim, gr00t_inference])
agent.tool.gr00t_inference( action="start", checkpoint_path="/data/checkpoints/model", port=8000, data_config="examples.Libero.custom_data_config:LiberoDataConfig",)
agent("""Task: open the top drawer
You are a robot task planner. Decompose this task into subtasks and executethem step-by-step using the my_stepped_sim tool.
1. Reset the episode with action="reset_episode"2. For each subtask, call action="execute_steps" with the subtask as instruction3. Observe camera images and state after each batch4. Adapt your approach based on what you see5. Continue until reward reaches 1.0 or the episode ends""")
agent.tool.gr00t_inference(action="stop", port=8000)- GitHub repository
- PyPI package
- Strands Robots - physical robot control
- Libero
- NVIDIA Isaac GR00T