Robots
Strands Robots is a Python library for controlling physical robots with natural language. It provides a policy abstraction layer for vision-language-action (VLA) models and a hardware abstraction layer for robot control, letting you tell a robot what to do without programming it.
The library provides a set of Strands Agents tools that handle several components of the robotics stack - from camera capture and servo calibration to policy inference and real-time control loops. An agent equipped with these tools can interpret instructions like “pick up the red block” and translate them into coordinated motor actions.
Getting started
Section titled “Getting started”Installation
Section titled “Installation”pip install strands-robotsBasic usage
Section titled “Basic usage”from strands import Agentfrom strands_robots import Robot, gr00t_inference
robot = Robot( tool_name="my_arm", robot="so101_follower", cameras={ "front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30}, "wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30}, }, port="/dev/ttyACM0", data_config="so100_dualcam",)
agent = Agent(tools=[robot, gr00t_inference])
# Start the inference serviceagent.tool.gr00t_inference( action="start", checkpoint_path="/data/checkpoints/model", port=5555, data_config="so100_dualcam",)
# Control the robot with natural languageagent("Use my_arm to pick up the red block using GR00T policy on port 5555")The Robot class is a Strands AgentTool that the agent can invoke directly. When the agent decides to use the robot, it calls the tool with an instruction and policy port, and the tool handles the entire observation-inference-action loop internally.
How it works
Section titled “How it works”The system chains together three layers: a Strands Agent that interprets natural language, a policy provider that maps camera observations and instructions to action chunks, and a hardware abstraction layer that sends those actions to physical actuators.
graph LR A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent] B --> C[Robot class] C --> D[Policy Provider] C --> E[Hardware Abstraction] D --> F[Action Chunk] F --> E E --> G[Robot Hardware]
classDef input fill:#2ea44f,stroke:#1b7735,color:#fff classDef agent fill:#0969da,stroke:#044289,color:#fff classDef policy fill:#8250df,stroke:#5a32a3,color:#fff classDef hardware fill:#bf8700,stroke:#875e00,color:#fff
class A input class B,C agent class D,F policy class E,G hardwareEach control cycle, the Robot class captures observations (camera frames and joint states), sends them to the policy for inference, receives an action chunk, and executes those actions on the hardware.
Architecture
Section titled “Architecture”flowchart TB subgraph Agent["🤖 Strands Agent"] NL[Natural Language Input] Tools[Tool Registry] end
subgraph RobotTool["🦾 Robot Class"] direction TB RT[Robot Class] TM[Task Manager] AS[Async Executor] end
subgraph Policy["🧠 Policy Layer"] direction TB PA[Policy Abstraction] GP[GR00T Policy] MP[Mock Policy] CP[Custom Policy] end
subgraph Inference["⚡ Inference Service"] direction TB DC[Docker Container] ZMQ[ZMQ Server :5555] TRT[TensorRT Engine] end
subgraph Hardware["🔧 Hardware Layer"] direction TB LR[LeRobot] CAM[Cameras] SERVO[Feetech Servos] end
NL --> Tools Tools --> RT RT --> TM TM --> AS AS --> PA PA --> GP PA --> MP PA --> CP GP --> ZMQ ZMQ --> TRT TRT --> DC AS --> LR LR --> CAM LR --> SERVO
classDef agentStyle fill:#0969da,stroke:#044289,color:#fff classDef robotStyle fill:#2ea44f,stroke:#1b7735,color:#fff classDef policyStyle fill:#8250df,stroke:#5a32a3,color:#fff classDef infraStyle fill:#bf8700,stroke:#875e00,color:#fff classDef hwStyle fill:#d73a49,stroke:#a72b3a,color:#fff
class NL,Tools agentStyle class RT,TM,AS robotStyle class PA,GP,MP,CP policyStyle class DC,ZMQ,TRT infraStyle class LR,CAM,SERVO hwStyleControl flow
Section titled “Control flow”sequenceDiagram participant User participant Agent as Strands Agent participant Robot as Robot Class participant Policy as Policy Provider participant HW as Hardware
User->>Agent: "Pick up the red block" Agent->>Robot: execute(instruction, policy_port)
loop Control Loop Robot->>HW: get_observation() HW-->>Robot: {cameras, joint_states} Robot->>Policy: get_actions(obs, instruction) Policy-->>Robot: action_chunk
loop Action Horizon Robot->>HW: send_action(action) Note over Robot,HW: sleep end end
Robot-->>Agent: Task completed Agent-->>User: "Picked up red block"Core concepts
Section titled “Core concepts”Robot class
Section titled “Robot class”The Robot class wraps a robot and exposes it as a Strands agent tool with four actions:
| Action | Behavior | Use case |
|---|---|---|
execute | Blocks until the task completes or times out | Single-step tasks |
start | Returns immediately, runs task in background | Long-running tasks |
status | Reports current task progress | Monitoring async tasks |
stop | Interrupts a running task | Emergency stop |
# Blocking - agent waits for completionagent("Use my_arm to pick up the red block using GR00T policy on port 5555")
# Async - agent can check status or do other workagent("Start my_arm waving using GR00T on port 5555, then check status")
# Stopagent("Stop my_arm immediately")Constructor parameters:
| Parameter | Type | Description |
|---|---|---|
tool_name | str | Name the agent uses to reference this robot |
robot | str, RobotConfig, or Robot | Robot type string (e.g. "so101_follower"), a config object, or a pre-built robot instance |
cameras | dict | Camera configuration mapping names to settings |
port | str | Serial port for the robot (e.g. "/dev/ttyACM0") |
data_config | str | Policy data configuration name |
control_frequency | float | Control loop frequency in Hz (default: 50) |
action_horizon | int | Number of actions to execute per inference step (default: 8) |
Policy abstraction
Section titled “Policy abstraction”Policies are the bridge between observations and actions. The library defines a Policy abstract class that any VLA model can implement:
from strands_robots import Policy, create_policy
# GR00T policy (ships with the library)policy = create_policy( provider="groot", data_config="so100_dualcam", host="localhost", port=5555,)
# Mock policy (for testing without hardware)policy = create_policy(provider="mock")The create_policy factory ships with "groot" and "mock" providers. You can integrate additional VLA models by subclassing Policy and implementing get_actions() and set_robot_state_keys().
Inference management
Section titled “Inference management”The gr00t_inference tool manages policy inference services running in Docker containers.
# Start with TensorRT accelerationagent.tool.gr00t_inference( action="start", checkpoint_path="/data/checkpoints/model", port=5555, data_config="so100_dualcam", use_tensorrt=True,)
# Check statusagent.tool.gr00t_inference(action="status", port=5555)
# Stopagent.tool.gr00t_inference(action="stop", port=5555)Available actions: start, stop, status, list, restart, and find_containers.
Additional tools
Section titled “Additional tools”Beyond the core robot and inference tools, the library includes several utilities that the agent can use for setup, calibration, and data collection.
Camera tool
Section titled “Camera tool”Camera management supporting OpenCV and RealSense cameras.
from strands_robots import lerobot_camera
agent = Agent(tools=[lerobot_camera])
agent("Discover all connected cameras")agent("Capture images from front and wrist cameras")agent("Record 30 seconds of video from the front camera")Actions: discover, capture, capture_batch, record, preview, test.
Teleoperation tool
Section titled “Teleoperation tool”Record demonstrations for imitation learning using a leader-follower setup.
from strands_robots import lerobot_teleoperate
agent.tool.lerobot_teleoperate( action="start", robot_type="so101_follower", robot_port="/dev/ttyACM0", teleop_type="so101_leader", teleop_port="/dev/ttyACM1", dataset_repo_id="my_user/cube_picking", dataset_single_task="Pick up the red cube", dataset_num_episodes=50,)Actions: start, stop, list, replay.
Pose tool
Section titled “Pose tool”Store, retrieve, and execute named robot poses for repeatable positioning.
from strands_robots import pose_tool
agent = Agent(tools=[robot, pose_tool])
agent("Save the current position as 'home'")agent("Go to the home pose")agent("Move the gripper to 50%")Actions: store_pose, load_pose, list_poses, move_motor, incremental_move, reset_to_home.
Serial tool
Section titled “Serial tool”Low-level serial communication for servos and custom protocols.
Actions: list_ports, feetech_position, feetech_ping, send, monitor.
Complete example
Section titled “Complete example”from strands import Agentfrom strands_robots import Robot, gr00t_inference, lerobot_camera, pose_tool
robot = Robot( tool_name="orange_arm", robot="so101_follower", cameras={ "wrist": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 15}, "front": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 15}, }, port="/dev/ttyACM0", data_config="so100_dualcam",)
agent = Agent(tools=[robot, gr00t_inference, lerobot_camera, pose_tool])
agent.tool.gr00t_inference( action="start", checkpoint_path="/data/checkpoints/gr00t-wave/checkpoint-300000", port=5555, data_config="so100_dualcam",)
while True: user_input = input("\n> ") if user_input.lower() in ["exit", "quit"]: break agent(user_input)
agent.tool.gr00t_inference(action="stop", port=5555)This gives you an interactive loop where you can issue natural language commands to the robot, check camera feeds, save poses, and manage inference services - all through conversation with the agent.