ARIO

Project Background

Embodied intelligence currently leads the artificial intelligence field, seen as the essential pathway to achieving general artificial intelligence. Developed countries, led by the United States, along with major technology players like OpenAI and NVIDIA, have recognized embodied intelligence as a critical strategic direction, investing substantial resources into its advancement. At the same time, China's related industries and some leading research institutions are also devoting efforts to embodied intelligence. For example, companies like AgileX Robotics, which offer hardware support for Stanford projects Mobile Aloha, actively engage in and promote Chinese research initiatives and the application of findings in related areas.

In the era of large-scale models, enhancing model performance relies on augmenting data volume and expanding model scale. Establishing the groundwork for large-scale embodied intelligence models and creating an extendable platform for embodied intelligence applications necessitates an open-source, extensive, and high-quality dataset for robot perception and operation. Much like ImageNet from Stanford University has catalyzed research in computer vision, integral to national strategic technological prowess, we aim to utilize the "China Computing Power Network" and the "Openi" open-source ecosystem to spearhead the development of an equally impactful open-source dataset in embodied intelligence, named ARIO (All Robots In One).

Cases

Data collection

The overall data file structure is divided into: collection-series-task-episode. Collection refers to the data set sample submitted and uploaded at one time, which may contain different scenes and robot types. Series refers to the same scene and the same robot collected A series of data, such as a series of data collected by a dual-arm robot in the kitchen, may include different tasks. Task is a specific task, such as grabbing apples. The same task can be collected repeatedly multiple times. Episode is a specific task. Complete collection process. Under the episode, sensors collect data. Each sensor can collect data by itself according to its own frequency, but it must be based on the same timestamp. The sample file structure is as follows:

collection (sample of data set submitted at one time)

│ commit.yaml (Author information and declaration)

│

├─series-1 (Same scene, same robot)

│ │  calibration_1.yaml (Camera1 calibration parameters)

│ │  calibration_cam1_lidar1.yaml (Camera1 and Lidar1 calibration parameters)

│ │  IMU.pdf (IMU instruction)

│ │  information.yaml (Scene description, robot information and other sensor numbers and information)

│ │  touch.pdf (Touch sensor instruction)

│  │     AgileX Robots instruction.pdf

│ │

│ ├─task-1 (A task. For example: Pick up an Apple)

│ │  │  description.yaml（instruction）

│ │  │  task_record.mp4 (Video of each task)

│ │  │

│ │  ├─episode-1 (A whole data collection process)

│ │  │  │  audio-1-1709554382234.aac (Audio data)

│ │  │  │  base.txt (Mobile base motion data)

│ │  │  │  IMU-1.txt (IMU data)

│ │  │  │  left_master_arm_joint-0.txt (Data of left master arm joint-0)

│ │  │  │  left_master_gripper.txt (Data of left master gripper)

│ │  │  │  left_slave_arm_joint-0.txt (Data of left slave arm joint-0)

│ │  │  │  left_slave_gripper.txt (Data of left slave gripper)

│ │  │  │  pan_tilt.txt (Data of pan tilt)

│ │  │  │  right_master_arm_joint-5.txt (Data of right master arm joint-5)

│ │  │  │  right_master_gripper.txt (Data of right master gripper)

│ │  │  │  right_slave_arm_joint-5.txt (Data of right slave arm joint-5)

│ │  │  │  right_slave_gripper.txt (Data of right slave gripper)

│ │  │  │

│ │  │  ├─cam-1 (Images recorded by camera 1. The collecting frame should >=30 FPS)

│ │  │  │    1709554382234.png

│ │  │  │    1709554383638.png

│ │  │  │

│ │  │  ├─cam-2

│ │  │  │    1709554382234.png

│ │  │  │    1709554383638.png

│ │  │  │

│ │  │  ├─lidar-1 (PCD collected by Lidar1. xyz unit: m)

│ │  │  │    1709554382234.ply

│ │  │  │    1709554382334.ply

│ │  │  │

│ │  │  ├─lidar-2

│ │  │  │    1709554382235.ply

│ │  │  │    1709554382354.ply

│ │  │  │

│ │  │  ├─rgbd-1 (PCD Data collected by rgbd 1)

│ │  │  │    1709554382234.ply

│ │  │  │    1709554383630.ply

│ │  │  │

│ │  │  └─touch-1 (Touch sensor1 data)

│ │  │      1709554382234.txt

│ │  │

│ │  └─episode-2

│ └─task-2

│   │  description.yaml

│   │  task_record.mp4

│   │

│   └─episode-1

│

└─series-2

│  information.yaml

│    AgileX robot2 instruction.pdf

  │

  └─task-1

    │  description.yaml

    │  task_record.mp4

    │

    └─episode-1