LiDAR Data Processing Tutorial
This notebook provides a comprehensive guide on preparing and managing LiDAR data on the Dataloop platform using its Python SDK. LiDAR (Light Detection and Ranging) data processing is essential for autonomous vehicles, robotics, and 3D mapping applications.
You'll learn how to structure LiDAR data, upload it to Dataloop, construct LiDAR video items, and add 3D annotations for comprehensive spatial analysis.
Prerequisites:
- Dataloop Account: You should have access to a Dataloop platform account.
- Python Environment: Ensure you have Python 3.7+ installed with pip.
- LiDAR Data: Access to LiDAR data files including PCD files, camera images, and calibration data.
- Local Data Structure: Your LiDAR data should be organized as described in this tutorial.
Navigate through the following sections:
- Dependencies & Setup
- Environment Setup
- LiDAR Data Preparation
- Construct LiDAR Video Item
- Upload Annotations
- Conclusion and Next Steps
For more detailed information, refer to the official Dataloop documentation:
1. Dependencies & Setup
First, let's ensure all required Python packages are installed. The cell below will install dtlpy for Dataloop SDK interaction and dtlpylidar for LiDAR-specific functionalities. It is recommended to run this to ensure you have the latest compatible versions.
!pip install dtlpy git+https://github.com/dataloop-ai-apps/dtlpy-lidar.git --upgrade --quiet2. Environment Setup
Import Required Libraries
With the dependencies installed, we will now import the necessary libraries for this tutorial.
import dtlpy as dl
from dtlpylidar.parsers.base_parser import LidarFileMappingParserConnect to Dataloop Platform
To begin, we need to connect to the Dataloop platform. If you're not already logged in, running the cell below will prompt you to do so. We will then create a new project or retrieve an existing one to work in.
Action Required: In the cell below, replace
"your-lidar-project-name"with your Dataloop project name. If a project with this name already exists, it will be retrieved; otherwise, a new one will be created.
if dl.token_expired():
dl.login()
PROJECT_NAME = "your-lidar-project-name"
# Check if the project exists, if not, create it
try:
project = dl.projects.get(project_name=PROJECT_NAME)
print(f"Successfully retrieved project: '{project.name}' (ID: {project.id})")
except dl.exceptions.NotFound:
project = dl.projects.create(project_name=PROJECT_NAME)
print(f"Successfully created project: '{project.name}' (ID: {project.id})")3. LiDAR Data Preparation
To construct a LiDAR video item, your dataset must contain three key components: PCD files for 3D scenes, camera images for 2D views, and a mapping.json file for sensor calibration. This section explains how to structure these files locally before uploading them to Dataloop.
3.1 Local File Structure
Before uploading to the Dataloop platform, your LiDAR data must be arranged in a specific local directory structure. This ensures that all components (point clouds, images) are correctly associated.
Required Directory Structure:
lidarfolder: Contains all PCD files, named numerically in sequence (e.g.,0.pcd,1.pcd, ...).framesfolder: Contains subfolders for each frame's camera images.- Subfolders must be named numerically to match the PCD files (e.g.,
0,1, ...). - Image files within each subfolder must be named sequentially (e.g.,
0.jpg,1.jpg, ...).
- Subfolders must be named numerically to match the PCD files (e.g.,
mapping.jsonfile: This file, located at the root of your data directory, contains the calibration data that links the 3D point clouds to the 2D images.
3.2 Mapping.json Schema
The mapping.json file must follow a specific schema. Below is a template demonstrating the required structure, including paths, timestamps, and camera calibration parameters (intrinsics and extrinsics).
{
"frames": {
"0": {
"path": "lidar/0.pcd", // Relative path from the mapping.json file
"timestamp": 1234567890,
"position": { // LiDAR sensor location (used as the center of the world)
"x": 0.0,
"y": 0.0,
"z": 0.0
},
"heading": { // LiDAR sensor rotation (Quaternion)
"x": 0.0,
"y": 0.0,
"z": 0.0,
"w": 1.0
},
"images": { // if no images are provided, add an empty dict
"0": {
"image_path": "frames/0/0.jpg", // Relative path from the mapping.json file
"timestamp": 1234567890,
"intrinsics": { // camera intrinsic parameters
"fx": 525.0, // Focal length in pixels
"fy": 525.0,
"cx": 319.5, // Optical center (the principal point), in pixels
"cy": 239.5
},
"extrinsics": { // camera extrinsic parameters
"translation": { // camera location in world coordinates (in relation to the lidar sensor)
"x": 0.0,
"y": 0.0,
"z": 0.0
},
"rotation": { // rotation of the camera (Quaternion)
"w": 1.0,
"x": 0.0,
"y": 0.0,
"z": 0.0
}
},
"distortion": { // distortion parameters
"k1": 0.0,
"k2": 0.0,
"p1": 0.0,
"p2": 0.0,
"k3": 0.0,
"k4": 0.0
}
}
}
}
}
}3.3 Upload Data to Dataloop
You can use your own data structured as described above, or use sample data from OSDaR23 for this tutorial.
Action Required: In the cell below, update
DATASET_NAMEwith your desired dataset name andDATA_PATHwith the local path to your data directory.
DATASET_NAME = "My-LiDAR-Dataset"
# Check if the dataset exists within the project
# if not, create it. Dataloop automatically creates a default Recipe and Ontology.
try:
dataset = project.datasets.get(dataset_name=DATASET_NAME)
print(f"Successfully retrieved dataset: '{dataset.name}' (ID: {dataset.id})")
except dl.exceptions.NotFound:
dataset = project.datasets.create(dataset_name=DATASET_NAME)
print(f"Successfully created dataset: '{dataset.name}' (ID: {dataset.id}) in project '{project.name}'.")3.4 Upload Items to Dataset
Now that we created or retrieved the dataset, we can upload items to it.
# Upload the data to the dataset
DATA_PATH = "./data/*"
dataset.items.upload(local_path=DATA_PATH)
print(f"Data uploaded successfully to dataset '{dataset.name}'")3.5 View Dataset in Platform
We can take a look at the items in the platform using the following command:
# Open the dataset in the Dataloop web interface
dataset.open_in_web()4. Construct LiDAR Video Item
After uploading your raw data files (.pcd, .jpg, mapping.json), we use the Dataloop LiDAR SDK to create a single, composite LiDAR video item. The LidarFileMappingParser reads the mapping.json file and generates a frames.json item, which represents the synchronized 3D video sequence in the Dataloop studio.
4.1 Parse Mapping File
Use the LiDAR parser to process the mapping file and create the LiDAR video item.
# Get the mapping item using filters
filters = dl.Filters(field=dl.FiltersKnownFields.NAME, values="mapping.json")
mapping_item = dataset.items.get_all_items(filters=filters)[0]
frames_item = LidarFileMappingParser().parse_data(mapping_item=mapping_item)
print(f"LiDAR Video File: ItemID: {frames_item.id}, ItemName: {frames_item.name}")5. Upload Annotations
With the frames.json video item created, you can now add annotations. The process is similar to annotating standard videos in Dataloop, but with support for 3D annotation types like cuboids. First, we need to retrieve the frames.json item we just created.
5.1 Retrieve Frames Item
Get the created frames item for annotation.
# Get the created frames item using filters
filters = dl.Filters(field=dl.FiltersKnownFields.NAME, values="frames.json")
frames_item = dataset.items.get_all_items(filters=filters)[0]
print(f"Retrieved frames item: {frames_item.name} (ID: {frames_item.id})")5.2 Define and Upload a 3D Cuboid
Define the Annotation
To create a 3D cuboid annotation, we define its geometric properties and any additional metadata. The key parameters are:
- Position: The 3D coordinates of the cuboid's center (x, y, z).
- Rotation: Euler angles for the cuboid's orientation (roll, pitch, yaw).
- Scale: The dimensions of the cuboid (width, height, depth).
The code below defines a dl.Cube3d annotation for a "person".
# Define Your 3D Annotation Parameters
label = "person"
position = [
73.9, # x coordinate
-5.9, # y coordinate
0.79 # z coordinate
]
rotation = [
0.0, # roll
0.0, # pitch
0.0 # yaw
]
scale = [
1.2, # width
0.8, # height
1.8 # depth
]
# Add Some Extra Details
attributes = {"age": "adult"}
description = "An adult person standing on the station"
# Create the 3D Cuboid Annotation
annotation_definition = dl.Cube3d(
label=label,
position=position,
scale=scale,
rotation=rotation,
attributes=attributes,
description=description
)
print(f"Created 3D cuboid annotation for label '{label}'")
print(f"Position: {position}")
print(f"Scale: {scale}")
print(f"Rotation: {rotation}")Upload the Annotation
Now that the dl.Cube3d object is defined, we use an annotation builder to add it to the frames.json item. We specify the frame range (frame_num to end_frame_num) where the annotation should appear and then upload it to the platform.
# Create annotation builder for the frames item
builder = frames_item.annotations.builder()
# Define annotation parameters
frame_num = 0
end_frame_num = 1
object_id = "0"
metadata = {"user": {"isDistracted": True}}
# Add the annotation to the builder
builder.add(
annotation_definition=annotation_definition,
frame_num=frame_num,
end_frame_num=end_frame_num,
object_id=object_id,
metadata=metadata
)
# Upload the annotations
builder.upload()
print(f"Successfully uploaded 3D cuboid annotation to frames {frame_num}-{end_frame_num}")5.3 View in Dataloop Studio
Once the annotation is uploaded, you can open the LiDAR video item in the Dataloop platform to view it in the LiDAR Studio. The following command will open the item directly in your web browser.
# Open the LiDAR video item in the Dataloop LiDAR Studio
frames_item.open_in_web()6. Conclusion and Next Steps
Congratulations! You have successfully walked through the entire process of preparing a LiDAR dataset for the Dataloop platform.
Summary of What You've Accomplished:
- Set up your Dataloop environment and project for LiDAR data processing
- Learned the required file structure for LiDAR data (PCDs, images, and mapping file)
- Understood the
mapping.jsonschema for sensor calibration and synchronization - Uploaded and structured raw LiDAR data files to Dataloop
- Constructed a composite LiDAR video item using the Dataloop LiDAR SDK
- Created and uploaded 3D cuboid annotations to the video item
- Visualized the results in the Dataloop LiDAR Studio
Next Steps:
From here, you can explore more advanced functionalities:
- Advanced Annotations: Experiment with different 3D annotation types (polylines, polygons, points)
- Custom LiDAR Parser: Try constructing a LiDAR video item using a Custom LiDAR Parser for data that doesn't fit the standard structure
- Automated Pipelines: Create automated pipelines for LiDAR data processing and annotation workflows
- Machine Learning Integration: Integrate LiDAR workflows with machine learning models for autonomous systems and 3D object detection
- Multi-Modal Analysis: Combine LiDAR data with other sensor modalities for comprehensive scene understanding
- Quality Assurance: Set up QA workflows for 3D annotation validation and consensus
For more advanced LiDAR processing capabilities, explore the full Dataloop LiDAR SDK documentation and the LiDAR tutorials in the Dataloop developer documentation.