Datasets for automotive
1. A2D2 dataset by Audi
Our data set includes over 40,000 frames with semantic image segmentation and point cloud labels, of which over 12,000 frames also have annotations for 3D boxes. In addition, we provide unmarked sensor data (approx. 390,000 frames) for multi-cycle sequences recorded in three cities.
Data segmentation
The data set contains 41,280 frames with semantic segmentation into 38 categories. Each pixel in the image is assigned a label that describes the type of object it represents, such as a pedestrian, car, vegetation, etc.
Point cloud
Point cloud segmentation is done by merging semantic pixel information and lidar point clouds. Thus, each 3D point is assigned an object type label. It depends on the exact registration of the lidar camera.
Framework
3D boxes are provided for 12,499 frames. Lidar points in the field of view of the front camera are marked with 3D frames. We annotate 14 classes related to driving, such as cars, pedestrians, buses, etc.
2. Ford Autonomous Vehicle Dataset
We present a sophisticated multi-agent seasonal dataset assembled by Ford’s fleet of autonomous vehicles on different days and times during 2017-18. Cars were manually driven along the Michigan route, which included a combination of driving scenarios, including a trip to Detroit Airport, freeways, city centers, a university campus, and a suburban area.
We present seasonal changes in weather, lighting, construction, and traffic observed in dynamic urban environments. This dataset can help develop robust algorithms for autonomous vehicles and multi-agent systems. Each log in the data set is time-marked and contains raw data from all sensors, calibration values, pose trajectory, ground truth pose, and 3D maps. All data is available in Rosbag format, which can be visualized, modified and applied using the open source robot (ROS) operating system.
The dataset contains data with a full resolution time stamp from the following sensors:
- Four HDL-32E Velodyne 3D Lidars
- 6 grayscale 1.3 MP cameras
- 1 Grayscale 5 MP Dash Camera
- Applanix POS-LV IMU
The dataset also includes:
- 3D Ground Reflectivity Maps
- 3D Point Cloud Maps
- 6 DoF Ground-truth Pose
- 3 DoF Localized Pose
- Transformation and calibration of sensors
3. Waymo open dataset
Waymo Open Dataset currently contains 1,950 segments. We plan to increase this data set in the future. Here is what is currently included:
1950 segments of 20 seconds each, assembled at a frequency of 10 Hz (200,000 frames) in various geographical conditions and conditions
Sensor Data
- 1 medium range lidar
- 4 short range lidar
- 5 cameras (front and side)
- Synchronized lidar and camera data
- Lidar to camera projections
- Calibration of sensors and vehicle positions
Tagged data
- Markings for 4 classes of objects – vehicles, pedestrians, cyclists, signs
- High-quality markings for lidar data in 1200 segments
- 12.6M 3D boxes labeled with tracking identifiers on lidar data
- High-quality markings for camera data in 1000 segments
- 11.8M 2D boxes labeled with tracking identifiers on camera data
Source code
github.com/waymo-research/waymo-open-dataset
4. nuTonomy
The nuScenes dataset is a large-scale, stand-alone data set for driving. It has the following features:
- Full range of sensors (1X lidar, 5X radar, 6X camera, IMU, GPS)
- 1000 scenes of 20 seconds each
- 1,400,000 camera images
- 390,000 lidar passages
- Two different cities: Boston and Singapore
- Left and right traffic
- Map Details
- Manual annotations for 23 feature classes
- 1.4M 3D boxes annotated at 2 Hz
- Attributes such as visibility, activity and posture
five. Dataset | Lyft Level 5
All data is collected by the Ford Fusion fleet. We have two versions of vehicles. They are indicated in their calibration data as BETA_V0 and BETA_PLUS_PLUS. Each vehicle is equipped with the following sensors, depending on the vehicle version:
BETA_V0 LiDARS:
- One 40-beam lidar on the roof and two 40-beam lidars on the bumper.
- Each lidar has an azimuth resolution of 0.2 degrees.
- All three lidars together produce ~ 216,000 points at a frequency of 10 Hz.
- The sensing directions of all lidars are synchronized to be the same at any given time.
BETA_V0 Cameras:
- Six cameras with a wide field of view (WFOV) evenly cover a field of view of 360 degrees (FOV). Each camera has a resolution of 1224×1024 and a FOV of 70 ° x60 °.
- A single camera with a large focal length is mounted slightly up, mainly for detecting traffic lights. The camera has a resolution of 2048×864 and FOV 35 ° x15 °.
- Each camera is synchronized with the lidar so that the lidar beam is in the center of the camera’s field of view when the camera captures an image.
BETA_PLUS_PLUS LiDARS:
The only lid difference between Beta-V0 and Beta ++ is the roof lidar, which is 64-ray for Beta ++.
Lidar synchronization is the same as Beta-V0.
BETA_PLUS_PLUS Cameras:
- Six cameras with a wide field of view (FOV) with a high dynamic range evenly cover a 360-degree field of view (FOV). Each camera has a resolution of 1920×1080 and FOV 82 ° x52 °.
- A single camera with a large focal length is mounted slightly up, mainly for detecting traffic lights. The camera has a resolution of 1920×1080 and FOV 27 ° x17 °.
- Each camera is synchronized with the lidar so that the lidar beam is in the center of the camera’s field of view when the camera captures an image.
This dataset includes a high-quality semantic map. A semantic map provides a context for making decisions about the presence and movement of agents in scenes. The map contains more than 4,000 lane segments (2,000 road segments and about 2,000 intersections), 197 pedestrian crossings, 60 stop-signs, 54 parking zones, 8 speed bumps, 11 speed bumps.
All map elements are registered in the base geometric map. This is the same frame of reference for all the scenes in the dataset.
6. UC Berkeley open-sources self-driving dataset
Video data
Explore 100,000 HD video sequences from over 1,100 hours of driving experience at different times of the day, weather conditions and driving scenarios. Our video sequences also include GPS location, IMU data, and timestamps.
Road object detection
2D boxes are annotated on 100,000 images for a bus, traffic light, road sign, person, bicycle, truck, motor, car, train and rider.
Instance Segmentation
Explore over 10,000 diverse images with annotations at the pixel level and rich instance-level.
Driveable area
Learn a sophisticated car driving solution of 100,000 images.
Lane markings
Multiple types of lane marking annotations on 100,000 images for driving guidance.
Multiple annotation of lane markings on 100,000 images for navigation.
7. The Cityscapes Dataset by Daimler
The Cityscapes Dataset focuses on semantic understanding of urban street scenes. In the following, we give an overview on the design choices that were made to target the dataset’s focus.
Cityscapes Dataset focuses on a semantic understanding of urban street scenes. Below we will give an overview of the design options that were made in order to achieve the goals of the dataset.
Polygonal annotations
- Dense semantic segmentation
- Segmentation of instances for vehicles and people
Complexity
- 30 classes
- Cm. class definitions for a list of all classes and check out the applicable labeling policy.
Diversity
- 50 cities
- Several months (spring, summer, autumn)
- Daytime
- Good / medium weather
- Manually selected frames
- A large number of dynamic objects
- Various scene layout
- Change background
The size
- 5,000 annotated images with beautiful annotations (examples)
- 20,000 annotated images with rough annotations (examples)
Metadata
- Previous and final video frames. Each annotated image is the 20th image of the 30-frame video fragments (1.8 s)
- Matching Right Stereo Views
- GPS coordinates
- Vehicle odometer ego data
- Outside temperature from vehicle sensor
Extensions made by other researchers
- People annotation bounding box
- Images supplemented by fog and rain
Benchmark suite and evaluation server
- Pixel level semantic marking
- Instance-level semantic marking
- Panoptic semantic marking
8. The KITTI Vision Benchmark Suite
Our recording platform is equipped with four high-resolution video cameras, a Velodyne laser scanner and a modern localization system. Our benchmarks comprise 389 pairs of stereo and optical streams, 39.2 km stereo-visual odometric sequences and over 200 thousand 3D annotations of objects shot in cluttered scenarios (up to 15 cars and 30 pedestrians in the image).
We are perhaps the most powerful competence center in Russia for the development of automotive electronics in Russia. Now we are actively growing and we have opened many vacancies (about 30, including in the regions), such as a software engineer, design engineer, lead development engineer (DSP programmer), etc.
We have many interesting challenges from automakers and concerns driving the industry. If you want to grow as a specialist and learn from the best, we will be glad to see you in our team. We are also ready to share expertise, the most important thing that happens in automotive. Ask us any questions, we will answer, we will discuss.
Read more useful articles:
- Free Online Courses in Automotive, Aerospace, Robotics, and Engineering (50+)
- [Прогноз] Transport of the future (short-term, medium-term, long-term horizons)
- The best materials for hacking cars with DEF CON 2019-2020
- [Прогноз] Motornet – a data exchange network for robotic vehicles
- Companies spend $ 16 billion on drones to capture 8 trillion market
- Cameras or lasers
- Autonomous cars on open source
- McKinsey: Rethinking Software and Electronics Architecture in Automotive
- Another OS war is already under the hood of cars
- Program code in the car
- In a modern car, there are more lines of code than …