Datasets for automotive

1. A2D2 dataset by Audi

image

Our data set includes over 40,000 frames with semantic image segmentation and point cloud labels, of which over 12,000 frames also have annotations for 3D boxes. In addition, we provide unmarked sensor data (approx. 390,000 frames) for multi-cycle sequences recorded in three cities.

Data segmentation

The data set contains 41,280 frames with semantic segmentation into 38 categories. Each pixel in the image is assigned a label that describes the type of object it represents, such as a pedestrian, car, vegetation, etc.

Point cloud

Point cloud segmentation is done by merging semantic pixel information and lidar point clouds. Thus, each 3D point is assigned an object type label. It depends on the exact registration of the lidar camera.

Framework

3D boxes are provided for 12,499 frames. Lidar points in the field of view of the front camera are marked with 3D frames. We annotate 14 classes related to driving, such as cars, pedestrians, buses, etc.

2. Ford Autonomous Vehicle Dataset

image

We present a sophisticated multi-agent seasonal dataset assembled by Ford’s fleet of autonomous vehicles on different days and times during 2017-18. Cars were manually driven along the Michigan route, which included a combination of driving scenarios, including a trip to Detroit Airport, freeways, city centers, a university campus, and a suburban area.

We present seasonal changes in weather, lighting, construction, and traffic observed in dynamic urban environments. This dataset can help develop robust algorithms for autonomous vehicles and multi-agent systems. Each log in the data set is time-marked and contains raw data from all sensors, calibration values, pose trajectory, ground truth pose, and 3D maps. All data is available in Rosbag format, which can be visualized, modified and applied using the open source robot (ROS) operating system.

The dataset contains data with a full resolution time stamp from the following sensors:

  • Four HDL-32E Velodyne 3D Lidars
  • 6 grayscale 1.3 MP cameras
  • 1 Grayscale 5 MP Dash Camera
  • Applanix POS-LV IMU

The dataset also includes:

  • 3D Ground Reflectivity Maps
  • 3D Point Cloud Maps
  • 6 DoF Ground-truth Pose
  • 3 DoF Localized Pose
  • Transformation and calibration of sensors

3. Waymo open dataset

image

Waymo Open Dataset currently contains 1,950 segments. We plan to increase this data set in the future. Here is what is currently included:

1950 segments of 20 seconds each, assembled at a frequency of 10 Hz (200,000 frames) in various geographical conditions and conditions

Sensor Data

  • 1 medium range lidar
  • 4 short range lidar
  • 5 cameras (front and side)
  • Synchronized lidar and camera data
  • Lidar to camera projections
  • Calibration of sensors and vehicle positions

Tagged data

  • Markings for 4 classes of objects – vehicles, pedestrians, cyclists, signs
  • High-quality markings for lidar data in 1200 segments
  • 12.6M 3D boxes labeled with tracking identifiers on lidar data
  • High-quality markings for camera data in 1000 segments
  • 11.8M 2D boxes labeled with tracking identifiers on camera data

Source code
github.com/waymo-research/waymo-open-dataset

4. nuTonomy

image

The nuScenes dataset is a large-scale, stand-alone data set for driving. It has the following features:

  • Full range of sensors (1X lidar, 5X radar, 6X camera, IMU, GPS)
  • 1000 scenes of 20 seconds each
  • 1,400,000 camera images
  • 390,000 lidar passages
  • Two different cities: Boston and Singapore
  • Left and right traffic
  • Map Details
  • Manual annotations for 23 feature classes
  • 1.4M 3D boxes annotated at 2 Hz
  • Attributes such as visibility, activity and posture

five. Dataset | Lyft Level 5

image

All data is collected by the Ford Fusion fleet. We have two versions of vehicles. They are indicated in their calibration data as BETA_V0 and BETA_PLUS_PLUS. Each vehicle is equipped with the following sensors, depending on the vehicle version:

image
BETA_V0 LiDARS:

  • One 40-beam lidar on the roof and two 40-beam lidars on the bumper.
  • Each lidar has an azimuth resolution of 0.2 degrees.
  • All three lidars together produce ~ 216,000 points at a frequency of 10 Hz.
  • The sensing directions of all lidars are synchronized to be the same at any given time.

BETA_V0 Cameras:

  • Six cameras with a wide field of view (WFOV) evenly cover a field of view of 360 degrees (FOV). Each camera has a resolution of 1224×1024 and a FOV of 70 ° x60 °.
  • A single camera with a large focal length is mounted slightly up, mainly for detecting traffic lights. The camera has a resolution of 2048×864 and FOV 35 ° x15 °.
  • Each camera is synchronized with the lidar so that the lidar beam is in the center of the camera’s field of view when the camera captures an image.

image

BETA_PLUS_PLUS LiDARS:

The only lid difference between Beta-V0 and Beta ++ is the roof lidar, which is 64-ray for Beta ++.
Lidar synchronization is the same as Beta-V0.

BETA_PLUS_PLUS Cameras:

  • Six cameras with a wide field of view (FOV) with a high dynamic range evenly cover a 360-degree field of view (FOV). Each camera has a resolution of 1920×1080 and FOV 82 ° x52 °.
  • A single camera with a large focal length is mounted slightly up, mainly for detecting traffic lights. The camera has a resolution of 1920×1080 and FOV 27 ° x17 °.
  • Each camera is synchronized with the lidar so that the lidar beam is in the center of the camera’s field of view when the camera captures an image.

This dataset includes a high-quality semantic map. A semantic map provides a context for making decisions about the presence and movement of agents in scenes. The map contains more than 4,000 lane segments (2,000 road segments and about 2,000 intersections), 197 pedestrian crossings, 60 stop-signs, 54 parking zones, 8 speed bumps, 11 speed bumps.

All map elements are registered in the base geometric map. This is the same frame of reference for all the scenes in the dataset.

6. UC Berkeley open-sources self-driving dataset

image

Video data

Explore 100,000 HD video sequences from over 1,100 hours of driving experience at different times of the day, weather conditions and driving scenarios. Our video sequences also include GPS location, IMU data, and timestamps.

Road object detection

2D boxes are annotated on 100,000 images for a bus, traffic light, road sign, person, bicycle, truck, motor, car, train and rider.

Instance Segmentation

Explore over 10,000 diverse images with annotations at the pixel level and rich instance-level.

Driveable area

Learn a sophisticated car driving solution of 100,000 images.

Lane markings

Multiple types of lane marking annotations on 100,000 images for driving guidance.
Multiple annotation of lane markings on 100,000 images for navigation.

7. The Cityscapes Dataset by Daimler

image

The Cityscapes Dataset focuses on semantic understanding of urban street scenes. In the following, we give an overview on the design choices that were made to target the dataset’s focus.

Cityscapes Dataset focuses on a semantic understanding of urban street scenes. Below we will give an overview of the design options that were made in order to achieve the goals of the dataset.

Polygonal annotations

  • Dense semantic segmentation
  • Segmentation of instances for vehicles and people

Complexity

Diversity

  • 50 cities
  • Several months (spring, summer, autumn)
  • Daytime
  • Good / medium weather
  • Manually selected frames
    • A large number of dynamic objects
    • Various scene layout
    • Change background

The size

  • 5,000 annotated images with beautiful annotations (examples)
  • 20,000 annotated images with rough annotations (examples)

Metadata

  • Previous and final video frames. Each annotated image is the 20th image of the 30-frame video fragments (1.8 s)
  • Matching Right Stereo Views
  • GPS coordinates
  • Vehicle odometer ego data
  • Outside temperature from vehicle sensor

Extensions made by other researchers

  • People annotation bounding box
  • Images supplemented by fog and rain

Benchmark suite and evaluation server

  • Pixel level semantic marking
  • Instance-level semantic marking
  • Panoptic semantic marking

8. The KITTI Vision Benchmark Suite

image

Our recording platform is equipped with four high-resolution video cameras, a Velodyne laser scanner and a modern localization system. Our benchmarks comprise 389 pairs of stereo and optical streams, 39.2 km stereo-visual odometric sequences and over 200 thousand 3D annotations of objects shot in cluttered scenarios (up to 15 cars and 30 pedestrians in the image).


image

About ITELMA

We are a big development company automotive components. The company employs about 2,500 employees, including 650 engineers.

We are perhaps the most powerful competence center in Russia for the development of automotive electronics in Russia. Now we are actively growing and we have opened many vacancies (about 30, including in the regions), such as a software engineer, design engineer, lead development engineer (DSP programmer), etc.

We have many interesting challenges from automakers and concerns driving the industry. If you want to grow as a specialist and learn from the best, we will be glad to see you in our team. We are also ready to share expertise, the most important thing that happens in automotive. Ask us any questions, we will answer, we will discuss.

Read more useful articles:

  • Free Online Courses in Automotive, Aerospace, Robotics, and Engineering (50+)
  • [Прогноз] Transport of the future (short-term, medium-term, long-term horizons)
  • The best materials for hacking cars with DEF CON 2019-2020
  • [Прогноз] Motornet – a data exchange network for robotic vehicles
  • Companies spend $ 16 billion on drones to capture 8 trillion market
  • Cameras or lasers
  • Autonomous cars on open source
  • McKinsey: Rethinking Software and Electronics Architecture in Automotive
  • Another OS war is already under the hood of cars
  • Program code in the car
  • In a modern car, there are more lines of code than …

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *