Published on
Friday, January 29, 2021

[MOOC] Apollo Lesson 4: Perception

381 words2 min read
  • avatar
    Viet Anh

My note for lesson 4 of MOOC course: Self-Driving Fundamentals - Featuring Apollo. Content: Identify different perception tasks such as classification, detection, segmentation.


Perception module is much like our brain. It receives data from car sensors such as cameras, LiDARs, radars and use AI models and algorithms to recognize traffic lights, 3D objects with type, distance and velocity. Our self-driving module uses these outputs to control the car. Below is the perception module of Apollo 6.0.

Perception module of Apollo 6.0. Image from

Perception module uses Computer Vision to analyze images.


BGR image

Camera images are often in RGB color space.

LiDAR image

LiDAR (/ˈlaɪdɑːr/, also LIDAR, LiDAR, and LADAR) is a method for measuring distances (ranging) by illuminating the target with laser light and measuring the reflection with a sensor. Differences in laser return times and wavelengths can then be used to make digital 3-D representations of the target. It has terrestrial, airborne, and mobile applications.


Computer vision techniques

  • Neural Network, Convolutional Neural Network
  • Image Classification
  • Object Detection
  • Object Tracking
  • Segmentation

Apollo Perception

The Apollo open software stack perceives obstacles, traffic lights and lanes.

The Region of Interest (ROI) filter is used to focus on relevant objects on HD map. Apollo applies the ROI filter to both point cloud and image data to narrow the search scope and accelerating perception.

ROI filter on HD Map

Sensor data comparison

Camera, LiDAR or Radar has different performance on different tasks or weather conditions. Below is the comparison between them. We need to fuse the outputs from these sensor to achieve the best performance.

Sensor data comparison - Camera vs LiDAR vs Radar vs Sensor fusion

Sensor fusion

Two-step estimation:

  • Predict State
  • Update Measurement

Measurement Update can be done in 2 ways: synchronous and asynchronous.

  • Synchronous fusion updates all the measurements from different sensors at the same time.
  • Asynchronous fusion updates the sensor measurements one at a time when they arrive.