Image Segmentation for Self Driving Car using UNET/CANET

3 min readJun 29, 2022

Photo by DS stories: https://www.pexels.com/photo/blue-car-on-white-background-10215974/

Business Objective

Given a raw image obtained from the car camera, produce segmented image of the same(which can be used by another ML model to predict the steering angle of a self driving car).

Machine Learning Formulation

To formulate the problem w.r.t machine learning, our aim is to create a machine learning model that accepts an input and predicts segmented image. Segmentation models such as UNET, CANET can be used.

Constraints

The metric iou_score is used in model compilation.

IOU Score is defined as the ratio of no. of similar pixels between the target and actual image with the no. of pixels present in both the actual and predicted. It is also called Jaccard index.

https://en.wikipedia.org/wiki/Jaccard_index

The loss function used in this scenario is the Dice Loss, which quantifies the loss as follows

https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient

Training Data

The training data consists of two parts

Actual Image from the car dashboard camera(Not the segmented image)

Json file for each image which contains segment information — Polygon co-ordinates for each object in the image.

Preprocessing

Currently we have the raw image, we need segmented image from the raw image so that it can be used in model training.

To create segmented image, we can use the polygon information that we have in json file. PIL libraries can be used to draw polygons based on the given vertex list.

The segmented image is converted to grey scale and the output will look something like this

Data Augmentation

Utilize the effect of data augmentation and it can be invoked at the dataloader end randomly for each image.

import imgaug.augmenters as iaa

DataLoader

Create a dataloader that returns the raw image as features and segmented image as Label

UNET Model

Unet segment model is available in segmentation_models library and the model ccan be compiled as below

model = Unet(‘resnet34’, encoder_weights=’imagenet’, classes=21, activation=’sigmoid’, input_shape=(imagesize_obj,imagesize_obj,3))

loss = sm.losses.cce_dice_loss

adam=tf.keras.optimizers.Adam()

model.compile(loss = loss, optimizer = adam, metrics = [iou_score])

The next part of this article is continued here - Image Segmentation : Attention-guided Chained Context Aggregation (CANET)

Conclusion

Image segmentation is widely used across different domains such as medicine, self driving vehicles e.t.c and the performance of the models are getting improved widely. CANET is also one of the segmentation model that uses modules like Global flows and context flows to get both the context information of high level as well as the surrounding pixel information which can be tried with the same dataset.