Image Segmentation for Self Driving Car using UNET/CANET

Photo by DS stories:

Business Objective

Given a raw image obtained from the car camera, produce segmented image of the same(which can be used by another ML model to predict the steering angle of a self driving car).

Machine Learning Formulation

To formulate the problem w.r.t machine learning, our aim is to create a machine learning model that accepts an input and predicts segmented image. Segmentation models such as UNET, CANET can be used.


The metric iou_score is used in model compilation.

IOU Score is defined as the ratio of no. of similar pixels between the target and actual image with the no. of pixels present in both the actual and predicted. It is also called Jaccard index.

The loss function used in this scenario is the Dice Loss, which quantifies the loss as follows

Training Data

The training data consists of two parts

  1. Actual Image from the car dashboard camera(Not the segmented image)
  1. Json file for each image which contains segment information — Polygon co-ordinates for each object in the image.


Currently we have the raw image, we need segmented image from the raw image so that it can be used in model training.

To create segmented image, we can use the polygon information that we have in json file. PIL libraries can be used to draw polygons based on the given vertex list.

The segmented image is converted to grey scale and the output will look something like this

Data Augmentation

Utilize the effect of data augmentation and it can be invoked at the dataloader end randomly for each image.

import imgaug.augmenters as iaa


Create a dataloader that returns the raw image as features and segmented image as Label

UNET Model

Unet segment model is available in segmentation_models library and the model ccan be compiled as below

model = Unet(‘resnet34’, encoder_weights=’imagenet’, classes=21, activation=’sigmoid’, input_shape=(imagesize_obj,imagesize_obj,3))

loss = sm.losses.cce_dice_loss


model.compile(loss = loss, optimizer = adam, metrics = [iou_score])

The next part of this article is continued here - Image Segmentation : Attention-guided Chained Context Aggregation (CANET)


Image segmentation is widely used across different domains such as medicine, self driving vehicles e.t.c and the performance of the models are getting improved widely. CANET is also one of the segmentation model that uses modules like Global flows and context flows to get both the context information of high level as well as the surrounding pixel information which can be tried with the same dataset.





Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Passionate about Data Science and applying Machine Learning,Deep Learning algorithms