Business Objective
Given a raw image obtained from the car camera, produce segmented image of the same(which can be used by another ML model to predict the steering angle of a self driving car).
Machine Learning Formulation
To formulate the problem w.r.t machine learning, our aim is to create a machine learning model that accepts an input and predicts segmented image. Segmentation models such as UNET, CANET can be used.
Constraints
The metric iou_score is used in model compilation.
IOU Score is defined as the ratio of no. of similar pixels between the target and actual image with the no. of pixels present in both the actual and predicted. It is also called Jaccard index.
The loss function used in this scenario is the Dice Loss, which quantifies the loss as follows
Training Data
The training data consists of two parts
- Actual Image from the car dashboard camera(Not the segmented image)
- Json file for each image which contains segment information — Polygon co-ordinates for each object in the image.
Preprocessing
Currently we have the raw image, we need segmented image from the raw image so that it can be used in model training.
To create segmented image, we can use the polygon information that we have in json file. PIL libraries can be used to draw polygons based on the given vertex list.
The segmented image is converted to grey scale and the output will look something like this
Data Augmentation
Utilize the effect of data augmentation and it can be invoked at the dataloader end randomly for each image.
import imgaug.augmenters as iaa
DataLoader
Create a dataloader that returns the raw image as features and segmented image as Label
UNET Model
Unet segment model is available in segmentation_models library and the model ccan be compiled as below
model = Unet(‘resnet34’, encoder_weights=’imagenet’, classes=21, activation=’sigmoid’, input_shape=(imagesize_obj,imagesize_obj,3))
loss = sm.losses.cce_dice_loss
adam=tf.keras.optimizers.Adam()
model.compile(loss = loss, optimizer = adam, metrics = [iou_score])
The next part of this article is continued here - Image Segmentation : Attention-guided Chained Context Aggregation (CANET)
Conclusion
Image segmentation is widely used across different domains such as medicine, self driving vehicles e.t.c and the performance of the models are getting improved widely. CANET is also one of the segmentation model that uses modules like Global flows and context flows to get both the context information of high level as well as the surrounding pixel information which can be tried with the same dataset.