Non Maximum Suppression (NMS)

1.4K views

4 minute read

Non maximum suppression is a technique used in object detection to filter bounding boxes generated by object detection algorithms. If we don’t use NMS, we will get an image with dense frames.

ByWayne
29/05/2024

Non maximum suppression is a technique used in object detection to filter bounding boxes generated by object detection algorithms. If we don’t use NMS, we will get an image with dense frames.

Non Maximum Suppression (NMS)
Intersection Over Union (IoU)
NMS Algorithm
YOLOv8 Example
Conclusion
Reference

Non Maximum Suppression (NMS)

Non maximum suppression (NMS) is a technique used in the post-processing task of object detection. Generally speaking, object detection algorithm generates many bounding boxes for an object. However, for the same object, we only need one bounding box. Therefore, we need to use NMS to help us filter redundant and irrelevant bounding boxes and retain only the best bounding box.

NMS filters out redundant and irrelevant bounding boxes.

Intersection Over Union (IoU)

Before go deeper into the NMS algorithm, let’s first understand what intersection over union (IoU) is. IoU is a metric used to measure how two bounding boxes overlap. Its formula is as follows. From the formula, we can know that when the overlapping area of two bounding boxes is larger, the IoU will be larger, and vice versa.

NMS Algorithm

Before we dive into the algorithm, we must first understand the input value format that NMS expects. The bounding boxes generated by object detection algorithm may contain six values.

Four values represent the range of a bounding box:
- It might be the center point (x, y), width, and height.
- It may also be the point in the upper left corner (x1, y1) and the point in the lower right corner (x2, y2).
A confidence score represents how likely the object exists in the box. The higher the value, the higher the probability of containing the object, and the lower the probability, the lower the probability.
A class ID represents the ID of the contained object. When the object detection algorithm supports the detection of multiple objects in an image, it generates an ID to represent the class ID of the detected object.

Now we can start to understand the NMS algorithm.

NMS Algorithm, source <a href= — NMS Algorithm, source A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS.

The algorithm looks a bit complicated, but it isn’t. The algorithm is roughly as follows.

Line 1: F will contain the bounding boxes selected by the NMS. So, it starts out empty.
Line 2: First remove the bounding boxes smaller than the confidence threshold T. Because bounding boxes with too low confidence scores are very likely to not detect objects, that is, irrelevant bounding boxes, we do not consider these boxes at all.
Line 5-7: Select a box b with the highest confidence score from B, add b to F, and remove b from B.
Line 8-13: For each box r in B, calculate the IoU of b and r. If IoU is greater than or equal to the IoU threshold τ, then r is removed from B. Because when the IoU is too high, it means that the overlapping range of the two boxes is too large, so we only need one of them. Of course, you should choose b with a high confidence score.
Line 4-14: Repeat these steps until all boxes in B are removed. Finally, the filtered boxes will be in F.

YOLOv8 Example

YOLOv8 is a very popular object detection model. If you are not familiar with YOLOv8, you can refer to the following articles first.

- Deep Learning
- Vision Models

YOLOv8 Object Detection Tutorial

ByWayne
23/05/2024

The default confidence score threshold of YOLOv8 is 0.25, and the default IoU threshold is 0.45. Please refer to the NMS source code of YOLOv8.

We can use the two parameters conf and iou to adjust the confidence score threshold and IoU threshold of YOLOv8. Both values are floating point numbers between 0 and 1. When conf is set higher, more bounding boxes will be filtered out, because bounding boxes less than or equal to conf will be filtered out. However, iou is the opposite. The lower the iou is set, the more bounding boxes will be filtered out, because bounding boxes greater than or equal to iou will be filtered out. Taking the baby penguin detection model trained in the above article as an example, the following is the YOLOv8 command with conf and iou parameters.

YOLOv8Example % .venv/bin/yolo detect mode=predict model=./runs/train/weights/best.pt source=image.jpg project=runs name=predict show_labels=False conf=0.05 iou=0.5

Change iou from the default 0.45 to 0.7 to retain more bounding boxes. Then, let’s compare what impact different conf will have on the predicted results.

NMS with different confidence score thresholds.

Change conf from the default 0.25 to 0.05 to retain more bounding boxes. Then, let’s compare how different iou will affect the prediction results.

We can find that the default conf and iou of YOLOv8 are actually quite appropriate.

Conclusion

We don’t necessarily need to implement NMS ourselves. YOLOv8 already has NMS implemented, and we can easily set the confidence score and IoU thresholds. In addition, TorchVision also provides NMS functions. However, understanding NMS allows us to better understand how to adjust these thresholds.

Reference

Non Maximum Suppression: Theory and Implementation in PyTorch, LearnOpenCV.
J. Terven and D. Cordova-Esparza, 2023. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS, In Machine Learning and Knowledge Extraction.

Get source code of posts.

Non Maximum Suppression (NMS)

Share

Table of Contents

Non Maximum Suppression (NMS)

Intersection Over Union (IoU)

NMS Algorithm

YOLOv8 Example

YOLOv8 Object Detection Tutorial

Conclusion

Reference

Related Tags

Wayne

Leave a Reply Cancel reply

Executing YOLOv8 Models on Android Using ONNX Runtime

Executing YOLOv8 Models on Android Using PyTorch

Neural Networks and Binary Classification

Word2Vec Word Embedding Model

Multiple Classification Neural Network

Convolutional Neural Networks (CNN)

Generative Pre-trained Transformer, GPT

Bidirectional Encoder Representations from Transformers, BERT

Transformer Model

Attention Models

Sequence to Sequence Model (Seq2Seq)

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts

Kotlin Coroutine Flow Tutorial

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts

Non Maximum Suppression (NMS)

Share

Table of Contents

Non Maximum Suppression (NMS)

Intersection Over Union (IoU)

NMS Algorithm

YOLOv8 Example

Conclusion

Reference

Related Tags

Leave a Reply Cancel reply

You May Also Like