YOLO (You Only Look Once) is a popular object detection model. Its high performance and high accuracy made it popular quickly. This article will introduce how to use YOLOv8 for object detection.
The complete code for this chapter can be found in .
Table of Contents
YOLOv8
YOLO is a model published in 2015 by Joseph Redmon and Ali Farhadi of the University of Washington. After its publication, it quickly became popular due to its high efficiency and high accuracy. Next, Joseph Redmon published YOLOv2 in 2017 and YOLOv3 in 2018. After that, he announced his leaving from Computer Vision because he found that his work was being used in military applications and privacy violation. In 2020, Alexey Bochkovskiy published YOLOv4. In the same year, Ultralytics released YOLOv5. In 2022, Meituan Vision AI Department released YOLOv6. In the same year, Alexey Bochkovskiy published YOLOv7. In 2023, Ultralytics published YOLOv8. The history of YOLO is quite special and interesting. Different versions, released by different people or groups. For a more detailed historical introduction, please refer here.
Project Setup
Next, we will introduce how to use YOLOv8 by building a baby penguin detection model.
First, create a folder called YOLOv8Example
and set up a virtual environment for Python3. Then, install ultralytics.
% mkdir YOLOv8Example % cd YOLOv8Example YOLOv8Example % python3 -m venv .venv YOLOv8Example % .venv/bin/pip install ultralytics
Labeling
Before starting to train the model, we must first prepare the training dataset. In this project, we prepared 9 photos of baby penguins, 7 of which were used to train the model and 2 of which were used to validate the model. It should be noted that in real projects, such a dataset is far from sufficient.
After the dataset is ready, we need to start labeling the dataset. The so-called labeling data refers to label an area where has baby penguins in the photo. In addition, we can label more than one object or class in a photo. For example, we can label a certain area with baby penguins in a photo, and another area with polar bears. So, we have a class called Baby Penguin.
Below is a photo with its labeling. The labeling format of YOLOv8 is
- Each line represents a labeling.
- The first number represents the index of the class. 0 represents the first class, which is
Baby Penguin
. - The second and third numbers represent the center points x and y of the labeled range. They are expressed in percentages, not pixels.
- The fourth and fifth numbers represent the width and height of the labeled range. They are expressed in percentages, not pixels.
0 0.482720 0.527601 0.517280 0.749469
There are two software that can help us to label objects.
Roboflow
Roboflow is a free online labeling software. The interface is nice and easy to use.
Yolo Label
Yolo Label is a free labeling software. The interface is not as good-looking as Roboflow, and when using it, the proportions of the pictures will run off. However, this does not affect the labeling result, since the labeled range is expressed in percentages, not pixels. Although it has these shortcomings, it can be used locally.
When using it, Yolo Label will ask you to enter the paths of image folder and class file you want to label.
We create a file called classes.txt, and each line in it is a class.
Baby Penguin
Training
Training Dataset
Before starting training, we first place the images and labels into the project. Images for training are placed in datasets/train/images, and labels are placed in datasets/train/labels. Images and labels used for validation are placed under datasets/val.
Training Settings
During training, we need to provide a configuration file. Create a file called data.yaml
with the following contents. In data.yaml, we
- Specify the path of the data set in
path
- The folder designated for training images in
train
- The folder specified in
val
for the images used for validation nc
specifies the number of classesnames
specifies the names of the classes
path: /path/to/waynestalk/YOLOv8Example/datasets train: train/images val: val/images nc: 1 names: [ "Baby Penguin" ]
YOLOv8 Pretrained Detect Models
Finally, before starting training, we need to choose a YOLOv8 pretrained model. Based on this pretrained model, we train the new classes we want to predict. The lower the model, the higher the accuracy, but the slower the speed and the larger the file size. You decide which model to use based on your needs. In the examples of this article, we will use YOLOv8s.
Model | size (pixels) | mAPval 50-95 | Speed CPU ONNX (ms) | Speed A100 TensorRT (ms) | params (M) | FLOPS (B) |
---|---|---|---|---|---|---|
YOLOv8n | 640 | 37.3 | 80.4 | 0.99 | 3.2 | 8.7 |
YOLOv8s | 640 | 44.9 | 128.4 | 1.20 | 11.2 | 28.6 |
YOLOv8m | 640 | 50.2 | 234.7 | 1.83 | 25.9 | 78.9 |
YOLOv8l | 640 | 52.9 | 375.2 | 2.39 | 43.7 | 165.2 |
YOLOv8x | 640 | 53.9 | 479.1 | 3.53 | 64.2 | 257.8 |
CLI
We can train our model through the following command. We will briefly explain each argument, please refer to the official.
- mode: YOLOv8 provides 5 modes. We are currently using train mode.
- data: refers to the data.yaml we created previously.
- imgsz: refers to the size of the training image, which will be the input image size of the final target model. When the size of the image is different from imgsz, YOLOv8 will automatically resize the image to the size of imgsz first.
- Epochs: An epoch refers to training the entire dataset once.
- batch: The number of images in each batch of training. This depends on the memory size.
- project, name: YOLOv8 does not provide parameters for us to specify the output path. Instead, use
project
andname
to specify the output path. The output path will be project/name.
YOLOv8Example % .venv/bin/yolo detect mode=train model=yolov8s.pt data=data.yaml imgsz=640 epochs=5 batch=1 project=runs name=train
After training, we will get the following output data. Among them, best.pt
is the trained model.
Python
In addition to CLI, we can also use YOLOv8 in Python, as follows.
from ultralytics import YOLO # Load a pretrained YOLOv8s model model = YOLO('yolov8s.pt') # Train and validate the mode train_results = model.train(data='data.yaml', imgsz=640, batch=1, epochs=5, project='runs', name='train', save=True)
Prediction
Next, we need to use the model we just trained to make predictions, which is object detection.
CLI
We can use the following command to detect whether there are baby penguins in the picture. We don’t need to worry about whether the image size to be predicted is 640×640, YOLOv8 will automatically adjust the size for us.
YOLOv8Example % .venv/bin/yolo detect mode=predict model=./runs/train/weights/best.pt source=image.jpg project=runs name=predict
After the prediction is completed, we can find the predicted results in the following folder.
YOLOv8 will automatically help us frame the detected range and label it with the class name.
YOLOv8 can not only predict pictures, but also predict videos. Its usage is the same, as follows.
YOLOv8Example % .venv/bin/yolo detect mode=predict model=./runs/train/weights/best.pt source=video.mp4 project=runs name=predict
Python
We can use YOLOv8 in Python.
from ultralytics import YOLO # Load the trained model model = YOLO('./runs/train/weights/best.pt') # Predict an image model.predict(source='image.jpg', project='run', name='predict', save=True) # Predict a video model.predict(source='video.mp4', project='run', name='predict', save=True)
Conclusion
YOLOv8 is very convenient and simple to use. Therefore, when training a model, a lot of time is spent collecting datasets and labeling these data. It provides 5 tasks, namely detect, segment, classify, pose, and OBB. This article only introduces how to use detect.
Reference
- The History of YOLO Object Detection Models from YOLOv1 to YOLOv8.
- Ultralytics YOLOv8 Docs.
- YOLOv8 Github.