Object detection and computer vision have become almost synonymous over the last decade with artificial intelligence and machine learning. Its applications have been at the forefront of the media for many years with self-driving cars and facial recognition.
There are many different options for implementing object detection in Python. Packages including OpenCV and YOLO already have pre-trained machine learning models for carrying out object detection. However, often these packages are explained through examples using already labelled datasets. As useful as these are, data preparation is a crucial step when it comes to implementing these algorithms and ensuring your training data is in the right format can be tricky.
In this blog we will see:
- How to prepare machine learning training datasets
- Comparisons with different labelling tools
Image labelling allows us to assign annotations to objects that exist in our images that we want to go on to detect in either images or videos.
Throughout this tutorial, we will be using a dataset downloaded from Kaggle. This dataset did originally come with labels but for the purpose of this tutorial they were deleted. This leaves only a folder containing images of different fruits.
We are going to use the LabelImg
tool which is a graphical annotation tool. More details about this can be found here. For the labels to be compatible with the YOLO algorithm, ensure the outputted annotations are saved as .txt
files.
To run LabelImg
, make sure you have conda
installed on your computer and open up the Anaconda Powershell Prompt
from the start window. Now, enter the following commands:
conda create --name yolo -c conda-forge python=3.9 pyqt=5 labelimg
Note this line only has to be run once. The next two lines should be run each time you want to start up the tool:
conda activate yolo
labelimg
The following window should now pop up.
Here, you want to choose Open Dir
and select the folder containing the images you want to train, and Change Save Dir
should be the folder you want to save your labels to.
Once you have your folders set up you can click through your images one by one drawing bounding boxes around each object, labelling as you go. Key piece of advice here is to make sure you have a mouse at the ready! As tedious as this task may seem – it becomes a whole lot worse when you are dealing with a trackpad.
To label the bounding boxes, you will want to add the annotations into the Use default label
box before drawing the rectangle.
There is an example below showing it in practice on an apple and some bananas. These labels can then be used in an object detection machine learning model which should detect apples and bananas from a range of images. (Keep your eyes peeled for the next blog!)
This is just one example of a label imaging tool, there are many more out there. For example, AWS (Amazon Web Services) have their own label imaging tool. This is priced per bounding box. So, for less than 50,000 objects, it costs $0.08 per label, that is if you wanted to train 5,000 images it would cost $400! More details on this can be found here.
In my opinion, this is all the more reason for using LabelImg
– in addition to it being very easy to use, it is completely free!
Part 1 of 2 blogs on object detection