Object detection is a computer vision task that involves identifying objects in both images and videos. YOLO (You Only Look Once) is a state-of-the-art object detection model that is widely used within the computer vision field. It uses a Convolutional Neural Network (CNN) that takes an image and predicts bounding boxes around objects and the corresponding class label.
YOLOv8 is the newest of the series of YOLO models and will be used throughout this blog.
When training any machine learning model, hyperparameter tuning is an essential part. Hyperparameters are parameters that influence the learning process during model training. In order to produce the best possible predictions from a model, we must find the optimal set of hyperparameters.
In this blog, we will describe how to run a custom YOLOv8 model using Amazon SageMaker’s resources to find the optimal hyperparameter configuration.
For the purposes of this blog, we will assume the following:
- We have a set of training images and labels saved in an S3 bucket
- We have a train.py file that contains the YOLO model
- We have a .yaml file that contains the directory of training and validation and the number of classes and label names
To run a hyperparameter tuning job, we need to set up an Estimator. An example is shown below and more details on each input can be found here. However, before we can do this, we must import all the necessary libraries.
# importing the libraries import sagemaker from sagemaker import get_execution_role from sagemaker.estimator import Estimator from sagemaker.pytorch import PyTorch from sagemaker.tuner import CategoricalParameter, ContinuousParameter from sagemaker.tuner import HyperparameterTuner, HyperbandStrategyConfig, StrategyConfig from sagemaker.estimator import Estimator sagemaker_session = sagemaker.Session() role = get_execution_role()
# setting the metric definitions for the YOLO model metric_definitions=[ { "Name": "precision", "Regex": "YOLO Metric metrics/precision\(B\): (.*)" }, { "Name": "recall", "Regex": "YOLO Metric metrics/recall\(B\): (.*)" }, { "Name": "mAP50", "Regex": "YOLO Metric metrics/mAP50\(B\): (.*)" }, { "Name": "mAP50-95", "Regex": "YOLO Metric metrics/mAP50-95\(B\): (.*)" }, { "Name": "box_loss", "Regex": "YOLO Metric val/box_loss: (.*)" }, { "Name": "cls_loss", "Regex": "YOLO Metric val/cls_loss: (.*)" }, { "Name": "dfl_loss", "Regex": "YOLO Metric val/dfl_loss: (.*)" } ] estimator = PyTorch( entry_point="train.py", role=role, image_uri='your/image', # your image source_dir="./src", instance_count=1, instance_type='ml.g4dn.xlarge', framework_version="1.12.1", py_version="py38", sagemaker_session=sagemaker_session, hyperparameters={}, use_spot_instances=True, input_mode='File', # FastFile causes a issue with writing label cache debugger_hook_config=False, max_wait=360000+3600, max_run=360000, output_path='path/to/output', enable_sagemaker_metrics=True, metric_definitions=metric_definitions, )
The estimator defined above, takes your train.py, (the source_dir
needs to be where this file is saved) sets an instance type uses a spot instance and has a max_run
time of 100 hours. This means that after 100 hours Amazon SageMaker terminates the job irrespective of its current position.
Any hyperparameters you want to keep the same value throughout the training jobs can also be set as a constant here. Again, more details on these can be found here.
The train.py file should include code similar to the following, with the hyperparameters that you are wanting to tune added to the parser:
# train.py import argparse import sys import os import shutil from ultralytics import YOLO parser = argparse.ArgumentParser() parser.add_argument('--epochs', help='number of training epochs') parser.add_argument('--optimizer', help='optimizer to use') parser.add_argument('--lr0', help='initial learning rate') parser.add_argument('--lrf', help='final learning rate') parser.add_argument('--momentum', help='momentum') parser.add_argument('--weight_decay', help='optimizer weight decay') args = parser.parse_args() print('---------------Debug injected environment and arguments--------------------') print(sys.argv) print(os.environ) print('---------------End debug----------------------') model = YOLO("yolov8n.yaml") model.train(data='./blaa.yaml', epochs=int(args.epochs), batch=64, optimizer=args.optimizer, lr0=float(args.lr0), lrf=float(args.lrf), momentum=float(args.momentum), weight_decay=float(args.weight_decay) ) model.export()
As mentioned at the start, we need a .yaml file to run the YOLOv8 model. This should contain the following details:
# .yaml file # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..] path: /opt/ml/input/your/s3/bucket # dataset root dir train: images/train # train images (relative to 'path') val: images/train # val images (relative to 'path') test: # test/images # test images (optional) # Classes names: 0: 'label1' 1: 'label2'
Now, we need to define the ranges of the hyperparameters you want to tune.
This is shown below; where each hyperparameter is either an IntegerParameter
, CategoricalParameter
or a ContinuousParameter
.
hyperparameter_ranges={ 'epochs':IntegerParameter(100, 300), 'optimizer':CategoricalParameter(['SGD', 'Adam', 'AdamW', 'RMSProp']), 'lr0': ContinuousParameter(0.00001, 0.01), 'lrf': ContinuousParameter(0.00001, 0.01), 'momentum': ContinuousParameter(0.9, 0.9999), 'weight_decay': ContinuousParameter(0.0003, 0.00099) }
To create a tuner we use HyperparameterTuner which takes the following inputs:
- Our estimator
- The objective metric and definition (definitions set above)
- here we have chosen to maximise the mean average precision mAP
- Hyperparameter ranges
- Strategy
- We have set the strategy to be Hyperband. More details on these options here
tuner = HyperparameterTuner(estimator, objective_metric_name="mAP50-95", metric_definitions=metric_definitions, hyperparameter_ranges= hyperparameter_ranges, strategy='Hyperband', max_jobs=50, strategy_config = StrategyConfig(hyperband_strategy_config=HyperbandStrategyConfig(max_resource=10, min_resource = 1)) )
Finally, we want to fit the tuner by passing in the S3 paths to the training data
tuner.fit('S3/path/to/training-data')
This starts up a hyperparameter tuning job.
Your work here is done, time to sit back and wait for the results!