GluonCV SSD Mobilenet training and optimizing using SageMaker Neo¶
Introduction¶
This is an end-to-end example of GluonCV SSD model training inside sagemaker notebook and then compile the trained model using SageMaker Neo. In this demo, we will demonstrate how to train a mobilenet model on the Pascal VOC dataset using the Single Shot multibox Detector (SSD) algorithm. We will also demonstrate how to optimize this trained model using SageMaker Neo and host it.
*This notebook is for demonstration purpose only. Please fine tune the training parameters based on your own dataset.*
Setup¶
To train the ssd mobilenet model on Amazon SageMaker, we need to setup and authenticate the use of AWS services.
To start, we need to upgrade the SageMaker SDK for Python to the latest version if it is not and verify the same before proceeding.
[ ]:
!~/anaconda3/envs/mxnet_p36/bin/pip install --upgrade sagemaker
[ ]:
import sagemaker
if sagemaker.__version__.split('.')[0] == '1':
raise Exception("Please upgrade sagemaker SDK by running the above cell while ensuring kernel name is the same as the one being used. Restart the kernel after upgrade.")
Then we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3. We also create a session.
[ ]:
from sagemaker import get_execution_role
role = get_execution_role()
sess = sagemaker.Session()
We then need an S3 bucket that would be used for storing the model artifacts generated after training and compilation, training data and custom code.
[ ]:
# S3 bucket and folders for saving code and model artifacts.
# Feel free to specify different bucket/folders here if you wish.
bucket = sess.default_bucket()
folder = 'DEMO-ObjectDetection'
custom_code_sub_folder = folder + '/custom-code'
training_data_sub_folder = folder + '/training-data'
training_output_sub_folder = folder + '/training-output'
compilation_output_sub_folder = folder + '/compilation-output'
To easily visualize the detection outputs we also define the following function. The function visualizes the high-confidence predictions with bounding box by filtering out low-confidence detections.
[ ]:
%matplotlib inline
def visualize_detection(img_file, dets, classes=[], thresh=0.6):
"""
visualize detections in one image
Parameters:
----------
img_file : numpy.array
image, in bgr format
dets : numpy.array
ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])
each row is one object
classes : tuple or list of str
class names
thresh : float
score threshold
"""
import random
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from matplotlib.patches import Rectangle
img=mpimg.imread(img_file)
plt.imshow(img)
height = img.shape[0]
width = img.shape[1]
colors = dict()
klasses = dets[0][0]
scores = dets[1][0]
bbox = dets[2][0]
for i in range(len(classes)):
klass = klasses[i][0]
score = scores[i][0]
x0, y0, x1, y1 = bbox[i]
if score < thresh:
continue
cls_id = int(klass)
if cls_id not in colors:
colors[cls_id] = (random.random(), random.random(), random.random())
xmin = int(x0 * width / 512)
ymin = int(y0 * height / 512)
xmax = int(x1 * width / 512)
ymax = int(y1 * height / 512)
rect = Rectangle((xmin, ymin), xmax - xmin,
ymax - ymin, fill=False,
edgecolor=colors[cls_id],
linewidth=3.5)
plt.gca().add_patch(rect)
class_name = str(cls_id)
if classes and len(classes) > cls_id:
class_name = classes[cls_id]
plt.gca().text(xmin, ymin-2,
'{:s} {:.3f}'.format(class_name, score),
bbox=dict(facecolor=colors[cls_id], alpha=0.5),
fontsize=12, color='white')
plt.tight_layout(rect=[0, 0, 2, 2])
plt.show()
[ ]:
# Initializing object categories
object_categories = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']
# Setting a threshold 0.20 will only plot detection results that have a confidence score greater than 0.20.
threshold = 0.20
Finally we load the test image into the memory. The test image used in this notebook is from PEXELS which remains unseen until the time of preditcion.
[ ]:
import PIL.Image
import numpy as np
test_file = 'test.jpg'
test_image = PIL.Image.open(test_file)
test_image = np.asarray(test_image.resize((512, 512)))
Data Preparation¶
Pascal VOC was a popular computer vision challenge and they released annual challenge datasets for object detection from 2005 to 2012. In this notebook, we will use the data sets from 2007 and 2012, named as VOC07 and VOC12 respectively. Cumulatively, we have more than 20,000 images containing about 50,000 annotated objects. These annotated objects are grouped into 20 categories.
*Notes:* 1. While using the Pascal VOC dataset, please be aware of the database usage rights. The VOC data includes images obtained from flickr’s website. Use of these images must respect the corresponding terms of use: https://www.flickr.com/help/terms 2. Default EBS Volume size for SageMaker Notebook instances is 5GB. While performing this step if you run out of storage then consider increasing the volume size. One way to do so is by using AWS CLI as documented here.
Download data¶
Download the Pascal VOC datasets from 2007 and 2012 from Oxford University’s website.
*Following is an alternative link to download the dataset if there is some connection problem: https://course.fast.ai/datasets#image-localization*
[ ]:
%%time
# Download the dataset
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
!tar -xf /tmp/VOCtrainval_11-May-2012.tar && rm /tmp/VOCtrainval_11-May-2012.tar
!tar -xf /tmp/VOCtrainval_06-Nov-2007.tar && rm /tmp/VOCtrainval_06-Nov-2007.tar
!tar -xf /tmp/VOCtest_06-Nov-2007.tar && rm /tmp/VOCtest_06-Nov-2007.tar
Convert data into RecordIO¶
RecordIO is a highly efficient binary data format from MXNet. Using this format, dataset is simple to prepare and transfer to the instance that will run the training job. Please refer to object_detection_recordio_format for more information about how to prepare RecordIO dataset
[ ]:
!python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target VOCdevkit/train.lst
!rm -rf VOCdevkit/VOC2012
!python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target VOCdevkit/val.lst --no-shuffle
!rm -rf VOCdevkit/VOC2007
Upload data to S3¶
Upload the converted data to the S3 bucket.
[ ]:
# Upload the RecordIO files to train and validation channels
sess.upload_data(path='VOCdevkit/train.rec', bucket=bucket, key_prefix=training_data_sub_folder)
sess.upload_data(path='VOCdevkit/train.idx', bucket=bucket, key_prefix=training_data_sub_folder)
Next, we need to setup training and compilation output locations in S3, where the respective model artifacts will be dumped. We also setup the s3 location for the custom code.
[ ]:
# S3 Location where the training data is stored in the previous step
s3_training_data_location = 's3://{}/{}'.format(bucket, training_data_sub_folder)
# S3 Location to save the model artifact after training
s3_training_output_location = 's3://{}/{}'.format(bucket, training_output_sub_folder)
# S3 Location to save the model artifact after compilation
s3_compilation_output_location = 's3://{}/{}'.format(bucket, compilation_output_sub_folder)
# S3 Location to save your custom code in tar.gz format
s3_custom_code_upload_location = 's3://{}/{}'.format(bucket, custom_code_sub_folder)
Train the model¶
Now that we are done with all the setup that is needed, we are ready to train our object detector. To begin, we will create a SageMaker MXNet estimator object which allows us to run single machine or distributed training in SageMaker, using CPU or GPU-based instances. After creating the estimator, training is started by calling fit() on this estimator. When we create the estimator, we pass:
- entry_point: filename of the python script which defines training and hosting methods. Here we use ssd_entry_point.py - role: name of our IAM execution role. - output_path: S3 path where the training artifacts will be stored. We defined this in the previous step. - code_location: S3 path where the custom code including the entry_point script will be stored. We defined this in the previous step. - instance_count & instance_type: allows us to specify the number &
type of SageMaker instances that will be used for the training job. For this example, we will choose one ml.p3.2xlarge instance. - framework_version & py_version - distribution: dict with information on how to run distributed training. Here we will use distributed training with parameter_server. - hyperparameters: dict of values that will be passed to the entry_point script.
[ ]:
from sagemaker.mxnet import MXNet
ssd_estimator = MXNet(entry_point='ssd_entry_point.py',
role=role,
output_path=s3_training_output_location,
code_location=s3_custom_code_upload_location,
instance_count=1,
instance_type='ml.p3.2xlarge',
framework_version='1.7.0',
py_version='py3',
distribution={'parameter_server': {'enabled': True}},
hyperparameters={'epochs': 1,
'data-shape': 512,
}
)
[ ]:
ssd_estimator.fit({'train': s3_training_data_location})
Compile the trained model using SageMaker Neo¶
After training the model we can use SageMaker Neo’s compile_model() API to compile the trained model. When calling compile_model() user is expected to provide all the correct input shapes required by the model for successful compilation. We also specify the target instance family, the name of our IAM execution role, S3 bucket to which the compiled model would be stored and we set MMS_DEFAULT_RESPONSE_TIMEOUT environment variable to 500.
For this example, we will choose ml_p3 as the target instance family while compiling the trained model.
[ ]:
compiled_model = ssd_estimator.compile_model(target_instance_family='ml_p3',
input_shape={'data':[1, 3, 512, 512]},
role=role,
output_path=s3_compilation_output_location,
framework='mxnet',
framework_version='1.7',
env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'})
Deploy the compiled model and request Inferences¶
We have to deploy the compiled model on one of the instance family for which the trained model was compiled for. Since we have compiled for ml_p3 we can deploy to any ml.p3 instance type. For this example we will choose ml.p3.2xlarge
[ ]:
neo_object_detector = compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')
[ ]:
%%time
response = neo_object_detector.predict(test_image)
[ ]:
# Visualize the detections.
visualize_detection(file_name, response, object_categories, threshold)
Delete the Endpoint¶
Having an endpoint running will incur some costs. Therefore as an optional clean-up job, you can delete it.
[ ]:
print("Endpoint name: " + neo_object_detector.endpoint_name)
neo_object_detector.delete_endpoint()