GluonCV SSD Mobilenet training and optimizing using SageMaker Neo

  1. Introduction

  2. Setup

  3. Data Preparation

  4. Download data

  5. Convert data into RecordIO

  6. Upload data to S3

  7. Train the model

  8. Compile the trained model using SageMaker Neo

  9. Deploy the compiled model and request Inferences

  10. Delete the Endpoint

Introduction

This is an end-to-end example of GluonCV SSD model training inside sagemaker notebook and then compile the trained model using SageMaker Neo. In this demo, we will demonstrate how to train a mobilenet model on the Pascal VOC dataset using the Single Shot multibox Detector (SSD) algorithm. We will also demonstrate how to optimize this trained model using SageMaker Neo and host it.

*This notebook is for demonstration purpose only. Please fine tune the training parameters based on your own dataset.*

Setup

To train the ssd mobilenet model on Amazon SageMaker, we need to setup and authenticate the use of AWS services.

To start, we need to upgrade the SageMaker SDK for Python to the latest version if it is not and verify the same before proceeding.

[ ]:
!~/anaconda3/envs/mxnet_p36/bin/pip install --upgrade sagemaker
[ ]:
import sagemaker
if sagemaker.__version__.split('.')[0] == '1':
    raise Exception("Please upgrade sagemaker SDK by running the above cell while ensuring kernel name is the same as the one being used. Restart the kernel after upgrade.")

Then we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3. We also create a session.

[ ]:
from sagemaker import get_execution_role

role = get_execution_role()
sess = sagemaker.Session()

We then need an S3 bucket that would be used for storing the model artifacts generated after training and compilation, training data and custom code.

[ ]:
# S3 bucket and folders for saving code and model artifacts.
# Feel free to specify different bucket/folders here if you wish.
bucket = sess.default_bucket()
folder = 'DEMO-ObjectDetection'
custom_code_sub_folder = folder + '/custom-code'
training_data_sub_folder = folder + '/training-data'
training_output_sub_folder = folder + '/training-output'
compilation_output_sub_folder = folder + '/compilation-output'

To easily visualize the detection outputs we also define the following function. The function visualizes the high-confidence predictions with bounding box by filtering out low-confidence detections.

[ ]:
%matplotlib inline
def visualize_detection(img_file, dets, classes=[], thresh=0.6):
        """
        visualize detections in one image
        Parameters:
        ----------
        img_file : numpy.array
            image, in bgr format
        dets : numpy.array
            ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])
            each row is one object
        classes : tuple or list of str
            class names
        thresh : float
            score threshold
        """
        import random
        import matplotlib.pyplot as plt
        import matplotlib.image as mpimg
        from matplotlib.patches import Rectangle

        img=mpimg.imread(img_file)
        plt.imshow(img)
        height = img.shape[0]
        width = img.shape[1]
        colors = dict()
        klasses = dets[0][0]
        scores = dets[1][0]
        bbox = dets[2][0]
        for i in range(len(classes)):
            klass = klasses[i][0]
            score = scores[i][0]
            x0, y0, x1, y1 = bbox[i]
            if score < thresh:
                continue
            cls_id = int(klass)
            if cls_id not in colors:
                colors[cls_id] = (random.random(), random.random(), random.random())
            xmin = int(x0 * width / 512)
            ymin = int(y0 * height / 512)
            xmax = int(x1 * width / 512)
            ymax = int(y1 * height / 512)
            rect = Rectangle((xmin, ymin), xmax - xmin,
                                 ymax - ymin, fill=False,
                                 edgecolor=colors[cls_id],
                                 linewidth=3.5)
            plt.gca().add_patch(rect)
            class_name = str(cls_id)
            if classes and len(classes) > cls_id:
                class_name = classes[cls_id]
            plt.gca().text(xmin, ymin-2,
                            '{:s} {:.3f}'.format(class_name, score),
                            bbox=dict(facecolor=colors[cls_id], alpha=0.5),
                                    fontsize=12, color='white')
        plt.tight_layout(rect=[0, 0, 2, 2])
        plt.show()
[ ]:
# Initializing object categories
object_categories = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
                     'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
                     'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']

# Setting a threshold 0.20 will only plot detection results that have a confidence score greater than 0.20.
threshold = 0.20

Finally we load the test image into the memory. The test image used in this notebook is from PEXELS which remains unseen until the time of preditcion.

[ ]:
import PIL.Image
import numpy as np

test_file = 'test.jpg'
test_image = PIL.Image.open(test_file)
test_image = np.asarray(test_image.resize((512, 512)))

Data Preparation

Pascal VOC was a popular computer vision challenge and they released annual challenge datasets for object detection from 2005 to 2012. In this notebook, we will use the data sets from 2007 and 2012, named as VOC07 and VOC12 respectively. Cumulatively, we have more than 20,000 images containing about 50,000 annotated objects. These annotated objects are grouped into 20 categories.

*Notes:* 1. While using the Pascal VOC dataset, please be aware of the database usage rights. The VOC data includes images obtained from flickr’s website. Use of these images must respect the corresponding terms of use: https://www.flickr.com/help/terms 2. Default EBS Volume size for SageMaker Notebook instances is 5GB. While performing this step if you run out of storage then consider increasing the volume size. One way to do so is by using AWS CLI as documented here.

Download data

Download the Pascal VOC datasets from 2007 and 2012 from Oxford University’s website.

*Following is an alternative link to download the dataset if there is some connection problem: https://course.fast.ai/datasets#image-localization*

[ ]:
%%time

# Download the dataset
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

# Extract the data.
!tar -xf /tmp/VOCtrainval_11-May-2012.tar && rm /tmp/VOCtrainval_11-May-2012.tar
!tar -xf /tmp/VOCtrainval_06-Nov-2007.tar && rm /tmp/VOCtrainval_06-Nov-2007.tar
!tar -xf /tmp/VOCtest_06-Nov-2007.tar && rm /tmp/VOCtest_06-Nov-2007.tar

Convert data into RecordIO

RecordIO is a highly efficient binary data format from MXNet. Using this format, dataset is simple to prepare and transfer to the instance that will run the training job. Please refer to object_detection_recordio_format for more information about how to prepare RecordIO dataset

[ ]:
!python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target VOCdevkit/train.lst
!rm -rf VOCdevkit/VOC2012
!python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target VOCdevkit/val.lst --no-shuffle
!rm -rf VOCdevkit/VOC2007

Upload data to S3

Upload the converted data to the S3 bucket.

[ ]:
# Upload the RecordIO files to train and validation channels
sess.upload_data(path='VOCdevkit/train.rec', bucket=bucket, key_prefix=training_data_sub_folder)
sess.upload_data(path='VOCdevkit/train.idx', bucket=bucket, key_prefix=training_data_sub_folder)

Next, we need to setup training and compilation output locations in S3, where the respective model artifacts will be dumped. We also setup the s3 location for the custom code.

[ ]:
# S3 Location where the training data is stored in the previous step
s3_training_data_location = 's3://{}/{}'.format(bucket, training_data_sub_folder)

# S3 Location to save the model artifact after training
s3_training_output_location = 's3://{}/{}'.format(bucket, training_output_sub_folder)

# S3 Location to save the model artifact after compilation
s3_compilation_output_location = 's3://{}/{}'.format(bucket, compilation_output_sub_folder)

# S3 Location to save your custom code in tar.gz format
s3_custom_code_upload_location = 's3://{}/{}'.format(bucket, custom_code_sub_folder)

Train the model

Now that we are done with all the setup that is needed, we are ready to train our object detector. To begin, we will create a SageMaker MXNet estimator object which allows us to run single machine or distributed training in SageMaker, using CPU or GPU-based instances. After creating the estimator, training is started by calling fit() on this estimator. When we create the estimator, we pass: - entry_point: filename of the python script which defines training and hosting methods. Here we use ssd_entry_point.py - role: name of our IAM execution role. - output_path: S3 path where the training artifacts will be stored. We defined this in the previous step. - code_location: S3 path where the custom code including the entry_point script will be stored. We defined this in the previous step. - instance_count & instance_type: allows us to specify the number & type of SageMaker instances that will be used for the training job. For this example, we will choose one ml.p3.2xlarge instance. - framework_version & py_version - distribution: dict with information on how to run distributed training. Here we will use distributed training with parameter_server. - hyperparameters: dict of values that will be passed to the entry_point script.

[ ]:
from sagemaker.mxnet import MXNet

ssd_estimator = MXNet(entry_point='ssd_entry_point.py',
                      role=role,
                      output_path=s3_training_output_location,
                      code_location=s3_custom_code_upload_location,
                      instance_count=1,
                      instance_type='ml.p3.2xlarge',
                      framework_version='1.7.0',
                      py_version='py3',
                      distribution={'parameter_server': {'enabled': True}},
                      hyperparameters={'epochs': 1,
                                       'data-shape': 512,
                                      }
                     )
[ ]:
ssd_estimator.fit({'train': s3_training_data_location})

Compile the trained model using SageMaker Neo

After training the model we can use SageMaker Neo’s compile_model() API to compile the trained model. When calling compile_model() user is expected to provide all the correct input shapes required by the model for successful compilation. We also specify the target instance family, the name of our IAM execution role, S3 bucket to which the compiled model would be stored and we set MMS_DEFAULT_RESPONSE_TIMEOUT environment variable to 500.

For this example, we will choose ml_p3 as the target instance family while compiling the trained model.

[ ]:
compiled_model = ssd_estimator.compile_model(target_instance_family='ml_p3',
                                             input_shape={'data':[1, 3, 512, 512]},
                                             role=role,
                                             output_path=s3_compilation_output_location,
                                             framework='mxnet',
                                             framework_version='1.7',
                                             env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'})

Deploy the compiled model and request Inferences

We have to deploy the compiled model on one of the instance family for which the trained model was compiled for. Since we have compiled for ml_p3 we can deploy to any ml.p3 instance type. For this example we will choose ml.p3.2xlarge

[ ]:
neo_object_detector = compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')
[ ]:
%%time
response = neo_object_detector.predict(test_image)
[ ]:
# Visualize the detections.
visualize_detection(file_name, response, object_categories, threshold)

Delete the Endpoint

Having an endpoint running will incur some costs. Therefore as an optional clean-up job, you can delete it.

[ ]:
print("Endpoint name: " + neo_object_detector.endpoint_name)
neo_object_detector.delete_endpoint()