Fleet Predictive Maintenance: Part 1. Introduction
Using SageMaker Studio to Predict Fault Classification
Contents
## Background
The purpose of this notebook is to demonstrate a Predictive Maintenance (PrM) solution for automible fleet maintenance via Amazon SageMaker Studio so that business users have a quick path towards a PrM POC. In this notebook, we focus on preprocessing engine sensor data before feature engineering and buidling an inital model leveraging SageMaker’s algorithms. This notebook will cover the following:
Setup for using SageMaker
Basic data cleaning, analysis and preprocessing
Converting datasets to format used by the Amazon SageMaker algorithms and uploading to S3
Training SageMaker’s linear learner on the dataset
Hyperparamter tuning using SageMaker Automatic Tuning
Deploying and getting predictions using Batch Transform
Important Notes:
Due to cost consideration, the goal of this example is to show you how to use some of SageMaker Studio’s features, not necessarily to achieve the best result.
We use the built-in classification algorithm in this example, and a Python 3 (Data Science) Kernel is required.
The nature of predictive maintenace solutions, requires a domain knowledge expert of the system or machinery. With this in mind, we will make assumptions here for certain elements of this solution with the acknowldgement that these assumptions should be informed by a domain expert and a main business stakeholder
Please see the README.md for more information about this use case.
## Set up
Let’s start by:
Setting up or refreshing storemagic variables
Install and Import any dependencies
Instatiate SageMaker session
Specifying the S3 bucket and prefix that you want to use for your training and model data. This should be within the same region as SageMaker training
Define the IAM role used to give training access to your data
View stored variables from previous session
If you ran this notebook before, you may want to re-use the resources you aready created with AWS. Run the cell below to load any prevously created variables. You should see a print-out of the existing variables. If you don’t see anything you may need to create them again or it may be your first time running this notebook.
After you run the notebooks each in succession you will accrue a set of stored variables, stored gradually as you run each notebook: Stored variables and their in-db values:
create_date -> ‘2021-03-16-06-42-12’
dw_output_path_prm -> ‘s3://sagemaker-us-east-2-1234567890/export-flow
exp_prefix -> ‘sagemaker-experiments/linear-learner-2021-03-16-0
experiment_name -> ‘ll-failure-classification-2021-03-16-06-42-12’
features_created_prm -> True
path_to_test_data_prm -> ‘s3://sagemaker-us-east-2-1234567890/test/test.c
path_to_test_x_data_prm -> ‘s3://sagemaker-us-east-2-1234567890/test/test_x
path_to_train_data_prm -> ‘s3://sagemaker-us-east-2-1234567890/train/train
path_to_valid_data_prm -> ‘s3://sagemaker-us-east-2-1234567890/validation/
trial_name_1 -> ‘linear-learner-lr-training-job-2021-03-16-06-42-1
trial_name_2 -> ‘linear-learner-svm-2021-03-16-06-00-37’
trial_name_3 -> ‘linear-learner-svm-thresh-2021-03-16-06-00-37’
trial_name_4 -> ‘linear-learner-svm-balanced-2021-03-16-06-00-37’
tune_trial_name -> ‘ll-svm-tuning-job-trial’
tuning_job_name -> ‘ll-svm-tuning-job’
[ ]:
%store -r
%store
Note : The above output will be null in the very beginning. On subsequent runs, you will see the stored variables.
## Architecture
## Next Notebook : Data Prep with DataWrangler
[ ]: