Get started with RL¶
- Mountain Car with Amazon SageMaker RL
PatientMountainCarPatientContinuousMountainCar- Imports
- Setup S3 bucket
- Define Variables
- Configure settings
- Create an IAM role
- Install docker for
localmode - Plot metrics for training job
- Visualize the rendered gifs
- Load checkpointed model
- Run the evaluation step
- Visualize the output
- Clean up endpoint
- Cart-pole Balancing Model with Amazon SageMaker and Ray
- Model deployment
- Roboschool simulations of physical robotics with Amazon SageMaker
- Model deployment
AWS DeepRacer¶
Use RL to train an autonomous AWS DeepRacer. This is a jailbreaker for the AWS DeepRacer; it gives a glimpse at the architecture used to get the DeepRacer working.
- Distributed DeepRacer RL training with SageMaker and RoboMaker
- How it works?
- Prequisites
- Run these command if you wish to modify the SageMaker and Robomaker code
- Imports
- Initializing basic parameters
- Setup S3 bucket
- Create an IAM role
- Permission setup for invoking AWS RoboMaker from this notebook
- Permission setup for Sagemaker to S3 bucket
- Permission setup for Sagemaker to create KinesisVideoStreams
- Build and push docker image
- Clean the docker images
- Configure VPC
- Create Route Table
- Setup the environment
- Configure the preset for RL algorithm
- Copy custom files to S3 bucket so that sagemaker & robomaker can pick it up
- Train the RL model using the Python SDK Script mode
- Create the Kinesis video stream
- Start the Robomaker job
- Create Simulation Application
- Launch the Simulation job on RoboMaker
- Visualizing the simulations in RoboMaker
- Creating temporary folder top plot metrics
- Plot metrics for training job
- Clean up RoboMaker and SageMaker training job
- Evaluation (Time trail, Object avoidance, Head to bot)
- Head-to-head Evaluation
Cart pole¶
A cart pole simulation is the act of balancing a broom upright by balancing it on your hand. The broom is the “pole” and your hand is replaced with a “cart” moving back and forth on a linear track. This simplified example works in 2 dimensions, so the cart can only move in a line back and forth, and the pole can only fall forwards or backwards, not to the sides. These examples use PyTorch or TensorFlow and SageMaker RL to solve a cart pole problem.
- Cart-pole Balancing Model with Amazon SageMaker and Coach library
- Model deployment
- Training Batch Reinforcement Learning Policies with Amazon SageMaker RL and Coach library
- Cart-pole Balancing Model with Amazon SageMaker on SageMaker Managed Spot Training
- Model deployment
Contextual bandits¶
Explore a number of actions with contextual bandits algorithms in SageMaker.
- Contextual Bandits with Parametric Actions – Experimentation Mode
- What is Experimentation Mode?
- Imports
- Setup S3 bucket
- Configure where training happens
- Create an IAM role
- Simulation environment (from MovieLens data)
- Create a SageMaker model for inference
- 1. Batch Transform
- Generating test dataset for inference
- Download batch transform results
- 2. Real-time inference
- Clean Up endpoint
- Contextual Bandits with Amazon SageMaker RL
Roboschool¶
Roboschool is a physics simulator that is commonly used to train RL policies for robotic systems.
- Tune hyperparameters for your RL training job
- Pick which Roboschool problem to solve
- Pre-requisites
- Build docker container
- Configure Tuning
- Prepare to launch the tuning job.
- Monitor progress
- Training Roboschool agents using distributed RL training across multiple nodes with Amazon SageMaker
- Roboschool simulations training with stable baselines on Amazon SageMaker RL
Use cases¶
Autoscaling¶
This example demonstrates how to use RL to address scaling a production service by adding and removing resources (e.g. servers or EC2 instances) in reaction to a dynamic load.
- Autoscaling a service with Amazon SageMaker
- Problem Statement
- Using Amazon SageMaker for RL
- Pre-requisites
- Set up the environment
- Configure the presets for RL algorithm
- Write the Training Code
- Train the RL model using the Python SDK Script mode
- Store intermediate training output and model checkpoints
- Visualization
- Evaluation of RL models
- Hosting
Energy¶
Training an RL algorithm in a real HVAC system can take time to converge as well as potentially lead to hazardous settings as the agent explores its state space. This example uses the EnergyPlus simulator to showcase how you can train an HVAC optimization RL model with Amazon SageMaker.
- HVAC with Amazon SageMaker RL
- Model deployment
Game play¶
Use RL to train an agent to play in a Unity3D environment.
Game server¶
A reinforcement learning-based system using SageMaker Autopilot and SageMaker RL that learns to allocate resources in response to player usage patterns.
Knapsack problem¶
Use SageMaker RL to address a canonical operations research problem, aka, a knapsack problem.
Object tracker¶
Use RL to train a TurtleBot object tracker using Amazon SageMaker Reinforcement Learning and AWS RoboMaker.
Network compression¶
Network to network compression via policy gradient reinforcement learning.
Portfolio management¶
Use SageMaker RL to manage a stock portfolio by continuously reallocating several stocks.
- Portfolio Management with Amazon SageMaker RL
- Problem Statement
- Dataset
- Using reinforcement learning on Amazon SageMaker RL
- Pre-requisites
- Set up the environment
- Configure the presets for RL algorithm
- Write the Training Code
- Train the RL model using the Python SDK Script mode
- Store intermediate training output and model checkpoints
- Visualization
- Load the checkpointed models for evaluation
- Risk Disclaimer (for live-trading)
Resource allocation¶
Solve resource allocation problems with SageMaker RL.
- Solving Bin Packing Problem with Amazon SageMaker RL
- Solving Multi-Period Newsvendor Problem with Amazon SageMaker RL
- Solving Vehicle Routing Problem with Amazon SageMaker RL
Tic-tac-toe¶
Play global thermonuclear war with a computer.
Traveling salesman problem¶
Use SageMaker RL to solve this classic problem with a twist: a restaurant delivery service on a 2D gridworld.
- Traveling Salesman Problem with Reinforcement Learning
- Description of Problem
- Why Reinforcement Learning?
- Easy Version of TSP
- Using AWS SageMaker for RL
- Medium version of TSP
- Using AWS SageMaker for RL
- Visualize, Compare with Baseline and Evaluate
- Vehicle Routing Problem with Reinforcement Learning