Examples on how to use distributed training on SageMaker.
Apache MXNet¶
- Training and hosting SageMaker Models using the Apache MXNet Module API
- (Optional) Delete the Endpoint
- Reduce Training Time with Apache MXNet and Horovod on Amazon SageMaker
- Distributed Training
- Horovod Overview
- Test Problem and Dataset
- Training Script with Horovod Support
- Results
- Reduce Training Time with Apache MXNet and Horovod on Amazon SageMaker
- Distributed Training
- Horovod Overview
- Test Problem and Dataset
- Training Script with Horovod Support
- Results
- MNIST Training using MXNet
In addition to the notebook, this topic is covered in this workshop topic: Parallelized data distribution (sharding)