Amazon SageMaker AI endpoints - AWS Prescriptive Guidance

Amazon SageMaker AI endpoints

Amazon SageMaker AI is a managed ML service that helps you build and train models and then deploy them into a production-ready hosted environment. Unlike Amazon SageMaker AI Canvas, you don't have the option of using a ready-to-use model in SageMaker AI. In SageMaker AI, you are responsible for providing the sample data and training the model. This provides you with more control but also more operational overhead and responsibility.

You can deploy a custom model in SageMaker AI as either a real-time or serverless endpoint. Alternatively, you can use batch transform, depending on your application demands. Even if a model will not be deployed as a SageMaker AI endpoint, the model artifact that SageMaker AI produces can be used for a customized deployment. For examples of SageMaker AI image classification models, see the following resources on GitHub:

After a model is trained, you can use SageMaker AI Neo to compile the model and make it more computationally efficient. Neo automatically optimizes Gluon, Keras, MXNet, PyTorch, TensorFlow, TensorFlow-Lite, and ONNX models for inference on Android, Linux, and Windows machines. For more information, see Optimize model performance using Neo.

The following are the advantages of SageMaker AI:

  • Full control of model architecture, objective, and training procedure

  • Ability to select the instance type for your endpoint deployments

  • Ability to compile a model with SageMaker AI Neo for efficient deployment

The following are the disadvantages of SageMaker AI:

  • Manual setup requires more labor than automated approaches

For more information about SageMaker AI, see the following: