Deploy models from JumpStart using Amazon SageMaker Studio
The following steps show you through how to deploy models from JumpStart using Amazon SageMaker Studio.
Prerequisites
Verify that you've set up inference capabilities on your Amazon SageMaker HyperPod clusters. For more information, see Setting up your HyperPod clusters for model deployment.
Create a HyperPod deployment
- 
                        In Amazon SageMaker Studio, open the JumpStart landing page from the left navigation pane. 
- 
                        Under All public models, choose a model you want to deploy. NoteIf you’ve selected a gated model, you’ll have to accept the End User License Agreement (EULA). 
- 
                        Choose SageMaker HyperPod. 
- 
                        Under Deployment settings, JumpStart will recommend an instance for deployment. You can modify these settings if necessary. - 
                                If you modify Instance type, ensure it’s compatible with the chosen HyperPod cluster. If there aren’t any compatible instances, you’ll need to select a new HyperPod cluster or contact your admin to add compatible instances to the cluster. 
- 
                                To prioritize the model deployment, install the task governance addon, create compute allocations, and set up task rankings for the cluster policy. Once this is done, you should see an option to select a priority for the model deployment which can be used for preemption of other deployments and tasks on the cluster. 
- 
                                Enter the namespace to which your admin has provided you access. You may have to directly reach out to your admin to get the exact namespace. Once a valid namespace is provided, the Deploy button should be enabled to deploy the model. 
 
- 
                                
- 
                        Choose Deploy and wait for the Endpoint to be created. 
- 
                        After the Endpoint has been created, select Test inference. 
Edit a HyperPod deployment
- 
                        In Amazon SageMaker Studio, select Compute and then HyperPod clusters from the left navigation pane. 
- 
                        Under Deployments, choose the HyperPod cluster deployment you want to modify. 
- 
                        From the vertical ellipsis icon (⋮), choose Edit. 
- 
                        Under Deployment settings, you can enable or disable Auto-scaling, and change the number of Max replicas. 
- 
                        Select Save. 
- 
                        The Status will change to Updating. Once it changes back to In service, your changes are complete and you’ll see a message confirming it. 
Delete a HyperPod deployment
- 
                        In Amazon SageMaker Studio, select Compute and then HyperPod clusters from the left navigation pane. 
- 
                        Under Deployments, choose the HyperPod cluster deployment you want to modify. 
- 
                        From the vertical ellipsis icon (⋮), choose Delete. 
- 
                        In the Delete HyperPod deployment window, select the checkbox. 
- 
                        Choose Delete. 
- 
                        The Status will change to Deleting. Once the HyperPod deployment has been deleted, you’ll see a message confirming it.