Use a deployment for on-demand inference
After
you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the modelId
parameter when you submit prompts and generate responses with model inference.
For information about making inference requests, see the following topics: