Copy Amazon ECR container images across AWS accounts and AWS Regions - AWS Prescriptive Guidance

Copy Amazon ECR container images across AWS accounts and AWS Regions

Faisal Shahdad, Amazon Web Services

Summary

This pattern shows you how to use a serverless approach to replicate tagged images from existing Amazon Elastic Container Registry (Amazon ECR) repositories to other AWS accounts and AWS Regions. The solution uses AWS Step Functions to manage the replication workflow and AWS Lambda functions to copy large container images.

Amazon ECR uses native cross-Region and cross-account replication features that replicate container images across Regions and accounts. But these features replicate images only from the moment replication is turned on. There is no mechanism to replicate existing images in different Regions and accounts.

This pattern helps artificial intelligence (AI) teams distribute containerized machine learning (ML) models, frameworks (for example, PyTorch, TensorFlow, and Hugging Face), and dependencies to other accounts and Regions. This can help you overcome service limits and optimize GPU compute resources. You can also selectively replicate Amazon ECR repositories from specific source accounts and Regions. For more information, see Cross-Region replication in Amazon ECR has landed.

Prerequisites and limitations

Prerequisites

  • Two or more active AWS accounts (one source account and one destination account, minimally)

  • Appropriate AWS Identity and Access Management (IAM) permissions in all accounts

  • Docker for building the Lambda container image

  • AWS Command Line Interface (AWS CLI) configured for all accounts

Limitations

  • Untagged image exclusion – The solution copies only container images that have explicit tags. It skips untagged images that exist with SHA256 digests.

  • Lambda execution timeout constraints – AWS Lambda is limited to a maximum 15-minute execution timeout, which may be insufficient to copy large container images or repositories.

  • Manual container image management – The crane-app.py Python code requires rebuilding and redeploying the Lambda container image.

  • Limited parallel processing capacity – The MaxConcurrency state setting limits how many repositories you can copy at the same time. However, you can modify this setting in the source account’s AWS CloudFormation template. Note that higher concurrency values can cause you to exceed service rate limits and account-level Lambda execution quotas.

Architecture

Target stack

The pattern has four main components:

  • Source account infrastructure – CloudFormation template that creates the orchestration components

  • Destination account infrastructure – CloudFormation template that creates cross-account access roles

  • Lambda function – Python-based function that uses Crane for efficient image copying

  • Container image – Docker container that packages the Lambda function with required tools

Target architecture

Step Functions workflow

The Step Functions state machine orchestrates the following, as shown in the following diagram:

  • PopulateRepositoryList Scans Amazon ECR repositories and populates Amazon DynamoDB

  • GetRepositoryList Retrieves unique repository list from DynamoDB

  • DeduplicateRepositories Ensures that there is no duplicate processing

  • CopyRepositories Handles parallel copying of repositories

  • NotifySuccess/NotifyFailure Amazon Simple Notification Service (Amazon SNS) notifications based on execution outcome

Tools

Amazon tools

  • Amazon CloudWatch helps you monitor the metrics of your AWS resources and the applications you run on AWS in real time.

  • Amazon DynamoDB is a fully managed NoSQL database service that provides fast, predictable, and scalable performance.

  • Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.

  • AWS Identity and Access Management (IAM) helps you securely manage access to your AWS resources by controlling who is authenticated and authorized to use them.

  • AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.

  • AWS Step Functions is a serverless orchestration service that helps you combine Lambda functions and other AWS services to build business-critical applications.

Other tools

  • Crane is a Docker orchestration tool. It’s similar to Docker Compose but has additional features.

  • Docker is a set of platform as a service (PaaS) products that use virtualization at the operating system level to deliver software in containers.

Code repository

  • The code for this pattern is available in the GitHub sample-ecr-copy repository. You can use the CloudFormation template from the repository to create the underlying resources.

Best practices

Follow the principle of least privilege and grant the minimum permissions required to perform a task. For more information, see Grant least privilege and Security best practices in the IAM documentation.

Epics

TaskDescriptionSkills required

Configure AWS CLI profiles.

  1. Configure the source account profile:

    aws configure --profile source-account # Enter: Access Key ID, Secret Access Key, Default region, Output format (json)
  2. Configure the destination account profile:

    aws configure --profile destination-account # Enter: Access Key ID, Secret Access Key, Default region, Output format (json)
  3. Verify the configurations:

    aws sts get-caller-identity --profile source-account aws sts get-caller-identity --profile destination-account
DevOps engineer, Data engineer, ML engineer

Gather required information.

  1. Get the source account ID:

    export SOURCE_ACCOUNT_ID=$(aws sts get-caller-identity --profile source-account --query Account --output text) echo "Source Account ID: $SOURCE_ACCOUNT_ID"
  2. Get the destination account ID:

    export DEST_ACCOUNT_ID=$(aws sts get-caller-identity --profile destination-account --query Account --output text) echo "Destination Account ID: $DEST_ACCOUNT_ID"
  3. Set the AWS Regions. Modify this command for your Region:

    export SOURCE_REGION="us-east-1" export DEST_REGION="us-east-2"
  4. List the existing Amazon ECR repositories in the source account:

    aws ecr describe-repositories --profile source-account --region $SOURCE_REGION --query 'repositories[].repositoryName' --output table
DevOps engineer, Data engineer, ML engineer

Clone the repository.

Clone the pattern’s repository to your local workstation:

git clone https://github.com/aws-samples/sample-ecr-copy
DevOps engineer, Data engineer, ML engineer
TaskDescriptionSkills required

Validate the template.

Validate the CloudFormation template:

aws cloudformation validate-template \ --template-body file://"Destination Account cf_template.yml" \ --profile destination-account
DevOps engineer, ML engineer, Data engineer

Deploy the destination infrastructure.

  1. Deploy the destination account stack:

    aws cloudformation deploy \ --template-file "Destination Account cf_template.yml" \ --stack-name ecr-copy-destination \ --parameter-overrides \ SourceAccountId=$SOURCE_ACCOUNT_ID \ SourceRoleName=ECRContainerLambdaRole \ --capabilities CAPABILITY_NAMED_IAM \ --profile destination-account \ --region $DEST_REGION
  2. Wait for the stack to complete:

    aws cloudformation wait stack-create-complete \ --stack-name ecr-copy-destination \ --profile destination-account \ --region $DEST_REGION
Data engineer, ML engineer, DevOps engineer

Verify the deployment.

  1. Get stack outputs:

    aws cloudformation describe-stacks \ --stack-name ecr-copy-destination \ --profile destination-account \ --region $DEST_REGION \ --query 'Stacks[0].Outputs' \ --output table
  2. Store the cross-account IAM role:

    export CROSS_ACCOUNT_ROLE_ARN=$(aws cloudformation describe-stacks \ --stack-name ecr-copy-destination \ --profile destination-account \ --region $DEST_REGION \ --query 'Stacks[0].Outputs[?OutputKey==`CrossAccountRoleArn`].OutputValue' \ --output text) echo "Cross-Account Role ARN: $CROSS_ACCOUNT_ROLE_ARN"
DevOps engineer, ML engineer, Data engineer
TaskDescriptionSkills required

Prepare the container build.

  1. Verify that Docker is running:

    docker --version docker info
  2. Ensure that crane-app.py and Dockerfile are in the current directory:

    ls -la crane-app.py Dockerfile
Data engineer, ML engineer, DevOps engineer

Build the container image.

  1. Build the Lambda container image:

    docker build -t ecr-copy-lambda . --no-cache
  2. Verify that the image was created:

    docker images ecr-copy-lambda
  3. (Optional) Test the container locally:

    docker run --rm --entrypoint python ecr-copy-lambda -c "import boto3; print('Container working')"
Data engineer, ML engineer, DevOps engineer

Create a repository and upload the image.

  1. Create an Amazon ECR repository in the source account:

    aws ecr create-repository \ --repository-name ecr-copy-lambda \ --profile source-account \ --region $SOURCE_REGION
  2. Get an Amazon ECR login token and authenticate Docker:

    aws ecr get-login-password \ --profile source-account \ --region $SOURCE_REGION | \ docker login --username AWS --password-stdin \ $SOURCE_ACCOUNT_ID.dkr.ecr.$SOURCE_REGION.amazonaws.com
  3. Tag the image for Amazon ECR:

    docker tag ecr-copy-lambda:latest \ $SOURCE_ACCOUNT_ID.dkr.ecr.$SOURCE_REGION.amazonaws.com/ecr-copy-lambda:latest
  4. Upload the image to Amazon ECR:

    docker push $SOURCE_ACCOUNT_ID.dkr.ecr.$SOURCE_REGION.amazonaws.com/ecr-copy-lambda:latest
  5. Store the image URI for later use:

    export LAMBDA_IMAGE_URI="$SOURCE_ACCOUNT_ID.dkr.ecr.$SOURCE_REGION.amazonaws.com/ecr-copy-lambda:latest" echo "Lambda Image URI: $LAMBDA_IMAGE_URI"
Data engineer, ML engineer, DevOps engineer

Verify the image.

  1. List the images in the repository:

    aws ecr list-images \ --repository-name ecr-copy-lambda \ --profile source-account \ --region $SOURCE_REGION
  2. Get the image details:

    aws ecr describe-images \ --repository-name ecr-copy-lambda \ --profile source-account \ --region $SOURCE_REGION
Data engineer, ML engineer, DevOps engineer
TaskDescriptionSkills required

Prepare deployment parameters.

  1. Set the notification email:

    export NOTIFICATION_EMAIL="your-email@company.com"
  2. Define the repositories to copy (comma-separated):

    export REPOSITORY_LIST="app-frontend,app-backend,database-migrations"
  3. Set the environment:

    export ENVIRONMENT="dev" echo "Deployment Parameters:" echo "Source Account: $SOURCE_ACCOUNT_ID" echo "Destination Account: $DEST_ACCOUNT_ID" echo "Source Region: $SOURCE_REGION" echo "Destination Region: $DEST_REGION" echo "Lambda Image: $LAMBDA_IMAGE_URI" echo "Notification Email: $NOTIFICATION_EMAIL" echo "Repositories: $REPOSITORY_LIST"
Data engineer, DevOps engineer, ML engineer

Validate the source template.

Validate the source CloudFormation template:

aws cloudformation validate-template \ --template-body file://"Source Account Cf template.yml" \ --profile source-account
Data engineer, ML engineer, DevOps engineer

Deploy the source infrastructure.

  1. Deploy the source account stack:

    aws cloudformation deploy \ --template-file "Source Account Cf template.yml" \ --stack-name ecr-copy-source \ --parameter-overrides \ SourceAccountId=$SOURCE_ACCOUNT_ID \ DestinationAccountId=$DEST_ACCOUNT_ID \ DestinationRegion=$DEST_REGION \ SourceRegion=$SOURCE_REGION \ NotificationEmail=$NOTIFICATION_EMAIL \ RepositoryList="$REPOSITORY_LIST" \ LambdaImageUri=$LAMBDA_IMAGE_URI \ Environment=$ENVIRONMENT \ --capabilities CAPABILITY_NAMED_IAM \ --profile source-account \ --region $SOURCE_REGION
  2. Wait for the stack to complete (this might take up to 10 minutes):

    aws cloudformation wait stack-create-complete \ --stack-name ecr-copy-source \ --profile source-account \ --region $SOURCE_REGION
Data engineer, ML engineer, DevOps engineer

Verify the deployment and collect outputs.

  1. Get the stack outputs:

    aws cloudformation describe-stacks \ --stack-name ecr-copy-source \ --profile source-account \ --region $SOURCE_REGION \ --query 'Stacks[0].Outputs' \ --output table
  2. Store the Amazon Resource Names (ARNs) for the state machine and SNS topic:

    export STATE_MACHINE_ARN=$(aws cloudformation describe-stacks \ --stack-name ecr-copy-source \ --profile source-account \ --region $SOURCE_REGION \ --query 'Stacks[0].Outputs[?OutputKey==`StateMachineArn`].OutputValue' \ --output text) export SNS_TOPIC_ARN=$(aws cloudformation describe-stacks \ --stack-name ecr-copy-source \ --profile source-account \ --region $SOURCE_REGION \ --query 'Stacks[0].Outputs[?OutputKey==`SNSTopicArn`].OutputValue' \ --output text) echo "State Machine ARN: $STATE_MACHINE_ARN" echo "SNS Topic ARN: $SNS_TOPIC_ARN"
DevOps engineer, ML engineer, Data engineer

Confirm your email subscription.

  1. Check your email for confirmation of your SNS subscription.

  2. Choose the confirmation link in the email.

  3. Verify the subscription status.

    aws sns list-subscriptions-by-topic \ --topic-arn $SNS_TOPIC_ARN \ --profile source-account \ --region $SOURCE_REGION
Data engineer, ML engineer, DevOps engineer
TaskDescriptionSkills required

Run and monitor the copy process.

  1. Sign in to the AWS Management Console, and open the Step Functions console.

  2. Locate the state machine.

  3. Choose Start execution.

    When it finishes, results are displayed on the Execution input and output tab.

  4. (Optional) If you want to continue running Step Functions by using the AWS CLI, follow the remaining steps in this epic.

DevOps engineer, ML engineer, Data engineer

Run the step function.

  1. Generate a unique name:

    export EXECUTION_NAME="ecr-copy-$(date +%Y%m%d-%H%M%S)"
  2. Run the step function.

    export EXECUTION_ARN=$(aws stepfunctions start-execution \ --state-machine-arn $STATE_MACHINE_ARN \ --name $EXECUTION_NAME \ --profile source-account \ --region $SOURCE_REGION \ --query 'executionArn' \ --output text) echo "Execution started: $EXECUTION_ARN" echo "Execution Name: $EXECUTION_NAME"
DevOps engineer, ML engineer, Data engineer

Monitor progress.

  1. Check the status:

    aws stepfunctions describe-execution \ --execution-arn $EXECUTION_ARN \ --profile source-account \ --region $SOURCE_REGION \ --query '{Status:status,StartDate:startDate,StopDate:stopDate}' \ --output table
  2. Get the history:

    aws stepfunctions get-execution-history \ --execution-arn $EXECUTION_ARN \ --profile source-account \ --region $SOURCE_REGION \ --query 'events[?type==`TaskStateEntered` || type==`TaskSucceeded` || type==`TaskFailed`].{Type:type,Timestamp:timestamp,Details:stateEnteredEventDetails.name}' \ --output table
DevOps engineer, ML engineer, Data engineer

Check the results.

Wait for the process to complete (updated every 30 seconds):

while true; do STATUS=$(aws stepfunctions describe-execution \ --execution-arn $EXECUTION_ARN \ --profile source-account \ --region $SOURCE_REGION \ --query 'status' \ --output text) echo "Current status: $STATUS" if [[ "$STATUS" == "SUCCEEDED" || "$STATUS" == "FAILED" || "$STATUS" == "TIMED_OUT" || "$STATUS" == "ABORTED" ]]; then break fi sleep 30 done echo "Final execution status: $STATUS"
DevOps engineer, ML engineer, Data engineer

Verify the images.

  1. List the repositories in the destination account:

    aws ecr describe-repositories \ --profile destination-account \ --region $DEST_REGION \ --query 'repositories[].repositoryName' \ --output table
  2. Check the repository images:

    for repo in $(echo $REPOSITORY_LIST | tr ',' ' '); do echo "\nImages in repository: $repo" aws ecr list-images \ --repository-name $repo \ --profile destination-account \ --region $DEST_REGION \ --query 'imageIds[].imageTag' \ --output table 2>/dev/null || echo "Repository $repo not found or no images" done
DevOps engineer, Data engineer, ML engineer

Troubleshooting

IssueSolution

Step functions fail to run.

  1. To retrieve detailed failure events from the history, run the following AWS CLI command:

    if [[ "$STATUS" == "FAILED" ]]; then echo "Getting failure details..." aws stepfunctions get-execution-history \ --execution-arn $EXECUTION_ARN \ --profile source-account \ --region $SOURCE_REGION \ --query 'events[?type==`TaskFailed`]' \ --output json fi
  2. To retrieve logs for failed Lambda functions, run the following AWS CLI command:

    # Check Lambda function logs echo "\nLambda function logs:" aws logs describe-log-groups \ --log-group-name-prefix "/aws/lambda/ecr-copy-source" \ --profile source-account \ --region $SOURCE_REGION \ --query 'logGroups[].logGroupName' \ --output table

Related resources

Additional information

Configuration parameters

Parameter

Description

Example

SourceAccountId

Source AWS account ID

11111111111

DestinationAccountId

Destination AWS account ID

22222222222

DestinationRegion

Target AWS Region

us-east-2

SourceRegion

Source AWS Region

us-east-1

NotificationEmail

Email for notifications

abc@xyz.com

RepositoryList

Repositories to copy

repo1,repo2,repo3

LambdaImageUri

Lambda container image URI

${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/ecr-copy-lambda:latest