

# Ground Truth Security and Permissions
<a name="sms-security-general"></a>

Use the topics on this page to learn about Ground Truth security features and how to configure AWS Identity and Access Management (IAM) permissions to allow a user or role to create a labeling job. Additionally, learn how to create an *execution role*. An execution role is the role that you specify when you create a labeling job. This role is used to start your labeling job.

If you are a new user and want to get started quickly, or if you do not require granular permissions, see [Use IAM Managed Policies with Ground Truth](sms-security-permissions-get-started.md).

For more information about IAM users and roles, see [Identities (Users, Groups, and Roles)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id.html) in the IAM User Guide. 

To learn more about using IAM with SageMaker AI, see [AWS Identity and Access Management for Amazon SageMaker AI](security-iam.md).

**Topics**
+ [CORS Requirement for Input Image Data](sms-cors-update.md)
+ [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md)
+ [Using Amazon SageMaker Ground Truth in an Amazon Virtual Private Cloud](sms-vpc.md)
+ [Output Data and Storage Volume Encryption](sms-security.md)
+ [Workforce Authentication and Restrictions](sms-security-workforce-authentication.md)

# CORS Requirement for Input Image Data
<a name="sms-cors-update"></a>

Earlier in 2020, widely used browsers like Chrome and Firefox changed their default behavior for rotating images based on image metadata, referred to as [EXIF data](https://en.wikipedia.org/wiki/Exif). Previously, browsers would always display images in exactly the manner in which they are stored on disk, which is typically unrotated. After the change, images now rotate according to a piece of image metadata called *orientation value*. This has important implications for the entire machine learning (ML) community. For example, if applications that annotate images do not consider the EXIF orientation, they may display images in unexpected orientations, resulting in incorrect labels. 

Starting with Chrome 89, AWS can no longer automatically prevent the rotation of images because the web standards group W3C has decided that the ability to control rotation of images violates the web’s Same-origin Policy. Therefore, to ensure human workers annotate your input images in a predictable orientation when you submit requests to create a labeling job, you must add a CORS header policy to the Amazon S3 buckets that contain your input images.

**Important**  
If you do not add a CORS configuration to the Amazon S3 buckets that contain your input data, labeling tasks for those input data objects will fail.

If you create a job through the Ground Truth console, CORS is enabled by default. If all of your input data is *not* located in the same Amazon S3 bucket as your input manifest file, you must add a CORS configuration to all Amazon S3 buckets that contain input data using the following instructions.

If you are using the `CreateLabelingJob` API to create a Ground Truth labeling job, you can add a CORS policy to an Amazon S3 bucket that contains input data in the S3 console. To set the required CORS headers on the Amazon S3 bucket that contain your input images in the Amazon S3 console, follow the directions detailed in [How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html). Use the following CORS configuration code for the buckets that host your images. If you use the Amazon S3 console to add the policy to your bucket, you must use the JSON format.

**Important**  
If you create a 3D point cloud or video frame labeling job, you must add additional rules to your CORS configuration. To learn more, see [3D point cloud labeling job permission requirements](sms-security-permission-3d-point-cloud.md) and [Video frame job permission requirements](sms-video-overview.md#sms-security-permission-video-frame) respectively. 

**JSON**

```
[{
   "AllowedHeaders": [],
   "AllowedMethods": ["GET"],
   "AllowedOrigins": ["*"],
   "ExposeHeaders": ["Access-Control-Allow-Origin"]
}]
```

**XML**

```
<CORSConfiguration>
 <CORSRule>
   <AllowedOrigin>*</AllowedOrigin>
   <AllowedMethod>GET</AllowedMethod>
   <ExposeHeader>Access-Control-Allow-Origin</ExposeHeader>
 </CORSRule>
</CORSConfiguration>
```

The following GIF demonstrates the instructions found in the Amazon S3 documentation to add a CORS header policy using the Amazon S3 console. For written instructions, see **Using the Amazon S3 console** on the documentation page [How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html) in the Amazon Simple Storage Service User Guide.

![\[Gif on how to add a CORS header policy using the Amazon S3 console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/cors-config.gif)


# Assign IAM Permissions to Use Ground Truth
<a name="sms-security-permission"></a>

Use the topics in this section to learn how to use AWS Identity and Access Management (IAM) managed and custom policies to manage access to Ground Truth and associated resources. 

You can use the sections on this page to learn the following: 
+ How to create IAM policies that grant a user or role permission to create a labeling job. Administrators can use IAM policies to restrict access to Amazon SageMaker AI and other AWS services that are specific to Ground Truth.
+ How to create a SageMaker AI *execution role*. An execution role is the role that you specify when you create a labeling job. The role is used to start and manage your labeling job.

The following is an overview of the topics you'll find on this page: 
+ If you are getting started using Ground Truth, or you do not require granular permissions for your use case, it is recommended that you use the IAM managed policies described in [Use IAM Managed Policies with Ground Truth](sms-security-permissions-get-started.md).
+ Learn about the permissions required to use the Ground Truth console in [Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console](sms-security-permission-console-access.md). This section includes policy examples that grant an IAM entity permission to create and modify private work teams, subscribe to vendor work teams, and create custom labeling workflows.
+ When you create a labeling job, you must provide an execution role. Use [Create a SageMaker AI Execution Role for a Ground Truth Labeling Job](sms-security-permission-execution-role.md) to learn about the permissions required for this role.

# Use IAM Managed Policies with Ground Truth
<a name="sms-security-permissions-get-started"></a>

SageMaker AI and Ground Truth provide AWS managed policies that you can use to create a labeling job. If you are getting started using Ground Truth and you do not require granular permissions for your use case, it is recommended that you use the following policies:
+ `[AmazonSageMakerFullAccess](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerFullAccess)` – Use this policy to give a user or role permission to create a labeling job. This is a broad policy that grants a entity permission to use SageMaker AI features, as well as features of necessary AWS services through the console and API. This policy gives the entity permission to create a labeling job and to create and manage workforces using Amazon Cognito. To learn more, see [AmazonSageMakerFullAccess Policy](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess).
+ `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` – To create an *execution role*, you can attach the policy `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` to a role. An execution role is the role that you specify when you create a labeling job and it is used to start your labeling job. This policy allows you to create both streaming and non-streaming labeling jobs, and to create a labeling job using any task type. Note the following limits of this managed policy.
  + **Amazon S3 permissions**: This policy grants an execution role permission to access Amazon S3 buckets with the following strings in the name: `GroundTruth`, `Groundtruth`, `groundtruth`, `SageMaker`, `Sagemaker`, and `sagemaker` or a bucket with an [object tag](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) that includes `SageMaker` in the name (case insensitive). Make sure your input and output bucket names include these strings, or add additional permissions to your execution role to [grant it permission to access your Amazon S3 buckets](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket.html). You must give this role permission to perform the following actions on your Amazon S3 buckets: `AbortMultipartUpload`, `GetObject`, and `PutObject`.
  + **Custom Workflows**: When you create a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html), this execution role is restricted to invoking AWS Lambda functions with one of the following strings as part of the function name: `GtRecipe`, `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction`. This applies to both your pre-annotation and post-annotation Lambda functions. If you choose to use names without those strings, you must explicitly provide `lambda:InvokeFunction` permission to the execution role used to create the labeling job.

To learn how to attach an AWS managed policy to a user or role, refer to [Adding and removing IAM identity permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console) in the IAM User Guide.

# Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console
<a name="sms-security-permission-console-access"></a>

To use the Ground Truth area of the SageMaker AI console, you need to grant permission to an entity to access SageMaker AI and other AWS services that Ground Truth interacts with. Required permissions to access other AWS services depends on your use-case: 
+ Amazon S3 permissions are required for all use cases. These permissions must grant access to the Amazon S3 buckets that contain input and output data. 
+ AWS Marketplace permissions are required to use a vendor workforce.
+ Amazon Cognito permission are required for private work team setup.
+ AWS KMS permissions are required to view available AWS KMS keys that can be used for output data encryption.
+ IAM permissions are required to either list pre-existing execution roles, or to create a new one. Additionally, you must use add a `PassRole` permission to allow SageMaker AI to use the execution role chosen to start the labeling job.

The following sections list policies you may want to grant to a role to use one or more functions of Ground Truth. 

**Topics**
+ [Ground Truth Console Permissions](#sms-security-permissions-console-all)
+ [Custom Labeling Workflow Permissions](#sms-security-permissions-custom-workflow)
+ [Private Workforce Permissions](#sms-security-permission-workforce-creation)
+ [Vendor Workforce Permissions](#sms-security-permissions-workforce-creation-vendor)

## Ground Truth Console Permissions
<a name="sms-security-permissions-console-all"></a>

To grant permission to a user or role to use the Ground Truth area of the SageMaker AI console to create a labeling job, attach the following policy to the user or role. The following policy will give an IAM role permission to create a labeling job using a [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) task type. If you want to create a custom labeling workflow, add the policy in [Custom Labeling Workflow Permissions](#sms-security-permissions-custom-workflow) to the following policy. Each `Statement` included in the following policy is described below this code block.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "SageMakerApis",
            "Effect": "Allow",
            "Action": [
                "sagemaker:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "KmsKeysForCreateForms",
            "Effect": "Allow",
            "Action": [
                "kms:DescribeKey",
                "kms:ListAliases"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AccessAwsMarketplaceSubscriptions",
            "Effect": "Allow",
            "Action": [
                "aws-marketplace:ViewSubscriptions"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SecretsManager",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:CreateSecret",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecrets"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ListAndCreateExecutionRoles",
            "Effect": "Allow",
            "Action": [
                "iam:ListRoles",
                "iam:CreateRole",
                "iam:CreatePolicy",
                "iam:AttachRolePolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "PassRoleForExecutionRoles",
            "Effect": "Allow",
            "Action": [
                "iam:PassRole"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "sagemaker.amazonaws.com"
                }
            }
        },
        {
            "Sid": "GroundTruthConsole",
            "Effect": "Allow",
            "Action": [
                "groundtruthlabeling:*",
                "lambda:InvokeFunction",
                "lambda:ListFunctions",
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetBucketCors",
                "s3:PutBucketCors",
                "s3:ListAllMyBuckets",
                "cognito-idp:AdminAddUserToGroup",
                "cognito-idp:AdminCreateUser",
                "cognito-idp:AdminDeleteUser",
                "cognito-idp:AdminDisableUser",
                "cognito-idp:AdminEnableUser",
                "cognito-idp:AdminRemoveUserFromGroup",
                "cognito-idp:CreateGroup",
                "cognito-idp:CreateUserPool",
                "cognito-idp:CreateUserPoolClient",
                "cognito-idp:CreateUserPoolDomain",
                "cognito-idp:DescribeUserPool",
                "cognito-idp:DescribeUserPoolClient",
                "cognito-idp:ListGroups",
                "cognito-idp:ListIdentityProviders",
                "cognito-idp:ListUsers",
                "cognito-idp:ListUsersInGroup",
                "cognito-idp:ListUserPoolClients",
                "cognito-idp:ListUserPools",
                "cognito-idp:UpdateUserPool",
                "cognito-idp:UpdateUserPoolClient"
            ],
            "Resource": "*"
        }
    ]
}
```

------

This policy includes the following statements. You can scope down any of these statements by adding specific resourses to the `Resource` list for that statement.

`SageMakerApis`

This statement includes `sagemaker:*`, which allows the user to perform all [SageMaker AI API actions](sagemaker/latest/APIReference/API_Operations.html). You can reduce the scope of this policy by restricting users from performing actions that are not used to create and monitoring a labeling job. 

**`KmsKeysForCreateForms`**

You only need to include this statement if you want to grant a user permission to list and select AWS KMS keys in the Ground Truth console to use for output data encryption. The policy above grants a user permission to list and select any key in the account in AWS KMS. To restrict the keys that a user can list and select, specify those key ARNs in `Resource`.

**`SecretsManager`**

This statement gives the user permission to describe, list, and create resources in AWS Secrets Manager required to create the labeling job.

`ListAndCreateExecutionRoles`

This statement gives a user permission to list (`ListRoles`) and create (`CreateRole`) IAM roles in your account. It also grants the user permission to create (`CreatePolicy`) policies and attach (`AttachRolePolicy`) policies to entities. These are required to list, select, and if required, create an execution role in the console. 

If you have already created an execution role, and want to narrow the scope of this statement so that users can only select that role in the console, specify the ARNs of the roles you want the user to have permission to view in `Resource` and remove the actions `CreateRole`, `CreatePolicy`, and `AttachRolePolicy`.

`AccessAwsMarketplaceSubscriptions`

These permissions are required to view and choose vendor work teams that you are already subscribed to when creating a labeling job. To give the user permission to *subscribe* to vendor work teams, add the statement in [Vendor Workforce Permissions](#sms-security-permissions-workforce-creation-vendor) to the policy above

`PassRoleForExecutionRoles`

This is required to give the labeling job creator permission to preview the worker UI and verify that input data, labels, and instructions display correctly. This statement gives an entity permissions to pass the IAM execution role used to create the labeling job to SageMaker AI to render and preview the worker UI. To narrow the scope of this policy, add the role ARN of the execution role used to create the labeling job under `Resource`.

**`GroundTruthConsole`**
+ `groundtruthlabeling` – This allows a user to perform actions required to use certain features of the Ground Truth console. These include permissions to describe the labeling job status (`DescribeConsoleJob`), list all dataset objects in the input manifest file (`ListDatasetObjects`), filter the dataset if dataset sampling is selected (`RunFilterOrSampleDatasetJob`), and to generate input manifest files if automated data labeling is used (`RunGenerateManifestByCrawlingJob`). These actions are only available when using the Ground Truth console and cannot be called directly using an API.
+ `lambda:InvokeFunction` and `lambda:ListFunctions` – these actions give users permission to list and invoke Lambda functions that are used to run a custom labeling workflow.
+ `s3:*` – All Amazon S3 permissions included in this statement are used to view Amazon S3 buckets for [automated data setup](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-console-create-manifest-file.html) (`ListAllMyBuckets`), access input data in Amazon S3 (`ListBucket`, `GetObject`), check for and create a CORS policy in Amazon S3 if needed (`GetBucketCors` and `PutBucketCors`), and write labeling job output files to S3 (`PutObject`).
+ `cognito-idp` – These permissions are used to create, view and manage and private workforce using Amazon Cognito. To learn more about these actions, refer to the [Amazon Cognito API References](https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-reference.html).

## Custom Labeling Workflow Permissions
<a name="sms-security-permissions-custom-workflow"></a>

Add the following statement to a policy similar to the one in [Ground Truth Console Permissions](#sms-security-permissions-console-all) to give a user permission to select pre-existing pre-annotation and post-annotation Lambda functions while [creating a custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html).

```
{
    "Sid": "GroundTruthConsoleCustomWorkflow",
    "Effect": "Allow",
    "Action": [
        "lambda:InvokeFunction",
        "lambda:ListFunctions"
    ],
    "Resource": "*"
}
```

To learn how to give an entity permission to create and test pre-annotation and post-annotation Lambda functions, see [Required Permissions To Use Lambda With Ground Truth](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates-step3-lambda-permissions.html).

## Private Workforce Permissions
<a name="sms-security-permission-workforce-creation"></a>

When added to a permissions policy, the following permission grants access to create and manage a private workforce and work team using Amazon Cognito. These permissions are not required to use an [OIDC IdP workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-create-private-oidc.html#sms-workforce-create-private-oidc-next-steps).

```
{
    "Effect": "Allow",
    "Action": [
        "cognito-idp:AdminAddUserToGroup",
        "cognito-idp:AdminCreateUser",
        "cognito-idp:AdminDeleteUser",
        "cognito-idp:AdminDisableUser",
        "cognito-idp:AdminEnableUser",
        "cognito-idp:AdminRemoveUserFromGroup",
        "cognito-idp:CreateGroup",
        "cognito-idp:CreateUserPool",
        "cognito-idp:CreateUserPoolClient",
        "cognito-idp:CreateUserPoolDomain",
        "cognito-idp:DescribeUserPool",
        "cognito-idp:DescribeUserPoolClient",
        "cognito-idp:ListGroups",
        "cognito-idp:ListIdentityProviders",
        "cognito-idp:ListUsers",
        "cognito-idp:ListUsersInGroup",
        "cognito-idp:ListUserPoolClients",
        "cognito-idp:ListUserPools",
        "cognito-idp:UpdateUserPool",
        "cognito-idp:UpdateUserPoolClient"
        ],
    "Resource": "*"
}
```

To learn more about creating private workforce using Amazon Cognito, see [Amazon Cognito Workforces](sms-workforce-private-use-cognito.md). 

## Vendor Workforce Permissions
<a name="sms-security-permissions-workforce-creation-vendor"></a>

You can add the following statement to the policy in [Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console](#sms-security-permission-console-access) to grant an entity permission to subscribe to a [vendor workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management-vendor.html).

```
{
    "Sid": "AccessAwsMarketplaceSubscriptions",
    "Effect": "Allow",
    "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
    ],
    "Resource": "*"
}
```

# Create a SageMaker AI Execution Role for a Ground Truth Labeling Job
<a name="sms-security-permission-execution-role"></a>

When you configure your labeling job, you need to provide an *execution role*, which is a role that SageMaker AI has permission to assume to start and run your labeling job.

This role must give Ground Truth permission to access the following: 
+ Amazon S3 to retrieve your input data and write output data to an Amazon S3 bucket. You can either grant permission for an IAM role to access an entire bucket by providing the bucket ARN, or you can grant access to the role to access specific resources in a bucket. For example, the ARN for a bucket may look similar to `arn:aws:s3:::amzn-s3-demo-bucket1` and the ARN of a resource in an Amazon S3 bucket may look similar to `arn:aws:s3:::amzn-s3-demo-bucket1/prefix/file-name.png`. To apply an action to all resources in an Amazon S3 bucket, you can use the wild card: `*`. For example, `arn:aws:s3:::amzn-s3-demo-bucket1/prefix/*`. For more information, see [Amazon Amazon S3 Resources](https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-arn-format.html) in the Amazon Simple Storage Service User Guide.
+ CloudWatch to log worker metrics and labeling job statuses.
+ AWS KMS for data encryption. (Optional)
+ AWS Lambda for processing input and output data when you create a custom workflow. 

Additionally, if you create a [streaming labeling job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-streaming-labeling-job.html), this role must have permission to access:
+ Amazon SQS to create an interact with an SQS queue used to [manage labeling requests](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-streaming-labeling-job.html#sms-streaming-how-it-works-sqs).
+ Amazon SNS to subscribe to and retrieve messages from your Amazon SNS input topic and to send messages to your Amazon SNS output topic.

All of these permissions can be granted with the `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` managed policy *except*:
+ Data and storage volume encryption of your Amazon S3 buckets. To learn how to configure these permissions, see [Encrypt Output Data and Storage Volume with AWS KMS](sms-security-kms-permissions.md).
+ Permission to select and invoke Lambda functions that do not include `GtRecipe`, `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction` in the function name.
+ Amazon S3 buckets that do not include either `GroundTruth`, `Groundtruth`, `groundtruth`, `SageMaker`, `Sagemaker`, and `sagemaker` in the prefix or bucket name or an [object tag](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) that includes `SageMaker` in the name (case insensitive).

If you require more granular permissions than the ones provided in `AmazonSageMakerGroundTruthExecution`, use the following policy examples to create an execution role that fits your specific use case.

**Topics**
+ [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt)
+ [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming)
+ [Execution Role Requirements for Custom Task Types](#sms-security-permission-execution-role-custom-tt)
+ [Automated Data Labeling Permission Requirements](#sms-security-permission-execution-role-custom-auto-labeling)

## Built-In Task Types (Non-streaming) Execution Role Requirements
<a name="sms-security-permission-execution-role-built-in-tt"></a>

The following policy grants permission to create a labeling job for a [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html). This execution policy does not include permissions for AWS KMS data encryption or decryption. Replace each red, italicized ARN with your own Amazon S3 ARNs.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "S3ViewBuckets",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::<input-bucket-name>",
                "arn:aws:s3:::<output-bucket-name>"
            ]
        },
        {
            "Sid": "S3GetPutObjects",
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<input-bucket-name>/*",
                "arn:aws:s3:::<output-bucket-name>/*"
            ]
        },
        {
            "Sid": "CloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}
```

------

## Built-In Task Types (Streaming) Execution Role Requirements
<a name="sms-security-permission-execution-role-built-in-tt-streaming"></a>

If you create a streaming labeling job, you must add a policy similar to the following to the execution role you use to create the labeling job. To narrow the scope of the policy, replace the `*` in `Resource` with specific AWS resources that you want to grant the IAM role permission to access and use.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket/*",
                "arn:aws:s3:::amzn-s3-demo-bucket2/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "*",
            "Condition": {
                "StringEqualsIgnoreCase": {
                    "s3:ExistingObjectTag/SageMaker": "true"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket2"
            ]
        },
        {
            "Sid": "CloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        },
        {
            "Sid": "StreamingQueue",
            "Effect": "Allow",
            "Action": [
                "sqs:CreateQueue",
                "sqs:DeleteMessage",
                "sqs:GetQueueAttributes",
                "sqs:GetQueueUrl",
                "sqs:ReceiveMessage",
                "sqs:SendMessage",
                "sqs:SetQueueAttributes"
            ],
            "Resource": "arn:aws:sqs:*:*:*GroundTruth*"
        },
        {
            "Sid": "StreamingTopicSubscribe",
            "Effect": "Allow",
            "Action": "sns:Subscribe",
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ],
            "Condition": {
                "StringEquals": {
                    "sns:Protocol": "sqs"
                },
                "StringLike": {
                    "sns:Endpoint": "arn:aws:sns:us-east-1:111122223333:*GroundTruth*"
                }
            }
        },
        {
            "Sid": "StreamingTopic",
            "Effect": "Allow",
            "Action": [
                "sns:Publish"
            ],
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ]
        },
        {
            "Sid": "StreamingTopicUnsubscribe",
            "Effect": "Allow",
            "Action": [
                "sns:Unsubscribe"
            ],
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ]
        }
    ]
}
```

------

## Execution Role Requirements for Custom Task Types
<a name="sms-security-permission-execution-role-custom-tt"></a>

If you want to create a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html), add the following statement to an execution role policy like the ones found in [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt) or [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming).

This policy gives the execution role permission to `Invoke` your pre-annotation and post-annotation Lambda functions.

```
{
    "Sid": "LambdaFunctions",
    "Effect": "Allow",
    "Action": [
        "lambda:InvokeFunction"
    ],
    "Resource": [
        "arn:aws:lambda:<region>:<account-id>:function:<pre-annotation-lambda-name>",
        "arn:aws:lambda:<region>:<account-id>:function:<post-annotation-lambda-name>"
    ]
}
```

## Automated Data Labeling Permission Requirements
<a name="sms-security-permission-execution-role-custom-auto-labeling"></a>

If you want to create a labeling job with [automated data labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) enabled, you must 1) add one policy to the IAM policy attached to the execution role and 2) update the trust policy of the execution role. 

The following statement allows the IAM execution role to be passed to SageMaker AI so that it can be used to run the training and inference jobs used for active learning and automated data labeling respectively. Add this statement to an execution role policy like the ones found in [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt) or [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming). Replace `arn:aws:iam::<account-number>:role/<role-name>` with the execution role ARN. You can find your IAM role ARN in the IAM console under **Roles**. 

```
{
    "Effect": "Allow",
    "Action": [
        "iam:PassRole"
    ],
    "Resource": "arn:aws:iam::<account-number>:role/<execution-role-name>",
    "Condition": {
        "StringEquals": {
            "iam:PassedToService": [
                "sagemaker.amazonaws.com"
            ]
        }
    }
}
```

The following statement allows SageMaker AI to assume the execution role to create and manage the SageMaker training and inference jobs. This policy must be added to the trust relationship of the execution role. To learn how to add or modify an IAM role trust policy, see [Modifying a role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage_modify.html) in the IAM User Guide.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": {
        "Effect": "Allow",
        "Principal": {"Service": "sagemaker.amazonaws.com" },
        "Action": "sts:AssumeRole"
    }
}
```

------



# Encrypt Output Data and Storage Volume with AWS KMS
<a name="sms-security-kms-permissions"></a>

You can use AWS Key Management Service (AWS KMS) to encrypt output data from a labeling job by specifying a [customer managed key](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#master_keys) when you create the labeling job. If you use the API operation `CreateLabelingJob` to create a labeling job that uses automated data labeling, you can also use a customer managed key to encrypt the storage volume attached to the ML compute instances to run the training and inference jobs.

This section describes the IAM policies you must attach to your customer managed key to enable output data encryption and the policies you must attach to your customer managed key and execution role to use storage volume encryption. To learn more about these options, see [Output Data and Storage Volume Encryption](sms-security.md).

## Encrypt Output Data using KMS
<a name="sms-security-kms-permissions-output-data"></a>

If you specify an AWS KMS customer managed key to encrypt output data, you must add an IAM policy similar to the following to that key. This policy gives the IAM execution role that you use to create your labeling job permission to use this key to perform all of the actions listed in `"Action"`. To learn more about these actions, see [AWS KMS permissions](https://docs.aws.amazon.com/kms/latest/developerguide/kms-api-permissions-reference.html) in the AWS Key Management Service Developer Guide.

To use this policy, replace the IAM service-role ARN in `"Principal"` with the ARN of the execution role you use to create the labeling job. When you create a labeling job in the console, this is the role you specify for **IAM Role** under the **Job overview** section. When you create a labeling job using `CreateLabelingJob`, this is ARN you specify for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn).

```
{
    "Sid": "AllowUseOfKmsKey",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/service-role/example-role"
    },
    "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey"
    ],
    "Resource": "*"
}
```

## Encrypt Automated Data Labeling ML Compute Instance Storage Volume
<a name="sms-security-kms-permissions-storage-volume"></a>

If you specify a [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId) to encrypt the storage volume attached to the ML compute instance used for automated data labeling training and inference, you must do the following:
+ Attach permissions described in [Encrypt Output Data using KMS](#sms-security-kms-permissions-output-data) to the customer managed key.
+ Attach a policy similar to the following to the IAM execution role you use to create your labeling job. This is the IAM role you specify for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn) in `CreateLabelingJob`. To learn more about the `"kms:CreateGrant"` action that this policy permits, see [https://docs.aws.amazon.com/kms/latest/APIReference/API_CreateGrant.html](https://docs.aws.amazon.com/kms/latest/APIReference/API_CreateGrant.html) in the AWS Key Management Service API Reference.

------
#### [ JSON ]

****  

```
{
"Version":"2012-10-17",		 	 	  
"Statement": 
 [  
   {
    "Effect": "Allow",
    "Action": [
       "kms:CreateGrant"
    ],
    "Resource": "*"
  }
]
}
```

------

To learn more about Ground Truth storage volume encryption, see [Use Your KMS Key to Encrypt Automated Data Labeling Storage Volume (API Only)](sms-security.md#sms-security-kms-storage-volume).

# Using Amazon SageMaker Ground Truth in an Amazon Virtual Private Cloud
<a name="sms-vpc"></a>

 With [Amazon Virtual Private Cloud](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html) (Amazon VPC) you can launch AWS resources in a logically isolated virtual network that you define. Ground Truth supports running labeling jobs inside an Amazon VPC instead of connecting over the internet. When you launch a labeling job in an Amazon VPC, communication between your VPC and Ground Truth is conducted entirely and securely within the AWS network.

This guide shows how you can use Ground Truth in an Amazon VPC in the following ways:

1. [Run an Amazon SageMaker Ground Truth Labeling Job in an Amazon Virtual Private Cloud](samurai-vpc-labeling-job.md)

1. [Use Amazon VPC Mode from a Private Worker Portal](samurai-vpc-worker-portal.md)

# Run an Amazon SageMaker Ground Truth Labeling Job in an Amazon Virtual Private Cloud
<a name="samurai-vpc-labeling-job"></a>

Ground Truth supports the following functionalities in Amazon VPC.
+ You can use Amazon S3 bucket policies to control access to buckets from specific Amazon VPC endpoints, or specific VPCs. If you launch a labeling job and your input data is located in an Amazon S3 bucket that is restricted to users in your VPC, you can add a bucket policy to also grant a Ground Truth endpoint permission to access the bucket. To learn more, see [Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets](#sms-vpc-permissions-s3).
+ You can launch an [automated data labeling job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) in your VPC. You use a VPC configuration to specify VPC subnets and security groups. SageMaker AI uses this configuration to launch the training and inference jobs used for automated data labeling in your VPC. To learn more, see [Create an Automated Data Labeling Job in a VPC](#sms-vpc-permissions-automated-labeling).

You may want to use these options in any of the following ways.
+ You can use both of these methods to launch a labeling job using a VPC-protected Amazon S3 bucket with automated data labeling enabled.
+ You can launch a labeling job using any [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) using a VPC-protected bucket.
+ You can launch a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html) using a VPC-protected bucket. Ground Truth interacts with your pre-annotation and post-annotation Lambda functions using an [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/endpoint-services-overview.html) endpoint.

We recommend that you review [Prerequisites for running a Ground Truth labeling job in a VPC](#sms-vpc-gt-prereq) before you create a labeling job in an Amazon VPC.

## Prerequisites for running a Ground Truth labeling job in a VPC
<a name="sms-vpc-gt-prereq"></a>

Review the following prerequisites before you create a Ground Truth labeling job in an Amazon VPC. 
+ If you are a new user of Ground Truth, review [Getting started](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-getting-started.html) to learn how to create a labeling job.
+ If your input data is located in a VPC-protected Amazon S3 bucket, your workers must access the worker portal from your VPC. VPC based labeling jobs require the use of a private work team. To learn more about creating a private work team, see [Use a Private Workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private.html).
+ The following prerequisites are specific to launching a labeling job in your VPC.
  + Use the instructions in [Create an Amazon S3 VPC Endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/train-vpc.html#train-vpc-s3). Training and inference containers used in the automated data labeling workflow use this endpoint to communicate with your buckets in Amazon S3.
  + Review [Automate Data Labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) to learn more about this feature. Note that automated data labeling is supported for the following [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html): [Image Classification (Single Label)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-image-classification.html), [Image Semantic Segmentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-semantic-segmentation.html), [Bounding Box](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-bounding-box.html), and [Text Classification (Single Label)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-text-classification.html). Streaming labeling jobs do not support automated data labeling.
+ Review the [Ground Truth Security and Permissions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-security-general.html) section and ensure that you have met the following conditions.
  + The user creating the labeling job has all necessary permissions
  + You have created an IAM execution role with required permissions. If you do not require fine-tuned permissions for your use case, we recommend you use the IAM managed policies described in [Grant General Permissions To Get Started Using Ground Truth](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-security-permission.html#sms-security-permissions-get-started).
  + Allow your VPC to have access to the `sagemaker-labeling-data-region` and `sm-bxcb-region-saved-task-states` S3 buckets. These are system owned regionalized S3 buckets that are accessed from worker portal when worker is working on a task. We use these buckets to interact with system managed data.

## Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets
<a name="sms-vpc-permissions-s3"></a>

The following sections provide details about the permissions Ground Truth requires to launch labeling jobs using Amazon S3 buckets that have access restricted to your VPC and VPC endpoints. To learn how to restrict access to an Amazon S3 bucket to a VPC, see [Controlling access from VPC endpoints with bucket policies](https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies-vpc-endpoint.html) in the Amazon Simple Storage Service User Guide guide. To learn how to add a policy to an S3 bucket, see [Adding a bucket policy using the Amazon S3 console](https://docs.aws.amazon.com/AmazonS3/latest/userguide/add-bucket-policy.html).

**Note**  
Modifying policies on existing buckets can cause `IN_PROGRESS` Ground Truth jobs to fail. We recommend you start new jobs using a new bucket. If you want to continue using the same bucket, you can do one of the following.  
Wait for an `IN_PROGRESS` job to finish.
Terminate the job using the console or the AWS CLI.

You can restrict Amazon S3 bucket access to users in your VPC using an [AWS PrivateLink](https://aws.amazon.com/privatelink/) endpoint. For example, the following S3 bucket policy allows access to a specific bucket, `<bucket-name>`, from `<vpc>` and the endpoint `<vpc-endpoint>` only. When you modify this policy, you must replace all *red-italized text* with your resources and specifications.

**Note**  
The following policy *denies* all entities *other than* users within a VPC to perform the actions listed in `Action`. If you do not include actions in this list, they are still accessible to any entity that has access to this bucket and permission to perform those actions. For example, if a user has permission to perform `GetBucketLocation` on your Amazon S3 bucket, the policy below does not restrict the user from performing this action outside of your VPC.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "Policy1415115909152",
    "Statement": [
        {
            "Sid": "AccessToSpecificVPCEOnly",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Effect": "Deny",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:sourceVpce": [
                        "vpce-12345678",
                        "vpce-12345678901234567"
                    ]
                }
            }
        }
    ]
}
```

------

Ground Truth must be able to perform the following Amazon S3 actions on the S3 buckets you use to configure the labeling job.

```
"s3:AbortMultipartUpload",
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketLocation"
```

You can do this by adding a Ground Truth endpoint to the bucket policy like the one previously mentioned. The following table includes Ground Truth service endpoints for each AWS Region. Add an endpoint in the same [AWS Region](https://docs.aws.amazon.com/general/latest/gr/rande.html) you use to run your labeling job to your bucket policy.


****  

| AWS Region | Ground Truth endpoint | 
| --- | --- | 
| us-east-2 | vpce-02569ba1c40aad0bc | 
| us-east-1 | vpce-08408e335ebf95b40 | 
| us-west-2 | vpce-0ea07aa498eb78469 | 
| ca-central-1 | vpce-0d46ea4c9ff55e1b7 | 
| eu-central-1 | vpce-0865e7194a099183d | 
| eu-west-2 | vpce-0bccd56798f4c5df0 | 
| eu-west-1 | vpce-0788e7ed8628e595d | 
| ap-south-1 | vpce-0d7fcda14e1783f11 | 
| ap-southeast-2 | vpce-0b7609e6f305a77d4 | 
| ap-southeast-1 | vpce-0e7e67b32e9efed27 | 
| ap-northeast-2 | vpce-007893f89e05f2bbf | 
| ap-northeast-1 | vpce-0247996a1a1807dbd | 

For example, the following policy restricts `GetObject` and `PutObject` actions on:
+ An Amazon S3 bucket to users in a VPC (`<vpc>`)
+ A VPC endpoint (`<vpc-endpoint>`)
+ A Ground Truth service endpoint (`<ground-truth-endpoint>`)

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "1",
    "Statement": [
        {
            "Sid": "DenyAccessFromNonGTandCustomerVPC",
            "Effect": "Deny",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket-name",
                "arn:aws:s3:::bucket-name/*"
            ],
            "Condition": {
              "StringNotEquals": {
                "aws:SourceVpc": "vpc-12345678",
                "aws:sourceVpce": [
                  "vpce-12345678",
                  "vpce-12345678"
                ] 
             }
           }
        }
    ]
}
```

------

If you want a user to have permission to launch a labeling job using the Ground Truth console, you must also add the user's ARN to the bucket policy using the `aws:PrincipalArn` condition. This user must also have permission to perform the following Amazon S3 actions on the bucket you use to launch the labeling job.

```
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketCors",
"s3:PutBucketCors",
"s3:ListAllMyBuckets",
```

The following code is an example of a bucket policy that restricts permission to perform the actions listed in `Action` on the S3 bucket `<bucket-name>` to the following.
+ *<role-name>*
+ The VPC endpoints listed in `aws:sourceVpce`
+ Users within the VPC named *<vpc>*

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "1",
    "Statement": [
        {
            "Sid": "DenyAccessFromNonGTandCustomerVPC",
            "Effect": "Deny",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket-name/*",
                "arn:aws:s3:::bucket-name"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:SourceVpc": "vpc-12345678",
                    "aws:PrincipalArn": "arn:aws:iam::111122223333:role/role-name"
                },
                "StringNotEquals": {
                    "aws:sourceVpce": [
                        "vpce-12345678",
                        "vpce-12345678"
                    ]
                }
            }
        }
    ]
}
```

------

**Note**  
The Amazon VPC interface endpoints and the protected Amazon S3 buckets you use for input and output data must be located in the same AWS Region that you use to create the labeling job.

After you have granted Ground Truth permission to access your Amazon S3 buckets, you can use one of the topics in [Create a Labeling Job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job.html) to launch a labeling job. Specify the VPC-restricted Amazon S3 buckets for your input and output data buckets.

## Create an Automated Data Labeling Job in a VPC
<a name="sms-vpc-permissions-automated-labeling"></a>

To create an automated data labeling job using an Amazon VPC, you provide a VPC configuration using the Ground Truth console or `CreateLabelingJob` API operation. SageMaker AI uses the subnets and security groups you provide to launch the training and inferences jobs used for automated labeling. 

**Important**  
Before you launch an automated data labeling job with a VPC configuration, make sure you have created an Amazon S3 VPC endpoint using the VPC you want to use for the labeling job. To learn how, see [Create an Amazon S3 VPC Endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/train-vpc.html#train-vpc-s3).  
Additionally, if you create an automated data labeling job using a VPC-restricted Amazon S3 bucket, you must follow the instructions in [Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets](#sms-vpc-permissions-s3) to give Ground Truth permission to access the bucket.

Use the following procedures to learn how to add a VPC configuration to your labeling job request.

**Add a VPC configuration to an automated data labeling job (console):**

1. Follow the instructions in [Create a Labeling Job (Console)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-console.html) and complete each step in the procedure, up to step 15.

1. In the **Workers** section, select the checkbox next to **Enable automated data labeling**.

1. Maximize the **VPC configuration** section of the console by selecting the arrow.

1. Specify the **Virtual private cloud (VPC)** that you want to use for your automated data labeling job.

1. Choose the dropdown list under **Subnets** and select one or more subnets.

1. Choose the dropdown list under **Security groups** and select one or more groups.

1. Complete all remaining steps of the procedure in [Create a Labeling Job (Console)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-console.html).

**Add a VPC configuration to an automated data labeling job (API):**  
To configure a labeling job using the Ground Truth API operation, `CreateLabelingJob`, follow the instructions in [Create an Automated Data Labeling Job (API)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html#sms-create-automated-labeling-api) to configure your request. In addition to the parameters described in this documentation, you must include a `VpcConfig` parameter in `LabelingJobResourceConfig` to specify one or more subnets and security groups using the following schema.

```
"LabelingJobAlgorithmsConfig": { 
      "InitialActiveLearningModelArn": "string",
      "LabelingJobAlgorithmSpecificationArn": "string",
      "LabelingJobResourceConfig": { 
         "VolumeKmsKeyId": "string",
         "VpcConfig": { 
            "SecurityGroupIds": [ "string" ],
            "Subnets": [ "string" ]
         }
      }
}
```

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create an automated data labeling job in the US East (N. Virginia) Region using a private workforce. Replace all *red-italicized text* with your labeling job resources and specifications. To learn more about the `CreateLabelingJob` operation, see the [Create a Labeling Job (API)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-api.html) tutorial and [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) API documentation.

```
import boto3
client = boto3.client(service_name='sagemaker')

response = client.create_labeling_job(
    LabelingJobName="example-labeling-job",
    LabelAttributeName="label",
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': "s3://bucket/path/manifest-with-input-data.json"
            }
        }
    },
    "LabelingJobAlgorithmsConfig": {
      "LabelingJobAlgorithmSpecificationArn": "arn:aws:sagemaker:us-east-1:027400017018:labeling-job-algorithm-specification/tasktype",
      "LabelingJobResourceConfig": { 
         "VpcConfig": { 
            "SecurityGroupIds": [ "sg-01233456789", "sg-987654321" ],
            "Subnets": [ "subnet-e0123456", "subnet-e7891011" ]
         }
      }
    },
    OutputConfig={
        'S3OutputPath': "s3://bucket/path/file-to-store-output-data",
        'KmsKeyId': "string"
    },
    RoleArn="arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri="s3://bucket/path/label-categories.json",
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': "arn:aws:sagemaker:region:*:workteam/private-crowd/*",
        'UiConfig': {
            'UiTemplateS3Uri': "s3://bucket/path/custom-worker-task-template.html"
        },
        'PreHumanTaskLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:PRE-tasktype",
        'TaskKeywords': [
            "Images",
            "Classification",
            "Multi-label"
        ],
        'TaskTitle': "Add task title here",
        'TaskDescription': "Add description of task here for workers",
        'NumberOfHumanWorkersPerDataObject': 1,
        'TaskTimeLimitInSeconds': 3600,
        'TaskAvailabilityLifetimeInSeconds': 21600,
        'MaxConcurrentTaskCount': 1000,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:ACS-tasktype"
        },
    Tags=[
        {
            'Key': "string",
            'Value': "string"
        },
    ]
)
```

# Use Amazon VPC Mode from a Private Worker Portal
<a name="samurai-vpc-worker-portal"></a>

To restrict worker portal access to labelers working inside of your Amazon VPC, you can add a VPC configuration when you create a Ground Truth private workforce. You can also add a VPC configuration to an existing private workforce. Ground Truth automatically creates VPC interface endpoints in your VPC and sets up AWS PrivateLink between your VPC endpoint and the Ground Truth services. The worker portal URL associated with the workforce can be accessed from your VPC. The worker portal URL can also be accessed from public internet until you set the restriction on the public internet. When you delete the workforce or remove the VPC configuration from your workforce, Ground Truth automatically deletes the VPC endpoints associated with the workforce.

**Note**  
There can be only one VPC supported for a workforce.

[Point Cloud](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud.html) and [video](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-video.html) tasks do not support loading through a VPC.

The guide demonstrates how to complete the necessary steps to add and delete an Amazon VPC configuration to your workforce, and satisfy the prerequisites.

## Prerequisites
<a name="samurai-vpc-getting-started-prerequisites"></a>

To run a Ground Truth labeling job in Amazon VPC, review the following prerequisites.
+ You have an Amazon VPC configured that you can use. If you have not configured a VPC, follow these instructions for [creating a VPC](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html#interface-endpoint-shared-subnets).
+ Depending on how a [Worker Task Template](https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-instructions-overview.html) is written, labeling data stored in an Amazon S3 bucket may be accessed directly from Amazon S3 during labeling tasks. In these cases, the VPC network must be configured to allow traffic from the device used by the human labeler to the S3 bucket containing labeling data.
+ Follow [View and update DNS attributes for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html#vpc-dns-updating) to enable DNS hostnames and DNS resolution for your VPC.

**Note**  
There are two ways to configure your VPC for your workforce. You can do this through the [console](https://console.aws.amazon.com/sagemaker) or the AWS SageMaker AI [CLI](https://aws.amazon.com/cli/).

# Using the SageMaker AI console to manage a VPC config
<a name="samurai-vpc-workforce-console"></a>

You can use the [SageMaker AI console](https://console.aws.amazon.com/sagemaker) to add or remove a VPC configuration. You can also delete an existing workforce.

## Adding a VPC configuration to your workforce
<a name="samurai-add-vpc-workforce"></a>

### Create a private workforce
<a name="samurai-vpc-create-workforce"></a>
+ [Create a private workforce using Amazon Cognito](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private-use-cognito.html)
+ [Create a private workforce using OpenID Connect (OIDC) Identity Provider(IdP)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private-use-oidc.html).

After you have created your private workforce, add a VPC configuration to it.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Select **Private** to access your private workforce. After your **Workforce status** is **Active**, select **Add** next to **VPC**.

1. When you are prompted to configure your VPC, provide the following:

   1. Your **VPC**

   1. **Subnets**

      1. Ensure that your VPC has an existing subnet

   1. **Security groups**

      1. 
**Note**  
You cannot select more than 5 security groups.

   1. After filling in this information, choose **Confirm**.

1. After you choose **Confirm**, you are redirected back to the **Private** page under **Labeling workforces**. You should see a green banner at the top that reads **Your private workforce update with VPC configuration was successfully initialized.** The workforce status is **Updating**. Next to the **Delete workforce** button is the **Refresh** button, which can be used to retrieve the latest **Workforce status**. After the workforce status has changed to **Active**, the VPC endpoint ID is updated as well.

## Removing a VPC configuration from your workforce
<a name="samurai-remove-vpc-workforce"></a>

Use the following information to remove a VPC configuration from your workforce using the console.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Find and select your workforce.

1. Under **Private workforce summary**, find **VPC** and choose **Remove** next to it.

1. Select **Remove**.

## Deleting a workforce through the console
<a name="samurai-delete-vpc-workforce"></a>

If you delete a workforce, you should not have any teams associated with it. You can delete a workforce only if the workforce status is **Active** or **Failed**.

Use the following information to delete a workforce using the console.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Find and select your workforce.

1. Choose **Delete workforce**.

1. Choose **Delete**.

# Using the SageMaker AI AWS API to manage a VPC config
<a name="samurai-vpc-workforce-cli"></a>

Use the following sections to learn more about managing a VPCs configuration, while maintaining the right level of access to the work team.

## Create a workforce with a VPC configuration
<a name="samurai-create-vpc-cli"></a>

If the account already has a workforce, then you must delete it first. You can also update the workforce with VPC configuration.

```
aws sagemaker create-workforce --cognito-config '{"ClientId": "app-client-id","UserPool": "Pool_ID",}' --workforce-vpc-config \       
" {\"VpcId\": \"vpc-id\", \"SecurityGroupIds\": [\"sg-0123456789abcdef0\"], \"Subnets\": [\"subnet-0123456789abcdef0\"]}" --workforce-name workforce-name
{
    "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name"
}
```

Describe the workforce and make sure the status is `Initializing`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "WorkforceVpcConfig": {
            "VpcId": "vpc-id",
            "SecurityGroupIds": [
                "sg-0123456789abcdef0"
            ],
            "Subnets": [
                "subnet-0123456789abcdef0"
            ]
        },
        "Status": "Initializing"
    }
}
```

Navigate to the Amazon VPC console. Select **Endpoints** from the left panel. There should be two VPC endpoints created in your account.

## Adding a VPC configuration your workforce
<a name="samurai-add-vpc-cli"></a>

Update a non-VPC private workforce with a VPC configuration using the following command.

```
aws sagemaker update-workforce --workforce-name workforce-name\
--workforce-vpc-config "{\"VpcId\": \"vpc-id\", \"SecurityGroupIds\": [\"sg-0123456789abcdef0\"], \"Subnets\": [\"subnet-0123456789abcdef0\"]}"
```

Describe the workforce and make sure the status is `Updating`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "WorkforceVpcConfig": {
            "VpcId": "vpc-id",
            "SecurityGroupIds": [
                "sg-0123456789abcdef0"
            ],
            "Subnets": [
                "subnet-0123456789abcdef0"
            ]
        },
        "Status": "Updating"
    }
}
```

Navigate to your Amazon VPC console. Select **Endpoints** from the left panel. There should be two VPC endpoints created in your account.

## Removing a VPC configuration from your workforce
<a name="samurai-remove-vpc-cli"></a>

Update a VPC private workforce with an empty VPC configuration to remove VPC resources.

```
aws sagemaker update-workforce --workforce-name workforce-name\ 
--workforce-vpc-config "{}"
```

Describe the workforce and make sure the status is `Updating`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "Status": "Updating"
    }
}
```

Naviagate to your Amazon VPC console. Select **Endpoints** from the left panel. The two VPC endpoints should be deleted.

## Restrict public access to the worker portal while maintaining access through a VPC
<a name="public-access-vpc"></a>

 The workers in a VPC or non-VPC worker portal are be able to see the labeling job tasks assigned to them. The assignment comes from assigning workers in a work team through OIDC groups. It is the customer’s responsibility to restrict the access to their public worker portal by setting the `sourceIpConfig` in their workforce. 

**Note**  
You can restrict access to the worker portal only through the SageMaker API. This cannot be done through the console.

Use the following command to restrict public access to the worker portal.

```
aws sagemaker update-workforce --region us-west-2 \
--workforce-name workforce-demo --source-ip-config '{"Cidrs":["10.0.0.0/16"]}'
```

After the `sourceIpConfig` is set on the workforce, the workers can access the worker portal in VPC but not through public internet.

**Note**  
You can not set the `sourceIP` restriction for worker portal in VPC.

# Output Data and Storage Volume Encryption
<a name="sms-security"></a>

With Amazon SageMaker Ground Truth, you can label highly sensitive data, stay in control of your data, and employ security best practices. While your labeling job is running, Ground Truth encrypts data in transit and at rest. Additionally, you can use AWS Key Management Service (AWS KMS) with Ground Truth to do the following:
+ Use a [customer managed key](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#master_keys) to encrypt your output data. 
+ Use AWS KMS customer managed key with your automated data labeling job to encrypt the storage volume attached to the compute instance used for model training and inference. 

Use the topics on this page to learn more about these Ground Truth security features.

## Use Your KMS Key to Encrypt Output Data
<a name="sms-security-kms-output-data"></a>

Optionally, you can provide an AWS KMS customer managed key when you create a labeling job, which Ground Truth uses to encrypt your output data. 

If you don't provide a customer managed key, Amazon SageMaker AI uses the default AWS managed key for Amazon S3 for your role's account to encrypt your output data.

If you provide a customer managed key, you must add the required permissions to the key described in [Encrypt Output Data and Storage Volume with AWS KMS](sms-security-kms-permissions.md). When you use the API operation `CreateLabelingJob`, you can specify your customer managed key ID using the parameter `[KmsKeyId](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobOutputConfig.html#sagemaker-Type-LabelingJobOutputConfig-KmsKeyId)`. See the following procedure to learn how to add a customer managed key when you create a labeling job using the console.

**To add an AWS KMS key to encrypt output data (console):**

1. Complete the first 7 steps in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md).

1. In step 8, select the arrow next to **Additional configuration** to expand this section.

1. For **Encryption key**, select the AWS KMS key that you want to use to encrypt output data.

1. Complete the rest of steps in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to create a labeling job.

## Use Your KMS Key to Encrypt Automated Data Labeling Storage Volume (API Only)
<a name="sms-security-kms-storage-volume"></a>

When you create a labeling job with automated data labeling using the `CreateLabelingJob` API operation, you have the option to encrypt the storage volume attached to the ML compute instances that run the training and inference jobs. To add encryption to your storage volume, use the parameter `VolumeKmsKeyId` to input an AWS KMS customer managed key. For more information about this parameter, see `[LabelingJobResourceConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId)`.

If you specify a key ID or ARN for `VolumeKmsKeyId`, your SageMaker AI execution role must include permissions to call `kms:CreateGrant`. To learn how to add this permission to an execution role, see [Create a SageMaker AI Execution Role for a Ground Truth Labeling Job](sms-security-permission-execution-role.md).

**Note**  
If you specify an AWS KMS customer managed key when you create a labeling job in the console, that key is *only* used to encrypt your output data. It is not used to encrypt the storage volume attached to the ML compute instances used for automated data labeling.

# Workforce Authentication and Restrictions
<a name="sms-security-workforce-authentication"></a>

Ground Truth enables you to use your own private workforce to work on labeling jobs. A *private workforce* is an abstract concept which refers to a set of people who work for you. Each labeling job is created using a work team, composed of workers in your workforce. Ground Truth supports private workforce creation using Amazon Cognito. 

A Ground Truth workforce maps to a Amazon Cognito user pool. A Ground Truth work team maps to a Amazon Cognito user group. Amazon Cognito manages the worker authentication. Amazon Cognito supports Open ID connection (OIDC) and customers can set up Amazon Cognito federation with their own identity provider (IdP). 

Ground Truth only allows one workforce per account per AWS Region. Each workforce has a dedicated Ground Truth work portal login URL. 

You can also restrict workers to a Classless Inter-Domain Routing (CIDR) block/IP address range. This means annotators must be on a specific network to access the annotation site. You can add up to ten CIDR blocks for one workforce. To learn more, see [Private workforce management using the Amazon SageMaker API](sms-workforce-management-private-api.md).

To learn how you can create a private workforce, see [Create a Private Workforce (Amazon Cognito)](sms-workforce-create-private.md).

## Restrict Access to Workforce Types
<a name="sms-security-permission-condition-keys"></a>

Amazon SageMaker Ground Truth work teams fall into one of three [workforce types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management.html): public (with Amazon Mechanical Turk), private, and vendor. To restrict user access to a specific work team using one of these types or the work team ARN, use the `sagemaker:WorkteamType` and/or the `sagemaker:WorkteamArn` condition keys. For the `sagemaker:WorkteamType` condition key, use [string condition operators](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html#Conditions_String). For the `sagemaker:WorkteamArn` condition key, use [Amazon Resource Name (ARN) condition operators](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html#Conditions_ARN). If the user attempts to create a labeling job with a restricted work team, SageMaker AI returns an access denied error. 

The policies below demonstrate different ways to use the `sagemaker:WorkteamType` and `sagemaker:WorkteamArn` condition keys with appropriate condition operators and valid condition values. 

The following example uses the `sagemaker:WorkteamType` condition key with the `StringEquals` condition operator to restrict access to a public work team. It accepts condition values in the following format: `workforcetype-crowd`, where *workforcetype* can equal `public`, `private`, or `vendor`.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "sagemaker:WorkteamType": "public-crowd"
                }
            }
        }
    ]
}
```

------

The following policies show how to restrict access to a public work team using the `sagemaker:WorkteamArn` condition key. The first shows how to use it with a valid IAM regex-variant of the work team ARN and the `ArnLike` condition operator. The second shows how to use it with the `ArnEquals` condition operator and the work team ARN.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "ArnLike": {
                    "sagemaker:WorkteamArn": "arn:aws:sagemaker:*:*:workteam/public-crowd/*"
                }
            }
        }
    ]
}
```

------

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "ArnEquals": {
                    "sagemaker:WorkteamArn": "arn:aws:sagemaker:us-west-2:394669845002:workteam/public-crowd/default"
                }
            }
        }
    ]
}
```

------