

# Create an Amazon S3 transfer task
<a name="create-an-amazon-s3-transfer-task"></a>

 The Guidance allows you to create an Amazon S3 transfer task in the following ways: 
+  [using the web console](#using-the-web-console) 
+  [using the S3 plugin](#using-the-s3-plugin) 
+  [using AWS CLI](#using-aws-cli-to-create-s3-transfer-task) 

 You can make your choice according to your needs. 
+  The web console provides an intuitive user interface where you can start, clone or stop a data transfer task with a simple click. The frontend also provides metric monitoring and logging view, so you do not need to switch between different pages. 
+  The S3 plugin is a standalone CloudFormation template, and you can easily integrate it into your workflows. Because this option allows deployment without the frontend, it is useful if you want to deploy in AWS China Regions but do not have an ICP licensed domain. 
+  AWS CLI can quickly initiate data transfer tasks. Select this option if you want to leverage Data Transfer Hub in your automation scripts. 

## Using the web console
<a name="using-the-web-console"></a>

 You can use the web console to create an Amazon S3 transfer task. For more information about how to launch the web console, see [Deploy the Guidance](deploy-the-solution.md#deployment-overview). 

1.  From the **Create Transfer Task** page**,** select **Start a New Task**, and then select **Next**. 

1.  From the **Engine options** page**,** under engine, select **Amazon S3**, and then choose **Next** **Step**. 

1.  Specify the transfer task details. 
   +  Under **Source Type**, select the data source, for example, **Amazon S3**. 

1.  Enter the bucket name and choose to sync Full Bucket or Objects with a specific prefix or Objects with different prefixes. 
   +  If the data source bucket is in the account where Data Transfer Hub was deployed, select **Yes.** 
     +  If you need to achieve real-time incremental data synchronization, please configure whether to enable S3 event notification. Note that this option can only be configured when the program and your data source are deployed in the same area of the same account. 
     +  If you do not enable S3 event notification, the program will periodically synchronize incremental data according to the scheduling frequency you configure in the future. 
   +  If the source bucket is not in the same account where Data Transfer Hub was deployed, select **No**, then specify the credentials for the source bucket. 
   +  If you choose to synchronize objects with multiple prefixes, please transfer the prefix list file separated by rows to the root directory of the bucket where Data Transfer Hub is deployed, and then fill in the name of the file. For details, please refer to [Multi-Prefix List Configuration Tutorial](https://github.com/awslabs/data-transfer-hub/blob/main/docs/USING_PREFIX_LIST.md). 

1.  To create credential info, choose [https://console.aws.amazon.com/secretsmanager/home](https://console.aws.amazon.com/secretsmanager/home) to navigate to the current Region’s AWS Secrets Manager console. 

   1.  From the left menu, select **Secrets**, then choose **Store a new secret** and select the **other type of secrets** key type. 

   1.  Fill in the `access_key_id` and `secret_access_key` information in the **Plaintext** input box according to the displayed format. For more information, refer to [IAM features](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) in the *IAM User Guide*. Choose **Next**. 

      ```
      {
          "access_key_id": "<Your Access Key ID>",
          "secret_access_key": "<Your Access Key Secret>"
      }
      ```

   1.  (Optional) Enter the key name and description. Choose **Next**. 

   1.  In the configuration of automatic rotation, select **Disable automatic rotation**. Choose **Next**. 

   1. Keep the default value and choose **Save** to complete the creation of the key. 

   1. Navigate back to the Data Transfer Hub task creation interface and refresh the interface. Your new secret is displayed in the drop-down list. 

   1. Select the certificate (Secret). 

1.  Provide destination settings for the S3 buckets. 
**Note**  
If the source S3 bucket is in the same account where Data Transfer Hub was deployed, then in **destination settings**, you must create or provide credential information for the S3 destination bucket. Otherwise, no credential information is needed. Use the following steps to update the destination settings. 

1.  From **Engine settings**, verify the values and modify them if necessary. We recommend to have the **minimum capacity** set to at least 1 if for incremental data transfer. 

1.  At **Task Scheduling Settings**, select your task scheduling configuration. 
   +  If you want to configure the timed task at a fixed frequency to compare the data difference on both sides of the time, select **Fixed Rate**. 
   +  If you want to configure a scheduled task through [Cron Expression](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html#CronExpressions) to achieve a scheduled comparison of data differences on both sides, select **Cron Expression**. 
   +  If you only want to perform the data synchronization task once, select **On Time Transfer**. 

1.  From **Advanced Options**, keep the default values. 

1.  At **Need Data Comparison before Transfer**, select your task configuration. 
   +  If you want to skip the data comparison process and transfer all files, please select **No**. 
   +  If you only want to synchronize files with differences, please select **Yes**. 

1.  Enter an email address in **Alarm Email**. 

1.  Choose **Next** and review your task parameter details. 

1.  Choose **Create Task**. 

 After the task is created successfully, it will appear on the **Tasks** page. 

**Note**  
If your destination bucket in Amazon S3 is set to require all data uploads to be encrypted with Amazon S3 managed keys, you can check the following tutorial. 

 **Destination bucket encrypted with Amazon S3 managed keys** 

 Select "SSE-S3 AES256" from the dropdown menu under 'Destination bucket policy check' in the destination's configuration. For more information, refer to this [https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html). 

 If your destination bucket is set to require that objects be encrypted using only SSE-KMS (Server-Side Encryption with AWS Key Management Service), which is detailed in this [https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html](https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html), and your policy looks something like the example provided: 

------
#### [ JSON ]

****  

```
{
        "Version":"2012-10-17",		 	 	 
        "Id": "PutObjectPolicy",
        "Statement": [
        {
                "Sid": "DenyIncorrectEncryptionHeader",
                "Effect": "Deny",
                "Principal": "*",
                "Action": "s3:PutObject",
                "Resource": "arn:aws-cn:s3:::dth-sse-debug-cn-north-1/*",
                "Condition": {
                    "StringNotEquals": {
                        "s3:x-amz-server-side-encryption": "aws:kms"
                    }
                }
            },
            {
                "Sid": "DenyUnencryptedObjectUploads",
                "Effect": "Deny",
                "Principal": "*",
                "Action": "s3:PutObject",
                "Resource": "arn:aws-cn:s3:::dth-sse-debug-cn-north-1/*",
                "Condition": {
                    "StringNotEquals": {
                        "s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws-cn:kms:cn-north-1:123456789012:key/7c54749e-eb6a-42cc-894e-93143b32e7c0"
                    }
                }
            }
        ]
    }
```

------

In this case, you should select "SSE-KMS" in the 'Destination bucket policy check' dropdown menu in the destination's configuration. Additionally, you need to provide the KMS Key ID, such as "7c54749e-eb6a-42cc-894e-93143b32e7c0" in the example.

## Using the S3 plugin
<a name="using-the-s3-plugin"></a>

**Note**  
This tutorial provides guidance for the backend-only version. For more details, please refer to [S3 Plugin Introduction](https://github.com/awslabs/data-transfer-hub/blob/main/docs/S3_PLUGIN.md). 

 **Step 1. Prepare VPC** 

 This Guidance can be deployed in both public and private subnets. Using public subnets is recommended. 
+  If you want to use existing VPC, please make sure the VPC has at least 2 subnets, and both subnets must have public internet access (either public subnets with internet gateway or private subnets with NAT gateway). 
+  If you want to create new default VPC for this Guidance, please go to Step 2 and make sure you have >**Create a new VPC for this cluster** selected when you create the cluster. 

 **Step 2. Configure credentials** 

 You need to provide AccessKeyID and SecretAccessKey (namely AK/SK) to read or write bucket in S3 from or to another AWS account or other cloud storage service, and the credential will be stored in AWS Secrets Manager. You do not need to create credential for the bucket in the current account where you are deploying the Guidance. 

1. Go to AWS Management Console > Secrets Manager. 

1. From Secrets Manager home page, choose **Store a new secret**. 

1. For secret type, select **Other type of secrets**.

1. For key/value paris, please copy and paste below JSON text into the **Plaintext** section, and change value to your AK/SK accordingly. 

   ```
   {
     "access_key_id": "<Your Access Key ID>",
     "secret_access_key": "<Your Access Key Secret>"
   }
   ```  
![\[AWS Secrets Manager interface for storing a new secret, with options for secret type and key/value pairs.\]](http://docs.aws.amazon.com/solutions/latest/data-transfer-hub/images/secret.png)

1. Choose **Next** to specify a secret name, and choose **Create**. 

 If the AK/SK is for source bucket, READ access to bucket is required; if it is for destination bucket, READ and WRITE access to bucket is required. For Amazon S3, you can refer to [Set up credentials for Amazon S3](set-up-credentials-for-amazon-s3.md) for more information. 

 **Step 3. Launch AWS Cloudformation Stack** 

 Please follow below steps to deploy this Guidance via AWS Cloudformation. 

1.  Sign in to AWS Management Console, and switch to the Region where you want to deploy the CloudFormation Stack. 

1.  Choose the following to launch the CloudFormation Stack. 
   +  For AWS China Regions 

      [https://console.amazonaws.cn/cloudformation/home#/stacks/create/template?stackName=DTHS3Stack&templateURL=https:%2F%2Fsolutions-reference.s3.amazonaws.com/data-transfer-hub/latest/DataTransferS3Stack.template](https://console.amazonaws.cn/cloudformation/home#/stacks/create/template?stackName=DTHS3Stack&templateURL=https:%2F%2Fsolutions-reference.s3.amazonaws.com/data-transfer-hub/latest/DataTransferS3Stack.template) 
   +  For AWS Global Regions 

      [https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=DTHS3Stack&templateURL=https://solutions-reference.s3.amazonaws.com/data-transfer-hub/latest/DataTransferS3Stack.template](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=DTHS3Stack&templateURL=https://solutions-reference.s3.amazonaws.com/data-transfer-hub/latest/DataTransferS3Stack.template) 

1. Choose **Next**. Specify values to parameters accordingly. Change the stack name if required. 

1. Choose **Next**. Configure additional stack options such as tags if needed. 

1. Choose **Next**. Review and confirm acknowledgement, and then choose **Create Stack** to start the deployment. 

 The deployment will take approximately 3 to 5 minutes. 

## Using AWS CLI
<a name="using-aws-cli-to-create-s3-transfer-task"></a>

 You can use the [AWS CLI](https://aws.amazon.com/cli/) to create an Amazon S3 transfer task. Note that if you have deployed the Data Transfer Hub Portal at the same time, the tasks started through the CLI will not appear in the Task List on your Portal. 

1.  Create an Amazon VPC with two public subnets or two private subnets with [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html). 

1.  Replace `<CLOUDFORMATION_URL>` as shown below.

   ```
   https://solutions-reference.s3.amazonaws.com/data-transfer-hub/latest/DataTransferS3Stack.template
   ```

1.  Go to your terminal and enter the following command. For the parameter details, refer to the Parameters table.

   ```
   aws cloudformation create-stack --stack-name dth-s3-task --template-url CLOUDFORMATION_URL \
   --capabilities CAPABILITY_NAMED_IAM \
   --parameters \
   ParameterKey=alarmEmail,ParameterValue=your_email@example.com \
   ParameterKey=destBucket,ParameterValue=dth-receive-cn-north-1 \
   ParameterKey=destPrefix,ParameterValue=test-prefix \
   ParameterKey=destCredentials,ParameterValue=drh-cn-secret-key \
   ParameterKey=destInCurrentAccount,ParameterValue=false \
   ParameterKey=destRegion,ParameterValue=cn-north-1 \
   ParameterKey=destStorageClass,ParameterValue=STANDARD \
   ParameterKey=destPutObjectSSEType,ParameterValue=None \
   ParameterKey=destPutObjectSSEKmsKeyId,ParameterValue= \
   ParameterKey=srcBucket,ParameterValue=dth-us-west-2 \
   ParameterKey=srcInCurrentAccount,ParameterValue=true \
   ParameterKey=srcCredentials,ParameterValue= \
   ParameterKey=srcRegion,ParameterValue=us-west-2 \
   ParameterKey=srcPrefix,ParameterValue=case1 \
   ParameterKey=srcType,ParameterValue=Amazon_S3 \
   ParameterKey=ec2VpcId,ParameterValue=vpc-040bbab85f0e4e088 \
   ParameterKey=ec2Subnets,ParameterValue=subnet-0d1bf2725ab8e94ee\\,subnet-06d17b2b3286be40e \
   ParameterKey=finderEc2Memory,ParameterValue=8 \
   ParameterKey=ec2CronExpression,ParameterValue="0/60 * * * ? *" \
   ParameterKey=includeMetadata,ParameterValue=false \
   ParameterKey=srcEvent,ParameterValue=No \
   ParameterKey=maxCapacity,ParameterValue=20 \
   ParameterKey=minCapacity,ParameterValue=1 \
   ParameterKey=desiredCapacity,ParameterValue=1
   ```


|  Parameter  |  Allowed Value  |  Default Value  |  Description  | 
| --- | --- | --- | --- | 
|  alarmEmail  |   |   |  An email to which errors will be sent  | 
|  desiredCapacity  |   |  1  |  Desired capacity for Auto Scaling Group  | 
|  destAcl  |  private  public-read public-read-write authenticated-read aws-exec-read bucket-owner-read bucket-owner-full-control  |  bucket-owner-full-control  |  Destination access control list  | 
|  destBucket  |   |   |  Destination bucket name  | 
|  destCredentials  |   |   |  Secret name in Secrets Manager used to keep AK/SK credentials for destination bucket. Leave it blank if the destination bucket is in the current account  | 
|  destInCurrentAccount  |  true  false   |  true  |  Indicates whether the destination bucket is in current account. If not, you should provide a credential with read and write access  | 
|  destPrefix  |   |   |  Destination prefix (Optional)  | 
|  destRegion  |   |   |  Destination region name  | 
|  destStorageClass  |  STANDARD STANDARD\$1IA ONEZONE\$1IA INTELLIGENT\$1TIERING  |  INTELLIGENT\$1TIERING  |  Destination storage class, which defaults to INTELLIGENT\$1TIERING  | 
|  destPutObjectSSEType  |  None AES256 AWS\$1KMS  |  None  |  Specifies the server-side encryption algorithm used for storing objects in Amazon S3. 'AES256' applies AES256 encryption, 'AWS\$1KMS' uses AWS Key Management Service encryption, and 'None' indicates that no encryption is applied.  | 
|  destPutObjectSSEKmsKeyId  |   |   |  Specifies the ID of the symmetric customer managed AWS KMS Customer Master Key (CMK) used for object encryption. This parameter should only be set when destPutObjectSSEType is set to 'AWS\$1KMS'. If destPutObjectSSEType is set to any value other than 'AWS\$1KMS', please leave this parameter empty. The default value is not set.  | 
|  isPayerRequest  |  true  false   |  false  |  Indicates whether to enable payer request. If true, it will get object in payer request mode.  | 
|  ec2CronExpression  |   |  0/60 \$1 \$1 \$1 ? \$1  |  Cron expression for EC2 Finder task "" for one time transfer.  | 
|  finderEc2Memory  |  8 16 32 64 128 256   |  8 GB  |  The amount of memory (in GB) used by the Finder task.  | 
|  ec2Subnets  |   |   |  Two public subnets or two private subnets with [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html)  | 
|  ec2VpcId  |   |   |  VPC ID to run EC2 task, for example, vpc-bef13dc7  | 
|  finderDepth  |   |  0  |  Depth of sub folders to compare in parallel. 0 means comparing all objects in sequence  | 
|  finderNumber  |   |  1  |  The number of finder threads to run in parallel  | 
|  includeMetadata  |  true  false   |  false  |  Indicates whether to add replication of object metadata. If true, there will be additional API calls.  | 
|  maxCapacity  |   |  20  |  Maximum capacity for Auto Scaling Group  | 
|  minCapacity  |   |  1  |  Minimum capacity for Auto Scaling Group  | 
|  srcBucket  |   |   |  Source bucket name  | 
|  srcCredentials  |   |   |  Secret name in Secrets Manager used to keep AK/SK credentials for Source Bucket. Leave it blank if source bucket is in the current account or source is open data  | 
|  srcEndpoint  |   |   |  Source Endpoint URL (Optional). Leave it blank unless you want to provide a custom Endpoint URL  | 
|  srcEvent  |  No Create CreateAndDelete   |  No  |  Whether to enable S3 Event to trigger the replication. Note that S3Event is only applicable if source is in the current account  | 
|  srcInCurrentAccount  |  true  false   |  false  |  Indicates whether the source bucket is in the current account. If not, you should provide a credential with read access  | 
|  srcPrefix  |   |   |  Source prefix (Optional)  | 
|  srcPrefixListBucket  |   |   |  Source prefix list file S3 bucket name (Optional). It used to store the Source prefix list file. The specified bucket must be located in the same AWS region and under the same account as the DTH deployment. If your PrefixList File is stored in the Source Bucket, please leave this parameter empty.  | 
|  srcPrefixsListFile  |   |   |  Source prefix list file S3 path (Optional). It supports txt type, for example, my\$1prefix\$1list.txt, and the maximum number of lines is 10 millions  | 
|  srcRegion  |   |   |  Source region name  | 
|  srcSkipCompare  |  true  false   |  false  |  Indicates whether to skip the data comparison in task finding process. If yes, all data in the source will be sent to the destination  | 
|  srcType  |  Amazon\$1S3 Aliyun\$1OSS Qiniu\$1Kodo Tencent\$1COS  |  Amazon\$1S3  |  If you choose to use the Endpoint mode, please select Amazon\$1S3.  | 
|  workerNumber  |  1 \$1 10  |  4  |  The number of worker threads to run in one worker node/instance. For small files (size < 1MB), you can increase the number of workers to improve the transfer performance.  | 

## How to transfer S3 object from KMS encrypted Amazon S3
<a name="how-to-transfer-s3-object-from-kms-encrypted-amazon-s3"></a>

 By default, Data Transfer Hub supports data source bucket using SSE-S3 and SSE-KMS. 

 If your source bucket enabled SSE-CMK, you need to create an IAM Policy and attach it to DTH worker and finder node. You can go to [Amazon IAM Roles](https://us-east-1.console.aws.amazon.com/iamv2/home#/roles) Console and search for `<StackName>-FinderStackFinderRole<random suffix>` and `<StackName>-EC2WorkerStackWorkerAsgRole<random suffix>`. 

 Pay attention to the following: 
+  Change the `Resource` in KMS part to your own KMS key's Amazon Resource Name (ARN). 
+  For S3 buckets in AWS China Regions, make sure to use `arn:aws-cn:kms:::` instead of `arn:aws:kms:::`. 

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:DescribeKey"
            ],
            "Resource": [
                "arn:aws:kms:us-west-2:123456789012:key/f5cd8cb7-476c-4322-ac9b-0c94a687700d <Please replace this with your own KMS key arn>"
            ]
        }
    ]
}
```

------