# Using data repositories with Amazon FSx for Lustre
Using data repositories

Amazon FSx for Lustre provides high-performance file systems optimized for fast workload processing. It can support workloads such as machine learning, high performance computing (HPC), video processing, financial modeling, and electronic design automation (EDA). These workloads commonly require data to be presented using a scalable, high-speed file system interface for data access. Often, the datasets used for these workloads are stored in long-term data repositories in Amazon S3. FSx for Lustre is natively integrated with Amazon S3, making it easier to process datasets with the Lustre file system.

**Note**  
File system backups aren't supported on file systems that are linked an Amazon S3 data repository. For more information, see [Protecting your data with backups](using-backups-fsx.md).
Intelligent-Tiering file systems don't support linking to Amazon S3 data repositories.

**Topics**
+ [

# Overview of data repositories
](overview-dra-data-repo.md)
+ [

# POSIX metadata support for data repositories
](posix-metadata-support.md)
+ [

# Linking your file system to an Amazon S3 bucket
](create-dra-linked-data-repo.md)
+ [

# Importing changes from your data repository
](importing-files-dra.md)
+ [

# Exporting changes to the data repository
](export-changed-data-meta-dra.md)
+ [

# Data repository tasks
](data-repository-tasks.md)
+ [

# Releasing files
](file-release.md)
+ [

# Using Amazon FSx with your on-premises data
](fsx-on-premises.md)
+ [

# Data repository event logs
](data-repo-event-logs.md)
+ [

# Working with older deployment types
](older-deployment-types.md)

# Overview of data repositories


When you use Amazon FSx for Lustre with data repositories, you can ingest and process large volumes of file data in a high-performance file system by using automatic import and import data repository tasks. At the same time, you can write results to your data repositories by using automatic export or export data repository tasks. With these features, you can restart your workload at any time using the latest data stored in your data repository.

**Note**  
 Data repository associations, automatic export, and support for multiple data repositories aren't available on FSx for Lustre 2.10 file systems or `Scratch 1` file systems. 

FSx for Lustre is deeply integrated with Amazon S3. This integration means that you can seamlessly access the objects stored in your Amazon S3 buckets from applications that mount your FSx for Lustre file system. You can also run your compute-intensive workloads on Amazon EC2 instances in the AWS Cloud and export the results to your data repository after your workload is complete.

In order to access objects in the Amazon S3 data repository as files and directories on the file system, file and directory metadata must be loaded into the file system. You can load metadata from a linked data repository when you create a data repository association.

Additionally you can import file and directory metadata from your linked data repositories to the file system using automatic import or using an import data repository task. When you turn on automatic import for a data repository association, your file system automatically imports file metadata as files are created, modified, and/or deleted in the S3 data repository. Alternatively, you can import metadata for new or changed files and directories using an import data repository task.

**Note**  
Automatic import and import data repository tasks can be used simultaneously on a file system.

You can also export files and their associated metadata in your file system to your data repository using automatic export or using an export data repository task. When you turn on automatic export on a data repository association, your file system automatically exports file data and metadata as files are created, modified, or deleted. Alternatively, you can export files or directories using an export data repository task. When you use an export data repository task, file data and metadata that were created or modified since the last such task are exported.

**Note**  
Automatic export and export data repository tasks can't be used simultaneously on a file system.
Data repository associations only export regular files, symlinks and directories. This means all the other type of files (FIFO special, block special, character special, and socket) won't be exported as part of the export processes like automatic export and export data repository tasks.

FSx for Lustre also supports cloud bursting workloads with on-premises file systems by enabling you to copy data from on-premises clients using Direct Connect or VPN.

**Important**  
If you have linked one or more FSx for Lustre file systems to a data repository on Amazon S3, don't delete the Amazon S3 bucket until you have deleted or unlinked all linked file systems.

## Region and account support for linked S3 buckets


When you create links to S3 buckets, keep in mind the following Region and account support limitations:
+ Automatic export supports cross-Region configurations. The Amazon FSx file system and the linked S3 bucket can be located in the same AWS Region or in different AWS Regions.
+ Automatic import doesn't support cross-Region configurations. Both the Amazon FSx file system and the linked S3 bucket must be located in the same AWS Region.
+ Both automatic export and automatic import support cross-Account configurations. The Amazon FSx file system and the linked S3 bucket can belong to the same AWS account or to different AWS accounts.

# POSIX metadata support for data repositories
POSIX metadata support

Amazon FSx for Lustre automatically transfers Portable Operating System Interface (POSIX) metadata for files, directories, and symbolic links (symlinks) when importing and exporting data to and from a linked data repository on Amazon S3. When you export changes in your file system to its linked data repository, FSx for Lustre also exports POSIX metadata changes as S3 object metadata. This means that if another FSx for Lustre file system imports the same files from S3, the files will have the same POSIX metadata in that file system, including ownership and permissions.

FSx for Lustre imports only S3 objects that have POSIX-compliant object keys, such as the following.

```
mydir/
mydir/myfile1
mydir/mysubdir/
mydir/mysubdir/myfile2.txt
```

FSx for Lustre stores directories and symlinks as separate objects in the linked data repository on S3, For directories, FSx for Lustre creates an S3 object with a key name that ends with a slash ("/"), as follows:
+ The S3 object key `mydir/` maps to the FSx for Lustre directory `mydir/`.
+ The S3 object key `mydir/mysubdir/` maps to the FSx for Lustre directory `mydir/mysubdir/`.

For symlinks, FSx for Lustre uses the following Amazon S3 schema:
+ **S3 object key** – The path to the link, relative to the FSx for Lustre mount directory
+ **S3 object data** – The target path of this symlink
+ **S3 object metadata** – The metadata for the symlink

FSx for Lustre stores POSIX metadata, including ownership, permissions, and timestamps for files, directories, and symbolic links, in S3 objects as follows:
+ `Content-Type` – The HTTP entity header used to indicate the media type of the resource for web browsers.
+ `x-amz-meta-file-permissions` – The file type and permissions in the format `<octal file type><octal permission mask>`, consistent with `st_mode` in the [Linux stat(2) man page](https://man7.org/linux/man-pages/man2/lstat.2.html).
**Note**  
FSx for Lustre doesn't import or retain `setuid` information.
+ `x-amz-meta-file-owner` – The owner user ID (UID) expressed as an integer.
+ `x-amz-meta-file-group` – The group ID (GID) expressed as an integer.
+ `x-amz-meta-file-atime` – The last-accessed time in nanoseconds since the beginning of the Unix epoch. Terminate the time value with `ns`; otherwise FSx for Lustre interprets the value as milliseconds.
+ `x-amz-meta-file-mtime` – The last-modified time in nanoseconds since the beginning of the Unix epoch. Terminate the time value with `ns`; otherwise, FSx for Lustre interprets the value as milliseconds.
+ `x-amz-meta-user-agent` – The user agent, ignored during FSx for Lustre import. During export, FSx for Lustre sets this value to `aws-fsx-lustre`.

When importing objects from S3 that don't have associated POSIX permissions, the default POSIX permission that FSx for Lustre assigns to a file is `755`. This permission allows read and execute access for all users and write access for the owner of the file.

**Note**  
FSx for Lustre doesn't retain any user-defined custom metadata on S3 objects.

# Hard links and exporting to Amazon S3
Exporting hard links

If automatic export (with NEW and CHANGED policies) is enabled on a DRA in your file system, each hard link contained within the DRA is exported to Amazon S3 as a separate S3 object for each hard link. If a file with multiple hard links is modified on the file system, all of the copies in S3 are updated, regardless of which hard link was used when changing the file.

If hard links are exported to S3 using data repository tasks (DRTs), each hard link contained within the paths specified for the DRT is exported to S3 as a separate S3 object for each hard link. If a file with multiple hard links is modified on the file system, each copy in S3 is updated at the time the respective hard link is exported, regardless of which hard link was used when changing the file.

**Important**  
When a new FSx for Lustre file system is linked to an S3 bucket to which hard links were previously exported by another FSx for Lustre file system, AWS DataSync, or Amazon FSx File Gateway, the hard links are subsequently imported as separate files on the new file system.

## Hard links and released files


A released file is a file whose metadata is present in the file system, but whose content is only stored in S3. For more information on released files, see [Releasing files](file-release.md).

**Important**  
The use of hard links in a file system that has data repository associations (DRAs) is subject to the following limitations:  
Deleting and recreating a released file that has multiple hard links may cause the content of all hard links to be overwritten.
Deleting a released file will delete content from all hard links that reside outside of a data repository association.
Creating a hard link to a released file whose corresponding S3 object is in either of the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes will not create a new object in S3 for the hard link.

# Walkthrough: Attaching POSIX permissions when uploading objects into an Amazon S3 bucket
Attaching POSIX permissions to an S3 bucket

The following procedure walks you through the process of uploading objects into Amazon S3 with POSIX permissions. Doing so allows you to import the POSIX permissions when you create an Amazon FSx file system that is linked to that S3 bucket.

**To upload objects with POSIX permissions to Amazon S3**

1. From your local computer or machine, use the following example commands to create a test directory (`s3cptestdir`) and file (`s3cptest.txt`) that will be uploaded to the S3 bucket.

   ```
   $ mkdir s3cptestdir
   $ echo "S3cp metadata import test" >> s3cptestdir/s3cptest.txt
   $ ls -ld s3cptestdir/ s3cptestdir/s3cptest.txt
   drwxr-xr-x 3 500 500 96 Jan 8 11:29 s3cptestdir/
   -rw-r--r-- 1 500 500 26 Jan 8 11:29 s3cptestdir/s3cptest.txt
   ```

   The newly created file and directory have a file owner user ID (UID) and group ID (GID) of 500 and permissions as shown in the preceding example.

1. Call the Amazon S3 API to create the directory `s3cptestdir` with metadata permissions. You must specify the directory name with a trailing slash (`/`). For information about supported POSIX metadata, see [POSIX metadata support for data repositories](posix-metadata-support.md).

   Replace `bucket_name` with the actual name of your S3 bucket.

   ```
   $ aws s3api put-object --bucket bucket_name --key s3cptestdir/ --metadata '{"user-agent":"aws-fsx-lustre" , \
         "file-atime":"1595002920000000000ns" , "file-owner":"500" , "file-permissions":"0100664","file-group":"500" , \
         "file-mtime":"1595002920000000000ns"}'
   ```

1. Verify the POSIX permissions are tagged to S3 object metadata.

   ```
   $ aws s3api head-object --bucket bucket_name --key s3cptestdir/
   {
       "AcceptRanges": "bytes",
       "LastModified": "Fri, 08 Jan 2021 17:32:27 GMT",
       "ContentLength": 0,
       "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
       "VersionId": "bAlhCoWq7aIEjc3R6Myc6UOb8sHHtJkR",
       "ContentType": "binary/octet-stream",
       "Metadata": {
           "user-agent": "aws-fsx-lustre",
           "file-atime": "1595002920000000000ns",
           "file-owner": "500",
           "file-permissions": "0100664",
           "file-group": "500",
           "file-mtime": "1595002920000000000ns"
       }
   }
   ```

1. Upload the test file (created in step 1) from your computer to the S3 bucket with metadata permissions.

   ```
   $ aws s3 cp s3cptestdir/s3cptest.txt s3://bucket_name/s3cptestdir/s3cptest.txt \
         --metadata '{"user-agent":"aws-fsx-lustre" , "file-atime":"1595002920000000000ns" , \
         "file-owner":"500" , "file-permissions":"0100664","file-group":"500" , "file-mtime":"1595002920000000000ns"}'
   ```

1. Verify the POSIX permissions are tagged to S3 object metadata.

   ```
   $ aws s3api head-object --bucket bucket_name --key s3cptestdir/s3cptest.txt
   {
       "AcceptRanges": "bytes",
       "LastModified": "Fri, 08 Jan 2021 17:33:35 GMT",
       "ContentLength": 26,
       "ETag": "\"eb33f7e1f44a14a8e2f9475ae3fc45d3\"",
       "VersionId": "w9ztRoEhB832m8NC3a_JTlTyIx7Uzql6",
       "ContentType": "text/plain",
       "Metadata": {
           "user-agent": "aws-fsx-lustre",
           "file-atime": "1595002920000000000ns",
           "file-owner": "500",
           "file-permissions": "0100664",
           "file-group": "500",
           "file-mtime": "1595002920000000000ns"
       }
   }
   ```

1. Verify permissions on the Amazon FSx file system linked to the S3 bucket.

   ```
   $ sudo lfs df -h /fsx
   UUID                       bytes        Used   Available Use% Mounted on
   3rnxfbmv-MDT0000_UUID       34.4G        6.1M       34.4G   0% /fsx[MDT:0]
   3rnxfbmv-OST0000_UUID        1.1T        4.5M        1.1T   0% /fsx[OST:0]
    
   filesystem_summary:         1.1T        4.5M        1.1T   0% /fsx
    
   $ cd /fsx/s3cptestdir/
   $ ls -ld s3cptestdir/
   drw-rw-r-- 2 500 500 25600 Jan  8 17:33 s3cptestdir/
   
   $ ls -ld s3cptestdir/s3cptest.txt
   -rw-rw-r-- 1 500 500 26 Jan 8 17:33 s3cptestdir/s3cptest.txt
   ```

Both the `s3cptestdir` directory and the `s3cptest.txt` file have POSIX permissions imported.

# Linking your file system to an Amazon S3 bucket
Linking your file system to an S3 bucket

You can link your Amazon FSx for Lustre file system to data repositories in Amazon S3. You can create the link when creating the file system or at any time after the file system has been created.

A link between a directory on the file system and an S3 bucket or prefix is called a *data repository association (DRA)*. You can configure a maximum of 8 data repository associations on an FSx for Lustre file system. A maximum of 8 DRA requests can be queued, but only one request can be worked on at a time for the file system. Each DRA must have a unique FSx for Lustre file system directory and a unique S3 bucket or prefix associated with it.

**Note**  
 Data repository associations, automatic export, and support for multiple data repositories aren't available on FSx for Lustre 2.10 file systems or `Scratch 1` file systems. 

In order to access objects on the S3 data repository as files and directories on the file system, file and directory metadata must be loaded into the file system. You can load metadata from a linked data repository when you create the DRA or load metadata for batches of files and directories that you want to access using the FSx for Lustre file system at a later time using an import data repository task, or use automatic export to load metadata automatically when objects are added to, changed in, or deleted from the data repository.

You can configure a DRA for automatic import only, for automatic export only, or for both. A data repository association configured with both automatic import and automatic export propagates data in both directions between the file system and the linked S3 bucket. As you make changes to data in your S3 data repository, FSx for Lustre detects the changes and then automatically imports the changes to your file system. As you create, modify, or delete files, FSx for Lustre automatically exports the changes to Amazon S3 asynchronously once your application finishes modifying the file.

**Important**  
If you modify the same file in both the file system and the S3 bucket, you should ensure application-level coordination to prevent conflicts. FSx for Lustre doesn't prevent conflicting writes in multiple locations.
For files marked with an immutable attribute, FSx for Lustre is unable to synchronize changes between your FSx for Lustre file system and an S3 bucket linked to the file system. Setting an immutable flag for an extended period of time can cause the performance of data movement between Amazon FSx and S3 to degrade.

When you create a data repository association, you can configure the following properties:
+ **File system path** – Enter a local path on the file system that points to a directory (such as `/ns1/`) or subdirectory (such as `/ns1/subdir/`) that will be mapped one-to-one with the specified data repository path below. The leading forward slash in the name is required. Two data repository associations cannot have overlapping file system paths. For example, if a data repository is associated with file system path `/ns1`, then you cannot link another data repository with file system path `/ns1/ns2`.
**Note**  
If you specify only a forward slash (`/`) as the file system path, you can link only one data repository to the file system. You can only specify "/" as the file system path for the first data repository associated with a file system.
+ **Data repository path** – Enter a path in the S3 data repository. The path can be an S3 bucket or prefix in the format `s3://bucket-name/prefix/`. This property specifies where in the S3 data repository files will be imported from or exported to. FSx for Lustre will append a trailing "/" to your data repository path if you don't provide one. For example, if you provide a data repository path of `s3://amzn-s3-demo-bucket/my-prefix`, FSx for Lustre will interpret it as `s3://amzn-s3-demo-bucket/my-prefix/`.

  Two data repository associations cannot have overlapping data repository paths. For example, if a data repository with path `s3://amzn-s3-demo-bucket/my-prefix/` is linked to the file system, then you cannot create another data repository association with data repository path `s3://amzn-s3-demo-bucket/my-prefix/my-sub-prefix`.
+ **Import metadata from repository** – You can select this option to import metadata from the entire data repository immediately after creating the data repository association. Alternatively, you can run an import data repository task to load all or a subset of the metadata from the linked data repository into the file system at any time after the data repository association is created.
+ **Import settings** – Choose an import policy that specifies the type of updated objects (any combination of new, changed, and deleted) that will be automatically imported from the linked S3 bucket to your file system. Automatic import (new, changed, deleted) is turned on by default when you add a data repository from the console, but is disabled by default when using the AWS CLI or Amazon FSx API.
+ **Export settings** – Choose an export policy that specifies the type of updated objects (any combination of new, changed, and deleted) that will be automatically exported to the S3 bucket. Automatic export (new, changed, deleted) is turned on by default when you add a data repository from the console, but is disabled by default when using the AWS CLI or Amazon FSx API.

The **File system path** and **Data repository path** settings provide a 1:1 mapping between paths in Amazon FSx and object keys in S3.

**Topics**
+ [

# Creating a link to an S3 bucket
](create-linked-dra.md)
+ [

# Updating data repository association settings
](update-dra-settings.md)
+ [

# Deleting an association to an S3 bucket
](delete-linked-dra.md)
+ [

# Viewing data repository association details
](view-dra-details.md)
+ [

# Data repository association lifecycle state
](dra-lifecycles.md)
+ [

# Working with server-side encrypted Amazon S3 buckets
](s3-server-side-encryption-support.md)

# Creating a link to an S3 bucket


The following procedures walk you through the process of creating a data repository association for an FSx for Lustre file system to an existing S3 bucket, using the AWS Management Console and AWS Command Line Interface (AWS CLI). For information on adding permissions to an S3 bucket in order to link it to your file system, see [Adding permissions to use data repositories in Amazon S3](setting-up.md#fsx-adding-permissions-s3).

**Note**  
Data repositories cannot be linked to file systems that have file system backups enabled. Disable backups before linking to a data repository.

## To link an S3 bucket while creating a file system (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. Follow the procedure for creating a new file system described in [Step 1: Create your FSx for Lustre file system](getting-started.md#getting-started-step1) in the Getting Started section. 

1. Open the **Data Repository Import/Export - *optional*** section. The feature is disabled by default.

1. Choose **Import data from and export data to S3**.

1. In the **Data repository association information** dialog, provide information for the following fields.
   + **File system path**: Enter the name of a high-level directory (such as `/ns1`) or subdirectory (such as `/ns1/subdir`) within the Amazon FSx file system that will be associated with the S3 data repository. The leading forward slash in the path is required. Two data repository associations cannot have overlapping file system paths. For example, if a data repository is associated with file system path `/ns1`, then you cannot link another data repository with file system path `/ns1/ns2`. The **File system path** setting must be unique across all the data repository associations for the file system.
   + **Data repository path**: Enter the path of an existing S3 bucket or prefix to associate with your file system (for example, `s3://amzn-s3-demo-bucket/my-prefix`). Two data repository associations cannot have overlapping data repository paths. The **Data repository path** setting must be unique across all the data repository associations for the file system.
   + **Import metadata from repository**: Select this property to optionally run an import data repository task to import metadata immediately after the link is created.

1. For **Import settings - optional**, set an **Import Policy** that determines how your file and directory listings are kept up to date as you add, change, or delete objects in your S3 bucket. For example, choose **New** to import metadata to your file system for new objects created in the S3 bucket. For more information on import policies, see [Automatically import updates from your S3 bucket](autoimport-data-repo-dra.md).

1. For **Export policy**, set an export policy that determines how your files are exported to your linked S3 bucket as you add, change, or delete objects in your file system. For example, choose **Changed** to export objects whose content or metadata has been changed on your file system. For more information about export policies, see [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md).

1. Continue with the next section of the file system creation wizard.

## To link an S3 bucket to an existing file system (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **File systems** and then select the file system that you want to create a data repository association for. 

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose **Create data repository association**.

1. In the **Data repository association information** dialog, provide information for the following fields.
   + **File system path**: Enter the name of a high-level directory (such as `/ns1`) or subdirectory (such as `/ns1/subdir`) within the Amazon FSx file system that will be associated with the S3 data repository. The leading forward slash in the path is required. Two data repository associations cannot have overlapping file system paths. For example, if a data repository is associated with file system path `/ns1`, then you cannot link another data repository with file system path `/ns1/ns2`. The **File system path** setting must be unique across all the data repository associations for the file system.
   + **Data repository path**: Enter the path of an existing S3 bucket or prefix to associate with your file system (for example, `s3://amzn-s3-demo-bucket/my-prefix`). Two data repository associations cannot have overlapping data repository paths. The **Data repository path** setting must be unique across all the data repository associations for the file system.
   + **Import metadata from repository**: Select this property to optionally run an import data repository task to import metadata immediately after the link is created.

1. For **Import settings - optional**, set an **Import Policy** that determines how your file and directory listings are kept up to date as you add, change, or delete objects in your S3 bucket. For example, choose **New** to import metadata to your file system for new objects created in the S3 bucket. For more information about import policies, see [Automatically import updates from your S3 bucket](autoimport-data-repo-dra.md).

1. For **Export policy**, set an export policy that determines how your files are exported to your linked S3 bucket as you add, change, or delete objects in your file system. For example, choose **Changed** to export objects whose content or metadata has been changed on your file system. For more information about export policies, see [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md).

1. Choose **Create**.

## To link a file system to an S3 bucket (AWS CLI)


The following example creates a data repository association that links an Amazon FSx file system to an S3 bucket, with an import policy that imports any new or changed files to the file system and an export policy that exports new, changed, or deleted files to the linked S3 bucket.
+ To create a data repository association, use the Amazon FSx CLI command `create-data-repository-association`, as shown following.

  ```
  $ aws fsx create-data-repository-association \
        --file-system-id fs-0123456789abcdef0 \
        --file-system-path /ns1/path1/ \
        --data-repository-path s3://amzn-s3-demo-bucket/myprefix/ \
        --s3 "AutoImportPolicy={Events=[NEW,CHANGED,DELETED]},AutoExportPolicy={Events=[NEW,CHANGED,DELETED]}"
  ```

Amazon FSx returns the JSON description of the DRA immediately. The DRA is created asynchronously.

You can use this command to create a data repository association even before the file system has finished creating. The request will be queued and the data repository association will be created after the file system is available.

# Updating data repository association settings


You can update an existing data repository association's settings using the AWS Management Console, the AWS CLI, and the Amazon FSx API, as shown in the following procedures.

**Note**  
You cannot update the `File system path` or `Data repository path` of a DRA after it's created. If you want to change the `File system path` or `Data repository path`, you must delete the DRA and create it again.

## To update settings for an existing data repository association (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **File systems**, and then select the file system that you want to manage.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association you want to change.

1. Choose **Update**. An edit dialog displays for the data repository association.

1. For **Import settings - optional**, you can update your **Import Policy**. For more information on import policies, see [Automatically import updates from your S3 bucket](autoimport-data-repo-dra.md).

1. For **Export settings - optional**, you can update your export policy. For more information on export policies, see [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md).

1. Choose **Update**.

## To update settings for an existing data repository association (CLI)

+ To update a data repository association, use the Amazon FSx CLI command `update-data-repository-association`, as shown following.

  ```
  $ aws fsx update-data-repository-association \
        --association-id 'dra-872abab4b4503bfc2' \
        --s3 "AutoImportPolicy={Events=[NEW,CHANGED,DELETED]},AutoExportPolicy={Events=[NEW,CHANGED,DELETED]}"
  ```

After successfully updating the data repository association's import and export policies, Amazon FSx returns the description of the updated data repository association as JSON.

# Deleting an association to an S3 bucket


The following procedures walk you through the process of deleting a data repository association from an existing Amazon FSx file system to an existing S3 bucket, using the AWS Management Console and AWS Command Line Interface (AWS CLI). Deleting the data repository association unlinks the file system from the S3 bucket.

## To delete a link from a file system to an S3 bucket (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **File systems** and then select the file system that you want to delete a data repository association from. 

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association that you want to delete.

1. For **Actions**, choose **Delete association**.

1. In the **Delete** dialog, you can choose **Delete data in file system** to physically delete the data in the file system that corresponds to the data repository association.

   Choose this option if you plan to create a new data repository association using the same file system path but pointing to a different S3 bucket prefix, or if you no longer need the data in your file system.

1. Choose **Delete** to remove the data repository association from the file system.

## To delete a link from a file system to an S3 bucket (AWS CLI)


The following example deletes a data repository association that links an Amazon FSx file system to an S3 bucket. The `--association-id` parameter specifies the ID of the data repository association to be deleted.
+ To delete a data repository association, use the Amazon FSx CLI command `delete-data-repository-association`, as shown following.

  ```
  $ aws fsx delete-data-repository-association \
        --association-id dra-872abab4b4503bfc \
        --delete-data-in-file-system false
  ```

After successfully deleting the data repository association, Amazon FSx returns its description as JSON.

**Recreating DRAs with the same file system path**  
We do not recommend deleting and recreating data repository associations that use the same file system path. If you delete a DRA and later create a new DRA using the same file system path, some files may retain HSM state from the previously deleted DRA.  
If you need to export files from a recreated DRA that were managed by a previously deleted DRA, you need to mark those files as dirty using the command below and then run an export data repository task:  

```
sudo lfs hsm_set --dirty file_path
```

# Viewing data repository association details


You can view the details of a data repository association using the FSx for Lustre console, the AWS CLI, and the API. The details include the DRA's association ID, file system path, data repository path, import settings, export settings, status, and the ID of its associated file system.

## To view DRA details (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **File systems** and then select the file system that you want to view a data repository association's details for.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association that you want to view. The **Summary** page appears, showing the DRA details.  
![\[Amazon FSx Details page of a data repository association.\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/dra-describe.png)

## To view DRA details (CLI)

+ To view the details of a specific data repository association, use the Amazon FSx CLI command `describe-data-repository-associations`, as shown following.

  ```
  $ aws fsx describe-data-repository-associations \
        --association-ids dra-872abab4b4503bfc2
  ```

  Amazon FSx returns the description of the data repository association as JSON.

# Data repository association lifecycle state


The data repository association lifecycle state provides status information about a specific DRA. A data repository association can have the following **Lifecycle states**:
+ **Creating** – Amazon FSx is creating the data repository association between the file system and the linked data repository. The data repository is unavailable.
+ **Available** – The data repository association is available for use.
+ **Updating** – The data repository association is undergoing a customer initiated update that might affect its availability.
+ **Deleting** – The data repository association is undergoing a customer initiated deletion.
+ **Misconfigured** – Amazon FSx cannot automatically import updates from the S3 bucket or automatically export updates to the S3 bucket until the data repository association configuration is corrected.

  A DRA can become **Misconfigured** due to the following:
  + Amazon FSx lacks necessary IAM permissions to access the S3 bucket.
  + The FSx event notification configuration on the S3 bucket is deleted or modified.
  + The S3 bucket has existing event notifications that overlap with FSx event types.

  After resolving the underlying issue, the DRA automatically returns to the **Available** state within 15 minutes, or you can immediately trigger the state change using the AWS CLI command [update-data-repository-association](https://docs.aws.amazon.com/cli/latest/reference/fsx/update-data-repository-association.html). 
+ **Failed** – The data repository association is in a terminal state that cannot be recovered (for example, because its file system path is deleted or the S3 bucket is deleted).

 You can view a data repository association’s lifecycle state using the Amazon FSx console, the AWS Command Line Interface, and the Amazon FSx API. For more information, see [Viewing data repository association details](view-dra-details.md).

# Working with server-side encrypted Amazon S3 buckets


 FSx for Lustre supports Amazon S3 buckets that use server-side encryption with S3-managed keys (SSE-S3), and with AWS KMS keys stored in AWS Key Management Service (SSE-KMS). 

If you want Amazon FSx to encrypt data when writing to your S3 bucket, you need to set the default encryption on your S3 bucket to either SSE-S3 or SSE-KMS. For more information, see [Configuring default encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/default-bucket-encryption.html) in the *Amazon S3 User Guide*. When writing files to your S3 bucket, Amazon FSx follows the default encryption policy of your S3 bucket.

By default, Amazon FSx supports S3 buckets encrypted using SSE-S3. If you want to link your Amazon FSx file system to an S3 bucket encrypted using SSE-KMS encryption, you need to add a statement to your customer managed key policy that allows Amazon FSx to encrypt and decrypt objects in your S3 bucket using your KMS key.

The following statement allows a specific Amazon FSx file system to encrypt and decrypt objects for a specific S3 bucket, *bucket\$1name*.

```
{
    "Sid": "Allow access through S3 for the FSx SLR to use the KMS key on the objects in the given S3 bucket",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::aws_account_id:role/aws-service-role/s3.data-source.lustre.fsx.amazonaws.com/AWSServiceRoleForFSxS3Access_fsx_file_system_id"
    },
    "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey"
    ],
    "Resource": "*",
    "Condition": {
        "StringEquals": {
            "kms:CallerAccount": "aws_account_id",
            "kms:ViaService": "s3.bucket-region.amazonaws.com"
        },
        "StringLike": {
            "kms:EncryptionContext:aws:s3:arn": "arn:aws:s3:::bucket_name/*"
        }
    }
}
```

**Note**  
 If you're using a KMS with a CMK to encrypt your S3 bucket with S3 bucket keys enabled, set the `EncryptionContext` to the bucket ARN, not the object ARN, as in this example:  

```
"StringLike": {
    "kms:EncryptionContext:aws:s3:arn": "arn:aws:s3:::bucket_name"
}
```

The following policy statement allows all Amazon FSx file systems in your account to link to a specific S3 bucket.

```
{
      "Sid": "Allow access through S3 for the FSx SLR to use the KMS key on the objects in the given S3 bucket",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "kms:ViaService": "s3.bucket-region.amazonaws.com",
          "kms:CallerAccount": "aws_account_id"
        },
        "StringLike": {
            "kms:EncryptionContext:aws:s3:arn": "arn:aws:s3:::bucket_name/*"
        },
        "ArnLike": {
          "aws:PrincipalArn": "arn:aws_partition:iam::aws_account_id:role/aws-service-role/s3.data-source.lustre.fsx.amazonaws.com/AWSServiceRoleForFSxS3Access_fs-*"
        }
      }
}
```

## Accessing server-side encrypted Amazon S3 buckets in a different AWS account or from a Shared VPC


After you create an FSx for Lustre file system linked to an encrypted Amazon S3 bucket, you must then grant the `AWSServiceRoleForFSxS3Access_fs-01234567890` service-linked role (SLR) access to the KMS key used to encrypt the S3 bucket before reading or writing data from the linked S3 bucket. You can use an IAM role which already has permissions to the KMS key.

**Note**  
This IAM role must be in the account that the FSx for Lustre file system was created in (which is the same account as the S3 SLR), not the account that the KMS key/S3 bucket belong to.

You use the IAM role to call the following AWS KMS API to create a grant for the S3 SLR so that the SLR gains permission to the S3 objects. In order to find the ARN associated with your SLR, search your IAM roles using your file system ID as the search string.

```
$ aws kms create-grant --region fs_account_region \
      --key-id arn:aws:kms:s3_bucket_account_region:s3_bucket_account:key/key_id \
      --grantee-principal arn:aws:iam::fs_account_id:role/aws-service-role/s3.data-source.lustre.fsx.amazonaws.com/AWSServiceRoleForFSxS3Access_file-system-id \
      --operations "Decrypt" "Encrypt" "GenerateDataKey" "GenerateDataKeyWithoutPlaintext" "CreateGrant" "DescribeKey" "ReEncryptFrom" "ReEncryptTo"
```

For more information about service-linked roles, see [Using service-linked roles for Amazon FSx](using-service-linked-roles.md).

# Importing changes from your data repository


You can import changes to data and POSIX metadata from a linked data repository to your Amazon FSx file system. Associated POSIX metadata includes ownership, permissions, and timestamps.

To import changes to the file system, use one of the following methods:
+ Configure your file system to automatically import new, changed, or deleted files from your linked data repository. For more information, see [Automatically import updates from your S3 bucket](autoimport-data-repo-dra.md).
+ Select the option to import metadata when you create a data repository association. This will initiate an import data repository task immediately after creating the data repository association.
+ Use an on-demand import data repository task. For more information, see [Using data repository tasks to import changes](import-data-repo-task-dra.md).

Automatic import and import data repository tasks can run at the same time.

When you turn on automatic import for a data repository association, your file system automatically updates file metadata as objects are created, modified, or deleted in S3. When you select the option to import metadata while creating a data repository association, your file system imports metadata for all objects in the data repository. When you import using an import data repository task, your file system imports only metadata for objects that were created or modified since the last import.

FSx for Lustre automatically copies the content of a file from your data repository and loads it into the ﬁle system when your application first accesses the file in the file system. This data movement is managed by FSx for Lustre and is transparent to your applications. Subsequent reads of these files are served directly from the file system with sub-millisecond latencies.

You can also preload your whole ﬁle system or a directory within your ﬁle system. For more information, see [Preloading files into your file system](preload-file-contents-hsm-dra.md). If you request the preloading of multiple ﬁles simultaneously, FSx for Lustre loads ﬁles from your Amazon S3 data repository in parallel.

FSx for Lustre only imports S3 objects that have POSIX-compliant object keys. Both automatic import and import data repository tasks import POSIX metadata. For more information, see [POSIX metadata support for data repositories](posix-metadata-support.md).

**Note**  
FSx for Lustre doesn't support importing metadata for symbolic links (symlinks) from S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage classes. Metadata for S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive objects that aren't symlinks can be imported (that is, an inode is created on the FSx for Lustre file system with the correct metadata). However, to read this data from the file system, you must first restore the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive object. Importing file data directly from Amazon S3 objects in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class into FSx for Lustre isn't supported.

# Automatically import updates from your S3 bucket


You can configure FSx for Lustre to automatically update metadata in the file system as objects are added to, changed in, or deleted from your S3 bucket. FSx for Lustre creates, updates, or deletes the file and directory listing, corresponding to the change in S3. If the changed object in the S3 bucket no longer contains its metadata, FSx for Lustre maintains the current metadata values of the file, including the current permissions.

**Note**  
The FSx for Lustre file system and the linked S3 bucket must be located in the same AWS Region to automatically import updates.

You can configure automatic import when you create the data repository association, and you can update the automatic import settings at any time using the FSx management console, the AWS CLI, or the AWS API.

**Note**  
You can configure both automatic import and automatic export on the same data repository association. This topic describes only the automatic import feature.

**Important**  
If an object is modified in S3 with all automatic import policies enabled and automatic export disabled, the content of that object is always imported to a corresponding file in the file system. If a file already exists in the target location, the file is overwritten.
If a file is modified in both the file system and S3, with all automatic import and automatic export policies enabled, either the file in the file system or the object in S3 could be overwritten by the other. It isn't guaranteed that a later edit in one location will overwrite an earlier edit in another location. If you modify the same file in both the file system and the S3 bucket, you should ensure application-level coordination to prevent such conflicts. FSx for Lustre doesn't prevent conflicting writes in multiple locations.

The import policy specifies how you want FSx for Lustre to update your file system as the contents change in the linked S3 bucket. A data repository association can have one of the following import policies:
+ **New** – FSx for Lustre automatically updates file and directory metadata only when new objects are added to the linked S3 data repository.
+ **Changed** – FSx for Lustre automatically updates file and directory metadata only when an existing object in the data repository is changed.
+ **Deleted** – FSx for Lustre automatically updates file and directory metadata only when an object in the data repository is deleted.
+ **Any combination of New, Changed, and Deleted** – FSx for Lustre automatically updates file and directory metadata when any of the specified actions occur in the S3 data repository. For example, you can specify that the file system is updated when an object is added to (**New**) or removed from (**Deleted**) the S3 repository, but not updated when an object is changed.
+ **No policy configured** – FSx for Lustre doesn't update file and directory metadata on the file system when objects are added to, changed in, or deleted from the S3 data repository. If you don't configure an import policy, automatic import is disabled for the data repository association. You can still manually import metadata changes by using an import data repository task, as described in [Using data repository tasks to import changes](import-data-repo-task-dra.md).

**Important**  
Automatic import will not synchronize the following S3 actions with your linked FSx for Lustre file system:  
Deleting an object using S3 object lifecycle expirations
Permanently deleting the current object version in a versioning-enabled bucket
Undeleting an object in a versioning-enabled bucket

For most use cases, we recommend that you configure an import policy of **New**, **Changed**, and **Deleted**. This policy ensures that all updates made in your linked S3 data repository are automatically imported to your file system.

When you set an import policy to update your file system file and directory metadata based on changes in the linked S3 data repository, FSx for Lustre creates an event notification configuration on the linked S3 bucket. The event notification configuration is named `FSx`. Don't modify or delete the `FSx` event notification configuration on the S3 bucket – doing so will prevent the automatic import of updated file and directory metadata to your file system.

When FSx for Lustre updates a file listing that has changed on the linked S3 data repository, it overwrites the local file with the updated version, even if the file is write-locked.

FSx for Lustre makes a best effort to update your file system. FSx for Lustre cannot update the file system in the following situations:
+ If FSx for Lustre doesn't have permission to open the changed or new S3 object. In this case, FSx for Lustre skips the object and continues. The DRA lifecycle state isn't affected.
+ If FSx for Lustre doesn't have bucket-level permissions, such as for `GetBucketAcl`. This will cause the data repository lifecycle state to become **Misconfigured**. For more information, see [Data repository association lifecycle state](dra-lifecycles.md).
+ If the `FSx` event notification configuration on the linked S3 bucket is deleted or changed. This will cause the data repository lifecycle state to become **Misconfigured**. For more information, see [Data repository association lifecycle state](dra-lifecycles.md).

We recommend that you [turn on logging](cw-event-logging.md#manage-logging) to CloudWatch Logs to log information about any files or directories that couldn't be imported automatically. Warnings and errors in the log contain information about the failure reason. For more information, see [Data repository event logs](data-repo-event-logs.md).

## Prerequisites


The following conditions are required for FSx for Lustre to automatically import new, changed, or deleted files from the linked S3 bucket:
+ The file system and its linked S3 bucket are located in the same AWS Region.
+ The S3 bucket doesn't have a misconfigured **Lifecycle state**. For more information, see [Data repository association lifecycle state](dra-lifecycles.md).
+ Your account has the permissions required to configure and receive event notifications on the linked S3 bucket.

## Types of file changes supported


FSx for Lustre supports importing the following changes to files and directories that occur in the linked S3 bucket:
+ Changes to file contents.
+ Changes to file or directory metadata.
+ Changes to symlink target or metadata.
+ Deletions of files and directories. If you delete an object in the linked S3 bucket which corresponds to a directory in the file system (that is, an object with a key name that ends with a slash), FSx for Lustre deletes the corresponding directory on the file system only if it is empty.

## Updating import settings


You can set a file system's import settings for a linked S3 bucket when you create the data repository association. For more information, see [Creating a link to an S3 bucket](create-linked-dra.md).

You can also update the import settings at any time, including the import policy. For more information, see [Updating data repository association settings](update-dra-settings.md).

## Monitoring automatic import


If the rate of change in your S3 bucket exceeds the rate at which automatic import can process these changes, the corresponding metadata changes being imported to your FSx for Lustre file system are delayed. If this occurs, you can use the `AgeOfOldestQueuedMessage` metric to monitor the age of the oldest change waiting to be processed by automatic import. For more information on this metric, see [FSx for Lustre S3 repository metrics](fs-metrics.md#auto-import-export-metrics).

If the delay in importing metadata changes exceeds 14 days (as measured using the `AgeOfOldestQueuedMessage` metric), changes in your S3 bucket that haven't been processed by automatic import aren't imported into your file system. Additionally, your data repository association lifecycle is marked as **MISCONFIGURED** and automatic import is stopped. If you have automatic export enabled, automatic export continues monitoring your FSx for Lustre file system for changes. However, additional changes aren't synchronized from your FSx for Lustre file system to S3.

To return your data repository association from the **MISCONFIGURED** lifecycle state to the **AVAILABLE** lifecycle state, you must update your data repository association. You can update your data repository association using the [update-data-repository-association](https://docs.aws.amazon.com/cli/latest/reference/fsx/update-data-repository-association.html) CLI command (or the corresponding [UpdateDataRepositoryAssociation](https://docs.aws.amazon.com/fsx/latest/APIReference/API_UpdateDataRepositoryAssociation.html) API operation). The only request parameter that you need is the `AssociationID` of the data repository association that you want to update.

After the data repository association lifecycle state changes to **AVAILABLE**, automatic import (and automatic export if enabled) restarts. Upon restarting, automatic export resumes synchronizing file system changes to S3. To synchronize the metadata of new and changed objects in S3 with your FSx for Lustre file system that weren't imported or are from when the data repository association was in a misconfigured state, run an [import data repository task](import-data-repo-task-dra.md). Import data repository tasks don't synchronize deletes in your S3 bucket with your FSx for Lustre file system. If you want to fully synchronize S3 with your file system (including deletes), you must re-create your file system.

To ensure that delays to importing metadata changes don't exceed 14 days, we recommend that you set an alarm on the `AgeOfOldestQueuedMessage` metric and reduce activity in your S3 bucket if the `AgeOfOldestQueuedMessage` metric grows beyond your alarm threshold. For an FSx for Lustre file system connected to an S3 bucket with a single shard continuously sending the maximum number of possible changes from S3, with only automatic import running on the FSx for Lustre file system, automatic import can process a 7-hour backlog of S3 changes within 14 days.

Additionally, with a single S3 action, you can generate more changes than automatic import will ever process in 14 days. Examples of these types of actions include, but are not limited to, AWS Snowball uploads to S3 and large-scale deletions. If you make a large-scale change to your S3 bucket that you want synchronized with your FSx for Lustre file system, to prevent automatic import changes from exceeding 14 days, you should delete your file system and re-create it once the S3 change has completed.

If your `AgeOfOldestQueuedMessage` metric is growing, review your S3 bucket `GetRequests`, `PutRequests`, `PostRequests`, and `DeleteRequests` metrics for activity changes that would cause an increase in the rate and/or number of changes being sent to automatic import. For information about available S3 metrics, see [Monitoring Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/monitoring-overview.html) in the *Amazon S3 User Guide*.

For a list of all available FSx for Lustre metrics, see [Monitoring with Amazon CloudWatch](monitoring-cloudwatch.md).

# Using data repository tasks to import changes


The import data repository task imports metadata of objects that are new or changed in your S3 data repository, creating a new file or directory listing for any new object in the S3 data repository. For any object that has been changed in the data repository, the corresponding file or directory listing is updated with the new metadata. No action is taken for objects that have been deleted from the data repository.

Use the following procedures to import metadata changes by using the Amazon FSx console and CLI. Note that you can use one data repository task for multiple DRAs.

## To import metadata changes (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. On the navigation pane, choose **File systems**, then choose your Lustre file system.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository associations you want to create the import task for.

1. From the **Actions** menu, choose **Import task**. This choice isn't available if the file system isn't linked to a data repository. The **Create import data repository task** page appears.

1. (Optional) Specify up to 32 directories or files to import from your linked S3 buckets by providing the paths to those directories or files in **Data repository paths to import**.
**Note**  
If a path that you provide isn't valid, the task fails.

1. (Optional) Choose **Enable** under **Completion report** to generate a task completion report after the task completes. A *task completion report* provides details about the files processed by the task that meet the scope provided in **Report scope**. To specify the location for Amazon FSx to deliver the report, enter a relative path on a linked S3 data repository for **Report path**.

1. Choose **Create**. 

   A notification at the top of the **File systems** page shows the task that you just created in progress. 

To view the task status and details, scroll down to the **Data Repository Tasks** pane in the **Data Repository** tab for the file system. The default sort order shows the most recent task at the top of the list.

To view a task summary from this page, choose **Task ID** for the task you just created. The **Summary** page for the task appears. 

## To import metadata changes (CLI)

+ Use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html) CLI command to import metadata changes on your FSx for Lustre file system. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html).

  ```
  $ aws fsx create-data-repository-task \
      --file-system-id fs-0123456789abcdef0 \
      --type IMPORT_METADATA_FROM_REPOSITORY \
      --paths s3://bucketname1/dir1/path1 \
      --report Enabled=true,Path=s3://bucketname1/dir1/path1,Format=REPORT_CSV_20191124,Scope=FAILED_FILES_ONLY
  ```

  After successfully creating the data repository task, Amazon FSx returns the task description as JSON.

After creating the task to import metadata from the linked data repository, you can check the status of the import data repository task. For more information about viewing data repository tasks, see [Accessing data repository tasks](view-data-repo-tasks.md).

# Preloading files into your file system


You can optionally preload contents individual files or directories into your file system.

## Importing files using HSM commands


Amazon FSx copies data from your Amazon S3 data repository when a file is first accessed. Because of this approach, the initial read or write to a file incurs a small amount of latency. If your application is sensitive to this latency, and you know which files or directories your application needs to access, you can optionally preload contents of individual files or directories. You do so using the `hsm_restore` command, as follows.

You can use the `hsm_action` command (issued with the `lfs` user utility) to verify that the file's contents have finished loading into the file system. A return value of `NOOP` indicates that the file has successfully been loaded. Run the following commands from a compute instance with the file system mounted. Replace *path/to/file* with the path of the file you're preloading into your file system.

```
sudo lfs hsm_restore path/to/file
sudo lfs hsm_action path/to/file
```

You can preload your whole file system or an entire directory within your file system by using the following commands. (The trailing ampersand makes a command run as a background process.) If you request the preloading of multiple files simultaneously, Amazon FSx loads your files from your Amazon S3 data repository in parallel. If a file has already been loaded to the file system, the `hsm_restore` command doesn't reload it.

```
nohup find local/directory -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_restore &
```

**Note**  
If your linked S3 bucket is larger than your file system, you should be able to import all the file metadata into your file system. However, you can load only as much actual file data as will fit into the file system's remaining storage space. You'll receive an error if you attempt to access file data when there is no more storage left on the file system. If this occurs, you can increase the amount of storage capacity as needed. For more information, see [Managing storage capacity](managing-storage-capacity.md).

## Validation step


You can run the bash script listed below to help you discover how many files or objects are in an archived (released) state.

To improve the script's performance, especially across file systems with a large number of files, CPU threads are automatically determined based in the `/proc/cpuproc` file. That is, you will see faster performance with a higher vCPU count Amazon EC2 instance.

1. Set up the bash script.

   ```
   #!/bin/bash
   
   # Check if a directory argument is provided
   if [ $# -ne 1 ]; then
       echo "Usage: $0 /path/to/lustre/mount"
       exit 1
   fi
   
   # Set the root directory from the argument
   ROOT_DIR="$1"
   
   # Check if the provided directory exists
   if [ ! -d "$ROOT_DIR" ]; then
       echo "Error: Directory $ROOT_DIR does not exist."
       exit 1
   fi
   
   # Automatically detect number of CPUs and set threads
   if command -v nproc &> /dev/null; then
       THREADS=$(nproc)
   elif [ -f /proc/cpuinfo ]; then
       THREADS=$(grep -c ^processor /proc/cpuinfo)
   else
       echo "Unable to determine number of CPUs. Defaulting to 1 thread."
       THREADS=1
   fi
   
   # Output file
   OUTPUT_FILE="released_objects_$(date +%Y%m%d_%H%M%S).txt"
   
   echo "Searching in $ROOT_DIR for all released objects using $THREADS threads"
   echo "This may take a while depending on the size of the filesystem..."
   
   # Find all released files in the specified lustre directory using parallel
   # If you  get false positives for file names/paths that include the word 'released',
   # you can grep 'released exists archived' instead of just 'released'
   time sudo lfs find "$ROOT_DIR" -type f | \
   parallel --will-cite -j "$THREADS" -n 1000 "sudo lfs hsm_state {} | grep released" > "$OUTPUT_FILE"
   
   echo "Search complete. Released objects are listed in $OUTPUT_FILE"
   echo "Total number of released objects: $(wc -l <"$OUTPUT_FILE")"
   ```

1. Make the script executable:

   ```
   $ chmod +x find_lustre_released_files.sh
   ```

1. Run the script, as in the following example:

   ```
   $ ./find_lustre_released_files.sh /fsxl/sample
   Searching in /fsxl/sample for all released objects using 16 threads
   This may take a while depending on the size of the filesystem...
   real 0m9.906s
   user 0m1.502s
   sys 0m5.653s
   Search complete. Released objects are listed in released_objects_20241121_184537.txt
   Total number of released objects: 30000
   ```

If there are released objects present, then perform a bulk restore on the desired directories to bring the files into FSx for Lustre from S3, as in the following example:

```
$ DIR=/path/to/lustre/mount
$ nohup find $DIR -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_restore &
```

Note that `hsm_restore` will take a while where there are millions of files.

# Exporting changes to the data repository


You can export changes to data and POSIX metadata changes from your FSx for Lustre file system to a linked data repository. Associated POSIX metadata includes ownership, permissions, and timestamps.

To export changes from the file system, use one of the following methods.
+ Configure your file system to automatically export new, changed, or deleted files to your linked data repository. For more information, see [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md).
+ Use an on-demand export data repository task. For more information, see [Using data repository tasks to export changes](export-data-repo-task-dra.md)

Automatic export and export data repository tasks cannot run at the same time.

**Important**  
Automatic export will not synchronize the following metadata operations on your file system with S3 if the corresponding objects are stored in S3 Glacier Flexible Retrieval:   
chmod
chown
rename

When you turn on automatic export for a data repository association, your file system automatically exports file data and metadata changes as files are created, modified, or deleted. When you export files or directories using an export data repository task, your file system exports only data files and metadata that were created or modified since the last export.

Both automatic export and export data repository tasks export POSIX metadata. For more information, see [POSIX metadata support for data repositories](posix-metadata-support.md). 

**Important**  
To ensure that FSx for Lustre can export your data to your S3 bucket, it must be stored in a UTF-8 compatible format.
S3 object keys have a maximum length of 1,024 bytes. FSx for Lustre will not export files whose corresponding S3 object key would be longer than 1,024 bytes.

**Note**  
All objects created by automatic export and export data repository tasks are written using the S3 Standard storage class.

**Topics**
+ [

# Automatically export updates to your S3 bucket
](autoexport-data-repo-dra.md)
+ [

# Using data repository tasks to export changes
](export-data-repo-task-dra.md)
+ [

# Exporting files using HSM commands
](exporting-files-hsm.md)

# Automatically export updates to your S3 bucket


You can configure your FSx for Lustre file system to automatically update the contents of a linked S3 bucket as files are added, changed, or deleted on the file system. FSx for Lustre creates, updates, or deletes the object in S3, corresponding to the change in the file system.

**Note**  
Automatic export isn't available on FSx for Lustre 2.10 file systems or `Scratch 1` file systems.

You can export to a data repository that is in the same AWS Region as the file system or in a different AWS Region.

You can configure automatic export when you create the data repository association and update the automatic export settings at any time using the FSx management console, the AWS CLI, and the AWS API.

**Important**  
If a file is modified in the file system with all automatic export policies enabled and automatic import disabled, the content of that file is always exported to a corresponding object in S3. If an object already exists in the target location, the object is overwritten.
If a file is modified in both the file system and S3, with all automatic import and automatic export policies enabled, either the file in the file system or the object in S3 could be overwritten by the other. It isn't guaranteed that a later edit in one location will overwrite an earlier edit in another location. If you modify the same file in both the file system and the S3 bucket, you should ensure application-level coordination to prevent such conflicts. FSx for Lustre doesn't prevent conflicting writes in multiple locations.

The export policy specifies how you want FSx for Lustre to update your linked S3 bucket as the contents change in the file system. A data repository association can have one of the following automatic export policies:
+ **New** – FSx for Lustre automatically updates the S3 data repository only when a new file, directory, or symlink is created on the file system.
+ **Changed** – FSx for Lustre automatically updates the S3 data repository only when an existing file in the file system is changed. For file content changes, the file must be closed before it's propagated to the S3 repository. Metadata changes (rename, ownership, permissions, and timestamps) are propagated when the operation is done. For renaming changes (including moves), the existing (pre-renamed) S3 object is deleted and a new S3 object is created with the new name.
+ **Deleted** – FSx for Lustre automatically updates the S3 data repository only when a file, directory, or symlink is deleted in the file system.
+ **Any combination of New, Changed, and Deleted** – FSx for Lustre automatically updates the S3 data repository when any of the specified actions occur in file system. For example, you can specify that the S3 repository is updated when a file is added to (**New**) or removed from (**Deleted**) the file system, but not when a file is changed.
+ **No policy configured** – FSx for Lustre doesn't automatically update the S3 data repository when files are added to, changed in, or deleted from the file system. If you don't configure an export policy, automatic export is disabled. You can still manually export changes by using an export data repository task, as described in [Using data repository tasks to export changes](export-data-repo-task-dra.md).

For most use cases, we recommend that you configure an export policy of **New**, **Changed**, and **Deleted**. This policy ensures that all updates made on your file system are automatically exported to your linked S3 data repository.

We recommend that you [turn on logging](cw-event-logging.md#manage-logging) to CloudWatch Logs to log information about any files or directories that couldn't be exported automatically. Warnings and errors in the log contain information about the failure reason. For more information, see [Data repository event logs](data-repo-event-logs.md).

**Note**  
While access time (`atime`) and modification time (`mtime`) are synchronized with S3 during export operations, changes to these timestamps alone do not trigger automatic export. Only changes to file content or other metadata (such as ownership or permissions) will trigger an automatic export to S3.

## Updating export settings


You can set a file system's export settings to a linked S3 bucket when you create the data repository association. For more information, see [Creating a link to an S3 bucket](create-linked-dra.md).

You can also update the export settings at any time, including the export policy. For more information, see [Updating data repository association settings](update-dra-settings.md).

## Monitoring automatic export


You can monitor automatic export enabled data repository associations using a set of metrics published to Amazon CloudWatch. The `AgeOfOldestQueuedMessage` metric represents the age of the oldest update made to the file system which has not yet been exported to S3. If the `AgeOfOldestQueuedMessage` is greater than zero for an extended period of time, we recommend temporarily reducing the number of changes (directory renames in particular) that are actively being made to the file system until the message queue has been reduced. For more information, see [FSx for Lustre S3 repository metrics](fs-metrics.md#auto-import-export-metrics).

**Important**  
When deleting a data repository association or file system with automatic export enabled, you should first make sure that `AgeOfOldestQueuedMessage` is zero, meaning that there are no changes that have not yet been exported. If `AgeOfOldestQueuedMessage` is greater than zero when you delete your data repository association or file system, the changes that had not yet been exported will not reach your linked S3 bucket. To avoid this, wait for `AgeOfOldestQueuedMessage` to reach zero before deleting your data repository association or file system.

# Using data repository tasks to export changes


The export data repository task exports files that are new or changed in your file system. It creates a new object in S3 for any new file on the file system. For any file that has been modified on the file system, or whose metadata has been modified, the corresponding object in S3 is replaced with a new object with the new data and metadata. No action is taken for files that have been deleted from the file system.

**Note**  
Keep the following in mind when using export data repository tasks:  
The use of wildcards to include or exclude files for export isn't supported.
When performing `mv` operations, the target file after being moved will be exported to S3 even if there is no UID, GID, permission, or content change.

Use the following procedures to export data and metadata changes on the file system to linked S3 buckets by using the Amazon FSx console and CLI. Note that you can use one data repository task for multiple DRAs.

## To export changes (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. On the navigation pane, choose **File systems**, then choose your Lustre file system.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association you want to create the export task for.

1. For **Actions**, choose **Export task**. This choice isn't available if the file system isn't linked to a data repository on S3. The **Create export data repository task** dialog appears.

1. (Optional) Specify up to 32 directories or files to export from your Amazon FSx file system by providing the paths to those directories or files in **File system paths to export**. The paths you provide need to be relative to the mount point of the file system. If the mount point is `/mnt/fsx` and `/mnt/fsx/path1` is a directory or file on the file system you want to export, then the path to provide is `path1`.
**Note**  
If a path that you provide isn't valid, the task fails.

1. (Optional) Choose **Enable** under **Completion report** to generate a task completion report after the task completes. A *task completion report* provides details about the files processed by the task that meet the scope provided in **Report scope**. To specify the location for Amazon FSx to deliver the report, enter a relative path on the file system's linked S3 data repository for **Report path**.

1. Choose **Create**.

   A notification at the top of the **File systems** page shows the task that you just created in progress. 

To view the task status and details, scroll down to the **Data Repository Tasks** pane in the **Data Repository** tab for the file system. The default sort order shows the most recent task at the top of the list.

To view a task summary from this page, choose **Task ID** for the task you just created. The **Summary** page for the task appears.

## To export changes (CLI)

+ Use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html) CLI command to export data and metadata changes on your FSx for Lustre file system. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html).

  ```
  $ aws fsx create-data-repository-task \
      --file-system-id fs-0123456789abcdef0 \
      --type EXPORT_TO_REPOSITORY \
      --paths path1,path2/file1 \
      --report Enabled=true
  ```

  After successfully creating the data repository task, Amazon FSx returns the task description as JSON, as shown in the following example.

  ```
  {
      "Task": {
          "TaskId": "task-123f8cd8e330c1321",
          "Type": "EXPORT_TO_REPOSITORY",
          "Lifecycle": "PENDING",
          "FileSystemId": "fs-0123456789abcdef0",
          "Paths": ["path1", "path2/file1"],
          "Report": {
              "Path":"s3://dataset-01/reports",
              "Format":"REPORT_CSV_20191124",
              "Enabled":true,
              "Scope":"FAILED_FILES_ONLY"
          },
          "CreationTime": "1545070680.120",
          "ClientRequestToken": "10192019-drt-12",
          "ResourceARN": "arn:aws:fsx:us-east-1:123456789012:task:task-123f8cd8e330c1321"
      }
  }
  ```

After creating the task to export data to the linked data repository, you can check the status of the export data repository task. For more information about viewing data repository tasks, see [Accessing data repository tasks](view-data-repo-tasks.md).

# Exporting files using HSM commands


**Note**  
To export changes in your FSx for Lustre file system's data and metadata to a durable data repository on Amazon S3, use the automatic export feature described in [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md). You can also use export data repository tasks, described in [Using data repository tasks to export changes](export-data-repo-task-dra.md).

To export an individual file to your data repository and verify that the file has successfully been exported to your data repository, you can run the commands shown following. A return value of `states: (0x00000009) exists archived` indicates that the file has successfully been exported.

```
sudo lfs hsm_archive path/to/export/file
sudo lfs hsm_state path/to/export/file
```

**Note**  
You must run the HSM commands (such as `hsm_archive`) as the root user or using `sudo`.

To export your entire file system or an entire directory in your file system, run the following commands. If you export multiple files simultaneously, Amazon FSx for Lustre exports your files to your Amazon S3 data repository in parallel.

```
nohup find local/directory -type f -print0 | xargs -0 -n 1 sudo lfs hsm_archive &
```

To determine whether the export has completed, run the following command.

```
find path/to/export/file -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_state | awk '!/\<archived\>/ || /\<dirty\>/' | wc -l
```

If the command returns with zero files remaining, the export is complete.

# Data repository tasks
Data repository tasks

By using import and export data repository tasks, you can manage the transfer of data and metadata between your FSx for Lustre file system and any of its durable data repositories on Amazon S3.

*Data repository tasks* optimize data and metadata transfers between your FSx for Lustre file system and a data repository on S3. One way that they do this is by tracking changes between your Amazon FSx file system and its linked data repository. They also do this by using parallel transfer techniques to transfer data at speeds up to hundreds of GBps. You create and view data repository tasks using the Amazon FSx console, the AWS CLI, and the Amazon FSx API. 

Data repository tasks maintain the file system's Portable Operating System Interface (POSIX) metadata, including ownership, permissions, and timestamps. Because the tasks maintain this metadata, you can implement and maintain access controls between your FSx for Lustre file system and its linked data repositories.

You can use a release data repository task to free up file system space for new files by releasing files exported to Amazon S3. The released file's content is removed, but the metadata of the released file remains on the file system. Users and applications can still access a released file by reading the file again. When the user or application reads the released file, FSx for Lustre transparently retrieves the file content from Amazon S3.

## Types of data repository tasks


There are three types of data repository tasks:
+ **Export** data repository tasks export from your Lustre file system to a linked S3 bucket.
+ **Import** data repository tasks import from a linked S3 bucket to your Lustre file system.
+ **Release** data repository tasks release files exported to a linked S3 bucket from your Lustre file system.

For more information, see [Creating a data repository task](creating-data-repo-task.md).

**Topics**
+ [

## Types of data repository tasks
](#data-repo-task-types)
+ [

# Understanding a task's status and details
](data-repo-task-status.md)
+ [

# Using data repository tasks
](managing-data-repo-task.md)
+ [

# Working with task completion reports
](task-completion-report.md)
+ [

# Troubleshooting data repository task failures
](failed-tasks.md)

# Understanding a task's status and details
Task status and details

 A data repository task has descriptive information and a lifecycle status.

After a task is created, you can view the following detailed information for a data repository task using the Amazon FSx console, CLI, or API:
+ The task type: 
  + `EXPORT_TO_REPOSITORY` indicates an export task.
  + `IMPORT_METADATA_FROM_REPOSITORY` indicates an import task.
  + `RELEASE_DATA_FROM_FILESYSTEM` indicates a release task.
+ The file system that the task ran on.
+ The task creation time.
+ The task status.
+ The total number of files that the task processed.
+ The total number of files that the task successfully processed.
+ The total number of files that the task failed to process. This value is greater than zero when the task status is FAILED. Detailed information about files that failed is available in a task completion report. For more information, see [Working with task completion reports](task-completion-report.md).
+ The time that the task started.
+ The time that the task status was last updated. Task status is updated every 30 seconds.

 A data repository task can have one of the following statuses:
+ **PENDING** indicates that Amazon FSx has not started the task.
+ **EXECUTING** indicates that Amazon FSx is processing the task.
+ **FAILED** indicates that Amazon FSx didn't successfully process the task. For example, there might be files that the task failed to process. The task details provide more information about the failure. For more information about failed tasks, see [Troubleshooting data repository task failures](failed-tasks.md).
+ **SUCCEEDED** indicates that Amazon FSx completed the task successfully.
+ **CANCELED** indicates that the task was canceled and not completed.
+ **CANCELING** indicates that Amazon FSx is in the process of canceling the task.

Data repository task information is kept for 14 days after the task finishes. For more information about accessing existing data repository tasks, see [Accessing data repository tasks](view-data-repo-tasks.md).

# Using data repository tasks


In the following sections, you can find detailed information about managing data repository tasks. You can create, duplicate, view details, and cancel data repository tasks using the Amazon FSx console, CLI, or API.

**Topics**
+ [

# Creating a data repository task
](creating-data-repo-task.md)
+ [

# Duplicating a task
](recreate-task.md)
+ [

# Accessing data repository tasks
](view-data-repo-tasks.md)
+ [

# Canceling a data repository task
](cancel-data-repo-task.md)

# Creating a data repository task


You can create a data repository task by using the Amazon FSx console, CLI, or API. After you create a task, you can view the task's progress and status by using the console, CLI, or API.

You can create three types of data repository tasks:
+ The **Export** data repository task exports from your Lustre file system to a linked S3 bucket. For more information, see [Using data repository tasks to export changes](export-data-repo-task-dra.md).
+ The **Import** data repository task imports from a linked S3 bucket to your Lustre file system. For more information, see [Using data repository tasks to import changes](import-data-repo-task-dra.md).
+ The **Release** data repository task releases files from your Lustre file system that have been exported to a linked S3 bucket. For more information, see [Using data repository tasks to release files](release-files-task.md).

# Duplicating a task


You can duplicate an existing data repository task in the Amazon FSx console. When you duplicate a task, an exact copy of the existing task is displayed in the **Create import data repository task** or **Create export data repository task** page. You can make changes to the paths to export or import, as needed, before creating and running the new task.

**Note**  
A request to run a duplicate task will fail if an exact copy of that task is already running. An exact copy of a task that is already running contains the same file system path or paths in the case of an export task or the same data repository paths in the case of an import task.

You can duplicate a task from the task details view, the **Data Repository Tasks** pane in the **Data Repository** tab for the file system, or from the **Data repository tasks** page.

**To duplicate an existing task**

1. Choose a task on the **Data Repository Tasks** pane in the **Data Repository** tab for the file system.

1. Choose **Duplicate task**. Depending on which type of task you chose, the **Create import data repository task** or **Create export data repository task** page appears. All settings for the new task are identical to those for the task that you're duplicating.

1. Change or add the paths that you want to import from or export to.

1. Choose **Create**.

# Accessing data repository tasks


After you create a data repository task, you can access the task, and all existing tasks in your account, using the Amazon FSx console, CLI, and API. Amazon FSx provides the following detailed task information: 
+ All existing tasks.
+ All tasks for a specific file system.
+ All tasks for a specific data repository association.
+ All tasks with a specific lifecycle status. For more information about task lifecycle status values, see [Understanding a task's status and details](data-repo-task-status.md).

You can access all existing data repository tasks in your account by using the Amazon FSx console, CLI, or API, as described following.

## To view data repository tasks and task details (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. On the navigation pane, choose the file system that you want to view data repository tasks for. The file system details page appears.

1. On the file system details page, choose the **Data repository** tab. Any tasks for this file system appear on the **Data repository tasks** panel.

1. To see a task's details, choose **Task ID** or **Task name** in the **Data repository tasks** panel. The task detail page appears.  
![\[Data repository tasks page\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/task-details-rprt.png)

## To retrieve data repository tasks and task details (CLI)


Using the Amazon FSx [https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-data-repository-tasks.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-data-repository-tasks.html) CLI command, you can view all the data repository tasks, and their details, in your account. [https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeDataRepositoryTasks.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeDataRepositoryTasks.html) is the equivalent API command.
+ Use the following command to view all data repository task objects in your account.

  ```
  aws fsx describe-data-repository-tasks
  ```

  If the command is successful, Amazon FSx returns the response in JSON format.

  ```
  {
      "DataRepositoryTasks": [
          {
              "Lifecycle": "EXECUTING",
              "Paths": [],
              "Report": {
                  "Path":"s3://dataset-01/reports",
                  "Format":"REPORT_CSV_20191124",
                  "Enabled":true,
                  "Scope":"FAILED_FILES_ONLY"
              },
              "StartTime": 1591863862.288,
              "EndTime": ,
              "Type": "EXPORT_TO_REPOSITORY",
              "Tags": [],
              "TaskId": "task-0123456789abcdef3",
              "Status": {
                  "SucceededCount": 4255,
                  "TotalCount": 4200,
                  "FailedCount": 55,
                  "LastUpdatedTime": 1571863875.289
              },
              "FileSystemId": "fs-0123456789a7",
              "CreationTime": 1571863850.075,
              "ResourceARN": "arn:aws:fsx:us-east-1:1234567890:task/task-0123456789abcdef3"
          },
          {
              "Lifecycle": "FAILED",
              "Paths": [],
              "Report": {
                  "Enabled": false,
              },
              "StartTime": 1571863862.288,
              "EndTime": 1571863905.292,
              "Type": "EXPORT_TO_REPOSITORY",
              "Tags": [],
              "TaskId": "task-0123456789abcdef1",
              "Status": {
                  "SucceededCount": 1153,
                  "TotalCount": 1156,
                  "FailedCount": 3,
                  "LastUpdatedTime": 1571863875.289
              },
              "FileSystemId": "fs-0123456789abcdef0",
              "CreationTime": 1571863850.075,
              "ResourceARN": "arn:aws:fsx:us-east-1:1234567890:task/task-0123456789abcdef1"
          },
          {
              "Lifecycle": "SUCCEEDED",
              "Paths": [],
              "Report": {
                  "Path":"s3://dataset-04/reports",
                  "Format":"REPORT_CSV_20191124",
                  "Enabled":true,
                  "Scope":"FAILED_FILES_ONLY"
              },
              "StartTime": 1571863862.288,
              "EndTime": 1571863905.292,
              "Type": "EXPORT_TO_REPOSITORY",
              "Tags": [],
              "TaskId": "task-04299453935122318",
              "Status": {
                  "SucceededCount": 258,
                  "TotalCount": 258,
                  "FailedCount": 0,
                  "LastUpdatedTime": 1771848950.012,
              },
              "FileSystemId": "fs-0123456789abcdef0",
              "CreationTime": 1771848950.012,
              "ResourceARN": "arn:aws:fsx:us-east-1:1234567890:task/task-0123456789abcdef0"
          }
      ]
  }
  ```

## Viewing tasks by file system


You can view all tasks for a specific file system using the Amazon FSx console, CLI, or API, as described following.

### To view tasks by file system (console)


1. Choose **File systems** on the navigation pane. The **File systems** page appears.

1. Choose the file system that you want to view data repository tasks for. The file system details page appears.

1. On the file system details page, choose the **Data repository** tab. Any tasks for this file system appear on the **Data repository tasks** panel.

### To retrieve tasks by file system (CLI)

+ Use the following command to view all data repository tasks for file system `fs-0123456789abcdef0`.

  ```
  aws fsx describe-data-repository-tasks \
      --filters Name=file-system-id,Values=fs-0123456789abcdef0
  ```

  If the command is successful, Amazon FSx returns the response in JSON format.

  ```
  {
      "DataRepositoryTasks": [
          {
              "Lifecycle": "FAILED",
              "Paths": [],
              "Report": {
                  "Path":"s3://dataset-04/reports",
                  "Format":"REPORT_CSV_20191124",
                  "Enabled":true,
                  "Scope":"FAILED_FILES_ONLY"
              },
              "StartTime": 1571863862.288,
              "EndTime": 1571863905.292,
              "Type": "EXPORT_TO_REPOSITORY",
              "Tags": [],
              "TaskId": "task-0123456789abcdef1",
              "Status": {
                  "SucceededCount": 1153,
                  "TotalCount": 1156,
                  "FailedCount": 3,
                  "LastUpdatedTime": 1571863875.289
              },
              "FileSystemId": "fs-0123456789abcdef0",
              "CreationTime": 1571863850.075,
              "ResourceARN": "arn:aws:fsx:us-east-1:1234567890:task/task-0123456789abcdef1"
          },
          {
              "Lifecycle": "SUCCEEDED",
              "Paths": [],
              "Report": {
                  "Enabled": false,
              },
              "StartTime": 1571863862.288,
              "EndTime": 1571863905.292,
              "Type": "EXPORT_TO_REPOSITORY",
              "Tags": [],
              "TaskId": "task-0123456789abcdef0",
              "Status": {
                  "SucceededCount": 258,
                  "TotalCount": 258,
                  "FailedCount": 0,
                  "LastUpdatedTime": 1771848950.012,
              },
              "FileSystemId": "fs-0123456789abcdef0",
              "CreationTime": 1771848950.012,
              "ResourceARN": "arn:aws:fsx:us-east-1:1234567890:task/task-0123456789abcdef0"
          }
      ]
  }
  ```

# Canceling a data repository task


You can cancel a data repository task while it's in either the PENDING or EXECUTING state. When you cancel a task, the following occurs:
+ Amazon FSx doesn't process any files that are in the queue to be processed.
+ Amazon FSx continues processing any files that are currently in process.
+ Amazon FSx doesn't revert any files that the task already processed.

## To cancel a data repository task (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. Click on the file system for which you want to cancel a data repository task.

1. Open the **Data Repository** tab and scroll down to view the **Data Repository Tasks** panel.

1. Choose **Task ID** or **Task name** for the task that you want to cancel.

1. Choose **Cancel task** to cancel the task.

1. Enter the task ID to confirm the cancellation request.

## To cancel a data repository task (CLI)


Use the Amazon FSx [https://docs.aws.amazon.com/cli/latest/reference/fsx/cancel-data-repository-task.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/cancel-data-repository-task.html) CLI command, to cancel a task. [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CancelDataRepositoryTask.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CancelDataRepositoryTask.html) is the equivalent API command.
+ Use the following command to cancel a data repository task.

  ```
  aws fsx cancel-data-repository-task \
      --task-id task-0123456789abcdef0
  ```

  If the command is successful, Amazon FSx returns the response in JSON format.

  ```
  {
      "Status": "CANCELING",
      "TaskId": "task-0123456789abcdef0"
  }
  ```

# Working with task completion reports


A *task completion report* provides details about the results of an export, import, or release data repository task. The report includes results for the files processed by the task that match the scope of the report. You can specify whether to generate a report for a task by using the `Enabled` parameter. 

Amazon FSx delivers the report to the file system's linked data repository in Amazon S3, using the path that you specify when you enable the report for a task. The report's file name is `report.csv` for import tasks and `failures.csv` for export or release tasks.

The report format is a comma-separated value (CSV) file that has three fields: `FilePath`, `FileStatus`, and `ErrorCode`.

Reports are encoded using RFC-4180-format encoding as follows:
+ Paths starting with any of the following characters are contained in single quotation marks: `@ + - =` 
+ Strings that contain at least one of the following characters are contained in double quotation marks: `" ,`
+ All double quotation marks are escaped with an additional double quotation mark.

Following are a few examples of the report encoding:
+ `@filename.txt` becomes `"""@filename.txt"""`
+ `+filename.txt` becomes `"""+filename.txt"""`
+ `file,name.txt` becomes `"file,name.txt"`
+ `file"name.txt` becomes `"file""name.txt"`

For more information about RFC-4180 encoding, see [RFC-4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files](https://tools.ietf.org/html/rfc4180) on the IETF website.

The following is an example of the information provided in a task completion report that includes only failed files.

```
myRestrictedFile,failed,S3AccessDenied
dir1/myLargeFile,failed,FileSizeTooLarge
dir2/anotherLargeFile,failed,FileSizeTooLarge
```

For more information about task failures and how to resolve them, see [Troubleshooting data repository task failures](failed-tasks.md).

# Troubleshooting data repository task failures
Troubleshooting task failures

You can [turn on logging](cw-event-logging.md) to CloudWatch Logs to log information about any failures experienced while importing or exporting files using data repository tasks. For information about CloudWatch Logs event logs, see [Data repository event logs](data-repo-event-logs.md).

When a data repository task fails, you can find the number of files that Amazon FSx failed to process in **Files failed to export** on the console's **Task status** page. Or you can use the CLI or API and view the task's `Status: FailedCount` property. For information about accessing this information, see [Accessing data repository tasks](view-data-repo-tasks.md). 

For data repository tasks, Amazon FSx also optionally provides information about the specific files and directories that failed in a completion report. The task completion report contains the file or directory path on the Lustre file system that failed, its status, and the failure reason. For more information, see [Working with task completion reports](task-completion-report.md).

A data repository task can fail for several reasons, including those listed following.


| Error Code | Explanation | 
| --- | --- | 
|  `FileSizeTooLarge`  |  The maximum object size supported by Amazon S3 is 5 TiB.  | 
|  `InternalError`  |  An error occurred within the Amazon FSx file system for an import, export, or release task. Generally, this error code means that the Amazon FSx file system that the failed task ran on is in a FAILED lifecycle state. When this occurs, the affected files might not be recoverable due to data loss. Otherwise, you can use hierarchical storage management (HSM) commands to export the files and directories to the data repository on S3. For more information, see [Exporting files using HSM commands](exporting-files-hsm.md).  | 
|  `OperationNotPermitted`  | Amazon FSx was unable to release the file because it ihas not been exported to a linked S3 bucket. You must use automatic export or export data repository tasks to ensure that your files are first exported to your linked Amazon S3 bucket.  | 
|  `PathSizeTooLong`  |  The export path is too long. The maximum object key length supported by S3 is 1,024 characters.  | 
|  `ResourceBusy`  | Amazon FSx was unable to export or release the file because it was being accessed by another client on the file system. You can retry the DataRepositoryTask after your workflow has finished writing to the file.  | 
|  `S3AccessDenied`  |  Access was denied to Amazon S3 for a data repository export or import task. For export tasks, the Amazon FSx file system must have permission to perform the `S3:PutObject` operation to export to a linked data repository on S3. This permission is granted in the `AWSServiceRoleForFSxS3Access_fs-0123456789abcdef0` service-linked role. For more information, see [Using service-linked roles for Amazon FSx](using-service-linked-roles.md). For export tasks, because the export task requires data to flow outside a file system's VPC, this error can occur if the target repository has a bucket policy that contains one of the `aws:SourceVpc` or `aws:SourceVpce` IAM global condition keys. For import tasks, the Amazon FSx file system must have permission to perform the `S3:HeadObject` and `S3:GetObject` operations to import from a linked data repository on S3. For import tasks, if your S3 bucket uses server-side encryption with customer managed keys stored in AWS Key Management Service (SSE-KMS), you must follow the policy configurations in [Working with server-side encrypted Amazon S3 buckets](s3-server-side-encryption-support.md). If your S3 bucket contains objects uploaded from a different AWS account than your file system linked S3 bucket account, you can ensure that your data repository tasks can modify S3 metadata or overwrite S3 objects regardless of which account uploaded them. We recommend that you enable the S3 Object Ownership feature for your S3 bucket. This feature enables you to take ownership of new objects that other AWS accounts upload to your bucket, by forcing uploads to provide the `-/-acl bucket-owner-full-control` canned ACL. You enable S3 Object Ownership by choosing the **Bucket owner preferred** option in your S3 bucket. For more information, see [Controlling ownership of uploaded objects using S3 Object Ownership](https://docs.aws.amazon.com/AmazonS3/latest/userguide/about-object-ownership.html) in the *Amazon S3 User Guide*.  | 
|  `S3Error`  |  Amazon FSx encountered an S3-related error that wasn't `S3AccessDenied`.  | 
|  `S3FileDeleted`  | Amazon FSx was unable to export a hard link file because the source file doesn't exist in the data repository. | 
|  `S3ObjectInUnsupportedTier`  | Amazon FSx successfully imported a non-symlink object from an S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class. The `FileStatus` will be `succeeded with warning` in the task completion report. The warning indicates that to retrieve the data, you must restore the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive object first and then use an `hsm_restore` command to import the object.  | 
|  `S3ObjectNotFound`  | Amazon FSx was unable to import or export the file because it doesn't exist in the data repository. | 
|  `S3ObjectPathNotPosixCompliant`  |  The Amazon S3 object exists but can't be imported because it isn't a POSIX-compliant object. For information about supported POSIX metadata, see [POSIX metadata support for data repositories](posix-metadata-support.md).  | 
|  `S3ObjectUpdateInProgressFromFileRename`  | Amazon FSx was unable to release the file because automatic export is processing a rename of the file. The automatic export rename process must finish before the file can be released.  | 
|  `S3SymlinkInUnsupportedTier`  | Amazon FSx was unable to import a symlink object because it's in an Amazon S3 storage class that is not supported, such as an S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class. The `FileStatus` will be `failed` in the task completion report. | 
|  `SourceObjectDeletedBeforeReleasing`  | Amazon FSx was unable to release the file from the file system because the file was deleted from the data repository before it could be released. | 

# Releasing files


Release data repository tasks release file data from your FSx for Lustre file system to free up space for new files. Releasing a file retains the file listing and metadata, but removes the local copy of that file's contents. If a user or application accesses a released file, the data is automatically and transparently loaded back onto your file system from your linked Amazon S3 bucket.

**Note**  
Release data repository tasks are not available on FSx for Lustre 2.10 file systems.

The parameters **File system paths to release** and **Minimum duration since last access** determine which files will be released.
+ **File system paths to release**: Specifies the path from which files will be released.
+ **Minimum duration since last access**: Specifies the duration, in days, such that any file not accessed in that duration should be released. The duration since a file was last accessed is calculated by taking the difference between the release task create time and the last time a file was accessed (maximum value of `atime`, `mtime`, and `ctime`).

Files will only be released along the file path if they have been exported to S3 and have a duration since last access that is greater than the minimum duration since last access value. Providing a minimum duration since last access of `0` days will release files independent of their duration since last access.

**Note**  
The use of wildcards to include or exclude files for release is not supported.

Release data repository tasks will only release data from files that have already been exported to a linked S3 data repository. You can export data to S3 using either the automatic export feature, an export data repository task, or HSM commands. To verify that a file has been exported to your data repository, you can run the following command. A return value of `states: (0x00000009) exists archived` indicates that the file has successfully been exported.

```
sudo lfs hsm_state path/to/export/file
```

**Note**  
You must run the HSM command as the root user or using `sudo`.

To release file data on a regular interval, you can schedule a recurring release data repository task using Amazon EventBridge Scheduler. For more information, see [Getting started with EventBridge Scheduler](https://docs.aws.amazon.com/scheduler/latest/UserGuide/getting-started.html) in the *Amazon EventBridge Scheduler User Guide*.

**Topics**
+ [

# Using data repository tasks to release files
](release-files-task.md)

# Using data repository tasks to release files


Use the following procedures to create tasks that release files from the file system by using the Amazon FSx console and CLI. Releasing a file retains the file listing and metadata, but removes the local copy of that file's contents.

## To release files (console)


1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. In the left navigation pane, choose **File systems**, then choose your Lustre file system.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association that you want to create the release task for.

1. For **Actions**, choose **Create release task**. This choice is available only if the file system is linked to a data repository on S3. The **Create release data repository task** dialog appears.

1. In **File system paths to release**, specify up to 32 directories or files to release from your Amazon FSx file system by providing the paths to those directories or files. The paths that you provide must be relative to the mount point of the file system. For example, if the mount point is `/mnt/fsx` and `/mnt/fsx/path1` is a file on the file system that you want to release, then the path to provide is `path1`. To release all files in the file system, specify a forward slash (/) as the path.
**Note**  
If a path that you provide isn't valid, the task fails.

1. For **Minimum duration since last access**, specify the the duration, in days, such that any file not accessed in that duration should be released. Last access time is calculated using the maximum value of `atime`, `mtime`, and `ctime`. Files with a last access duration period greater than minimum duration since last access (relative to the task create time) will be released. Files with a last access duration period less than this number of days won't be released, even if they are in the **File system paths to release** field. Provide a duration of `0` days to release files independent of duration since last access.

1. (Optional) Under **Completion report**, choose **Enable** to generate a task completion report that provides details about the files that meet the scope provided in **Report scope**. To specify a location for Amazon FSx to deliver the report, enter a relative path on the file system's linked S3 data repository for **Report path**.

1. Choose **Create data repository task**.

   A notification at the top of the **File systems** page shows the task that you just created in progress. 

To view the task status and details, in the **Data Repository** tab, scroll down to **Data Repository Tasks**. The default sort order shows the most recent task at the top of the list.

To view a task summary from this page, choose **Task ID** for the task you just created.

## To release files (CLI)

+ Use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html) CLI command to create a task that releases files on your FSx for Lustre file system. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html).

  Set the following parameters:
  + Set `--file-system-id` to the ID of the file system that you are releasing files from.
  + Set `--paths` to the paths on the file system from which the data will be released. If a directory is specified, files within the directory are released. If a file path is specified, only that file is released. To release all files in the file system that been exported to a linked S3 bucket, specify a forward slash (/) for the path.
  + Set `--type` to `RELEASE_DATA_FROM_FILESYSTEM`.
  + Set the `--release-configuration DurationSinceLastAccess` options as follows:
    + `Unit` – Set to `DAYS`.
    + `Value` – Specify an integer that represents the the duration, in days, such that any file not accessed in that duration should be released. Files that were accessed during a period less than this number of days won't be released, even if they are in the `--paths` parameter. Provide a duration of `0` days to release files independent of duration since last access.

  This sample command specifies that files that been exported to a linked S3 bucket and meet the `--release-configuration` criteria will be released from the directories in the specified paths.

  ```
  $ aws fsx create-data-repository-task \
      --file-system-id fs-0123456789abcdef0 \
      --type RELEASE_DATA_FROM_FILESYSTEM \
      --paths path1,path2/file1 \
      --release-configuration '{"DurationSinceLastAccess":{"Unit":"DAYS","Value":10}}' \
      --report Enabled=false
  ```

  After successfully creating the data repository task, Amazon FSx returns the task description as JSON.

After creating the task to release files, you can check the status of the task. For more information about viewing data repository tasks, see [Accessing data repository tasks](view-data-repo-tasks.md).

# Using Amazon FSx with your on-premises data


You can use FSx for Lustre to process your on-premises data with in-cloud compute instances. FSx for Lustre supports access over Direct Connect and VPN, enabling you to mount your file systems from on-premises clients.

**To use FSx for Lustre with your on-premises data**

1. Create a file system. For more information, see [Step 1: Create your FSx for Lustre file system](getting-started.md#getting-started-step1) in the getting started exercise.

1. Mount the file system from on-premises clients. For more information, see [Mounting Amazon FSx file systems from on-premises or a peered Amazon VPC](mounting-on-premises.md).

1. Copy the data that you want to process into your FSx for Lustre file system.

1. Run your compute-intensive workload on in-cloud Amazon EC2 instances mounting your file system.

1. When you're finished, copy the final results from your file system back to your on-premises data location, and delete your FSx for Lustre file system.

# Data repository event logs
Data repository event logs

You can turn on logging to CloudWatch Logs to log information about any failures experienced while importing or exporting files using import, export, data repository tasks and restore events. For more information, see [Logging with Amazon CloudWatch Logs](cw-event-logging.md).

**Note**  
When a data repository task fails, Amazon FSx also writes failure information to the task completion report. For more information about failure information in completion reports, see [Troubleshooting data repository task failures](failed-tasks.md).

**Topics**
+ [

## Import events
](#import-event-logging)
+ [

## Export events
](#export-event-logging)
+ [

## HSM restore events
](#hsm-restore-event-logging)

## Import events


| Error type | Log level | Log message | Root cause | Error code in completion report | 
| --- | --- | --- | --- | --- | 
| List objects failure | ERROR | Failed to list S3 objects in S3 bucket bucket\$1name with prefix prefix. | Amazon FSx failed to list S3 objects in the S3 bucket. This can happen if the S3 bucket policy does not provide sufficient permissions to Amazon FSx. | N/A | 
| Unsupported S3 storage class | WARN | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name due to an S3 object in an unsupported tier S3\$1tier\$1name. | Amazon FSx was unable to import an S3 object because it's in an Amazon S3 storage class that is not supported, such as S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class. | S3ObjectInUnsupportedTier | 
| Unsupported symlink storage class | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name due to an S3 symlink object in an unsupported tier S3\$1tier\$1name. | Amazon FSx was unable to import a symlink object because it's in an Amazon S3 storage class that is not supported, such as S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class. | S3SymlinkInUnsupportedTier | 
| S3 access denied | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name because access to the S3 object was denied. | Access was denied to Amazon S3 for a data repository export import task. For import tasks, the Amazon FSx file system must have permission to perform the `s3:HeadObject` and `s3:GetObject` operations to import from a linked data repository on S3. For import tasks, if your S3 bucket uses server-side encryption with customer managed keys stored in AWS Key Management Service (SSE-KMS), you must follow the policy configurations in [Working with server-side encrypted Amazon S3 buckets](s3-server-side-encryption-support.md).  | S3AccessDenied | 
| Delete access denied | ERROR | Failed to delete local file for S3 object with key key\$1value in S3 bucket bucket\$1name because access to the S3 object was denied. | Automatic import was denied access to an S3 object. | N/A | 
| Non-POSIX compliant object | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name because S3 object is not POSIX compliant. |  The Amazon S3 object exists but can't be imported because it isn't a POSIX-compliant object. For information about supported POSIX metadata, see [POSIX metadata support for data repositories](posix-metadata-support.md).  | S3ObjectPathNotPosixCompliant | 
| Object type mismatch | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name because an S3 object with the same name has already been imported into the file system. | The S3 object being imported is of a different type (file or directory) than an existing object with the same name in the file system. | S3ObjectTypeMismatch | 
| Directory metadata update failure | ERROR | Failed to update local directory metadata due to an internal error. | Directory metadata could not be imported due to an internal error. | N/A | 
| S3 object not found | ERROR | Failed to import S3 object with key key\$1value because it was not found in S3 bucket bucket\$1name. | Amazon FSx was unable to import file metadata because the corresponding object doesn't exist in the data repository. | S3FileDeleted | 
| S3 bucket not found | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name due to bucket does not exist. | Amazon FSx cannot automatically import an S3 object to the file system because the S3 bucket no longer exists. | N/A | 
| S3 bucket not found | ERROR | Failed to delete local file for S3 object with key key\$1value in S3 bucket bucket\$1name due to bucket does not exist. | Amazon FSx cannot delete a file linked to an S3 object on the file system because the S3 bucket no longer exists. | N/A | 
| Directory creation failure | ERROR | Failed to create local directory due to an internal error. | Amazon FSx failed to automatically import a directory creation on the file system due to an internal error. | N/A | 
| Disk space full | ERROR | Failed to import S3 object with key key\$1value in S3 bucket bucket\$1name because the file system is full. | File system ran out of disk space on the metadata server(s) while creating file or directory. | N/A | 

## Export events


| Error type | Log level | Log message | Root cause | Error code in completion report | 
| --- | --- | --- | --- | --- | 
| Access denied | ERROR | Failed to export file because access was denied to S3 object with key key\$1value in S3 bucket bucket\$1name. | Access was denied to Amazon S3 for a data repository export task. For export tasks, the Amazon FSx file system must have permission to perform the `s3:PutObject` operation to export to a linked data repository on S3. This permission is granted in the `AWSServiceRoleForFSxS3Access_fs-0123456789abcdef0` service-linked role. For more information, see [Using service-linked roles for Amazon FSx](using-service-linked-roles.md). Because the export task requires data to flow outside a file system's VPC, this error can occur if the target repository has a bucket policy that contains one of the `aws:SourceVpc` or `aws:SourceVpce` IAM global condition keys. If your S3 bucket contains objects uploaded from a different AWS account than your file system linked S3 bucket account, you can ensure that your data repository tasks can modify S3 metadata or overwrite S3 objects regardless of which account uploaded them. We recommend that you enable the S3 Object Ownership feature for your S3 bucket. This feature enables you to take ownership of new objects that other AWS accounts upload to your bucket, by forcing uploads to provide the `--acl bucket-owner-full-control` canned ACL. You enable S3 Object Ownership by choosing the **Bucket owner preferred** option in your S3 bucket. For more information, see [Controlling ownership of uploaded objects using S3 Object Ownership](https://docs.aws.amazon.com/AmazonS3/latest/userguide/about-object-ownership.html) in the *Amazon S3 User Guide*.  | S3AccessDenied | 
| Export path too long | ERROR | Failed to export file because the local file path size exceeds the maximum object key length supported by S3. | The export path is too long. The maximum object key length supported by S3 is 1,024 characters.  | PathSizeTooLong | 
| File too large | ERROR | Failed to export file because the file size exceeds the maximum supported S3 objects size. | The maximum object size supported by Amazon S3 is 5 TiB. | FileSizeTooLarge | 
| KMS key not found | ERROR | Failed to export file for S3 object with key key\$1value in S3 bucket bucket\$1name because the bucket's KMS key was not found. | Amazon FSx was unable to export the file because the AWS KMS key couldn't be found. Be sure to use a key that's in the same AWS Region as the S3 bucket. For more information on creating KMS keys, see [Creating keys](https://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html) in the AWS Key Management Service Developer Guide. | N/A | 
| Resource busy | ERROR | Failed to export file because it is being used by another process. | Amazon FSx was unable to export the file because it was being modified by another client on the file system. You can retry the task after your workflow has finished writing to the file. | ResourceBusy | 
| File released | WARN | Export skipped: Local file is in released state and a linked S3 object with key key\$1value was not found in bucket bucket\$1name. | Amazon FSx was unable to export the file because it was in a released state on the file system. | N/A | 
| Data repository path mismatch | WARN | Export skipped: local file does not belong to a data repository linked file system path. | Amazon FSx was unable to export because the object doesn't belong to a file system path that is linked to a data repository. | N/A | 
| Internal failure | ERROR | Automatic export encountered an internal error while exporting a file system object | The export failed because of an internal (auto-export- or lustre-level) error. | N/A | 
| Completion report upload failure | ERROR | Failed to upload data repository task completion report into bucket\$1name | Amazon FSx was unable to upload the completion report. | N/A | 
| Completion report validation failure | ERROR | Failed to upload data repository task completion report into bucket bucket\$1name because the completion report path report\$1path does not belong to a data repository associated with this file system | Amazon FSx was unable to upload the completion report because the customer-provided S3 path does not belong to a linked data repository. | N/A | 

## HSM restore events


| Error type | Log level | Log message | Root cause | 
| --- | --- | --- | --- | 
| Access denied | ERROR | Failed to restore file because access was denied to S3 object object\$1name in S3 bucket bucket\$1name. | Access was denied to Amazon S3 when attempting to restore a file using HSM commands. The file system must have permission to perform the `s3:HeadObject` and `s3:GetObject` operations to restore from the linked data repository on S3. | 
| Unsupported S3 storage class | WARN | Failed to restore file because S3 object object\$1name in bucket bucket\$1name was in an unsupported S3\$1storage\$1class \$1name. | Amazon FSx was unable to restore the file because the corresponding S3 object is in an S3 unsupported storage class, such as S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive. You must first restore the object from the Glacier storage class before using `hsm_restore`. | 
| S3 object not found | ERROR | Failed to restore file because S3 object with key key\$1value was not found in S3 bucket bucket\$1name. | Amazon FSx was unable to restore the file because the corresponding S3 object doesn't exist in the data repository. | 
| S3 bucket not found | ERROR | Failed to restore file because S3 bucket bucket\$1name does not exist. | Amazon FSx cannot restore the file because the linked S3 bucket no longer exists. | 
| Disk space full | ERROR | Failed to restore file because there was no available storage space on the file system. | The file system ran out of available storage space while attempting to restore the file data from S3. Consider increasing the file system's storage capacity or releasing files to free up space. | 

# Working with older deployment types


This section applies to file systems with Scratch 1 deployment type, and also to file systems with `Scratch 2` or `Persistent 1` deployment types that do not use data repository associations. Note that automatic export and support for multiple data repositories aren't available on FSx for Lustre file systems that don't use data repository associations.

**Topics**
+ [

## Link your file system to an Amazon S3 bucket
](#legacy-link-to-S3)
+ [

## Automatically import updates from your S3 bucket
](#legacy-auto-import-from-s3)

## Link your file system to an Amazon S3 bucket


When you create an Amazon FSx for Lustre file system, you can link it to a durable data repository in Amazon S3. Before you create your file system, make sure that you have already created the Amazon S3 bucket to which you are linking. In the **Create file system** wizard, you set the following data repository configuration properties in the optional **Data Repository Import/Export** pane.
+ Choose how Amazon FSx keeps your file and directory listing up to date as you add or modify objects in your S3 bucket after the file system is created. For more information, see [Automatically import updates from your S3 bucket](#legacy-auto-import-from-s3).
+ **Import bucket**: Enter the name of the S3 bucket that you are using for the linked repository.
+ **Import prefix**: Enter an optional import prefix if you want to import only some file and directory listings of data in your S3 bucket into your file system. The import prefix defines where in your S3 bucket to import data from.
+ **Export prefix**: Defines where Amazon FSx exports the contents of your file system to your linked S3 bucket.

You can have a 1:1 mapping where Amazon FSx exports data from your FSx for Lustre file system back to the same directories on the S3 bucket that it was imported from. To have a 1:1 mapping, specify an export path to the S3 bucket without any prefixes when you create your file system.
+ When you create file system using the console, choose **Export prefix > A prefix you specify** option, and keep the prefix field blank.
+ When you create a file system using the AWS CLI or API, specify the export path as the name of the S3 bucket without any additional prefixes, for example, `ExportPath=s3://amzn-s3-demo-bucket/`.

Using this method, you can include an import prefix when you specify the import path, and it doesn't impact a 1:1 mapping for exports.

### Creating file systems linked to an S3 bucket


The following procedures walk you through the process of creating an Amazon FSx file system linked to an S3 bucket using the AWS Management Console and AWS Command Line Interface (AWS CLI).

------
#### [ Console ]

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **Create file system**.

1. For the file system type, choose **FSx for Lustre**, and then choose **Next**.

1. Provide the information required for the **File system details** and **Network and security** sections. For more information, see [Step 1: Create your FSx for Lustre file system](getting-started.md#getting-started-step1).

1. You use the **Data repository import/export** panel to configure a linked data repository in Amazon S3. Select **Import data from and export data to S3** to expand the **Data Repository Import/Export** section and configure the data repository settings.  
![\[The Data repository import and export panel for configuring a linked data repository in Amazon S3.\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/legacy-data-repository-import-export.png)

1. Choose how Amazon FSx keeps your file and directory listing up to date as you add or modify objects in your S3 bucket. When you create your file system, your existing S3 objects appear as file and directory listings.
   + **Update my file and directory listing as objects are added to my S3 bucket**: (Default) Amazon FSx automatically updates file and directory listings of any new objects added to the linked S3 bucket that do not currently exist in the FSx file system. Amazon FSx does not update listings for objects that have changed in the S3 bucket. Amazon FSx does not delete listings of objects that are deleted in the S3 bucket.
**Note**  
The default import preferences setting for importing data from a linked S3 bucket using the CLI and API is `NONE`. The default import preferences setting when using the console is to update Lustre as new objects are added to the S3 bucket.
   + **Update my file and directory listing as objects are added to or changed in my S3 bucket**: Amazon FSx automatically updates file and directory listings of any new objects added to the S3 bucket and any existing objects that are changed in the S3 bucket after you choose this option. Amazon FSx does not delete listings of objects that are deleted in the S3 bucket.
   + **Update my file and directory listing as objects are added to, changed in, or deleted from my S3 bucket**: Amazon FSx automatically updates file and directory listings of any new objects added to the S3 bucket, any existing objects that are changed in the S3 bucket, and any existing objects that are deleted in the S3 bucket after you choose this option.
   + **Do not update my file and directly listing when objects are added to, changed in, or deleted from my S3 bucket** - Amazon FSx only updates file and directory listings from the linked S3 bucket when the file system is created. FSx does not update file and directory listings for any new, changed, or deleted objects after choosing this option.

1. Enter an optional **Import prefix** if you want to import only some of the file and directory listings of data in your S3 bucket into your file system. The import prefix defines where in your S3 bucket to import data from. For more information, see [Automatically import updates from your S3 bucket](autoimport-data-repo-dra.md).

1. Choose one of the available **Export prefix** options:
   + **A unique prefix that Amazon FSx creates in your bucket**: Choose this option to export new and changed objects using a prefix generated by FSx for Lustre. The prefix looks like the following: `/FSxLustrefile-system-creation- timestamp`. The timestamp is in UTC format, for example `FSxLustre20181105T222312Z`.
   + **The same prefix that you imported from (replace existing objects with updated ones)**: Choose this option to replace existing objects with updated ones.
   + **A prefix you specify**: Choose this option to preserve your imported data and to export new and changed objects using a prefix that you specify. To achieve a 1:1 mapping when exporting data to your S3 bucket, choose this option and leave the prefix field blank. FSx will export data to same directories it was imported from.

1. (Optional) Set **Maintenance preferences**, or use the system defaults.

1. Choose **Next**, and review the file system settings. Make any changes if needed.

1. Choose **Create file system**.

------
#### [ AWS CLI ]

The following example creates an Amazon FSx file system linked to the `amzn-s3-demo-bucket`, with an import preference that imports any new, changed, and deleted files in the linked data repository after the file system is created.

**Note**  
The default import preferences setting for importing data from a linked S3 bucket using the CLI and API is `NONE`, which is different from the default behavior when using the console.

To create an FSx for Lustre file system, use the Amazon FSx CLI command [https://docs.aws.amazon.com/cli/latest/reference/fsx/create-file-system.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/create-file-system.html), as shown below. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateFileSystem.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateFileSystem.html).

```
$ aws fsx create-file-system \
--client-request-token CRT1234 \
--file-system-type LUSTRE \
--file-system-type-version 2.10 \
--lustre-configuration
AutoImportPolicy=NEW_CHANGED_DELETED,DeploymentType=SCRATCH_1,ImportPath=s
3://amzn-s3-demo-bucket/,ExportPath=s3://amzn-s3-demo-bucket/export,
PerUnitStorageThroughput=50 \
--storage-capacity 2400 \
--subnet-ids subnet-123456 \
--tags Key=Name,Value=Lustre-TEST-1 \
--region us-east-2
```

After you successfully create the file system, Amazon FSx returns the file system description as JSON, as shown in the following example.

```
{
    "FileSystems": [
        {
            "OwnerId": "owner-id-string",
            "CreationTime": 1549310341.483,
            "FileSystemId": "fs-0123456789abcdef0",
            "FileSystemType": "LUSTRE",
            "FileSystemTypeVersion": "2.10",
            "Lifecycle": "CREATING",
            "StorageCapacity": 2400,
            "VpcId": "vpc-123456",
            "SubnetIds": [
                "subnet-123456"
            ],
            "NetworkInterfaceIds": [
                "eni-039fcf55123456789"
            ],
            "DNSName": "fs-0123456789abcdef0.fsx.us-east-2.amazonaws.com",
            "ResourceARN": "arn:aws:fsx:us-east-2:123456:file-system/fs-0123456789abcdef0",
            "Tags": [
                {
                    "Key": "Name",
                    "Value": "Lustre-TEST-1"
                }
            ],
            "LustreConfiguration": {
                "DeploymentType": "PERSISTENT_1",
                "DataRepositoryConfiguration": {
                    "AutoImportPolicy": "NEW_CHANGED_DELETED",
                    "Lifecycle": "UPDATING",
                    "ImportPath": "s3://amzn-s3-demo-bucket/",
                    "ExportPath": "s3://amzn-s3-demo-bucket/export",
                    "ImportedFileChunkSize": 1024
                },
                "PerUnitStorageThroughput": 50
            }
        }
    ]
}
```

------

### Viewing a file system's export path


You can view a file system's export path using the FSx for Lustre console, the AWS CLI, and the API.

------
#### [ Console ]

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/)

1. Choose **File system name** or **File system ID** for the FSx for Lustre file system for which you want to view the export path.

    The file system details page appears for that file system.

1. Choose the **Data repository** tab.

   The **Data repository integration** panel appears, showing the import and export paths.

![\[The Data repository import and export paths in the Data repository integration panel.\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/legacy-view-export-path.png)


------
#### [ CLI ]

To determine the export path for your file system, use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html) AWS CLI command.

```
aws fsx describe-file-systems
```

Look for the `ExportPath` property under `LustreConfiguration` in the response.

```
{
    "OwnerId": "111122223333",
    "CreationTime": 1563382847.014,
    "FileSystemId": "",
    "FileSystemType": "LUSTRE",
    "Lifecycle": "AVAILABLE",
    "StorageCapacity": 2400,
    "VpcId": "vpc-6296a00a",
    "SubnetIds": [
        "subnet-1111111"
    ],
    "NetworkInterfaceIds": [
        "eni-0c288d5b8cc06c82d",
        "eni-0f38b702442c6918c"
    ],
    "DNSName": "fs-0123456789abcdef0.fsx.us-east-2.amazonaws.com",
    "ResourceARN": "arn:aws:fsx:us-east-2:267731178466:file-system/fs-0123456789abcdef0",
    "Tags": [
        {
          "Key": "Name",
          "Value": "Lustre System"
        }
    ],
	"LustreConfiguration": {
    "DeploymentType": "SCRATCH_1",
    "DataRepositoryConfiguration": {
    "AutoImportPolicy": " NEW_CHANGED_DELETED",
    "Lifecycle": "AVAILABLE",
    "ImportPath": "s3://amzn-s3-demo-bucket/",
    "ExportPath": "s3://amzn-s3-demo-bucket/FSxLustre20190717T164753Z",
    "ImportedFileChunkSize": 1024
    }
  },
  "PerUnitStorageThroughput": 50,
  "WeeklyMaintenanceStartTime": "6:09:30"
}
```

------

### Data repository lifecycle state


The data repository lifecycle state provides status information about the file system's linked data repository. A data repository can have the following Lifecycle states.
+ **Creating**: Amazon FSx is creating the data repository configuration between the file system and the linked data repository. The data repository is unavailable.
+ **Available**: The data repository is available for use.
+ **Updating**: The data repository configuration is undergoing a customer-initiated update that might affect its availability.
+ **Misconfigured**: Amazon FSx cannot automatically import updates from the S3 bucket until the data repository configuration is corrected. For more information, see [Troubleshooting a misconfigured linked S3 bucket](troubleshooting-misconfigured-data-repository.md).

You can view a file system's linked data repository lifecycle state using the Amazon FSx console, the AWS Command Line Interface, and the Amazon FSx API. In the Amazon FSx console, you can access the data repository **Lifecycle state** in the **Data Repository Integration** pane of the **Data Repository** tab for the file system. The `Lifecycle` property is located in the `DataRepositoryConfiguration` object in the response of a [https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html) CLI command (the equivalent API action is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeFileSystems.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeFileSystems.html)).

## Automatically import updates from your S3 bucket


By default, when you create a new file system, Amazon FSx imports the file metadata (the name, ownership, timestamp, and permissions) of objects in the linked S3 bucket at file system creation. You can configure your FSx for Lustre file system to automatically import metadata of objects that are added to, changed in, or deleted from your S3 bucket after file system creation. FSx for Lustre updates the file and directory listing of a changed object after creation in the same manner as it imports file metadata at file system creation. When Amazon FSx updates the file and directory listing of a changed object, if the changed object in the S3 bucket no longer contains its metadata, Amazon FSx maintains the current metadata values of the file, rather than using default permissions.

**Note**  
Import settings are available on FSx for Lustre file systems created after 3:00 pm EDT, July 23, 2020.

You can set import preferences when you create a new file system, and you can update the setting on existing file systems using the FSx management console, the AWS CLI, and the AWS API. When you create your file system, your existing S3 objects appear as file and directory listings. After you create your file system, how do you want to update it as the contents of your S3 bucket are updated? A file system can have one of the following Import preferences:

**Note**  
The FSx for Lustre file system and its linked S3 bucket must be located in the same AWS Region to automatically import updates.
+ **Update my file and directory listing as objects are added to my S3 bucket**: (Default) Amazon FSx automatically updates file and directory listings of any new objects added to the linked S3 bucket that do not currently exist in the FSx file system. Amazon FSx does not update listings for objects that have changed in the S3 bucket. Amazon FSx does not delete listings of objects that are deleted in the S3 bucket.
**Note**  
The default import preferences setting for importing data from a linked S3 bucket using the CLI and API is `NONE`. The default import preferences setting when using the console is to update Lustre as new objects are added to the S3 bucket.
+ **Update my file and directory listing as objects are added to or changed in my S3 bucket**: Amazon FSx automatically updates file and directory listings of any new objects added to the S3 bucket and any existing objects that are changed in the S3 bucket after you choose this option. Amazon FSx does not delete listings of objects that are deleted in the S3 bucket.
+ **Update my file and directory listing as objects are added to, changed in, or deleted from my S3 bucket**: Amazon FSx automatically updates file and directory listings of any new objects added to the S3 bucket, any existing objects that are changed in the S3 bucket, and any existing objects that are deleted in the S3 bucket after you choose this option.
+ **Do not update my file and directly listing when objects are added to, changed in, or deleted from my S3 bucket** - Amazon FSx only updates file and directory listings from the linked S3 bucket when the file system is created. FSx does not update file and directory listings for any new, changed, or deleted objects after choosing this option.

When you set the import preferences to update your file system file and directory listings based on changes in the linked S3 bucket, Amazon FSx creates an event notification configuration on the linked S3 bucket named `FSx`. Do not modify or delete the `FSx` event notification configuration on the S3 bucket—doing so prevents the automatic import of new or changed file and directory listings to your file system. 

When Amazon FSx updates a file listing that has changed on the linked S3 bucket, it overwrites the local file with the updated version, even if the file is write-locked. Similarly, when Amazon FSx updates a file listing when the corresponding object has been deleted on the linked S3 bucket, it deletes the local file, even if the file is write- locked.

Amazon FSx makes a best effort to update your file system. Amazon FSx cannot update the file system with changes in the following situations:
+ When Amazon FSx does not have permission to open the changed or new S3 object.
+ When the `FSx` event notification configuration on the linked S3 bucket is deleted or changed.

Either of these conditions cause the data repository lifecycle state to become **Misconfigured**. For more information, see [Data repository lifecycle state](#legacy-data-lifecycle).

### Prerequisites


The following conditions are required for Amazon FSx to automatically import new, changed, or deleted files from the linked S3 bucket:
+ The file system and its linked S3 bucket must be located in the same AWS Region.
+ The S3 bucket does not have a misconfigured Lifecycle state. For more information, see [Data repository lifecycle state](#legacy-data-lifecycle).
+ Your account must have the permissions required to configure and receive event notifications on the linked S3 bucket.

### Types of file changes supported


Amazon FSx supports importing the following changes to files and folders that occur in the linked S3 bucket:
+ Changes to file contents
+ Changes to file or folder metadata
+ Changes to symlink target or metadata

  
### Updating import preferences


You can set a file system's import preferences when you create a new file system. For more information, see [Linking your file system to an Amazon S3 bucket](create-dra-linked-data-repo.md).

You can also update a file system's import preferences after it is created using the AWS Management Console, the AWS CLI, and the Amazon FSx API, as shown in the following procedure.

------
#### [ Console ]

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the dashboard, choose **File systems**.

1. Select the file system that you want to manage to display the file system details.

1. Choose **Data repository** to view the data repository settings. You can modify the import preferences if the lifecycle state is **AVAILABLE** or **MISCONFIGURED**. For more information, see [Data repository lifecycle state](#legacy-data-lifecycle).

1. Choose **Actions**, and then choose **Update import preferences** to display the **Update import preferences** dialog box.

1. Select the new setting, and then choose **Update** to make the change.

------
#### [ CLI ]

To update import preferences, use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/update-file-system.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/update-file-system.html) CLI command. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_UpdateFileSystem.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_UpdateFileSystem.html). 

After you successfully update the file system's `AutoImportPolicy`, Amazon FSx returns the description of the updated file system as JSON, as shown here:

```
{
    "FileSystems": [
        {
            "OwnerId": "111122223333",
            "CreationTime": 1549310341.483,
            "FileSystemId": "fs-0123456789abcdef0",
            "FileSystemType": "LUSTRE",
            "Lifecycle": "UPDATING",
            "StorageCapacity": 2400,
            "VpcId": "vpc-123456",
            "SubnetIds": [
                "subnet-123456"
            ],
            "NetworkInterfaceIds": [
                "eni-039fcf55123456789"
            ],
            "DNSName": "fs-0123456789abcdef0.fsx.us-east-2.amazonaws.com",
            "ResourceARN": "arn:aws:fsx:us-east-2:123456:file-system/fs-0123456789abcdef0",
            "Tags": [
                {
                    "Key": "Name",
                    "Value": "Lustre-TEST-1"
                }
            ],
            "LustreConfiguration": {
                "DeploymentType": "SCRATCH_1",
                "DataRepositoryConfiguration": {
                    "AutoImportPolicy": "NEW_CHANGED_DELETED",
                    "Lifecycle": "UPDATING",
                    "ImportPath": "s3://amzn-s3-demo-bucket/",
                    "ExportPath": "s3://amzn-s3-demo-bucket/export",
                    "ImportedFileChunkSize": 1024
                }
                "PerUnitStorageThroughput": 50,
                "WeeklyMaintenanceStartTime": "2:04:30"
            }
        }
    ]
}
```

------