

# Exporting changes to the data repository
<a name="export-changed-data-meta-dra"></a>

You can export changes to data and POSIX metadata changes from your FSx for Lustre file system to a linked data repository. Associated POSIX metadata includes ownership, permissions, and timestamps.

To export changes from the file system, use one of the following methods.
+ Configure your file system to automatically export new, changed, or deleted files to your linked data repository. For more information, see [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md).
+ Use an on-demand export data repository task. For more information, see [Using data repository tasks to export changes](export-data-repo-task-dra.md)

Automatic export and export data repository tasks cannot run at the same time.

**Important**  
Automatic export will not synchronize the following metadata operations on your file system with S3 if the corresponding objects are stored in S3 Glacier Flexible Retrieval:   
chmod
chown
rename

When you turn on automatic export for a data repository association, your file system automatically exports file data and metadata changes as files are created, modified, or deleted. When you export files or directories using an export data repository task, your file system exports only data files and metadata that were created or modified since the last export.

Both automatic export and export data repository tasks export POSIX metadata. For more information, see [POSIX metadata support for data repositories](posix-metadata-support.md). 

**Important**  
To ensure that FSx for Lustre can export your data to your S3 bucket, it must be stored in a UTF-8 compatible format.
S3 object keys have a maximum length of 1,024 bytes. FSx for Lustre will not export files whose corresponding S3 object key would be longer than 1,024 bytes.

**Note**  
All objects created by automatic export and export data repository tasks are written using the S3 Standard storage class.

**Topics**
+ [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md)
+ [Using data repository tasks to export changes](export-data-repo-task-dra.md)
+ [Exporting files using HSM commands](exporting-files-hsm.md)

# Automatically export updates to your S3 bucket
<a name="autoexport-data-repo-dra"></a>

You can configure your FSx for Lustre file system to automatically update the contents of a linked S3 bucket as files are added, changed, or deleted on the file system. FSx for Lustre creates, updates, or deletes the object in S3, corresponding to the change in the file system.

**Note**  
Automatic export isn't available on FSx for Lustre 2.10 file systems or `Scratch 1` file systems.

You can export to a data repository that is in the same AWS Region as the file system or in a different AWS Region.

You can configure automatic export when you create the data repository association and update the automatic export settings at any time using the FSx management console, the AWS CLI, and the AWS API.

**Important**  
If a file is modified in the file system with all automatic export policies enabled and automatic import disabled, the content of that file is always exported to a corresponding object in S3. If an object already exists in the target location, the object is overwritten.
If a file is modified in both the file system and S3, with all automatic import and automatic export policies enabled, either the file in the file system or the object in S3 could be overwritten by the other. It isn't guaranteed that a later edit in one location will overwrite an earlier edit in another location. If you modify the same file in both the file system and the S3 bucket, you should ensure application-level coordination to prevent such conflicts. FSx for Lustre doesn't prevent conflicting writes in multiple locations.

The export policy specifies how you want FSx for Lustre to update your linked S3 bucket as the contents change in the file system. A data repository association can have one of the following automatic export policies:
+ **New** – FSx for Lustre automatically updates the S3 data repository only when a new file, directory, or symlink is created on the file system.
+ **Changed** – FSx for Lustre automatically updates the S3 data repository only when an existing file in the file system is changed. For file content changes, the file must be closed before it's propagated to the S3 repository. Metadata changes (rename, ownership, permissions, and timestamps) are propagated when the operation is done. For renaming changes (including moves), the existing (pre-renamed) S3 object is deleted and a new S3 object is created with the new name.
+ **Deleted** – FSx for Lustre automatically updates the S3 data repository only when a file, directory, or symlink is deleted in the file system.
+ **Any combination of New, Changed, and Deleted** – FSx for Lustre automatically updates the S3 data repository when any of the specified actions occur in file system. For example, you can specify that the S3 repository is updated when a file is added to (**New**) or removed from (**Deleted**) the file system, but not when a file is changed.
+ **No policy configured** – FSx for Lustre doesn't automatically update the S3 data repository when files are added to, changed in, or deleted from the file system. If you don't configure an export policy, automatic export is disabled. You can still manually export changes by using an export data repository task, as described in [Using data repository tasks to export changes](export-data-repo-task-dra.md).

For most use cases, we recommend that you configure an export policy of **New**, **Changed**, and **Deleted**. This policy ensures that all updates made on your file system are automatically exported to your linked S3 data repository.

We recommend that you [turn on logging](cw-event-logging.md#manage-logging) to CloudWatch Logs to log information about any files or directories that couldn't be exported automatically. Warnings and errors in the log contain information about the failure reason. For more information, see [Data repository event logs](data-repo-event-logs.md).

**Note**  
While access time (`atime`) and modification time (`mtime`) are synchronized with S3 during export operations, changes to these timestamps alone do not trigger automatic export. Only changes to file content or other metadata (such as ownership or permissions) will trigger an automatic export to S3.

## Updating export settings
<a name="manage-autoexport-dra"></a>

You can set a file system's export settings to a linked S3 bucket when you create the data repository association. For more information, see [Creating a link to an S3 bucket](create-linked-dra.md).

You can also update the export settings at any time, including the export policy. For more information, see [Updating data repository association settings](update-dra-settings.md).

## Monitoring automatic export
<a name="monitoring-autoexport"></a>

You can monitor automatic export enabled data repository associations using a set of metrics published to Amazon CloudWatch. The `AgeOfOldestQueuedMessage` metric represents the age of the oldest update made to the file system which has not yet been exported to S3. If the `AgeOfOldestQueuedMessage` is greater than zero for an extended period of time, we recommend temporarily reducing the number of changes (directory renames in particular) that are actively being made to the file system until the message queue has been reduced. For more information, see [FSx for Lustre S3 repository metrics](fs-metrics.md#auto-import-export-metrics).

**Important**  
When deleting a data repository association or file system with automatic export enabled, you should first make sure that `AgeOfOldestQueuedMessage` is zero, meaning that there are no changes that have not yet been exported. If `AgeOfOldestQueuedMessage` is greater than zero when you delete your data repository association or file system, the changes that had not yet been exported will not reach your linked S3 bucket. To avoid this, wait for `AgeOfOldestQueuedMessage` to reach zero before deleting your data repository association or file system.

# Using data repository tasks to export changes
<a name="export-data-repo-task-dra"></a>

The export data repository task exports files that are new or changed in your file system. It creates a new object in S3 for any new file on the file system. For any file that has been modified on the file system, or whose metadata has been modified, the corresponding object in S3 is replaced with a new object with the new data and metadata. No action is taken for files that have been deleted from the file system.

**Note**  
Keep the following in mind when using export data repository tasks:  
The use of wildcards to include or exclude files for export isn't supported.
When performing `mv` operations, the target file after being moved will be exported to S3 even if there is no UID, GID, permission, or content change.

Use the following procedures to export data and metadata changes on the file system to linked S3 buckets by using the Amazon FSx console and CLI. Note that you can use one data repository task for multiple DRAs.

## To export changes (console)
<a name="create-dra-repo-task-console"></a>

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. On the navigation pane, choose **File systems**, then choose your Lustre file system.

1. Choose the **Data repository** tab.

1. In the **Data repository associations** pane, choose the data repository association you want to create the export task for.

1. For **Actions**, choose **Export task**. This choice isn't available if the file system isn't linked to a data repository on S3. The **Create export data repository task** dialog appears.

1. (Optional) Specify up to 32 directories or files to export from your Amazon FSx file system by providing the paths to those directories or files in **File system paths to export**. The paths you provide need to be relative to the mount point of the file system. If the mount point is `/mnt/fsx` and `/mnt/fsx/path1` is a directory or file on the file system you want to export, then the path to provide is `path1`.
**Note**  
If a path that you provide isn't valid, the task fails.

1. (Optional) Choose **Enable** under **Completion report** to generate a task completion report after the task completes. A *task completion report* provides details about the files processed by the task that meet the scope provided in **Report scope**. To specify the location for Amazon FSx to deliver the report, enter a relative path on the file system's linked S3 data repository for **Report path**.

1. Choose **Create**.

   A notification at the top of the **File systems** page shows the task that you just created in progress. 

To view the task status and details, scroll down to the **Data Repository Tasks** pane in the **Data Repository** tab for the file system. The default sort order shows the most recent task at the top of the list.

To view a task summary from this page, choose **Task ID** for the task you just created. The **Summary** page for the task appears.

## To export changes (CLI)
<a name="create-data-repo-task-cli"></a>
+ Use the [https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/create-data-repository-task.html) CLI command to export data and metadata changes on your FSx for Lustre file system. The corresponding API operation is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_CreateDataRepositoryTask.html).

  ```
  $ aws fsx create-data-repository-task \
      --file-system-id fs-0123456789abcdef0 \
      --type EXPORT_TO_REPOSITORY \
      --paths path1,path2/file1 \
      --report Enabled=true
  ```

  After successfully creating the data repository task, Amazon FSx returns the task description as JSON, as shown in the following example.

  ```
  {
      "Task": {
          "TaskId": "task-123f8cd8e330c1321",
          "Type": "EXPORT_TO_REPOSITORY",
          "Lifecycle": "PENDING",
          "FileSystemId": "fs-0123456789abcdef0",
          "Paths": ["path1", "path2/file1"],
          "Report": {
              "Path":"s3://dataset-01/reports",
              "Format":"REPORT_CSV_20191124",
              "Enabled":true,
              "Scope":"FAILED_FILES_ONLY"
          },
          "CreationTime": "1545070680.120",
          "ClientRequestToken": "10192019-drt-12",
          "ResourceARN": "arn:aws:fsx:us-east-1:123456789012:task:task-123f8cd8e330c1321"
      }
  }
  ```

After creating the task to export data to the linked data repository, you can check the status of the export data repository task. For more information about viewing data repository tasks, see [Accessing data repository tasks](view-data-repo-tasks.md).

# Exporting files using HSM commands
<a name="exporting-files-hsm"></a>

**Note**  
To export changes in your FSx for Lustre file system's data and metadata to a durable data repository on Amazon S3, use the automatic export feature described in [Automatically export updates to your S3 bucket](autoexport-data-repo-dra.md). You can also use export data repository tasks, described in [Using data repository tasks to export changes](export-data-repo-task-dra.md).

To export an individual file to your data repository and verify that the file has successfully been exported to your data repository, you can run the commands shown following. A return value of `states: (0x00000009) exists archived` indicates that the file has successfully been exported.

```
sudo lfs hsm_archive path/to/export/file
sudo lfs hsm_state path/to/export/file
```

**Note**  
You must run the HSM commands (such as `hsm_archive`) as the root user or using `sudo`.

To export your entire file system or an entire directory in your file system, run the following commands. If you export multiple files simultaneously, Amazon FSx for Lustre exports your files to your Amazon S3 data repository in parallel.

```
nohup find local/directory -type f -print0 | xargs -0 -n 1 sudo lfs hsm_archive &
```

To determine whether the export has completed, run the following command.

```
find path/to/export/file -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_state | awk '!/\<archived\>/ || /\<dirty\>/' | wc -l
```

If the command returns with zero files remaining, the export is complete.