

# Migrating your existing file storage to Amazon FSx for OpenZFS
Migrating your existing file storage

The following sections provide information on how to migrate your existing file storage to Amazon FSx for OpenZFS using AWS DataSync, `rsync` or `Robocopy`.
+ **AWS DataSync** – An online data transfer service designed to simplify, automate, and accelerate copying large amounts of data to and from AWS storage services.
+ **rsync** – Remote sync is an open source utility for efficiently transferring and synchronizing files commonly available on most Linux or other Unix-based operating systems.
+ **Robocopy** – Robust File Copy is a command line directory and file replication command set for Microsoft Windows.

Before you begin using the procedures described in the following sections, be sure that the following prerequisites are met:
+ You have created a destination FSx for OpenZFS file system. For more information, see [Creating an Amazon FSx for OpenZFS file system](creating-file-systems.md).
+ The source and destination file systems are connected in the same virtual private cloud (VPC). The source file system can be located on-premises or in another Amazon VPC, AWS account, or AWS Region, but it must be in a network peered with that of the destination file system using Amazon VPC Peering, Transit Gateway, AWS Direct Connect, or Site-to-Site VPN. For more information, see [Access from a different VPC](access-within-aws.md#vpc-peering) and [What is VPC peering?](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html) in the *Amazon VPC Peering Guide*.

**Topics**
+ [Migrating files with AWS DataSync](migrate-files-to-fsx-datasync.md)
+ [Migrating files with rsync](fsx-migrate-rsync.md)
+ [Migrating files with Robocopy](fsx-migrate-robocopy.md)
+ [Cutting over to your file system](cutover.md)

# Migrating files to Amazon FSx for OpenZFS using AWS DataSync
Migrating files with AWS DataSync

We recommend using AWS DataSync to transfer data between FSx for OpenZFS file systems. DataSync is a data transfer service that simplifies, automates, and accelerates moving and replicating data between self-managed storage systems and AWS storage services over the internet or Direct Connect. DataSync can transfer your file system data and metadata, such as ownership, timestamps, and access permissions.

You can use DataSync to transfer files between two FSx for OpenZFS file systems, and also move data to a file system in a different AWS Region or AWS account. You can also use DataSync with FSx for OpenZFS file systems for other tasks. For example, you can perform one-time data migrations, periodically ingest data for distributed workloads, and schedule replication for data protection and recovery.

In DataSync, a *location* is an endpoint for an FSx for OpenZFS file system. For information about specific transfer scenarios, see [Working with locations](https://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html) in the *AWS DataSync User Guide*.

## Prerequisites


To migrate data into your FSx for OpenZFS setup, you need a server and network that meet the DataSync requirements. To learn more, see [Requirements for DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/requirements.html) in the *AWS DataSync User Guide*.

## Basic steps for migrating files using DataSync
DataSync migration basic steps

Transferring files from a source to a destination using DataSync involves the following basic steps:
+ Download and deploy an agent in your environment and activate it (not required if transferring between AWS services).
+ Create a source and destination location.
+ Create a task.
+ Run the task to transfer files from the source to the destination.

For more information, see the following topics in the AWS DataSync User Guide:
+ [ Data transfer between self-managed storage and AWS](https://docs.aws.amazon.com/datasync/latest/userguide/how-datasync-works.html#onprem-aws)
+ [ Creating a location for Amazon FSx for OpenZFS](https://docs.aws.amazon.com/datasync/latest/userguide/create-openzfs-location.html)
+ [ Deploy your agent as an Amazon EC2 instance](https://docs.aws.amazon.com/datasync/latest/userguide/deploy-agents.html#ec2-deploy-agent)

# Migrating files to Amazon FSx for OpenZFS using rsync
Migrating files with rsync

With **rsync**, you can replicate data between any source and destination, but at least one must be locally accessible to the client instance.

------
#### [ Amazon EC2 instance ]

**To migrate existing files to Amazon FSx from a Linux-based Amazon EC2 instance**

The following procedure configures your FSx for OpenZFS destination volume as a local NFS mount on a Linux-based EC2 instance and uses the **rsync** command to synchronize data from your source file system or existing directory on your EC2 instance.

1. Launch a Linux-based Amazon EC2 instance or connect to an existing EC2 instance that contains your desired source data.

1. Mount your destination FSx for OpenZFS source volume; for more information, see [Step 2: Mount your file system from an Amazon EC2 instance](getting-started.md#getting-started-step2). The following step assumes that you have mounted your desired destination on your OpenZFS volume to `/fsx/destination_path` on your EC2 instance.

1. Run **rsync** from this EC2 instance to synchronize data from your source. If your source data is already on the EC2 instance, use the following command:

   ```
   sudo rsync -avR /source_path /fsx/destination_path
   ```
**Note**  
You can also run **rsync** with GNU parallel to maximize performance. The following instructions apply for EC2 Linux instances running Amazon Linux 2:.  

   ```
   sudo amazon-linux-extras install epel 
   sudo yum install nload sysstat parallel -y
   sudo time find -L /source_path -type f | parallel rsync -avR {} /fsx/destination_path
   ```

   If your source is a directory on a remote host, use the following command:

   ```
   sudo rsync -avR username@source_dns_or_ip:/source_path /fsx/destination_path
   ```

   Use the following variant if you need to use .pem key-based authentication:

   ```
   sudo rsync -avR -e "ssh -i key.pem" username@source_dns_or_ip:/source_path /fsx/destination_path
   ```

------
#### [ On-premises ]

**To migrate existing files to Amazon FSx from your on-premises Linux-based source**

The following procedure configures your FSx for OpenZFS destination volume as a local NFS mount on a Linux-based EC2 instance. Then, you use **rsync** from your source to connect to this EC2 instance and synchronize files to the path where the destination Amazon FSx volume is mounted.

1. Launch a Linux-based Amazon EC2 instance and mount your destination FSx for OpenZFS source volume. For more information, see [Step 2: Mount your file system from an Amazon EC2 instance](getting-started.md#getting-started-step2). The following step assumes that you have mounted the desired destination on your OpenZFS volume to `/fsx/destination_path` on your EC2 instance.

1. From your on-premises Linux-based source, run **rsync** to connect to this EC2 instance and synchronize data from any locally accessible path. For example, `source_path` can refer to a locally accessible directory or a path on another shared file system.

   ```
   sudo rsync -e "ssh -i key.pem" /source_path ec2-user@ec2_dns_name.amazonaws.com:/fsx/destination_path
   ```

------

# Migrating files to Amazon FSx for OpenZFS using Robocopy
Migrating files with Robocopy

Robocopy is designed to replicate data between two locations that are locally accessible on the same host. To use Robocopy to migrate data to your FSx for OpenZFS file system, you need to mount the source file system and the destination OpenZFS volume on the same Windows-based EC2 client instance. The following procedure outlines the necessary steps to perform this migration using a new EC2 instance.

**To migrate existing files to Amazon FSx**

1. Launch a Windows Server 2016 Amazon EC2 instance in the same Amazon VPC as that of your Amazon FSx file system.

1. Connect to your Amazon EC2 instance. For more information, see [Connect to Your Windows instance](https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/connecting_to_windows_instance.html) in the *Amazon EC2 User Guide for Windows Instances*.

1. Open **Command Prompt** and map the source file share on your existing file server (on-premises or in AWS) to a drive letter (for example, *Y*:) as follows. As part of this, you provide credentials for a member of your on-premises Active Directory's **Domain Administrators** group.

   ```
   C:\>net use Y: \\fileserver1.mydata.com\localdata /user:mydata.com\Administrator
   Enter the password for ‘fileserver1.mydata.com’: _
   
   Drive Y: is now connected to \\fileserver1.mydata.com\localdata.
   
   The command completed successfully.
   ```

1. Map the target file share on your Amazon FSx file system to a different drive letter (for example, *Z*:) on your Amazon EC2 instance using the [Windows client](getting-started.md#getting-started-step2) instructions.

1. Choose **Run as Administrator** from the context menu. Open **Command Prompt** or **Windows PowerShell** as an administrator, and run the following Robocopy command to copy all the files on the source share to the target share. The example command uses the following elements and options:
   + Y – Refers to the source share located in the on-premises Active Directory forest mydata.com.
   + Z – Refers to the target share \$1\$1amznfsxabcdef1.mydata.com\$1share on Amazon FSx.
   + /copy – Specifies the following file properties to be copied: 
     + D – data
     + A – attributes
     + T – timestamps
   + /e – Copies subdirectories, including empty ones.
   + /b – Uses the backup and restore privilege in Windows to copy files even if their NTFS ACLs deny permissions to the current user.
   + /MT:8 – Specifies how many threads to use for performing multithreaded copies.

   ```
   robocopy Y:\ Z:\ /copy:DAT /e /b /MT:8
   ```

**Note**  
If you are copying large files over a slow or unreliable connection, you can enable restartable mode by using the **/zb** option in place of the **/b** option. With restartable mode, if the transfer of a large file is interrupted, a subsequent Robocopy operation can pick up in the middle of the transfer instead of having to re-copy the entire file from the beginning. Using the restartable mode can reduce data transfer speeds.

# Cutting over to your Amazon FSx for OpenZFS file system
Cutting over to your file system

To cut over to your FSx for OpenZFS file system, do the following:
+ Disconnect all clients that write to the source file system.
+ Perform an **rsync** or **Robocopy** final file sync to ensure there is no data loss when cutting over.
+ Connect all clients to your FSx for OpenZFS file system.

Now your FSx for OpenZFS file system is available with the data from the source file system and is available for clients to read and write to it. To make this data accessible to clients and applications, see [Accessing your dataAccessing your data](accessing-your-data.md).