

# Step-by-Step BigQuery project to Amazon Redshift migration walkthrough


In the following sections, you can find step-by-step instructions for migrating your BigQuery project to Amazon Redshift. These steps assume that you have already prepared your source and target databases as described in preceding sections.

**Topics**
+ [

# Step 1: Create a BigQuery Service Account Key File
](bigquery-redshift-migration-step-1.md)
+ [

# Step 2: Create an Amazon Redshift Cluster
](bigquery-redshift-migration-step-2.md)
+ [

# Step 3: Create Buckets to Store Your Temporary Data
](bigquery-redshift-migration-step-3.md)
+ [

# Step 4: Install AWS SCT on Your Local Computer
](bigquery-redshift-migration-step-4.md)
+ [

# Step 5: Create an AWS SCT Project
](bigquery-redshift-migration-step-5.md)
+ [

# Step 6: Convert Database Schemas
](bigquery-redshift-migration-step-6.md)
+ [

# Step 7: Install and Configure Data Extraction Agents
](bigquery-redshift-migration-step-7.md)
+ [

# Step 8: Run Your Migration Task
](bigquery-redshift-migration-step-8.md)
+ [

# Step 9: Delete Walkthrough Resources
](bigquery-redshift-migration-step-9.md)

# Step 1: Create a BigQuery Service Account Key File


You can connect to BigQuery with a user account or a service account. A *service account* is a special kind of account designed to be used by applications or compute workloads, rather than a person.

Service accounts don’t have passwords and use a unique email address for identification. You can associate each service account with a service account key, which is a public or private RSA key pair. In this walkthrough, we use a service account key in AWS SCT to access your BigQuery project.

 **To create a BigQuery service account key** 

1. Sign in to the [Google Cloud management console](https://console.cloud.google.com/).

1. Make sure that you have API enabled on your [BigQuery API](https://console.cloud.google.com/apis/library/bigquery.googleapis.com) page. If you don’t see **API Enabled**, choose **Enable**.

1. On the [Service accounts](https://console.cloud.google.com/iam-admin/serviceaccounts) page, choose your BigQuery project, and then choose **Create service account**.

1. On the **Service account details** page, enter a descriptive value for **Service account name**. Choose **Create and continue**. The **Grant this service account access to the project** page opens.

1. For **Select a role**, choose **BigQuery**, and then choose **BigQuery Admin**. AWS SCT uses permissions to manage all resources within the project to load your BigQuery metadata in the migration project.

1. Choose **Add another role**. For **Select a role**, choose **Cloud Storage**, and then choose **Storage Admin**. AWS SCT uses full control of data objects and buckets to extract your data from BigQuery and then load it into Amazon Redshift.

1. Choose **Continue**, and then choose **Done**.

1. On the [Service accounts](https://console.cloud.google.com/iam-admin/serviceaccounts) page, choose the service account that you created.

1. Choose **Keys**, **Add key**, **Create new key**.

1. Choose **JSON**, and then choose **Create**. Choose the folder to save your private key or check the default folder for downloads in your browser.

# Step 2: Create an Amazon Redshift Cluster


To store your data in the AWS cloud, you can use your existing Amazon Redshift cluster or create a new one. You don’t need to create any tables because AWS SCT automates this process.

If you don’t plan to migrate data as part of this walkthrough, you can skip this step. To see how AWS SCT converts your database code objects, use a virtual Amazon Redshift target in your project. For more information, see [Using virtual targets](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Mapping.VirtualTargets.html).

 **To create an Amazon Redshift cluster** 

1. Sign in to the AWS Management Console and open the Amazon Redshift console at [https://console.aws.amazon.com/redshift/](https://console.aws.amazon.com/redshift/).

1. On the navigation menu, choose **Clusters**.

1. Choose **Create cluster**.

1. For **Cluster identifier**, enter the unique name of your Amazon Redshift cluster.

1. Choose **Free trial**.

1. For **Admin user name**, enter the login for the admin user of your Amazon Redshift cluster.

1. For **Admin user password**, enter the password for the admin user.

1. Choose **Create cluster**.

After you create your Amazon Redshift database, configure a new database user. Then, use the credentials of this user in AWS SCT to access your Amazon Redshift cluster. We don’t recommend you to use the admin user for the migration.

Make sure that you grant the following privileges to this new user to complete the migration:
+  `CREATE ON DATABASE` — allows to create new schemas in the database.
+  `GRANT USAGE ON LANGUAGE` — allows to create new functions and procedures in the database.
+  `GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog` — provides the user with system information about the Amazon Redshift cluster.
+  `GRANT SELECT ON pg_class_info` — provides the user with information about tables distribution style.

You can use the following code example to create a database user and grant the privileges.

```
CREATE USER user_name PASSWORD your_password;
GRANT CREATE ON DATABASE db_name TO user_name;
GRANT USAGE ON LANGUAGE plpythonu TO user_name;
GRANT USAGE ON LANGUAGE plpgsql TO user_name;
GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog TO user_name;
GRANT SELECT ON pg_class_info TO user_name;
GRANT SELECT ON sys_serverless_usage TO user_name;
GRANT SELECT ON pg_database_info TO user_name;
GRANT SELECT ON pg_statistic TO user_name;
```

In the preceding example, replace *user\$1name* with the name of your user. Then, replace *db\$1name* with the name of your target Amazon Redshift database. Finally, replace *your\$1password* with a secure password.

# Step 3: Create Buckets to Store Your Temporary Data


Data migration from BigQuery to Amazon Redshift includes the following steps:

1. Export data from BigQuery to a Cloud Storage bucket.

1. Extract data from a Cloud Storage bucket.

1. Upload data to an Amazon Simple Storage Service (Amazon S3) bucket.

1. Copy data from an S3 bucket to Amazon Redshift.

You need all four steps because you can’t access data directly in BigQuery and you can’t upload data directly to Amazon Redshift. Because of these limitations, you need to create buckets to store your data during migration.

 **To create a Cloud Storage bucket** 

1. Sign in to the [Google Cloud management console](https://console.cloud.google.com/).

1. Open the [Cloud Storage Browser](https://console.cloud.google.com/storage/browser) page.

1. Choose **Create bucket**.

1. For **Name your bucket**, enter a name for your Cloud Storage bucket.

1. On the **Choose where to store your data** page, choose **Region** for **Location type** and then choose your region for **Location**.

1. Leave the default values for other options, and choose **Create**.

 **To create an Amazon S3 bucket** 

1. Sign in to the AWS Management Console and open the Amazon Redshift console at [ https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. Choose **Create bucket**.

1. For **Bucket name**, enter the globally unique name of your Amazon S3 bucket.

1. For ** AWS Region**, choose the AWS Region where you want the bucket to reside. Choose a Region close to you to minimize latency and costs.

1. Leave the default values for other options, and choose **Create bucket**.

# Step 4: Install AWS SCT on Your Local Computer


In this step, you install and configure the AWS Schema Conversion Tool. In this walkthrough, we run AWS SCT and the data extraction agent on Windows. However, you can use AWS SCT and data extraction agents on other supported operating systems. For more information, see [Installing the schema conversion tool](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Installing.html#CHAP_Installing.Procedure) and [Installing extraction agents](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/agents.dw.html#agents.Installing).

 **To install AWS SCT ** 

1. Download the compressed file that contains AWS SCT installer for Microsoft Windows from [https://s3.amazonaws.com/publicsctdownload/Windows/aws-schema-conversion-tool-1.0.latest.zip](https://s3.amazonaws.com/publicsctdownload/Windows/aws-schema-conversion-tool-1.0.latest.zip).

1. Extract AWS SCT installer file.

1. Run AWS SCT installer file that you extracted in the previous step.

1. Choose **Next**, accept the terms of the License Agreement, and choose **Next** again.

1. Enter the path to the folder where you want to install AWS SCT, and choose **Next**.

1. Choose **Install**.

1. Choose **Finish** to close the installation wizard.

Now you can run AWS SCT. Before you create a new project, make sure that you add the path to an Amazon Redshift JDBC driver in the application settings. You don’t need a JDBC driver to connect to your BigQuery project.

 **To configure driver settings in AWS SCT ** 

1. Download an Amazon Redshift JDBC driver version 2.1.0.9 or later from [https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html](https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html).

1. Extract the JDBC driver from the compressed file that you downloaded.

1. Open AWS SCT, and choose **Global settings** from **Settings**.

1. Choose **Drivers**.

1. For **Amazon Redshift driver path**, choose Browse and choose the `redshift-jdbc42-2.1.0.9.jar` file that you extracted.

1. Choose **Apply**, and then choose **OK** to close the settings window.

To access AWS services such as Amazon S3 from AWS SCT, you configure an AWS service profile. An AWS service profile is a set of AWS credentials that includes your AWS access key, AWS secret access key, AWS Region, and Amazon S3 bucket.

 **To create an AWS service profile in AWS SCT ** 

1. Open AWS SCT, and choose **Global settings** from **Settings**.

1. Choose ** AWS service profiles**.

1. Choose **Add a new AWS service profile**.

1. For **Profile name**, enter a descriptive name for your profile.

1. For ** AWS access key**, enter your AWS access key.

1. For ** AWS secret key**, enter your AWS secret access key. For more information about AWS access keys, see [Programmatic access](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).

1. For **Region**, choose the AWS Region where you created your Amazon S3 bucket in the previous step.

1. For **Amazon S3 bucket folder**, choose the Amazon S3 bucket that you created in the previous step.

1. Choose **Apply**, and then choose **OK** to close the settings window.

# Step 5: Create an AWS SCT Project


After you configure AWS SCT, create a new migration project.

1. In AWS SCT, choose **File**, then choose **New Project**.

1. For **Project name**, enter a descriptive name of your project, and then choose **OK**.

1. Choose **Add source** to add a source BigQuery data warehouse to your project, then choose **BigQuery**, and choose **Next**.

1. For **Connection name**, enter a name for your source data warehouse. AWS SCT displays this name in the tree in the left panel.

1. For **Key path**, choose **Browse** and then choose the BigQuery service account key file that you created in step 1.

1. Choose **Connect** to close the dialog box and to connect to your BigQuery data warehouse.

1. Choose **Add target** to add a target Amazon Redshift database to your project, then choose **Amazon Redshift**, and choose **Next**.

1. If you store your database credentials in AWS Secrets Manager, choose your secret and then choose **Populate**. For more information, see [Using Secrets Manager](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_UserInterface.html#CHAP_UserInterface.SecretsManager).

   If you don’t use Secrets Manager, enter your database credentials manually.
   + For **Connection name**, enter a name for your target data warehouse. AWS SCT displays this name in the tree in the right panel.
   + For **Server name**, enter the server name of the Amazon Redshift cluster that you created in step 2. You can copy the server name as **JDBC URL** in the **General information** for your Amazon Redshift cluster. Remove `jdbc:redshift://` from the URL that you copied.
   + For **Server port**, enter 5439.
   + For **User name**, enter the name of the user that you created in step 2.
   + For **Password**, enter the password for the user that you created in step 2.

1. Turn off **Use AWS Glue** and choose **Connect**.

1. In the tree in the left panel, choose your BigQuery dataset. In the tree in the right panel, choose your target Amazon Redshift database. Choose **Create mapping**. You can add multiple mapping rules a single AWS SCT project. For more information about mapping rules, see [Creating mapping rules](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Mapping.html).

1. Choose **Main view**.

# Step 6: Convert Database Schemas


After you create a new AWS SCT project, convert your source database schemas and apply converted code to your target database.

1. In the tree in the left panel, choose your source dataset. Open the context (right-click) menu, and choose **Convert schema**.

1. Choose **Yes** for the confirmation message. AWS SCT then converts your schema to the target database format.

1.  AWS SCT also generates the assessment report. This report includes database objects that require manual conversion. To view this report, choose **View**, and then choose **Assessment report view**.

1. On the **Action items** tab, AWS SCT provides you with the recommended actions for each conversion issue.

1. Check the report and make changes in your source or converted code where necessary. You can optionally save the report as a .CSV or .PDF file for later analysis.

1. Choose **Action Items**, and review any recommendations that you see.

1. In the tree in the right panel, choose the converted schema. Open the context (right-click) menu, and choose **Apply to database** to apply the schema scripts to the target Amazon Redshift cluster.

# Step 7: Install and Configure Data Extraction Agents


 AWS SCT uses a data extraction agent to migrate data from BigQuery to Amazon Redshift. The .zip file that you downloaded to install AWS SCT, includes the extraction agent installer file. In this walkthrough, we install the data extraction agent on Windows. However, you can install data extraction agents on Red Hat Enterprise Linux or Ubuntu. For more information, see [Installing extraction agents](https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/agents.dw.html#agents.Installing).

 **To install and configure a data extraction agent** 

1. Find the `aws-schema-conversion-tool-extractor-2.0.1.<version>.msi` file in the `agents` folder. The number of the *<version>* in the file name depends on the version of AWS SCT that you use. To migrate data from BigQuery to Amazon Redshift, make sure that you use an extraction agent version 665 or higher.

1. Run the file.

1. Choose **Next**, accept the terms of the License Agreement, and choose **Next** again.

1. Enter the path to the folder where you want to install the data extraction agent, and choose **Next**.

1. Choose **Install**.

1. On Windows, the data extraction agent installer launches the configuration wizard in the command prompt window. On Linux, run the `sct-extractor-setup.sh` file from the location where you installed the agent.

1. For **Listening port**, enter `8192`. This is the default value. You can choose another port.

1. For **Add a source vendor**, enter `no`. You don’t need to configure the data extraction agent to work with your BigQuery data warehouse because you don’t need a driver to connect to BigQuery.

1. For **Add the Amazon Redshift driver**, enter `yes` and then enter the path to the Amazon Redshift JDBC driver that you downloaded in [Step 4](bigquery-redshift-migration-step-4.md).

1. For **Working folder**, enter the folder where the data extraction agent can store its data. Choose the project folder and make sure that you don’t need admin rights to write data to this folder.

1. For **Enable SSL communication**, enter `no`. Then enter `yes` to confirm your choice. In this walkthrough, we don’t use SSL to connect to databases. If you use SSL, configure the agent.

# Step 8: Run Your Migration Task


After you install and configure the data extraction agent, register it in AWS SCT.

 **To register a data extraction agent** 

1. In AWS SCT, for **View** choose **Data migration view (other)**, and then choose **Register**.

1. For **Description**, enter a name for your data extraction agent.

1. For **Host name**, enter `0.0.0.0` because you run the data extraction agent on the same computer as AWS SCT. If you install the data extraction agent on another computer, enter the IP address of this computer.

1. For **Port**, enter `8192`. If you configured another listening port in the previous step, use the value that you configured.

1. Choose **Register**.

 AWS SCT now can use the data extraction agent for data migration tasks.

When you migrate big tables, you can split data into virtual partitions in AWS SCT. Then AWS SCT creates subtasks for each virtual partition.

 **To create virtual partitions for your table in AWS SCT ** 

1. In the tree in the left panel, choose your source table. Open the context (right-click) menu, and choose **Add virtual partitioning**.

1. For **Partition type**, choose **Range**.

1. For **Column name**, choose the column of your table. AWS SCT partitions data based on a range of column values. This partition type creates a `WHERE` clause, and you provide the range of values for each partition.

1. For **Values**, enter a list of values for the partitioned column.

1. Choose **OK** to create virtual partitions for your table.

Now, you can start the data migration.

 **To create and run a migration task** 

1. In the tree in the left panel, choose your source table. Open the context (right-click) menu, and choose **Create local task**.

1. For **Task name**, enter a descriptive name for your data migration task.

1. For **Migration mode**, choose **Extract, upload, and copy**.

1. Choose **Advanced**. For **Google CS bucket folder**, enter the name for your Cloud Storage bucket that you created in [Step 3](bigquery-redshift-migration-step-3.md).

1. Choose **Amazon S3 settings**. For **Amazon S3 bucket folder**, enter the name of your Amazon S3 bucket that you created in [Step 3](bigquery-redshift-migration-step-3.md).

1. Choose **Create** and then choose **Start**.

The AWS SCT data extraction agents migrates data from your BigQuery dataset to Amazon Redshift. You can manage the migration process in AWS SCT. After the data extraction agent completes the migration, check your data in Amazon Redshift. Make sure that all your source data migrated to the new target database.

# Step 9: Delete Walkthrough Resources


After you complete this step-by-step guide, make sure that you delete your Amazon Redshift cluster to avoid additional charges.

 **To delete an Amazon Redshift cluster** 

1. Sign in to the AWS Management Console and open the Amazon Redshift console at [https://console.aws.amazon.com/redshift/](https://console.aws.amazon.com/redshift/).

1. On the navigation menu, choose **Clusters**.

1. Choose the cluster to delete.

1. For **Actions**, choose **Delete**. The **Delete cluster** page appears.

1. Choose **Delete cluster**.