

# Splitting a shard in a DB shard group
Splitting a shard

You can split a shard in a DB shard group manually into two smaller shards. This is called a *user-initiated* shard split.

Aurora PostgreSQL Limitless Database can also split shards when they have very large amounts of data or very high usage. This is called a *system-initiated* shard split.

**Topics**
+ [

## Prerequisites
](#limitless-shard-split.prereqs)
+ [

## Splitting a shard
](#limitless-shard-split.proc)
+ [

## Tracking shard splits
](#limitless-shard-split.track)
+ [

## Finalizing shard splits
](#limitless-shard-split.finalize)
+ [

## Canceling a shard split
](#limitless-shard-split.cancel)

## Prerequisites


User-initiated shard splits have the following prerequisites:
+ You must have a DB shard group.
+ The DB shard group can't be empty: it must contain at least one sharded table.
+ A user must have the `rds_aurora_limitless_cluster_admin` privilege. The `rds_superuser` has this privilege; therefore the master user also has it. The `rds_superuser` can grant the privilege to other users:

  ```
  /* Logged in as the master user or a user with rds_superuser privileges */
  CREATE USER username;
  GRANT rds_aurora_limitless_cluster_admin to username;
  ```
+ You must know the subcluster (node) ID of the shard that you want to split. You can obtain the ID by using the following query:

  ```
  SELECT * FROM rds_aurora.limitless_subclusters;
  
   subcluster_id | subcluster_type
  ---------------+-----------------
   1             | router
   2             | router
   3             | shard
   4             | shard
   5             | shard
   6             | shard
  ```

To enable system-initiated shard splits, set the following DB cluster parameters in a custom DB cluster parameter group associated with your DB cluster:


| Parameter | Value | 
| --- | --- | 
|  `rds_aurora.limitless_enable_auto_scale`  |  `on`  | 
|  `rds_aurora.limitless_auto_scale_options`  |  Either `split_shard`or `add_router,split_shard`  | 
|  `rds_aurora.limitless_finalize_split_shard_mode`  |  This parameter determines how *system-initiated* shard splits are finalized. The value can be one of the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/limitless-shard-split.html) For more information, see [Finalizing shard splits](#limitless-shard-split.finalize).  This parameter applies only to system-initiated shard splits.   | 

For more information, see [DB cluster parameter groups for Amazon Aurora DB clusters](USER_WorkingWithDBClusterParamGroups.md).

## Splitting a shard


To split a shard in a DB shard group, use the `rds_aurora.limitless_split_shard` function. This function starts a shard-split job that runs asynchronously.

```
SELECT rds_aurora.limitless_split_shard('subcluster_id');
```

Wait for the return of a job ID upon successful submission of the job, for example:

```
SELECT rds_aurora.limitless_split_shard('3');

    job_id
---------------
 1691300000000
(1 row)
```

**Note**  
Concurrent shard split operations are not supported. Execute each operation sequentially and complete each operation before initiating another addition operation.

## Tracking shard splits


You can use the job ID to track a shard-split job. To describe a particular job and get more details about it, run the following query:

```
SELECT * FROM rds_aurora.limitless_list_shard_scale_jobs(job_id);
```

For example:

```
SELECT * FROM rds_aurora.limitless_list_shard_scale_jobs(1691300000000);

    job_id     |    action   |      job_details      | status  |    submission_time     |                  message                  
---------------+-------------+-----------------------+---------+------------------------+-------------------------------------------
 1691300000000 | SPLIT_SHARD | Split Shard 3 by User | SUCCESS | 2023-08-06 05:33:20+00 | Scaling job succeeded.                 +
               |             |                       |         |                        | New shard instance with ID 7 was created.
(1 row)
```

The query returns an error when you pass a nonexistent job as the input.

```
SELECT * from rds_aurora.limitless_list_shard_scale_jobs(1691300000001);

ERROR:  no job found with the job ID provided
```

You can track the status of all shard-split jobs by using the same query without a job ID, for example:

```
SELECT * FROM rds_aurora.limitless_list_shard_scale_jobs();

    job_id     |   action    |  job_details          |   status    |    submission_time     |                  message                 
---------------+-------------+-----------------------+-------------+------------------------+--------------------------------------------------------------
 1691200000000 | SPLIT_SHARD | Split Shard 3 by User | IN_PROGRESS | 2023-08-05 01:46:40+00 | 
 1691300000000 | SPLIT_SHARD | Split Shard 4 by User | SUCCESS     | 2023-08-06 05:33:20+00 | Scaling job succeeded. +
               |             |                       |             |                        | New shard instance with ID 7 was created.
 1691400000000 | SPLIT_SHARD | Split Shard 5 by User | FAILED      | 2023-08-07 09:20:00+00 | Error occurred for the add shard job 1691400000000.
               |             |                       |             |                        | Retry the command. If the issue persists, contact AWS Support.
 1691500000000 | SPLIT_SHARD | Split Shard 5 by User | CANCELED    | 2023-08-07 09:20:00+00 | Scaling job was cancelled.
(4 rows)
```

The job status can be one of the following:
+ `IN_PROGRESS` – The shard-split job has been submitted and is in progress. You can have only one job in progress at a time.
+ `PENDING` – The shard-split job is waiting for you to finalize it. For more information, see [Finalizing shard splits](#limitless-shard-split.finalize).
+ `CANCELLATION_IN_PROGRESS` – The shard-split job is being canceled by the user.
+ `CANCELED` – The shard-split job has been successfully canceled by the user or by the system.
+ `SUCCESS` – The shard-split job completed successfully. The `message` field contains the instance ID of the new shard.
+ `FAILED` – The shard-split job failed. The `message` field contains the details of the failure and any actions that can be taken as a followup to the failed job.

## Finalizing shard splits


Finalizing is the last step of the shard-split process. It causes some downtime. If you start a shard-split job, then finalizing happens immediately after the job completes successfully.

Sometimes the system splits shards based on workload, when you've enabled system-initiated shard splits by using the `rds_aurora.limitless_enable_auto_scale` parameter.

In this case, you can choose whether finalizing happens immediately, or at a time that you choose. You use the `rds_aurora.limitless_finalize_split_shard_mode` DB cluster parameter to choose when it happens:
+ If you set the value to `immediate`, it happens immediately.
+ If you set the value to `user_initiated`, you have to finalize the shard-split job manually.

  An RDS event is sent to you, and the status of the shard-split job is set to `PENDING`.

When set to `user_initiated`, you use the `rds_aurora.limitless_finalize_split_shard` function to finalize the shard-split job:

```
SELECT * FROM rds_aurora.limitless_finalize_split_shard(job_id);
```

**Note**  
This function applies only to shard splits that are initiated by the system, not by you.

## Canceling a shard split


You can cancel a user-initiated or system-initiated shard split that's `IN_PROGRESS` or `PENDING`. You need the job ID to cancel it.

```
SELECT * from rds_aurora.limitless_cancel_shard_scale_jobs(job_id);
```

No output is returned unless there's an error. You can track the cancellation using a job-tracking query.