

# Working with Amazon OpenSearch Service direct queries
<a name="direct-query"></a>

Use Amazon OpenSearch Service direct query to analyze data in Amazon CloudWatch Logs, Amazon S3, Amazon Security Lake, and Amazon Managed Service for Prometheus without building ingestion pipelines. This zero-ETL integration lets you query data in place using PromQL, PPL, or SQL, and explore it in **Discover**.

To get started with Amazon Managed Service for Prometheus, CloudWatch Logs, or Security Lake, configure your data source in the [AWS Management Console](https://console.aws.amazon.com/aos/home#opensearch/data-sources). For Amazon S3, use domain connections and create tables with SQL in Query Workbench. CloudWatch Logs and Security Lake use preconfigured data sources and schema. For Amazon S3 and Security Lake, data is cataloged using AWS Glue Data Catalog tables—Amazon S3 requires you to create these tables manually, while Security Lake configures them automatically as part of the ingestion process.

# Directly querying Amazon S3 data in OpenSearch Service
<a name="direct-query-s3-overview"></a>

This section will walk you through the process of creating and configuring a data source integration in Amazon OpenSearch Service, enabling you to efficiently query and analyze your data stored in Amazon S3.

In the following pages, you'll learn how to set up an Amazon S3 direct-query data source, navigate the necessary prerequisites, and follow step-by-step procedures using both the AWS Management Console and the OpenSearch Service API. It also covers important next steps, including mapping AWS Glue Data Catalog roles and configuring access controls in OpenSearch Dashboards.

**Topics**
+ [Creating an Amazon S3 data source integration in OpenSearch Service](direct-query-s3-creating.md)
+ [Configuring and querying an S3 data source in OpenSearch Dashboards](direct-query-s3-configure.md)
+ [Pricing](#direct-query-s3-pricing)
+ [Limitations](#direct-query-s3-limitations)
+ [Recommendations](#direct-query-s3-recommendations)
+ [Quotas](#direct-query-s3-quotas)
+ [Supported AWS Regions](#direct-query-s3-regions)

# Creating an Amazon S3 data source integration in OpenSearch Service
<a name="direct-query-s3-creating"></a>

You can create a new Amazon S3 direct-query data source for OpenSearch Service through the AWS Management Console or the API. Each new data source uses the AWS Glue Data Catalog to manage tables that represent Amazon S3 buckets. 

**Topics**
+ [Prerequisites](#direct-query-s3-prereq)
+ [Procedure](#direct-query-s3-create)
+ [Next steps](#direct-query-s3-next-steps)
+ [Map the AWS Glue Data Catalog role](#direct-query-s3-permissions)
+ [Additional resources](#direct-query-s3-additional-resources)

## Prerequisites
<a name="direct-query-s3-prereq"></a>

Before you get started, make sure that you have reviewed the following documentation:
+ [Limitations](direct-query-s3-overview.md#direct-query-s3-limitations)
+ [Recommendations](direct-query-s3-overview.md#direct-query-s3-recommendations)
+ [Quotas](direct-query-s3-overview.md#direct-query-s3-quotas)

Before you can create a data source, you must have the following resources in your AWS account:
+ **An OpenSearch domain with version 2.13 or later.** This is the foundation for setting up the direct query integration. For instructions on setting this up, see [Creating OpenSearch Service domains](createupdatedomains.md#createdomains).
+ **One or more S3 buckets.** You’ll need to specify the buckets containing data that you want to query, and a bucket to store your query checkpoints in. For instructions on creating an S3 bucket, see [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) in the Amazon S3 user guide.
+ **(Optional) One or more AWS Glue tables. **Querying data on Amazon S3 requires that you have tables setup in AWS Glue Data Catalog to point to the S3 data. You must create the tables using OpenSearch Query Workbench. Existing Hive tables are not compatible. 

  If this is the first time you're setting up an Amazon S3 data source, you must create an admin data source to configure all of your AWS Glue Data Catalog tables. You can do this by installing OpenSearch out-of-the-box integrations or by using OpenSearch Query Workbench to create custom SQL tables for advanced use cases. For examples on creating tables for VPC, CloudTrail, and AWS WAF logs, see the documentation on GitHub for [VPC](https://github.com/opensearch-project/opensearch-catalog/blob/main/integrations/observability/amazon_vpc_flow/assets/create_table_vpc_schema-1.0.0.sql), [CloudTrail](https://github.com/opensearch-project/opensearch-catalog/blob/main/integrations/observability/aws_cloudtrail/assets/create_table_cloud-trail-records-1.0.0.sql), and [AWS WAF](https://github.com/opensearch-project/opensearch-catalog/blob/main/integrations/observability/aws_waf/assets/create_table-1.0.0.sql). After you create your tables, you can create new Amazon S3 data sources and restrict access to limited tables.
+ **(Optional) A manually created IAM role. **You can use this role to manage access to your data source. Alternatively, you can have OpenSearch Service create a role for you automatically with the required permissions. If you choose to use a manually created IAM role, follow the guidance in [Required permissions for manually created IAM roles](#direct-query-s3-additional-resources-required-permissions).

## Procedure
<a name="direct-query-s3-create"></a>

You can set up a direct-query data source on a domain with the AWS Management Console or the OpenSearch Service API.

### To set up a data source using the AWS Management Console
<a name="creating-direct-query-s3-console-create"></a>

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, choose **Domains**. 

1. Select the domain that you want to set up a new data source for. This opens the domain details page. 

1. Choose the **Connections** tab below the general domain details and find the **Direct query** section.

1. Choose **Configure data source**.

1. Enter a name and an optional description for your new data source. 

1. Choose **Amazon S3 with AWS Glue Data Catalog**. 

1. Under **IAM permission access settings**, choose how to manage access.

   1. If you want to automatically create a role for this data source, follow these steps:

      1. Select **Create a new role**.

      1. Enter a name for the IAM role.

      1. Select one or more S3 buckets that contain data you want to query.

      1. Select an checkpoint S3 bucket to store query checkpoints in.

      1. Select one or more AWS Glue databases or tables to define which data can be queried. If tables haven't been created yet, provide access to the default database.

   1. If you want to use an existing role that you manage yourself, follow these steps:

      1. Select **Use an existing role.**

      1. Select an existing role from the drop-down menu.
**Note**  
When using your own role, you must ensure it has all necessary permissions by attaching required policies from the IAM console. For more information, refer to the sample policy in [Required permissions for manually created IAM roles](#direct-query-s3-additional-resources-required-permissions).

1. Choose **Configure**. This opens the data source details screen with an OpenSearch Dashboards URL. You can navigate to this URL to complete the next steps.

### OpenSearch Service API
<a name="creating-direct-query-s3-api-create"></a>

Use the [AddDataSource](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_AddDataSource.html) API operation to create a new data source in your domain.

```
POST https://es.region.amazonaws.com/2021-01-01/opensearch/domain/domain-name/dataSource

{
   "DataSourceType": {
        "S3GlueDataCatalog": {
            "RoleArn": "arn:aws:iam::account-id:role/role-name"
        }
    }
   "Description": "data-source-description",
   "Name": "my-data-source"
}
```

## Next steps
<a name="direct-query-s3-next-steps"></a>

### Visit OpenSearch Dashboards
<a name="direct-query-s3-next-steps-dashboard"></a>

After you create a data source, OpenSearch Service provides you with an OpenSearch Dashboards link. You can use this to configure access control, define tables, install out-of-the-box integrations, and query your data.

For more information, see [Configuring and querying an S3 data source in OpenSearch Dashboards](direct-query-s3-configure.md).

## Map the AWS Glue Data Catalog role
<a name="direct-query-s3-permissions"></a>

If you have enabled [fine-grained access control](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/fgac.html) after creating a data source, you must map non-admin users to an IAM role with AWS Glue Data Catalog access in order to run direct queries. To manually create a back-end `glue_access` role that you can map to the IAM role, perform the following steps:

**Note**  
Indexes are used for any queries against the data source. A user with read access to the request index for a given data source can read *all* queries against that data source. A user with read access to the result index can read results for *all* queries against that data source.

1. From the main menu in OpenSearch Dashboards, choose **Security**, **Roles**, and **Create roles**.

1. Name the role **glue\$1access**.

1. For **Cluster permissions**, select `indices:data/write/bulk*`, `indices:data/read/scroll`, `indices:data/read/scroll/clear`.

1. For **Index**, enter the following indexes you want to grant the user with the role access to:
   + `.query_execution_request_<name of data source>`
   + `query_execution_result_<name of data source>`
   + `.async-query-scheduler`
   + `flint_*`

1. For **Index permissions**, select `indices_all`. 

1. Choose **Create**.

1. Choose **Mapped users**, **Manage mapping**. 

1. Under **Backend roles**, add the ARN of the AWS Glue role that needs permission to call your domain.

   ```
   arn:aws:iam::account-id:role/role-name
   ```

1. Select **Map** and confirm the role shows up under **Mapped users**.

For more information on mapping roles, see [Mapping roles to users](fgac.md#fgac-mapping).

## Additional resources
<a name="direct-query-s3-additional-resources"></a>

### Required permissions for manually created IAM roles
<a name="direct-query-s3-additional-resources-required-permissions"></a>

 When creating a data source for your domain, you choose an IAM role to manage access to your data. You have two options:

1. Create a new IAM role automatically

1. Use an existing IAM role that you created manually

If you use a manually created role, you need to attach the correct permissions to the role. The permissions must allow access to the specific data source, and allow OpenSearch Service to assume the role. This is required so that the OpenSearch Service can securely access and interact with your data. 

The following sample policy demonstrates the least-privilege permissions required to create and manage a data source. If you have broader permissions, such as `s3:*` or the `AdminstratorAccess` policy, these permissions encompasses the least-privilege permissions in the sample policy.

In the following sample policy, replace the *placeholder text *with your own information.

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement":[
      {
         "Sid":"HttpActionsForOpenSearchDomain",
         "Effect":"Allow",
         "Action":"es:ESHttp*",
"Resource":"arn:aws:es:us-east-1:111122223333:domain/example.com/*"
      },
      {
         "Sid":"AmazonOpenSearchS3GlueDirectQueryReadAllS3Buckets",
         "Effect":"Allow",
         "Action":[
            "s3:GetObject",
            "s3:GetObjectVersion",
            "s3:ListBucket"
         ],
         "Condition":{
            "StringEquals":{
               "aws:ResourceAccount":"111122223333"
            }
         },
         "Resource":"*"
      },
      {
         "Sid":"AmazonOpenSearchDirectQueryGlueCreateAccess",
         "Effect":"Allow",
         "Action":[
            "glue:CreateDatabase",
            "glue:CreatePartition",
            "glue:CreateTable",
            "glue:BatchCreatePartition"
         ],
         "Resource":"*"
      },
      {
         "Sid":"AmazonOpenSearchS3GlueDirectQueryModifyAllGlueResources",
         "Effect":"Allow",
         "Action":[
            "glue:DeleteDatabase",
            "glue:DeletePartition",
            "glue:DeleteTable",
            "glue:GetDatabase",
            "glue:GetDatabases",
            "glue:GetPartition",
            "glue:GetPartitions",
            "glue:GetTable",
            "glue:GetTableVersions",
            "glue:GetTables",
            "glue:UpdateDatabase",
            "glue:UpdatePartition",
            "glue:UpdateTable",
            "glue:BatchGetPartition",
            "glue:BatchDeletePartition",
            "glue:BatchDeleteTable"
         ],
         "Resource":[
            "arn:aws:glue:us-east-1:111122223333:table/*",
            "arn:aws:glue:us-east-1:111122223333:database/*",
            "arn:aws:glue:us-east-1:111122223333:catalog",
            "arn:aws:es:us-east-1:111122223333:domain/domain_name"
         ],
         "Condition":{
            "StringEquals":{
               "aws:ResourceAccount":"111122223333"
            }
         }
      },
      {
         "Sid":"ReadAndWriteActionsForS3CheckpointBucket",
         "Effect":"Allow",
         "Action":[
            "s3:ListMultipartUploadParts",
            "s3:DeleteObject",
            "s3:GetObject",
            "s3:PutObject",
            "s3:GetBucketLocation",
            "s3:ListBucket"
         ],
         "Condition":{
            "StringEquals":{
               "aws:ResourceAccount":"111122223333"
            }
         },
         "Resource":[
            "arn:aws:s3:::amzn-s3-demo-bucket",
            "arn:aws:s3:::amzn-s3-demo-bucket/*"
         ]
      }
   ]
}
```

------

To support Amazon S3 buckets in different accounts, you will need to include a condition to the Amazon S3 policy and add the appropriate account. 

In the following sample condition, replace the *placeholder text *with your own information.

```
"Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": "{{accountId}}"
                }
```

The role must also have the following trust policy, which specifies the target ID.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement":[
       {
          "Effect":"Allow",
          "Principal":{
             "Service": "directquery.opensearchservice.amazonaws.com"
          },
          "Action":"sts:AssumeRole"
       }
     ]
}
```

------

For instructions to create the role, see [Creating a role using custom trust policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-custom.html).

If you have fine-grained access control enabled in OpenSearch Service, a new OpenSearch fine-grained access control role will automatically be created for your data source. The name of the new fine-grained access control role will be `AWSOpenSearchDirectQuery <name of data source>`.

By default, the role has access to direct query data source indexes only. Although you can configure the role to limit or grant access to your data source, it is recommended you not adjust the access of this role. **If you delete the data source, this role will be deleted**. This will remove access for any other users if they are mapped to the role.

# Configuring and querying an S3 data source in OpenSearch Dashboards
<a name="direct-query-s3-configure"></a>

Now that you've created your data source, you can configure security settings, define your Amazon S3 tables, or set up accelerated data indexing. This section walks you through various use cases with your data source in OpenSearch Dashboards before you query your data.

To configure the following sections, you must first navigate to your data source in OpenSearch Dashboards. In the left-hand navigation, under **Management**, choose **Data sources**. Under **Manage data sources**, select the name of the data source that you created in the console. 

## Create Spark Tables using Query Workbench
<a name="direct-query-s3-configure-tables"></a>

Direct queries from OpenSearch Service to Amazon S3 use Spark tables within the AWS Glue Data Catalog. You can create tables from within the Query Workbench without having to leave OpenSearch Dashboards. 

To manage existing databases and tables in your data source, or to create new tables that you want to use direct queries on, choose **Query Workbench** from the left navigation and select the Amazon S3 data source from the data source drop down.

To set up a table for VPC Flow logs stored in S3 in Parquet format, run the following query: 

```
CREATE TABLE 
datasourcename.gluedatabasename.vpclogstable (version INT, account_id STRING, interface_id STRING, 
srcaddr STRING, dstaddr STRING, srcport INT, dstport INT, protocol INT, packets BIGINT, 
bytes BIGINT, start BIGINT, end BIGINT, action STRING, log_status STRING, 
`aws-account-id` STRING, `aws-service` STRING, `aws-region` STRING, year STRING, 
month STRING, day STRING, hour STRING) 

USING parquet PARTITIONED BY (aws-account-id, aws-service, aws-region, year, month, 
day, hour) 

LOCATION "s3://accountnum-vpcflow/AWSLogs"
```

After creating the table, run the following query to ensure that it's compatible with direct queries:

```
MSCK REPAIR TABLE  datasourcename.databasename.vpclogstable
```

## Setup integrations for popular AWS log types
<a name="direct-query-s3-setup-integration"></a>

You can integrate AWS log types stored in Amazon S3 with OpenSearch Service. Use OpenSearch Dashboards to install integrations that create AWS Glue Data Catalog tables, saved queries, and dashboards. These integrations use indexed views to keep dashboards updated.

For instructions to install an integration, see [Installing an integration asset](https://opensearch.org/docs/latest/integrations/#installing-an-integration-asset) in the OpenSearch documentation.

When you select an integration, make sure it has the `S3 Glue` tag. 

When you set up the integration, specify **S3 Connection** for the connection type. Then, select the data source for the integration, the Amazon S3 location of the data, the checkpoint to manage acceleration indexing, and the assets required for your use case.

**Note**  
Make sure the S3 bucket for your checkpoint has write permissions for the checkpoint location. Without these permissions, the integration's accelerations will fail.

## Set up access control
<a name="direct-query-s3-configure-ac"></a>

On the details page for your data source, find the **Access controls** section and choose **Edit**. If the domain has fine-grained access control enabled, choose **Restricted** and select which roles you want to provide with access to the new data source. You can also choose **Admin only** if you only want the administrator to have access to the data source.

**Important**  
Indexes are used for any queries against the data source. A user with read access to the request index for a given data source can read *all* queries against that data source. A user with read access to the result index can read results for *all* queries against that data source.

## Querying S3 data in OpenSearch Discover
<a name="direct-querying-s3-query"></a>

After you set up your tables and configure your desired optional query acceleration, you can start analyzing your data. To query your data, select your data source from the drop-down menu. If you're using Amazon S3 and OpenSearch Dashboards, go to Discover and select the data source name. 

If you're using a skipping index or haven't created an index, you can use SQL or PPL to query your data. If you've configured a materialized view or a covering index, you already have an index and can use Dashboards Query Language (DQL) throughout Dashboards. You can also use PPL with the Observability plugin, and SQL with the Query Workbench plugin. Currently, only the Observability and Query Workbench plugins support PPL and SQL. For querying data using the OpenSearch Service API, refer to the [async API documentation](https://github.com/opensearch-project/sql/blob/main/docs/user/interfaces/asyncqueryinterface.rst).

**Note**  
Not all SQL and PPL statements, commands and functions are supported. For a list of supported commands, see [Supported SQL and PPL commands](direct-query-supported-commands.md).  
If you’ve created a materialized view or covering index, you can use DQL to query your data given that you’ve indexed it within.

## Troubleshooting
<a name="s3-troubleshooting"></a>

There might be instances when results don’t return as expected. If you experience any issues, make sure that you're following the [Recommendations](direct-query-s3-overview.md#direct-query-s3-recommendations).

## Pricing
<a name="direct-query-s3-pricing"></a>

Amazon OpenSearch Service offers OpenSearch Compute Unit (OCU) pricing for Amazon S3 direct queries. As you run direct queries, you incur charges for OCUs per hour, listed as DirectQuery OCU usage type on your bill. You will also incur separate charges from Amazon S3 for data storage.

Direct queries are of two types—interactive and indexed view queries.
+ *Interactive queries* are used to populate the data selector and perform analytics on your data in Amazon S3. When you run a new query from Discover, OpenSearch Service starts a new session that lasts for a minimum of three minutes. OpenSearch Service keeps this session active to ensure that subsequent queries run quickly.
+ *Indexed view queries* use compute to maintain indexed views in OpenSearch Service. These queries usually take longer because they ingest a varying amount of data into a named index. For Amazon S3 data sources, the indexed data is stored in a domain based on an instance type purchased.

For more information, see the Direct Query and Serverless sections within [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Limitations
<a name="direct-query-s3-limitations"></a>

The following limitations apply to direct queries in Amazon S3:
+ Direct query for S3 is only available on OpenSearch Service domains running OpenSearch version 2.13 or later, and requires access to AWS Glue Data Catalog. Existing AWS Glue Data Catalog tables must be recreated using SQL in OpenSearch Query Workbench.
+ Direct query for S3 requires you to specify a checkpoint bucket on Amazon S3. This bucket maintains the state of your indexed views, including the last refresh time and the most recently ingested data.
+ Your OpenSearch domain and AWS Glue Data Catalog must be in the same AWS account. Your S3 bucket can be in a different account (requires condition to be added to your IAM policy), but must be in the same AWS Region as your domain.
+ OpenSearch Service direct queries with S3 only support Spark tables generated from Query Workbench. Tables generated within AWS Glue Data Catalog or Athena are not supported by Spark streaming, which is needed to maintain indexed views.
+ OpenSearch instance types have networked payload limitations of either 10 MiB or 100 MiB, depending on the specific instance type you choose. 
+ Some data types aren't supported. Supported data types are limited to Parquet, CSV, and JSON. 
+ If the structure of your data changes over time, you will need to update your indexed views or out-of-the-box integrations to account for the data structure changes. 
+ AWS CloudFormation templates aren't supported yet.
+ OpenSearch SQL and OpenSearch PPL statements have different limitations when working with OpenSearch indexes compared to using direct query. Direct query supports advanced commands such as JOINs, subqueries, and lookups, while support for these commands on OpenSearch indexes is limited or nonexistent. For more information, see [Supported SQL and PPL commands](direct-query-supported-commands.md).

## Recommendations
<a name="direct-query-s3-recommendations"></a>

We recommend the following when using direct queries in Amazon S3:
+ Ingest data into Amazon S3 using partition formats of year, month, day, hour to speed up queries.
+ When you build skipping indexes, use Bloom filters for fields with high cardinality and min/max indexes for fields with large value ranges. For high-cardinality fields, consider using a value-based approach to improve query efficiency.
+ Use Index State Management to maintain storage for materialized views and covering indexes.
+ Use the `COALESCE SQL` function to handle missing columns and ensure results are returned.
+ Use limits on your queries to make sure you aren't pulling too much data back.

## Quotas
<a name="direct-query-s3-quotas"></a>

Each time you initiate a query to an Amazon S3 data source, OpenSearch Service opens a *session* and keeps it alive for at least three minutes. This reduces query latency by removing session start-up time in subsequent queries.


| Description | Maximum | Can override | 
| --- | --- | --- | 
| Connections per domain | 10 | Yes | 
| Data sources per domain | 20 | Yes | 
| Indexes per domain | 5 | Yes | 
| Concurrent sessions per data source | 10 | Yes | 
| Maximum OCU per query | 60 | Yes | 
| Maximum query execution time (minutes) | 30 | Yes | 
| Maximum OCUs per acceleration | 20 | Yes | 
| Maximum ephemeral storage | 20 | Yes | 

## Supported AWS Regions
<a name="direct-query-s3-regions"></a>

The following AWS Regions are supported for direct queries in Amazon S3:
+ Asia Pacific (Hong Kong)
+ Asia Pacific (Mumbai)
+ Asia Pacific (Seoul) 
+ Asia Pacific (Singapore)
+ Asia Pacific (Sydney)
+ Asia Pacific (Tokyo)
+ Canada (Central)
+ Europe (Frankfurt)
+  Europe (Ireland)
+ Europe (Stockholm)
+ US East (N. Virginia)
+ US East (Ohio)
+ US West (Oregon)

# Directly querying Amazon CloudWatch Logs data in OpenSearch Service
<a name="direct-query-cloudwatch-logs-overview"></a>

This section will walk you through the process of creating and configuring a data source integration in Amazon OpenSearch Service, enabling you to efficiently query and analyze your data stored in CloudWatch Logs.

In the following pages, you'll learn how to set up a CloudWatch Logs direct-query data source, navigate the necessary prerequisites, and follow step-by-step procedures using the AWS Management Console. 

**Topics**
+ [Creating an Amazon CloudWatch Logs data source integration in OpenSearch Service](direct-query-cloudwatch-logs-creating.md)
+ [Configuring and querying a CloudWatch Logs data source in OpenSearch Dashboards](direct-query-cloudwatch-logs-configure.md)
+ [Pricing](#direct-query-cloudwatch-logs-pricing)
+ [Limitations](#direct-query-cloudwatch-logs-limitations)
+ [Recommendations](#direct-query-cloudwatch-logs-recommendations)
+ [Quotas](#direct-query-cloudwatch-logs-quotas)
+ [Supported AWS Regions](#direct-query-cloudwatch-logs-regions)

# Creating an Amazon CloudWatch Logs data source integration in OpenSearch Service
<a name="direct-query-cloudwatch-logs-creating"></a>

If you use Amazon OpenSearch Serverless for your observability needs, you can now analyze your Amazon CloudWatch Logs without copying or ingesting the data into OpenSearch Service. This capability leverages direct query for querying data, similar to analyzing data in Amazon S3 from OpenSearch Service. You can get started by creating a new connected data source from the AWS Management Console.

You can create a new data source to analyze CloudWatch Logs data without having to build Amazon OpenSearch Serverless to directly query operational logs in CloudWatch Logs. This enables you to analyze your accessed operational data that rests outside of OpenSearch Service. By querying across OpenSearch Service and CloudWatch Logs, you can start analyzing logs in CloudWatch Logs and then move back to monitoring data sources in OpenSearch without having to switch tools.

To use this feature, you create a CloudWatch Logs direct query data source for OpenSearch Service through the AWS Management Console. 

**Topics**
+ [Prerequisites](#direct-query-cloudwatch-logs-prereq)
+ [Procedure](#direct-query-cloudwatch-logs-create)
+ [Next steps](#direct-query-cloudwatch-logs-next-steps)
+ [Additional resources](#direct-query-cloudwatch-logs-additional-resources)

## Prerequisites
<a name="direct-query-cloudwatch-logs-prereq"></a>

Before you get started, make sure that you have reviewed the following documentation:
+ [Limitations](direct-query-cloudwatch-logs-overview.md#direct-query-cloudwatch-logs-limitations)
+ [Recommendations](direct-query-cloudwatch-logs-overview.md#direct-query-cloudwatch-logs-recommendations)
+ [Quotas](direct-query-cloudwatch-logs-overview.md#direct-query-cloudwatch-logs-quotas)

Before you can create a data source, you must have the following resources in your AWS account:
+ **Enable CloudWatch Logs.** Configure CloudWatch Logs to collect logs on the same AWS account as your OpenSearch resource. For instructions, see [Getting started with CloudWatch Logs ](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_GettingStarted.html)in the Amazon CloudWatch Logs user guide. 
+ **One or more CloudWatch log groups. **You can specify the log groups containing data that you want to query. For instructions on creating a log group, see [Create a log group in CloudWatch Logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html) in the Amazon CloudWatch Logs user guide.
+ **(Optional) A manually created IAM role. **You can use this role to manage access to your data source. Alternatively, you can have OpenSearch Service create a role for you automatically with the required permissions. If you choose to use a manually created IAM role, follow the guidance in [Required permissions for manually created IAM roles](#direct-query-cloudwatch-logs-additional-resources-required-permissions).

## Procedure
<a name="direct-query-cloudwatch-logs-create"></a>

You can set up a collection-level query data source with the AWS Management Console.

### To set up a collection-level data source using the AWS Management Console
<a name="creating-direct-query-cloudwatch-logs-console-create"></a>

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, go to **Central management** and choose **Connected data sources**. 

1. Choose **Connect**.

1. Choose **CloudWatch** as the data source type. 

1. Choose **Next**.

1. Under **Data connection details**, enter a name and an optional description. 

1. Under **IAM roles**, choose how to manage access to the log groups.

   1. If you want to automatically create a role for this data source, follow these steps:

      1. Select **Create a new role**.

      1. Enter a name for the IAM role.

      1. Select one or more log groups to define which data can be queried.

   1. If you want to use an existing role that you manage yourself, follow these steps:

      1. Select **Use an existing role.**

      1. Select an existing role from the drop-down menu.
**Note**  
When using your own role, you must ensure it has all necessary permissions by attaching required policies from the IAM console. For more information, see [Required permissions for manually created IAM roles](#direct-query-cloudwatch-logs-additional-resources-required-permissions).

1. (Optional) Under **Access policy**, configure an access policy for the data source. Access policies control whether a request to the OpenSearch Service direct query data source is accepted or rejected. If you don't configure an access policy, only the data source owner has access. You can configure the access policy to enable cross-account access, allowing principals in other AWS accounts to access the data source.

   You can create an access policy using the visual editor or by providing a JSON policy document. With the visual editor, you can allow or deny access by specifying a principal AWS account ID, account ARN, IAM user ARN, IAM role ARN, source IP address, or CIDR block. The visual editor supports up to 10 elements. To define a policy with more than 10 elements, use the JSON editor.

   You can also choose **Import policy** to import an existing access policy from another data source.

1. (Optional) Under **Tags**, add tags to your data source.

1. Choose **Next**.

1. Under **Set up OpenSearch**, choose how to set up OpenSearch.

   1. Use the default settings:

      1. Review the default resource names and data retention settings. We suggest you use custom names.

        When you use the default settings, a new OpenSearch application and Essentials workspace is created for you at no additional cost. OpenSearch enables you to analyze multiple data sources. It includes workspaces, which provide a tailored experiences for popular use cases. Workspaces support access control, enabling you to create private spaces for your use cases and share them only with your collaborators.

   1. Use customized settings:

      1. Choose **Customize**.

      1. Edit the collection name and the data retention settings as needed.

      1. Select the OpenSearch application and workspace that you want to use.

1. Choose **Next**.

1. Review your choices and choose **Edit** if you need to make any changes.

1. Choose **Connect** to set up the data source. Stay on this page while your data source is created. When it’s ready, you’ll be taken to the data source details page. 

## Next steps
<a name="direct-query-cloudwatch-logs-next-steps"></a>

### Visit OpenSearch Dashboards
<a name="direct-query-cloudwatch-logs-next-steps-dashboard"></a>

After you create a data source, OpenSearch Service provides you with an OpenSearch Dashboards URL. You use this to configure access control, define tables, set up log-type based dashboards for popular log types, and query your data using SQL or PPL.

For more information, see [Configuring and querying a CloudWatch Logs data source in OpenSearch Dashboards](direct-query-cloudwatch-logs-configure.md).

## Additional resources
<a name="direct-query-cloudwatch-logs-additional-resources"></a>

### Required permissions for manually created IAM roles
<a name="direct-query-cloudwatch-logs-additional-resources-required-permissions"></a>

 When creating a data source, you choose an IAM role to manage access to your data. You have two options:

1. Create a new IAM role automatically

1. Use an existing IAM role that you created manually

If you use a manually created role, you need to attach the correct permissions to the role. The permissions must allow access to the specific data source, and allow OpenSearch Service to assume the role. This is required so that the OpenSearch Service can securely access and interact with your data. 

The following sample policy demonstrates the least-privilege permissions required to create and manage a data source. If you have broader permissions, such as `logs:*` or the `AdminstratorAccess` policy, these permissions encompasses the least-privilege permissions in the sample policy.

In the following sample policy, replace the *placeholder text *with your own information.

------
#### [ JSON ]

****  

```
    {
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AmazonOpenSearchDirectQueryAllLogsAccess",
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogGroups",
                "logs:StartQuery",
                "logs:GetLogGroupFields"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": "111122223333"
                }
            },
            "Resource": [
                "arn:aws:logs:us-east-1:111122223333:log-group:*"
            ]
        }
    ]
}
```

------

The role must also have the following trust policy, which specifies the target ID.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "TrustPolicyForAmazonOpenSearchDirectQueryService",
            "Effect": "Allow",
            "Principal": {
                "Service": "directquery.opensearchservice.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:opensearch:us-east-1:111122223333:datasource/rolename"
                }
            }
        }
    ]
}
```

------

For instructions to create the role, see [Creating a role using custom trust policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-custom.html).

By default, the role has access to direct query data source indexes only. Although you can configure the role to limit or grant access to your data source, it is recommended you not adjust the access of this role. **If you delete the data source, this role will be deleted**. This will remove access for any other users if they are mapped to the role.

### Sample access policy for a direct query data source
<a name="direct-query-cloudwatch-logs-additional-resources-access-policy"></a>

Access policies for direct query data sources follow IAM policy syntax. The policy document must be in valid JSON format. The following example policy grants a specific AWS account access to the direct query data source.

In the following sample policy, replace the *placeholder text* with your own information.

```
{
 "Version": "2012-10-17", 		 	 	 
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
     "AWS": "arn:aws:iam::account-id:root"
     },
     "Action": [
       "opensearch:StartDirectQuery",
       "opensearch:GetDirectQuery",
       "opensearch:CancelDirectQuery",
       "opensearch:GetDirectQueryResult"
     ],
     "Resource": "arn:aws:opensearch:region:account-id:datasource/data-source-name"
   }
 ]
}
```

If you don't configure an access policy, only the data source owner has access to the data source.

# Configuring and querying a CloudWatch Logs data source in OpenSearch Dashboards
<a name="direct-query-cloudwatch-logs-configure"></a>

Now that you've created your data source, you can get started with it OpenSearch Dashboards. This section walks you through various use cases with your data source in OpenSearch Dashboards.

## Query log groups from the Discover page
<a name="direct-query-cloudwatch-logs-query-from-discover"></a>

In the OpenSearch Discover page, you can use the new direct query data source you configured to query your CloudWatch Logs log groups. To do this, choose **Explore logs**, then use the search bar to build your query using SQL or PPL. You can filter, sort, and visualize the data returned from your log groups. To understand what statements, commands, and limitations are supported for the CloudWatch Logs integration, see [Supported SQL and PPL commands](direct-query-supported-commands.md).

## Create a dashboard view for your data source
<a name="direct-query-cloudwatch-logs-setup-integration"></a>

When you use OpenSearch Service, you can quickly analyze popular AWS log types using pre-built dashboard templates. For CloudWatch Logs there are templates for VPC, CloudTrail, AWS WAF, and Network Firewall logs. These templates allow you to quickly create a dashboard tailored to your specific data. They include dashboards tailored for that specific log type. This enables you to quickly get up and running with analyzing these popular AWS log sources, without having to build everything from scratch.

**Note**  
Dashboards use indexed views, which ingest data from CloudWatch Logs using direct query OpenSearch Compute Units (OCUs) as well as serverless collection indexingOCUs, searchingOCUs, and storage.

Follow these steps to create a dashboard using one of these pre-built templates, so you can start exploring and analyzing your data right away.

**To create a dashboard view**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. From the left navigation pane, choose **Central management**, then **Connected data sources**. 

1. Select the data source to open the details page. 

1. Choose **Create dashboard**.

1. Choose which type of dashboard you want to create.

1. Enter a name for your dashboard.

1. Enter an optional description for your dashboard.

1. Select one or more log groups to view on your dashboard.

1. Choose how often you want to refresh the data in your dashboard.

1. Choose which OpenSearch workspace you want to use. 

   1. To create a new workspace, select **Create new workspace** and enter a name.

   1. To use an existing workspace, select **Select existing workspace**.

1. Choose **Create dashboard**.

## Querying CloudWatch Logs data in OpenSearch Discover
<a name="direct-querying-cloudwatch-logs-query"></a>

To query your data, select your data source from the drop-down menu. If you're using CloudWatch Logs, navigate to Discover from your Essentials workspace and start querying data using OpenSearch SQL or Piped Processing Language (PPL). For a list of supported commands, see [Supported SQL and PPL commands](direct-query-supported-commands.md).

**Note**  
If you’ve created a materialized view, you can use DQL to query your data given that you’ve indexed it within.

### Troubleshooting
<a name="cloudwatch-logs-troubleshooting"></a>

There might be instances when results don’t return as expected. If you experience any issues, make sure that you're following the [Recommendations](direct-query-cloudwatch-logs-overview.md#direct-query-cloudwatch-logs-recommendations).

## Pricing
<a name="direct-query-cloudwatch-logs-pricing"></a>

Amazon OpenSearch Service offers OpenSearch Compute Unit (OCU) pricing for CloudWatch Logs direct queries. As you run direct queries, you incur charges for OCUs per hour, listed as DirectQuery OCU usage type on your bill. You will also incur separate charges from Amazon CloudWatch Logs.

Direct queries are of two types—interactive and indexed view queries.
+ *Interactive queries* are used to populate the data selector and perform analytics on your data in CloudWatch Logs. OpenSearch Service handles each query with a separate pre-warmed job, without maintaining an extended session.
+ *Indexed view queries* use compute to maintain indexed views in OpenSearch Service. These queries usually take longer because they ingest a varying amount of data into a named index. For CloudWatch Logs connected data sources, the indexed data is stored in an OpenSearch Serverless collection where you are charged for data indexed (IndexingOCU), data searched (SearchOCU), and data stored in GB.

For more information, see the Direct Query and Serverless sections within [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Limitations
<a name="direct-query-cloudwatch-logs-limitations"></a>

The following limitations apply to direct queries in CloudWatch Logs:
+ The direct query integration with CloudWatch Logs is only available on OpenSearch Service collections and the OpenSearch user interface.
+ OpenSearch Serverless collections have networked payload limitations of 100 MiB. 
+ CloudWatch Logs supports VPC Flow, CloudTrail, and AWS WAF dashboard integrations installed from the console. 
+ AWS CloudFormation templates aren't supported yet.
+ OpenSearch SQL and OpenSearch PPL statements have different limitations when working with OpenSearch indexes compared to using direct query. Direct query supports advanced commands such as JOINs, subqueries, and lookups, while support for these commands on OpenSearch indexes is limited or nonexistent. For more information, see [Supported SQL and PPL commands](direct-query-supported-commands.md).

## Recommendations
<a name="direct-query-cloudwatch-logs-recommendations"></a>

We recommend the following when using direct queries in CloudWatch Logs:
+ When searching multiple log groups in one query, use the appropriate syntax. For more information, see [Multi-log group functions](supported-directquery-sql.md#multi-log-queries).
+ When using SQL or PPL commands, enclose certain fields in backticks to successfully query them. Backticks are needed for fields with special characters (non-alphabetic and non-numeric). For example, enclose `@message`, `Operation.Export,` and `Test::Field` in backticks. You don't need to enclose columns with purely alphabetic names in backticks.

  Example query with simple fields:

  ```
  SELECT SessionToken, Operation, StartTime  FROM `LogGroup-A`
  LIMIT 1000;
  ```

  Similar query with backticks appended:

  ```
  SELECT `@SessionToken`, `@Operation`, `@StartTime` FROM `LogGroup-A`
  LIMIT 1000;
  ```
+ Use limits on your queries to make sure you aren't pulling too much data back.
+ Queries containing field names that are identical but differ only in case (such as `field1` and `FIELD1`) are not supported.

  For example, the following queries are not supported:

  ```
  Select AWSAccountId, AwsAccountId from LogGroup
  Select a.@LogStream, b.@logStream from Table A INNER Join Table B ona.id = b.id
  ```

  However, the following query is supported because the field name (@logStream) is identical in both log groups:

  ```
  Select a.@logStream, b.@logStream from Table A INNER Join Table B on a.id = b.id
  ```
+ Functions and expressions must operate on field names and be part of a `SELECT` statement with a log group specified in the `FROM` clause.

  For example, this query is not supported:

  ```
  SELECT cos(10) FROM LogGroup
  ```

  This query is supported:

  ```
  SELECT cos(field1) FROM LogGroup
  ```

## Quotas
<a name="direct-query-cloudwatch-logs-quotas"></a>

**Note**  
If you're looking to perform direct queries using CloudWatch Logs Insights, make sure that you refer to [Additional information for CloudWatch Logs Insights users using OpenSearch SQL](supported-directquery-sql.md#supported-sql-for-multi-log-queries).


| Description | Value | Soft limit? | Notes | 
| --- | --- | --- | --- | 
| Account-level TPS limit across direct query APIs | 3 TPS | Yes |  | 
| Maximum number of data sources | 20 | Yes | Limit is per AWS account. | 
| Maximum auto-refreshing indexes or materialized views | 30 | Yes | Limit is per data source. | 
| Maximum concurrent queries | 30 | Yes |  Limit is per data source and applies to queries in pending or running state.  Includes interactive queries (for example, data retrieval commands like `SELECT`) and index queries (for example, operations like `CREATE`/`ALTER`).   | 
| Maximum concurrent OCU per query | 512 | Yes |  OpenSearch Compute Units (OCU). Limit based on 15 executors and 1 driver, each with 16 vCPU and 32 GB memory. Represents concurrent processing power.  | 
| Maximum query execution time in minutes | 60 | No | Limit applies to OpenSearch PPL/SQL queries in CloudWatch Logs Insights. | 
| Period for purging stale query IDs | 90 days | Yes | This is the time period after which OpenSearch Service purges query metadata for older entries. For example, calling GetDirectQuery or GetDirectQueryResult fails for queries older than 90 days. | 

## Supported AWS Regions
<a name="direct-query-cloudwatch-logs-regions"></a>

The following AWS Regions are supported for direct queries in CloudWatch Logs:
+ Asia Pacific (Mumbai) 
+ Asia Pacific (Hong Kong)
+ Asia Pacific (Osaka)
+ Asia Pacific (Seoul)
+ Asia Pacific (Singapore)
+ Asia Pacific (Sydney)
+ Asia Pacific (Tokyo)
+ Canada (Central)
+ Europe (Frankfurt)
+  Europe (Ireland)
+ Europe (Stockholm)
+ Europe (Milan)
+ Europe (Spain)
+ US East (N. Virginia)
+ US East (Ohio)
+ US West (Oregon)
+ US West (N. California)
+ Europe (Paris) 
+ Europe (London)
+ South America (Sao Paulo)

# Directly querying Amazon Security Lake data in OpenSearch Service
<a name="direct-query-security-lake-overview"></a>

This section will walk you through the process of creating and configuring a data source integration in Amazon OpenSearch Service, enabling you to efficiently query and analyze your data stored in Security Lake.

In the following pages, you'll learn how to set up a Security Lake direct-query data source, navigate the necessary prerequisites, and follow step-by-step procedures using the AWS Management Console. 

**Topics**
+ [Creating an Amazon Security Lake data source integration in OpenSearch Service](direct-query-security-lake-creating.md)
+ [Configuring and querying a Security Lake data source in OpenSearch Dashboards](direct-query-security-lake-configure.md)
+ [Pricing](#direct-query-security-lake-pricing)
+ [Limitations](#direct-query-security-lake-limitations)
+ [Recommendations](#direct-query-security-lake-recommendations)
+ [Quotas](#direct-query-security-lake-quotas)
+ [Supported AWS Regions](#direct-query-security-lake-regions)

# Creating an Amazon Security Lake data source integration in OpenSearch Service
<a name="direct-query-security-lake-creating"></a>

You can use Amazon OpenSearch Serverless to directly query security data in Amazon Security Lake. To do this, you create a data source that enables you to use OpenSearch zero-ETL capabilities on Security Lake data. When you create a data source you can directly search, gain insights from, and analyze data stored in Security Lake. You can accelerate your query performance and use advanced OpenSearch analytics on select Security Lake data sets using on-demand indexing.

**Topics**
+ [Prerequisites](#direct-query-s3security-lake-prereq)
+ [Procedure](#direct-query-security-lake-create)
+ [Next steps](#direct-query-security-lake-next-steps)
+ [Additional resources](#direct-query-security-lake-additional-resources)

## Prerequisites
<a name="direct-query-s3security-lake-prereq"></a>

Before you get started, make sure that you have reviewed the following documentation:
+ [Limitations](direct-query-security-lake-overview.md#direct-query-security-lake-limitations)
+ [Recommendations](direct-query-security-lake-overview.md#direct-query-security-lake-recommendations)
+ [Quotas](direct-query-security-lake-overview.md#direct-query-security-lake-quotas)

Before you can create a data source, take the following actions in Security Lake:
+ **Enable Security Lake**. Configure Security Lake to collect logs on the same AWS Region as your OpenSearch resource. For instructions, see [Getting started with Amazon Security Lake](https://docs.aws.amazon.com/security-lake/latest/userguide/getting-started.html) in the Amazon Security Lake user guide.
+ **Set up Security Lake permissions**. Make sure you have accepted the service linked role permissions for resource management and the console does not show any issues under the **Issues** page. For more information, see [Service-linked role for Security Lake](https://docs.aws.amazon.com/security-lake/latest/userguide/using-service-linked-roles.html) in the Amazon Security Lake user guide.
+ **Share Security Lake data sources**. When accessing OpenSearch within the same account as Security Lake, ensure that there is no message to register your Security Lake buckets with Lake Formation in the Security Lake console. For cross-account OpenSearch access, set up a Lake Formation query subscriber in the Security Lake console. Use the account associated with your OpenSearch resource as the subscriber. For more information, see [Subscriber management in Security Lake](https://docs.aws.amazon.com/security-lake/latest/userguide/create-query-subscriber-procedures.html) in the Amazon Security Lake user guide.

In addition, you must also you must have the following resources in your AWS account:
+ **(Optional) A manually created IAM role.** You can use this role to manage access to your data source. Alternatively, you can have OpenSearch Service create a role for you automatically with the required permissions. If you choose to use a manually created IAM role, follow the guidance in [Required permissions for manually created IAM roles](#direct-query-security-lake-additional-resources-required-permissions).

## Procedure
<a name="direct-query-security-lake-create"></a>

You can set up a data source to connect with a Security Lake database from within the AWS Management Console.

### To set up a data source using the AWS Management Console
<a name="creating-direct-query-security-lake-console-create"></a>

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, go to **Central management** and choose **Connected data sources**. 

1. Choose **Connect**.

1. Choose **Security Lake** as the data source type. 

1. Choose **Next**.

1. Under **Data connection details**, enter a name and an optional description. 

1. Under **IAM permission access settings**, choose how to manage access to your data source.

   1. If you want to automatically create a role for this data source, follow these steps:

      1. Select **Create a new role**.

      1. Enter a name for the IAM role.

      1. Select one or more AWS Glue tables to define which data can be queried.

   1. If you want to use an existing role that you manage yourself, follow these steps:

      1. Select **Use an existing role.**

      1. Select an existing role from the drop-down menu.
**Note**  
When using your own role, you must ensure it has all necessary permissions by attaching required policies from the IAM console. For more information, see [Required permissions for manually created IAM roles](#direct-query-security-lake-additional-resources-required-permissions).

1. (Optional) Under **Tags**, add tags to your data source.

1. Choose **Next**.

1. Under **Set up OpenSearch**, choose how to set up OpenSearch.

   1. Review the default resource names and data retention settings.

     When you use the default settings, a new OpenSearch application and Essentials workspace is created for you at no additional cost. OpenSearch enables you to analyze multiple data sources. It includes workspaces, which provide a tailored experiences for popular use cases. Workspaces support access control, enabling you to create private spaces for your use cases and share them only with your collaborators.

1. Use customized settings:

   1. Choose **Customize**.

   1. Edit the collection name and the data retention settings as needed.

   1. Select the OpenSearch application and workspace that you want to use.

1. Choose **Next**.

1. Review your choices and choose **Edit** if you need to make any changes.

1. Choose **Connect** to set up the data source. Stay on this page while your data source is created. When it’s ready, you’ll be taken to the data source details page. 

## Next steps
<a name="direct-query-security-lake-next-steps"></a>

### Visit OpenSearch Dashboards and create a dashboard
<a name="direct-query-security-lake-next-steps-dashboard"></a>

After you create a data source, OpenSearch Service provides you with an OpenSearch Dashboards URL. You use this to query your data using SQL or PPL. The Security Lake integration comes with pre-packaged query templates for SQL and PPL to get you get started analyzing your logs. 

For more information, see [Configuring and querying a Security Lake data source in OpenSearch Dashboards](direct-query-security-lake-configure.md).

## Additional resources
<a name="direct-query-security-lake-additional-resources"></a>

### Required permissions for manually created IAM roles
<a name="direct-query-security-lake-additional-resources-required-permissions"></a>

When creating a data source, you choose an IAM role to manage access to your data. You have two options:

1. Create a new IAM role automatically

1. Use an existing IAM role that you created manually

If you use a manually created role, you need to attach the correct permissions to the role. The permissions must allow access to the specific data source, and allow OpenSearch Service to assume the role so that OpenSearch Service can securely access and interact with your data. Additionally, grant LakeFormation permissions to the role for any databases and tables that you want to query. Grant `DESCRIBE` permissions to the role on SecurityLake databases that you want to query from the direct query connection. Grant at least `SELECT and DESCRIBE` permissions to the data source role for tables within the database.

The following sample policy demonstrates the least-privilege permissions required to create and manage a data source. If you have broader permissions, such as the `AdminstratorAccess` policy, these permissions encompasses the least-privilege permissions in the sample policy.

In the following sample policy, replace the *placeholder text *with your own information.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AmazonOpenSearchDirectQueryServerlessAccess",
            "Effect": "Allow",
            "Action": [
                "aoss:APIAccessAll",
                "aoss:DashboardsAccessAll"
            ],
            "Resource": "arn:aws:aoss:us-east-1:111122223333:collection/collectionname/*"
        },
        {
            "Sid": "AmazonOpenSearchDirectQueryGlueAccess",
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase",
                "glue:GetDatabases",
                "glue:GetPartition",
                "glue:GetPartitions",
                "glue:GetTable",
                "glue:GetTableVersions",
                "glue:GetTables",
                "glue:SearchTables",
                "glue:BatchGetPartition"
            ],
            "Resource": [
                "arn:aws:glue:us-east-1:111122223333:table/databasename/*",
                "arn:aws:glue:us-east-1:111122223333:database/databasename",
                "arn:aws:glue:us-east-1:111122223333:catalog",
                "arn:aws:glue:us-east-1:111122223333:database/default"
            ]
        },
        {
            "Sid": "AmazonOpenSearchDirectQueryLakeFormationAccess",
            "Effect": "Allow",
            "Action": [
                "lakeformation:GetDataAccess"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
```

------

The role must also have the following trust policy, which specifies the target ID.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "directquery.opensearchservice.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

------

For instructions to create the role, see [Creating a role using custom trust policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-custom.html).

By default, the role has access to direct query data source indexes only. Although you can configure the role to limit or grant access to your data source, it is recommended you not adjust the access of this role. **If you delete the data source, this role will be deleted**. This will remove access for any other users if they are mapped to the role.

### Querying Security Lake data that's encrypted with a customer managed key
<a name="querying-data-in-cmk-lake"></a>

If the Security Lake bucket associated with the data connection is encrypted using server-side encryption with a customer managed AWS KMS key, you must add the LakeFormation service role to the key policy. This allows the service to access and read the data for your queries.

In the following sample policy, replace the *placeholder text *with your own information.

```
{
    "Sid": "Allow LakeFormation to access the key",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::account:role/aws-service-role/lakeformation.amazonaws.com/AWSServiceRoleForLakeFormationDataAccess"
    },
    "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey"
    ],
    "Resource": "*"
}
```

# Configuring and querying a Security Lake data source in OpenSearch Dashboards
<a name="direct-query-security-lake-configure"></a>

Now that you've created your data source, you can set it up in OpenSearch Dashboards. 

This section walks you through various use cases with your data source in OpenSearch Dashboards before you query your data. To get started, you need to navigate to your data source in OpenSearch Dashboards. In the left-hand menu, under **Management**, choose **Data sources**. Then, select the name of the data source that you created earlier in the OpenSearch Service console.

## Query Security Lake tables from Discover
<a name="direct-query-security-lake-query-from-discover"></a>

If you have created tables based on your Security Lake logs, you can now query those tables directly from OpenSearch Discover. This enables you to seamlessly access and analyze data stored in Security Lake, directly from the familiar Discover interface. By querying Security Lake directly from Discover, you can avoid the need to manually extract, transform, and load the data into a separate search index. To quickly get started analyzing your logs, Discover includes a set of PPL and SQL saved queries.

Start by selecting the data source that you configured. Select the associated database and table you want to query, then use the search bar to write queries against your tables. To understand what statements, commands, and limitations are supported for the Security Lake integration, see [Supported SQL and PPL commands](direct-query-supported-commands.md). 

To take advantage of the pre-built queries that are available for Security Lake, go to **...** on the top right hand side of Discover, choose **Open Query** and then choose **Templates**. There are many pre-built queries available for log sources supported in Security Lake. Search for the templates that match your use case, copy the query to use in the search bar, and replace templated fields (such as Region and action) with your own information.

## Accelerate data from Discover
<a name="accelerate-security-lake-data-from-discover"></a>

To enhance performance and enable faster subsequent queries and analysis in OpenSearch, you can ingest the results of your query from Discover into an OpenSearch indexed view. 

**To create an indexed view**

1. From Discover, choose **Create Indexed View**. 

1. In the query editor, enter your desired query. You can create a new query here or use an existing one from your previous searches.

1. Specify a name for your new indexed view. Choose a descriptive name that will help you identify the view later.

1. Configure the data retention settings for your indexed view. You can specify how long the data should be kept in the index, allowing you to balance performance with storage costs.

1. Create the indexed view. After it's created, your indexed view will be available for faster querying and analysis.

If you've previously created indexed views, you can access them from Discover.

**To use an existing index view**

1. From Discover, choose **Select Indexed View** to see a list of your existing indexed views for Security Lake.

1. Choose the indexed view you want to use. This will apply the view to your current query, potentially significantly speeding up your data retrieval and analysis.

## Create a dashboard view for your data source
<a name="direct-query-security-lake-create-dashboard"></a>

When you use OpenSearch Service, you can analyze popular AWS log types using pre-built dashboard templates. For Security Lake there are templates for VPC, CloudTrail, and WAF logs. These templates allow you to create a dashboard tailored to your specific data. They include pre-built queries and dashboards tailored for that specific log type. This enables you to quickly get up and running with analyzing these popular AWS log sources, without having to build everything from scratch.

**Note**  
Dashboards use indexed views, which ingest data from Security Lake and contribute to direct query and collection compute.

Follow these steps to create a dashboard using one of these pre-built templates, so you can start exploring and analyzing your data right away.

**To create a dashboard view**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. From the left navigation pane, choose **Central management**, then **Connected data sources**. 

1. Select the data source to open the details page. 

1. Choose **Create dashboard**.

1. Choose which type of dashboard you want to create.

1. Enter a name for your dashboard.

1. Enter an optional description for your dashboard.

1. Select one or more AWS Glue tables to view on your dashboard.

1. Choose how often you want to refresh the data in your dashboard.

1. Choose which OpenSearch workspace you want to use. 

   1. To create a new workspace, select **Create new workspace**.

   1. To use an existing workspace, select **Select existing workspace**.

1. Enter a name for your workspace.

1. Choose **Create dashboard**.

## Troubleshooting
<a name="security-lake-troubleshooting"></a>

There might be instances when results don’t return as expected. If you experience any issues, make sure that you're following the [Recommendations](direct-query-security-lake-overview.md#direct-query-security-lake-recommendations).

## Pricing
<a name="direct-query-security-lake-pricing"></a>

Amazon OpenSearch Service offers OpenSearch Compute Unit (OCU) pricing for Security Lake direct queries. As you run direct queries, you incur charges for OCUs per hour, listed as DirectQuery OCU usage type on your bill. You will also incur separate charges from Amazon Security Lake.

Direct queries are of two types—interactive and indexed view queries.
+ *Interactive queries* are used to populate the data selector and perform analytics on your data in Security Lake. OpenSearch Service handles each query with a separate pre-warmed job, without maintaining an extended session.
+ *Indexed view queries* use compute to maintain indexed views in OpenSearch Service. These queries usually take longer because they ingest a varying amount of data into a named index. For Security Lake connected data sources, the indexed data is stored in an OpenSearch Serverless collection where you are charged for data indexed (IndexingOCU), data searched (SearchOCU), and data stored in GB.

For more information, see the Direct Query and Serverless sections within [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Limitations
<a name="direct-query-security-lake-limitations"></a>

The following limitations apply to direct queries in Security Lake:
+ The direct query integration with Security Lake is only available on OpenSearch Service collections and the OpenSearch user interface.
+ OpenSearch Serverless collections have networked payload limitations of 100 MiB. 
+ Table management for Security Lake is performed in Lake Formation.
+ Security Lake only supports materialized views as indexed views. Covering indexes are not supported.
+ AWS CloudFormation templates aren't supported yet.
+ OpenSearch SQL and OpenSearch PPL statements have different limitations when working with OpenSearch indexes compared to using direct query. Direct query supports advanced commands such as JOINs, subqueries, and lookups, while support for these commands on OpenSearch indexes is limited or nonexistent. For more information, see [Supported SQL and PPL commands](direct-query-supported-commands.md).

## Recommendations
<a name="direct-query-security-lake-recommendations"></a>

We recommend the following when using direct queries in Security Lake:
+ Check your Security Lake status and ensure that it's running smoothly without any problems. For detailed troubleshooting steps, see [Troubleshooting data lake status](https://docs.aws.amazon.com/security-lake/latest/userguide/securitylake-data-lake-troubleshoot.html) in the Amazon Security Lake User Guide.
+ Verify your query access:
  + If you're querying Security Lake from a different account than the Security Lake delegated administrator account, [set up a subscriber with query access in Security Lake](https://docs.aws.amazon.com/security-lake/latest/userguide/subscriber-query-access.html). 
  + If you're querying Security Lake from the same account, check for any messages in Security Lake about registering your managed S3 buckets with LakeFormation.
+ Explore the query templates and pre-built dashboards to jumpstart your analysis.
+ Familiarize yourself with Open Cybersecurity Schema Framework (OCSF) and Security Lake:
  + Review schema mapping examples for AWS sources in the [OCSF GitHub repository](https://github.com/ocsf/examples/tree/main/mappings/markdown/AWS/v1.1.0/CloudTrail)
  + Learn how to query Security Lake effectively by visiting [Security Lake queries for AWS source version 2 (OCSF 1.1.0)](https://docs.aws.amazon.com/security-lake/latest/userguide/subscriber-query-examples2.html)
  + Improve query performance by using partitions: `accountid`, `region`, and `time_dt`
+ Get comfortable with SQL syntax, which Security Lake supports for querying. For more information, see [Supported OpenSearch SQL commands and functions](supported-directquery-sql.md).
+ Use limits on your queries to make sure you aren't pulling too much data back.

## Quotas
<a name="direct-query-security-lake-quotas"></a>


| Description | Value | Soft limit? | Notes | 
| --- | --- | --- | --- | 
| Account-level TPS limit across direct query APIs | 3 TPS | Yes |  | 
| Maximum number of data sources | 20 | Yes | Limit is per AWS account. | 
| Maximum auto-refreshing indexes or materialized views | 30 | Yes |  Limit applies per data source.  Only includes indices and materialized views (MVs) with auto-refresh set to true.  | 
| Maximum concurrent queries | 30 | Yes |  Limit applies to queries in pending or running state.  Includes interactive queries (for example, data retrieval commands like `SELECT`) and index queries (for example, operations like `CREATE`/`ALTER`/`DROP`).   | 
| Maximum concurrent OCU per query | 512 | Yes |  OpenSearch Compute Units (OCU). Limit based on 15 executors and 1 driver, each with 16 vCPU and 32 GB memory. Represents concurrent processing power.  | 
| Maximum query execution time in minutes | 30 | No | Applies only to interactive queries (for example, data retrieval commands like SELECT). For REFRESH queries, the limit is 6 hours. | 
| Period for purging stale query IDs | 90 days | Yes |  This is the time period after which OpenSearch Service purges query metadata for older entries. For example, calling GetDirectQuery or GetDirectQueryResult fails for queries older than 90 days.  | 

## Supported AWS Regions
<a name="direct-query-security-lake-regions"></a>

The following AWS Regions are supported for direct queries in Security Lake:
+ Asia Pacific (Mumbai)
+ Asia Pacific (Singapore)
+ Asia Pacific (Sydney)
+ Asia Pacific (Tokyo)
+ Canada (Central)
+ Europe (Frankfurt)
+  Europe (Ireland)
+ Europe (Stockholm)
+ US East (N. Virginia)
+ US East (Ohio)
+ US West (Oregon)
+ Europe (Paris) 
+ Europe (London)
+ South America (Sao Paulo)

# Direct queries in Amazon Managed Service for Prometheus
<a name="direct-query-prometheus-overview"></a>

You can use Amazon OpenSearch Service to directly query operational metrics stored in Amazon Managed Service for Prometheus. This integration allows you to analyze and visualize your Prometheus time-series data alongside your logs and traces within OpenSearch UI, enabling a unified observability experience.

Unlike storage-based direct queries (such as Amazon S3 or CloudWatch Logs), the Prometheus integration uses a live-call architecture. OpenSearch Service acts as a direct client, translating your queries and making live API calls to your Prometheus workspace. Because OpenSearch Service does not provision temporary compute to scan data, you do not incur OpenSearch Compute Unit (OCU) charges for these queries.

To set up and use this integration, you must first create the data source and then configure your workspaces to query the data.

**Topics**
+ [Creating an Amazon Managed Service for Prometheus data source](direct-query-prometheus-creating.md)
+ [Querying Prometheus metrics](direct-query-prometheus-configure.md)
+ [Pricing](#direct-query-prometheus-pricing)
+ [Limitations](#direct-query-prometheus-limitations)
+ [Recommendations](#direct-query-prometheus-recommendations)
+ [Quotas](#direct-query-prometheus-quotas)
+ [Supported AWS Regions](#direct-query-prometheus-regions)

# Creating an Amazon Managed Service for Prometheus data source
<a name="direct-query-prometheus-creating"></a>

To create an Amazon Managed Service for Prometheus data source, you need an active workspace and an IAM role that grants OpenSearch Service the necessary permissions to query your metrics.

## Prerequisites
<a name="direct-query-prometheus-prereq"></a>

Before you connect the data source, make sure you have the following:
+ **Prometheus workspace** – An active Amazon Managed Service for Prometheus workspace. Note your Workspace ID and the AWS Region it resides in.
+ **IAM role** – An AWS Identity and Access Management role with a trust policy that allows the `directquery.opensearchservice.amazonaws.com` service principal to assume it.

## Connecting the data source
<a name="direct-query-prometheus-connect"></a>

After your prerequisites are met, you can connect the data source using the OpenSearch Service console.

**To set up an Amazon Managed Service for Prometheus data source**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, go to **Central management** and choose **Connected data sources**.

1. Choose **Connect new data source**.

1. Choose **Amazon Managed Service for Prometheus** as the data source type.

1. Choose **Next**.

1. Under **Data connection details**, enter a name and an optional description.

1. Under **IAM roles**, choose how to manage access:
   + To automatically create a role for this data source:

     1. Select **Create a new role**.

     1. Enter a name for the IAM role.

     1. Select one or more workspaces to define which data can be queried.
   + To use an existing role that you manage yourself:

     1. Select **Use an existing role**.

     1. Select an existing role from the drop-down menu.
**Note**  
When using your own role, make sure it has all necessary permissions by attaching required policies from the IAM console. For more information, see [Required permissions for manually created IAM roles](#direct-query-prometheus-manual-role-permissions).

1. (Optional) Under **Access policy**, configure an access policy for the data source. Access policies control whether a request to the OpenSearch Service direct query data source is accepted or rejected. If you don't configure an access policy, only the data source owner has access. You can configure the access policy to enable cross-account access, allowing principals in other AWS accounts to access the data source.

   You can create an access policy using the visual editor or by providing a JSON policy document. With the visual editor, you can allow or deny access by specifying a principal AWS account ID, account ARN, IAM user ARN, IAM role ARN, source IP address, or CIDR block. The visual editor supports up to 10 elements. To define a policy with more than 10 elements, use the JSON editor.

   You can also choose **Import policy** to import an existing access policy from another data source.

1. (Optional) Under **Tags**, add tags to your data source.

1. Choose **Next**.

1. Under **Set up OpenSearch**, choose how to set up OpenSearch UI:

   1. If no OpenSearch UI application exists in your account, create a new OpenSearch application. If an existing OpenSearch application exists, select it.

   1. If you create a new application, create a new observability workspace. If you selected an existing application, create a new observability workspace or select an existing one. Amazon Managed Service for Prometheus is only available in the observability workspace.

1. Choose **Next**.

1. Review your choices and choose **Edit** if you need to make any changes.

1. Choose **Connect** to set up the data source. Stay on this page while your data source is created. When it's ready, you're taken to the data source details page.

### Next steps
<a name="direct-query-prometheus-next-steps"></a>

**Visit OpenSearch UI**  
After you create a data source, OpenSearch Service provides you with an OpenSearch UI application URL. You use this to configure who has access to OpenSearch UI and analyze your Amazon Managed Service for Prometheus data using Discover Metrics with PromQL.

### Additional resources
<a name="direct-query-prometheus-additional-resources"></a>

#### Required permissions for manually created IAM roles
<a name="direct-query-prometheus-manual-role-permissions"></a>

When creating a data source, you choose an IAM role to manage access to your data. You have two options:
+ Create a new IAM role automatically
+ Use an existing IAM role that you created manually

If you use a manually created role, you need to attach the correct permissions to the role. The permissions must allow access to the specific data source and allow OpenSearch Service to assume the role. This is required so that OpenSearch Service can securely access and interact with your data.

The following sample policy demonstrates the least-privilege permissions required to create and manage a data source. If you have broader permissions, such as `aps:*` or the `AdministratorAccess` policy, these permissions encompass the least-privilege permissions in the sample policy.

In the following sample policy, replace the *placeholder* text with your own information.

**Sample IAM policy**  
Attach the following permissions to your IAM role to allow OpenSearch Service to fetch metric metadata and execute queries:

```
{
    "Version": "2012-10-17", 		 	 	 
    "Statement": [
        {
            "Sid": "AmazonOpenSearchDirectQueryPrometheusAccess",
            "Effect": "Allow",
            "Action": [
                "aps:DeleteAlertManagerSilence",
                "aps:GetAlertManagerSilence",
                "aps:GetAlertManagerStatus",
                "aps:GetLabels",
                "aps:GetMetricMetadata",
                "aps:GetSeries",
                "aps:ListAlertManagerAlertGroups",
                "aps:ListAlertManagerAlerts",
                "aps:ListAlertManagerReceivers",
                "aps:ListAlertManagerSilences",
                "aps:ListAlerts",
                "aps:QueryMetrics",
                "aps:PutAlertManagerSilences",
                "aps:DescribeAlertManagerDefinition",
                "aps:CreateRuleGroupsNamespace",
                "aps:DeleteRuleGroupsNamespace",
                "aps:ListRuleGroupsNamespaces",
                "aps:DescribeRuleGroupsNamespace",
                "aps:PutRuleGroupsNamespace"
            ],
            "Resource": "arn:aws:aps:region:account-id:workspace/workspace-id",
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "aws:CalledVia": [
                        "directquery.opensearchservice.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Sid": "AmazonOpenSearchDirectQueryPrometheusListAccess",
            "Effect": "Allow",
            "Action": [
                "aps:ListWorkspaces"
            ],
            "Resource": "*",
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "aws:CalledVia": [
                        "directquery.opensearchservice.amazonaws.com"
                    ]
                }
            }
        }
    ]
}
```

**Sample trust policy**  
Attach the following trust policy to your IAM role:

```
{
    "Version": "2012-10-17", 		 	 	 
    "Statement": [
        {
            "Sid": "TrustPolicyForAmazonOpenSearchDirectQueryService",
            "Effect": "Allow",
            "Principal": {
                "Service": "directquery.opensearchservice.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "ArnEquals": {
                    "aws:SourceArn": "arn:aws:opensearch:region:account-id:datasource/data-source-name"
                },
                "StringEquals": {
                    "aws:SourceAccount": "account-id"
                }
            }
        }
    ]
}
```

# Querying Prometheus metrics
<a name="direct-query-prometheus-configure"></a>

Amazon OpenSearch Service allows you to query your Prometheus data using PromQL (Prometheus Query Language) directly from the Observability interface. When you execute a PromQL query against the Prometheus data source, OpenSearch Service passes the query directly to your workspace API for execution.

## Running a PromQL query
<a name="direct-query-prometheus-query"></a>

To run a query:

1. Open your OpenSearch UI application and observability workspace.

1. Navigate to **Observability** and select **Discover Metrics**.

1. In the data source drop-down, select your Prometheus data source.

1. Enter your PromQL query in the query bar.

For example, to find the average per-second CPU usage over a 5-minute window for a specific pod:

```
avg(rate(container_cpu_usage_seconds_total{pod="payment-service-pod"}[5m])) by (pod)
```

**Note**  
Set your time picker to a narrow, relevant window (for example, the last 1 hour) to optimize API performance and prevent timeouts.

## Visualizing metrics in dashboards
<a name="direct-query-prometheus-dashboards"></a>

You can add PromQL-driven metric visualizations to your existing observability dashboards to correlate them with your log and trace data.

1. Navigate to **Discover Metrics**, select your Prometheus workspace from the data source drop-down, and run your PromQL query.

1. Use the visualization tab in **Discover Metrics** to create a visualization and define your visualization type.

1. Save the visualization to your dashboard.

**Note**  
Metric visualizations can only be added from **Discover Metrics**. Visualizations found on the visualizations tab are only optimized for logs.

## Pricing
<a name="direct-query-prometheus-pricing"></a>

For Amazon Managed Service for Prometheus, OpenSearch Service makes live calls directly to your Prometheus workspace to fetch data. Because OpenSearch Service does not provision separate compute resources to execute these queries, you do not incur OpenSearch Compute Unit (OCU) charges. You are only responsible for the standard querying costs associated with Amazon Managed Service for Prometheus.

For more information, see the Direct Query and Serverless sections within [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Limitations
<a name="direct-query-prometheus-limitations"></a>

The following limitations apply to direct queries in Amazon Managed Service for Prometheus:
+ **Time range constraints** – Live queries are optimized for operational dashboards. Querying highly granular, non-downsampled metrics over very long time horizons (for example, spanning multiple months) might hit payload size limits or result in timeouts from the Prometheus API.
+ **Prometheus API quotas** – Your queries are subject to the standard Amazon Managed Service for Prometheus service quotas, including limits on Query Samples Processed (QSP) and concurrent query execution.
+ **Query timeout** – Long-running queries time out after 30 seconds.

## Recommendations
<a name="direct-query-prometheus-recommendations"></a>

We recommend the following when using direct queries in Amazon Managed Service for Prometheus:
+ **Use recording rules for long time ranges** – Because direct queries make live API calls to your workspace, querying highly granular data over long periods (such as multiple months) can result in API timeouts or payload limits. Use Amazon Managed Service for Prometheus recording rules to create downsampled metrics for historical analysis.
+ **Apply narrow time filters** – Always specify a targeted time range in your OpenSearch UI queries to minimize the volume of data fetched dynamically from the Prometheus API.
+ **Monitor your Prometheus quotas** – Because these live queries do not incur OpenSearch Compute Unit (OCU) charges, monitor your Amazon Managed Service for Prometheus usage instead. Monitor your Query Samples Processed (QSP) and concurrent query limits to avoid throttling at the workspace level.

## Quotas
<a name="direct-query-prometheus-quotas"></a>

Unlike other direct query data sources, each time you initiate a query to Amazon Managed Service for Prometheus, OpenSearch Service makes a live call to Amazon Managed Service for Prometheus. No sessions are created. The following quotas are applicable per account and Region. For example, an account is allotted 20 data sources in us-east-1 and 20 in us-east-2.


| Description | Maximum | Can override | 
| --- | --- | --- | 
| Data sources | 20 | Yes | 
| Execute queries – instant and range queries (TPS) | 500 | Yes | 
| Read resources – labels, metrics, alerts, rules, alert manager (TPS) | 50 | Yes | 
| Write resources – create/update silence, create/update rule (TPS) | 50 | Yes | 
| Maximum query execution time (seconds) | 30 | No | 

## Supported AWS Regions
<a name="direct-query-prometheus-regions"></a>

The following AWS Regions are supported for direct queries in Amazon Managed Service for Prometheus:
+ Asia Pacific (Hong Kong)
+ Asia Pacific (Mumbai)
+ Asia Pacific (Osaka)
+ Asia Pacific (Seoul)
+ Asia Pacific (Singapore)
+ Asia Pacific (Sydney)
+ Asia Pacific (Tokyo)
+ Canada (Central)
+ Europe (Frankfurt)
+ Europe (Ireland)
+ Europe (London)
+ Europe (Milan)
+ Europe (Paris)
+ Europe (Spain)
+ Europe (Stockholm)
+ South America (São Paulo)
+ US East (N. Virginia)
+ US East (Ohio)
+ US West (N. California)
+ US West (Oregon)

# Managing a data source in Amazon OpenSearch Service
<a name="direct-query-managing-data-sources"></a>

Managing your data source is an important part of maintaining the reliability, availability, and performance of direct query data sources and your other AWS solutions. AWS provides the following tools to monitor, report when something is wrong, and take automatic actions when appropriate.

**Topics**
+ [Monitoring with CloudWatch metrics data sources](#monitoring-cloudwatch-metrics)
+ [Enabling and disabling data sources](#direct-query-s3-enabling-disabling-data)
+ [Monitoring with AWS Budget](#direct-query-s3-enabling-budget)
+ [Deleting a data source](#direct-query-s3-delete)

## Monitoring with CloudWatch metrics data sources
<a name="monitoring-cloudwatch-metrics"></a>

You can monitor direct query using CloudWatch. CloudWatch collects raw data and processes it into readable, near real-time metrics. These statistics are kept for 15 months, so that you can access historical information and gain a better perspective on how your web application or service is performing.

You can also set alarms to monitor certain thresholds, and send notifications or take actions when those thresholds are met. For more information, see [What is Amazon CloudWatch.](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html)

Amazon S3 reports the following metrics:


| Metric | Description | 
| --- | --- | 
| AsyncQueryCreateAPI |  The total number of requests made to the API for creating asynchronous queries. **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`,`DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryGetApiRequestCount  |  The total number of requests made to the API for retrieving asynchronous query results. **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`, `DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryCancelApiRequestCount  |  The total number of requests made to the API for canceling asynchronous queries. **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`, `DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryGetApiFailedRequestCusErrCount  |  The number of failed requests when retrieving asynchronous query results due to customer-related errors (e.g., invalid query ID). **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`, `DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryCancelApiFailedRequestCusErrCount  |  The number of failed requests when retrieving asynchronous query results due to customer-related errors (e.g., invalid query ID). **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`,`DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryCancelApiFailedRequestSysErrCount  |  The number of failed requests when creating asynchronous queries due to customer-related errors. **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`, `DomainName` **Frequency**: 60 seconds  | 
|  AsyncQueryGetApiFailedRequestSysErrCount  |  The number of failed requests when retrieving asynchronous query results due to system-related errors. **Relevant statistics**: Average, Maximum, Sum **Dimensions**: `ClientId`, `DomainName` **Frequency**: 60 seconds  | 

CloudWatch Logs and Security Lake report the following metrics:


| Metric | Description | 
| --- | --- | 
|  DirectQueryRate  |  The rate of requests made against the data sources. **Relevant statistics**: Sum, Maximum, Minimum, Average **Dimensions**: `DataSourceName` **Frequency**: 60 seconds  | 
|  DirectQueryLatency  |  The latency observed for running queries on the data sources. **Relevant statistics**: Average, P90, P99, Sum, Minimum, Maximum **Dimensions**: `DataSourceName` **Frequency**: 60 seconds  | 
|  FailedDirectQueries  |  The total number of query failures that are observed on the data source queries. **Relevant statistics**: Sum, Maximum, Minimum, Average **Dimensions**: `DataSourceName` **Frequency**: 60 seconds  | 
|  DirectQueryConsumedOCU  |  The number of OCUs that are consumed for running the queries on the data sources. **Relevant statistics**: Average, P90, P99, Sum, Minimum, Maximum **Dimensions**: `DataSourceName` **Frequency**: 60 seconds  | 

## Enabling and disabling data sources
<a name="direct-query-s3-enabling-disabling-data"></a>

**Note**  
The following information is only applicable to Amazon S3 data sources.

For circumstances when you want to halt direct query usage for a data source, you can opt to disable the data source. Disabling a data source will finish executing existing queries and halt all new queries from being executed.

Accelerations setup to boost query performance such as skipping indexes, materialized views, covering indexes will be set to manual once a data source is disabled. Once a data source is set to active after being disabled, user queries will run as expected. Accelerations which were previously setup and set to manual, will need to be manually configured to run on a schedule again.

## Monitoring with AWS Budget
<a name="direct-query-s3-enabling-budget"></a>

Amazon OpenSearch Service is populating OCU usage data at the account level into Billing and Cost Management's Cost Explorer. You can account for OCU usage at the account level and set thresholds and alerts when thresholds have been crossed. 

The format of the usage type to filter on in Cost Explorer looks like RegionCode-DirectQueryOCU (OCU-hours). If you want to be notified when DirectQueryOCU (OCU-Hours) usage meets your threshold, you can create an AWS Budgets account and configure an alert based on the threshold you set. Optionally for Amazon S3, you can set up an Amazon SNS topic, which will turn off a data source in the event a threshold criteria is met. 

**Note**  
Usage data in AWS Budgets is not real-time and may be delayed up to 8 hours.

## Deleting a data source
<a name="direct-query-s3-delete"></a>

When you delete a data source, Amazon OpenSearch Service removes it from your domain or your collection. OpenSearch Service also removes indexes associated with the data source. Your transactional data isn't deleted from the other AWS service, but the other AWS service doesn't send new data to OpenSearch Service.

You can delete a data source integration using the AWS Management Console or the OpenSearch Service API.

### AWS Management Console
<a name="direct-query-s3-console-delete"></a>

**To delete an Amazon S3 data source**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. From the left navigation pane, choose **Domains**. 

1. Select the domain that you want to delete a data source for. This opens the domain details page. Choose the **Connections** tab below the general information and find the **Direct query** section.

1. Select the data source you want to delete, choose **Delete**, and confirm deletion. 

**To delete a CloudWatch Logs or Security Lake data source**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. From the left navigation pane, choose **Central management**, then **Connected data sources**. 

1. Select the data source you want to delete, choose **Delete**, and confirm deletion. 

### OpenSearch Service API
<a name="creating-direct-query-s3-api-delete"></a>

To delete an Amazon S3 data source, use the [DeleteDataSource](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_DeleteDataSource.html) API operation.

```
POST https://es.region.amazonaws.com/2021-01-01/opensearch/domain/domain-name/dataSource/data-source-name
```

To delete a CloudWatch Logs or Security Lake data source, use the [DeleteDirectQueryDataSource](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_DeleteDirectQueryDataSource.html) API operation.

# Supported SQL and PPL commands
<a name="direct-query-supported-commands"></a>

OpenSearch SQL and OpenSearch Pipeline Processing Language (PPL) are languages for querying, analyzing, and processing data in OpenSearch, CloudWatch Logs Insights, and Security Lake. You can use OpenSearch SQL and OpenSearch PPL in OpenSearch Discover to query data within CloudWatch Logs, Amazon S3, or Security Lake. CloudWatch Logs Insights also supports both OpenSearch PPL and OpenSearch SQL query languages, in addition to Logs Insights QL, a purpose-built query language for analyzing CloudWatch Logs.
+ **OpenSearch SQL**: OpenSearch SQL provides a familiar option if you're used to working with relational databases. OpenSearch SQL offers a subset of SQL functionality, making it a good choice for performing ad-hoc queries and data analysis tasks. With OpenSearch SQL, you can use commands such as SELECT, FROM, WHERE, GROUP BY, HAVING, and various other SQL commands and functions available in SQL. You can execute JOINs across tables (or log groups), correlate data across tables (or log groups) using subqueries, and use the rich set of JSON, mathematical, string, conditional, and other SQL functions to perform powerful analysis on log and security data.
+ **OpenSearch PPL (Piped Processing Language):** With OpenSearch PPL, you can retrieve, query, and analyze data using piped-together commands, making it easier to understand and compose complex queries. Its syntax is based on Unix pipes, and enables chaining of commands to transform and process data. With PPL, you can filter and aggregate data, and use commands such as JOINs, subqueries, LOOKUP, and a rich set of math, string, date, conditional, and other functions for analysis.

Although most of the commands in OpenSearch PPL and OpenSearch SQL query languages are common across CloudWatch Logs and OpenSearch, there are some differences in which set of commands and functions are supported in each of these services. For more details, see the tables on the following pages. 

****
+ [Supported OpenSearch SQL commands and functions](supported-directquery-sql.md)
  + [Additional information for CloudWatch Logs Insights users using OpenSearch SQL](supported-directquery-sql.md#supported-sql-for-multi-log-queries)
  + [General SQL restrictions](supported-directquery-sql.md#general-sql-restrictions)
+ [Supported PPL commands](supported-ppl.md)
  + [Additional information for CloudWatch Logs Insights users using OpenSearch PPL](supported-ppl.md#supported-ppl-for-cloudwatch-users)

# Supported OpenSearch SQL commands and functions
<a name="supported-directquery-sql"></a>

The following reference tables show which SQL commands are supported in OpenSearch Discover for querying data in Amazon S3, Security Lake, or CloudWatch Logs, and which SQL commands are supported in CloudWatch Logs Insights. The SQL syntax supported in CloudWatch Logs Insights and that supported in OpenSearch Discover for querying CloudWatch Logs are the same, and referenced as CloudWatch Logs in the following tables.

**Note**  
OpenSearch also has SQL support for querying data that is ingested in OpenSearch and stored in indexes. This SQL dialect is different than the SQL used in direct query and is referred to as [OpenSearch SQL on indexes](https://opensearch.org/docs/latest/search-plugins/sql/sql/index/).

**Topics**
+ [Commands](#supported-sql-data-retrieval)
+ [Functions](#supported-sql-functions)
+ [General SQL restrictions](#general-sql-restrictions)
+ [Additional information for CloudWatch Logs Insights users using OpenSearch SQL](#supported-sql-for-multi-log-queries)

## Commands
<a name="supported-sql-data-retrieval"></a>

**Note**  
In the example commands column, replace `<tableName/logGroup>` as needed depending on which data source you're querying.   
Example command: `SELECT Body , Operation FROM <tableName/logGroup>` 
If you're querying Amazon S3 or Security Lake, use: `SELECT Body , Operation FROM table_name` 
If you're querying CloudWatch Logs, use: `SELECT Body , Operation FROM `LogGroupA`` 


| Command | Description | CloudWatch Logs | Amazon S3 | Security Lake | Example command | 
| --- | --- | --- | --- | --- | --- | 
|  [SELECT clause](#supported-sql-select)  |  Displays projected values.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    method,<br />    status <br />FROM <br />    <tableName/logGroup></pre>  | 
| [WHERE clause](#supported-sql-where) |  Filters log events based on the provided field criteria.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    <tableName/logGroup><br />WHERE <br />    status = 100</pre>  | 
| [GROUP BY clause](#supported-sql-group-by) |  Groups log events based on category and finds the average based on stats.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    method,<br />    status,<br />    COUNT(*) AS request_count,<br />    SUM(bytes) AS total_bytes <br />FROM <br />    <tableName/logGroup> <br />GROUP BY <br />    method, <br />    status</pre>  | 
| [HAVING clause](#supported-sql-having) |  Filters the results based on grouping conditions.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    method,<br />    status,<br />    COUNT(*) AS request_count,<br />    SUM(bytes) AS total_bytes <br />FROM <br />    <tableName/logGroup> <br />GROUP BY <br />    method,<br />    status<br />HAVING <br />    COUNT(*) > 5</pre>  | 
| [ORDER BY clause](#supported-sql-order-by) |  Orders the results based on fields in the order clause. You can sort in either descending or ascending order.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    <tableName/logGroup> <br />ORDER BY <br />    status DESC</pre>  | 
|  [JOIN clause](#supported-sql-join)  ( `INNER` \$1 `CROSS` \$1 `LEFT` `OUTER` )  |  Joins the results for two tables based on common fields.  |  ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported (must use `Inner` and `Left Outer` keywords for join; Only only one JOIN operation is supported in a SELECT statement)  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported (must use Inner, Left Outer, and Cross keywords for join) | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported (must use Inner, Left Outer, and Cross keywords for join) |  <pre>SELECT <br />    A.Body,<br />    B.Timestamp<br />FROM <br />    <tableNameA/logGroupA> AS A <br />INNER JOIN <br />    <tableNameB/logGroupB> AS B <br />    ON A.`requestId` = B.`requestId`</pre>  | 
| [LIMIT clause](#supported-sql-limit) |  Restricts the results to first N rows.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    <tableName/logGroup> <br />LIMIT <br />    10</pre>  | 
| [CASE clause](#supported-sql-case) | Evaluates conditions and returns a value when the first condition is met. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT<br />    method,<br />    status,<br />    CASE<br />        WHEN status BETWEEN 100 AND 199 THEN 'Informational'<br />        WHEN status BETWEEN 200 AND 299 THEN 'Success'<br />        WHEN status BETWEEN 300 AND 399 THEN 'Redirection'<br />        WHEN status BETWEEN 400 AND 499 THEN 'Client Error'<br />        WHEN status BETWEEN 500 AND 599 THEN 'Server Error'<br />        ELSE 'Unknown Status'<br />    END AS status_category,<br />    CASE method<br />        WHEN 'GET' THEN 'Read Operation'<br />        WHEN 'POST' THEN 'Create Operation'<br />        WHEN 'PUT' THEN 'Update Operation'<br />        WHEN 'PATCH' THEN 'Partial Update Operation'<br />        WHEN 'DELETE' THEN 'Delete Operation'<br />        ELSE 'Other Operation'<br />    END AS operation_type,<br />    bytes,<br />    datetime<br />FROM <tableName/logGroup>                         </pre>  | 
| [Common table expression](#supported-sql-cte) | Creates a named temporary result set within a SELECT, INSERT, UPDATE, DELETE, or MERGE statement. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>WITH RequestStats AS (<br />    SELECT <br />        method,<br />        status,<br />        bytes,<br />        COUNT(*) AS request_count<br />    FROM <br />        tableName<br />    GROUP BY <br />        method,<br />        status,<br />        bytes<br />)<br />SELECT <br />    method,<br />    status,<br />    bytes,<br />    request_count <br />FROM <br />    RequestStats <br />WHERE <br />    bytes > 1000</pre>  | 
| [EXPLAIN](#supported-sql-explain) | Displays the execution plan of a SQL statement without actually executing it. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>EXPLAIN<br />SELECT <br />    k,<br />    SUM(v)<br />FROM <br />    VALUES <br />        (1, 2),<br />        (1, 3) AS t(k, v)<br />GROUP BY <br />    k</pre>  | 
| [LATERAL SUBQUERY clause](#supported-sql-lateral-subquery) | Allows a subquery in the FROM clause to reference columns from preceding items in the same FROM clause. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> SELECT <br />    * <br />FROM <br />    tableName<br />LATERAL (<br />    SELECT <br />        * <br />    FROM <br />        t2 <br />    WHERE <br />        t1.c1 = t2.c1<br />)</pre>  | 
| [LATERAL VIEW clause](#supported-sql-lateral-view) | Generates a virtual table by applying a table-generating function to each row of a base table. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    tableName<br />LATERAL VIEW <br />    EXPLODE(ARRAY(30, 60)) tableName AS c_age<br />LATERAL VIEW <br />    EXPLODE(ARRAY(40, 80)) AS d_age</pre>  | 
| [LIKE predicate](#supported-sql-like-predicate) | Matches a string against a pattern using wildcard characters. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> SELECT <br />    method,<br />    status,<br />    request,<br />    host <br />FROM <br />    <tableName/logGroup> <br />WHERE <br />    method LIKE 'D%'</pre>  | 
| [OFFSET](#supported-sql-offset) | Specifies the number of rows to skip before starting to return rows from the query. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported when used in conjunction with a LIMIT clause in a query. For example:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-directquery-sql.html) | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> SELECT <br />    method,<br />    status,<br />    bytes,<br />    datetime <br />FROM <br />    <tableName/logGroup> <br />ORDER BY <br />    datetime<br />OFFSET <br />    10 </pre>  | 
| [PIVOT clause](#supported-sql-pivot) | Transforms rows into columns, rotating data from a row-based format to a column-based format. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    (<br />        SELECT <br />            method,<br />            status,<br />            bytes<br />        FROM <br />            <tableName/logGroup><br />    ) AS SourceTable <br />PIVOT <br />(<br />    SUM(bytes) <br />    FOR method IN ('GET', 'POST', 'PATCH', 'PUT', 'DELETE')<br />) AS PivotTable</pre>  | 
| [Set operators](#supported-sql-set) | Combines the results of two or more SELECT statements (e.g., UNION, INTERSECT, EXCEPT). | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    method,<br />    status,<br />    bytes<br />FROM <br />    <tableName/logGroup><br />WHERE <br />    status = '416'<br /><br />UNION<br /><br />SELECT <br />    method,<br />    status,<br />    bytes<br />FROM <br />    <tableName/logGroup><br />WHERE <br />    bytes > 20000</pre>  | 
| [SORT BY clause](#supported-sql-sort-by) | Specifies the order in which to return the query results. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    method,<br />    status,<br />    bytes<br />FROM <br />    <tableName/logGroup><br />SORT BY <br />    bytes DESC</pre>  | 
| [UNPIVOT](#supported-sql-unpivot) | Transforms columns into rows, rotating data from a column-based format to a row-based format. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> SELECT <br />    status,<br />    REPLACE(method, '_bytes', '') AS request_method,<br />    bytes,<br />    datetime <br />FROM <br />    PivotedData <br />UNPIVOT <br />(<br />    bytes <br />    FOR method IN <br />    (<br />        GET_bytes,<br />        POST_bytes,<br />        PATCH_bytes,<br />        PUT_bytes,<br />        DELETE_bytes<br />    )<br />) AS UnpivotedData</pre>  | 

## Functions
<a name="supported-sql-functions"></a>

**Note**  
In the example commands column, replace `<tableName/logGroup>` as needed depending on which data source you're querying.   
Example command: `SELECT Body , Operation FROM <tableName/logGroup>` 
If you're querying Amazon S3 or Security Lake, use: `SELECT Body , Operation FROM table_name` 
If you're querying CloudWatch Logs, use: `SELECT Body , Operation FROM `LogGroupA`` 


| Available SQL Grammar | Description | CloudWatch Logs | Amazon S3 | Security Lake | Example command | 
| --- | --- | --- | --- | --- | --- | 
| [String functions](#supported-sql-string) |  Built-in functions that can manipulate and transform string and text data within SQL queries. For example, converting case, combining strings, extracting parts, and cleaning text.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    UPPER(method) AS upper_method,<br />    LOWER(host) AS lower_host <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Date and time functions](#supported-sql-date-time) |  Built-in functions for handling and transforming date and timestamp data in queries. For example, **date\$1add**, **date\$1format**, **datediff**, and **current\$1date**.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    TO_TIMESTAMP(datetime) AS timestamp,<br />    TIMESTAMP_SECONDS(UNIX_TIMESTAMP(datetime)) AS from_seconds,<br />    UNIX_TIMESTAMP(datetime) AS to_unix,<br />    FROM_UTC_TIMESTAMP(datetime, 'PST') AS to_pst,<br />    TO_UTC_TIMESTAMP(datetime, 'EST') AS from_est <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Aggregate functions](#supported-sql-aggregate) |  Built-in functions that perform calculations on multiple rows to produce a single summarized value. For example, **sum**, **count**, **avg**, **max**, and **min**.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png)Supported |  ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    COUNT(*) AS total_records,<br />    COUNT(DISTINCT method) AS unique_methods,<br />    SUM(bytes) AS total_bytes,<br />    AVG(bytes) AS avg_bytes,<br />    MIN(bytes) AS min_bytes,<br />    MAX(bytes) AS max_bytes <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Conditional functions](#supported-sql-conditional) |  Built-in functions that perform actions based on specified conditions, or that evaluate expressions conditionally. For example, **CASE** and **IF**.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    CASE <br />        WHEN method = 'GET' AND bytes < 1000 THEN 'Small Read'<br />        WHEN method = 'POST' AND bytes > 10000 THEN 'Large Write'<br />        WHEN status >= 400 OR bytes = 0 THEN 'Problem'<br />        ELSE 'Normal'<br />    END AS request_type <br />FROM <br />    <tableName/logGroup></pre>  | 
| [JSON functions](#supported-sql-json) |  Built-in functions for parsing, extracting, modifying, and querying JSON-formatted data within SQL queries (e.g., from\$1json, to\$1json, get\$1json\$1object, json\$1tuple) allowing manipulation of JSON structures in datasets.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    FROM_JSON(<br />        @message, <br />        'STRUCT<<br />            host: STRING,<br />            user-identifier: STRING,<br />            datetime: STRING,<br />            method: STRING,<br />            status: INT,<br />            bytes: INT<br />        >'<br />    ) AS parsed_json <br />FROM <br />    <tableName/logGroup> </pre>  | 
| [Array functions](#supported-sql-array) |  Built-in functions for working with array-type columns in SQL queries, allowing operations like accessing, modifying, and analyzing array data (e.g., size, explode, array\$1contains).  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    scores,<br />    size(scores) AS length,<br />    array_contains(scores, 90) AS has_90 <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Window functions](#supported-sql-window) | Built-in functions that perform calculations across a specified set of rows related to the current row (window), enabling operations like ranking, running totals, and moving averages (e.g., ROW\$1NUMBER, RANK, LAG, LEAD) | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> SELECT <br />    field1,<br />    field2,<br />    RANK() OVER (ORDER BY field2 DESC) AS field2Rank <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Conversion functions](#supported-sql-conversion) |  Built-in functions for converting data from one type to another within SQL queries, enabling data type transformations and format conversions (e.g., CAST, TO\$1DATE, TO\$1TIMESTAMP, BINARY)  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    CAST('123' AS INT) AS converted_number,<br />    CAST(123 AS STRING) AS converted_string <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Predicate functions](#supported-sql-predicate) |  Built-in functions that evaluate conditions and return boolean values (true/false) based on specified criteria or patterns (e.g., IN, LIKE, BETWEEN, IS NULL, EXISTS)  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    * <br />FROM <br />    <tableName/logGroup> <br />WHERE <br />    id BETWEEN 50000 AND 75000</pre>  | 
| [Map functions](#supported-sql-map) | Applies a specified function to each element in a collection, transforming the data into a new set of values. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    MAP_FILTER(<br />        MAP(<br />            'method', method,<br />            'status', CAST(status AS STRING),<br />            'bytes', CAST(bytes AS STRING)<br />        ),<br />        (k, v) -> k IN ('method', 'status') AND v != 'null'<br />    ) AS filtered_map <br />FROM <br />    <tableName/logGroup> <br />WHERE <br />    status = 100</pre>  | 
| [Mathematical functions](#supported-sql-math) | Performs mathematical operations on numeric data, such as calculating averages, sums, or trigonometric values. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    bytes,<br />    bytes + 1000 AS added,<br />    bytes - 1000 AS subtracted,<br />    bytes * 2 AS doubled,<br />    bytes / 1024 AS kilobytes,<br />    bytes % 1000 AS remainder <br />FROM <br />    <tableName/logGroup></pre>  | 
| [Multi-log group functions](#multi-log-queries) |  Enables users to specify multiple log groups in a SQL SELECT statement  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | Not applicable | Not applicable |  <pre>SELECT <br />    lg1.Column1,<br />    lg1.Column2 <br />FROM <br />    `logGroups(logGroupIdentifier: ['LogGroup1', 'LogGroup2'])` AS lg1 <br />WHERE <br />    lg1.Column3 = "Success"<br /></pre>  | 
| [Generator functions](#supported-sql-generator) | Creates an iterator object that yields a sequence of values, allowing for efficient memory usage in large data sets. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>SELECT <br />    explode(array(10, 20)) </pre>  | 

## General SQL restrictions
<a name="general-sql-restrictions"></a>

The following restrictions apply when using OpenSearch SQL with CloudWatch Logs, Amazon S3, and Security Lake.

1. You can only use one JOIN operation in a SELECT statement.

1. Only one level of nested subqueries is supported.

1. Multiple statement queries separated by semi-colons aren't supported.

1. Queries containing field names that are identical but differ only in case (such as field1 and FIELD1) are not supported.

   For example, the following queries are not supported:

   ```
   Select AWSAccountId, awsaccountid from LogGroup
   ```

   However, the following query is because the field name (@logStream) is identical in both log groups:

   ```
   Select a.`@logStream`, b.`@logStream` from Table A INNER Join Table B on a.id = b.id 
   ```

1. Functions and expressions must operate on field names and be part of a SELECT statement with a log group specified in the FROM clause.

   For example, this query is not supported:

   ```
   SELECT cos(10) FROM LogGroup
   ```

   This query is supported:

   ```
   SELECT cos(field1) FROM LogGroup
   ```

## Additional information for CloudWatch Logs Insights users using OpenSearch SQL
<a name="supported-sql-for-multi-log-queries"></a>

 CloudWatch Logs supports OpenSearch SQL queries in the Logs Insights console, API, and CLI. It supports most commands, including SELECT, FROM, WHERE, GROUP BY, HAVING, JOINS, and nested queries, along with JSON, math, string, and conditional functions. However, CloudWatch Logs supports only read operations, so it doesn't allow DDL or DML statements. See the tables in the previous sections for a full list of supported commands and functions. 

### Multi-log group functions
<a name="multi-log-queries"></a>

CloudWatch Logs Insights supports the ability to query multiple log groups. To address this use case in SQL, you can use the `logGroups` command. This command is specific to querying data in CloudWatch Logs Insights involving one or more log groups. Use this syntax to query multiple log groups by specifying them in the command, instead of writing a query for each of the log groups and combining them with a `UNION` command. 

Syntax:

```
`logGroups(
    logGroupIdentifier: ['LogGroup1','LogGroup2', ...'LogGroupn']
)
```

In this syntax, you can specify up to 50 log groups in the `logGroupIndentifier` parameter. To reference log groups in a monitoring account, use ARNs instead of `LogGroup` names.

Example query:

```
SELECT LG1.Column1, LG1.Column2 from `logGroups(
    logGroupIdentifier: ['LogGroup1', 'LogGroup2']
)` as LG1 
WHERE LG1.Column1 = 'ABC'
```

The following syntax involving multiple log groups after the `FROM` statement is not supported when querying CloudWatch Logs:

```
SELECT Column1, Column2 FROM 'LogGroup1', 'LogGroup2', ...'LogGroupn' 
WHERE Column1 = 'ABC'
```

### Restrictions
<a name="restrictions"></a>

When you use SQL or PPL commands, enclose certain fields in backticks to query them. Backticks are required for fields with special characters (non-alphabetic and non-numeric). For example, enclose `@message`, `Operation.Export,` and `Test::Field` in backticks. You don't need to enclose columns with purely alphabetic names in backticks.

Example query with simple fields:

```
SELECT SessionToken, Operation, StartTime  FROM `LogGroup-A`
LIMIT 1000;
```

Same query with backticks appended:

```
SELECT `SessionToken`, `Operation`, `StartTime`  FROM `LogGroup-A`
LIMIT 1000;
```

For additional general restrictions that aren't specific to CloudWatch Logs, see [General SQL restrictions](#general-sql-restrictions).

### Sample queries and quotas
<a name="samples"></a>

**Note**  
The following applies to both CloudWatch Logs Insights users and OpenSearch users querying CloudWatch data.

For sample SQL queries that you can use in CloudWatch Logs, see **Saved and sample queries** in the Amazon CloudWatch Logs Insights console for examples.

For information about the limits that apply when querying CloudWatch Logs from OpenSearch Service, see [CloudWatch Logs quotas](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html) in the Amazon CloudWatch Logs User Guide. Limits involve the number of CloudWatch Log groups you can query, the maximum concurrent queries that you can execute, the maximum query execution time, and the maximum number of rows returned in results. The limits are the same regardless of which language you use for querying CloudWatch Logs (namely, OpenSearch PPL, SQL, and Logs Insights). 

### SQL commands
<a name="supported-sql-commands-details"></a>

**Topics**
+ [String functions](#supported-sql-string)
+ [Date and time functions](#supported-sql-date-time)
+ [Aggregate functions](#supported-sql-aggregate)
+ [Conditional functions](#supported-sql-conditional)
+ [JSON functions](#supported-sql-json)
+ [Array functions](#supported-sql-array)
+ [Window functions](#supported-sql-window)
+ [Conversion functions](#supported-sql-conversion)
+ [Predicate functions](#supported-sql-predicate)
+ [Map functions](#supported-sql-map)
+ [Mathematical functions](#supported-sql-math)
+ [Generator functions](#supported-sql-generator)
+ [SELECT clause](#supported-sql-select)
+ [WHERE clause](#supported-sql-where)
+ [GROUP BY clause](#supported-sql-group-by)
+ [HAVING clause](#supported-sql-having)
+ [ORDER BY clause](#supported-sql-order-by)
+ [JOIN clause](#supported-sql-join)
+ [LIMIT clause](#supported-sql-limit)
+ [CASE clause](#supported-sql-case)
+ [Common table expression](#supported-sql-cte)
+ [EXPLAIN](#supported-sql-explain)
+ [LATERAL SUBQUERY clause](#supported-sql-lateral-subquery)
+ [LATERAL VIEW clause](#supported-sql-lateral-view)
+ [LIKE predicate](#supported-sql-like-predicate)
+ [OFFSET](#supported-sql-offset)
+ [PIVOT clause](#supported-sql-pivot)
+ [Set operators](#supported-sql-set)
+ [SORT BY clause](#supported-sql-sort-by)
+ [UNPIVOT](#supported-sql-unpivot)

#### String functions
<a name="supported-sql-string"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| ascii(str) | Returns the numeric value of the first character of str. | 
| base64(bin) | Converts the argument from a binary bin to a base 64 string. | 
| bit\$1length(expr) | Returns the bit length of string data or number of bits of binary data. | 
| btrim(str) | Removes the leading and trailing space characters from str. | 
| btrim(str, trimStr) | Remove the leading and trailing trimStr characters from str. | 
| char(expr) | Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256) | 
| char\$1length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. | 
| character\$1length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. | 
| chr(expr) | Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256) | 
| concat\$1ws(sep[, str \$1 array(str)]\$1) | Returns the concatenation of the strings separated by sep, skipping null values. | 
| contains(left, right) | Returns a boolean. The value is True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. | 
| decode(bin, charset) | Decodes the first argument using the second argument character set. | 
| decode(expr, search, result [, search, result ] ... [, default]) | Compares expr to each search value in order. If expr is equal to a search value, decode returns the corresponding result. If no match is found, then it returns default. If default is omitted, it returns null. | 
| elt(n, input1, input2, ...) | Returns the n-th input, e.g., returns input2 when n is 2.  | 
| encode(str, charset) | Encodes the first argument using the second argument character set. | 
| endswith(left, right) | Returns a boolean. The value is True if left ends with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. | 
| find\$1in\$1set(str, str\$1array) | Returns the index (1-based) of the given string (str) in the comma-delimited list (str\$1array). Returns 0, if the string was not found or if the given string (str) contains a comma. | 
| format\$1number(expr1, expr2) | Formats the number expr1 like '\$1,\$1\$1\$1,\$1\$1\$1.\$1\$1', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. expr2 also accept a user specified format. This is supposed to function like MySQL's FORMAT. | 
| format\$1string(strfmt, obj, ...) | Returns a formatted string from printf-style format strings. | 
| initcap(str) | Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. | 
| instr(str, substr) | Returns the (1-based) index of the first occurrence of substr in str. | 
| lcase(str) | Returns str with all characters changed to lowercase. | 
| left(str, len) | Returns the leftmost len(len can be string type) characters from the string str,if len is less or equal than 0 the result is an empty string. | 
| len(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. | 
| length(expr) | Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. | 
| levenshtein(str1, str2[, threshold]) | Returns the Levenshtein distance between the two given strings. If threshold is set and distance more than it, return -1. | 
| locate(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based. | 
| lower(str) | Returns str with all characters changed to lowercase. | 
| lpad(str, len[, pad]) | Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters or bytes. If pad is not specified, str will be padded to the left with space characters if it is a character string, and with zeros if it is a byte sequence. | 
| ltrim(str) | Removes the leading space characters from str. | 
| luhn\$1check(str ) | Checks that a string of digits is valid according to the Luhn algorithm. This checksum function is widely applied on credit card numbers and government identification numbers to distinguish valid numbers from mistyped, incorrect numbers. | 
| mask(input[, upperChar, lowerChar, digitChar, otherChar]) | masks the given string value. The function replaces characters with 'X' or 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed. | 
| octet\$1length(expr) | Returns the byte length of string data or number of bytes of binary data. | 
| overlay(input, replace, pos[, len]) | Replace input with replace that starts at pos and is of length len. | 
| position(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based. | 
| printf(strfmt, obj, ...) | Returns a formatted string from printf-style format strings. | 
| regexp\$1count(str, regexp) | Returns a count of the number of times that the regular expression pattern regexp is matched in the string str. | 
| regexp\$1extract(str, regexp[, idx]) | Extract the first string in the str that match the regexp expression and corresponding to the regex group index. | 
| regexp\$1extract\$1all(str, regexp[, idx]) | Extract all strings in the str that match the regexp expression and corresponding to the regex group index. | 
| regexp\$1instr(str, regexp) | Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0. | 
| regexp\$1replace(str, regexp, rep[, position]) | Replaces all substrings of str that match regexp with rep. | 
| regexp\$1substr(str, regexp) | Returns the substring that matches the regular expression regexp within the string str. If the regular expression is not found, the result is null. | 
| repeat(str, n) | Returns the string which repeats the given string value n times. | 
| replace(str, search[, replace]) | Replaces all occurrences of search with replace. | 
| right(str, len) | Returns the rightmost len(len can be string type) characters from the string str,if len is less or equal than 0 the result is an empty string. | 
| rpad(str, len[, pad]) | Returns str, right-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters. If pad is not specified, str will be padded to the right with space characters if it is a character string, and with zeros if it is a binary string. | 
| rtrim(str) | Removes the trailing space characters from str. | 
| sentences(str[, lang, country]) | Splits str into an array of array of words. | 
| soundex(str) | Returns Soundex code of the string. | 
| space(n) | Returns a string consisting of n spaces. | 
| split(str, regex, limit) | Splits str around occurrences that match regex and returns an array with a length of at most limit | 
| split\$1part(str, delimiter, partNum) | Splits str by delimiter and return requested part of the split (1-based). If any input is null, returns null. if partNum is out of range of split parts, returns empty string. If partNum is 0, throws an error. If partNum is negative, the parts are counted backward from the end of the string. If the delimiter is an empty string, the str is not split. | 
| startswith(left, right) | Returns a boolean. The value is True if left starts with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type. | 
| substr(str, pos[, len]) | Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len. | 
| substr(str FROM pos[ FOR len]]) | Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len. | 
| substring(str, pos[, len]) | Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len. | 
| substring(str FROM pos[ FOR len]]) | Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len. | 
| substring\$1index(str, delim, count) | Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring\$1index performs a case-sensitive match when searching for delim. | 
| to\$1binary(str[, fmt]) | Converts the input str to a binary value based on the supplied fmt. fmt can be a case-insensitive string literal of "hex", "utf-8", "utf8", or "base64". By default, the binary format for conversion is "hex" if fmt is omitted. The function returns NULL if at least one of the input parameters is NULL. | 
| to\$1char(numberExpr, formatExpr) | Convert numberExpr to a string based on the formatExpr. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. ' | 
| to\$1number(expr, fmt) | Convert string 'expr' to a number based on the string format 'fmt'. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input string. If the 0/9 sequence starts with 0 and is before the decimal point, it can only match a digit sequence of the same size. Otherwise, if the sequence starts with 9 or is after the decimal point, it can match a digit sequence that has the same or smaller size. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. 'expr' must match the grouping separator relevant for the size of the number. ' | 
| to\$1varchar(numberExpr, formatExpr) | Convert numberExpr to a string based on the formatExpr. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. ' | 
| translate(input, from, to) | Translates the input string by replacing the characters present in the from string with the corresponding characters in the to string. | 
| trim(str) | Removes the leading and trailing space characters from str. | 
| trim(BOTH FROM str) | Removes the leading and trailing space characters from str. | 
| trim(LEADING FROM str) | Removes the leading space characters from str. | 
| trim(TRAILING FROM str) | Removes the trailing space characters from str. | 
| trim(trimStr FROM str) | Remove the leading and trailing trimStr characters from str. | 
| trim(BOTH trimStr FROM str) | Remove the leading and trailing trimStr characters from str. | 
| trim(LEADING trimStr FROM str) | Remove the leading trimStr characters from str. | 
| trim(TRAILING trimStr FROM str) | Remove the trailing trimStr characters from str. | 
| try\$1to\$1binary(str[, fmt]) | This is a special version of to\$1binary that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed. | 
| try\$1to\$1number(expr, fmt) | Convert string 'expr' to a number based on the string format fmt. Returns NULL if the string 'expr' does not match the expected format. The format follows the same semantics as the to\$1number function. | 
| ucase(str) | Returns str with all characters changed to uppercase. | 
| unbase64(str) | Converts the argument from a base 64 string str to a binary. | 
| upper(str) | Returns str with all characters changed to uppercase. | 



**Examples**

```
-- ascii
SELECT ascii('222');
+----------+
|ascii(222)|
+----------+
|        50|
+----------+
SELECT ascii(2);
+--------+
|ascii(2)|
+--------+
|      50|
+--------+
-- base64
SELECT base64('Feathers');
+-----------------+
|base64(Feathers)|
+-----------------+
|     RmVhdGhlcnM=|
+-----------------+
SELECT base64(x'537061726b2053514c');
+-----------------------------+
|base64(X'537061726B2053514C')|
+-----------------------------+
|                 U3BhcmsgU1FM|
+-----------------------------+
-- bit_length
SELECT bit_length('Feathers');
+---------------------+
|bit_length(Feathers)|
+---------------------+
|                   64|
+---------------------+
SELECT bit_length(x'537061726b2053514c');
+---------------------------------+
|bit_length(X'537061726B2053514C')|
+---------------------------------+
|                               72|
+---------------------------------+
-- btrim
SELECT btrim('    Feathers   ');
+----------------------+
|btrim(    Feathers   )|
+----------------------+
|              Feathers|
+----------------------+
SELECT btrim(encode('    Feathers   ', 'utf-8'));
+-------------------------------------+
|btrim(encode(    Feathers   , utf-8))|
+-------------------------------------+
|                             Feathers|
+-------------------------------------+
SELECT btrim('Feathers', 'Fe');
+---------------------+
|btrim(Alphabet, Al)|
+---------------------+
|               athers|
+---------------------+
SELECT btrim(encode('Feathers', 'utf-8'), encode('Al', 'utf-8'));
+---------------------------------------------------+
|btrim(encode(Feathers, utf-8), encode(Al, utf-8))|
+---------------------------------------------------+
|                                             athers|
+---------------------------------------------------+
-- char
SELECT char(65);
+--------+
|char(65)|
+--------+
|       A|
+--------+
-- char_length
SELECT char_length('Feathers ');
+-----------------------+
|char_length(Feathers )|
+-----------------------+
|                     9 |
+-----------------------+
SELECT char_length(x'537061726b2053514c');
+----------------------------------+
|char_length(X'537061726B2053514C')|
+----------------------------------+
|                                 9|
+----------------------------------+
SELECT CHAR_LENGTH('Feathers ');
+-----------------------+
|char_length(Feathers )|
+-----------------------+
|                     9|
+-----------------------+
SELECT CHARACTER_LENGTH('Feathers ');
+----------------------------+
|character_length(Feathers )|
+----------------------------+
|                          9|
+----------------------------+
-- character_length
SELECT character_length('Feathers ');
+----------------------------+
|character_length(Feathers )|
+----------------------------+
|                          9|
+----------------------------+
SELECT character_length(x'537061726b2053514c');
+---------------------------------------+
|character_length(X'537061726B2053514C')|
+---------------------------------------+
|                                      9|
+---------------------------------------+
SELECT CHAR_LENGTH('Feathers ');
+-----------------------+
|char_length(Feathers )|
+-----------------------+
|                     9|
+-----------------------+
SELECT CHARACTER_LENGTH('Feathers ');
+----------------------------+
|character_length(Feathers )|
+----------------------------+
|                          9|
+----------------------------+
-- chr
SELECT chr(65);
+-------+
|chr(65)|
+-------+
|      A|
+-------+
-- concat_ws
SELECT concat_ws(' ', 'Fea', 'thers');
+------------------------+
|concat_ws( , Fea, thers)|
+------------------------+
|               Feathers|
+------------------------+
SELECT concat_ws('s');
+------------+
|concat_ws(s)|
+------------+
|            |
+------------+
SELECT concat_ws('/', 'foo', null, 'bar');
+----------------------------+
|concat_ws(/, foo, NULL, bar)|
+----------------------------+
|                     foo/bar|
+----------------------------+
SELECT concat_ws(null, 'Fea', 'thers');
+---------------------------+
|concat_ws(NULL, Fea, thers)|
+---------------------------+
|                       NULL|
+---------------------------+
-- contains
SELECT contains('Feathers', 'Fea');
+--------------------------+
|contains(Feathers, Fea)|
+--------------------------+
|                      true|
+--------------------------+
SELECT contains('Feathers', 'SQL');
+--------------------------+
|contains(Feathers, SQL)|
+--------------------------+
|                     false|
+--------------------------+
SELECT contains('Feathers', null);
+-------------------------+
|contains(Feathers, NULL)|
+-------------------------+
|                     NULL|
+-------------------------+
SELECT contains(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|contains(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
|                                          true|
+----------------------------------------------+
-- decode
SELECT decode(encode('abc', 'utf-8'), 'utf-8');
+---------------------------------+
|decode(encode(abc, utf-8), utf-8)|
+---------------------------------+
|                              abc|
+---------------------------------+
SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(2, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
|                                                                     San Francisco|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
+----------------------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle, Non domestic)|
+----------------------------------------------------------------------------------+
|                                                                      Non domestic|
+----------------------------------------------------------------------------------+
SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle');
+--------------------------------------------------------------------+
|decode(6, 1, Southlake, 2, San Francisco, 3, New Jersey, 4, Seattle)|
+--------------------------------------------------------------------+
|                                                                NULL|
+--------------------------------------------------------------------+
SELECT decode(null, 6, 'Fea', NULL, 'thers', 4, 'rock');
+-------------------------------------------+
|decode(NULL, 6, Fea, NULL, thers, 4, rock)|
+-------------------------------------------+
|                                      thers|
+-------------------------------------------+
-- elt
SELECT elt(1, 'scala', 'java');
+-------------------+
|elt(1, scala, java)|
+-------------------+
|              scala|
+-------------------+
SELECT elt(2, 'a', 1);
+------------+
|elt(2, a, 1)|
+------------+
|           1|
+------------+
-- encode
SELECT encode('abc', 'utf-8');
+------------------+
|encode(abc, utf-8)|
+------------------+
|        [61 62 63]|
+------------------+
-- endswith
SELECT endswith('Feathers', 'ers');
+------------------------+
|endswith(Feathers, ers)|
+------------------------+
|                    true|
+------------------------+
SELECT endswith('Feathers', 'SQL');
+--------------------------+
|endswith(Feathers, SQL)|
+--------------------------+
|                     false|
+--------------------------+
SELECT endswith('Feathers', null);
+-------------------------+
|endswith(Feathers, NULL)|
+-------------------------+
|                     NULL|
+-------------------------+
SELECT endswith(x'537061726b2053514c', x'537061726b');
+----------------------------------------------+
|endswith(X'537061726B2053514C', X'537061726B')|
+----------------------------------------------+
|                                         false|
+----------------------------------------------+
SELECT endswith(x'537061726b2053514c', x'53514c');
+------------------------------------------+
|endswith(X'537061726B2053514C', X'53514C')|
+------------------------------------------+
|                                      true|
+------------------------------------------+
-- find_in_set
SELECT find_in_set('ab','abc,b,ab,c,def');
+-------------------------------+
|find_in_set(ab, abc,b,ab,c,def)|
+-------------------------------+
|                              3|
+-------------------------------+
-- format_number
SELECT format_number(12332.123456, 4);
+------------------------------+
|format_number(12332.123456, 4)|
+------------------------------+
|                   12,332.1235|
+------------------------------+
SELECT format_number(12332.123456, '##################.###');
+---------------------------------------------------+
|format_number(12332.123456, ##################.###)|
+---------------------------------------------------+
|                                          12332.123|
+---------------------------------------------------+
-- format_string
SELECT format_string("Hello World %d %s", 100, "days");
+-------------------------------------------+
|format_string(Hello World %d %s, 100, days)|
+-------------------------------------------+
|                       Hello World 100 days|
+-------------------------------------------+
-- initcap
SELECT initcap('Feathers');
+------------------+
|initcap(Feathers)|
+------------------+
|         Feathers|
+------------------+
-- instr
SELECT instr('Feathers', 'ers');
+--------------------+
|instr(Feathers, ers)|
+--------------------+
|                   6|
+--------------------+
-- lcase
SELECT lcase('Feathers');
+---------------+
|lcase(Feathers)|
+---------------+
|       feathers|
+---------------+
-- left
SELECT left('Feathers', 3);
+------------------+
|left(Feathers, 3)|
+------------------+
|               Fea|
+------------------+
SELECT left(encode('Feathers', 'utf-8'), 3);
+---------------------------------+
|left(encode(Feathers, utf-8), 3)|
+---------------------------------+
|                       [RmVh]|
+---------------------------------+
-- len
SELECT len('Feathers ');
+---------------+
|len(Feathers )|
+---------------+
|             9|
+---------------+
SELECT len(x'537061726b2053514c');
+--------------------------+
|len(X'537061726B2053514C')|
+--------------------------+
|                         9|
+--------------------------+
SELECT CHAR_LENGTH('Feathers ');
+-----------------------+
|char_length(Feathers )|
+-----------------------+
|                     9|
+-----------------------+
SELECT CHARACTER_LENGTH('Feathers ');
+----------------------------+
|character_length(Feathers )|
+----------------------------+
|                          9|
+----------------------------+
-- length
SELECT length('Feathers ');
+------------------+
|length(Feathers )|
+------------------+
|                9|
+------------------+
SELECT length(x'537061726b2053514c');
+-----------------------------+
|length(X'537061726B2053514C')|
+-----------------------------+
|                            9|
+-----------------------------+
SELECT CHAR_LENGTH('Feathers ');
+-----------------------+
|char_length(Feathers )|
+-----------------------+
|                     9|
+-----------------------+
SELECT CHARACTER_LENGTH('Feathers ');
+----------------------------+
|character_length(Feathers )|
+----------------------------+
|                          9|
+----------------------------+
-- levenshtein
SELECT levenshtein('kitten', 'sitting');
+----------------------------+
|levenshtein(kitten, sitting)|
+----------------------------+
|                           3|
+----------------------------+
SELECT levenshtein('kitten', 'sitting', 2);
+-------------------------------+
|levenshtein(kitten, sitting, 2)|
+-------------------------------+
|                             -1|
+-------------------------------+
-- locate
SELECT locate('bar', 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
|                        4|
+-------------------------+
SELECT locate('bar', 'foobarbar', 5);
+-------------------------+
|locate(bar, foobarbar, 5)|
+-------------------------+
|                        7|
+-------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
|                        4|
+-------------------------+
-- lower
SELECT lower('Feathers');
+---------------+
|lower(Feathers)|
+---------------+
|       feathers|
+---------------+
-- lpad
SELECT lpad('hi', 5, '??');
+---------------+
|lpad(hi, 5, ??)|
+---------------+
|          ???hi|
+---------------+
SELECT lpad('hi', 1, '??');
+---------------+
|lpad(hi, 1, ??)|
+---------------+
|              h|
+---------------+
SELECT lpad('hi', 5);
+--------------+
|lpad(hi, 5,  )|
+--------------+
|            hi|
+--------------+
SELECT hex(lpad(unhex('aabb'), 5));
+--------------------------------+
|hex(lpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
|                      000000AABB|
+--------------------------------+
SELECT hex(lpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(lpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
|                            112211AABB|
+--------------------------------------+
-- ltrim
SELECT ltrim('    Feathers   ');
+----------------------+
|ltrim(    Feathers   )|
+----------------------+
|           Feathers   |
+----------------------+
-- luhn_check
SELECT luhn_check('8112189876');
+----------------------+
|luhn_check(8112189876)|
+----------------------+
|                  true|
+----------------------+
SELECT luhn_check('79927398713');
+-----------------------+
|luhn_check(79927398713)|
+-----------------------+
|                   true|
+-----------------------+
SELECT luhn_check('79927398714');
+-----------------------+
|luhn_check(79927398714)|
+-----------------------+
|                  false|
+-----------------------+
-- mask
SELECT mask('abcd-EFGH-8765-4321');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, X, x, n, NULL)|
+----------------------------------------+
|                     xxxx-XXXX-nnnn-nnnn|
+----------------------------------------+
SELECT mask('abcd-EFGH-8765-4321', 'Q');
+----------------------------------------+
|mask(abcd-EFGH-8765-4321, Q, x, n, NULL)|
+----------------------------------------+
|                     xxxx-QQQQ-nnnn-nnnn|
+----------------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
|                     QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#');
+--------------------------------+
|mask(AbCD123-@$#, X, x, n, NULL)|
+--------------------------------+
|                     XxXXnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q');
+--------------------------------+
|mask(AbCD123-@$#, Q, x, n, NULL)|
+--------------------------------+
|                     QxQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, n, NULL)|
+--------------------------------+
|                     QqQQnnn-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd');
+--------------------------------+
|mask(AbCD123-@$#, Q, q, d, NULL)|
+--------------------------------+
|                     QqQQddd-@$#|
+--------------------------------+
SELECT mask('AbCD123-@$#', 'Q', 'q', 'd', 'o');
+-----------------------------+
|mask(AbCD123-@$#, Q, q, d, o)|
+-----------------------------+
|                  QqQQdddoooo|
+-----------------------------+
SELECT mask('AbCD123-@$#', NULL, 'q', 'd', 'o');
+--------------------------------+
|mask(AbCD123-@$#, NULL, q, d, o)|
+--------------------------------+
|                     AqCDdddoooo|
+--------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, 'd', 'o');
+-----------------------------------+
|mask(AbCD123-@$#, NULL, NULL, d, o)|
+-----------------------------------+
|                        AbCDdddoooo|
+-----------------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, 'o');
+--------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, o)|
+--------------------------------------+
|                           AbCD123oooo|
+--------------------------------------+
SELECT mask(NULL, NULL, NULL, NULL, 'o');
+-------------------------------+
|mask(NULL, NULL, NULL, NULL, o)|
+-------------------------------+
|                           NULL|
+-------------------------------+
SELECT mask(NULL);
+-------------------------+
|mask(NULL, X, x, n, NULL)|
+-------------------------+
|                     NULL|
+-------------------------+
SELECT mask('AbCD123-@$#', NULL, NULL, NULL, NULL);
+-----------------------------------------+
|mask(AbCD123-@$#, NULL, NULL, NULL, NULL)|
+-----------------------------------------+
|                              AbCD123-@$#|
+-----------------------------------------+
-- octet_length
SELECT octet_length('Feathers');
+-----------------------+
|octet_length(Feathers)|
+-----------------------+
|                      8|
+-----------------------+
SELECT octet_length(x'537061726b2053514c');
+-----------------------------------+
|octet_length(X'537061726B2053514C')|
+-----------------------------------+
|                                  9|
+-----------------------------------+
-- overlay
SELECT overlay('Feathers' PLACING '_' FROM 6);
+----------------------------+
|overlay(Feathers, _, 6, -1)|
+----------------------------+
|                   Feathe_ers|
+----------------------------+
SELECT overlay('Feathers' PLACING 'ures' FROM 5);
+-------------------------------+
|overlay(Feathers, ures, 5, -1)|
+-------------------------------+
|                     Features  |
+-------------------------------+
-- position
SELECT position('bar', 'foobarbar');
+---------------------------+
|position(bar, foobarbar, 1)|
+---------------------------+
|                          4|
+---------------------------+
SELECT position('bar', 'foobarbar', 5);
+---------------------------+
|position(bar, foobarbar, 5)|
+---------------------------+
|                          7|
+---------------------------+
SELECT POSITION('bar' IN 'foobarbar');
+-------------------------+
|locate(bar, foobarbar, 1)|
+-------------------------+
|                        4|
+-------------------------+
-- printf
SELECT printf("Hello World %d %s", 100, "days");
+------------------------------------+
|printf(Hello World %d %s, 100, days)|
+------------------------------------+
|                Hello World 100 days|
+------------------------------------+
-- regexp_count
SELECT regexp_count('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+------------------------------------------------------------------------------+
|regexp_count(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+------------------------------------------------------------------------------+
|                                                                             2|
+------------------------------------------------------------------------------+
SELECT regexp_count('abcdefghijklmnopqrstuvwxyz', '[a-z]{3}');
+--------------------------------------------------+
|regexp_count(abcdefghijklmnopqrstuvwxyz, [a-z]{3})|
+--------------------------------------------------+
|                                                 8|
+--------------------------------------------------+
-- regexp_extract
SELECT regexp_extract('100-200', '(\\d+)-(\\d+)', 1);
+---------------------------------------+
|regexp_extract(100-200, (\d+)-(\d+), 1)|
+---------------------------------------+
|                                    100|
+---------------------------------------+
-- regexp_extract_all
SELECT regexp_extract_all('100-200, 300-400', '(\\d+)-(\\d+)', 1);
+----------------------------------------------------+
|regexp_extract_all(100-200, 300-400, (\d+)-(\d+), 1)|
+----------------------------------------------------+
|                                          [100, 300]|
+----------------------------------------------------+
-- regexp_instr
SELECT regexp_instr('user@opensearch.org', '@[^.]*');
+----------------------------------------------+
|regexp_instr(user@opensearch.org, @[^.]*, 0)|
+----------------------------------------------+
|                                             5|
+----------------------------------------------+
-- regexp_replace
SELECT regexp_replace('100-200', '(\\d+)', 'num');
+--------------------------------------+
|regexp_replace(100-200, (\d+), num, 1)|
+--------------------------------------+
|                               num-num|
+--------------------------------------+
-- regexp_substr
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Ste(v|ph)en');
+-------------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Ste(v|ph)en)|
+-------------------------------------------------------------------------------+
|                                                                         Steven|
+-------------------------------------------------------------------------------+
SELECT regexp_substr('Steven Jones and Stephen Smith are the best players', 'Jeck');
+------------------------------------------------------------------------+
|regexp_substr(Steven Jones and Stephen Smith are the best players, Jeck)|
+------------------------------------------------------------------------+
|                                                                    NULL|
+------------------------------------------------------------------------+
-- repeat
SELECT repeat('123', 2);
+--------------+
|repeat(123, 2)|
+--------------+
|        123123|
+--------------+
-- replace
SELECT replace('ABCabc', 'abc', 'DEF');
+-------------------------+
|replace(ABCabc, abc, DEF)|
+-------------------------+
|                   ABCDEF|
+-------------------------+
-- right
SELECT right('Feathers', 3);
+-------------------+
|right(Feathers, 3)|
+-------------------+
|                ers|
+-------------------+
-- rpad
SELECT rpad('hi', 5, '??');
+---------------+
|rpad(hi, 5, ??)|
+---------------+
|          hi???|
+---------------+
SELECT rpad('hi', 1, '??');
+---------------+
|rpad(hi, 1, ??)|
+---------------+
|              h|
+---------------+
SELECT rpad('hi', 5);
+--------------+
|rpad(hi, 5,  )|
+--------------+
|         hi   |
+--------------+
SELECT hex(rpad(unhex('aabb'), 5));
+--------------------------------+
|hex(rpad(unhex(aabb), 5, X'00'))|
+--------------------------------+
|                      AABB000000|
+--------------------------------+
SELECT hex(rpad(unhex('aabb'), 5, unhex('1122')));
+--------------------------------------+
|hex(rpad(unhex(aabb), 5, unhex(1122)))|
+--------------------------------------+
|                            AABB112211|
+--------------------------------------+
-- rtrim
SELECT rtrim('    Feathers   ');
+----------------------+
|rtrim(    Feathers   )|
+----------------------+
|              Feathers|
+----------------------+
-- sentences
SELECT sentences('Hi there! Good morning.');
+--------------------------------------+
|sentences(Hi there! Good morning., , )|
+--------------------------------------+
|                  [[Hi, there], [Go...|
+--------------------------------------+
-- soundex
SELECT soundex('Miller');
+---------------+
|soundex(Miller)|
+---------------+
|           M460|
+---------------+
-- space
SELECT concat(space(2), '1');
+-------------------+
|concat(space(2), 1)|
+-------------------+
|                  1|
+-------------------+
-- split
SELECT split('oneAtwoBthreeC', '[ABC]');
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
|             [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', -1);
+--------------------------------+
|split(oneAtwoBthreeC, [ABC], -1)|
+--------------------------------+
|             [one, two, three, ]|
+--------------------------------+
SELECT split('oneAtwoBthreeC', '[ABC]', 2);
+-------------------------------+
|split(oneAtwoBthreeC, [ABC], 2)|
+-------------------------------+
|              [one, twoBthreeC]|
+-------------------------------+
-- split_part
SELECT split_part('11.12.13', '.', 3);
+--------------------------+
|split_part(11.12.13, ., 3)|
+--------------------------+
|                        13|
+--------------------------+
-- startswith
SELECT startswith('Feathers', 'Fea');
+----------------------------+
|startswith(Feathers, Fea)|
+----------------------------+
|                        true|
+----------------------------+
SELECT startswith('Feathers', 'SQL');
+--------------------------+
|startswith(Feathers, SQL)|
+--------------------------+
|                     false|
+--------------------------+
SELECT startswith('Feathers', null);
+---------------------------+
|startswith(Feathers, NULL)|
+---------------------------+
|                       NULL|
+---------------------------+
SELECT startswith(x'537061726b2053514c', x'537061726b');
+------------------------------------------------+
|startswith(X'537061726B2053514C', X'537061726B')|
+------------------------------------------------+
|                                            true|
+------------------------------------------------+
SELECT startswith(x'537061726b2053514c', x'53514c');
+--------------------------------------------+
|startswith(X'537061726B2053514C', X'53514C')|
+--------------------------------------------+
|                                       false|
+--------------------------------------------+
-- substr
SELECT substr('Feathers', 5);
+--------------------------------+
|substr(Feathers, 5, 2147483647)|
+--------------------------------+
|                           hers |
+--------------------------------+
SELECT substr('Feathers', -3);
+---------------------------------+
|substr(Feathers, -3, 2147483647)|
+---------------------------------+
|                              ers|
+---------------------------------+
SELECT substr('Feathers', 5, 1);
+-----------------------+
|substr(Feathers, 5, 1)|
+-----------------------+
|                      h|
+-----------------------+
SELECT substr('Feathers' FROM 5);
+-----------------------------------+
|substring(Feathers, 5, 2147483647)|
+-----------------------------------+
|                              hers |
+-----------------------------------+
SELECT substr('Feathers' FROM -3);
+------------------------------------+
|substring(Feathers, -3, 2147483647)|
+------------------------------------+
|                                 ers|
+------------------------------------+
SELECT substr('Feathers' FROM 5 FOR 1);
+--------------------------+
|substring(Feathers, 5, 1)|
+--------------------------+
|                         h|
+--------------------------+
-- substring
SELECT substring('Feathers', 5);
+-----------------------------------+
|substring(Feathers, 5, 2147483647)|
+-----------------------------------+
|                              hers |
+-----------------------------------+
SELECT substring('Feathers', -3);
+------------------------------------+
|substring(Feathers, -3, 2147483647)|
+------------------------------------+
|                                 ers|
+------------------------------------+
SELECT substring('Feathers', 5, 1);
+--------------------------+
|substring(Feathers, 5, 1)|
+--------------------------+
|                         h|
+--------------------------+
SELECT substring('Feathers' FROM 5);
+-----------------------------------+
|substring(Feathers, 5, 2147483647)|
+-----------------------------------+
|                              hers |
+-----------------------------------+
SELECT substring('Feathers' FROM -3);
+------------------------------------+
|substring(Feathers, -3, 2147483647)|
+------------------------------------+
|                                 ers|
+------------------------------------+
SELECT substring('Feathers' FROM 5 FOR 1);
+--------------------------+
|substring(Feathers, 5, 1)|
+--------------------------+
|                         h|
+--------------------------+
-- substring_index
SELECT substring_index('www.apache.org', '.', 2);
+-------------------------------------+
|substring_index(www.apache.org, ., 2)|
+-------------------------------------+
|                           www.apache|
+-------------------------------------+
-- to_binary
SELECT to_binary('abc', 'utf-8');
+---------------------+
|to_binary(abc, utf-8)|
+---------------------+
|           [61 62 63]|
+---------------------+
-- to_char
SELECT to_char(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
|              454|
+-----------------+
SELECT to_char(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
|                 454.00|
+-----------------------+
SELECT to_char(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
|                12,454|
+----------------------+
SELECT to_char(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
|                $78.12|
+----------------------+
SELECT to_char(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
|                   12,454.8-|
+----------------------------+
-- to_number
SELECT to_number('454', '999');
+-------------------+
|to_number(454, 999)|
+-------------------+
|                454|
+-------------------+
SELECT to_number('454.00', '000.00');
+-------------------------+
|to_number(454.00, 000.00)|
+-------------------------+
|                   454.00|
+-------------------------+
SELECT to_number('12,454', '99,999');
+-------------------------+
|to_number(12,454, 99,999)|
+-------------------------+
|                    12454|
+-------------------------+
SELECT to_number('$78.12', '$99.99');
+-------------------------+
|to_number($78.12, $99.99)|
+-------------------------+
|                    78.12|
+-------------------------+
SELECT to_number('12,454.8-', '99,999.9S');
+-------------------------------+
|to_number(12,454.8-, 99,999.9S)|
+-------------------------------+
|                       -12454.8|
+-------------------------------+
-- to_varchar
SELECT to_varchar(454, '999');
+-----------------+
|to_char(454, 999)|
+-----------------+
|              454|
+-----------------+
SELECT to_varchar(454.00, '000D00');
+-----------------------+
|to_char(454.00, 000D00)|
+-----------------------+
|                 454.00|
+-----------------------+
SELECT to_varchar(12454, '99G999');
+----------------------+
|to_char(12454, 99G999)|
+----------------------+
|                12,454|
+----------------------+
SELECT to_varchar(78.12, '$99.99');
+----------------------+
|to_char(78.12, $99.99)|
+----------------------+
|                $78.12|
+----------------------+
SELECT to_varchar(-12454.8, '99G999D9S');
+----------------------------+
|to_char(-12454.8, 99G999D9S)|
+----------------------------+
|                   12,454.8-|
+----------------------------+
-- translate
SELECT translate('AaBbCc', 'abc', '123');
+---------------------------+
|translate(AaBbCc, abc, 123)|
+---------------------------+
|                     A1B2C3|
+---------------------------+
-- try_to_binary
SELECT try_to_binary('abc', 'utf-8');
+-------------------------+
|try_to_binary(abc, utf-8)|
+-------------------------+
|               [61 62 63]|
+-------------------------+
select try_to_binary('a!', 'base64');
+-------------------------+
|try_to_binary(a!, base64)|
+-------------------------+
|                     NULL|
+-------------------------+
select try_to_binary('abc', 'invalidFormat');
+---------------------------------+
|try_to_binary(abc, invalidFormat)|
+---------------------------------+
|                             NULL|
+---------------------------------+
-- try_to_number
SELECT try_to_number('454', '999');
+-----------------------+
|try_to_number(454, 999)|
+-----------------------+
|                    454|
+-----------------------+
SELECT try_to_number('454.00', '000.00');
+-----------------------------+
|try_to_number(454.00, 000.00)|
+-----------------------------+
|                       454.00|
+-----------------------------+
SELECT try_to_number('12,454', '99,999');
+-----------------------------+
|try_to_number(12,454, 99,999)|
+-----------------------------+
|                        12454|
+-----------------------------+
SELECT try_to_number('$78.12', '$99.99');
+-----------------------------+
|try_to_number($78.12, $99.99)|
+-----------------------------+
|                        78.12|
+-----------------------------+
SELECT try_to_number('12,454.8-', '99,999.9S');
+-----------------------------------+
|try_to_number(12,454.8-, 99,999.9S)|
+-----------------------------------+
|                           -12454.8|
+-----------------------------------+
-- ucase
SELECT ucase('Feathers');
+---------------+
|ucase(Feathers)|
+---------------+
|       FEATHERS|
+---------------+
-- unbase64
SELECT unbase64('U3BhcmsgU1FM');
+----------------------+
|unbase64(U3BhcmsgU1FM)|
+----------------------+
|  [53 70 61 72 6B 2...|
+----------------------+
-- upper
SELECT upper('Feathers');
+---------------+
|upper(Feathers)|
+---------------+
|       FEATHERS|
+---------------+
```

#### Date and time functions
<a name="supported-sql-date-time"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| add\$1months(start\$1date, num\$1months) | Returns the date that is num\$1months after start\$1date. | 
| convert\$1timezone([sourceTz, ]targetTz, sourceTs) | Converts the timestamp without time zone sourceTs from the sourceTz time zone to targetTz. | 
| curdate() | Returns the current date at the start of query evaluation. All calls of curdate within the same query return the same value. | 
| current\$1date() | Returns the current date at the start of query evaluation. All calls of current\$1date within the same query return the same value. | 
| current\$1date | Returns the current date at the start of query evaluation. | 
| current\$1timestamp() | Returns the current timestamp at the start of query evaluation. All calls of current\$1timestamp within the same query return the same value. | 
| current\$1timestamp | Returns the current timestamp at the start of query evaluation. | 
| current\$1timezone() | Returns the current session local timezone. | 
| date\$1add(start\$1date, num\$1days) | Returns the date that is num\$1days after start\$1date. | 
| date\$1diff(endDate, startDate) | Returns the number of days from startDate to endDate. | 
| date\$1format(timestamp, fmt) | Converts timestamp to a value of string in the format specified by the date format fmt. | 
| date\$1from\$1unix\$1date(days) | Create date from the number of days since 1970-01-01. | 
| date\$1part(field, source) | Extracts a part of the date/timestamp or interval source. | 
| date\$1sub(start\$1date, num\$1days) | Returns the date that is num\$1days before start\$1date. | 
| date\$1trunc(fmt, ts) | Returns timestamp ts truncated to the unit specified by the format model fmt. | 
| dateadd(start\$1date, num\$1days) | Returns the date that is num\$1days after start\$1date. | 
| datediff(endDate, startDate) | Returns the number of days from startDate to endDate. | 
| datepart(field, source) | Extracts a part of the date/timestamp or interval source. | 
| day(date) | Returns the day of month of the date/timestamp. | 
| dayofmonth(date) | Returns the day of month of the date/timestamp. | 
| dayofweek(date) | Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday). | 
| dayofyear(date) | Returns the day of year of the date/timestamp. | 
| extract(field FROM source) | Extracts a part of the date/timestamp or interval source. | 
| from\$1unixtime(unix\$1time[, fmt]) | Returns unix\$1time in the specified fmt. | 
| from\$1utc\$1timestamp(timestamp, timezone) | Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT\$11' would yield '2017-07-14 03:40:00.0'. | 
| hour(timestamp) | Returns the hour component of the string/timestamp. | 
| last\$1day(date) | Returns the last day of the month which the date belongs to. | 
| localtimestamp() | Returns the current timestamp without time zone at the start of query evaluation. All calls of localtimestamp within the same query return the same value. | 
| localtimestamp | Returns the current local date-time at the session time zone at the start of query evaluation. | 
| make\$1date(year, month, day) | Create date from year, month and day fields.  | 
| make\$1dt\$1interval([days[, hours[, mins[, secs]]]]) | Make DayTimeIntervalType duration from days, hours, mins and secs. | 
| make\$1interval([years[, months[, weeks[, days[, hours[, mins[, secs]]]]]]]) | Make interval from years, months, weeks, days, hours, mins and secs. | 
| make\$1timestamp(year, month, day, hour, min, sec[, timezone]) | Create timestamp from year, month, day, hour, min, sec and timezone fields.  | 
| make\$1timestamp\$1ltz(year, month, day, hour, min, sec[, timezone]) | Create the current timestamp with local time zone from year, month, day, hour, min, sec and timezone fields. | 
| make\$1timestamp\$1ntz(year, month, day, hour, min, sec) | Create local date-time from year, month, day, hour, min, sec fields.  | 
| make\$1ym\$1interval([years[, months]]) | Make year-month interval from years, months. | 
| minute(timestamp) | Returns the minute component of the string/timestamp. | 
| month(date) | Returns the month component of the date/timestamp. | 
| months\$1between(timestamp1, timestamp2[, roundOff]) | If timestamp1 is later than timestamp2, then the result is positive. If timestamp1 and timestamp2 are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false. | 
| next\$1day(start\$1date, day\$1of\$1week) | Returns the first date which is later than start\$1date and named as indicated. The function returns NULL if at least one of the input parameters is NULL.  | 
| now() | Returns the current timestamp at the start of query evaluation. | 
| quarter(date) | Returns the quarter of the year for date, in the range 1 to 4. | 
| second(timestamp) | Returns the second component of the string/timestamp. | 
| session\$1window(time\$1column, gap\$1duration) | Generates session window given a timestamp specifying column and gap duration. See 'Types of time windows' in Structured Streaming guide doc for detailed explanation and examples. | 
| timestamp\$1micros(microseconds) | Creates timestamp from the number of microseconds since UTC epoch. | 
| timestamp\$1millis(milliseconds) | Creates timestamp from the number of milliseconds since UTC epoch. | 
| timestamp\$1seconds(seconds) | Creates timestamp from the number of seconds (can be fractional) since UTC epoch. | 
| to\$1date(date\$1str[, fmt]) | Parses the date\$1str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted. | 
| to\$1timestamp(timestamp\$1str[, fmt]) | Parses the timestamp\$1str expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted.  | 
| to\$1timestamp\$1ltz(timestamp\$1str[, fmt]) | Parses the timestamp\$1str expression with the fmt expression to a timestamp with local time zone. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted. | 
| to\$1timestamp\$1ntz(timestamp\$1str[, fmt]) | Parses the timestamp\$1str expression with the fmt expression to a timestamp without time zone. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted. | 
| to\$1unix\$1timestamp(timeExp[, fmt]) | Returns the UNIX timestamp of the given time. | 
| to\$1utc\$1timestamp(timestamp, timezone) | Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT\$11' would yield '2017-07-14 01:40:00.0'. | 
| trunc(date, fmt) | Returns date with the time portion of the day truncated to the unit specified by the format model fmt. | 
| try\$1to\$1timestamp(timestamp\$1str[, fmt]) | Parses the timestamp\$1str expression with the fmt expression to a timestamp.  | 
| unix\$1date(date) | Returns the number of days since 1970-01-01. | 
| unix\$1micros(timestamp) | Returns the number of microseconds since 1970-01-01 00:00:00 UTC. | 
| unix\$1millis(timestamp) | Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision. | 
| unix\$1seconds(timestamp) | Returns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision. | 
| unix\$1timestamp([timeExp[, fmt]]) | Returns the UNIX timestamp of current or specified time. | 
| weekday(date) | Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). | 
| weekofyear(date) | Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. | 
| window(time\$1column, window\$1duration[, slide\$1duration[, start\$1time]]) | Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. See 'Window Operations on Event Time' in Structured Streaming guide doc for detailed explanation and examples. | 
| window\$1time(window\$1column) | Extract the time value from time/session window column which can be used for event time value of window. The extracted time is (window.end - 1) which reflects the fact that the the aggregating windows have exclusive upper bound - [start, end) See 'Window Operations on Event Time' in Structured Streaming guide doc for detailed explanation and examples. | 
| year(date) | Returns the year component of the date/timestamp. | 

**Examples**

```
-- add_months
SELECT add_months('2016-08-31', 1);
+-------------------------+
|add_months(2016-08-31, 1)|
+-------------------------+
|               2016-09-30|
+-------------------------+
-- convert_timezone
SELECT convert_timezone('Europe/Brussels', 'America/Los_Angeles', timestamp_ntz'2021-12-06 00:00:00');
+-------------------------------------------------------------------------------------------+
|convert_timezone(Europe/Brussels, America/Los_Angeles, TIMESTAMP_NTZ '2021-12-06 00:00:00')|
+-------------------------------------------------------------------------------------------+
|                                                                        2021-12-05 15:00:00|
+-------------------------------------------------------------------------------------------+
SELECT convert_timezone('Europe/Brussels', timestamp_ntz'2021-12-05 15:00:00');
+------------------------------------------------------------------------------------------+
|convert_timezone(current_timezone(), Europe/Brussels, TIMESTAMP_NTZ '2021-12-05 15:00:00')|
+------------------------------------------------------------------------------------------+
|                                                                       2021-12-05 07:00:00|
+------------------------------------------------------------------------------------------+
-- curdate
SELECT curdate();
+--------------+
|current_date()|
+--------------+
|    2024-02-24|
+--------------+
-- current_date
SELECT current_date();
+--------------+
|current_date()|
+--------------+
|    2024-02-24|
+--------------+
SELECT current_date;
+--------------+
|current_date()|
+--------------+
|    2024-02-24|
+--------------+
-- current_timestamp
SELECT current_timestamp();
+--------------------+
| current_timestamp()|
+--------------------+
|2024-02-24 16:36:...|
+--------------------+
SELECT current_timestamp;
+--------------------+
| current_timestamp()|
+--------------------+
|2024-02-24 16:36:...|
+--------------------+
-- current_timezone
SELECT current_timezone();
+------------------+
|current_timezone()|
+------------------+
|        Asia/Seoul|
+------------------+
-- date_add
SELECT date_add('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
|             2016-07-31|
+-----------------------+
-- date_diff
SELECT date_diff('2009-07-31', '2009-07-30');
+---------------------------------+
|date_diff(2009-07-31, 2009-07-30)|
+---------------------------------+
|                                1|
+---------------------------------+
SELECT date_diff('2009-07-30', '2009-07-31');
+---------------------------------+
|date_diff(2009-07-30, 2009-07-31)|
+---------------------------------+
|                               -1|
+---------------------------------+
-- date_format
SELECT date_format('2016-04-08', 'y');
+--------------------------+
|date_format(2016-04-08, y)|
+--------------------------+
|                      2016|
+--------------------------+
-- date_from_unix_date
SELECT date_from_unix_date(1);
+----------------------+
|date_from_unix_date(1)|
+----------------------+
|            1970-01-02|
+----------------------+
-- date_part
SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(YEAR, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
|                                                   2019|
+-------------------------------------------------------+
SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
+-------------------------------------------------------+
|date_part(week, TIMESTAMP '2019-08-12 01:00:00.123456')|
+-------------------------------------------------------+
|                                                     33|
+-------------------------------------------------------+
SELECT date_part('doy', DATE'2019-08-12');
+---------------------------------+
|date_part(doy, DATE '2019-08-12')|
+---------------------------------+
|                              224|
+---------------------------------+
SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+----------------------------------------------------------+
|date_part(SECONDS, TIMESTAMP '2019-10-01 00:00:01.000001')|
+----------------------------------------------------------+
|                                                  1.000001|
+----------------------------------------------------------+
SELECT date_part('days', interval 5 days 3 hours 7 minutes);
+-------------------------------------------------+
|date_part(days, INTERVAL '5 03:07' DAY TO MINUTE)|
+-------------------------------------------------+
|                                                5|
+-------------------------------------------------+
SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+-------------------------------------------------------------+
|date_part(seconds, INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+-------------------------------------------------------------+
|                                                    30.001001|
+-------------------------------------------------------------+
SELECT date_part('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+--------------------------------------------------+
|date_part(MONTH, INTERVAL '2021-11' YEAR TO MONTH)|
+--------------------------------------------------+
|                                                11|
+--------------------------------------------------+
SELECT date_part('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+---------------------------------------------------------------+
|date_part(MINUTE, INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+---------------------------------------------------------------+
|                                                             55|
+---------------------------------------------------------------+
-- date_sub
SELECT date_sub('2016-07-30', 1);
+-----------------------+
|date_sub(2016-07-30, 1)|
+-----------------------+
|             2016-07-29|
+-----------------------+
-- date_trunc
SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(YEAR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
|                      2015-01-01 00:00:00|
+-----------------------------------------+
SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(MM, 2015-03-05T09:32:05.359)|
+---------------------------------------+
|                    2015-03-01 00:00:00|
+---------------------------------------+
SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
+---------------------------------------+
|date_trunc(DD, 2015-03-05T09:32:05.359)|
+---------------------------------------+
|                    2015-03-05 00:00:00|
+---------------------------------------+
SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
+-----------------------------------------+
|date_trunc(HOUR, 2015-03-05T09:32:05.359)|
+-----------------------------------------+
|                      2015-03-05 09:00:00|
+-----------------------------------------+
SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
+---------------------------------------------------+
|date_trunc(MILLISECOND, 2015-03-05T09:32:05.123456)|
+---------------------------------------------------+
|                               2015-03-05 09:32:...|
+---------------------------------------------------+
-- dateadd
SELECT dateadd('2016-07-30', 1);
+-----------------------+
|date_add(2016-07-30, 1)|
+-----------------------+
|             2016-07-31|
+-----------------------+
-- datediff
SELECT datediff('2009-07-31', '2009-07-30');
+--------------------------------+
|datediff(2009-07-31, 2009-07-30)|
+--------------------------------+
|                               1|
+--------------------------------+
SELECT datediff('2009-07-30', '2009-07-31');
+--------------------------------+
|datediff(2009-07-30, 2009-07-31)|
+--------------------------------+
|                              -1|
+--------------------------------+
-- datepart
SELECT datepart('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
|                                                      2019|
+----------------------------------------------------------+
SELECT datepart('week', timestamp'2019-08-12 01:00:00.123456');
+----------------------------------------------------------+
|datepart(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+----------------------------------------------------------+
|                                                        33|
+----------------------------------------------------------+
SELECT datepart('doy', DATE'2019-08-12');
+------------------------------------+
|datepart(doy FROM DATE '2019-08-12')|
+------------------------------------+
|                                 224|
+------------------------------------+
SELECT datepart('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+-------------------------------------------------------------+
|datepart(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+-------------------------------------------------------------+
|                                                     1.000001|
+-------------------------------------------------------------+
SELECT datepart('days', interval 5 days 3 hours 7 minutes);
+----------------------------------------------------+
|datepart(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+----------------------------------------------------+
|                                                   5|
+----------------------------------------------------+
SELECT datepart('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+----------------------------------------------------------------+
|datepart(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+----------------------------------------------------------------+
|                                                       30.001001|
+----------------------------------------------------------------+
SELECT datepart('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
+-----------------------------------------------------+
|datepart(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+-----------------------------------------------------+
|                                                   11|
+-----------------------------------------------------+
SELECT datepart('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+------------------------------------------------------------------+
|datepart(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+------------------------------------------------------------------+
|                                                                55|
+------------------------------------------------------------------+
-- day
SELECT day('2009-07-30');
+---------------+
|day(2009-07-30)|
+---------------+
|             30|
+---------------+
-- dayofmonth
SELECT dayofmonth('2009-07-30');
+----------------------+
|dayofmonth(2009-07-30)|
+----------------------+
|                    30|
+----------------------+
-- dayofweek
SELECT dayofweek('2009-07-30');
+---------------------+
|dayofweek(2009-07-30)|
+---------------------+
|                    5|
+---------------------+
-- dayofyear
SELECT dayofyear('2016-04-09');
+---------------------+
|dayofyear(2016-04-09)|
+---------------------+
|                  100|
+---------------------+
-- extract
SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
|                                                     2019|
+---------------------------------------------------------+
SELECT extract(week FROM timestamp'2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|extract(week FROM TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
|                                                       33|
+---------------------------------------------------------+
SELECT extract(doy FROM DATE'2019-08-12');
+-----------------------------------+
|extract(doy FROM DATE '2019-08-12')|
+-----------------------------------+
|                                224|
+-----------------------------------+
SELECT extract(SECONDS FROM timestamp'2019-10-01 00:00:01.000001');
+------------------------------------------------------------+
|extract(SECONDS FROM TIMESTAMP '2019-10-01 00:00:01.000001')|
+------------------------------------------------------------+
|                                                    1.000001|
+------------------------------------------------------------+
SELECT extract(days FROM interval 5 days 3 hours 7 minutes);
+---------------------------------------------------+
|extract(days FROM INTERVAL '5 03:07' DAY TO MINUTE)|
+---------------------------------------------------+
|                                                  5|
+---------------------------------------------------+
SELECT extract(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+---------------------------------------------------------------+
|extract(seconds FROM INTERVAL '05:00:30.001001' HOUR TO SECOND)|
+---------------------------------------------------------------+
|                                                      30.001001|
+---------------------------------------------------------------+
SELECT extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
+----------------------------------------------------+
|extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH)|
+----------------------------------------------------+
|                                                  11|
+----------------------------------------------------+
SELECT extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND);
+-----------------------------------------------------------------+
|extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND)|
+-----------------------------------------------------------------+
|                                                               55|
+-----------------------------------------------------------------+
-- from_unixtime
SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
|                  1970-01-01 09:00:00|
+-------------------------------------+
SELECT from_unixtime(0);
+-------------------------------------+
|from_unixtime(0, yyyy-MM-dd HH:mm:ss)|
+-------------------------------------+
|                  1970-01-01 09:00:00|
+-------------------------------------+
-- from_utc_timestamp
SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
+------------------------------------------+
|from_utc_timestamp(2016-08-31, Asia/Seoul)|
+------------------------------------------+
|                       2016-08-31 09:00:00|
+------------------------------------------+
-- hour
SELECT hour('2009-07-30 12:58:59');
+-------------------------+
|hour(2009-07-30 12:58:59)|
+-------------------------+
|                       12|
+-------------------------+
-- last_day
SELECT last_day('2009-01-12');
+--------------------+
|last_day(2009-01-12)|
+--------------------+
|          2009-01-31|
+--------------------+
-- localtimestamp
SELECT localtimestamp();
+--------------------+
|    localtimestamp()|
+--------------------+
|2024-02-24 16:36:...|
+--------------------+
-- make_date
SELECT make_date(2013, 7, 15);
+----------------------+
|make_date(2013, 7, 15)|
+----------------------+
|            2013-07-15|
+----------------------+
SELECT make_date(2019, 7, NULL);
+------------------------+
|make_date(2019, 7, NULL)|
+------------------------+
|                    NULL|
+------------------------+
-- make_dt_interval
SELECT make_dt_interval(1, 12, 30, 01.001001);
+-------------------------------------+
|make_dt_interval(1, 12, 30, 1.001001)|
+-------------------------------------+
|                 INTERVAL '1 12:30...|
+-------------------------------------+
SELECT make_dt_interval(2);
+-----------------------------------+
|make_dt_interval(2, 0, 0, 0.000000)|
+-----------------------------------+
|               INTERVAL '2 00:00...|
+-----------------------------------+
SELECT make_dt_interval(100, null, 3);
+----------------------------------------+
|make_dt_interval(100, NULL, 3, 0.000000)|
+----------------------------------------+
|                                    NULL|
+----------------------------------------+
-- make_interval
SELECT make_interval(100, 11, 1, 1, 12, 30, 01.001001);
+----------------------------------------------+
|make_interval(100, 11, 1, 1, 12, 30, 1.001001)|
+----------------------------------------------+
|                          100 years 11 mont...|
+----------------------------------------------+
SELECT make_interval(100, null, 3);
+----------------------------------------------+
|make_interval(100, NULL, 3, 0, 0, 0, 0.000000)|
+----------------------------------------------+
|                                          NULL|
+----------------------------------------------+
SELECT make_interval(0, 1, 0, 1, 0, 0, 100.000001);
+-------------------------------------------+
|make_interval(0, 1, 0, 1, 0, 0, 100.000001)|
+-------------------------------------------+
|                       1 months 1 days 1...|
+-------------------------------------------+
-- make_timestamp
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887);
+-------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887)|
+-------------------------------------------+
|                       2014-12-28 06:30:...|
+-------------------------------------------+
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887, 'CET');
+------------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, 45.887, CET)|
+------------------------------------------------+
|                            2014-12-28 14:30:...|
+------------------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 60);
+---------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 60)|
+---------------------------------------+
|                    2019-07-01 00:00:00|
+---------------------------------------+
SELECT make_timestamp(2019, 6, 30, 23, 59, 1);
+--------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, 1)|
+--------------------------------------+
|                   2019-06-30 23:59:01|
+--------------------------------------+
SELECT make_timestamp(null, 7, 22, 15, 30, 0);
+--------------------------------------+
|make_timestamp(NULL, 7, 22, 15, 30, 0)|
+--------------------------------------+
|                                  NULL|
+--------------------------------------+
-- make_timestamp_ltz
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
|                           2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, 'CET');
+----------------------------------------------------+
|make_timestamp_ltz(2014, 12, 28, 6, 30, 45.887, CET)|
+----------------------------------------------------+
|                                2014-12-28 14:30:...|
+----------------------------------------------------+
SELECT make_timestamp_ltz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ltz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
|                        2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ltz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ltz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
|                                      NULL|
+------------------------------------------+
-- make_timestamp_ntz
SELECT make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------+
|make_timestamp_ntz(2014, 12, 28, 6, 30, 45.887)|
+-----------------------------------------------+
|                           2014-12-28 06:30:...|
+-----------------------------------------------+
SELECT make_timestamp_ntz(2019, 6, 30, 23, 59, 60);
+-------------------------------------------+
|make_timestamp_ntz(2019, 6, 30, 23, 59, 60)|
+-------------------------------------------+
|                        2019-07-01 00:00:00|
+-------------------------------------------+
SELECT make_timestamp_ntz(null, 7, 22, 15, 30, 0);
+------------------------------------------+
|make_timestamp_ntz(NULL, 7, 22, 15, 30, 0)|
+------------------------------------------+
|                                      NULL|
+------------------------------------------+
-- make_ym_interval
SELECT make_ym_interval(1, 2);
+----------------------+
|make_ym_interval(1, 2)|
+----------------------+
|  INTERVAL '1-2' YE...|
+----------------------+
SELECT make_ym_interval(1, 0);
+----------------------+
|make_ym_interval(1, 0)|
+----------------------+
|  INTERVAL '1-0' YE...|
+----------------------+
SELECT make_ym_interval(-1, 1);
+-----------------------+
|make_ym_interval(-1, 1)|
+-----------------------+
|   INTERVAL '-0-11' ...|
+-----------------------+
SELECT make_ym_interval(2);
+----------------------+
|make_ym_interval(2, 0)|
+----------------------+
|  INTERVAL '2-0' YE...|
+----------------------+
-- minute
SELECT minute('2009-07-30 12:58:59');
+---------------------------+
|minute(2009-07-30 12:58:59)|
+---------------------------+
|                         58|
+---------------------------+
-- month
SELECT month('2016-07-30');
+-----------------+
|month(2016-07-30)|
+-----------------+
|                7|
+-----------------+
-- months_between
SELECT months_between('1997-02-28 10:30:00', '1996-10-30');
+-----------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, true)|
+-----------------------------------------------------+
|                                           3.94959677|
+-----------------------------------------------------+
SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false);
+------------------------------------------------------+
|months_between(1997-02-28 10:30:00, 1996-10-30, false)|
+------------------------------------------------------+
|                                    3.9495967741935485|
+------------------------------------------------------+
-- next_day
SELECT next_day('2015-01-14', 'TU');
+------------------------+
|next_day(2015-01-14, TU)|
+------------------------+
|              2015-01-20|
+------------------------+
-- now
SELECT now();
+--------------------+
|               now()|
+--------------------+
|2024-02-24 16:36:...|
+--------------------+
-- quarter
SELECT quarter('2016-08-31');
+-------------------+
|quarter(2016-08-31)|
+-------------------+
|                  3|
+-------------------+
-- second
SELECT second('2009-07-30 12:58:59');
+---------------------------+
|second(2009-07-30 12:58:59)|
+---------------------------+
|                         59|
+---------------------------+
-- session_window
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, session_window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
|  a|              start|                end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30|  2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00|  1|
| A2|2021-01-01 00:01:00|2021-01-01 00:06:00|  1|
+---+-------------------+-------------------+---+
SELECT a, session_window.start, session_window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:10:00'), ('A2', '2021-01-01 00:01:00'), ('A2', '2021-01-01 00:04:30') AS tab(a, b) GROUP by a, session_window(b, CASE WHEN a = 'A1' THEN '5 minutes' WHEN a = 'A2' THEN '1 minute' ELSE '10 minutes' END) ORDER BY a, start;
+---+-------------------+-------------------+---+
|  a|              start|                end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:09:30|  2|
| A1|2021-01-01 00:10:00|2021-01-01 00:15:00|  1|
| A2|2021-01-01 00:01:00|2021-01-01 00:02:00|  1|
| A2|2021-01-01 00:04:30|2021-01-01 00:05:30|  1|
+---+-------------------+-------------------+---+
-- timestamp_micros
SELECT timestamp_micros(1230219000123123);
+----------------------------------+
|timestamp_micros(1230219000123123)|
+----------------------------------+
|              2008-12-26 00:30:...|
+----------------------------------+
-- timestamp_millis
SELECT timestamp_millis(1230219000123);
+-------------------------------+
|timestamp_millis(1230219000123)|
+-------------------------------+
|           2008-12-26 00:30:...|
+-------------------------------+
-- timestamp_seconds
SELECT timestamp_seconds(1230219000);
+-----------------------------+
|timestamp_seconds(1230219000)|
+-----------------------------+
|          2008-12-26 00:30:00|
+-----------------------------+
SELECT timestamp_seconds(1230219000.123);
+---------------------------------+
|timestamp_seconds(1230219000.123)|
+---------------------------------+
|             2008-12-26 00:30:...|
+---------------------------------+
-- to_date
SELECT to_date('2009-07-30 04:17:52');
+----------------------------+
|to_date(2009-07-30 04:17:52)|
+----------------------------+
|                  2009-07-30|
+----------------------------+
SELECT to_date('2016-12-31', 'yyyy-MM-dd');
+-------------------------------+
|to_date(2016-12-31, yyyy-MM-dd)|
+-------------------------------+
|                     2016-12-31|
+-------------------------------+
-- to_timestamp
SELECT to_timestamp('2016-12-31 00:12:00');
+---------------------------------+
|to_timestamp(2016-12-31 00:12:00)|
+---------------------------------+
|              2016-12-31 00:12:00|
+---------------------------------+
SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd');
+------------------------------------+
|to_timestamp(2016-12-31, yyyy-MM-dd)|
+------------------------------------+
|                 2016-12-31 00:00:00|
+------------------------------------+
-- to_timestamp_ltz
SELECT to_timestamp_ltz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ltz(2016-12-31 00:12:00)|
+-------------------------------------+
|                  2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ltz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ltz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
|                     2016-12-31 00:00:00|
+----------------------------------------+
-- to_timestamp_ntz
SELECT to_timestamp_ntz('2016-12-31 00:12:00');
+-------------------------------------+
|to_timestamp_ntz(2016-12-31 00:12:00)|
+-------------------------------------+
|                  2016-12-31 00:12:00|
+-------------------------------------+
SELECT to_timestamp_ntz('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp_ntz(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
|                     2016-12-31 00:00:00|
+----------------------------------------+
-- to_unix_timestamp
SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+-----------------------------------------+
|to_unix_timestamp(2016-04-08, yyyy-MM-dd)|
+-----------------------------------------+
|                               1460041200|
+-----------------------------------------+
-- to_utc_timestamp
SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');
+----------------------------------------+
|to_utc_timestamp(2016-08-31, Asia/Seoul)|
+----------------------------------------+
|                     2016-08-30 15:00:00|
+----------------------------------------+
-- trunc
SELECT trunc('2019-08-04', 'week');
+-----------------------+
|trunc(2019-08-04, week)|
+-----------------------+
|             2019-07-29|
+-----------------------+
SELECT trunc('2019-08-04', 'quarter');
+--------------------------+
|trunc(2019-08-04, quarter)|
+--------------------------+
|                2019-07-01|
+--------------------------+
SELECT trunc('2009-02-12', 'MM');
+---------------------+
|trunc(2009-02-12, MM)|
+---------------------+
|           2009-02-01|
+---------------------+
SELECT trunc('2015-10-27', 'YEAR');
+-----------------------+
|trunc(2015-10-27, YEAR)|
+-----------------------+
|             2015-01-01|
+-----------------------+
-- try_to_timestamp
SELECT try_to_timestamp('2016-12-31 00:12:00');
+-------------------------------------+
|try_to_timestamp(2016-12-31 00:12:00)|
+-------------------------------------+
|                  2016-12-31 00:12:00|
+-------------------------------------+
SELECT try_to_timestamp('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|try_to_timestamp(2016-12-31, yyyy-MM-dd)|
+----------------------------------------+
|                     2016-12-31 00:00:00|
+----------------------------------------+
SELECT try_to_timestamp('foo', 'yyyy-MM-dd');
+---------------------------------+
|try_to_timestamp(foo, yyyy-MM-dd)|
+---------------------------------+
|                             NULL|
+---------------------------------+
-- unix_date
SELECT unix_date(DATE("1970-01-02"));
+---------------------+
|unix_date(1970-01-02)|
+---------------------+
|                    1|
+---------------------+
-- unix_micros
SELECT unix_micros(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_micros(1970-01-01 00:00:01Z)|
+---------------------------------+
|                          1000000|
+---------------------------------+
-- unix_millis
SELECT unix_millis(TIMESTAMP('1970-01-01 00:00:01Z'));
+---------------------------------+
|unix_millis(1970-01-01 00:00:01Z)|
+---------------------------------+
|                             1000|
+---------------------------------+
-- unix_seconds
SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'));
+----------------------------------+
|unix_seconds(1970-01-01 00:00:01Z)|
+----------------------------------+
|                                 1|
+----------------------------------+
-- unix_timestamp
SELECT unix_timestamp();
+--------------------------------------------------------+
|unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss)|
+--------------------------------------------------------+
|                                              1708760216|
+--------------------------------------------------------+
SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+--------------------------------------+
|unix_timestamp(2016-04-08, yyyy-MM-dd)|
+--------------------------------------+
|                            1460041200|
+--------------------------------------+
-- weekday
SELECT weekday('2009-07-30');
+-------------------+
|weekday(2009-07-30)|
+-------------------+
|                  3|
+-------------------+
-- weekofyear
SELECT weekofyear('2008-02-20');
+----------------------+
|weekofyear(2008-02-20)|
+----------------------+
|                     8|
+----------------------+
-- window
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
|  a|              start|                end|cnt|
+---+-------------------+-------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00|  2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00|  1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00|  1|
+---+-------------------+-------------------+---+
SELECT a, window.start, window.end, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '10 minutes', '5 minutes') ORDER BY a, start;
+---+-------------------+-------------------+---+
|  a|              start|                end|cnt|
+---+-------------------+-------------------+---+
| A1|2020-12-31 23:55:00|2021-01-01 00:05:00|  2|
| A1|2021-01-01 00:00:00|2021-01-01 00:10:00|  3|
| A1|2021-01-01 00:05:00|2021-01-01 00:15:00|  1|
| A2|2020-12-31 23:55:00|2021-01-01 00:05:00|  1|
| A2|2021-01-01 00:00:00|2021-01-01 00:10:00|  1|
+---+-------------------+-------------------+---+
-- window_time
SELECT a, window.start as start, window.end as end, window_time(window), cnt FROM (SELECT a, window, count(*) as cnt FROM VALUES ('A1', '2021-01-01 00:00:00'), ('A1', '2021-01-01 00:04:30'), ('A1', '2021-01-01 00:06:00'), ('A2', '2021-01-01 00:01:00') AS tab(a, b) GROUP by a, window(b, '5 minutes') ORDER BY a, window.start);
+---+-------------------+-------------------+--------------------+---+
|  a|              start|                end| window_time(window)|cnt|
+---+-------------------+-------------------+--------------------+---+
| A1|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...|  2|
| A1|2021-01-01 00:05:00|2021-01-01 00:10:00|2021-01-01 00:09:...|  1|
| A2|2021-01-01 00:00:00|2021-01-01 00:05:00|2021-01-01 00:04:...|  1|
+---+-------------------+-------------------+--------------------+---+
-- year
SELECT year('2016-07-30');
+----------------+
|year(2016-07-30)|
+----------------+
|            2016|
+----------------+
```

#### Aggregate functions
<a name="supported-sql-aggregate"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

Aggregate functions operate on values across rows to perform mathematical calculations such as sum, average, counting, minimum/maximum values, standard deviation, and estimation, as well as some non-mathematical operations. 

**Syntax **

```
aggregate_function(input1 [, input2, ...]) FILTER (WHERE boolean_expression) 
```

**Parameters **
+ `boolean_expression` - Specifies any expression that evaluates to a result type boolean. Two or more expressions may be combined together using the logical operators ( AND, OR ). 

**Ordered-set aggregate functions **

These aggregate functions use different syntax than the other aggregate functions so that to specify an expression (typically a column name) by which to order the values. 

**Syntax **

```
{ PERCENTILE_CONT | PERCENTILE_DISC }(percentile) WITHIN GROUP (ORDER BY { order_by_expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] }) FILTER (WHERE boolean_expression) 
```

**Parameters **
+ `percentile` - The percentile of the value that you want to find. The percentile must be a constant between 0.0 and 1.0. 
+ `order_by_expression` - The expression (typically a column name) by which to order the values before aggregating them. 
+ `boolean_expression` - Specifies any expression that evaluates to a result type boolean. Two or more expressions may be combined together using the logical operators ( AND, OR ). 

**Examples**

```
CREATE OR REPLACE TEMPORARY VIEW basic_pays AS SELECT * FROM VALUES
('Jane Doe','Accounting',8435),
('Akua Mansa','Accounting',9998),
('John Doe','Accounting',8992),
('Juan Li','Accounting',8870),
('Carlos Salazar','Accounting',11472),
('Arnav Desai','Accounting',6627),
('Saanvi Sarkar','IT',8113),
('Shirley Rodriguez','IT',5186),
('Nikki Wolf','Sales',9181),
('Alejandro Rosalez','Sales',9441),
('Nikhil Jayashankar','Sales',6660),
('Richard Roe','Sales',10563),
('Pat Candella','SCM',10449),
('Gerard Hernandez','SCM',6949),
('Pamela Castillo','SCM',11303),
('Paulo Santos','SCM',11798),
('Jorge Souza','SCM',10586)
AS basic_pays(employee_name, department, salary);
SELECT * FROM basic_pays;
+-------------------+----------+------+
|    employee_name  |department|salary|
+-------------------+----------+------+
| Arnav Desai       |Accounting|  6627|
| Jorge Souza       |       SCM| 10586|
| Jane Doe          |Accounting|  8435|
| Nikhil Jayashankar|     Sales|  6660|
| Diego Vanauf      |     Sales| 10563|
| Carlos Salazar    |Accounting| 11472|
| Gerard Hernandez  |       SCM|  6949|
| John Doe          |Accounting|  8992|
| Nikki Wolf        |     Sales|  9181|
| Paulo Santos      |       SCM| 11798|
| Saanvi Sarkar     |        IT|  8113|
| Shirley Rodriguez |        IT|  5186|
| Pat Candella      |       SCM| 10449|
| Akua Mansa        |Accounting|  9998|
| Pamela Castillo   |       SCM| 11303|
| Alejandro Rosalez |     Sales|  9441|
| Juan Li           |Accounting|  8870|
+-------------------+----------+------+
SELECT
department,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) AS pc1,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pc2,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pc3,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pc4,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) AS pd1,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pd2,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pd3,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pd4
FROM basic_pays
GROUP BY department
ORDER BY department;
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|department|    pc1|     pc2|    pc3|     pc4|  pd1|  pd2|  pd3|  pd4|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|Accounting|8543.75| 7838.25| 9746.5|10260.75| 8435| 6627| 9998|11472|
|        IT|5917.75|    NULL|7381.25|    NULL| 5186| NULL| 8113| NULL|
|     Sales|8550.75|    NULL| 9721.5|    NULL| 6660| NULL|10563| NULL|
|       SCM|10449.0|10786.25|11303.0|11460.75|10449|10449|11303|11798|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
```

#### Conditional functions
<a name="supported-sql-conditional"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| coalesce(expr1, expr2, ...) | Returns the first non-null argument if exists. Otherwise, null. | 
| if(expr1, expr2, expr3) | If expr1 evaluates to true, then returns expr2; otherwise returns expr3. | 
| ifnull(expr1, expr2) | Returns expr2 if expr1 is null, or expr1 otherwise. | 
| nanvl(expr1, expr2) | Returns expr1 if it's not NaN, or expr2 otherwise. | 
| nullif(expr1, expr2) | Returns null if expr1 equals to expr2, or expr1 otherwise. | 
| nvl(expr1, expr2) | Returns expr2 if expr1 is null, or expr1 otherwise. | 
| nvl2(expr1, expr2, expr3) | Returns expr2 if expr1 is not null, or expr3 otherwise. | 
| CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]\$1 [ELSE expr5] END | When expr1 = true, returns expr2; else when expr3 = true, returns expr4; else returns expr5. | 

**Examples**

```
-- coalesce
SELECT coalesce(NULL, 1, NULL);
+-----------------------+
|coalesce(NULL, 1, NULL)|
+-----------------------+
|                      1|
+-----------------------+
-- if
SELECT if(1 < 2, 'a', 'b');
+-------------------+
|(IF((1 < 2), a, b))|
+-------------------+
|                  a|
+-------------------+
-- ifnull
SELECT ifnull(NULL, array('2'));
+----------------------+
|ifnull(NULL, array(2))|
+----------------------+
|                   [2]|
+----------------------+
-- nanvl
SELECT nanvl(cast('NaN' as double), 123);
+-------------------------------+
|nanvl(CAST(NaN AS DOUBLE), 123)|
+-------------------------------+
|                          123.0|
+-------------------------------+
-- nullif
SELECT nullif(2, 2);
+------------+
|nullif(2, 2)|
+------------+
|        NULL|
+------------+
-- nvl
SELECT nvl(NULL, array('2'));
+-------------------+
|nvl(NULL, array(2))|
+-------------------+
|                [2]|
+-------------------+
-- nvl2
SELECT nvl2(NULL, 2, 1);
+----------------+
|nvl2(NULL, 2, 1)|
+----------------+
|               1|
+----------------+
-- when
SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;
+-----------------------------------------------------------+
|CASE WHEN (1 > 0) THEN 1 WHEN (2 > 0) THEN 2.0 ELSE 1.2 END|
+-----------------------------------------------------------+
|                                                        1.0|
+-----------------------------------------------------------+
SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;
+-----------------------------------------------------------+
|CASE WHEN (1 < 0) THEN 1 WHEN (2 > 0) THEN 2.0 ELSE 1.2 END|
+-----------------------------------------------------------+
|                                                        2.0|
+-----------------------------------------------------------+
SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END;
+--------------------------------------------------+
|CASE WHEN (1 < 0) THEN 1 WHEN (2 < 0) THEN 2.0 END|
+--------------------------------------------------+
|                                              NULL|
+--------------------------------------------------+
```

#### JSON functions
<a name="supported-sql-json"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).




****  

| Function | Description | 
| --- | --- | 
| from\$1json(jsonStr, schema[, options]) | Returns a struct value with the given `jsonStr` and `schema`. | 
| get\$1json\$1object(json\$1txt, path) | Extracts a json object from `path`. | 
| json\$1array\$1length(jsonArray) | Returns the number of elements in the outermost JSON array. | 
| json\$1object\$1keys(json\$1object) | Returns all the keys of the outermost JSON object as an array. | 
| json\$1tuple(jsonStr, p1, p2, ..., pn) | Returns a tuple like the function get\$1json\$1object, but it takes multiple names. All the input parameters and output column types are string. | 
| schema\$1of\$1json(json[, options]) | Returns schema in the DDL format of JSON string. | 
| to\$1json(expr[, options]) | Returns a JSON string with a given struct value | 

**Examples**

```
-- from_json
SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
+---------------------------+
| from_json({"a":1, "b":0.8}) |
+---------------------------+
| {1, 0.8}                  |
+---------------------------+

SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+--------------------------------+
| from_json({"time":"26/08/2015"}) |
+--------------------------------+
| {2015-08-26 00:00...           |
+--------------------------------+

SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
+--------------------------------------------------------------------------------------------------------+
| from_json({"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}) |
+--------------------------------------------------------------------------------------------------------+
| {Alice, [{Bob, 1}...                                                                                   |
+--------------------------------------------------------------------------------------------------------+

-- get_json_object
SELECT get_json_object('{"a":"b"}', '$.a');
+-------------------------------+
| get_json_object({"a":"b"}, $.a) |
+-------------------------------+
| b                             |
+-------------------------------+

-- json_array_length
SELECT json_array_length('[1,2,3,4]');
+----------------------------+
| json_array_length([1,2,3,4]) |
+----------------------------+
| 4                          |
+----------------------------+

SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+------------------------------------------------+
| json_array_length([1,2,3,{"f1":1,"f2":[5,6]},4]) |
+------------------------------------------------+
| 5                                              |
+------------------------------------------------+

SELECT json_array_length('[1,2');
+-----------------------+
| json_array_length([1,2) |
+-----------------------+
| NULL                  |
+-----------------------+

-- json_object_keys
SELECT json_object_keys('{}');
+--------------------+
| json_object_keys({}) |
+--------------------+
| []                 |
+--------------------+

SELECT json_object_keys('{"key": "value"}');
+----------------------------------+
| json_object_keys({"key": "value"}) |
+----------------------------------+
| [key]                            |
+----------------------------------+

SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
+--------------------------------------------------------+
| json_object_keys({"f1":"abc","f2":{"f3":"a", "f4":"b"}}) |
+--------------------------------------------------------+
| [f1, f2]                                               |
+--------------------------------------------------------+

-- json_tuple
SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
+---+---+
| c0| c1|
+---+---+
|  1|  2|
+---+---+

-- schema_of_json
SELECT schema_of_json('[{"col":0}]');
+---------------------------+
| schema_of_json([{"col":0}]) |
+---------------------------+
| ARRAY<STRUCT<col:...      |
+---------------------------+

SELECT schema_of_json('[{"col":01}]', map('allowNumericLeadingZeros', 'true'));
+----------------------------+
| schema_of_json([{"col":01}]) |
+----------------------------+
| ARRAY<STRUCT<col:...       |
+----------------------------+

-- to_json
SELECT to_json(named_struct('a', 1, 'b', 2));
+---------------------------------+
| to_json(named_struct(a, 1, b, 2)) |
+---------------------------------+
| {"a":1,"b":2}                   |
+---------------------------------+

SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-----------------------------------------------------------------+
| to_json(named_struct(time, to_timestamp(2015-08-26, yyyy-MM-dd))) |
+-----------------------------------------------------------------+
| {"time":"26/08/20...                                            |
+-----------------------------------------------------------------+

SELECT to_json(array(named_struct('a', 1, 'b', 2)));
+----------------------------------------+
| to_json(array(named_struct(a, 1, b, 2))) |
+----------------------------------------+
| [{"a":1,"b":2}]                        |
+----------------------------------------+

SELECT to_json(map('a', named_struct('b', 1)));
+-----------------------------------+
| to_json(map(a, named_struct(b, 1))) |
+-----------------------------------+
| {"a":{"b":1}}                     |
+-----------------------------------+

SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));
+----------------------------------------------------+
| to_json(map(named_struct(a, 1), named_struct(b, 2))) |
+----------------------------------------------------+
| {"[1]":{"b":2}}                                    |
+----------------------------------------------------+

SELECT to_json(map('a', 1));
+------------------+
| to_json(map(a, 1)) |
+------------------+
| {"a":1}          |
+------------------+

SELECT to_json(array(map('a', 1)));
+-------------------------+
| to_json(array(map(a, 1))) |
+-------------------------+
| [{"a":1}]               |
+-------------------------+
```

#### Array functions
<a name="supported-sql-array"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| array(expr, ...) | Returns an array with the given elements. | 
| array\$1append(array, element) | Add the element at the end of the array passed as first argument. Type of element should be similar to type of the elements of the array. Null element is also appended into the array. But if the array passed, is NULL output is NULL | 
| array\$1compact(array) | Removes null values from the array. | 
| array\$1contains(array, value) | Returns true if the array contains the value. | 
| array\$1distinct(array) | Removes duplicate values from the array. | 
| array\$1except(array1, array2) | Returns an array of the elements in array1 but not in array2, without duplicates. | 
| array\$1insert(x, pos, val) | Places val into index pos of array x. Array indices start at 1. The maximum negative index is -1 for which the function inserts new element after the current last element. Index above array size appends the array, or prepends the array if index is negative, with 'null' elements. | 
| array\$1intersect(array1, array2) | Returns an array of the elements in the intersection of array1 and array2, without duplicates. | 
| array\$1join(array, delimiter[, nullReplacement]) | Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered. | 
| array\$1max(array) | Returns the maximum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped. | 
| array\$1min(array) | Returns the minimum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped. | 
| array\$1position(array, element) | Returns the (1-based) index of the first matching element of the array as long, or 0 if no match is found. | 
| array\$1prepend(array, element) | Add the element at the beginning of the array passed as first argument. Type of element should be the same as the type of the elements of the array. Null element is also prepended to the array. But if the array passed is NULL output is NULL | 
| array\$1remove(array, element) | Remove all elements that equal to element from array. | 
| array\$1repeat(element, count) | Returns the array containing element count times. | 
| array\$1union(array1, array2) | Returns an array of the elements in the union of array1 and array2, without duplicates. | 
| arrays\$1overlap(a1, a2) | Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise. | 
| arrays\$1zip(a1, a2, ...) | Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. | 
| flatten(arrayOfArrays) | Transforms an array of arrays into a single array. | 
| get(array, index) | Returns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL. | 
| sequence(start, stop, step) | Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions. Supported types are: byte, short, integer, long, date, timestamp. The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' or 'year-month interval' or 'day-time interval' type, otherwise to the same type as the start and stop expressions. | 
| shuffle(array) | Returns a random permutation of the given array. | 
| slice(x, start, length) | Subsets array x starting from index start (array indices start at 1, or starting from the end if start is negative) with the specified length. | 
| sort\$1array(array[, ascendingOrder]) | Sorts the input array in ascending or descending order according to the natural ordering of the array elements. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order. | 

**Examples**

```
-- array
SELECT array(1, 2, 3);
+--------------+
|array(1, 2, 3)|
+--------------+
|     [1, 2, 3]|
+--------------+
-- array_append
SELECT array_append(array('b', 'd', 'c', 'a'), 'd');
+----------------------------------+
|array_append(array(b, d, c, a), d)|
+----------------------------------+
|                   [b, d, c, a, d]|
+----------------------------------+
SELECT array_append(array(1, 2, 3, null), null);
+----------------------------------------+
|array_append(array(1, 2, 3, NULL), NULL)|
+----------------------------------------+
|                    [1, 2, 3, NULL, N...|
+----------------------------------------+
SELECT array_append(CAST(null as Array<Int>), 2);
+---------------------+
|array_append(NULL, 2)|
+---------------------+
|                 NULL|
+---------------------+
-- array_compact
SELECT array_compact(array(1, 2, 3, null));
+-----------------------------------+
|array_compact(array(1, 2, 3, NULL))|
+-----------------------------------+
|                          [1, 2, 3]|
+-----------------------------------+
SELECT array_compact(array("a", "b", "c"));
+-----------------------------+
|array_compact(array(a, b, c))|
+-----------------------------+
|                    [a, b, c]|
+-----------------------------+
-- array_contains
SELECT array_contains(array(1, 2, 3), 2);
+---------------------------------+
|array_contains(array(1, 2, 3), 2)|
+---------------------------------+
|                             true|
+---------------------------------+
-- array_distinct
SELECT array_distinct(array(1, 2, 3, null, 3));
+---------------------------------------+
|array_distinct(array(1, 2, 3, NULL, 3))|
+---------------------------------------+
|                        [1, 2, 3, NULL]|
+---------------------------------------+
-- array_except
SELECT array_except(array(1, 2, 3), array(1, 3, 5));
+--------------------------------------------+
|array_except(array(1, 2, 3), array(1, 3, 5))|
+--------------------------------------------+
|                                         [2]|
+--------------------------------------------+
-- array_insert
SELECT array_insert(array(1, 2, 3, 4), 5, 5);
+-------------------------------------+
|array_insert(array(1, 2, 3, 4), 5, 5)|
+-------------------------------------+
|                      [1, 2, 3, 4, 5]|
+-------------------------------------+
SELECT array_insert(array(5, 4, 3, 2), -1, 1);
+--------------------------------------+
|array_insert(array(5, 4, 3, 2), -1, 1)|
+--------------------------------------+
|                       [5, 4, 3, 2, 1]|
+--------------------------------------+
SELECT array_insert(array(5, 3, 2, 1), -4, 4);
+--------------------------------------+
|array_insert(array(5, 3, 2, 1), -4, 4)|
+--------------------------------------+
|                       [5, 4, 3, 2, 1]|
+--------------------------------------+
-- array_intersect
SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
+-----------------------------------------------+
|array_intersect(array(1, 2, 3), array(1, 3, 5))|
+-----------------------------------------------+
|                                         [1, 3]|
+-----------------------------------------------+
-- array_join
SELECT array_join(array('hello', 'world'), ' ');
+----------------------------------+
|array_join(array(hello, world),  )|
+----------------------------------+
|                       hello world|
+----------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ');
+----------------------------------------+
|array_join(array(hello, NULL, world),  )|
+----------------------------------------+
|                             hello world|
+----------------------------------------+
SELECT array_join(array('hello', null ,'world'), ' ', ',');
+-------------------------------------------+
|array_join(array(hello, NULL, world),  , ,)|
+-------------------------------------------+
|                              hello , world|
+-------------------------------------------+
-- array_max
SELECT array_max(array(1, 20, null, 3));
+--------------------------------+
|array_max(array(1, 20, NULL, 3))|
+--------------------------------+
|                              20|
+--------------------------------+
-- array_min
SELECT array_min(array(1, 20, null, 3));
+--------------------------------+
|array_min(array(1, 20, NULL, 3))|
+--------------------------------+
|                               1|
+--------------------------------+
-- array_position
SELECT array_position(array(312, 773, 708, 708), 708);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 708)|
+----------------------------------------------+
|                                             3|
+----------------------------------------------+
SELECT array_position(array(312, 773, 708, 708), 414);
+----------------------------------------------+
|array_position(array(312, 773, 708, 708), 414)|
+----------------------------------------------+
|                                             0|
+----------------------------------------------+
-- array_prepend
SELECT array_prepend(array('b', 'd', 'c', 'a'), 'd');
+-----------------------------------+
|array_prepend(array(b, d, c, a), d)|
+-----------------------------------+
|                    [d, b, d, c, a]|
+-----------------------------------+
SELECT array_prepend(array(1, 2, 3, null), null);
+-----------------------------------------+
|array_prepend(array(1, 2, 3, NULL), NULL)|
+-----------------------------------------+
|                     [NULL, 1, 2, 3, N...|
+-----------------------------------------+
SELECT array_prepend(CAST(null as Array<Int>), 2);
+----------------------+
|array_prepend(NULL, 2)|
+----------------------+
|                  NULL|
+----------------------+
-- array_remove
SELECT array_remove(array(1, 2, 3, null, 3), 3);
+----------------------------------------+
|array_remove(array(1, 2, 3, NULL, 3), 3)|
+----------------------------------------+
|                            [1, 2, NULL]|
+----------------------------------------+
-- array_repeat
SELECT array_repeat('123', 2);
+--------------------+
|array_repeat(123, 2)|
+--------------------+
|          [123, 123]|
+--------------------+
-- array_union
SELECT array_union(array(1, 2, 3), array(1, 3, 5));
+-------------------------------------------+
|array_union(array(1, 2, 3), array(1, 3, 5))|
+-------------------------------------------+
|                               [1, 2, 3, 5]|
+-------------------------------------------+
-- arrays_overlap
SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
+----------------------------------------------+
|arrays_overlap(array(1, 2, 3), array(3, 4, 5))|
+----------------------------------------------+
|                                          true|
+----------------------------------------------+
-- arrays_zip
SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
+------------------------------------------+
|arrays_zip(array(1, 2, 3), array(2, 3, 4))|
+------------------------------------------+
|                      [{1, 2}, {2, 3}, ...|
+------------------------------------------+
SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
+-------------------------------------------------+
|arrays_zip(array(1, 2), array(2, 3), array(3, 4))|
+-------------------------------------------------+
|                             [{1, 2, 3}, {2, 3...|
+-------------------------------------------------+
-- flatten
SELECT flatten(array(array(1, 2), array(3, 4)));
+----------------------------------------+
|flatten(array(array(1, 2), array(3, 4)))|
+----------------------------------------+
|                            [1, 2, 3, 4]|
+----------------------------------------+
-- get
SELECT get(array(1, 2, 3), 0);
+----------------------+
|get(array(1, 2, 3), 0)|
+----------------------+
|                     1|
+----------------------+
SELECT get(array(1, 2, 3), 3);
+----------------------+
|get(array(1, 2, 3), 3)|
+----------------------+
|                  NULL|
+----------------------+
SELECT get(array(1, 2, 3), -1);
+-----------------------+
|get(array(1, 2, 3), -1)|
+-----------------------+
|                   NULL|
+-----------------------+
-- sequence
SELECT sequence(1, 5);
+---------------+
| sequence(1, 5)|
+---------------+
|[1, 2, 3, 4, 5]|
+---------------+
SELECT sequence(5, 1);
+---------------+
| sequence(5, 1)|
+---------------+
|[5, 4, 3, 2, 1]|
+---------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
+----------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '1' MONTH)|
+----------------------------------------------------------------------+
|                                                  [2018-01-01, 2018...|
+----------------------------------------------------------------------+
SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
+--------------------------------------------------------------------------------+
|sequence(to_date(2018-01-01), to_date(2018-03-01), INTERVAL '0-1' YEAR TO MONTH)|
+--------------------------------------------------------------------------------+
|                                                            [2018-01-01, 2018...|
+--------------------------------------------------------------------------------+
-- shuffle
SELECT shuffle(array(1, 20, 3, 5));
+---------------------------+
|shuffle(array(1, 20, 3, 5))|
+---------------------------+
|              [5, 1, 20, 3]|
+---------------------------+
SELECT shuffle(array(1, 20, null, 3));
+------------------------------+
|shuffle(array(1, 20, NULL, 3))|
+------------------------------+
|              [1, NULL, 20, 3]|
+------------------------------+
-- slice
SELECT slice(array(1, 2, 3, 4), 2, 2);
+------------------------------+
|slice(array(1, 2, 3, 4), 2, 2)|
+------------------------------+
|                        [2, 3]|
+------------------------------+
SELECT slice(array(1, 2, 3, 4), -2, 2);
+-------------------------------+
|slice(array(1, 2, 3, 4), -2, 2)|
+-------------------------------+
|                         [3, 4]|
+-------------------------------+
-- sort_array
SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
+-----------------------------------------+
|sort_array(array(b, d, NULL, c, a), true)|
+-----------------------------------------+
|                       [NULL, a, b, c, d]|
+-----------------------------------------+
```

#### Window functions
<a name="supported-sql-window"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

Window functions operate on a group of rows, referred to as a window, and calculate a return value for each row based on the group of rows. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative statistic, or accessing the value of rows given the relative position of the current row. 

**Syntax** 

```
window_function [ nulls_option ] OVER ( [ { PARTITION | DISTRIBUTE } BY partition_col_name = partition_col_val ( [ , ... ] ) ] { ORDER | SORT } BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] [ window_frame ] ) 
```

**Parameters** 
+ 

  window\$1function 

  Ranking functions 

  Syntax: `RANK | DENSE_RANK | PERCENT_RANK | NTILE | ROW_NUMBER `

  Analytic functions 

  Syntax: `CUME_DIST | LAG | LEAD | NTH_VALUE | FIRST_VALUE | LAST_VALUE `

  Aggregate functions 

  Syntax: `MAX | MIN | COUNT | SUM | AVG | ... `
+ `nulls_option` - Specifies whether or not to skip null values when evaluating the window function. RESPECT NULLS means not skipping null values, while IGNORE NULLS means skipping. If not specified, the default is RESPECT NULLS. 

  Syntax: `{ IGNORE | RESPECT } NULLS `

  Note: `Only LAG` \$1 `LEAD` \$1 `NTH_VALUE` \$1 `FIRST_VALUE` \$1 `LAST_VALUE` can be used with `IGNORE NULLS`. 
+ `window_frame` - Specifies which row to start the window on and where to end it. 

  Syntax: `{ RANGE | ROWS } { frame_start | BETWEEN frame_start AND frame_end }` 

  frame\$1start and frame\$1end have the following syntax: 

  Syntax: `UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW | offset FOLLOWING | UNBOUNDED FOLLOWING `

  offset: specifies the offset from the position of the current row. 

  **Note** If frame\$1end is omitted, it defaults to CURRENT ROW. 

**Examples**

```
CREATE TABLE employees (name STRING, dept STRING, salary INT, age INT);
INSERT INTO employees VALUES ("Lisa", "Sales", 10000, 35);
INSERT INTO employees VALUES ("Evan", "Sales", 32000, 38);
INSERT INTO employees VALUES ("Fred", "Engineering", 21000, 28);
INSERT INTO employees VALUES ("Alex", "Sales", 30000, 33);
INSERT INTO employees VALUES ("Tom", "Engineering", 23000, 33);
INSERT INTO employees VALUES ("Jane", "Marketing", 29000, 28);
INSERT INTO employees VALUES ("Jeff", "Marketing", 35000, 38);
INSERT INTO employees VALUES ("Paul", "Engineering", 29000, 23);
INSERT INTO employees VALUES ("Chloe", "Engineering", 23000, 25);
SELECT * FROM employees;
+-----+-----------+------+-----+
| name|       dept|salary|  age|
+-----+-----------+------+-----+
|Chloe|Engineering| 23000|   25|
| Fred|Engineering| 21000|   28|
| Paul|Engineering| 29000|   23|
|Helen|  Marketing| 29000|   40|
|  Tom|Engineering| 23000|   33|
| Jane|  Marketing| 29000|   28|
| Jeff|  Marketing| 35000|   38|
| Evan|      Sales| 32000|   38|
| Lisa|      Sales| 10000|   35|
| Alex|      Sales| 30000|   33|
+-----+-----------+------+-----+
SELECT name, dept, salary, RANK() OVER (PARTITION BY dept ORDER BY salary) AS rank FROM employees;
+-----+-----------+------+----+
| name|       dept|salary|rank|
+-----+-----------+------+----+
| Lisa|      Sales| 10000|   1|
| Alex|      Sales| 30000|   2|
| Evan|      Sales| 32000|   3|
| Fred|Engineering| 21000|   1|
|  Tom|Engineering| 23000|   2|
|Chloe|Engineering| 23000|   2|
| Paul|Engineering| 29000|   4|
|Helen|  Marketing| 29000|   1|
| Jane|  Marketing| 29000|   1|
| Jeff|  Marketing| 35000|   3|
+-----+-----------+------+----+
SELECT name, dept, salary, DENSE_RANK() OVER (PARTITION BY dept ORDER BY salary ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS dense_rank FROM employees;
+-----+-----------+------+----------+
| name|       dept|salary|dense_rank|
+-----+-----------+------+----------+
| Lisa|      Sales| 10000|         1|
| Alex|      Sales| 30000|         2|
| Evan|      Sales| 32000|         3|
| Fred|Engineering| 21000|         1|
|  Tom|Engineering| 23000|         2|
|Chloe|Engineering| 23000|         2|
| Paul|Engineering| 29000|         3|
|Helen|  Marketing| 29000|         1|
| Jane|  Marketing| 29000|         1|
| Jeff|  Marketing| 35000|         2|
+-----+-----------+------+----------+
SELECT name, dept, age, CUME_DIST() OVER (PARTITION BY dept ORDER BY age
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cume_dist FROM employees;
+-----+-----------+------+------------------+
| name|       dept|age   |         cume_dist|
+-----+-----------+------+------------------+
| Alex|      Sales|    33|0.3333333333333333|
| Lisa|      Sales|    35|0.6666666666666666|
| Evan|      Sales|    38|               1.0|
| Paul|Engineering|    23|              0.25|
|Chloe|Engineering|    25|              0.75|
| Fred|Engineering|    28|              0.25|
|  Tom|Engineering|    33|               1.0|
| Jane|  Marketing|    28|0.3333333333333333|
| Jeff|  Marketing|    38|0.6666666666666666|
|Helen|  Marketing|    40|               1.0|
+-----+-----------+------+------------------+
SELECT name, dept, salary, MIN(salary) OVER (PARTITION BY dept ORDER BY salary) AS min
FROM employees;
+-----+-----------+------+-----+
| name|       dept|salary|  min|
+-----+-----------+------+-----+
| Lisa|      Sales| 10000|10000|
| Alex|      Sales| 30000|10000|
| Evan|      Sales| 32000|10000|
|Helen|  Marketing| 29000|29000|
| Jane|  Marketing| 29000|29000|
| Jeff|  Marketing| 35000|29000|
| Fred|Engineering| 21000|21000|
|  Tom|Engineering| 23000|21000|
|Chloe|Engineering| 23000|21000|
| Paul|Engineering| 29000|21000|
+-----+-----------+------+-----+
SELECT name, salary,
LAG(salary) OVER (PARTITION BY dept ORDER BY salary) AS lag,
LEAD(salary, 1, 0) OVER (PARTITION BY dept ORDER BY salary) AS lead
FROM employees;
+-----+-----------+------+-----+-----+
| name|       dept|salary|  lag| lead|
+-----+-----------+------+-----+-----+
| Lisa|      Sales| 10000|NULL |30000|
| Alex|      Sales| 30000|10000|32000|
| Evan|      Sales| 32000|30000|    0|
| Fred|Engineering| 21000| NULL|23000|
|Chloe|Engineering| 23000|21000|23000|
|  Tom|Engineering| 23000|23000|29000|
| Paul|Engineering| 29000|23000|    0|
|Helen|  Marketing| 29000| NULL|29000|
| Jane|  Marketing| 29000|29000|35000|
| Jeff|  Marketing| 35000|29000|    0|
+-----+-----------+------+-----+-----+
SELECT id, v,
LEAD(v, 0) IGNORE NULLS OVER w lead,
LAG(v, 0) IGNORE NULLS OVER w lag,
NTH_VALUE(v, 2) IGNORE NULLS OVER w nth_value,
FIRST_VALUE(v) IGNORE NULLS OVER w first_value,
LAST_VALUE(v) IGNORE NULLS OVER w last_value
FROM test_ignore_null
WINDOW w AS (ORDER BY id)
ORDER BY id;
+--+----+----+----+---------+-----------+----------+
|id|   v|lead| lag|nth_value|first_value|last_value|
+--+----+----+----+---------+-----------+----------+
| 0|NULL|NULL|NULL|     NULL|       NULL|      NULL|
| 1|   x|   x|   x|     NULL|          x|         x|
| 2|NULL|NULL|NULL|     NULL|          x|         x|
| 3|NULL|NULL|NULL|     NULL|          x|         x|
| 4|   y|   y|   y|        y|          x|         y|
| 5|NULL|NULL|NULL|        y|          x|         y|
| 6|   z|   z|   z|        y|          x|         z|
| 7|   v|   v|   v|        y|          x|         v|
| 8|NULL|NULL|NULL|        y|          x|         v|
+--+----+----+----+---------+-----------+----------+
```

#### Conversion functions
<a name="supported-sql-conversion"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| bigint(expr) | Casts the value `expr` to the target data type `bigint`. | 
| binary(expr) | Casts the value `expr` to the target data type `binary`. | 
| boolean(expr) | Casts the value `expr` to the target data type `boolean`. | 
| cast(expr AS type) | Casts the value `expr` to the target data type `type`. | 
| date(expr) | Casts the value `expr` to the target data type `date`. | 
| decimal(expr) | Casts the value `expr` to the target data type `decimal`. | 
| double(expr) | Casts the value `expr` to the target data type `double`. | 
| float(expr) | Casts the value `expr` to the target data type `float`. | 
| int(expr) | Casts the value `expr` to the target data type `int`. | 
| smallint(expr) | Casts the value `expr` to the target data type `smallint`. | 
| string(expr) | Casts the value `expr` to the target data type `string`. | 
| timestamp(expr) | Casts the value `expr` to the target data type `timestamp`. | 
| tinyint(expr) | Casts the value `expr` to the target data type `tinyint`. | 

**Examples**

```
-- cast
SELECT cast(field as int);
+---------------+
|CAST(field AS INT)|
+---------------+
|             10|
+---------------+
```

#### Predicate functions
<a name="supported-sql-predicate"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| \$1 expr | Logical not. | 
| expr1 < expr2 | Returns true if `expr1` is less than `expr2`. | 
| expr1 <= expr2 | Returns true if `expr1` is less than or equal to `expr2`. | 
| expr1 <=> expr2 | Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. | 
| expr1 = expr2 | Returns true if `expr1` equals `expr2`, or false otherwise. | 
| expr1 == expr2 | Returns true if `expr1` equals `expr2`, or false otherwise. | 
| expr1 > expr2 | Returns true if `expr1` is greater than `expr2`. | 
| expr1 >= expr2 | Returns true if `expr1` is greater than or equal to `expr2`. | 
| expr1 and expr2 | Logical AND. | 
| str ilike pattern[ ESCAPE escape] | Returns true if str matches `pattern` with `escape` case-insensitively, null if any arguments are null, false otherwise. | 
| expr1 in(expr2, expr3, ...) | Returns true if `expr` equals to any valN. | 
| isnan(expr) | Returns true if `expr` is NaN, or false otherwise. | 
| isnotnull(expr) | Returns true if `expr` is not null, or false otherwise. | 
| isnull(expr) | Returns true if `expr` is null, or false otherwise. | 
| str like pattern[ ESCAPE escape] | Returns true if str matches `pattern` with `escape`, null if any arguments are null, false otherwise. | 
| not expr | Logical not. | 
| expr1 or expr2 | Logical OR. | 
| regexp(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. | 
| regexp\$1like(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. | 
| rlike(str, regexp) | Returns true if `str` matches `regexp`, or false otherwise. | 

**Examples**

```
-- !
SELECT ! true;
+----------+
|(NOT true)|
+----------+
|     false|
+----------+
SELECT ! false;
+-----------+
|(NOT false)|
+-----------+
|       true|
+-----------+
SELECT ! NULL;
+----------+
|(NOT NULL)|
+----------+
|      NULL|
+----------+
-- <
SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) < to_date(2009-07-30 04:17:52))|
+-------------------------------------------------------------+
|                                                        false|
+-------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) < to_date(2009-08-01 04:17:52))|
+-------------------------------------------------------------+
|                                                         true|
+-------------------------------------------------------------+
SELECT 1 < NULL;
+----------+
|(1 < NULL)|
+----------+
|      NULL|
+----------+
-- <=
SELECT 2 <= 2;
+--------+
|(2 <= 2)|
+--------+
|    true|
+--------+
SELECT 1.0 <= '1';
+----------+
|(1.0 <= 1)|
+----------+
|      true|
+----------+
SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) <= to_date(2009-07-30 04:17:52))|
+--------------------------------------------------------------+
|                                                          true|
+--------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) <= to_date(2009-08-01 04:17:52))|
+--------------------------------------------------------------+
|                                                          true|
+--------------------------------------------------------------+
SELECT 1 <= NULL;
+-----------+
|(1 <= NULL)|
+-----------+
|       NULL|
+-----------+
-- <=>
SELECT 2 <=> 2;
+---------+
|(2 <=> 2)|
+---------+
|     true|
+---------+
SELECT 1 <=> '1';
+---------+
|(1 <=> 1)|
+---------+
|     true|
+---------+
SELECT true <=> NULL;
+---------------+
|(true <=> NULL)|
+---------------+
|          false|
+---------------+
SELECT NULL <=> NULL;
+---------------+
|(NULL <=> NULL)|
+---------------+
|           true|
+---------------+
-- =
SELECT 2 = 2;
+-------+
|(2 = 2)|
+-------+
|   true|
+-------+
SELECT 1 = '1';
+-------+
|(1 = 1)|
+-------+
|   true|
+-------+
SELECT true = NULL;
+-------------+
|(true = NULL)|
+-------------+
|         NULL|
+-------------+
SELECT NULL = NULL;
+-------------+
|(NULL = NULL)|
+-------------+
|         NULL|
+-------------+
-- ==
SELECT 2 == 2;
+-------+
|(2 = 2)|
+-------+
|   true|
+-------+
SELECT 1 == '1';
+-------+
|(1 = 1)|
+-------+
|   true|
+-------+
SELECT true == NULL;
+-------------+
|(true = NULL)|
+-------------+
|         NULL|
+-------------+
SELECT NULL == NULL;
+-------------+
|(NULL = NULL)|
+-------------+
|         NULL|
+-------------+
-- >
SELECT 2 > 1;
+-------+
|(2 > 1)|
+-------+
|   true|
+-------+
SELECT 2 > 1.1;
+-------+
|(2 > 1)|
+-------+
|   true|
+-------+
SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) > to_date(2009-07-30 04:17:52))|
+-------------------------------------------------------------+
|                                                        false|
+-------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52');
+-------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) > to_date(2009-08-01 04:17:52))|
+-------------------------------------------------------------+
|                                                        false|
+-------------------------------------------------------------+
SELECT 1 > NULL;
+----------+
|(1 > NULL)|
+----------+
|      NULL|
+----------+
-- >=
SELECT 2 >= 1;
+--------+
|(2 >= 1)|
+--------+
|    true|
+--------+
SELECT 2.0 >= '2.1';
+------------+
|(2.0 >= 2.1)|
+------------+
|       false|
+------------+
SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) >= to_date(2009-07-30 04:17:52))|
+--------------------------------------------------------------+
|                                                          true|
+--------------------------------------------------------------+
SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52');
+--------------------------------------------------------------+
|(to_date(2009-07-30 04:17:52) >= to_date(2009-08-01 04:17:52))|
+--------------------------------------------------------------+
|                                                         false|
+--------------------------------------------------------------+
SELECT 1 >= NULL;
+-----------+
|(1 >= NULL)|
+-----------+
|       NULL|
+-----------+
-- and
SELECT true and true;
+---------------+
|(true AND true)|
+---------------+
|           true|
+---------------+
SELECT true and false;
+----------------+
|(true AND false)|
+----------------+
|           false|
+----------------+
SELECT true and NULL;
+---------------+
|(true AND NULL)|
+---------------+
|           NULL|
+---------------+
SELECT false and NULL;
+----------------+
|(false AND NULL)|
+----------------+
|           false|
+----------------+
-- ilike
SELECT ilike('Wagon', '_Agon');
+-------------------+
|ilike(Wagon, _Agon)|
+-------------------+
|               true|
+-------------------+
SELECT '%SystemDrive%\Users\John' ilike '\%SystemDrive\%\\users%';
+--------------------------------------------------------+
|ilike(%SystemDrive%\Users\John, \%SystemDrive\%\\users%)|
+--------------------------------------------------------+
|                                                    true|
+--------------------------------------------------------+
SELECT '%SystemDrive%\\USERS\\John' ilike '\%SystemDrive\%\\\\Users%';
+--------------------------------------------------------+
|ilike(%SystemDrive%\USERS\John, \%SystemDrive\%\\Users%)|
+--------------------------------------------------------+
|                                                    true|
+--------------------------------------------------------+
SELECT '%SystemDrive%/Users/John' ilike '/%SYSTEMDrive/%//Users%' ESCAPE '/';
+--------------------------------------------------------+
|ilike(%SystemDrive%/Users/John, /%SYSTEMDrive/%//Users%)|
+--------------------------------------------------------+
|                                                    true|
+--------------------------------------------------------+
-- in
SELECT 1 in(1, 2, 3);
+----------------+
|(1 IN (1, 2, 3))|
+----------------+
|            true|
+----------------+
SELECT 1 in(2, 3, 4);
+----------------+
|(1 IN (2, 3, 4))|
+----------------+
|           false|
+----------------+
SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3));
+----------------------------------------------------------------------------------+
|(named_struct(a, 1, b, 2) IN (named_struct(a, 1, b, 1), named_struct(a, 1, b, 3)))|
+----------------------------------------------------------------------------------+
|                                                                             false|
+----------------------------------------------------------------------------------+
SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3));
+----------------------------------------------------------------------------------+
|(named_struct(a, 1, b, 2) IN (named_struct(a, 1, b, 2), named_struct(a, 1, b, 3)))|
+----------------------------------------------------------------------------------+
|                                                                              true|
+----------------------------------------------------------------------------------+
-- isnan
SELECT isnan(cast('NaN' as double));
+--------------------------+
|isnan(CAST(NaN AS DOUBLE))|
+--------------------------+
|                      true|
+--------------------------+
-- isnotnull
SELECT isnotnull(1);
+---------------+
|(1 IS NOT NULL)|
+---------------+
|           true|
+---------------+
-- isnull
SELECT isnull(1);
+-----------+
|(1 IS NULL)|
+-----------+
|      false|
+-----------+
-- like
SELECT like('Wagon', '_Agon');
+----------------+
|Wagon LIKE _Agon|
+----------------+
|            true|
+----------------+
-- not
SELECT not true;
+----------+
|(NOT true)|
+----------+
|     false|
+----------+
SELECT not false;
+-----------+
|(NOT false)|
+-----------+
|       true|
+-----------+
SELECT not NULL;
+----------+
|(NOT NULL)|
+----------+
|      NULL|
+----------+
-- or
SELECT true or false;
+---------------+
|(true OR false)|
+---------------+
|           true|
+---------------+
SELECT false or false;
+----------------+
|(false OR false)|
+----------------+
|           false|
+----------------+
SELECT true or NULL;
+--------------+
|(true OR NULL)|
+--------------+
|          true|
+--------------+
SELECT false or NULL;
+---------------+
|(false OR NULL)|
+---------------+
|           NULL|
+---------------+
```

#### Map functions
<a name="supported-sql-map"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| element\$1at(array, index) | Returns element of array at given (1-based) index. | 
| element\$1at(map, key) | Returns value for given key. The function returns NULL if the key is not contained in the map. | 
| map(key0, value0, key1, value1, ...) | Creates a map with the given key/value pairs. | 
| map\$1concat(map, ...) | Returns the union of all the given maps | 
| map\$1contains\$1key(map, key) | Returns true if the map contains the key. | 
| map\$1entries(map) | Returns an unordered array of all entries in the given map. | 
| map\$1from\$1arrays(keys, values) | Creates a map with a pair of the given key/value arrays. All elements in keys should not be null | 
| map\$1from\$1entries(arrayOfEntries) | Returns a map created from the given array of entries. | 
| map\$1keys(map) | Returns an unordered array containing the keys of the map. | 
| map\$1values(map) | Returns an unordered array containing the values of the map. | 
| str\$1to\$1map(text[, pairDelim[, keyValueDelim]]) | Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for `pairDelim` and ':' for `keyValueDelim`. Both `pairDelim` and `keyValueDelim` are treated as regular expressions. | 
| try\$1element\$1at(array, index) | Returns element of array at given (1-based) index. If Index is 0, the system will throw an error. If index < 0, accesses elements from the last to the first. The function always returns NULL if the index exceeds the length of the array. | 
| try\$1element\$1at(map, key) | Returns value for given key. The function always returns NULL if the key is not contained in the map. | 

**Examples**

```
-- element_at
SELECT element_at(array(1, 2, 3), 2);
+-----------------------------+
|element_at(array(1, 2, 3), 2)|
+-----------------------------+
|                            2|
+-----------------------------+
SELECT element_at(map(1, 'a', 2, 'b'), 2);
+------------------------------+
|element_at(map(1, a, 2, b), 2)|
+------------------------------+
|                             b|
+------------------------------+
-- map
SELECT map(1.0, '2', 3.0, '4');
+--------------------+
| map(1.0, 2, 3.0, 4)|
+--------------------+
|{1.0 -> 2, 3.0 -> 4}|
+--------------------+
-- map_concat
SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
+--------------------------------------+
|map_concat(map(1, a, 2, b), map(3, c))|
+--------------------------------------+
|                  {1 -> a, 2 -> b, ...|
+--------------------------------------+
-- map_contains_key
SELECT map_contains_key(map(1, 'a', 2, 'b'), 1);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 1)|
+------------------------------------+
|                                true|
+------------------------------------+
SELECT map_contains_key(map(1, 'a', 2, 'b'), 3);
+------------------------------------+
|map_contains_key(map(1, a, 2, b), 3)|
+------------------------------------+
|                               false|
+------------------------------------+
-- map_entries
SELECT map_entries(map(1, 'a', 2, 'b'));
+----------------------------+
|map_entries(map(1, a, 2, b))|
+----------------------------+
|            [{1, a}, {2, b}]|
+----------------------------+
-- map_from_arrays
SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
+---------------------------------------------+
|map_from_arrays(array(1.0, 3.0), array(2, 4))|
+---------------------------------------------+
|                         {1.0 -> 2, 3.0 -> 4}|
+---------------------------------------------+
-- map_from_entries
SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------+
|map_from_entries(array(struct(1, a), struct(2, b)))|
+---------------------------------------------------+
|                                   {1 -> a, 2 -> b}|
+---------------------------------------------------+
-- map_keys
SELECT map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
|                   [1, 2]|
+-------------------------+
-- map_values
SELECT map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
|                     [a, b]|
+---------------------------+
-- str_to_map
SELECT str_to_map('a:1,b:2,c:3', ',', ':');
+-----------------------------+
|str_to_map(a:1,b:2,c:3, ,, :)|
+-----------------------------+
|         {a -> 1, b -> 2, ...|
+-----------------------------+
SELECT str_to_map('a');
+-------------------+
|str_to_map(a, ,, :)|
+-------------------+
|        {a -> NULL}|
+-------------------+
-- try_element_at
SELECT try_element_at(array(1, 2, 3), 2);
+---------------------------------+
|try_element_at(array(1, 2, 3), 2)|
+---------------------------------+
|                                2|
+---------------------------------+
SELECT try_element_at(map(1, 'a', 2, 'b'), 2);
+----------------------------------+
|try_element_at(map(1, a, 2, b), 2)|
+----------------------------------+
|                                 b|
+----------------------------------+
```

#### Mathematical functions
<a name="supported-sql-math"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| expr1 % expr2 | Returns the remainder after `expr1`/`expr2`. | 
| expr1 \$1 expr2 | Returns `expr1`\$1`expr2`. | 
| expr1 \$1 expr2 | Returns `expr1`\$1`expr2`. | 
| expr1 - expr2 | Returns `expr1`-`expr2`. | 
| expr1 / expr2 | Returns `expr1`/`expr2`. It always performs floating point division. | 
| abs(expr) | Returns the absolute value of the numeric or interval value. | 
| acos(expr) | Returns the inverse cosine (a.k.a. arc cosine) of `expr`, as if computed by `java.lang.Math.acos`. | 
| acosh(expr) | Returns inverse hyperbolic cosine of `expr`. | 
| asin(expr) | Returns the inverse sine (a.k.a. arc sine) the arc sin of `expr`, as if computed by `java.lang.Math.asin`. | 
| asinh(expr) | Returns inverse hyperbolic sine of `expr`. | 
| atan(expr) | Returns the inverse tangent (a.k.a. arc tangent) of `expr`, as if computed by `java.lang.Math.atan` | 
| atan2(exprY, exprX) | Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates (`exprX`, `exprY`), as if computed by `java.lang.Math.atan2`. | 
| atanh(expr) | Returns inverse hyperbolic tangent of `expr`. | 
| bin(expr) | Returns the string representation of the long value `expr` represented in binary. | 
| bround(expr, d) | Returns `expr` rounded to `d` decimal places using HALF\$1EVEN rounding mode. | 
| cbrt(expr) | Returns the cube root of `expr`. | 
| ceil(expr[, scale]) | Returns the smallest number after rounding up that is not smaller than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. | 
| ceiling(expr[, scale]) | Returns the smallest number after rounding up that is not smaller than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. | 
| conv(num, from\$1base, to\$1base) | Convert `num` from `from\$1base` to `to\$1base`. | 
| cos(expr) | Returns the cosine of `expr`, as if computed by `java.lang.Math.cos`. | 
| cosh(expr) | Returns the hyperbolic cosine of `expr`, as if computed by `java.lang.Math.cosh`. | 
| cot(expr) | Returns the cotangent of `expr`, as if computed by `1/java.lang.Math.tan`. | 
| csc(expr) | Returns the cosecant of `expr`, as if computed by `1/java.lang.Math.sin`. | 
| degrees(expr) | Converts radians to degrees. | 
| expr1 div expr2 | Divide `expr1` by `expr2`. It returns NULL if an operand is NULL or `expr2` is 0. The result is casted to long. | 
| e() | Returns Euler's number, e. | 
| exp(expr) | Returns e to the power of `expr`. | 
| expm1(expr) - Returns exp(`expr`) | 1 | 
| factorial(expr) | Returns the factorial of `expr`. `expr` is [0..20]. Otherwise, null. | 
| floor(expr[, scale]) | Returns the largest number after rounding down that is not greater than `expr`. An optional `scale` parameter can be specified to control the rounding behavior. | 
| greatest(expr, ...) | Returns the greatest value of all parameters, skipping null values. | 
| hex(expr) | Converts `expr` to hexadecimal. | 
| hypot(expr1, expr2) | Returns sqrt(`expr1`\$1\$12 \$1 `expr2`\$1\$12). | 
| least(expr, ...) | Returns the least value of all parameters, skipping null values. | 
| ln(expr) | Returns the natural logarithm (base e) of `expr`. | 
| log(base, expr) | Returns the logarithm of `expr` with `base`. | 
| log10(expr) | Returns the logarithm of `expr` with base 10. | 
| log1p(expr) | Returns log(1 \$1 `expr`). | 
| log2(expr) | Returns the logarithm of `expr` with base 2. | 
| expr1 mod expr2 | Returns the remainder after `expr1`/`expr2`. | 
| negative(expr) | Returns the negated value of `expr`. | 
| pi() | Returns pi. | 
| pmod(expr1, expr2) | Returns the positive value of `expr1` mod `expr2`. | 
| positive(expr) | Returns the value of `expr`. | 
| pow(expr1, expr2) | Raises `expr1` to the power of `expr2`. | 
| power(expr1, expr2) | Raises `expr1` to the power of `expr2`. | 
| radians(expr) | Converts degrees to radians. | 
| rand([seed]) | Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). | 
| randn([seed]) | Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution. | 
| random([seed]) | Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). | 
| rint(expr) | Returns the double value that is closest in value to the argument and is equal to a mathematical integer. | 
| round(expr, d) | Returns `expr` rounded to `d` decimal places using HALF\$1UP rounding mode. | 
| sec(expr) | Returns the secant of `expr`, as if computed by `1/java.lang.Math.cos`. | 
| shiftleft(base, expr) | Bitwise left shift. | 
| sign(expr) | Returns -1.0, 0.0 or 1.0 as `expr` is negative, 0 or positive. | 
| signum(expr) | Returns -1.0, 0.0 or 1.0 as `expr` is negative, 0 or positive. | 
| sin(expr) | Returns the sine of `expr`, as if computed by `java.lang.Math.sin`. | 
| sinh(expr) | Returns hyperbolic sine of `expr`, as if computed by `java.lang.Math.sinh`. | 
| sqrt(expr) | Returns the square root of `expr`. | 
| tan(expr) | Returns the tangent of `expr`, as if computed by `java.lang.Math.tan`. | 
| tanh(expr) | Returns the hyperbolic tangent of `expr`, as if computed by `java.lang.Math.tanh`. | 
| try\$1add(expr1, expr2) | Returns the sum of `expr1`and `expr2` and the result is null on overflow. The acceptable input types are the same with the `\$1` operator. | 
| try\$1divide(dividend, divisor) | Returns `dividend`/`divisor`. It always performs floating point division. Its result is always null if `expr2` is 0. `dividend` must be a numeric or an interval. `divisor` must be a numeric. | 
| try\$1multiply(expr1, expr2) | Returns `expr1`\$1`expr2` and the result is null on overflow. The acceptable input types are the same with the `\$1` operator. | 
| try\$1subtract(expr1, expr2) | Returns `expr1`-`expr2` and the result is null on overflow. The acceptable input types are the same with the `-` operator. | 
| unhex(expr) | Converts hexadecimal `expr` to binary. | 
| width\$1bucket(value, min\$1value, max\$1value, num\$1bucket) | Returns the bucket number to which `value` would be assigned in an equiwidth histogram with `num\$1bucket` buckets, in the range `min\$1value` to `max\$1value`." | 

**Examples**

```
-- %
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
|      0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
|        0.2|
+-----------+
-- *
SELECT 2 * 3;
+-------+
|(2 * 3)|
+-------+
|      6|
+-------+
-- +
SELECT 1 + 2;
+-------+
|(1 + 2)|
+-------+
|      3|
+-------+
-- -
SELECT 2 - 1;
+-------+
|(2 - 1)|
+-------+
|      1|
+-------+
-- /
SELECT 3 / 2;
+-------+
|(3 / 2)|
+-------+
|    1.5|
+-------+
SELECT 2L / 2L;
+-------+
|(2 / 2)|
+-------+
|    1.0|
+-------+
-- abs
SELECT abs(-1);
+-------+
|abs(-1)|
+-------+
|      1|
+-------+
SELECT abs(INTERVAL -'1-1' YEAR TO MONTH);
+----------------------------------+
|abs(INTERVAL '-1-1' YEAR TO MONTH)|
+----------------------------------+
|              INTERVAL '1-1' YE...|
+----------------------------------+
-- acos
SELECT acos(1);
+-------+
|ACOS(1)|
+-------+
|    0.0|
+-------+
SELECT acos(2);
+-------+
|ACOS(2)|
+-------+
|    NaN|
+-------+
-- acosh
SELECT acosh(1);
+--------+
|ACOSH(1)|
+--------+
|     0.0|
+--------+
SELECT acosh(0);
+--------+
|ACOSH(0)|
+--------+
|     NaN|
+--------+
-- asin
SELECT asin(0);
+-------+
|ASIN(0)|
+-------+
|    0.0|
+-------+
SELECT asin(2);
+-------+
|ASIN(2)|
+-------+
|    NaN|
+-------+
-- asinh
SELECT asinh(0);
+--------+
|ASINH(0)|
+--------+
|     0.0|
+--------+
-- atan
SELECT atan(0);
+-------+
|ATAN(0)|
+-------+
|    0.0|
+-------+
-- atan2
SELECT atan2(0, 0);
+-----------+
|ATAN2(0, 0)|
+-----------+
|        0.0|
+-----------+
-- atanh
SELECT atanh(0);
+--------+
|ATANH(0)|
+--------+
|     0.0|
+--------+
SELECT atanh(2);
+--------+
|ATANH(2)|
+--------+
|     NaN|
+--------+
-- bin
SELECT bin(13);
+-------+
|bin(13)|
+-------+
|   1101|
+-------+
SELECT bin(-13);
+--------------------+
|            bin(-13)|
+--------------------+
|11111111111111111...|
+--------------------+
SELECT bin(13.3);
+---------+
|bin(13.3)|
+---------+
|     1101|
+---------+
-- bround
SELECT bround(2.5, 0);
+--------------+
|bround(2.5, 0)|
+--------------+
|             2|
+--------------+
SELECT bround(25, -1);
+--------------+
|bround(25, -1)|
+--------------+
|            20|
+--------------+
-- cbrt
SELECT cbrt(27.0);
+----------+
|CBRT(27.0)|
+----------+
|       3.0|
+----------+
-- ceil
SELECT ceil(-0.1);
+----------+
|CEIL(-0.1)|
+----------+
|         0|
+----------+
SELECT ceil(5);
+-------+
|CEIL(5)|
+-------+
|      5|
+-------+
SELECT ceil(3.1411, 3);
+---------------+
|ceil(3.1411, 3)|
+---------------+
|          3.142|
+---------------+
SELECT ceil(3.1411, -3);
+----------------+
|ceil(3.1411, -3)|
+----------------+
|            1000|
+----------------+
-- ceiling
SELECT ceiling(-0.1);
+-------------+
|ceiling(-0.1)|
+-------------+
|            0|
+-------------+
SELECT ceiling(5);
+----------+
|ceiling(5)|
+----------+
|         5|
+----------+
SELECT ceiling(3.1411, 3);
+------------------+
|ceiling(3.1411, 3)|
+------------------+
|             3.142|
+------------------+
SELECT ceiling(3.1411, -3);
+-------------------+
|ceiling(3.1411, -3)|
+-------------------+
|               1000|
+-------------------+
-- conv
SELECT conv('100', 2, 10);
+----------------+
|conv(100, 2, 10)|
+----------------+
|               4|
+----------------+
SELECT conv(-10, 16, -10);
+------------------+
|conv(-10, 16, -10)|
+------------------+
|               -16|
+------------------+
-- cos
SELECT cos(0);
+------+
|COS(0)|
+------+
|   1.0|
+------+
-- cosh
SELECT cosh(0);
+-------+
|COSH(0)|
+-------+
|    1.0|
+-------+
-- cot
SELECT cot(1);
+------------------+
|            COT(1)|
+------------------+
|0.6420926159343306|
+------------------+
-- csc
SELECT csc(1);
+------------------+
|            CSC(1)|
+------------------+
|1.1883951057781212|
+------------------+
-- degrees
SELECT degrees(3.141592653589793);
+--------------------------+
|DEGREES(3.141592653589793)|
+--------------------------+
|                     180.0|
+--------------------------+
-- div
SELECT 3 div 2;
+---------+
|(3 div 2)|
+---------+
|        1|
+---------+
SELECT INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH;
+------------------------------------------------------+
|(INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH)|
+------------------------------------------------------+
|                                                   -13|
+------------------------------------------------------+
-- e
SELECT e();
+-----------------+
|              E()|
+-----------------+
|2.718281828459045|
+-----------------+
-- exp
SELECT exp(0);
+------+
|EXP(0)|
+------+
|   1.0|
+------+
-- expm1
SELECT expm1(0);
+--------+
|EXPM1(0)|
+--------+
|     0.0|
+--------+
-- factorial
SELECT factorial(5);
+------------+
|factorial(5)|
+------------+
|         120|
+------------+
-- floor
SELECT floor(-0.1);
+-----------+
|FLOOR(-0.1)|
+-----------+
|         -1|
+-----------+
SELECT floor(5);
+--------+
|FLOOR(5)|
+--------+
|       5|
+--------+
SELECT floor(3.1411, 3);
+----------------+
|floor(3.1411, 3)|
+----------------+
|           3.141|
+----------------+
SELECT floor(3.1411, -3);
+-----------------+
|floor(3.1411, -3)|
+-----------------+
|                0|
+-----------------+
-- greatest
SELECT greatest(10, 9, 2, 4, 3);
+------------------------+
|greatest(10, 9, 2, 4, 3)|
+------------------------+
|                      10|
+------------------------+
-- hex
SELECT hex(17);
+-------+
|hex(17)|
+-------+
|     11|
+-------+
SELECT hex('SQL');
+------------------+
|    hex(SQL)|
+------------------+
|53514C|
+------------------+
-- hypot
SELECT hypot(3, 4);
+-----------+
|HYPOT(3, 4)|
+-----------+
|        5.0|
+-----------+
-- least
SELECT least(10, 9, 2, 4, 3);
+---------------------+
|least(10, 9, 2, 4, 3)|
+---------------------+
|                    2|
+---------------------+
-- ln
SELECT ln(1);
+-----+
|ln(1)|
+-----+
|  0.0|
+-----+
-- log
SELECT log(10, 100);
+------------+
|LOG(10, 100)|
+------------+
|         2.0|
+------------+
-- log10
SELECT log10(10);
+---------+
|LOG10(10)|
+---------+
|      1.0|
+---------+
-- log1p
SELECT log1p(0);
+--------+
|LOG1P(0)|
+--------+
|     0.0|
+--------+
-- log2
SELECT log2(2);
+-------+
|LOG2(2)|
+-------+
|    1.0|
+-------+
-- mod
SELECT 2 % 1.8;
+---------+
|(2 % 1.8)|
+---------+
|      0.2|
+---------+
SELECT MOD(2, 1.8);
+-----------+
|mod(2, 1.8)|
+-----------+
|        0.2|
+-----------+
-- negative
SELECT negative(1);
+-----------+
|negative(1)|
+-----------+
|         -1|
+-----------+
-- pi
SELECT pi();
+-----------------+
|             PI()|
+-----------------+
|3.141592653589793|
+-----------------+
-- pmod
SELECT pmod(10, 3);
+-----------+
|pmod(10, 3)|
+-----------+
|          1|
+-----------+
SELECT pmod(-10, 3);
+------------+
|pmod(-10, 3)|
+------------+
|           2|
+------------+
-- positive
SELECT positive(1);
+-----+
|(+ 1)|
+-----+
|    1|
+-----+
-- pow
SELECT pow(2, 3);
+---------+
|pow(2, 3)|
+---------+
|      8.0|
+---------+
-- power
SELECT power(2, 3);
+-----------+
|POWER(2, 3)|
+-----------+
|        8.0|
+-----------+
-- radians
SELECT radians(180);
+-----------------+
|     RADIANS(180)|
+-----------------+
|3.141592653589793|
+-----------------+
-- rand
SELECT rand();
+------------------+
|            rand()|
+------------------+
|0.7211420708112387|
+------------------+
SELECT rand(0);
+------------------+
|           rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT rand(null);
+------------------+
|        rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- randn
SELECT randn();
+-------------------+
|            randn()|
+-------------------+
|-0.8175603217732732|
+-------------------+
SELECT randn(0);
+------------------+
|          randn(0)|
+------------------+
|1.6034991609278433|
+------------------+
SELECT randn(null);
+------------------+
|       randn(NULL)|
+------------------+
|1.6034991609278433|
+------------------+
-- random
SELECT random();
+-----------------+
|           rand()|
+-----------------+
|0.394205008255365|
+-----------------+
SELECT random(0);
+------------------+
|           rand(0)|
+------------------+
|0.7604953758285915|
+------------------+
SELECT random(null);
+------------------+
|        rand(NULL)|
+------------------+
|0.7604953758285915|
+------------------+
-- rint
SELECT rint(12.3456);
+-------------+
|rint(12.3456)|
+-------------+
|         12.0|
+-------------+
-- round
SELECT round(2.5, 0);
+-------------+
|round(2.5, 0)|
+-------------+
|            3|
+-------------+
-- sec
SELECT sec(0);
+------+
|SEC(0)|
+------+
|   1.0|
+------+
-- shiftleft
SELECT shiftleft(2, 1);
+---------------+
|shiftleft(2, 1)|
+---------------+
|              4|
+---------------+
-- sign
SELECT sign(40);
+--------+
|sign(40)|
+--------+
|     1.0|
+--------+
SELECT sign(INTERVAL -'100' YEAR);
+--------------------------+
|sign(INTERVAL '-100' YEAR)|
+--------------------------+
|                      -1.0|
+--------------------------+
-- signum
SELECT signum(40);
+----------+
|SIGNUM(40)|
+----------+
|       1.0|
+----------+
SELECT signum(INTERVAL -'100' YEAR);
+----------------------------+
|SIGNUM(INTERVAL '-100' YEAR)|
+----------------------------+
|                        -1.0|
+----------------------------+
-- sin
SELECT sin(0);
+------+
|SIN(0)|
+------+
|   0.0|
+------+
-- sinh
SELECT sinh(0);
+-------+
|SINH(0)|
+-------+
|    0.0|
+-------+
-- sqrt
SELECT sqrt(4);
+-------+
|SQRT(4)|
+-------+
|    2.0|
+-------+
-- tan
SELECT tan(0);
+------+
|TAN(0)|
+------+
|   0.0|
+------+
-- tanh
SELECT tanh(0);
+-------+
|TANH(0)|
+-------+
|    0.0|
+-------+
-- try_add
SELECT try_add(1, 2);
+-------------+
|try_add(1, 2)|
+-------------+
|            3|
+-------------+
SELECT try_add(2147483647, 1);
+----------------------+
|try_add(2147483647, 1)|
+----------------------+
|                  NULL|
+----------------------+
SELECT try_add(date'2021-01-01', 1);
+-----------------------------+
|try_add(DATE '2021-01-01', 1)|
+-----------------------------+
|                   2021-01-02|
+-----------------------------+
SELECT try_add(date'2021-01-01', interval 1 year);
+---------------------------------------------+
|try_add(DATE '2021-01-01', INTERVAL '1' YEAR)|
+---------------------------------------------+
|                                   2022-01-01|
+---------------------------------------------+
SELECT try_add(timestamp'2021-01-01 00:00:00', interval 1 day);
+----------------------------------------------------------+
|try_add(TIMESTAMP '2021-01-01 00:00:00', INTERVAL '1' DAY)|
+----------------------------------------------------------+
|                                       2021-01-02 00:00:00|
+----------------------------------------------------------+
SELECT try_add(interval 1 year, interval 2 year);
+---------------------------------------------+
|try_add(INTERVAL '1' YEAR, INTERVAL '2' YEAR)|
+---------------------------------------------+
|                            INTERVAL '3' YEAR|
+---------------------------------------------+
-- try_divide
SELECT try_divide(3, 2);
+----------------+
|try_divide(3, 2)|
+----------------+
|             1.5|
+----------------+
SELECT try_divide(2L, 2L);
+----------------+
|try_divide(2, 2)|
+----------------+
|             1.0|
+----------------+
SELECT try_divide(1, 0);
+----------------+
|try_divide(1, 0)|
+----------------+
|            NULL|
+----------------+
SELECT try_divide(interval 2 month, 2);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 2)|
+---------------------------------+
|             INTERVAL '0-1' YE...|
+---------------------------------+
SELECT try_divide(interval 2 month, 0);
+---------------------------------+
|try_divide(INTERVAL '2' MONTH, 0)|
+---------------------------------+
|                             NULL|
+---------------------------------+
-- try_multiply
SELECT try_multiply(2, 3);
+------------------+
|try_multiply(2, 3)|
+------------------+
|                 6|
+------------------+
SELECT try_multiply(-2147483648, 10);
+-----------------------------+
|try_multiply(-2147483648, 10)|
+-----------------------------+
|                         NULL|
+-----------------------------+
SELECT try_multiply(interval 2 year, 3);
+----------------------------------+
|try_multiply(INTERVAL '2' YEAR, 3)|
+----------------------------------+
|              INTERVAL '6-0' YE...|
+----------------------------------+
-- try_subtract
SELECT try_subtract(2, 1);
+------------------+
|try_subtract(2, 1)|
+------------------+
|                 1|
+------------------+
SELECT try_subtract(-2147483648, 1);
+----------------------------+
|try_subtract(-2147483648, 1)|
+----------------------------+
|                        NULL|
+----------------------------+
SELECT try_subtract(date'2021-01-02', 1);
+----------------------------------+
|try_subtract(DATE '2021-01-02', 1)|
+----------------------------------+
|                        2021-01-01|
+----------------------------------+
SELECT try_subtract(date'2021-01-01', interval 1 year);
+--------------------------------------------------+
|try_subtract(DATE '2021-01-01', INTERVAL '1' YEAR)|
+--------------------------------------------------+
|                                        2020-01-01|
+--------------------------------------------------+
SELECT try_subtract(timestamp'2021-01-02 00:00:00', interval 1 day);
+---------------------------------------------------------------+
|try_subtract(TIMESTAMP '2021-01-02 00:00:00', INTERVAL '1' DAY)|
+---------------------------------------------------------------+
|                                            2021-01-01 00:00:00|
+---------------------------------------------------------------+
SELECT try_subtract(interval 2 year, interval 1 year);
+--------------------------------------------------+
|try_subtract(INTERVAL '2' YEAR, INTERVAL '1' YEAR)|
+--------------------------------------------------+
|                                 INTERVAL '1' YEAR|
+--------------------------------------------------+
-- unhex
SELECT decode(unhex('53514C'), 'UTF-8');
+----------------------------------------+
|decode(unhex(53514C), UTF-8)|
+----------------------------------------+
|                               SQL|
+----------------------------------------+
-- width_bucket
SELECT width_bucket(5.3, 0.2, 10.6, 5);
+-------------------------------+
|width_bucket(5.3, 0.2, 10.6, 5)|
+-------------------------------+
|                              3|
+-------------------------------+
SELECT width_bucket(-2.1, 1.3, 3.4, 3);
+-------------------------------+
|width_bucket(-2.1, 1.3, 3.4, 3)|
+-------------------------------+
|                              0|
+-------------------------------+
SELECT width_bucket(8.1, 0.0, 5.7, 4);
+------------------------------+
|width_bucket(8.1, 0.0, 5.7, 4)|
+------------------------------+
|                             5|
+------------------------------+
SELECT width_bucket(-0.9, 5.2, 0.5, 2);
+-------------------------------+
|width_bucket(-0.9, 5.2, 0.5, 2)|
+-------------------------------+
|                              3|
+-------------------------------+
SELECT width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '0' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
|                                                                         1|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10);
+--------------------------------------------------------------------------+
|width_bucket(INTERVAL '1' YEAR, INTERVAL '0' YEAR, INTERVAL '10' YEAR, 10)|
+--------------------------------------------------------------------------+
|                                                                         2|
+--------------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '0' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
|                                                                      1|
+-----------------------------------------------------------------------+
SELECT width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10);
+-----------------------------------------------------------------------+
|width_bucket(INTERVAL '1' DAY, INTERVAL '0' DAY, INTERVAL '10' DAY, 10)|
+-----------------------------------------------------------------------+
|                                                                      2|
+-----------------------------------------------------------------------+
```

#### Generator functions
<a name="supported-sql-generator"></a>

**Note**  
To see which AWS data source integrations support these SQL functions, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).


****  

| Function | Description | 
| --- | --- | 
| explode(expr) | Separates the elements of array `expr` into multiple rows, or the elements of map `expr` into multiple rows and columns. Unless specified otherwise, uses the default column name `col` for elements of the array or `key` and `value` for the elements of the map. | 
| explode\$1outer(expr) | Separates the elements of array `expr` into multiple rows, or the elements of map `expr` into multiple rows and columns. Unless specified otherwise, uses the default column name `col` for elements of the array or `key` and `value` for the elements of the map. | 
| inline(expr) | Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise. | 
| inline\$1outer(expr) | Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise. | 
| posexplode(expr) | Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions. Unless specified otherwise, uses the column name `pos` for position, `col` for elements of the array or `key` and `value` for elements of the map. | 
| posexplode\$1outer(expr) | Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions. Unless specified otherwise, uses the column name `pos` for position, `col` for elements of the array or `key` and `value` for elements of the map. | 
| stack(n, expr1, ..., exprk) | Separates `expr1`, ..., `exprk` into `n` rows. Uses column names col0, col1, etc. by default unless specified otherwise. | 

**Examples**

```
-- explode
SELECT explode(array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

SELECT explode(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

SELECT * FROM explode(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

-- explode_outer
SELECT explode_outer(array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

SELECT explode_outer(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

SELECT * FROM explode_outer(collection => array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

-- inline
SELECT inline(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
|   1|   a|
|   2|   b|
+----+----+

-- inline_outer
SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
|   1|   a|
|   2|   b|
+----+----+

-- posexplode
SELECT posexplode(array(10,20));
+---+---+
|pos|col|
+---+---+
|  0| 10|
|  1| 20|
+---+---+

SELECT * FROM posexplode(array(10,20));
+---+---+
|pos|col|
+---+---+
|  0| 10|
|  1| 20|
+---+---+

-- posexplode_outer
SELECT posexplode_outer(array(10,20));
+---+---+
|pos|col|
+---+---+
|  0| 10|
|  1| 20|
+---+---+

SELECT * FROM posexplode_outer(array(10,20));
+---+---+
|pos|col|
+---+---+
|  0| 10|
|  1| 20|
+---+---+

-- stack
SELECT stack(2, 1, 2, 3);
+----+----+
|col0|col1|
+----+----+
|   1|   2|
|   3|NULL|
+----+----+
```

#### SELECT clause
<a name="supported-sql-select"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

OpenSearch SQL supports a `SELECT` statement used for retrieving result sets from one or more tables. The following section describes the overall query syntax and the different constructs of a query.

**Syntax** 

```
select_statement 
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select_statement, ... ]
[ ORDER BY 
    { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] 
    [ , ... ] 
    } 
]
[ SORT BY 
    { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] 
    [ , ... ] 
    } 
]
[ WINDOW { named_window [ , WINDOW named_window, ... ] } ]
[ LIMIT { ALL | expression } ]
```

While `select_statement` is defined as:

```
SELECT [ ALL | DISTINCT ] { [ [ named_expression ] [ , ... ] ] }
FROM { from_item [ , ... ] }
[ PIVOT clause ]
[ UNPIVOT clause ]
[ LATERAL VIEW clause ] [ ... ]
[ WHERE boolean_expression ]
[ GROUP BY expression [ , ... ] ]
[ HAVING boolean_expression ]
```

 **Parameters** 
+ **ALL** 

  Selects all matching rows from the relation and is enabled by default. 
+ **DISTINCT** 

  Selects all matching rows from the relation after removing duplicates in results. 
+ **named\$1expression **

  An expression with an assigned name. In general, it denotes a column expression. 

  Syntax: `expression [[AS] alias]` 
+ **from\$1item **

  Table relation

  Join relation

  Pivot relation

  Unpivot relation

  Table-value function

  Inline table

  `[ LATERAL ] ( Subquery )`
+ **PIVOT** 

  The `PIVOT` clause is used for data perspective. You can get the aggregated values based on specific column value. 
+ **UNPIVOT** 

  The `UNPIVOT` clause transforms columns into rows. It is the reverse of `PIVOT`, except for aggregation of values. 
+ **LATERAL VIEW **

  The `LATERAL VIEW` clause is used in conjunction with generator functions such as `EXPLODE`, which will generate a virtual table containing one or more rows.

  `LATERAL VIEW` will apply the rows to each original output row. 
+ **WHERE** 

  Filters the result of the `FROM` clause based on the supplied predicates. 
+ **GROUP BY **

  Specifies the expressions that are used to group the rows. 

  This is used in conjunction with aggregate functions (`MIN`, `MAX`, `COUNT`, `SUM`, `AVG`, and so on) to group rows based on the grouping expressions and aggregate values in each group. 

  When a `FILTER` clause is attached to an aggregate function, only the matching rows are passed to that function. 
+ **HAVING** 

  Specifies the predicates by which the rows produced by `GROUP BY` are filtered. 

  The `HAVING` clause is used to filter rows after the grouping is performed. 

  If `HAVING` is specified without `GROUP BY`, it indicates a `GROUP BY` without grouping expressions (global aggregate). 
+ **ORDER BY **

  Specifies an ordering of the rows of the complete result set of the query. 

  The output rows are ordered across the partitions. 

  This parameter is mutually exclusive with `SORT BY` and `DISTRIBUTE BY` and can not be specified together. 
+ **SORT BY **

  Specifies an ordering by which the rows are ordered within each partition. 

  This parameter is mutually exclusive with `ORDER BY` and can not be specified together. 
+ **LIMIT** 

  Specifies the maximum number of rows that can be returned by a statement or subquery. 

  This clause is mostly used in the conjunction with `ORDER BY` to produce a deterministic result. 
+ **boolean\$1expression **

  Specifies any expression that evaluates to a result type boolean. 

  Two or more expressions may be combined together using the logical operators ( `AND`, `OR` ). 
+ **expression** 

  Specifies a combination of one or more values, operators, and SQL functions that evaluates to a value. 
+ **named\$1window **

  Specifies aliases for one or more source window specifications. 

  The source window specifications can be referenced in the widow definitions in the query. 

#### WHERE clause
<a name="supported-sql-where"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `WHERE` clause is used to limit the results of the `FROM` clause of a query or a subquery based on the specified condition. 

**Syntax** 

```
WHERE boolean_expression
```

**Parameters**
+ **boolean\$1expression** 

  Specifies any expression that evaluates to a result type boolean. 

  Two or more expressions may be combined together using the logical operators ( `AND`, `OR` ). 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'John', 30),
(200, 'Mary', NULL),
(300, 'Mike', 80),
(400, 'Dan',  50);

-- Comparison operator in `WHERE` clause.
SELECT * FROM person WHERE id > 200 ORDER BY id;
+---+----+---+
| id|name|age|
+---+----+---+
|300|Mike| 80|
|400| Dan| 50|
+---+----+---+

-- Comparison and logical operators in `WHERE` clause.
SELECT * FROM person WHERE id = 200 OR id = 300 ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|300|Mike|  80|
+---+----+----+

-- IS NULL expression in `WHERE` clause.
SELECT * FROM person WHERE id > 300 OR age IS NULL ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|400| Dan|  50|
+---+----+----+

-- Function expression in `WHERE` clause.
SELECT * FROM person WHERE length(name) > 3 ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|100|John|  30|
|200|Mary|null|
|300|Mike|  80|
+---+----+----+

-- `BETWEEN` expression in `WHERE` clause.
SELECT * FROM person WHERE id BETWEEN 200 AND 300 ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|300|Mike|  80|
+---+----+----+

-- Scalar Subquery in `WHERE` clause.
SELECT * FROM person WHERE age > (SELECT avg(age) FROM person);
+---+----+---+
| id|name|age|
+---+----+---+
|300|Mike| 80|
+---+----+---+
 
-- Correlated Subquery in `WHERE` clause.
SELECT id FROM person
WHERE exists (SELECT id FROM person where id = 200);
+---+----+----+
|id |name|age |
+---+----+----+
|200|Mary|null|
+---+----+----+
```

#### GROUP BY clause
<a name="supported-sql-group-by"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `GROUP BY` clause is used to group the rows based on a set of specified grouping expressions and compute aggregations on the group of rows based on one or more specified aggregate functions. 

The system also does multiple aggregations for the same input record set via `GROUPING SETS`, `CUBE`, `ROLLUP` clauses. The grouping expressions and advanced aggregations can be mixed in the `GROUP BY` clause and nested in a `GROUPING SETS `clause. See more details in the `Mixed/Nested Grouping Analytics `section. 

When a `FILTER` clause is attached to an aggregate function, only the matching rows are passed to that function. 

**Syntax** 

```
GROUP BY group_expression [ , group_expression [ , ... ] ] [ WITH { ROLLUP | CUBE } ]
GROUP BY { group_expression | { ROLLUP | CUBE | GROUPING SETS } (grouping_set [ , ...]) } [ , ... ]
```

While aggregate functions are defined as: 

```
aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_expression ) ]
```

**Parameters**
+ **group\$1expression **

  Specifies the criteria based on which the rows are grouped together. The grouping of rows is performed based on result values of the grouping expressions. 

  A grouping expression may be a column name like `GROUP BY a`, a column position like `GROUP BY 0`, or an expression like `GROUP BY a + b`. 
+ **grouping\$1set **

  A grouping set is specified by zero or more comma-separated expressions in parentheses. When the grouping set has only one element, parentheses can be omitted. 

  For example, `GROUPING SETS ((a), (b))` is the same as `GROUPING SETS (a, b)`. 

  Syntax: `{ ( [ expression [ , ... ] ] ) | expression }` 
+ **GROUPING SETS **

  Groups the rows for each grouping set specified after `GROUPING SETS`. 

  For example, `GROUP BY GROUPING SETS ((warehouse), (product))` is semantically equivalent to union of results of `GROUP BY warehouse` and `GROUP BY product`. This clause is a shorthand for a UNION ALL where each leg of the `UNION ALL` operator performs aggregation of each grouping set specified in the `GROUPING SETS` clause. 

  Similarly, `GROUP BY GROUPING SETS ((warehouse, product), (product), ())` is semantically equivalent to the union of results of `GROUP BY warehouse, product, GROUP BY product` and global aggregate. 
+ **ROLLUP** 

  Specifies multiple levels of aggregations in a single statement. This clause is used to compute aggregations based on multiple grouping sets. `ROLLUP` is a shorthand for `GROUPING SETS`. 

  For example, `GROUP BY warehouse, product WITH ROLLUP or GROUP BY ROLLUP(warehouse, product)` is equivalent to `GROUP BY GROUPING SETS((warehouse, product), (warehouse), ())`. 

  `GROUP BY ROLLUP(warehouse, product, (warehouse, location))` is equivalent to `GROUP BY GROUPING SETS((warehouse, product, location), (warehouse, product), (warehouse), ())`.

  The N elements of a ROLLUP specification results in N\$11 GROUPING SETS. 
+ **CUBE** 

  CUBE clause is used to perform aggregations based on combination of grouping columns specified in the GROUP BY clause. CUBE is a shorthand for GROUPING SETS. 

  For example, `GROUP BY warehouse, product WITH CUBE or GROUP BY CUBE(warehouse, product)` is equivalent to `GROUP BY GROUPING SETS((warehouse, product), (warehouse), (product), ())`. 

  `GROUP BY CUBE(warehouse, product, (warehouse, location))` is equivalent to `GROUP BY GROUPING SETS((warehouse, product, location), (warehouse, product), (warehouse, location), (product, warehouse, location), (warehouse), (product), (warehouse, product), ())`. The N elements of a `CUBE` specification results in 2^N `GROUPING SETS`. 
+ **Mixed/Nested Grouping Analytics **

  A `GROUP BY` clause can include multiple group\$1expressions and multiple `CUBE|ROLLUP|GROUPING SETS`. `GROUPING SETS` can also have nested `CUBE|ROLLUP|GROUPING SETS` clauses, such as `GROUPING SETS(ROLLUP(warehouse, location)`, `CUBE(warehouse, location))`, `GROUPING SETS(warehouse, GROUPING SETS(location, GROUPING SETS(ROLLUP(warehouse, location),` `CUBE(warehouse, location))))`. 

  `CUBE|ROLLUP` is just a syntax sugar for `GROUPING SETS`. Refer to the sections above for how to translate `CUBE|ROLLUP` to `GROUPING SETS`. `group_expression` can be treated as a single-group `GROUPING SETS` under this context. 

  For multiple `GROUPING SETS` in the `GROUP BY` clause, we generate a single `GROUPING SETS` by doing a cross-product of the original `GROUPING SETS`. For nested `GROUPING SETS` in the `GROUPING SETS` clause, we simply take its grouping sets and strip it. 

  For example, `GROUP BY warehouse, GROUPING SETS((product), ()), GROUPING SETS((location, size), (location), (size), ()) and GROUP BY warehouse, ROLLUP(product), CUBE(location, size)` is equivalent to `GROUP BY GROUPING SETS( (warehouse, product, location, size), (warehouse, product, location), (warehouse, product, size), (warehouse, product), (warehouse, location, size), (warehouse, location), (warehouse, size), (warehouse))`. 

  `GROUP BY GROUPING SETS(GROUPING SETS(warehouse), GROUPING SETS((warehouse, product)))` is equivalent to `GROUP BY GROUPING SETS((warehouse), (warehouse, product))`. 
+ **aggregate\$1name **

  Specifies an aggregate function name (`MIN`, `MAX`, `COUNT`, `SUM`, `AVG`, and so on). 
+ **DISTINCT** 

  Removes duplicates in input rows before they are passed to aggregate functions. 
+ **FILTER** 

  Filters the input rows for which the `boolean_expression` in the `WHERE` clause evaluates to true are passed to the aggregate function; other rows are discarded. 

**Examples**

```
CREATE TABLE dealer (id INT, city STRING, car_model STRING, quantity INT);
INSERT INTO dealer VALUES
(100, 'Fremont', 'Honda Civic', 10),
(100, 'Fremont', 'Honda Accord', 15),
(100, 'Fremont', 'Honda CRV', 7),
(200, 'Dublin', 'Honda Civic', 20),
(200, 'Dublin', 'Honda Accord', 10),
(200, 'Dublin', 'Honda CRV', 3),
(300, 'San Jose', 'Honda Civic', 5),
(300, 'San Jose', 'Honda Accord', 8);

-- Sum of quantity per dealership. Group by `id`.
SELECT id, sum(quantity) FROM dealer GROUP BY id ORDER BY id;
+---+-------------+
| id|sum(quantity)|
+---+-------------+
|100|           32|
|200|           33|
|300|           13|
+---+-------------+

-- Use column position in GROUP by clause.
SELECT id, sum(quantity) FROM dealer GROUP BY 1 ORDER BY 1;
+---+-------------+
| id|sum(quantity)|
+---+-------------+
|100|           32|
|200|           33|
|300|           13|
+---+-------------+

-- Multiple aggregations.
-- 1. Sum of quantity per dealership.
-- 2. Max quantity per dealership.
SELECT id, sum(quantity) AS sum, max(quantity) AS max FROM dealer GROUP BY id ORDER BY id;
+---+---+---+
| id|sum|max|
+---+---+---+
|100| 32| 15|
|200| 33| 20|
|300| 13|  8|
+---+---+---+

-- Count the number of distinct dealer cities per car_model.
SELECT car_model, count(DISTINCT city) AS count FROM dealer GROUP BY car_model;
+------------+-----+
|   car_model|count|
+------------+-----+
| Honda Civic|    3|
|   Honda CRV|    2|
|Honda Accord|    3|
+------------+-----+

-- Sum of only 'Honda Civic' and 'Honda CRV' quantities per dealership.
SELECT id, sum(quantity) FILTER (
WHERE car_model IN ('Honda Civic', 'Honda CRV')
) AS `sum(quantity)` FROM dealer
GROUP BY id ORDER BY id;
+---+-------------+
| id|sum(quantity)|
+---+-------------+
|100|           17|
|200|           23|
|300|            5|
+---+-------------+

-- Aggregations using multiple sets of grouping columns in a single statement.
-- Following performs aggregations based on four sets of grouping columns.
-- 1. city, car_model
-- 2. city
-- 3. car_model
-- 4. Empty grouping set. Returns quantities for all city and car models.
SELECT city, car_model, sum(quantity) AS sum FROM dealer
GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ())
ORDER BY city;
+---------+------------+---+
|     city|   car_model|sum|
+---------+------------+---+
|     null|        null| 78|
|     null| HondaAccord| 33|
|     null|    HondaCRV| 10|
|     null|  HondaCivic| 35|
|   Dublin|        null| 33|
|   Dublin| HondaAccord| 10|
|   Dublin|    HondaCRV|  3|
|   Dublin|  HondaCivic| 20|
|  Fremont|        null| 32|
|  Fremont| HondaAccord| 15|
|  Fremont|    HondaCRV|  7|
|  Fremont|  HondaCivic| 10|
| San Jose|        null| 13|
| San Jose| HondaAccord|  8|
| San Jose|  HondaCivic|  5|
+---------+------------+---+

-- Group by processing with `ROLLUP` clause.
-- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), ())
SELECT city, car_model, sum(quantity) AS sum FROM dealer
GROUP BY city, car_model WITH ROLLUP
ORDER BY city, car_model;
+---------+------------+---+
|     city|   car_model|sum|
+---------+------------+---+
|     null|        null| 78|
|   Dublin|        null| 33|
|   Dublin| HondaAccord| 10|
|   Dublin|    HondaCRV|  3|
|   Dublin|  HondaCivic| 20|
|  Fremont|        null| 32|
|  Fremont| HondaAccord| 15|
|  Fremont|    HondaCRV|  7|
|  Fremont|  HondaCivic| 10|
| San Jose|        null| 13|
| San Jose| HondaAccord|  8|
| San Jose|  HondaCivic|  5|
+---------+------------+---+

-- Group by processing with `CUBE` clause.
-- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ())
SELECT city, car_model, sum(quantity) AS sum FROM dealer
GROUP BY city, car_model WITH CUBE
ORDER BY city, car_model;
+---------+------------+---+
|     city|   car_model|sum|
+---------+------------+---+
|     null|        null| 78|
|     null| HondaAccord| 33|
|     null|    HondaCRV| 10|
|     null|  HondaCivic| 35|
|   Dublin|        null| 33|
|   Dublin| HondaAccord| 10|
|   Dublin|    HondaCRV|  3|
|   Dublin|  HondaCivic| 20|
|  Fremont|        null| 32|
|  Fremont| HondaAccord| 15|
|  Fremont|    HondaCRV|  7|
|  Fremont|  HondaCivic| 10|
| San Jose|        null| 13|
| San Jose| HondaAccord|  8|
| San Jose|  HondaCivic|  5|
+---------+------------+---+

--Prepare data for ignore nulls example
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'Mary', NULL),
(200, 'John', 30),
(300, 'Mike', 80),
(400, 'Dan', 50);

--Select the first row in column age
SELECT FIRST(age) FROM person;
+--------------------+
| first(age, false)  |
+--------------------+
| NULL               |
+--------------------+

--Get the first row in column `age` ignore nulls,last row in column `id` and sum of column `id`.
SELECT FIRST(age IGNORE NULLS), LAST(id), SUM(id) FROM person;
+-------------------+------------------+----------+
| first(age, true)  | last(id, false)  | sum(id)  |
+-------------------+------------------+----------+
| 30                | 400              | 1000     |
+-------------------+------------------+----------+
```

#### HAVING clause
<a name="supported-sql-having"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `HAVING` clause is used to filter the results produced by `GROUP BY` based on the specified condition. It is often used in conjunction with a `GROUP BY` clause. 

**Syntax** 

```
HAVING boolean_expression
```

**Parameters**
+ **boolean\$1expression **

  Specifies any expression that evaluates to a result type boolean. Two or more expressions may be combined together using the logical operators ( `AND`, `OR` ). 

  **Note** The expressions specified in the `HAVING` clause can only refer to: 

  1. Constants 

  1. Expressions that appear in `GROUP BY` 

  1. Aggregate functions 

**Examples**

```
CREATE TABLE dealer (id INT, city STRING, car_model STRING, quantity INT);
INSERT INTO dealer VALUES
(100, 'Fremont', 'Honda Civic', 10),
(100, 'Fremont', 'Honda Accord', 15),
(100, 'Fremont', 'Honda CRV', 7),
(200, 'Dublin', 'Honda Civic', 20),
(200, 'Dublin', 'Honda Accord', 10),
(200, 'Dublin', 'Honda CRV', 3),
(300, 'San Jose', 'Honda Civic', 5),
(300, 'San Jose', 'Honda Accord', 8);

-- `HAVING` clause referring to column in `GROUP BY`.
SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING city = 'Fremont';
+-------+---+
|   city|sum|
+-------+---+
|Fremont| 32|
+-------+---+

-- `HAVING` clause referring to aggregate function.
SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum(quantity) > 15;
+-------+---+
|   city|sum|
+-------+---+
| Dublin| 33|
|Fremont| 32|
+-------+---+

-- `HAVING` clause referring to aggregate function by its alias.
SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum > 15;
+-------+---+
|   city|sum|
+-------+---+
| Dublin| 33|
|Fremont| 32|
+-------+---+

-- `HAVING` clause referring to a different aggregate function than what is present in
-- `SELECT` list.
SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING max(quantity) > 15;
+------+---+
|  city|sum|
+------+---+
|Dublin| 33|
+------+---+

-- `HAVING` clause referring to constant expression.
SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING 1 > 0 ORDER BY city;
+--------+---+
|    city|sum|
+--------+---+
|  Dublin| 33|
| Fremont| 32|
|San Jose| 13|
+--------+---+

-- `HAVING` clause without a `GROUP BY` clause.
SELECT sum(quantity) AS sum FROM dealer HAVING sum(quantity) > 10;
+---+
|sum|
+---+
| 78|
+---+
```

#### ORDER BY clause
<a name="supported-sql-order-by"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `ORDER BY` clause is used to return the result rows in a sorted manner in the user specified order. Unlike the SORT BY clause, this clause guarantees a total order in the output. 

**Syntax** 

```
ORDER BY { expression [ sort_direction | nulls_sort_order ] [ , ... ] }
```

**Parameters**
+ **ORDER BY **

  Specifies a comma-separated list of expressions along with optional parameters `sort_direction` and `nulls_sort_order` which are used to sort the rows. 
+ **sort\$1direction **

  Optionally specifies whether to sort the rows in ascending or descending order. 

  The valid values for the sort direction are `ASC` for ascending and `DESC` for descending. 

  If sort direction is not explicitly specified, then by default rows are sorted ascending. 

  Syntax: `[ ASC | DESC ] `
+ **nulls\$1sort\$1order **

  Optionally specifies whether `NULL` values are returned before/after non-NULL values. 

  If null\$1sort\$1order is not specified, then `NULLs` sort first if sort order is `ASC` and NULLS sort last if sort order is `DESC`. 

  1. If `NULLS FIRST` is specified, then NULL values are returned first regardless of the sort order. 

  2. If `NULLS LAST` is specified, then NULL values are returned last regardless of the sort order. 

  Syntax: `[ NULLS { FIRST | LAST } ]` 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'John', 30),
(200, 'Mary', NULL),
(300, 'Mike', 80),
(400, 'Jerry', NULL),
(500, 'Dan',  50);

-- Sort rows by age. By default rows are sorted in ascending manner with NULL FIRST.
SELECT name, age FROM person ORDER BY age;
+-----+----+
| name| age|
+-----+----+
|Jerry|null|
| Mary|null|
| John|  30|
|  Dan|  50|
| Mike|  80|
+-----+----+

-- Sort rows in ascending manner keeping null values to be last.
SELECT name, age FROM person ORDER BY age NULLS LAST;
+-----+----+
| name| age|
+-----+----+
| John|  30|
|  Dan|  50|
| Mike|  80|
| Mary|null|
|Jerry|null|
+-----+----+

-- Sort rows by age in descending manner, which defaults to NULL LAST.
SELECT name, age FROM person ORDER BY age DESC;
+-----+----+
| name| age|
+-----+----+
| Mike|  80|
|  Dan|  50|
| John|  30|
|Jerry|null|
| Mary|null|
+-----+----+

-- Sort rows in ascending manner keeping null values to be first.
SELECT name, age FROM person ORDER BY age DESC NULLS FIRST;
+-----+----+
| name| age|
+-----+----+
|Jerry|null|
| Mary|null|
| Mike|  80|
|  Dan|  50|
| John|  30|
+-----+----+

-- Sort rows based on more than one column with each column having different
-- sort direction.
SELECT * FROM person ORDER BY name ASC, age DESC;
+---+-----+----+
| id| name| age|
+---+-----+----+
|500|  Dan|  50|
|400|Jerry|null|
|100| John|  30|
|200| Mary|null|
|300| Mike|  80|
+---+-----+----+
```

#### JOIN clause
<a name="supported-sql-join"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

A SQL join is used to combine rows from two relations based on join criteria. The following section describes the overall join syntax and the different types of joins along with examples. 

**Syntax** 

```
relation INNER JOIN relation [ join_criteria ]
```

**Parameters**
+ **relation **

  Specifies the relation to be joined. 
+ **join\$1type **

  Specifies the join type. 

  Syntax: `INNER | CROSS | LEFT OUTER`
+ **join\$1criteria **

  Specifies how the rows from one relation will be combined with the rows of another relation. 

  Syntax: `ON boolean_expression | USING ( column_name [ , ... ] ) `
+ **boolean\$1expression **

  Specifies an expression with a return type of boolean. 

**Join types**
+ **Inner Join**

  The inner join needs to be explicitly specified. It selects rows that have matching values in both relations.

  Syntax: `relation INNER JOIN relation [ join_criteria ] `
+ **Left Join **

  A left join returns all values from the left relation and the matched values from the right relation, or appends NULL if there is no match. It is also referred to as a left outer join. 

  Syntax: `relation LEFT OUTER JOIN relation [ join_criteria ]` 
+ **Cross Join **

  A cross join returns the Cartesian product of two relations. 

  Syntax: `relation CROSS JOIN relation [ join_criteria ]` 

**Examples**

```
-- Use employee and department tables to demonstrate different type of joins.
SELECT * FROM employee;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|105|Chloe|     5|
|103| Paul|     3|
|101| John|     1|
|102| Lisa|     2|
|104| Evan|     4|
|106|  Amy|     6|
+---+-----+------+
SELECT * FROM department;
+------+-----------+
|deptno|   deptname|
+------+-----------+
|     3|Engineering|
|     2|      Sales|
|     1|  Marketing|
+------+-----------+

-- Use employee and department tables to demonstrate inner join.
SELECT id, name, employee.deptno, deptname
FROM employee INNER JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno|   deptname|
+---+-----+------+-----------|
|103| Paul|     3|Engineering|
|101| John|     1|  Marketing|
|102| Lisa|     2|      Sales|
+---+-----+------+-----------|

-- Use employee and department tables to demonstrate left join.
SELECT id, name, employee.deptno, deptname
FROM employee LEFT JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno|   deptname|
+---+-----+------+-----------|
|105|Chloe|     5|       NULL|
|103| Paul|     3|Engineering|
|101| John|     1|  Marketing|
|102| Lisa|     2|      Sales|
|104| Evan|     4|       NULL|
|106|  Amy|     6|       NULL|
+---+-----+------+-----------|

-- Use employee and department tables to demonstrate cross join.
SELECT id, name, employee.deptno, deptname FROM employee CROSS JOIN department;
+---+-----+------+-----------|
| id| name|deptno|   deptname|
+---+-----+------+-----------|
|105|Chloe|     5|Engineering|
|105|Chloe|     5|  Marketing|
|105|Chloe|     5|      Sales|
|103| Paul|     3|Engineering|
|103| Paul|     3|  Marketing|
|103| Paul|     3|      Sales|
|101| John|     1|Engineering|
|101| John|     1|  Marketing|
|101| John|     1|      Sales|
|102| Lisa|     2|Engineering|
|102| Lisa|     2|  Marketing|
|102| Lisa|     2|      Sales|
|104| Evan|     4|Engineering|
|104| Evan|     4|  Marketing|
|104| Evan|     4|      Sales|
|106|  Amy|     4|Engineering|
|106|  Amy|     4|  Marketing|
|106|  Amy|     4|      Sales|
+---+-----+------+-----------|
```

#### LIMIT clause
<a name="supported-sql-limit"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `LIMIT` clause is used to constrain the number of rows returned by the `SELECT` statement. In general, this clause is used in conjunction with `ORDER BY` to ensure that the results are deterministic. 

**Syntax** 

```
LIMIT { ALL | integer_expression }
```

**Parameters**
+ **ALL**

  If specified, the query returns all the rows. In other words, no limit is applied if this option is specified. 
+ **integer\$1expression **

  Specifies a foldable expression that returns an integer. 

**Examples**

```
CREATE TABLE person (name STRING, age INT);
INSERT INTO person VALUES
('Jane Doe', 25),
('Pat C', 18),
('Nikki W', 16),
('John D', 25),
('Juan L', 18),
('Jorge S', 16);

-- Select the first two rows.
SELECT name, age FROM person ORDER BY name LIMIT 2;
+-------+---+
|   name|age|
+-------+---+
|  Pat C| 18|
|Jorge S| 16|
+------+---+

-- Specifying ALL option on LIMIT returns all the rows.
SELECT name, age FROM person ORDER BY name LIMIT ALL;
+--------+---+
|    name|age|
+--------+---+
|   Pat C| 18|
| Jorge S| 16|
|  Juan L| 18|
|  John D| 25|
| Nikki W| 16|
|Jane Doe| 25|
+--------+---+

-- A function expression as an input to LIMIT.
SELECT name, age FROM person ORDER BY name LIMIT length('OPENSEARCH');
+-------+---+
|   name|age|
+-------+---+
|  Pat C| 18|
|Jorge S| 16|
| Juan L| 18|
| John D| 25|
|Nikki W| 16|
+-------+---+
```

#### CASE clause
<a name="supported-sql-case"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `CASE` clause uses a rule to return a specific result based on the specified condition, similar to if/else statements in other programming languages. 

**Syntax** 

```
CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ]
[ ELSE else_expression ]
END
```

**Parameters**
+ **boolean\$1expression **

  Specifies any expression that evaluates to a result type boolean. 

  Two or more expressions may be combined together using the logical operators ( `AND`, `OR` ). 
+ **then\$1expression **

  Specifies the then expression based on the boolean\$1expression condition.

  `then_expression` and `else_expression` should all be same type or coercible to a common type. 
+ **else\$1expression **

  Specifies the default expression.

  `then_expression` and` else_expression` should all be same type or coercible to a common type. 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'John', 30),
(200, 'Mary', NULL),
(300, 'Mike', 80),
(400, 'Dan', 50);
SELECT id, CASE WHEN id > 200 THEN 'bigger' ELSE 'small' END FROM person;
+------+--------------------------------------------------+
|  id  | CASE WHEN (id > 200) THEN bigger ELSE small END  |
+------+--------------------------------------------------+
| 100  | small                                            |
| 200  | small                                            |
| 300  | bigger                                           |
| 400  | bigger                                           |
+------+--------------------------------------------------+
SELECT id, CASE id WHEN 100 then 'bigger' WHEN  id > 300 THEN '300' ELSE 'small' END FROM person;
+------+-----------------------------------------------------------------------------------------------+
|  id  | CASE WHEN (id = 100) THEN bigger WHEN (id = CAST((id > 300) AS INT)) THEN 300 ELSE small END  |
+------+-----------------------------------------------------------------------------------------------+
| 100  | bigger                                                                                        |
| 200  | small                                                                                         |
| 300  | small                                                                                         |
| 400  | small                                                                                         |
+------+-----------------------------------------------------------------------------------------------+
```

#### Common table expression
<a name="supported-sql-cte"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

A common table expression (CTE) defines a temporary result set that a user can reference possibly multiple times within the scope of a SQL statement. A CTE is used mainly in a `SELECT` statement. 

**Syntax** 

```
WITH common_table_expression [ , ... ]
```

While `common_table_expression` is defined as:

```
Syntexpression_name [ ( column_name [ , ... ] ) ] [ AS ] ( query )
```

**Parameters** 
+ **expression\$1name **

  Specifies a name for the common table expression. 
+ **query** 

  A `SELECT` statement. 

**Examples**

```
-- CTE with multiple column aliases
WITH t(x, y) AS (SELECT 1, 2)
SELECT * FROM t WHERE x = 1 AND y = 2;
+---+---+
|  x|  y|
+---+---+
|  1|  2|
+---+---+

-- CTE in CTE definition
WITH t AS (
WITH t2 AS (SELECT 1)
SELECT * FROM t2
)
SELECT * FROM t;
+---+
|  1|
+---+
|  1|
+---+

-- CTE in subquery
SELECT max(c) FROM (
WITH t(c) AS (SELECT 1)
SELECT * FROM t
);
+------+
|max(c)|
+------+
|     1|
+------+

-- CTE in subquery expression
SELECT (
WITH t AS (SELECT 1)
SELECT * FROM t
);
+----------------+
|scalarsubquery()|
+----------------+
|               1|
+----------------+

-- CTE in CREATE VIEW statement
CREATE VIEW v AS
WITH t(a, b, c, d) AS (SELECT 1, 2, 3, 4)
SELECT * FROM t;
SELECT * FROM v;
+---+---+---+---+
|  a|  b|  c|  d|
+---+---+---+---+
|  1|  2|  3|  4|
+---+---+---+---+
```

#### EXPLAIN
<a name="supported-sql-explain"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `EXPLAIN` statement is used to provide logical/physical plans for an input statement. By default, this clause provides information about a physical plan only. 

**Syntax** 

```
EXPLAIN [ EXTENDED | CODEGEN | COST | FORMATTED ] statement
```

**Parameters**
+ **EXTENDED** 

  Generates parsed logical plan, analyzed logical plan, optimized logical plan and physical plan. 

  Parsed Logical plan is a unresolved plan that extracted from the query. 

  Analyzed logical plans transforms which translates `unresolvedAttribute` and `unresolvedRelation` into fully typed objects. 

  The optimized logical plan transforms through a set of optimization rules, resulting in the physical plan. 
+ **CODEGEN** 

  Generates code for the statement, if any and a physical plan. 
+ **COST** 

  If plan node statistics are available, generates a logical plan and the statistics. 
+ **FORMATTED** 

  Generates two sections: a physical plan outline and node details. 
+ **statement** 

  Specifies a SQL statement to be explained. 

**Examples**

```
-- Default Output
EXPLAIN select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k;
+----------------------------------------------------+
|                                                plan|
+----------------------------------------------------+
| == Physical Plan ==
*(2) HashAggregate(keys=[k#33], functions=[sum(cast(v#34 as bigint))])
+- Exchange hashpartitioning(k#33, 200), true, [id=#59]
+- *(1) HashAggregate(keys=[k#33], functions=[partial_sum(cast(v#34 as bigint))])
+- *(1) LocalTableScan [k#33, v#34]
|
+----------------------------------------------------

-- Using Extended
EXPLAIN EXTENDED select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k;
+----------------------------------------------------+
|                                                plan|
+----------------------------------------------------+
| == Parsed Logical Plan ==
'Aggregate ['k], ['k, unresolvedalias('sum('v), None)]
 +- 'SubqueryAlias `t`
+- 'UnresolvedInlineTable [k, v], [List(1, 2), List(1, 3)]
   
 == Analyzed Logical Plan ==
 k: int, sum(v): bigint
 Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L]
 +- SubqueryAlias `t`
    +- LocalRelation [k#47, v#48]
   
 == Optimized Logical Plan ==
 Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L]
 +- LocalRelation [k#47, v#48]
   
 == Physical Plan ==
 *(2) HashAggregate(keys=[k#47], functions=[sum(cast(v#48 as bigint))], output=[k#47, sum(v)#50L])
+- Exchange hashpartitioning(k#47, 200), true, [id=#79]
   +- *(1) HashAggregate(keys=[k#47], functions=[partial_sum(cast(v#48 as bigint))], output=[k#47, sum#52L])
    +- *(1) LocalTableScan [k#47, v#48]
|
+----------------------------------------------------+

-- Using Formatted
EXPLAIN FORMATTED select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k;
+----------------------------------------------------+
|                                                plan|
+----------------------------------------------------+
| == Physical Plan ==
 * HashAggregate (4)
 +- Exchange (3)
    +- * HashAggregate (2)
       +- * LocalTableScan (1)
   
   
 (1) LocalTableScan [codegen id : 1]
 Output: [k#19, v#20]
        
 (2) HashAggregate [codegen id : 1]
 Input: [k#19, v#20]
        
 (3) Exchange
 Input: [k#19, sum#24L]
        
 (4) HashAggregate [codegen id : 2]
 Input: [k#19, sum#24L]
|
+----------------------------------------------------+
```

#### LATERAL SUBQUERY clause
<a name="supported-sql-lateral-subquery"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

`LATERAL SUBQUERY` is a subquery that is preceded by the keyword `LATERAL`. It provides a way to reference columns in the preceding `FROM` clause. Without the `LATERAL` keyword, subqueries can only refer to columns in the outer query, but not in the `FROM` clause. `LATERAL SUBQUERY` makes the complicated queries simpler and more efficient. 

**Syntax** 

```
[ LATERAL ] primary_relation [ join_relation ]
```

**Parameters**
+ **primary\$1relation **

  Specifies the primary relation. It can be one of the following: 

  1. Table relation 

  1. Aliased query 

     Syntax: `( query ) [ [ AS ] alias ] `

  1. Aliased relation 

     `Syntax: ( relation ) [ [ AS ] alias ]` 

**Examples**

```
CREATE TABLE t1 (c1 INT, c2 INT);
INSERT INTO t1 VALUES (0, 1), (1, 2);
CREATE TABLE t2 (c1 INT, c2 INT);
INSERT INTO t2 VALUES (0, 2), (0, 3);
SELECT * FROM t1,
LATERAL (SELECT * FROM t2 WHERE t1.c1 = t2.c1);
+--------+-------+--------+-------+
|  t1.c1 | t1.c2 |  t2.c1 | t2.c2 |
+-------+--------+--------+-------+
|    0   |   1   |    0   |   3   |
|    0   |   1   |    0   |   2   |
+-------+--------+--------+-------+
SELECT a, b, c FROM t1,
LATERAL (SELECT c1 + c2 AS a),
LATERAL (SELECT c1 - c2 AS b),
LATERAL (SELECT a * b AS c);
+--------+-------+--------+
|    a   |   b   |    c   |
+-------+--------+--------+
|    3   |  -1   |   -3   |
|    1   |  -1   |   -1   |
+-------+--------+--------+
```

#### LATERAL VIEW clause
<a name="supported-sql-lateral-view"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `LATERAL VIEW` clause is used in conjunction with generator functions such as `EXPLODE`, which will generate a virtual table containing one or more rows. `LATERAL VIEW` will apply the rows to each original output row. 

**Syntax** 

```
LATERAL VIEW [ OUTER ] generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ]
```

**Parameters**
+ **OUTER**

  If `OUTER` specified, returns null if an input array/map is empty or null. 
+ **generator\$1function **

  Specifies a generator function (`EXPLODE`, `INLINE`, and so on.). 
+ **table\$1alias **

  The alias for `generator_function`, which is optional. 
+ **column\$1alias **

  Lists the column aliases of `generator_function`, which may be used in output rows. 

  You can have multiple aliases if `generator_function` has multiple output columns. 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);
INSERT INTO person VALUES
(100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
SELECT * FROM person
LATERAL VIEW EXPLODE(ARRAY(30, 60)) tableName AS c_age
LATERAL VIEW EXPLODE(ARRAY(40, 80)) AS d_age;
+------+-------+-------+--------+-----------+--------+--------+
|  id  | name  |  age  | class  |  address  | c_age  | d_age  |
+------+-------+-------+--------+-----------+--------+--------+
| 100  | John  | 30    | 1      | Street 1  | 30     | 40     |
| 100  | John  | 30    | 1      | Street 1  | 30     | 80     |
| 100  | John  | 30    | 1      | Street 1  | 60     | 40     |
| 100  | John  | 30    | 1      | Street 1  | 60     | 80     |
| 200  | Mary  | NULL  | 1      | Street 2  | 30     | 40     |
| 200  | Mary  | NULL  | 1      | Street 2  | 30     | 80     |
| 200  | Mary  | NULL  | 1      | Street 2  | 60     | 40     |
| 200  | Mary  | NULL  | 1      | Street 2  | 60     | 80     |
| 300  | Mike  | 80    | 3      | Street 3  | 30     | 40     |
| 300  | Mike  | 80    | 3      | Street 3  | 30     | 80     |
| 300  | Mike  | 80    | 3      | Street 3  | 60     | 40     |
| 300  | Mike  | 80    | 3      | Street 3  | 60     | 80     |
| 400  | Dan   | 50    | 4      | Street 4  | 30     | 40     |
| 400  | Dan   | 50    | 4      | Street 4  | 30     | 80     |
| 400  | Dan   | 50    | 4      | Street 4  | 60     | 40     |
| 400  | Dan   | 50    | 4      | Street 4  | 60     | 80     |
+------+-------+-------+--------+-----------+--------+--------+
SELECT c_age, COUNT(1) FROM person
LATERAL VIEW EXPLODE(ARRAY(30, 60)) AS c_age
LATERAL VIEW EXPLODE(ARRAY(40, 80)) AS d_age
GROUP BY c_age;
+--------+-----------+
| c_age  | count(1)  |
+--------+-----------+
| 60     | 8         |
| 30     | 8         |
+--------+-----------+
SELECT * FROM person
LATERAL VIEW EXPLODE(ARRAY()) tableName AS c_age;
+-----+-------+------+--------+----------+--------+
| id  | name  | age  | class  | address  | c_age  |
+-----+-------+------+--------+----------+--------+
+-----+-------+------+--------+----------+--------+
SELECT * FROM person
LATERAL VIEW OUTER EXPLODE(ARRAY()) tableName AS c_age;
+------+-------+-------+--------+-----------+--------+
|  id  | name  |  age  | class  |  address  | c_age  |
+------+-------+-------+--------+-----------+--------+
| 100  | John  | 30    | 1      | Street 1  | NULL   |
| 200  | Mary  | NULL  | 1      | Street 2  | NULL   |
| 300  | Mike  | 80    | 3      | Street 3  | NULL   |
| 400  | Dan   | 50    | 4      | Street 4  | NULL   |
+------+-------+-------+--------+-----------+--------+
```

#### LIKE predicate
<a name="supported-sql-like-predicate"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

A `LIKE` predicate is used to search for a specific pattern. This predicate also supports multiple patterns with quantifiers include `ANY`, `SOME`, and `ALL`. 

**Syntax** 

```
[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern }
[ NOT ] { LIKE quantifiers ( search_pattern [ , ... ]) }
```

**Parameters**
+ **search\$1pattern **

  Specifies a string pattern to be searched by the LIKE clause. It can contain special pattern-matching characters: 
  + `%` matches zero or more characters. 
  + `_` matches exactly one character. 
+ **esc\$1char **

  Specifies the escape character. The default escape character is `\`. 
+ **regex\$1pattern **

  Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. 
+ **quantifiers** 

  Specifies the predicate quantifiers include `ANY`, `SOME` and `ALL`. 

  `ANY` or `SOME` means if one of the patterns matches the input, then return true.

  `ALL` means if all the patterns matches the input, then return true. 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'John', 30),
(200, 'Mary', NULL),
(300, 'Mike', 80),
(400, 'Dan',  50),
(500, 'Evan_w', 16);
SELECT * FROM person WHERE name LIKE 'M%';
+---+----+----+
| id|name| age|
+---+----+----+
|300|Mike|  80|
|200|Mary|null|
+---+----+----+
SELECT * FROM person WHERE name LIKE 'M_ry';
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
+---+----+----+
SELECT * FROM person WHERE name NOT LIKE 'M_ry';
+---+------+---+
| id|  name|age|
+---+------+---+
|500|Evan_W| 16|
|300|  Mike| 80|
|100|  John| 30|
|400|   Dan| 50|
+---+------+---+
SELECT * FROM person WHERE name RLIKE 'M+';
+---+----+----+
| id|name| age|
+---+----+----+
|300|Mike|  80|
|200|Mary|null|
+---+----+----+
SELECT * FROM person WHERE name REGEXP 'M+';
+---+----+----+
| id|name| age|
+---+----+----+
|300|Mike|  80|
|200|Mary|null|
+---+----+----+
SELECT * FROM person WHERE name LIKE '%\_%';
+---+------+---+
| id|  name|age|
+---+------+---+
|500|Evan_W| 16|
+---+------+---+
SELECT * FROM person WHERE name LIKE '%$_%' ESCAPE '$';
+---+------+---+
| id|  name|age|
+---+------+---+
|500|Evan_W| 16|
+---+------+---+
SELECT * FROM person WHERE name LIKE ALL ('%an%', '%an');
+---+----+----+
| id|name| age|
+---+----+----+
|400| Dan|  50|
+---+----+----+
SELECT * FROM person WHERE name LIKE ANY ('%an%', '%an');
+---+------+---+
| id|  name|age|
+---+------+---+
|400|   Dan| 50|
|500|Evan_W| 16|
+---+------+---+
SELECT * FROM person WHERE name LIKE SOME ('%an%', '%an');
+---+------+---+
| id|  name|age|
+---+------+---+
|400|   Dan| 50|
|500|Evan_W| 16|
+---+------+---+
SELECT * FROM person WHERE name NOT LIKE ALL ('%an%', '%an');
+---+----+----+
| id|name| age|
+---+----+----+
|100|John|  30|
|200|Mary|null|
|300|Mike|  80|
+---+----+----+
SELECT * FROM person WHERE name NOT LIKE ANY ('%an%', '%an');
+---+------+----+
| id|  name| age|
+---+------+----+
|100|  John|  30|
|200|  Mary|null|
|300|  Mike|  80|
|500|Evan_W|  16|
+---+------+----+
SELECT * FROM person WHERE name NOT LIKE SOME ('%an%', '%an');
+---+------+----+
| id|  name| age|
+---+------+----+
|100|  John|  30|
|200|  Mary|null|
|300|  Mike|  80|
|500|Evan_W|  16|
+---+------+----+
```

#### OFFSET
<a name="supported-sql-offset"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `OFFSET` clause is used to specify the number of rows to skip before beginning to return rows returned by the `SELECT` statement. In general, this clause is used in conjunction with `ORDER BY` to ensure that the results are deterministic. 

**Syntax** 

```
OFFSET integer_expression
```

**Parameters**
+ **integer\$1expression **

  Specifies a foldable expression that returns an integer. 

**Examples**

```
CREATE TABLE person (name STRING, age INT);
INSERT INTO person VALUES
('Jane Doe', 25),
('Pat C', 18),
('Nikki W', 16),
('Juan L', 25),
('John D', 18),
('Jorge S', 16);

-- Skip the first two rows.
SELECT name, age FROM person ORDER BY name OFFSET 2;
+-------+---+
|   name|age|
+-------+---+
| John D| 18|
| Juan L| 25|
|Nikki W| 16|
|Jane Doe| 25|
+-------+---+

-- Skip the first two rows and returns the next three rows.
SELECT name, age FROM person ORDER BY name LIMIT 3 OFFSET 2;
+-------+---+
|   name|age|
+-------+---+
| John D| 18|
| Juan L| 25|
|Nikki W| 16|
+-------+---+

-- A function expression as an input to OFFSET.
SELECT name, age FROM person ORDER BY name OFFSET length('WAGON');
+-------+---+
|   name|age|
+-------+---+
|Jane Doe| 25|
+-------+---+
```

#### PIVOT clause
<a name="supported-sql-pivot"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `PIVOT` clause is used for data perspective. We can get the aggregated values based on specific column values, which will be turned to multiple columns used in `SELECT` clause. The `PIVOT` clause can be specified after the table name or subquery. 

**Syntax** 

```
PIVOT ( { aggregate_expression [ AS aggregate_expression_alias ] } [ , ... ] FOR column_list IN ( expression_list ) ) 
```

**Parameters** 
+ **aggregate\$1expression**

  Specifies an aggregate expression `(SUM(a)`, `COUNT(DISTINCT b)`, and so on.). 
+ **aggregate\$1expression\$1alias**

  Specifies an alias for the aggregate expression. 
+ **column\$1list**

  Contains columns in the `FROM` clause, which specifies the columns you want to replace with new columns. You can use brackets to surround the columns, such as `(c1, c2)`. 
+ **expression\$1list**

  Specifies new columns, which are used to match values in `column_list` as the aggregating condition. You can also add aliases for them. 

**Examples**

```
CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);
INSERT INTO person VALUES
(100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
SELECT * FROM person
PIVOT (
SUM(age) AS a, AVG(class) AS c
FOR name IN ('John' AS john, 'Mike' AS mike)
);
+------+-----------+---------+---------+---------+---------+
|  id  |  address  | john_a  | john_c  | mike_a  | mike_c  |
+------+-----------+---------+---------+---------+---------+
| 200  | Street 2  | NULL    | NULL    | NULL    | NULL    |
| 100  | Street 1  | 30      | 1.0     | NULL    | NULL    |
| 300  | Street 3  | NULL    | NULL    | 80      | 3.0     |
| 400  | Street 4  | NULL    | NULL    | NULL    | NULL    |
+------+-----------+---------+---------+---------+---------+
SELECT * FROM person
PIVOT (
SUM(age) AS a, AVG(class) AS c
FOR (name, age) IN (('John', 30) AS c1, ('Mike', 40) AS c2)
);
+------+-----------+-------+-------+-------+-------+
|  id  |  address  | c1_a  | c1_c  | c2_a  | c2_c  |
+------+-----------+-------+-------+-------+-------+
| 200  | Street 2  | NULL  | NULL  | NULL  | NULL  |
| 100  | Street 1  | 30    | 1.0   | NULL  | NULL  |
| 300  | Street 3  | NULL  | NULL  | NULL  | NULL  |
| 400  | Street 4  | NULL  | NULL  | NULL  | NULL  |
+------+-----------+-------+-------+-------+-------+
```

#### Set operators
<a name="supported-sql-set"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

Set operators are used to combine two input relations into a single one. OpenSearch SQL supports three types of set operators: 
+ `EXCEPT` or `MINUS`
+ `INTERSECT` 
+ `UNION` 

Input relations must have the same number of columns and compatible data types for the respective columns. 

**EXCEPT** 

`EXCEPT` and `EXCEPT ALL` return the rows that are found in one relation but not the other. `EXCEPT` (alternatively, `EXCEPT DISTINCT`) takes only distinct rows while `EXCEPT ALL` does not remove duplicates from the result rows. Note that `MINUS` is an alias for `EXCEPT`. 

**Syntax** 

```
 [ ( ] relation [ ) ] EXCEPT | MINUS [ ALL | DISTINCT ] [ ( ] relation [ ) ] 
```

**Examples**

```
-- Use table1 and table2 tables to demonstrate set operators in this page.
SELECT * FROM table1;
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
+---+
SELECT * FROM table2;
+---+
|  c|
+---+
|  5|
|  1|
|  2|
|  2|
+---+
SELECT c FROM table1 EXCEPT SELECT c FROM table2;
+---+
|  c|
+---+
|  3|
|  4|
+---+
SELECT c FROM table1 MINUS SELECT c FROM table2;
+---+
|  c|
+---+
|  3|
|  4|
+---+
SELECT c FROM table1 EXCEPT ALL (SELECT c FROM table2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+
SELECT c FROM table1 MINUS ALL (SELECT c FROM table2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+
```

**INTERSECT** 

`INTERSECT` and `INTERSECT ALL` return the rows that are found in both relations. `INTERSECT` (alternatively, `INTERSECT DISTINCT`) takes only distinct rows while `INTERSECT ALL` does not remove duplicates from the result rows.

**Syntax** 

```
 [ ( ] relation [ ) ] INTERSECT [ ALL | DISTINCT ] [ ( ] relation [ ) ]
```

**Examples**

```
(SELECT c FROM table1) INTERSECT (SELECT c FROM table2);
+---+
|  c|
+---+
|  1|
|  2|
+---+
(SELECT c FROM table1) INTERSECT DISTINCT (SELECT c FROM table2);
+---+
|  c|
+---+
|  1|
|  2|
+---+
(SELECT c FROM table1) INTERSECT ALL (SELECT c FROM table2);
+---+
|  c|
+---+
|  1|
|  2|
|  2|
+---+
```

**UNION** 

`UNION` and `UNION ALL` return the rows that are found in either relation. `UNION` (alternatively, `UNION DISTINCT`) takes only distinct rows while `UNION ALL` does not remove duplicates from the result rows.

**Syntax** 

```
 [ ( ] relation [ ) ] UNION [ ALL | DISTINCT ] [ ( ] relation [ ) ]
```

**Examples**

```
(SELECT c FROM table1) UNION (SELECT c FROM table2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+
(SELECT c FROM table1) UNION DISTINCT (SELECT c FROM table2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+
SELECT c FROM table1 UNION ALL (SELECT c FROM table2);
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
|  5|
|  1|
|  2|
|  2|
+---+
```

#### SORT BY clause
<a name="supported-sql-sort-by"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `SORT BY` clause is used to return the result rows sorted within each partition in the user specified order. When there is more than one partition `SORT BY` may return result that is partially ordered. This is different than `ORDER BY` clause which guarantees a total order of the output. 

**Syntax** 

```
SORT BY { expression [ sort_direction | nulls_sort_order ] [ , ... ] }
```

**Parameters**
+ **SORT BY **

  Specifies a comma-separated list of expressions along with optional parameters sort\$1direction and nulls\$1sort\$1order which are used to sort the rows within each partition. 
+ **sort\$1direction **

  Optionally specifies whether to sort the rows in ascending or descending order. 

  The valid values for the sort direction are `ASC` for ascending and `DESC` for descending. 

  If sort direction is not explicitly specified, then by default rows are sorted ascending. 

  Syntax: `[ ASC | DESC ]` 
+ **nulls\$1sort\$1order **

  Optionally specifies whether NULL values are returned before/after non-NULL values. 

  If `null_sort_order` is not specified, then NULLs sort first if the sort order is `ASC` and NULLS sort last if the sort order is `DESC`. 

  1. If `NULLS FIRST` is specified, then NULL values are returned first regardless of the sort order. 

  2. If `NULLS LAST` is specified, then NULL values are returned last regardless of the sort order. 

  Syntax: `[ NULLS { FIRST | LAST } ] `

**Examples**

```
CREATE TABLE person (zip_code INT, name STRING, age INT);
INSERT INTO person VALUES
(94588, 'Shirley Rodriguez', 50),
(94588, 'Juan Li', 18),
(94588, 'Anil K', 27),
(94588, 'John D', NULL),
(94511, 'David K', 42),
(94511, 'Aryan B.', 18),
(94511, 'Lalit B.', NULL);
-- Sort rows by `name` within each partition in ascending manner
SELECT name, age, zip_code FROM person SORT BY name;
+------------------+----+--------+
|              name| age|zip_code|
+------------------+----+--------+
|            Anil K|  27|   94588|
|           Juan Li|  18|   94588|
|            John D|null|   94588|
| Shirley Rodriguez|  50|   94588|
|          Aryan B.|  18|   94511|
|           David K|  42|   94511|
|          Lalit B.|null|   94511|
+------------------+----+--------+
-- Sort rows within each partition using column position.
SELECT name, age, zip_code FROM person SORT BY 1;
+----------------+----+----------+
|              name| age|zip_code|
+------------------+----+--------+
|            Anil K|  27|   94588|
|           Juan Li|  18|   94588|
|            John D|null|   94588|
| Shirley Rodriguez|  50|   94588|
|          Aryan B.|  18|   94511|
|           David K|  42|   94511|
|          Lalit B.|null|   94511|
+------------------+----+--------+

-- Sort rows within partition in ascending manner keeping null values to be last.
SELECT age, name, zip_code FROM person SORT BY age NULLS LAST;
+----+------------------+--------+
| age|              name|zip_code|
+----+------------------+--------+
|  18|           Juan Li|   94588|
|  27|            Anil K|   94588|
|  50| Shirley Rodriguez|   94588|
|null|            John D|   94588|
|  18|          Aryan B.|   94511|
|  42|           David K|   94511|
|null|          Lalit B.|   94511|
+--------------+--------+--------+

-- Sort rows by age within each partition in descending manner, which defaults to NULL LAST.
SELECT age, name, zip_code FROM person SORT BY age DESC;
+----+------------------+--------+
| age|              name|zip_code|
+----+------------------+--------+
|  50|          Shirley Rodriguez|   94588|
|  27|            Anil K|   94588|
|  18|           Juan Li|   94588|
|null|            John D|   94588|
|  42|           David K|   94511|
|  18|          Aryan B.|   94511|
|null|          Lalit B.|   94511|
+----+------------------+--------+

-- Sort rows by age within each partition in descending manner keeping null values to be first.
SELECT age, name, zip_code FROM person SORT BY age DESC NULLS FIRST;
+----+------------------+--------+
| age|              name|zip_code|
+----+------------------+--------+
|null|            John D|   94588|
|  50| Shirley Rodriguez|   94588|
|  27|            Anil K|   94588|
|  18|           Juan Li|   94588|
|null|          Lalit B.|   94511|
|  42|           David K|   94511|
|  18|          Aryan B.|   94511|
+--------------+--------+--------+

-- Sort rows within each partition based on more than one column with each column having
-- different sort direction.
SELECT name, age, zip_code FROM person
SORT BY name ASC, age DESC;
+------------------+----+--------+
|              name| age|zip_code|
+------------------+----+--------+
|            Anil K|  27|   94588|
|           Juan Li|  18|   94588|
|            John D|null|   94588|
| Shirley Rodriguez|  50|   94588|
|          Aryan B.|  18|   94511|
|           David K|  42|   94511|
|          Lalit B.|null|   94511|
+------------------+----+--------+
```

#### UNPIVOT
<a name="supported-sql-unpivot"></a>

**Note**  
To see which AWS data source integrations support this SQL command, see [Supported OpenSearch SQL commands and functions](#supported-directquery-sql).

The `UNPIVOT` clause transforms multiple columns into multiple rows used in `SELECT` clause. The `UNPIVOT` clause can be specified after the table name or subquery. 

**Syntax** 

```
UNPIVOT [ { INCLUDE | EXCLUDE } NULLS ] (
    { single_value_column_unpivot | multi_value_column_unpivot }
) [[AS] alias]

single_value_column_unpivot:
    values_column
    FOR name_column
    IN (unpivot_column [[AS] alias] [, ...])

multi_value_column_unpivot:
    (values_column [, ...])
    FOR name_column
    IN ((unpivot_column [, ...]) [[AS] alias] [, ...])
```

**Parameters**
+ **unpivot\$1column **

  Contains columns in the `FROM` clause, which specifies the columns we want to unpivot. 
+ **name\$1column **

  The name for the column that holds the names of the unpivoted columns. 
+ **values\$1column **

  The name for the column that holds the values of the unpivoted columns. 

**Examples**

```
CREATE TABLE sales_quarterly (year INT, q1 INT, q2 INT, q3 INT, q4 INT);
INSERT INTO sales_quarterly VALUES
(2020, null, 1000, 2000, 2500),
(2021, 2250, 3200, 4200, 5900),
(2022, 4200, 3100, null, null);
-- column names are used as unpivot columns
SELECT * FROM sales_quarterly
UNPIVOT (
sales FOR quarter IN (q1, q2, q3, q4)
);
+------+---------+-------+
| year | quarter | sales |
+------+---------+-------+
| 2020 | q2      | 1000  |
| 2020 | q3      | 2000  |
| 2020 | q4      | 2500  |
| 2021 | q1      | 2250  |
| 2021 | q2      | 3200  |
| 2021 | q3      | 4200  |
| 2021 | q4      | 5900  |
| 2022 | q1      | 4200  |
| 2022 | q2      | 3100  |
+------+---------+-------+
-- NULL values are excluded by default, they can be included
-- unpivot columns can be alias
-- unpivot result can be referenced via its alias
SELECT up.* FROM sales_quarterly
UNPIVOT INCLUDE NULLS (
sales FOR quarter IN (q1 AS Q1, q2 AS Q2, q3 AS Q3, q4 AS Q4)
) AS up;
+------+---------+-------+
| year | quarter | sales |
+------+---------+-------+
| 2020 | Q1      | NULL  |
| 2020 | Q2      | 1000  |
| 2020 | Q3      | 2000  |
| 2020 | Q4      | 2500  |
| 2021 | Q1      | 2250  |
| 2021 | Q2      | 3200  |
| 2021 | Q3      | 4200  |
| 2021 | Q4      | 5900  |
| 2022 | Q1      | 4200  |
| 2022 | Q2      | 3100  |
| 2022 | Q3      | NULL  |
| 2022 | Q4      | NULL  |
+------+---------+-------+
-- multiple value columns can be unpivoted per row
SELECT * FROM sales_quarterly
UNPIVOT EXCLUDE NULLS (
(first_quarter, second_quarter)
FOR half_of_the_year IN (
(q1, q2) AS H1,
(q3, q4) AS H2
)
);
+------+------------------+---------------+----------------+
|  id  | half_of_the_year | first_quarter | second_quarter |
+------+------------------+---------------+----------------+
| 2020 | H1               | NULL          | 1000           |
| 2020 | H2               | 2000          | 2500           |
| 2021 | H1               | 2250          | 3200           |
| 2021 | H2               | 4200          | 5900           |
| 2022 | H1               | 4200          | 3100           |
+------+------------------+---------------+----------------+
```

# Supported PPL commands
<a name="supported-ppl"></a>

The following tables show which PPL commands OpenSearch Dashboards supports for querying CloudWatch Logs, Amazon S3, or Security Lake, and which commands CloudWatch Logs Insights supports. CloudWatch Logs Insights uses the same PPL syntax as OpenSearch Dashboards when querying CloudWatch Logs, and the tables refer to both as CloudWatch Logs. 

**Note**  
When you analyze data outside of OpenSearch Service, commands might execute differently than they do on OpenSearch indexes.

**Topics**
+ [Commands](#supported-ppl-commands)
+ [Functions](#supported-ppl-functions)
+ [Additional information for CloudWatch Logs Insights users using OpenSearch PPL](#supported-ppl-for-cloudwatch-users)

## Commands
<a name="supported-ppl-commands"></a>


| PPL command | Description | CloudWatch Logs | Amazon S3 | Security Lake | Example command | 
| --- | --- | --- | --- | --- | --- | 
| [fields command](#supported-ppl-fields-command) | Displays a set of fields that needs projection. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> fields field1, field2</pre>  | 
| [where command](#supported-ppl-where-command) |  Filters the data based on the conditions that you specify.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> where field1="success"<br />| where field2 != "i -023fe0a90929d8822"<br />| fields field3, col4, col5, col6<br />| head 1000</pre>  | 
| [stats command](#supported-ppl-stats-command) |  Performs aggregations and calculations.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>stats count(),<br />      count(`field1`),<br />      min(`field1`),<br />      max(`field1`),<br />      avg(`field1`)<br />by field2<br />| head 1000</pre>  | 
| [parse command](#supported-ppl-parse-command) |  Extracts a regular expression (regex) pattern from a string and displays the extracted pattern. The extracted pattern can be further used to create new fields or filter data.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>parse `field1` ".*/(?<field2>[^/]+$)"<br />| where field2 = "requestId"<br />| fields field2, `field2`<br />| head 1000</pre>  | 
| [patterns command](#supported-ppl-patterns-command) |  Extracts log patterns from a text field and appends the results to the search result. Grouping logs by their patterns makes it easier to aggregate stats from large volumes of log data for analysis and troubleshooting.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>patterns new_field='no_numbers' pattern='[0-9]' message<br />| fields message, no_numbers</pre>  | 
| [sort command](#supported-ppl-sort-command) |  Sort the displayed results by a field name. Use** sort -*FieldName*** to sort in descending order.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>stats count(),<br />      count(`field1`),<br />      min(`field1`) as field1Alias,<br />      max(`field1`),<br />      avg(`field1`)<br />by field2<br />| sort -field1Alias<br />| head 1000</pre>  | 
| [eval command](#supported-ppl-eval-command) |  Modifies or processes the value of a field and stores it in a different field. This is useful to mathematically modify a column, apply string functions to a column, or apply date functions to a column.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval field2 = `field1` * 2<br />| fields field1, field2<br />| head 20</pre>  | 
| [rename command](#supported-ppl-rename-command) |  Renames one or more fields in the search result.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>rename field2 as field1<br />| fields field1</pre>  | 
| [head command](#supported-ppl-head-command) |  Limits the displayed query results to the frst N rows.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> fields `@message`<br />| head 20</pre>  | 
| [grok command](#supported-ppl-grok-command) |  Parses a text field with a grok pattern based on regular expression, and appends the results to the search result.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> grok email '.+@%{HOSTNAME:host}'<br />| fields email</pre>  | 
| [top command](#supported-ppl-top-command) |  Finds the most frequent values for a field.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> top 2 Field1 by Field2</pre>  | 
| [dedup command](#supported-ppl-dedup-command) |  Removes duplicate entries based on the fields that you specify.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>dedup field1<br />| fields field1, field2, field3</pre>  | 
| [join command](#supported-ppl-join-commands) |  Joins two datasets together.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>source=customer<br />| join ON c_custkey = o_custkey orders<br />| head 10</pre>  | 
| [lookup command](#supported-ppl-lookup-commands) |  Enriches your search data by adding or replacing data from a lookup index (dimension table). You can extend fields of an index with values from a dimension table, append or replace values when lookup condition is matched  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>where orderType = 'Cancelled'<br />| lookup account_list, mkt_id AS mkt_code<br />  replace amount, account_name as name<br />| stats count(mkt_code), avg(amount)<br />  by name</pre>  | 
| [subquery command](#supported-ppl-subquery-commands) | Performs complex, nested queries within your Piped Processing Language (PPL) statements. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>where id in [<br />  subquery source=users<br />  | where user in [<br />    subquery source=actions<br />    | where action="login"<br />    | fields user<br />  ]<br />  | fields uid<br />]</pre>  | 
| [rare command](#supported-ppl-rare-command) |  Finds the least frequent values of all fields in the field list.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> rare Field1 by Field2</pre>  | 
| [trendline command](#supported-ppl-trendline-commands) | Calculates the moving averages of fields. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> trendline sma(2, field1) as field1Alias</pre>  | 
| [eventstats command](#supported-ppl-eventstats-command) | Enriches your event data with calculated summary statistics. It analyzes specified fields within your events, computes various statistical measures, and then appends these results to each original event as new fields. |  ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported (except `count()`)  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> eventstats sum(field1) by field2</pre>  | 
| [flatten command](#supported-ppl-flatten-command) |  Flattens a field, The field must be of this type: `struct<?,?> or array<struct<?,?>>`  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> source=table | flatten field1</pre>  | 
| [field summary](#supported-ppl-field-summary-command) | Calculates basic statistics for each field (count, distinct count, min, max, avg, stddev, and mean). | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported (one field per query) | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>where field1 != 200<br />| fieldsummary includefields=field1 nulls=true</pre>  | 
| [fillnull command](#supported-ppl-fillnull-command) | Fills null fields with the value that you provide. It can be used in one or more fields. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>fields field1<br />| eval field2=field1<br />| fillnull value=0 field1</pre>  | 
| [expand command](#supported-ppl-expand-command) | Breaks down a field containing multiple values into separate rows, creating a new row for each value in the specified field. | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>expand employee<br />| stats max(salary) as max<br />  by state, company</pre>  | 
| [describe command](#supported-ppl-describe-command) |  Gets detailed information about the structure and metadata of tables, schemas, and catalogs  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre> describe schema.table</pre>  | 

## Functions
<a name="supported-ppl-functions"></a>


| PPL function | Description | CloudWatch Logs | Amazon S3 | Security Lake | Example command | 
| --- | --- | --- | --- | --- | --- | 
|  [PPL string functions](#supported-ppl-string-functions) (`CONCAT`, `CONCAT_WS`, `LENGTH`, `LOWER`, `LTRIM`, `POSITION`, `REVERSE`, `RIGHT`, `RTRIM`, `SUBSTRING`, `TRIM`, `UPPER`)  |  Built-in functions in PPL that can manipulate and transform string and text data within PPL queries. For example, converting case, combining strings, extracting parts, and cleaning text.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval col1Len = LENGTH(col1)<br />| fields col1Len</pre>  | 
|  [PPL date and time functions](#supported-ppl-date-time-functions) (`DAY`, `DAYOFMONTH`, `DAY_OF_MONTH`,`DAYOFWEEK`, `DAY_OF_WEEK`, `DAYOFYEAR`, `DAY_OF_YEAR`, `DAYNAME`, `FROM_UNIXTIME`, `HOUR`, `HOUR_OF_DAY`, `LAST_DAY`, `LOCALTIMESTAMP`, `LOCALTIME`, `MAKE_DATE`, `MINUTE`, `MINUTE_OF_HOUR`, `MONTH`, `MONTHNAME`, `MONTH_OF_YEAR`, `NOW`, `QUARTER`, `SECOND`, `SECOND_OF_MINUTE`, `SUBDATE`, `SYSDATE`, `TIMESTAMP`, `UNIX_TIMESTAMP`, `WEEK`, `WEEKDAY`, `WEEK_OF_YEAR`, `DATE_ADD`, `DATE_SUB`, `TIMESTAMPADD`, `TIMESTAMPDIFF`, `UTC_TIMESTAMP`, `CURRENT_TIMEZONE`)  |  Built-in functions for handling and transforming date and timestamp data in PPL queries. For example, **date\$1add**, **date\$1format**, **datediff**, and **current\$1date**.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval newDate = ADDDATE(DATE('2020-08-26'), 1)<br />| fields newDate</pre>  | 
|  [PPL condition functions](#supported-ppl-condition-functions) (`EXISTS`, `IF`, `IFNULL`, `ISNOTNULL`, `ISNULL`, `NULLIF`)  |  Built-in functions that perform calculations on multiple rows to produce a single summarized value. For example, **sum**, **count**, **avg**, **max**, and **min**.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval field2 = isnull(col1)<br />| fields field2, col1, field3  </pre>  | 
|  [PPL mathematical functions](#supported-ppl-math-functions) (`ABS`, `ACOS`, `ASIN`, `ATAN`, `ATAN2`, `CEIL`, `CEILING`, `CONV`, `COS`, `COT`, `CRC32`, `DEGREES`, `E`, `EXP`, `FLOOR`, `LN`, `LOG`, `LOG2`, `LOG10`, `MOD`, `PI`. `POW`, `POWER`, `RADIANS`, `RAND`, `ROUND`, `SIGN`, `SIN`, `SQRT`, `CBRT`)  |  Built-in functions for performing mathematical calculations and transformations in PPL queries. For example: **abs** (absolute value), **round** (rounds numbers), **sqrt** (square root), **pow** (power calculation), and **ceil** (rounds up to nearest integer).  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval field2 = ACOS(col1)<br />| fields col1</pre>  | 
|  [PPL expressions](#supported-ppl-expressions) (Arithmetic operators (`+`, `-`, `*`), Predicate operators (`>. <`, `IN)`)  |  Built-in functions for expressions, particularly value expressions, return a scalar value. Expressions have different types and forms.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>where age > (25 + 5)<br />| fields age  </pre>  | 
|  [PPL IP address functions](#supported-ppl-ip-address-functions) (`CIDRMATCH`)  |  Built-in functions for handling IP addresses such as CIDR.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>where cidrmatch(ip, '***********/24')<br />| fields ip </pre>  | 
|  [PPL JSON functions](#supported-ppl-json-functions) (`ARRAY_LENGTH`, `ARRAY_LENGTH`, `JSON`, `JSON_ARRAY`, `JSON_EXTRACT`, `JSON_KEYS`, `JSON_OBJECT`, `JSON_VALID`, `TO_JSON_STRING`)  |  Built-in functions for handling JSON including arrays, extracting, and validation.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval `json_extract('{"a":"b"}', '$.a')` = json_extract('{"a":"b"}', '$a')</pre>  | 
|  [PPL Lambda functions](#supported-ppl-lambda-functions) (`EXISTS`, `FILTER`, `REDUCE`, `TRANSFORM`)  |  Built-in functions for handling JSON including arrays, extracting, and validation.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/negative_icon.png) Not supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval array = json_array(1, -1, 2),<br />     result = filter(array, x -> x > 0)<br />| fields result</pre>  | 
|  [PPL cryptographic hash functions](#supported-ppl-cryptographic-functions) (`MD5`, `SHA1`, `SHA2`)  |  Built-in functions that allow you to generate unique fingerprints of data, which can be used for verification, comparison, or as part of more complex security protocols.  | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported | ![\[alt text not found\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/images/success_icon.png) Supported |  <pre>eval `MD5('hello')` = MD5('hello')<br />| fields `MD5('hello')`</pre>  | 

## Additional information for CloudWatch Logs Insights users using OpenSearch PPL
<a name="supported-ppl-for-cloudwatch-users"></a>

Although CloudWatch Logs Insights supports most OpenSearch PPL commands and functions, some commands and functions aren't currently supported. For example, it doesn't currently support Lookup commands in PPL. As of June 2, 2025, CloudWatch Logs Insights now supports JOIN, subqueries, Flatten, Fillnull, Expand, Cidrmatch, and JSON functions in PPL. For a complete list of supported query commands and functions, see the Amazon CloudWatch Logs columns in the above tables.

### Sample queries and quotas
<a name="sample-queries"></a>

The following applies to both CloudWatch Logs Insights users and OpenSearch users querying CloudWatch data.

For information about the limits that apply when querying CloudWatch Logs from OpenSearch Service, see [CloudWatch Logs quotas](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html) in the Amazon CloudWatch Logs User Guide. Limits involve the number of CloudWatch Log groups you can query, the maximum concurrent queries that you can execute, the maximum query execution time, and the maximum number of rows returned in results. The limits are the same regardless of which language you use to query CloudWatch Logs (namely, OpenSearch PPL, SQL, and Logs Insights QL). 

### PPL commands
<a name="supported-ppl-commands-details"></a>

**Topics**
+ [comment](#supported-ppl-comment)
+ [correlation command](#supported-ppl-correlation-commands)
+ [dedup command](#supported-ppl-dedup-command)
+ [describe command](#supported-ppl-describe-command)
+ [eval command](#supported-ppl-eval-command)
+ [eventstats command](#supported-ppl-eventstats-command)
+ [expand command](#supported-ppl-expand-commands)
+ [explain command](#supported-ppl-explain-command)
+ [fillnull command](#supported-ppl-fillnull-command)
+ [fields command](#supported-ppl-fields-command)
+ [flatten command](#supported-ppl-flatten-command)
+ [grok command](#supported-ppl-grok-command)
+ [head command](#supported-ppl-head-command)
+ [join command](#supported-ppl-join-commands)
+ [lookup command](#supported-ppl-lookup-commands)
+ [parse command](#supported-ppl-parse-command)
+ [patterns command](#supported-ppl-patterns-command)
+ [rare command](#supported-ppl-rare-command)
+ [rename command](#supported-ppl-rename-command)
+ [search command](#supported-ppl-search-command)
+ [sort command](#supported-ppl-sort-command)
+ [stats command](#supported-ppl-stats-command)
+ [subquery command](#supported-ppl-subquery-commands)
+ [top command](#supported-ppl-top-command)
+ [trendline command](#supported-ppl-trendline-commands)
+ [where command](#supported-ppl-where-command)
+ [field summary](#supported-ppl-field-summary-command)
+ [expand command](#supported-ppl-expand-command)
+ [PPL functions](#supported-ppl-functions-details)

#### comment
<a name="supported-ppl-comment"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

PPL supports both line comments and block comments. The system doesn't evaluate comment text.

**Line comments**  
Line comments begin with two slashes // and end with a new line. 

Example: 

```
os> source=accounts | top gender // finds most common gender of all the accounts
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| M        |
| F        |
+----------+
```

**Block Comments**  
Block comments begin with a slash followed by an asterisk \$1\$1, and end with an asterisk followed by a slash \$1/. 

Example:

```
os> source=accounts | dedup 2 gender /* dedup the document with gender field keep 2 duplication */ | fields account_number, gender
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 6                | M        |
| 13               | F        |
+------------------+----------+
```

#### correlation command
<a name="supported-ppl-correlation-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

You can correlate different data sources according to common dimensions and timeframes. 

This correlation is crucial when you're dealing with large amounts of data from various verticals that share the same time periods but aren't formally synchronized.

By correlating these different data sources based on timeframes and similar dimensions, you can enrich your data and uncover valuable insights.

**Example**  
The observability domain has three distinct data sources:
+ Logs
+ Metrics
+ Traces

These data sources might share common dimensions. To transition from one data source to another, you need to correlate them correctly. Using semantic naming conventions, you can identify shared elements across logs, traces, and metrics.

Example:

```
{
  "@timestamp": "2018-07-02T22:23:00.186Z",
  "aws": {
    "elb": {
      "backend": {
        "http": {
          "response": {
            "status_code": 500
          }
        },
        "ip": "********",
        "port": "80"
      },
      ...
     "target_port": [
        "10.0.0.1:80"
      ],
      "target_status_code": [
        "500"
      ],
      "traceId": "Root=1-58337262-36d228ad5d99923122bbe354",
      "type": "http"
    }
  },
  "cloud": {
    "provider": "aws"
  },
  "http": {
    "request": {
    ...
  },
  "communication": {
    "source": {
      "address": "**************",
      "ip": "**************",
      "port": 2817
    }
  },
  "traceId": "Root=1-58337262-36d228ad5d99923122bbe354"
}
```

This example shows an AWS ELB log arriving from a service residing on AWS. It shows a backend HTTP response with a status code of 500, indicating an error. This could trigger an alert or be part of your regular monitoring process. Your next step is to gather relevant data around this event for a thorough investigation.

While you might be tempted to query all data related to the timeframe, this approach can be overwhelming. You could end up with too much information, spending more time filtering out irrelevant data than identifying the root cause. 

Instead, you can use a more targeted approach by correlating data from different sources. You can use these dimensions for correlation:
+ **IP** - `"ip": "10.0.0.1" | "ip": "**************"`
+ **Port** - `"port": 2817 | "target_port": "10.0.0.1:80"`

Assuming you have access to additional traces and metrics indices, and you're familiar with your schema structure, you can create a more precise correlation query.

Here's an example of a trace index document containing HTTP information you might want to correlate:

```
{
  "traceId": "c1d985bd02e1dbb85b444011f19a1ecc",
  "spanId": "55a698828fe06a42",
  "traceState": [],
  "parentSpanId": "",
  "name": "mysql",
  "kind": "CLIENT",
  "@timestamp": "2021-11-13T20:20:39+00:00",
  "events": [
    {
      "@timestamp": "2021-03-25T17:21:03+00:00",
       ...
    }
  ],
  "links": [
    {
      "traceId": "c1d985bd02e1dbb85b444011f19a1ecc",
      "spanId": "55a698828fe06a42w2",
      },
      "droppedAttributesCount": 0
    }
  ],
  "resource": {
    "service@name": "database",
    "telemetry@sdk@name": "opentelemetry",
    "host@hostname": "ip-172-31-10-8.us-west-2.compute.internal"
  },
  "status": {
    ...
  },
  "attributes": {
    "http": {
      "user_agent": {
        "original": "Mozilla/5.0"
      },
      "network": {
         ...
        }
      },
      "request": {
         ...
        }
      },
      "response": {
        "status_code": "200",
        "body": {
          "size": 500
        }
      },
      "client": {
        "server": {
          "socket": {
            "address": "***********",
            "domain": "example.com",
            "port": 80
          },
          "address": "***********",
          "port": 80
        },
        "resend_count": 0,
        "url": {
          "full": "http://example.com"
        }
      },
      "server": {
        "route": "/index",
        "address": "***********",
        "port": 8080,
        "socket": {
         ...
        },
        "client": {
         ...
         }
        },
        "url": {
         ...
        }
      }
    }
  }
}
```

In this approach you can see the `traceId` and the http's client/server `ip` that can be correlated with the elb logs to better understand the system's behaviour and condition.

**New correlation query command**  
Here is the new command that would allow this type of investigation:

```
source alb_logs, traces | where alb_logs.ip="10.0.0.1" AND alb_logs.cloud.provider="aws"| 
correlate exact fields(traceId, ip) scope(@timestamp, 1D) mapping(alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId )
```

Here's what each part of the command does:

1. `source alb_logs, traces` - This selects the data sources that you want to correlate.

1. `where ip="10.0.0.1" AND cloud.provider="aws"` - This narrows down the scope of your search.

1. `correlate exact fields(traceId, ip)` - This tells the system to correlate data based on exact matches of the following fields:
   + The `ip` field has an explicit filter condition, so it will be used in the correlation for all data sources.
   + The `traceId` field has no explicit filter, so it will match the same traceIds across all data sources.

The field names indicate the logical meaning of the function within the correlation command. The actual join condition relies on the mapping statement you provide.

The term `exact` means that the correlation statements will require all fields to match in order to fulfill the query statement.

The term `approximate` will attempt to match on a best case scenario and will not reject rows with partial matches.

**Addressing different field mapping**  
In cases where the same logical field (such as `ip`) has different names across your data sources, you need to provide the explicit mapping of path fields. To address this, you can extend your correlation conditions to match different field names with similar logical meanings. Here's how you might do this:

```
alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId    
```

For each field participating in the correlation join, you should provide a relevant mapping statement that includes all tables to be joined by this correlation command.

**Example**  
In this example, there are 2 sources: `alb_logs, traces`

There are 2 fields: `traceId, ip`

There are 2 mapping statements: `alb_logs.ip = traces.attributes.http.server.address, alb_logs.traceId = traces.traceId`

**Scoping the correlation timeframes**  
To simplify the work done by the execution engine (driver), you can add the scope statement. This explicitly directs the join query on the time it should scope for this search.

`scope(@timestamp, 1D)` i

In this example, the search scope focuses on a daily basis, so correlations appearing on the same day are grouped together. This scoping mechanism simplifies and allows better control over results, enabling incremental search resolution based on your needs.

**Supporting drivers**  
The new correlation command is actually a 'hidden' join command. Therefore, only the following PPL drivers support this command. In these drivers, the correlation command will be directly translated into the appropriate Catalyst Join logical plan.

**Example**  
`source alb_logs, traces, metrics | where ip="10.0.0.1" AND cloud.provider="aws"| correlate exact on (ip, port) scope(@timestamp, 2018-07-02T22:23:00, 1 D)`

**Logical Plan:**

```
'Project [*]
+- 'Join Inner, ('ip && 'port)
   :- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
      +- 'UnresolvedRelation [alb_logs]
   +- 'Join Inner, ('ip & 'port)
      :- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
         +- 'UnresolvedRelation [traces]
      +- 'Filter (('ip === "10.0.0.1" & 'cloud.provider === "aws") & inTimeScope('@timestamp, "2018-07-02T22:23:00", "1 D"))
         +- 'UnresolvedRelation [metrics]
```

The catalyst engine optimizes this query according to the most efficient join ordering.

#### dedup command
<a name="supported-ppl-dedup-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `dedup` command to remove identical documents from your search results based on specified fields.

**Syntax**  
Use the following syntax:

```
dedup [int] <field-list> [keepempty=<bool>] [consecutive=<bool>] 
```

**`int`**
+ Optional. 
+ The `dedup` command retains multiple events for each combination when you specify <int>. The number for <int> must be greater than 0. If you don't specify a number, only the first occurring event is kept. All other duplicates are removed from the results. 
+ Default: 1

**`keepempty`**
+ Optional. 
+ If true, keeps documents where any field in the field-list has a NULL value or is MISSING.
+ Default: false

**`consecutive`**
+ Optional.
+ If true, removes only events with consecutive duplicate combinations of values.
+ Default: false

**`field-list`**
+ Mandatory. 
+ A comma-delimited list of fields. At least one field is required.

**Example 1: Dedup by one field**  
This example shows how to dedup documents using the gender field.

PPL query:

```
os> source=accounts | dedup gender | fields account_number, gender;
fetched rows / total rows = 2/2
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 13               | F        |
+------------------+----------+
```

**Example 2: Keep 2 duplicates documents**  
The example shows how to dedup documents with the gender field, keeping two duplicates.

PPL query:

```
os> source=accounts | dedup 2 gender | fields account_number, gender;
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 6                | M        |
| 13               | F        |
+------------------+----------+
```

**Example 3: Keep or ignore the empty field by default**  
The example shows how to dedup the document by keeping the null value field.

PPL query:

```
os> source=accounts | dedup email keepempty=true | fields account_number, email;
fetched rows / total rows = 4/4
+------------------+-----------------------+
| account_number   | email                 |
+------------------+-----------------------+
| 1                | john_doe@example.com  |
| 6                | jane_doe@example.com  |
| 13               | null                  |
| 18               | juan_li@example.com   |
+------------------+-----------------------+
```

The example shows how to dedup the document by ignoring the empty value field.

PPL query:

```
os> source=accounts | dedup email | fields account_number, email;
fetched rows / total rows = 3/3
+------------------+-----------------------+
| account_number   | email                 |
+------------------+-----------------------+
| 1                | john_doe@example.com  |
| 6                | jane_doe@example.com  |
| 18               | juan_li@example.com   |
+------------------+-----------------------+
```

**Example 4: Dedup in consecutive documents**  
The example shows how to dedup in consecutive documents.

PPL query:

```
os> source=accounts | dedup gender consecutive=true | fields account_number, gender;
fetched rows / total rows = 3/3
+------------------+----------+
| account_number   | gender   |
+------------------+----------+
| 1                | M        |
| 13               | F        |
| 18               | M        |
+------------------+----------+
```

**Additional examples**
+ `source = table | dedup a | fields a,b,c`
+ `source = table | dedup a,b | fields a,b,c`
+ `source = table | dedup a keepempty=true | fields a,b,c`
+ `source = table | dedup a,b keepempty=true | fields a,b,c`
+ `source = table | dedup 1 a | fields a,b,c`
+ `source = table | dedup 1 a,b | fields a,b,c`
+ `source = table | dedup 1 a keepempty=true | fields a,b,c`
+ `source = table | dedup 1 a,b keepempty=true | fields a,b,c`
+ `source = table | dedup 2 a | fields a,b,c`
+ `source = table | dedup 2 a,b | fields a,b,c`
+ `source = table | dedup 2 a keepempty=true | fields a,b,c`
+ `source = table | dedup 2 a,b keepempty=true | fields a,b,c`
+ `source = table | dedup 1 a consecutive=true| fields a,b,c` (consecutive deduplication is unsupported)

**Limitation**
+ For `| dedup 2 a, b keepempty=false`

  ```
  DataFrameDropColumns('_row_number_)
  +- Filter ('_row_number_ <= 2) // allowed duplication = 2
     +- Window [row_number() windowspecdefinition('a, 'b, 'a ASC NULLS FIRST, 'b ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS _row_number_], ['a, 'b], ['a ASC NULLS FIRST, 'b ASC NULLS FIRST]
         +- Filter (isnotnull('a) AND isnotnull('b)) // keepempty=false
            +- Project
               +- UnresolvedRelation
  ```
+ For `| dedup 2 a, b keepempty=true`

  ```
  Union
  :- DataFrameDropColumns('_row_number_)
  :  +- Filter ('_row_number_ <= 2)
  :     +- Window [row_number() windowspecdefinition('a, 'b, 'a ASC NULLS FIRST, 'b ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS _row_number_], ['a, 'b], ['a ASC NULLS FIRST, 'b ASC NULLS FIRST]
  :        +- Filter (isnotnull('a) AND isnotnull('b))
  :           +- Project
  :              +- UnresolvedRelation
  +- Filter (isnull('a) OR isnull('b))
     +- Project
        +- UnresolvedRelation
  ```

#### describe command
<a name="supported-ppl-describe-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `describe` command to get detailed information about the structure and metadata of tables, schemas, and catalogs. Here are various examples and use cases of the `describe` command.

**Describe**
+ `describe table` This command is equal to the `DESCRIBE EXTENDED table` SQL command
+ `describe schema.table`
+ `describe schema.`table``
+ `describe catalog.schema.table`
+ `describe catalog.schema.`table``
+ `describe `catalog`.`schema`.`table``

#### eval command
<a name="supported-ppl-eval-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The `eval` command evaluates the expression and appends the result to the search result.

**Syntax**  
Use the following syntax:

```
eval <field>=<expression> ["," <field>=<expression> ]...    
```
+ `field`: Mandatory. If the field name doesn't exist, a new field is added. If the field name already exists, it will be overridden.
+  `expression`: Mandatory. Any expression supported by the system.

**Example 1: Create the new field**  
This example shows how to create a new `doubleAge` field for each document. The new `doubleAge` is the evaluation result of age multiplied by 2.

PPL query:

```
os> source=accounts | eval doubleAge = age * 2 | fields age, doubleAge ;
fetched rows / total rows = 4/4
+-------+-------------+
| age   | doubleAge   |
|-------+-------------|
| 32    | 64          |
| 36    | 72          |
| 28    | 56          |
| 33    | 66          |
+-------+-------------+
```

**Example 2: Override the existing field**  
This example shows how to override the existing age field with age plus 1.

PPL query:

```
os> source=accounts | eval age = age + 1 | fields age ;
fetched rows / total rows = 4/4
+-------+
| age   |
|-------|
| 33    |
| 37    |
| 29    |
| 34    |
+-------+
```

**Example 3: Create the new field with field defined in eval**  
This example shows how to create a new `ddAge` field with a field defined in the eval command. The new field `ddAge` is the evaluation result of `doubleAge` multiplied by 2, where `doubleAge` is defined in the eval command.

PPL query:

```
os> source=accounts | eval doubleAge = age * 2, ddAge = doubleAge * 2 | fields age, doubleAge, ddAge ;
fetched rows / total rows = 4/4
+-------+-------------+---------+
| age   | doubleAge   | ddAge   |
|-------+-------------+---------|
| 32    | 64          | 128     |
| 36    | 72          | 144     |
| 28    | 56          | 112     |
| 33    | 66          | 132     |
+-------+-------------+---------+
```

Assumptions: `a`, `b`, `c` are existing fields in `table`

**Additional examples**
+ `source = table | eval f = 1 | fields a,b,c,f`
+ `source = table | eval f = 1` (output a,b,c,f fields)
+ `source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t`
+ `source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5`
+ `source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = b | stats avg(f) by h`
+ `source = table | eval f = ispresent(a)`
+ `source = table | eval r = coalesce(a, b, c) | fields r`
+ `source = table | eval e = isempty(a) | fields e`
+ `source = table | eval e = isblank(a) | fields e`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))`
+ `source = table | eval f = a in ('foo', 'bar') | fields f`
+ `source = table | eval f = a not in ('foo', 'bar') | fields f`

**Eval with case example:**  


```
source = table | eval e = eval status_category =
case(a >= 200 AND a < 300, 'Success',
a >= 300 AND a < 400, 'Redirection',
a >= 400 AND a < 500, 'Client Error',
a >= 500, 'Server Error'
else 'Unknown')
```

**Eval with another case example:**  


Assumptions: `a`, `b`, `c` are existing fields in `table`

**Additional examples**
+ `source = table | eval f = 1 | fields a,b,c,f`
+ `source = table | eval f = 1` (output a,b,c,f fields)
+ `source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t`
+ `source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5`
+ `source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = b | stats avg(f) by h`
+ `source = table | eval f = ispresent(a)`
+ `source = table | eval r = coalesce(a, b, c) | fields r`
+ `source = table | eval e = isempty(a) | fields e`
+ `source = table | eval e = isblank(a) | fields e`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))`
+ `source = table | eval f = a in ('foo', 'bar') | fields f`
+ `source = table | eval f = a not in ('foo', 'bar') | fields f`

**Eval with case example:**  


```
source = table | eval e = eval status_category =
case(a >= 200 AND a < 300, 'Success',
a >= 300 AND a < 400, 'Redirection',
a >= 400 AND a < 500, 'Client Error',
a >= 500, 'Server Error'
else 'Unknown')
```

**Eval with another case example:**  


```
source = table |  where ispresent(a) |
eval status_category =
 case(a >= 200 AND a < 300, 'Success',
  a >= 300 AND a < 400, 'Redirection',
  a >= 400 AND a < 500, 'Client Error',
  a >= 500, 'Server Error'
  else 'Incorrect HTTP status code'
 )
 | stats count() by status_category
```

**Limitations**
+ Overriding existing fields is unsupported. Queries attempting to do so will throw exceptions with the message "Reference 'a' is ambiguous".

  ```
  - `source = table | eval a = 10 | fields a,b,c`
  - `source = table | eval a = a * 2 | stats avg(a)`
  - `source = table | eval a = abs(a) | where a > 0`
  - `source = table | eval a = signum(a) | where a < 0`
  ```

#### eventstats command
<a name="supported-ppl-eventstats-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `eventstats` command to enrich your event data with calculated summary statistics. It operates by analyzing specified fields within your events, computing various statistical measures, and then appending these results as new fields to each original event.

**Key aspects of eventstats**

1. It performs calculations across the entire result set or within defined groups.

1. The original events remain intact, with new fields added to contain the statistical results.

1. The command is particularly useful for comparative analysis, identifying outliers, or providing additional context to individual events.

**Difference between stats and eventstats**  
The `stats` and `eventstats` commands are both used for calculating statistics, but they have some key differences in how they operate and what they produce.

**Output format**
+ `stats`: Produces a summary table with only the calculated statistics.
+ `eventstats`: Adds the calculated statistics as new fields to the existing events, preserving the original data.

**Event retention**
+ `stats`: Reduces the result set to only the statistical summary, discarding individual events.
+ `eventstats`: Retains all original events and adds new fields with the calculated statistics.

**Use cases**
+ `stats`: Best for creating summary reports or dashboards. Often used as a final command to summarize results.
+ `eventstats`: Useful when you need to enrich events with statistical context for further analysis or filtering. Can be used mid-search to add statistics that can be used in subsequent commands.

**Syntax**  
Use the following syntax:

```
eventstats <aggregation>... [by-clause]    
```

**aggregation**
+ Mandatory. 
+ An aggregation function. 
+ The argument of aggregation must be a field.

**by-clause**
+ Optional.
+ Syntax: `by [span-expression,] [field,]...`
+ The by clause can include fields and expressions such as scalar functions and aggregation functions. You can also use the span clause to split a specific field into buckets of equal intervals. The eventstats command then performs aggregation based on these span buckets.
+ Default: If you don't specify a by clause, the eventstats command aggregates over the entire result set.

**span-expression**
+ Optional, at most one.
+ Syntax: `span(field_expr, interval_expr)`
+ The unit of the interval expression is the natural unit by default. However, for date and time type fields, you need to specify the unit in the interval expression when using date/time units.

  For example, to split the field `age` into buckets by 10 years, use `span(age, 10)`. For time-based fields, you can split a `timestamp` field into hourly intervals using `span(timestamp, 1h)`.


**Available time units**  

| Span interval units | 
| --- | 
| millisecond (ms) | 
| second (s) | 
| minute (m, case sensitive) | 
| hour (h) | 
| day (d) | 
| week (w) | 
| month (M, case sensitive) | 
| quarter (q) | 
| year (y) | 

**Aggregation functions**  


**`COUNT`**  
`COUNT` returns a count of the number of expr in the rows retrieved by a SELECT statement.

For CloudWatch Logs use queries, `COUNT` is not supported. 

Example:

```
os> source=accounts | eventstats count();
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+
| account_number | balance  | firstname | lastname | age | gender | address            | employer   | email                    | city   | state | count() |
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane       | AnyCorp    | janedoe@anycorp.com      | Brogan | IL    | 4       |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street | AnyCompany | marymajor@anycompany.com | Dante  | TN    | 4       |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street     | AnyOrg     |                          | Nogal  | VA    | 4       |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court  |            | juanli@exampleorg.com    | Orick  | MD    | 4       |
+----------------+----------+-----------+----------+-----+--------+--------------------+------------+--------------------------+--------+-------+---------+
```

**`SUM`**  
`SUM(expr)` returns the sum of expr.

Example:

```
os> source=accounts | eventstats sum(age) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer   | email                    | city   | state | sum(age) by gender |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp    | janedoe@anycorp.com      | Brogan | IL    | 101                |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | AnyCompany | marymajor@anycompany.com | Dante  | TN    | 101                |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg     |                          | Nogal  | VA    | 28                 |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |            | juanli@exampleorg.com    | Orick  | MD    | 101                |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+--------------------------+--------+-------+--------------------+
```

**`AVG`**  
`AVG(expr)` returns the average value of expr.

Example:

```
os> source=accounts | eventstats avg(age) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+---------------------------+--------+-------+--------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | avg(age) by gender |
+----------------+----------+-----------+----------+-----+--------+-----------------------+------------+---------------------------+--------+-------+--------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 33.67              |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 33.67              |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28.00              |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 33.67              |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------+
```

**MAX**  
`MAX(expr)` Returns the maximum value of expr.

Example

```
os> source=accounts | eventstats max(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | max(age)  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 36        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 36        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 36        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 36        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
```

**MIN**  
`MIN(expr)` Returns the minimum value of expr.

Example

```
os> source=accounts | eventstats min(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | min(age)  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 28        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 28        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                          | Nogal  | VA    | 28        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 28        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+
```

**STDDEV\$1SAMP**  
`STDDEV_SAMP(expr)` Return the sample standard deviation of expr.

Example

```
os> source=accounts | eventstats stddev_samp(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | stddev_samp(age)       |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 3.304037933599835      |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 3.304037933599835      |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 3.304037933599835      |
| 18             | 4180     | Juan      | Li       | 33  | M      | 467 Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 3.304037933599835      |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
```

**STDDEV\$1POP**  
`STDDEV_POP(expr)` Return the population standard deviation of expr.

Example

```
os> source=accounts | eventstats stddev_pop(age);
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | stddev_pop(age)        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | 880 Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 2.****************     |
| 6              | 5686     | Mary      | Major    | 36  | M      | *** Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 2.****************     |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                          | Nogal  | VA    | 2.****************     |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 2.****************     |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+------------------------+
```

**PERCENTILE or PERCENTILE\$1APPROX**  
`PERCENTILE(expr, percent)` or `PERCENTILE_APPROX(expr, percent)` Return the approximate percentile value of expr at the specified percentage.

**percent**
+ The number must be a constant between 0 and 100.

Example

```
os> source=accounts | eventstats percentile(age, 90) by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | percentile(age, 90) by gender  |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 36                             |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 36                             |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28                             |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 36                             |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+--------------------------------+
```

**Example 1: Calculate the average, sum and count of a field by group**  
The example show calculate the average age, sum age and count of events of all the accounts group by gender.

```
os> source=accounts | eventstats avg(age) as avg_age, sum(age) as sum_age, count() as count by gender;
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | avg_age   | sum_age   | count |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 33.666667 | 101       | 3     |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 33.666667 | 101       | 3     |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 28.000000 | 28        | 1     |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 33.666667 | 101       | 3     |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+-----------+-----------+-------+
```

**Example 2: Calculate the count by a span**  
The example gets the count of age by the interval of 10 years.

```
os> source=accounts | eventstats count(age) by span(age, 10) as age_span
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                    | city   | state | age_span |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com      | Brogan | IL    | 3        |
| 6              | 5686     | Mary      | Major    | 36  | M      | 671 Example Street    | Any Company | marymajor@anycompany.com | Dante  | TN    | 3        |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | 789 Any Street        | AnyOrg      |                          | Nogal  | VA    | 1        |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com    | Orick  | MD    | 3        |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+--------------------------+--------+-------+----------+
```

**Example 3: Calculate the count by a gender and span**  
The example gets the count of age by the interval of 5 years and group by gender.

```
os> source=accounts | eventstats count() as cnt by span(age, 5) as age_span, gender
fetched rows / total rows = 4/4
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+
| account_number | balance  | firstname | lastname | age | gender | address               | employer    | email                     | city   | state | cnt |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+
| 1              | 39225    | Jane      | Doe      | 32  | M      | *** Any Lane          | AnyCorp     | janedoe@anycorp.com       | Brogan | IL    | 2   |
| 6              | 5686     | Mary      | Majo     | 36  | M      | 671 Example Street    | Any Company | hattiebond@anycompany.com | Dante  | TN    | 1   |
| 13             | 32838    | Nikki     | Wolf     | 28  | F      | *** Any Street        | AnyOrg      |                           | Nogal  | VA    | 1   |
| 18             | 4180     | Juan      | Li       | 33  | M      | *** Example Court     |             | juanli@exampleorg.com     | Orick  | MD    | 2   |
+----------------+----------+-----------+----------+-----+--------+-----------------------+-------------+---------------------------+--------+-------+-----+
```

**Usage**
+ `source = table | eventstats avg(a)`
+ `source = table | where a < 50 | eventstats avg(c)`
+ `source = table | eventstats max(c) by b`
+ `source = table | eventstats count(c) by b | head 5`
+ `source = table | eventstats distinct_count(c)`
+ `source = table | eventstats stddev_samp(c)`
+ `source = table | eventstats stddev_pop(c)`
+ `source = table | eventstats percentile(c, 90)`
+ `source = table | eventstats percentile_approx(c, 99)`

**Aggregations with span**  

+ `source = table | eventstats count(a) by span(a, 10) as a_span`
+ `source = table | eventstats sum(age) by span(age, 5) as age_span | head 2`
+ `source = table | eventstats avg(age) by span(age, 20) as age_span, country | sort - age_span | head 2`

**Aggregations with time window span (tumble windowing function)**  

+ `source = table | eventstats sum(productsAmount) by span(transactionDate, 1d) as age_date | sort age_date`
+ `source = table | eventstats sum(productsAmount) by span(transactionDate, 1w) as age_date, productId`

**Aggregations group by multiple levels**  

+ `source = table | eventstats avg(age) as avg_state_age by country, state | eventstats avg(avg_state_age) as avg_country_age by country`
+ `source = table | eventstats avg(age) as avg_city_age by country, state, city | eval new_avg_city_age = avg_city_age - 1 | eventstats avg(new_avg_city_age) as avg_state_age by country, state | where avg_state_age > 18 | eventstats avg(avg_state_age) as avg_adult_country_age by country`

#### expand command
<a name="supported-ppl-expand-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `expand` command to flatten a field of type:
+ `Array<Any>`
+ `Map<Any>`

**Syntax**  
Use the following syntax:

```
expand <field> [As alias]
```

**field**
+ The field to be expanded (exploded). Must be of a supported type.

**alias**
+ Optional. The name to be used instead of the original field name.

**Usage**  
The `expand` command produces a row for each element in the specified array or map field, where:
+ Array elements become individual rows.
+ Map key-value pairs are broken into separate rows, with each key-value represented as a row.
+ When an alias is provided, the exploded values are represented under the alias instead of the original field name.
+ This can be used in combination with other commands, such as `stats`, `eval`, and `parse` to manipulate or extract data post-expansion.

**Examples**
+ `source = table | expand employee | stats max(salary) as max by state, company`
+ `source = table | expand employee as worker | stats max(salary) as max by state, company`
+ `source = table | expand employee as worker | eval bonus = salary * 3 | fields worker, bonus`
+ `source = table | expand employee | parse description '(?<email>.+@.+)' | fields employee, email`
+ `source = table | eval array=json_array(1, 2, 3) | expand array as uid | fields name, occupation, uid`
+ `source = table | expand multi_valueA as multiA | expand multi_valueB as multiB`

#### explain command
<a name="supported-ppl-explain-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The `explain` command helps you understand query execution plans, enabling you to analyze and optimize your queries for better performance. This introduction provides a concise overview of the explain command's purpose and its importance in query optimization.

**Comment**
+ `source=accounts | top gender // finds most common gender of all the accounts` (line comment)
+ `source=accounts | dedup 2 gender /* dedup the document with gender field keep 2 duplication */ | fields account_number, gender` (block comment)

**Describe**
+ `describe table` This command is equal to the `DESCRIBE EXTENDED table` SQL command
+ `describe schema.table`
+ `describe schema.`table``
+ `describe catalog.schema.table`
+ `describe catalog.schema.`table``
+ `describe `catalog`.`schema`.`table``

**Explain**
+ `explain simple | source = table | where a = 1 | fields a,b,c`
+ `explain extended | source = table`
+ `explain codegen | source = table | dedup a | fields a,b,c`
+ `explain cost | source = table | sort a | fields a,b,c`
+ `explain formatted | source = table | fields - a`
+ `explain simple | describe table`

**Fields**
+ `source = table`
+ `source = table | fields a,b,c`
+ `source = table | fields + a,b,c`
+ `source = table | fields - b,c`
+ `source = table | eval b1 = b | fields - b1,c`

**Field summary**
+ `source = t | fieldsummary includefields=status_code nulls=false`
+ `source = t | fieldsummary includefields= id, status_code, request_path nulls=true`
+ `source = t | where status_code != 200 | fieldsummary includefields= status_code nulls=true`

**Nested field**
+ `source = catalog.schema.table1, catalog.schema.table2 | fields A.nested1, B.nested1`
+ `source = catalog.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields int_col, struct_col.field1.subfield, struct_col2.field1.subfield`
+ `source = catalog.schema.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields int_col, struct_col.field1.subfield, struct_col2.field1.subfield`

**Filters**
+ `source = table | where a = 1 | fields a,b,c`
+ `source = table | where a >= 1 | fields a,b,c`
+ `source = table | where a < 1 | fields a,b,c`
+ `source = table | where b != 'test' | fields a,b,c`
+ `source = table | where c = 'test' | fields a,b,c | head 3`
+ `source = table | where ispresent(b)`
+ `source = table | where isnull(coalesce(a, b)) | fields a,b,c | head 3`
+ `source = table | where isempty(a)`
+ `source = table | where isblank(a)`
+ `source = table | where case(length(a) > 6, 'True' else 'False') = 'True'`
+ `source = table | where a not in (1, 2, 3) | fields a,b,c`
+ `source = table | where a between 1 and 4` - Note: This returns a >= 1 and a <= 4, i.e. [1, 4]
+ `source = table | where b not between '2024-09-10' and '2025-09-10'` - Note: This returns b >= '\$1\$1\$1\$1\$1\$1\$1\$1\$1\$1' and b <= '2025-09-10'
+ `source = table | where cidrmatch(ip, '***********/24')`
+ `source = table | where cidrmatch(ipv6, '2003:db8::/32')`
+ `source = table | trendline sma(2, temperature) as temp_trend`

**IP related queries**
+ `source = table | where cidrmatch(ip, '**************')`
+ `source = table | where isV6 = false and isValid = true and cidrmatch(ipAddress, '**************')`
+ `source = table | where isV6 = true | eval inRange = case(cidrmatch(ipAddress, '2003:***::/32'), 'in' else 'out') | fields ip, inRange`

**Complex filters**  


```
source = table | eval status_category =
case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
else 'Incorrect HTTP status code')
| where case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
else 'Incorrect HTTP status code'
) = 'Incorrect HTTP status code'
```

```
source = table
| eval factor = case(a > 15, a - 14, isnull(b), a - 7, a < 3, a + 1 else 1)
| where case(factor = 2, 'even', factor = 4, 'even', factor = 6, 'even', factor = 8, 'even' else 'odd') = 'even'
| stats count() by factor
```

**Filters with logical conditions**
+ `source = table | where c = 'test' AND a = 1 | fields a,b,c`
+ `source = table | where c != 'test' OR a > 1 | fields a,b,c | head 1`
+ `source = table | where c = 'test' NOT a > 1 | fields a,b,c`

**Eval**  
Assumptions: `a`, `b`, `c` are existing fields in `table`
+ `source = table | eval f = 1 | fields a,b,c,f`
+ `source = table | eval f = 1` (output a,b,c,f fields)
+ `source = table | eval n = now() | eval t = unix_timestamp(a) | fields n,t`
+ `source = table | eval f = a | where f > 1 | sort f | fields a,b,c | head 5`
+ `source = table | eval f = a * 2 | eval h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = f * 2 | fields a,f,h`
+ `source = table | eval f = a * 2, h = b | stats avg(f) by h`
+ `source = table | eval f = ispresent(a)`
+ `source = table | eval r = coalesce(a, b, c) | fields r`
+ `source = table | eval e = isempty(a) | fields e`
+ `source = table | eval e = isblank(a) | fields e`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one', a = 2, 'two', a = 3, 'three', a = 4, 'four', a = 5, 'five', a = 6, 'six', a = 7, 'se7en', a = 8, 'eight', a = 9, 'nine')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else 'unknown')`
+ `source = table | eval f = case(a = 0, 'zero', a = 1, 'one' else concat(a, ' is an incorrect binary digit'))`
+ `source = table | eval digest = md5(fieldName) | fields digest`
+ `source = table | eval digest = sha1(fieldName) | fields digest`
+ `source = table | eval digest = sha2(fieldName,256) | fields digest`
+ `source = table | eval digest = sha2(fieldName,512) | fields digest`

#### fillnull command
<a name="supported-ppl-fillnull-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

**Description**  
Use the `fillnull` command to replace null values with a specified value in one or more fields of your search results. 

**Syntax**  
Use the following syntax:

```
fillnull [with <null-replacement> in <nullable-field>["," <nullable-field>]] | [using <source-field> = <null-replacement> [","<source-field> = <null-replacement>]]
```
+ null-replacement: Mandatory. The value used to replace null values.
+ nullable-field: Mandatory. Field reference. The null values in this field will be replaced with the value specified in null-replacement.

**Example 1: Fillnull one field**  
The example shows how to use fillnull on a single field:

```
os> source=logs | fields status_code | eval input=status_code | fillnull with 0 in status_code;
| input | status_code |
|-------|-------------|
| 403   | 403         |
| 403   | 403         |
| NULL  | 0           |
| NULL  | 0           |
| 200   | 200         |
| 404   | 404         |
| 500   | 500         |
| NULL  | 0           |
| 500   | 500         |
| 404   | 404         |
| 200   | 200         |
| 500   | 500         |
| NULL  | 0           |
| NULL  | 0           |
| 404   | 404         |
```

**Example 2: Fillnull applied to multiple fields**  
The example shows fillnull applied to multiple fields.

```
os> source=logs | fields request_path, timestamp | eval input_request_path=request_path, input_timestamp = timestamp | fillnull with '???' in request_path, timestamp;
| input_request_path | input_timestamp       | request_path | timestamp              |
|------------------------------------------------------------------------------------|
| /contact           | NULL                  | /contact     | ???                    |
| /home              | NULL                  | /home        | ???                    |
| /about             | 2023-10-01 10:30:00   | /about       | 2023-10-01 10:30:00    |
| /home              | 2023-10-01 10:15:00   | /home        | 2023-10-01 10:15:00    |
| NULL               | 2023-10-01 10:20:00   | ???          | 2023-10-01 10:20:00    |
| NULL               | 2023-10-01 11:05:00   | ???          | 2023-10-01 11:05:00    |
| /about             | NULL                  | /about       | ???                    |
| /home              | 2023-10-01 10:00:00   | /home        | 2023-10-01 10:00:00    |
| /contact           | NULL                  | /contact     | ???                    |
| NULL               | 2023-10-01 10:05:00   | ???          | 2023-10-01 10:05:00    |
| NULL               | 2023-10-01 10:50:00   | ???          | 2023-10-01 10:50:00    |
| /services          | NULL                  | /services    | ???                    |
| /home              | 2023-10-01 10:45:00   | /home        | 2023-10-01 10:45:00    |
| /services          | 2023-10-01 11:00:00   | /services    | 2023-10-01 11:00:00    |
| NULL               | 2023-10-01 10:35:00   | ???          | 2023-10-01 10:35:00    |
```

**Example 3: Fillnull applied to multiple fields with various null replacement values.**  
The example show fillnull with various values used to replace nulls.
+ `/error` in `request_path` field
+ `1970-01-01 00:00:00` in `timestamp` field

```
os> source=logs | fields request_path, timestamp | eval input_request_path=request_path, input_timestamp = timestamp | fillnull using request_path = '/error', timestamp='1970-01-01 00:00:00';

| input_request_path | input_timestamp       | request_path | timestamp              |
|------------------------------------------------------------------------------------|
| /contact           | NULL                  | /contact     | 1970-01-01 00:00:00    |
| /home              | NULL                  | /home        | 1970-01-01 00:00:00    |
| /about             | 2023-10-01 10:30:00   | /about       | 2023-10-01 10:30:00    |
| /home              | 2023-10-01 10:15:00   | /home        | 2023-10-01 10:15:00    |
| NULL               | 2023-10-01 10:20:00   | /error       | 2023-10-01 10:20:00    |
| NULL               | 2023-10-01 11:05:00   | /error       | 2023-10-01 11:05:00    |
| /about             | NULL                  | /about       | 1970-01-01 00:00:00    |
| /home              | 2023-10-01 10:00:00   | /home        | 2023-10-01 10:00:00    |
| /contact           | NULL                  | /contact     | 1970-01-01 00:00:00    |
| NULL               | 2023-10-01 10:05:00   | /error       | 2023-10-01 10:05:00    |
| NULL               | 2023-10-01 10:50:00   | /error       | 2023-10-01 10:50:00    |
| /services          | NULL                  | /services    | 1970-01-01 00:00:00    |
| /home              | 2023-10-01 10:45:00   | /home        | 2023-10-01 10:45:00    |
| /services          | 2023-10-01 11:00:00   | /services    | 2023-10-01 11:00:00    |
| NULL               | 2023-10-01 10:35:00   | /error       | 2023-10-01 10:35:00    |
```

#### fields command
<a name="supported-ppl-fields-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `fields` command to keep or remove fields from the search result.

**Syntax**  
Use the following syntax:

```
field [+|-] <field-list> 
```
+ `index`: Optional. 

  If the plus (\$1) is used, only the fields specified in the field list will be kept. 

  If the minus (-) is used, all the fields specified in the field list will be removed. 

  *Default*: \$1
+ `field list`: Mandatory. A comma-delimited list of fields to keep or remove.

**Example 1: Select specified fields from result**  
This example shows how to fetch `account_number`, `firstname`, and `lastname` fields from search results.

PPL query:

```
os> source=accounts | fields account_number, firstname, lastname;
fetched rows / total rows = 4/4
+------------------+-------------+------------+
| account_number   | firstname   | lastname   |
|------------------+-------------+------------|
| 1                | Jane        | Doe        |
| 6                | John        | Doe        |
| 13               | Jorge       | Souza      |
| 18               | Juan        | Li         |
+------------------+-------------+------------+
```

**Example 2: Remove specified fields from result**  
This example shows how to remove the `account_number` field from search results.

PPL query:

```
os> source=accounts | fields account_number, firstname, lastname | fields - account_number ;
fetched rows / total rows = 4/4
+-------------+------------+
| firstname   | lastname   |
|-------------+------------|
| Jane        | Doe        |
| John        | Doe        |
| Jorge       | Souza      |
| Juan        | Li         |
+-------------+------------+
```

**Additional examples**
+ `source = table`
+ `source = table | fields a,b,c`
+ `source = table | fields + a,b,c`
+ `source = table | fields - b,c`
+ `source = table | eval b1 = b | fields - b1,c`

Nested-fields example:

```
`source = catalog.schema.table1, catalog.schema.table2 | fields A.nested1, B.nested1`
`source = catalog.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields  int_col, struct_col.field1.subfield, struct_col2.field1.subfield`
`source = catalog.schema.table | where struct_col2.field1.subfield > 'valueA' | sort int_col | fields  int_col, struct_col.field1.subfield, struct_col2.field1.subfield`
```

#### flatten command
<a name="supported-ppl-flatten-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the flatten command to expand fields of the following types:
+ `struct<?,?>`
+ `array<struct<?,?>>`

**Syntax**  
Use the following syntax:

```
flatten <field>
```
+ *field*: The field to be flattened. The field must be of supported type.

**Schema**


| col\$1name | data\$1type | 
| --- | --- | 
| \$1time | string | 
| bridges | array<struct<length:bigint,name:string>> | 
| city | string | 
| coor | struct<alt:bigint,lat:double,long:double> | 
| country | string | 

**Data**  



| \$1time | bridges | city | coor | country | 
| --- | --- | --- | --- | --- | 
| 2024-09-13T12:00:00 | [\$1801, Tower Bridge\$1, \$1928, London Bridge\$1] | London | \$135, 51.5074, -0.1278\$1 | England | 
| 2024-09-13T12:00:00 | [\$1232, Pont Neuf\$1, \$1160, Pont Alexandre III\$1] | Paris | \$135, 48.8566, 2.3522\$1 | France | 
| 2024-09-13T12:00:00 | [\$148, Rialto Bridge\$1, \$111, Bridge of Sighs\$1] | Venice | \$12, 45.4408, 12.3155\$1 | Italy | 
| 2024-09-13T12:00:00 | [\$1\$1\$1\$1, Charles Bridge\$1, \$1343, Legion Bridge\$1] | Prague | \$1200, 50.0755, 14.4378\$1 | Czech Republic | 
| 2024-09-13T12:00:00 | [\$1375, Chain Bridge\$1, \$1333, Liberty Bridge\$1] | Budapest | \$196, 47.4979, 19.0402\$1 | Hungary | 
| 1990-09-13T12:00:00 | NULL | Warsaw | NULL | Poland | 

**Example 1: flatten struct**  
This example shows how to flatten a struct field.

PPL query:

```
source=table | flatten coor
```


| \$1time | bridges | city | country | alt | lat | long | 
| --- | --- | --- | --- | --- | --- | --- | 
| 2024-09-13T12:00:00 | [\$1801, Tower Bridge\$1, \$1928, London Bridge\$1] | London | England | 35 | 51.5074 | -0.1278 | 
| 2024-09-13T12:00:00 | [\$1232, Pont Neuf\$1, \$1160, Pont Alexandre III\$1] | Paris | France | 35 | 48.8566 | 2.3522 | 
| 2024-09-13T12:00:00 | [\$148, Rialto Bridge\$1, \$111, Bridge of Sighs\$1] | Venice | Italy | 2 | 45.4408 | 12.3155 | 
| 2024-09-13T12:00:00 | [\$1516, Charles Bridge\$1, \$1343, Legion Bridge\$1] | Prague | Czech Republic | 200 | 50.0755 | 14.4378 | 
| 2024-09-13T12:00:00 | [\$1375, Chain Bridge\$1, \$1333, Liberty Bridge\$1] | Budapest | Hungary | 96 | 47.4979 | 19.0402 | 
| 1990-09-13T12:00:00 | NULL | Warsaw | Poland | NULL | NULL | NULL | 

**Example 2: flatten array**  
The example shows how to flatten an array of struct fields.

PPL query:

```
source=table | flatten bridges
```


| \$1time | city | coor | country | length | name | 
| --- | --- | --- | --- | --- | --- | 
| 2024-09-13T12:00:00 | London | \$135, 51.5074, -0.1278\$1 | England | 801 | Tower Bridge | 
| 2024-09-13T12:00:00 | London | \$135, 51.5074, -0.1278\$1 | England | 928 | London Bridge | 
| 2024-09-13T12:00:00 | Paris | \$135, 48.8566, 2.3522\$1 | France | 232 | Pont Neuf | 
| 2024-09-13T12:00:00 | Paris | \$135, 48.8566, 2.3522\$1 | France | 160 | Pont Alexandre III | 
| 2024-09-13T12:00:00 | Venice | \$12, 45.4408, 12.3155\$1 | Italy | 48 | Rialto Bridge | 
| 2024-09-13T12:00:00 | Venice | \$12, 45.4408, 12.3155\$1 | Italy | 11 | Bridge of Sighs | 
| 2024-09-13T12:00:00 | Prague | \$1200, 50.0755, 14.4378\$1 | Czech Republic | 516 | Charles Bridge | 
| 2024-09-13T12:00:00 | Prague | \$1200, 50.0755, 14.4378\$1 | Czech Republic | 343 | Legion Bridge | 
| 2024-09-13T12:00:00 | Budapest | \$196, 47.4979, 19.0402\$1 | Hungary | 375 | Chain Bridge | 
| 2024-09-13T12:00:00 | Budapest | \$196, 47.4979, 19.0402\$1 | Hungary | 333 | Liberty Bridge | 
| 1990-09-13T12:00:00 | Warsaw | NULL | Poland | NULL | NULL | 

**Example 3: flatten array and struct**  
This example shows how to flatten multiple fields.

PPL query:

```
source=table | flatten bridges | flatten coor
```


| \$1time | city | country | length | name | alt | lat | long | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| 2024-09-13T12:00:00 | London | England | 801 | Tower Bridge | 35 | 51.5074 | -0.1278 | 
| 2024-09-13T12:00:00 | London | England | 928 | London Bridge | 35 | 51.5074 | -0.1278 | 
| 2024-09-13T12:00:00 | Paris | France | 232 | Pont Neuf | 35 | 48.8566 | 2.3522 | 
| 2024-09-13T12:00:00 | Paris | France | 160 | Pont Alexandre III | 35 | 48.8566 | 2.3522 | 
| 2024-09-13T12:00:00 | Venice | Italy | 48 | Rialto Bridge | 2 | 45.4408 | 12.3155 | 
| 2024-09-13T12:00:00 | Venice | Italy | 11 | Bridge of Sighs | 2 | 45.4408 | 12.3155 | 
| 2024-09-13T12:00:00 | Prague | Czech Republic | 516 | Charles Bridge | 200 | 50.0755 | 14.4378 | 
| 2024-09-13T12:00:00 | Prague | Czech Republic | 343 | Legion Bridge | 200 | 50.0755 | 14.4378 | 
| 2024-09-13T12:00:00 | Budapest | Hungary | 375 | Chain Bridge | 96 | 47.4979 | 19.0402 | 
| 2024-09-13T12:00:00 | Budapest | Hungary | 333 | Liberty Bridge | 96 | 47.4979 | 19.0402 | 
| 1990-09-13T12:00:00 | Warsaw | Poland | NULL | NULL | NULL | NULL | NULL | 

#### grok command
<a name="supported-ppl-grok-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The `grok` command parses a text field with a grok pattern and appends the results to the search result.

**Syntax**  
Use the following syntax:

```
grok <field> <pattern>
```

**field**
+ Mandatory. 
+ The field must be a text field.

**pattern**
+ Mandatory. 
+ The grok pattern used to extract new fields from the given text field. 
+ If a new field name already exists, it will replace the original field.

**Grok pattern**  
The grok pattern is used to match the text field of each document to extract new fields.

**Example 1: Create the new field**  
This example shows how to create a new field `host` for each document. `host` will be the host name after `@` in the `email` field. Parsing a null field will return an empty string.

```
os> source=accounts | grok email '.+@%{HOSTNAME:host}' | fields email, host ;
fetched rows / total rows = 4/4
+-------------------------+-------------+
| email                   | host        |
|-------------------------+-------------|
| jane_doe@example.com    | example.com |
| arnav_desai@example.net | example.net |
| null                    |             |
| juan_li@example.org     | example.org |
+-------------------------+-------------+
```

**Example 2: Override the existing field**  
This example shows how to override the existing `address` field with the street number removed.

```
os> source=accounts | grok address '%{NUMBER} %{GREEDYDATA:address}' | fields address ;
fetched rows / total rows = 4/4
+------------------+
| address          |
|------------------|
| Example Lane     |
| Any Street       |
| Main Street      |
| Example Court    |
+------------------+
```

**Example 3: Using grok to parse logs**  
This example shows how to use grok to parse raw logs.

```
os> source=apache | grok message '%{COMMONAPACHELOG}' | fields COMMONAPACHELOG, timestamp, response, bytes ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------+
| COMMONAPACHELOG                                                                                                             | timestamp                  | response   | bytes   |
|-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | 28/Sep/2022:10:15:57 -0700 | 404        | 19927   |
| 127.45.152.6 - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | 28/Sep/2022:10:15:57 -0700 | 100        | 28722   |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | 28/Sep/2022:10:15:57 -0700 | 401        | 27439   |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | 28/Sep/2022:10:15:57 -0700 | 301        | 9481    |
+-----------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+---------+
```

**Limitations**  
The grok command has the same limitations as the parse command.

#### head command
<a name="supported-ppl-head-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `head` command to return the first N number of specified results after an optional offset in search order.

**Syntax**  
Use the following syntax:

```
head [<size>] [from <offset>]
```

**<size>**
+ Optional integer. 
+ The number of results to return. 
+ Default: 10

**<offset>**
+ Integer after optional `from`. 
+ The number of results to skip. 
+ Default: 0

**Example 1: Get first 10 results**  
This example shows how to retrieve a maximum of 10 results from the accounts index.

PPL query:

```
os> source=accounts | fields firstname, age | head;
fetched rows / total rows = 4/4
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| Jane        | 32    |
| John        | 36    |
| Jorge       | 28    |
| Juan        | 33    |
+-------------+-------+
```

**Example 2: Get first N results**  
The example shows the first N results from the accounts index.

PPL query:

```
os> source=accounts | fields firstname, age | head 3;
fetched rows / total rows = 3/3
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| Jane        | 32    |
| John        | 36    |
| Jorge       | 28    |
+-------------+-------+
```

**Example 3: Get first N results after offset M**  
This example shows how to retrieve the first N results after skipping M results from the accounts index.

PPL query:

```
os> source=accounts | fields firstname, age | head 3 from 1;
fetched rows / total rows = 3/3
+-------------+-------+
| firstname   | age   |
|-------------+-------|
| John        | 36    |
| Jorge       | 28    |
| Juan        | 33    |
+-------------+-------+
```

#### join command
<a name="supported-ppl-join-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The join command allows you to combine data from multiple sources based on common fields, enabling you to perform complex analyses and gain deeper insights from your distributed datasets

**Schema**  
There are least two indices, `otel-v1-apm-span-*` (large) and `otel-v1-apm-service-map` (small).

Relevant fields from indices:

**`otel-v1-apm-span-*`**
+ traceId - A unique identifier for a trace. All spans from the same trace share the same traceId.
+ spanId - A unique identifier for a span within a trace, assigned when the span is created.
+ parentSpanId - The spanId of this span's parent span. If this is a root span, then this field must be empty.
+ durationInNanos - The difference in nanoseconds between startTime and endTime. (this is `latency` in UI)
+ serviceName - The resource from which the span originates.
+ traceGroup - The name of the trace's root span.

**`otel-v1-apm-service-map`**
+ serviceName - The name of the service that emitted the span.
+ destination.domain - The serviceName of the service being called by this client.
+ destination.resource - The span name (API, operation, and so on) being called by this client.
+ target.domain - The serviceName of the service being called by a client.
+ target.resource - The span name (API, operation, and so on) being called by a client.
+ traceGroupName - The top-level span name that started the request chain.

**Requirement**  
Support **join** to calculate the following:

For each service, join span index on service map index to calculate metrics under different type of filters.

This sample query calculates latency when filtered by trace group `client_cancel_order` for the `order` service. 

```
SELECT avg(durationInNanos)
FROM `otel-v1-apm-span-000001` t1
WHERE t1.serviceName = `order`
  AND ((t1.name in
          (SELECT target.resource
           FROM `otel-v1-apm-service-map`
           WHERE serviceName = `order`
             AND traceGroupName = `client_cancel_order`)
        AND t1.parentSpanId != NULL)
       OR (t1.parentSpanId = NULL
           AND t1.name = `client_cancel_order`))
  AND t1.traceId in
    (SELECT traceId
     FROM `otel-v1-apm-span-000001`
     WHERE serviceName = `order`)
```

**Migrate to PPL**  
Syntax of the join command

```
SEARCH source=<left-table>
| <other piped command>
| [joinType] JOIN
    [leftAlias]
    ON joinCriteria
    <right-table>
| <other piped command>
```

**Rewriting**  


```
SEARCH source=otel-v1-apm-span-000001
| WHERE serviceName = 'order'
| JOIN left=t1 right=t2
    ON t1.traceId = t2.traceId AND t2.serviceName = 'order'
    otel-v1-apm-span-000001 -- self inner join
| EVAL s_name = t1.name -- rename to avoid ambiguous
| EVAL s_parentSpanId = t1.parentSpanId -- RENAME command would be better when it is supported
| EVAL s_durationInNanos = t1.durationInNanos 
| FIELDS s_name, s_parentSpanId, s_durationInNanos -- reduce colunms in join
| LEFT JOIN left=s1 right=t3
    ON s_name = t3.target.resource AND t3.serviceName = 'order' AND t3.traceGroupName = 'client_cancel_order'
    otel-v1-apm-service-map
| WHERE (s_parentSpanId IS NOT NULL OR (s_parentSpanId IS NULL AND s_name = 'client_cancel_order'))
| STATS avg(s_durationInNanos) -- no need to add alias if there is no ambiguous
```

**joinType**
+ Syntax: `INNER | LEFT OUTER | CROSS`
+ Optional
+ The type of join to perform. The default is `INNER` if not specified.

**leftAlias**
+ Syntax: `left = <leftAlias>`
+ Optional
+ The subquery alias to use with the left join side, to avoid ambiguous naming.

**joinCriteria**
+ Syntax: `<expression>`
+ Required
+ The syntax starts with `ON`. It could be any comparison expression. Generally, the join criteria looks like `<leftAlias>.<leftField>=<rightAlias>.<rightField>`. 

  For example: `l.id = r.id`. If the join criteria contains multiple conditions, you can specify `AND` and `OR` operator between each comparison expression. For example, `l.id = r.id AND l.email = r.email AND (r.age > 65 OR r.age < 18)`.

**More examples**  
Migration from SQL query (TPC-H Q13):

```
SELECT c_count, COUNT(*) AS custdist
FROM
  ( SELECT c_custkey, COUNT(o_orderkey) c_count
    FROM customer LEFT OUTER JOIN orders ON c_custkey = o_custkey
        AND o_comment NOT LIKE '%unusual%packages%'
    GROUP BY c_custkey
  ) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC, c_count DESC;
```

Rewritten by PPL join query:

```
SEARCH source=customer
| FIELDS c_custkey
| LEFT OUTER JOIN
    ON c_custkey = o_custkey AND o_comment NOT LIKE '%unusual%packages%'
    orders
| STATS count(o_orderkey) AS c_count BY c_custkey
| STATS count() AS custdist BY c_count
| SORT - custdist, - c_count
```

Limitation: sub searches are unsupported in join right side.

If sub searches are supported, you can rewrite the above PPL query as follows:

```
SEARCH source=customer
| FIELDS c_custkey
| LEFT OUTER JOIN
   ON c_custkey = o_custkey
   [
      SEARCH source=orders
      | WHERE o_comment NOT LIKE '%unusual%packages%'
      | FIELDS o_orderkey, o_custkey
   ]
| STATS count(o_orderkey) AS c_count BY c_custkey
| STATS count() AS custdist BY c_count
| SORT - custdist, - c_count
```

#### lookup command
<a name="supported-ppl-lookup-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `lookup` command to enrich your search data by adding or replacing data from a lookup index (dimension table). This command allows you to extend fields of an index with values from a dimension table. You can also use it to append or replace values when lookup conditions are met. The `lookup` command is more suitable than the `Join` command for enriching source data with a static dataset.

**Syntax**  
Use the following syntax:

```
SEARCH source=<sourceIndex>
| <other piped command>
| LOOKUP <lookupIndex> (<lookupMappingField> [AS <sourceMappingField>])...
    [(REPLACE | APPEND) (<inputField> [AS <outputField>])...]
| <other piped command>
```

**lookupIndex**
+ Required.
+ The name of the lookup index (dimension table).

**lookupMappingField**
+ Required.
+ A mapping key in the lookup index, analogous to a join key from the right table. You can specify multiple fields, separated by commas.

**sourceMappingField**
+ Optional.
+ Default: <lookupMappingField>.
+ A mapping key from the source query, analogous to a join key from the left side.

**inputField**
+ Optional.
+ Default: All fields of the lookup index where matched values are found.
+ A field in the lookup index where matched values are applied to the result output. You can specify multiple fields, separated by commas.

**outputField**
+ Optional.
+ Default: `<inputField>`.
+ A field in the output. You can specify multiple output fields. If you specify an existing field name from the source query, its values will be replaced or appended by matched values from inputField. If you specify a new field name, it will be added to the results.

**REPLACE \$1 APPEND**
+ Optional.
+ Default: REPLACE
+ Specifies how to handle matched values. If you specify REPLACE, matched values in <lookupIndex> field overwrite the values in result. If you specify `APPEND`, matched values in <lookupIndex> field only append to the missing values in result.

**Usage**
+ LOOKUP <lookupIndex> id AS cid REPLACE mail AS email
+ LOOKUP <lookupIndex> name REPLACE mail AS email
+ LOOKUP <lookupIndex> id AS cid, name APPEND address, mail AS email
+ LOOKUP <lookupIndex> id

**Example**  
See the following examples.

```
SEARCH source=<sourceIndex>
| WHERE orderType = 'Cancelled'
| LOOKUP account_list, mkt_id AS mkt_code REPLACE amount, account_name AS name
| STATS count(mkt_code), avg(amount) BY name
```

```
SEARCH source=<sourceIndex>
| DEDUP market_id
| EVAL category=replace(category, "-", ".")
| EVAL category=ltrim(category, "dvp.")
| LOOKUP bounce_category category AS category APPEND classification
```

```
SEARCH source=<sourceIndex>
| LOOKUP bounce_category category
```

#### parse command
<a name="supported-ppl-parse-command"></a>

The `parse` command parses a text field with a regular expression and appends the result to the search result.

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

**Syntax**  
Use the following syntax:

```
parse <field> <pattern>    
```

**`field`**
+ Mandatory. 
+ The field must be a text field.

**`pattern`**
+ Mandatory string. 
+ This is the regular expression pattern used to extract new fields from the given text field. 
+ If a new field name already exists, it will replace the original field.

**Regular expression**  
The regular expression pattern is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new `STRING` field.

**Example 1: Create a new field**  
The example shows how to create a new field `host` for each document. `host` will be the host name after `@` in the `email` field. Parsing a null field will return an empty string.

PPL query:

```
os> source=accounts | parse email '.+@(?<host>.+)' | fields email, host ;
fetched rows / total rows = 4/4
+-----------------------+-------------+
| email                 | host        |
|-----------------------+-------------|
| jane_doe@example.com  | example.com |
| john_doe@example.net  | example.net |
| null                  |             |
| juan_li@example.org   | example.org |
+-----------------------+-------------+
```

**Example 2: Override an existing field**  
The example shows how to override the existing `address` field with the street number removed.

PPL query:

```
os> source=accounts | parse address '\d+ (?<address>.+)' | fields address ;
fetched rows / total rows = 4/4
+------------------+
| address          |
|------------------|
| Example Lane     |
| Example Street   |
| Example Avenue   |
| Example Court    |
+------------------+
```

**Example 3: Filter and sort by casted parsed field**  
The example shows how to sort street numbers that are higher than 500 in the `address` field.

PPL query:

```
os> source=accounts | parse address '(?<streetNumber>\d+) (?<street>.+)' | where cast(streetNumber as int) > 500 | sort num(streetNumber) | fields streetNumber, street ;
fetched rows / total rows = 3/3
+----------------+----------------+
| streetNumber   | street         |
|----------------+----------------|
| ***            | Example Street |
| ***            | Example Avenue |
| 880            | Example Lane   |
+----------------+----------------+
```

**Limitations**  
There are a few limitations with the parse command:
+ Fields defined by parse cannot be parsed again.

  The following command will not work:

  ```
  source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)'
  ```
+ Fields defined by parse cannot be overridden with other commands.

  `where` will not match any documents since `street` cannot be overridden:

  ```
  source=accounts | parse address '\d+ (?<street>.+)' | eval street='1' | where street='1' ;        
  ```
+ The text field used by parse cannot be overridden.

  `street` will not be successfully parsed since `address` is overridden:

  ```
  source=accounts | parse address '\d+ (?<street>.+)' | eval address='1' ;        
  ```
+ Fields defined by parse cannot be filtered or sorted after using them in the `stats` command.

  `where` in the following command will not work:

  ```
  source=accounts | parse email '.+@(?<host>.+)' | stats avg(age) by host | where host=pyrami.com ;        
  ```

#### patterns command
<a name="supported-ppl-patterns-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The `patterns` command extracts log patterns from a text field and appends the results to the search result. Grouping logs by their patterns makes it easier to aggregate stats from large volumes of log data for analysis and troubleshooting.

**Syntax**  
Use the following syntax:

```
patterns [new_field=<new-field-name>] [pattern=<pattern>] <field>    
```

**new-field-name**
+ Optional string. 
+ This is the name of the new field for extracted patterns.
+ The default is `patterns_field`. 
+ If the name already exists, it will replace the original field.

**pattern**
+ Optional string. 
+ This it the regex pattern of characters that should be filtered out from the text field. 
+ If absent, the default pattern is alphanumeric characters (`[a-zA-Z\d]`).

**field**
+ Mandatory. 
+ The field must be a text field.

**Example 1: Create the new field**  
The example shows how to use extract punctuations in `email` for each document. Parsing a null field will return an empty string.

PPL query:

```
os> source=accounts | patterns email | fields email, patterns_field ;
fetched rows / total rows = 4/4
+-----------------------+------------------+
| email                 | patterns_field   |
|-----------------------+------------------|
| jane_doe@example.com  | @.               |
| john_doe@example.net  | @.               |
| null                  |                  |
| juan_li@example.org   | @.               |
+-----------------------+------------------+
```

**Example 2: Extract log patterns**  
The example shows how to extract punctuations from a raw log field using the default patterns.

PPL query:

```
os> source=apache | patterns message | fields message, patterns_field ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+---------------------------------+
| message                                                                                                                     | patterns_field                  |
|-----------------------------------------------------------------------------------------------------------------------------+---------------------------------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | ... -  [//::: -] " /-/ /."      |
| ************ - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | ... -  [//::: -] " //// /."     |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | ... - - [//::: -] " //--- /."   |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | ... - - [//::: -] " / /."       |
+-----------------------------------------------------------------------------------------------------------------------------+---------------------------------+
```

**Example 3: Extract log patterns with custom regex pattern**  
The example shows how to extract punctuations from a raw log field using user defined patterns.

PPL query:

```
os> source=apache | patterns new_field='no_numbers' pattern='[0-9]' message | fields message, no_numbers ;
fetched rows / total rows = 4/4
+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| message                                                                                                                     | no_numbers                                                                           |
|-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------|
| 177.95.8.74 - upton5450 [28/Sep/2022:10:15:57 -0700] "HEAD /e-business/mindshare HTTP/1.0" 404 19927                        | ... - upton [/Sep/::: -] "HEAD /e-business/mindshare HTTP/."                         |
| 127.45.152.6 - pouros8756 [28/Sep/2022:10:15:57 -0700] "GET /architectures/convergence/niches/mindshare HTTP/1.0" 100 28722 | ... - pouros [/Sep/::: -] "GET /architectures/convergence/niches/mindshare HTTP/."   |
| *************** - - [28/Sep/2022:10:15:57 -0700] "PATCH /strategize/out-of-the-box HTTP/1.0" 401 27439                      | ... - - [/Sep/::: -] "PATCH /strategize/out-of-the-box HTTP/."                       |
| ************** - - [28/Sep/2022:10:15:57 -0700] "POST /users HTTP/1.1" 301 9481                                             | ... - - [/Sep/::: -] "POST /users HTTP/."                                            |
+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
```

**Limitation**  
The patterns command has the same limitations as the parse command.

#### rare command
<a name="supported-ppl-rare-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `rare` command to find the least common tuple of values of all fields in the field list.

**Note**  
A maximum of 10 results is returned for each distinct tuple of values of the group-by fields.

**Syntax**  
Use the following syntax:

```
rare [N] <field-list> [by-clause] rare_approx [N] <field-list> [by-clause]
```

**field-list**
+ Mandatory. 
+ A comma-delimited list of field names.

**by-clause**
+ Optional. 
+ One or more fields to group the results by.

**N**
+ The number of results to return. 
+ Default: 10

**rare\$1approx**
+ The approximate count of the rare (n) fields by using estimated [cardinality by HyperLogLog\$1\$1 algorithm](https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html).

**Example 1: Find the least common values in a field**  
The example finds the least common gender of all the accounts.

PPL query:

```
os> source=accounts | rare gender;
os> source=accounts | rare_approx 10 gender;
os> source=accounts | rare_approx gender;
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| F        |
| M        |
+----------+
```

**Example 2: Find the least common values organized by gender**  
The example finds the least common age of all the accounts group by gender.

PPL query:

```
os> source=accounts | rare 5 age by gender;
os> source=accounts | rare_approx 5 age by gender;
fetched rows / total rows = 4/4
+----------+-------+
| gender   | age   |
|----------+-------|
| F        | 28    |
| M        | 32    |
| M        | 33    |
| M        | 36    |
+----------+-------+
```

#### rename command
<a name="supported-ppl-rename-command"></a>

Use the `rename` command to change the names of one or more fields in the search result.

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

**Syntax**  
Use the following syntax:

```
rename <source-field> AS <target-field>["," <source-field> AS <target-field>]...    
```

**source-field**
+ Mandatory. 
+ This is the name of the field you want to rename.

**target-field**
+ Mandatory. 
+ This is the name you want to rename to.

**Example 1: Rename one field**  
This example shows how to rename a single field.

PPL query:

```
os> source=accounts | rename account_number as an | fields an;
fetched rows / total rows = 4/4
+------+
| an   |
|------|
| 1    |
| 6    |
| 13   |
| 18   |
+------+
```

**Example 2: Rename multiple fields**  
This example shows how to rename multiple fields.

PPL query:

```
os> source=accounts | rename account_number as an, employer as emp | fields an, emp;
fetched rows / total rows = 4/4
+------+---------+
| an   | emp     |
|------+---------|
| 1    | Pyrami  |
| 6    | Netagy  |
| 13   | Quility |
| 18   | null    |
+------+---------+
```

**Limitations**
+ Overriding existing field is unsupported:

  ```
  source=accounts | grok address '%{NUMBER} %{GREEDYDATA:address}' | fields address        
  ```

#### search command
<a name="supported-ppl-search-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `search` command to retrieve documents from an index. The `search` command can only be used as the first command in a PPL query.

**Syntax**  
Use the following syntax:

```
search source=[<remote-cluster>:]<index> [boolean-expression]    
```

**search**
+ Optional.
+ Search keywords, which can be omitted.

**index**
+ Mandatory.
+ The search command must specify which index to query from. 
+ The index name can be prefixed by `<cluster name>:` for cross-cluster searches.

**bool-expression**
+ Optional. 
+ Any expression that evaluates to a boolean value.

**Example 1: Fetch all the data**  
The example show fetch all the document from accounts index.

PPL query:

```
os> source=accounts;
+------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------+
| account_number   | firstname   | address              | balance   | gender   | city   | employer       | state   | age   | email                 | lastname   |
|------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------|
| 1                | Jorge       | *** Any Lane         | 39225     | M        | Brogan | ExampleCorp    | IL      | 32    | jane_doe@example.com  | Souza      |
| 6                | John        | *** Example Street   | 5686      | M        | Dante  | AnyCorp        | TN      | 36    | john_doe@example.com  | Doe        |
| 13               | Jane        | *** Any Street       | *****     | F        | Nogal  | ExampleCompany | VA      | 28    | null                  | Doe        |
| 18               | Juan        | *** Example Court    | 4180      | M        | Orick  | null           | MD      | 33    | juan_li@example.org   | Li         |
+------------------+-------------+----------------------+-----------+----------+--------+----------------+---------+-------+-----------------------+------------+
```

**Example 2: Fetch data with condition**  
The example show fetch all the document from accounts index with .

PPL query:

```
os> SEARCH source=accounts account_number=1 or gender="F";
+------------------+-------------+--------------------+-----------+----------+--------+----------------+---------+-------+-------------------------+------------+
| account_number   | firstname   | address            | balance   | gender   | city   | employer       | state   | age   | email                -  | lastname   |
|------------------+-------------+--------------------+-----------+----------+--------+----------------+---------+-------+-------------------------+------------|
| 1                | Jorge       | *** Any Lane       | *****     | M        | Brogan | ExampleCorp    | IL      | 32    | jorge_souza@example.com | Souza      |
| 13               | Jane        | *** Any Street     | *****     | F        | Nogal  | ExampleCompany | VA      | 28    | null                    | Doe        |
+------------------+-------------+--------------------+-----------+----------+--------+-----------------+---------+-------+------------------------+------------+
```

#### sort command
<a name="supported-ppl-sort-command"></a>

Use the `sort` command to sort search result by specified fields.

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

**Syntax**  
Use the following syntax:

```
sort <[+|-] sort-field>...
```

**[\$1\$1-]**
+ Optional. 
+ The plus [\$1] stands for ascending order with NULL/MISSING values first.
+ The minus [-] stands for descending order with NULL/MISSING values last.
+ Default: Ascending order with NULL/MISSING values first.

**sort-field**
+ Mandatory. 
+ The field used for sorting.

**Example 1: Sort by one field**  
The example shows how to sort the document with the age field in ascending order.

PPL query:

```
os> source=accounts | sort age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 13               | 28    |
| 1                | 32    |
| 18               | 33    |
| 6                | 36    |
+------------------+-------+
```

**Example 2: Sort by one field and return all the results**  
The example shows how to sort the document with the age field in ascending order.

PPL query:

```
os> source=accounts | sort age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 13               | 28    |
| 1                | 32    |
| 18               | 33    |
| 6                | 36    |
+------------------+-------+
```

**Example 3: Sort by one field in descending order**  
The example shows how to sort the document with the age field in descending order.

PPL query:

```
os> source=accounts | sort - age | fields account_number, age;
fetched rows / total rows = 4/4
+------------------+-------+
| account_number   | age   |
|------------------+-------|
| 6                | 36    |
| 18               | 33    |
| 1                | 32    |
| 13               | 28    |
+------------------+-------+
```

**Example 4: Sort by multiple fields**  
The example shows how to sort the document with the gender field in ascending order and the age field in descending order.

PPL query:

```
os> source=accounts | sort + gender, - age | fields account_number, gender, age;
fetched rows / total rows = 4/4
+------------------+----------+-------+
| account_number   | gender   | age   |
|------------------+----------+-------|
| 13               | F        | 28    |
| 6                | M        | 36    |
| 18               | M        | 33    |
| 1                | M        | 32    |
+------------------+----------+-------+
```

**Example 5: Sort by field include null value**  
The example shows how to sort the employer field by the default option (ascending order and null first). The result shows that the null value is in the first row.

PPL query:

```
os> source=accounts | sort employer | fields employer;
fetched rows / total rows = 4/4
+------------+
| employer   |
|------------|
| null       |
| AnyCompany |
| AnyCorp    |
| AnyOrgty   |
+------------+
```

#### stats command
<a name="supported-ppl-stats-command"></a>

Use the `stats` command to calculate the aggregation from search result.

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

**NULL/MISSING values handling**  



**NULL/MISSING values handling**  

| Function | NULL | MISSING | 
| --- | --- | --- | 
| COUNT | Not counted | Not counted | 
| SUM | Ignore | Ignore | 
| AVG | Ignore | Ignore | 
| MAX | Ignore | Ignore | 
| MIN | Ignore | Ignore | 

**Syntax**  
Use the following syntax:

```
stats <aggregation>... [by-clause]    
```

**aggregation**
+ Mandatory. 
+ An aggregation function applied to a field.

**by-clause**
+ Optional.
+ Syntax: `by [span-expression,] [field,]...`
+ Specifies fields and expressions for grouping the aggregation results. The by-clause allows you to group your aggregation results using fields and expressions. You can use scalar functions, aggregation functions, and even span expressions to split specific fields into buckets of equal intervals. 
+ Default: If no `<by-clause>` is specified, the stats command returns a single row representing the aggregation over the entire result set.

**span-expression**  

+ Optional, at most one.
+ Syntax: `span(field_expr, interval_expr)`
+ The unit of the interval expression is the natural unit by default. If the field is a date and time type field, and the interval is in date/time units, you specify the unit in the interval expression.
+ For example, splitting the `age` field into buckets by 10 years, it looks like `span(age, 10)`. To split a timestamp field into hourly intervals, use `span(timestamp, 1h)`.


**Available time units**  

| Span interval units | 
| --- | 
| millisecond (ms) | 
| second (s) | 
| minute (m, case sensitive) | 
| hour (h) | 
| day (d) | 
| week (w) | 
| month (M, case sensitive) | 
| quarter (q) | 
| year (y) | 

**Aggregation functions**  


**`COUNT`**  
Returns a count of the number of expr in the rows retrieved by a SELECT statement.

Example:

```
os> source=accounts | stats count();
fetched rows / total rows = 1/1
+-----------+
| count()   |
|-----------|
| 4         |
+-----------+
```

**`SUM`**  
Use `SUM(expr)` to return the sum of expr.

Example

```
os> source=accounts | stats sum(age) by gender;
fetched rows / total rows = 2/2
+------------+----------+
| sum(age)   | gender   |
|------------+----------|
| 28         | F        |
| 101        | M        |
+------------+----------+
```

**`AVG`**  
Use `AVG(expr)` to return the average value of expr.

Example

```
os> source=accounts | stats avg(age) by gender;
fetched rows / total rows = 2/2
+--------------------+----------+
| avg(age)           | gender   |
|--------------------+----------|
| 28.0               | F        |
| 33.666666666666664 | M        |
+--------------------+----------+
```

**`MAX`**  
Use `MAX(expr)` to return the maximum value of expr.

Example

```
os> source=accounts | stats max(age);
fetched rows / total rows = 1/1
+------------+
| max(age)   |
|------------|
| 36         |
+------------+
```

**`MIN`**  
Use `MIN(expr)` to return the minimum value of expr.

Example

```
os> source=accounts | stats min(age);
fetched rows / total rows = 1/1
+------------+
| min(age)   |
|------------|
| 28         |
+------------+
```

**`STDDEV_SAMP`**  
Use `STDDEV_SAMP(expr)` to return the sample standard deviation of expr.

Example:

```
os> source=accounts | stats stddev_samp(age);
fetched rows / total rows = 1/1
+--------------------+
| stddev_samp(age)   |
|--------------------|
| 3.304037933599835  |
+--------------------+
```

**STDDEV\$1POP**  
Use `STDDEV_POP(expr)` to return the population standard deviation of expr.

Example:

```
os> source=accounts | stats stddev_pop(age);
fetched rows / total rows = 1/1
+--------------------+
| stddev_pop(age)    |
|--------------------|
| 2.**************** |
+--------------------+
```

**TAKE**  
Use `TAKE(field [, size])` to return the original values of a field. It does not guarantee on the order of values.

**field**
+ Mandatory. 
+ The field must be a text field.

**size**
+ Optional integer. 
+ The number of values should be returned. 
+ Default is 10.

**Example**  


```
os> source=accounts | stats take(firstname);
fetched rows / total rows = 1/1
+-----------------------------+
| take(firstname)             |
|-----------------------------|
| [Jane, Mary, Nikki, Juan    |
+-----------------------------+
```

**PERCENTILE or PERCENTILE\$1APPROX**  
Use `PERCENTILE(expr, percent)` or `PERCENTILE_APPROX(expr, percent)` to return the approximate percentile value of expr at the specified percentage.

**percent**
+ The number must be a constant between 0 and 100.

**Example**  


```
os> source=accounts | stats percentile(age, 90) by gender;
fetched rows / total rows = 2/2
+-----------------------+----------+
| percentile(age, 90)   | gender   |
|-----------------------+----------|
| 28                    | F        |
| 36                    | M        |
+-----------------------+----------+
```

**Example 1: Calculate the count of events**  
The example shows how to calculate the count of events in the accounts.

```
os> source=accounts | stats count();
fetched rows / total rows = 1/1
+-----------+
| count()   |
|-----------|
| 4         |
+-----------+
```

**Example 2: Calculate the average of a field**  
The example shows how to calculate the average age for all accounts.

```
os> source=accounts | stats avg(age);
fetched rows / total rows = 1/1
+------------+
| avg(age)   |
|------------|
| 32.25      |
+------------+
```

**Example 3: Calculate the average of a field by group**  
The example shows how to calculate the average age for all accounts, grouped by gender.

```
os> source=accounts | stats avg(age) by gender;
fetched rows / total rows = 2/2
+--------------------+----------+
| avg(age)           | gender   |
|--------------------+----------|
| 28.0               | F        |
| 33.666666666666664 | M        |
+--------------------+----------+
```

**Example 4: Calculate the average, sum, and count of a field by group**  
The example shows how to calculate the average age, sum age, and count of events for all the accounts, grouped by gender.

```
os> source=accounts | stats avg(age), sum(age), count() by gender;
fetched rows / total rows = 2/2
+--------------------+------------+-----------+----------+
| avg(age)           | sum(age)   | count()   | gender   |
|--------------------+------------+-----------+----------|
| 28.0               | 28         | 1         | F        |
| 33.666666666666664 | 101        | 3         | M        |
+--------------------+------------+-----------+----------+
```

**Example 5: Calculate the maximum of a field**  
The example calculates the maximum age for all accounts.

```
os> source=accounts | stats max(age);
fetched rows / total rows = 1/1
+------------+
| max(age)   |
|------------|
| 36         |
+------------+
```

**Example 6: Calculate the maximum and minimum of a field by group**  
The example calculates the maximum and minimum age values for all accounts, grouped by gender.

```
os> source=accounts | stats max(age), min(age) by gender;
fetched rows / total rows = 2/2
+------------+------------+----------+
| max(age)   | min(age)   | gender   |
|------------+------------+----------|
| 28         | 28         | F        |
| 36         | 32         | M        |
+------------+------------+----------+
```

**Example 7: Calculate the distinct count of a field**  
To get the count of distinct values of a field, you can use the `DISTINCT_COUNT` (or `DC`) function instead of `COUNT`. The example calculates both the count and the distinct count of gender field of all the accounts.

```
os> source=accounts | stats count(gender), distinct_count(gender);
fetched rows / total rows = 1/1
+-----------------+--------------------------+
| count(gender)   | distinct_count(gender)   |
|-----------------+--------------------------|
| 4               | 2                        |
+-----------------+--------------------------+
```

**Example 8: Calculate the count by a span**  
The example gets the count of age by the interval of 10 years.

```
os> source=accounts | stats count(age) by span(age, 10) as age_span
fetched rows / total rows = 2/2
+--------------+------------+
| count(age)   | age_span   |
|--------------+------------|
| 1            | 20         |
| 3            | 30         |
+--------------+------------+
```

**Example 9: Calculate the count by a gender and span**  
This example counts records grouped by gender and age spans of 5 years.

```
os> source=accounts | stats count() as cnt by span(age, 5) as age_span, gender
fetched rows / total rows = 3/3
+-------+------------+----------+
| cnt   | age_span   | gender   |
|-------+------------+----------|
| 1     | 25         | F        |
| 2     | 30         | M        |
| 1     | 35         | M        |
+-------+------------+----------+
```

The span expression always appears as the first grouping key, regardless of the order specified in the command.

```
os> source=accounts | stats count() as cnt by gender, span(age, 5) as age_span
fetched rows / total rows = 3/3
+-------+------------+----------+
| cnt   | age_span   | gender   |
|-------+------------+----------|
| 1     | 25         | F        |
| 2     | 30         | M        |
| 1     | 35         | M        |
+-------+------------+----------+
```

**Example 10: Calculate the count and get email list by a gender and span**  
The example gets the count of age by the interval of 10 years and group by gender, additionally for each row get a list of at most 5 emails.

```
os> source=accounts | stats count() as cnt, take(email, 5) by span(age, 5) as age_span, gender
fetched rows / total rows = 3/3
+-------+----------------------------------------------------+------------+----------+
| cnt   | take(email, 5)                                     | age_span   | gender   |
|-------+----------------------------------------------------+------------+----------|
| 1     | []                                                 | 25         | F        |
| 2     | [janedoe@anycompany.com,juanli@examplecompany.org] | 30         | M        |
| 1     | [marymajor@examplecorp.com]                        | 35         | M        |
+-------+----------------------------------------------------+------------+----------+
```

**Example 11: Calculate the percentile of a field**  
The example shows how to calculate the percentile 90th age of all the accounts.

```
os> source=accounts | stats percentile(age, 90);
fetched rows / total rows = 1/1
+-----------------------+
| percentile(age, 90)   |
|-----------------------|
| 36                    |
+-----------------------+
```

**Example 12: Calculate the percentile of a field by group**  
The example shows how to calculate the percentile 90th age of all the accounts group by gender.

```
os> source=accounts | stats percentile(age, 90) by gender;
fetched rows / total rows = 2/2
+-----------------------+----------+
| percentile(age, 90)   | gender   |
|-----------------------+----------|
| 28                    | F        |
| 36                    | M        |
+-----------------------+----------+
```

**Example 13: Calculate the percentile by a gender and span**  
The example gets the percentile 90th age by the interval of 10 years and group by gender.

```
os> source=accounts | stats percentile(age, 90) as p90 by span(age, 10) as age_span, gender
fetched rows / total rows = 2/2
+-------+------------+----------+
| p90   | age_span   | gender   |
|-------+------------+----------|
| 28    | 20         | F        |
| 36    | 30         | M        |
+-------+------------+----------+
```

```
- `source = table | stats avg(a) `
- `source = table | where a < 50 | stats avg(c) `
- `source = table | stats max(c) by b`
- `source = table | stats count(c) by b | head 5`
- `source = table | stats distinct_count(c)`
- `source = table | stats stddev_samp(c)`
- `source = table | stats stddev_pop(c)`
- `source = table | stats percentile(c, 90)`
- `source = table | stats percentile_approx(c, 99)`
```

**Aggregations with span**  


```
- `source = table  | stats count(a) by span(a, 10) as a_span`
- `source = table  | stats sum(age) by span(age, 5) as age_span | head 2`
- `source = table  | stats avg(age) by span(age, 20) as age_span, country  | sort - age_span |  head 2`
```

**Aggregations with timewindow span (tumble windowing function)**  


```
- `source = table | stats sum(productsAmount) by span(transactionDate, 1d) as age_date | sort age_date`
- `source = table | stats sum(productsAmount) by span(transactionDate, 1w) as age_date, productId`
```

**Aggregations group by multiple levels**  


```
- `source = table | stats avg(age) as avg_state_age by country, state | stats avg(avg_state_age) as avg_country_age by country`
- `source = table | stats avg(age) as avg_city_age by country, state, city | eval new_avg_city_age = avg_city_age - 1 | stats avg(new_avg_city_age) as avg_state_age by country, state | where avg_state_age > 18 | stats avg(avg_state_age) as avg_adult_country_age by country`
```

#### subquery command
<a name="supported-ppl-subquery-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `subquery` command to perform complex, nested queries within your Piped Processing Language (PPL) statements.

```
source=logs | where field in [ subquery source=events | where condition | fields field ]
```

In this example, the primary search (`source=logs`) is filtered by results from the subquery (`source=events`).

The subquery command supports multiple levels of nesting for complex data analysis.

**Nested Subquery Example**  


```
source=logs | where id in [ subquery source=users | where user in [ subquery source=actions | where action="login" | fields user] | fields uid ]  
```

**InSubquery Usage**
+ `source = outer | where a in [ source = inner | fields b ]`
+ `source = outer | where (a) in [ source = inner | fields b ]`
+ `source = outer | where (a,b,c) in [ source = inner | fields d,e,f ]`
+ `source = outer | where a not in [ source = inner | fields b ]`
+ `source = outer | where (a) not in [ source = inner | fields b ]`
+ `source = outer | where (a,b,c) not in [ source = inner | fields d,e,f ]`
+ `source = outer a in [ source = inner | fields b ]` (search filtering with subquery)
+ `source = outer a not in [ source = inner | fields b ]` (search filtering with subquery)
+ `source = outer | where a in [ source = inner1 | where b not in [ source = inner2 | fields c ] | fields b ]` (nested)
+ `source = table1 | inner join left = l right = r on l.a = r.a AND r.a in [ source = inner | fields d ] | fields l.a, r.a, b, c` (as join filter)

**SQL Migration Examples with IN-Subquery PPL**  
TPC-H Q4 (in-subquery with aggregation)

```
select
  o_orderpriority,
  count(*) as order_count
from
  orders
where
  o_orderdate >= date '1993-07-01'
  and o_orderdate < date '1993-07-01' + interval '3' month
  and o_orderkey in (
    select
      l_orderkey
    from
      lineitem
    where l_commitdate < l_receiptdate
  )
group by
  o_orderpriority
order by
  o_orderpriority
```

Rewritten by PPL InSubquery query:

```
source = orders
| where o_orderdate >= "1993-07-01" and o_orderdate < "1993-10-01" and o_orderkey IN
  [ source = lineitem
    | where l_commitdate < l_receiptdate
    | fields l_orderkey
  ]
| stats count(1) as order_count by o_orderpriority
| sort o_orderpriority
| fields o_orderpriority, order_count
```

TPC-H Q20 (nested in-subquery)

```
select
  s_name,
  s_address
from
  supplier,
  nation
where
  s_suppkey in (
    select
      ps_suppkey
    from
      partsupp
    where
      ps_partkey in (
        select
          p_partkey
        from
          part
        where
          p_name like 'forest%'
      )
  )
  and s_nationkey = n_nationkey
  and n_name = 'CANADA'
order by
  s_name
```

Rewritten by PPL InSubquery query:

```
source = supplier
| where s_suppkey IN [
    source = partsupp
    | where ps_partkey IN [
        source = part
        | where like(p_name, "forest%")
        | fields p_partkey
      ]
    | fields ps_suppkey
  ]
| inner join left=l right=r on s_nationkey = n_nationkey and n_name = 'CANADA'
  nation
| sort s_name
```

**ExistsSubquery usage**  
Assumptions: `a`, `b` are fields of table outer, `c`, `d` are fields of table inner, `e`, `f` are fields of table inner2.
+ `source = outer | where exists [ source = inner | where a = c ]`
+ `source = outer | where not exists [ source = inner | where a = c ]`
+ `source = outer | where exists [ source = inner | where a = c and b = d ]`
+ `source = outer | where not exists [ source = inner | where a = c and b = d ]`
+ `source = outer exists [ source = inner | where a = c ]` (search filtering with subquery)
+ `source = outer not exists [ source = inner | where a = c ]` (search filtering with subquery)
+ `source = table as t1 exists [ source = table as t2 | where t1.a = t2.a ]` (table alias is useful in exists subquery)
+ `source = outer | where exists [ source = inner1 | where a = c and exists [ source = inner2 | where c = e ] ]` (nested)
+ `source = outer | where exists [ source = inner1 | where a = c | where exists [ source = inner2 | where c = e ] ]` (nested)
+ `source = outer | where exists [ source = inner | where c > 10 ]` (uncorrelated exists)
+ `source = outer | where not exists [ source = inner | where c > 10 ]` (uncorrelated exists)
+ `source = outer | where exists [ source = inner ] | eval l = "nonEmpty" | fields l` (special uncorrelated exists)

**ScalarSubquery usage**  
Assumptions: `a`, `b` are fields of table outer, `c`, `d` are fields of table inner, `e`, `f` are fields of table nested

**Uncorrelated scalar subquery**  
In Select:
+ `source = outer | eval m = [ source = inner | stats max(c) ] | fields m, a`
+ `source = outer | eval m = [ source = inner | stats max(c) ] + b | fields m, a`

In Where:
+ `source = outer | where a > [ source = inner | stats min(c) ] | fields a`

In Search filter:
+ `source = outer a > [ source = inner | stats min(c) ] | fields a`

**Correlated scalar subquery**  
In Select:
+ `source = outer | eval m = [ source = inner | where outer.b = inner.d | stats max(c) ] | fields m, a`
+ `source = outer | eval m = [ source = inner | where b = d | stats max(c) ] | fields m, a`
+ `source = outer | eval m = [ source = inner | where outer.b > inner.d | stats max(c) ] | fields m, a`

In Where:
+ `source = outer | where a = [ source = inner | where outer.b = inner.d | stats max(c) ]`
+ `source = outer | where a = [ source = inner | where b = d | stats max(c) ]`
+ `source = outer | where [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a`

In Search filter:
+ `source = outer a = [ source = inner | where b = d | stats max(c) ]`
+ `source = outer [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a`

**Nested scalar subquery**  

+ `source = outer | where a = [ source = inner | stats max(c) | sort c ] OR b = [ source = inner | where c = 1 | stats min(d) | sort d ]`
+ `source = outer | where a = [ source = inner | where c = [ source = nested | stats max(e) by f | sort f ] | stats max(d) by c | sort c | head 1 ]`

**(Relation) Subquery**  
`InSubquery`, `ExistsSubquery` and `ScalarSubquery` are all subquery expressions. But `RelationSubquery` is not a subquery expression, it is a subquery plan which is common used in Join or From clause.
+ `source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ]` (subquery in join right side)
+ `source = [ source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] | stats count(a) by b ] as outer | head 1`

**Additional Context**  
`InSubquery`, `ExistsSubquery`, and `ScalarSubquery` are subquery expressions commonly used in `where` clauses and search filters.

Where command:

```
| where <boolean expression> | ...    
```

Search filter:

```
search source=* <boolean expression> | ...    
```

A subquery expression could be used in a boolean expression:

```
| where orders.order_id in [ source=returns | where return_reason="damaged" | field order_id ]    
```

The `orders.order_id in [ source=... ]` is a `<boolean expression>`.

In general, we name this kind of subquery clause the `InSubquery` expression. It is a `<boolean expression>`.

**Subquery with different join types**  
Example using a `ScalarSubquery`:

```
source=employees
| join source=sales on employees.employee_id = sales.employee_id
| where sales.sale_amount > [ source=targets | where target_met="true" | fields target_value ]
```

Unlike InSubquery, ExistsSubquery, and ScalarSubquery, a RelationSubquery is not a subquery expression. Instead, it's a subquery plan.

```
SEARCH source=customer
| FIELDS c_custkey
| LEFT OUTER JOIN left = c, right = o ON c.c_custkey = o.o_custkey
   [
      SEARCH source=orders
      | WHERE o_comment NOT LIKE '%unusual%packages%'
      | FIELDS o_orderkey, o_custkey
   ]
| STATS ...
```

#### top command
<a name="supported-ppl-top-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `top` command to find the most common tuple of values of all fields in the field list.

**Syntax**  
Use the following syntax:

```
top [N] <field-list> [by-clause] top_approx [N] <field-list> [by-clause]
```

**N**
+ The number of results to return. 
+ Default: 10

**field-list**
+ Mandatory. 
+ A comma-delimited list of field names.

**by-clause**
+ Optional. 
+ One or more fields to group the results by.

**top\$1approx**
+ An approximate count of the (n) top fields by using the estimated [cardinality by HyperLogLog\$1\$1 algorithm](https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html).

**Example 1: Find the most common values in a field**  
The example finds the most common gender for all accounts.

PPL query:

```
os> source=accounts | top gender;
os> source=accounts | top_approx gender;
fetched rows / total rows = 2/2
+----------+
| gender   |
|----------|
| M        |
| F        |
+----------+
```

**Example 2: Find the most common values in a field (limited to 1)**  
The example finds the single most common gender for all accounts.

PPL query:

```
os> source=accounts | top_approx 1 gender;
fetched rows / total rows = 1/1
+----------+
| gender   |
|----------|
| M        |
+----------+
```

**Example 3: Find the most common values, grouped by gender**  
The example finds the most common age for all accounts, grouped by gender.

PPL query:

```
os> source=accounts | top 1 age by gender;
os> source=accounts | top_approx 1 age by gender;
fetched rows / total rows = 2/2
+----------+-------+
| gender   | age   |
|----------+-------|
| F        | 28    |
| M        | 32    |
+----------+-------+
```

#### trendline command
<a name="supported-ppl-trendline-commands"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `trendline` command to calculate moving averages of fields.

**Syntax**  
Use the following syntax

```
TRENDLINE [sort <[+|-] sort-field>] SMA(number-of-datapoints, field) [AS alias] [SMA(number-of-datapoints, field) [AS alias]]... 
```

**[\$1\$1-]**
+ Optional. 
+ The plus [\$1] stands for ascending order with NULL/MISSING values first.
+ The minus [-] stands for descending order with NULL/MISSING values last. 
+ Default: Ascending order with NULL/MISSING values first.

**sort-field**
+ Mandatory when sorting is used. 
+ The field used for sorting.

**number-of-datapoints**
+ Mandatory. 
+ The number of datapoints that calculate the moving average.
+ Must be greater than zero.

**field**
+ Mandatory. 
+ The name of the field the moving average should be calculated for.

**alias**
+ Optional. 
+ The name of the resulting column containing the moving average.

Only the Simple Moving Average (SMA) type is supported. It is calculated like this:

```
f[i]: The value of field 'f' in the i-th data-point
n: The number of data-points in the moving window (period)
t: The current time index

SMA(t) = (1/n) * Σ(f[i]), where i = t-n+1 to t
```

**Example 1: Calculate simple moving average for a timeseries of temperatures**  
The example calculates the simple moving average over temperatures using two datapoints.

PPL query:

```
os> source=t | trendline sma(2, temperature) as temp_trend;
fetched rows / total rows = 5/5
+-----------+---------+--------------------+----------+
|temperature|device-id|           timestamp|temp_trend|
+-----------+---------+--------------------+----------+
|         12|     1492|2023-04-06 17:07:...|      NULL|
|         12|     1492|2023-04-06 17:07:...|      12.0|
|         13|      256|2023-04-06 17:07:...|      12.5|
|         14|      257|2023-04-06 17:07:...|      13.5|
|         15|      258|2023-04-06 17:07:...|      14.5|
+-----------+---------+--------------------+----------+
```

**Example 2: Calculate simple moving averages for a timeseries of temperatures with sorting**  
The example calculates two simple moving average over temperatures using two and three datapoints sorted descending by device-id.

PPL query:

```
os> source=t | trendline sort - device-id sma(2, temperature) as temp_trend_2 sma(3, temperature) as temp_trend_3;
fetched rows / total rows = 5/5
+-----------+---------+--------------------+------------+------------------+
|temperature|device-id|           timestamp|temp_trend_2|      temp_trend_3|
+-----------+---------+--------------------+------------+------------------+
|         15|      258|2023-04-06 17:07:...|        NULL|              NULL|
|         14|      257|2023-04-06 17:07:...|        14.5|              NULL|
|         13|      256|2023-04-06 17:07:...|        13.5|              14.0|
|         12|     1492|2023-04-06 17:07:...|        12.5|              13.0|
|         12|     1492|2023-04-06 17:07:...|        12.0|12.333333333333334|
+-----------+---------+--------------------+------------+------------------+
```

#### where command
<a name="supported-ppl-where-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

The `where` command uses a bool-expression to filter the search result. It only returns the result when bool-expression evaluates to true.

**Syntax**  
Use the following syntax:

```
where <boolean-expression>    
```

**bool-expression**
+ Optional. 
+ Any expression which could be evaluated to a boolean value.

**Example 1: Filter result set with condition**  
The example shows how to fetch documents from the accounts index that meet specific conditions.

PPL query:

```
os> source=accounts | where account_number=1 or gender="F" | fields account_number, gender;
fetched rows / total rows = 2/2
+------------------+----------+
| account_number   | gender   |
|------------------+----------|
| 1                | M        |
| 13               | F        |
+------------------+----------+
```

**Additional examples**  


**Filters with logical conditions**
+ `source = table | where c = 'test' AND a = 1 | fields a,b,c`
+ `source = table | where c != 'test' OR a > 1 | fields a,b,c | head 1`
+ `source = table | where c = 'test' NOT a > 1 | fields a,b,c`
+ `source = table | where a = 1 | fields a,b,c`
+ `source = table | where a >= 1 | fields a,b,c`
+ `source = table | where a < 1 | fields a,b,c`
+ `source = table | where b != 'test' | fields a,b,c`
+ `source = table | where c = 'test' | fields a,b,c | head 3`
+ `source = table | where ispresent(b)`
+ `source = table | where isnull(coalesce(a, b)) | fields a,b,c | head 3`
+ `source = table | where isempty(a)`
+ `source = table | where isblank(a)`
+ `source = table | where case(length(a) > 6, 'True' else 'False') = 'True'`
+ `source = table | where a between 1 and 4` - Note: This returns a >= 1 and a <= 4, i.e. [1, 4]
+ `source = table | where b not between '2024-09-10' and '2025-09-10'` - Note: This returns b >= '\$1\$1\$1\$1\$1\$1\$1\$1\$1\$1' and b <= '2025-09-10'
+ `source = table | where cidrmatch(ip, '***********/24')`
+ `source = table | where cidrmatch(ipv6, '2003:db8::/32')`

```
source = table | eval status_category =
    case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
    else 'Incorrect HTTP status code')
    | where case(a >= 200 AND a < 300, 'Success',
    a >= 300 AND a < 400, 'Redirection',
    a >= 400 AND a < 500, 'Client Error',
    a >= 500, 'Server Error'
    else 'Incorrect HTTP status code'
    ) = 'Incorrect HTTP status code'
```

```
source = table
    | eval factor = case(a > 15, a - 14, isnull(b), a - 7, a < 3, a + 1 else 1)
    | where case(factor = 2, 'even', factor = 4, 'even', factor = 6, 'even', factor = 8, 'even' else 'odd') = 'even'
    |  stats count() by factor
```

#### field summary
<a name="supported-ppl-field-summary-command"></a>

**Note**  
To see which AWS data source integrations support this PPL command, see [Commands](#supported-ppl-commands).

Use the `fieldsummary` command to calculate basic statistics for each field (count, distinct count, min, max, avg, stddev, mean) and determine the data type of each field. This command can be used with any preceding pipe and will take them into account.

**Syntax**  
Use the following syntax. For CloudWatch Logs use cases, only one field in a query is supported.

```
... | fieldsummary <field-list> (nulls=true/false)
```

**includefields**
+ List of all the columns to be collected with statistics into a unified result set.

**Nulls**
+ Optional. 
+  If set to true, include null values in the aggregation calculations (replace null with zero for numeric values).

**Example 1**  
PPL query:

```
os> source = t | where status_code != 200 | fieldsummary includefields= status_code nulls=true
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| Fields           | COUNT       | COUNT_DISTINCT    |  MIN  |  MAX   |  AVG   |  MEAN   |        STDDEV       | NUlls | TYPEOF |
|------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "status_code"    |      2      |         2         | 301   |   403  |  352.0 |  352.0  |  72.12489168102785  |  0    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
```

**Example 2**  
PPL query:

```
os> source = t | fieldsummary includefields= id, status_code, request_path nulls=true
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| Fields           | COUNT       | COUNT_DISTINCT    |  MIN  |  MAX   |  AVG   |  MEAN   |        STDDEV       | NUlls | TYPEOF |
|------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
|       "id"       |      6      |         6         | 1     |   6    |  3.5   |   3.5  |  1.8708286933869707  |  0    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "status_code"    |      4      |         3         | 200   |   403  |  184.0 |  184.0  |  161.16699413961905 |  2    | "int"  |
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
| "request_path"   |      2      |         2         | /about| /home  |  0.0    |  0.0     |      0            |  2    |"string"|
+------------------+-------------+------------+------------+------------+------------+------------+------------+----------------|
```

#### expand command
<a name="supported-ppl-expand-command"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

Use the `expand` command to flatten a field of type Array<Any> or Map<Any>, producing individual rows for each element or key-value pair.

**Syntax**  
Use the following syntax:

```
expand <field> [As alias]
```

**field**
+ The field to be expanded (exploded). 
+ The field must be of a supported type.

**alias**
+ Optional.
+ The name to be used instead of the original field name.

**Usage guidelines**  
The expand command produces a row for each element in the specified array or map field, where:
+ Array elements become individual rows. 
+ Map key-value pairs are broken into separate rows, with each key-value represented as a row. 
+ When an alias is provided, the exploded values are represented under the alias instead of the original field name. 

You can use this command in combination with other commands, such as stats, eval, and parse, to manipulate or extract data post-expansion.

**Examples**
+ `source = table | expand employee | stats max(salary) as max by state, company `
+ `source = table | expand employee as worker | stats max(salary) as max by state, company `
+ `source = table | expand employee as worker | eval bonus = salary * 3 | fields worker, bonus` 
+ `source = table | expand employee | parse description '(?<email>.+@.+)' | fields employee, email` 
+ `source = table | eval array=json_array(1, 2, 3) | expand array as uid | fields name, occupation, uid `
+ `source = table | expand multi_valueA as multiA | expand multi_valueB as multiB` 

You can use the expand command in combination with other commands such as eval, stats, and more. Using multiple expand commands will create a Cartesian product of all the internal elements within each composite array or map.

**Effective SQL push-down query**  
The expand command is translated into an equivalent SQL operation using LATERAL VIEW explode, allowing for efficient exploding of arrays or maps at the SQL query level.

```
SELECT customer exploded_productId
FROM table
LATERAL VIEW explode(productId) AS exploded_productId
```

The explode command offers the following functionality: 
+ It is a column operation that returns a new column. 
+ It creates a new row for every element in the exploded column. 
+ Internal nulls are ignored as part of the exploded field (no row is created/exploded for null).

#### PPL functions
<a name="supported-ppl-functions-details"></a>

**Topics**
+ [PPL condition functions](#supported-ppl-condition-functions)
+ [PPL cryptographic hash functions](#supported-ppl-cryptographic-functions)
+ [PPL date and time functions](#supported-ppl-date-time-functions)
+ [PPL expressions](#supported-ppl-expressions)
+ [PPL IP address functions](#supported-ppl-ip-address-functions)
+ [PPL JSON functions](#supported-ppl-json-functions)
+ [PPL Lambda functions](#supported-ppl-lambda-functions)
+ [PPL mathematical functions](#supported-ppl-math-functions)
+ [PPL string functions](#supported-ppl-string-functions)
+ [PPL type conversion functions](#supported-ppl-type-conversion-functions)

##### PPL condition functions
<a name="supported-ppl-condition-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### ISNULL
<a name="supported-ppl-condition-functions-isnull"></a>

**Description**: `isnull(field)` returns true if the field is null.

**Argument type:**
+ All supported data types.

**Return type:**
+ BOOLEAN

**Example**:

```
os> source=accounts | eval result = isnull(employer) | fields result, employer, firstname
fetched rows / total rows = 4/4
+----------+-------------+-------------+
| result   | employer    | firstname   |
|----------+-------------+-------------|
| False    | AnyCompany  | Mary        |
| False    | ExampleCorp | Jane        |
| False    | ExampleOrg  | Nikki       |
| True     | null        | Juan        |
+----------+-------------+-------------+
```

##### ISNOTNULL
<a name="supported-ppl-condition-functions-isnotnull"></a>

**Description**: `isnotnull(field)` returns true if the field is not null.

**Argument type:**
+ All supported data types.

**Return type:**
+ BOOLEAN

**Example**:

```
os> source=accounts | where not isnotnull(employer) | fields account_number, employer
fetched rows / total rows = 1/1
+------------------+------------+
| account_number   | employer   |
|------------------+------------|
| 18               | null       |
+------------------+------------+
```

##### EXISTS
<a name="supported-ppl-condition-functions-exists"></a>

**Example**:

```
os> source=accounts | where exists(email) | fields account_number, email
fetched rows / total rows = 1/1
```

##### IFNULL
<a name="supported-ppl-condition-functions-ifnull"></a>

**Description**: `ifnull(field1, field2)` returns `field2` if `field1` is null.

**Argument type:**
+ All supported data types. 
+ If the two parameters have different types, the function will fail the semantic check.

**Return type:**
+ Any

**Example**:

```
os> source=accounts | eval result = ifnull(employer, 'default') | fields result, employer, firstname
fetched rows / total rows = 4/4
+------------+------------+-------------+
| result     | employer   | firstname   |
|------------+------------+-------------|
| AnyCompany | AnyCompany | Mary        |
| ExampleCorp| ExampleCorp| Jane        |
| ExampleOrg | ExampleOrg | Nikki       |
| default    | null       | Juan        |
+------------+------------+-------------+
```

##### NULLIF
<a name="supported-ppl-condition-functions-nullif"></a>

**Description**: `nullif(field1, field2)` return null if two parameters are same, otherwise return field1.

**Argument type:**
+ All supported data types. 
+ If the two parameters have different types, the function will fail the semantic check.

**Return type:**
+ Any

**Example**:

```
os> source=accounts | eval result = nullif(employer, 'AnyCompany') | fields result, employer, firstname
fetched rows / total rows = 4/4
+----------------+----------------+-------------+
| result         | employer       | firstname   |
|----------------+----------------+-------------|
| null           | AnyCompany     | Mary        |
| ExampleCorp    | ExampleCorp    | Jane        |
| ExampleOrg     | ExampleOrg     | Nikki       |
| null           | null           | Juan        |
+----------------+----------------+-------------+
```

##### IF
<a name="supported-ppl-condition-functions-if"></a>

**Description**: `if(condition, expr1, expr2)` returns `expr1` if the condition is true, otherwise it returns `expr2`.

**Argument type:**
+ All supported data types. 
+ If the two parameters have different types, the function will fail the semantic check.

**Return type:**
+ Any

**Example**:

```
os> source=accounts | eval result = if(true, firstname, lastname) | fields result, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+----------+
| result   | firstname | lastname   |
|----------+-------------+----------|
| Jane     | Jane      | Doe        |
| Mary     | Mary      | Major      |
| Pat      | Pat       | Candella   |
| Dale     | Jorge     | Souza      |
+----------+-----------+------------+

os> source=accounts | eval result = if(false, firstname, lastname) | fields result, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+------------+
| result   | firstname   | lastname   |
|----------+-------------+------------|
| Doe      | Jane        | Doe        |
| Major    | Mary        | Major      |
| Candella | Pat         | Candella   |
| Souza    | Jorge       | Souza      |
+----------+-------------+------------+

os> source=accounts | eval is_vip = if(age > 30 AND isnotnull(employer), true, false) | fields is_vip, firstname, lastname
fetched rows / total rows = 4/4
+----------+-------------+------------+
| is_vip   | firstname   | lastname   |
|----------+-------------+------------|
| True     | Jane        | Doe        |
| True     | Mary        | Major      |
| False    | Pat         | Candella   |
| False    | Jorge       | Souza      |
+----------+-------------+------------+
```

##### PPL cryptographic hash functions
<a name="supported-ppl-cryptographic-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### MD5
<a name="supported-ppl-cryptographic-functions-md5"></a>

MD5 calculates the MD5 digest and returns the value as a 32 character hex string.

**Usage**: `md5('hello')`

**Argument type:**
+ STRING

**Return type:**
+ STRING

**Example:**

```
os> source=people | eval `MD5('hello')` = MD5('hello') | fields `MD5('hello')`
fetched rows / total rows = 1/1
+----------------------------------+
| MD5('hello')                     |
|----------------------------------|
| <32 character hex string>        |
+----------------------------------+
```

##### SHA1
<a name="supported-ppl-cryptographic-functions-sha1"></a>

SHA1 returns the hex string result of SHA-1.

**Usage**: `sha1('hello')`

**Argument type:**
+ STRING

**Return type:**
+ STRING

**Example:**

```
os> source=people | eval `SHA1('hello')` = SHA1('hello') | fields `SHA1('hello')`
fetched rows / total rows = 1/1
+------------------------------------------+
| SHA1('hello')                            |
|------------------------------------------|
| <40-character SHA-1 hash result>         |
+------------------------------------------+
```

##### SHA2
<a name="supported-ppl-cryptographic-functions-sha2"></a>

SHA2 returns the hex string result of SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). The numBits indicates the desired bit length of the result, which must have a value of 224, 256, 384, 512

**Usage:**
+ `sha2('hello',256)`
+ `sha2('hello',512)`

**Argument type:**
+ STRING, INTEGER

**Return type:**
+ STRING

**Example:**

```
os> source=people | eval `SHA2('hello',256)` = SHA2('hello',256) | fields `SHA2('hello',256)`
fetched rows / total rows = 1/1
+------------------------------------------------------------------+
| SHA2('hello',256)                                                |
|------------------------------------------------------------------|
| <64-character SHA-256 hash result>                               |
+------------------------------------------------------------------+

os> source=people | eval `SHA2('hello',512)` = SHA2('hello',512) | fields `SHA2('hello',512)`
fetched rows / total rows = 1/1
+------------------------------------------------------------------+
| SHA2('hello',512)                                                |                                                                |
|------------------------------------------------------------------|
| <128-character SHA-512 hash result>                              |
+------------------------------------------------------------------+
```

##### PPL date and time functions
<a name="supported-ppl-date-time-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `DAY`
<a name="supported-ppl-date-time-functions-day"></a>

**Usage**: `DAY(date)` extracts the day of the month for a date, in the range 1 to 31.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAYOFMONTH`, `DAY_OF_MONTH`

**Example**:

```
os> source=people | eval `DAY(DATE('2020-08-26'))` = DAY(DATE('2020-08-26')) | fields `DAY(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------+
| DAY(DATE('2020-08-26'))   |
|---------------------------|
| 26                        |
+---------------------------+
```

##### `DAYOFMONTH`
<a name="supported-ppl-date-time-functions-dayofmonth"></a>

**Usage**: `DAYOFMONTH(date)` extracts the day of the month for a date, in the range 1 to 31.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAY`, `DAY_OF_MONTH`

**Example**:

```
os> source=people | eval `DAYOFMONTH(DATE('2020-08-26'))` = DAYOFMONTH(DATE('2020-08-26')) | fields `DAYOFMONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+----------------------------------+
| DAYOFMONTH(DATE('2020-08-26'))   |
|----------------------------------|
| 26                               |
+----------------------------------+
```

##### `DAY_OF_MONTH`
<a name="supported-ppl-date-time-functions-day-of-month"></a>

**Usage**: `DAY_OF_MONTH(DATE)` extracts the day of the month for a date, in the range 1 to 31.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAY`, `DAYOFMONTH`

**Example**:

```
os> source=people | eval `DAY_OF_MONTH(DATE('2020-08-26'))` = DAY_OF_MONTH(DATE('2020-08-26')) | fields `DAY_OF_MONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+------------------------------------+
| DAY_OF_MONTH(DATE('2020-08-26'))   |
|------------------------------------|
| 26                                 |
+------------------------------------+
```

##### `DAYOFWEEK`
<a name="supported-ppl-date-time-functions-dayofweek"></a>

**Usage**: `DAYOFWEEK(DATE)` returns the weekday index for a date (1 = Sunday, 2 = Monday, ..., 7 = Saturday).

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAY_OF_WEEK`

**Example**:

```
os> source=people | eval `DAYOFWEEK(DATE('2020-08-26'))` = DAYOFWEEK(DATE('2020-08-26')) | fields `DAYOFWEEK(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| DAYOFWEEK(DATE('2020-08-26'))   |
|---------------------------------|
| 4                               |
+---------------------------------+
```

##### `DAY_OF_WEEK`
<a name="supported-ppl-date-time-functions-day-of-week"></a>

**Usage**: `DAY_OF_WEEK(DATE)` returns the weekday index for a date (1 = Sunday, 2 = Monday, ..., 7 = Saturday).

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAYOFWEEK`

**Example**:

```
os> source=people | eval `DAY_OF_WEEK(DATE('2020-08-26'))` = DAY_OF_WEEK(DATE('2020-08-26')) | fields `DAY_OF_WEEK(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------------+
| DAY_OF_WEEK(DATE('2020-08-26'))   |
|-----------------------------------|
| 4                                 |
+-----------------------------------+
```

##### `DAYOFYEAR`
<a name="supported-ppl-date-time-functions-dayofyear"></a>

**Usage**: `DAYOFYEAR(DATE)` returns the day of the year for a date, in the range 1 to 366.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAY_OF_YEAR`

**Example**:

```
os> source=people | eval `DAYOFYEAR(DATE('2020-08-26'))` = DAYOFYEAR(DATE('2020-08-26')) | fields `DAYOFYEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| DAYOFYEAR(DATE('2020-08-26'))   |
|---------------------------------|
| 239                             |
+---------------------------------+
```

##### `DAY_OF_YEAR`
<a name="supported-ppl-date-time-functions-day-of-year"></a>

**Usage**: `DAY_OF_YEAR(DATE)` returns the day of the year for a date, in the range 1 to 366.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `DAYOFYEAR`

**Example**:

```
os> source=people | eval `DAY_OF_YEAR(DATE('2020-08-26'))` = DAY_OF_YEAR(DATE('2020-08-26')) | fields `DAY_OF_YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------------+
| DAY_OF_YEAR(DATE('2020-08-26'))   |
|-----------------------------------|
| 239                               |
+-----------------------------------+
```

##### `DAYNAME`
<a name="supported-ppl-date-time-functions-dayname"></a>

**Usage**: `DAYNAME(DATE)` returns the name of the weekday for a date, including Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: STRING

**Example**:

```
os> source=people | eval `DAYNAME(DATE('2020-08-26'))` = DAYNAME(DATE('2020-08-26')) | fields `DAYNAME(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------+
| DAYNAME(DATE('2020-08-26'))   |
|-------------------------------|
| Wednesday                     |
+-------------------------------+
```

##### `FROM_UNIXTIME`
<a name="supported-ppl-date-time-functions-from-unixtime"></a>

**Usage**: `FROM_UNIXTIME` returns a representation of the argument given as a timestamp or character string value. This function performs a reverse conversion of the `UNIX_TIMESTAMP` function. 

If you provide a second argument, `FROM_UNIXTIME` uses it to format the result similar to the `DATE_FORMAT` function. 

If the timestamp is outside of the range 1970-01-01 00:00:00 to 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 epoch time), the function returns `NULL`.

**Argument type**: DOUBLE, STRING

**Return type map**:

DOUBLE -> TIMESTAMP

DOUBLE, STRING -> STRING

**Examples**:

```
os> source=people | eval `FROM_UNIXTIME(1220249547)` = FROM_UNIXTIME(1220249547) | fields `FROM_UNIXTIME(1220249547)`
fetched rows / total rows = 1/1
+-----------------------------+
| FROM_UNIXTIME(1220249547)   |
|-----------------------------|
| 2008-09-01 06:12:27         |
+-----------------------------+

os> source=people | eval `FROM_UNIXTIME(1220249547, 'HH:mm:ss')` = FROM_UNIXTIME(1220249547, 'HH:mm:ss') | fields `FROM_UNIXTIME(1220249547, 'HH:mm:ss')`
fetched rows / total rows = 1/1
+-----------------------------------------+
| FROM_UNIXTIME(1220249547, 'HH:mm:ss')   |
|-----------------------------------------|
| 06:12:27                                |
+-----------------------------------------+
```

##### `HOUR`
<a name="supported-ppl-date-time-functions-hour"></a>

**Usage**: `HOUR(TIME)` extracts the hour value for time. 

Unlike a standard time of day, the time value in this function can have a range larger than 23. As a result, the return value of `HOUR(TIME)` can be greater than 23.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `HOUR_OF_DAY`

**Example**:

```
os> source=people | eval `HOUR(TIME('01:02:03'))` = HOUR(TIME('01:02:03')) | fields `HOUR(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+--------------------------+
| HOUR(TIME('01:02:03'))   |
|--------------------------|
| 1                        |
+--------------------------+
```

##### `HOUR_OF_DAY`
<a name="supported-ppl-date-time-functions-hour-of-day"></a>

**Usage**: `HOUR_OF_DAY(TIME)` extracts the hour value from the given time. 

Unlike a standard time of day, the time value in this function can have a range larger than 23. As a result, the return value of `HOUR_OF_DAY(TIME)` can be greater than 23.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `HOUR`

**Example**:

```
os> source=people | eval `HOUR_OF_DAY(TIME('01:02:03'))` = HOUR_OF_DAY(TIME('01:02:03')) | fields `HOUR_OF_DAY(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+---------------------------------+
| HOUR_OF_DAY(TIME('01:02:03'))   |
|---------------------------------|
| 1                               |
+---------------------------------+
```

##### `LAST_DAY`
<a name="supported-ppl-date-time-functions-last-day"></a>

**Usage**: `LAST_DAY` returns the last day of the month as a DATE value for the given date argument.

**Argument type**: DATE/STRING/TIMESTAMP/TIME

**Return type**: DATE

**Example**:

```
os> source=people | eval `last_day('2023-02-06')` = last_day('2023-02-06') | fields `last_day('2023-02-06')`
fetched rows / total rows = 1/1
+--------------------------+
| last_day('2023-02-06')   |
|--------------------------|
| 2023-02-28               |
+--------------------------+
```

##### `LOCALTIMESTAMP`
<a name="supported-ppl-date-time-functions-localtimestamp"></a>

**Usage**: `LOCALTIMESTAMP()` is a synonyms for `NOW()`.

**Example**:

```
> source=people | eval `LOCALTIMESTAMP()` = LOCALTIMESTAMP() | fields `LOCALTIMESTAMP()`
fetched rows / total rows = 1/1
+---------------------+
| LOCALTIMESTAMP()    |
|---------------------|
| 2022-08-02 15:54:19 |
+---------------------+
```

##### `LOCALTIME`
<a name="supported-ppl-date-time-functions-localtime"></a>

**Usage**: `LOCALTIME()` is a synonym for `NOW()`.

**Example**:

```
> source=people | eval `LOCALTIME()` = LOCALTIME() | fields `LOCALTIME()`
fetched rows / total rows = 1/1
+---------------------+
| LOCALTIME()         |
|---------------------|
| 2022-08-02 15:54:19 |
+---------------------+
```

##### `MAKE_DATE`
<a name="supported-ppl-date-time-functions-make-date"></a>

**Usage**: `MAKE_DATE` returns a date value based on the given year, month, and day values. All arguments are rounded to integers.

**Specifications**: 1. MAKE\$1DATE(INTEGER, INTEGER, INTEGER) -> DATE

**Argument type**: INTEGER, INTEGER, INTEGER

**Return type**: DATE

**Example**:

```
os> source=people | eval `MAKE_DATE(1945, 5, 9)` = MAKEDATE(1945, 5, 9) | fields `MAKEDATE(1945, 5, 9)`
fetched rows / total rows = 1/1
+------------------------+
| MAKEDATE(1945, 5, 9)   |
|------------------------|
| 1945-05-09             |
+------------------------+
```

##### `MINUTE`
<a name="supported-ppl-date-time-functions-minute"></a>

**Usage**: `MINUTE(TIME)` returns the minute component of the given time, as an integer in the range 0 to 59.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `MINUTE_OF_HOUR`

**Example**:

```
os> source=people | eval `MINUTE(TIME('01:02:03'))` =  MINUTE(TIME('01:02:03')) | fields `MINUTE(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+----------------------------+
| MINUTE(TIME('01:02:03'))   |
|----------------------------|
| 2                          |
+----------------------------+
```

##### `MINUTE_OF_HOUR`
<a name="supported-ppl-date-time-functions-minute-of-hour"></a>

**Usage**: `MINUTE_OF_HOUR(TIME)` returns the minute component of the given time, as an integer in the range 0 to 59.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `MINUTE`

**Example**:

```
os> source=people | eval `MINUTE_OF_HOUR(TIME('01:02:03'))` =  MINUTE_OF_HOUR(TIME('01:02:03')) | fields `MINUTE_OF_HOUR(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+------------------------------------+
| MINUTE_OF_HOUR(TIME('01:02:03'))   |
|------------------------------------|
| 2                                  |
+------------------------------------+
```

##### `MONTH`
<a name="supported-ppl-date-time-functions-month"></a>

**Usage**: `MONTH(DATE)` returns the month of the given date as an integer, in the range 1 to 12 (where 1 represents January and 12 represents December).

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `MONTH_OF_YEAR`

**Example**:

```
os> source=people | eval `MONTH(DATE('2020-08-26'))` =  MONTH(DATE('2020-08-26')) | fields `MONTH(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-----------------------------+
| MONTH(DATE('2020-08-26'))   |
|-----------------------------|
| 8                           |
+-----------------------------+
```

##### `MONTHNAME`
<a name="supported-ppl-date-time-functions-monthname"></a>

**Usage**: `MONTHNAME(DATE)` returns the month of the given date as an integer, in the range 1 to 12 (where 1 represents January and 12 represents December).

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `MONTH_OF_YEAR`

**Example**:

```
os> source=people | eval `MONTHNAME(DATE('2020-08-26'))` = MONTHNAME(DATE('2020-08-26')) | fields `MONTHNAME(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+---------------------------------+
| MONTHNAME(DATE('2020-08-26'))   |
|---------------------------------|
| August                          |
+---------------------------------+
```

##### `MONTH_OF_YEAR`
<a name="supported-ppl-date-time-functions-month-of-year"></a>

**Usage**: `MONTH_OF_YEAR(DATE)`returns the month of the given date as an integer, in the range 1 to 12 (where 1 represents January and 12 represents December).

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `MONTH`

**Example**:

```
os> source=people | eval `MONTH_OF_YEAR(DATE('2020-08-26'))` =  MONTH_OF_YEAR(DATE('2020-08-26')) | fields `MONTH_OF_YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------------+
| MONTH_OF_YEAR(DATE('2020-08-26'))   |
|-------------------------------------|
| 8                                   |
+-------------------------------------+
```

##### `NOW`
<a name="supported-ppl-date-time-functions-now"></a>

**Usage**: `NOW` returns the current date and time as a `TIMESTAMP` value in the 'YYYY-MM-DD hh:mm:ss' format. The value is expressed in the cluster time zone. 

**Note**  
`NOW()` returns a constant time that indicates when the statement began to execute. This differs from `SYSDATE()`, which returns the exact time of execution.

**Return type**: TIMESTAMP

**Specification**: NOW() -> TIMESTAMP

**Example**:

```
os> source=people | eval `value_1` = NOW(), `value_2` = NOW() | fields `value_1`, `value_2`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| value_1             | value_2             |
|---------------------+---------------------|
| 2022-08-02 15:39:05 | 2022-08-02 15:39:05 |
+---------------------+---------------------+
```

##### `QUARTER`
<a name="supported-ppl-date-time-functions-quarter"></a>

**Usage**: `QUARTER(DATE)` returns the quarter of the year for the given date as an integer, in the range 1 to 4.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Example**:

```
os> source=people | eval `QUARTER(DATE('2020-08-26'))` = QUARTER(DATE('2020-08-26')) | fields `QUARTER(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+-------------------------------+
| QUARTER(DATE('2020-08-26'))   |
|-------------------------------|
| 3                             |
+-------------------------------+
```

##### `SECOND`
<a name="supported-ppl-date-time-functions-second"></a>

**Usage**: `SECOND(TIME)` returns the second component of the given time as an integer, in the range 0 to 59.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `SECOND_OF_MINUTE`

**Example**:

```
os> source=people | eval `SECOND(TIME('01:02:03'))` = SECOND(TIME('01:02:03')) | fields `SECOND(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+----------------------------+
| SECOND(TIME('01:02:03'))   |
|----------------------------|
| 3                          |
+----------------------------+
```

##### `SECOND_OF_MINUTE`
<a name="supported-ppl-date-time-functions-second-of-minute"></a>

**Usage**: `SECOND_OF_MINUTE(TIME)` returns the second component of the given time as an integer, in the range 0 to 59.

**Argument type**: STRING/TIME/TIMESTAMP

**Return type**: INTEGER

**Synonyms**: `SECOND`

**Example**:

```
os> source=people | eval `SECOND_OF_MINUTE(TIME('01:02:03'))` = SECOND_OF_MINUTE(TIME('01:02:03')) | fields `SECOND_OF_MINUTE(TIME('01:02:03'))`
fetched rows / total rows = 1/1
+--------------------------------------+
| SECOND_OF_MINUTE(TIME('01:02:03'))   |
|--------------------------------------|
| 3                                    |
+--------------------------------------+
```

##### `SUBDATE`
<a name="supported-ppl-date-time-functions-subdate"></a>

**Usage**: `SUBDATE(DATE, DAYS)` subtracts the second argument (such as `DATE` or `DAYS`) from the given date.

**Argument type**: DATE/TIMESTAMP, LONG

**Return type map**: (DATE, LONG) -> DATE

**Antonyms**: `ADDDATE`

**Example**:

```
os> source=people | eval `'2008-01-02' - 31d` = SUBDATE(DATE('2008-01-02'), 31), `'2020-08-26' - 1` = SUBDATE(DATE('2020-08-26'), 1), `ts '2020-08-26 01:01:01' - 1` = SUBDATE(TIMESTAMP('2020-08-26 01:01:01'), 1) | fields `'2008-01-02' - 31d`, `'2020-08-26' - 1`, `ts '2020-08-26 01:01:01' - 1`
fetched rows / total rows = 1/1
+----------------------+--------------------+--------------------------------+
| '2008-01-02' - 31d   | '2020-08-26' - 1   | ts '2020-08-26 01:01:01' - 1   |
|----------------------+--------------------+--------------------------------|
| 2007-12-02 00:00:00  | 2020-08-25         | 2020-08-25 01:01:01            |
+----------------------+--------------------+--------------------------------+
```

##### `SYSDATE`
<a name="supported-ppl-date-time-functions-sysdate"></a>

**Usage**: `SYSDATE()` returns the current date and time as a `TIMESTAMP` value in the 'YYYY-MM-DD hh:mm:ss.nnnnnn' format. 

`SYSDATE()`returns the exact time at which it executes. This differs from NOW(), which returns a constant time indicating when the statement began to execute. 

**Optional argument type**: INTEGER (0 to 6) - Specifies the number of digits for fractional seconds in the return value.

**Return type**: TIMESTAMP

**Example**:

```
os> source=people | eval `SYSDATE()` = SYSDATE() | fields `SYSDATE()`
fetched rows / total rows = 1/1
+----------------------------+
| SYSDATE()                  |
|----------------------------|
| 2022-08-02 15:39:05.123456 |
+----------------------------+
```

##### `TIMESTAMP`
<a name="supported-ppl-date-time-functions-timestamp"></a>

**Usage**: `TIMESTAMP(EXPR)` constructs a timestamp type with the input string `expr` as an timestamp. 

With a single argument, `TIMESTAMP(expr)` constructs a timestamp from the input. If `expr` is a string, it's interpreted as a timestamp. For non-string arguments, the function casts `expr` to a timestamp using the UTC timezone. When `expr` is a `TIME` value, the function applies today's date before casting.

When used with two arguments, `TIMESTAMP(expr1, expr2)` adds the time expression (`expr2`) to the date or timestamp expression (`expr1`) and returns the result as a timestamp value.

**Argument type**: STRING/DATE/TIME/TIMESTAMP

**Return type map**:

(STRING/DATE/TIME/TIMESTAMP) -> TIMESTAMP

(STRING/DATE/TIME/TIMESTAMP, STRING/DATE/TIME/TIMESTAMP) -> TIMESTAMP

**Example**:

```
os> source=people | eval `TIMESTAMP('2020-08-26 13:49:00')` = TIMESTAMP('2020-08-26 13:49:00'), `TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))` = TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42')) | fields `TIMESTAMP('2020-08-26 13:49:00')`, `TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))`
fetched rows / total rows = 1/1
+------------------------------------+------------------------------------------------------+
| TIMESTAMP('2020-08-26 13:49:00')   | TIMESTAMP('2020-08-26 13:49:00', TIME('12:15:42'))   |
|------------------------------------+------------------------------------------------------|
| 2020-08-26 13:49:00                | 2020-08-27 02:04:42                                  |
+------------------------------------+------------------------------------------------------+
```

##### `UNIX_TIMESTAMP`
<a name="supported-ppl-date-time-functions-unix-timestamp"></a>

**Usage**: `UNIX_TIMESTAMP` converts a given date argument to Unix time (seconds since the Epoch, which began at the start of 1970). If no argument is provided, it returns the current Unix time. 

The date argument can be a `DATE`, a `TIMESTAMP` string, or a number in one of these formats: `YYMMDD`, `YYMMDDhhmmss`, `YYYYMMDD`, or `YYYYMMDDhhmmss`. If the argument includes a time component, it may optionally include fractional seconds.

If the argument is in an invalid format or falls outside the range of 1970-01-01 00:00:00 to 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 in epoch time), the function returns `NULL`.

The function accepts `DATE`, `TIMESTAMP`, or `DOUBLE` as argument types, or no argument. It always returns a `DOUBLE` value representing the Unix timestamp.

For the reverse conversion, you can use the FROM\$1UNIXTIME function.

**Argument type**: <NONE>/DOUBLE/DATE/TIMESTAMP

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `UNIX_TIMESTAMP(double)` = UNIX_TIMESTAMP(20771122143845), `UNIX_TIMESTAMP(timestamp)` = UNIX_TIMESTAMP(TIMESTAMP('1996-11-15 17:05:42')) | fields `UNIX_TIMESTAMP(double)`, `UNIX_TIMESTAMP(timestamp)`
fetched rows / total rows = 1/1
+--------------------------+-----------------------------+
| UNIX_TIMESTAMP(double)   | UNIX_TIMESTAMP(timestamp)   |
|--------------------------+-----------------------------|
| 3404817525.0             | 848077542.0                 |
+--------------------------+-----------------------------+
```

##### `WEEK`
<a name="supported-ppl-date-time-functions-week"></a>

**Usage**: `WEEK(DATE)` returns the week number for a given date.

**Argument type**: DATE/TIMESTAMP/STRING

**Return type**: INTEGER

**Synonyms**: `WEEK_OF_YEAR`

**Example**:

```
os> source=people | eval `WEEK(DATE('2008-02-20'))` = WEEK(DATE('2008-02-20')) | fields `WEEK(DATE('2008-02-20'))`
fetched rows / total rows = 1/1
+----------------------------+
| WEEK(DATE('2008-02-20'))   |
|----------------------------|
| 8                          |
+----------------------------+
```

##### `WEEKDAY`
<a name="supported-ppl-date-time-functions-weekday"></a>

**Usage**: `WEEKDAY(DATE)` returns the weekday index for date (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).

It is similar to the `dayofweek` function, but returns different indexes for each day.

**Argument type**: STRING/DATE/TIME/TIMESTAMP

**Return type**: INTEGER

**Example**:

```
os> source=people | eval `weekday(DATE('2020-08-26'))` = weekday(DATE('2020-08-26')) | eval `weekday(DATE('2020-08-27'))` = weekday(DATE('2020-08-27')) | fields `weekday(DATE('2020-08-26'))`, `weekday(DATE('2020-08-27'))`
fetched rows / total rows = 1/1
+-------------------------------+-------------------------------+
| weekday(DATE('2020-08-26'))   | weekday(DATE('2020-08-27'))   |
|-------------------------------+-------------------------------|
| 2                             | 3                             |
+-------------------------------+-------------------------------+
```

##### `WEEK_OF_YEAR`
<a name="supported-ppl-date-time-functions-week-of-year"></a>

**Usage**: `WEEK_OF_YEAR(DATE)` returns the week number for the given date.

**Argument type**: DATE/TIMESTAMP/STRING

**Return type**: INTEGER

**Synonyms**: `WEEK`

**Example**:

```
os> source=people | eval `WEEK_OF_YEAR(DATE('2008-02-20'))` = WEEK(DATE('2008-02-20'))| fields `WEEK_OF_YEAR(DATE('2008-02-20'))`
fetched rows / total rows = 1/1
+------------------------------------+
| WEEK_OF_YEAR(DATE('2008-02-20'))   |
|------------------------------------|
| 8                                  |
+------------------------------------+
```

##### `YEAR`
<a name="supported-ppl-date-time-functions-year"></a>

**Usage**: `YEAR(DATE)` returns the year for date, in the range 1000 to 9999, or 0 for the "zero" date.

**Argument type**: STRING/DATE/TIMESTAMP

**Return type**: INTEGER

**Example**:

```
os> source=people | eval `YEAR(DATE('2020-08-26'))` = YEAR(DATE('2020-08-26')) | fields `YEAR(DATE('2020-08-26'))`
fetched rows / total rows = 1/1
+----------------------------+
| YEAR(DATE('2020-08-26'))   |
|----------------------------|
| 2020                       |
+----------------------------+
```

##### `DATE_ADD`
<a name="supported-ppl-date-time-functions-date-add"></a>

**Usage**: `DATE_ADD(date, INTERVAL expr unit)` adds the specified interval to the given date.

**Argument type**: DATE, INTERVAL

**Return type**: DATE

**Antonyms**: `DATE_SUB`

**Example**:

```
os> source=people | eval `'2020-08-26' + 1d` = DATE_ADD(DATE('2020-08-26'), INTERVAL 1 DAY) | fields `'2020-08-26' + 1d`
fetched rows / total rows = 1/1
+---------------------+
| '2020-08-26' + 1d   |
|---------------------|
| 2020-08-27          |
+---------------------+
```

##### `DATE_SUB`
<a name="supported-ppl-date-time-functions-date-sub"></a>

**Usage**: `DATE_SUB(date, INTERVAL expr unit)` subtracts the interval expr from date.

**Argument type**: DATE, INTERVAL

**Return type**: DATE

**Antonyms**: `DATE_ADD`

**Example**:

```
os> source=people | eval `'2008-01-02' - 31d` = DATE_SUB(DATE('2008-01-02'), INTERVAL 31 DAY) | fields `'2008-01-02' - 31d`
fetched rows / total rows = 1/1
+---------------------+
| '2008-01-02' - 31d  |
|---------------------|
| 2007-12-02          |
+---------------------+
```

##### `TIMESTAMPADD`
<a name="supported-ppl-date-time-functions-timestampadd"></a>

**Usage**: Returns a `TIMESTAMP` value after adding a specified time interval to a given date.

**Arguments**: 
+ interval: INTERVAL (SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR) 
+ integer: INTEGER 
+ date: DATE, TIMESTAMP, or STRING

If you provide a `STRING` as the date argument, format it as a valid `TIMESTAMP`. The function automatically converts a `DATE` argument to a `TIMESTAMP`.

**Examples**:

```
os> source=people | eval `TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00')` = TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00') | eval `TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00')` = TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00') | fields `TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00')`, `TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00')`
fetched rows / total rows = 1/1
+----------------------------------------------+--------------------------------------------------+
| TIMESTAMPADD(DAY, 17, '2000-01-01 00:00:00') | TIMESTAMPADD(QUARTER, -1, '2000-01-01 00:00:00') |
|----------------------------------------------+--------------------------------------------------|
| 2000-01-18 00:00:00                          | 1999-10-01 00:00:00                              |
+----------------------------------------------+--------------------------------------------------+
```

##### `TIMESTAMPDIFF`
<a name="supported-ppl-date-time-functions-timestampdiff"></a>

**Usage**: `TIMESTAMPDIFF(interval, start, end)` returns the difference between the start and end date/times in specified interval units.

**Arguments**: 
+ interval: INTERVAL (SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR) 
+ start: DATE, TIMESTAMP, or STRING 
+ end: DATE, TIMESTAMP, or STRING

The function automatically converts arguments to `TIMESTAMP` when appropriate. Format `STRING` arguments as valid `TIMESTAMP`s.

**Examples**:

```
os> source=people | eval `TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00')` = TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00') | eval `TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00'))` = TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00')) | fields `TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00')`, `TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00'))`
fetched rows / total rows = 1/1
+-------------------------------------------------------------------+-------------------------------------------------------------------------------------------+
| TIMESTAMPDIFF(YEAR, '1997-01-01 00:00:00', '2001-03-06 00:00:00') | TIMESTAMPDIFF(SECOND, timestamp('1997-01-01 00:00:23'), timestamp('1997-01-01 00:00:00')) |
|-------------------------------------------------------------------+-------------------------------------------------------------------------------------------|
| 4                                                                 | -23                                                                                       |
+-------------------------------------------------------------------+-------------------------------------------------------------------------------------------+
```

##### `UTC_TIMESTAMP`
<a name="supported-ppl-date-time-functions-utc-timestamp"></a>

**Usage**: `UTC_TIMESTAMP` returns the current UTC timestamp as a value in 'YYYY-MM-DD hh:mm:ss'.

**Return type**: TIMESTAMP

**Specification**: UTC\$1TIMESTAMP() -> TIMESTAMP

**Example**:

```
> source=people | eval `UTC_TIMESTAMP()` = UTC_TIMESTAMP() | fields `UTC_TIMESTAMP()`
fetched rows / total rows = 1/1
+---------------------+
| UTC_TIMESTAMP()     |
|---------------------|
| 2022-10-03 17:54:28 |
+---------------------+
```

##### `CURRENT_TIMEZONE`
<a name="supported-ppl-date-time-functions-current-timezone"></a>

**Usage**: `CURRENT_TIMEZONE` returns the current local timezone.

**Return type**: STRING

**Example**:

```
> source=people | eval `CURRENT_TIMEZONE()` = CURRENT_TIMEZONE() | fields `CURRENT_TIMEZONE()`
fetched rows / total rows = 1/1
+------------------------+
| CURRENT_TIMEZONE()     |
|------------------------|
| America/Chicago        |
+------------------------+
```

##### PPL expressions
<a name="supported-ppl-expressions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

Expressions, particularly value expressions, return a scalar value. Expressions have different types and forms. For example, there are literal values as atom expressions and arithmetic, predicate and function expressions built on top of them. You can use expressions in different clauses, such as using arithmetic expressions in `Filter` and `Stats` commands.

**Operators**

An arithmetic expression is an expression formed by numeric literals and binary arithmetic operators as follows:

1. `+`: Add.

1. `-`: Subtract.

1. `*`: Multiply.

1. `/`: Divide (For integers, the result is an integer with the fractional part discarded)

1. `%`: Modulo (Use with integers only; the result is the remainder of the division)

**Precedence**

Use parentheses to control the precedence of arithmetic operators. Otherwise, operators of higher precedence are performed first.

**Type conversion**

Implicit type conversion is performed when looking up operator signatures. For example, an integer `+` a real number matches signature `+(double,double)` which results in a real number. This rule also applies to function calls.

Example for different type of arithmetic expressions:

```
os> source=accounts | where age > (25 + 5) | fields age ;
fetched rows / total rows = 3/3
+-------+
| age   |
|-------|
| 32    |
| 36    |
| 33    |
+-------+
```

**Predicate operators**  
A predicate operator is an expression that evaluates to be true. The `MISSING` and `NULL` value comparison follow these rules: 
+ A `MISSING` value only equals a `MISSING` value and is less than other values. 
+ A `NULL` value equals a `NULL` value, is larger than a `MISSING` value, but is less than all other values.

**Operators**


**Predicate operators**  

| Name | Description | 
| --- | --- | 
| > | Greater than operator | 
| >= | Greater than or equal operator | 
| < | Less than operator | 
| \$1= | Not equal operator | 
| <= | Less than or equal operator | 
| = | Equal operator | 
| LIKE | Simple pattern matching | 
| IN | NULL value test | 
| AND | AND operator | 
| OR | OR operator | 
| XOR | XOR operator | 
| NOT | NOT NULL value test | 

You can compare datetimes. When comparing different datetime types (for example `DATE` and `TIME`), both convert to `DATETIME`. The following rules apply to conversion:
+  `TIME` applies to today's date.
+ `DATE` is interpreted at midnight.

**Basic predicate operator**  
Example for comparison operators:

```
os> source=accounts | where age > 33 | fields age ;
fetched rows / total rows = 1/1
+-------+
| age   |
|-------|
| 36    |
+-------+
```

**`IN`**  
Example of the `IN` operator test field in value lists:

```
os> source=accounts | where age in (32, 33) | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 32    |
| 33    |
+-------+
```

**`OR`**  
Example of the `OR` operator:

```
os> source=accounts | where age = 32 OR age = 33 | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 32    |
| 33    |
+-------+
```

**`NOT`**  
Example of the `NOT` operator:

```
os> source=accounts | where age not in (32, 33) | fields age ;
fetched rows / total rows = 2/2
+-------+
| age   |
|-------|
| 36    |
| 28    |
+-------+
```

##### PPL IP address functions
<a name="supported-ppl-ip-address-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `CIDRMATCH`
<a name="supported-ppl-address-functions-cidrmatch"></a>

**Usage**: `CIDRMATCH(ip, cidr)` checks if the specified IP address is within the given cidr range.

**Argument type:**
+ STRING, STRING
+ Return type: BOOLEAN

**Example**:

```
os> source=ips | where cidrmatch(ip, '***********/24') | fields ip
fetched rows / total rows = 1/1
+--------------+
| ip           |
|--------------|
| ***********  |
+--------------+

os> source=ipsv6 | where cidrmatch(ip, '2003:db8::/32') | fields ip
fetched rows / total rows = 1/1
+-----------------------------------------+
| ip                                      |
|-----------------------------------------|
| 2003:0db8:****:****:****:****:****:0000 |
+-----------------------------------------+
```

**Note**  
`ip` can be an IPv4 or an IPv6 address.
`cidr` can be an IPv4 or an IPv6 block.
`ip` and `cidr` must be either both IPv4 or both IPv6.
`ip` and `cidr` must both be valid and non-empty/non-null.

##### PPL JSON functions
<a name="supported-ppl-json-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `JSON`
<a name="supported-ppl-json-functions-json"></a>

**Usage**: `json(value)` evaluates whether a string can be parsed as JSON format. The function returns the original string if it's valid JSON, or null if it's invalid.

**Argument type**: STRING

**Return type**: STRING/NULL. A STRING expression of a valid JSON object format.

**Examples**:

```
os> source=people | eval `valid_json()` = json('[1,2,3,{"f1":1,"f2":[5,6]},4]') | fields valid_json
fetched rows / total rows = 1/1
+---------------------------------+
| valid_json                      |
+---------------------------------+
| [1,2,3,{"f1":1,"f2":[5,6]},4]   |
+---------------------------------+

os> source=people | eval `invalid_json()` = json('{"invalid": "json"') | fields invalid_json
fetched rows / total rows = 1/1
+----------------+
| invalid_json   |
+----------------+
| null           |
+----------------+
```

##### `JSON_OBJECT`
<a name="supported-ppl-json-functions-json-object"></a>

**Usage**: `json_object(<key>, <value>[, <key>, <value>]...)` returns a JSON object from members of key-value pairs.

**Argument type:**
+ A <key> must be STRING.
+ A <value> can be any data types.

**Return type**: JSON\$1OBJECT. A StructType expression of a valid JSON object.

**Examples**:

```
os> source=people | eval result = json_object('key', 123.45) | fields result
fetched rows / total rows = 1/1
+------------------+
| result           |
+------------------+
| {"key":123.45}   |
+------------------+

os> source=people | eval result = json_object('outer', json_object('inner', 123.45)) | fields result
fetched rows / total rows = 1/1
+------------------------------+
| result                       |
+------------------------------+
| {"outer":{"inner":123.45}}   |
+------------------------------+
```

##### `JSON_ARRAY`
<a name="supported-ppl-json-functions-json-array"></a>

**Usage**: `json_array(<value>...)` creates a JSON ARRAY using a list of values.

**Argument type**: A `<value>` can be any kind of value such as string, number, or boolean.

**Return type**: ARRAY. An array of any supported data type for a valid JSON array.

**Examples**:

```
os> source=people | eval `json_array` = json_array(1, 2, 0, -1, 1.1, -0.11)
fetched rows / total rows = 1/1
+------------------------------+
| json_array                   |
+------------------------------+
| [1.0,2.0,0.0,-1.0,1.1,-0.11] |
+------------------------------+

os> source=people | eval `json_array_object` = json_object("array", json_array(1, 2, 0, -1, 1.1, -0.11))
fetched rows / total rows = 1/1
+----------------------------------------+
| json_array_object                      |
+----------------------------------------+
| {"array":[1.0,2.0,0.0,-1.0,1.1,-0.11]} |
+----------------------------------------+
```

##### `TO_JSON_STRING`
<a name="supported-ppl-json-functions-to-json-string"></a>

**Usage**: `to_json_string(jsonObject)` returns a JSON string with a given json object value.

**Argument type**: JSON\$1OBJECT 

**Return type**: STRING

**Examples**:

```
os> source=people | eval `json_string` = to_json_string(json_array(1, 2, 0, -1, 1.1, -0.11)) | fields json_string
fetched rows / total rows = 1/1
+--------------------------------+
| json_string                    |
+--------------------------------+
| [1.0,2.0,0.0,-1.0,1.1,-0.11]   |
+--------------------------------+

os> source=people | eval `json_string` = to_json_string(json_object('key', 123.45)) | fields json_string
fetched rows / total rows = 1/1
+-----------------+
| json_string     |
+-----------------+
| {'key', 123.45} |
+-----------------+
```

##### `ARRAY_LENGTH`
<a name="supported-ppl-json-functions-array-length"></a>

**Usage**: `array_length(jsonArray)` returns the number of elements in the outermost array.

**Argument type**: ARRAY. An ARRAY or JSON\$1ARRAY object.

**Return type**: INTEGER

**Example**:

```
os> source=people | eval `json_array` = json_array_length(json_array(1,2,3,4)), `empty_array` = json_array_length(json_array())
fetched rows / total rows = 1/1
+--------------+---------------+
| json_array   | empty_array   |
+--------------+---------------+
| 4            | 0             |
+--------------+---------------+
```

##### `JSON_EXTRACT`
<a name="supported-ppl-json-functions-json-extract"></a>

**Usage**: `json_extract(jsonStr, path)` extracts a JSON object from a JSON string based on the specified JSON path. The function returns null if the input JSON string is invalid.

**Argument type**: STRING, STRING

**Return type**: STRING
+ A STRING expression of a valid JSON object format.
+ `NULL` is returned in case of an invalid JSON.

**Examples**:

```
os> source=people | eval `json_extract('{"a":"b"}', '$.a')` = json_extract('{"a":"b"}', '$a')
fetched rows / total rows = 1/1
+----------------------------------+
| json_extract('{"a":"b"}', 'a')   |
+----------------------------------+
| b                                |
+----------------------------------+

os> source=people | eval `json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[1].b')` = json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[1].b')
fetched rows / total rows = 1/1
+-----------------------------------------------------------+
| json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[1].b')   |
+-----------------------------------------------------------+
| 2.0                                                       |
+-----------------------------------------------------------+

os> source=people | eval `json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[*].b')` = json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[*].b')
fetched rows / total rows = 1/1
+-----------------------------------------------------------+
| json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[*].b')   |
+-----------------------------------------------------------+
| [1.0,2.0]                                                 |
+-----------------------------------------------------------+

os> source=people | eval `invalid_json` = json_extract('{"invalid": "json"')
fetched rows / total rows = 1/1
+----------------+
| invalid_json   |
+----------------+
| null           |
+----------------+
```

##### `JSON_KEYS`
<a name="supported-ppl-json-functions-json-keys"></a>

**Usage**: `json_keys(jsonStr)` returns all the keys of the outermost JSON object as an array.

**Argument type**: STRING. A STRING expression of a valid JSON object format.

**Return type**: ARRAY[STRING]. The function returns `NULL` for any other valid JSON string, an empty string, or an invalid JSON.

**Examples**:

```
os> source=people | eval `keys` = json_keys('{"f1":"abc","f2":{"f3":"a","f4":"b"}}')
fetched rows / total rows = 1/1
+------------+
| keus       |
+------------+
| [f1, f2]   |
+------------+

os> source=people | eval `keys` = json_keys('[1,2,3,{"f1":1,"f2":[5,6]},4]')
fetched rows / total rows = 1/1
+--------+
| keys   |
+--------+
| null   |
+--------+
```

##### `JSON_VALID`
<a name="supported-ppl-json-functions-json-valid"></a>

**Usage**: `json_valid(jsonStr)` evaluates whether a JSON string uses valid JSON syntax and returns TRUE or FALSE.

**Argument type**: STRING

**Return type**: BOOLEAN

**Examples**:

```
os> source=people | eval `valid_json` = json_valid('[1,2,3,4]'), `invalid_json` = json_valid('{"invalid": "json"') | feilds `valid_json`, `invalid_json`
fetched rows / total rows = 1/1
+--------------+----------------+
| valid_json   | invalid_json   |
+--------------+----------------+
| True         | False          |
+--------------+----------------+

os> source=accounts | where json_valid('[1,2,3,4]') and isnull(email) | fields account_number, email
fetched rows / total rows = 1/1
+------------------+---------+
| account_number   | email   |
|------------------+---------|
| 13               | null    |
+------------------+---------+
```

##### PPL Lambda functions
<a name="supported-ppl-lambda-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `EXISTS`
<a name="supported-ppl-lambda-functions-exists"></a>

**Usage**: `exists(array, lambda)` evaluates whether a Lambda predicate holds for one or more elements in the array.

**Argument type**: ARRAY, LAMBDA

**Return type**: BOOLEAN. Returns `TRUE` if at least one element in the array satisfies the Lambda predicate, otherwise `FALSE`.

**Examples**:

```
 os> source=people | eval array = json_array(1, -1, 2), result = exists(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| true      |
+-----------+

os> source=people | eval array = json_array(-1, -3, -2), result = exists(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| false     |
+-----------+
```

##### `FILTER`
<a name="supported-ppl-lambda-functions-filter"></a>

**Usage**: `filter(array, lambda)` filters the input array using the given Lambda function.

**Argument type**: ARRAY, LAMBDA

**Return type**: ARRAY. An ARRAY that contains all elements in the input array that satisfy the lambda predicate.

**Examples**:

```
 os> source=people | eval array = json_array(1, -1, 2), result = filter(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| [1, 2]    |
+-----------+

os> source=people | eval array = json_array(-1, -3, -2), result = filter(array, x -> x > 0) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| []        |
+-----------+
```

##### `TRANSFORM`
<a name="supported-ppl-lambda-functions-transform"></a>

**Usage**: `transform(array, lambda)` transforms elements in an array using the Lambda transform function. The second argument implies the index of the element if using binary Lambda function. This is similar to a `map` in functional programming.

**Argument type**: ARRAY, LAMBDA

**Return type**: ARRAY. An ARRAY that contains the result of applying the lambda transform function to each element in the input array.

**Examples**:

```
os> source=people | eval array = json_array(1, 2, 3), result = transform(array, x -> x + 1) | fields result
fetched rows / total rows = 1/1
+--------------+
| result       |
+--------------+
| [2, 3, 4]    |
+--------------+

os> source=people | eval array = json_array(1, 2, 3), result = transform(array, (x, i) -> x + i) | fields result
fetched rows / total rows = 1/1
+--------------+
| result       |
+--------------+
| [1, 3, 5]    |
+--------------+
```

##### `REDUCE`
<a name="supported-ppl-lambda-functions-reduce"></a>

**Usage**: `reduce(array, start, merge_lambda, finish_lambda)` reduces an array to a single value by applying lambda functions. The function applies the merge\$1lambda to the start value and all array elements, then applies the `finish_lambda` to the result.

**Argument type**: ARRAY, ANY, LAMBDA, LAMBDA

**Return type**: ANY. The final result of applying the Lambda functions to the start value and the input array.

**Examples**:

```
 os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 0, (acc, x) -> acc + x) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 6         |
+-----------+

os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 10, (acc, x) -> acc + x) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 16        |
+-----------+

os> source=people | eval array = json_array(1, 2, 3), result = reduce(array, 0, (acc, x) -> acc + x, acc -> acc * 10) | fields result
fetched rows / total rows = 1/1
+-----------+
| result    |
+-----------+
| 60        |
+-----------+
```

##### PPL mathematical functions
<a name="supported-ppl-math-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `ABS`
<a name="supported-ppl-math-functions-abs"></a>

**Usage**: `ABS(x) `calculates the absolute value of x.

**Argument type: **INTEGER/LONG/FLOAT/DOUBLE

**Return type:** INTEGER/LONG/FLOAT/DOUBLE

**Example**:

```
os> source=people | eval `ABS(-1)` = ABS(-1) | fields `ABS(-1)`
fetched rows / total rows = 1/1
+-----------+
| ABS(-1)   |
|-----------|
| 1         |
+-----------+
```

##### `ACOS`
<a name="supported-ppl-math-functions-acos"></a>

**Usage**: `ACOS(x)` calculates the arc cosine of x. It returns `NULL` if x is not in the range -1 to 1.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `ACOS(0)` = ACOS(0) | fields `ACOS(0)`
fetched rows / total rows = 1/1
+--------------------+
| ACOS(0)            |
|--------------------|
| 1.5707963267948966 |
+--------------------+
```

##### `ASIN`
<a name="supported-ppl-math-functions-asin"></a>

**Usage**: `asin(x)` calculates the arc sine of x. It returns `NULL` if x is not in the range -1 to 1.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `ASIN(0)` = ASIN(0) | fields `ASIN(0)`
fetched rows / total rows = 1/1
+-----------+
| ASIN(0)   |
|-----------|
| 0.0       |
+-----------+
```

##### `ATAN`
<a name="supported-ppl-math-functions-atan"></a>

**Usage**: `ATAN(x)` calculates the arc tangent of x. `atan(y, x)` calculates the arc tangent of y / x, except that the signs of both arguments determine the quadrant of the result.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `ATAN(2)` = ATAN(2), `ATAN(2, 3)` = ATAN(2, 3) | fields `ATAN(2)`, `ATAN(2, 3)`
fetched rows / total rows = 1/1
+--------------------+--------------------+
| ATAN(2)            | ATAN(2, 3)         |
|--------------------+--------------------|
| 1.1071487177940904 | 0.5880026035475675 |
+--------------------+--------------------+
```

##### `ATAN2`
<a name="supported-ppl-math-functions-atan2"></a>

**Usage**: `ATAN2(y, x)` calculates the arc tangent of y / x, except that the signs of both arguments determine the quadrant of the result.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `ATAN2(2, 3)` = ATAN2(2, 3) | fields `ATAN2(2, 3)`
fetched rows / total rows = 1/1
+--------------------+
| ATAN2(2, 3)        |
|--------------------|
| 0.5880026035475675 |
+--------------------+
```

##### `CBRT`
<a name="supported-ppl-math-functions-cbrt"></a>

**Usage**: `CBRT` calculates the cube root of a number.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE:

INTEGER/LONG/FLOAT/DOUBLE -> DOUBLE

**Example**:

```
opensearchsql> source=location | eval `CBRT(8)` = CBRT(8), `CBRT(9.261)` = CBRT(9.261), `CBRT(-27)` = CBRT(-27) | fields `CBRT(8)`, `CBRT(9.261)`, `CBRT(-27)`;
fetched rows / total rows = 2/2
+-----------+---------------+-------------+
| CBRT(8)   | CBRT(9.261)   | CBRT(-27)   |
|-----------+---------------+-------------|
| 2.0       | 2.1           | -3.0        |
| 2.0       | 2.1           | -3.0        |
+-----------+---------------+-------------+
```

##### `CEIL`
<a name="supported-ppl-math-functions-ceil"></a>

**Usage**: An alias for the `CEILING` function. `CEILING(T)` takes the ceiling of value T.

**Limitation**: `CEILING` only works as expected when IEEE 754 double type displays a decimal when stored.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: LONG

**Example**:

```
os> source=people | eval `CEILING(0)` = CEILING(0), `CEILING(50.00005)` = CEILING(50.00005), `CEILING(-50.00005)` = CEILING(-50.00005) | fields `CEILING(0)`, `CEILING(50.00005)`, `CEILING(-50.00005)`
fetched rows / total rows = 1/1
+--------------+---------------------+----------------------+
| CEILING(0)   | CEILING(50.00005)   | CEILING(-50.00005)   |
|--------------+---------------------+----------------------|
| 0            | 51                  | -50                  |
+--------------+---------------------+----------------------+

os> source=people | eval `CEILING(3147483647.12345)` = CEILING(3147483647.12345), `CEILING(113147483647.12345)` = CEILING(113147483647.12345), `CEILING(3147483647.00001)` = CEILING(3147483647.00001) | fields `CEILING(3147483647.12345)`, `CEILING(113147483647.12345)`, `CEILING(3147483647.00001)`
fetched rows / total rows = 1/1
+-----------------------------+-------------------------------+-----------------------------+
| CEILING(3147483647.12345)   | CEILING(113147483647.12345)   | CEILING(3147483647.00001)   |
|-----------------------------+-------------------------------+-----------------------------|
| 3147483648                  | 113147483648                  | 3147483648                  |
+-----------------------------+-------------------------------+-----------------------------+
```

##### `CONV`
<a name="supported-ppl-math-functions-conv"></a>

**Usage**: `CONV(x, a, b)` converts the number x from a base to b base.

**Argument type**: x: STRING, a: INTEGER, b: INTEGER

**Return type**: STRING

**Example**:

```
os> source=people | eval `CONV('12', 10, 16)` = CONV('12', 10, 16), `CONV('2C', 16, 10)` = CONV('2C', 16, 10), `CONV(12, 10, 2)` = CONV(12, 10, 2), `CONV(1111, 2, 10)` = CONV(1111, 2, 10) | fields `CONV('12', 10, 16)`, `CONV('2C', 16, 10)`, `CONV(12, 10, 2)`, `CONV(1111, 2, 10)`
fetched rows / total rows = 1/1
+----------------------+----------------------+-------------------+---------------------+
| CONV('12', 10, 16)   | CONV('2C', 16, 10)   | CONV(12, 10, 2)   | CONV(1111, 2, 10)   |
|----------------------+----------------------+-------------------+---------------------|
| c                    | 44                   | 1100              | 15                  |
+----------------------+----------------------+-------------------+---------------------+
```

##### `COS`
<a name="supported-ppl-math-functions-cos"></a>

**Usage**: `COS(x)` calculates the cosine of x, where x is given in radians.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type:** DOUBLE

**Example**:

```
os> source=people | eval `COS(0)` = COS(0) | fields `COS(0)`
fetched rows / total rows = 1/1
+----------+
| COS(0)   |
|----------|
| 1.0      |
+----------+
```

##### `COT`
<a name="supported-ppl-math-functions-cot"></a>

**Usage**: `COT(x)` calculates the cotangent of x. It returns out-of-range error if x is equal to 0.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `COT(1)` = COT(1) | fields `COT(1)`
fetched rows / total rows = 1/1
+--------------------+
| COT(1)             |
|--------------------|
| 0.6420926159343306 |
+--------------------+
```

##### `CRC32`
<a name="supported-ppl-math-functions-crc32"></a>

**Usage**: `CRC32` calculates a cyclic redundancy check value and returns a 32-bit unsigned value.

**Argument type**: STRING

**Return type**: LONG

**Example**:

```
os> source=people | eval `CRC32('MySQL')` = CRC32('MySQL') | fields `CRC32('MySQL')`
fetched rows / total rows = 1/1
+------------------+
| CRC32('MySQL')   |
|------------------|
| 3259397556       |
+------------------+
```

##### `DEGREES`
<a name="supported-ppl-math-functions-degrees"></a>

**Usage**: `DEGREES(x)` converts x from radians to degrees.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `DEGREES(1.57)` = DEGREES(1.57) | fields `DEGREES(1.57)`
fetched rows / total rows  = 1/1
+-------------------+
| DEGREES(1.57)     |
|-------------------|
| 89.95437383553924 |
+-------------------+
```

##### `E`
<a name="supported-ppl-math-functions-e"></a>

**Usage**: `E()` returns the Euler's number.

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `E()` = E() | fields `E()`
fetched rows / total rows = 1/1
+-------------------+
| E()               |
|-------------------|
| 2.718281828459045 |
+-------------------+
```

##### `EXP`
<a name="supported-ppl-math-functions-exp"></a>

**Usage**: `EXP(x)` returns e raised to the power of x.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `EXP(2)` = EXP(2) | fields `EXP(2)`
fetched rows / total rows = 1/1
+------------------+
| EXP(2)           |
|------------------|
| 7.38905609893065 |
+------------------+
```

##### `FLOOR`
<a name="supported-ppl-math-functions-floor"></a>

**Usage**: `FLOOR(T)` takes the floor of value T.

**Limitation**: `FLOOR` only works as expected when IEEE 754 double type displays a decimal when stored.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: LONG

**Example**:

```
os> source=people | eval `FLOOR(0)` = FLOOR(0), `FLOOR(50.00005)` = FLOOR(50.00005), `FLOOR(-50.00005)` = FLOOR(-50.00005) | fields `FLOOR(0)`, `FLOOR(50.00005)`, `FLOOR(-50.00005)`
fetched rows / total rows = 1/1
+------------+-------------------+--------------------+
| FLOOR(0)   | FLOOR(50.00005)   | FLOOR(-50.00005)   |
|------------+-------------------+--------------------|
| 0          | 50                | -51                |
+------------+-------------------+--------------------+

os> source=people | eval `FLOOR(3147483647.12345)` = FLOOR(3147483647.12345), `FLOOR(113147483647.12345)` = FLOOR(113147483647.12345), `FLOOR(3147483647.00001)` = FLOOR(3147483647.00001) | fields `FLOOR(3147483647.12345)`, `FLOOR(113147483647.12345)`, `FLOOR(3147483647.00001)`
fetched rows / total rows = 1/1
+---------------------------+-----------------------------+---------------------------+
| FLOOR(3147483647.12345)   | FLOOR(113147483647.12345)   | FLOOR(3147483647.00001)   |
|---------------------------+-----------------------------+---------------------------|
| 3147483647                | 113147483647                | 3147483647                |
+---------------------------+-----------------------------+---------------------------+

os> source=people | eval `FLOOR(282474973688888.022)` = FLOOR(282474973688888.022), `FLOOR(9223372036854775807.022)` = FLOOR(9223372036854775807.022), `FLOOR(9223372036854775807.0000001)` = FLOOR(9223372036854775807.0000001) | fields `FLOOR(282474973688888.022)`, `FLOOR(9223372036854775807.022)`, `FLOOR(9223372036854775807.0000001)`
fetched rows / total rows = 1/1
+------------------------------+----------------------------------+--------------------------------------+
| FLOOR(282474973688888.022)   | FLOOR(9223372036854775807.022)   | FLOOR(9223372036854775807.0000001)   |
|------------------------------+----------------------------------+--------------------------------------|
| 282474973688888              | 9223372036854775807              | 9223372036854775807                  |
+------------------------------+----------------------------------+--------------------------------------+
```

##### `LN`
<a name="supported-ppl-math-functions-ln"></a>

**Usage**: `LN(x)` returns the natural logarithm of x.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `LN(2)` = LN(2) | fields `LN(2)`
fetched rows / total rows = 1/1
+--------------------+
| LN(2)              |
|--------------------|
| 0.6931471805599453 |
+--------------------+
```

##### `LOG`
<a name="supported-ppl-math-functions-log"></a>

**Usage**: `LOG(x)` returns the natural logarithm of x that is the base e logarithm of the x. log(B, x) is equivalent to log(x)/log(B).

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `LOG(2)` = LOG(2), `LOG(2, 8)` = LOG(2, 8) | fields `LOG(2)`, `LOG(2, 8)`
fetched rows / total rows = 1/1
+--------------------+-------------+
| LOG(2)             | LOG(2, 8)   |
|--------------------+-------------|
| 0.6931471805599453 | 3.0         |
+--------------------+-------------+
```

##### `LOG2`
<a name="supported-ppl-math-functions-log2"></a>

**Usage**: `LOG2(x)` is equivalent to `log(x)`/`log(2)`.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `LOG2(8)` = LOG2(8) | fields `LOG2(8)`
fetched rows / total rows = 1/1
+-----------+
| LOG2(8)   |
|-----------|
| 3.0       |
+-----------+
```

##### `LOG10`
<a name="supported-ppl-math-functions-log10"></a>

**Usage**: `LOG10(x)` is equivalent to `log(x)`/`log(10)`.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `LOG10(100)` = LOG10(100) | fields `LOG10(100)`
fetched rows / total rows = 1/1
+--------------+
| LOG10(100)   |
|--------------|
| 2.0          |
+--------------+
```

##### `MOD`
<a name="supported-ppl-math-functions-mod"></a>

**Usage**: `MOD(n, m)` calculates the remainder of the number n divided by m.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: Wider type between types of n and m if m is nonzero value. If m equals to 0, then returns NULL.

**Example**:

```
os> source=people | eval `MOD(3, 2)` = MOD(3, 2), `MOD(3.1, 2)` = MOD(3.1, 2) | fields `MOD(3, 2)`, `MOD(3.1, 2)`
fetched rows / total rows = 1/1
+-------------+---------------+
| MOD(3, 2)   | MOD(3.1, 2)   |
|-------------+---------------|
| 1           | 1.1           |
+-------------+---------------+
```

##### `PI`
<a name="supported-ppl-math-functions-pi"></a>

**Usage**: `PI() `returns the constant pi.

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `PI()` = PI() | fields `PI()`
fetched rows / total rows = 1/1
+-------------------+
| PI()              |
|-------------------|
| 3.141592653589793 |
+-------------------+
```

##### `POW`
<a name="supported-ppl-math-functions-pow"></a>

**Usage**: `POW(x, y)` calculates the value of x raised to the power of y. Bad inputs return a `NULL` result.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Synonyms**: `POWER(_, _)`

**Example**:

```
os> source=people | eval `POW(3, 2)` = POW(3, 2), `POW(-3, 2)` = POW(-3, 2), `POW(3, -2)` = POW(3, -2) | fields `POW(3, 2)`, `POW(-3, 2)`, `POW(3, -2)`
fetched rows / total rows = 1/1
+-------------+--------------+--------------------+
| POW(3, 2)   | POW(-3, 2)   | POW(3, -2)         |
|-------------+--------------+--------------------|
| 9.0         | 9.0          | 0.1111111111111111 |
+-------------+--------------+--------------------+
```

##### POWER
<a name="supported-ppl-math-functions-power"></a>

**Usage**: `POWER(x, y)` calculates the value of x raised to the power of y. Bad inputs return a `NULL` result.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Synonyms**: `POW(_, _)`

**Example**:

```
os> source=people | eval `POWER(3, 2)` = POWER(3, 2), `POWER(-3, 2)` = POWER(-3, 2), `POWER(3, -2)` = POWER(3, -2) | fields `POWER(3, 2)`, `POWER(-3, 2)`, `POWER(3, -2)`
fetched rows / total rows = 1/1
+---------------+----------------+--------------------+
| POWER(3, 2)   | POWER(-3, 2)   | POWER(3, -2)       |
|---------------+----------------+--------------------|
| 9.0           | 9.0            | 0.1111111111111111 |
+---------------+----------------+--------------------+
```

##### `RADIANS`
<a name="supported-ppl-math-functions-radians"></a>

**Usage**: `RADIANS(x)` converts x from degrees to radians.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type**: DOUBLE

**Example**:

```
os> source=people | eval `RADIANS(90)` = RADIANS(90) | fields `RADIANS(90)`
fetched rows / total rows  = 1/1
+--------------------+
| RADIANS(90)        |
|--------------------|
| 1.5707963267948966 |
+--------------------+
```

##### `RAND`
<a name="supported-ppl-math-functions-rand"></a>

**Usage**: `RAND()`/`RAND(N)` returns a random floating-point value in the range 0 <= value < 1.0. If you specify integer N, the function initializes the seed before execution. One implication of this behavior is that with an identical argument N, `rand(N)` returns the same value each time, producing a repeatable sequence of column values.

**Argument type**: INTEGER

**Return type**: FLOAT

**Example**:

```
os> source=people | eval `RAND(3)` = RAND(3) | fields `RAND(3)`
fetched rows / total rows = 1/1
+------------+
| RAND(3)    |
|------------|
| 0.73105735 |
+------------+
```

##### `ROUND`
<a name="supported-ppl-math-functions-round"></a>

**Usage**: `ROUND(x, d)` rounds the argument x to d decimal places. If you don't specify d, it defaults to 0.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type map**:
+ (INTEGER/LONG [,INTEGER]) -> LONG
+ (FLOAT/DOUBLE [,INTEGER]) -> LONG

**Example**:

```
os> source=people | eval `ROUND(12.34)` = ROUND(12.34), `ROUND(12.34, 1)` = ROUND(12.34, 1), `ROUND(12.34, -1)` = ROUND(12.34, -1), `ROUND(12, 1)` = ROUND(12, 1) | fields `ROUND(12.34)`, `ROUND(12.34, 1)`, `ROUND(12.34, -1)`, `ROUND(12, 1)`
fetched rows / total rows = 1/1
+----------------+-------------------+--------------------+----------------+
| ROUND(12.34)   | ROUND(12.34, 1)   | ROUND(12.34, -1)   | ROUND(12, 1)   |
|----------------+-------------------+--------------------+----------------|
| 12.0           | 12.3              | 10.0               | 12             |
+----------------+-------------------+--------------------+----------------+
```

##### `SIGN`
<a name="supported-ppl-math-functions-sign"></a>

**Usage**: `SIGN` returns the sign of the argument as -1, 0, or 1, depending on whether the number is negative, zero, or positive.

**Argument type**: INTEGER/LONG/FLOAT/DOUBLE

**Return type:** INTEGER

**Example**:

```
os> source=people | eval `SIGN(1)` = SIGN(1), `SIGN(0)` = SIGN(0), `SIGN(-1.1)` = SIGN(-1.1) | fields `SIGN(1)`, `SIGN(0)`, `SIGN(-1.1)`
fetched rows / total rows = 1/1
+-----------+-----------+--------------+
| SIGN(1)   | SIGN(0)   | SIGN(-1.1)   |
|-----------+-----------+--------------|
| 1         | 0         | -1           |
+-----------+-----------+--------------+
```

##### `SIN`
<a name="supported-ppl-math-functions-sin"></a>

**Usage**: `sin(x)` calculates the sine of x, where x is given in radians.

**Argument type: **INTEGER/LONG/FLOAT/DOUBLE

**Return type: **DOUBLE

**Example**:

```
os> source=people | eval `SIN(0)` = SIN(0) | fields `SIN(0)`
fetched rows / total rows = 1/1
+----------+
| SIN(0)   |
|----------|
| 0.0      |
+----------+
```

##### `SQRT`
<a name="supported-ppl-math-functions-sqrt"></a>

**Usage**: `SQRT` calculates the square root of a non-negative number.

**Argument type: **INTEGER/LONG/FLOAT/DOUBLE

**Return type map:**
+ (Non-negative) INTEGER/LONG/FLOAT/DOUBLE -> DOUBLE
+ (Negative) INTEGER/LONG/FLOAT/DOUBLE -> NULL

**Example**:

```
os> source=people | eval `SQRT(4)` = SQRT(4), `SQRT(4.41)` = SQRT(4.41) | fields `SQRT(4)`, `SQRT(4.41)`
fetched rows / total rows = 1/1
+-----------+--------------+
| SQRT(4)   | SQRT(4.41)   |
|-----------+--------------|
| 2.0       | 2.1          |
+-----------+--------------+
```

##### PPL string functions
<a name="supported-ppl-string-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `CONCAT`
<a name="supported-ppl-string-functions-concat"></a>

**Usage**: `CONCAT(str1, str2, ...., str_9)` adds up to 9 strings together.

**Argument type:**
+ STRING, STRING, ...., STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `CONCAT('hello', 'world')` = CONCAT('hello', 'world'), `CONCAT('hello ', 'whole ', 'world', '!')` = CONCAT('hello ', 'whole ', 'world', '!') | fields `CONCAT('hello', 'world')`, `CONCAT('hello ', 'whole ', 'world', '!')`
fetched rows / total rows = 1/1
+----------------------------+--------------------------------------------+
| CONCAT('hello', 'world')   | CONCAT('hello ', 'whole ', 'world', '!')   |
|----------------------------+--------------------------------------------|
| helloworld                 | hello whole world!                         |
+----------------------------+--------------------------------------------+
```

##### `CONCAT_WS`
<a name="supported-ppl-string-functions-concat-ws"></a>

**Usage**: `CONCAT_WS(sep, str1, str2)` concatenates two or more strings using a specified separator between them.

**Argument type:**
+ STRING, STRING, ...., STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `CONCAT_WS(',', 'hello', 'world')` = CONCAT_WS(',', 'hello', 'world') | fields `CONCAT_WS(',', 'hello', 'world')`
fetched rows / total rows = 1/1
+------------------------------------+
| CONCAT_WS(',', 'hello', 'world')   |
|------------------------------------|
| hello,world                        |
+------------------------------------+
```

##### `LENGTH`
<a name="supported-ppl-string-functions-length"></a>

**Usage**: `length(str)` returns the length of the input string measured in bytes.

**Argument type:**
+ STRING
+ Return type: INTEGER

**Example**:

```
os> source=people | eval `LENGTH('helloworld')` = LENGTH('helloworld') | fields `LENGTH('helloworld')`
fetched rows / total rows = 1/1
+------------------------+
| LENGTH('helloworld')   |
|------------------------|
| 10                     |
+------------------------+
```

##### `LOWER`
<a name="supported-ppl-string-functions-lower"></a>

**Usage**: `lower(string)` converts the input string to lowercase.

**Argument type:**
+ STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `LOWER('helloworld')` = LOWER('helloworld'), `LOWER('HELLOWORLD')` = LOWER('HELLOWORLD') | fields `LOWER('helloworld')`, `LOWER('HELLOWORLD')`
fetched rows / total rows = 1/1
+-----------------------+-----------------------+
| LOWER('helloworld')   | LOWER('HELLOWORLD')   |
|-----------------------+-----------------------|
| helloworld            | helloworld            |
+-----------------------+-----------------------+
```

##### `LTRIM`
<a name="supported-ppl-string-functions-ltrim"></a>

**Usage**: `ltrim(str)` removes leading space characters from the input string.

**Argument type:**
+ STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `LTRIM('   hello')` = LTRIM('   hello'), `LTRIM('hello   ')` = LTRIM('hello   ') | fields `LTRIM('   hello')`, `LTRIM('hello   ')`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| LTRIM('   hello')   | LTRIM('hello   ')   |
|---------------------+---------------------|
| hello               | hello               |
+---------------------+---------------------+
```

##### `POSITION`
<a name="supported-ppl-string-functions-position"></a>

**Usage**: `POSITION(substr IN str)` returns the position of the first occurrence of substring in string. It returns 0 if the substring is not in the string. It returns NULL if any argument is NULL.

**Argument type:**
+ STRING, STRING
+ Return type INTEGER

**Example**:

```
os> source=people | eval `POSITION('world' IN 'helloworld')` = POSITION('world' IN 'helloworld'), `POSITION('invalid' IN 'helloworld')`= POSITION('invalid' IN 'helloworld')  | fields `POSITION('world' IN 'helloworld')`, `POSITION('invalid' IN 'helloworld')`
fetched rows / total rows = 1/1
+-------------------------------------+---------------------------------------+
| POSITION('world' IN 'helloworld')   | POSITION('invalid' IN 'helloworld')   |
|-------------------------------------+---------------------------------------|
| 6                                   | 0                                     |
+-------------------------------------+---------------------------------------+
```

##### `REVERSE`
<a name="supported-ppl-string-functions-reverse"></a>

**Usage**: `REVERSE(str)` returns the reversed string of the input string.

**Argument type:**
+ STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `REVERSE('abcde')` = REVERSE('abcde') | fields `REVERSE('abcde')`
fetched rows / total rows = 1/1
+--------------------+
| REVERSE('abcde')   |
|--------------------|
| edcba              |
+--------------------+
```

##### `RIGHT`
<a name="supported-ppl-string-functions-right"></a>

**Usage**: `right(str, len)` returns the rightmost characters from the input string. It returns 0 if the substring is not in the string. It returns NULL if any argument is NULL.

**Argument type:**
+ STRING, INTEGER
+ Return type: STRING

**Example**:

```
os> source=people | eval `RIGHT('helloworld', 5)` = RIGHT('helloworld', 5), `RIGHT('HELLOWORLD', 0)` = RIGHT('HELLOWORLD', 0) | fields `RIGHT('helloworld', 5)`, `RIGHT('HELLOWORLD', 0)`
fetched rows / total rows = 1/1
+--------------------------+--------------------------+
| RIGHT('helloworld', 5)   | RIGHT('HELLOWORLD', 0)   |
|--------------------------+--------------------------|
| world                    |                          |
+--------------------------+--------------------------+
```

##### `RTRIM`
<a name="supported-ppl-string-functions-rtrim"></a>

**Usage**: `rtrim(str)` trims trailing space characters from the input string.

**Argument type:**
+ STRING
+ Return type: **STRING**

**Example**:

```
os> source=people | eval `RTRIM('   hello')` = RTRIM('   hello'), `RTRIM('hello   ')` = RTRIM('hello   ') | fields `RTRIM('   hello')`, `RTRIM('hello   ')`
fetched rows / total rows = 1/1
+---------------------+---------------------+
| RTRIM('   hello')   | RTRIM('hello   ')   |
|---------------------+---------------------|
|    hello            | hello               |
+---------------------+---------------------+
```

##### `SUBSTRING`
<a name="supported-ppl-string-functions-substring"></a>

**Usage**: `substring(str, start)` or `substring(str, start, length)` returns a substring of the input string. With no length specified, it returns the entire string from the start position.

**Argument type:**
+ STRING, INTEGER, INTEGER
+ Return type: STRING

**Example**:

```
os> source=people | eval `SUBSTRING('helloworld', 5)` = SUBSTRING('helloworld', 5), `SUBSTRING('helloworld', 5, 3)` = SUBSTRING('helloworld', 5, 3) | fields `SUBSTRING('helloworld', 5)`, `SUBSTRING('helloworld', 5, 3)`
fetched rows / total rows = 1/1
+------------------------------+---------------------------------+
| SUBSTRING('helloworld', 5)   | SUBSTRING('helloworld', 5, 3)   |
|------------------------------+---------------------------------|
| oworld                       | owo                             |
+------------------------------+---------------------------------+
```

##### `TRIM`
<a name="supported-ppl-string-functions-trim"></a>

**Usage**: `trim(string)` removes leading and trailing whitespace from the input string.

**Argument type:**
+ STRING
+ Return type: **STRING**

**Example**:

```
os> source=people | eval `TRIM('   hello')` = TRIM('   hello'), `TRIM('hello   ')` = TRIM('hello   ') | fields `TRIM('   hello')`, `TRIM('hello   ')`
fetched rows / total rows = 1/1
+--------------------+--------------------+
| TRIM('   hello')   | TRIM('hello   ')   |
|--------------------+--------------------|
| hello              | hello              |
+--------------------+--------------------+
```

##### `UPPER`
<a name="supported-ppl-string-functions-upper"></a>

**Usage**: `upper(string)` converts the input string to uppercase.

**Argument type:**
+ STRING
+ Return type: STRING

**Example**:

```
os> source=people | eval `UPPER('helloworld')` = UPPER('helloworld'), `UPPER('HELLOWORLD')` = UPPER('HELLOWORLD') | fields `UPPER('helloworld')`, `UPPER('HELLOWORLD')`
fetched rows / total rows = 1/1
+-----------------------+-----------------------+
| UPPER('helloworld')   | UPPER('HELLOWORLD')   |
|-----------------------+-----------------------|
| HELLOWORLD            | HELLOWORLD            |
+-----------------------+-----------------------+
```

##### PPL type conversion functions
<a name="supported-ppl-type-conversion-functions"></a>

**Note**  
To see which AWS data source integrations support this PPL function, see [Functions](#supported-ppl-functions).

##### `TRIM`
<a name="supported-ppl-conversion-functions-cast"></a>

**Usage**: `cast(expr as dateType)` casts the `expr` to the `dataType` and returns the value of the `dataType`. 

The following conversion rules apply:


**Type conversion rules**  

| Src/Target | STRING | NUMBER | BOOLEAN | TIMESTAMP | DATE | TIME | 
| --- | --- | --- | --- | --- | --- | --- | 
| STRING |  | Note1 | Note1 | TIMESTAMP() | DATE() | TIME() | 
| NUMBER | Note1 |  | v\$1=0 | N/A | N/A | N/A | 
| BOOLEAN | Note1 | v?1:0 |  | N/A | N/A | N/A | 
| TIMESTAMP | Note1 | N/A | N/A |  | DATE() | TIME() | 
| DATE | Note1 | N/A | N/A | N/A |  | N/A | 
| TIME | Note1 | N/A | N/A | N/A | N/A |  | 

**Cast to string example:**

```
os> source=people | eval `cbool` = CAST(true as string), `cint` = CAST(1 as string), `cdate` = CAST(CAST('2012-08-07' as date) as string) | fields `cbool`, `cint`, `cdate`
fetched rows / total rows = 1/1
+---------+--------+------------+
| cbool   | cint   | cdate      |
|---------+--------+------------|
| true    | 1      | 2012-08-07 |
+---------+--------+------------+
```

**Cast to number example:**

```
os> source=people | eval `cbool` = CAST(true as int), `cstring` = CAST('1' as int) | fields `cbool`, `cstring`
fetched rows / total rows = 1/1
+---------+-----------+
| cbool   | cstring   |
|---------+-----------|
| 1       | 1         |
+---------+-----------+
```

**Cast to date example:**

```
os> source=people | eval `cdate` = CAST('2012-08-07' as date), `ctime` = CAST('01:01:01' as time), `ctimestamp` = CAST('2012-08-07 01:01:01' as timestamp) | fields `cdate`, `ctime`, `ctimestamp`
fetched rows / total rows = 1/1
+------------+----------+---------------------+
| cdate      | ctime    | ctimestamp          |
|------------+----------+---------------------|
| 2012-08-07 | 01:01:01 | 2012-08-07 01:01:01 |
+------------+----------+---------------------+
```

**Chained cast example:**

```
os> source=people | eval `cbool` = CAST(CAST(true as string) as boolean) | fields `cbool`
fetched rows / total rows = 1/1
+---------+
| cbool   |
|---------|
| True    |
+---------+
```