Configuring Amazon Athena Configuring Amazon Redshift Recommended SQL queries for processing CUR 2.0 Recommended SQL queries for processing carbon emissions data exports

Processing data exports

In the following sections, you'll find information about processing your data exports.

Configuring Amazon Athena

To build a table and partitions for Athena using an AWS Glue crawler

Create an export of CUR 2.0 or Carbon emissions with the following data export delivery options:
- Compression type and file format: Parquet - Parquet
- File versioning: Overwrite existing data export file
In Athena, use the notebook editor with Trino SQL and choose Create to create a table with "AWS Glue crawler". Using the Glue crawler workflow, point the Glue crawler to run on the s3://<bucket-name>/<prefix>/<export-name>/data folder to automatically load all of the delivered partitions for the specified export to Athena.
After the Glue crawler is complete, you can use Athena to write queries on the table created by the Glue crawler.

Configuring Amazon Redshift

Amazon Redshift is a cloud data warehouse that can be accessed either in a provisioned capacity or serverless model. Amazon Redshift offers fast query performance for processing your data from Data Exports.

For information on setting up Redshift, see the Amazon Redshift Getting Started Guide.

Recommended SQL queries for processing CUR 2.0

After loading your CUR 2.0 export data into a data analytics tool such as Amazon Athena or Amazon Redshift, you can process it in order to gain cost and usage insights. AWS Well-Architected Labs provides a CUR query library that you can use to process CUR. For more information, see AWS CUR Query Library.

Note the following two pieces of information about SQL queries:

The Well-Architected Labs SQL queries won't work in the data export query field, because Data Exports doesn't support aggregations and some of the other SQL syntax used in these queries.
The Well-Architected Labs SQL queries will only work if you haven’t renamed your columns from the default names. Depending on the query, you may need to query some of the product columns as separate columns using the dot operator. For more information, see Data query–SQL query and table configurations.

Recommended SQL queries for processing carbon emissions data exports

To get the total carbon emissions per payer_account_id:


SELECT payer_account_id, SUM(total_mbm_emissions_value) AS total_emissions
FROM "ccft-data-exports"."ccft-data-exports-data" -- change to your table name
GROUP BY payer_account_id
ORDER BY total_emissions DESC;

To get the total carbon emissions per payer_account_id and per product_code:


SELECT payer_account_id, product_code, SUM(total_mbm_emissions_value) AS total_emissions
FROM "ccft-data-exports"."ccft-data-exports-data"-- change to your table name
GROUP BY payer_account_id, product_code
ORDER BY total_emissions DESC;

To get the total carbon emissions per payer_account_id and per region_code:


SELECT payer_account_id, region_code, SUM(total_mbm_emissions_value) AS total_emissions
FROM "ccft-data-exports"."ccft-data-exports-data" -- change to your table name
GROUP BY payer_account_id, region_code
ORDER BY total_emissions DESC;

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Carbon emissions columns

Understanding the Cost and Usage Dashboard