使用 Step Functions 运行 Athena 查询 - AWS Step Functions

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Step Functions 运行 Athena 查询

您可以 AWS Step Functions 与 Amazon Athena 集成,启动和停止查询执行,并通过 Step Functions 获取查询结果。使用 Step Functions,您可以运行临时或计划数据查询,并检索针对 S3 数据湖的结果。Athena 没有服务器,因此您无需设置或管理任何基础设施,且只需为您运行的查询付费。本页列出了支持的 A APIs thena,并提供了启动 Athena Task 查询的示例状态。

要了解如何在 Step Functions 中与 AWS 服务集成,请参阅集成 服务在 Step Functions 中将参数传递给服务 API

经优化的 Athena 集成的主要功能

要 AWS Step Functions 与亚马逊 Athena 集成,您可以使用所提供的 Athena 服务集成。 APIs

服务集成与相应 APIs 的 Athena APIs 相同。并非所有集成模式都 APIs支持所有集成模式,如下表所示。

API 请求响应 运行作业 (.sync)
StartQueryExecution 支持 支持
StopQueryExecution 支持 不支持
GetQueryExecution 支持 不支持
GetQueryResults 支持 不支持

下面包含一个启动 Athena 查询作业的 Task 状态。

"Start an Athena query": { "Type": "Task", "Resource": "arn:aws:states:::athena:startQueryExecution.sync", "Arguments": { "QueryString": "SELECT * FROM \"myDatabase\".\"myTable\" limit 1", "WorkGroup": "primary", "ResultConfiguration": { "OutputLocation": "s3://amzn-s3-demo-bucket" } }, "Next": "Get results of the query" }

优化的亚马逊 Athena APIs:

输入或结果数据的配额

在服务之间发送或接收数据时,任务的最大输入或结果为 256 KiB 的数据作为 UTF-8 编码字符串。请参阅与状态机执行相关的配额

用于调用 Amazon Athena 的 IAM 策略

以下示例模板展示了如何根据状态机定义中的资源 AWS Step Functions 生成 IAM 策略。有关更多信息,请参阅Step Functions 如何为集成服务生成 IAM 策略探索 Step Functions 中的服务集成模式

注意

除了 IAM 策略外,您可能需要使用 AWS Lake Formation 来授予对服务(例如 Amazon S3 和)中数据的访问权限 AWS Glue Data Catalog。有关更多信息,请参阅使用 Athena 查询注册到的数据。 AWS Lake Formation

StartQueryExecution

静态资源

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:startQueryExecution", "athena:stopQueryExecution", "athena:getQueryExecution", "athena:getDataCatalog", "athena:GetWorkGroup", "athena:BatchGetQueryExecution", "athena:GetQueryResults", "athena:ListQueryExecutions" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/workGroup", "arn:aws:athena:region:account-id:datacatalog/*" ] }, { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:CreateBucket", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::*" ] }, { "Effect": "Allow", "Action": [ "glue:CreateDatabase", "glue:GetDatabase", "glue:GetDatabases", "glue:UpdateDatabase", "glue:DeleteDatabase", "glue:CreateTable", "glue:UpdateTable", "glue:GetTable", "glue:GetTables", "glue:DeleteTable", "glue:BatchDeleteTable", "glue:BatchCreatePartition", "glue:CreatePartition", "glue:UpdatePartition", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition", "glue:DeletePartition", "glue:BatchDeletePartition" ], "Resource": [ "arn:aws:glue:region:account-id:catalog", "arn:aws:glue:region:account-id:database/*", "arn:aws:glue:region:account-id:table/*", "arn:aws:glue:region:account-id:userDefinedFunction/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:startQueryExecution", "athena:getDataCatalog" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/workGroup", "arn:aws:athena:region:account-id:datacatalog/*" ] }, { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:CreateBucket", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::*" ] }, { "Effect": "Allow", "Action": [ "glue:CreateDatabase", "glue:GetDatabase", "glue:GetDatabases", "glue:UpdateDatabase", "glue:DeleteDatabase", "glue:CreateTable", "glue:UpdateTable", "glue:GetTable", "glue:GetTables", "glue:DeleteTable", "glue:BatchDeleteTable", "glue:BatchCreatePartition", "glue:CreatePartition", "glue:UpdatePartition", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition", "glue:DeletePartition", "glue:BatchDeletePartition" ], "Resource": [ "arn:aws:glue:region:account-id:catalog", "arn:aws:glue:region:account-id:database/*", "arn:aws:glue:region:account-id:table/*", "arn:aws:glue:region:account-id:userDefinedFunction/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }

动态资源

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:startQueryExecution", "athena:stopQueryExecution", "athena:getQueryExecution", "athena:getDataCatalog", "athena:GetWorkGroup", "athena:BatchGetQueryExecution", "athena:GetQueryResults", "athena:ListQueryExecutions" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/*", "arn:aws:athena:region:account-id:datacatalog/*" ] }, { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:CreateBucket", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::*" ] }, { "Effect": "Allow", "Action": [ "glue:CreateDatabase", "glue:GetDatabase", "glue:GetDatabases", "glue:UpdateDatabase", "glue:DeleteDatabase", "glue:CreateTable", "glue:UpdateTable", "glue:GetTable", "glue:GetTables", "glue:DeleteTable", "glue:BatchDeleteTable", "glue:BatchCreatePartition", "glue:CreatePartition", "glue:UpdatePartition", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition", "glue:DeletePartition", "glue:BatchDeletePartition" ], "Resource": [ "arn:aws:glue:region:account-id:catalog", "arn:aws:glue:region:account-id:database/*", "arn:aws:glue:region:account-id:table/*", "arn:aws:glue:region:account-id:userDefinedFunction/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:startQueryExecution", "athena:getDataCatalog" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/*", "arn:aws:athena:region:account-id:datacatalog/*" ] }, { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:CreateBucket", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::*" ] }, { "Effect": "Allow", "Action": [ "glue:CreateDatabase", "glue:GetDatabase", "glue:GetDatabases", "glue:UpdateDatabase", "glue:DeleteDatabase", "glue:CreateTable", "glue:UpdateTable", "glue:GetTable", "glue:GetTables", "glue:DeleteTable", "glue:BatchDeleteTable", "glue:BatchCreatePartition", "glue:CreatePartition", "glue:UpdatePartition", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition", "glue:DeletePartition", "glue:BatchDeletePartition" ], "Resource": [ "arn:aws:glue:region:account-id:catalog", "arn:aws:glue:region:account-id:database/*", "arn:aws:glue:region:account-id:table/*", "arn:aws:glue:region:account-id:userDefinedFunction/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }

StopQueryExecution

资源

{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:stopQueryExecution" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/*" ] } ] }

GetQueryExecution

资源

{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:getQueryExecution" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/*" ] } ] }

GetQueryResults

资源

{ "Version": "2012-10-17", "Statement":[ { "Effect": "Allow", "Action": [ "athena:getQueryResults" ], "Resource": [ "arn:aws:athena:region:account-id:workgroup/*" ] }, { "Effect": "Allow", "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::*" ] } ] }