Configuring Custom S3 Bucket Sources - Amazon CloudWatch

Configuring Custom S3 Bucket Sources

With CloudWatch pipelines, you can process arbitrary logs stored in S3 buckets.

Prerequisites

To use Amazon S3 as the source for a pipeline, first create an S3 bucket. For instructions, see Creating a general purpose bucketin the Amazon S3 User Guide.

Note

If the S3 bucket used as a source in the pipeline is in a different AWS account, you also need to enable cross-account read permissions on the bucket. This allows the pipeline to read and process the data. To enable cross-account permissions, see Bucket owner granting cross-account bucket permissions in the Amazon S3 User Guide.

If your S3 buckets are in multiple accounts, use a bucket_owners map. For an example, see Cross-account S3 access in the OpenSearch documentation.

To set up S3-SQS processing, you also need to perform the following steps:

Configure the pipeline role

Unlike other source plugins that push data to a pipeline, the S3 source plugin has a read-based architecture in which the pipeline pulls data from the source. Therefore, in order for a pipeline to read from S3, you must specify a role within the pipeline's S3 source configuration that has access to both the S3 bucket and the Amazon SQS queue. The pipeline will assume this role in order to read data from the queue.

You can find example role permissions in Source-specific IAM policies. Note that this role must have a trust relationship with the CloudWatch pipelines service principle. You can find an example trust policy configuration for your pipeline role in Trust relationships.

Create the pipeline

After you've set up your permissions, you can configure a pipeline depending on your Amazon S3 use case.

Select Create pipeline in the Pipelines tab under Ingestion in the CloudWatch Console. Follow the pipeline wizard steps and provide the SQS queue ARN and required pipeline IAM role when prompted. Optionally provide a data source name and type to attach to the log group destination in CloudWatch Logs.

Be sure to configure a CloudWatch Logs resource policy if one isn't already configured to the destination log group and then select Create pipeline in the Review and create step. The pipeline will be created and data will begin to flow within 5 minutes if successful.

Amazon S3 cross account as a source

You can grant access across accounts with Amazon S3 so that CloudWatch pipelines can access S3 buckets in another account as a source. To enable cross-account access, see Bucket owner granting cross-account bucket permissions in the Amazon S3 User Guide. After you have granted access, ensure that your pipeline role has the required permissions.

Then, you can create a pipeline using bucket_owners to enable cross-account access to an Amazon S3 bucket as a source.

Custom source configuration

When creating a pipeline for custom sources:

  • A parser must be the first processor in the pipeline

  • You can specify any supported processor for custom log pipelines