Configuring Amazon MSK event sources for Lambda
To use an Amazon MSK cluster as an event source for your Lambda function, you create an event source mapping that connects the two resources. This page describes how to create an event source mapping for Amazon MSK.
This page assumes that you've already properly configured your MSK cluster and the Amazon Virtual Private Cloud (VPC) it resides in. If you need to set up your cluster or VPC, see Configuring your Amazon MSK cluster and Amazon VPC network for Lambda.
Topics
Using an Amazon MSK cluster as an event source
When you add your Apache Kafka or Amazon MSK cluster as a trigger for your Lambda function, the cluster is used as an event source.
Lambda reads event data from the Kafka topics that you specify as Topics
in a
CreateEventSourceMapping request, based on the starting
position that you specify. After successful processing, your Kafka topic is committed to your
Kafka cluster.
Lambda reads messages sequentially for each Kafka topic partition. A single Lambda payload can contain messages from multiple partitions. When more records are available, Lambda continues processing records in batches, based on the BatchSize value that you specify in a CreateEventSourceMapping request, until your function catches up with the topic.
After Lambda processes each batch, it commits the offsets of the messages in that batch. If your function returns an error for any of the messages in a batch, Lambda retries the whole batch of messages until processing succeeds or the messages expire. You can send records that fail all retry attempts to an on-failure destination for later processing.
Note
While Lambda functions typically have a maximum timeout limit of 15 minutes, event source mappings for Amazon MSK, self-managed Apache Kafka, Amazon DocumentDB, and Amazon MQ for ActiveMQ and RabbitMQ only support functions with maximum timeout limits of 14 minutes.