Skip to content

Retry Strategies

Retry suspends invocation

When a step throws an exception, the SDK uses the step's retry strategy to define the retry behaviour. When the strategy logic requires a retry, the SDK checkpoints the error and the scheduled resume time, and then ends the Lambda invocation. The backend starts a new invocation for the execution at the scheduled resume time and the SDK replays the step body.

Retries do not consume Lambda execution time while waiting for the next retry.

When a step exhausts all retry attempts, the SDK checkpoints the final error and throws it to your handler. If you configure no retry strategy on a step, the SDK applies a default strategy with up to 5 retries (6 total attempts). See Retry presets.

Configure a retry strategy

A retry strategy is a function that takes the error and the current attempt number, and returns a decision. The decision is either to retry with a given delay, or to stop. You can write a retry strategy directly yourself or use one of the built-in helpers to build a ready-made strategy from configuration. The SDK ships helpers for exponential backoff and linear backoff.

Exponential backoff

Use createRetryStrategy() to build a strategy, then pass it as retryStrategy in StepConfig.

import {
  withDurableExecution,
  createRetryStrategy,
  StepConfig,
} from "@aws/durable-execution-sdk-js";

const retryStrategy = createRetryStrategy({
  maxAttempts: 5,
  initialDelay: { seconds: 2 },
  maxDelay: { minutes: 1 },
  backoffRate: 2,
});

const stepConfig: StepConfig<string> = { retryStrategy };

export const handler = withDurableExecution(async (event, context) => {
  const result = await context.step(
    "call-external-api",
    async () => callExternalApi(),
    stepConfig,
  );
  return result;
});

async function callExternalApi(): Promise<string> {
  return "ok";
}

Use create_retry_strategy() with a RetryStrategyConfig, then pass it as retry_strategy in StepConfig.

from aws_durable_execution_sdk_python.config import Duration, StepConfig
from aws_durable_execution_sdk_python.context import DurableContext, StepContext, durable_step
from aws_durable_execution_sdk_python.execution import durable_execution
from aws_durable_execution_sdk_python.retries import RetryStrategyConfig, create_retry_strategy

retry_strategy = create_retry_strategy(
    RetryStrategyConfig(
        max_attempts=5,
        initial_delay=Duration.from_seconds(2),
        max_delay=Duration.from_minutes(1),
        backoff_rate=2.0,
    )
)

step_config = StepConfig(retry_strategy=retry_strategy)


@durable_step
def call_external_api(step_context: StepContext) -> str:
    return "ok"


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> str:
    result = context.step(call_external_api(), config=step_config)
    return result

Use RetryStrategies.exponentialBackoff() to build a strategy, then pass it to StepConfig.builder().retryStrategy().

import java.time.Duration;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.StepContext;
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.retry.JitterStrategy;
import software.amazon.lambda.durable.retry.RetryStrategies;

public class ExponentialBackoffExample extends DurableHandler<Object, String> {
    @Override
    public String handleRequest(Object input, DurableContext context) {
        StepConfig config = StepConfig.builder()
            .retryStrategy(RetryStrategies.exponentialBackoff(
                3,
                Duration.ofSeconds(1),
                Duration.ofSeconds(10),
                2.0,
                JitterStrategy.FULL))
            .build();

        String result = context.step("retry_step", String.class,
            (StepContext ctx) -> "Step with exponential backoff",
            config);

        return "Result: " + result;
    }
}

RetryStrategyConfig signature

import { JitterStrategy, Duration } from "@aws/durable-execution-sdk-js";

export interface RetryStrategyConfig {
  /** Default: 3 */
  maxAttempts?: number;
  /** Default: { seconds: 5 } */
  initialDelay?: Duration;
  /** Default: { minutes: 5 } */
  maxDelay?: Duration;
  /** Default: 2 */
  backoffRate?: number;
  /** Default: JitterStrategy.FULL */
  jitter?: JitterStrategy;
  retryableErrors?: (string | RegExp)[];
  retryableErrorTypes?: (new () => Error)[];
}

Parameters:

  • maxAttempts (optional) Total attempts including the initial attempt. Default: 3.
  • initialDelay (optional) Delay before the first retry. Default: { seconds: 5 }.
  • maxDelay (optional) Maximum delay between retries. Default: { minutes: 5 }.
  • backoffRate (optional) Multiplier applied to the delay on each retry. Default: 2.
  • jitter (optional) A JitterStrategy value. Default: JitterStrategy.FULL.
  • retryableErrors (optional) Array of strings or RegExp patterns matched against the error message. The SDK retries all errors when you set neither retryableErrors nor retryableErrorTypes.
  • retryableErrorTypes (optional) Array of error classes. The SDK retries only errors that are instances of these classes. When you set both filters, the SDK retries an error if it matches either (OR logic).
import re
from dataclasses import dataclass
from aws_durable_execution_sdk_python.config import Duration, JitterStrategy
from aws_durable_execution_sdk_python.retries import RetryDecision


@dataclass
class RetryStrategyConfig:
    max_attempts: int = 3
    initial_delay: Duration = Duration.from_seconds(5)
    max_delay: Duration = Duration.from_minutes(5)
    backoff_rate: float = 2.0
    jitter_strategy: JitterStrategy = JitterStrategy.FULL
    retryable_errors: list[str | re.Pattern] | None = None
    retryable_error_types: list[type[Exception]] | None = None

Parameters:

  • max_attempts (optional) Total attempts including the initial attempt. Default: 3.
  • initial_delay (optional) A Duration. Default: Duration.from_seconds(5).
  • max_delay (optional) A Duration. Default: Duration.from_minutes(5).
  • backoff_rate (optional) Multiplier applied to the delay on each retry. Default: 2.0.
  • jitter_strategy (optional) A JitterStrategy value. Default: JitterStrategy.FULL.
  • retryable_errors (optional) List of strings or compiled re.Pattern objects matched against the error message. The SDK retries all errors when you set neither retryable_errors nor retryable_error_types.
  • retryable_error_types (optional) List of exception classes. The SDK retries only exceptions that are instances of these classes. When you set both filters, the SDK retries an error if it matches either (OR logic).
RetryStrategy RetryStrategies.exponentialBackoff(
    int maxAttempts,
    Duration initialDelay,
    Duration maxDelay,
    double backoffRate,
    JitterStrategy jitter
)

RetryStrategy RetryStrategies.fixedDelay(
    int maxAttempts,
    Duration fixedDelay
)

Parameters:

  • maxAttempts Total attempts including the initial attempt.
  • initialDelay A java.time.Duration. Minimum 1 second.
  • maxDelay A java.time.Duration. Minimum 1 second.
  • backoffRate Multiplier applied to the delay on each retry.
  • jitter A JitterStrategy value: FULL, HALF, or NONE.

Java does not have built-in error type filtering. Filter by error type manually inside the RetryStrategy lambda. See Retrying specific errors.

JitterStrategy

import { JitterStrategy } from "@aws/durable-execution-sdk-js";

enum JitterStrategy {
  NONE = "NONE", // exact calculated delay
  FULL = "FULL", // random between 0 and base_delay
  HALF = "HALF", // random between 50% and 100% of base_delay
}
from aws_durable_execution_sdk_python.config import JitterStrategy

class JitterStrategy(StrEnum):
    NONE = "NONE"  # exact calculated delay
    FULL = "FULL"  # random between 0 and base_delay
    HALF = "HALF"  # random between 50% and 100% of base_delay
import software.amazon.lambda.durable.retry.JitterStrategy;

enum JitterStrategy {
    NONE, // exact calculated delay
    FULL, // random between 0 and base_delay
    HALF  // random between 50% and 100% of base_delay
}

Delay calculation

The SDK calculates the delay before each retry using exponential backoff with jitter:

base_delay = min(initial_delay × backoff_rate ^ (attempt - 1), max_delay)
final_delay = jitter(base_delay), minimum 1 second
  • JitterStrategy.FULL randomizes the delay between 0 and base_delay. This spreads retries across time and avoids many clients retrying simultaneously after a shared failure.
  • JitterStrategy.HALF randomizes between 50% and 100% of base_delay.
  • JitterStrategy.NONE uses the exact calculated delay.

Linear backoff

Linear backoff grows the delay by a fixed increment on each attempt instead of multiplying by a backoff rate. Use it when you want predictable, bounded growth between retries rather than the rapid expansion of exponential backoff.

Use createLinearRetryStrategy() to build a strategy, then pass it as retryStrategy in StepConfig.

import {
  withDurableExecution,
  createLinearRetryStrategy,
  StepConfig,
} from "@aws/durable-execution-sdk-js";

const retryStrategy = createLinearRetryStrategy({
  maxAttempts: 5,
  initialDelay: { seconds: 2 },
  increment: { seconds: 3 },
  maxDelay: { seconds: 30 },
});

const stepConfig: StepConfig<string> = { retryStrategy };

export const handler = withDurableExecution(async (event, context) => {
  return context.step(
    "call-external-api",
    async () => callExternalApi(),
    stepConfig,
  );
});

async function callExternalApi(): Promise<string> {
  return "ok";
}

Use create_linear_retry_strategy() with a LinearRetryStrategyConfig, then pass it as retry_strategy in StepConfig.

from aws_durable_execution_sdk_python.config import Duration, StepConfig
from aws_durable_execution_sdk_python.context import DurableContext, StepContext, durable_step
from aws_durable_execution_sdk_python.execution import durable_execution
from aws_durable_execution_sdk_python.retries import (
    LinearRetryStrategyConfig,
    create_linear_retry_strategy,
)

retry_strategy = create_linear_retry_strategy(
    LinearRetryStrategyConfig(
        max_attempts=5,
        initial_delay=Duration.from_seconds(2),
        increment=Duration.from_seconds(3),
        max_delay=Duration.from_seconds(30),
    )
)

step_config = StepConfig(retry_strategy=retry_strategy)


@durable_step
def call_external_api(step_context: StepContext) -> str:
    return "ok"


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> str:
    return context.step(call_external_api(), config=step_config)

Use RetryStrategies.linearBackoff() to build a strategy, then pass it to StepConfig.builder().retryStrategy().

import java.time.Duration;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.StepContext;
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.retry.JitterStrategy;
import software.amazon.lambda.durable.retry.RetryStrategies;

public class LinearRetryStrategyExample extends DurableHandler<Object, String> {
    @Override
    public String handleRequest(Object input, DurableContext context) {
        StepConfig config = StepConfig.builder()
            .retryStrategy(RetryStrategies.linearBackoff(
                5,                       // maxAttempts
                Duration.ofSeconds(2),   // initialDelay
                Duration.ofSeconds(30),  // maxDelay
                Duration.ofSeconds(3),   // increment
                JitterStrategy.FULL))    // jitter
            .build();

        return context.step("call-external-api", String.class,
            (StepContext ctx) -> "ok",
            config);
    }
}

LinearRetryStrategyConfig signature

import { JitterStrategy, Duration } from "@aws/durable-execution-sdk-js";

export interface LinearRetryStrategyConfig {
  /** Default: 6 */
  maxAttempts?: number;
  /** Default: { seconds: 1 } */
  initialDelay?: Duration;
  /** Default: { seconds: 1 } */
  increment?: Duration;
  /** Default: { minutes: 5 } */
  maxDelay?: Duration;
  /** Default: JitterStrategy.FULL */
  jitter?: JitterStrategy;
  retryableErrors?: (string | RegExp)[];
  retryableErrorTypes?: (new () => Error)[];
}

Parameters:

  • maxAttempts (optional) Total attempts including the initial attempt. Default: 6.
  • initialDelay (optional) Delay before the first retry. Default: { seconds: 1 }.
  • increment (optional) Amount added to the delay on each retry. Default: { seconds: 1 }.
  • maxDelay (optional) Maximum delay between retries. Default: { minutes: 5 }.
  • jitter (optional) A JitterStrategy value. Default: JitterStrategy.FULL.
  • retryableErrors (optional) Array of strings or RegExp patterns matched against the error message. The SDK retries all errors when you set neither retryableErrors nor retryableErrorTypes.
  • retryableErrorTypes (optional) Array of error classes. The SDK retries only errors that are instances of these classes. When you set both filters, the SDK retries an error if it matches either (OR logic).
import re
from dataclasses import dataclass
from aws_durable_execution_sdk_python.config import Duration, JitterStrategy


@dataclass
class LinearRetryStrategyConfig:
    max_attempts: int = 6
    initial_delay: Duration = Duration.from_seconds(1)
    increment: Duration = Duration.from_seconds(1)
    max_delay: Duration = Duration.from_minutes(5)
    jitter_strategy: JitterStrategy = JitterStrategy.FULL
    retryable_errors: list[str | re.Pattern] | None = None
    retryable_error_types: list[type[Exception]] | None = None

Parameters:

  • max_attempts (optional) Total attempts including the initial attempt. Default: 6.
  • initial_delay (optional) A Duration. Default: Duration.from_seconds(1).
  • increment (optional) Amount added to the delay on each retry. Default: Duration.from_seconds(1).
  • max_delay (optional) A Duration. Default: Duration.from_minutes(5).
  • jitter_strategy (optional) A JitterStrategy value. Default: JitterStrategy.FULL.
  • retryable_errors (optional) List of strings or compiled re.Pattern objects matched against the error message. The SDK retries all errors when you set neither retryable_errors nor retryable_error_types.
  • retryable_error_types (optional) List of exception classes. The SDK retries only exceptions that are instances of these classes. When you set both filters, the SDK retries an error if it matches either (OR logic).
RetryStrategy RetryStrategies.linearBackoff(
    int maxAttempts,
    Duration initialDelay,
    Duration increment
)

RetryStrategy RetryStrategies.linearBackoff(
    int maxAttempts,
    Duration initialDelay,
    Duration maxDelay,
    Duration increment,
    JitterStrategy jitter
)

Parameters:

  • maxAttempts Total attempts including the initial attempt.
  • initialDelay A java.time.Duration. Minimum 1 second.
  • maxDelay A java.time.Duration. Minimum 1 second. Caps the calculated delay.
  • increment A java.time.Duration added to the delay on each retry.
  • jitter A JitterStrategy value. The three-argument overload omits both maxDelay and jitter.

Delay calculation

Linear backoff calculates the delay before each retry as:

base_delay = min(initial_delay + increment × (attempt - 1), max_delay)
final_delay = jitter(base_delay), minimum 1 second

The same JitterStrategy values apply: FULL, HALF, and NONE.

Write a custom strategy

You can write your own retry strategy directly. The SDK calls it with the error and the current attempt number after each failure. The attempt number is one-indexed.

RetryStrategy signature

import { RetryDecision } from "@aws/durable-execution-sdk-js";

// (error: Error, attemptCount: number) => RetryDecision
// attemptCount is one-indexed: 1 on the first retry, 2 on the second, etc.

type RetryStrategy = (error: Error, attemptCount: number) => RetryDecision;
from collections.abc import Callable
from aws_durable_execution_sdk_python.retries import RetryDecision

# retry_strategy: Callable[[Exception, int], RetryDecision]
# attempt_count is one-indexed: 1 on the first retry, 2 on the second, etc.
import software.amazon.lambda.durable.retry.RetryDecision;
import software.amazon.lambda.durable.retry.RetryStrategy;

// @FunctionalInterface
// interface RetryStrategy {
//     RetryDecision makeRetryDecision(Throwable error, int attempt);
// }
// attempt is one-indexed: 1 on the first retry, 2 on the second, etc.

Example

Return { shouldRetry: false } to stop, or { shouldRetry: true, delay: { seconds: N } } to retry.

import {
  withDurableExecution,
  StepConfig,
  RetryDecision,
} from "@aws/durable-execution-sdk-js";

// A retry strategy is a plain function: (error: Error, attemptCount: number) => RetryDecision
// attemptCount is 1-based: 1 on the first retry, 2 on the second, etc.
const customRetryStrategy = (error: Error, attemptCount: number): RetryDecision => {
  if (attemptCount >= 4) {
    return { shouldRetry: false };
  }
  // Fixed 2-second delay regardless of attempt number
  return { shouldRetry: true, delay: { seconds: 2 } };
};

const stepConfig: StepConfig<string> = { retryStrategy: customRetryStrategy };

export const handler = withDurableExecution(async (event, context) => {
  const result = await context.step(
    "call-api",
    async () => callApi(),
    stepConfig,
  );
  return result;
});

async function callApi(): Promise<string> {
  return "ok";
}

Use RetryDecision.retry(Duration) or RetryDecision.no_retry().

from aws_durable_execution_sdk_python.config import Duration, StepConfig
from aws_durable_execution_sdk_python.context import DurableContext, StepContext, durable_step
from aws_durable_execution_sdk_python.execution import durable_execution
from aws_durable_execution_sdk_python.retries import RetryDecision


# A retry strategy is a plain callable: (Exception, int) -> RetryDecision
# attempt_count is 1-based: 1 on the first retry, 2 on the second, etc.
def custom_retry_strategy(error: Exception, attempt_count: int) -> RetryDecision:
    if attempt_count >= 4:
        return RetryDecision.no_retry()
    # Fixed 2-second delay regardless of attempt number
    return RetryDecision.retry(Duration.from_seconds(2))


step_config = StepConfig(retry_strategy=custom_retry_strategy)


@durable_step
def call_api(step_context: StepContext) -> str:
    return "ok"


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> str:
    return context.step(call_api(), config=step_config)

Use RetryDecision.retry(Duration) or RetryDecision.fail().

import java.time.Duration;
import java.util.Map;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.retry.RetryDecision;
import software.amazon.lambda.durable.retry.RetryStrategy;

public class CustomRetryStrategyExample extends DurableHandler<Map<String, Object>, String> {

    // RetryStrategy is a functional interface: (Throwable error, int attempt) -> RetryDecision
    // attempt is 1-based: 1 on the first retry, 2 on the second, etc.
    private static final RetryStrategy customRetryStrategy = (error, attempt) -> {
        if (attempt >= 4) {
            return RetryDecision.fail();
        }
        // Fixed 2-second delay regardless of attempt number
        return RetryDecision.retry(Duration.ofSeconds(2));
    };

    @Override
    public String handleRequest(Map<String, Object> event, DurableContext context) {
        return context.step("call-api", String.class,
                stepCtx -> callApi(),
                StepConfig.builder().retryStrategy(customRetryStrategy).build());
    }

    private String callApi() {
        return "ok";
    }
}

Retry presets

The SDK ships with preset strategies for common cases:

import { withDurableExecution, retryPresets } from "@aws/durable-execution-sdk-js";

export const handler = withDurableExecution(async (event, context) => {
  // Default: 6 attempts, 5s initial delay, 60s max, 2x backoff, full jitter
  const result = await context.step(
    "call-api",
    async () => callApi(),
    { retryStrategy: retryPresets.default },
  );

  // Linear: 6 attempts, delays of 1s, 2s, 3s, 4s, 5s
  const audit = await context.step(
    "audit-log",
    async () => writeAuditLog(),
    { retryStrategy: retryPresets.linear },
  );

  // No retry: fail immediately on first error
  const critical = await context.step(
    "charge-payment",
    async () => chargePayment(),
    { retryStrategy: retryPresets.noRetry },
  );

  return { result, audit, critical };
});

async function callApi(): Promise<string> { return "ok"; }
async function writeAuditLog(): Promise<string> { return "logged"; }
async function chargePayment(): Promise<string> { return "charged"; }

retryPresets.default 6 attempts, 5s initial delay, 60s max, 2x backoff, full jitter.

retryPresets.linear 6 attempts with linear delays of 1s, 2s, 3s, 4s, 5s and no jitter.

retryPresets.noRetry 1 attempt, fails immediately on error.

from aws_durable_execution_sdk_python.config import Duration, StepConfig
from aws_durable_execution_sdk_python.retries import RetryPresets

# No retries
no_retry_config = StepConfig(retry_strategy=RetryPresets.none())

# Default retries (6 attempts, 5s initial delay, 60s max, 2x backoff)
default_config = StepConfig(retry_strategy=RetryPresets.default())

# Quick retries for transient errors (3 attempts)
transient_config = StepConfig(retry_strategy=RetryPresets.transient())

# Longer retries for resource availability (5 attempts, up to 5 minutes)
resource_config = StepConfig(retry_strategy=RetryPresets.resource_availability())

# Aggressive retries for critical operations (10 attempts)
critical_config = StepConfig(retry_strategy=RetryPresets.critical())

# Linear backoff (6 attempts, delays of 1s, 2s, 3s, 4s, 5s)
linear_config = StepConfig(retry_strategy=RetryPresets.linear())

# Fixed delay (5 attempts, 5 second interval). Pass an interval to customize.
fixed_config = StepConfig(retry_strategy=RetryPresets.fixed())
fixed_2s_config = StepConfig(
    retry_strategy=RetryPresets.fixed(interval=Duration.from_seconds(2))
)

RetryPresets.default() 6 attempts, 5s initial delay, 60s max, 2x backoff, full jitter.

RetryPresets.none() 1 attempt, fails immediately on error.

RetryPresets.transient() 3 attempts, 2x backoff, half jitter.

RetryPresets.resource_availability() 5 attempts, 5s initial delay, 5 min max, 2x backoff.

RetryPresets.critical() 10 attempts, 1s initial delay, 60s max, 1.5x backoff, no jitter.

RetryPresets.linear() 6 attempts with linear delays of 1s, 2s, 3s, 4s, 5s and no jitter.

RetryPresets.fixed(interval) 5 attempts at a constant interval. Defaults to a 5 second interval. Pass a Duration to override.

import java.util.Map;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.retry.RetryStrategies;

public class RetryPresetsExample extends DurableHandler<Map<String, Object>, Map<String, Object>> {

    @Override
    public Map<String, Object> handleRequest(Map<String, Object> event, DurableContext context) {
        // Default: 6 attempts, 5s initial delay, 60s max, 2x backoff, full jitter
        String result = context.step("call-api", String.class,
                stepCtx -> callApi(),
                StepConfig.builder().retryStrategy(RetryStrategies.Presets.DEFAULT).build());

        // Linear: 6 attempts, delays of 1s, 2s, 3s, 4s, 5s
        String audit = context.step("audit-log", String.class,
                stepCtx -> writeAuditLog(),
                StepConfig.builder().retryStrategy(RetryStrategies.Presets.LINEAR).build());

        // No retry: fail immediately on first error
        String critical = context.step("charge-payment", String.class,
                stepCtx -> chargePayment(),
                StepConfig.builder().retryStrategy(RetryStrategies.Presets.NO_RETRY).build());

        return Map.of("result", result, "audit", audit, "critical", critical);
    }

    private String callApi() { return "ok"; }
    private String writeAuditLog() { return "logged"; }
    private String chargePayment() { return "charged"; }
}

RetryStrategies.Presets.DEFAULT 6 attempts, 5s initial delay, 60s max, 2x backoff, full jitter.

RetryStrategies.Presets.LINEAR 6 attempts with linear delays capped at 5 seconds and no jitter.

RetryStrategies.Presets.NO_RETRY Fails immediately on first error.

Retry any durable operation

Use the withRetry helper to wrap any durable operation in a replay-safe retry loop. The withRetry helper extends the same RetryStrategy configuration capability available to step to other operations, such as invoke, waitForCallback, and waitForCondition.

withRetry(context, name?, func, config) runs func and retries it on failure. The function receives the durable context and the 1-based attempt number. By default the loop is wrapped in runInChildContext so all attempts group under one operation in execution history.

import {
  withDurableExecution,
  withRetry,
  createRetryStrategy,
} from "@aws/durable-execution-sdk-js";

const retryStrategy = createRetryStrategy({
  maxAttempts: 3,
  initialDelay: { seconds: 2 },
  backoffRate: 2,
});

export const handler = withDurableExecution(async (event, context) => {
  // invoke does not accept a retryStrategy, so wrap it with withRetry to
  // apply backoff between failed attempts.
  const receipt = await withRetry(
    context,
    "charge-payment",
    (ctx, attempt) =>
      ctx.invoke<{ orderId: string }, string>(
        `charge-${attempt}`,
        "process-payment",
        { orderId: (event as { orderId: string }).orderId },
      ),
    { retryStrategy },
  );
  return receipt;
});

with_retry(context, func, config, name=None) runs func and retries it on failure. The function receives the durable context and the 1-based attempt number. By default the loop is wrapped in run_in_child_context so all attempts group under one operation in execution history.

from aws_durable_execution_sdk_python.config import Duration
from aws_durable_execution_sdk_python.context import DurableContext
from aws_durable_execution_sdk_python.execution import durable_execution
from aws_durable_execution_sdk_python.retries import (
    RetryStrategyConfig,
    WithRetryConfig,
    create_retry_strategy,
    with_retry,
)


def charge_flow(ctx: DurableContext, attempt: int) -> str:
    # invoke does not accept a retry strategy, so with_retry handles backoff.
    return ctx.invoke(
        function_name="process-payment",
        payload={"order_id": "abc"},
        name=f"charge-{attempt}",
    )


retry_config = WithRetryConfig(
    retry_strategy=create_retry_strategy(
        RetryStrategyConfig(
            max_attempts=3,
            initial_delay=Duration.from_seconds(2),
            backoff_rate=2.0,
        )
    ),
)


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> str:
    return with_retry(
        context,
        func=charge_flow,
        config=retry_config,
        name="charge-payment",
    )

DurableContext.withRetry(name, operation, config) runs operation and retries it on failure. The BiFunction receives the 1-based attempt number first and the durable context second. An async overload, withRetryAsync, returns a DurableFuture<T> for parallel use.

import java.time.Duration;
import java.util.Map;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.config.WithRetryConfig;
import software.amazon.lambda.durable.retry.JitterStrategy;
import software.amazon.lambda.durable.retry.RetryStrategies;

public class WithRetryHelperExample extends DurableHandler<Map<String, Object>, String> {

    @Override
    public String handleRequest(Map<String, Object> event, DurableContext context) {
        WithRetryConfig retryConfig = WithRetryConfig.builder()
                .retryStrategy(RetryStrategies.exponentialBackoff(
                        3,                       // maxAttempts
                        Duration.ofSeconds(2),   // initialDelay
                        Duration.ofMinutes(1),   // maxDelay
                        2.0,                     // backoffRate
                        JitterStrategy.FULL))    // jitter
                .build();

        // invoke does not accept a retry strategy, so withRetry applies backoff
        // between failed attempts.
        return context.withRetry(
                "charge-payment",
                (attempt, ctx) -> ctx.invoke(
                        "charge-" + attempt,
                        "process-payment",
                        Map.of("orderId", event.get("orderId")),
                        String.class),
                retryConfig);
    }
}

The withRetry helper wraps the retry loop in a child context and uses context.wait between attempts to suspend the invocation while waiting for the retry interval. The child context, the wait operations, and any operations inside each attempt count toward the durable operations the execution consumes. See AWS Lambda service quotas.

Retry only specific errors

You can retry only certain error types and fail immediately on others.

Use retryableErrorTypes to specify which error classes to retry.

import {
  withDurableExecution,
  createRetryStrategy,
} from "@aws/durable-execution-sdk-js";

class RateLimitError extends Error {}
class ServiceUnavailableError extends Error {}

const retryStrategy = createRetryStrategy({
  maxAttempts: 5,
  initialDelay: { seconds: 2 },
  // Only retry these specific error types; all other errors fail immediately
  retryableErrorTypes: [RateLimitError, ServiceUnavailableError],
});

export const handler = withDurableExecution(async (event, context) => {
  const result = await context.step(
    "call-api",
    async () => {
      // Throws RateLimitError or ServiceUnavailableError on transient failures
      return callApi();
    },
    { retryStrategy },
  );
  return result;
});

async function callApi(): Promise<string> {
  return "ok";
}

Use retryable_error_types to specify which exception classes to retry.

from aws_durable_execution_sdk_python.config import StepConfig
from aws_durable_execution_sdk_python.context import DurableContext, StepContext, durable_step
from aws_durable_execution_sdk_python.execution import durable_execution
from aws_durable_execution_sdk_python.retries import RetryStrategyConfig, create_retry_strategy


class RateLimitError(Exception):
    pass


class ServiceUnavailableError(Exception):
    pass


retry_strategy = create_retry_strategy(
    RetryStrategyConfig(
        max_attempts=5,
        # Only retry these specific error types; all other errors fail immediately
        retryable_error_types=[RateLimitError, ServiceUnavailableError],
    )
)

step_config = StepConfig(retry_strategy=retry_strategy)


@durable_step
def call_api(step_context: StepContext) -> str:
    # Raises RateLimitError or ServiceUnavailableError on transient failures
    return "ok"


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> str:
    return context.step(call_api(), config=step_config)

RetryStrategy is a functional interface. Check the error type in the lambda and return RetryDecision.fail() for errors you do not want to retry.

import java.time.Duration;
import java.util.Map;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.retry.JitterStrategy;
import software.amazon.lambda.durable.retry.RetryDecision;
import software.amazon.lambda.durable.retry.RetryStrategies;
import software.amazon.lambda.durable.retry.RetryStrategy;

public class RetrySpecificErrorsExample extends DurableHandler<Map<String, Object>, String> {

    static class RateLimitException extends RuntimeException {
        public RateLimitException(String message) { super(message); }
    }

    static class ServiceUnavailableException extends RuntimeException {
        public ServiceUnavailableException(String message) { super(message); }
    }

    // RetryStrategy is a functional interface: (Throwable error, int attempt) -> RetryDecision
    // Filter by error type manually, then delegate to exponential backoff for the delay.
    private static final RetryStrategy retryStrategy = (error, attempt) -> {
        if (!(error instanceof RateLimitException) && !(error instanceof ServiceUnavailableException)) {
            return RetryDecision.fail(); // all other errors fail immediately
        }
        return RetryStrategies.exponentialBackoff(5, Duration.ofSeconds(2), Duration.ofMinutes(1), 2.0, JitterStrategy.FULL)
                .makeRetryDecision(error, attempt);
    };

    @Override
    public String handleRequest(Map<String, Object> event, DurableContext context) {
        return context.step("call-api", String.class,
                stepCtx -> callApi(),
                StepConfig.builder().retryStrategy(retryStrategy).build());
    }

    private String callApi() { return "ok"; }
}

See also