# Architecture details
<a name="architecture-details"></a>

This section describes the components and [AWS services that make up this solution](#aws-services-in-this-solution) and the architecture details on how these components work together.

The Distributed Load Testing on AWS solution consists of three high-level components: a [front end](front-end.md), a [backend](back-end.md), and an optional [MCP Server](MCP-Server.md).

# Front end
<a name="front-end"></a>

The front end provides the interfaces for interacting with the solution and includes:
+ A load testing API for programmatic access
+ A web console for creating, scheduling, and running performance tests
+ An optional MCP Server for AI-assisted analysis of test results and errors

## Load testing API
<a name="load-testing-api"></a>

Distributed Load Testing on AWS configures Amazon API Gateway to host the solution’s RESTful API. Users can interact with the load testing system securely through the included web console, RESTful API, and optional MCP Server. The API acts as a "front door" for access to testing data stored in Amazon DynamoDB. You can also use the APIs to access any extended functionality you build into the solution.

This solution takes advantage of the user authentication features of Amazon Cognito user pools. After successfully authenticating a user, Amazon Cognito issues a JSON web token that is used to allow the console to submit requests to the solution’s APIs (Amazon API Gateway endpoints). HTTPS requests are sent by the console to the APIs with the authorization header that includes the token.

Based on the request, API Gateway invokes the appropriate AWS Lambda function to perform the necessary tasks on the data stored in the DynamoDB tables, store test scenarios as JSON objects in Amazon S3, retrieve Amazon CloudWatch metrics images, and submit test scenarios to the AWS Step Functions state machine.

For more information on the solution’s API, refer to the [Distributed load testing API](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/distributed-load-testing-api.html) section of this guide.

## Web console
<a name="web-console"></a>

This solution includes a web console that you can use to configure and run tests, monitor running tests, and view detailed test results. The console is a ReactJS application built with [Cloudscape](https://cloudscape.design/), an open-source design system for building intuitive web applications. The console is hosted in Amazon S3 and accessed through Amazon CloudFront. The application leverages AWS Amplify to integrate with Amazon Cognito to authenticate users. The web console also contains an option to view live data for a running test, in which it subscribes to the corresponding topic in AWS IoT Core.

The web console URL is the CloudFront distribution domain name which can be found in the CloudFormation outputs as **Console**. After you launch the CloudFormation template, you will also receive an email that contains the web console URL and the one-time password to log into it.

## MCP Server (Optional)
<a name="mcp-server-front-end"></a>

The optional Model Context Protocol (MCP) Server provides an additional interface for AI development tools to access and analyze load testing data through natural language interactions. This component is only deployed if you select the MCP Server option during solution deployment.

The MCP Server enables AI agents to query test results, analyze performance metrics, and gain insights into your load testing data using tools like Amazon Q, Claude, and other MCP-compatible AI assistants. For detailed information about the MCP Server architecture and configuration, refer to [MCP Server](MCP-Server.md) in this section.

# Backend
<a name="back-end"></a>

The backend consists of a container image pipeline and load testing engine you use to generate load for the tests. You interact with the backend through the front end. Additionally, Amazon ECS on AWS Fargate tasks launched for each test are tagged with a unique test identifier (ID). These test ID tags can be used to help you monitor costs for this solution. For additional information, refer to [User-Defined Cost Allocation Tags](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/custom-tags.html) in the *AWS Billing and Cost Management User Guide*.

## Container image pipeline
<a name="container-image-pipeline"></a>

This solution uses a container image built with [Amazon Linux 2023](https://aws.amazon.com/linux/amazon-linux-2023/) as the base image with the [Taurus](https://gettaurus.org/) load testing framework installed. Taurus is an open-source test automation framework that supports JMeter, K6, Locust, and other testing tools. AWS hosts this image in an Amazon Elastic Container Registry (Amazon ECR) public repository. The solution uses this image to run tasks in the Amazon ECS on AWS Fargate cluster.

For more information, refer to the [Container image customization](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/container-image.html) section of this guide.

## Testing infrastructure
<a name="testing-infrastructure"></a>

In addition to the main CloudFormation template, the solution provides a regional template to launch the required resources for running tests in multiple Regions. The solution stores this template in Amazon S3 and provides a link to it in the web console. Each regional stack includes a VPC, an AWS Fargate cluster, and a Lambda function for processing live data.

For more information on how to deploy testing infrastructure in additional Regions, refer to the [Multi-Region deployment](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/multi-region-deployment.html) section of this guide.

## Load testing engine
<a name="load-testing-engine"></a>

The Distributed Load Testing solution uses Amazon Elastic Container Service (Amazon ECS) and AWS Fargate to simulate thousands of concurrent users across multiple Regions, generating HTTP requests at a sustained rate.

You define the test parameters using the included web console. The solution uses these parameters to generate a JSON test scenario and stores it in Amazon S3. For more information about test scripts and testing parameters, refer to [Test types](design-considerations.md#test-types) in this section.

An AWS Step Functions state machine runs and monitors Amazon ECS tasks in an AWS Fargate cluster. The AWS Step Functions state machine includes an ecr-checker AWS Lambda function, a task-status-checker AWS Lambda function, a task-runner AWS Lambda function, a task-canceler AWS Lambda function, and a results-parser AWS Lambda function. For more information on the workflow, refer to the [Test workflow](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/test-workflow.html) section of this guide. For more information on test results, refer to the [Test results](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/test-results.html) section of this guide. For more information on the test cancellation workflow, refer to the [Test cancellation workflow](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/test-cancel-workflow.html) section of this guide.

If you select live data, the solution initiates a real-time-data-publisher Lambda function in each Region by the CloudWatch logs that correspond to the Fargate tasks in that Region. The solution then processes and publishes the data to a topic in AWS IoT Core within the Region where you launched the main stack. For more information, refer to the [Live data](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/live-data.html) section of this guide.

# MCP Server
<a name="MCP-Server"></a>

The optional Model Context Protocol (MCP) Server integration enables AI agents to programmatically access and analyze your load testing data through natural language interactions. This component is only deployed if you select the MCP Server option during solution deployment.

The MCP Server acts as a bridge between AI development tools and your DLT deployment, providing a standardized interface for intelligent analysis of performance testing results. The architecture integrates several AWS services to create a secure, scalable interface for AI agent interactions:

## AWS AgentCore Gateway
<a name="AWS-AgentCore-Gateway"></a>

AWS AgentCore Gateway is a fully managed service that provides standardized hosting and protocol management for MCP servers. In this solution, AgentCore Gateway serves as the public endpoint that AI agents connect to when requesting access to your load testing data.

The service handles all MCP protocol communication, including tool discovery, authentication token validation, and request routing. AgentCore Gateway operates as a multi-tenant service with built-in security protections against common threats to public endpoints, while validating Cognito token signatures and claims for each request.

## DLT MCP Server Lambda
<a name="MCP-Server-Lambda"></a>

The DLT MCP Server Lambda function is a custom serverless component that processes MCP requests from AI agents and translates them into queries against your DLT resources.

This Lambda function acts as the intelligence layer of the MCP integration, retrieving test results from DynamoDB tables, accessing performance artifacts stored in S3 buckets, and querying CloudWatch logs for detailed execution information. The Lambda function implements read-only access patterns and transforms raw DLT data into structured, AI-friendly formats that agents can easily interpret and analyze.

## Authentication integration
<a name="MCP-Auth-Integration"></a>

The authentication system leverages your existing Cognito user pool infrastructure to maintain consistent access controls across both the web console and MCP Server interfaces.

This integration uses OAuth 2.0 token-based authentication. Users authenticate once through the Cognito login process and receive tokens that work for both UI interactions and MCP Server access. The system maintains the same permission boundaries and access controls as the web interface, ensuring that users can only access through AI agents the same load testing data they can access through the console.

## AWS services in this solution
<a name="aws-services-in-this-solution"></a>

The following AWS services are included in this solution:


| AWS service | Description | 
| --- | --- | 
|   [Amazon API Gateway](https://aws.amazon.com/api-gateway/)   |   **Core.** Hosts REST API endpoints in the solution.  | 
|   [AWS CloudFormation](https://aws.amazon.com/cloudformation/)   |   **Core.** Manages deployments for the solution infrastructure.  | 
|   [Amazon CloudFront](https://aws.amazon.com/cloudfront/)   |   **Core.** Serves the web content hosted in Amazon S3.  | 
|   [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/)   |   **Core.** Stores the solution logs and metrics.  | 
|   [Amazon Cognito](https://aws.amazon.com/cognito/)   |   **Core.** Handles user management and authentication for the API.  | 
|   [Amazon DynamoDB](https://aws.amazon.com/dynamodb/)   |   **Core.** Stores deployment information and tests scenario details and results.  | 
|   [Amazon Elastic Container Service](https://aws.amazon.com/ecs/)   |   **Core.** Deploys and manages independent Amazon ECS tasks on AWS Fargate containers.  | 
|   [AWS Fargate](https://aws.amazon.com/fargate/)   |   **Core.** Hosts solution's Amazon ECS containers  | 
|   [AWS Identity and Access Management](https://aws.amazon.com/iam/)   |   **Core.** Handles user role and permissions management.  | 
|   [AWS Lambda](https://aws.amazon.com/lambda/)   |   **Core.** Provides logic for APIs implementation, tests results parsing, and launching workers/leader tasks.  | 
|   [AWS Step Functions](https://aws.amazon.com/step-functions/)   |   **Core.** Orchestrates the provisioning of Amazon ECS containers on AWS Fargate tasks in the specified regions  | 
|   [AWS Amplify](https://aws.amazon.com/amplify/)   |   **Supporting.** Provides a web console powered by [AWS Amplify](https://aws.amazon.com/amplify).  | 
|   [Amazon CloudWatch Events](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/WhatIsCloudWatchEvents.html)   |   **Supporting**. Schedules tests to automatically begin at a specified date or on recurring dates.  | 
|   [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/)   |   **Supporting**. Hosts the container image in a public ECR repository.  | 
|   [AWS IoT Core](https://aws.amazon.com/iot-core/)   |   **Supporting.** Enables viewing live data for a running test by subscribing to the corresponding topic in AWS IoT Core.  | 
|   [AWS Systems Manager](https://aws.amazon.com/systems-manager/)   |   **Supporting.** Provides application-level resource monitoring and visualization of resource operations and cost data.  | 
|   [Amazon S3](https://aws.amazon.com/s3/)   |   **Supporting.** Hosts the static web content, logs, metrics, and tests data.  | 
|   [Amazon Virtual Private Cloud](https://aws.amazon.com/vpc/)   |   **Supporting.** Contains the solution's Amazon ECS containers running on AWS Fargate.  | 
|   [Amazon Bedrock AgentCore](https://aws.amazon.com/bedrock/agentcore/)   |   **Supporting, Optional.** Hosts the solution's optional Remote Model Context Protocol (MCP) Server for AI agent integration with API.  | 

# How Distributed Load Testing on AWS works
<a name="how-distributed-load-testing-on-aws-works"></a>

The following detailed breakdown shows the steps involved in running a test scenario.

**Test workflow**  
 ![\[image3\]](http://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/images/image3.png) 

1. You use the web console to submit a test scenario that includes the configuration details to the solution’s API.

1. The test scenario configuration is uploaded to the Amazon Simple Storage Service (Amazon S3) as a JSON file (`s3://<bucket-name>/test-scenarios/<$TEST_ID>/<$TEST_ID>.json`).

1. An AWS Step Functions state machine runs using the test ID, task count, test type, and file type as the AWS Step Functions state machine input. If the test is scheduled, it will first create a CloudWatch Events rule, which triggers AWS Step Functions on the specified date. For more details on the scheduling workflow, refer to the [Test scheduling workflow](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/test-scheduling-workflow.html) section of this guide.

1. Configuration details are stored in the scenarios Amazon DynamoDB table.

1. In the AWS Step Functions task runner workflow, the task-status-checker AWS Lambda function checks if Amazon Elastic Container Service (Amazon ECS) tasks are already running for the same test ID. If tasks with the same test ID are found running, it causes an error. If there are no Amazon ECS tasks running in the AWS Fargate cluster, the function returns the test ID, task count, and test type.

1. The task-runner AWS Lambda function gets the task details from the previous step and runs the Amazon ECS worker tasks in the AWS Fargate cluster. The Amazon ECS API uses the RunTask action to run the worker tasks. These worker tasks are launched and then wait for a start message from the leader task in order to begin the test. The RunTask action is limited to 10 tasks per definition. If your task count is more than 10, the task definition will run multiple times until all worker tasks have been started. The function also generates a prefix to distinguish the current test in the results-parse AWS Lambda function.

1. The task-status-checker AWS Lambda function checks if all the Amazon ECS worker tasks are running with the same test ID. If tasks are still provisioning, it waits for one minute and checks again. Once all Amazon ECS tasks are running, it returns the test ID, task count, test type, all task IDs and prefix and passes it to the task-runner function.

1. The task-runner AWS Lambda function runs again, this time launching a single Amazon ECS task to act as the leader node. This ECS task sends a start test message to each of the worker tasks in order to start the tests simultaneously.

1. The task-status-checker AWS Lambda function again checks if Amazon ECS tasks are running with the same test ID. If tasks are still running, it waits for one minute and checks again. Once there are no running Amazon ECS tasks, it returns the test ID, task count, test type, and prefix.

1. When the task-runner AWS Lambda function runs the Amazon ECS tasks in the AWS Fargate cluster, each task downloads the test configuration from Amazon S3 and starts the test.

1. Once the tests are running, the average response time, number of concurrent users, number of successful requests, and number of failed requests for each task is logged in Amazon CloudWatch and can be viewed in a CloudWatch dashboard.

1. If you included live data in the test, the solution filters real-time test results in CloudWatch using a subscription filter. Then the solution passes the data to a Lambda function.

1. The Lambda function then structures the data received and publishes it to an AWS IoT Core topic.

1. The web console subscribes to the AWS IoT Core topic for the test and receives the data published to the topic to graph the real-time data while the test is running.

1. When the test is complete, the container images export a detailed report as an XML file to Amazon S3. Each file is given a UUID for the filename. For example, s3://dlte-bucket/test-scenarios/*<\$1TEST\$1ID>*/results/*<\$1UUID>*.json.

1. When the XML files are uploaded to Amazon S3, the results-parser AWS Lambda function reads the results in the XML files starting with the prefix and parses and aggregates all the results into one summarized result.

1. The results-parser AWS Lambda function writes the aggregate result to an Amazon DynamoDB table.

## MCP Server workflow (Optional)
<a name="mcp-server-workflow"></a>

If you deploy the optional MCP Server integration, AI agents can access and analyze your load testing data through the following workflow:

 **MCP Server architecture** 

![\[MCP Server architecture showing integration with DLT components\]](http://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/images/mcp-server-architecture.png)


1.  **Customer interaction** - The customer interacts with DLT’s MCP via the MCP Endpoint hosted by AWS AgentCore Gateway. AI agents connect to this endpoint to request access to load testing data.

1.  **Authorization** - AgentCore Gateway handles authorization against the Solution Cognito user pool application client. The gateway validates the user’s Cognito token to ensure they have permission to access the DLT MCP server. Authorized users are granted access with agent tool access limited to read-only operations.

1.  **Tool specification** - AgentCore Gateway connects to the DLT MCP Server Lambda function. A tool specification defines the available tools that AI agents can use to interact with your load testing data.

1.  **Read-only API access** - The Lambda function is scoped to read-only API access through the existing DLT API Gateway endpoints. The function provides four primary operations:
   +  **List scenarios** - Retrieve a list of test scenarios from the DynamoDB scenarios table
   +  **Get scenario test results** - Access detailed test results for specific scenarios from DynamoDB and S3
   +  **Get Fargate load test runners** - Query information about running Fargate tasks in the ECS cluster
   +  **Get available Regional stacks** - Retrieve information about deployed regional infrastructure from CloudFormation

The MCP Server integration leverages the existing DLT infrastructure (API Gateway, Cognito, DynamoDB, S3) to provide secure, read-only access to test data for AI-powered analysis and insights.

# Design considerations
<a name="design-considerations"></a>

This section describes important design decisions and configuration options for the Distributed Load Testing on AWS solution, including supported applications, test types, scheduling options, and deployment considerations.

## Supported applications
<a name="supported-applications"></a>

This solution supports testing cloud-based applications and on-premises applications as long as you have network connectivity from your AWS account to your application. The solution supports APIs that use HTTP or HTTPS protocols.

## Test types
<a name="test-types"></a>

Distributed Load Testing on AWS supports multiple test types: simple HTTP endpoint tests, JMeter, K6, and Locust.

### Simple HTTP endpoint tests
<a name="single-http-support"></a>

The web console provides an HTTP Endpoint Configuration interface that allows you to test any HTTP or HTTPS endpoint without writing custom scripts. You define the endpoint URL, select the HTTP method (GET, POST, PUT, DELETE, etc.) from a dropdown menu, and optionally add custom request headers and body payloads. This configuration enables you to test APIs with custom authorization tokens, content types, or any other HTTP headers and request bodies required by your application.

### JMeter tests
<a name="jmeter-script-support"></a>

When creating a test scenario using the web console, you can upload a JMeter test script. The solution uploads the script to the scenarios S3 bucket. When Amazon ECS tasks run, they download the JMeter script from S3 and execute the test.

**Important**  
Although your JMeter script may define concurrency (virtual users), transaction rates (TPS), ramp-up times, and other load parameters, the solution will override these configurations with the values you specify in the Traffic Shape screen during test creation. The Traffic Shape configuration controls the task count, concurrency (virtual users per task), ramp-up duration, and hold duration for the test execution.

If you have JMeter input files, you can zip the input files together with the JMeter script. You can choose the zip file when you create a test scenario.

If you would like to include plugins, any .jar files that are included in a /plugins subdirectory in the bundled zip file will be copied to the JMeter extensions directory and be available for load testing.

**Note**  
If you include JMeter input files with your JMeter script file, you must include the relative path of the input files in your JMeter script file. In addition, the input files must be at the relative path. For example, when your JMeter input files and script file are in the /home/user directory and you refer to the input files in the JMeter script file, the path of input files must be ./INPUT\$1FILES. If you use /home/user/INPUT\$1FILES instead, the test will fail because it will not be able to find the input files.

If you include JMeter plugins, the .jar files must be bundled in a subdirectory named /plugins within the root of the zip file. Relative to the root of the zip file, the path to the jar files must be ./plugins/BUNDLED\$1PLUGIN.jar.

For more information about how to use JMeter scripts, refer to [JMeter User’s Manual](https://jmeter.apache.org/usermanual/index.html).

### K6 tests
<a name="k6-script-support"></a>

The solution supports K6 framework-based testing. K6 is released under the [AGPL-3.0 license](https://github.com/grafana/k6/blob/master/LICENSE.md). The solution displays a license acknowledgment message when creating a new K6 test. You can upload the K6 test file along with any necessary input files in an archive file.

**Important**  
Although your K6 script may define concurrency (virtual users), stages, thresholds, and other load parameters, the solution will override these configurations with the values you specify in the Traffic Shape screen during test creation. The Traffic Shape configuration controls the task count, concurrency (virtual users per task), ramp-up duration, and hold duration for the test execution.

### Locust tests
<a name="locust-script-support"></a>

The solution supports Locust framework-based testing. You can upload the Locust test file along with any necessary input files in an archive file.

**Important**  
Although your Locust script may define concurrency (user count), spawn rate, and other load parameters, the solution will override these configurations with the values you specify in the Traffic Shape screen during test creation. The Traffic Shape configuration controls the task count, concurrency (virtual users per task), ramp-up duration, and hold duration for the test execution.

## Scheduling tests
<a name="scheduling-tests"></a>

The solution provides three execution timing options for running load tests:
+  **Run Now** - Execute the load test immediately after creation
+  **Run Once** - Execute the test on a specific date and time in the future
+  **Run on a Schedule** - Create recurring tests using cron expressions to define the schedule

When you select **Run Once**, you specify the run time in 24-hour format and the run date when the load test should start running.

When you select **Run on a Schedule**, you can either manually enter a cron expression or select from common cron patterns (such as every hour, daily at a specific time, weekdays, or monthly). The cron expression uses a fine-grained schedule format with fields for minutes, hours, day of month, month, day of week, and year. You must also specify an expiry date, which defines when the scheduled test should stop running. For more information on how scheduling works, refer to the [Test scheduling workflow](https://docs.aws.amazon.com/solutions/latest/distributed-load-testing-on-aws/test-scheduling-workflow.html) section of this guide.

**Note**  
Test duration: Consider the total duration of tests when scheduling. For example, a test with a 10-minute ramp-up time and 40-minute hold time will take approximately 80 minutes to complete.
Minimum interval: Ensure the interval between scheduled tests is longer than the estimated test duration. For example, if the test takes about 80 minutes, schedule it to run no more frequently than every 3 hours.
Hourly limitation: The system does not allow tests to be scheduled with only a one-hour difference even if the estimated test duration is less than an hour.

## Concurrent tests
<a name="concurrent-tests"></a>

This solution creates an Amazon CloudWatch dashboard for each test that displays the combined output of all tasks running in the Amazon ECS cluster in real time. The CloudWatch dashboard shows average response time, number of concurrent users, number of successful requests, and number of failed requests. The solution aggregates each metric by the second and updates the dashboard every minute.

## User management
<a name="user-management"></a>

During initial configuration, you provide a username and email address that Amazon Cognito uses to grant you access to the solution’s web console. The console does not provide user administration. To add additional users, you must use the Amazon Cognito console. For more information, refer to [Managing Users in User Pools](https://docs.aws.amazon.com/cognito/latest/developerguide/managing-users.html) in the *Amazon Cognito Developer Guide*.

For migrating existing users to Amazon Cognito user pools, refer to the AWS blog [Approaches for migrating users to Amazon Cognito user pools](https://aws.amazon.com/blogs/security/approaches-for-migrating-users-to-amazon-cognito-user-pools).

## Regional deployment
<a name="regional-deployment"></a>

This solution uses Amazon Cognito which is available in specific AWS Regions only. Therefore, you must deploy this solution in a region where Amazon Cognito is available. For the most current service availability by Region, refer to the [AWS Regional Services List](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/).