Prerequisites Supported AWS Regions Cost Security Quotas Deployment sizing Disaster recovery

Plan your deployment

This section describes the prerequisites, supported Regions, cost, security, and quota considerations before deploying the guidance.

Prerequisites

Before deploying the guidance, ensure you have the following prerequisites:

Software requirements

Required software:

AWS CLI v2 - Command line tool for AWS
Node.js 18.x or later - JavaScript runtime
Python 3.9 or later - Python runtime
AWS CDK v2.100.0 or later - Infrastructure as code framework
Make - Build automation tool
Git - Version control system

AWS account requirements

Account setup:

An AWS account with appropriate IAM permissions
AWS credentials configured (via aws configure or environment variables)
Sufficient service quotas for the resources being deployed
CDK bootstrap completed in target account and region

Required IAM permissions:

The IAM user or role deploying the guidance needs permissions to create and manage:

AWS CloudFormation stacks
IAM roles and policies
Amazon VPC and networking resources
Amazon DynamoDB tables
Amazon S3 buckets
AWS IoT Core resources
Amazon MSK clusters
Amazon Kinesis Data Analytics applications
AWS Lambda functions
Amazon API Gateway APIs
Amazon Cognito user pools
Amazon Location Service resources
Amazon CloudFront distributions
Amazon ElastiCache clusters

We recommend using the AdministratorAccess managed policy for initial deployment, then creating a custom policy with least-privilege permissions for production deployments.

Network requirements

VPC considerations:

The solution creates a new VPC by default with the following configuration:

CIDR block: 10.0.0.0/16 (customizable)
Public subnets: 2 (across 2 Availability Zones)
Private subnets: 2 (across 2 Availability Zones)
NAT Gateway: 1 per Availability Zone
Internet Gateway: 1

Existing VPC:

To use an existing VPC, set the VPC_ID environment variable before deployment. The VPC must have:

At least 2 private subnets across 2 Availability Zones
Internet connectivity via NAT Gateway or NAT instance
Sufficient IP address space for MSK and ElastiCache

Supported AWS Regions

For the most current availability of AWS services by Region, see the AWS Regional Services List.

Guidance for Connected Mobility on AWS is supported in the following AWS Regions:

Region name	Region code
US East (Ohio)	us-east-2
US East (N. Virginia)	us-east-1
US West (Oregon)	us-west-2
Europe (Ireland)	eu-west-1
Europe (Frankfurt)	eu-central-1
Asia Pacific (Tokyo)	ap-northeast-1
Asia Pacific (Sydney)	ap-southeast-2

Note

Not all AWS services are available in all Regions. Verify that Amazon MSK, Amazon Kinesis Data Analytics, and Amazon Location Service are available in your target Region before deployment.

Cost

You are responsible for the cost of the AWS services used while running this solution. Prices are subject to change. For full details, see the pricing webpage for each AWS service used in this solution.

We recommend creating a budget through AWS Cost Explorer to help manage costs.

Cost model assumptions

The cost estimates below are based on the following telemetry profile per vehicle:

Telemetry frequency: 1 message every 2 seconds while driving (260 signals per message)
Average driving time: 2 hours per day per vehicle
Messages per vehicle per day: ~3,600
Message size: ~2 KB (compressed JSON with 260 signals)
Data per vehicle per month: ~216 MB
Active fleet percentage: 30% of vehicles driving at any given time during peak hours

All estimates use US East (N. Virginia) pricing as of March 2026.

Cost by fleet size

Fleet Size	MSK	Flink	Infrastructure	Application	Total/Month
100 vehicles	$67	$108	$58	$32	~$265
500 vehicles	$67	$108	$58	$35	~$268
1,000 vehicles	$194	$108	$58	$40	~$400
5,000 vehicles	$194	$324	$58	$55	~$631
10,000 vehicles	$389	$540	$71	$75	~$1,075
25,000 vehicles	$583	$864	$84	$120	~$1,651
50,000 vehicles	$972	$1,296	$97	$200	~$2,565
100,000 vehicles	$1,944	$2,160	$130	$350	~$4,584

Cost per vehicle per month:

Fleet Size	Total/Month	Per Vehicle
100	$265	$2.65
1,000	$400	$0.40
10,000	$1,075	$0.11
50,000	$2,565	$0.05
100,000	$4,584	$0.05

The per-vehicle cost drops dramatically as fleet size increases because the fixed infrastructure costs (MSK cluster, NAT Gateway, ElastiCache) are amortized across more vehicles. The breakpoint for cost efficiency is around 5,000–10,000 vehicles, where per-vehicle cost drops below $0.15/month.

Detailed cost breakdown

Amazon MSK (message streaming)

MSK is the largest cost driver (40-50% of total). Cost scales with broker count and instance size.

Fleet Size	Configuration	Messages/sec (peak)	Monthly Cost
Up to 500	3 × kafka.t3.small (2 vCPU, 2 GB)	~250	$67
500–5,000	3 × kafka.m5.large (2 vCPU, 8 GB)	~2,500	$194
5,000–25,000	3 × kafka.m5.xlarge (4 vCPU, 16 GB)	~12,500	$389–583
25,000–100,000	6 × kafka.m5.xlarge	~50,000	$972–1,944

Scaling trigger: Upgrade when average broker CPU exceeds 60% or when consumer lag exceeds 30 seconds.

Storage: 100 GB per broker included. Telemetry retention is 7 days. At 1,000 vehicles, daily ingest is ~7 GB, so 100 GB per broker provides comfortable headroom.

Amazon Managed Service for Apache Flink (stream processing)

Flink is the second largest cost driver (25-35% of total). Cost scales with KPU count. Each KPU provides 1 vCPU and 4 GB memory at $0.15/hour ($108/month).

The solution runs 7-10 Flink applications. In development, each application uses 1 KPU. In production, data-path processors (SimulatorPreprocessor, EventDrivenTelemetryProcessor, TripProcessor) may need 2-4 KPUs each.

Fleet Size	KPU Allocation	Monthly Cost
Up to 1,000	1 KPU × 10 apps = 10 KPUs	$108 (min billing)
1,000–5,000	2 KPU × 3 critical + 1 KPU × 7 = 13 KPUs	$324
5,000–10,000	3 KPU × 3 critical + 1 KPU × 7 = 16 KPUs	$540
10,000–25,000	4 KPU × 3 critical + 2 KPU × 7 = 26 KPUs	$864
25,000–50,000	4 KPU × 5 critical + 2 KPU × 5 = 30 KPUs	$1,296
50,000–100,000	6 KPU × 5 critical + 3 KPU × 5 = 45 KPUs	$2,160

Scaling trigger: Add KPUs when millisBehindLatest exceeds 5,000ms or when checkpoint duration exceeds 50% of the checkpoint interval.

Infrastructure (VPC, ElastiCache, NAT Gateway)

These are mostly fixed costs that do not scale linearly with fleet size.

Service	Configuration	Monthly Cost
NAT Gateway	2 AZs × $0.045/hour + data processing	$32–65
ElastiCache for Redis	cache.t3.micro (dev) to cache.r6g.large (prod)	$12–130
VPC endpoints	DynamoDB + S3 (gateway, free) + IoT Core (interface)	$14

ElastiCache scaling: The Redis node must hold the Last Known State for all active vehicles. Each vehicle uses ~5 KB in Redis (signals hash + timestamps hash + meta hash + stream). At 10,000 vehicles, that is ~50 MB — well within a cache.t3.micro (0.5 GB). Upgrade to cache.t3.small at 50,000+ vehicles or if geospatial query latency exceeds 5ms.

Application layer (Lambda, API Gateway, IoT Core, DynamoDB, S3)

These costs scale with usage but remain a small percentage of total cost.

Service	Cost Driver	Monthly Cost (1K vehicles)
AWS IoT Core	$1.00 per million messages	$3.24
Amazon DynamoDB	On-demand read/write capacity	$3.50
AWS Lambda	$0.20 per million invocations	$20.00
Amazon API Gateway	$3.50 per million calls	$3.50
Amazon CloudFront	$0.085 per GB transfer	$8.50
Amazon Location Service	$0.04 per 1K map tiles	$8.00
Amazon S3	$0.023 per GB storage	$1.50
Amazon Cognito	Free tier (50K MAU)	$0.00

IoT Core message cost detail: At 1,000 vehicles × 3,600 messages/day = 3.6M messages/day = 108M messages/month. IoT Core charges $1.00 per million messages (first 1B), so 108M × $1.00/M = $108. However, messages are metered in 5 KB increments, and compressed telemetry is ~2 KB, so each message counts as 1 unit. For 100 vehicles, this drops to $10.80/month.

DynamoDB cost detail: On-demand pricing is $1.25 per million write request units and $0.25 per million read request units. The stateful TripProcessor design reduces writes by 80% compared to a stateless approach (see Trip lifecycle).

Cost breakpoints and optimization

The cost curve has three distinct regions:

Under 500 vehicles (~$265/month): Fixed infrastructure dominates. MSK and Flink minimum billing account for 66% of cost. Per-vehicle cost is high ($0.53–$2.65) but total cost is low. Use kafka.t3.small brokers and 1 KPU per Flink app.

500–10,000 vehicles (~$400–$1,075/month): The sweet spot. Infrastructure costs are amortized, and usage-based costs (IoT Core, DynamoDB) are still modest. Per-vehicle cost drops to $0.11–$0.40. This is where the architecture is most cost-efficient relative to capability.

Over 10,000 vehicles (~$1,075+/month): Usage-based costs begin to dominate. MSK and Flink need to scale horizontally. Per-vehicle cost flattens at ~$0.05. At this scale, consider:

MSK Serverless instead of provisioned — eliminates broker sizing decisions and can reduce cost for bursty workloads
Provisioned DynamoDB capacity with auto-scaling instead of on-demand — 5-10x cheaper for predictable write patterns
S3 Intelligent-Tiering for telemetry archives — automatically moves cold data to cheaper storage classes
Reserved capacity for Flink KPUs if available — reduces hourly rate

Cost optimization strategies

Development environment:

Change	Impact	Savings
kafka.t3.small instead of m5.large	Sufficient for <500 vehicles	$127/month
1 KPU per Flink app (minimum)	Sufficient for <1,000 vehicles	$0 (already minimum)
cache.t3.micro for Redis	Sufficient for <10,000 vehicles	$0 (already minimum)
Single NAT Gateway (1 AZ)	Reduced availability	$16/month
Total development savings		~$143/month

Production optimizations:

DynamoDB TTL: Enable TTL on telemetry records (30 days), safety events (90 days), and commands (7 days) to automatically delete old data and reduce storage costs.
S3 lifecycle policies: Transition telemetry archives to S3 Glacier after 90 days (saves ~$0.02/GB/month).
CloudWatch log retention: Set log retention to 30 days for development, 90 days for production (default is indefinite).
Flink checkpointing: Increase checkpoint interval from 60s to 120s for non-critical processors to reduce state backend I/O.
IoT Core message batching: The simulator compresses telemetry with gzip, reducing message size from ~8 KB to ~2 KB (75% reduction in IoT Core message costs).

Security

Data protection

Encryption at rest:

All DynamoDB tables use AWS-managed encryption keys
All S3 buckets use AES-256 encryption
MSK cluster uses encryption at rest
ElastiCache uses encryption at rest

Encryption in transit:

All API calls use TLS 1.2 or higher
MSK client connections use TLS
IoT Core connections use TLS with X.509 certificates
CloudFront uses TLS 1.2 minimum

Identity and access management

Authentication:

Amazon Cognito manages user authentication for Fleet Manager UI
AWS IoT Core uses X.509 certificates for vehicle authentication
IAM roles control service-to-service communication

Authorization:

IAM policies follow least-privilege principles
IoT policies restrict device access to specific topics
API Gateway uses Cognito authorizers
Lambda functions have minimal required permissions

Network security

VPC isolation:

MSK cluster runs in private subnets
ElastiCache runs in private subnets
Security groups restrict traffic between components
No direct internet access to data stores

API security:

API Gateway endpoints require authentication
CloudFront uses signed URLs for sensitive content
CORS policies restrict cross-origin requests

Monitoring and logging

CloudWatch Logs:

All Lambda functions log to CloudWatch
Flink applications log to CloudWatch
API Gateway logs all requests
Default log retention: 90 days

CloudTrail:

All API calls are logged to CloudTrail
CloudTrail logs stored in S3 with encryption
Log file integrity validation enabled

Compliance

This solution uses AWS services that support various compliance programs:

SOC 1, 2, 3
PCI DSS Level 1
ISO 27001, 27017, 27018
HIPAA eligible services
GDPR compliant

For the most current compliance information, see AWS Services in Scope by Compliance Program.

Quotas

Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.

Service quotas to verify

Before deploying the guidance, verify you have sufficient quotas for the following services:

Amazon MSK:

Clusters per Region: Default 20 (need 1)
Brokers per cluster: Default 30 (need 3)
Configuration revisions: Default 50

Amazon Kinesis Data Analytics:

Applications per Region: Default 50 (need 3-5)
KPUs per application: Default 32 (need 1-4)

AWS IoT Core:

Things per account: Default 500,000
Certificates per account: Default 500,000
Policies per account: Default 1,000
Message broker connections: Default 500,000

Amazon DynamoDB:

Tables per Region: Default 2,500 (need 4)
On-demand read/write capacity: No limit

AWS Lambda:

Concurrent executions: Default 1,000
Function storage: Default 75 GB

Amazon VPC:

VPCs per Region: Default 5 (need 1)
Subnets per VPC: Default 200 (need 4)
Security groups per VPC: Default 2,500
NAT gateways per AZ: Default 5 (need 2)

Amazon ElastiCache:

Nodes per Region: Default 300 (need 1)
Clusters per Region: Default 300 (need 1)

Requesting quota increases

If you need to increase service quotas:

Open the Service Quotas console
Select the service
Select the quota
Choose Request quota increase
Enter the new quota value
Submit the request

Most quota increases are processed within 24-48 hours.

Deployment sizing

Small fleet (100-1,000 vehicles)

Recommended configuration:

MSK: 3 × kafka.t3.small brokers
ElastiCache: cache.t3.micro
Flink: 1 KPU per application
DynamoDB: On-demand billing

Expected cost: ~$250-300/month

Telemetry capacity:

Messages per second: ~100-500
Daily messages: ~8-40 million
Storage per month: ~10-50 GB

Medium fleet (1,000-10,000 vehicles)

Recommended configuration:

MSK: 3 × kafka.m5.large brokers
ElastiCache: cache.t3.small
Flink: 2 KPUs per application
DynamoDB: On-demand billing

Expected cost: ~$410-600/month

Telemetry capacity:

Messages per second: ~500-2,000
Daily messages: ~40-170 million
Storage per month: ~50-200 GB

Large fleet (10,000+ vehicles)

Recommended configuration:

MSK: 6 × kafka.m5.xlarge brokers
ElastiCache: cache.r6g.large (cluster mode)
Flink: 4 KPUs per application
DynamoDB: Provisioned capacity with auto-scaling

Expected cost: ~$1,200-2,000/month

Telemetry capacity:

Messages per second: ~2,000-10,000
Daily messages: ~170-860 million
Storage per month: ~200-1,000 GB

Performance considerations

Message throughput:

Each MSK broker handles ~1,000 messages/second
Flink applications process ~2,000 messages/second per KPU
DynamoDB on-demand scales automatically

Latency targets:

IoT Core to MSK: <100ms
MSK to Flink: <500ms
Flink to DynamoDB: <200ms
API response time: <500ms
ElastiCache lookup: <10ms

Scaling triggers:

MSK CPU > 70%: Add brokers
Flink lag > 60 seconds: Add KPUs
DynamoDB throttling: Increase capacity
ElastiCache CPU > 75%: Upgrade node type

Disaster recovery

Backup strategy

Automated backups:

DynamoDB: Point-in-time recovery enabled (35 days)
S3: Versioning enabled on all buckets
MSK: Automatic snapshots (not exposed to users)

Manual backups:

Export DynamoDB tables to S3 for long-term retention
Backup IoT certificates and policies
Export Cognito user pool configuration

Recovery time objectives

RTO (Recovery Time Objective):

Phase 1-2 recovery: 10-15 minutes
Phase 3 recovery (MSK): 15-20 minutes
Phase 5 recovery (Flink): 5-10 minutes
Full stack recovery: 40-60 minutes

RPO (Recovery Point Objective):

DynamoDB: Up to 5 minutes (PITR)
S3: Zero data loss (versioning)
Telemetry in-flight: Up to 5 minutes

Multi-region considerations

For high availability across regions:

Deploy solution in multiple regions
Use Route 53 for DNS failover
Replicate DynamoDB tables with Global Tables
Use S3 Cross-Region Replication for archives
Configure IoT Core custom domains for failover

Note

Multi-region deployment increases costs by 2-3x but provides geographic redundancy and lower latency for global fleets.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Security

Deploy the guidance