LSPERF19-BP02 Establish comprehensive performance metrics and evidence collection

Establish measurable performance metrics specific to scientific workflows before load testing. Capture traditional indicators alongside domain-specific measurements for research and clinical data processing. Implement automated evidence collection to create audit trails, inform optimization decisions, and verify system reliability. Centralize test results to enable trend analysis and detect regressions across cycles.

Desired outcome: Implement a comprehensive scientific workflow performance measurement framework that establishes metrics before testing, captures domain-specific indicators, automates evidence collection, and centralizes results for trend analysis and regression detection.

Level of risk exposed if this best practice is not established: Medium

Implementation guidance

Effective performance testing for generative AI systems requires establishing comprehensive, measurable metrics specifically tailored to scientific workflows before initiating load testing activities. Organizations should develop a balanced measurement framework that captures traditional technical indicators (such as latency, throughput, and resource utilization) alongside domain-specific scientific measurements that reflect research quality and clinical data processing integrity.

Implement automated evidence collection mechanisms that systematically document test results, creating detailed audit trails that satisfy regulatory requirements while simultaneously providing valuable data for optimization decisions. These automated systems should capture performance data at regular intervals during testing, correlating system behaviors with specific workloads to identify optimization opportunities and potential bottlenecks.

Centralize test results in a structured repository that enables sophisticated trend analysis across multiple test cycles, facilitating the early detection of performance regressions that might impact scientific outcomes.

Configure dashboards to visualize both point-in-time performance and longitudinal trends, making complex performance data accessible to both technical and scientific stakeholders.

Implementation steps

Define metrics in Amazon CloudWatch using custom dimensions for scientific workflows.
Deploy AWS X-Ray traces to capture both technical and domain-specific indicators.
Implement AWS Audit Manager for automated evidence collection.
Use Amazon S3 and Amazon Athena to centralize and query performance test results.
Create Quick dashboards for trend analysis across test cycles.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

LSPERF19-BP01 Implement infrastructure as code for consistent test environments

LSPERF19-BP03 Schedule regular performance tests with production-representative data loads