Content Domain 3: Continuous Improvement for Existing Solutions - AWS Certification

Content Domain 3: Continuous Improvement for Existing Solutions

Task 3.1: Determine a strategy to improve overall operational excellence.

Knowledge of:

  • Alerting and automatic remediation strategies

  • Disaster recovery planning

  • Monitoring and logging solutions (for example, Amazon CloudWatch)

  • CI/CD pipelines and deployment strategies (for example, blue/green, all-at-once, rolling)

  • Configuration management tools (for example, AWS Systems Manager)

Skills in:

  • Determining the most appropriate logging and monitoring strategy

  • Evaluating current deployment processes for improvement opportunities

  • Prioritizing opportunities for automation within a solution stack

  • Recommending the appropriate AWS solution to enable configuration management automation

  • Engineering failure scenario activities to support and exercise an understanding of recovery actions

Task 3.2: Determine a strategy to improve security.

Knowledge of:

  • Data retention, data sensitivity, and data regulatory requirements

  • Automated monitoring and remediation strategies (for example, AWS Config rules)

  • Secrets management (for example, Systems Manager, AWS Secrets Manager)

  • Principle of least privilege access

  • Security-specific AWS solutions

  • Patching practices

  • Backup practices and methods

Skills in:

  • Evaluating a strategy for the secure management of secrets and credentials

  • Auditing an environment for least privilege access

  • Reviewing implemented solutions to ensure security at every layer

  • Reviewing comprehensive traceability of users and services

  • Prioritizing automated responses to the detection of vulnerabilities

  • Designing and implementing a patch and update process

  • Designing and implementing a backup process

  • Employing remediation techniques

Task 3.3: Determine a strategy to improve performance.

Knowledge of:

  • High-performing systems architectures (for example, auto scaling, instance fleets, placement groups)

  • Global service offerings (for example, AWS Global Accelerator, Amazon CloudFront, edge computing services)

  • Monitoring tool sets and services (for example, CloudWatch)

  • Service level agreements (SLAs) and key performance indicators (KPIs)

Skills in:

  • Translating business requirements to measurable metrics

  • Testing potential remediation solutions and making recommendations

  • Proposing opportunities for the adoption of new technologies and managed services

  • Assessing solutions and applying rightsizing based on requirements

  • Identifying and examining performance bottlenecks

Task 3.4: Determine a strategy to improve reliability.

Knowledge of:

  • AWS Global Infrastructure

  • Data replication methods

  • Scaling methodologies (for example, load balancing, auto scaling)

  • High availability and resiliency

  • Disaster recovery methods and tools

  • Service quotas and limits

Skills in:

  • Understanding application growth and usage trends

  • Evaluating existing architecture to determine areas that are not sufficiently reliable

  • Remediating single points of failure

  • Enabling data replication, self-healing, and elastic features and services

Task 3.5: Identify opportunities for cost optimizations.

Knowledge of:

  • Cost-conscious architecture choices (for example, using Spot Instances, scaling policies, and rightsizing resources)

  • Price model adoptions (for example, Reserved Instances, AWS Savings Plans)

  • Networking and data transfer costs

  • Cost management, alerting, and reporting

Skills in:

  • Analyzing usage reports to identify underutilized and overutilized resources

  • Using AWS solutions to identify unused resources

  • Designing billing alarms based on expected usage patterns

  • Investigating AWS Cost and Usage Reports at a granular level

  • Using tagging for cost allocation and reporting