Content Domain 3: Data Operations and Support - AWS Certification

Content Domain 3: Data Operations and Support

Task 3.1: Automate data processing by using AWS services

  • Skill 3.1.1: Orchestrate data pipelines (for example, Amazon Managed Workflows for Apache Airflow [Amazon MWAA], AWS Step Functions).

  • Skill 3.1.2: Troubleshoot Amazon managed workflows.

  • Skill 3.1.3: Call SDKs to access Amazon features from code.

  • Skill 3.1.4: Use the features of AWS services to process data (for example, Amazon EMR, Amazon Redshift, AWS Glue).

  • Skill 3.1.5: Consume and maintain data APIs.

  • Skill 3.1.6: Prepare data for transformation (for example, AWS Glue DataBrew and Amazon SageMaker Unified Studio).

  • Skill 3.1.7: Query data (for example, Amazon Athena).

  • Skill 3.1.8: Use AWS Lambda to automate data processing.

  • Skill 3.1.9: Manage events and schedulers (for example, Amazon EventBridge).

Task 3.2: Analyze data by using AWS services

  • Skill 3.2.1: Visualize data by using AWS services and tools (for example, DataBrew, Amazon QuickSight).

  • Skill 3.2.2: Verify and clean data (for example, Lambda, Athena, QuickSight, Jupyter Notebooks, Amazon SageMaker Data Wrangler).

  • Skill 3.2.3: Use SQL in Amazon Redshift and Athena to query data or to create views.

  • Skill 3.2.4: Use Athena notebooks that use Apache Spark to explore data.

  • Skill 3.2.5: Describe tradeoffs between provisioned services and serverless services.

  • Skill 3.2.6: Define data aggregation, rolling average, grouping, and pivoting.

Task 3.3: Maintain and monitor data pipelines

  • Skill 3.3.1: Extract logs for audits.

  • Skill 3.3.2: Deploy logging and monitoring solutions to facilitate auditing and traceability.

  • Skill 3.3.3: Use notifications during monitoring to send alerts.

  • Skill 3.3.4: Troubleshoot performance issues.

  • Skill 3.3.5: Use AWS CloudTrail to track API calls.

  • Skill 3.3.6: Troubleshoot and maintain pipelines (for example, AWS Glue, Amazon EMR).

  • Skill 3.3.7: Use Amazon CloudWatch Logs to log application data (with a focus on configuration and automation).

  • Skill 3.3.8: Analyze logs with AWS services (for example, Athena, Amazon EMR, Amazon OpenSearch Service, CloudWatch Logs Insights, big data application logs).

Task 3.4: Ensure data quality

  • Skill 3.4.1: Run data quality checks while processing the data (for example, checking for empty fields).

  • Skill 3.4.2: Define data quality rules (for example, DataBrew).

  • Skill 3.4.3: Investigate data consistency (for example, DataBrew).

  • Skill 3.4.4: Describe data sampling techniques.

  • Skill 3.4.5: Implement data skew mechanisms.