

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# Amazon Managed Workflows for Apache Airflow 的快速入門教學課程
<a name="quick-start"></a>

此快速入門教學課程使用 AWS CloudFormation 範本來建立 Amazon VPC 基礎設施、具有`dags`資料夾的 Amazon S3 儲存貯體，以及 Amazon Managed Workflows for Apache Airflow 環境。

**Topics**
+ [於本教學課程中](#quick-start-overview)
+ [先決條件](#quick-start-before)
+ [步驟一：在本機儲存 CloudFormation 範本](#quick-start-template)
+ [步驟二：使用 建立堆疊 AWS CLI](#quick-start-createstack)
+ [步驟三：將 DAG 上傳至 Amazon S3，並在 Apache Airflow UI 中執行](#quick-start-upload-dag)
+ [步驟四：存取 CloudWatch Logs 中的日誌](#quick-start-logs)
+ [後續步驟？](#quick-start-next-up)

## 於本教學課程中
<a name="quick-start-overview"></a>

使用此教學課程將 DAG 上傳至 Amazon S3、在 Apache Airflow 中執行 DAG，並使用三 AWS Command Line Interface (AWS CLI) 命令存取 CloudWatch 中的日誌。最後，您將學習如何為 Apache Airflow 開發團隊建立 IAM 政策。

**注意**  
此頁面上的 CloudFormation 範本會為 中可用的最新版本 Apache Airflow 建立 Amazon Managed Workflows for Apache Airflow 環境 CloudFormation。最新的可用版本是 Apache Airflow v3.0.6。

 CloudFormation 範本會建立下列項目：
+ **VPC 基礎設施**。範本使用 [透過網際網路的公有路由](networking-about.md#networking-about-overview-public)。它會[公有網路存取模式](configuring-networking.md#access-overview-public)針對 中的 Apache Airflow Web 伺服器使用 `WebserverAccessMode: PUBLIC_ONLY`。
+ **Amazon S3 儲存貯**體。範本會使用`dags`資料夾建立 Amazon S3 儲存貯體。其設定為**封鎖所有公有存取**，並啟用**儲存貯體版本控制**，如 中所定義[為 Amazon MWAA 建立 Amazon S3 儲存貯體](mwaa-s3-bucket.md)。
+ **Amazon MWAA 環境**。範本會建立與 Amazon S3 儲存貯體上 `dags` 資料夾相關聯的 Amazon MWAA 環境、具有 Amazon MWAA 所使用 AWS 服務許可的執行角色，以及使用 [AWS擁有的金鑰](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#aws-owned-cmk)進行加密的預設值，如 中所定義[建立 Amazon MWAA 環境](create-environment.md)。
+ **CloudWatch Logs**。範本會在 CloudWatch 中於 **INFO** 層級開啟 Apache Airflow 日誌，並為 *Airflow 排程器日誌群組*、*Airflow Web 伺服器日誌群組*、*Airflow 工作者日誌群組*、*Airflow DAG 處理日誌群組*和 *Airflow 任務日誌群組*啟動，如 中所定義[在 Amazon CloudWatch 中存取 Airflow 日誌](monitoring-airflow.md)。

在本教學課程中，您將完成下列任務：
+ **上傳並執行 DAG**。將最新 Amazon MWAA 支援的 Apache Airflow 版本的 Apache Airflow 教學 DAG 上傳至 Amazon S3，然後在 Apache Airflow UI 中執行，如 中所定義[新增或更新 DAGs](configuring-dag-folder.md)。
+ **存取日誌**。存取 CloudWatch Logs 中的 *Airflow Web 伺服器日誌群組*，如 中所定義[在 Amazon CloudWatch 中存取 Airflow 日誌](monitoring-airflow.md)。
+ **建立存取控制政策**。在 IAM 中為您的 Apache Airflow 開發團隊建立存取控制政策，如 中所定義[存取 Amazon MWAA 環境](access-policies.md)。

**注意**  
在託管 Amazon MWAA 環境的 VPC 中，`true`針對所有連接的子網路`assignIpv6AddressOnCreation`將 設定為 。此設定可確保自動指派網際網路通訊協定第 6 版 (IPv6) 地址給這些子網路內的資源。

## 先決條件
<a name="quick-start-before"></a>

 AWS Command Line Interface (AWS CLI) 是一種開放原始碼工具，您可以使用命令列 shell 中的 命令與 AWS 服務互動。若要完成此頁面上的步驟，您需要下列項目：
+ [AWS CLI – 安裝第 2 版](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html)。
+ [AWS CLI – 使用 的快速組態`aws configure`](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)。

## 步驟一：在本機儲存 CloudFormation 範本
<a name="quick-start-template"></a>
+ 複製下列範本的內容，並在本機儲存為 `mwaa-public-network.yml`。您也可以[下載 範本](./samples/mwaa-public-network.zip)。

  ```
  AWSTemplateFormatVersion: "2010-09-09"
  
  Parameters:
  
    EnvironmentName:
      Description: An environment name that is prefixed to resource names
      Type: String
      Default: MWAAEnvironment
  
    VpcCIDR:
      Description: The IP range (CIDR notation) for this VPC
      Type: String
      Default: 10.192.0.0/16
  
    PublicSubnet1CIDR:
      Description: The IP range (CIDR notation) for the public subnet in the first Availability Zone
      Type: String
      Default: 10.192.10.0/24
  
    PublicSubnet2CIDR:
      Description: The IP range (CIDR notation) for the public subnet in the second Availability Zone
      Type: String
      Default: 10.192.11.0/24
  
    PrivateSubnet1CIDR:
      Description: The IP range (CIDR notation) for the private subnet in the first Availability Zone
      Type: String
      Default: 10.192.20.0/24
    PrivateSubnet2CIDR:
      Description: The IP range (CIDR notation) for the private subnet in the second Availability Zone
      Type: String
      Default: 10.192.21.0/24
    MaxWorkerNodes:
      Description: The maximum number of workers that can run in the environment
      Type: Number
      Default: 2
    DagProcessingLogs:
      Description: Log level for DagProcessing
      Type: String
      Default: INFO
    SchedulerLogsLevel:
      Description: Log level for SchedulerLogs
      Type: String
      Default: INFO
    TaskLogsLevel:
      Description: Log level for TaskLogs
      Type: String
      Default: INFO
    WorkerLogsLevel:
      Description: Log level for WorkerLogs
      Type: String
      Default: INFO
    WebserverLogsLevel:
      Description: Log level for WebserverLogs
      Type: String
      Default: INFO
  
  Resources:
    #####################################################################################################################
    # CREATE VPC
    #####################################################################################################################
  
    VPC:
      Type: AWS::EC2::VPC
      Properties:
        CidrBlock: !Ref VpcCIDR
        EnableDnsSupport: true
        EnableDnsHostnames: true
        Tags:
          - Key: Name
            Value: MWAAEnvironment
  
    InternetGateway:
      Type: AWS::EC2::InternetGateway
      Properties:
        Tags:
          - Key: Name
            Value: MWAAEnvironment
  
    InternetGatewayAttachment:
      Type: AWS::EC2::VPCGatewayAttachment
      Properties:
        InternetGatewayId: !Ref InternetGateway
        VpcId: !Ref VPC
  
    PublicSubnet1:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId: !Ref VPC
        AvailabilityZone: !Select [ 0, !GetAZs '' ]
        CidrBlock: !Ref PublicSubnet1CIDR
        MapPublicIpOnLaunch: true
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Public Subnet (AZ1)
  
    PublicSubnet2:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId: !Ref VPC
        AvailabilityZone: !Select [ 1, !GetAZs  '' ]
        CidrBlock: !Ref PublicSubnet2CIDR
        MapPublicIpOnLaunch: true
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Public Subnet (AZ2)
  
    PrivateSubnet1:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId: !Ref VPC
        AvailabilityZone: !Select [ 0, !GetAZs  '' ]
        CidrBlock: !Ref PrivateSubnet1CIDR
        MapPublicIpOnLaunch: false
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Private Subnet (AZ1)
  
    PrivateSubnet2:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId: !Ref VPC
        AvailabilityZone: !Select [ 1, !GetAZs  '' ]
        CidrBlock: !Ref PrivateSubnet2CIDR
        MapPublicIpOnLaunch: false
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Private Subnet (AZ2)
  
    NatGateway1EIP:
      Type: AWS::EC2::EIP
      DependsOn: InternetGatewayAttachment
      Properties:
        Domain: vpc
  
    NatGateway2EIP:
      Type: AWS::EC2::EIP
      DependsOn: InternetGatewayAttachment
      Properties:
        Domain: vpc
  
    NatGateway1:
      Type: AWS::EC2::NatGateway
      Properties:
        AllocationId: !GetAtt NatGateway1EIP.AllocationId
        SubnetId: !Ref PublicSubnet1
  
    NatGateway2:
      Type: AWS::EC2::NatGateway
      Properties:
        AllocationId: !GetAtt NatGateway2EIP.AllocationId
        SubnetId: !Ref PublicSubnet2
  
    PublicRouteTable:
      Type: AWS::EC2::RouteTable
      Properties:
        VpcId: !Ref VPC
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Public Routes
  
    DefaultPublicRoute:
      Type: AWS::EC2::Route
      DependsOn: InternetGatewayAttachment
      Properties:
        RouteTableId: !Ref PublicRouteTable
        DestinationCidrBlock: 0.0.0.0/0
        GatewayId: !Ref InternetGateway
  
    PublicSubnet1RouteTableAssociation:
      Type: AWS::EC2::SubnetRouteTableAssociation
      Properties:
        RouteTableId: !Ref PublicRouteTable
        SubnetId: !Ref PublicSubnet1
  
    PublicSubnet2RouteTableAssociation:
      Type: AWS::EC2::SubnetRouteTableAssociation
      Properties:
        RouteTableId: !Ref PublicRouteTable
        SubnetId: !Ref PublicSubnet2
  
  
    PrivateRouteTable1:
      Type: AWS::EC2::RouteTable
      Properties:
        VpcId: !Ref VPC
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Private Routes (AZ1)
  
    DefaultPrivateRoute1:
      Type: AWS::EC2::Route
      Properties:
        RouteTableId: !Ref PrivateRouteTable1
        DestinationCidrBlock: 0.0.0.0/0
        NatGatewayId: !Ref NatGateway1
  
    PrivateSubnet1RouteTableAssociation:
      Type: AWS::EC2::SubnetRouteTableAssociation
      Properties:
        RouteTableId: !Ref PrivateRouteTable1
        SubnetId: !Ref PrivateSubnet1
  
    PrivateRouteTable2:
      Type: AWS::EC2::RouteTable
      Properties:
        VpcId: !Ref VPC
        Tags:
          - Key: Name
            Value: !Sub ${EnvironmentName} Private Routes (AZ2)
  
    DefaultPrivateRoute2:
      Type: AWS::EC2::Route
      Properties:
        RouteTableId: !Ref PrivateRouteTable2
        DestinationCidrBlock: 0.0.0.0/0
        NatGatewayId: !Ref NatGateway2
  
    PrivateSubnet2RouteTableAssociation:
      Type: AWS::EC2::SubnetRouteTableAssociation
      Properties:
        RouteTableId: !Ref PrivateRouteTable2
        SubnetId: !Ref PrivateSubnet2
  
    SecurityGroup:
      Type: AWS::EC2::SecurityGroup
      Properties:
        GroupName: "mwaa-security-group"
        GroupDescription: "Security group with a self-referencing inbound rule."
        VpcId: !Ref VPC
  
    SecurityGroupIngress:
      Type: AWS::EC2::SecurityGroupIngress
      Properties:
        GroupId: !Ref SecurityGroup
        IpProtocol: "-1"
        SourceSecurityGroupId: !Ref SecurityGroup
  
    EnvironmentBucket:
      Type: AWS::S3::Bucket
      Properties:
        VersioningConfiguration:
          Status: Enabled
        PublicAccessBlockConfiguration: 
          BlockPublicAcls: true
          BlockPublicPolicy: true
          IgnorePublicAcls: true
          RestrictPublicBuckets: true
  
    #####################################################################################################################
    # CREATE MWAA
    #####################################################################################################################
  
    MwaaEnvironment:
      Type: AWS::MWAA::Environment
      DependsOn: MwaaExecutionPolicy
      Properties:
        Name: !Sub "${AWS::StackName}-MwaaEnvironment"
        SourceBucketArn: !GetAtt EnvironmentBucket.Arn
        ExecutionRoleArn: !GetAtt MwaaExecutionRole.Arn
        DagS3Path: dags/  
        NetworkConfiguration:
          SecurityGroupIds:
            - !GetAtt SecurityGroup.GroupId
          SubnetIds:
            - !Ref PrivateSubnet1
            - !Ref PrivateSubnet2
        WebserverAccessMode: PUBLIC_ONLY
        MaxWorkers: !Ref MaxWorkerNodes
        LoggingConfiguration:
          DagProcessingLogs:
            LogLevel: !Ref DagProcessingLogs
            Enabled: true
          SchedulerLogs:
            LogLevel: !Ref SchedulerLogsLevel
            Enabled: true
          TaskLogs:
            LogLevel: !Ref TaskLogsLevel
            Enabled: true
          WorkerLogs:
            LogLevel: !Ref WorkerLogsLevel
            Enabled: true
          WebserverLogs:
            LogLevel: !Ref WebserverLogsLevel
            Enabled: true
  
    MwaaExecutionRole:
      Type: AWS::IAM::Role
      Properties:
        AssumeRolePolicyDocument:
          Version: 2012-10-17&TCX5-2025-waiver;
          Statement:
            - Effect: Allow
              Principal:
                Service:
                  - airflow-env.amazonaws.com
                  - airflow.amazonaws.com
              Action:
               - "sts:AssumeRole"
        Path: "/service-role/"
  
    MwaaExecutionPolicy:
      DependsOn: EnvironmentBucket
      Type: AWS::IAM::ManagedPolicy
      Properties:
        Roles:
          - !Ref MwaaExecutionRole
        PolicyDocument:
          Version: 2012-10-17&TCX5-2025-waiver;
          Statement:
            - Effect: Allow
              Action: airflow:PublishMetrics
              Resource:
                - !Sub "arn:aws:airflow:${AWS::Region}:${AWS::AccountId}:environment/${EnvironmentName}"
            - Effect: Deny
              Action: s3:ListAllMyBuckets
              Resource:
                - !Sub "${EnvironmentBucket.Arn}"
                - !Sub "${EnvironmentBucket.Arn}/*"
  
            - Effect: Allow
              Action:
                - "s3:GetObject*"
                - "s3:GetBucket*"
                - "s3:List*"
              Resource:
                - !Sub "${EnvironmentBucket.Arn}"
                - !Sub "${EnvironmentBucket.Arn}/*"
            - Effect: Allow
              Action:
                - logs:DescribeLogGroups
              Resource: "*"
  
            - Effect: Allow
              Action:
                - logs:CreateLogStream
                - logs:CreateLogGroup
                - logs:PutLogEvents
                - logs:GetLogEvents
                - logs:GetLogRecord
                - logs:GetLogGroupFields
                - logs:GetQueryResults
                - logs:DescribeLogGroups
              Resource:
                - !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:airflow-${AWS::StackName}*"
            - Effect: Allow
              Action: cloudwatch:PutMetricData
              Resource: "*"
            - Effect: Allow
              Action:
                - sqs:ChangeMessageVisibility
                - sqs:DeleteMessage
                - sqs:GetQueueAttributes
                - sqs:GetQueueUrl
                - sqs:ReceiveMessage
                - sqs:SendMessage
              Resource:
                - !Sub "arn:aws:sqs:${AWS::Region}:*:airflow-celery-*"
            - Effect: Allow
              Action:
                - kms:Decrypt
                - kms:DescribeKey
                - "kms:GenerateDataKey*"
                - kms:Encrypt
              NotResource: !Sub "arn:aws:kms:*:${AWS::AccountId}:key/*"
              Condition:
                StringLike:
                  "kms:ViaService":
                    - !Sub "sqs.${AWS::Region}.amazonaws.com"
  Outputs:
    VPC:
      Description: A reference to the created VPC
      Value: !Ref VPC
  
    PublicSubnets:
      Description: A list of the public subnets
      Value: !Join [ ",", [ !Ref PublicSubnet1, !Ref PublicSubnet2 ]]
  
    PrivateSubnets:
      Description: A list of the private subnets
      Value: !Join [ ",", [ !Ref PrivateSubnet1, !Ref PrivateSubnet2 ]]
  
    PublicSubnet1:
      Description: A reference to the public subnet in the 1st Availability Zone
      Value: !Ref PublicSubnet1
  
    PublicSubnet2:
      Description: A reference to the public subnet in the 2nd Availability Zone
      Value: !Ref PublicSubnet2
  
    PrivateSubnet1:
      Description: A reference to the private subnet in the 1st Availability Zone
      Value: !Ref PrivateSubnet1
  
    PrivateSubnet2:
      Description: A reference to the private subnet in the 2nd Availability Zone
      Value: !Ref PrivateSubnet2
  
    SecurityGroupIngress:
      Description: Security group with self-referencing inbound rule
      Value: !Ref SecurityGroupIngress
  
    MwaaApacheAirflowUI:
      Description: MWAA Environment
      Value: !Sub  "https://${MwaaEnvironment.WebserverUrl}"
  ```

## 步驟二：使用 建立堆疊 AWS CLI
<a name="quick-start-createstack"></a>

1. 在命令提示字元中，導覽至`mwaa-public-network.yml`存放 的目錄。例如：

   ```
   cd mwaaproject
   ```

1. 使用 [https://docs.aws.amazon.com/cli/latest/reference/cloudformation/create-stack.html](https://docs.aws.amazon.com/cli/latest/reference/cloudformation/create-stack.html)命令來使用 建立堆疊 AWS CLI。

   ```
   aws cloudformation create-stack --stack-name mwaa-environment-public-network --template-body file://mwaa-public-network.yml --capabilities CAPABILITY_IAM
   ```
**注意**  
建立 Amazon VPC 基礎設施、Amazon S3 儲存貯體和 Amazon MWAA 環境需要 30 分鐘以上。

## 步驟三：將 DAG 上傳至 Amazon S3，並在 Apache Airflow UI 中執行
<a name="quick-start-upload-dag"></a>

1. 複製[最新支援的 Apache Airflow 版本的](https://airflow.apache.org/docs/apache-airflow/2.2.2/tutorial.html) `tutorial.py` 檔案內容，並在本機儲存為 `tutorial.py`。

1. 在您的命令提示字元中，導覽至`tutorial.py`存放 的目錄。例如：

   ```
   cd mwaaproject
   ```

1. 使用下列命令列出所有 Amazon S3 儲存貯體。

   ```
   aws s3 ls
   ```

1. 使用下列命令列出您環境的 Amazon S3 儲存貯體中的檔案和資料夾。

   ```
   aws s3 ls s3://YOUR_S3_BUCKET_NAME
   ```

1. 使用下列指令碼將`tutorial.py`檔案上傳至您的`dags`資料夾。以 *amzn-s3-demo-bucket *取代範例值。

   ```
   aws s3 cp tutorial.py s3://amzn-s3-demo-bucket/dags/
   ```

1. 在 Amazon MWAA 主控台上開啟[環境](https://console.aws.amazon.com/mwaa/home#/environments)頁面。

1. 選擇環境。

1. 選擇**開啟氣流使用者介面**。

1. 在 Apache Airflow UI 上，從可用的 DAGs清單中，選擇**教學**課程 DAG。

1. 在 DAG 詳細資訊頁面上，選擇 **DAG 名稱旁的暫停/取消暫停** DAG 切換，以取消暫停 DAG。

1. 選擇**觸發 DAG**。

## 步驟四：存取 CloudWatch Logs 中的日誌
<a name="quick-start-logs"></a>

您可以在 CloudWatch 主控台中存取 CloudFormation 由堆疊開啟的所有 Apache Airflow 日誌的 Apache Airflow 日誌。下一節說明如何存取 *Airflow Web 伺服器日誌群組的日誌*。

1. 在 Amazon MWAA 主控台上開啟[環境](https://console.aws.amazon.com/mwaa/home#/environments)頁面。

1. 選擇環境。

1. 在**監控**窗格中選擇 **Airflow Web 伺服器日誌群組**。

1. 在`webserver_console_ip`日誌**串流中選擇日誌**。

## 後續步驟？
<a name="quick-start-next-up"></a>
+ 進一步了解如何上傳 DAGs、在 中指定 Python 相依性，`requirements.txt`以及在 `plugins.zip` 中指定自訂外掛程式[在 Amazon MWAA 上使用 DAG](working-dags.md)。
+ 進一步了解建議您在 中調整環境效能的最佳實務[Amazon MWAA 上 Apache Airflow 的效能調校](best-practices-tuning.md)。
+ 在 中為您的環境建立監控儀表板[在 Amazon MWAA 上監控儀表板和警示](monitoring-dashboard.md)。
+ 在 中執行一些 DAG 程式碼範例[Amazon Managed Workflows for Apache Airflow 的程式碼範例](sample-code.md)。