使用 JSON 手動分割在 Athena 中建立 CloudFront 日誌的資料表

使用 JSON 格式建立 CloudFront 標準日誌檔案欄位的資料表

複製下列範例 DDL 陳述式，並將其貼到 Athena 主控台查詢編輯器。範例陳述式使用《Amazon CloudFront 開發人員指南》中標準日誌檔欄位章節中所述的日誌檔欄位。為存放日誌的 Amazon S3 儲存貯體修改 LOCATION。

此查詢使用 OpenX JSON SerDe 搭配下列 SerDe 屬性，以在 Athena 中正確讀取 JSON 欄位。


CREATE EXTERNAL TABLE `cf_logs_manual_partition_json`(
  `date` string , 
  `time` string , 
  `x-edge-location` string , 
  `sc-bytes` string , 
  `c-ip` string , 
  `cs-method` string , 
  `cs(host)` string , 
  `cs-uri-stem` string , 
  `sc-status` string , 
  `cs(referer)` string , 
  `cs(user-agent)` string , 
  `cs-uri-query` string , 
  `cs(cookie)` string , 
  `x-edge-result-type` string , 
  `x-edge-request-id` string , 
  `x-host-header` string , 
  `cs-protocol` string , 
  `cs-bytes` string , 
  `time-taken` string , 
  `x-forwarded-for` string , 
  `ssl-protocol` string , 
  `ssl-cipher` string , 
  `x-edge-response-result-type` string , 
  `cs-protocol-version` string , 
  `fle-status` string , 
  `fle-encrypted-fields` string , 
  `c-port` string , 
  `time-to-first-byte` string , 
  `x-edge-detailed-result-type` string , 
  `sc-content-type` string , 
  `sc-content-len` string , 
  `sc-range-start` string , 
  `sc-range-end` string )
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES ( 
  'paths'='c-ip,c-port,cs(Cookie),cs(Host),cs(Referer),cs(User-Agent),cs-bytes,cs-method,cs-protocol,cs-protocol-version,cs-uri-query,cs-uri-stem,date,fle-encrypted-fields,fle-status,sc-bytes,sc-content-len,sc-content-type,sc-range-end,sc-range-start,sc-status,ssl-cipher,ssl-protocol,time,time-taken,time-to-first-byte,x-edge-detailed-result-type,x-edge-location,x-edge-request-id,x-edge-response-result-type,x-edge-result-type,x-forwarded-for,x-host-header') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://amzn-s3-demo-bucket/'

在 Athena 主控台中執行查詢。查詢完成之後，Athena 會註冊 cf_logs_manual_partition_json 資料表，讓其中的資料可供您發出查詢。

查詢範例

下列查詢會加總 CloudFront 在 2025 年 1 月 15 日提供的位元組數。


SELECT sum(cast("sc-bytes" as BIGINT)) as sc
FROM cf_logs_manual_partition_json
WHERE "date"='2025-01-15'

若要從查詢結果中除去重複的資料列 (例如，重複的空白資料列)，您可以使用 SELECT DISTINCT 陳述式，如下列範例所示。


SELECT DISTINCT * FROM cf_logs_manual_partition_json

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

標準日誌（舊版）

手動分割 (Parquet)