

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 提取数据目录中的数据以用于 Amazon Chime S AWS Glue DK 调用分析
<a name="ca-data-model-queries"></a>

使用这些示例查询来提取和组织您的 Amazon Chime SDK 呼叫分析 Glue 数据目录中的数据。

**注意**  
有关连接 Amazon Athena 并查询您的 Glue 数据目录的信息，请参阅[使用 ODBC 连接 Amazon Athena](https://docs.aws.amazon.com/athena/latest/ug/connect-with-odbc.html)。

根据需要展开每个部分。

## 从 call\$1analytics\$1metadata 表中的元数据（字符串数据类型）中提取值
<a name="qry-insights-metadata"></a>

`call_analytics_metadata` 具有采用 JSON 字符串格式的 `metadata` 字段。使用 Athena 中的 [json\$1extract\$1scalar 函数](https://docs.aws.amazon.com/athena/latest/ug/extracting-data-from-JSON.html)来查询此字符串中的元素。

```
SELECT
    json_extract_scalar(metadata,'$.voiceConnectorId') AS "VoiceConnector ID",
    json_extract_scalar(metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(metadata,'$.callId') AS "Call ID",
    json_extract_scalar(metadata,'$.direction') AS Direction,
    json_extract_scalar(metadata,'$.transactionId') AS "Transaction ID"
FROM 
    "GlueDatabaseName"."call_analytics_metadata"
```

## 查询 call\$1analytics\$1metadatics 表中的 SIPRECMetadata 更新
<a name="qry-insights-siprec-metadata"></a>

该`call_analytics_metadata`字段的元数据字段采用 JSON 字符串格式。 `metadata`还有另一个名为的嵌套对象`oneTimeMetadata`，该对象包含原始 XML 和转换后的 JSON 格式的 SIPRec 元数据。使用 Athena 中的 `json_extract_scalar` 函数来查询此字符串中的元素。

```
SELECT
    json_extract_scalar(metadata,'$.voiceConnectorId') AS "VoiceConnector ID",
    json_extract_scalar(metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(metadata,'$.callId') AS "Call ID",
    json_extract_scalar(metadata,'$.direction') AS Direction,
    json_extract_scalar(metadata,'$.transactionId') AS "Transaction ID",
    json_extract_scalar(json_extract_scalar(metadata,'$.oneTimeMetadata'),'$.siprecMetadata') AS "siprec Metadata XML",
    json_extract_scalar(json_extract_scalar(metadata,'$.oneTimeMetadata'),'$.siprecMetadataJson') AS "Siprec Metadata JSON",
    json_extract_scalar(json_extract_scalar(metadata,'$.oneTimeMetadata'),'$.inviteHeaders') AS "Invite Headers"
FROM 
    "GlueDatabaseName"."call_analytics_metadata"
WHERE 
    callevent-type = "update";
```

## 从 call\$1analytics\$1recording\$1metadata 表中的元数据（字符串数据类型）中提取值
<a name="qry-recording-metadata"></a>

`call_analytics_recording_metadata` 具有采用 JSON 字符串格式的元数据字段。使用 Athena 中的 [json\$1extract\$1scalar 函数](https://docs.aws.amazon.com/athena/latest/ug/extracting-data-from-JSON.html)来查询此字符串中的元素。

```
SELECT
    json_extract_scalar(metadata,'$.voiceConnectorId') AS "VoiceConnector ID",
    json_extract_scalar(metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(metadata,'$.callId') AS "Call ID",
    json_extract_scalar(metadata,'$.direction') AS Direction,
    json_extract_scalar(metadata,'$.transactionId') AS "Transaction ID"
FROM 
    "GlueDatabaseName"."call_analytics_recording_metadata"
WHERE 
    detail-subtype = "Recording"
```

## 从 voice\$1analytics\$1status 表中的详细信息（结构数据类型）中提取值
<a name="qry-va-status"></a>

`voice_analytics_status` 具有采用 `struct` 数据类型的详细信息字段。以下示例展示了如何查询 `struct` 数据类型字段：

```
SELECT
    detail.transactionId AS "Transaction ID",
    detail.voiceConnectorId AS "VoiceConnector ID",
    detail.siprecmetadata AS "Siprec Metadata",
    detail.inviteheaders AS "Invite Headers",
    detail.streamStartTime AS "Stream Start Time"
FROM 
    "GlueDatabaseName"."voice_analytics_status"
```

## 连接 voice\$1analytics\$1status 表和 call\$1analytics\$1metadata 表
<a name="qry-join-va-meta"></a>

以下示例查询连接了 `call_analytics_metadata` 和 `voice_analytics_status`：

```
SELECT
    a.detail.transactionId AS "Transaction ID",
    a.detail.voiceConnectorId AS "VoiceConnector ID",
    a.detail.siprecmetadata AS "Siprec Metadata",
    a.detail.inviteheaders AS "Invite Headers",
    a.detail.streamStartTime AS "Stream Start Time"
    json_extract_scalar(b.metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(b.metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(b.metadata,'$.callId') AS "Call ID",
    json_extract_scalar(b.metadata,'$.direction') AS Direction
FROM 
    "GlueDatabaseName"."voice_analytics_status" a
INNER JOIN 
    "GlueDatabaseName"."call_analytics_metadata" b
ON a.detail.transactionId = json_extract_scalar(b.metadata,'$.transactionId')
```

## 从 transcribe\$1call\$1analytics\$1post\$1call 表中提取转录
<a name="qry-transcribe-ca-post-call"></a>

transcripe\$1call\$1analytics\$1post\$1call 具有结构格式的转录字段，带有嵌套数组。使用以下查询来取消数组嵌套：

```
SELECT 
    jobstatus,
    languagecode,
    IF(CARDINALITY(m.transcript)=0 OR CARDINALITY(m.transcript) IS NULL, NULL, e.transcript.id) AS utteranceId,
    IF(CARDINALITY(m.transcript)=0 OR CARDINALITY(m.transcript) IS NULL, NULL, e.transcript.content) AS transcript,
    accountid,
    channel,
    sessionid,
    contentmetadata.output AS "Redaction"
FROM 
    "GlueDatabaseName"."transcribe_call_analytics_post_call" m
CROSS JOIN UNNEST
    (IF(CARDINALITY(m.transcript)=0, ARRAY[NULL], transcript)) AS e(transcript)
```

## 连接 transcribe\$1call\$1analytics\$1post\$1call 和 call\$1analytics\$1metadata 表
<a name="qry-va-status"></a>

以下查询连接了 transcribe\$1call\$1analytics\$1post\$1call 和 call\$1analytics\$1metadata 表：

```
WITH metadata AS(
  SELECT 
    from_iso8601_timestamp(time) AS "Timestamp",
    date_parse(date_format(from_iso8601_timestamp(time), '%m/%d/%Y %H:%i:%s') , '%m/%d/%Y %H:%i:%s') AS "DateTime",
    date_parse(date_format(from_iso8601_timestamp(time) , '%m/%d/%Y') , '%m/%d/%Y') AS "Date",
    date_format(from_iso8601_timestamp(time) , '%H:%i:%s')  AS "Time",
    mediainsightspipelineid,
    json_extract_scalar(metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(metadata,'$.voiceConnectorId') AS "VoiceConnector ID",
    json_extract_scalar(metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(metadata,'$.callId') AS "Call ID",
    json_extract_scalar(metadata,'$.direction') AS Direction,
    json_extract_scalar(metadata,'$.transactionId') AS "Transaction ID",
    REGEXP_REPLACE(REGEXP_EXTRACT(json_extract_scalar(metadata,'$.oneTimeMetadata.s3RecordingUrl'), '[^/]+(?=\.[^.]+$)'), '\.wav$', '') AS "SessionID"
  FROM 
    "GlueDatabaseName"."call_analytics_metadata"
),
transcript_events AS(
  SELECT 
    jobstatus,
    languagecode,
    IF(CARDINALITY(m.transcript)=0 OR CARDINALITY(m.transcript) IS NULL, NULL, e.transcript.id) AS utteranceId,
    IF(CARDINALITY(m.transcript)=0 OR CARDINALITY(m.transcript) IS NULL, NULL, e.transcript.content) AS transcript,
    accountid,
    channel,
    sessionid,
    contentmetadata.output AS "Redaction"
  FROM 
    "GlueDatabaseName"."transcribe_call_analytics_post_call" m
  CROSS JOIN UNNEST
    (IF(CARDINALITY(m.transcript)=0, ARRAY[NULL], transcript)) AS e(transcript)
)
SELECT 
    jobstatus,
    languagecode,
    a.utteranceId,
    transcript,
    accountid,
    channel,
    a.sessionid,
    "Redaction"
    "Timestamp",
    "DateTime",
    "Date",
    "Time",
    mediainsightspipelineid,
    "To Number",
    "VoiceConnector ID",
    "From Number",
    "Call ID",
    Direction,
    "Transaction ID"
FROM 
    "GlueDatabaseName"."transcribe_call_analytics_post_call" a
LEFT JOIN 
    metadata b
ON 
    a.sessionid = b.SessionID
```

## 查询媒体对象 URLs 以获取语音增强通话录音
<a name="qry-voice-enhancement-call-recording"></a>

以下示例查询连接了 `Voice enhancement call recording` URL：

```
SELECT 
    json_extract_scalar(metadata,'$.voiceConnectorId') AS "VoiceConnector ID",
    json_extract_scalar(metadata,'$.fromNumber') AS "From Number",
    json_extract_scalar(metadata,'$.toNumber') AS "To Number",
    json_extract_scalar(metadata,'$.callId') AS "Call ID",
    json_extract_scalar(metadata,'$.direction') AS Direction,
    json_extract_scalar(metadata,'$.transactionId') AS "Transaction ID",
    s3MediaObjectConsoleUrl
FROM
    {GlueDatabaseName}."call_analytics_recording_metadata"
WHERE
    detail-subtype = "VoiceEnhancement"
```