替代轉錄

Amazon Transcribe轉錄音訊時，它會建立相同文字記錄的不同版本，並為每個版本指派可信度分數。在一般轉錄中，您僅會獲得可信度分數最高的版本。

如果您開啟替代文字記錄，會Amazon Transcribe傳回可信度較低的其他版本文字記錄。您最多可以選擇傳回 10 個替代轉錄。如果您指定的替代項目數量大於Amazon Transcribe識別項目的數量，則只會傳回實際的替代項目數量。

所有替代選項都位於相同的轉錄輸出檔案中，並在區段層級顯示。區段會依語音中的自然停頓而定，例如，講者改變或音訊停頓。

替代轉錄僅適用於批次轉錄。

轉錄輸出的結構如下所示。程式碼範例中的省略號 (……) 是為簡潔起見而移除內容的位置。

指定區段的完整最終轉錄。


"results": {
    "language_code": "en-US",
    "transcripts": [
        {
            "transcript": "The amazon is the largest rainforest on the planet."
        }
    ],

前一 transcript 部分中每個單字的可信度分數。


"items": [
    {
        "start_time": "1.15",
        "end_time": "1.35",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "The"
            }
        ],
        "type": "pronunciation"
    },
    {
        "start_time": "1.35",
        "end_time": "2.05",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "amazon"
            }
        ],
        "type": "pronunciation"
    },

您的替代轉錄位於轉錄輸出的 segments 部分中。每個區段的替代選項是依可信度分數遞減排序。


"segments": [
            {
                "start_time": "1.04",
                "end_time": "5.065",
                "alternatives": [
                    {    
                ...
                        "transcript": "The amazon is the largest rain forest on the planet.",
                        "items": [
                            {
                             "start_time": "1.15",
                                "confidence": "1.0",
                                "end_time": "1.35",
                                "type": "pronunciation",
                                "content": "The"
                            },
                            ...
                            {
                                "start_time": "3.06",
                                "confidence": "0.0037",
                                "end_time": "3.38",
                                "type": "pronunciation",
                                "content": "rain"
                            },
                            {
                                "start_time": "3.38",
                                "confidence": "0.0037",
                                "end_time": "3.96",
                                "type": "pronunciation",
                                "content": "forest"
                            },

轉錄輸出尾聲處的狀態。
```
"status": "COMPLETED"
}
```

請求替代轉錄

您可以使用 AWS 管理主控台、AWS CLI 或 AWS SDK，請求替代轉錄，請參閱下列範例：

登入 AWS 管理主控台。
在導覽窗格中，選擇轉錄作業，然後選擇建立作業(右上角)。這會開啟指定作業詳細資訊‭頁面。
填寫您要包含在指定作業詳細資訊頁面上的任何欄位，然後選擇下一步。這會引導您前往設定工作 - 選擇性頁面。

選擇替代結果，然後指定您想要在文字記錄中輸入的替代轉錄結果數量上限。
選擇建立作業以執行轉錄作業。

此範例使用 start-transcription-job 指令和 ShowAlternatives 參數。如需詳細資訊，請參閱StartTranscriptionJob及ShowAlternatives。

請注意，如果您的請求包含 ShowAlternatives=true，則還必須納入 MaxAlternatives。


aws transcribe start-transcription-job \
--region us-west-2 \
--transcription-job-name my-first-transcription-job \
--media MediaFileUri=s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac \
--output-bucket-name amzn-s3-demo-bucket \
--output-key my-output-files/ \
--language-code en-US \
--settings ShowAlternatives=true,MaxAlternatives=4

以下是使用 start-transcription-job 指令的另一個範例，以及包含替代轉錄的請求主體。


aws transcribe start-transcription-job \
--region us-west-2 \
--cli-input-json file://filepath/my-first-alt-transcription-job.json

檔案 my-first-alt-transcription-job.json 包含以下請求主體。


{
  "TranscriptionJobName": "my-first-transcription-job",  
  "Media": {
        "MediaFileUri": "s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac"
   },
  "OutputBucketName": "amzn-s3-demo-bucket",
  "OutputKey": "my-output-files/", 
  "LanguageCode": "en-US",
  "Settings": {
        "ShowAlternatives": true,
        "MaxAlternatives": 4
   }
}

下列範例使用適用於 Python (Boto3) 的 AWS SDK來請求替代轉錄，方法是使用 start_transcription_job 方法的 ShowAlternatives引數。如需詳細資訊，請參閱StartTranscriptionJob及ShowAlternatives。

如需使用 AWSSDKs 的其他範例，包括功能特定、案例和跨服務範例，請參閱 Amazon Transcribe AWSSDKs的程式碼範例章節。

請注意，如果您的請求包含 'ShowAlternatives':True，則還必須納入 MaxAlternatives。


from __future__ import print_function
import time
import boto3
transcribe = boto3.client('transcribe', 'us-west-2')
job_name = "my-first-transcription-job"
job_uri = "s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac"
transcribe.start_transcription_job(
    TranscriptionJobName = job_name,
    Media = {
        'MediaFileUri': job_uri
    },
    OutputBucketName = 'amzn-s3-demo-bucket',
    OutputKey = 'my-output-files/', 
    LanguageCode = 'en-US', 
    Settings = {
        'ShowAlternatives':True, 
        'MaxAlternatives':4
    }
)

while True:
    status = transcribe.get_transcription_job(TranscriptionJobName = job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

串流語言識別

提高轉錄準確性