

# Using a text file to create a medical custom vocabulary
<a name="create-med-custom-vocabulary"></a>

To create a custom vocabulary, you must have prepared a text file that contains a collection a words or phrases. Amazon Transcribe Medical uses this text file to create a custom vocabulary that you can use to improve the transcription accuracy of those words or phrases. You can create a custom vocabulary using the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateMedicalVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateMedicalVocabulary.html) API or the Amazon Transcribe Medical console.

## AWS Management Console
<a name="create-med-custom-vocab-console"></a>

To use the AWS Management Console to create a custom vocabulary, you provide the Amazon S3 URI of the text file containing your words or phrases.

1. Sign in to the [AWS Management Console](https://console.aws.amazon.com/transcribe/).

1. In the navigation pane, under Amazon Transcribe Medical, choose **Custom vocabulary**.

1. For **Name**, under **Vocabulary settings**, choose a name for your custom vocabulary.

1. Specify the location of your audio file or video file in Amazon S3:
   + For **Vocabulary input file location on S3** under **Vocabulary settings**, specify the Amazon S3 URI that identifies the text file you will use to create your custom vocabulary.
   + For **Vocabulary input file location in S3**, choose **Browse S3** to browse for the text file and choose it.

1. Choose **Create vocabulary**.

You can see the processing status of your custom vocabulary in the AWS Management Console.

## API
<a name="create-med-custom-vocab-api"></a>

**To create a medical custom vocabulary (API)**
+ For the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) API, specify the following.

  1. For `LanguageCode`, specify `en-US`.

  1. For `VocabularyFileUri`, specify the Amazon S3 location of the text file that you use to define your custom vocabulary.

  1. For `VocabularyName`, specify a name for your custom vocabulary. The name you specify must be unique within your AWS account.

To see the processing status of your custom vocabulary, use the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetMedicalVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetMedicalVocabulary.html) API.

The following is an example request using the AWS SDK for Python (Boto3) to create a custom vocabulary.

```
from __future__ import print_function
import time
import boto3  
transcribe = boto3.client('transcribe', '{{us-west-2}}')
vocab_name = "{{my-first-vocabulary}}"
response = transcribe.create_medical_vocabulary(
    VocabularyName = job_name,
    VocabularyFileUri = 's3://{{amzn-s3-demo-bucket}}/{{my-vocabularies}}/{{my-vocabulary-table}}.txt'
    LanguageCode = 'en-US',
  )
  
while True:
    status = transcribe.get_medical_vocabulary(VocabularyName = vocab_name)
    if status['VocabularyState'] in ['READY', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)
```

## AWS CLI
<a name="create-med-custom-vocab-cli"></a>

**To enable speaker partitioning in a batch transcription job (AWS CLI)**
+ Run the following code.

  ```
  aws transcribe create-medical-vocabulary \
  --vocabulary-name {{my-first-vocabulary}} \ 
  --vocabulary-file-uri s3://{{amzn-s3-demo-bucket}}/{{my-vocabularies}}/{{my-vocabulary-file}}.txt \
  --language-code {{en-US}}
  ```