

# Video Frame Input Data


When you create a video frame object detection or object tracking labeling job, you can choose video files (MP4 files) or video frames for input data. All worker tasks are created using video frames, so if you choose video files, use the Ground Truth frame extraction tool to extract video frames (images) from your video files. 

For both of these options, you can use the **Automated data setup** option in the Ground Truth section of the Amazon SageMaker AI console to set up a connection between Ground Truth and your input data in Amazon S3 so that Ground Truth knows where to look for your input data when creating your labeling tasks. This creates and stores an input manifest file in your Amazon S3 input dataset location. To learn more, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md).

Alternatively, you can manually create sequence files for each sequence of video frames that you want labeled and provide the Amazon S3 location of an input manifest file that references each of these sequences files using the `source-ref` key. To learn more, see [Create a Video Frame Input Manifest File](sms-video-manual-data-setup.md#sms-video-create-manifest). 

**Topics**
+ [

# Choose Video Files or Video Frames for Input Data
](sms-point-cloud-video-input-data.md)
+ [

# Input Data Setup
](sms-video-data-setup.md)

# Choose Video Files or Video Frames for Input Data


When you create a video frame object detection or object tracking labeling job, you can provide a sequence of video frames (images) or you can use the Amazon SageMaker AI console to have Ground Truth automatically extract video frames from your video files. Use the following sections to learn more about these options. 

## Provide Video Frames


Video frames are sequences of images extracted from a video file. You can create a Ground Truth labeling job to have workers label multiple sequences of video frames. Each sequence is made up of images extracted from a single video. 

To create a labeling job using video frame sequences, you must store each sequence using a unique [key name prefix](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys) in Amazon S3. In the Amazon S3 console, key name prefixes are folders. So in the Amazon S3 console, each sequence of video frames must be located in its own folder in Amazon S3.

For example, if you have two sequences of video frames, you might use the key name prefixes `sequence1/` and `sequence2/` to identify your sequences. In this example, your sequences may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence1/` and `s3://amzn-s3-demo-bucket/video-frames/sequence2/`.

If you are using the Ground Truth console to create an input manifest file, all of the sequence key name prefixes should be in the same location in Amazon S3. For example, in the Amazon S3 console, each sequence could be in a folder in `s3://amzn-s3-demo-bucket/video-frames/`. In this example, your first sequence of video frames (images) may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence1/` and your second sequence may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence2/`. 

**Important**  
Even if you only have a single sequence of video frames that you want workers to label, that sequence must have a key name prefix in Amazon S3. If you are using the Amazon S3 console, this means that your sequence is located in a folder. It cannot be located in the root of your S3 bucket. 

When creating worker tasks using sequences of video frames, Ground Truth uses one sequence per task. In each task, Ground Truth orders your video frames using [UTF-8](https://en.wikipedia.org/wiki/UTF-8) binary order. 

For example, video frames might be in the following order in Amazon S3: 

```
[0001.jpg, 0002.jpg, 0003.jpg, ..., 0011.jpg]
```

They are arranged in the same order in the worker’s task: `0001.jpg, 0002.jpg, 0003.jpg, ..., 0011.jpg`.

Frames might also be ordered using a naming convention like the following:

```
[frame1.jpg, frame2.jpg, ..., frame11.jpg]
```

In this case, `frame10.jpg` and `frame11.jpg` come before `frame2.jpg` in the worker task. Your worker sees your video frames in the following order: `frame1.jpg, frame10.jpg, frame11.jpg, frame2.jpg, ..., frame9.jpg`. 

## Provide Video Files


You can use the Ground Truth frame splitting feature when creating a new labeling job in the console to extract video frames from video files (MP4 files). A series of video frames extracted from a single video file is referred to as a *sequence of video frames*.

You can either have Ground Truth automatically extract all frames, up to 2,000, from the video, or you can specify a frequency for frame extraction. For example, you can have Ground Truth extract every 10th frame from your videos.

You can provide up to 50 videos when you use automated data setup to extract frames, however your input manifest file cannot reference more than 10 video frame sequence files when you create a video frame object tracking and video frame object detection labeling job. If you use the automated data setup console tool to extract video frames from more than 10 video files, you will need to modify the manifest file the tool generates or create a new one to include 10 video frame sequence files or less. To learn more about these quotas, see [3D Point Cloud and Video Frame Labeling Job Quotas](input-data-limits.md#sms-input-data-quotas-other).

To use the video frame extraction tool, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md). 

When all of your video frames have been successfully extracted from your videos, you will see the following in your S3 input dataset location:
+ A key name prefix (a folder in the Amazon S3 console) named after each video. Each of these prefixes leads to:
  + A sequence of video frames extracted from the video used to name that prefix.
  + A sequence file used to identify all of the images that make up that sequence. 
+ An input manifest file with a .manifest extension. This identifies all of the sequence files that will be used to create your labeling job. 

All of the frames extracted from a single video file are used for a labeling task. If you extract video frames from multiple video files, multiple tasks are created for your labeling job, one for each sequence of video frames. 

 Ground Truth stores each sequence of video frames that it extracts in your Amazon S3 location for input datasets using a unique [key name prefix](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys). In the Amazon S3 console, key name prefixes are folders.

# Input Data Setup


When you create a video frame labeling job, you need to let Ground Truth know where to look for your input data. You can do this in one of two ways:
+ You can store your input data in Amazon S3 and have Ground Truth automatically detect the input dataset used for your labeling job. See [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md) to learn more about this option. 
+ You can create an input manifest file and sequence files and upload them to Amazon S3. See [Set up Video Frame Input Data Manually](sms-video-manual-data-setup.md) to learn more about this option. 

**Topics**
+ [

# Set up Automated Video Frame Input Data
](sms-video-automated-data-setup.md)
+ [

# Set up Video Frame Input Data Manually
](sms-video-manual-data-setup.md)

# Set up Automated Video Frame Input Data


You can use the Ground Truth automated data setup to automatically detect video files in your Amazon S3 bucket and extract video frames from those files. To learn how, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

If you already have video frames in Amazon S3, you can use the automated data setup to use these video frames in your labeling job. For this option, all video frames from a single video must be stored using a unique prefix. To learn about the requirements to use this option, see [Provide Video Frames](sms-point-cloud-video-input-data.md#sms-video-provide-frames).

Select one of the following sections to learn how to set up your automatic input dataset connection with Ground Truth.

## Provide Video Files and Extract Frames


Use the following procedure to connect your video files with Ground Truth and automatically extract video frames from those files for video frame object detection and object tracking labeling jobs.

**Note**  
If you use the automated data setup console tool to extract video frames from more than 10 video files, you will need to modify the manifest file the tool generates or create a new one to include 10 video frame sequence files or less. To learn more, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

Make sure your video files are stored in an Amazon S3 bucket in the same AWS Region that you perform the automated data setup in. 

**Automatically connect your video files in Amazon S3 with Ground Truth and extract video frames:**

1. Navigate to the **Create labeling job** page in the Amazon SageMaker AI console: [https://console.aws.amazon.com/sagemaker/groundtruth](https://console.aws.amazon.com//sagemaker/groundtruth). 

   Your input and output S3 buckets must be located in the same AWS Region that you create your labeling job in. This link puts you in the North Virginia (us-east-1) AWS Region. If your input data is in an Amazon S3 bucket in another Region, switch to that Region. To change your AWS Region, on the [navigation bar](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/getting-started.html#select-region), choose the name of the currently displayed Region.

1. Select **Create labeling job**.

1. Enter a **Job name**. 

1. In the section **Input data setup**, select **Automated data setup**.

1. Enter an Amazon S3 URI for **S3 location for input datasets**. An S3 URI looks like the following: `s3://amzn-s3-demo-bucket/path-to-files/`. This URI should point to the Amazon S3 location where your video files are stored.

1. Specify your **S3 location for output datasets**. This is where your output data is stored. You can choose to store your output data in the **Same location as input dataset** or **Specify a new location** and entering the S3 URI of the location that you want to store your output data.

1. Choose **Video Files** for your **Data type** using the dropdown list.

1. Choose **Yes, extract frames for object tracking and detection tasks**. 

1. Choose a method of **Frame extraction**.
   + When you choose **Use all frames extracted from the video to create a labeling task**, Ground Truth extracts all frames from each video in your **S3 location for input datasets**, up to 2,000 frames. If a video in your input dataset contains more than 2,000 frames, the first 2,000 are extracted and used for that labeling task. 
   + When you choose **Use every *x* frame from a video to create a labeling task**, Ground Truth extracts every *x*th frame from each video in your **S3 location for input datasets**. 

     For example, if your video is 2 seconds long, and has a [frame rate](https://en.wikipedia.org/wiki/Frame_rate) of 30 frames per second, there are 60 frames in your video. If you specify 10 here, Ground Truth extracts every 10th frame from your video. This means the 1st, 10th, 20th, 30th, 40th, 50th, and 60th frames are extracted. 

1. Choose or create an IAM execution role. Make sure that this role has permission to access your Amazon S3 locations for input and output data specified in steps 5 and 6. 

1. Select **Complete data setup**.

## Provide Video Frames


Use the following procedure to connect your sequences of video frames with Ground Truth for video frame object detection and object tracking labeling jobs. 

Make sure your video frames are stored in an Amazon S3 bucket in the same AWS Region that you perform the automated data setup in. Each sequence of video frames should have a unique prefix. For example, if you have two sequences stored in `s3://amzn-s3-demo-bucket/video-frames/sequences/`, each should have a unique prefix like `sequence1` and `sequence2` and should both be located directly under the `/sequences/` prefix. In the example above, the locations of these two sequences is: `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence1/` and `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence2/`. 

**Automatically connect your video frame in Amazon S3 with Ground Truth:**

1. Navigate to the **Create labeling job** page in the Amazon SageMaker AI console: [https://console.aws.amazon.com/sagemaker/groundtruth](https://console.aws.amazon.com//sagemaker/groundtruth). 

   Your input and output S3 buckets must be located in the same AWS Region that you create your labeling job in. This link puts you in the North Virginia (us-east-1) AWS Region. If your input data is in an Amazon S3 bucket in another Region, switch to that Region. To change your AWS Region, on the [navigation bar](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/getting-started.html#select-region), choose the name of the currently displayed Region.

1. Select **Create labeling job**.

1. Enter a **Job name**. 

1. In the section **Input data setup**, select **Automated data setup**.

1. Enter an Amazon S3 URI for **S3 location for input datasets**. 

   This should be the Amazon S3 location where your sequences are stored. For example, if you have two sequences stored in `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence1/`, `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence2/`, enter `s3://amzn-s3-demo-bucket/video-frames/sequences/` here.

1. Specify your **S3 location for output datasets**. This is where your output data is stored. You can choose to store your output data in the **Same location as input dataset** or **Specify a new location** and entering the S3 URI of the location that you want to store your output data.

1. Choose **Video frames** for your **Data type** using the dropdown list. 

1. Choose or create an IAM execution role. Make sure that this role has permission to access your Amazon S3 locations for input and output data specified in steps 5 and 6. 

1. Select **Complete data setup**.

These procedures will create an input manifest in the Amazon S3 location for input datasets that you specified in step 5. If you are creating a labeling job using the SageMaker API or, AWS CLI, or an AWS SDK, use the Amazon S3 URI for this input manifest file as input to the parameter `ManifestS3Uri`.

# Set up Video Frame Input Data Manually


Choose the manual data setup option if you have created sequence files for each of your video frame sequences, and a manifest file listing references to those sequences files.

## Create a Video Frame Input Manifest File


 Ground Truth uses the input manifest file to identify the location of your input dataset when creating labeling tasks. For video frame object detection and object tracking labeling jobs, each line in the input manifest file identifies the location of a video frame sequence file. Each sequence file identifies the images included in a single sequence of video frames.

Use this page to learn how to create a video frame sequence file and an input manifest file for video frame object tracking and object detection labeling jobs.

If you want Ground Truth to automatically generate your sequence files and input manifest file, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md). 

### Create a Video Frame Sequence Input Manifest


In the video frame sequence input manifest file, each line in the manifest is a JSON object, with a `"source-ref"` key that references a sequence file. Each sequence file identifies the location of a sequence of video frames. This is the manifest file formatting required for all video frame labeling jobs. 

The following example demonstrates the syntax used for an input manifest file:

```
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq1.json"}
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq2.json"}
```

### Create a Video Frame Sequence File


The data for each sequence of video frames needs to be stored in a JSON data object. The following is an example of the format you use for a sequence file. Information about each frame is included as a JSON object and is listed in the `frames` list. The following JSON has been expanded for readability. 

```
{
 "seq-no": 1,
 "prefix": "s3://amzn-s3-demo-bucket/prefix/video1/",
 "number-of-frames": 3,
 "frames":[
   {"frame-no": 1, "unix-timestamp": 1566861644, "frame": "frame0001.jpg" },
   {"frame-no": 2, "unix-timestamp": 1566861644, "frame": "frame0002.jpg" }, 
   {"frame-no": 3, "unix-timestamp": 1566861644, "frame": "frame0003.jpg" }   
 ]
}
```

The following table provides details about the parameters shown in the this code example. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `seq-no`  |  Yes  |  Integer  |  The ordered number of the sequence.   | 
|  `prefix`  |  Yes  |  String **Accepted Values**: `s3://<bucket-name>/<prefix>/`  |  The Amazon S3 location where the sequence files are located.  The prefix must end with a forward slash: `/`.  | 
|  `number-of-frames`  |  Yes  |  Integer  |  The total number of frames included in the sequence file. This number must match the total number of frames listed in the `frames` parameter in the next row.  | 
|  `frames`  |  Yes  |  List of JSON objects **Required**: `frame-no`, `frame` **Optional**: `unix-timestamp`  |  A list of frame data. The length of the list must equal `number-of-frames`. In the worker UI, frames in a sequence are ordered in [UTF-8](https://en.wikipedia.org/wiki/UTF-8) binary order. To learn more about this ordering, see [Provide Video Frames](sms-point-cloud-video-input-data.md#sms-video-provide-frames).  | 
| frame-no |  Yes  |  Integer  |  The frame order number. This will determine the order of a frame in the sequence.   | 
|  `unix-timestamp`  |  No  |  Integer  |  The unix timestamp of a frame. The number of seconds since January 1st, 1970 until the UTC time when the frame was captured.   | 
| frame |  Yes  |  String  |  The name of a video frame image file.   | 