

# Debugging datasets


During dataset creation there are two types of error that can occur — *terminal errors* and *non-terminal errors*. Terminal errors can stop dataset creation or update. Non-terminal errors don't stop dataset creation or update.

**Topics**
+ [

# Debugging terminal dataset errors
](debugging-datasets-terminal-errors.md)
+ [

# Debugging non-terminal dataset errors
](debugging-datasets-non-terminal-errors.md)

# Debugging terminal dataset errors


 There are two types of terminal errors — file errors that cause dataset creation to fail, and content errors that Amazon Rekognition Custom Labels removes from the dataset. Dataset creation fails if there are too many content errors.

**Topics**
+ [

## Terminal file errors
](#debugging-datasets-terminal-file-errors)
+ [

## Terminal content errors
](#debugging-datasets-terminal-content-errors)

## Terminal file errors


The following are file errors. You can get information about file errors by calling `DescribeDataset` and checking the `Status` and `StatusMessage` fields. For example code, see [Describing a dataset (SDK)](md-describing-dataset-sdk.md).
+ [ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT](#md-error-status-ERROR_MANIFEST_INACCESSIBLE_OR_UNSUPPORTED_FORMAT)
+ [ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE](#md-error-status-ERROR_MANIFEST_SIZE_TOO_LARGE).
+ [ERROR\$1MANIFEST\$1ROWS\$1EXCEEDS\$1MAXIMUM](#md-error-status-ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM)
+ [ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET](#md-error-status-ERROR_INVALID_PERMISSIONS_MANIFEST_S3_BUCKET)
+ [ERROR\$1TOO\$1MANY\$1RECORDS\$1IN\$1ERROR](#md-error-status-ERROR_TOO_MANY_RECORDS_IN_ERROR)
+ [ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS](#md-error-status-ERROR_MANIFEST_TOO_MANY_LABELS)
+ [ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1DISTRIBUTE](#md-error-status-ERROR_INSUFFICIENT_IMAGES_PER_LABEL_FOR_DISTRIBUTE)

### ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT


#### Error message


The manifest file extension or contents are invalid.

The training or testing manifest file doesn't have a file extension or its contents are invalid. 

**To fix error *ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT***
+ Check the following possible causes in both the training and testing manifest files.
  + The manifest file is missing a file extension. By convention the file extension is `.manifest`.
  +  The Amazon S3 bucket or key for the manifest file couldn't be found.

### ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE


#### Error message


The manifest file size exceeds the maximum supported size.

The training or testing manifest file size (in bytes) is too large. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md). A manifest file can have less than the maximum number of JSON Lines and still exceed the maximum file size.

You can't use the Amazon Rekognition Custom Labels console to fix error *The manifest file size exceeds the maximum supported size*.

**To fix error *ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE***

1. Check which of the training and testing manifests exceed the maximum file size.

1. Reduce the number of JSON Lines in the manifest files that are too large. For more information, see [Creating a manifest file](md-create-manifest-file.md).

### ERROR\$1MANIFEST\$1ROWS\$1EXCEEDS\$1MAXIMUM


#### Error message


The manifest file has too many rows.

#### More information


The number of JSON Lines (number of images) in the manifest file is greater than the allowed limit. The limit is different for image-level models and object location models. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md). 

JSON Line error are validated until the number of JSON Lines reaches the `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM` limit. 

You can't use the Amazon Rekognition Custom Labels console to fix error `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM`.

**To fix `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM`**
+ Reduce the number of JSON Lines in the manifest. For more information, see [Creating a manifest file](md-create-manifest-file.md).



### ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET


#### Error message


The S3 bucket permissions are incorrect.

Amazon Rekognition Custom Labels doesn't have permissions to one or more of the buckets containing the training and testing manifest files. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix error *ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET***
+ Check the permissions for the bucket(s) containing the training and testing manifests. For more information, see [Step 2: Set up Amazon Rekognition Custom Labels console permissions](su-console-policy.md).

### ERROR\$1TOO\$1MANY\$1RECORDS\$1IN\$1ERROR


#### Error message


 The manifest file has too many terminal errors.

**To fix `ERROR_TOO_MANY_RECORDS_IN_ERROR`**
+ Reduce the number of JSON Lines (images) with terminal content errors. For more information, see [Terminal manifest content errors](tm-debugging-aggregate-errors.md). 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

### ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS


#### Error message


The manifest file has too many labels.

##### More information


The number of unique labels in the manifest (dataset) is more than the allowed limit. If the training dataset is split to create a testing dataset, the mumber of labels is determined after the split. 

**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (Console)**
+ Remove labels from the dataset. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.



**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (JSON Line)**
+ Manifests with image level JSON Lines – If the image has a single label, remove the JSON Lines for images that use the desired label. If the JSON Line contains multiple labels, remove only the JSON object for the desired label. For more information, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels). 

  Manifests with object location JSON Lines – Remove the bounding box and associated label information for the label that you want to remove. Do this for each JSON Line that contains the desired label. You need to remove the label from the `class-map` array and corresponding objects in the `objects` and `annotations` array. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

### ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1DISTRIBUTE


#### Error message


The manifest file doesn't have enough labeled images to distribute the dataset.



Dataset distribution occurs when Amazon Rekognition Custom Labels splits a training dataset to create a test dataset. You can also split a dataset by calling the `DistributeDatasetEntries` API.

**To fix error *ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS***
+ Add more labeled images to the training dataset

## Terminal content errors


The following are terminal content errors. During dataset creation, images that have terminal content errors are removed from the dataset. The dataset can still be used for training. If there are too many content errors, dataset/update fails. Terminal content errors related to dataset operations aren't displayed in the console or returned from `DescribeDataset` or other API. If you notice that images or annotations are missing from your datasets, check your dataset manifest files for the following issues: 
+ The length of a JSON line is too long. The maximum length is 100,000 characters.
+ The `source-ref` value is missing from a JSON Line.
+ The format of a `source-ref` value in a JSON Line is invalid.
+ The contents of a JSON Line are not valid.
+ The value a `source-ref` field appears more than once. An image can only be referenced once in a dataset.

For information about the `source-ref` field, see [Creating a manifest file](md-create-manifest-file.md). 

# Debugging non-terminal dataset errors


The following are non-terminal errors that can occur during dataset creation or update. These errors can invalidate an entire JSON Line or invalidate annotations within a JSON Line. If a JSON Line has an error, it is not used for training. If an annotation within a JSON Line has an error, the JSON Line is still used for training, but without the broken annotation. For more information about JSON Lines, see [Creating a manifest file](md-create-manifest-file.md).

You can access non-terminal errors from the console and by calling the `ListDatasetEntries` API. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md).

The following errors are are also returned during training. We recommend that you fix these errors before training your model.For more information, see [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md).
+ [ERROR\$1NO\$1LABEL\$1ATTRIBUTES](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_LABEL_ATTRIBUTES)
+ [ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_FORMAT)
+ [ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1METADATA\$1FORMAT](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_METADATA_FORMAT)
+ [ERROR\$1NO\$1VALID\$1LABEL\$1ATTRIBUTES](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_LABEL_ATTRIBUTES)
+ [ERROR\$1INVALID\$1BOUNDING\$1BOX](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_BOUNDING_BOX)
+ [ERROR\$1INVALID\$1IMAGE\$1DIMENSION](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_IMAGE_DIMENSION)
+ [ERROR\$1BOUNDING\$1BOX\$1TOO\$1SMALL](tm-debugging-json-line-errors.md#tm-error-ERROR_BOUNDING_BOX_TOO_SMALL)
+ [ERROR\$1NO\$1VALID\$1ANNOTATIONS](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_ANNOTATIONS)
+ [ERROR\$1MISSING\$1BOUNDING\$1BOX\$1CONFIDENCE](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_BOUNDING_BOX_CONFIDENCE)
+ [ERROR\$1MISSING\$1CLASS\$1MAP\$1ID](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_CLASS_MAP_ID)
+ [ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES](tm-debugging-json-line-errors.md#tm-error-ERROR_TOO_MANY_BOUNDING_BOXES)
+ [ERROR\$1UNSUPPORTED\$1USE\$1CASE\$1TYPE](tm-debugging-json-line-errors.md#tm-error-ERROR_UNSUPPORTED_USE_CASE_TYPE)
+ [ERROR\$1INVALID\$1LABEL\$1NAME\$1LENGTH](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_NAME_LENGTH)

## Accessing non-terminal errors


You can use the console to find out which images in a dataset have non-terminal errors. You can also call, call `ListDatasetEntries` API to get the error messages. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md). 

**To access non-terminal errors(console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. If you want to view non-terminal errors in your training dataset, choose the **Training** tab. Otherwise choose the **Test** tab to view non-terminal errors in your test dataset. 

1. In the **Labels** section of the dataset gallery, choose **Errors**. The dataset gallery is filtered to only show images with errors.

1. Choose **Error** underneath an image to see the error code. Use the information at [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md) to fix the error.  
![\[Error dialog showing "ERROR_UNSUPPORTED_USE_CASE_TYPE" and "ERROR_NO_VALID_LABEL_ATTRIBUTES" under "Dataset record errors".\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/dataset-non-terminal-error.jpg)