This is the new *CloudFormation Template Reference Guide*. Please update your bookmarks and links. For help getting started with CloudFormation, see the [AWS CloudFormation User Guide](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html). # AWS::Personalize::Dataset Creates an empty dataset and adds it to the specified dataset group. Use [CreateDatasetImportJob](https://docs.aws.amazon.com/personalize/latest/dg/API_CreateDatasetImportJob.html) to import your training data to a dataset. There are 5 types of datasets: + Item interactions + Items + Users + Action interactions (you can't use CloudFormation to create an Action interactions dataset) + Actions (you can't use CloudFormation to create an Actions dataset) Each dataset type has an associated schema with required field types. Only the `Item interactions` dataset is required in order to train a model (also referred to as creating a solution). A dataset can be in one of the following states: + CREATE PENDING > CREATE IN\$1PROGRESS > ACTIVE -or- CREATE FAILED + DELETE PENDING > DELETE IN\$1PROGRESS To get the status of the dataset, call [DescribeDataset](https://docs.aws.amazon.com/personalize/latest/dg/API_DescribeDataset.html). **Related APIs** + [CreateDatasetGroup](https://docs.aws.amazon.com/personalize/latest/dg/API_CreateDatasetGroup.html) + [ListDatasets](https://docs.aws.amazon.com/personalize/latest/dg/API_ListDatasets.html) + [DescribeDataset](https://docs.aws.amazon.com/personalize/latest/dg/API_DescribeDataset.html) + [DeleteDataset](https://docs.aws.amazon.com/personalize/latest/dg/API_DeleteDataset.html) ## Syntax To declare this entity in your CloudFormation template, use the following syntax: ### JSON ``` { "Type" : "AWS::Personalize::Dataset", "Properties" : { "[DatasetGroupArn](#cfn-personalize-dataset-datasetgrouparn)" : String, "[DatasetImportJob](#cfn-personalize-dataset-datasetimportjob)" : DatasetImportJob, "[DatasetType](#cfn-personalize-dataset-datasettype)" : String, "[Name](#cfn-personalize-dataset-name)" : String, "[SchemaArn](#cfn-personalize-dataset-schemaarn)" : String } } ``` ### YAML ``` Type: AWS::Personalize::Dataset Properties: [DatasetGroupArn](#cfn-personalize-dataset-datasetgrouparn): String [DatasetImportJob](#cfn-personalize-dataset-datasetimportjob): DatasetImportJob [DatasetType](#cfn-personalize-dataset-datasettype): String [Name](#cfn-personalize-dataset-name): String [SchemaArn](#cfn-personalize-dataset-schemaarn): String ``` ## Properties `DatasetGroupArn` The Amazon Resource Name (ARN) of the dataset group. *Required*: Yes *Type*: String *Pattern*: `arn:([a-z\d-]+):personalize:.*:.*:.+` *Maximum*: `256` *Update requires*: [Replacement](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement) `DatasetImportJob` Describes a job that imports training data from a data source (Amazon S3 bucket) to an Amazon Personalize dataset. If you specify a dataset import job as part of a dataset, all dataset import job fields are required. *Required*: No *Type*: [DatasetImportJob](aws-properties-personalize-dataset-datasetimportjob.md) *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) `DatasetType` One of the following values: + Interactions + Items + Users You can't use CloudFormation to create an Action Interactions or Actions dataset. *Required*: Yes *Type*: String *Allowed values*: `Interactions | Items | Users` *Maximum*: `256` *Update requires*: [Replacement](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement) `Name` The name of the dataset. *Required*: Yes *Type*: String *Pattern*: `^[a-zA-Z0-9][a-zA-Z0-9\-_]*` *Minimum*: `1` *Maximum*: `63` *Update requires*: [Replacement](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement) `SchemaArn` The ARN of the associated schema. *Required*: Yes *Type*: String *Pattern*: `arn:([a-z\d-]+):personalize:.*:.*:.+` *Maximum*: `256` *Update requires*: [Replacement](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement) ## Return values ### Ref When you pass the logical ID of this resource to the intrinsic `Ref` function, `Ref` returns the name of the resource. For more information about using the `Ref` function, see [https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/intrinsic-function-reference-ref.html](https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/intrinsic-function-reference-ref.html). ### Fn::GetAtt The `Fn::GetAtt` intrinsic function returns a value for a specified attribute of this type. The following are the available attributes and sample return values. For more information about using the `Fn::GetAtt` intrinsic function, see [https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/intrinsic-function-reference-getatt.html](https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/intrinsic-function-reference-getatt.html). #### `DatasetArn` The Amazon Resource Name (ARN) of the dataset. ## Examples ### Creating a dataset The following example creates an Amazon Personalize dataset and a dataset import job. The dataset import job imports data from an Amazon S3 bucket into the dataset. #### JSON ``` { "AWSTemplateFormatVersion": "2010-09-09", "Resources": { "MyDataset": { "Type": "AWS::Personalize::Dataset", "Properties": { "Name": "my-dataset-name", "DatasetType": "Interactions", "DatasetGroupArn": "arn:aws:personalize:us-west-2:123456789012:dataset-group/dataset-group-name", "SchemaArn": "arn:aws:personalize:us-west-2:123456789012:schema/schema-name", "DatasetImportJob": { "JobName": "my-import-job-name", "DataSource": { "DataLocation": "s3://bucket-name/file-name.csv" }, "RoleArn": "arn:aws:iam::123456789012:role/personalize-role" } } } } } ``` #### YAML ``` AWSTemplateFormatVersion: 2010-09-09 Resources: MyDataset: Type: 'AWS::Personalize::Dataset' Properties: Name: my-dataset-name DatasetType: Interactions DatasetGroupArn: 'arn:aws:personalize:us-west-2:123456789012:dataset-group/dataset-group-name' SchemaArn: 'arn:aws:personalize:us-west-2:123456789012:schema/schema-name' DatasetImportJob: JobName: my-import-job-name DataSource: DataLocation: 's3://bucket-name/file-name.csv' RoleArn: 'arn:aws:iam::123456789012:role/personalize-role' ``` # AWS::Personalize::Dataset DatasetImportJob Describes a job that imports training data from a data source (Amazon S3 bucket) to an Amazon Personalize dataset. A dataset import job can be in one of the following states: + CREATE PENDING > CREATE IN\$1PROGRESS > ACTIVE -or- CREATE FAILED If you specify a dataset import job as part of a dataset, all dataset import job fields are required. ## Syntax To declare this entity in your CloudFormation template, use the following syntax: ### JSON ``` { "[DatasetArn](#cfn-personalize-dataset-datasetimportjob-datasetarn)" : String, "[DatasetImportJobArn](#cfn-personalize-dataset-datasetimportjob-datasetimportjobarn)" : String, "[DataSource](#cfn-personalize-dataset-datasetimportjob-datasource)" : DataSource, "[JobName](#cfn-personalize-dataset-datasetimportjob-jobname)" : String, "[RoleArn](#cfn-personalize-dataset-datasetimportjob-rolearn)" : String } ``` ### YAML ``` [DatasetArn](#cfn-personalize-dataset-datasetimportjob-datasetarn): String [DatasetImportJobArn](#cfn-personalize-dataset-datasetimportjob-datasetimportjobarn): String [DataSource](#cfn-personalize-dataset-datasetimportjob-datasource): DataSource [JobName](#cfn-personalize-dataset-datasetimportjob-jobname): String [RoleArn](#cfn-personalize-dataset-datasetimportjob-rolearn): String ``` ## Properties `DatasetArn` The Amazon Resource Name (ARN) of the dataset that receives the imported data. *Required*: No *Type*: String *Pattern*: `arn:([a-z\d-]+):personalize:.*:.*:.+` *Maximum*: `256` *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) `DatasetImportJobArn` The ARN of the dataset import job. *Required*: No *Type*: String *Pattern*: `arn:([a-z\d-]+):personalize:.*:.*:.+` *Maximum*: `256` *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) `DataSource` The Amazon S3 bucket that contains the training data to import. *Required*: No *Type*: [DataSource](aws-properties-personalize-dataset-datasource.md) *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) `JobName` The name of the import job. *Required*: No *Type*: String *Pattern*: `^[a-zA-Z0-9][a-zA-Z0-9\-_]*` *Minimum*: `1` *Maximum*: `63` *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) `RoleArn` The ARN of the IAM role that has permissions to read from the Amazon S3 data source. *Required*: No *Type*: String *Pattern*: `arn:([a-z\d-]+):iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+` *Maximum*: `256` *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt) # AWS::Personalize::Dataset DataSource Describes the data source that contains the data to upload to a dataset, or the list of records to delete from Amazon Personalize. ## Syntax To declare this entity in your CloudFormation template, use the following syntax: ### JSON ``` { "[DataLocation](#cfn-personalize-dataset-datasource-datalocation)" : String } ``` ### YAML ``` [DataLocation](#cfn-personalize-dataset-datasource-datalocation): String ``` ## Properties `DataLocation` For dataset import jobs, the path to the Amazon S3 bucket where the data that you want to upload to your dataset is stored. For data deletion jobs, the path to the Amazon S3 bucket that stores the list of records to delete. For example: `s3://bucket-name/folder-name/fileName.csv` If your CSV files are in a folder in your Amazon S3 bucket and you want your import job or data deletion job to consider multiple files, you can specify the path to the folder. With a data deletion job, Amazon Personalize uses all files in the folder and any sub folder. Use the following syntax with a `/` after the folder name: `s3://bucket-name/folder-name/` *Required*: No *Type*: String *Pattern*: `(s3|http|https)://.+` *Maximum*: `256` *Update requires*: [No interruption](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-no-interrupt)