This is the new AWS CloudFormation Template Reference Guide. Please update your bookmarks and links. For help getting started with CloudFormation, see the AWS CloudFormation User Guide.
AWS::DataBrew::Job
Specifies a new DataBrew job.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "Type" : "AWS::DataBrew::Job", "Properties" : { "DatabaseOutputs" :[ DatabaseOutput, ... ], "DataCatalogOutputs" :[ DataCatalogOutput, ... ], "DatasetName" :String, "EncryptionKeyArn" :String, "EncryptionMode" :String, "JobSample" :JobSample, "LogSubscription" :String, "MaxCapacity" :Integer, "MaxRetries" :Integer, "Name" :String, "OutputLocation" :OutputLocation, "Outputs" :[ Output, ... ], "ProfileConfiguration" :ProfileConfiguration, "ProjectName" :String, "Recipe" :Recipe, "RoleArn" :String, "Tags" :[ Tag, ... ], "Timeout" :Integer, "Type" :String, "ValidationConfigurations" :[ ValidationConfiguration, ... ]} }
YAML
Type: AWS::DataBrew::Job Properties: DatabaseOutputs:- DatabaseOutputDataCatalogOutputs:- DataCatalogOutputDatasetName:StringEncryptionKeyArn:StringEncryptionMode:StringJobSample:JobSampleLogSubscription:StringMaxCapacity:IntegerMaxRetries:IntegerName:StringOutputLocation:OutputLocationOutputs:- OutputProfileConfiguration:ProfileConfigurationProjectName:StringRecipe:RecipeRoleArn:StringTags:- TagTimeout:IntegerType:StringValidationConfigurations:- ValidationConfiguration
Properties
- DatabaseOutputs
- 
                    Represents a list of JDBC database output objects which defines the output destination for a DataBrew recipe job to write into. Required: No Type: Array of DatabaseOutput Minimum: 1Update requires: No interruption 
- DataCatalogOutputs
- 
                    One or more artifacts that represent the AWS Glue Data Catalog output from running the job. Required: No Type: Array of DataCatalogOutput Minimum: 1Update requires: No interruption 
- DatasetName
- 
                    A dataset that the job is to process. Required: No Type: String Minimum: 1Maximum: 255Update requires: No interruption 
- EncryptionKeyArn
- 
                    The Amazon Resource Name (ARN) of an encryption key that is used to protect the job output. For more information, see Encrypting data written by DataBrew jobs Required: No Type: String Minimum: 20Maximum: 2048Update requires: No interruption 
- EncryptionMode
- 
                    The encryption mode for the job, which can be one of the following: - 
                            SSE-KMS- Server-side encryption with keys managed by AWS KMS.
- 
                            SSE-S3- Server-side encryption with keys managed by Amazon S3.
 Required: No Type: String Allowed values: SSE-KMS | SSE-S3Update requires: No interruption 
- 
                            
- JobSample
- 
                    A sample configuration for profile jobs only, which determines the number of rows on which the profile job is run. If a JobSamplevalue isn't provided, the default value is used. The default value is CUSTOM_ROWS for the mode parameter and 20,000 for the size parameter.Required: No Type: JobSample Update requires: No interruption 
- LogSubscription
- 
                    The current status of Amazon CloudWatch logging for the job. Required: No Type: String Allowed values: ENABLE | DISABLEUpdate requires: No interruption 
- MaxCapacity
- 
                    The maximum number of nodes that can be consumed when the job processes data. Required: No Type: Integer Update requires: No interruption 
- MaxRetries
- 
                    The maximum number of times to retry the job after a job run fails. Required: No Type: Integer Minimum: 0Update requires: No interruption 
- Name
- 
                    The unique name of the job. Required: Yes Type: String Minimum: 1Maximum: 255Update requires: Replacement 
- OutputLocation
- 
                    The location in Amazon S3 where the job writes its output. Required: No Type: OutputLocation Update requires: No interruption 
- Outputs
- 
                    One or more artifacts that represent output from running the job. Required: No Type: Array of Output Minimum: 1Update requires: No interruption 
- ProfileConfiguration
- 
                    Configuration for profile jobs. Configuration can be used to select columns, do evaluations, and override default parameters of evaluations. When configuration is undefined, the profile job will apply default settings to all supported columns. Required: No Type: ProfileConfiguration Update requires: No interruption 
- ProjectName
- 
                    The name of the project that the job is associated with. Required: No Type: String Minimum: 1Maximum: 255Update requires: No interruption 
- Recipe
- 
                    A series of data transformation steps that the job runs. Required: No Type: Recipe Update requires: No interruption 
- RoleArn
- 
                    The Amazon Resource Name (ARN) of the role to be assumed for this job. Required: Yes Type: String Minimum: 20Maximum: 2048Update requires: No interruption 
- 
                    Metadata tags that have been applied to the job. Required: No Type: Array of Tag Update requires: No interruption 
- Timeout
- 
                    The job's timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.Required: No Type: Integer Minimum: 0Update requires: No interruption 
- Type
- 
                    The job type of the job, which must be one of the following: - 
                            PROFILE- A job to analyze a dataset, to determine its size, data types, data distribution, and more.
- 
                            RECIPE- A job to apply one or more transformations to a dataset.
 Required: Yes Type: String Allowed values: PROFILE | RECIPEUpdate requires: Replacement 
- 
                            
- ValidationConfigurations
- 
                    List of validation configurations that are applied to the profile job. Required: No Type: Array of ValidationConfiguration Update requires: No interruption 
Return values
Ref
When you pass the logical ID of this resource to the intrinsic Ref
            function, Ref returns the resource name. For example:
                        { "Ref": "myJob" }
                    
For an AWS Glue DataBrew job named myJob, Ref
            returns the name of the job. 
Examples
Creating jobs
The following examples create new DataBrew profile jobs.
YAML
Resources: TestDataBrewJob: Type: AWS::DataBrew::Job Properties: Type: PROFILE Name: job-name DatasetName: dataset-name RoleArn: arn:aws:iam::12345678910:role/PassRoleAdmin JobSample: Mode: 'CUSTOM_ROWS' Size: 50000 OutputLocation: Bucket: !Join [ '', ['databrew-cfn-integration-tests-', !Ref 'AWS::Region', '-', !Ref 'AWS::AccountId' ] ] Tags: [{Key: key00AtCreate, Value: value001AtCreate}]
JSON
{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "This CloudFormation template specifies a DataBrew Profile Job", "Resources": { "MyDataBrewProfileJob": { "Type": "AWS::DataBrew::Job", "Properties": { "Type": "PROFILE", "Name": "job-test", "DatasetName": "dataset-test", "RoleArn": "arn:aws:iam::1234567891011:role/PassRoleAdmin", "JobSample": { "Mode": "FULL_DATASET" }, "OutputLocation": { "Bucket": "test-output", "Key": "job-output.json" }, "Tags": [ { "Key": "key00AtCreate", "Value": "value001AtCreate" } ] } } } }