

# Recipe step and function reference
Recipe step and function reference

In this reference, you can find descriptions of the recipe steps and functions that you can use programmatically, either from the AWS CLI or by using one of the AWS SDKs. In DataBrew, a *recipe step* is an action that transforms your raw data into a form that is ready to be consumed by your data pipeline. A DataBrew *function* is a special kind of recipe step that performs a computation based on parameters.

Categories for transformations in the UI include the following:
+ Basic column recipe steps
  + Filter
  + Column
+ Data cleaning recipe steps
  + Format
  + Clean
  + Extract
+ Data quality recipe steps
  + Missing
  + Invalid
  + Duplicates
  + Outliers
+ Personally indentifiable information (PII) recipe steps
  + Mask personal information
  + Replace personal information
  + Encrypt personal information
  + Shuffle rows
+ Column structure recipe steps
  + Split
  + Merge
  + Create
+ Column formatting recipe steps
  + Decimal precision
  + Thousands separator
  + Abbreviate numbers
+ Data structure recipe steps
  + Nest-Unnest
  + Pivot
  + Group
  + Join
  + Union
+ Data science recipe steps
  + Text
  + Scale
  + Mapping
  + Encode
+ Functions
  + Mathematical functions
  + Aggregate functions
  + Text functions
  + Date and time functions
  + Window functions
  + Web functions
  + Other functions

For more information about how these recipe steps and functions are used in a recipe (including the use of condition expressions) see [Defining a recipe structure](recipes.md#recipes.structure).

The following sections describe the recipe steps and functions, organized by what they do.

**Topics**
+ [

# Basic column recipe steps
](recipe-actions.basic.md)
+ [

# Data cleaning recipe steps
](recipe-actions.data-cleaning.md)
+ [

# Data quality recipe steps
](recipe-actions.data-quality.md)
+ [

# Personally identifiable information (PII) recipe steps
](recipe-actions.pii.md)
+ [

# Outlier detection and handling recipe steps
](recipe-actions.outliers.md)
+ [

# Column structure recipe steps
](recipe-actions.column-structure.md)
+ [

# Column formatting recipe steps
](recipe-actions.column-formatting.md)
+ [

# Data structure recipe steps
](recipe-actions.data-structure.md)
+ [

# Data science recipe steps
](recipe-actions.data-science.md)
+ [

# Mathematical functions
](recipe-actions.functions.math.md)
+ [

# Aggregate functions
](recipe-actions.functions.aggregate.md)
+ [

# Text functions
](recipe-actions.functions.text.md)
+ [

# Date and time functions
](recipe-actions.functions.date.md)
+ [

# Window functions
](recipe-actions.functions.window.md)
+ [

# Web functions
](recipe-actions.functions.web.md)
+ [

# Other functions
](recipe-actions.functions.other.md)

# Basic column recipe steps


Use these basic column recipe actions to perform simple transformations on your data.

**Topics**
+ [

# CHANGE\$1DATA\$1TYPE
](recipe-actions.CHANGE_DATA_TYPE.md)
+ [

# DELETE
](recipe-actions.DELETE.md)
+ [

# DUPLICATE
](recipe-actions.DUPLICATE.md)
+ [

# JSON\$1TO\$1STRUCTS
](recipe-actions.JSON_TO_STRUCTS.md)
+ [

# MOVE\$1AFTER
](recipe-actions.MOVE_AFTER.md)
+ [

# MOVE\$1BEFORE
](recipe-actions.MOVE_BEFORE.md)
+ [

# MOVE\$1TO\$1END
](recipe-actions.MOVE_TO_END.md)
+ [

# MOVE\$1TO\$1INDEX
](recipe-actions.MOVE_TO_INDEX.md)
+ [

# MOVE\$1TO\$1START
](recipe-actions.MOVE_TO_START.md)
+ [

# RENAME
](recipe-actions.RENAME.md)
+ [

# SORT
](recipe-actions.SORT.md)
+ [

# TO\$1BOOLEAN\$1COLUMN
](recipe-actions.TO_BOOLEAN_COLUMN.md)
+ [

# TO\$1DOUBLE\$1COLUMN
](recipe-actions.TO_DOUBLE_COLUMN.md)
+ [

# TO\$1NUMBER\$1COLUMN
](recipe-actions.TO_NUMBER_COLUMN.md)
+ [

# TO\$1STRING\$1COLUMN
](recipe-actions.TO_STRING_COLUMN.md)

# CHANGE\$1DATA\$1TYPE


Changes the data type of an existing column.

If a column value can’t be converted to the new type, it will be replaced with NULL. This can happen when a string column is converted to an integer column. For example, string "123" will become integer 123, but string "ABC" cannot become a number, so it will be replaced with a NULL value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – New type of the column. The following data types are supported:
  + **byte: **1-byte signed integer numbers. The range of numbers is from -128 to 127.
  + **short: **2-byte signed integer numbers. The range of numbers is from -32768 to 32767.
  + **int: **4-byte signed integer numbers. The range of numbers is from -2147483648 to 2147483647.
  + **long: **8-byte signed integer numbers. The range of numbers is from -9223372036854775808 to 9223372036854775807.
  + **float: **4-byte single-precision floating point numbers.
  + **double: **8-byte double-precision floating point numbers.
  + **decimal: **Signed decimal numbers with up to 38 digits total and 18 digits after the decimal point.
  + **string: **Character string values.
  + **boolean: **Boolean type has one of two possible values: `true` and `false` or `yes` and `no`.
  + **timestamp: **Values comprising fields year, month, day, hour, minute, and second.
  + **date: **Values comprising fields year, month and day.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "CHANGE_DATA_TYPE",
        "Parameters": {
            "sourceColumn": "columnName",
            "columnDataType": "boolean"
        }
    } 
 }
```

# DELETE


Removes a column from the dataset.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DELETE",
        "Parameters": {
            "sourceColumn": "extra_data"
        }
    }
}
```

# DUPLICATE


Creates a new column with the different name, but with all of the same data. Both the old and new columns are retained in the dataset.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – A name for the duplicate column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DUPLICATE",
        "Parameters": {
            "sourceColumn": "last_name",
            "targetColumn": "copy_of_last_name"
        }
    }
}
```

# JSON\$1TO\$1STRUCTS


Converts a JSON string to statically typed structs. During conversion, it detects the schema of every JSON object and merges them in order to get the most generic schema to represent the entire JSON string. The “unnestLevel” parameter specifies how many levels of JSON objects to convert to structs.

**Parameters**
+ `sourceColumns` – A list of source columns.
+ `regexColumnSelector –` A regular expression to select the columns.
+ `removeSourceColumn` – A Boolean value. If `true` then remove the source column; otherwise, keep it.
+ `unnestLevel` – The number of levels to unnest.
+ `conditionExpressions` – Condition expressions.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "JSON_TO_STRUCTS",
        "Parameters": {
            "sourceColumns": "[\"address\"]",
            "removeSourceColumn": "true",
            "unnestLevel": "2"
       }
    }
}
```

# MOVE\$1AFTER


Moves a column to the position immediately after another column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of another column. The column specified by `sourceColumn` will be moved immediately after the column specified by `targetColumn`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOVE_AFTER",
        "Parameters": {
            "sourceColumn": "rating",
            "targetColumn": "height_cm"
        }
    }
}
```

# MOVE\$1BEFORE


Moves a column to the position immediately before another column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of another column. The column specified by `sourceColumn` will be moved immediately after the column specified by `targetColumn`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOVE_BEFORE",
        "Parameters": {
            "sourceColumn": "height_cm",
            "targetColumn": "weight_kg"
        }
    }
}
```

# MOVE\$1TO\$1END


Moves a column to the end position (last column) in the dataset.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOVE_TO_END",
        "Parameters": {
            "sourceColumn": "height_cm"
        }
    }
}
```

# MOVE\$1TO\$1INDEX


Moves a column to a position specified by a number.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetIndex` – The new position for the column. Positions start with 0—so, for example, `1` refers to the second column, `2` refers to the third column, and so on.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOVE_TO_INDEX",
        "Parameters": {
            "sourceColumn": "nationality",
            "targetIndex": "5"
        }
    }
}
```

# MOVE\$1TO\$1START


Moves a column to the beginning position (first column) in the dataset.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOVE_TO_START",
        "Parameters": {
            "sourceColumn": "first_name"
        }
    }
}
```

# RENAME


Creates a new column with the different name, but with all of the same data. The old column is then removed from the dataset.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – A new name for the column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "RENAME",
        "Parameters": {
            "sourceColumn": "date_of_birth",
            "targetColumn": "birth_date"
        }
    }
}
```

# SORT


Sorts the data in one or more columns of a dataset in ascending, descending, or custom order.

**Parameters**
+ `expressions` – A string that contains one or more JSON-encoded strings representing sorting expressions.
  + `sourceColumn` – A string that contains the name of an existing column.
  + `ordering` – Ordering can be either ASCENDING or DESCENDING.
  + `nullsOrdering` – Nulls ordering can be either NULLS\$1TOP or NULLS\$1BOTTOM to place null or missing values at the beginning or at the bottom of the column. 
  + `customOrder` – A list of strings that defines a custom order for the string sorting. By default, strings are sorted alphabetically.
  + `isCustomOrderCaseSensitive` – Boolean. The default value is `false`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SORT",
        "Parameters": {
            "expressions": "[{\"sourceColumn\": \"A\", \"ordering\": \"ASCENDING\", \"nullsOrdering\": \"NULLS_TOP\"}]",
       }
    } 
 }
```


**Example of custom sort order**  
In the following example, the customOrder expression string has the format of a list of objects. Each object describes a sorting expression for one column.  
  

```
[
  {
    "sourceColumn": "A",
    "ordering": "ASCENDING",
    "nullsOrdering": "NULLS_TOP",
  },
  {
    "sourceColumn": "B",
    "ordering": "DESCENDING",
    "nullsOrdering": "NULLS_BOTTOM",
    "customOrder": ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"],
    "isCustomOrderCaseSensitive": false,
  }
]
```

# TO\$1BOOLEAN\$1COLUMN


Changes the data type of an existing column to BOOLEAN.

**Note**  
We recommend using CHANGE\$1DATA\$1TYPE recipe action rather than TO\$1BOOLEAN\$1COLUMN.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – A value that must be `boolean`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TO_BOOLEAN_COLUMN",
        "Parameters": {
            "columnDataType": "boolean",
            "sourceColumn": "is_present"
        }
    }
}
```

# TO\$1DOUBLE\$1COLUMN


Changes the data type of an existing column to DOUBLE.

**Note**  
We recommend using CHANGE\$1DATA\$1TYPE recipe action rather than TO\$1DOUBLE\$1COLUMN.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – A value that must be `number`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TO_DOUBLE_COLUMN",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "hourly_rate"
        }
    }
}
```

# TO\$1NUMBER\$1COLUMN


Changes the data type of an existing column to NUMBER.

**Note**  
We recommend using CHANGE\$1DATA\$1TYPE recipe action rather than TO\$1NUMBER\$1COLUMN.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – A value that must be `number`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TO_NUMBER_COLUMN",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "hours_worked"
        }
    }
}
```

# TO\$1STRING\$1COLUMN


Changes the data type of an existing column to STRING.

**Note**  
We recommend using CHANGE\$1DATA\$1TYPE recipe action rather than TO\$1STRING\$1COLUMN.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – A value that must be `string`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TO_STRING_COLUMN",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "age"
        }
    }
}
```

# Data cleaning recipe steps


Use these data cleaning recipe steps to perform simple transformations on existing data.

**Topics**
+ [

# CAPITAL\$1CASE
](recipe-actions.CAPITAL_CASE.md)
+ [

# FORMAT\$1DATE
](recipe-actions.FORMAT_DATE.md)
+ [

# LOWER\$1CASE
](recipe-actions.LOWER_CASE.md)
+ [

# UPPER\$1CASE
](recipe-actions.UPPER_CASE.md)
+ [

# SENTENCE\$1CASE
](recipe-actions.SENTENCE_CASE.md)
+ [

# ADD\$1DOUBLE\$1QUOTES
](recipe-actions.ADD_DOUBLE_QUOTES.md)
+ [

# ADD\$1PREFIX
](recipe-actions.ADD_PREFIX.md)
+ [

# ADD\$1SINGLE\$1QUOTES
](recipe-actions.ADD_SINGLE_QUOTES.md)
+ [

# ADD\$1SUFFIX
](recipe-actions.ADD_SUFFIX.md)
+ [

# EXTRACT\$1BETWEEN\$1DELIMITERS
](recipe-actions.EXTRACT_BETWEEN_DELIMITERS.md)
+ [

# EXTRACT\$1BETWEEN\$1POSITIONS
](recipe-actions.EXTRACT_BETWEEN_POSITIONS.md)
+ [

# EXTRACT\$1PATTERN
](recipe-actions.EXTRACT_PATTERN.md)
+ [

# EXTRACT\$1VALUE
](recipe-actions.EXTRACT_VALUE.md)
+ [

# REMOVE\$1COMBINED
](recipe-actions.REMOVE_COMBINED.md)
+ [

# REPLACE\$1BETWEEN\$1DELIMITERS
](recipe-actions.REPLACE_BETWEEN_DELIMITERS.md)
+ [

# REPLACE\$1BETWEEN\$1POSITIONS
](recipe-actions.REPLACE_BETWEEN_POSITIONS.md)
+ [

# REPLACE\$1TEXT
](recipe-actions.REPLACE_TEXT.md)

# CAPITAL\$1CASE


Changes each string in a column to capitalize each word. In *capital case, *the first letter of each word is capitalized and the rest of the word is transformed to lowercase. An example is: The Quick Brown Fox Jumped Over The Fence.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "CAPITAL_CASE",
        "Parameters": {
            "sourceColumn": "last_name"
        }
    }
}
```

# FORMAT\$1DATE


Returns a column in which a date string is converted into a formatted value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetDateFormat` – One of the following date formats:
  + `mm/dd/yyyy`
  + `mm-dd-yyyy`
  + `dd month yyyy`
  + `month yyyy`
  + `dd month`

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FORMAT_DATE",
        "Parameters": {
            "sourceColumn": "birth_date",
            "targetDateFormat": "mm-dd-yyyy"
        }
    }
}
```

# LOWER\$1CASE


Changes each string in a column to lowercase, for example: the quick brown fox jumped over the fence

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "LOWER_CASE",
        "Parameters": {
            "sourceColumn": "nationality"
        }
    }
}
```

# UPPER\$1CASE


Changes each string in a column to uppercase, for example: THE QUICK BROWN FOX JUMPED OVER THE FENCE

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UPPER_CASE",
        "Parameters": {
            "sourceColumn": "nationality"
        }
    }
}
```

# SENTENCE\$1CASE


Changes each string in a column to sentence case. In *sentence case, *the first letter of each sentence is capitalized, and the rest of the sentence is transformed to lowercase. An example is: The quick brown fox. Jumped over. The fence

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SENTENCE_CASE",
        "Parameters": {
            "sourceColumn": "description"
        }
    }
}
```

# ADD\$1DOUBLE\$1QUOTES


Encloses the characters in a column with double quotation marks.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADD_DOUBLE_QUOTES",
        "Parameters": {
            "sourceColumn": "info_url"
        }
    }
}
```

# ADD\$1PREFIX


Adds one or more characters, concatenating them as a prefix to the beginning of a column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – The character or characters to place at the beginning of the column values.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADD_PREFIX",
        "Parameters": {
            "pattern": "aaa",
            "sourceColumn": "info_url"
        }
    }
}
```

# ADD\$1SINGLE\$1QUOTES


Encloses the characters in a column with single quotation marks.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADD_SINGLE_QUOTES",
        "Parameters": {
            "sourceColumn": "info_url"
        }
    }
}
```

# ADD\$1SUFFIX


Adds one more characters concatenating them as a suffix to the end of a column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – The character or characters to place at the end of the column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADD_SUFFIX",
        "Parameters": {
            "pattern": "bbb",
            "sourceColumn": "info_url"
        }
    }
}
```

# EXTRACT\$1BETWEEN\$1DELIMITERS


Creates a new column, based on delimiters, from the values in an existing column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `startPattern` – A regular expression, indicating the character or characters that begin the delimited values.
+ `endPattern` – A regular expression, indicating the delimiter character or characters that end the delimited values.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXTRACT_BETWEEN_DELIMITERS",
        "Parameters": {
            "endPattern": "\\/",
            "sourceColumn": "info_url",
            "startPattern": "\\/\\/",
            "targetColumn": "raw_url"
        }
    }
}
```

# EXTRACT\$1BETWEEN\$1POSITIONS


Creates a new column, based on character positions, from the values in an existing column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `startPosition` – The character position at which to perform the extract.
+ `endPosition` – The character position at which to end the extract.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXTRACT_BETWEEN_POSITIONS",
        "Parameters": {
            "endPosition": "9",
            "sourceColumn": "last_name",
            "startPosition": "3",
            "targetColumn": "characters_3_to_9"
        }
    }
}
```

# EXTRACT\$1PATTERN


Creates a new column, based on a regular expression, from the values in an existing column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `pattern` – A regular expression that indicates which character or characters to extract and create the new column from.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXTRACT_PATTERN",
        "Parameters": {
            "pattern": "^....*...$",
            "sourceColumn": "last_name",
            "targetColumn": "first_and_last_few_characters"
        }
    }
}
```

# EXTRACT\$1VALUE


Creates a new column with an extracted value from a user-specified path. If the source column is of the Map, Array, or Struct type, each field in the path should be escaped using back ticks (for example, `name`).

**Parameters**
+ `targetColumn` – The name of the target column.
+ `sourceColumn` – Name of the source column from which the value is to be extracted.
+ `path` – The path to the specific key that the user wants to extract. If the source column is of the Map, Array, or Struct type, each field in the path should be escaped using back ticks (for example, `name`).

  Consider the following example of user information:

  ```
                     user {
                        name: “Ammy” 
                        address: {
                           state: "CA",
                           zipcode: 12345
                        },
                        phoneNumber:{"home": "123123123", "work": "456456456"}
                        citizenship: ["Canada", "USA", "Mexico", "India"]
                     }
  ```

  The following are examples of the paths you would provide, depending on the type of the source column:
  + If the source column is of the type **map**, the path for extracting the home phone number is:

    ``user`.`phoneNumber`.`home``
  + If the source column is of the type **array**, the path for extracting the second "citizenship" value is:

    ``user`.`citizenship`[1]`
  + If the source column is of the type **struct**, the path for extracting the zip code is:

    ``user`.`address`.`zipcode``



**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXTRACT_VALUE",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "columnName",
            "path": "`age`.`name`",
        }
    }
}
```

# REMOVE\$1COMBINED


Removes one or more characters from a column, according to what a user specifies.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `collapseConsecutiveWhitespace` – If `true`, replaces two or more white-space characters with exactly one white-space character. 
+ `removeAllPunctuation` – If `true`, removes all of the following characters: `. ! , ?`
+ `removeAllQuotes` – If `true`, removes all single quotation marks and double quotation marks.
+ `removeAllWhitespace` – If `true`, removes all white-space characters.
+ `customCharacters` – One or more characters that can be acted upon.
+ `customValue` – A value that can be acted upon.
+ `removeCustomCharacters` – If `true`, removes all characters specified by `customCharacters` parameter.
+ `removeCustomValue` – If `true`, removes all characters specified by `customValue` parameter.
+ `punctuationally` – If `true`, removes the following characters if they occur at the start or end of the value:`. ! , ?`
+ `antidisestablishmentarianism` – If `true`, removes single quotation marks and double quotation marks from the beginning and end of the value.
+ `removeLeadingAndTrailingWhitespace` – If `true`, removes all white spaces from the beginning and end of the value.
+ `removeLetters` – If `true`, removes all uppercase and lowercase alphabetic characters (`A` through `Z`; `a` through `z`).
+ `removeNumbers` – If `true`, removes all numeric characters (`0` through `9`).
+ `removeSpecialCharacters` – If `true`, removes all of the following characters: `! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~`

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_COMBINED",
        "Parameters": {
            "collapseConsecutiveWhitespace": "false",
            "removeAllPunctuation": "false",
            "removeAllQuotes": "false",
            "removeAllWhitespace": "false",
            "removeCustomCharacters": "false",
            "removeCustomValue": "false",
            "removeLeadingAndTrailingPunctuation": "false",
            "removeLeadingAndTrailingQuotes": "false",
            "removeLeadingAndTrailingWhitespace": "false",
            "removeLetters": "false",
            "removeNumbers": "false",
            "removeSpecialCharacters": "true",
            "sourceColumn": "info_url"
        }
    }
}
```

```
{
    "RecipeAction": {
        "Operation": "REMOVE_COMBINED",
        "Parameters": {
            "collapseConsecutiveWhitespace": "false",
            "customCharacters": "¶",
            "removeAllPunctuation": "false",
            "removeAllQuotes": "false",
            "removeAllWhitespace": "false",
            "removeCustomCharacters": "true",
            "removeCustomValue": "false",
            "removeLeadingAndTrailingPunctuation": "false",
            "removeLeadingAndTrailingQuotes": "false",
            "removeLeadingAndTrailingWhitespace": "false",
            "removeLetters": "false",
            "removeNumbers": "false",
            "removeSpecialCharacters": "false",
            "sourceColumn": "info_url"
        }
    }
}
```

```
{
    "RecipeAction": {
        "Operation": "REMOVE_COMBINED",
        "Parameters": {
            "collapseConsecutiveWhitespace": "true",
            "customValue": "M",
            "removeAllPunctuation": "true",
            "removeAllQuotes": "false",
            "removeAllWhitespace": "false",
            "removeCustomCharacters": "false",
            "removeCustomValue": "true",
            "removeLeadingAndTrailingPunctuation": "false",
            "removeLeadingAndTrailingQuotes": "true",
            "removeLeadingAndTrailingWhitespace": "true",
            "removeLetters": "true",
            "removeNumbers": "true",
            "removeSpecialCharacters": "false",
            "sourceColumn": "info_url"
        }
    }
}
```

```
{
    "RecipeAction": {
        "Operation": "REMOVE_COMBINED",
        "Parameters": {
            "collapseConsecutiveWhitespace": "false",
            "removeAllPunctuation": "false",
            "removeAllQuotes": "false",
            "removeAllWhitespace": "false",
            "removeCustomCharacters": "false",
            "removeCustomValue": "false",
            "removeLeadingAndTrailingPunctuation": "false",
            "removeLeadingAndTrailingQuotes": "false",
            "removeLeadingAndTrailingWhitespace": "false",
            "removeLetters": "false",
            "removeNumbers": "true",
            "removeSpecialCharacters": "false",
            "sourceColumn": "first_name"
        }
    }
}
```

```
{
    "RecipeAction": {
        "Operation": "REMOVE_COMBINED",
        "Parameters": {
            "collapseConsecutiveWhitespace": "false",
            "removeAllPunctuation": "false",
            "removeAllQuotes": "false",
            "removeAllWhitespace": "false",
            "removeCustomCharacters": "false",
            "removeCustomValue": "false",
            "removeLeadingAndTrailingPunctuation": "false",
            "removeLeadingAndTrailingQuotes": "false",
            "removeLeadingAndTrailingWhitespace": "false",
            "removeLetters": "false",
            "removeNumbers": "true",
            "removeSpecialCharacters": "false",
            "sourceColumn": "first_name"
        }
    }
}
```

# REPLACE\$1BETWEEN\$1DELIMITERS


Replaces the characters between two delimiters with user-specified text.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `startPattern` – Character or characters or a regular expression, indicating where the substitution is to begin.
+ `endPattern` – Character or characters or a regular expression, indicating where the substitution is to end.
+ `value` – The replacement character or characters to be substituted.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_BETWEEN_DELIMITERS",
        "Parameters": {
            "endPattern": ">",
            "sourceColumn": "last_name",
            "startPattern": "&lt;",
            "value": "?"
        }
    }
}
```

# REPLACE\$1BETWEEN\$1POSITIONS


Replaces the characters between two positions with user-specified text.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `startPosition` – A number indicting at what character position in the string the substitution is to begin.
+ `endPosition` – A number indicting at what character position in the string the substitution is to end.
+ `value` – The replacement character or characters to be substituted.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_BETWEEN_POSITIONS",
        "Parameters": {
            "endPosition": "20",
            "sourceColumn": "nationality",
            "startPosition": "10",
            "value": "E"
        }
    }
}
```

# REPLACE\$1TEXT


Replaces a specified sequence of characters with another.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – Character or characters or a regular expression, indicating which characters should be replaced in the source column.
+ `value` – The replacement character or characters to be substituted.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_TEXT",
        "Parameters": {
            "pattern": "x",
            "sourceColumn": "first_name",
            "value": "a"
        }
    }
}
```

```
{
    "RecipeAction": {
        "Operation": "REPLACE_TEXT",
        "Parameters": {
            "pattern": "[0-9]",
            "sourceColumn": "nationality",
            "value": "!"
        }
    }
}
```

# Data quality recipe steps


Use these data quality recipe steps to populate missing values, remove invalid data, or remove duplicates.

**Topics**
+ [

# ADVANCED\$1DATATYPE\$1FILTER
](recipe-actions.ADVANCED_DATATYPE_FILTER.md)
+ [

# ADVANCED\$1DATATYPE\$1FLAG
](recipe-actions.ADVANCED_DATATYPE_FLAG.md)
+ [

# DELETE\$1DUPLICATE\$1ROWS
](recipe-actions.DELETE_DUPLICATE_ROWS.md)
+ [

# EXTRACT\$1ADVANCED\$1DATATYPE\$1DETAILS
](recipe-actions.EXTRACT_ADVANCED_DATATYPE_DETAILS.md)
+ [

# FILL\$1WITH\$1AVERAGE
](recipe-actions.FILL_WITH_AVERAGE.md)
+ [

# FILL\$1WITH\$1CUSTOM
](recipe-actions.FILL_WITH_CUSTOM.md)
+ [

# FILL\$1WITH\$1EMPTY
](recipe-actions.FILL_WITH_EMPTY.md)
+ [

# FILL\$1WITH\$1LAST\$1VALID
](recipe-actions.FILL_WITH_LAST_VALID.md)
+ [

# FILL\$1WITH\$1MEDIAN
](recipe-actions.FILL_WITH_MEDIAN.md)
+ [

# FILL\$1WITH\$1MODE
](recipe-actions.FILL_WITH_MODE.md)
+ [

# FILL\$1WITH\$1MOST\$1FREQUENT
](recipe-actions.FILL_WITH_MOST_FREQUENT.md)
+ [

# FILL\$1WITH\$1NULL
](recipe-actions.FILL_WITH_NULL.md)
+ [

# FILL\$1WITH\$1SUM
](recipe-actions.FILL_WITH_SUM.md)
+ [

# FLAG\$1DUPLICATE\$1ROWS
](recipe-actions.FLAG_DUPLICATE_ROWS.md)
+ [

# FLAG\$1DUPLICATES\$1IN\$1COLUMN
](recipe-actions.FLAG_DUPLICATES_IN_COLUMN.md)
+ [

# GET\$1ADVANCED\$1DATATYPE
](recipe-actions.GET_ADVANCED_DATATYPE.md)
+ [

# REMOVE\$1DUPLICATES
](recipe-actions.REMOVE_DUPLICATES.md)
+ [

# REMOVE\$1INVALID
](recipe-actions.REMOVE_INVALID.md)
+ [

# REMOVE\$1MISSING
](recipe-actions.REMOVE_MISSING.md)
+ [

# REPLACE\$1WITH\$1AVERAGE
](recipe-actions.REPLACE_WITH_AVERAGE.md)
+ [

# REPLACE\$1WITH\$1CUSTOM
](recipe-actions.REPLACE_WITH_CUSTOM.md)
+ [

# REPLACE\$1WITH\$1EMPTY
](recipe-actions.REPLACE_WITH_EMPTY.md)
+ [

# REPLACE\$1WITH\$1LAST\$1VALID
](recipe-actions.REPLACE_WITH_LAST_VALID.md)
+ [

# REPLACE\$1WITH\$1MEDIAN
](recipe-actions.REPLACE_WITH_MEDIAN.md)
+ [

# REPLACE\$1WITH\$1MODE
](recipe-actions.REPLACE_WITH_MODE.md)
+ [

# REPLACE\$1WITH\$1MOST\$1FREQUENT
](recipe-actions.REPLACE_WITH_MOST_FREQUENT.md)
+ [

# REPLACE\$1WITH\$1NULL
](recipe-actions.REPLACE_WITH_NULL.md)
+ [

# REPLACE\$1WITH\$1ROLLING\$1AVERAGE
](recipe-actions.REPLACE_WITH_ROLLING_AVERAGE.md)
+ [

# REPLACE\$1WITH\$1ROLLING\$1SUM
](recipe-actions.REPLACE_WITH_ROLLING_SUM.md)
+ [

# REPLACE\$1WITH\$1SUM
](recipe-actions.REPLACE_WITH_SUM.md)

# ADVANCED\$1DATATYPE\$1FILTER


Filters the current source column based on advanced data type detection. For example, given a column that DataBrew has identified as containing zip codes, this transform can filter the column based on timezone. The details that you can extract depend on the pattern that is detected, as described in **Notes** below.

**Parameters**
+ `sourceColumn` – The name of a string source column.
+ `pattern` – The pattern to extract.
+ `advancedDataType` – Can be one of Phone, Zip Code, Date Time, State, Credit Card, URL, Email, SSN, or Gender.
+ `filter values` – List of string values that the user wants to filter the column based on.
+ `strategy` – KEEP\$1ROWS or DISCARD\$1ROWS or CLEAR\$1FILTERS or CLEAR\$1OTHERS.
+ `clearWithEmpty` – Boolean `true` or `false`, to clear rows with `empty` instead of `null`.

**Notes**
+ If advancedDataType is **Phone**, then the pattern can be AREA\$1CODE, TIME\$1ZONE, or COUNTRY\$1CODE.
+ If advancedDataType is **Zip Code**, then the pattern can be TIME\$1ZONE, COUNTRY, STATE, CITY, TYPE, or REGION.
+ If advancedDataType is **Date Time**, then the pattern can be DAY, MONTH, MONTH\$1NAME, WEEK, QUARTER, or YEAR.
+ If advancedDataType is **State**, then the pattern can be TIME\$1ZONE.
+ If advancedDataType is **Credit Card**, then the pattern can be LENGTH or NETWORK.
+ If advancedDataType is **URL**, then the pattern can be PROTOCOL, TLD, or DOMAIN.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADVANCED_DATATYPE_FILTER",
        "Parameters": {
            "pattern": "AREA_CODE",
            "sourceColumn": "phoneColumn",
            "advancedDataType": "Phone",
            "filterValues": ['Ohio'],
            "strategy": "KEEP_ROWS"
        }
    }
}
```

# ADVANCED\$1DATATYPE\$1FLAG


Creates a new flag column based on the values for the current source column. For example, given a source column containing zip codes, this transform can be used to flag values as `true` or `false` based on a particular timezone. The details that you can extract depend on the pattern that is detected, as described in **Notes** below.

**Parameters**
+ `sourceColumn` – The name of a string source column.
+ `pattern` – The pattern to extract.
+ `targetColumn` – The name of the target column.
+ `advancedDataType` – Can be one of Phone, Zip Code, Date Time, State, Credit Card, URL, Email, SSN, or Gender.
+ `filter values` – List of string values that the user wants to filter the column based on.
+ `trueString` – The `true` value for the target column.
+ `falseString` – The `false` value for the target column.

**Notes**
+ If advancedDataType is **Phone**, then the pattern can be AREA\$1CODE, TIME\$1ZONE, or COUNTRY\$1CODE.
+ If advancedDataType is **Zip Code**, then the pattern can be TIME\$1ZONE, COUNTRY, STATE, CITY, TYPE, or REGION.
+ If advancedDataType is **Date Time**, then the pattern can be DAY, MONTH, MONTH\$1NAME, WEEK, QUARTER, or YEAR.
+ If advancedDataType is **State**, then the pattern can be TIME\$1ZONE.
+ If advancedDataType is **Credit Card**, then the pattern can be LENGTH or NETWORK.
+ If advancedDataType is **URL**, then the pattern can be PROTOCOL, TLD, or DOMAIN.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADVANCED_DATATYPE_FLAG",
        "Parameters": {
            "pattern": "AREA_CODE",
            "sourceColumn": "phoneColumn",
            "advancedDataType": "Phone",
            "filterValues": ['Ohio'],
            "targetColumn": "targetColumnName",
            "trueString": "trueValue",
            "falseString": "falseValue"
        }
    }
}
```

# DELETE\$1DUPLICATE\$1ROWS


Deletes any row that is an exact match to an earlier row in the dataset. The initial occurrence is not deleted, because it doesn't match an earlier row.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DELETE_DUPLICATE_ROWS"
    }
}
```

# EXTRACT\$1ADVANCED\$1DATATYPE\$1DETAILS


Extracts details for the advanced data type. The details that you can extract depend on the pattern that is detected, as described in **Notes** below.

**Parameters**
+ `sourceColumn` – The name of a string source column.
+ `pattern` – The pattern to extract.
+ `targetColumn` – The name of the target column.
+ `advancedDataType` – Can be one of Phone, Zip Code, Date Time, State, Credit Card, URL, Email, SSN, or Gender.

**Notes**
+ If advancedDataType is **Phone**, then the pattern can be AREA\$1CODE, TIME\$1ZONE, or COUNTRY\$1CODE.
+ If advancedDataType is **Zip Code**, then the pattern can be TIME\$1ZONE, COUNTRY, STATE, CITY, TYPE, or REGION.
+ If advancedDataType is **Date Time**, then the pattern can be DAY, MONTH, MONTH\$1NAME, WEEK, QUARTER, or YEAR.
+ If advancedDataType is **State**, then the pattern can be TIME\$1ZONE.
+ If advancedDataType is **Credit Card**, then the pattern can be LENGTH or NETWORK.
+ If advancedDataType is **URL**, then the pattern can be PROTOCOL, TLD, or DOMAIN.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXTRACT_ADVANCED_DATATYPE_DETAILS",
        "Parameters": {
            "pattern": "TIMEZONE"
            "sourceColumn": "zipCode",
            "targetColumn": "timeZoneFromZipCode",
            "advancedDataType": "ZipCode"
        }
    }
}
```

# FILL\$1WITH\$1AVERAGE


Returns a column with missing data replaced by the average of all values.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_AVERAGE",
        "Parameters": {
            "sourceColumn": "age"
        }
    }
}
```

# FILL\$1WITH\$1CUSTOM


Returns a column with missing data replaced by a specific value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type for the column. This type must be `date`, `number`, `boolean`, `unsupported`, `string`, or `timestamp`.
+ `value` – The custom value to fill in. The data type must match the value that you choose for `columnDataType`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_CUSTOM",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "last_name",
            "value": "No last name provided"
        }
    }
}
```

# FILL\$1WITH\$1EMPTY


Returns a column with missing data replaced by an empty string.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_EMPTY",
        "Parameters": {
            "sourceColumn": "wind_direction"
        }
    }
}
```

# FILL\$1WITH\$1LAST\$1VALID


Returns a column with missing data replaced by the most recent valid value for that column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type for the column. This type must be `date`, `number`, `boolean`, `unsupported`, `string`, or `timestamp`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_LAST_VALID",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "birth_date"
        }
    }
}
```

# FILL\$1WITH\$1MEDIAN


Returns a column with missing data replaced by the median of all values.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_MEDIAN",
        "Parameters": {
            "sourceColumn": "age"
        }
    }
}
```

# FILL\$1WITH\$1MODE


Returns a column with missing data replaced by the mode of all values.

You can also specify tie-breaker logic, where some of the values are identical. For example, consider the following values:

`1 2 2 3 3 4`

A `modeType` of `MINIMUM` causes `FILL_WITH_MODE` to return 2 as the mode value. If `modeType` is `MAXIMUM`, the mode is 3. For `AVERAGE`, the mode is 2.5.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `modeType` – How to resolve tie values in the data. This value must be `MINIMUM`, `NONE`, `AVERAGE`, or `MAXIMUM`. 

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_MODE",
        "Parameters": {
            "modeType": "MAXIMUM",
            "sourceColumn": "age"
        }
    }
}
```

# FILL\$1WITH\$1MOST\$1FREQUENT


Returns a column with missing data replaced by the most frequent value.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_MOST_FREQUENT",
        "Parameters": {
            "sourceColumn": "position"
        }
    }
}
```

# FILL\$1WITH\$1NULL


Returns a column with data values replaced by null.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_NULL",
        "Parameters": {
            "sourceColumn": "rating"
        }
    }
}
```

# FILL\$1WITH\$1SUM


Returns a column with missing data replaced by the sum of all values.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FILL_WITH_SUM",
        "Parameters": {
            "sourceColumn": "age"
        }
    }
}
```

# FLAG\$1DUPLICATE\$1ROWS


Returns a new column with a specified value in each row that indicates whether that row is an exact match of an earlier row in the dataset. When matches are found, they are flagged as duplicates. The initial occurrence is not flagged, because it doesn't match an earlier row.

**Parameters**
+ `trueString` – Value to be inserted if the row matches an earlier row.
+ `falseString` – Value to be inserted if the row is unique.
+ `targetColumn` – Name of the new column that is inserted in the dataset.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FLAG_DUPLICATE_ROWS",
        "Parameters": {
            "trueString": "TRUE",
            "falseString": "FALSE",
            "targetColumn": "Flag"           
        }
    }
}
```

# FLAG\$1DUPLICATES\$1IN\$1COLUMN


Returns a new column with a specified value in each row that indicates whether the value in the row's source column matches a value in an earlier row of the source column. When matches are found, they are flagged as duplicates. The initial occurrence is not flagged, because it doesn't match an earlier row. 

**Parameters**
+ `sourceColumn` – Name of the source column.
+ `targetColumn` – Name of the target column.
+ `trueString` – String to be inserted in the target column when a source column value duplicates an earlier value in that column.
+ `falseString` – String to be inserted in the target column when a source column value is distinct from earlier values in that column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FLAG_DUPLICATES_IN_COLUMN",
        "Parameters": {
            "sourceColumn": "Name",
            "targetColumn": "Duplicate",
            "trueString": "TRUE",
            "falseString": "FALSE"          
        }
    }
}
```

# GET\$1ADVANCED\$1DATATYPE


Given a string column, identifies the advanced data type of the column, if any.

**Parameters**
+ `columnName` – The name of the string column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "GET_ADVANCED_DATATYPE",
        "Parameters": {
            "sourceColumn": "columnName"
        }
    }
}
```

# REMOVE\$1DUPLICATES


Deletes an entire row, if a duplicate value is encountered in a selected source column.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_DUPLICATES",
        "Parameters": {
            "sourceColumn": "nationality"
        }
    }
}
```

# REMOVE\$1INVALID


Deletes an entire row if an invalid value is encountered in a column of that row.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_INVALID",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "help_url"
        }
    }
}
```

# REMOVE\$1MISSING


Returns only the rows in which a specified column isn't missing data.

**Parameters**
+ `sourceColumn` – The name of an existing column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_MISSING",
        "Parameters": {
            "sourceColumn": "last_name"    
        }
    }
}
```

# REPLACE\$1WITH\$1AVERAGE


Replaces each invalid value in a column with the average of all other values.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_AVERAGE",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "age"
        }
    }
}
```

# REPLACE\$1WITH\$1CUSTOM


Replace detected entities with a custom value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `sourceColumns` – A list of existing column names.
+ `columnDataType` – The data type of the column.
+ `value` – The custom value to be used to replace invalid values.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Note**  
Use either `sourceColumn` or `sourceColumns`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_CUSTOM",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "",
            "sourceColumns": ["column1", "column2"],
            "value": 0
        }
    }
}
```

# REPLACE\$1WITH\$1EMPTY


Replaces each invalid value in a column with an empty value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_EMPTY",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "nationality"
        }
    }
}
```

# REPLACE\$1WITH\$1LAST\$1VALID


Replaces each invalid value in a column with the last valid value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_LAST_VALID",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "rating"
        }
    }
}
```

# REPLACE\$1WITH\$1MEDIAN


Replaces each invalid value in a column with the median of all other values.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_MEDIAN",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "games_won"
        }
    }
}
```

# REPLACE\$1WITH\$1MODE


Replaces each invalid value in a column with the mode of all other values.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.
+ `modeType` – How to resolve tie values in the data. This value must be `MINIMUM`, `NONE`, `AVERAGE`, or `MAXIMUM`. 

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_MODE",
        "Parameters": {
            "columnDataType": "number",
            "modeType": "MAXIMUM",
            "sourceColumn": "height_cm"
        }
    }
}
```

# REPLACE\$1WITH\$1MOST\$1FREQUENT


Replaces each invalid value in a column with the most frequent column value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_MOST_FREQUENT",
        "Parameters": {
            "columnDataType": "string",
            "sourceColumn": "wind_direction"
        }
    }
}
```

# REPLACE\$1WITH\$1NULL


Replaces each invalid value in a column with a null value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column.
+ `advancedDataType` – Special data types that are detected by DataBrew in a column that has the data type `string`. The types that DataBrew can detect within a `string` column include SSN, Email, Phone Number, Gender, Credit Card, URL, IP Address, DateTime, Currency, ZipCode, Country, Region, State, and City.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_NULL",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "weight_kg"
        }
    }
}
```

# REPLACE\$1WITH\$1ROLLING\$1AVERAGE


Replaces each value in a column with the rolling average from a previous "window" of rows.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.
+ `period` - – The size of the window. For example, if `period` is 10, the rolling average is computed using the previous 10 rows.

**Example**  
  

```
{
    "RecipeStep": {
        "Action": {
            "Operation": "REPLACE_WITH_ROLLING_AVERAGE",
            "Parameters": {
                "sourceColumn": "created_at",
                "columnDataType": "number",
                "period": "2"
            }
        }
    }
}
```

# REPLACE\$1WITH\$1ROLLING\$1SUM


Replaces each value in a column with the rolling sum from a previous "window" of rows.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.
+ `period` - – The size of the window. For example, if `period` is 10, the rolling sum is computed using the previous 10 rows.

**Example**  
  

```
{
    "RecipeStep": {
        "Action": {
            "Operation": "REPLACE_WITH_ROLLING_SUM",
            "Parameters": {
                "sourceColumn": "created_at",
                "columnDataType": "number",
                "period": "2"
            }
        }
    }
}
```

# REPLACE\$1WITH\$1SUM


Replaces each invalid value in a column with the sum of all other values.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `columnDataType` – The data type of the column. This type must be `number`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_SUM",
        "Parameters": {
            "columnDataType": "number",
            "sourceColumn": "games_won"
        }
    }
}
```

# Personally identifiable information (PII) recipe steps
PII recipe steps

Use these recipe steps to perform transformations on personally identifiable information (PII) in a dataset.

**Note**  
In addition to the recipe steps in this section, there are DataBrew recipe steps not designed specifically for PII that you can use to handle PII. An example is [DELETE](recipe-actions.DELETE.md), a basic column recipe step that deletes a column.

**Topics**
+ [

# CRYPTOGRAPHIC\$1HASH
](recipe-actions.CRYPTOGRAPHIC_HASH.md)
+ [

# DECRYPT
](recipe-actions.DECRYPT.md)
+ [

# DETERMINISTIC\$1DECRYPT
](recipe-actions.DETERMINISTIC_DECRYPT.md)
+ [

# DETERMINISTIC\$1ENCRYPT
](recipe-actions.DETERMINISTIC_ENCRYPT.md)
+ [

# ENCRYPT
](recipe-actions.ENCRYPT.md)
+ [

# MASK\$1CUSTOM
](recipe-actions.MASK_CUSTOM.md)
+ [

# MASK\$1DATE
](recipe-actions.MASK_DATE.md)
+ [

# MASK\$1DELIMITER
](recipe-actions.MASK_DELIMITER.md)
+ [

# MASK\$1RANGE
](recipe-actions.MASK_RANGE.md)
+ [

# REPLACE\$1WITH\$1RANDOM\$1BETWEEN
](recipe-actions.REPLACE_WITH_RANDOM_BETWEEN.md)
+ [

# REPLACE\$1WITH\$1RANDOM\$1DATE\$1BETWEEN
](recipe-actions.REPLACE_WITH_RANDOM_DATE_BETWEEN.md)
+ [

# SHUFFLE\$1ROWS
](recipe-actions.SHUFFLE_ROWS.md)

# CRYPTOGRAPHIC\$1HASH


Applies an algorithm to hash values in the column.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `secretId` – The ARN of the Secrets Manager secret key. The key used in the hash-based message authentication code (HMAC) prefix algorithm to hash the source columns, or `databrew!default` is the base64 decoded output for the value of the Secrets Manager secret key.
+ `secretVersion` – Optional. Defaults to the latest secret version.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.
+ `createSecretIfMissing` – Optional boolean. If true will attempt to create the secret on behalf of the caller.
+ `algorithm` – The algorithm used to hash your data. Valid enum values: MD5, SHA1, SHA256, SHA512, HMAC\$1MD5, HMAC\$1SHA1, HMAC\$1SHA256, HMAC\$1SHA512

  Each option refers to a different hashing algorithm. Those options with the "HMAC" prefix refer to a keyed hashing algorithm, and require the `secretId` parameter. For options without the "HMAC" prefix, the `secretId` parameter is not required.

  If you do not provide a hash algorithm, the service defaults to "HMAC\$1SHA256".

```
{
   "sourceColumns": ["phonenumber"],   
   "secretId": "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret",
   "entityTypeFilter": ["USA_ALL"]
}
```

When working in the interactive experience, in addition to the project’s role, the console user must have permission to `secretsmanager:GetSecretValue` on the provided Secrets Manager secret.

**Sample policy:**

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret"
            ]
        }
    ]
}
```

------

You may also opt to use the DataBrew-created default secret by passing `databrew!default` as secretId and parameter `createSecretIfMissing` as true. This is not recommended for production. Anyone with the **AwsGlueDataBrewFullAccessPolicy** role can use the default secret.

# DECRYPT


You can use the DECRYPT transform to decrypt inside of DataBrew. Your data can also be decrypted outside of DataBrew with the AWS Encryption SDK. If the provided KMS key ARN does not match what was used to encrypt the column, the decrypt operation fails. For more information on the AWS Encryption SDK, see [What is the AWS Encryption SDK](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/introduction.html) in the *AWS Encryption SDK Developer Guide*.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `kmsKeyArn` – The key ARN of the AWS Key Management Service key to use to decrypt the source columns. For more information on the key ARN, see [Key ARN](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#key-id-key-ARN) in the *AWS Key Management Service Developer Guide*. 

```
{
   "sourceColumns": ["phonenumber"],
   "kmsKeyArn": "arn:aws:kms:us-east-1:012345678901:key/<kms-key-id>"
}
```

When working in the interactive experience, in addition to the project’s role, the console user must have permission to `kms:GenerateDataKey` and `kms:Decrypt` on the provided KMS key.

**Sample policy:**

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "kms:GenerateDataKey",
            "kms:Decrypt"
        ],
        "Resource": [
            "arn:aws:kms:us-east-1:012345678901:key/kms-key-id"
        ]
    }
  ]
}
```

------

# DETERMINISTIC\$1DECRYPT


Decrypts data encrypted with DETERMINISTIC\$1ENCRYPT.

This transformation is a no-op if the provided secret id and version does not match what was used to encrypt the column.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `secretId` – The ARN of the Secrets Manager secret key to use to decrypt the source columns.
+ `secretVersion` – Optional. Defaults to the latest secret version.

**Example**

```
{
   "sourceColumns": ["phonenumber"],   
   "secretId": "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret",
   "secretVersion": "adfe-1232-7563-3123"
}
```

When working in the interactive experience, in addition to the project’s role, the console user must have permission to secretsmanager:GetSecretValue on the provided Secrets Manager secret.

**Sample policy:**

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret"
            ]
        }
    ]
}
```

------

# DETERMINISTIC\$1ENCRYPT


Encrypts the column using AES-GCM-SIV with a 256 bit key. Data encrypted with DETERMINISTIC\$1ENCRYPT can only be decrypted inside of DataBrew with the DETERMINISTIC\$1DECRYPT transform. This transform does not use AWS KMS or the AWS Encryption SDK, and instead uses the [AWS LC github library](https://github.com/awslabs/aws-lc).

Can encrypt up to 400KB per cell. Does not preserve data type on decrypt.

**Note**  
Note: Using a secret for more than a year is discouraged.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `secretId` – The ARN of the Secrets Manager secret key to use to encrypt the source columns, or databrew\$1default.
+ `secretVersion` – Optional. Defaults to the latest secret version.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.
+ `createSecretIfMissing` – Optional boolean. If true will attempt to create the secret on behalf of the caller.

**Example**

```
{
   "sourceColumns": ["phonenumber"],   
   "secretId": "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret",
   "secretVersion": "adfe-1232-7563-3123",
   "entityTypeFilter": ["USA_ALL"]
}
```

When working in the interactive experience, in addition to the project’s role, the console user must have permission to `secretsmanager:GetSecretValue` on the provided Secrets Manager secret.

**Sample policy**

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:us-east-1:012345678901:secret:mysecret"
            ]
        }
    ]
}
```

------

# ENCRYPT


Encrypts values in the source columns with the [AWS Encryption SDK](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/introduction.html). The DECRYPT transform can be used to decrypt inside of DataBrew. You can also decrypt the data outside of DataBrew using the AWS Encryption SDK.

The ENCRYPT transform can encrypt up to 128 MiB per cell. It will attempt to preserve the format on decryption. To preserve the data type, the data type metadata must serialize to less than 1KB. Otherwise, you must set the `preserveDataType` parameter to false. The data type metadata will be stored in plaintext in the encryption context. For more information on the encryption context, see [Encryption context](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context) in the *AWS Key Management Service Developer Guide*.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `kmsKeyArn` – The key ARN of the AWS Key Management Service key to use to encrypt the source columns. For more information on the key ARN, see [Key ARN](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#key-id-key-ARN) in the *AWS Key Management Service Developer Guide*.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.
+ `preserveDataType` – Optional boolean. Defaults to true. If false, the data type will not be stored.

In the following example, `entityTypeFilter` and `preserveDataType` are optional.

**Example**

```
{
    "sourceColumns": ["phonenumber"],
    "kmsKeyArn": "arn:aws:kms:us-east-1:012345678901:key/kms-key-id",
    "entityTypeFilter": ["USA_ALL"],
    "preserveDataType": "true"
}
```

When working in the interactive experience, in addition to the project’s role, the console user must have permission to `kms:GenerateDataKey` on the provided AWS KMS key.

**Sample policy:**

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "kms:GenerateDataKey"
        ],
        "Resource": [
            "arn:aws:kms:us-east-1:012345678901:key/kms-key-id"
        ]
    }
  ]
}
```

------

# MASK\$1CUSTOM


Masks characters that match a provided custom value.

**Parameters**
+ `sourceColumns` – A list of existing column names.
+ `maskSymbol` – A symbol that will be used to replace specified characters.
+ `regex` – If true, treats `customValue` as a regex pattern to match.
+ `customValue` – All occurrences (or regex matches) of `customValue` will be masked in the string.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.

**Example**  
  

```
// Mask all occurrences of 'amazon' in the column
{ 
    "RecipeAction": {
        "Operation": "MASK_CUSTOM",
        "Parameters": {
            "sourceColumns": ["company"],
            "maskSymbol": "#",
            "customValue": "amazon"
        }
    }
}
```

# MASK\$1DATE


Masks components of a date with a user-specified mask symbol.

**Parameters**
+ `sourceColumns` – A list of existing column names.
+ `maskSymbol` – A symbol that will be used to replace specified characters.
+ `redact` – An array of date component enums to mask. Valid enum values: YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, MILLISECOND.
+ `locale` – Optional IETF BCP 47 language tag. Defaults to `en`. The locale to use for date formatting.

**Example**  
  

```
// Mask year
{ 
    "RecipeAction": {
        "Operation": "MASK_DATE",
        "Parameters": {
            "sourceColumns": ["birthday"],
            "maskSymbol": "#",
            "redact": ["YEAR"]
        }
    }
}
```

# MASK\$1DELIMITER


Masks characters between two delimiters with a user-specified masking symbol.

**Parameters**
+ `sourceColumns` – A list of existing column names.
+ `maskSymbol` – A symbol that will be used to replace specified characters.
+ `startDelimiter` – A character indicating where masking is to begin. Omitting this parameter will apply the mask starting from the start of the string.
+ `endDelimiter` – A character indicating where masking is to end. Omitting this parameter will apply the masking from the startDelimiter to the end of the string.
+ `preserveDelimiters` – If true, applies mask to delimiters.
+ `alphabet` – An array of character sets to preserve during masking. Valid enum values: SYMBOLS, WHITESPACE.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.

**Example**  
  

```
// Mask string between '<' and '>', ignoring white spaces, symbols, and lowercase letters
{ 
    "RecipeAction": {
        "Operation": "MASK_DELIMITER",
        "Parameters": {
            "sourceColumns": ["name"],
            "maskSymbol": "#",
            "startDelimiter": "<",
            "endDelimiter": ">",
            "preserveDelimiters": false,
            "alphabet": ["WHITESPACE", "SYMBOLS"]
        }
    }
}
```

# MASK\$1RANGE


Masks characters between two positions with a user-specified masking symbol.

**Parameters**
+ `sourceColumns` – A list of existing column names.
+ `maskSymbol` – A symbol that will be used to replace specified characters.
+ `start` – A number indicating at which character position the masking is to begin (0-indexed, inclusive). Negative indexing is allowed. Omitting this parameter will apply the mask from the beginning of the string until 'stop'.
+ `stop` – A number indicating at which character position the masking is to end (0-indexed, exclusive). Negative indexing is allowed. Omitting this parameter will apply the mask from 'start' until the end of the string.
+ `alphabet` – An array of character sets enums to preserve during masking. Valid enum values: SYMBOLS, WHITESPACE.
+ `entityTypeFilter` – Optional array of [entity types](https://docs.aws.amazon.com/databrew/latest/dg/API_EntityDetectorConfiguration.html#databrew-Type-EntityDetectorConfiguration-EntityTypes). Can be used to encrypt only detected PII in free-text column.

**Example**  
  

```
// Mask entire string
{ 
    "RecipeAction": {
        "Operation": "MASK_RANGE",
        "Parameters": {
            "sourceColumns": ["firstName", "lastName"],
            "maskSymbol": "#"
        }
    }
}
```

# REPLACE\$1WITH\$1RANDOM\$1BETWEEN


Replaces values with a random number.

**Parameters**
+ `lowerBound` – The lower bound of the random number range.
+ `sourceColumns` – A list of existing column names.
+ `upperBound` – The upper bound of the random number range.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_RANDOM_BETWEEN",
        "Parameters": {
            "lowerBound": "1",
            "sourceColumns": ["column1", "column2"],
            "upperBound": "100"
        }
    }
}
```

# REPLACE\$1WITH\$1RANDOM\$1DATE\$1BETWEEN


Replaces values with a random date.

**Parameters**
+ `startDate` – The start of the range of dates from which a random date will be taken.
+ `sourceColumns` – A list of existing column names.
+ `endDate` – The end of the range of dates from which a random date will be taken.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "REPLACE_WITH_RANDOM_DATE_BETWEEN",
        "Parameters": {
            "startDate": "2020-12-12 12:12:12",
            "sourceColumns": ["column1", "column2"],
            "endDate": "2021-12-12 12:12:12"
        }
    }
}
```

# SHUFFLE\$1ROWS


Shuffles values in a given column. The shuffling can occur with values grouped by a secondary column.

**Parameters**
+ `sourceColumns` – An array of existing columns.
+ `groupByColumns` – An array of columns to group the source columns by while shuffling.

**Example**  
  

```
{
   "sourceColumns": ["age"],
   "*groupByColumns*": ["country"]
}
```

# Outlier detection and handling recipe steps


Use these recipe steps to work with outliers in your data and perform advanced transformations on them..

**Topics**
+ [

# FLAG\$1OUTLIERS
](recipe-actions.FLAG_OUTLIERS.md)
+ [

# REMOVE\$1OUTLIERS
](recipe-actions.outliers.REMOVE_OUTLIERS.md)
+ [

# REPLACE\$1OUTLIERS
](recipe-actions.REPLACE_OUTLIERS.md)
+ [

# RESCALE\$1OUTLIERS\$1WITH\$1Z\$1SCORE
](recipe-actions.RESCALE_OUTLIERS_WITH_Z_SCORE.md)
+ [

# RESCALE\$1OUTLIERS\$1WITH\$1SKEW
](recipe-actions.outliers.RESCALE_OUTLIERS_WITH_SKEW.md)

# FLAG\$1OUTLIERS


Returns a new column containing a customizable value in each row that indicates if the source column value is an outlier.

**Parameters**
+ `sourceColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `targetColumn` – Specifies the name of a new column where the results of the outlier evaluation strategy is to be inserted.
+ `outlierStrategy` – Specifies the approach to use in detecting outliers. Valid values include the following: 
  + `Z_SCORE` – Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold.
  + `MODIFIED_Z_SCORE` – Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold.
  + `IQR` – Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are. 
+ `threshold` – Specifies the threshold value to use when detecting outliers. The `sourceColumn` value is identified as an outlier if the score that's calculated with the `outlierStrategy` exceeds this number. The default is 3.
+ `trueString` – Specifies the string value to use if an outlier is detected. The default is "True".
+ `falseString` – Specifies the string value to use if no outlier is detected. The default is "False".

The following examples display syntax for a single [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html) operation. A *recipe* contains at least one [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html) operation, and a recipe step contains at least one recipe action. A *recipe action* runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.

------
#### [ JSON ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using JSON syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in JSON**  

```
{
    "Action": {
        "Operation": "FLAG_OUTLIERS",
        "Parameters": {
            "sourceColumn": "name-of-existing-column",
            "targetColumn": "name-of-new-column",
            "outlierStrategy": "IQR",
            "threshold": "1.5",
            "trueString": "Yes",
            "falseString": "No"
        }
    }
}
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------
#### [ YAML ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using YAML syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in YAML**  

```
- Action:
  Operation: FLAG_OUTLIERS
  Parameters:
    sourceColumn: name-of-existing-column
    targetColumn: name-of-new-column
    outlierStrategy: IQR
    trueString: Outlier
    falseString: No
    threshold: '1.5'
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------

# REMOVE\$1OUTLIERS


Removes data points that classify as outliers, based on the settings in the parameters.

**Parameters**
+ `sourceColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `outlierStrategy` – Specifies the approach to use in detecting outliers. Valid values include the following: 
  + `Z_SCORE` – Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold.
  + `MODIFIED_Z_SCORE` – Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold.
  + `IQR` – Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are. 
+ `threshold` – Specifies the threshold value to use when detecting outliers. The `sourceColumn` value is identified as an outlier if the score that's calculated with the `outlierStrategy` exceeds this number. The default is 3.
+ `removeType` – Specifies the way to remove the data. Valid values include `DELETE_ROWS` and `CLEAR`. 
+ `trimValue` – Specifies whether to remove all or some of the outliers. This Boolean value defaults to `FALSE`.
  + `FALSE` – Removes all outliers
  +  `TRUE` – Removes outliers that rank outside of the percentile threshold specified in `minValue` and `maxValue`.
+ `minValue` – Indicates the minimum percentile value for the outlier range. Valid range is 0–100.
+ `maxValue` – Indicates the maximum percentile value for the outlier range. Valid range is 0–100.

The following examples display syntax for a single [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html) operation. A *recipe* contains at least one [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html) operation, and a recipe step contains at least one recipe action. A *recipe action* runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.

------
#### [ JSON ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using JSON syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in JSON**  

```
{
    "Action": {
        "Operation": "REMOVE_OUTLIERS",
        "Parameters": {
            "sourceColumn": "name-of-existing-column",
            "outlierStrategy": "Z_SCORE",
            "threshold": "3",
            "removeType": "DELETE_ROWS",
            "trimValue": "TRUE",
            "minValue": "5",
            "maxValue": "95"
        }
    }
}
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------
#### [ YAML ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using YAML syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in YAML**  

```
- Action:
  Operation: REMOVE_OUTLIERS
  Parameters:
    sourceColumn: name-of-existing-column
    outlierStrategy: Z_SCORE
    threshold: '3'
    removeType: DELETE_ROWS
    trimValue: 'TRUE'
    minValue: '5'
    maxValue: '95'
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------

# REPLACE\$1OUTLIERS


Updates the data point values that classify as outliers, based on the settings in the parameters.

**Parameters**
+ `sourceColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `outlierStrategy` – Specifies the approach to use in detecting outliers. Valid values include the following: 
  + `Z_SCORE` – Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold.
  + `MODIFIED_Z_SCORE` – Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold.
  + `IQR` – Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are. 
+ `threshold` – Specifies the threshold value to use when detecting outliers. The `sourceColumn` value is identified as an outlier if the score that's calculated with the `outlierStrategy` exceeds this number. The default is 3.
+ `replaceType` – Specifies the method to use when replacing outliers. Valid values include the following:
  + `WINSORIZE_VALUES` – Specifies using the minimum and maximum percentile to cap the values. 
  + `REPLACE_WITH_CUSTOM` 
  + `REPLACE_WITH_EMPTY` 
  + `REPLACE_WITH_NULL`
  + `REPLACE_WITH_MODE`
  + `REPLACE_WITH_AVERAGE`
  + `REPLACE_WITH_MEDIAN`
  + `REPLACE_WITH_SUM`
  + `REPLACE_WITH_MAX`
+ `modeType` – Indicates the type of modal function to use when `replaceType` is `REPLACE_WITH_MODE`. Valid values include the following: `MIN`, `MAX`, and `AVERAGE`.
+ `minValue` – Indicates the minimum percentile value for the outlier range that is to be applied when `trimValue` is used. Valid range is 0–100. 
+ `maxValue` – Indicates the maximum percentile value for the outlier range that is to be applied when `trimValue` is used. . Valid range is 0–100. 
+ `value` – Specifies the value to insert when using `REPLACE_WITH_CUSTOM`.
+  `trimValue` – Specifies whether to remove all or some of the outliers. This Boolean value is set to `TRUE` when `replaceType` is `REPLACE_WITH_NULL`, `REPLACE_WITH_MODE`, or `WINSORIZE_VALUES`. It defaults to `FALSE` for all others.
  + `FALSE` – Removes all outliers
  +  `TRUE` –Removes outliers that rank outside of the percentile cap threshold specified in `minValue` and `maxValue`.

The following examples display syntax for a single [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html) operation. A *recipe* contains at least one [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html) operation, and a recipe step contains at least one recipe action. A *recipe action* runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.

------
#### [ JSON ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using JSON syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in JSON**  

```
{
    "Action": {
        "Operation": "REPLACE_OUTLIERS",
        "Parameters": {
            "maxValue": "95",
            "minValue": "5",
            "modeType": "AVERAGE",
            "outlierStrategy": "Z_SCORE",
            "replaceType": "REPLACE_WITH_MODE",
            "sourceColumn": "name-of-existing-column",
            "threshold": "3",
            "trimValue": "TRUE"
        }
    }
}
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------
#### [ YAML ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using YAML syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in YAML**  

```
- Action:
  Operation: REMOVE_OUTLIERS
  Parameters:
    sourceColumn: name-of-existing-column
    outlierStrategy: Z_SCORE
    threshold: '3'
    replaceType: REPLACE_WITH_MODE
    modeType: AVERAGE
    minValue: '5'
    maxValue: '95'
    trimValue: 'TRUE'
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------

# RESCALE\$1OUTLIERS\$1WITH\$1Z\$1SCORE


Returns a new column with a rescaled outlier value in each row, based on the settings in the parameters. This action also applies Z-score normalization to linearly scale data values to have a mean (μ) of 0 and standard deviation (σ) of 1. We recommend this action for handling outliers. 

**Parameters**
+ `sourceColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `targetColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `outlierStrategy` – Specifies the approach to use in detecting outliers. Valid values include the following: 
  + `Z_SCORE` – Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold.
  + `MODIFIED_Z_SCORE` – Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold.
  + `IQR` – Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are. 
+ `threshold` – The threshold value to use when detecting outliers. The `sourceColumn` value is identified as an outlier if the score that's calculated with the `outlierStrategy` exceeds this number. The default is 3. 

The following examples display syntax for a single [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html) operation. A *recipe* contains at least one [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html) operation, and a recipe step contains at least one recipe action. A *recipe action* runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.

------
#### [ JSON ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html) operation, using JSON syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in JSON**  

```
{
    "Action": {
        "Operation": "RESCALE_OUTLIERS_WITH_Z_SCORE",
        "Parameters": {
            "sourceColumn": "name-of-existing-column",
            "targetColumn": "name-of-new-column",
            "outlierStrategy": "Z_SCORE",
            "threshold": "3"
        }
    }
}
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------
#### [ YAML ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html) operation, using YAML syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in YAML**  

```
- Action:
  Operation: REMOVE_OUTLIERS
  Parameters:
    sourceColumn: name-of-existing-column
    targetColumn: name-of-new-column
    outlierStrategy: Z_SCORE
    threshold: '3'
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------

# RESCALE\$1OUTLIERS\$1WITH\$1SKEW


Returns a new column with a rescaled outlier value in each row, based on the settings in the parameters. This action works to reduce distribution skewness by applying the specified log or root transform. We recommend this action for handling skewed data. 

**Parameters**
+ `sourceColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `targetColumn` – Specifies the name of an existing numeric column that might contain outliers.
+ `outlierStrategy` – Specifies the approach to use in detecting outliers. Valid values include the following: 
  + `Z_SCORE` – Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold.
  + `MODIFIED_Z_SCORE` – Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold.
  + `IQR` – Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are. 
+ `threshold` – Specifies the threshold value to use when detecting outliers. The `sourceColumn` value is identified as an outlier if the score that's calculated with the `outlierStrategy` exceeds this number. The default is 3.
+ `skewFunction` – Specifies the method to use when replacing outliers. Valid values include the following:
  + LOG – Applies a strong transformation to reduce positive and negative skew. This is a natural logarithm (2.718281828).
  + ROOT (with `value = 3` ) – Applies a fairly strong transformation to reduce positive and negative skew. (Cube root)
  + ROOT (with `value = 2` ) – Applies a moderate transformation to reduce positive skew only. (Square root)
  + SQUARE – Applies a moderate transformation to reduce negative skew. (Square)
  + Custom transform – Applies the specified `LOG` or `ROOT` transform using the custom number provided in the `value` parameter.
+ `value` – Specifies the value to use for the custom transform. If `skewFunction` is LOG, this value represents the base of the log. If `skewFunction` is ROOT, this value represents the power of the root.

The following examples display syntax for a single [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeAction.html) operation. A *recipe* contains at least one [https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html](https://docs.aws.amazon.com/databrew/latest/dg/API_RecipeStep.html) operation, and a recipe step contains at least one recipe action. A *recipe action* runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.

------
#### [ JSON ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using JSON syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in JSON**  

```
{
    "Action": {
        "Operation": "RESCALE_OUTLIERS_WITH_SKEW",
        "Parameters": {
            "outlierStrategy": "Z_SCORE",
            "threshold": "3",
            "skewFunction": "ROOT",
            "sourceColumn": "name-of-existing-column",
            "targetColumn": "name-of-new-column",
            "value": "4"
        }
    }
}
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------
#### [ YAML ]

The following shows an example `RecipeAction` to use as member of an example `RecipeStep` for a DataBrew [Recipe](https://docs.aws.amazon.com/databrew/latest/dg/API_Recipe.html), using YAML syntax. For syntax examples showing a list of recipe actions, see [Defining a recipe structure](recipes.md#recipes.structure).

**Example in YAML**  

```
- Action:
  Operation: RESCALE_OUTLIERS_WITH_SKEW
  Parameters:
    outlierStrategy: Z_SCORE
    threshold: '3'
    skewFunction: ROOT
    sourceColumn: name-of-existing-column
    targetColumn: name-of-new-column
    value: '4'
```

For more information on using this recipe action in an API operation, see [https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_CreateRecipe.html) or [https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html](https://docs.aws.amazon.com/databrew/latest/dg/API_UpdateRecipe.html). You can use these and other API operations in your own code. 

------

# Column structure recipe steps


Use these column structure recipe steps to modify the column structure of your data.

**Topics**
+ [

# BOOLEAN\$1OPERATION
](recipe-actions.BOOLEAN_OPERATION.md)
+ [

# CASE\$1OPERATION
](recipe-actions.CASE_OPERATION.md)
+ [

# FLAG\$1COLUMN\$1FROM\$1NULL
](recipe-actions.FLAG_COLUMN_FROM_NULL.md)
+ [

# FLAG\$1COLUMN\$1FROM\$1PATTERN
](recipe-actions.FLAG_COLUMN_FROM_PATTERN.md)
+ [

# MERGE
](recipe-actions.MERGE.md)
+ [

# SPLIT\$1COLUMN\$1BETWEEN\$1DELIMITER
](recipe-actions.SPLIT_COLUMN_BETWEEN_DELIMITER.md)
+ [

# SPLIT\$1COLUMN\$1BETWEEN\$1POSITIONS
](recipe-actions.SPLIT_COLUMN_BETWEEN_POSITIONS.md)
+ [

# SPLIT\$1COLUMN\$1FROM\$1END
](recipe-actions.SPLIT_COLUMN_FROM_END.md)
+ [

# SPLIT\$1COLUMN\$1FROM\$1START
](recipe-actions.SPLIT_COLUMN_FROM_START.md)
+ [

# SPLIT\$1COLUMN\$1MULTIPLE\$1DELIMITER
](recipe-actions.SPLIT_COLUMN_MULTIPLE_DELIMITER.md)
+ [

# SPLIT\$1COLUMN\$1SINGLE\$1DELIMITER
](recipe-actions.SPLIT_COLUMN_SINGLE_DELIMITER.md)
+ [

# SPLIT\$1COLUMN\$1WITH\$1INTERVALS
](recipe-actions.SPLIT_COLUMN_WITH_INTERVALS.md)

# BOOLEAN\$1OPERATION


Create a new column, based on the result of logical condition IF. Return true value if the boolean expression is true, false value if the boolean expression is false, or return a custom value.

**Parameters**
+ `trueValueExpression` – Result when the condition is met.
+ `falseValueExpression` – Result when the condition is not met.
+ `valueExpression` – Boolean condition.
+ `withExpressions` – Configuration for aggregate results.
+ `targetColumn` – A name for the newly created column.

You can use constant values, column references, and aggregate results in trueValueExpression, falseValueExpression and valueExpression.

**Example: Constant values**  
Values that remain unchanged, like a number or a sentence.  
  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "It is true.",
        "falseValueExpression": "It is false.",
        "valueExpression": "`column.1` < 2000",
        "targetColumn": "result.column"
      }
    }
  }
}
```

**Example: Column references**  
Values that are columns in the dataset.  
  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "`column.2`",
        "falseValueExpression": "`column.3`",
        "valueExpression": "`column.1` < `column.4`",
        "targetColumn": "result.column"
      }
    }
  }
}
```

**Example: Aggregate results**  
Values that are calculated by aggregate functions. An aggregate function performs a calculation on a column, and returns a single value.  
  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "`:mincolumn.2`",
        "falseValueExpression": "`:maxcolumn.3`",
        "valueExpression": "`column.1` < `:avgcolumn.4`",
        "withExpressions": "[{\"name\":\"mincolumn.2\",\"value\":\"min(`column.2`)\",\"type\":\"aggregate\"},{\"name\":\"maxcolumn.3\",\"value\":\"max(`column.3`)\",\"type\":\"aggregate\"},{\"name\":\"avgcolumn.4\",\"value\":\"avg(`column.4`)\",\"type\":\"aggregate\"}]",
        "targetColumn": "result.column"
      }
    }
  }
}
```
Users need to convert the JSON to a string by escaping.  
Note that the parameter names in trueValueExpression, falseValueExpression, and valueExpression must match the names in withExpressions. To use the aggregate results from some columns, you need to create parameters for them and provide the aggregate functions.

**Example:**  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "It is true.",
        "falseValueExpression": "It is false.",
        "valueExpression": "`column.1` < 2000",
        "targetColumn": "result.column"
      }
    }
  }
}
```

**Example: and/or**  
You can use and and or to combine multiple conditions.  
  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "It is true.",
        "falseValueExpression": "It is false.",
        "valueExpression": "`column.1` < 2000 and `column.2` >= `column.3",
        "targetColumn": "result.column"
      }
    }
  }
}
{
  "RecipeStep": {
    "Action": {
      "Operation": "BOOLEAN_OPERATION",
      "Parameters": {
        "trueValueExpression": "`column.4`",
        "falseValueExpression": "`column.5`",
        "valueExpression": "startsWith(`column1`, 'value1') or endsWith(`column2`, 'value2')",
        "targetColumn": "result.column"
      }
    }
  }
}
```

## Valid aggregate functions


The table below shows all of the valid aggregate functions that can be used in a boolean operation.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.BOOLEAN_OPERATION.html)

## Valid conditions in a valueExpression


The table below shows supported conditions and the value expressions you can use.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.BOOLEAN_OPERATION.html)

# CASE\$1OPERATION


Create a new column, based on the result of logical condition CASE. The case operation goes through case conditions and returns a value when the first condition is met. Once a condition is true, the operation stops reading and returns the result. If no conditions are true, it returns the default value.

**Parameters**
+ `valueExpression` – Conditions.
+ `withExpressions` – Configuration for aggregate results.
+ `targetColumn` – Name for the newly created column.

**Example**  
  

```
{
  "RecipeStep": {
    "Action": {
      "Operation": "CASE_OPERATION",
      "Parameters": {
        "valueExpression": "case when `column11` < `column.2` then 'result1' when `column2` < 'value2' then 'result2' else 'high' end",
        "targetColumn": "result.column"
      }
    }
  }
}
```

## Valid aggregate functions


The table below shows all of the valid aggregate functions that can be used in a case operation.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.CASE_OPERATION.html)

## Valid conditions in a valueExpression


The table below shows supported conditions and the value expressions you can use.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.CASE_OPERATION.html)

# FLAG\$1COLUMN\$1FROM\$1NULL


Creates a new column, based on the presence of null values in an existing column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of a new column to be created.
+ `flagType` – A value that must be set to `Null values`.
+ `trueString` – A value for the new column, if a null value is found in the source. If no value is specified, the default is `True`.
+ `falseString` – A value for the new column, if a non-null value is found in the source. If no value is specified, the default is `False`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FLAG_COLUMN_FROM_NULL",
        "Parameters": {
            "flagType": "Null values",
            "sourceColumn": "weight_kg",
            "targetColumn": "is_weight_kg_missing"
        }
    }
}
```

# FLAG\$1COLUMN\$1FROM\$1PATTERN


Creates a new column, based on the presence of a user-specified pattern in an existing column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of a new column to be created.
+ `flagType` – A value that must be set to `Pattern`.
+ `pattern` – A regular expression, indicating the pattern to be evaluated.
+ `trueString` – A value for the new column, if a null value is found in the source. If no value is specified, the default is `True`.
+ `falseString` – A value for the new column, if a non-null value is found in the source. If no value is specified, the default is `False`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FLAG_COLUMN_FROM_PATTERN",
        "Parameters": {
            "falseString": "No",
            "flagType": "Pattern",
            "pattern": "N.*",
            "sourceColumn": "wind_direction",
            "targetColumn": "northerly",
            "trueString": "yes"
        }
    }
}
```

# MERGE


Merges two or more columns into a new column.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of one or more columns to be merged.
+ `delimiter` – An optional separator between the values, to appear in the target column.
+ `targetColumn` – The name of the merged column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MERGE",
        "Parameters": {
            "delimiter": " ",
            "sourceColumns": "[\"first_name\",\"last_name\"]",
            "targetColumn": "Merged Column 1"
        }
    }
}
```

# SPLIT\$1COLUMN\$1BETWEEN\$1DELIMITER


Splits a column into three new columns, according to a beginning and ending delimiter.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `patternOption1` – A JSON-encoded string representing one or more characters that indicate the first delimiter.
+ `patternOption2` – A JSON-encoded string representing one or more characters that indicate the second delimiter.
+ `pattern` – One or more characters to use as a separator, when splitting the data.
+ `includeInSplit` – If true, includes the pattern in the new column; otherwise, the pattern is discarded.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_BETWEEN_DELIMITER",
        "Parameters": {
            "patternOption1": "{\"pattern\":\"H\",\"includeInSplit\":true}",
            "patternOption2": "{\"pattern\":\"M\",\"includeInSplit\":true}",
            "sourceColumn": "last_name"
        }
    }
}
```

# SPLIT\$1COLUMN\$1BETWEEN\$1POSITIONS


Splits a column into three new columns, according to offsets that you specify.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `startPosition` – The character position where the split is to begin.
+ `endPosition` – The character position where the split is to end.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_BETWEEN_POSITIONS",
        "Parameters": {
            "endPosition": "12",
            "sourceColumn": "last_name",
            "startPosition": "2"
        }
    }
}
```

# SPLIT\$1COLUMN\$1FROM\$1END


Splits a column into two new columns, at an offset from the end of the string.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `position` – The character position, from the right end of the string, where the split is to occur.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_FROM_END",
        "Parameters": {
            "position": "1",
            "sourceColumn": "nationality"
        }
    }
}
```

# SPLIT\$1COLUMN\$1FROM\$1START


Splits a column into two new columns, at an offset from the beginning of the string.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `position` – The character position, from the left end of the string, where the split is to occur.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_FROM_START",
        "Parameters": {
            "position": "1",
            "sourceColumn": "first_name"
        }
    }
}
```

# SPLIT\$1COLUMN\$1MULTIPLE\$1DELIMITER


Splits a column according to multiple delimiters.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `patternOptions` – A JSON-encoded string representing one or more patterns that determine the split criteria.
+ `pattern` – One or more characters to use as a separator, when splitting the data.
+ `limit` – How many splits to perform. The minimum is 1; the maximum is 20.
+ `includeInSplit` – If true, includes the pattern in the new column; otherwise, the pattern is discarded.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_MULTIPLE_DELIMITER",
        "Parameters": {
            "limit": "1",
            "patternOptions": "[{\"pattern\":\",\",\"includeInSplit\":true},{\"pattern\":\" \",\"includeInSplit\":true}]",
            "sourceColumn": "description"
        }
    }
}
```

# SPLIT\$1COLUMN\$1SINGLE\$1DELIMITER


Splits a column into one or more new columns, according to a specific delimiter.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – One or more characters to use as a separator, when splitting the data.
+ `limit` – How many splits to perform. The minimum is 1; the maximum is 20.
+ `includeInSplit` – If true, includes the pattern in the new column; otherwise, the pattern is discarded.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_SINGLE_DELIMITER",
        "Parameters": {
            "includeInSplit": "true",
            "limit": "1",
            "pattern": "/",
            "sourceColumn": "info_url"
        }
    }
}
```

# SPLIT\$1COLUMN\$1WITH\$1INTERVALS


Splits a column at intervals of *n* characters, where you specify *n*.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `startPosition` – The character position where the split is to begin.
+ `interval` – The number of characters to skip before the next split.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SPLIT_COLUMN_WITH_INTERVALS",
        "Parameters": {
            "interval": "4",
            "sourceColumn": "nationality",
            "startPosition": "1"
        }
    }
}
```

# Column formatting recipe steps


Use column formatting recipe steps to change the format of the data in your columns.

**Topics**
+ [

# NUMBER\$1FORMAT
](recipe-actions.NUMBER_FORMAT.md)
+ [

# FORMAT\$1PHONE\$1NUMBER
](recipe-actions.FORMAT_PHONE_NUMBER.md)

# NUMBER\$1FORMAT


Returns a column in which a numeric value is converted into a formatted string.

**Parameters**
+ `sourceColumn` – String. The name of an existing column.
+ `decimalPlaces` – Integer. The value of number of digits after the decimal separator.
+ `numericDecimalSeparator` – String. One of the following values indicating the decimal separator:
  + "."
  + ","
+ `numericThousandSeparator` – String. One of the following values indicating the thousand separator:
  + null. Indicates that a thousand separator isn't enabled.
  + ","
  + " "
  + "."
  + "\$1\$1"
+ `numericAbbreviatedUnit` – String. One of the following values indicating the abbreviation unit:
  + null. Indicates that an abbreviation unit isn't enabled.
  + “THOUSAND”
  + "MILLION"
  + "BILLION"
  + "TRILLION"
+ `numericUnitAbbreviation` – String. One of the following values or any custom value, indicating unit abbreviation:
  + null. Indicates that unit abbreviation isn't enabled.
  +     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.NUMBER_FORMAT.html)

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NUMBER_FORMAT",
        "Parameters": {
            "sourceColumn": "income",
            "decimalPlaces": "2",
            "numericDecimalSeparator": ".",
            "numericThousandSeparator": ",",
            "numericAbbreviatedUnit": "THOUSAND",
            "numericUnitAbbreviation": "K"
        }
    }
}
```

# FORMAT\$1PHONE\$1NUMBER


Returns a column in which a phone number string is converted into a formatted value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `phoneNumberFormat` – The format to convert the phone number to. If no format is specified, the default is `E.164`, an internationally-recognized standard phone number format. Valid values include the following: 
  + `E164` (omit the period after `E`)
+ `defaultRegion` – A valid region code consisting of two or three uppercase letters that specifies the region for the phone number when no country code is present in the number itself. At most, one of `defaultRegion` or `defaultRegionColumn` can be provided.
+ `defaultRegionColumn` – The name of a column of the [advanced data type](projects.adv-data-types.md#projects.adv-data-types.title) `Country`. The region code from the specified column is used to determine the country code for the phone number when no country code is present in the number itself. At most, one of `defaultRegion` or `defaultRegionColumn` can be provided.

**Notes**
+ Inputs that can't be formatted to a valid phone number remain unmodified. 
+ If no default region is provided, and a phone number doesn't start with a plus symbol (\$1) and country calling code, the phone number isn't formatted.

**Example**  
  
**Example: Fixed default region**  

```
{
    "Action": {
        "Operation": "FORMAT_PHONE_NUMBER",
        "Parameters": {
            "sourceColumn": "Phone Number",
            "defaultRegion": "US"
        }
    }
}
```
**Example: Default region column option**  

```
{
    "Action": {
        "Operation": "FORMAT_PHONE_NUMBER",
        "Parameters": {
            "sourceColumn": "Phone Number",
            "defaultRegionColumn": "Country Code"
        }
    }
}
```

# Data structure recipe steps


Use these recipe steps to tabulate and summarize data from different perspectives, or to perform advanced functions.

**Topics**
+ [

# NEST\$1TO\$1ARRAY
](recipe-actions.NEST_TO_ARRAY.md)
+ [

# NEST\$1TO\$1MAP
](recipe-actions.NEST_TO_MAP.md)
+ [

# NEST\$1TO\$1STRUCT
](recipe-actions.NEST_TO_STRUCT.md)
+ [

# UNNEST\$1ARRAY
](recipe-actions.UNNEST_ARRAY.md)
+ [

# UNNEST\$1MAP
](recipe-actions.UNNEST_MAP.md)
+ [

# UNNEST\$1STRUCT
](recipe-actions.UNNEST_STRUCT.md)
+ [

# UNNEST\$1STRUCT\$1N
](recipe-actions.UNNEST_STRUCT_N.md)
+ [

# GROUP\$1BY
](recipe-actions.GROUP_BY.md)
+ [

# JOIN
](recipe-actions.JOIN.md)
+ [

# PIVOT
](recipe-actions.PIVOT.md)
+ [

## SCALE
](#recipe-actions.SCALE)
+ [

# TRANSPOSE
](recipe-actions.TRANSPOSE.md)
+ [

# UNION
](recipe-actions.UNION.md)
+ [

# UNPIVOT
](recipe-actions.UNPIVOT.md)

# NEST\$1TO\$1ARRAY


Converts user-selected columns into array values. The order of the selected columns is maintained while creating the resultant array. The different column data types are typecast to a common type that supports the data types of all columns.

**Parameters**
+ `sourceColumns` — List of the source columns.
+ `targetColumn` — The name of the target column.
+ `removeSourceColumns` — Contains the value `true` or `false` to indicate whether or not the user wants to remove the selected source columns.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NEST_TO_ARRAY",
        "Parameters": {
            "sourceColumns": "[\"age\",\"weight_kg\",\"height_cm\"]",
            "targetColumn": "columnName",
            "removeSourceColumns": "true"
        }
    }
}
```

# NEST\$1TO\$1MAP


Converts user-selected columns into key-value pairs, each with a key representing the column name and a value representing the row value. The order of the selected column is not maintained while creating the resultant map. The different column data types are typecast to a common type that supports the data types of all columns.

**Parameters**
+ `sourceColumns` — List of the source columns.
+ `targetColumn` — The name of the target column.
+ `removeSourceColumns` — Contains the value `true` or `false` to indicate whether or not the user wants to remove the selected source columns.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NEST_TO_MAP",
        "Parameters": {
            "sourceColumns": "[\"age\",\"weight_kg\",\"height_cm\"]",
            "targetColumn": "columnName",
            "removeSourceColumns": "true"
        }
    }
}
```

# NEST\$1TO\$1STRUCT


Converts user-selected columns into key-value pairs, each with a key representing the column name and a value representing the row value. The order of the selected columns and the data type of each column are maintained in the resultant struct.

**Parameters**
+ `sourceColumns` — List of the source columns.
+ `targetColumn` — The name of the target column.
+ `removeSourceColumns` — Contains the value `true` or `false` to indicate whether or not the user wants to remove the selected source columns.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NEST_TO_STRUCT",
        "Parameters": {
            "sourceColumns": "[\"age\",\"weight_kg\",\"height_cm\"]",
            "targetColumn": "columnName",
            "removeSourceColumns": "true"
        }
    }
}
```

# UNNEST\$1ARRAY


Unnests a column of type `array` into a new column. If the array contains more than one value, then a row corresponding to each element is generated. This function only unnests one level of an array column.

**Parameters**
+ `sourceColumn` — The name of an existing column. This column must be of `struct` type.
+ `targetColumn` — Name of the target column that is generated.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNNEST_ARRAY",
        "Parameters": {
            "sourceColumn": "address",
            "targetColumn": "address"
        }
    }
}
```

# UNNEST\$1MAP


Unnests a column of type `map` and generates a column for the key and value. If there is more than one key-value pair, a row corresponding to each key value would be generated. This function only unnests one level of a map column.

**Parameters**
+ `sourceColumn` — The name of an existing column. This column must be of `struct` type.
+ `removeSourceColumn` — If `true`, the source column is deleted after the function is complete.
+ `targetColumn` — If provided, each of the generated column will start with this as the prefix.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNNEST_MAP",
        "Parameters": {
            "sourceColumn": "address",
            "removeSourceColumn": "false",
            "targetColumn": "address"
        }
    }
}
```

# UNNEST\$1STRUCT


Unnest a column of type `struct` and generates a column for each of the keys present in the struct. This function only unnests struct level one.

**Parameters**
+ `sourceColumn` — The name of an existing column. This column must be of struct type.
+ `removeSourceColumn` — If `true`, the source column is deleted after the function is complete.
+ `targetColumn` — If provided, each of the generated column will start with this as the prefix.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNNEST_STRUCT",
        "Parameters": {
            "sourceColumn": "address",
            "removeSourceColumn": "false"
            "targetColumn": "add"
        }
    }
}
```

# UNNEST\$1STRUCT\$1N


Creates a new column for each field of a selected column of type `struct`.

For example, given the following struct:

```
            user {
               name: “Ammy” 
               address: {
                  state: "CA",
                  zipcode: 12345
               }
            }
```

This function creates 3 columns:


| user.name | user.address.state | user.address.zipcode | 
| --- | --- | --- | 
|  Ammy  |  CA  |  12345  | 

**Parameters**
+ `sourceColumns` — List of the source columns.
+ `regexColumnSelector` — A regular expression to select the columns to unnest.
+ `removeSourceColumn` — A Boolean value. If true, then remove the source column; otherwise keep it.
+ `unnestLevel` — The number of levels to unnest.
+ `delimiter` — The delimiter is used in the newly created column name to separate the different levels of the struct. For example: if the delimiter is “/”, the column name will be in this form: “user/address/state”.
+ `conditionExpressions` — Condition expressions.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNNEST_STRUCT_N",
        "Parameters": {
            "sourceColumns": "[\"address\"]",
            "removeSourceColumn": "true",
            "unnestLevel": "2",
            "delimiter": "/"
        }
    }
}
```

# GROUP\$1BY


Summarizes the data by grouping rows by one or more columns, and then applying an aggregation function to each group. 

**Parameters**
+ `sourceColumns` — A JSON-encoded string representing a list of columns that form the basis of each group.
+ `groupByAggFunctions` — A JSON-encoded string representing a list of aggregation function to apply. (If you don't want aggregation, specify `UNAGGREGATED`.)
+ `useNewDataFrame` — If true, the results from GROUP\$1BY are made available in the project session, replacing its current contents.

**Example**  
  

```
[
  {
    "Action": {
      "Operation": "GROUP_BY",
      "Parameters": {
        "groupByAggFunctionOptions": "[{\"sourceColumnName\":\"all_votes\",\"targetColumnName\":\"all_votes_count\",\"targetColumnDataType\":\"number\",\"functionName\":\"COUNT\"}]",
        "sourceColumns": "[\"year\",\"state_name\"]",
        "useNewDataFrame": "true"
      }
    }
  }
]
```

# JOIN


Performs a join operation on two datasets.

**Parameters**
+ `joinKeys` — A JSON-encoded string representing a list of columns from each dataset to act as join keys.
+ `joinType` — The type of join to perform. Must be one of: `INNER_JOIN` \$1 `LEFT_JOIN `\$1 `RIGHT_JOIN` \$1 `OUTER_JOIN `\$1 `LEFT_EXCLUDING_JOIN` \$1 `RIGHT_EXCLUDING_JOIN` \$1 `OUTER_EXCLUDING_JOIN`
+ `leftColumns` — A JSON-encoded string representing a list of columns from the current active dataset.
+ `rightColumns` — A JSON-encoded string representing a list of columns from another (secondary) dataset to join to the current one.
+ `secondInputLocation` — An Amazon S3 URL that resolves to the data file for the secondary dataset.
+ `secondaryDatasetName` — The name of the secondary dataset.

**Example**  
  

```
{
    "Action": {
        "Operation": "JOIN",
        "Parameters": {
            "joinKeys": "[{\"key\":\"assembly_session\",\"value\":\"assembly_session\"},{\"key\":\"state_code\",\"value\":\"state_code\"}]",
            "joinType": "INNER_JOIN",
            "leftColumns": "[\"year\",\"assembly_session\",\"state_code\",\"state_name\",\"all_votes\",\"yes_votes\",\"no_votes\",\"abstain\",\"idealpoint_estimate\",\"affinityscore_usa\",\"affinityscore_russia\",\"affinityscore_china\",\"affinityscore_india\",\"affinityscore_brazil\",\"affinityscore_israel\"]",
            "rightColumns": "[\"assembly_session\",\"vote_id\",\"resolution\",\"state_code\",\"state_name\",\"member\",\"vote\"]",
            "secondInputLocation": "s3://databrew-public-datasets-us-east-1/votes.csv",
            "secondaryDatasetName": "votes"
        }
    }
}
```

# PIVOT


Converts all the row values in a selected column into individual columns with values.

![\[Diagram showing pivot column transformation: original table to new table with columns as values.\]](http://docs.aws.amazon.com/databrew/latest/dg/images/pivot.png)


**Parameters**
+ `sourceColumn` — The name of an existing column. The column can have a maximum of 10 distinct values.
+ `valueColumn` — The name of an existing column. The column can have a maximum of 10 distinct values.
+ `aggregateFunction` — The name of an aggregation function. If you don't want aggregation, use the keyword `COLLECT_LIST`.

**Example**  
  

```
{
    "Action": {
        "Operation": "PIVOT",
        "Parameters": {
            "aggregateFunction": "SUM",
            "sourceColumn": "state_name",
            "valueColumn": "all_votes"
        }
    }
}
```

## SCALE


Scales or normalizes the range of data in a numeric column.

**Parameters**
+ `sourceColumn` — The name of an existing column.
+ `strategy` — The operation to be applied to the column values:
  + `MIN_MAX` — Rescales the values into a range of [0,1].
  + `SCALE_BETWEEN` — Rescales the values into a range of two specified values.
  +  `MEAN_NORMALIZATION` — Rescales the data to have a mean (μ) of 0 and standard deviation (σ) of 1 within a range of [-1, 1].
  +  `Z_SCORE` — Linearly scales data values to have a mean (μ) of 0 and standard deviation (σ) of 1. Best for handling outliers.
+ `targetColumn` — The name of a column to contain the results.

**Example**  
  

```
{
    "Action": {
        "Operation": "NORMALIZATION",
        "Parameters": {
            "sourceColumn": "all_votes",
            "strategy": "MIN_MAX",
            "targetColumn": "all_votes_normalized"
        }
    }
}
```

# TRANSPOSE


Converts all selected rows to columns and columns to rows.

![\[Table transformation from rows to columns, showing data reorganization for improved analysis.\]](http://docs.aws.amazon.com/databrew/latest/dg/images/transpose.png)


**Parameters**
+ `pivotColumns` — A JSON-encoded string representing a list of columns whose rows will be converted to column names.
+ `valueColumns` — A JSON-encoded string representing a list of one or more columns to be converted to rows.
+ `aggregateFunction` — The name of an aggregation function. If you don't want aggregation, use the keyword `COLLECT_LIST`.
+ `newColumn` — The column to hold transposed columns as values.

**Example**  
  

```
{
    "Action": {
        "Operation": "TRANSPOSE",
        "Parameters": {
            "pivotColumns": "[\"Teacher\"]",
            "valueColumns": "[\"Tom\",\"John\",\"Harry\"]",
            "aggregateFunction": "COLLECT_LIST",
            "newColumn": "Student"
        }
    }

}
```

# UNION


Combines the rows from two or more datasets into a single result.

**Parameters**
+ `datasetsColumns` — A JSON-encoded string representing a list of all the columns in the datasets.
+ `secondaryDatasetNames` — A JSON-encoded string representing a list of one or more secondary datasets.
+ `secondaryInputs` — A JSON-encoded string representing a list of Amazon S3 buckets and object key names that tell DataBrew where to find the secondary dataset(s).
+ `targetColumnNames` — A JSON-encoded string representing a list of column names for the results.

**Example**  
  

```
{
    "Action": {
        "Operation": "UNION",
        "Parameters": {
            "datasetsColumns": "[[\"assembly_session\",\"state_code\",\"state_name\",\"year\",\"all_votes\",\"yes_votes\",\"no_votes\",\"abstain\",\"idealpoint_estimate\",\"affinityscore_usa\",\"affinityscore_russia\",\"affinityscore_china\",\"affinityscore_india\",\"affinityscore_brazil\",\"affinityscore_israel\"],[\"assembly_session\",\"state_code\",\"state_name\",null,null,null,null,null,null,null,null,null,null,null,null]]",
            "secondaryDatasetNames": "[\"votes\"]",
            "secondaryInputs": "[{\"S3InputDefinition\":{\"Bucket\":\"databrew-public-datasets-us-east-1\",\"Key\":\"votes.csv\"}}]",
            "targetColumnNames": "[\"assembly_session\",\"state_code\",\"state_name\",\"year\",\"all_votes\",\"yes_votes\",\"no_votes\",\"abstain\",\"idealpoint_estimate\",\"affinityscore_usa\",\"affinityscore_russia\",\"affinityscore_china\",\"affinityscore_india\",\"affinityscore_brazil\",\"affinityscore_israel\"]"
        }
    }
}
```

# UNPIVOT


Converts all the column values in a selected row into individual rows with values.

![\[Table transformation showing column values converted to individual rows with values.\]](http://docs.aws.amazon.com/databrew/latest/dg/images/unpivot.png)


**Parameters**
+ `sourceColumns` — A JSON-encoded string representing a list of one or more columns to be unpivoted.
+ `unpivotColumn` — The value column for the unpivot operation.
+ `valueColumn` — The column to hold unpivoted values.

**Example**  
  

```
{
    "Action": {
        "Operation": "UNPIVOT",
        "Parameters": {
            "sourceColumns": "[\"idealpoint_estimate\"]",
            "unpivotColumn": "unpivoted_idealpoint_estimate",
            "valueColumn": "unpivoted_column_values"
        }
    }
}
```

# Data science recipe steps


Use these recipe steps to tabulate and summarize data from different perspectives, or to perform advanced transformations.

**Topics**
+ [

# BINARIZATION
](recipe-actions.BINARIZATION.md)
+ [

# BUCKETIZATION
](recipe-actions.BUCKETIZATION.md)
+ [

# CATEGORICAL\$1MAPPING
](recipe-actions.CATEGORICAL_MAPPING.md)
+ [

# ONE\$1HOT\$1ENCODING
](recipe-actions.ONE_HOT_ENCODING.md)
+ [

## SCALE
](#recipe-actions.SCALE)
+ [

# SKEWNESS
](recipe-actions.SKEWNESS.md)
+ [

# TOKENIZATION
](recipe-actions.TOKENIZATION.md)

# BINARIZATION


Takes all the values in a selected numeric source column, compares them to a threshold value, and outputs a new column with a 1 or 0 for each row. 

**Parameters**
+ `sourceColumn` – The name of an existing column. 

  `targetColumn` – The name of the new column to be created.

  `threshold` – Number indicating the threshold for assigning the value of 0 or 1.

  `flip` – Option to flip binary assignment so that lower values are assigned 1 and higher values are assigned 0. When the flip parameter is true, values lower than or equal to the threshold value result in 1, and values greater than the threshold value result in 0.

**Example**  
  

```
{
    "Action": {
        "Operation": "BINARIZATION",
        "Parameters": {
            "sourceColumn": "level",
            "targetColumn": "bin",
            "threshold": "100.0",
            "flip": "false"
        }
    }
}
```

# BUCKETIZATION


Bucketization (called Binning in the console) takes the items in a column of numeric values, groups them into bins defined by numeric ranges, and outputs a new column that displays the bin for each row. Bucketization can be done using splits or percentage. The first example below uses splits and the second example uses a percentage.

**Parameters**
+ `sourceColumn` – The name of an existing column.

  `targetColumn` – The name of the new column to be created.

  `bucketNames` – List of bucket names.

  `splits` – List of bucket levels. Buckets are consecutive, and an upper bound for a bucket will be a lower bound for the next bucket.

  `percentage` – Each bucket will be described as a percentage.

**Example using splits**  
  

```
{
    "Action": {
        "Operation": "BUCKETIZATION",
        "Parameters": {
            "sourceColumn": "level",
            "targetColumn": "bin",
            "bucketNames": "[\"Bin1\",\"Bin2\",\"Bin3\"]",
            "splits": "[\"-Infinity\",\"2\",\"20\",\"Infinity\"]"
        }
    }
}
```

**Example using a percentage**  

```
{
    "Action": {
        "Operation": "BUCKETIZATION",
        "Parameters": {
            "sourceColumn": "level",
            "targetColumn": "bin",
            "bucketNames": "[\"Bin1\",\"Bin2\"]",
            "percentage": "50"
        }
    }
}
```

# CATEGORICAL\$1MAPPING


Maps one or more categorical values to numeric or other values

**Parameters**
+ `sourceColumn` – The name of an existing column. 

  `categoryMap` – A JSON-encoded string representing a map of values to categories.

  `deleteOtherRows` – If `true`, all non-mapped rows will be removed from the dataset.

  `other` – When provided, all non-mapped values will be replaced by this value.

  `keepOthers` – If true, all non-mapped values will remain the same.

  `mapType` – The data type of the mapped column.

  `targetColumn` – The name of a column to contain the results.

**Example**  
  

```
{
    "Action": {
        "Operation": "CATEGORICAL_MAPPING",
        "Parameters": {
            "categoryMap": "{\"United States of America\":\"1\",\"Canada\":\"2\",\"Cuba\":\"3\",\"Haiti\":\"4\",\"Dominican Republic\":\"5\"}",
            "deleteOtherRows": "false",
            "keepOthers": "true",           
            "mapType": "NUMERIC",
            "sourceColumn": "state_name",
            "targetColumn": "state_name_mapped"
        }
    }
}
```

# ONE\$1HOT\$1ENCODING


Creates *n* numerical columns, where *n* is the number of unique values in a selected categorical variable.

For example, consider a column named `shirt_size`. Shirts are available in small, medium, large, or extra large. The column data might look like the following.

```
shirt_size
-----------
L
XL
M
S
M
M
S
XL
M
L
XL
M
```

In this scenario, there are four distinct values for `shirt_size`. Therefore, `ONE_HOT_ENCODING` generates four new columns. Each new column is named `shirt_size_x`, where `x` represents a distinct `shirt_size `value.

The results of `shirt_size` and the four generated columns look like this.

```
shirt_size    shirt_size_S    shirt_size_M    shirt_size_L    shirt_size_XL
------------    ------------    ------------    ------------    -------------
L              0               0               1               0
XL             0               0               0               1
M              0               1               0               0
S              1               0               0               0
M              0               1               0               0
M              0               1               0               0
S              1               0               0               0
XL             0               0               0               1
M              0               1               0               0
L              0               0               1               0
XL             0               0               0               1
M              0               1               0               0
```

The column that you specify for `ONE_HOT_ENCODING` can have a maximum of ten (10) distinct values.

**Parameters**
+ `sourceColumn` – The name of an existing column. The column can have a maximum of 10 distinct values.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ONE_HOT_ENCODING",
        "Parameters": {
            "sourceColumn": "shirt_size"
        }
    }
}
```

## SCALE


Scales or normalizes the range of data in a numeric column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `strategy` – The operation to be applied to the column values:
  + `MIN_MAX` – Rescales the values into a range of [0,1]
  + `SCALE_BETWEEN` – Rescales the values into a range of 2 specified values.
  +  `MEAN_NORMALIZATION` – Rescales the data to have a mean (μ) of 0 and standard deviation (σ) of 1 within a range of [-1, 1]
  +  `Z_SCORE` – Linearly scale data values to have a mean (μ) of 0 and standard deviation (σ) of 1. Best for handling outliers.
+ `targetColumn` – The name of a column to contain the results.

**Example**  
  

```
{
    "Action": {
        "Operation": "NORMALIZATION",
        "Parameters": {
            "sourceColumn": "all_votes",
            "strategy": "MIN_MAX",
            "targetColumn": "all_votes_normalized"
        }
    }
}
```

# SKEWNESS


Applies transformations on your data values to change the distribution shape and its skew.

**Parameters**
+ `sourceColumn` – The name of an existing column. 

  `targetColumn` – The name of the new column to be created.

  `skewFunction`
  + `ROOT` – extract value-root. The root can be provided in the `value` parameter.

    `LOG` – log base value. The log base can be provided in the `value` parameter.

    `SQUARE` – square function

  `value` – Argument of the skewFunction.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SKEWNESS",
        "Parameters": {
            "sourceColumn": "level",
            "targetColumn": "bin",
            "skewFunction": "LOG",
            "value": "2.718281828"
        }
    }
}
```

# TOKENIZATION


Splits text into smaller units, or tokens, such as individual words or terms.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `delimiter` — A custom delimiter that appears between tokenized words. (The default behavior is to separate each token by a space.)
+ `expandContractions` — If `ENABLED`, expands contracted words. For example: "don't" becomes "do not".
+ `stemmingMode` — Splits text into smaller units or tokens, such as individual lowercase words or terms. Two stemming modes are available: `PORTER` \$1 `LANCASTER`.
+ `stopWordRemovalMode` — Removes common words like a, an, the, and more. 
+ `customStopWords` — For `StopWordRemovalMode`, allows you to specify a custom list of stop words.
+ `targetColumn` — The name of a column to contain the results.

**Example**  
  

```
{
    "Action": {
        "Operation": "TOKENIZATION",
        "Parameters": {
            "customStopWords": "[]",
            "delimiter": "- ",
            "expandContractions": "ENABLED",
            "sourceColumn": "dimensions",
            "stemmingMode": "PORTER",
            "stopWordRemovalMode": "DEFAULT",
            "targetColumn": "dimensions_tokenized"
        }
    }

}
```

# Mathematical functions


Following, find reference topics for mathematical functions that work with recipe actions.

**Topics**
+ [

# ABSOLUTE
](recipe-actions.functions.ABSOLUTE.md)
+ [

# ADD
](recipe-actions.functions.ADD.md)
+ [

# CEILING
](recipe-actions.functions.CEILING.md)
+ [

# DEGREES
](recipe-actions.functions.DEGREES.md)
+ [

# DIVIDE
](recipe-actions.functions.DIVIDE.md)
+ [

# EXPONENT
](recipe-actions.functions.EXPONENT.md)
+ [

# FLOOR
](recipe-actions.functions.FLOOR.md)
+ [

# IS\$1EVEN
](recipe-actions.functions.ISEVEN.md)
+ [

# IS\$1ODD
](recipe-actions.functions.ISODD.md)
+ [

# LN
](recipe-actions.functions.LN.md)
+ [

# LOG
](recipe-actions.functions.LOG.md)
+ [

# MOD
](recipe-actions.functions.MOD.md)
+ [

# MULTIPLY
](recipe-actions.functions.MULTIPLY.md)
+ [

# NEGATE
](recipe-actions.functions.NEGATE.md)
+ [

# PI
](recipe-actions.functions.PI.md)
+ [

# POWER
](recipe-actions.functions.POWER.md)
+ [

# RADIANS
](recipe-actions.functions.RADIANS.md)
+ [

# RANDOM
](recipe-actions.functions.RANDOM.md)
+ [

# RANDOM\$1BETWEEN
](recipe-actions.functions.RANDOM_BETWEEN.md)
+ [

# ROUND
](recipe-actions.functions.ROUND.md)
+ [

# SIGN
](recipe-actions.functions.SIGN.md)
+ [

# SQUARE\$1ROOT
](recipe-actions.functions.SQUARE_ROOT.md)
+ [

# SUBTRACT
](recipe-actions.functions.SUBTRACT.md)

# ABSOLUTE


Returns the absolute value of the input number in a new column. *Absolute value* is how far the number is from zero, regardless of whether it is positive or negative

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ABSOLUTE",
        "Parameters": {
            "sourceColumn": "freezingTemps",
            "targetColumn": "absValueOfFreezingTemps"
        }
    }
}
```

# ADD


 Sums the input column values in a new column, using (`sourceColumn1` \$1 `sourceColumn2`) or (`sourceColumn1` \$1 `value1`).

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `value1` – A numeric value.
+ `sourceColumn2` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ADD",
        "Parameters": {
            "sourceColumn1": "weight_kg",
            "sourceColumn2": "height_cm",
            "targetColumn": "weight_plus_height"
        }
    }
}
```

# CEILING


Returns the smallest integer number greater than or equal to the input decimal numbers in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value1` – A numeric value.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "CEILING",
        "Parameters": {
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_CEILING"
        }
    }
}
```

# DEGREES


Converts radians for an angle to degrees and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DEGREES",
        "Parameters": {
            "sourceColumn": "height_cm",
            "targetColumn": "height_cm_DEGREES"
        }
    }
}
```

# DIVIDE


Divides one input number by another and returns the result in a new column.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `value1` – A numeric value.
+ `sourceColumn2` – The name of an existing column.
+ `value2` – A numeric value.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DIVIDE",
        "Parameters": {
            "sourceColumn1": "height_cm",
            "targetColumn": "divide_by_2",
            "value2": "2"
        }
    }
}
```

# EXPONENT


Returns Euler’s number raised to the *n*th degree in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXPONENT",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "age_EXPONENT"
        }
    }
}
```

# FLOOR


Returns the largest integral number greater than or equal to the input number in a new column.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `value` – A numeric value.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FLOOR",
        "Parameters": {
            "targetColumn": "FLOOR Column 1",
            "value": "42"
        }
    }
}
```

# IS\$1EVEN


Returns a Boolean value in a new column that indicates whether the source column or value is even. If the source column or value is a decimal, the result is false.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `trueString` – A string that indicates whether the value is even.
+ `falseString` – A string that indicates whether the value is *not* even.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "IS_EVEN",
        "Parameters": {
            "falseString": "Value is odd",
            "sourceColumn": "height_cm",
            "targetColumn": "height_cm_IS_EVEN",
            "trueString": "Value is even"
        }
    }
}
```

# IS\$1ODD


Returns a Boolean value in a new column that indicates whether the source column or value is odd. If the source column or value is a decimal, the result is false.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `trueString` – A string that indicates whether the value is odd.
+ `falseString` – A string that indicates whether the value is *not* odd.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "IS_ODD",
        "Parameters": {
            "falseString": "Value is even",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_IS_ODD",
            "trueString": "Value is odd"
        }
    }
}
```

# LN


Returns the natural logarithm (Euler’s number) of a value in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "LN",
        "Parameters": {
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_LN"
        }
    }
}
```

# LOG


Returns the logarithm of a value in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.
+ `base` – The base of the logarithm. The default is 10.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "LOG",
        "Parameters": {
            "base": "10",
            "sourceColumn": "age",
            "targetColumn": "age_LOG"
        }
    }
}
```

# MOD


Returns the percent that one number is of another number in a new column.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MOD",
        "Parameters": {
            "sourceColumn1": "start_date",
            "sourceColumn2": "end_date",
            "targetColumn": "MOD Column 1"
        }
    }
}
```

# MULTIPLY


Multiplies two numbers and returns the result in a new column.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `value1` – A numeric value.
+ `sourceColumn2` – The name of an existing column.
+ `value2` – A numeric value.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MULTIPLY",
        "Parameters": {
            "sourceColumn1": "hourly_rate",
            "sourceColumn2": "hours",
            "targetColumn": "total_pay"
        }
    }
}
```

# NEGATE


Negates a value and returns the result in a new column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NEGATE",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "age_NEGATE"
        }
    }
}
```

# PI


Returns the value of pi (3.141592653589793) in a new column.

**Parameters**
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "PI",
        "Parameters": {
            "targetColumn": "PI Column 1"
        }
    }
}
```

# POWER


Returns the value of a number to the power of the exponent in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A number whose value is to be raised.
+ `targetColumn` – The name of the new column to be created.
+ `exponent` – The power to which the value will be raised.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "POWER",
        "Parameters": {
            "exponent": "3",
            "sourceColumn": "age",
            "targetColumn": "age_cubed"
        }
    }
}
```

# RADIANS


Converts degrees to radians (divides by 180/pi) and returns the value in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "RADIANS",
        "Parameters": {
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_RADIANS"
        }
    }
}
```

# RANDOM


Returns a random number between 0 and 1 in a new column.

**Parameters**
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "RANDOM",
        "Parameters": {
            "targetColumn": "RANDOM Column 1"
        }
    }
}
```

# RANDOM\$1BETWEEN


In a new column, returns a random number between a specified lower bound (inclusive) and a specified upper bound (inclusive).

**Parameters**
+ `lowerBound` – The lower bound of the random number range.
+ `upperBound` – The upper bound of the random number range.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "RANDOM_BETWEEN",
        "Parameters": {
            "lowerBound": "1",
            "targetColumn": "RANDOM_BETWEEN Column 1",
            "upperBound": "100"
        }
    }
}
```

# ROUND


Rounds a numerical value to the nearest integer in a new column. It rounds up when the fraction is 0.5 or more. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ROUND",
        "Parameters": {
            "sourceColumn": "rating",
            "targetColumn": "rating_ROUND"
        }
    }
}
```

# SIGN


Returns a new column with -1 if the value is less than 0, 0 if the value is 0, and \$11 if the value is greater than 0.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SIGN",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "age_SIGN"
        }
    }
}
```

# SQUARE\$1ROOT


Returns the square root of a value in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SQUARE_ROOT",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "age_SQUARE_ROOT"
        }
    }
}
```

# SUBTRACT


Subtracts one number from another and returns the result in a new column. 

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `value1` – A numeric value.
+ `sourceColumn2` – The name of an existing column.
+ `value2` – A numeric value. 
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SUBTRACT",
        "Parameters": {
            "sourceColumn1": "weight_kg",
            "targetColumn": "weight_minus_10_kg",
            "value2": "10"
        }
    }
}
```

# Aggregate functions


Following, find reference topics for aggregate functions that work with recipe actions.

**Topics**
+ [

# ANY
](recipe-actions.functions.ANY.md)
+ [

# AVERAGE
](recipe-actions.functions.AVERAGE.md)
+ [

# COUNT
](recipe-actions.functions.COUNT.md)
+ [

# COUNT\$1DISTINCT
](recipe-actions.functions.COUNT_DISTINCT.md)
+ [

# KTH\$1LARGEST
](recipe-actions.functions.KTH_LARGEST.md)
+ [

# KTH\$1LARGEST\$1UNIQUE
](recipe-actions.functions.KTH_LARGEST_UNIQUE.md)
+ [

# MAX
](recipe-actions.functions.MAX.md)
+ [

# MEDIAN
](recipe-actions.functions.MEDIAN.md)
+ [

# MIN
](recipe-actions.functions.MIN.md)
+ [

# MODE
](recipe-actions.functions.MODE.md)
+ [

# STANDARD\$1DEVIATION
](recipe-actions.functions.STDDEV.md)
+ [

# SUM
](recipe-actions.functions.SUM.md)
+ [

# VARIANCE
](recipe-actions.functions.VAR.md)

# ANY


 Returns any values from the selected source columns in a new column. Empty and null values are ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ANY",
        "Parameters": {
            "sourceColumns": "[\"age\",\"last_name\"]",
            "targetColumn": "ANY Column 1"
        }
    }
}
```

# AVERAGE


 Calculates the average of the values in the source columns and returns the result in a new column. Any non-number is ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "AVERAGE",
        "Parameters": {
            "sourceColumns": "[\"age\",\"weight_kg\",\"height_cm\"]",
            "targetColumn": "AVERAGE Column 1"
        }
    }
}
```

# COUNT


 Returns the number of values from the selected source columns in a new column. Empty and null values are ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "COUNT",
        "Parameters": {
            "sourceColumns": "[\"ANY Column 1\",\"birth_date\",\"last_name\"]",
            "targetColumn": "COUNT Column 1"
        }
    }
}
```

# COUNT\$1DISTINCT


 Returns the total number of distinct values from the selected source columns in a new column. Empty and null values are ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "COUNT_DISTINCT",
        "Parameters": {
            "sourceColumns": "[\"long_name\",\"weight_kg\"]",
            "targetColumn": "COUNT_DISTINCT Column 1"
        }
    }
}
```

# KTH\$1LARGEST


 Returns the *k*th largest number from the selected source columns in a new column. 

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.
+ `value` – A number representing *k*.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "KTH_LARGEST",
        "Parameters": {
            "sourceColumns": "[\"height_cm\",\"weight_kg\",\"age\"]",
            "targetColumn": "KTH_LARGEST Column 1",
            "value": "2"
        }
    }
}
```

# KTH\$1LARGEST\$1UNIQUE


 Returns the *k*th largest unique number from the selected source columns in a new column.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

  `value` – A number representing *k*.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "KTH_LARGEST_UNIQUE",
        "Parameters": {
            "sourceColumns": "[\"age\",\"height_cm\",\"weight_kg\"]",
            "targetColumn": "KTH_LARGEST_UNIQUE Column 1",
            "value": "3"
        }
    }
}
```

# MAX


 Returns the maximum numerical value from the selected source columns in a new column. Any non-number is ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MAX",
        "Parameters": {
            "sourceColumns": "[\"age\",\"height_cm\",\"weight_kg\"]",
            "targetColumn": "MAX Column 1"
        }
    }
}
```

# MEDIAN


 Returns the median, the middle number of a sorted group of numbers, from the selected source columns in a new column. Any non-number is ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MEDIAN",
        "Parameters": {
            "sourceColumns": "[\"age\",\"years_in_service\"]",
            "targetColumn": "MEDIAN Column 1"
        }
    }
}
```

# MIN


 Returns the minimum value from the selected source columns in a new column. Any non-number is ignored.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MIN",
        "Parameters": {
            "sourceColumns": "[\"age\",\"height_cm\",\"weight_kg\"]",
            "targetColumn": "MIN Column 1"
        }
    }
}
```

# MODE


Returns the mode, the number that appears most often, from the selected source columns in a new column. Any non-number is ignored. For multiple modes, the mode is calculated with the modal function.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MODE",
        "Parameters": {
            "modeType": "MINIMUM",
            "sourceColumns": "[\"years_in_service\",\"age\"]",
            "targetColumn": "MODE Column 1"
        }
    }
}
```

# STANDARD\$1DEVIATION


 Returns the standard deviation from the selected source columns in a new column. 

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STANDARD_DEVIATION",
        "Parameters": {
            "sourceColumns": "[\"years_in_sservice\",\"age\"]",
            "targetColumn": "STANDARD_DEVIATION Column 1"
        }
    }
}
```

# SUM


Returns the sum of the values from the selected source columns in a new column. Any non-number is treated as 0.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SUM",
        "Parameters": {
            "sourceColumns": "[\"age\",\"years_in_service\"]",
            "targetColumn": "SUM Column 1"
        }
    }
}
```

# VARIANCE


 Returns the variance from the selected source columns in a new column. Variance is defined as `Var(X) = [Sum ((X – mean(X))^2)]/Count(X)`.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing a list of existing columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "VARIANCE",
        "Parameters": {
            "sourceColumns": "[\"age\",\"years_in_service\"]",
            "targetColumn": "VARIANCE Column 1"
        }
    }
}
```

# Text functions


Following, find reference topics for text functions that work with recipe actions.

**Topics**
+ [

# CHAR
](recipe-actions.functions.CHAR.md)
+ [

# ENDS\$1WITH
](recipe-actions.functions.ENDS_WITH.md)
+ [

# EXACT
](recipe-actions.functions.EXACT.md)
+ [

# FIND
](recipe-actions.functions.FIND.md)
+ [

# LEFT
](recipe-actions.functions.LEFT.md)
+ [

# LEN
](recipe-actions.functions.LEN.md)
+ [

# LOWER
](recipe-actions.functions.LOWER.md)
+ [

# MERGE\$1COLUMNS\$1AND\$1VALUES
](recipe-actions.functions.MERGE.md)
+ [

# PROPER
](recipe-actions.functions.PROPER.md)
+ [

# REMOVE\$1SYMBOLS
](recipe-actions.functions.REMOVE_SYMBOLS.md)
+ [

# REMOVE\$1WHITESPACE
](recipe-actions.functions.REMOVE_WHITESPACE.md)
+ [

# REPEAT\$1STRING
](recipe-actions.functions.REPEAT.md)
+ [

# RIGHT
](recipe-actions.functions.RIGHT.md)
+ [

# RIGHT\$1FIND
](recipe-actions.functions.RIGHT_FIND.md)
+ [

# STARTS\$1WITH
](recipe-actions.functions.STARTS_WITH.md)
+ [

# STRING\$1GREATER\$1THAN
](recipe-actions.functions.STRING_GREATER_THAN.md)
+ [

# STRING\$1GREATER\$1THAN\$1EQUAL
](recipe-actions.functions.STRING_GREATER_THAN_EQUAL.md)
+ [

# STRING\$1LESS\$1THAN
](recipe-actions.functions.STRING_LESS_THAN.md)
+ [

# STRING\$1LESS\$1THAN\$1EQUAL
](recipe-actions.functions.STRING_LESS_THAN_EQUAL.md)
+ [

# SUBSTRING
](recipe-actions.functions.SUBSTRING.md)
+ [

# TRIM
](recipe-actions.functions.TRIM.md)
+ [

# UNICODE
](recipe-actions.functions.UNICODE.md)
+ [

# UPPER
](recipe-actions.functions.UPPER.md)

# CHAR


Returns in a new column the Unicode character for each integer in the source column, or for a custom integer value.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – An integer that represents a Unicode value.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "CHAR",
        "Parameters": {
            "sourceColumn": "age",
            "targetColumn": "age_char"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "CHAR",
        "Parameters": {
            "value": 42,
            "targetColumn": "asterisk"
        }
    }
}
```

# ENDS\$1WITH


Returns `true` in a new column if a specified number of rightmost characters, or custom string, matches a pattern.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `pattern` – A regular expression that must match the end of the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "ENDS_WITH",
        "Parameters": {
            "sourceColumn": "nationality",
            "pattern": "[Ss]",
            "targetColumn": "nationality_ends_with"
        }
    }
}
```

# EXACT


Creates a new column populated with one of the following:
+ `True` if one string in a column (or value) exactly matches another string in a different column (or value).
+ `False` if there is no match.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify only one of the following combinations:  
Both of `sourceColumnN`.
One of `sourceColumnN` and one of `valueN`.
Both of `valueN`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "EXACT",
        "Parameters": {
            "sourceColumn1": "nationality",
            "value2": "Argentina",
            "targetColumn": "nationality_exact"
        }
    }
}
```

# FIND


Searching left to right, finds strings that match a specified string from the source column or from a custom value, and returns the result in a new column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – A regular expression to search for.
+ `position` – The character position to begin with, from the left end of the string.
+ `ignoreCase` – If `true`, ignore differences of case (between uppercase and lowercase) among letters. To enforce strict matching, use `false` instead.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "FIND",
        "Parameters": {
            "sourceColumn": "city",
            "pattern": "[AEIOU]",
            "position": "1",
            "ignoreCase": "false",
            "targetColumn": "begins_with_a_vowel"
        }
    }
}
```

# LEFT


Given a number of characters, takes the leftmost number of characters in the string from the source column or custom string, and returns the specified number of leftmost characters in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `position` – The character position to begin with, from the left end of the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "LEFT",
        "Parameters": {
            "position": "3",
            "sourceColumn": "city",
            "targetColumn": "city_left"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "LEFT",
        "Parameters": {
            "position": "5",
            "value": "How now brown cow",
            "targetColumn": "how_now_5_left_chars"
        }
    }
}
```

# LEN


Returns in a new column the length of strings from the source column or of custom strings.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "LEN",
        "Parameters": {
            "sourceColumn": "last_name",
            "targetColumn": "last_name_len"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "LEN",
        "Parameters": {
            "value": "Hello",
            "targetColumn": "hello_len"
        }
    }
}
```

# LOWER


Converts all alphabetical characters from the strings in the source column or custom strings to lowercase, and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "LOWER",
        "Parameters": {
            "sourceColumn": "last_name",
            "targetColumn": "last_name_lower"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "LOWER",
        "Parameters": {
            "value": "GOODBYE",
            "targetColumn": "goodbye_lower"
        }
    }
}
```

# MERGE\$1COLUMNS\$1AND\$1VALUES


Concatenates the strings in the source columns and returns the result in a new column. You can insert a delimiter between the merged values.

**Parameters**
+ `sourceColumns` – The names of two or more existing columns, in JSON-encoded format.
+ `delimiter` – Optional. One or more characters to place between each two source column values. 
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MERGE_COLUMNS_AND_VALUES",
        "Parameters": {
            "sourceColumns": "[\"last_name\",\"birth_date\"]",
            "delimiter": " was born on: ",
            "targetColumn": "merged_column"
        }
    }
}
```

# PROPER


Converts all alphabetical characters from the strings in the source column or custom values to proper case, and returns the result in a new column. 

In *proper case,* also called capital case, the first letter of each word is capitalized and the rest of the word is transformed to lowercase. An example is: The Quick Brown Fox Jumped Over The Fence 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "PROPER",
        "Parameters": {
            "sourceColumn": "first_name",
            "targetColumn": "first_name_proper"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "PROPER",
        "Parameters": {
            "value": "MR. H. SMITH, ESQ.",
            "targetColumn": "formal_name_proper"
        }
    }
}
```

# REMOVE\$1SYMBOLS


Removes characters that aren't letters, numbers, accented Latin characters, or white space from the strings in the source column or custom strings, and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_SYMBOLS",
        "Parameters": {
            "sourceColumn": "info_url",
            "targetColumn": "info_url_remove_symbols"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_SYMBOLS",
        "Parameters": {
            "value": "$&#$&HEY!#@@",
            "targetColumn": "without_symbols"
        }
    }
}
```

# REMOVE\$1WHITESPACE


Removes white space from the strings in the source column or custom strings, and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_WHITESPACE",
        "Parameters": {
            "sourceColumn": "job_desc",
            "targetColumn": "job_desc_remove_whitespace"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "REMOVE_WHITESPACE",
        "Parameters": {
            "value": "This string has spaces in it",
            "targetColumn": "string_without_spaces"
        }
    }
}
```

# REPEAT\$1STRING


Repeats the strings in the source column or custom input value a specified number of times, and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `count` – The number of times to repeat the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "REPEAT_STRING",
        "Parameters": {
            "count": 3,
            "sourceColumn": "last_name",
            "targetColumn": "last_name_repeat_string"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "REPEAT_STRING",
        "Parameters": {
            "count": 80,
            "value": "*",
            "targetColumn": "80_stars"
        }
    }
}
```

# RIGHT


Given a number of characters, takes the rightmost number of characters in the strings from the source column or custom strings, and returns the specified number of rightmost characters in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `position` – The character position to begin with, from the right side of the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "RIGHT",
        "Parameters": {
            "sourceColumn": "nationality",
            "position": "3",
            "targetColumn": "nationality_right"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "RIGHT",
        "Parameters": {
            "value": "United States of America",
            "position": "7",
            "targetColumn": "usa_right"
        }
    }
}
```

# RIGHT\$1FIND


Searching right to left, finds strings that match a specified string from the source column or from a custom value, and returns the result in a new column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `pattern` – A regular expression to search for.
+ `position` – The character position to begin with, from the right end of the string.
+ `ignoreCase` – If `true`, ignore differences of case (between uppercase and lowercase) among letters. To enforce strict matching, use `false` instead.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "RIGHT_FIND",
        "Parameters": {
            "sourceColumn": "nationality",
            "pattern": "s",
            "position": "1",
            "ignoreCase": "true",
            "targetColumn": "ends_with_an_s"
        }
    }
}
```

# STARTS\$1WITH


Returns `true` in a new column if a specified number of leftmost characters, or custom string, matches a pattern.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `pattern` – A regular expression that must match the start of the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STARTS_WITH",
        "Parameters": {
            "sourceColumn": "nationality",
            "pattern": "[AEIOU]",
            "targetColumn": "nationality_starts_with"
        }
    }
}
```

# STRING\$1GREATER\$1THAN


Creates a new column populated with one of the following:
+ `True` if one string in a column (or value) is greater than another string in a different column (or value).
+ `False` if there is no match.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify only one of the following combinations:  
Both of `sourceColumnN`.
One of `sourceColumnN` and one of `valueN`.
Both of `valueN`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STRING_GREATER_THAN",
        "Parameters": {
            "sourceColumn1": "first_name",
            "sourceColumn2": "last_name",
            "targetColumn": "string_greater_than"
        }
    }
}
```

# STRING\$1GREATER\$1THAN\$1EQUAL


Creates a new column populated with one of the following:
+ `True` if one string in a column (or value) is greater than or equal to another string in a different column (or value).
+ `False` if there is no match.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify only one of the following combinations:  
Both of `sourceColumnN`.
One of `sourceColumnN` and one of `valueN`.
Both of `valueN`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STRING_GREATER_THAN_EQUAL",
        "Parameters": {
            "sourceColumn1": "nationality",
            "targetColumn": "string_greater_than_equal",
            "value2": "s"
        }
    }
}
```

# STRING\$1LESS\$1THAN


Creates a new column populated with one of the following:
+ `True` if one string in a column (or value) is less than another string in a different column (or value).
+ `False` if there is no match.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify only one of the following combinations:  
Both of `sourceColumnN`.
One of `sourceColumnN` and one of `valueN`.
Both of `valueN`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STRING_LESS_THAN",
        "Parameters": {
            "sourceColumn1": "first_name",
            "sourceColumn2": "last_name",
            "targetColumn": "string_less_than"
        }
    }
}
```

# STRING\$1LESS\$1THAN\$1EQUAL


Creates a new column populated with one of the following:
+ `True` if one string in a column (or value) is less than or equal to another string in a different column (or value).
+ `False` if there is no match.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify only one of the following combinations:  
Both of `sourceColumnN`.
One of `sourceColumnN` and one of `valueN`.
Both of `valueN`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "STRING_LESS_THAN_EQUAL",
        "Parameters": {
            "sourceColumn1": "first_name",
            "targetColumn": "string_less_than_equal",
            "value2": "s"
        }
    }
}
```

# SUBSTRING


Returns in a new column some or all of the specified strings in the source column, based on the user-defined starting and ending index values.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `startPosition` – The character position to begin with, from the left end of the string.
+ `endPosition` – The character position to end with, from the left end of the string.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SUBSTRING",
        "Parameters": {
            "sourceColumn": "last_name",
            "startPosition": "5",
            "endPosition": "8",
            "targetColumn": "chars_5_through_8"
        }
    }
}
```

# TRIM


Removes leading and trailing white space from the strings in the source column or custom strings, and returns the result in a new column. Spaces between words aren't removed.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "TRIM",
        "Parameters": {
            "sourceColumn": "nationality",
            "targetColumn": "nationality_trim"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "TRIM",
        "Parameters": {
            "value": "   This string should be trimmed       ",
            "targetColumn": "string_trimmed"
        }
    }
}
```

# UNICODE


Returns in a new column the Unicode index value for the first character of the strings in the source column or for custom strings.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "UNICODE",
        "Parameters": {
            "sourceColumn": "first_name",
            "targetColumn": "first_name_unicode"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "UNICODE",
        "Parameters": {
            "value": "?",
            "targetColumn": "sixty_three"
        }
    }
}
```

# UPPER


Converts all alphabetical characters from the strings in the source column or custom strings to uppercase, and returns the result in a new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "UPPER",
        "Parameters": {
            "sourceColumn": "last_name",
            "targetColumn": "last_name_upper"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "UPPER",
        "Parameters": {
            "value": "a string of lowercase letters",
            "targetColumn": "string_upper"
        }
    }
}
```

# Date and time functions


Following, find reference topics for date and time functions that work with recipe actions.

**Topics**
+ [

# CONVERT\$1TIMEZONE
](recipe-actions.functions.CONVERT_TIMEZONE.md)
+ [

# DATE
](recipe-actions.functions.DATE.md)
+ [

# DATE\$1ADD
](recipe-actions.functions.DATEADD.md)
+ [

# DATE\$1DIFF
](recipe-actions.functions.DATEDIFF.md)
+ [

# DATE\$1FORMAT
](recipe-actions.functions.DATE_FORMAT.md)
+ [

# DATE\$1TIME
](recipe-actions.functions.DATETIME.md)
+ [

# DAY
](recipe-actions.functions.DAY.md)
+ [

# HOUR
](recipe-actions.functions.HOUR.md)
+ [

# MILLISECOND
](recipe-actions.functions.MILLISECOND.md)
+ [

# MINUTE
](recipe-actions.functions.MINUTE.md)
+ [

# MONTH
](recipe-actions.functions.MONTH.md)
+ [

# MONTH\$1NAME
](recipe-actions.functions.MONTH_NAME.md)
+ [

# NOW
](recipe-actions.functions.NOW.md)
+ [

# QUARTER
](recipe-actions.functions.QUARTER.md)
+ [

# SECOND
](recipe-actions.functions.SECOND.md)
+ [

# TIME
](recipe-actions.functions.TIME.md)
+ [

# TODAY
](recipe-actions.functions.TODAY.md)
+ [

# UNIX\$1TIME
](recipe-actions.functions.UNIX_TIME.md)
+ [

# UNIX\$1TIME\$1FORMAT
](recipe-actions.functions.UNIX_TIME_FORMAT.md)
+ [

# WEEK\$1DAY
](recipe-actions.functions.WEEK_DAY.md)
+ [

# WEEK\$1NUMBER
](recipe-actions.functions.WEEK_NUMBER.md)
+ [

# YEAR
](recipe-actions.functions.YEAR.md)

# CONVERT\$1TIMEZONE


Converts a time value from the source column into a new column based on a specified timezone.

**Parameters**
+ `sourceColumn` – The name of an existing column. The source column can be of type `string`, `date`, or `timestamp`.
+ `fromTimeZone` – Source value timezone. If nothing is specified, the default timezone is UTC.
+ `toTimeZone` – Timezone to be converted to. If nothing is specified, the default timezone is UTC.
+ `targetColumn` – A name for the newly-created column.
+ `dateTimeFormat` – Optional. A format string for the date. If the format isn't specified, the default format is used: `yyyy-mm-dd HH:MM:SS`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "CONVERT_TIMEZONE",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "fromTimeZone": "UTC+08:00",
            "toTimeZone": "UTC+08:00",
            "targetColumn": "DATETIME Column CONVERT_TIMEZONE",
            "dateTimeFormat": "yyyy-mm-dd HH:MM:SS"
        }
    }
}
```

# DATE


Creates a new column containing the date value, from the source columns or from values provided.

**Parameters**
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If this string isn't specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `dateTimeParameters` – A JSON-encoded string representing the components of the date and time:
  + `year`
  + `value`
  + `month`
  + `day`
  + `hour`
  + `second`

  Each component must specify one of the following:
  + `sourceColumn` – The name of an existing column.
  + `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DATE",
        "Parameters": {
            "dateTimeFormat": "mm/dd/yy",
            "dateTimeParameters": "{\"year\":{\"value\":\"2019\"},\"month\":{\"value\":\"12\"},\"day\":{\"value\":\"31\"},\"hour\":{},\"minute\":{},\"second\":{}}",
            "targetColumn": "DATE Column 1"
        }
    }
}
```

# DATE\$1ADD


Adds a year, month, or day to the date from a source column or value, and creates a new column containing the results.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `units` – A unit of measure for adjusting the date. Valid values are `MONTHS`, `YEARS`, `MILLISECONDS`, `QUARTERS`, `HOURS`, `MICROSECONDS`, `WEEKS`, `SECONDS`, `DAYS`, and `MINUTES`.
+ `dateAddValue` – The number of `units` to be added to the date.
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If not specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DATE_ADD",
        "Parameters": {
            "sourceColumn": "DATE Column 1",
            "units": "DAYS",
            "dateAddValue": "14",
            "dateTimeFormat": "mm/dd/yyyy",
            "targetColumn": "DATE Column 1_DATEADD"
        }
    }
}
```

# DATE\$1DIFF


Creates a new column containing the difference between two dates.

**Parameters**
+ `sourceColumn1` – The name of an existing column.
+ `sourceColumn2` – The name of an existing column.
+ `value1` – A character string to evaluate.
+ `value2` – A character string to evaluate.
+ `units` – A unit of measure for describe the difference between the dates. Valid values are `MONTHS`, `YEARS`, `MILLISECONDS`, `QUARTERS`, `HOURS`, `MICROSECONDS`, `WEEKS`, `SECONDS`, `DAYS`, and `MINUTES`.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can only specify one of the following combinations:  
Both of `sourceColumn1` and `sourceColumn2`.
One of `sourceColumn1` or `sourceColumn2` and one of `value1` or `value2`.
Both of `value1` and `value2`.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DATE_DIFF",
        "Parameters": {
            "value1": "2020-01-01",
            "value2": "2020-10-06",
            "units": "DAYS",
            "targetColumn": "DATEDIFF Column 1"
        }
    }
}
```

# DATE\$1FORMAT


Creates a new column containing a date, in a specific format, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A string to evaluate.
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If not specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `targetColumn` – A name for the newly created column.
**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Examples**  
  

```
{
    "RecipeAction": {
        "Operation": "DATE_FORMAT",
        "Parameters": {
            "sourceColumn": "DATE Column 1",
            "dateTimeFormat": "month*dd*yyyy",
            "targetColumn": "DATE Column 1_DATEFORMAT"
        }
    }
}
```
  

```
{
    "RecipeAction": {
        "Operation": "DATE_FORMAT",
        "Parameters": {
            "value": "22:10:47",
            "dateTimeFormat": "HH:MM:SS",
            "targetColumn": "formatted_date_value"
        }
    }
}
```

# DATE\$1TIME


Creates a new column containing the date and time value, from the source columns or from values provided.

**Parameters**
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If this string isn't specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `dateTimeParameters` – A JSON-encoded string representing the components of the date and time:
  + `year`
  + `value`
  + `month`
  + `day`
  + `hour`
  + `second`

  Each component must specify one of the following:
  + `sourceColumn` – The name of an existing column.
  + `value` – A character string to evaluate.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DATE_TIME",
        "Parameters": {
            "dateTimeFormat": "yyyy-mm-dd HH:MM:SS",
            "dateTimeParameters": "{\"year\":{\"value\":\"2010\"},\"month\":{\"value\":\"5\"},\"day\":{\"value\":\"21\"},\"hour\":{\"value\":\"13\"},\"minute\":{\"value\":\"34\"},\"second\":{\"value\":\"25\"}}",
            "targetColumn": "DATETIME Column 1"
        }
    }
}
```

# DAY


Creates a new column containing the day of the month, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "DAY",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_DAY"
        }
    }
}
```

# HOUR


Creates a new column containing the hour value, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "HOUR",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_HOUR"
        }
    }
}
```

# MILLISECOND


Creates a new column containing the millisecond value from a source column or input value.

**Parameters**
+ `sourceColumn` – The name of an existing column. The source column can be of type `string`, `date`, or `timestamp`.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly-created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MILLISECOND",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_MILLISECOND"
        }
    }
}
```

# MINUTE


Creates a new column containing the minute value, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MINUTE",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_MINUTE"
        }
    }
}
```

# MONTH


Creates a new column containing the number of the month, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MONTH",
        "Parameters": {
            "value": "2018-05-27",
            "targetColumn": "MONTH Column 1"
        }
    }
}
```

# MONTH\$1NAME


Creates a new column containing the name of the month, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "MONTH_NAME",
        "Parameters": {
            "value": "2018-05-27",
            "targetColumn": "MONTHNAME Column 1"
        }
    }
}
```

# NOW


Creates a new column containing the current date and time in the format `yyyy-mm-dd HH:MM:SS`.

**Parameters**
+ `timeZone` – The name of a time zone. If no time zone is specified, then the default is Universal Coordinated Time (UTC).
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "NOW",
        "Parameters": {
            "timeZone": "US/Pacific",
            "targetColumn": "NOW Column 1"
        }
}
```

# QUARTER


Creates a new column containing the date-based quarter from a string that represents a date.

**Note**  
Quarters are designated in the new column as 1, 2, 3, or 4.  
1 is January, February, and March.
2 is April, May, and June.
3 is July, August, and September.
4 is October, November, and December.

**Parameters**
+ `sourceColumn` – The name of an existing column. The source column can be of type `string`, `date`, or `timestamp`.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly-created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "QUARTER",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_QUARTER"
        }
    }
}
```

# SECOND


Creates a new column containing the second value, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "SECOND",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_SECOND"
        }
    }
}
```

# TIME


Creates a new column containing the time value, from the source columns or values provided.

**Parameters**
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If this string isn't specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `dateTimeParameters` – A JSON-encoded string representing the components of the date and time:
  + `year`
  + `value`
  + `month`
  + `day`
  + `hour`
  + `second`

  Each component must specify one of the following:
  + `sourceColumn` – The name of an existing column.
  + `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TIME",
        "Parameters": {
            "dateTimeFormat": "HH:MM:SS",
            "dateTimeParameters": "{\"year\":{},\"month\":{},\"day\":{},\"hour\":{\"sourceColumn\":\"rand_hour\"},\"minute\":{\"sourceColumn\":\"rand_minute\"},\"second\":{\"sourceColumn\":\"rand_second\"}}",
            "targetColumn": "TIME Column 1"
        }
    }
}
```

# TODAY


Creates a new column containing the current date in the format `yyyy-mm-dd`.

**Parameters**
+ `timeZone` – The name of a time zone. If no time zone is specified, then the default is Universal Coordinated Time (UTC).
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "TODAY",
        "Parameters": {
            "timeZone": "US/Pacific",
            "targetColumn": "TODAY Column 1"
        }
    }
}
```

# UNIX\$1TIME


Creates a new column containing a number representing epoch time (Unix time)—the number of seconds since January 1, 1970—based on a source column or input value. If time zone can be inferred, the output is in that time zone. Otherwise, the output is in Universal Coordinated Time (UTC).

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNIX_TIME",
        "Parameters": {
            "sourceColumn": "TIME Column 1",
            "targetColumn": "TIME Column 1_UNIXTIME"
        }
    }
}
```

# UNIX\$1TIME\$1FORMAT


Converts Unix time for a source column or input value to a specified numerical date format, and returns the result in a new column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – An integer that represents a Unix epoch timestamp.
+ `dateTimeFormat` – Optional. A format string for the date, as it is to appear in the new column. If not specified, the default format is `yyyy-mm-dd HH:MM:SS`.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "UNIX_TIME_FORMAT",
        "Parameters": {
            "value": "1601936554",
            "dateTimeFormat": "yyyy-mm-dd HH:MM:SS",
            "targetColumn": "UNIXTIMEFORMAT Column 1"
        }
    }
}
```

# WEEK\$1DAY


Creates a new column containing the day of the week, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "WEEK_DAY",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_WEEKDAY"
        }
    }
}
```

# WEEK\$1NUMBER


Creates a new column containing the number of the week (from 1 to 52), from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "WEEK_NUMBER",
        "Parameters": {
            "sourceColumn": "DATETIME Column 1",
            "targetColumn": "DATETIME Column 1_WEEK_NUMBER"
        }
    }
}
```

# YEAR


Creates a new column containing the year, from a string that represents a date.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – A name for the newly created column.

**Note**  
You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "YEAR",
        "Parameters": {
            "value": "2019-06-12",
            "targetColumn": "YEAR Column 1"
        }
    }
}
```

# Window functions


Following, find reference topics for window functions that work with recipe actions.

**Topics**
+ [

# FILL
](recipe-actions.functions.FILL.md)
+ [

# NEXT
](recipe-actions.functions.NEXT.md)
+ [

# PREV
](recipe-actions.functions.PREV.md)
+ [

# ROLLING\$1AVERAGE
](recipe-actions.functions.ROLLING_AVERAGE.md)
+ [

# ROLLING\$1COUNT\$1A
](recipe-actions.functions.ROLLING_COUNT_A.md)
+ [

# ROLLING\$1KTH\$1LARGEST
](recipe-actions.functions.ROLLING_KTH_LARGEST.md)
+ [

# ROLLING\$1KTH\$1LARGEST\$1UNIQUE
](recipe-actions.functions.ROLLING_KTH_LARGEST_UNIQUE.md)
+ [

# ROLLING\$1MAX
](recipe-actions.functions.ROLLING_MAX.md)
+ [

# ROLLING\$1MIN
](recipe-actions.functions.ROLLING_MIN.md)
+ [

# ROLLING\$1MODE
](recipe-actions.functions.ROLLING_MODE.md)
+ [

# ROLLING\$1STANDARD\$1DEVIATION
](recipe-actions.functions.ROLLING_STANDARD_DEVIATION.md)
+ [

# ROLLING\$1SUM
](recipe-actions.functions.ROLLING_SUM.md)
+ [

# ROLLING\$1VARIANCE
](recipe-actions.functions.ROLLING_VARIANCE.md)
+ [

# ROW\$1NUMBER
](recipe-actions.functions.ROW_NUMBER.md)
+ [

# SESSION
](recipe-actions.functions.SESSION.md)

# FILL


Returns a new column based on a specified source column. For any missing or null values in the source column, `FILL` chooses the most recent nonblank value from a window of rows before and after the source value in question. The chosen value is then placed in the new column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "FILL",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "last_name",
            "targetColumn": "last_name_FILL"
        }
    }
}
```

# NEXT


Returns a new column, where each value represents a value that is *n* rows later in the source column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRows` – A value that represents *n* rows earlier in the source column. For example, if `numRows` is 3, then `NEXT` uses the third-next `sourceColumn` value as the new `targetColumn` value.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "NEXT",
        "Parameters": {
            "numRows": "1",
            "sourceColumn": "age",
            "targetColumn": "age_NEXT"
        }
    }
}
```

# PREV


Returns a new column, where each value represents a value that is *n* rows earlier in the source column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRows` – A value that represents *n* rows earlier in the source column. For example, if `numRows` is 3, then `PREV` uses the third-previous `sourceColumn` value as the new `targetColumn` value.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "PREV",
        "Parameters": {
            "numRows": "1",
            "sourceColumn": "age",
            "targetColumn": "age_PREV"
        }
    }
}
```

# ROLLING\$1AVERAGE


Returns in a new column the rolling average of values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_AVERAGE",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_AVERAGE"
        }
    }
}
```

# ROLLING\$1COUNT\$1A


Returns in a new column the rolling count of non-null values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_COUNT_A",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_COUNT_A"
        }
    }
}
```

# ROLLING\$1KTH\$1LARGEST


Returns in a new column the rolling *k*th largest value from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `value` – The value for *k*.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
  {
    "Action": {
      "Operation": "ROLLING_KTH_LARGEST",
      "Parameters": {
        "sourceColumn": "weight_kg",
        "numRowsBefore": "5",
        "numRowsAfter": "5",
        "value": "3"
        "targetColumn": "weight_kg_ROLLING_KTH_LARGEST"
      }
    }
  }
```

# ROLLING\$1KTH\$1LARGEST\$1UNIQUE


Returns in a new column the rolling unique *k*th largest value from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `value` – The value for *k*.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
  {
    "Action": {
      "Operation": "ROLLING_KTH_LARGEST_UNIQUE",
      "Parameters": {
        "sourceColumn": "games_played",
        "numRowsBefore": "3",
        "numRowsAfter": "3",
        "value": "5",
        "targetColumn": "weight_kg_ROLLING_KTH_LARGEST_UNIQUE"
      }
    }
  }
```

# ROLLING\$1MAX


Returns in a new column the rolling maximum of values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.

  `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_MAX",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_MAX"
        }
    }
}
```

# ROLLING\$1MIN


Returns in a new column the rolling minimum of values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.

  `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_MIN",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_MIN"
        }
    }
}
```

# ROLLING\$1MODE


Returns in a new column the rolling mode (most common value) from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ modeType – The modal function to apply to the window. Valid values are `NONE`, `MINIMUM`, `MAXIMUM`, and `AVERAGE`.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_MODE",
        "Parameters": {
            "modeType": "MINIMUM",
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_MODE"
        }
    }
}
```

# ROLLING\$1STANDARD\$1DEVIATION


Returns in a new column the rolling standard deviation of values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_STDEV",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_STDEV"
        }
    }
}
```

# ROLLING\$1SUM


Returns in a new column the rolling sum of values from a specified number of rows before to a specified number of rows after the current row in the specified column. 

**Parameters**
+ `sourceColumn` – The name of an existing column.

  `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_SUM",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_SUM"
        }
    }
}
```

# ROLLING\$1VARIANCE


Returns in a new column the rolling variance of values from a specified number of rows before to a specified number of rows after the current row in the specified column.

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `numRowsBefore` – A number of rows before the current source row, representing the start of the window.
+ `numRowsAfter` – A number of rows after the current source row, representing the end of the window.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROLLING_VAR",
        "Parameters": {
            "numRowsAfter": "10",
            "numRowsBefore": "10",
            "sourceColumn": "weight_kg",
            "targetColumn": "weight_kg_ROLLING_VAR"
        }
    }
}
```

# ROW\$1NUMBER


Returns in a new column a session identifier based on a window created by column names from "group by" and "order by" statements.

**Parameters**
+ `groupByColumns` – A JSON-encoded string describing the "group by" columns.
+ `orderByColumns` – A JSON-encoded string describing the "order by" columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "ROW_NUMBER",
        "Parameters": {
            "groupByColumns": "[\"is public domain\"]",
            "orderByColumns": "[\"dimensions\"]",
            "targetColumn": "Row number"
        }
    }
}
```

# SESSION


Returns in a new column a session identifier based on a window created by column names from "group by" and "order by" statements. 

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `units` – A unit of measure for describe the session length. Valid values are `MONTHS`, `YEARS`, `MILLISECONDS`, `QUARTERS`, `HOURS`, `MICROSECONDS`, `WEEKS`, `SECONDS`, `DAYS`, and `MINUTES`.
+ `value` – The number of `units` to define the time period.
+ `groupByColumns` – A JSON-encoded string describing the "group by" columns.
+ `orderByColumns` – A JSON-encoded string describing the "order by" columns.
+ `targetColumn` – A name for the newly created column.

**Example**  
  

```
{
    "Action": {
        "Operation": "SESSION",
        "Parameters": {
            "sourceColumn": "object number",
            "units": "MINUTES",
            "value": "10",
            "groupByColumns": "[\"is public domain\"]",
            "orderByColumns": "[\"dimensions\"]",
            "targetColumn": "object number_SESSION",
        }
    }
}
```

# Web functions


Following, find reference topics for web functions that work with recipe actions.

**Topics**
+ [

# IP\$1TO\$1INT
](recipe-actions.IP_TO_INT.md)
+ [

# INT\$1TO\$1IP
](recipe-actions.INT_TO_IP.md)
+ [

# URL\$1PARAMS
](recipe-actions.URL_PARAMS.md)

# IP\$1TO\$1INT


Converts the Internet Protocol version 4 (IPv4) value of the source column or other value to the corresponding integer value in the target column, and returns the result in a new column. This function works for IPv4 only.

For example, consider the following IP address.

```
192.168.1.1
```

If you use this value as an input to `IP_TO_INT`, the output value is as follows.

```
3232235777
```

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "IP_TO_INT",
        "Parameters": {
            "sourceColumn": "my_ip_address",
            "targetColumn": "IP_TO_INT Column 1"
        }
    }
}
```

# INT\$1TO\$1IP


Converts the integer value of source column or other value to the corresponding IPv4 value in then target column, and returns the result in a new column. This function works for IPv4 only.

For example, consider the following integer.

```
167772410
```

If you use this value as an input to `INT_TO_IP`, the output value is as follows.

```
10.0.0.250
```

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
[  {
    "RecipeAction": {
      "Operation": "INT_TO_IP",
      "Parameters": {
        "sourceColumn": "my_integer",
        "targetColumn": "INT_TO_IP Column 1"
      }
    }
  }
]
```

# URL\$1PARAMS


Extracts query parameters from a URL string, formats them as a JSON object, and returns the result in a new column.

For example, consider the following URL.

```
https://example.com/?firstParam=answer&secondParam=42
```

If you use this value as an input to `URL_PARAMS`, the output value is as follows.

```
{"firstParam": ["answer"], "secondParam": ["42"]}
```

**Parameters**
+ `sourceColumn` – The name of an existing column.
+ `value` – A character string to evaluate.
+ `targetColumn` – The name of the new column to be created.

You can specify either `sourceColumn` or `value`, but not both.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "URL_PARAMS",
        "Parameters": {
            "sourceColumn": "my_url",
            "targetColumn": "URL_PARAMS Column 1"
        }
    }
}
```

# Other functions


Following, find reference topics for other functions that work with recipe actions. 

**Topics**
+ [

# COALESCE
](recipe-actions.COALESCE.md)
+ [

# GET\$1ACTION\$1RESULT
](recipe-actions.GET_ACTION_RESULT.md)
+ [

# GET\$1STEP\$1DATAFRAME
](recipe-actions.GET_STEP_DATAFRAME.md)

# COALESCE


Returns in a new column the first non-null value found in the array of columns. The order of the columns listed in the function determines the order in which they're searched.

**Parameters**
+ `sourceColumns` – A JSON-encoded string representing list of existing columns.
+ `targetColumn` – The name of the new column to be created.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "COALESCE",
        "Parameters": {
            "sourceColumns": "[\"nation_position\",\"joined\"]",
            "targetColumn": "COALESCE Column 1"
        }
    }
}
```

# GET\$1ACTION\$1RESULT


Fetches the result of a previously submitted action. Only for use in the interactive experience.

**Parameters**
+ `actionId` – The ActionId returned in the original SendProjectSessionAction response.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "GET_ACTION_RESULT",
        "Parameters": {
            "actionId": "7",
        }
    }
}
```

# GET\$1STEP\$1DATAFRAME


Fetches the data frame from a step in the project's recipe. Only for use in the interactive experience. Used with the ViewFrame parameter to paginate across a large data frame.

**Parameters**
+ `stepIndex` – The index of the step in the project's recipe for which to fetch the data frame.

**Example**  
  

```
{
    "RecipeAction": {
        "Operation": "GET_STEP_DATAFRAME",
        "Parameters": {
            "stepIndex": "0"
        }
    }
}
```