# Processing input and output in Step Functions
Processing input and output

**Managing state with variables and JSONata**  
Step Functions recently added variables and JSONata to manage state and transform data.  
Learn more in the blog post [Simplifying developer experience with variables and JSONata in AWS Step Functions](https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/) 

When a Step Functions execution receives JSON input, it passes that data to the first state in the workflow as input.

With JSONata, you can retrieve state input from `$states.input`. Your state machine executions also provide that initial input data in the [Context object](input-output-contextobject.md). You can retrieve the original state machine input at any point in your workflow from `$states.context.Execution.Input`.

 When states exit, their output is available to the *very* next state in your state machine. Your state inputs will pass through as state output by default, unless you **modify** the state output. For data that you might need in later steps, consider storing it in variables. For more info, see [Passing data between states with variables](workflow-variables.md). 

**QueryLanguage recommendation**  
For new state machines, we recommend the JSONata query language. In state machines that do not specify a query language, the state machine defaults to JSONPath for backward compatibility. You must opt-in to use JSONata for your state machines or individual states.

**Processing input and output with JSONata**

With JSONata expressions, you can select and transform data. In the `Arguments` field, you can customize the data sent to the action. The result can be transformed into custom state output in the `Output` field. You can also store data in variables in the `Assign` field. For more info, see [Transforming data with JSONata](transforming-data.md).

The following diagram shows how JSON information moves through a JSONata task state.

![\[Diagram showing JSONata task state flow with input, arguments, output, and action components.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-jsonata.png)


**Processing input and output with JSONPath**

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

For state machines that use JSONPath, the following fields control the flow of data from state to state: `InputPath`, `Parameters`, `ResultSelector`, `ResultPath`, and `OutputPath`. Each JSONPath field can manipulate JSON as it moves through each state in your workflow.

JSONPath fields can use [paths](amazon-states-language-paths.md) to select portions of the JSON from the input or the result. A path is a string, beginning with `$`, that identifies nodes within JSON text. Step Functions paths use [JsonPath](https://datatracker.ietf.org/wg/jsonpath/about/) syntax.

The following diagram shows how JSON information moves through a JSONPath task state. The `InputPath` selects the parts of the JSON input to pass to the task of the `Task` state (for example, an AWS Lambda function). You can adjust the data that is sent to your action in the `Parameters` field. Then, with `ResultSelector`, you can select portions of the action result to carry forward. `ResultPath` then selects the combination of state input and task results to pass to the output. `OutputPath` can filter the JSON output to further limit the information that's passed to the output.

![\[Order of filters: InputPath, Parameters, ResultSelector, ResultPath, and OutputPath.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-jsonpath.png)


**Topics**
+ [

# Passing data between states with variables
](workflow-variables.md)
+ [

# Transforming data with JSONata in Step Functions
](transforming-data.md)
+ [

# Accessing execution data from the Context object in Step Functions
](input-output-contextobject.md)
+ [

# Using JSONPath paths
](amazon-states-language-paths.md)
+ [

# Manipulate parameters in Step Functions workflows
](input-output-inputpath-params.md)
+ [

# Example: Manipulating state data with paths in Step Functions workflows
](input-output-example.md)
+ [

# Specifying state output using ResultPath in Step Functions
](input-output-resultpath.md)
+ [

# Map state input and output fields in Step Functions
](input-output-fields-dist-map.md)

# Passing data between states with variables
Passing data with variables

**Managing state with variables and JSONata**  
Step Functions recently added variables and JSONata to manage state and transform data.  
Learn more in the blog post [Simplifying developer experience with variables and JSONata in AWS Step Functions](https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/)   
 The following video describes variables and JSONata in Step Functions with a DynamoDB example:   


 With variables and state output, you can pass data between the steps of your workflow. 

 Using workflow variables, you can store data in a step and retrieve that data in future steps. For example, you could store an API response that contains data you might need later. Conversely, state output can only be used as input to the very next step. 

## Conceptual overview of variables


 With workflow variables, you can store data to reference later. For example, Step 1 might store the result from an API request so a part of that request can be re-used later in Step 5. 

 In the following scenario, the state machine fetches data from an API once. In Step 1, the workflow stores the returned API data (up to 256 KiB per state) in a variable ‘x’ to use in later steps. 

 Without variables, you would need to pass the data through output from Step 1 to Step 2 to Step 3 to Step 4 to use it in Step 5. What if those intermediate steps do not need the data? Passing data from state to state through outputs and input would be unnecessary effort. 

 With variables, you can store data and use it in any future step. You can also modify, rearrange, or add steps without disrupting the flow of your data. Given the flexibility of variables, you might only need to use **Output** to return data from Parallel and Map sub-workflows, and at the end of your state machine execution. 

 ![\[Diagram showing step 1 assigning a value to $x, used in step 5.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-diag-opt1.png)

 **States that support variables **

 The following state types support `Assign` to declare and assign values to variables: *Pass, Task, Map, Parallel, Choice, Wait.*

 To set a variable, provide a JSON object with variable names and values: 

```
"Assign": {
  "productName": "product1",
  "count" : 42,
  "available" : true
}
```

 To reference a variable, prepend the name with a dollar sign (`$`), for example, `$productName`. 

## Reserved variable : \$1states


 Step Functions defines a single reserved variable called **`$states`**. In JSONata states, the following structures are assigned to `$states` for use in JSONata expressions: 

```
# Reserved $states variable in JSONata states
$states = {
  "input":       // Original input to the state
  "result":      // API or sub-workflow's result (if successful)
  "errorOutput": // Error Output (only available in a Catch)
  "context":     // Context object
}
```

 On state entry, Step Functions assigns the state input to **`$states.input`**. The value of `$states.input` can be used in all fields that accept JSONata expressions. `$states.input` always refers to the original state input. 

 For `Task`, `Parallel`, and `Map` states:
+ **`$states.result`** refers to the API or sub-workflow’s raw result if successful. 
+ **`$states.errorOutput`** refers to the Error Output if the API or sub-workflow failed.

  `$states.errorOutput` can be used in the `Catch` field’s `Assign` or `Output`. 

Attempting to access `$states.result` or `$states.errorOutput` in fields and states where they are not accessible will be caught at creation, update, or validation of the state machine. 

The `$states.context` object provides your workflows information about their specific execution, such as `StartTime`, task token, and initial workflow input. To learn more, see [Accessing execution data from the Context object in Step Functions](input-output-contextobject.md).

## Variable name syntax


 Variable names follow the rules for Unicode Identifiers as described in [Unicode® Standard Annex \$131](https://unicode.org/reports/tr31/). The first character of a variable name must be a Unicode ID\$1Start character, and the second and subsequent characters must be Unicode ID\$1Continue characters. The maximum length of a variable name is 80. 

 The variable name convention is similar to rules for JavaScript and other programming languages. 

## Variable scope


 Step Functions workflows avoid race conditions with variables by using a *workflow-local scope*. 

Workflow-local scope includes all states inside a state machine's **States** field, but not states inside Parallel or Map states. States inside Parallel or Map states can refer to outer scope variables, but they create and maintain their own separate workflow-local variables and values.

`Parallel` branches and `Map` iterations can access variable values from **outer scopes**, but they do not have access to variable values from other concurrent branches or iterations. When handling errors, the `Assign` field in a `Catch` can assign values to variables in the outer scope, that is, the scope in which the Parallel/Map state exists.

 Exception: **Distributed Map states** cannot currently reference variables in outer scopes. 

 A variable exists in a scope if any state in the scope assigns a value to it. To help avoid common errors, a variable assigned in an inner scope cannot have the same name as one assigned in an outer scope. For example, if the top-level scope assigns a value to a variable called `myVariable`, then no other scope (inside a `Map`, `Parallel`) can assign to `myVariable` as well. 

 Access to variables depends on the current scope. Parallel and Map states have their own scope, but can access variables in outer scopes. 

 When a Parallel or Map state completes, all of their variables will go out of scope and stop being accessible. Use the **Output field** to pass data out of Parallel branches and Map iterations. 

## Assign field in ASL


 The `Assign` field in ASL is used to assign values to one or more variables. The `Assign` field is available at the top level of each state (except `Succeed` and `Fail`), inside `Choice` state rules, and inside `Catch` fields. For example: 

```
# Example of Assign with JSONata
"Store inputs": {
    "Type": "Pass",
    "Next": "Get Current Price",
    "Comment": "Store the input desired price into a variable: $desiredPrice",
    "Assign": {
       "desiredPrice": "{% $states.input.desired_price %}",
       "maximumWait": "{% $states.input.max_days %}"
    }
},
```

 The `Assign` field takes a JSON object. Each top-level field names a variable to assign. In the previous examples, the variable names are `desiredPrice` and `maximumWait`. When using JSONata, `{% ... %}` indicates a JSONata expression which might contain variables or more complex expressions. For more information about JSONata expressions, refer to the [JSONata.org documentation](https://docs.jsonata.org/overview.html). 

 When using **JSONata** as the query language, the following diagram shows how **Assign** and **Output** fields are processed in parallel. Note the implication: *assigning variable values will not affect state Output. *

 ![\[Diagram showing a comparison of JSONPath and JSONata flow.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-jsonata.png)

 The following JSONata example retrieves `order.product` from the state input. The variable `currentPrice` is set to a value from the result of the task. 

```
# Example of Task with JSONata assignment from result
{
   "Type": "Task",
   ...
   "Assign": {
      "product": "{% $states.input.order.product %}",
      "currentPrice": "{% $states.result.Payload.current_price %}"
   },
   "Next": "the next state"
}
```

 Note: You **cannot** assign a value to a part of a variable. For example, you can `"Assign":{"x":42}`, but you cannot `"Assign":{"x.y":42}` or `"Assign":{"x[2]":42}`. 

## Evaluation order in an assign field


All variable references in Step Functions states use the values as they were on **state entry**. 

The previous fact is important to understand how the `Assign` field assigns values to one or more variables. First, new values are calculated, then Step Functions assigns the new values to the variables. The new variable values will be available starting with the **next **state. For example, consider the following `Assign` field: 

```
# Starting values: $x=3, $a=6

"Assign": {
  "x": "{% $a %}",
  "nextX": "{% $x %}"
}

# Ending values: $x=6, $nextX=3
```

In the preceding example, the variable `x` is both assigned and referenced. 

Remember, all expressions are ***evaluated first***, then assignments are made. And newly assigned values will be available in the **next** state. 

Let's go through the example in detail. Assume that in a previous state, `$x` was assigned a value of three (3) and `$a` was assigned a value of six (6). The following steps describe the process:

1. All expressions are evaluated, using **current** values of all variables.

   The expression `"{% $a %}"` will evaluate to 6, and `"{% $x %}"` will evaluate to 3.

1. Next, assignments are made:

   `$x` will be assigned the value six (6) 

   `$nextX` will be assigned three (3)

 Note: If `$x` had not been previously assigned, the example would **fail** because `$x` would be *undefined*. 

 In summary, Step Functions evaluates **all** expressions and then makes assignments. The order in which the variables occur in the `Assign` field does **not** matter. 

## Limits


 The maximum size of a single variable is 256Kib, for both Standard and Express workflows. 

 The maximum combined size for all variables in a single `Assign` field is also 256Kib. For example, you could assign X and Y to 128KiB, but you could not assign both X and Y to 256KiB in the same `Assign` field. 

 The total size of all stored variables cannot exceed 10MiB per execution. 

## Using variables in JSONPath states


 Variables are also available in states that use JSONPath for their query language. 

 You can reference a variable in any field that accepts a JSONpath expression ( `$.` or `$$.` syntax), with the exception of `ResultPath`, which specifies a location in state input to inject the state's result. Variables cannot be used in `ResultPath`. 

 In JSONPath, the `$` symbol refers to the ‘current’ value and `$$` represents the states Context object. JSONPath expressions can start with `$.` as in `$.customer.name`. You can access context with `$$.` as in `$$.Execution.Id`. 

 To reference a variable, you also use the `$` symbol before a variable name, for example, `$x` or `$order.numItems`. 

 In** JSONPath** fields that accept intrinsic functions, variables can be used in the arguments, for example `States.Format('The order number is {}', $order.number)`. 

 The following digram illustrates how the assign step in a **JSONPath** task occurs in at the same time as the ResultSelector: 

 ![\[Logical diagram of a state that uses JSONPath query language.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-jsonpath.png)

 **Assigning variables in JSONPath**

 JSONPath variable assignments behave similarly to payload templates. Fields that end with `.$` indicate the value is a JSONPath expression which Step Functions evaluates to a value during state machine execution (for example: `$.order..product` and `$.order.total`). 

```
# Example of Assign with JSONPath
{
  "Type": "Task",
  ...
  "Assign": {
    "products.$": "$.order..product",
    "orderTotal.$": "$.order.total"
  },
  "Next": "the next state"
}
```

 For JSONPath states, the value of `$` in an `Assign` field depends on the state type. In `Task,` `Map`, `Parallel` states, the `$` refers to the API/sub-workflow result. In `Choice` and `Wait` state, `$` refers to the *effective input*, which is the value after `InputPath` has been applied to the state input. For `Pass`, `$` refers to the result, whether generated by the `Result` field or the `InputPath`/`Parameters` fields. 

 The following JSONPath example assigns a JSON object to the `details` variable, the result of the JSONPath expression `$.result.code` to `resultCode`, and the result of the JSONPath expression `States.Format('Hello {}', $customer.name)` to `message`. If this was in a `Task` state, then `$` in `$.order.items` and `$.result.code` refers to the API result. The `startTime` variable is assigned with a value from the Context object, `$$.Execution.StartTime`. 

```
"Assign": {
   "details": {
      "status": "SUCCESS",
      "lineItems.$": "$.order.items"
   },
   "resultCode.$": "$.result.code",
   "message.$": "States.Format('Hello {}', $customer.name)",
   "startTime.$": "$$.Execution.StartTime"
}
```

# Transforming data with JSONata in Step Functions
Transforming data with JSONata

 With JSONata, you gain a powerful open source query and expression language to **select** and **transform** data in your workflows. For a brief introduction and complete JSONata reference, see [JSONata.org documentation](https://docs.jsonata.org/overview.html). 

**Supported JSONata version**  
Step Functions supports JSONata version 2.0.6.

 The following video describes variables and JSONata in Step Functions with a DynamoDB example: 


 You must opt-in to use the JSONata query and transformation language for existing workflows. When creating a workflow in the console, we recommend choosing JSONata for the top-level state machine `QueryLanguage`. For existing or new workflows that use JSONPath, the console provides an option to convert individual states to JSONata. 

 After selecting JSONata, your workflow fields will be reduced from five JSONPath fields (`InputPath`, `Parameters`, `ResultSelector`, `ResultPath`, and `OutputPath`) to only two fields: `Arguments` and `Output`. Also, you will **not** use `.$` on JSON object key names. 

 If you are new to Step Functions, you only need to know that JSONata expressions use the following syntax: 

 **JSONata syntax:** `"{% <JSONata expression> %}"` 

 The following code samples show a conversion from JSONPath to JSONata: 

```
# Original sample using JSONPath
{
  "QueryLanguage": "JSONPath", // Set explicitly; could be set and inherited from top-level
  "Type": "Task",
  ...
  "Parameters": {
    "static": "Hello",
    "title.$": "$.title",
    "name.$": "$customerName",  // With $customerName declared as a variable
    "not-evaluated": "$customerName"
  }
}
```

```
# Sample after conversion to JSONata
{
  "QueryLanguage": "JSONata", // Set explicitly; could be set and inherited from top-level
  "Type": "Task",
  ...
  "Arguments": { // JSONata states do not have Parameters
    "static": "Hello",
    "title": "{% $states.input.title %}", 
    "name": "{% $customerName %}",   // With $customerName declared as a variable
    "not-evaluated": "$customerName"
  }
}
```

 Given input `{ "title" : "Doctor" }` and variable `customerName` assigned to `"María"`, both state machines will produce the following JSON result: 

```
{
  "static": "Hello",
  "title": "Doctor",
  "name": "María",
  "not-evaluated": "$customerName"
 }
```

 In the next diagram, you can see a graphical representation showing how converting JSONPath (left) to JSONata (right) will reduce the complexity of the steps in your state machines: 

![\[Diagram that compares the fields in JSONPath and JSONata states.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/compare-jsonpath-jsonata.png)


 You can (optionally) select and transform data from the state input into **Arguments** to send to your integrated action. With JSONata, you can then (optionally) select and transform the **results** from the action for assigning to variables and for state **Output**. 

 Note: **Assign** and **Output** steps occur in **parallel**. If you choose to transform data during variable assignment, that transformed data will **not** be available in the Output step. You must reapply the JSONata transformation in the Output step. 

![\[Logical diagram of a state that uses JSONata query language.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/vars-jsonata.png)


## QueryLanguage field


 In your workflow ASL definitions, there is a `QueryLanguage` field at the top level of a state machine definition and in individual states. By setting `QueryLanguage` inside individual states, you can incrementally adopt JSONata in an existing state machine rather than upgrading the state machine all at once. 

 The `QueryLanguage` field can be set to `"JSONPath"` or `"JSONata"`. If the top-level `QueryLanguage` field is omitted, it defaults to `"JSONPath"`. If a state contains a state-level `QueryLanguage` field, Step Functions will use the specified query language for that state. If the state does not contain a `QueryLanguage` field, then it will use the query language specified in the top-level `QueryLanguage` field. 

## Writing JSONata expressions in JSON strings
Writing JSONata expressions

 When a string in the value of an ASL field, a JSON object field, or a JSON array element is surrounded by `{% %}` characters, that string will be evaluated as JSONata . Note, the string must start with `{%` with no leading spaces, and must end with `%}` with no trailing spaces. Improperly opening or closing the expression will result in a validation error. 

 Some examples: 
+  `"TimeoutSeconds" : "{% $timeout %}"` 
+  `"Arguments" : {"field1" : "{% $name %}"}` in a `Task` state
+  `"Items": [1, "{% $two %}", 3]` in a `Map` state 

 Not all ASL fields accept JSONata. For example, each state’s `Type` field must be set to a constant string. Similarly, the `Task` state’s `Resource` field must be a constant string. The `Map` state `Items` field will accept a JSON array, a JSON object, or a JSONata expression that must evaluate to an array or object. 

## Reserved variable : \$1states


 Step Functions defines a single reserved variable called ** `$states` **. In JSONata states, the following structures are assigned to `$states` for use in JSONata expressions: 

```
# Reserved $states variable in JSONata states
$states = {
  "input":       // Original input to the state
  "result":      // API or sub-workflow's result (if successful)
  "errorOutput": // Error Output (only available in a Catch)
  "context":     // Context object
}
```

 On state entry, Step Functions assigns the state input to ** `$states.input` **. The value of `$states.input` can be used in all fields that accept JSONata expressions. `$states.input` always refers to the original state input. 

 For `Task`, `Parallel`, and `Map` states:
+  ** `$states.result` ** refers to the API or sub-workflow’s raw result if successful. 
+  ** `$states.errorOutput` ** refers to the Error Output if the API or sub-workflow failed.

   `$states.errorOutput` can be used in the `Catch` field’s `Assign` or `Output`. 

Attempting to access `$states.result` or `$states.errorOutput` in fields and states where they are not accessible will be caught at creation, update, or validation of the state machine. 

The `$states.context` object provides your workflows information about their specific execution, such as `StartTime`, task token, and initial workflow input. To learn more, see [Accessing execution data from the Context object in Step Functions](input-output-contextobject.md) .

## Handling expression errors


At runtime, JSONata expression evaluation might fail for a variety of reasons, such as:
+  **Type error** - An expression, such as `{% $x + $y %}`, will fail if `$x` or `$y` is not a number.
+  **Type incompatibility** - An expression might evaluate to a type that the field will not accept. For example, the field `TimeoutSeconds` requires a numeric input, so the expression `{% $timeout %}` will fail if `$timeout` returns a string.
+  **Value out of range **- An expression that produces a value that is outside the acceptable range for a field will fail. For example, an expression such as `{% $evaluatesToNegativeNumber %}` will fail in the `TimeoutSeconds` field.
+  **Failure to return a result** - JSON cannot represent an undefined value expression, so the expression `{% $data.thisFieldDoesNotExist %}` would result in an error.
+  **Memory limit exceeded** - A JSONata expression that consumes too much memory during evaluation will fail with an `Expression evaluation memory limit exceeded` error. This can occur with expressions that process or transform large amounts of data. To work around this limitation, consider moving the data transformation to a Lambda function.
+  **Expression timeout** - A JSONata expression that takes longer than 1 second to evaluate will fail with an `Expression evaluation timeout` error. This can occur with expressions that contain infinite loops or very expensive operations.
+  **Stack overflow** - A JSONata expression that exceeds the maximum recursion depth will fail with a `Stack overflow error`. If the recursion is non-terminating, ensure the function has a correct base case or termination condition. If the recursion terminates but the call stack grows too deep, consider rewriting the function as tail-recursive to reduce stack depth.

In each case, the interpreter will throw the error: `States.QueryEvaluationError`. Your Task, Map, and Parallel states can provide a `Catch` field to catch the error, and a `Retry` field to retry on the error.

## Converting from JSONPath to JSONata
Converting to JSONata

 The following sections compare and explain the differences between code written with JSONPath and JSONata. 

### No more path fields


 ASL requires developers use `Path` versions of fields, as in `TimeoutSecondsPath`, to select a value from the state data when using JSONPath. When you use JSONata, you no longer use `Path` fields because ASL will interpret `{% %}`-enclosed JSONata expressions automatically for you in non-Path fields, such as `TimeoutSeconds`. 
+ JSONPath legacy example: `"TimeoutSecondsPath": "$timeout"` 
+ JSONata : `"TimeoutSeconds": "{% $timeout %}"` 

 Similarly, the `Map` state `ItemsPath` has been replaced with the `Items` field which accepts a JSON array, a JSON object, or a JSONata expression that must evaluate to an array or object. 

### JSON Objects


 ASL uses the term *payload template* to describe a JSON object that can contain JSONPath expressions for `Parameters` and `ResultSelector` field values. ASL will not use the term payload template for JSONata because JSONata evaluation happens for all strings whether they occur on their own or inside a JSON object or a JSON array. 

### No more .\$1


 ASL requires you to append ‘`.$`’ to field names in payload templates to use JSONPath and Intrinsic Functions. When you specify `"QueryLanguage":"JSONata"`, you no longer use the ‘`.$`’ convention for JSON object field names. Instead, you enclose JSONata expressions in `{% %}` characters. You use the same convention for all string-valued fields, regardless of how deeply the object is nested inside other arrays or objects. 

### Arguments and Output Fields


 When the `QueryLanguage` is set to `JSONata`, the old I/O processing fields will be disabled (`InputPath`, `Parameters`, `ResultSelector`, `ResultPath` and `OutputPath`) and most states will get two new fields: `Arguments` and `Output`. 

 JSONata provides a simpler way to perform I/O transformations compared to the fields used with JSONPath. JSONata’s features makes `Arguments` and `Output` more capable than the previous five fields with JSONPath. These new field names also help simplify your ASL and clarify the model for passing and returning values. 

 The `Arguments` and `Output` fields (and other similar fields such as `Map` state’s `ItemSelector`) will accept either a JSON object such as: 

```
"Arguments": {
    "field1": 42, 
    "field2": "{% jsonata expression %}"
}
```

 Or, you can use a JSONata expression directly, for example: 

```
"Output": "{% jsonata expression %}"
```

 Output can also accept any type of JSON value too, for example: `"Output":true`, `"Output":42`. 

 The `Arguments` and `Output` fields only support JSONata, so it is invalid to use them with workflows that use JSONPath. Conversely, `InputPath`, `Parameters`, `ResultSelector`, `ResultPath`, `OutputPath` , and other JSONPath fields are only supported in JSONPath, so it is invalid to use path-based fields when using JSONata as your top level workflow or state query language. 

### Pass state


 The optional **Result** in a Pass state was previously treated as the *output* of a virtual task. With JSONata selected as the workflow or state query language, you can now use the new **Output** field. 

### Choice state


 When using JSONPath, choice states have an input `Variable` and numerous comparison paths, such as the following `NumericLessThanEqualsPath` : 

```
# JSONPath choice state sample, with Variable and comparison path
"Check Price": {
  "Type": "Choice",
  "Default": "Pause",
  "Choices": [
  {
    "Variable": "$.current_price.current_price",
    "NumericLessThanEqualsPath": "$.desired_price",
    "Next": "Send Notification"
  } ],
}
```

 With JSONata, the choice state has a `Condition` where you can use a JSONata expression: 

```
# Choice state after JSONata conversion
"Check Price": {
  "Type": "Choice",
  "Default": "Pause"
  "Choices": [
    {
      "Condition": "{% $current_price <= $states.input.desired_priced %}",
      "Next": "Send Notification"
    } ]
```

 Note: Variables and comparison fields are only available for JSONPath. Condition is only available for JSONata. 

## JSONata examples


 The following examples can be created in Workflow Studio to experiment with JSONata. You can create and execute the state machines, or use the **Test state** to pass in data and even modify the state machine definition. 

### Example: Input and Output


 This example shows how to use `$states.input` to use the state input and the `Output` field to specify the state output when you opt into JSONata. 

```
{
  "Comment": "Input and Output example using JSONata",
  "QueryLanguage": "JSONata",
  "StartAt": "Basic Input and Output",
  "States": {
    "Basic Input and Output": {
      "QueryLanguage": "JSONata",
      "Type": "Succeed",
      "Output": {
        "lastName": "{% 'Last=>' & $states.input.customer.lastName %}",
        "orderValue": "{% $states.input.order.total %}"
      }
    }
  }
}
```

 When the workflow is executed with the following as input: 

```
{
  "customer": {
    "firstName": "Martha",
    "lastName": "Rivera"
  },
  "order": {
    "items": 7,
    "total": 27.91
  }
}
```

Test state or state machine execution will return the following JSON output:

```
{
  "lastName": "Last=>Rivera",
  "orderValue": 27.91
}
```

![\[Screenshot showing input and output of a state under test.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/jsonata-basic-io.png)


### Example: Filtering with JSONata


 You can filter your data with JSONata [Path operators](https://docs.jsonata.org/path-operators). For example, imagine you have a list of products for input, and you only want to process products that contain zero calories. You can create a state machine definition with the following ASL and test the `FilterDietProducts` state with the sample input that follows. 

 **State machine definition for filtering with JSONata** 

```
{
  "Comment": "Filter products using JSONata",
  "QueryLanguage": "JSONata",
  "StartAt": "FilterDietProducts",
  "States": {
    "FilterDietProducts": {
      "Type": "Pass",
      "Output": {
        "dietProducts": "{% $states.input.products[calories=0] %}"
      },
      "End": true
    }
  }
}
```

 **Sample input for the test** 

```
{
  "products": [
    {
      "calories": 140,
      "flavour": "Cola",
      "name": "Product-1"
    },
    {
      "calories": 0,
      "flavour": "Cola",
      "name": "Product-2"
    },
    {
      "calories": 160,
      "flavour": "Orange",
      "name": "Product-3"
    },
    {
      "calories": 100,
      "flavour": "Orange",
      "name": "Product-4"
    },
    {
      "calories": 0,
      "flavour": "Lime",
      "name": "Product-5"
    }
  ]
}
```

 **Output from testing the step in your state machine** 

```
{
    "dietProducts": [
        {
            "calories": 0,
            "flavour": "Cola",
            "name": "Product-2"
        },
        {
            "calories": 0,
            "flavour": "Lime",
            "name": "Product-5"
        }
    ]
}
```

![\[Example output for JSONata expressions under test.\]](http://docs.aws.amazon.com/step-functions/latest/dg/images/test-state-jsonata.png)


## JSONata functions provided by Step Functions
JSONata functions

JSONata contains function libraries for String, Numeric, Aggregation, Boolean, Array, Object, Date/Time, and High Order functions. Step Functions provides additional JSONata functions that you can use in your JSONata expressions. These built-in functions serve as replacements for Step Functions intrinsic functions. Intrinsic functions are only available in states that use the JSONPath query language. 

 Note: Built-in JSONata functions that require integer values as parameters will automatically round down any non-integer numbers provided. 

 **\$1partition -** JSONata equivalent of `States.ArrayPartition` intrinsic function to partition a large array. 

 The first parameter is the array to partition, the second parameter is an integer representing the chunk size. The return value will be a two-dimensional array. The interpreter chunks the input array into multiple arrays of the size specified by chunk size. The length of the last array chunk may be less than the length of the previous array chunks if the number of remaining items in the array is smaller than the chunk size. 

```
"Assign": {
  "arrayPartition": "{% $partition([1,2,3,4], $states.input.chunkSize) %}"
}
```

 **\$1range** - JSONata equivalent of `States.ArrayRange` intrinsic function to generate an array of values. 

 This function takes three arguments. The first argument is an integer representing the first element of the new array, the second argument is an integer representing the final element of the new array, and the third argument is the delta value integer for the elements in the new array. The return value is a newly-generated array of values ranging from the first argument of the function to the second argument of the function with elements in between adjusted by the delta. The delta value can be positive or negative which will increment or decrement each element from the last until the end value is reached or exceeded. 

```
"Assign": {
  "arrayRange": "{% $range(0, 10, 2) %}"
}
```

 **\$1hash** - JSONata equivalent of the `States.Hash` intrinsic function to calculate the hash value of a given input. 

 This function takes two arguments. The first argument is the source string to be hashed. The second argument is a string representing the hashing algorithm to for the hash calculation. The hashing algorithm must be one of the following values: `"MD5"`, `"SHA-1"`, `"SHA-256"`, `"SHA-384"`, `"SHA-512"`. The return value is a string of the calculated hash of the data. 

 This function was created because JSONata does not natively support the ability to calculate hashes. 

```
"Assign": {
  "myHash": "{% $hash($states.input.content, $hashAlgorithmName) %}"
}
```

 **\$1random** - JSONata equivalent of the `States.MathRandom` intrinsic function to return a random number n where `0 ≤ n < 1`. 

 The function takes an *optional* integer argument representing the seed value of the random function. If you use this function with the same seed value, it returns an identical number. 

 This overloaded function was created because the built-in JSONata function [https://docs.jsonata.org/numeric-functions#random](https://docs.jsonata.org/numeric-functions#random) does not accept a seed value. 

```
"Assign": {
   "randNoSeed": "{% $random() %}",
   "randSeeded": "{% $random($states.input.seed) %}"
}
```

 **\$1uuid** - JSONata version of the `States.UUID` intrinsic function. 

 The function takes no arguments. This function return a v4 UUID. 

 This function was created because JSONata does not natively support the ability to generate UUIDs. 

```
"Assign": {
  "uniqueId": "{% $uuid() %}"
}
```

 **\$1parse** - JSONata function to deserialize JSON strings. 

 The function takes a stringified JSON as its only argument. 

 JSONata supports this functionality via `$eval`; however, `$eval` is not supported in Step Functions workflows. 

```
"Assign": {
  "deserializedPayload": "{% $parse($states.input.json_string) %}"
}
```

# Accessing execution data from the Context object in Step Functions
Context object

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

The Context object is an internal JSON structure that is available during an execution, and contains information about your state machine and execution. The context provides your workflows information about their specific execution. Your workflows can reference the Context object in a JSONata expression with `$states.context`.

## Accessing the Context object


**To access the Context object in JSONata**

To access the Context object in JSONata states, use `$states.context` in a JSONata expression. 

```
{
  "ExecutionID" : "{% $states.context.Execution.Id %}"
}
```

**To access the Context object in JSONPath**

To access the Context object in JSONPath, you first append `.$` to the end of the key to indicate the value is a path. Then, prepend the value with `$$.` to select a node in the Context object.

```
{
  "ExecutionID.$": "$$.Execution.Id"
}
```

JSONPath states can refer to the context (`$$.`) from the following JSONPath fields:
+ `InputPath`
+ `OutputPath`
+ `ItemsPath` (in Map states)
+ `Variable` (in Choice states)
+ `ResultSelector`
+ `Parameters`
+ Variable to variable comparison operators

## Context object fields


The Context object includes information about the state machine, state, execution, and task. The Context JSON object includes nodes for each type of data in the following format:

```
{
    "Execution": {
        "Id": "String",
        "Input": {},
        "Name": "String",
        "RoleArn": "String",
        "StartTime": "Format: ISO 8601",
        "RedriveCount": Number,
        "RedriveTime": "Format: ISO 8601"
    },
    "State": {
        "EnteredTime": "Format: ISO 8601",
        "Name": "String",
        "RetryCount": Number
    },
    "StateMachine": {
        "Id": "String",
        "Name": "String"
    },
    "Task": {
        "Token": "String"
    }
}
```

During an execution, the Context object is populated with relevant data. 

Occasionally, new fields are added to the context. If you are processing the JSON context directly, we recommend crafting code that can gracefully handle new unknown fields. For example, if using the Jackson library for unmarshalling JSON, we recommend setting `FAIL_ON_UNKNOWN_PROPERTIES` to `false` in your `ObjectMapper` to prevent an `UnrecognizedPropertyException`.

 `RedriveTime` Context object is only available if you've redriven an execution. If you've [redriven a Map Run](redrive-map-run.md), the `RedriveTime` context object is only available for child workflows of type Standard. For a redriven Map Run with child workflows of type Express, `RedriveTime` isn't available.

Content from a running execution includes specifics in the following format: 

```
{
    "Execution": {
        "Id": "arn:aws:states:region:123456789012:execution:stateMachineName:executionName",
        "Input": {
           "key": "value"
        },
        "Name": "executionName",
        "RoleArn": "arn:aws:iam::123456789012:role...",
        "StartTime": "2025-08-27T10:04:42Z"
    },
    "State": {
        "EnteredTime": "2025-08-27T10:04:42.001Z",
        "Name": "Test",
        "RetryCount": 3
    },
    "StateMachine": {
        "Id": "arn:aws:states:region:123456789012:stateMachine:stateMachineName",
        "Name": "stateMachineName"
    },
    "Task": {
        "Token": "h7XRiCdLtd/83p1E0dMccoxlzFhglsdkzpK9mBVKZsp7d9yrT1W"
    }
}
```

**Timestamp format with fractional seconds**  
Step Functions follows the ISO8601 specification which states that output can be zero, three, six or nine digits as necessary. When a timestamp has zero fractional seconds, Step Functions removes the trailing zeros rather than pad the output.   
If you create code that consumes Step Functions timestamps, your code must be able to process a variable number of fractional seconds.

## Context object data for Map states


**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

When processing a [`Map` state](state-map.md), the context will also contain `Index`, `Value`, and `Source`. 

For each `Map` state iteration, `Index` contains the index number for the array item that is being currently processed, `Value` contains the array item being processed, and `Source` will be the InputType of `CSV`, `JSON`, `JSONL`, or `PARQUET`.

Within a `Map` state, the Context object includes the following data:

```
"Map": {
   "Item": {
      "Index" : Number,
      "Key"   : "String", // Only valid for JSON objects
      "Value" : "String",
      "Source": "String"
   }
}
```

These are available only in a `Map` state, and can be specified in the `ItemSelector (Map)` field.

**Note**  
You must define parameters from the Context object in the `ItemSelector` block of the main `Map` state, not within the states included in the `ItemProcessor` section.

Given a state machine using a **JSONPath** `Map` state, you can inject information from the Context object as follows.

```
{
  "StartAt": "ExampleMapState",
  "States": {
    "ExampleMapState": {
      "Type": "Map",
      "ItemSelector": {
        "ContextIndex.$": "$$.Map.Item.Index",
        "ContextValue.$": "$$.Map.Item.Value",
        "ContextSource.$": "$$.Map.Item.Source"
      },
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "INLINE"
        },
        "StartAt": "TestPass",
        "States": {
          "TestPass": {
            "Type": "Pass",
            "End": true
          }
        }
      },
      "End": true
    }
  }
}
```

For JSONata, the additional Map state context information can be accessed from the `$states.context` variable:

```
{
  "StartAt": "ExampleMapState",
  "States": {
    "ExampleMapState": {
      "Type": "Map",
      "ItemSelector": {
        "ContextIndex": "{% $states.context.Map.Item.Index %}",
        "ContextValue": "{% $states.context.Map.Item.Value %}",
        "ContextSource": "{% $states.context.Map.Item.Source %}"
      },
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "INLINE"
        },
        "StartAt": "TestPass",
        "States": {
          "TestPass": {
            "Type": "Pass",
            "End": true
          }
        }
      },
      "End": true
    }
  }
}
```


If you execute the previous state machine with the following input, `Index` and `Value` are inserted in the output.

```
[
  {
    "who": "bob"
  },
  {
    "who": "meg"
  },
  {
    "who": "joe"
  }
]
```

The output for the execution returns the values of `Index` and `Value` items for each of the three iterations as follows:

```
[
  {
    "ContextIndex": 0,
    "ContextValue": {
      "who": "bob"
    },
    "ContextSource" : "STATE_DATA" 
  },
  {
    "ContextIndex": 1,
    "ContextValue": {
      "who": "meg"
    },
    "ContextSource" : "STATE_DATA" 
  },
  {
    
    "ContextIndex": 2,
    "ContextValue": {
      "who": "joe"
    },
    "ContextSource" : "STATE_DATA" 
  }
]
```

Note that `$states.context.Map.Item.Source` will be one of the following:
+ For state input, the value will be : `STATE_DATA`
+ For `Amazon S3 LIST_OBJECTS_V2` with `Transformation=NONE`, the value will show the S3 URI for the bucket. For example: `S3://bucket-name`. 
+ For all the other input types, the value will be the Amazon S3 URI. For example: `S3://bucket-name/object-key`.

# Using JSONPath paths
Using JSONPath paths

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

In the Amazon States Language, a *path* is a string beginning with `$` that you can use to identify components within JSON text. Paths follow [JsonPath](https://datatracker.ietf.org/wg/jsonpath/about/) syntax, which is only available when the `QueryLanguage` is set to JSONPath. You can specify a path to access subsets of the input when specifying values for `InputPath`, `ResultPath`, and `OutputPath`.

You must use square bracket notation if your field name contains any character that is not included in the `member-name-shorthand` definition of the [JsonPath ABNF](https://www.ietf.org/archive/id/draft-ietf-jsonpath-base-21.html#jsonpath-abnf) rule. Therefore, to encode special characters, such as punctuation marks (excluding `_`), you must use square bracket notation. For example, `$.abc.['def ghi']`. 

## Reference Paths


A *reference path* is a path whose syntax is limited in such a way that it can identify only a single node in a JSON structure:
+ You can access object fields using only dot (`.`) and square bracket (`[ ]`) notation.
+ Functions such as `length()` aren't supported.
+ Lexical operators, which are non-symbolic, such as `subsetof` aren't supported.
+ Filtering by regular expression or by referencing another value in the JSON structure is not supported.
+ The operators `@`, `,`, `:`, and `?` are not supported

For example, if state input data contains the following values:

```
{
  "foo": 123,
  "bar": ["a", "b", "c"],
  "car": {
      "cdr": true
  }
}
```

The following reference paths would return the following.

```
$.foo => 123
$.bar => ["a", "b", "c"]
$.car.cdr => true
```

Certain states use paths and reference paths to control the flow of a state machine or configure a state's settings or options. For more information, see [Modeling workflow input and output path processing with data flow simulator](https://aws.amazon.com/blogs/compute/modeling-workflow-input-output-path-processing-with-data-flow-simulator/) and [Using JSONPath effectively in AWS Step Functions](https://aws.amazon.com/blogs/compute/using-jsonpath-effectively-in-aws-step-functions/).

### Flattening an array of arrays


If the [Parallel workflow state](state-parallel.md) or [Map workflow state](state-map.md) state in your state machines return an array of arrays, you can transform them into a flat array with the [ResultSelector](input-output-inputpath-params.md#input-output-resultselector) field. You can include this field inside the Parallel or Map state definition to manipulate the result of these states.

To flatten arrays, use the syntax: `[*]` in the `ResultSelector` field as shown in the following example.

```
"ResultSelector": {
    "flattenArray.$": "$[*][*]"
  }
```

For examples that show how to flatten an array, see *Step 3* in the following tutorials:
+ [Processing batch data with a Lambda function in Step Functions](tutorial-itembatcher-param-task.md)
+ [Processing individual items with a Lambda function in Step Functions](tutorial-itembatcher-single-item-process.md)

# Manipulate parameters in Step Functions workflows
Manipulate parameters with paths

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

The `InputPath`, `Parameters` and `ResultSelector` fields provide a way to manipulate JSON as it moves through your workflow. `InputPath` can limit the input that is passed by filtering the JSON notation by using a path (see [Using JSONPath paths](amazon-states-language-paths.md)). With the `Parameters` field, you can pass a collection of key-value pairs, using either static values or selections from the input using a path.

 The `ResultSelector` field provides a way to manipulate the state’s result before `ResultPath` is applied. 

AWS Step Functions applies the `InputPath` field first, and then the `Parameters` field. You can first filter your raw input to a selection you want using `InputPath`, and then apply `Parameters` to manipulate that input further, or add new values. You can then use the `ResultSelector` field to manipulate the state's output before `ResultPath` is applied.

## InputPath


Use `InputPath` to select a portion of the state input. 

For example, suppose the input to your state includes the following.

```
{
  "comment": "Example for InputPath.",
  "dataset1": {
    "val1": 1,
    "val2": 2,
    "val3": 3
  },
  "dataset2": {
    "val1": "a",
    "val2": "b",
    "val3": "c"
  }
}
```

You could apply the `InputPath`.

```
"InputPath": "$.dataset2",
```

With the previous `InputPath`, the following is the JSON that is passed as the input.

```
{
  "val1": "a",
  "val2": "b",
  "val3": "c"
}
```

**Note**  
A path can yield a selection of values. Consider the following example.  

```
{ "a": [1, 2, 3, 4] }
```
If you apply the path `$.a[0:2]`, the following is the result.  

```
[ 1, 2 ]
```

## Parameters


This section describes the different ways you can use the Parameters field. 

### Key-value pairs


Use the `Parameters` field to create a collection of key-value pairs that are passed as input. The values of each can either be static values that you include in your state machine definition, or selected from either the input or the Context object with a path. For key-value pairs where the value is selected using a path, the key name must end in `.$`. 

For example, suppose you provide the following input. 

```
{
  "comment": "Example for Parameters.",
  "product": {
    "details": {
       "color": "blue",
       "size": "small",
       "material": "cotton"
    },
    "availability": "in stock",
    "sku": "2317",
    "cost": "$23"
  }
}
```

To select some of the information, you could specify these parameters in your state machine definition. 

```
"Parameters": {
        "comment": "Selecting what I care about.",
        "MyDetails": {
          "size.$": "$.product.details.size",
          "exists.$": "$.product.availability",
          "StaticValue": "foo"
        }
      },
```

Given the previous input and the `Parameters` field, this is the JSON that is passed.

```
{
  "comment": "Selecting what I care about.",
  "MyDetails": {
      "size": "small",
      "exists": "in stock",
      "StaticValue": "foo"
  }
},
```

In addition to the input, you can access a special JSON object, known as the Context object. The Context object includes information about your state machine execution. See [Accessing execution data from the Context object in Step Functions](input-output-contextobject.md).

### Connected resources


The `Parameters` field can also pass information to connected resources. For example, if your task state is orchestrating an AWS Batch job, you can pass the relevant API parameters directly to the API actions of that service. For more information, see:
+ [Passing parameters to a service API in Step Functions](connect-parameters.md)
+ [Integrating services](integrate-services.md)

### Amazon S3


If the Lambda function data you are passing between states might grow to more than 262,144 bytes, we recommend using Amazon S3 to store the data, and implement one of the following methods:
+ Use the *Distributed Map state* in your workflow so that the `Map` state can read input directly from Amazon S3 data sources. For more information, see [Distributed mode](state-map-distributed.md).
+ Parse the Amazon Resource Name (ARN) of the bucket in the `Payload` parameter to get the bucket name and key value. For more information, see [Using Amazon S3 ARNs instead of passing large payloads in Step Functions](sfn-best-practices.md#avoid-exec-failures).

Alternatively, you can adjust your implementation to pass smaller payloads in your executions.

## ResultSelector


 Use the `ResultSelector` field to manipulate a state's result before `ResultPath` is applied. The `ResultSelector` field lets you create a collection of key value pairs, where the values are static or selected from the state's result. Using the `ResultSelector` field, you can choose what parts of a state's result you want to pass to the `ResultPath` field.

**Note**  
With the `ResultPath` field, you can add the output of the `ResultSelector` field to the original input.

`ResultSelector` is an optional field in the following states:
+ [Map workflow state](state-map.md)
+ [Task workflow state](state-task.md)
+ [Parallel workflow state](state-parallel.md)

For example, Step Functions service integrations return metadata in addition to the payload in the result. `ResultSelector` can select portions of the result and merge them with the state input with `ResultPath`. In this example, we want to select just the `resourceType` and `ClusterId`, and merge that with the state input from an Amazon EMR createCluster.sync. Given the following:

```
{
  "resourceType": "elasticmapreduce",
  "resource": "createCluster.sync",
  "output": {
    "SdkHttpMetadata": {
      "HttpHeaders": {
        "Content-Length": "1112",
        "Content-Type": "application/x-amz-JSON-1.1",
        "Date": "Mon, 25 Nov 2019 19:41:29 GMT",
        "x-amzn-RequestId": "1234-5678-9012"
      },
      "HttpStatusCode": 200
    },
    "SdkResponseMetadata": {
      "RequestId": "1234-5678-9012"
    },
    "ClusterId": "AKIAIOSFODNN7EXAMPLE"
  }
}
```

You can then select the `resourceType` and `ClusterId` using `ResultSelector`:

```
"Create Cluster": {
  "Type": "Task",
  "Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
  "Parameters": {
    <some parameters>
  },
  "ResultSelector": {
    "ClusterId.$": "$.output.ClusterId",
    "ResourceType.$": "$.resourceType"
  },
  "ResultPath": "$.EMROutput",
  "Next": "Next Step"
}
```

With the given input, using `ResultSelector` produces:

```
{
  "OtherDataFromInput": {},
  "EMROutput": {
      "ClusterId": "AKIAIOSFODNN7EXAMPLE",
      "ResourceType": "elasticmapreduce",
  }
}
```

### Flattening an array of arrays


If the [Parallel workflow state](state-parallel.md) or [Map workflow state](state-map.md) state in your state machines return an array of arrays, you can transform them into a flat array with the [ResultSelector](#input-output-resultselector) field. You can include this field inside the Parallel or Map state definition to manipulate the result of these states.

To flatten arrays, use the syntax: `[*]` in the `ResultSelector` field as shown in the following example.

```
"ResultSelector": {
    "flattenArray.$": "$[*][*]"
  }
```

For examples that show how to flatten an array, see *Step 3* in the following tutorials:
+ [Processing batch data with a Lambda function in Step Functions](tutorial-itembatcher-param-task.md)
+ [Processing individual items with a Lambda function in Step Functions](tutorial-itembatcher-single-item-process.md)

# Example: Manipulating state data with paths in Step Functions workflows
Example: Manipulating state data with paths

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

This topic contains examples of how to manipulate state input and output JSON using the InputPath, ResultPath, and OutputPath fields. 

Any state other than a [Fail workflow state](state-fail.md) state or a [Succeed workflow state](state-succeed.md) state can include the input and output processing fields, such as `InputPath`, `ResultPath`, or `OutputPath`. Additionally, the [Wait workflow state](state-wait.md) and [Choice workflow state](state-choice.md) states don't support the `ResultPath` field. With these fields, you can use a [JsonPath](https://datatracker.ietf.org/wg/jsonpath/about/) to filter the JSON data as it moves through your workflow. 

You can also use the `Parameters` field to manipulate the JSON data as it moves through your workflow. For information about using `Parameters`, see [Manipulate parameters in Step Functions workflows](input-output-inputpath-params.md).

For example, start with the AWS Lambda function and state machine described in the [Creating a Step Functions state machine that uses Lambda](tutorial-creating-lambda-state-machine.md) tutorial. Modify the state machine so that it includes the following `InputPath`, `ResultPath`, and `OutputPath`.

```
{
  "Comment": "A Hello World example of the Amazon States Language using an AWS Lambda function",
  "StartAt": "HelloWorld",
  "States": {
    "HelloWorld": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:region:123456789012:function:HelloFunction",
      "InputPath": "$.lambda",
      "ResultPath": "$.data.lambdaresult",
      "OutputPath": "$.data",
      "End": true
    }
  }
}
```

Start an execution using the following input.

```
{
  "comment": "An input comment.",
  "data": {
    "val1": 23,
    "val2": 17
  },
  "extra": "foo",
  "lambda": {
    "who": "AWS Step Functions"
  }
}
```

Assume that the `comment` and `extra` nodes can be discarded, but that you want to include the output of the Lambda function, and preserve the information in the `data` node.

In the updated state machine, the `Task` state is altered to process the input to the task.

```
"InputPath": "$.lambda",
```

This line in the state machine definition limits the task input to only the `lambda` node from the state input. The Lambda function receives only the JSON object `{"who": "AWS Step Functions"}` as input. 

```
"ResultPath": "$.data.lambdaresult",
```

This `ResultPath` tells the state machine to insert the result of the Lambda function into a node named `lambdaresult`, as a child of the `data` node in the original state machine input. Because you are not performing any other manipulation on the original input and the result using `OutputPath`, the output of the state now includes the result of the Lambda function with the original input.

```
{
  "comment": "An input comment.",
  "data": {
    "val1": 23,
    "val2": 17,
    "lambdaresult": "Hello, AWS Step Functions!"
  },
  "extra": "foo",
  "lambda": {
    "who": "AWS Step Functions"
  }
}
```

But, our goal was to preserve only the `data` node, and include the result of the Lambda function. `OutputPath` filters this combined JSON before passing it to the state output.

```
"OutputPath": "$.data",
```

This selects only the `data` node from the original input (including the `lambdaresult` child inserted by `ResultPath`) to be passed to the output. The state output is filtered to the following.

```
{
  "val1": 23,
  "val2": 17,
  "lambdaresult": "Hello, AWS Step Functions!"
}
```

In this `Task` state:

1. `InputPath` sends only the `lambda` node from the input to the Lambda function.

1. `ResultPath` inserts the result as a child of the `data` node in the original input.

1. `OutputPath` filters the state input (which now includes the result of the Lambda function) so that it passes only the `data` node to the state output.

**Example to manipulate original state machine input, result, and final output using JsonPath**  
Consider the following state machine that verifies an insurance applicant's identity and address.  
To view the complete example, see [How to use JSON Path in Step Functions](https://github.com/aws-samples/serverless-account-signup-service).

```
{
  "Comment": "Sample state machine to verify an applicant's ID and address",
  "StartAt": "Verify info",
  "States": {
    "Verify info": {
      "Type": "Parallel",
      "End": true,
      "Branches": [
        {
          "StartAt": "Verify identity",
          "States": {
            "Verify identity": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "Payload.$": "$",
                "FunctionName": "arn:aws:lambda:us-east-2:111122223333:function:check-identity:$LATEST"
              },
              "End": true
            }
          }
        },
        {
          "StartAt": "Verify address",
          "States": {
            "Verify address": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "Payload.$": "$",
                "FunctionName": "arn:aws:lambda:us-east-2:111122223333:function:check-address:$LATEST"
              },
              "End": true
            }
          }
        }
      ]
    }
  }
}
```
If you run this state machine using the following input, the execution fails because the Lambda functions that perform verification only expect the data that needs to be verified as input. Therefore, you must specify the nodes that contain the information to be verified using an appropriate JsonPath.  

```
{
  "data": {
    "firstname": "Jane",
    "lastname": "Doe",
    "identity": {
      "email": "jdoe@example.com",
      "ssn": "123-45-6789"
    },
    "address": {
      "street": "123 Main St",
      "city": "Columbus",
      "state": "OH",
      "zip": "43219"
    },
    "interests": [
      {
        "category": "home",
        "type": "own",
        "yearBuilt": 2004
      },
      {
        "category": "boat",
        "type": "snowmobile",
        "yearBuilt": 2020
      },
      {
        "category": "auto",
        "type": "RV",
        "yearBuilt": 2015
      },
    ]
  }
}
```
To specify the node that the `check-identity` Lambda function must use, use the `InputPath` field as follows:  

```
"InputPath": "$.data.identity"
```
And to specify the node that the `check-address` Lambda function must use, use the `InputPath` field as follows:  

```
"InputPath": "$.data.address"
```
Now if you want to store the verification result within the original state machine input, use the `ResultPath` field as follows:  

```
"ResultPath": "$.results"
```
However, if you only need the identity and verification results and discard the original input, use the `OutputPath` field as follows:  

```
"OutputPath": "$.results"
```

For more information, see [Processing input and output in Step Functions](concepts-input-output-filtering.md).

## Filtering state output using OutputPath
Filtering state output

With `OutputPath` you can select a portion of the state output to pass to the next state. With this approach, you can filter out unwanted information, and pass only the portion of JSON that you need.

If you don't specify an `OutputPath` the default value is `$`. This passes the entire JSON node (determined by the state input, the task result, and `ResultPath`) to the next state.

# Specifying state output using ResultPath in Step Functions
Specify state output with paths

**Managing state and transforming data**  
This page refers to JSONPath. Step Functions recently added variables and JSONata to manage state and transform data.  
Learn about [Passing data with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

The output of a state can be a copy of its input, the result it produces (for example, output from a `Task` state’s Lambda function), or a combination of its input and result. Use `ResultPath` to control which combination of these is passed to the state output. 

The following state types can generate a result and can include `ResultPath:`
+ [Pass workflow state](state-pass.md)
+ [Task workflow state](state-task.md)
+ [Parallel workflow state](state-parallel.md)
+ [Map workflow state](state-map.md)

Use `ResultPath` to combine a task result with task input, or to select one of these. The path you provide to `ResultPath` controls what information passes to the output. 

**Note**  
 `ResultPath` is limited to using [reference paths](amazon-states-language-paths.md#amazon-states-language-reference-paths), which limit scope so the path must identify only a single node in JSON. See [Reference Paths](amazon-states-language-paths.md#amazon-states-language-reference-paths) in the [Amazon States Language](concepts-amazon-states-language.md).

## Use ResultPath to replace input with the task result
Replace input with result

If you do not specify a `ResultPath`, the default behavior is the same as `"ResultPath": "$"`. The state will replace the entire state input with the result from the task.

```
# State Input
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions"
}

# Path 
"ResultPath": "$"

# Task result
"Hello, Step Functions!"

# State Output
"Hello, Step Functions!"
```

**Note**  
`ResultPath` is used to include content from the result with the input, before passing it to the output. But, if `ResultPath` isn't specified, the default action is to replace the entire input.

## Discard the result and keep the original input
Discard Result and Keep Input

If you set `ResultPath` to `null`, the state will pass the **original input** to the output. The state's input payload will be copied directly to the output, with no regard for the task result. 

```
# State Input
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions"
}

# Path 
"ResultPath": null

# Task result
"Hello, Step Functions!"

# State Output
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions"
}
```

## Use ResultPath to include the result with the input
Include Result with Input

If you specify a path for ResultPath, the state output will combine the state input and task result:

```
# State Input
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions"
}

# Path 
"ResultPath": "$.taskresult"

# Task result
"Hello, Step Functions!"

# State Output
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions",
 "taskresult" : "Hello, Step Functions!"
}
```

You can also insert the result into a child node of the input. Set the `ResultPath` to the following.

```
"ResultPath": "$.strings.lambdaresult"
```

Given the following input: 

```
{
  "comment": "An input comment.",
  "strings": {
    "string1": "foo",
    "string2": "bar",
    "string3": "baz"
  },
  "who": "AWS Step Functions"
}
```

The task result would be inserted as a child of the `strings` node in the input.

```
{
  "comment": "An input comment.",
  "strings": {
    "string1": "foo",
    "string2": "bar",
    "string3": "baz",
    "lambdaresult": "Hello, Step Functions!"
  },
  "who": "AWS Step Functions"
}
```

The state output now includes the original input JSON with the result as a child node.

## Use ResultPath to update a node in the input with the result
Update a Node in Input with Result

If you specify an existing node for ResultPath, the task result will replace that existing node:

```
# State Input
{  
 "comment": "This is a test",
 "details": "Default example",
 "who" : "Step Functions"
}

# Path 
"ResultPath": "$.comment"

# Task result
"Hello, Step Functions!"

# State Output
{  
 "comment": "Hello, Step Functions!",
 "details": "Default example",
 "who" : "Step Functions"
}
```

## Use ResultPath to include both error and input in a `Catch`
Include Error and Input in a `Catch`

In some cases, you might want to preserve the original input with the error. Use `ResultPath` in a `Catch` to include the error with the original input, instead of replacing it. 

```
"Catch": [{ 
  "ErrorEquals": ["States.ALL"], 
  "Next": "NextTask", 
  "ResultPath": "$.error" 
}]
```

If the previous `Catch` statement catches an error, it includes the result in an `error` node within the state input. For example, with the following input:

```
{"foo": "bar"}
```

The state output when catching an error is the following.

```
{
  "foo": "bar",
  "error": {
    "Error": "Error here"
  }
}
```

For more information about error handling, see the following:
+ [Handling errors in Step Functions workflows](concepts-error-handling.md)
+ [Handling error conditions in a Step Functions state machine](tutorial-handling-error-conditions.md)

# Map state input and output fields in Step Functions
Map state configuration

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

Map states iterate over a collection of items in a dataset. Examples of data sets include: 
+ JSON arrays and objects from previous states.
+ Individual data files stored in Amazon S3 in formats such as: JSON, JSONL, CSV, Parquet files.
+ References to multiple objects, such as: Athena manifests and Amazon S3 inventory files

A map repeats a set of steps for each item in the dataset. You can configure the input that the `Map state` receives and the output the map generates using a variety of configuration options. Step Functions applies each option in your *Distributed Map state* in the order shown in the following list. Depending on your use case, you may not need to apply all of fields.

1. [ItemReader (Map)](input-output-itemreader.md) - used to read your data items

1. [ItemsPath (Map, JSONPath only)](input-output-itemspath.md) or **Items (JSONata)** - optional; used to specify items in your dataset

1. [ItemSelector (Map)](input-output-itemselector.md) - optional; used to select and modify items in the data set 

1. [ItemBatcher (Map)](input-output-itembatcher.md) - used to process groups of items when processing large sets of items

1. [ResultWriter (Map)](input-output-resultwriter.md) - provides options for output results from child workflows

# ItemReader (Map)
ItemReader

The `ItemReader` field is a JSON object, which specifies a dataset and its location. A *Distributed Map state* uses this dataset as its input. 

The following example shows the syntax of the `ItemReader` field in a **JSONPath-based** workflow, for a dataset in a text delimited file that's stored in an Amazon S3 bucket.

```
"ItemReader": {
    "ReaderConfig": {
        "InputType": "CSV",
        "CSVHeaderLocation": "FIRST_ROW"
    },
    "Resource": "arn:aws:states:::s3:getObject",
    "Parameters": {
        "Bucket": "amzn-s3-demo-bucket",
        "Key": "csvDataset/ratings.csv",
        "VersionId": "BcK42coT2jE1234VHLUvBV1yLNod2OEt"
    }
}
```

In the following **JSONata-based** workflow, note that `Parameters` is replaced with **Arguments**.

```
"ItemReader": {
    "ReaderConfig": {
        "InputType": "CSV",
        "CSVHeaderLocation": "FIRST_ROW"
    },
    "Resource": "arn:aws:states:::s3:getObject",
    "Arguments": {
        "Bucket": "amzn-s3-demo-bucket",
        "Key": "csvDataset/ratings.csv"
        "VersionId": "BcK42coT2jE1234VHLUvBV1yLNod2OEt"
    }
}
```

## Contents of the ItemReader field


Depending on your dataset, the contents of the `ItemReader` field varies. For example, if your dataset is a JSON array passed from a previous step in the workflow, the `ItemReader` field is omitted. If your dataset is an Amazon S3 data source, this field contains the following sub-fields.

**`Resource`**  
The Amazon S3 API integration action that Step Functions will use, such as `arn:aws:states:::s3:getObject`

**`Arguments (JSONata) or Parameters (JSONPath)`**  
A JSON object that specifies the Amazon S3 bucket name and object key that the dataset is stored in.   
If the bucket has versioning enabled, you can also provide the Amazon S3 object version.

**`ReaderConfig`**  
A JSON object that specifies the following details:  
+ `InputType`

  Accepts one of the following values: `CSV`, `JSON`, `JSONL`, `PARQUET`, `MANIFEST`.

  Specifies the type of Amazon S3 data source, such as a text delimited file (`CSV`), object, JSON file, JSON Lines, Parquet file, Athena manifest, or an Amazon S3 inventory list. In Workflow Studio, you can select an input type from **S3 item source**.

  Most input types which use `S3GetObject` retrieval also support `ExpectedBucketOwner` and `VersionId` fields in their parameters. Parquet files are the one exception which does not support `VersionId`.

  Input files support the following external compression types: GZIP, ZSTD. 

  Example file names: `myObject.jsonl.gz` and `myObject.csv.zstd`. 

  Note: Parquet files are a binary file type that are internally compressed. GZIP, ZSTD, and Snappy compression are supported.
+ `Transformation`

  *Optional*. Value will be either or `NONE` or `LOAD_AND_FLATTEN`. 

  If not specified, `NONE` will be assumed. When set to `LOAD_AND_FLATTEN`, you must also set `InputType`.

  Default behavior, map will iterate over **metadata objects** returned from calls to `S3:ListObjectsV2`. When set to `LOAD_AND_FLATTEN`, map will read and process the actual **data objects** referenced in the list of results. 
+ `ManifestType`

  *Optional*. Value will be either or `ATHENA_DATA` or `S3_INVENTORY`. 

  Note: If set to `S3_INVENTORY`, you must **not** also specify `InputType` because the type is assumed to be `CSV`.
+ `CSVDelimiter`

  You can specify this field when `InputType` is `CSV` or `MANIFEST`. 

  Accepts one of the following values: `COMMA` (default), `PIPE`, `SEMICOLON`, `SPACE`, `TAB`.
**Note**  
With the `CSVDelimiter` field, `ItemReader` can process files that are delimited by characters other than a comma. References to "CSV files" also includes files that use alternative delimiters specified by the `CSVDelimiter` field.
+ `CSVHeaderLocation`

  You can specify this field when `InputType` is `CSV` or `MANIFEST`. 

  Accepts one of the following values to specify the location of the column header:
  + `FIRST_ROW` – Use this option if the first line of the file is the header.
  + `GIVEN` – Use this option to specify the header within the state machine definition. 

    For example, if your file contains the following data.

    ```
    1,307,3.5,1256677221
    1,481,3.5,1256677456
    1,1091,1.5,1256677471
    ...
    ```

    You might provide the following JSON array as a CSV header:

    ```
    "ItemReader": {
        "ReaderConfig": {
            "InputType": "CSV",
            "CSVHeaderLocation": "GIVEN",
            "CSVHeaders": [
                "userId",
                "movieId",
                "rating",
                "timestamp"
            ]
        }
    }
    ```
**CSV header size**  
Step Functions supports headers of up to 10 KiB for text delimited files.
+ `ItemsPointer`

  *Optional*. You can specify this field when `InputType` is `JSON`. 

  `ItemsPointer` uses JSONPointer syntax to select a specific array or object nested within your JSON file. JSONPointer is a standardized syntax designed exclusively for navigating and referencing locations within JSON documents.

  JSONPointer syntax uses forward slashes (/) to separate each level of nesting, with array indices represented as numbers without brackets. For example:
  + `/Data/Contents` - references the Contents array within the Data object
  + `/Data/Contents/0` - references the first element of the Contents array

  The target array's starting position must be within the first 16MB of the JSON file, and the JSONPointer path must be less than 2000 characters in length.

  For example, if your JSON file contains:

  ```
  {"data": {"items": [{"id": 1}, {"id": 2}]}}
  ```

  You would specify `"ItemsPointer": "/data/items"` to process the items array.
+ `MaxItems`

  By default, the `Map` state iterates over all items in the specified dataset. By setting `MaxItems`, you can limit the number of data items passed to the `Map` state. For example, if you provide a text delimited file that contains 1,000 rows, and you set a limit of 100, then the interpreter passes *only* 100 rows to the *Distributed Map state*. The `Map` state processes items in sequential order, starting after the header row. 

  For **JSONPath** workflows, you can use `MaxItemsPath` and a *reference path* to a key-value pair in the state input which resolves to an integer. Note that you can specify either `MaxItems` or `MaxItemsPath`, but not **both**.
**Note**  
You can specify a limit of up to 100,000,000 after which the `Distributed Map` stops reading items.

**Requirements for account and region**  
Your Amazon S3 buckets must be in the same AWS account and AWS Region as your state machine.  
Note that even though your state machine may be able to access files in buckets across different AWS accounts that are in the same AWS Region, Step Functions only supports listing objects in Amazon S3 buckets that are in *both* the same AWS account and the same AWS Region as the state machine.

## Processing nested data sets (updated Sep 11, 2025)


With the new `Transformation` parameter, you can specify a value of `LOAD_AND_FLATTEN` and the map will read the **actual **data objects referenced in the list of results from a call to `S3:ListObjectsV2`. 

Prior to this release, you would need to create nested Distributed Maps to **retrieve** the metadata and then **process** the actual data. The first map would iterate over the **metadata** returned by `S3:ListObjectsV2` and invoke child workflows. Another map within each child state machine would read the **actual data **from individual files. With the transformation option, you can accomplish both steps at once.

Imagine you want to run a daily audit on the past 24 log files your system produces hourly and stores in Amazon S3. Your Distributed Map state can list the log files with `S3:ListObjectsV2`, then iterate over either the *metadata* of each object, or it can now load and analyze the **actual data **objects stored in your Amazon S3 bucket.

Using the `LOAD_AND_FLATTEN` option can increase scalability, reduce open Map Run counts, and process multiple objects concurrently. Athena and Amazon EMR jobs typically generate output that can be processed with the new configuration. 

The following is an example of the parameters in an `ItemReader` definition: 

```
{
  "QueryLanguage": "JSONata",
  "States": {
    ...
    "Map": {
        ...
        "ItemReader": {
            "Resource": "arn:aws:states:::s3:listObjectsV2",
            "ReaderConfig": {
                // InputType is required if Transformation is LOAD_AND_FLATTEN.
                "InputType": "CSV | JSON | JSONL | PARQUET",

                // Transformation is OPTIONAL and defaults to NONE if not present
                "Transformation": "NONE | LOAD_AND_FLATTEN" 
            },
            "Arguments": {
                "Bucket": "amzn-s3-demo-bucket1",
                "Prefix": "{% $states.input.PrefixKey %}"
            }
        },
        ...
    }
}
```

## Examples of datasets


You can specify one of the following options as your dataset:
+ [JSON data from a previous step](#itemsource-json-array)
+ [A list of Amazon S3 objects](#itemsource-example-s3-object-data)
+ [Amazon S3 objects transformed by LOAD\$1AND\$1FLATTEN](#itemsource-example-s3-object-data-flatten)
+ [JSON file in an Amazon S3 bucket](#itemsource-example-json-data)
+ [JSON Lines file in an Amazon S3 bucket](#itemsource-example-json-lines-data)
+ [CSV file in an Amazon S3 bucket](#itemsource-example-csv-data)
+ [Parquet file in an Amazon S3 bucket](#itemsource-example-parquet-data)
+ [Athena manifest (process multiple items)](#itemsource-example-athena-manifest-data)
+ [Amazon S3 inventory (process multiple items)](#itemsource-example-s3-inventory)

**Note**  
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

### JSON data from a previous step


A *Distributed Map state* can accept a JSON input passed from a previous step in the workflow. 

The input can be a JSON array, a JSON object, or an array within a node of a JSON object. 

Step Functions will iterate directly over the elements of an array, or the key-value pairs of a JSON object. 

To select a specific node that contains a nested JSON array or object from the input, you can use the `ItemsPath (Map, JSONPath only)` or use a JSONata expression in the `Items` field for JSONata states. 

To process individual items, the *Distributed Map state* starts a child workflow execution for each item. The following tabs show examples of the input passed to the `Map` state and the corresponding input to a child workflow execution.

**Note**  
The `ItemReader` field is not needed when your dataset is JSON data from a previous step.

------
#### [ Input passed to the Map state ]

Consider the following JSON array of three items.

```
"facts": [
    {
        "verdict": "true",
        "statement_date": "6/11/2008",
        "statement_source": "speech"
    },
    {
        "verdict": "false",
        "statement_date": "6/7/2022",
        "statement_source": "television"
    },
    {
        "verdict": "mostly-true",
        "statement_date": "5/18/2016",
        "statement_source": "news"
    }
]
```

------
#### [ Input passed to a child workflow execution ]

The *Distributed Map state* starts three child workflow executions. Each execution receives an array item as input. The following example shows the input received by a child workflow execution.

```
{
  "verdict": "true",
  "statement_date": "6/11/2008",
  "statement_source": "speech"
}
```

------

### A list of Amazon S3 objects


A *Distributed Map state* can iterate over the objects that are stored in an Amazon S3 bucket. When the workflow execution reaches the `Map` state, Step Functions invokes the [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) API action, which returns an array of the Amazon S3 **object metadata**. In this array, each item contains data, such as **ETag** and **Key**, for the actual data stored in the bucket. 

To process individual items in the array, the *Distributed Map state* starts a child workflow execution. For example, suppose that your Amazon S3 bucket contains 100 images. Then, the array returned after invoking the `ListObjectsV2` API action contains 100 metadata items. The *Distributed Map state* then starts 100 child workflow executions to process each item.

To process data objects directly, without nested workflows, you can choose the LOAD\$1AND\$1FLATTEN Transformation option to process items **directly**.

**Note**  
Step Functions will also include an item for each **folder **created in the Amazon S3 bucket using the Amazon S3 **console**. The folder items result in starting extra child workflow executions.   
To avoid creating a extra child workflow executions for each folder, we recommend that you use the AWS CLI to create folders. For more information, see [High-level Amazon S3 commands](https://docs.aws.amazon.com/cli/latest/userguide/cli-services-s3-commands.html#using-s3-commands-managing-buckets-creating) in the *AWS Command Line Interface User Guide*.
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

The following tabs show examples of the `ItemReader` field syntax and the input passed to a child workflow execution for this dataset.

------
#### [ ItemReader syntax ]

In this example, you've organized your data, which includes images, JSON files, and objects, within a prefix named `processData` in an Amazon S3 bucket named `amzn-s3-demo-bucket`.

```
"ItemReader": {
    "Resource": "arn:aws:states:::s3:listObjectsV2",
    "Parameters": {
        "Bucket": "amzn-s3-demo-bucket",
        "Prefix": "processData"
    }
}
```

------
#### [ Input passed to a child workflow execution ]

The *Distributed Map state* starts as many child workflow executions as the number of metadata items present in the Amazon S3 bucket. The following example shows the input received by a child workflow execution.

```
{
  "Etag": "\"05704fbdccb224cb01c59005bebbad28\"",
  "Key": "processData/images/n02085620_1073.jpg",
  "LastModified": 1668699881,
  "Size": 34910,
  "StorageClass": "STANDARD"
}
```

------

### Amazon S3 objects transformed by `LOAD_AND_FLATTEN`


With enhanced support for S3 ListObjectsV2 as an input source in Distributed Map, your state machines can read and process multiple **data objects **from Amazon S3 buckets directly, eliminating the need for nested maps to process the metadata\$1

With the `LOAD_AND_FLATTEN` option, your state machine will do the following:
+ Read the **actual content** of each object listed by Amazon S3 `ListObjectsV2` call.
+ Parse the content based on InputType (CSV, JSON, JSONL, Parquet).
+ Create items from the file contents (rows/records) rather than metadata.

With the transformation option, you no longer need nested Distributed Maps to process the metadata. Using the LOAD\$1AND\$1FLATTEN option increases scalability, reduces active map run counts, and processes multiple objects concurrently.

The following configuration shows the setting for an `ItemReader`:

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:listObjectsV2",
   "ReaderConfig": {
      "InputType": "JSON",
      "Transformation": "LOAD_AND_FLATTEN"
   },
   "Arguments": {
      "Bucket": "S3_BUCKET_NAME",
      "Prefix": "S3_BUCKET_PREFIX"
   }
}
```

**Bucket prefix recommendation**  
We recommend including a trailing slash on your prefix. For example, if you select data with a prefix of `folder1`, your state machine will process both `folder1/myData.csv` and `folder10/myData.csv`. Using `folder1/` will strictly process only one folder.

### JSON file in an Amazon S3 bucket


A *Distributed Map state* can accept a JSON file that's stored in an Amazon S3 bucket as a dataset. The JSON file must contain an array or JSON object. 

When the workflow execution reaches the `Map` state, Step Functions invokes the [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) API action to fetch the specified JSON file. 

If the JSON file contains a nested object structure, you can select the specific node with your data set with an `ItemsPointer`. For example, the following configuration would extract a nested list of *featured products* in *inventory*.

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:getObject",
   "ReaderConfig": {
      "InputType": "JSON",
      "ItemsPointer": "/inventory/products/featured"
   },
   "Arguments": {
      "Bucket": "amzn-s3-demo-bucket",
      "Key": "nested-data-file.json"
   }
}
```

The `Map` state then iterates over each item in the array and starts a child workflow execution for each item. For example, if your JSON file contains 1000 array items, the `Map` state starts 1000 child workflow executions.

**Note**  
The execution input used to start a child workflow execution can't exceed 256 KiB. However, Step Functions supports reading an item of up to 8 MB from a text delimited file, JSON, or JSON Lines file if you then apply the optional `ItemSelector` field to reduce the item's size.
Step Functions supports 10 GB as the maximum size of an individual file in Amazon S3.
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

The following tabs show examples of the `ItemReader` field syntax and the input passed to a child workflow execution for this dataset.

For this example, imagine you have a JSON file named `factcheck.json`. You've stored this file within a prefix named `jsonDataset` in an Amazon S3 bucket. The following is an example of the JSON dataset.

```
[
  {
    "verdict": "true",
    "statement_date": "6/11/2008",
    "statement_source": "speech"
  },
  {
    "verdict": "false",
    "statement_date": "6/7/2022",
    "statement_source": "television"
  },
  {
    "verdict": "mostly-true",
    "statement_date": "5/18/2016",
    "statement_source": "news"
  },
  ...
]
```

------
#### [ ItemReader syntax ]

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:getObject",
   "ReaderConfig": {
      "InputType": "JSON"
   },
   "Parameters": {
      "Bucket": "amzn-s3-demo-bucket",
      "Key": "jsonDataset/factcheck.json"
   }
}
```

------
#### [ Input to a child workflow execution ]

The *Distributed Map state* starts as many child workflow executions as the number of array items present in the JSON file. The following example shows the input received by a child workflow execution.

```
{
  "verdict": "true",
  "statement_date": "6/11/2008",
  "statement_source": "speech"
}
```

------

### JSON Lines file in an Amazon S3 bucket


A *Distributed Map state* can accept a JSON Lines file that's stored in an Amazon S3 bucket as a dataset.

**Note**  
The execution input used to start a child workflow execution can't exceed 256 KiB. However, Step Functions supports reading an item of up to 8 MB from a text delimited file, JSON, or JSON Lines file if you then apply the optional `ItemSelector` field to reduce the item's size.
Step Functions supports 10 GB as the maximum size of an individual file in Amazon S3.
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

The following tabs show examples of the `ItemReader` field syntax and the input passed to a child workflow execution for this dataset.

For this example, imagine you have a JSON Lines file named `factcheck.jsonl`. You've stored this file within a prefix named `jsonlDataset` in an Amazon S3 bucket. The following is an example of the file's contents.

```
{"verdict": "true", "statement_date": "6/11/2008", "statement_source": "speech"} 
{"verdict": "false", "statement_date": "6/7/2022", "statement_source": "television"}
{"verdict": "mostly-true", "statement_date": "5/18/2016", "statement_source": "news"}
```

------
#### [ ItemReader syntax ]

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:getObject",
   "ReaderConfig": {
      "InputType": "JSONL"
   },
   "Parameters": {
      "Bucket": "amzn-s3-demo-bucket",
      "Key": "jsonlDataset/factcheck.jsonl"
   }
}
```

------
#### [ Input to a child workflow execution ]

The *Distributed Map state* starts as many child workflow executions as the number of lines present in the JSONL file. The following example shows the input received by a child workflow execution.

```
{
  "verdict": "true",
  "statement_date": "6/11/2008",
  "statement_source": "speech"
}
```

------

### CSV file in an Amazon S3 bucket


**Note**  
With the `CSVDelimiter` field, `ItemReader` can process files that are delimited by characters other than a comma. References to "CSV files" also includes files that use alternative delimiters specified by the `CSVDelimiter` field.

A *Distributed Map state* can accept a text delimited file that's stored in an Amazon S3 bucket as a dataset. If you use a text delimited file as your dataset, you need to specify a column header. For information about how to specify a header, see [Contents of the ItemReader field](#itemreader-field-contents).

Step Functions parses text delimited files based on the following rules:
+ The delimiter that separates fields is specified by `CSVDelimiter` in *ReaderConfig*. The delimiter defaults to `COMMA`.
+ Newlines are a delimiter that separates **records**.
+ Fields are treated as strings. For data type conversions, use the `States.StringToJson` intrinsic function in [ItemSelector (Map)](input-output-itemselector.md).
+ Double quotation marks (" ") are not required to enclose strings. However, strings that are enclosed by double quotation marks can contain commas and newlines without acting as record delimiters.
+ You can preserve double quotes by repeating them.
+ Backslashes (\$1) are another way to escape special characters. Backslashes only work with other backslashes, double quotation marks, and the configured field separator such as comma or pipe. A backslash followed by any other character is silently removed.
+ You can preserve backslashes by repeating them. For example: 

  ```
  path,size
  C:\\Program Files\\MyApp.exe,6534512
  ```
+ Backslashes that escape double quotation marks (`\"`), only work when included in pairs, so we recommend escaping double quotation marks by repeating them: `""`.
+ If the number of fields in a row is **less** than the number of fields in the header, Step Functions provides **empty strings** for the missing values.
+ If the number of fields in a row is **more** than the number of fields in the header, Step Functions **skips** the additional fields.

For more information about how Step Functions parses a text delimited file, see [Example of parsing an input CSV file](example-csv-parse-dist-map.md#example-csv-parse).

When the workflow execution reaches the `Map` state, Step Functions invokes the [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) API action to fetch the specified file. The `Map` state then iterates over each row in the file and starts a child workflow execution to process the items in each row. For example, suppose that you provide a text delimited file that contains 100 rows as input. Then, the interpreter passes each row to the `Map` state. The `Map` state processes items in serial order, starting after the header row.

**Note**  
The execution input used to start a child workflow execution can't exceed 256 KiB. However, Step Functions supports reading an item of up to 8 MB from a text delimited file, JSON, or JSON Lines file if you then apply the optional `ItemSelector` field to reduce the item's size.
Step Functions supports 10 GB as the maximum size of an individual file in Amazon S3.
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

The following tabs show examples of the `ItemReader` field syntax and the input passed to a child workflow execution for this dataset.

------
#### [ ItemReader syntax ]

For example, say that you have a CSV file named `ratings.csv`. Then, you've stored this file within a prefix that's named `csvDataset` in an Amazon S3 bucket.

```
"ItemReader": {
   "ReaderConfig": {
      "InputType": "CSV",
      "CSVHeaderLocation": "FIRST_ROW",
      "CSVDelimiter": "PIPE"
   },
   "Resource": "arn:aws:states:::s3:getObject",
   "Parameters": {
      "Bucket": "amzn-s3-demo-bucket",
      "Key": "csvDataset/ratings.csv"
   }
}
```

------
#### [ Input to a child workflow execution ]

The *Distributed Map state* starts as many child workflow executions as the number of rows present in the CSV file, excluding the header row, if in the file. The following example shows the input received by a child workflow execution.

```
{
  "rating": "3.5",
  "movieId": "307",
  "userId": "1",
  "timestamp": "1256677221"
}
```

------

### Parquet file in an Amazon S3 bucket


Parquet files can be used as an input source. Apache Parquet files stored in Amazon S3 provide efficient columnar data processing at scale.

When using Parquet files, the following conditions apply:
+ 256MB is the maximum row-group size, and 5MB is the maximum footer size. If you provide input files that exceed either limit, your state machine will return a runtime error.
+ The `VersionId` field is **not** supported for `InputType=Parquet`.
+ Internal GZIP, ZSTD, and Snappy data compression are natively supported. No filename extensions are necessary. 

The following shows an example ASL configuration for `InputType` set to Parquet:

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:getObject",
   "ReaderConfig": {
      "InputType": "PARQUET"
   },
   "Arguments": {
      "Bucket": "amzn-s3-demo-bucket",
      "Key": "my-parquet-data-file-1.parquet"
   }
}
```

**Large scale job processing**  
For extremely large scale jobs, Step Functions will use many input readers. Readers interleave their processing, which might result in some readers pausing while others progress. Intermittent progress is expected behavior at scale.

### Athena manifest (process multiple items)


You can use the Athena manifest files, generated from `UNLOAD` query results, to specify the **source** of data files for your Map state. You set `ManifestType` to `ATHENA_DATA`, and `InputType` to either `CSV`, `JSONL`, or `Parquet`. 

When running an `UNLOAD` query, Athena generates a data manifest file in addition to the actual data objects. The manifest file provides a structured CSV list of the data files. Both the manifest and the data files are saved to your Athena query result location in Amazon S3.

```
UNLOAD (<YOUR_SELECT_QUERY>) TO 'S3_URI_FOR_STORING_DATA_OBJECT' WITH (format = 'JSON')
```

Conceptual overview of the process, in brief:

1. Select your data from a Table using an `UNLOAD` query in Athena.

1. Athena will generate a manifest file (CSV) and the data objects in Amazon S3. 

1. Configure Step Functions to read the manifest file and process the input.

The feature can process CSV, JSONL, and Parquet output formats from Athena. All objects referenced in a single manifest file must be the same InputType format. Note that CSV objects exported by an `UNLOAD` query do **not **include header in the first line. See `CSVHeaderLocation` if you need to provide column headers. 

The map context will also include a `$states.context.Map.Item.Source` so you can customize processing based on the source of the data.

The following is an example configuration of an `ItemReader` configured to use an Athena manifest:

```
"ItemReader": {
   "Resource": "arn:aws:states:::s3:getObject",
   "ReaderConfig": {
      "ManifestType": "ATHENA_DATA",
      "InputType": "CSV | JSONL | PARQUET"
   },
   "Arguments": {
      "Bucket": "<S3_BUCKET_NAME>",
      "Key": "<S3_KEY_PREFIX><QUERY_ID>-manifest.csv"
   }
}
```

**Using the Athena manifest pattern in Workflow Studio**  
A common scenario for data processing applies a Map to data sourced from an Athena UNLOAD query. The Map invokes a Lambda function to process each item described in the Athena manifest. Step Functions Workflow Studio provides a ready-made pattern that combines all of these components into block you an drag onto your state machine canvas.

### S3 inventory (process multiple items)


A *Distributed Map state* can accept an Amazon S3 inventory manifest file that's stored in an Amazon S3 bucket as a dataset.

When the workflow execution reaches the `Map` state, Step Functions invokes the [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) API action to fetch the specified Amazon S3 inventory manifest file. 

By default, the `Map` state then iterates over the **objects** in the inventory to return an array of Amazon S3 inventory object metadata.

If you specify ManifestType is S3\$1INVENTORY then InputType cannot be specified. 


**Note**  
Step Functions supports 10 GB as the maximum size of an individual file in an Amazon S3 inventory report after decompression. However, Step Functions can process more than 10 GB if each individual file is under 10 GB.
Step Functions needs appropriate permissions to access the Amazon S3 datasets that you use. For information about IAM policies for the datasets, see [IAM policy recommendations for datasets](#itemreader-iam-policies).

The following is an example of an inventory file in CSV format. This file includes the objects named `csvDataset` and `imageDataset`, which are stored in an Amazon S3 bucket that's named `amzn-s3-demo-source-bucket`.

```
"amzn-s3-demo-source-bucket","csvDataset/","0","2022-11-16T00:27:19.000Z"
"amzn-s3-demo-source-bucket","csvDataset/titles.csv","3399671","2022-11-16T00:29:32.000Z"
"amzn-s3-demo-source-bucket","imageDataset/","0","2022-11-15T20:00:44.000Z"
"amzn-s3-demo-source-bucket","imageDataset/n02085620_10074.jpg","27034","2022-11-15T20:02:16.000Z"
...
```

**Important**  
Step Functions doesn't support a user-defined Amazon S3 inventory report as a dataset.   
The output format of your Amazon S3 inventory report must be CSV.   
For more information about Amazon S3 inventories and how to set them up, see [Amazon S3 Inventory](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html).

The following example of an Amazon S3 inventory manifest file shows the CSV headers for the inventory object metadata.

```
{
  "sourceBucket" : "amzn-s3-demo-source-bucket",
  "destinationBucket" : "arn:aws:s3:::amzn-s3-demo-inventory",
  "version" : "2016-11-30",
  "creationTimestamp" : "1668560400000",
  "fileFormat" : "CSV",
  "fileSchema" : "Bucket, Key, Size, LastModifiedDate",
  "files" : [ {
    "key" : "amzn-s3-demo-bucket/destination-prefix/data/20e55de8-9c21-45d4-99b9-46c732000228.csv.gz",
    "size" : 7300,
    "MD5checksum" : "a7ff4a1d4164c3cd55851055ec8f6b20"
  } ]
}
```

The following tabs show examples of the `ItemReader` field syntax and the input passed to a child workflow execution for this dataset.

------
#### [ ItemReader syntax ]

```
"ItemReader": {
   "ReaderConfig": {
      "InputType": "MANIFEST"
   },
   "Resource": "arn:aws:states:::s3:getObject",
   "Parameters": {
      "Bucket": "amzn-s3-demo-destination-bucket",
      "Key": "destination-prefix/amzn-s3-demo-bucket/config-id/YYYY-MM-DDTHH-MMZ/manifest.json"
   }
}
```

------
#### [ Input to a child workflow execution ]

```
{
  "LastModifiedDate": "2022-11-16T00:29:32.000Z",
  "Bucket": "amzn-s3-demo-source-bucket",
  "Size": "3399671",
  "Key": "csvDataset/titles.csv"
}
```

Depending on the fields you selected while configuring the Amazon S3 inventory report, the contents of your `manifest.json` file may vary from the example.

------

## IAM policy recommendations for datasets


When you create workflows with the Step Functions console, Step Functions can automatically generate IAM policies based on the resources in your workflow definition. Generated policies include the least privileges necessary to allow the state machine role to invoke the `[StartExecution](https://docs.aws.amazon.com/step-functions/latest/apireference/API_StartExecution.html)` API action for the *Distributed Map state* and access AWS resources, such as Amazon S3 buckets and objects, and Lambda functions.

We recommend including only the necessary permissiosn in your IAM policies. For example, if your workflow includes a `Map` state in Distributed mode, scope your policies down to the specific Amazon S3 bucket and folder that contains your data.

**Important**  
If you specify an Amazon S3 bucket and object, or prefix, with a [reference path](amazon-states-language-paths.md#amazon-states-language-reference-paths) to an existing key-value pair in your *Distributed Map state* input, make sure that you update the IAM policies for your workflow. Scope the policies down to the bucket and object names the path resolves to at runtime.

The following examples show techniques for granting the least privileges required to access your Amazon S3 datasets using the [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) and [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) API actions.

**Example condition using an Amazon S3 object as a dataset**  
The following condition grants the least privileges to access objects in a `processImages` folder of an Amazon S3 bucket.  

```
"Resource": [ "arn:aws:s3:::amzn-s3-demo-bucket" ],
"Condition": {
   "StringLike": { 
      "s3:prefix": [ "processImages" ]
   }
}
```

**Example using a CSV file as a dataset**  
The following example shows the actions required to access a CSV file named `ratings.csv`.  

```
"Action": [ "s3:GetObject" ],
"Resource": [
   "arn:aws:s3:::amzn-s3-demo-bucket/csvDataset/ratings.csv"
   ]
```

**Example using an Amazon S3 inventory as a dataset**  
The following shows example resources for an Amazon S3 inventory manifest and data files.  

```
"Resource": [
   "arn:aws:s3:::myPrefix/amzn-s3-demo-bucket/myConfig-id/YYYY-MM-DDTHH-MMZ/manifest.json",
   "arn:aws:s3:::myPrefix/amzn-s3-demo-bucket/myConfig-id/data/*"
   ]
```

**Example using ListObjectsV2 to restrict to a folder prefix**  
When using [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html), two policies will be generated. One is needed to allow **listing** the contents of the bucket (`ListBucket`) and another policy will allow **retrieving objects** in the bucket (`GetObject`).   
The following show example actions, resources, and a condition:  

```
"Action": [ "s3:ListBucket" ],
"Resource": [ "arn:aws:s3:::amzn-s3-demo-bucket" ],
"Condition": {
   "StringLike": {
      "s3:prefix": [ "/path/to/your/json/" ]
   }
}
```

```
"Action": [ "s3:GetObject" ],
"Resource": [ "arn:aws:s3:::amzn-s3-demo-bucket/path/to/your/json/*" ]
```
Note that `GetObject` will not be scoped and you will use a wildcard (`*`) for the object.

# ItemsPath (Map, JSONPath only)
ItemsPath (JSONPath)

**Managing state and transforming data**  
This page refers to JSONPath. Step Functions recently added variables and JSONata to manage state and transform data.  
Learn about [Passing data with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

In JSONPath-based states, use the `ItemsPath` field to select an array or object within a JSON input provided to a `Map` state. By default, the `Map` state sets `ItemsPath` to `$`, which selects the entire input. 
+  If the input to the `Map` state is a JSON array, it runs an iteration for each item in the array, passing that item to the iteration as input 
+  If the input to the `Map` state is a JSON object, it runs an iteration for each key-value pair in the object, passing the pair to the iteration as input 

**Note**  
You can use `ItemsPath` in the *Distributed Map state* only if you use a JSON input passed from a previous state in the workflow.

The value of `ItemsPath` must be a [Reference Path](amazon-states-language-paths.md#amazon-states-language-reference-paths), and that path must evaluate to a JSON array or object. For instance, consider input to a `Map` state that includes two arrays, like the following example.

```
{
  "ThingsPiratesSay": [
    {
      "say": "Avast!"
    },
    {
      "say": "Yar!"
    },
    {
      "say": "Walk the Plank!"
    }
  ],
  "ThingsGiantsSay": [
    {
      "say": "Fee!"
    },
    {
      "say": "Fi!"
    },
    {
      "say": "Fo!"
    },
    {
      "say": "Fum!"
    }
  ]
}
```

In this case, you could specify which array to use for `Map` state iterations by selecting it with `ItemsPath`. The following state machine definition specifies the `ThingsPiratesSay` array in the input using `ItemsPath`.It then runs an iteration of the `SayWord` pass state for each item in the `ThingsPiratesSay` array.

```
{
  "StartAt": "PiratesSay",
  "States": {
    "PiratesSay": {
      "Type": "Map",
      "ItemsPath": "$.ThingsPiratesSay",
      "ItemProcessor": {
         "StartAt": "SayWord",
         "States": {
           "SayWord": {
             "Type": "Pass",
             "End": true
           }
         }
      },
      "End": true
    }
  }
}
```

For nested JSON objects, you can use `ItemsPath` to select a specific object within the input. Consider the following input with nested configuration data:

```
{
  "environment": "production",
  "servers": {
    "web": {
      "server1": {"port": 80, "status": "active"},
      "server2": {"port": 8080, "status": "inactive"}
    },
    "database": {
      "primary": {"host": "db1.example.com", "port": 5432},
      "replica": {"host": "db2.example.com", "port": 5432}
    }
  }
}
```

To iterate over the web servers object, you would set `ItemsPath` to `$.servers.web`:

```
{
  "StartAt": "ProcessWebServers",
  "States": {
    "ProcessWebServers": {
      "Type": "Map",
      "ItemsPath": "$.servers.web",
      "ItemProcessor": {
         "StartAt": "CheckServer",
         "States": {
           "CheckServer": {
             "Type": "Pass",
             "End": true
           }
         }
      },
      "End": true
    }
  }
}
```

When processing input, the `Map` state applies `ItemsPath` after [`InputPath`](input-output-inputpath-params.md#input-output-inputpath). It operates on the effective input to the state after `InputPath` filters the input.

For more information on `Map` states, see the following:
+  [Map state](state-map.md) 
+ [Map state processing modes](state-map.md#concepts-map-process-modes)
+ [Repeat actions with Inline Map](tutorial-map-inline.md)
+ [Inline `Map` state input and output processing](state-map-inline.md#inline-map-state-output)

# ItemSelector (Map)
ItemSelector

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

By default, the effective input for the `Map` state is the set of individual data items present in the raw state input. With the `ItemSelector` field, you can override the data items’ values before they’re passed on to the `Map` state. 

To override the values, specify a valid JSON input that contains a collection of key-value pairs. The pairs can be static values provided in your state machine definition, values selected from the state input using a [path](amazon-states-language-paths.md), or values accessed from the [Context object](input-output-contextobject.md). 

If you specify key-value pairs using a path or Context object, the key name must end in `.$`.

**Note**  
The `ItemSelector` field replaces the `Parameters` field within the `Map` state. If you use the `Parameters` field in your `Map` state definitions to create custom input, we recommend that you replace them with `ItemSelector`.

You can specify the `ItemSelector` field in both an *Inline Map state* and a *Distributed Map state*.

For example, consider the following JSON input that contains an array of three items within the `imageData` node. For each *`Map` state iteration*, an array item is passed to the iteration as input.

```
[
  {
    "resize": "true",
    "format": "jpg"
  },
  {
    "resize": "false",
    "format": "png"
  },
  {
    "resize": "true",
    "format": "jpg"
  }
]
```

Using the `ItemSelector` field, you can define a custom JSON input to override the original input as shown in the following example. Step Functions then passes this custom input to each *`Map` state iteration*. The custom input contains a static value for `size` and the value of a Context object data for `Map` state. The `$$.Map.Item.Value` Context object contains the value of each individual data item.

```
{
  "ItemSelector": {
    "size": 10,
    "value.$": "$$.Map.Item.Value"
  }
}
```

The following example shows the input received by one iteration of the *Inline Map state*:

```
{
  "size": 10,
  "value": {
    "resize": "true",
    "format": "jpg"
  }
}
```

**Tip**  
For a complete example of a *Distributed Map state* that uses the `ItemSelector` field, see [Copy large-scale CSV using Distributed Map](tutorial-map-distributed.md).

# ItemBatcher (Map)
ItemBatcher

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

The `ItemBatcher` field is a JSON object, which specifies to process a group of items in a single child workflow execution. Use batching when processing large CSV files or JSON arrays, or large sets of Amazon S3 objects.

The following example shows the syntax of the `ItemBatcher` field. In the following syntax, the maximum number of items that each child workflow execution should process is set to 100.

```
{
  "ItemBatcher": {
    "MaxItemsPerBatch": 100
  }
}
```

By default, each item in a dataset is passed as input to individual child workflow executions. For example, assume you specify a JSON file as input that contains the following array:

```
[
  {
    "verdict": "true",
    "statement_date": "6/11/2008",
    "statement_source": "speech"
  },
  {
    "verdict": "false",
    "statement_date": "6/7/2022",
    "statement_source": "television"
  },
  {
    "verdict": "true",
    "statement_date": "5/18/2016",
    "statement_source": "news"
  },
  ...
]
```

For the given input, each child workflow execution receives an array item as its input. The following example shows the input of a child workflow execution:

```
{
  "verdict": "true",
  "statement_date": "6/11/2008",
  "statement_source": "speech"
}
```

To help optimize the performance and cost of your processing job, select a batch size that balances the number of items against the items processing time. If you use batching, Step Functions adds the items to an **Items** array. It then passes the array as input to each child workflow execution. The following example shows a batch of two items passed as input to a child workflow execution:

```
{
  "Items": [
    {
      "verdict": "true",
      "statement_date": "6/11/2008",
      "statement_source": "speech"
    },
    {
      "verdict": "false",
      "statement_date": "6/7/2022",
      "statement_source": "television"
    }
  ]
}
```

**Tip**  
To learn more about using the `ItemBatcher` field in your workflows, try the following tutorials and workshop:  
[Process an entire batch of data within a Lambda function](tutorial-itembatcher-param-task.md)
[Iterate over items in a batch inside child workflow executions](tutorial-itembatcher-single-item-process.md)
[Distributed map and related resources](https://catalog.workshops.aws/stepfunctions/use-cases/distributed-map) in *The AWS Step Functions Workshop*

**Contents**
+ [

## Fields to specify item batching
](#input-output-itembatcher-subfields)

## Fields to specify item batching


To batch items, specify the maximum number of items to batch, the maximum batch size, or both. You must specify one of these values to batch items. 

**Max items per batch**  
Specifies the maximum number of items that each child workflow execution processes. The interpreter limits the number of items batched in the `Items` array to this value. If you specify both a batch number and size, the interpreter reduces the number of items in a batch to avoid exceeding the specified batch size limit.   
If you don't specify this value but provide a value for maximum batch size, Step Functions processes as many items as possible in each child workflow execution without exceeding the maximum batch size in bytes.  
For example, imagine you run an execution with an input JSON file that contains 1130 nodes. If you specify a maximum items value for each batch of 100, Step Functions creates 12 batches. Of these, 11 batches contain 100 items each, while the twelfth batch contains the remaining 30 items.  
Alternatively, you can specify the maximum items for each batch as a [reference path](amazon-states-language-paths.md#amazon-states-language-reference-paths) to an existing key-value pair in your *Distributed Map state* input. This path must resolve to a positive integer.  
For example, given the following input:  

```
{
  "maxBatchItems": 500
}
```
You can specify the maximum number of items to batch using a reference path (**JSONPath only**) as follows:  

```
{
  ...
  "Map": {
    "Type": "Map",
    "MaxConcurrency": 2000,
    "ItemBatcher": {
      "MaxItemsPerBatchPath": "$.maxBatchItems"
    }
    ...
    ...
  }
}
```
For **JSONata-based** states, you can also provide a JSONata expression that evaluates to a positive integer.  
You can specify either the `MaxItemsPerBatch` or the `MaxItemsPerBatchPath (JSONPath only)` sub-field, but not both.

**Max KiB per batch**  
Specifies the maximum size of a batch in bytes, up to 256 KiB. If you specify both a maximum batch number and size, Step Functions reduces the number of items in a batch to avoid exceeding the specified batch size limit.  
Alternatively, you can specify the maximum batch size as a [reference path](amazon-states-language-paths.md#amazon-states-language-reference-paths) to an existing key-value pair in your *Distributed Map state* input. This path must resolve to a positive integer.  
If you use batching and don't specify a maximum batch size, the interpreter processes as many items it can process up to 256 KiB in each child workflow execution.
For example, given the following input:  

```
{
  "batchSize": 131072
}
```
You can specify the maximum batch size using a reference path as follows:  

```
{
  ...
  "Map": {
    "Type": "Map",
    "MaxConcurrency": 2000,
    "ItemBatcher": {
      "MaxInputBytesPerBatchPath": "$.batchSize"
    }
    ...
    ...
  }
}
```
For **JSONata-based** states, you can also provide a JSONata expression that evaluates to a positive integer.  
You can specify either the `MaxInputBytesPerBatch` or the `MaxInputBytesPerBatchPath` (JSONPath only) sub-field, but not both. 

**Batch input**  
Optionally, you can also specify a fixed JSON input to include in each batch passed to each child workflow execution. Step Functions merges this input with the input for each individual child workflow executions. For example, given the following fixed input of a fact check date on an array of items:  

```
"ItemBatcher": {
    "BatchInput": {
        "factCheck": "December 2022"
    }
}
```
Each child workflow execution receives the following as input:  

```
{
  "BatchInput": {
    "factCheck": "December 2022"
  },
  "Items": [
    {
      "verdict": "true",
      "statement_date": "6/11/2008",
      "statement_source": "speech"
    },
    {
      "verdict": "false",
      "statement_date": "6/7/2022",
      "statement_source": "television"
    },
    ...
  ]
}
```
For **JSONata-based** states, you can provide JSONata expressions directly to BatchInput, or use JSONata expressions inside JSON objects or arrays.

# ResultWriter (Map)
ResultWriter

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

The `ResultWriter` field is a JSON object that provides options for the output results of the child workflow executions started by a Distributed Map state. You can specify different formatting options for the output results along with the Amazon S3 location to store them if you choose to export them. Step Functions doesn't export these results by default.

**Topics**
+ [

## Contents of the ResultWriter field
](#input-output-resultwriter-field-contents)
+ [Examples](#input-output-resultwriter-examples)
+ [

## Exporting to Amazon S3
](#input-output-resultwriter-exporting-to-S3)
+ [

## IAM policies for ResultWriter
](#resultwriter-iam-policies)

## Contents of the ResultWriter field


The `ResultWriter` field contains the following sub-fields. The choice of fields determines how the output is formatted and whether it's exported to Amazon S3.

**`ResultWriter`**  
A JSON object that specifies the following details:  
+ `Resource`

  The Amazon S3 API action that Step Functions invokes to export the execution results.
+ `Parameters`

  A JSON object that specifies the Amazon S3 bucket name and prefix that stores the execution output.
+ `WriterConfig`

  This field enables you to configure the following options.
  + `Transformation`
    + `NONE` - returns the output of the child workflow executions unchanged, in addition to the workflow metadata. Default when exporting the child workflow execution results to Amazon S3 and `WriterConfig` is not specified.
    + `COMPACT` - returns the output of the child workflow executions. Default when `ResultWriter` is not specified. 
    + `FLATTEN` - returns the output of the child workflow executions. If a child workflow execution returns an array, this option flattens the array, prior to returning the result to a state output or writing the result to an Amazon S3 object.
**Note**  
If a child workflow execution fails, Step Functions returns its execution result unchanged. The results would be equivalent to having set `Transformation` to `NONE`.
  + `OutputType`
    + `JSON` - formats the results as a JSON array.
    + `JSONL` - formats the results as JSON Lines.

**Required field combinations**  
The `ResultWriter` field cannot be empty. You must specify one of these sets of sub-fields.
+ `WriterConfig` - to preview the formatted output, without saving the results to Amazon S3.
+ `Resource` and `Parameters` - to save the results to Amazon S3 without additional formatting.
+ All three fields: `WriterConfig`, `Resource` and `Parameters` - to format the output and save it to Amazon S3.

## Example configurations and transformation output
Examples

The following topics demonstrate the possible configuration settings for `ResultWriter` and examples of processed results from the different transformation options.
+ [ResultWriter configurations](#input-output-resultwriter-example-configurations)
+ [Transformations](#input-output-resultwriter-example-transformations)

### Examples of ResultWriter configurations


The following examples demonstrate configurations with the possible combinations of the three fields: `WriterConfig`, `Resources` and `Parameters`.

**Only *WriterConfig***  
This example configures how the state output is presented in preview, with the output format and transformation specified in the `WriterConfig` field. Non-existent `Resource` and `Parameters` fields, which would have provided the Amazon S3 bucket specifications, imply the *state output* resource. The results are passed on to the next state.

```
"ResultWriter": {
    "WriterConfig": { 
        "Transformation": "FLATTEN", 
        "OutputType": "JSON"
    }
}
```

**Only *Resources* and *Parameters***  
This example exports the state output to the specified Amazon S3 bucket, without the additional formatting and transformation that the non-existent `WriterConfig` field would have specified.

```
"ResultWriter": {
    "Resource": "arn:aws:states:::s3:putObject",
    "Parameters": {
        "Bucket": "amzn-s3-demo-destination-bucket",
        "Prefix": "csvProcessJobs"
    }
```

**All three fields: *WriterConfig*, *Resources* and *Parameters***  
This example formats the state output according the specifications in the `WriterConfig` field. It also exports it to an Amazon S3 bucket according to the specifications in the `Resource` and `Parameters` fields.

```
"ResultWriter": {
     "WriterConfig": { 
        "Transformation": "FLATTEN",
        "OutputType": "JSON"
    },
    "Resource": "arn:aws:states:::s3:putObject",
    "Parameters": {
        "Bucket": "amzn-s3-demo-destination-bucket",
        "Prefix": "csvProcessJobs"
    }
}
```

### Examples of transformations


For these examples assume that each child workflow execution returns an output, which is an array of objects. 

```
[
  {
    "customer_id": "145538",
    "order_id": "100000"
  },
  {
    "customer_id": "898037",
    "order_id": "100001"
  }
]
```

These examples demonstrate the formatted output for different `Transformation` values, with `OutputType` of `JSON`. 

**Transformation NONE**  


This is an example of the processed result when you use the `NONE` transformation. The output is unchanged, and it includes the workflow metadata.

```
[
    {
        "ExecutionArn": "arn:aws:states:region:account-id:execution:orderProcessing/getOrders:da4e9fc7-abab-3b27-9a77-a277e463b709",
        "Input": ...,
        "InputDetails": {
            "Included": true
        },
        "Name": "da4e9fc7-abab-3b27-9a77-a277e463b709",
        "Output": "[{\"customer_id\":\"145538\",\"order_id\":\"100000\"},{\"customer_id\":\"898037\",\"order_id\":\"100001\"}]",
        "OutputDetails": {
            "Included": true
        },
        "RedriveCount": 0,
        "RedriveStatus": "NOT_REDRIVABLE",
        "RedriveStatusReason": "Execution is SUCCEEDED and cannot be redriven",
        "StartDate": "2025-02-04T01:49:50.099Z",
        "StateMachineArn": "arn:aws:states:region:account-id:stateMachine:orderProcessing/getOrders",
        "Status": "SUCCEEDED",
        "StopDate": "2025-02-04T01:49:50.163Z"
    },
    ...
    {
        "ExecutionArn": "arn:aws:states:region:account-id:execution:orderProcessing/getOrders:f43a56f7-d21e-3fe9-a40c-9b9b8d0adf5a",
        "Input": ...,
        "InputDetails": {
            "Included": true
        },
        "Name": "f43a56f7-d21e-3fe9-a40c-9b9b8d0adf5a",
        "Output": "[{\"customer_id\":\"169881\",\"order_id\":\"100005\"},{\"customer_id\":\"797471\",\"order_id\":\"100006\"}]",
        "OutputDetails": {
            "Included": true
        },
        "RedriveCount": 0,
        "RedriveStatus": "NOT_REDRIVABLE",
        "RedriveStatusReason": "Execution is SUCCEEDED and cannot be redriven",
        "StartDate": "2025-02-04T01:49:50.135Z",
        "StateMachineArn": "arn:aws:states:region:account-id:stateMachine:orderProcessing/getOrders",
        "Status": "SUCCEEDED",
        "StopDate": "2025-02-04T01:49:50.227Z"
    }
]
```

**Transformation COMPACT**  
This is an example of the processed result when you use the `COMPACT` transformation. Note that it’s the combined output of the child workflow executions with the original array structure.

```
[
    [
        {
            "customer_id": "145538",
            "order_id": "100000"
        },
        {
            "customer_id": "898037",
            "order_id": "100001"
        }
    ],
    ...,
    
    [
        {
            "customer_id": "169881",
            "order_id": "100005"
        },
        {
            "customer_id": "797471",
            "order_id": "100006"
        }
    ]
]
```

**Transformation FLATTEN**  
This is an example of the processed result when you use the `FLATTEN` transformation. Note that it’s the combined output of the child workflow executions arrays flattened into one array.

```
[
    {
        "customer_id": "145538",
        "order_id": "100000"
    },
    {
        "customer_id": "898037",
        "order_id": "100001"
    },
    ...
    {
        "customer_id": "169881",
        "order_id": "100005"
    },
    {
        "customer_id": "797471",
        "order_id": "100006"
    }
]
```

## Exporting to Amazon S3


**Important**  
Make sure that the Amazon S3 bucket you use to export the results of a Map Run is under the same AWS account and AWS Region as your state machine. Otherwise, your state machine execution will fail with the `States.ResultWriterFailed` error.

Exporting the results to an Amazon S3 bucket is helpful if your output payload size exceeds 256 KiB. Step Functions consolidates all child workflow execution data, such as execution input and output, ARN, and execution status. It then exports executions with the same status to their respective files in the specified Amazon S3 location. 

The following example, using **JSONPath**, shows the syntax of the `ResultWriter` field with `Parameters` to export the child workflow execution results. In this example, you store the results in a bucket named `amzn-s3-demo-destination-bucket` within a prefix called `csvProcessJobs`. 

```
{
  "ResultWriter": {
    "Resource": "arn:aws:states:::s3:putObject",
    "Parameters": {
      "Bucket": "amzn-s3-demo-destination-bucket",
      "Prefix": "csvProcessJobs"
    }
  }
}
```

For **JSONata** states, `Parameters` will be replaced with `Arguments`.

```
{
  "ResultWriter": {
    "Resource": "arn:aws:states:::s3:putObject",
    "Arguments": {
      "Bucket": "amzn-s3-demo-destination-bucket",
      "Prefix": "csvProcessJobs"
    }
  }
}
```

**Tip**  
In Workflow Studio, you can export the child workflow execution results by selecting **Export Map state results to Amazon S3**. Then, provide the name of the Amazon S3 bucket and prefix where you want to export the results to.

Step Functions needs appropriate permissions to access the bucket and folder where you want to export the results. For information about the required IAM policy, see [IAM policies for ResultWriter](#resultwriter-iam-policies).

If you export the child workflow execution results, the *Distributed Map state* execution returns the Map Run ARN and data about the Amazon S3 export location in the following format:

```
{
  "MapRunArn": "arn:aws:states:us-east-2:account-id:mapRun:csvProcess/Map:ad9b5f27-090b-3ac6-9beb-243cd77144a7",
  "ResultWriterDetails": {
    "Bucket": "amzn-s3-demo-destination-bucket",
    "Key": "csvProcessJobs/ad9b5f27-090b-3ac6-9beb-243cd77144a7/manifest.json"
  }
}
```

Step Functions exports executions with the same status to their respective files. For example, if your child workflow executions resulted in 500 success and 200 failure results, Step Functions creates two files in the specified Amazon S3 location for the success and failure results. In this example, the success results file contains the 500 success results, while the failure results file contains the 200 failure results.

For a given execution attempt, Step Functions creates the following files in the specified Amazon S3 location depending on your execution output:
+ `manifest.json` – Contains Map Run metadata, such as export location, Map Run ARN, and information about the result files.

  If you've [redriven](redrive-map-run.md) a Map Run, the `manifest.json` file, contains references to all the successful child workflow executions across all the attempts of a Map Run. However, this file contains references to the failed and pending executions for a specific redrive.
+ `SUCCEEDED_n.json` – Contains the consolidated data for all successful child workflow executions. *n* represents the index number of the file. The index number starts from 0. For example, `SUCCEEDED_1.json`.
+ `FAILED_n.json` – Contains the consolidated data for all failed, timed out, and aborted child workflow executions. Use this file to recover from failed executions. *n* represents the index of the file. The index number starts from 0. For example, `FAILED_1.json`.
+ `PENDING_n.json` – Contains the consolidated data for all child workflow executions that weren’t started because the Map Run failed or aborted. *n* represents the index of the file. The index number starts from 0. For example, `PENDING_1.json`.

Step Functions supports individual result files of up to 5 GB. If a file size exceeds 5 GB, Step Functions creates another file to write the remaining execution results and appends an index number to the file name. For example, if size of the `SUCCEEDED_0.json` file exceeds 5 GB, Step Functions creates `SUCCEEDED_1.json` file to record the remaining results.

If you didn’t specify to export the child workflow execution results, the state machine execution returns an array of child workflow execution results as shown in the following example:

```
[
  {
    "statusCode": 200,
    "inputReceived": {
      "show_id": "s1",
      "release_year": "2020",
      "rating": "PG-13",
      "type": "Movie"
    }
  },
  {
    "statusCode": 200,
    "inputReceived": {
      "show_id": "s2",
      "release_year": "2021",
      "rating": "TV-MA",
      "type": "TV Show"
    }
  },
  ...
]
```

**Note**  
If the returned output size exceeds 256 KiB, the state machine execution fails and returns a `States.DataLimitExceeded` error.

## IAM policies for ResultWriter


When you create workflows with the Step Functions console, Step Functions can automatically generate IAM policies based on the resources in your workflow definition. Generated policies include the least privileges necessary to allow the state machine role to invoke the `[StartExecution](https://docs.aws.amazon.com/step-functions/latest/apireference/API_StartExecution.html)` API action for the *Distributed Map state* and access AWS resources, such as Amazon S3 buckets and objects, and Lambda functions.

We recommend including only the necessary permissiosn in your IAM policies. For example, if your workflow includes a `Map` state in Distributed mode, scope your policies down to the specific Amazon S3 bucket and folder that contains your data.

**Important**  
If you specify an Amazon S3 bucket and object, or prefix, with a [reference path](amazon-states-language-paths.md#amazon-states-language-reference-paths) to an existing key-value pair in your *Distributed Map state* input, make sure that you update the IAM policies for your workflow. Scope the policies down to the bucket and object names the path resolves to at runtime.

The following IAM policy example grants the least privileges required to write your child workflow execution results to a folder named *csvJobs* in an Amazon S3 bucket using the `[PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html)` API action.

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-destination-bucket/csvJobs/*"
            ]
        }
    ]
}
```

If the Amazon S3 bucket to which you're writing the child workflow execution result is encrypted using an AWS Key Management Service (AWS KMS) key, you must include the necessary AWS KMS permissions in your IAM policy. For more information, see [IAM permissions for AWS KMS key encrypted Amazon S3 bucket](iam-policies-eg-dist-map.md#multiupload-dmap-result-policy).

# How Step Functions parses input CSV files
Parsing input CSV files

**Managing state and transforming data**  
Learn about [Passing data between states with variables](workflow-variables.md) and [Transforming data with JSONata](transforming-data.md).

Step Functions parses text delimited files based on the following rules:
+ The delimiter that separates fields is specified by `CSVDelimiter` in *ReaderConfig*. The delimiter defaults to `COMMA`.
+ Newlines are a delimiter that separates **records**.
+ Fields are treated as strings. For data type conversions, use the `States.StringToJson` intrinsic function in [ItemSelector (Map)](input-output-itemselector.md).
+ Double quotation marks (" ") are not required to enclose strings. However, strings that are enclosed by double quotation marks can contain commas and newlines without acting as record delimiters.
+ You can preserve double quotes by repeating them.
+ Backslashes (\$1) are another way to escape special characters. Backslashes only work with other backslashes, double quotation marks, and the configured field separator such as comma or pipe. A backslash followed by any other character is silently removed.
+ You can preserve backslashes by repeating them. For example: 

  ```
  path,size
  C:\\Program Files\\MyApp.exe,6534512
  ```
+ Backslashes that escape double quotation marks (`\"`), only work when included in pairs, so we recommend escaping double quotation marks by repeating them: `""`.
+ If the number of fields in a row is **less** than the number of fields in the header, Step Functions provides **empty strings** for the missing values.
+ If the number of fields in a row is **more** than the number of fields in the header, Step Functions **skips** the additional fields.

**Example of parsing an input CSV file**  
Say that you have provided a CSV file named `myCSVInput.csv` that contains one row as input. Then, you've stored this file in an Amazon S3 bucket that's named `amzn-s3-demo-bucket`. The CSV file is as follows.

```
abc,123,"This string contains commas, a double quotation marks (""), and a newline (
)",{""MyKey"":""MyValue""},"[1,2,3]"
```

The following state machine reads this CSV file and uses [ItemSelector (Map)](input-output-itemselector.md) to convert the data types of some of the fields.

```
{
  "StartAt": "Map",
  "States": {
    "Map": {
      "Type": "Map",
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "DISTRIBUTED",
          "ExecutionType": "STANDARD"
        },
        "StartAt": "Pass",
        "States": {
          "Pass": {
            "Type": "Pass",
            "End": true
          }
        }
      },
      "End": true,
      "Label": "Map",
      "MaxConcurrency": 1000,
      "ItemReader": {
        "Resource": "arn:aws:states:::s3:getObject",
        "ReaderConfig": {
          "InputType": "CSV",
          "CSVHeaderLocation": "GIVEN",
          "CSVHeaders": [
            "MyLetters",
            "MyNumbers",
            "MyString",
            "MyObject",
            "MyArray"
          ]
        },
        "Parameters": {
          "Bucket": "amzn-s3-demo-bucket",
          "Key": "myCSVInput.csv"
        }
      },
      "ItemSelector": {
        "MyLetters.$": "$$.Map.Item.Value.MyLetters",
        "MyNumbers.$": "States.StringToJson($$.Map.Item.Value.MyNumbers)",
        "MyString.$": "$$.Map.Item.Value.MyString",
        "MyObject.$": "States.StringToJson($$.Map.Item.Value.MyObject)",
        "MyArray.$": "States.StringToJson($$.Map.Item.Value.MyArray)"
      }
    }
  }
}
```

When you run this state machine, it produces the following output.

```
[
  {
    "MyNumbers": 123,
    "MyObject": {
      "MyKey": "MyValue"
    },
    "MyString": "This string contains commas, a double quote (\"), and a newline (\n)",
    "MyLetters": "abc",
    "MyArray": [
      1,
      2,
      3
    ]
  }
]
```