Parameter template files for HealthOmics workflows
Parameter templates define the input parameters for a workflow. You can define input parameters to make your workflow more flexible and versatile. For example, you can define a parameter for the Amazon S3 location of the reference genome files. Users can then run the workflow using various data sets.
You can create the parameter template for your workflow, or HealthOmics can generate the parameter template for you.
The parameter template is a JSON file. In the file, each input parameter is a named object that must match the name of the workflow input. When you start a run, if you don't provide values for all the required parameters, the run fails.
The input parameter object includes the following attributes:
description – This required attribute is a string that the console displays in the Start run page. This description is also retained as run metadata.
optional – This optional attribute indicates whether the input parameter is optional. If you don't specify the optional field, the input parameter is required.
The following example parameter template shows how to specify the input parameters.
{ "myRequiredParameter1": { "description": "this parameter is required", }, "myRequiredParameter2": { "description": "this parameter is also required", "optional": false }, "myOptionalParameter": { "description": "this parameter is optional", "optional": true } }
Generating parameter templates
HealthOmics generates the parameter template by parsing the workflow definition to detect input parameters. If you provide a parameter template file for a workflow, the parameters in your file override the parameters detected in the workflow definition.
There are slight differences between the parsing logic of the CWL, WDL, and Nextflow engines, as described in the following sections.
Parameter detection for CWL
In the CWL workflow engine, the parsing logic makes the following assumptions:
-
Any nullable supported types are marked as optional input parameters
-
Any non-null supported types are marked as required input parameters
-
Descriptions are extracted from the
label
section from themain
workflow definition. Iflabel
is not specified, the description will be blank (an empty string).
The following tables show CWL interpolation examples. For each example, the parameter name is x
. If the
parameter is required, you must provide a value for the parameter. If the parameter is optional, you don't need to
provide a value.
This table shows CWL interpolation examples for primitive types.
Input | Example input/output | Required |
---|---|---|
|
1 or 2 or ... | Yes |
|
Default value is 2. Valid input is 1 or 2 or ... | Yes |
|
Valid input is None or 1 or 2 or ... | No |
|
Default value is 2. Valid input is None or 1 or 2 or ... | No |
The following table shows CWL interpolation examples for complex types. A complex type is a collection of primitive types.
Input | Example input/output | Required |
---|---|---|
|
[] or [1,2,3] | Yes |
|
None or [] or [1,2,3] | No |
|
[] or [None, 3, None] |
Yes |
|
[None] or None or [1,2,3] or [None, 3] but not [] |
No |
Parameter detection for WDL
In the WDL workflow engine, the parsing logic makes the following assumptions:
-
Any nullable supported types are marked as optional input parameters.
-
For non-nullable supported types:
-
Any input variable with assignment of literals or expression are marked as optional parameters. For example:
Int x = 2 Float f0 = 1.0 + f1
-
If no values or expressions have been been assigned to the input parameters, they will be marked as required parameters.
-
-
Descriptions are extracted from
parameter_meta
in themain
workflow definition. Ifparameter_meta
is not specified, the description will be blank (an empty string). For more information, see the WDL specification for Parameter metadata.
The following tables show WDL interpolation examples. For each example, the parameter name is
x
. If the parameter is required, you must provide a value for the parameter. If the parameter is
optional, you don't need to provide a value.
This table shows WDL interpolation examples for primitive types.
Input | Example input/output | Required |
---|---|---|
Int x | 1 or 2 or ... | Yes |
Int x = 2 | 2 | No |
Int x = 1+2 | 3 | No |
Int x = y+z | y+z | No |
Int? x | None or 1 or 2 or ... | Yes |
Int? x = 2 | None or 2 | No |
Int? x = 1+2 | None or 3 | No |
Int? x = y+z | None or y+z | No |
The following table shows WDL interpolation examples for complex types. A complex type is a collection of primitive types.
Input | Example input/output | Required |
---|---|---|
Array[Int] x | [1,2,3] or [] | Yes |
Array[Int]+ x | [1], but not [] | Yes |
Array[Int]? x | None or [] or [1,2,3] | No |
Array[Int?] x | [] or [None, 3, None] | Yes |
Array[Int?]=? x | [None] or None or [1,2,3] or [None, 3] but not [] | No |
Struct sample {String a, Int y}
later in inputs: Sample mySample |
|
Yes |
Struct sample {String a, Int y} later in inputs: Sample? mySample |
|
No |
Parameter detection for Nextflow
For Nextflow, HealthOmics generates the parameter template by parsing the nextflow_schema.json
file.
If the workflow definition doesn't include a schema file, HealthOmics parses the main workflow definition file.
Topics
Parsing the schema file
For parsing to work correctly, make sure the schema file meets the following requirements:
-
The schema file is named
nextflow_schema.json
and is located in the same directory as the main workflow file. -
The schema file is valid JSON as defined in either of the following schemas:
HealthOmics parses the nextflow_schema.json
file to generate the parameter template:
-
Extracts all properties that are defined in the schema.
-
Includes the property description if available for the property.
-
Identifies whether each parameter is optional or required, based on the required field of the property.
The following example shows a definition file and the generated parameter file.
{ "$schema": "https://json-schema.org/draft/2020-12/schema", "type": "object", "$defs": { "input_options": { "title": "Input options", "type": "object", "required": ["input_file"], "properties": { "input_file": { "type": "string", "format": "file-path", "pattern": "^s3://[a-z0-9.-]{3,63}(?:/\\S*)?$", "description": "description for input_file" }, "input_num": { "type": "integer", "default": 42, "description": "description for input_num" } } }, "output_options": { "title": "Output options", "type": "object", "required": ["output_dir"], "properties": { "output_dir": { "type": "string", "format": "file-path", "description": "description for output_dir", } } } }, "properties": { "ungrouped_input_bool": { "type": "boolean", "default": true } }, "required": ["ungrouped_input_bool"], "allOf": [ { "$ref": "#/$defs/input_options" }, { "$ref": "#/$defs/output_options" } ] }
The generated parameter template:
{ "input_file": { "description": "description for input_file", "optional": False }, "input_num": { "description": "description for input_num", "optional": True }, "output_dir": { "description": "description for output_dir", "optional": False }, "ungrouped_input_bool": { "description": None, "optional": False } }
Parsing the main file
If the workflow definition doesn't include a nextflow_schema.json
file, HealthOmics parses the
main workflow definition file.
HealthOmics analyzes the params
expressions found in the main workflow definition file and in the
nextflow.config
file. All params
with default values are marked as
optional.
For parsing to work correctly, note the following requirements:
-
HealthOmics parses only the main workflow definition file. To ensure all parameters are captured, we recommend that you wire all params through to any submodules and imported workflows.
-
The config file is optional. If you define one, name it
nextflow.config
and place it in the same directory as the main workflow definition file.
The following example shows a definition file and the generated parameter template.
params.input_file = "default.txt" params.threads = 4 params.memory = "8GB" workflow { if (params.version) { println "Using version: ${params.version}" } }
The generated parameter template:
{ "input_file": { "description": None, "optional": True }, "threads": { "description": None, "optional": True }, "memory": { "description": None, "optional": True }, "version": { "description": None, "optional": False } }
For default values that are defined in nextflow.config, HealthOmics collects params
assignments and
parameters declared within params {}
, as shown in the following example. In assignment
statements, params
must appear in the left side of the statement.
params.alpha = "alpha" params.beta = "beta" params { gamma = "gamma" delta = "delta" } env { // ignored, as this assignment isn't in the params block VERSION = "TEST" } // ignored, as params is not on the left side interpolated_image = "${params.cli_image}"
The generated parameter template:
{ // other params in your main workflow defintion "alpha": { "description": None, "optional": True }, "beta": { "description": None, "optional": True }, "gamma": { "description": None, "optional": True }, "delta": { "description": None, "optional": True } }
Nested parameters
Both nextflow_schema.json
and nextflow.config
allow nested parameters. However,
the HealthOmics parameter template requires only the top-level parameters. If your workflow uses a nested parameter,
you must provide a JSON object as the input for that parameter.
Nested parameters in schema files
HealthOmics skips nested params when parsing a nextflow_schema.json
file. For
example, if you define the following nextflow_schema.json
file:
{ "properties": { "input": { "properties": { "input_file": { ... }, "input_num": { ... } } }, "input_bool": { ... } } }
HealthOmics ignores input_file
and input_num
when it generates the parameter
template:
{ "input": { "description": None, "optional": True }, "input_bool": { "description": None, "optional": True } }
When you run this workflow, HealthOmics expects an input.json
file similar to the
following:
{ "input": { "input_file": "s3://bucket/obj", "input_num": 2 }, "input_bool": false }
Nested parameters in config files
HealthOmics doesn't collect nested params in a nextflow.config
file, and skips
them during parsing. For example, if you define the following nextflow.config
file:
params.alpha = "alpha" params.nested.beta = "beta" params { gamma = "gamma" group { delta = "delta" } }
HealthOmics ignores params.nested.beta
and params.group.delta
when it generates the
parameter template:
{ "alpha": { "description": None, "optional": True }, "gamma": { "description": None, "optional": True } }
Examples of Nextflow interpolation
The following table shows Nextflow interpolation examples for params in the main file.
Parameters | Required |
---|---|
params.input_file | Yes |
params.input_file = "s3://bucket/data.json" | No |
params.nested.input_file | N/A |
params.nested.input_file = "s3://bucket/data.json" | N/A |
The following table shows Nextflow interpolation examples for params in the nextflow.config
file.
Parameters | Required |
---|---|
|
No |
|
No |
|
N/A |
|
N/A |