Advanced training configurations
Sample rate configuration
The sample rate defines how frequently sensor readings are recorded (for example, once every second, or once every minute). This setting directly impacts the granularity of the training data, and influences the model's ability to capture short-term variations in sensor behavior.
Visit Sampling for high-frequency data and consistency between training and inference to learn about best practices.
Configure target sampling rate
You can optionally specify a TargetSamplingRate
in your training configuration,
to control the frequency at which data is sampled. Supported values are:
PT1S | PT5S | PT10S | PT15S | PT30S | PT1M | PT5M | PT10M | PT15M | PT30M | PT1H
These are ISO 8601 duration formats, representing the following time formats:
-
PT1S
= 1 second -
PT1M
= 1 minute -
PT1H
= 1 hour
Choose a sampling rate that strikes the right balance between data resolution, and training efficiency. The following rates are available:
-
Higher sampling rates (
PT1S
) offer finer detail but may increase data volume and training time. -
Lower sampling rates (
PT10M
,PT1H
) reduce data size and cost but may miss short-lived anomalies.
Handling timestamp misalignment
AWS IoT SiteWise automatically compensates for timestamp misalignment across multiple data streams during training. This ensures consistent model behavior even if input signals are not perfectly aligned in time.
Visit Sampling for high-frequency data and consistency between training and inference to learn about best practices.
Enable sampling
Add the following code to anomaly-detection-training-payload.json
.
Configure sampling by adding TargetSamplingRate
in the training action payload,
with the sampling rate of the data. The allowed values are:
PT1S | PT5S | PT10S | PT15S | PT30S | PT1M | PT5M | PT10M | PT15M | PT30M | PT1H
.
{ "exportDataStartTime": StartTime, "exportDataEndTime": EndTime, "targetSamplingRate": "TargetSamplingRate" }
Example of a sample rate configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "targetSamplingRate": "PT1M" }
Label your data
When labeling your data, you must define time intervals that represent periods of abnormal equipment behavior.
This labeling information is provided as a CSV
file, where each row specifies a time range
during which the equipment was not operating correctly.
Each row contains two timestamps:
-
The start time, indicating when abnormal behavior is believed to have begun.
-
The end time, representing when the failure or issue was first observed.
This CSV file is stored in an Amazon S3 bucket and is used during model training to help the system learn
from known examples of abnormal behavior. The following example shows how your label data should appear as a .csv
file.
The file has no header.
Example of a CSV file:
2024-06-21T00:00:00.000000,2024-06-21T12:00:00.000000 2024-07-11T00:00:00.000000,2024-07-11T12:00:00.000000 2024-07-31T00:00:00.000000,2024-07-31T12:00:00.000000
Row 1 represents a maintenance event on June 21, 2024,
with a 12-hour window (from 2024-06-21T00:00:00.000000Z
to 2024-06-21T12:00:00.000000Z
)
for AWS IoT SiteWise to look for abnormal behavior.
Row 2 represents a maintenance event on July 11, 2024,
with a 12-hour window (from 2024-07-11T00:00:00.000000Z
to 2024-07-11T12:00:00.000000Z
)
for AWS IoT SiteWise to look for abnormal behavior.
Row 3 represents a maintenance event on July 31, 2024,
with a 12-hour window (from 2024-07-31T00:00:00.000000Z
to 2024-07-31T12:00:00.000000Z
)
for AWS IoT SiteWise to look for abnormal behavior.
AWS IoT SiteWise uses all of these time windows to train and evaluate models that can identify abnormal behavior around these events. Note that not all events are detectable, and results are highly dependent on the quality and characteristics of the underlying data.
For details about best practices for sampling, see Best practices.
Data labeling steps
-
Configure your Amazon S3 bucket according to the labeling prerequisites at Labeling data prerequisites.
-
Upload the file to your labeling bucket.
-
Add the following to
anomaly-detection-training-payload.json
.-
Provide the locations in the
labelInputConfiguration
section of the file. Replacelabels-bucket
with bucket name andfiles-prefix
with file(s) path or any part of prefix. All files at the location are parsed, and (on success) used as label files.
-
{ "exportDataStartTime":
StartTime
, "exportDataEndTime":EndTime
, "labelInputConfiguration": { "bucketName": "label-bucket
", "prefix": "files-prefix
" } }
Example of a label configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "labelInputConfiguration": { "bucketName": "anomaly-detection-customer-data-278129555252-iad", "prefix": "Labels/model=b2d8ab3e-73af-48d8-9b8f-a290bef931b4/asset[d3347728-4796-4c5c-afdb-ea2f551ffe7a]/Lables.csv" } }
Evaluate your model
Pointwise model diagnostics for an AWS IoT SiteWise training model is an evaluation of the model performance at the individual events. During training, AWS IoT SiteWise generates an anomaly score, and sensor contribution diagnostics for each row in the input dataset. A higher anomaly score indicates a higher likelihood of an abnormal event.
Pointwise diagnostics are available, when you train a model with
ExecuteAction
API, and AWS/ANOMALY_DETECTION_TRAINING
action type.
To configure model evaluation,
-
Configure your Amazon S3 bucket according to the labelling prerequisites at Labeling data prerequisites.
-
Add the following to
anomaly-detection-training-payload.json
.-
Provide the
evaluationStartTime
andevaluationEndTime
(both in epoch seconds) for the data in the window used to evaluate the performance of the model. -
Provide the Amazon S3 bucket location (
resultDestination
) in order for the the evaluation diagnostics to be written to.
-
Note
The model evaluation interval (dataStartTime
to dataEndtime
) must either
overlap, or be contiguous to the training interval. No gaps are permitted.
{ "exportDataStartTime":
StartTime
, "exportDataEndTime":EndTime
, "modelEvaluationConfiguration": { "dataStartTime":evaluationStartTime
, "dataEndTime":evaluationEndTime
"resultDestination": { "bucketName": "s3BucketName
", "prefix": "bucketPrefix
" } } }
Example of a model evaluation configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "modelEvaluationConfiguration": { "dataStartTime": 1722789360, "dataEndTime": 1725174000, "resultDestination": { "bucketName": "anomaly-detection-customer-data-278129555252-iad", "prefix": "Evaluation/asset[d3347728-4796-4c5c-afdb-ea2f551ffe7a]/1747681026-evaluation_results.jsonl" } } }