Advanced training configurations - AWS IoT SiteWise

Advanced training configurations

Sample rate configuration

The sample rate defines how frequently sensor readings are recorded (for example, once every second, or once every minute). This setting directly impacts the granularity of the training data, and influences the model's ability to capture short-term variations in sensor behavior.

Visit Sampling for high-frequency data and consistency between training and inference to learn about best practices.

Configure target sampling rate

You can optionally specify a TargetSamplingRate in your training configuration, to control the frequency at which data is sampled. Supported values are:

PT1S | PT5S | PT10S | PT15S | PT30S | PT1M | PT5M | PT10M | PT15M | PT30M | PT1H

These are ISO 8601 duration formats, representing the following time formats:

  • PT1S = 1 second

  • PT1M = 1 minute

  • PT1H = 1 hour

Choose a sampling rate that strikes the right balance between data resolution, and training efficiency. The following rates are available:

  • Higher sampling rates (PT1S) offer finer detail but may increase data volume and training time.

  • Lower sampling rates (PT10M, PT1H) reduce data size and cost but may miss short-lived anomalies.

Handling timestamp misalignment

AWS IoT SiteWise automatically compensates for timestamp misalignment across multiple data streams during training. This ensures consistent model behavior even if input signals are not perfectly aligned in time.

Visit Sampling for high-frequency data and consistency between training and inference to learn about best practices.

Enable sampling

Add the following code to anomaly-detection-training-payload.json.

Configure sampling by adding TargetSamplingRate in the training action payload, with the sampling rate of the data. The allowed values are: PT1S | PT5S | PT10S | PT15S | PT30S | PT1M | PT5M | PT10M | PT15M | PT30M | PT1H.

{ "exportDataStartTime": StartTime, "exportDataEndTime": EndTime, "targetSamplingRate": "TargetSamplingRate" }
Example of a sample rate configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "targetSamplingRate": "PT1M" }

Label your data

When labeling your data, you must define time intervals that represent periods of abnormal equipment behavior. This labeling information is provided as a CSV file, where each row specifies a time range during which the equipment was not operating correctly.

Each row contains two timestamps:

  • The start time, indicating when abnormal behavior is believed to have begun.

  • The end time, representing when the failure or issue was first observed.

This CSV file is stored in an Amazon S3 bucket and is used during model training to help the system learn from known examples of abnormal behavior. The following example shows how your label data should appear as a .csv file. The file has no header.

Example of a CSV file:
2024-06-21T00:00:00.000000,2024-06-21T12:00:00.000000 2024-07-11T00:00:00.000000,2024-07-11T12:00:00.000000 2024-07-31T00:00:00.000000,2024-07-31T12:00:00.000000

Row 1 represents a maintenance event on June 21, 2024, with a 12-hour window (from 2024-06-21T00:00:00.000000Z to 2024-06-21T12:00:00.000000Z) for AWS IoT SiteWise to look for abnormal behavior.

Row 2 represents a maintenance event on July 11, 2024, with a 12-hour window (from 2024-07-11T00:00:00.000000Z to 2024-07-11T12:00:00.000000Z) for AWS IoT SiteWise to look for abnormal behavior.

Row 3 represents a maintenance event on July 31, 2024, with a 12-hour window (from 2024-07-31T00:00:00.000000Z to 2024-07-31T12:00:00.000000Z) for AWS IoT SiteWise to look for abnormal behavior.

AWS IoT SiteWise uses all of these time windows to train and evaluate models that can identify abnormal behavior around these events. Note that not all events are detectable, and results are highly dependent on the quality and characteristics of the underlying data.

For details about best practices for sampling, see Best practices.

Data labeling steps

  • Configure your Amazon S3 bucket according to the labeling prerequisites at Labeling data prerequisites.

  • Upload the file to your labeling bucket.

  • Add the following to anomaly-detection-training-payload.json.

    • Provide the locations in the labelInputConfiguration section of the file. Replace labels-bucket with bucket name and files-prefix with file(s) path or any part of prefix. All files at the location are parsed, and (on success) used as label files.

{ "exportDataStartTime": StartTime, "exportDataEndTime": EndTime, "labelInputConfiguration": { "bucketName": "label-bucket", "prefix": "files-prefix" } }
Example of a label configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "labelInputConfiguration": { "bucketName": "anomaly-detection-customer-data-278129555252-iad", "prefix": "Labels/model=b2d8ab3e-73af-48d8-9b8f-a290bef931b4/asset[d3347728-4796-4c5c-afdb-ea2f551ffe7a]/Lables.csv" } }

Evaluate your model

Pointwise model diagnostics for an AWS IoT SiteWise training model is an evaluation of the model performance at the individual events. During training, AWS IoT SiteWise generates an anomaly score, and sensor contribution diagnostics for each row in the input dataset. A higher anomaly score indicates a higher likelihood of an abnormal event.

Pointwise diagnostics are available, when you train a model with ExecuteAction API, and AWS/ANOMALY_DETECTION_TRAINING action type.

To configure model evaluation,

  • Configure your Amazon S3 bucket according to the labelling prerequisites at Labeling data prerequisites.

  • Add the following to anomaly-detection-training-payload.json.

    • Provide the evaluationStartTime and evaluationEndTime (both in epoch seconds) for the data in the window used to evaluate the performance of the model.

    • Provide the Amazon S3 bucket location (resultDestination) in order for the the evaluation diagnostics to be written to.

Note

The model evaluation interval (dataStartTime to dataEndtime) must either overlap, or be contiguous to the training interval. No gaps are permitted.

{ "exportDataStartTime": StartTime, "exportDataEndTime": EndTime, "modelEvaluationConfiguration": { "dataStartTime": evaluationStartTime, "dataEndTime": evaluationEndTime "resultDestination": { "bucketName": "s3BucketName", "prefix": "bucketPrefix" } } }
Example of a model evaluation configuration:
{ "exportDataStartTime": 1717225200, "exportDataEndTime": 1722789360, "modelEvaluationConfiguration": { "dataStartTime": 1722789360, "dataEndTime": 1725174000, "resultDestination": { "bucketName": "anomaly-detection-customer-data-278129555252-iad", "prefix": "Evaluation/asset[d3347728-4796-4c5c-afdb-ea2f551ffe7a]/1747681026-evaluation_results.jsonl" } } }