# Setting up ML-powered anomaly detection for outlier analysis


Use procedures in the following sections to start detecting outliers, detecting anomalies, and identifying the key drivers that contribute to them.

**Topics**
+ [

# Viewing anomaly and forecast notifications
](anomaly-detection-adding-from-visuals.md)
+ [

# Adding an ML insight to detect outliers and key drivers
](anomaly-detection-adding-anomaly-insights.md)
+ [

# Using contribution analysis for key drivers
](anomaly-detection-adding-key-drivers.md)

# Viewing anomaly and forecast notifications


Amazon Quick Sight notifies you on a visual where it detects an anomaly, key drivers, or a forecasting opportunity. You can follow the prompts to set up anomaly detection or forecasting based on the data in that visual.

1. In an existing line chart, look for an insight notification in the menu on the visual widget. 

1. Choose the lightbulb icon to display the notification.

1. If you want more information about the ML insight, you can follow the screen prompts to add an ML insight.

# Adding an ML insight to detect outliers and key drivers


You can add an ML insight that detects *anomalies*, which are outliers that seem significant. To get started, you create for your insight a widget, also known as an *autonarrative*. As you configure your options, you can view a limited screenshot of your insight in the **Preview** pane at screen right.

In your insight widget, you can add up to five dimension fields that are not calculated fields. In the field wells, values for **Categories** represent the dimensional values that Amazon Quick Sight uses to split the metric. For example, let's say that you are analyzing revenue across all product categories and product SKUs. There are 10 product categories, each with 10 product SKUs. Amazon Quick Sight splits the metric by the 100 unique combinations and runs anomaly detection on each combination for the split.

The following procedure shows how to do this, and also how to add contribution analysis to detect the key drivers that are causing each anomaly. You can add contribution analysis later, as described in [Using contribution analysis for key drivers](anomaly-detection-adding-key-drivers.md).

**To set up outlier analysis, including key drivers**

1. Open your analysis and in the toolbar, choose **Insights**, then **Add**. From the list, choose **Anomaly detection** and **Select**.

1. Follow the screen prompt on the new widget, which tells you to choose fields for the insight. Add at least one date, one measure, and one dimension. 

1. Choose **Get started** on the widget. The configuration screen appears.

1. Under **Compute options**, choose values for the following options.

   1. For **Combinations to be analysed**, choose one of the following options:

      1. **Hierarchical**

         Choose this option if you want to analyze the fields hierarchically. For example, if you chose a date (T), a measure (N), and three dimension categories (C1, C2, and C3), Quick Sight analyses the fields hierarchically, as shown following.

         ```
         T-N, T-C1-N, T-C1-C2-N, T-C1-C2-C3-N
         ```

      1. **Exact**

         Choose this option if you want to analyze only the exact combination of fields in the Category field well, as they are listed. For example, if you chose a date (T), a measure (N), and three dimension categories (C1, C2, and C3), Quick Sight analyses only the exact combination of category fields in the order they are listed, as shown following.

         ```
         T-C1-C2-C3-N
         ```

      1. **All**

         Choose this option if you want to analyze all field combinations in the Category field well. For example, if you chose a date (T), a measure (N), and three dimension categories (C1, C2, and C3), Quick Sight analyses all combinations of fields, as shown following.

         ```
         T-N, T-C1-N, T-C1-C2-N, T-C1-C2-C3-N, T-C1-C3-N, T-C2-N, T-C2-C3-N, T-C3-N
         ```

      If you chose a date and a measure only, Quick Sight analyses the fields by date and then by measure.

      In the **Fields to be analyzed** section, you can see a list of fields from the field wells for reference.

   1. For **Name**, enter a descriptive alphanumeric name with no spaces, or choose the default value. This provides a name for the computation.

      If you plan on editing the narrative that automatically displays on the widget, you can use the name to identify this widget's calculation. Customize the name if you plan to edit the autonarrative and if you have other similar calculations in your analysis.

1. In the **Display options** section, choose the following options to customize what is displayed in your insight widget. You can still explore all your results, no matter what you display.

   1. **Maximum number of anomalies to show** – The number of outliers you want to display in the narrative widget. 

   1. **Severity** – The minimum level of severity for anomalies that you want to display in the insight widget.

      A *level of severity* is a range of anomaly scores that is characterized by the lowest actual anomaly score included in the range. All anomalies that score higher are included in the range. If you set severity to **Low**, the insight displays all of the anomalies that rank between low and very high. If you set the severity to **Very high**, the insight displays only the anomalies that have the highest anomaly scores.

      You can use the following options:
      + **Very high** 
      + **High and above** 
      + **Medium and above** 
      + **Low and above** 

   1. **Direction** – The direction on the x-axis or y-axis that you want to identify as anomalous. You can choose from the following:
      + **Higher than expected** to identify higher values as anomalies.
      + **Lower than expected** to identify lower values as anomalies. 
      + **[ALL]** to identify all anomalous values, high and low (default setting).

   1. **Delta** – Enter a custom value to use to identify anomalies. Any amount higher than the threshold value counts as an anomaly. The values here change how the insight works in your analysis. In this section, you can set the following:
      + **Absolute value** – The actual value to use. For example, suppose this is 48. Amazon Quick Sight then identifies values as anomalous when the difference between a value and the expected value is greater than 48. 
      + **Percentage** – The percentage threshold to use. For example, suppose this is 12.5%. Amazon Quick Sight then identifies values as anomalous when the difference between a value and the expected value is greater than 12.5%.

   1. **Sort by** – Choose a sort method for your results. Some methods are based on the anomaly score that Amazon Quick Sight generates. Amazon Quick Sight gives higher scores to data points that look anomalous. You can use any of the following options: 
      + **Weighted anomaly score** – The anomaly score multiplied by the log of the absolute value of the difference between the actual value and the expected value. This score is always a positive number. 
      + **Anomaly score** – The actual anomaly score assigned to this data point.
      + **Weighted difference from expected value** – The anomaly score multiplied by the difference between the actual value and the expected value (default).
      + **Difference from expected value** – The actual difference between the actual value and the expected value (that is, actual−expected).
      + **Actual value** – The actual value with no formula applied.

1. In the **Schedule options** section, set the schedule for automatically running the insight recalculation. The schedule runs only for published dashboards. In the analysis, you can run it manually as needed. Scheduling includes the following settings:
   + **Occurrence** – How often that you want the recalculation to run: every hour, every day, every week, or every month.
   + **Start schedule on** – The date and time to start running this schedule.
   + **Timezone** – The time zone that the schedule runs in. To view a list, delete the current entry. 

1. In the **Top contributors** section, set Amazon Quick Sight to analyze the key drivers when an outlier (anomaly) is detected.

   For example, Amazon Quick Sight can show the top customers that contributed to a spike in sales in the US for home improvement products. You can add up to four dimensions from your dataset. These include dimensions that you didn't add to the field wells of this insight widget.

   For a list of dimensions available for contribution analysis, choose **Select fields**.

1. Choose **Save** to confirm your choices. Choose **Cancel** to exit without saving.

1. From the insight widget, choose **Run now** to run the anomaly detection and view your insight.

The amount of time that anomaly detecton takes to complete varies depending on how many unique data points you are analyzing. The process can take a few minutes for a minimum number of points, or it can take many hours.

While it's running in the background, you can do other work in your analysis. Make sure to wait for it to complete before you change the configuration, edit the narrative, or open the **Explore anomalies** page for this insight.

The insight widget needs to run at least once before you can see results. If you think the status might be out of date, you can refresh the page. The insight can have the following states.


| Appears on the Page | Status | 
| --- | --- | 
| Run now button | The job has not yet started. | 
| Message about Analyzing for anomalies | The job is currently running. | 
| Narrative about the detected anomalies (outliers)  | The job has run successfully. The message says when this widget's calculation was last updated. | 
| Alert icon with an exclamation point (\$1)  | This icon indicates there was an error during the last run. If the narrative also displays, you can still use Explore anomalies to use data from the previous successful run.  | 

# Using contribution analysis for key drivers


Amazon Quick Sight can identify the dimensions (categories) that contribute to outliers in measures (metrics) between two points in time. The key driver that contributes to an outlier helps you to answer the question: What happened to cause this anomaly? 

If you are already using anomaly detection without contribution analysis, you can enable the existing ML insight to find key drivers. Use the following procedure to add contribution analysis and identify the key drivers behind outliers. Your insight for anomaly detection needs to include a time field and at least one aggregated metric (SUM, AVERAGE, or COUNT). You can include multiple categories (dimension fields) if you wish, but you can also run contribution analysis without specifying any category or dimension field.

You can also use this procedure to change or remove fields as key drivers in your anomaly detection.

**To add contribution analysis to identify key drivers**

1. Open your analysis and locate an existing ML insight for anomaly detection. Select the insight widget to highlight it.

1. Choose **Menu Options** (**…**) from the menu on the visual.

1. Choose **Configure anomaly** to edit the settings.

1. The **Contribution analysis (optional)** setting allows Amazon Quick Sight to analyze the key drivers when an outlier (anomaly) is detected. For example, Amazon Quick Sight can show you the top customers that contributed to a spike in sales in the US for home improvement products. You can add up to four dimensions from your dataset, including dimensions that you didn't add to the field wells of this insight widget.

   To view a list of dimensions available for contribution analysis, choose **Select fields**.

   If you want to change the fields you're using as key drivers, change the fields that are enabled in this list. If you disable all of them, Quick Sight won't perform any contribution analysis in this insight.

1. To save your changes, scroll to the bottom of the configuration options, and choose **Save**. To exit without saving, choose **Cancel**. To completely remove these settings, choose **Delete**.