# Source data used in a Detective behavior graph
<a name="detective-source-data-about"></a>

To populate a behavior graph, Amazon Detective uses source data from the behavior graph administrator account and member accounts.

With Detective, you can access up to a year of historical event data. This data is available through a set of visualizations that show changes in the type and volume of activity over a selected time window. Detective links these changes to GuardDuty findings.

![\[Diagram showing how a behavior graph uses data from the administrator account and member accounts, and uses the behavior graph data structure.\]](http://docs.aws.amazon.com/detective/latest/userguide/images/diagram_graph_structure_overview.png)


For details about the behavior graph data structure, see [Overview of the behavior graph data structure](https://docs.aws.amazon.com/detective/latest/userguide/graph-data-structure-overview.html) in *Detective User Guide*.

## Types of core data sources in Detective
<a name="source-data-types"></a>

Detective ingests data from these types of AWS logs:
+ AWS CloudTrail logs 
+ Amazon Virtual Private Cloud (Amazon VPC) flow logs 
  + Ingests both IPv4 and IPv6 records, but not MAC records produced by Elastic Fabric Adapters.
  + Ingests log records when the value of the `log-status` field is in `OK` state. For more information, see [Flow log records](https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html#flow-logs-fields) in the Amazon VPC User Guide.
  + Ingests flow logs produced by Amazon Elastic Compute Cloud instances running in those VPCs only. No other resources, such as NAT gateways, RDS instances, or Fargate clusters are used.
  + Ingests both accepted and rejected traffic.
+ For accounts that are enrolled in GuardDuty, Detective also ingests GuardDuty findings.

Detective consumes CloudTrail and VPC flow log events using independent and duplicative streams of CloudTrail and VPC flow logs. These processes do not affect or use your existing CloudTrail and VPC flow log configurations. They also do not affect the performance of or increase your costs for these services.

## Types of optional data sources in Detective
<a name="source-data-types-optional"></a>

Detective offers optional source packages in addition to the three data sources offered in the Detective core package (the core package includes AWS CloudTrail logs, VPC Flow logs, and GuardDuty findings). An optional data source package can be started or stopped for a behavior graph at any time.

Detective provides a 30-day free trial for all core and optional source packages per Region.

**Note**  
Detective retains all data received from each data source package for up to 1 year.

Currently the following optional source packages are available:
+ **EKS audit logs**

  This optional data source package allows Detective to ingest detailed information on EKS clusters in your environment and adds that data to your behavior graph. Detective correlates user activities with AWS CloudTrail Management events and network activity with Amazon VPC Flow Logs without the need for you to enable or store these logs manually. See [Amazon EKS audit logs](source-data-types-EKS.md) for details.
+ **AWS security findings**

  This optional data source package allows Detective to ingest data from Security Hub CSPM and adds that data to your behavior graph. See [**AWS security findings**](source-data-types-asff.md) for details.

****Starting or stopping an optional data source:****

1. Open the Detective console at [https://console.aws.amazon.com/detective/](https://console.aws.amazon.com/detective/).

1. From the navigation panel under **Settings**, choose **General**.

1. Under **Optional source packages**, select **Update**. Then select the data source you wish to enable or deselect a box for an already enabled data source and choose **Update** to change which data source packages are enabled.

**Note**  
If you stop and then restart an optional data source you will see a gap in the data displayed on some entity profiles. This gap will be noted in the console display and represent the period of time when the data source was stopped. When a data source is restarted Detective does not retroactively ingest data.

# Amazon EKS audit logs
<a name="source-data-types-EKS"></a>

Amazon EKS audit logs is an optional data source package that can be added to your Detective behavior graph. You can view the available optional source packages, and their status in your account, from the **Settings** page in the console or through the Detective API. 

A 30 day free trial is provided for this data source. To learn more see [Free trial for optional data sources](free-trial-overview.md#free-trial-datasource).

Enabling Amazon EKS audit logs allows Detective to add in-depth information about resources created with Amazon EKS to your behavior graph. This data source enhances the information provided about the following entity types: EKS Cluster, Kubernetes Pod, Container Image and Kubernetes subject. 

Additionally, If you have enabled EKS audit logs as a data source in Amazon GuardDuty you will be able to see details for Kubernetes findings from GuardDuty. For more info on enabling this data source in GuardDuty see [Kubernetes protection in Amazon GuardDuty](https://docs.aws.amazon.com//guardduty/latest/ug/kubernetes-protection.html).

**Note**  
This data source is enabled by default for new behavior graphs created after July 26, 2022. For behavior graphs created before July 26, 2022 it must be enabled manually.

****Adding or removing Amazon EKS audit logs as an optional data source:****

1. Open the Detective console at [https://console.aws.amazon.com/detective/](https://console.aws.amazon.com/detective/).

1. From the navigation panel under **Settings**, choose **General**.

1. Under **Source packages**, select **EKS audit logs** to enable this data source. If it is already enabled, select it again to stop ingesting **EKS audit logs** into your behavior graph.

# **AWS security findings**
<a name="source-data-types-asff"></a>

**AWS security findings** is an optional data source package that can be added to your Detective behavior graph.

You can view the available optional source packages, and their status in your account, from the Settings page in the console or through the Detective API.

A 30 day free trial is provided for this data source. To learn more see [Free trial for optional data sources](free-trial-overview.md#free-trial-datasource).

Enabling **AWS security findings** allows Detective to use the findings from Security Hub CSPM aggregated by Security Hub from upstream services in a standard findings format called the AWS Security Format (ASFF), which eliminates the need for time-consuming data conversion efforts. Then it correlates ingested findings across products to prioritize the most important ones.

****Adding or removing AWS security findings as an optional data source:****
**Note**  
The AWS security findings data source is enabled by default for new behavior graphs created after May 16, 2023. For behavior graphs created before May 16, 2023 it must be enabled manually.

1. Open the Detective console at [https://console.aws.amazon.com/detective/](https://console.aws.amazon.com/detective/).

1. From the navigation panel under **Settings**, choose **General**.

1. Under **Source packages**, select AWS security findings to enable this data source. If it is already enabled, select it again to stop ingesting AWS Security Finding Format (ASFF) findings into your behavior graph.

## Currently supported findings
<a name="currently-supported-findings"></a>

Detective ingests all ASFF findings in Security Hub CSPM from services that are owned by Amazon or AWS.
+ To see the list of supported service integrations, see [Available AWS service integrations](https://docs.aws.amazon.com//securityhub/latest/userguide/securityhub-internal-providers.html) in the AWS Security Hub User Guide.
+ For the list of supported resources, see [Resources](https://docs.aws.amazon.com//securityhub/latest/userguide/asff-resources.html) in the AWS Security Hub User Guide.
+ AWS Service Findings with a Compliance status not set to `FAILED` and cross-Region aggregated findings are not ingested.

## How Detective ingests and stores source data
<a name="source-data-storage"></a>

When Detective is enabled, Detective begins ingesting source data from the behavior graph administrator account. As member accounts are added to the behavior graph, Detective also begins using the data from those member accounts.

Detective source data consists of structured and processed versions of the original feeds. To support Detective analytics, Detective stores copies of the Detective source data.

The Detective ingest process feeds data into Amazon Simple Storage Service (Amazon S3) buckets in the Detective source data store. As new source data arrives, other Detective components pick up the data and start the extraction and analytics processes. For more information, see [How Detective uses source data to populate a behavior graph](https://docs.aws.amazon.com/detective/latest/userguide/behavior-graph-population-about.html) in *Detective User Guide*.

## How Detective enforces the data volume quota for behavior graphs
<a name="data-volume-enforcement"></a>

Detective has strict quotas on the volume of data it allows in each behavior graph. The data volume is the amount of data per day that flows into the Detective behavior graph.

Detective enforces these quotas when an administrator account enables Detective, and when a member account accepts an invitation to contribute to a behavior graph.
+ If the data volume for an administrator account exceeds 10 TB per day, then the administrator account cannot enable Detective.
+ If the added data volume from a member account would cause the behavior graph to exceed 10 TB per day, the member account cannot be enabled.

The data volume for a behavior graph also can grow naturally over time. Detective checks the behavior graph data volume each day to make sure that it does not exceed the quota.

If the behavior graph data volume is approaching the quota, Detective displays a warning message on the console. To avoid exceeding the quota, you can remove member accounts.

If the behavior graph data volume exceeds 10 TB per day, then you cannot add a new member account to the behavior graph.

If the behavior graph data volume exceeds 15 TB per day, then Detective stops ingesting data into the behavior graph. The 15 TB per day quota reflects both normal data volume and spikes in the data volume. When this quota is reached, no new data is ingested into the behavior graph, but existing data is not removed. You can still use that historical data for investigation. The console displays a message to indicate that the data ingest is suspended for the behavior graph.

If the data ingest is suspended, you must work with Support to get it re-enabled. If possible, before you contact Support, try to remove member accounts to get the data volume below the quota. This makes it easier to re-enable the data ingest for the behavior graph.