[ aws . wafv2 ]

get-top-path-statistics-by-traffic

Description

Retrieves aggregated statistics about the top URI paths accessed by bot traffic for a specified web ACL and time window. You can use this operation to analyze which paths on your web application receive the most bot traffic and identify the specific bots accessing those paths. The operation supports filtering by bot category, organization, or name, and allows you to drill down into specific path prefixes to view detailed URI-level statistics.

See also: AWS API Documentation

Synopsis

  get-top-path-statistics-by-traffic
--web-acl-arn <value>
--scope <value>
[--uri-path-prefix <value>]
--time-window <value>
[--bot-category <value>]
[--bot-organization <value>]
[--bot-name <value>]
--limit <value>
--number-of-top-traffic-bots-per-path <value>
[--next-marker <value>]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
[--debug]
[--endpoint-url <value>]
[--no-verify-ssl]
[--no-paginate]
[--output <value>]
[--query <value>]
[--profile <value>]
[--region <value>]
[--version <value>]
[--color <value>]
[--no-sign-request]
[--ca-bundle <value>]
[--cli-read-timeout <value>]
[--cli-connect-timeout <value>]
[--cli-binary-format <value>]
[--no-cli-pager]
[--cli-auto-prompt]
[--no-cli-auto-prompt]

Options

--web-acl-arn (string) [required]

The Amazon Resource Name (ARN) of the web ACL for which you want to retrieve path statistics.

Constraints:

  • min: 20
  • max: 2048
  • pattern: .*\S.*

--scope (string) [required]

Specifies whether the web ACL is for an Amazon Web Services CloudFront distribution or for a regional application. A regional application can be an Application Load Balancer, an AppSync GraphQL API, an Amazon Cognito user pool, an Amazon Web Services App Runner service, or an Amazon Web Services Verified Access instance.

Possible values:

  • CLOUDFRONT
  • REGIONAL

--uri-path-prefix (string)

A URI path prefix to filter the results. When you specify this parameter, the operation returns statistics for individual URIs within the specified path prefix. For example, if you specify /api , the response includes statistics for paths like /api/v1/users and /api/v2/orders . If you don’t specify this parameter, the operation returns top-level path statistics.

Constraints:

  • min: 1
  • max: 512
  • pattern: ^\/[^ ]*$

--time-window (structure) [required]

The time window for which you want to retrieve path statistics. The time window must be within the data retention period for your web ACL.

StartTime -> (timestamp) [required]

The beginning of the time range from which you want GetSampledRequests to return a sample of the requests that your Amazon Web Services resource received. You must specify the times in Coordinated Universal Time (UTC) format. UTC format includes the special designator, Z . For example, "2016-09-27T14:50Z" . You can specify any time range in the previous three hours.

EndTime -> (timestamp) [required]

The end of the time range from which you want GetSampledRequests to return a sample of the requests that your Amazon Web Services resource received. You must specify the times in Coordinated Universal Time (UTC) format. UTC format includes the special designator, Z . For example, "2016-09-27T14:50Z" . You can specify any time range in the previous three hours.

Shorthand Syntax:

StartTime=timestamp,EndTime=timestamp

JSON Syntax:

{
  "StartTime": timestamp,
  "EndTime": timestamp
}

--bot-category (string)

Filters the results to include only traffic from bots in the specified category. For example, you can filter by ai to see only AI crawler traffic, or search_engine to see only search engine bot traffic. When you apply this filter, the Source field is populated in the response.

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

--bot-organization (string)

Filters the results to include only traffic from bots belonging to the specified organization. For example, you can filter by openai or google . When you apply this filter, the Source field is populated in the response.

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

--bot-name (string)

Filters the results to include only traffic from the specified bot. For example, you can filter by gptbot or googlebot . When you apply this filter, the Source field is populated in the response.

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

--limit (integer) [required]

The maximum number of path statistics to return. Valid values are 1 to 100.

Constraints:

  • min: 1
  • max: 100

--number-of-top-traffic-bots-per-path (integer) [required]

The maximum number of top bots to include in the statistics for each path. Valid values are 1 to 10.

Constraints:

  • min: 1
  • max: 10

--next-marker (string)

When you request a list of objects with a Limit setting, if the number of objects that are still available for retrieval exceeds the limit, WAF returns a NextMarker value in the response. To retrieve the next batch of objects, provide the marker from the prior call in your next request.

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

--cli-input-json | --cli-input-yaml (string) Reads arguments from the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, those values will override the JSON-provided values. It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. This may not be specified along with --cli-input-yaml.

--generate-cli-skeleton (string) Prints a JSON skeleton to standard output without sending an API request. If provided with no value or the value input, prints a sample input JSON that can be used as an argument for --cli-input-json. Similarly, if provided yaml-input it will print a sample input YAML that can be used with --cli-input-yaml. If provided with the value output, it validates the command inputs and returns a sample output JSON for that command. The generated JSON skeleton is not stable between versions of the AWS CLI and there are no backwards compatibility guarantees in the JSON skeleton generated.

Global Options

--debug (boolean)

Turn on debug logging.

--endpoint-url (string)

Override command’s default URL with the given URL.

--no-verify-ssl (boolean)

By default, the AWS CLI uses SSL when communicating with AWS services. For each SSL connection, the AWS CLI will verify SSL certificates. This option overrides the default behavior of verifying SSL certificates.

--no-paginate (boolean)

Disable automatic pagination. If automatic pagination is disabled, the AWS CLI will only make one call, for the first page of results.

--output (string)

The formatting style for command output.

  • json
  • text
  • table
  • yaml
  • yaml-stream

--query (string)

A JMESPath query to use in filtering the response data.

--profile (string)

Use a specific profile from your credential file.

--region (string)

The region to use. Overrides config/env settings.

--version (string)

Display the version of this tool.

--color (string)

Turn on/off color output.

  • on
  • off
  • auto

--no-sign-request (boolean)

Do not sign requests. Credentials will not be loaded if this argument is provided.

--ca-bundle (string)

The CA certificate bundle to use when verifying SSL certificates. Overrides config/env settings.

--cli-read-timeout (int)

The maximum socket read time in seconds. If the value is set to 0, the socket read will be blocking and not timeout. The default value is 60 seconds.

--cli-connect-timeout (int)

The maximum socket connect time in seconds. If the value is set to 0, the socket connect will be blocking and not timeout. The default value is 60 seconds.

--cli-binary-format (string)

The formatting style to be used for binary blobs. The default format is base64. The base64 format expects binary blobs to be provided as a base64 encoded string. The raw-in-base64-out format preserves compatibility with AWS CLI V1 behavior and binary values must be passed literally. When providing contents from a file that map to a binary blob fileb:// will always be treated as binary and use the file contents directly regardless of the cli-binary-format setting. When using file:// the file contents will need to properly formatted for the configured cli-binary-format.

  • base64
  • raw-in-base64-out

--no-cli-pager (boolean)

Disable cli pager for output.

--cli-auto-prompt (boolean)

Automatically prompt for CLI input parameters.

--no-cli-auto-prompt (boolean)

Disable automatically prompt for CLI input parameters.

Output

PathStatistics -> (list)

The list of path statistics, ordered by request count. Each entry includes the path, request count, percentage of total traffic, and the top bots accessing that path.

(structure)

Statistics about bot traffic to a specific URI path, including the path, request count, percentage of total traffic, and the top bots accessing that path.

Source -> (structure)

Information about the bot filter that was applied to generate these statistics. This field is only populated when you filter by bot category, organization, or name.

BotCategory -> (string)

The bot category that was used to filter the results. For example, ai or search_engine .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

BotOrganization -> (string)

The bot organization that was used to filter the results. For example, OpenAI or Google .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

BotName -> (string)

The bot name that was used to filter the results. For example, gptbot or googlebot .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

Path -> (string) [required]

The URI path. For example, /api/ or /api/v1/users .

Constraints:

  • min: 1
  • max: 512

RequestCount -> (long) [required]

The number of requests to this path within the specified time window.

Constraints:

  • min: 0

Percentage -> (double) [required]

The percentage of total requests that were made to this path.

Constraints:

  • min: 0.0
  • max: 100.0

TopBots -> (list)

The list of top bots accessing this path, ordered by request count. The number of bots included is determined by the NumberOfTopTrafficBotsPerPath parameter in the request.

(structure)

Statistics about a specific bot’s traffic to a path, including the bot name, request count, and percentage of traffic.

BotName -> (string) [required]

The name of the bot. For example, gptbot or googlebot .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

RequestCount -> (long) [required]

The number of requests from this bot to the associated path within the specified time window.

Constraints:

  • min: 0

Percentage -> (double) [required]

The percentage of total requests to the associated path that came from this bot.

Constraints:

  • min: 0.0
  • max: 100.0

TotalRequestCount -> (long)

The total number of requests that match the query criteria within the specified time window.

Constraints:

  • min: 0

NextMarker -> (string)

When you request a list of objects with a Limit setting, if the number of objects that are still available for retrieval exceeds the limit, WAF returns a NextMarker value in the response. To retrieve the next batch of objects, provide the marker from the prior call in your next request.

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

TopCategories -> (list)

Category-level aggregations for visualizing bot category to path relationships. This field is only populated when no bot filters are applied to the request. Each entry includes the bot category and the paths accessed by bots in that category.

(structure)

Statistics about bot traffic to a specific URI path, including the path, request count, percentage of total traffic, and the top bots accessing that path.

Source -> (structure)

Information about the bot filter that was applied to generate these statistics. This field is only populated when you filter by bot category, organization, or name.

BotCategory -> (string)

The bot category that was used to filter the results. For example, ai or search_engine .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

BotOrganization -> (string)

The bot organization that was used to filter the results. For example, OpenAI or Google .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

BotName -> (string)

The bot name that was used to filter the results. For example, gptbot or googlebot .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

Path -> (string) [required]

The URI path. For example, /api/ or /api/v1/users .

Constraints:

  • min: 1
  • max: 512

RequestCount -> (long) [required]

The number of requests to this path within the specified time window.

Constraints:

  • min: 0

Percentage -> (double) [required]

The percentage of total requests that were made to this path.

Constraints:

  • min: 0.0
  • max: 100.0

TopBots -> (list)

The list of top bots accessing this path, ordered by request count. The number of bots included is determined by the NumberOfTopTrafficBotsPerPath parameter in the request.

(structure)

Statistics about a specific bot’s traffic to a path, including the bot name, request count, and percentage of traffic.

BotName -> (string) [required]

The name of the bot. For example, gptbot or googlebot .

Constraints:

  • min: 1
  • max: 256
  • pattern: .*\S.*

RequestCount -> (long) [required]

The number of requests from this bot to the associated path within the specified time window.

Constraints:

  • min: 0

Percentage -> (double) [required]

The percentage of total requests to the associated path that came from this bot.

Constraints:

  • min: 0.0
  • max: 100.0