Getting started with Bloom filters - Amazon ElastiCache

Getting started with Bloom filters

ElastiCache supports the Bloom filter data structure, which provides a space efficient probabilistic data structure to check if an element is a member of a set. When using Bloom filters, false positives are possible—a filter can incorrectly indicate that an element exists, even though that element was not added to the set. However, using Bloom filters will prevent false negatives—incorrect indications that an element does not exist, even though that element was added to the set.

You can set the percentage of potential false positives to a preferred rate for your workload, by adjusting the fp rate. You can also configure capacity (the number of items a Bloom filter can hold), scaling and non-scaling properties, and more.

After you create a cluster with a supported engine version, the Bloom data type and associated commands are automatically available. The bloom data type is API compatible with the Bloom filter command syntax of the official Valkey client libraries including valkey-py, valkey-java, and valkey-go. You can easily migrate existing Bloom-based Valkey and Redis OSS applications into ElastiCache. For a complete list of commands see Bloom filter commands.

The Bloom-related metrics BloomFilterBasedCmds, BloomFilterBasedCmdsLatency, and BloomFilterBasedCmdsECPUs are incorporated into CloudWatch to monitor the usage of this data type. For more information, see Metrics for Valkey and Redis OSS.

Note
  • To use Bloom filters, you must be running on ElastiCache Valkey 8.1 and later.

  • The bloom data type is not RDB compatible with other non-Valkey based bloom offerings.

Bloom filters data type overview

Bloom filters are a space efficient probabilistic data structure that allows adding elements and checking whether elements exist. False positives are possible where a filter incorrectly indicates that an element exists, even though it was not added. However, Bloom Filters guarantee that false negatives (incorrectly indicating that an element does not exist, even though it was added) do not occur.

The main source of documentation for bloom filters can be found on the valkey.io documentation page. This contains the following information:

Bloom size limit

The consumption of memory by a single Bloom filter object is limited to 128 MB. You can check the amount of memory consumed by a Bloom filter by using the BF.INFO <key> SIZE command.

Bloom ACLs

Similar to the existing per-datatype categories (@string, @hash, etc.) a new category @bloom is added to simplify managing access to Bloom commands and data. No other existing Valkey or Redis OSS commands are members of the @bloom category.

There are 3 existing ACL categories that are updated to include the new Bloom commands: @read, @write and @fast. The following table indicates the mapping of Bloom commands to the appropriate categories.

Bloom command @read @write @fast @bloom

BF.ADD

y

y

y

BF.CARD

y

y

y

BF.EXISTS

y

y

y

BF.INFO

y

y

y

BF.INSERT

y

y

y

BF.MADD

y

y

y

BF.MEXISTS

y

y

y

BF.RESERVE

y

y

y

Bloom filter related metrics

The following CloudWatch metrics related to bloom data structures are provided:

CW Metrics Unit Serverless/Self-designed Description

BloomFilterBasedCmds

Count

Both

The total number of Bloom filter commands, including both read and write commands.

BloomFilterBasedCmdsLatency

Microseconds

Self-managed

Latency of all Bloom filter commands, including both read and write commands.

BloomFilterBasedCmdsECPUs

Count

Serverless

ECPUs consumed by all Bloom filter commands, including both read and write commands.

Bloom filter commands

Bloom Filter commands are documented on the Valkey.io website. Each command page provides a comprehensive overview of the bloom commands, including its syntax, behavior, return values, and potential error conditions.

Name Description
BF.ADD

Adds a single item to a bloom filter.If the filter doesn't already exist, it is created.

BF.CARD

Returns the cardinality of a bloom filter.

BF.EXISTS

Determines if the bloom filter contains the specified item.

BF.INFO

Returns usage information and properties of a specific bloom filter.

BF.INSERT

Creates a bloom filter with 0 or more itemes, or adds items to an existing bloom filter.

BF.MADD

Adds one or more items to a bloom filter.

BF.MEXISTS

Determines if the bloom filter contains 1 or more items.

BF.RESERVE

Creates an empty bloom filter with the specified properties.

Note

BF.LOAD is not supported by ElastiCache. It is only relevant for AOF usage, which ElastiCache does not support.