$bucket - Amazon DocumentDB

$bucket

New from version 8.0

Not supported by Elastic cluster.

The $bucket aggregation stage in Amazon DocumentDB allows you to group input documents into buckets based on a specified expression and bucket boundaries. This can be useful for analyzing data that falls within certain value ranges or categories.

Parameters

  • groupBy (required): The expression that specifies the value to group by.

  • boundaries (required): An array of double values that define the bucket boundaries. Documents are assigned to buckets based on the groupBy expression value falling within the specified boundaries.

  • default (optional): A literal value that is output for documents whose groupBy expression value does not fall into any of the specified boundaries.

  • output (optional): An object that specifies the information to output for each bucket. You can use accumulator operators like $sum, $avg, $min, and $max to compute aggregations for each bucket.

Example (MongoDB Shell)

The following example demonstrates how to use the $bucket stage to group sales data by price range.

Create sample documents

db.sales.insertMany([ { item: "abc", price: 10, quantity: 2, date: new Date("2020-09-01") }, { item: "def", price: 20, quantity: 1, date: new Date("2020-10-01") }, { item: "ghi", price: 5, quantity: 3, date: new Date("2020-11-01") }, { item: "jkl", price: 15, quantity: 2, date: new Date("2020-12-01") }, { item: "mno", price: 25, quantity: 1, date: new Date("2021-01-01") } ]);

Query example

db.sales.aggregate([ { $bucket: { groupBy: "$price", boundaries: [0, 10, 20, 30], default: "Other", output: { "count": { $sum: 1 }, "totalQuantity": { $sum: "$quantity" } } } }, { $sort: { _id: 1 } } ])

Output

[ { _id: 0, count: 1, totalQuantity: 3 }, { _id: 10, count: 2, totalQuantity: 4 }, { _id: 20, count: 2, totalQuantity: 2 } ]

Code examples

To view a code example for using the $bucket command, choose the tab for the language that you want to use:

Node.js
const { MongoClient } = require('mongodb'); async function example() { const client = await MongoClient.connect('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false'); const db = client.db('test'); const sales = db.collection('sales'); const result = await sales.aggregate([ { $bucket: { groupBy: "$price", boundaries: [0, 10, 20, 30], default: "Other", output: { "count": { $sum: 1 }, "totalQuantity": { $sum: "$quantity" } } } }, { $sort: { _id: 1 } } ]).toArray(); console.log(result); client.close(); } example();
Python
from pymongo import MongoClient def example(): client = MongoClient('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&lsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false') db = client['test'] sales = db['sales'] result = list(sales.aggregate([ { '$bucket': { 'groupBy': '$price', 'boundaries': [0, 10, 20, 30], 'default': 'Other', 'output': { 'count': {'$sum': 1}, 'totalQuantity': {'$sum': '$quantity'} } } }, { "$sort": { "_id": 1 } } ])) print(result) client.close() example()