S3 Metadata live inventory tables schema - Amazon Simple Storage Service

S3 Metadata live inventory tables schema

The live inventory table provides a simple, queryable inventory of all the objects and their versions in your bucket so that you can determine the latest state of your data. Updates to your objects are typically reflected in the inventory table within one hour.

You can use this table to simplify and speed up business workflows and big data jobs by identifying objects that you want to process for various workloads. For example, you can query the inventory table to do the following:

  • Find all objects stored in the S3 Glacier Deep Archive storage class.

  • Create a distribution of object tags or find objects without tags.

  • Find all objects that aren't encrypted by using server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS).

When you enable the inventory table for your metadata table configuration, the table goes through a process known as backfilling, during which Amazon S3 scans your general purpose bucket to retrieve the initial metadata for all objects in the bucket. Depending on the number of objects in your bucket, this process can take minutes (minimum 15 minutes) to hours. When the backfilling process is finished, the status of your inventory table changes from Backfilling to Active. After backfilling is completed, updates to your objects are typically reflected in the inventory table within one hour.

Note

You're charged for backfilling your inventory table. If your general purpose bucket has more than one billion objects, you're also charged a monthly fee for your inventory table. For more information, see Amazon S3 Pricing.

Amazon S3 Metadata inventory tables contain rows and columns. Each row represents the current state of an object in your general purpose bucket. The inventory table provides a simple, queryable inventory of all objects in your bucket so that you can determine the current state of your data.

The following is an example of an inventory table for a general purpose bucket named amzn-s3-demo-bucket:

bucket key sequence_number version_id is_delete_marker size last_modified_date e_tag storage_class is_multipart encryption_status is_bucket_key_enabled kms_key_arn checksum_algorithm object_tags user_metadata amzn-s3-demo-bucket Finance/statement1.pdf 80e737d8b4d82f776affffffffffffffff006737d8b4d82f776a00000000000000000000000000000000000000000000000072 FALSE 6223 11/15/2024 23:26 e131b86632dda753aac4018f72192b83 STANDARD FALSE SSE-KMS FALSE arn:aws:kms:us-east-1:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890df SSECRC32 {} {count -> Asia, customs -> false, family -> true, location -> Mary, name -> football, user -> United States} amzn-s3-demo-bucket s3-dg.pdf 80e737d8b4e39f1dbdffffffffffffffff006737d8b4e39f1dbd00000000000000000000000000000000000000000000000072 FALSE 3554 11/15/2024 23:26 9bb49efc2d92c05558ddffbbde8636d5 STANDARD FALSE DSSE-KMS FALSE arn:aws:kms:us-east-1:936810216292:key/0dcebce6-49fd-4cae-b2e2-5512ad281afd SSESHA1 {} {} amzn-s3-demo-bucket Development/Projects.xls 80e737d8b4ed9ac5c6ffffffffffffffff006737d8b4ed9ac5c600000000000000000000000000000000000000000000000072 FALSE 7746 11/15/2024 23:26 729a6863e47fb9955b31bfabce984908 STANDARD FALSE SSE-S3 FALSE NULL SSECRC32 {} {count -> Asia, customs -> Canada, family -> Billiards, filter -> true, location -> Europe, name -> Asia, user -> United States}

Inventory tables have the following schema:

Column name Required? Data type

bucket

Yes String The general purpose bucket name. For more information, see General purpose bucket naming rules.

key

Yes String The object key name (or key) that uniquely identifies the object in the bucket. For more information, see Naming Amazon S3 objects.

sequence_number

Yes String

The sequence number, which is an ordinal that's included in the records for a given object. To order records of the same bucket and key, you can sort on sequence_number. For a given bucket and key, a lexicographically larger sequence_number value implies that the record was introduced to the bucket more recently.

version_id

No String

The object's version ID. When you enable versioning on a bucket, Amazon S3 assigns a version number to objects that are added to the bucket. For more information, see Retaining multiple versions of objects with S3 Versioning.

Objects that are stored in your bucket before you set the versioning state have a version ID of null.

is_delete_marker

No Boolean

The object's delete marker status. If the object is a delete marker, this value is True. Otherwise, it's False. For more information, see Working with delete markers.

Note

Rows that are added for delete markers have a record_type value of DELETE, not UPDATE_METADATA. If the delete marker is created as the result of an S3 Lifecycle expiration, the requester value is s3.amazonaws.com.

size

No Long

The object size in bytes, not including the size of incomplete multipart uploads or object metadata. If is_delete_marker is True, the size is 0. For more information, see System-defined object metadata.

last_modified_date

No Timestamp NTZ (no time zone)

The object creation date or the last modified date, whichever is the latest. For multipart uploads, the object creation date is the date when the multipart upload is initiated. For more information, see System-defined object metadata.

e_tag

No String

The entity tag (ETag), which is a hash of the object. The ETag reflects changes only to the contents of an object, not to its metadata. The ETag can be an MD5 digest of the object data. Whether the ETag is an MD5 digest depends on how the object was created and how it's encrypted. For more information, see Object in the Amazon S3 API Reference.

storage_class

No String

The storage class that’s used for storing the object. One of STANDARD, REDUCED_REDUNDANCY, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, GLACIER, DEEP_ARCHIVE, or GLACIER_IR. For more information, see Understanding and managing Amazon S3 storage classes.

is_multipart

No Boolean

The object's upload type. If the object was uploaded as a multipart upload, this value is True. Otherwise, it's False. For more information, see Uploading and copying objects using multipart upload in Amazon S3.

encryption_status

No String

The object's server-side encryption status, depending on what kind of encryption key is used: server-side encryption with Amazon S3 managed keys (SSE-S3), server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS), dual-layer server-side encryption with AWS KMS keys (DSSE-KMS), or server-side encryption with customer-provided keys (SSE-C). If the object is unencrypted, this value is null. Possible values are SSE-S3, SSE-KMS, DSSE-KMS, SSE-C, or null. For more information, see Protecting data with encryption.

is_bucket_key_enabled

No Boolean

The object's S3 Bucket Key enablement status. If the object uses an S3 Bucket Key for SSE-KMS, this value is True. Otherwise, it's False. For more information, see Configuring an S3 Bucket Key at the object level.

kms_key_arn

No String

The Amazon Resource Name (ARN) for the KMS key with which the object is encrypted, for rows where encryption_status is SSE-KMS or DSSE-KMS. If the object isn't encrypted with SSE-KMS or DSSE-KMS, the value is null. For more information, see Using server-side encryption with AWS KMS keys (SSE-KMS) and Using dual-layer server-side encryption with AWS KMS keys (DSSE-KMS).

Note

If a row represents an object version that no longer existed at the time that a delete or overwrite event was processed, kms_key_arn contains a null value, even if the encryption_status column value is SSE-KMS or DSSE-KMS.

checksum_algorithm

No String

The algorithm that’s used to create the checksum for the object, one of CRC64-NVME, CRC32, CRC32C, SHA1, or SHA256. If no checksum is present, this value is null. For more information, see Using supported checksum algorithms.

object_tags

No Map <String, String>

The object tags that are associated with the object. Object tags are stored as a map of key-value pairs. If an object has no object tags, an empty map ({}) is stored. For more information, see Categorizing your storage using tags.

Note

If the record_type value is DELETE, the object_tags column contains a null value. If the record_type value is CREATE or UPDATE_METADATA, rows that represent object versions that no longer existed at the time that a delete or overwrite event was processed will contain a null value in the object_tags column.

user_metadata

No Map <String, String>

The user metadata that's associated with the object. User metadata is stored as a map of key-value pairs. If an object has no user metadata, an empty map ({}) is stored. For more information, see User-defined object metadata.

Note

If the record_type value is DELETE, the user_metadata column contains a null value. If the record_type value is CREATE or UPDATE_METADATA, rows that represent object versions that no longer existed at the time that a delete or overwrite event was processed will contain a null value in the user_metadata column.