使用 AWS SDK 将文件归档到 Amazon Glacier,获取通知并启动任务 - Amazon Glacier

此页面仅适用于使用文件库和 2012 年原始 REST API 的 Amazon Glacier 服务的现有客户。

如果您正在寻找归档存储解决方案,建议使用 Amazon S3 中的 Amazon Glacier 存储类别 S3 Glacier Instant Retrieval、S3 Glacier Flexible Retrieval 和 S3 Glacier Deep Archive。要了解有关这些存储选项的更多信息,请参阅 Amazon Glacier 存储类别

从 2025 年 12 月 15 日起,Amazon Glacier(最初基于独立文件库的服务)将不再接受新客户,对现有客户不存在任何影响。Amazon Glacier 是一项独立服务,拥有自己的 API,可将数据存储在文件库中,与 Amazon S3 和 Amazon S3 Glacier 存储类别不同。在 Amazon Glacier 中,您现有的数据将确保安全,并且可以无限期地访问。无需进行迁移。对于低成本、长期的存档存储,AWS 建议使用 Amazon S3 Glacier 存储类别,该类别可通过基于 S3 存储桶的 API、完全的 AWS 区域可用性、更低的成本和 AWS 服务集成,提供卓越的客户体验。如果您希望加强功能,可以考虑使用我们的 AWS 将数据从 Amazon Glacier 文件库传输到 Amazon S3 Glacier 存储类别的解决方案指南,迁移到 Amazon S3 Glacier 存储类别。

使用 AWS SDK 将文件归档到 Amazon Glacier,获取通知并启动任务

以下代码示例展示了如何:

  • 创建 Amazon Glacier 文件库。

  • 配置文件库,将通知发布到 Amazon SNS 主题。

  • 将档案文件上传到文件库。

  • 启动档案检索任务。

Python
适用于 Python 的 SDK(Boto3)
注意

查看 GitHub,了解更多信息。在 AWS 代码示例存储库中查找完整示例,了解如何进行设置和运行。

创建一个包装 Amazon Glacier 操作的类。

import argparse import logging import os import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) class GlacierWrapper: """Encapsulates Amazon S3 Glacier API operations.""" def __init__(self, glacier_resource): """ :param glacier_resource: A Boto3 Amazon S3 Glacier resource. """ self.glacier_resource = glacier_resource def create_vault(self, vault_name): """ Creates a vault. :param vault_name: The name to give the vault. :return: The newly created vault. """ try: vault = self.glacier_resource.create_vault(vaultName=vault_name) logger.info("Created vault %s.", vault_name) except ClientError: logger.exception("Couldn't create vault %s.", vault_name) raise else: return vault def list_vaults(self): """ Lists vaults for the current account. """ try: for vault in self.glacier_resource.vaults.all(): logger.info("Got vault %s.", vault.name) except ClientError: logger.exception("Couldn't list vaults.") raise @staticmethod def upload_archive(vault, archive_description, archive_file): """ Uploads an archive to a vault. :param vault: The vault where the archive is put. :param archive_description: A description of the archive. :param archive_file: The archive file to put in the vault. :return: The uploaded archive. """ try: archive = vault.upload_archive( archiveDescription=archive_description, body=archive_file ) logger.info( "Uploaded %s with ID %s to vault %s.", archive_description, archive.id, vault.name, ) except ClientError: logger.exception( "Couldn't upload %s to %s.", archive_description, vault.name ) raise else: return archive @staticmethod def initiate_archive_retrieval(archive): """ Initiates an archive retrieval job. Standard retrievals typically complete within 3—5 hours. When the job completes, you can get the archive contents by calling get_output(). :param archive: The archive to retrieve. :return: The archive retrieval job. """ try: job = archive.initiate_archive_retrieval() logger.info("Started %s job with ID %s.", job.action, job.id) except ClientError: logger.exception("Couldn't start job on archive %s.", archive.id) raise else: return job @staticmethod def list_jobs(vault, job_type): """ Lists jobs by type for the specified vault. :param vault: The vault to query. :param job_type: The type of job to list. :return: The list of jobs of the requested type. """ job_list = [] try: if job_type == "all": jobs = vault.jobs.all() elif job_type == "in_progress": jobs = vault.jobs_in_progress.all() elif job_type == "completed": jobs = vault.completed_jobs.all() elif job_type == "succeeded": jobs = vault.succeeded_jobs.all() elif job_type == "failed": jobs = vault.failed_jobs.all() else: jobs = [] logger.warning("%s isn't a type of job I can get.", job_type) for job in jobs: job_list.append(job) logger.info("Got %s %s job %s.", job_type, job.action, job.id) except ClientError: logger.exception("Couldn't get %s jobs from %s.", job_type, vault.name) raise else: return job_list def set_notifications(self, vault, sns_topic_arn): """ Sets an Amazon Simple Notification Service (Amazon SNS) topic as a target for notifications. Amazon S3 Glacier publishes messages to this topic for the configured list of events. :param vault: The vault to set up to publish notifications. :param sns_topic_arn: The Amazon Resource Name (ARN) of the topic that receives notifications. :return: Data about the new notification configuration. """ try: notification = self.glacier_resource.Notification("-", vault.name) notification.set( vaultNotificationConfig={ "SNSTopic": sns_topic_arn, "Events": [ "ArchiveRetrievalCompleted", "InventoryRetrievalCompleted", ], } ) logger.info( "Notifications will be sent to %s for events %s from %s.", notification.sns_topic, notification.events, notification.vault_name, ) except ClientError: logger.exception( "Couldn't set notifications to %s on %s.", sns_topic_arn, vault.name ) raise else: return notification

调用包装器类上的函数,以创建文件库并上传文件,然后将文件库配置为发布通知并启动检索档案的任务。

def upload_demo(glacier, vault_name, topic_arn): """ Shows how to: * Create a vault. * Configure the vault to publish notifications to an Amazon SNS topic. * Upload an archive. * Start a job to retrieve the archive. :param glacier: A Boto3 Amazon S3 Glacier resource. :param vault_name: The name of the vault to create. :param topic_arn: The ARN of an Amazon SNS topic that receives notification of Amazon S3 Glacier events. """ print(f"\nCreating vault {vault_name}.") vault = glacier.create_vault(vault_name) print("\nList of vaults in your account:") glacier.list_vaults() print(f"\nUploading glacier_basics.py to {vault.name}.") with open("glacier_basics.py", "rb") as upload_file: archive = glacier.upload_archive(vault, "glacier_basics.py", upload_file) print( "\nStarting an archive retrieval request to get the file back from the " "vault." ) glacier.initiate_archive_retrieval(archive) print("\nListing in progress jobs:") glacier.list_jobs(vault, "in_progress") print( "\nBecause Amazon S3 Glacier is intended for infrequent retrieval, an " "archive request with Standard retrieval typically completes within 3–5 " "hours." ) if topic_arn: notification = glacier.set_notifications(vault, topic_arn) print( f"\nVault {vault.name} is configured to notify the " f"{notification.sns_topic} topic when {notification.events} " f"events occur. You can subscribe to this topic to receive " f"a message when the archive retrieval completes.\n" ) else: print( f"\nVault {vault.name} is not configured to notify an Amazon SNS topic " f"when the archive retrieval completes so wait a few hours." ) print("\nRetrieve your job output by running this script with the --retrieve flag.")

有关 AWS SDK 开发人员指南和代码示例的完整列表,请参阅将 Amazon Glacier 与 AWS SDK 结合使用。本主题还包括有关入门的信息以及有关先前的 SDK 版本的详细信息。