

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 在您的数据层中命名 Amazon S3 存储桶
<a name="naming-structure-data-layers"></a>

以下各节提供了数据湖层中亚马逊简单存储服务 (Amazon S3) 存储桶的命名结构。但是，您可以根据组织的要求自定义 Amazon S3 存储桶和路径名称。我们建议您为每个单独的层创建单独的存储桶，因为每个层的存档、版本控制、访问和加密要求可能有所不同。

下图显示了推荐的数据湖层中 Amazon S3 存储桶的推荐命名结构。命名结构将多个业务部门、文件格式和分区分开来。



![S3 存储桶的命名方法因其目标数据层而异。](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/images/data-lake-naming-diag-1.png)


**重要**  
Amazon S3 存储桶必须遵循 Amazon S3 文档中[存储桶命名规则](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html)中的命名指南。

您可以根据组织的要求调整数据分区。但是，您应该使用小写和键值对（例如，`year=yyyy`而不是`yyyy`），以便可以使用命令更新目录。`MSCK REPAIR TABLE`

定义分区策略取决于数据的性质，最重要的是，取决于用户查询的性质。我们建议您分析消费和数据处理模式，以找到最适合您组织的策略。通常，在原始数据层上提供更高的层次级别（例如`year=yyyy``month=mm``day=dd`、和），在消费数据层（例如阶段层和分析层）上提供较低的层次结构级别是有意义的。这是因为原始数据层通常没有数据处理管道那样的复杂消耗模式。

## 着陆区 Amazon S3 存储桶
<a name="landing-zone-naming-structure"></a>

如果敏感数据集包含在将数据移至原始存储桶之前必须屏蔽的元素，则需要在着陆区使用 Amazon S3 存储桶。

下表提供了您的着陆区层中 Amazon S3 存储桶的命名结构、命名结构描述和名称示例。


****  

| 命名格式 | 示例 | 
| --- | --- | 
| `s3://companyname-landingzoneawsregion-awsaccount\|uniqidenv/source/source_region/table/year=yyyy/month=mm/day=dd/table_<yearmonthday>.avro\|csv`[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html) | `s3://anycompany-landingzoneuseast1-12345-dev/socialmedia/us/tb_products/year=2021/month=03/day=01/products_20210301.csv` | 

## 原始层 Amazon S3 存储桶
<a name="raw-data-layer-naming-structure"></a>

原始数据层包含尚未转换且采用原始文件格式（例如 JSON 或 CSV）的提取数据。这些数据通常按数据源及其提取到原始数据层的 Amazon S3 存储桶的日期进行组织。

下表提供了命名结构、命名结构描述以及原始数据层中 Amazon S3 存储桶的名称示例。


****  

| 命名格式 | 示例 | 
| --- | --- | 
| `s3://companyname-raw-awsregion-awsaccount\|uniqid-env/source/source_region/table/year=yyyy/month=mm/day=dd/table_<yearmonthday>.avro\|csv`[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html) | `s3://anycompany-raw-useast1-12345-dev/socialmedia/us/tb_products/year=2021/month=03/day=01/products_20210301.csv` | 

## 舞台层 Amazon S3 存储桶
<a name="stage-data-layer-naming-structure"></a>

舞台层中的数据是从原始层读取和转换的（例如，通过使用 AWS Glue 或 Amazon EMR 作业）。此过程会验证数据（例如，通过检查数据类型和标头），然后将其存储为便于使用的文件格式，例如 Apache Parquet。元数据存储在中的表中[AWS Glue Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/components-overview.html)。

下表提供了您的舞台数据层中 Amazon S3 存储桶的命名结构、命名结构描述和名称示例。


****  

| 命名格式 | 示例 | 
| --- | --- | 
| `s3://companyname-stageawsregion-awsaccount\|uniqidenv/source/source_region/ business_unit/table/<partitions>/table_<table_name>_<yearmonthday>.snap`[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html) | `s3://anycompany-stagesaeast1-12345-dev/sap/br/customers/validated/dt=2021-03-01/table_customers_20210301.snappy.parquet py.parquet` | 

## 分析层 Amazon S3 存储桶
<a name="analytics-data-layer-naming-structure"></a>

分析层与舞台层类似，因为数据采用经过处理的文件格式，但随后会根据贵组织的要求对数据进行聚合。

下表提供了您的分析数据层中 Amazon S3 存储桶的命名结构、命名结构描述和名称示例。


****  

| 命名格式 | 示例 | 
| --- | --- | 
| `s3://companyname-analytics-awsregion-awsaccount\|uniqid-env/source_region/business_unit/tb_<region>_<table_name>_<file_format>/<partition_0>/<partition_1>/.../<partition_n>/xxxxx.<compression>.<file_format>`[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html)[See the AWS documentation website for more details](http://docs.aws.amazon.com/zh_cn/prescriptive-guidance/latest/defining-bucket-names-data-lakes/naming-structure-data-layers.html) | `s3://anycompany-analytics-useast1-12345-dev/us/sales/tb_us_customers_parquet/<partitions>/part-000001-20218c886790.c000.snappy.parquet` | 