class DataFormat
| Language | Type name |
|---|---|
.NET | Amazon.CDK.AWS.Glue.DataFormat |
Java | software.amazon.awscdk.services.glue.DataFormat |
Python | aws_cdk.aws_glue.DataFormat |
TypeScript (source) | @aws-cdk/aws-glue ยป DataFormat |
Defines the input/output formats and ser/de for a single DataFormat.
Example
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
database: myDatabase,
tableName: 'my_table',
columns: [{
name: 'col1',
type: glue.Schema.STRING,
}],
partitionKeys: [{
name: 'year',
type: glue.Schema.SMALL_INT,
}, {
name: 'month',
type: glue.Schema.SMALL_INT,
}],
dataFormat: glue.DataFormat.JSON,
});
Initializer
new DataFormat(props: DataFormatProps)
Parameters
- props
DataFormat Props
Properties
| Name | Type | Description |
|---|---|---|
| input | Input | InputFormat for this data format. |
| output | Output | OutputFormat for this data format. |
| serialization | Serialization | Serialization library for this data format. |
| classification | Classification | Classification string given to tables with this data format. |
| static APACHE_LOGS | Data | DataFormat for Apache Web Server Logs. |
| static AVRO | Data | DataFormat for Apache Avro. |
| static CLOUDTRAIL_LOGS | Data | DataFormat for CloudTrail logs stored on S3. |
| static CSV | Data | DataFormat for CSV Files. |
| static JSON | Data | Stored as plain text files in JSON format. |
| static LOGSTASH | Data | DataFormat for Logstash Logs, using the GROK SerDe. |
| static ORC | Data | DataFormat for Apache ORC (Optimized Row Columnar). |
| static PARQUET | Data | DataFormat for Apache Parquet. |
| static TSV | Data | DataFormat for TSV (Tab-Separated Values). |
inputFormat
Type:
Input
InputFormat for this data format.
outputFormat
Type:
Output
OutputFormat for this data format.
serializationLibrary
Type:
Serialization
Serialization library for this data format.
classificationString?
Type:
Classification
(optional)
Classification string given to tables with this data format.
static APACHE_LOGS
Type:
Data
DataFormat for Apache Web Server Logs.
Also works for CloudFront logs
See also: https://docs.aws.amazon.com/athena/latest/ug/apache.html
static AVRO
Type:
Data
DataFormat for Apache Avro.
See also: https://docs.aws.amazon.com/athena/latest/ug/avro.html
static CLOUDTRAIL_LOGS
Type:
Data
DataFormat for CloudTrail logs stored on S3.
See also: https://docs.aws.amazon.com/athena/latest/ug/cloudtrail.html
static CSV
Type:
Data
DataFormat for CSV Files.
See also: https://docs.aws.amazon.com/athena/latest/ug/csv.html
static JSON
Type:
Data
Stored as plain text files in JSON format.
Uses OpenX Json SerDe for serialization and deseralization.
See also: https://docs.aws.amazon.com/athena/latest/ug/json.html
static LOGSTASH
Type:
Data
DataFormat for Logstash Logs, using the GROK SerDe.
See also: https://docs.aws.amazon.com/athena/latest/ug/grok.html
static ORC
Type:
Data
DataFormat for Apache ORC (Optimized Row Columnar).
See also: https://docs.aws.amazon.com/athena/latest/ug/orc.html
static PARQUET
Type:
Data
DataFormat for Apache Parquet.
See also: https://docs.aws.amazon.com/athena/latest/ug/parquet.html
static TSV
Type:
Data
DataFormat for TSV (Tab-Separated Values).
See also: https://docs.aws.amazon.com/athena/latest/ug/lazy-simple-serde.html

.NET
Java
Python
TypeScript (