Class ParquetOutputFormat.Builder
- All Implemented Interfaces:
software.amazon.jsii.Builder<ParquetOutputFormat>
- Enclosing class:
ParquetOutputFormat
ParquetOutputFormat.-
Method Summary
Modifier and TypeMethodDescriptionThe Hadoop Distributed File System (HDFS) block size.build()compression(ParquetCompression compression) The compression code to use over data blocks.static ParquetOutputFormat.Buildercreate()enableDictionaryCompression(Boolean enableDictionaryCompression) Indicates whether to enable dictionary compression.maxPadding(Size maxPadding) The maximum amount of padding to apply.The Parquet page size.writerVersion(ParquetWriterVersion writerVersion) Indicates the version of Parquet to output.
-
Method Details
-
create
- Returns:
- a new instance of
ParquetOutputFormat.Builder.
-
blockSize
The Hadoop Distributed File System (HDFS) block size.This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. Firehose uses this value for padding calculations.
Default: `Size.mebibytes(256)`
- Parameters:
blockSize- The Hadoop Distributed File System (HDFS) block size. This parameter is required.- Returns:
this
-
compression
The compression code to use over data blocks.The possible values are
UNCOMPRESSED,SNAPPY, andGZIP. UseSNAPPYfor higher decompression speed. UseGZIPif the compression ratio is more important than speed.Default: `SNAPPY`
- Parameters:
compression- The compression code to use over data blocks. This parameter is required.- Returns:
this- See Also:
-
enableDictionaryCompression
@Stability(Stable) public ParquetOutputFormat.Builder enableDictionaryCompression(Boolean enableDictionaryCompression) Indicates whether to enable dictionary compression.Default: `false`
- Parameters:
enableDictionaryCompression- Indicates whether to enable dictionary compression. This parameter is required.- Returns:
this- See Also:
-
maxPadding
The maximum amount of padding to apply.This is useful if you intend to copy the data from Amazon S3 to HDFS before querying.
Default: no padding is applied
- Parameters:
maxPadding- The maximum amount of padding to apply. This parameter is required.- Returns:
this- See Also:
-
pageSize
The Parquet page size.Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
Default: `Size.mebibytes(1)`
- Parameters:
pageSize- The Parquet page size. This parameter is required.- Returns:
this- See Also:
-
writerVersion
@Stability(Stable) public ParquetOutputFormat.Builder writerVersion(ParquetWriterVersion writerVersion) Indicates the version of Parquet to output.The possible values are
V1andV2Default: `V1`
- Parameters:
writerVersion- Indicates the version of Parquet to output. This parameter is required.- Returns:
this- See Also:
-
build
- Specified by:
buildin interfacesoftware.amazon.jsii.Builder<ParquetOutputFormat>- Returns:
- a newly built instance of
ParquetOutputFormat.
-