AWS Glue Partition StorageDescriptor
The
StorageDescriptor property type describes the physical storage of AWS Glue partition data.
StorageDescriptor is a property of the PartitionInput property type.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "StoredAsSubDirectories" :Boolean, "Parameters" :JSON object, "BucketColumns" : [String, ... ], "SkewedInfo" : SkewedInfo, "InputFormat" :String, "NumberOfBuckets" :Integer, "OutputFormat" :String, "Columns" : [ Column, ... ], "SerdeInfo" : SerdeInfo, "SortColumns" : [ Order, ... ], "Compressed" :Boolean, "Location" :String}
YAML
StoredAsSubDirectories:BooleanParameters:JSON objectBucketColumns: -StringSkewedInfo: SkewedInfo InputFormat:StringNumberOfBuckets:IntegerOutputFormat:StringColumns: - Column SerdeInfo: SerdeInfo SortColumns: - Order Compressed:BooleanLocation:String
Properties
StoredAsSubDirectories-
Indicates whether the partition data is stored in subdirectories.
Required: No
Type: Boolean
Update requires: No interruption
Parameters-
UTF-8 string–to–UTF-8 string key-value pairs that specify user-supplied properties.
Required: No
Type: JSON object
Update requires: No interruption
BucketColumns-
A list of UTF-8 strings that specify reducer grouping columns, clustering columns, and bucketing columns in the partition.
Required: No
Type: List of String values
Update requires: No interruption
SkewedInfo-
Information about values that appear very frequently in a column (skewed values).
Required: No
Type: SkewedInfo
Update requires: No interruption
InputFormat-
The input format:
SequenceFileInputFormat(binary),TextInputFormat, or a custom format. It must match the single-line string pattern:[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*Required: No
Type: String
Update requires: No interruption
NumberOfBuckets-
The number of buckets.
Required: Conditional. You must specify this property if the partition contains any dimension columns.
Type: Integer
Update requires: No interruption
OutputFormat-
The output format:
SequenceFileOutputFormat(binary),IgnoreKeyTextOutputFormat, or a custom format. It must match the single-line string pattern:[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*Required: No
Type: String
Update requires: No interruption
Columns-
The columns in the partition.
Required: No
Type: List of Column
Update requires: No interruption
SerdeInfo-
Information about a serialization/deserialization program (SerDe), which serves as an extractor and loader.
Required: No
Type: SerdeInfo
Update requires: No interruption
SortColumns-
The sort order of each bucket in the partition.
Required: No
Type: List of Order
Update requires: No interruption
Compressed-
Indicates whether the data in the partition is compressed.
Required: No
Type: Boolean
Update requires: No interruption
Location-
The physical location of the partition. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the partition name. It must match the URI address multi-line string pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*Required: No
Type: String
Update requires: No interruption
