Amazon Redshift will no longer support the creation of new Python UDFs starting November 1, 2025.
If you would like to use Python UDFs, create the UDFs prior to that date.
Existing Python UDFs will continue to function as normal. For more information, see the
blog post
Nested data limitations (preview)
This topic describes limitations for reading nested data with Redshift Spectrum. Nested data is data that contains nested fields. Nested fields are fields that are joined together as a single entity, such as arrays, structs, or objects.
Note
The limitations marked (preview) in the following list only apply to preview clusters created in the following Regions.
US East (Ohio) (us-east-2)
US East (N. Virginia) (us-east-1)
US West (N. California) (us-west-1)
Asia Pacific (Tokyo) (ap-northeast-1)
Europe (Ireland) (eu-west-1)
Europe (Stockholm) (eu-north-1)
For information about setting up Preview clusters, see Creating a preview cluster in the Amazon Redshift Management Guide.
The following limitations apply to nested data:
-
An
arrayormaptype can contain otherarrayormaptypes as long as queries on the nestedarraysormapsdon't returnscalarvalues. (preview) -
Amazon Redshift Spectrum supports complex data types only as external tables.
-
Subquery result columns must be top-level. (preview)
-
If an
OUTER JOINexpression refers to a nested table, it can refer only to that table and its nested arrays (and maps). If anOUTER JOINexpression doesn't refer to a nested table, it can refer to any number of non-nested tables. -
If a
FROMclause in a subquery refers to a nested table, it can't refer to any other table. -
If a subquery depends on a nested table that refers to a parent table, the subquery can only use the parent table in the
FROMclause. You can't use the parent in any other clauses, such as aSELECTorWHEREclause. For example, the following query doesn't run because the subquery'sSELECTclause refers to the parent tablec.SELECT c.name.given FROM spectrum.customers c WHERE (SELECT COUNT(c.id) FROM c.phones p WHERE p LIKE '858%') > 1;The following query works because the parent
cis used only in theFROMclause of the subquery.SELECT c.name.given FROM spectrum.customers c WHERE (SELECT COUNT(*) FROM c.phones p WHERE p LIKE '858%') > 1; -
A subquery that accesses nested data anywhere other than the
FROMclause must return a single value. The only exceptions are(NOT) EXISTSoperators in aWHEREclause. -
(NOT) INis not supported. -
The maximum nesting depth for all nested types is 100. This restriction applies to all file formats (Parquet, ORC, Ion, and JSON).
-
Aggregation subqueries that access nested data can only refer to
arraysandmapsin theirFROMclause, not to an external table. -
Querying the pseudocolumns of nested data in a Redshift Spectrum table is not supported. For more information, see Pseudocolumns.
-
When extracting data from array or map columns by specifying them in a
FROMclause, you can only select values from those columns if the values arescalar. For example, the following queries both try toSELECTelements from inside an array. The query that selectsarr.aworks becausearr.ais ascalarvalue. The second query doesn't work becausearrayis an array extracted froms3.nested tablein theFROMclause. (preview)SELECT array_column FROM s3.nested_table;array_column ----------------- [{"a":1},{"b":2}]SELECT arr.a FROM s3.nested_table t, t.array_column arr;arr.a ----- 1--This query fails to run. SELECT array FROM s3.nested_table tab, tab.array_column array;You can’t use an array or map in the
FROMclause that itself comes from another array or map. To select arrays or other complex structures that are nested inside other arrays, consider using indexes in theSELECTstatement.