Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198.
Existing Python UDFs will continue to function until June 30, 2026. For more information, see the
blog post
SVL_S3QUERY_SUMMARY
Use the SVL_S3QUERY_SUMMARY view to get a summary of all data lake queries that have been run on the system. SVL_S3QUERY_SUMMARY aggregates detail from SVL_S3QUERY at the segment level.
SVL_S3QUERY_SUMMARY is visible to all users. Superusers can see all rows; regular users can see only their own data. For more information, see Visibility of data in system tables and views.
Some or all of the data in this table can also be found in the SYS monitoring view SYS_EXTERNAL_QUERY_DETAIL. The data in the SYS monitoring view is formatted to be easier to use and understand. We recommend that you use the SYS monitoring view for your queries.
For SVCS_S3QUERY_SUMMARY, see SVCS_S3QUERY_SUMMARY.
Table columns
| Column name | Data type | Description |
|---|---|---|
| userid | integer | The ID of the user that generated the given entry. |
| query | integer | The query ID. You can use this value to join various other system tables and views. |
| xid | bigint | The transaction ID. |
| pid | integer | The process ID. |
| segment | integer | The segment number. A query consists of multiple segments, and each segment consists of one or more steps. |
| step | integer | The query step that ran. |
| starttime | timestamp | Time in UTC that the query started executing. |
| endtime | timestamp | Time in UTC that the query completed. |
| elapsed | integer | The length of time that it took the query to run (in microseconds). |
| aborted | integer | If a query was stopped by the system or canceled
by the user, this column contains 1. If the
query ran to completion, this column contains
0. |
| external_table_name | char(136) | The internal format of name of the external name of the table for the external table scan. |
| file_format | character(16) | The file format of the external table data. |
| is_partitioned | char(1) | If true (t), this column
value indicates that the external table is partitioned. |
| is_rrscan | char(1) | If true (t), this column
value indicates that a range-restricted scan was applied. |
| is_nested | char(1) | If true (t), this column
value indicates that the nested column data type is accessed. |
| s3_scanned_rows | bigint | The number of rows scanned from Amazon S3 and sent to the Redshift Spectrum layer. On RG provisioned clusters, this is the total number of rows scanned directly from Amazon S3 by the cluster's native reader across all slices, before filter pushdown. |
| s3_scanned_bytes | bigint | The number of bytes scanned from Amazon S3 and sent to the Redshift Spectrum layer, based on compressed data. On RG provisioned clusters, this is the total number of bytes scanned directly from Amazon S3 by the cluster's native reader across all slices, before filter pushdown. |
| s3query_returned_rows | bigint | The number of rows returned from the Redshift Spectrum layer to the cluster. On RG provisioned clusters, this is the total number of rows returned by the cluster's native reader across all slices, after filter pushdown. |
| s3query_returned_bytes | bigint | The number of bytes returned from the Redshift Spectrum layer to the cluster. A large amount of data returned to Amazon Redshift might affect system performance. On RG provisioned clusters, this is the total number of bytes returned by the cluster's native reader across all slices, after filter pushdown. |
| files | integer | The number of files that were processed for this data lake query. A small number of files limits the benefits of parallel processing. |
| files_max | integer | The maximum number of files processed on one slice. |
| files_avg | integer | The average number of files processed on one slice. |
| splits | int | The number of splits processed for this segment. The number of splits processed on this slice. With large splitable data files, for example, data files larger than about 512 MB, Redshift Spectrum tries to split the files into multiple S3 requests for parallel processing. |
| splits_max | int | The maximum number of splits processed on this slice. |
| splits_avg | int | The average number of splits processed on this slice. |
| total_split_size | bigint | The total size of all splits processed. |
| max_split_size | bigint | The maximum split size processed, in bytes. |
| avg_split_size | bigint | The average split size processed, in bytes. |
| total_retries | integer | The total number of retries for one individual
processed file. On RG node type clusters, this column is deprecated and contains -1. For Amazon S3 client retries on RG, see STL_S3CLIENT. |
| max_retries | integer | The maximum number of retries for any of
processed files. On RG node type clusters, this column is deprecated and contains -1. For Amazon S3 client retries on RG, see STL_S3CLIENT. |
| max_request_duration | integer | The maximum duration of an individual file request (in microseconds). Long running queries might indicate a bottleneck. On RG node type clusters, this column is deprecated and contains -1. |
| avg_request_duration | double precision | The average duration of the file requests (in microseconds). On RG node type clusters, this column is deprecated and contains -1. |
| max_request_parallelism | integer | The maximum number of parallel requests at one slice for this Redshift Spectrum query. On RG node type clusters, this column is deprecated and contains -1. |
| avg_request_parallelism | double precision | The average number of parallel requests at one slice for this Redshift Spectrum query. On RG node type clusters, this column is deprecated and contains -1. |
| total_slowdown_count | bigint | The total number of Amazon S3 requests with a slow down error that occurred during the external table scan. On RG node type clusters, this column is deprecated and contains -1. |
| max_slowdown_count | integer | The maximum number of Amazon S3 requests with a slow down error that occurred during the external table scan on one slice. On RG node type clusters, this column is deprecated and contains -1. |
Sample query
The following example gets the scan step details for the last query completed.
select query, segment, elapsed, s3_scanned_rows, s3_scanned_bytes, s3query_returned_rows, s3query_returned_bytes, files from svl_s3query_summary where query = pg_last_query_id() order by query,segment;
query | segment | elapsed | s3_scanned_rows | s3_scanned_bytes | s3query_returned_rows | s3query_returned_bytes | files ------+---------+---------+-----------------+------------------+-----------------------+------------------------+------ 4587 | 2 | 67811 | 0 | 0 | 0 | 0 | 0 4587 | 2 | 591568 | 172462 | 11260097 | 8513 | 170260 | 1 4587 | 2 | 216849 | 0 | 0 | 0 | 0 | 0 4587 | 2 | 216671 | 0 | 0 | 0 | 0 | 0