Emulate Oracle PL/SQL associative arrays in Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL - AWS Prescriptive Guidance

Emulate Oracle PL/SQL associative arrays in Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL

Rajkumar Raghuwanshi, Bhanu Ganesh Gudivada, and Sachin Khanna, Amazon Web Services

Summary

This pattern describes how to emulate Oracle PL/SQL associative arrays with empty index positions in Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL environments. It also describes some of the differences between Oracle PL/SQL associative arrays and PostgreSQL arrays with regard to how each handles empty index positions during migrations.

We provide a PostgreSQL alternative to using aws_oracle_ext functions for handling empty index positions when migrating an Oracle database. This pattern uses an additional column to store index positions, and it maintains Oracle's handling of sparse arrays while incorporating native PostgreSQL capabilities.

Oracle

In Oracle, collections can be initialized as empty and populated using the EXTEND collection method, which appends NULL elements to the array. When working with PL/SQL associative arrays indexed by PLS_INTEGER, the EXTEND method adds NULL elements sequentially, but elements can also be initialized at nonsequential index positions. Any index position that isn't explicitly initialized remains empty.

This flexibility allows for sparse array structures where elements can be populated at arbitrary positions. When iterating through collections using a FOR LOOP with FIRST and LAST bounds, only the initialized elements (whether NULL or with a defined value) are processed, while empty positions are skipped.

PostgreSQL (Amazon Aurora and Amazon RDS)

PostgreSQL handles empty values differently from NULL values. It stores empty values as distinct entities that use one byte of storage. When an array has empty values, PostgreSQL assigns sequential index positions just like non-empty values. But sequential indexing requires additional processing because the system must iterate through all indexed positions, including the empty ones. This makes traditional array creation inefficient for sparse datasets.

AWS Schema Conversion Tool

The AWS Schema Conversion Tool (AWS SCT) typically handles Oracle-to-PostgreSQL migrations using aws_oracle_ext functions. In this pattern, we propose an alternative approach that uses native PostgreSQL capabilities, which combines PostgreSQL array types with an additional column for storing index positions. The system can then iterate through arrays using just the index column.

Prerequisites and limitations

Prerequisites

  • An active AWS account.

  • An AWS Identity and Access Management (IAM) user with administrator permissions.

  • An instance that’s compatible with Amazon RDS or Aurora PostgreSQL.

  • A basic understanding of relational databases.

Limitations

Product versions

This pattern was tested with the following versions:

  • Amazon Aurora PostgreSQL 13.3

  • Amazon RDS for PostgreSQL 13.3

  • AWS SCT 1.0.674

  • Oracle 12c EE 12.2

Architecture

Source technology stack

  • On-premises Oracle database

Target technology stack

  • Amazon Aurora PostgreSQL

  • Amazon RDS for PostgreSQL

Target architecture

The diagram shows the following:

  • A source Amazon RDS for Oracle database instance

  • An Amazon EC2 instance with AWS SCT for converting Oracle functions to the PostgreSQL equivalent

  • A target database that’s compatible with Amazon Aurora PostgreSQL

Tools

AWS services

Other tools

  • Oracle SQL Developer is an integrated development environment that simplifies the development and management of Oracle databases in both traditional and cloud-based deployments.

  • pgAdmin is an open source management tool for PostgreSQL. It provides a graphical interface that helps you create, maintain, and use database objects. In this pattern, pgAdmin connects to the RDS for PostgreSQL database instance and queries the data. Alternatively, you can use the psql command line client.

Best practices

  • Test data set boundaries and edge scenarios.

  • Consider implementing error handling for out-of-bounds index conditions.

  • Optimize queries to avoid scanning sparse data sets.

Epics

TaskDescriptionSkills required

Create a source PL/SQL block in Oracle.

Create a source PL/SQL block in Oracle that uses the following associative array:

DECLARE TYPE country_codes IS TABLE OF VARCHAR2(100) INDEX BY pls_integer; cc country_codes; cc_idx NUMBER := NULL; BEGIN cc(7) := 'India'; cc(3) := 'UK'; cc(5) := 'USA'; cc(0) := 'China'; cc(-2) := 'Invalid'; dbms_output.put_line('cc_length:' || cc.COUNT); IF (cc.COUNT > 0) THEN cc_idx := cc.FIRST; FOR i IN 1..cc.COUNT LOOP dbms_output.put_line('cc_idx:' || cc_idx || ' country:' || cc(cc_idx)); cc_idx := cc.next(cc_idx); END LOOP; END IF; END;
DBA

Run the PL/SQL block.

Run the source PL/SQL block in Oracle. If there are gaps between the index values of an associative array, no data is stored in those gaps. This allows the Oracle loop to iterate only through the index positions.

DBA

Review the output.

Five elements were inserted into the array (cc) at nonconsecutive intervals. The array count is shown in the following output:

cc_length:5 cc_idx:-2 country:Invalid cc_idx:0 country:China cc_idx:3 country:UK cc_idx:5 country:USA cc_idx:7 country:India
DBA
TaskDescriptionSkills required

Create a target PL/pgSQL block in PostgreSQL.

Create a target PL/pgSQL block in PostgreSQL that uses the following associative array:

DO $$ DECLARE cc character varying(100)[]; cc_idx integer := NULL; BEGIN cc[7] := 'India'; cc[3] := 'UK'; cc[5] := 'USA'; cc[0] := 'China'; cc[-2] := 'Invalid'; RAISE NOTICE 'cc_length: %', ARRAY_LENGTH(cc, 1); IF (ARRAY_LENGTH(cc, 1) > 0) THEN FOR i IN ARRAY_LOWER(cc, 1)..ARRAY_UPPER(cc, 1) LOOP RAISE NOTICE 'cc_idx:% country:%', i, cc[i]; END LOOP; END IF; END; $$;
DBA

Run the PL/pgSQL block.

Run the target PL/pgSQL block in PostgreSQL. If there are gaps between the index values of an associative array, no data is stored in those gaps. This allows the Oracle loop to iterate only through the index positions.

DBA

Review the output.

The array length is greater than 5 because NULL is stored in the gaps between index positions. As shown in the following output, the loop completes 10 iterations to retrieve 5 values in the array.

cc_length:10 cc_idx:-2 country:Invalid cc_idx:-1 country:<NULL> cc_idx:0 country:China cc_idx:1 country:<NULL> cc_idx:2 country:<NULL> cc_idx:3 country:UK cc_idx:4 country:<NULL> cc_idx:5 country:USA cc_idx:6 country:<NULL> cc_idx:7 country:India
DBA
TaskDescriptionSkills required

Create a target PL/pgSQL block with an array and user-defined type.

To optimize performance and match Oracle's functionality, we can create a user-defined type that stores both index positions and their corresponding data. This approach reduces unnecessary iterations by maintaining direct associations between indices and values.

DO $$ DECLARE cc country_codes[]; cc_append country_codes := NULL; i record; BEGIN cc_append.idx = 7; cc_append.val = 'India'; cc := array_append(cc, cc_append); cc_append.idx = 3; cc_append.val = 'UK'; cc := array_append(cc, cc_append); cc_append.idx = 5; cc_append.val = 'USA'; cc := array_append(cc, cc_append); cc_append.idx = 0; cc_append.val = 'China'; cc := array_append(cc, cc_append); cc_append.idx = - 2; cc_append.val = 'Invalid'; cc := array_append(cc, cc_append); RAISE NOTICE 'cc_length: %', ARRAY_LENGTH(cc, 1); IF (ARRAY_LENGTH(cc, 1) > 0) THEN FOR i IN ( SELECT * FROM unnest(cc) ORDER BY idx) LOOP RAISE NOTICE 'cc_idx:% country:%', i.idx, i.val; END LOOP; END IF; END; $$;
DBA

Run the PL/pgSQL block.

Run the target PL/pgSQL block. If there are gaps between the index values of an associative array, no data is stored in those gaps. This allows the Oracle loop to iterate only through the index positions.

DBA

Review the output.

As shown in the following output, the user-defined type stores only populated data elements, which means that the array length matches the number of values. As a result, LOOP iterations are optimized to process only existing data, eliminating the need to track empty positions.

cc_length:5 cc_idx:-2 country:Invalid cc_idx:0 country:China cc_idx:3 country:UK cc_idx:5 country:USA cc_idx:7 country:India
DBA

Related resources

AWS documentation

Other documentation