Creating a Visual ETL job in Identity Center-based domains
To create a job using Visual ETL in Amazon SageMaker Unified Studio Identity Center-based domains:
-
Log in to Amazon SageMaker Unified Studio and from the project selector dropdown at the top of the page, choose a project.
-
In the left navigation pane, under Data analytics, choose Visual ETL.
-
Choose "Create Visual ETL job" to open the Visual ETL editor.
If this is your first time using Visual ETL jobs in Amazon SageMaker Unified Studio, you are asked to choose a default compute permission mode option based on your data access preference.
-
Give the job a name when you begin authoring the job.
-
From the dropdown menu next to the Run button, choose the compute permission mode option that supports the data you will be using in the job.
Select project.spark.fineGrained for data managed using fine-grained access, meaning the compute engine can only access specific rows and columns from the full dataset. Choosing this option configures your compute to work with data asset subscriptions from Amazon SageMaker Catalog.
Select project.spark.compatibility to configure permission mode to be compatible with data managed using full-table access, meaning the compute engine can access all rows and columns in the data. Choosing this option configures your compute to work with data assets from AWS and from external systems that you connect to from your project.
-
Select the "Add nodes" button and select a node, chooing your node from one of the three tabs: "Data sources", "Transforms", or "Data targets".
-
Drag a source component onto the canvas.
-
Configure the component by choosing the node and editing the configurations, to connect to your data source.
-
Add transformation components as needed, connecting them in the desired order.
-
Drag a data target onto the canvas and configure it to specify where the processed data should be stored.
-
Connect the components to create a complete job.
-
Choose the "Checklist" button to check for any configuration errors.
-
To make the job accessible for all project members to view and edit, select "Save to project".
-
Select "Run" to execute it immediately or run it on a schedule with the instructions at Scheduling and running visual jobs in Identity Center-based domains.