Use RunJobFlow with an AWS SDK - AWS SDK Code Examples

There are more AWS SDK examples available in the AWS Doc SDK Examples GitHub repo.

Use RunJobFlow with an AWS SDK

The following code examples show how to use RunJobFlow.

Python
SDK for Python (Boto3)
Note

There's more on GitHub. Find the complete example and learn how to set up and run in the AWS Code Examples Repository.

def run_job_flow( name, log_uri, keep_alive, applications, job_flow_role, service_role, security_groups, steps, emr_client, ): """ Runs a job flow with the specified steps. A job flow creates a cluster of instances and adds steps to be run on the cluster. Steps added to the cluster are run as soon as the cluster is ready. This example uses the 'emr-5.30.1' release. A list of recent releases can be found here: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-components.html. :param name: The name of the cluster. :param log_uri: The URI where logs are stored. This can be an Amazon S3 bucket URL, such as 's3://my-log-bucket'. :param keep_alive: When True, the cluster is put into a Waiting state after all steps are run. When False, the cluster terminates itself when the step queue is empty. :param applications: The applications to install on each instance in the cluster, such as Hive or Spark. :param job_flow_role: The IAM role assumed by the cluster. :param service_role: The IAM role assumed by the service. :param security_groups: The security groups to assign to the cluster instances. Amazon EMR adds all needed rules to these groups, so they can be empty if you require only the default rules. :param steps: The job flow steps to add to the cluster. These are run in order when the cluster is ready. :param emr_client: The Boto3 EMR client object. :return: The ID of the newly created cluster. """ try: response = emr_client.run_job_flow( Name=name, LogUri=log_uri, ReleaseLabel="emr-5.30.1", Instances={ "MasterInstanceType": "m5.xlarge", "SlaveInstanceType": "m5.xlarge", "InstanceCount": 3, "KeepJobFlowAliveWhenNoSteps": keep_alive, "EmrManagedMasterSecurityGroup": security_groups["manager"].id, "EmrManagedSlaveSecurityGroup": security_groups["worker"].id, }, Steps=[ { "Name": step["name"], "ActionOnFailure": "CONTINUE", "HadoopJarStep": { "Jar": "command-runner.jar", "Args": [ "spark-submit", "--deploy-mode", "cluster", step["script_uri"], *step["script_args"], ], }, } for step in steps ], Applications=[{"Name": app} for app in applications], JobFlowRole=job_flow_role.name, ServiceRole=service_role.name, EbsRootVolumeSize=10, VisibleToAllUsers=True, ) cluster_id = response["JobFlowId"] logger.info("Created cluster %s.", cluster_id) except ClientError: logger.exception("Couldn't create cluster.") raise else: return cluster_id
  • For API details, see RunJobFlow in AWS SDK for Python (Boto3) API Reference.

SAP ABAP
SDK for SAP ABAP
Note

There's more on GitHub. Find the complete example and learn how to set up and run in the AWS Code Examples Repository.

TRY. " Create instances configuration DATA(lo_instances) = NEW /aws1/cl_emrjobflowinstsconfig( iv_masterinstancetype = 'm5.xlarge' iv_slaveinstancetype = 'm5.xlarge' iv_instancecount = 3 iv_keepjobflowalivewhennos00 = iv_keep_alive iv_emrmanagedmastersecgroup = iv_primary_sec_grp iv_emrmanagedslavesecgroup = iv_secondary_sec_grp ). DATA(lo_result) = lo_emr->runjobflow( iv_name = iv_name iv_loguri = iv_log_uri iv_releaselabel = 'emr-5.30.1' io_instances = lo_instances it_steps = it_steps it_applications = it_applications iv_jobflowrole = iv_job_flow_role iv_servicerole = iv_service_role iv_ebsrootvolumesize = 10 iv_visibletoallusers = abap_true ). ov_cluster_id = lo_result->get_jobflowid( ). MESSAGE 'EMR cluster created successfully.' TYPE 'I'. CATCH /aws1/cx_emrinternalservererr INTO DATA(lo_internal_error). DATA(lv_error) = lo_internal_error->if_message~get_text( ). MESSAGE lv_error TYPE 'E'. CATCH /aws1/cx_emrclientexc INTO DATA(lo_client_error). lv_error = lo_client_error->if_message~get_text( ). MESSAGE lv_error TYPE 'E'. ENDTRY.
  • For API details, see RunJobFlow in AWS SDK for SAP ABAP API reference.