Posted On: Mar 29, 2021

AWS Step Functions is now integrated with Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS), making it easier to integrate Apache Spark based jobs into your analytics pipeline. You can now build workflows including steps to manage EMR on EKS virtual clusters and submit jobs without writing code to manage the state of the job.

Step Functions allows you to build resilient serverless workflows using Amazon services such as Amazon Athena, Amazon EMR, Amazon EKS, and now with Amazon EMR on Amazon EKS. EMR on EKS provides a deployment option for Amazon EMR that allows you to run Apache Spark on Amazon Elastic Kubernetes Service (Amazon EKS). You can run Amazon EMR based applications with other types of applications on the same Amazon EKS cluster to improve resource utilization and simplify infrastructure management across multiple AWS Availability Zones.

The three API actions supported by Step Functions’ integration with Amazon EMR on EKS are:

1. createVirtualCluster: create and register an AWS EMR on EKS Virtual Cluster from an existing AWS EKS namespace and then waits until it completes.
2. deleteVirtualCluster (.sync): deletes an AWS EMR on EKS Virtual Cluster and then waits until it is deleted.
3. startJobRun (.sync): submits a job to AWS EMR on EKS and waits until the job completes.

To get started, please visit the AWS Step Functions Documentation and view our blog post.

Step Functions integration with EMR on EKS is generally available in the following regions: US East (Ohio and N. Virginia), EU West (Ireland), and Canada (Central). It will be generally available in all other commercial regions where Step Functions and EMR on EKS are available in the coming days. For a complete list of regions and service offerings, see AWS Regions.  

To learn more about Step Functions, please visit the AWS Step Functions Page. To learn more about Amazon EMR on EKS, please visit the AWS EMR on EKS page.