Posted On: Jan 20, 2022

We are announcing the support of using Apache Spark SQL to update Glue Data Catalog tables when using Amazon EMR integration with AWS Lake Formation.

Amazon EMR integration with AWS Lake Formation allows you to define and enforce database, table, and column-level permissions when Apache Spark users access data in Amazon S3 through the Glue Data Catalog. Previously, with AWS Lake Formation integration is enabled, you were limited to only being able to read data using Spark SQL statements such as SHOW DATABASES and DESCRIBE TABLE. Now, you can also insert data into, or update the Glue Data Catalog tables with these statements: INSERT INTO, INSERT OVERWRITE, and ALTER TABLE.

This feature is enabled on Amazon EMR 5.34 in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), and South America (São Paulo).