Posted On: May 12, 2023

AWS Glue Crawler now supports the ability for customers to bring their own custom JDBC drivers to extract data schemas from data sources and populate the AWS Glue Data Catalog. Glue Crawlers already support JDBC Glue connections to supported data sources on AWS. Now, you can bring your own JDBC driver versions to connect to data sources in Glue Crawlers. These data sources include Postgres, MySQL, Oracle, SQL Server, and Amazon Redshift.

To use your own JDBC driver, add the driver file to your Amazon S3 bucket. Then configure the Glue Connection with JDBC driver S3 path and class name. With each run of the Glue Crawler, a Glue job is started using the provided JDBC driver to inspect the schema. The Glue Crawler then catalogs the schema information, such as new tables, deletes, and updates to schemas in the AWS Glue Data Catalog. With AWS Glue, you can now use AWS Glue Data Catalog as a source to pull data from these data sources and populate an Amazon S3 target.

AWS Glue Crawlers support for custom JDBC drivers is available in all commercial regions where AWS Glue is available. See the AWS Region Table. To learn more, visit the AWS Glue Crawler documentation.