Automated Data Quality Testing at Scale using Apache Spark

With An Open Source library from Amazon — Deequ

Tanmay Deshpande
Towards Data Science
7 min readJun 29, 2019

--

Photo by Stephen Dawson on Unsplash

I have been working as a Technology Architect, mainly responsible for the Data Lake/Hub/Platform kind of projects. Every day we ingest data from 100+ business systems so that the data can be made available to the analytics and BI teams for their projects.

--

--