ETL is a process of extracting (E), transforming (T) and loading (L) the data into the data warehouse
ETL Development & ETL Testing
ETL is a process of extracting (E), transforming (T) and loading (L) the data into the data warehouse or any other data-centric system. The process involved in developing these ETL processes is time consuming and usually follows the steps below:
1. Understand the Business Requirement
2. Create Technical Specification/ Requirement
3. Generate Mapping Document
4. Develop ETL Code
The Issue: The ETL development process above is typically a complete black box. The business users or business analysts don’t really have any insights into the quality or correctness of this ETL process. Hence it is not only important to test this ETL Process, but ensure that the – ETL testing is transparent to the user’s data warehouse testing needs. Also, it’s not enough to test based on the transformation rules alone, since the technical transformation requirements might be incorrect and may not represent true business requirements.
Ecclesiastes suggests the following approach.
In addition to the ETL development process pipeline as described in the above section, we recommend a parallel ETL testing/auditing pipeline:
1. Understand the business requirements of an auditing and data reconciliation point of view.
2. Derive the audit and ETL testing requirements from the same core business requirements.
3. Based on the audit requirements proceed to generate test cases.
4. Each test case generates multiple Physical rules to test the ETL and data migration process.