Data Extraction, Transformation and Loading

Home/Practice Areas/Information Management Services/Data Extraction, Transformation and Loading
Data Extraction, Transformation and Loading 2016-12-05T10:57:40+00:00

For Information Management solutions such as Data Warehouses, Data Marts and other Analytical Applications, to provide the maximum return on investment, enterprises must thoroughly understand their data and develop efficient and effective Extract, Clean, Transform and Load processes (ECTL).  These ECTL processes are used to gather the data from the siloed and fragmented sources that often exist in a business.

The ECTL process is a significant portion of all Data Warehouse project and program effort often accounting for 50% of the time and resources.  It is therefore crucial that this investment is made carefully and use the best practices in this area.

netlogx Information Architects work with enterprises to identify and understand the critical data needed for an application or business process.  The understanding of the business needs and the consistency and validity of the data are of primary importance.   Once these are establish the ETL process can be designed and implemented.  In general the process will be as follows:

  • Extract – Data is extracted from the source systems and made accessible for further processing. The main objective of the extract step is to retrieve the required data from the source systems in the most effective and efficient manner.
  • Clean – Data cleaning is extremely important as it ensures the quality of the data.  It performs basic data unification rules, such as making identifiers unique, converting to standardized formats and validating with 3rd party resources.
  • Transform – Data transformation applies rules to transform the data from the source to the target.  It converts measured data to the common dimensions so that they can later be joined, joins data from several sources, calculates aggregates, generates surrogate keys, sorts, derives new calculated values, and applies additional validation rules.
  • Load-Data loading ensures that the load is performed correctly and completely.  The referential integrity must be maintained by ETL process / tool to ensure consistency.

Benefits

Enterprises with robust ECTL processes are able to take advantage of the data that may be locked up or hidden in siloed systems and turn into valuable information though Data Warehouses, Data Marts and Analytics Applications.