Abstract:
Extraction-Transformation-Loading (ETL) tools are pieces of
software responsible for the extraction of data from several
sources, their cleansing, customization and insertion into a data
warehouse. In this paper, we focus on the problem of the
definition of ETL activities and provide formal foundations for
their conceptual representation. The proposed conceptual model is
(a) customized for the tracing of inter-attribute relationships and
the respective ETL activities in the early stages of a data
warehouse project; (b) enriched with a 'palette' of a set of
frequently used ETL activities, like the assignment of surrogate
keys, the check for null values, etc; and (c) constructed in a
customizable and extensible manner, so that the designer can
enrich it with his own re-occurring patterns for ETL activities.
Note: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.