Open topic with navigation
Denormalization of data is useful for business intelligence reporting, and can be accomplished using horizontal pivoting via one of the following methods:
Data denormalization can be achieved in DMExpress using two different methods depending on whether or not you know how many new columns will be generated. Both solutions are included in the attached example.
When you know in advance how many new columns will be created as a result of the horizontal pivot, use aggregation to pivot.
In the following Source Data, each person has three records, one for each email address of a given EmailType: home, work, or other. To denormalize this data, we want to end up with one record for each person, with each email type in a separate column as shown in the Target Data.
In DMExpress, this can be achieved by defining the following Conditions, Values, and Aggregate transformation, as demonstrated in the HorizontalPivotingAggr.dxt example task.
The following conditions check the content of EmailType:
isHomeEmail: Person.EmailType = 'home'
isWorkEmail: Person.EmailType = 'work'
isOtherEmail: Person.EmailType = 'other'
The following three values are defined for each of the new target data columns, "home", "work", and "other", using the conditions created previously to store the respective email address if available, otherwise NULL:
email_home = IfThenElse(isHomeEmail,Person.Email,NULL)
email_work = IfThenElse(isWorkEmail,Person.Email,NULL)
email_other = IfThenElse(isOtherEmail,Person.Email,NULL)
As DMExpress processes the incoming data, it creates a virtual matrix where new email "values" are mapped as shown in the above image. It then aggregates the data to create the pivoted result:
The final result creates a single horizontal pivoted record for each person, with all three email ids in separate columns.
When you don’t know in advance how many new columns will be created by the horizontal pivot, use cross record logic, via User Defined Values (UDV), to pivot.
In the following Source Data, each person has multiple monthly sales records, each containing Month Number and Month Sales. To denormalize this data, we want to show only one record per person and pivot all monthly sales data into vertical columns as shown in the Target Data.
In DMExpress, this can be achieved using two tasks in succession:
Because it’s unknown ahead of processing how many months of data will exist for a given person, this approach uses User Defined Values (UDV) to prepare the pivoting record progressively until a key break occurs, whereupon all values get reset.
The Sort task performs the following functions:
This task defines the following values and conditions and creates the intermediate file shown in the above image. To follow the evaluation flow easily, consider the evaluation of WriteSales, which, as part of the Reformat, drives the evaluation of the rest of the values and conditions.
The following values and conditions are defined:
WriteSales = IfThenElse(IsPersonChange,ResetUDVs,PivotedSales)
IsPersonChange = Person.PersonID != GetPrevPersonID
GetPrevPersonID = GetUserDefinedValue('prevPersonID')
ResetUDVs = EvaluateExpressionList(
PivotedSales = SetUserDefinedValue('pivotedSales',
GetUserDefinedValue('pivotedSales') || '|' || Person.MonthSales)
This task applies aggregation to create the final pivoted result:
In the final target file, the field delimiter is changed to a pipe ('|'), the delimiter used to separate the concatenated monthly sales in the previous task, so that all pivoted monthly sales data will appear in separate columns.
The final result creates a single horizontal pivoted record for each person with all monthly sales separated in different columns.
This example uses an intermediate file so you can see the output from the first task, but it is recommended that you enable Direct Data Flow between the sort and aggregate tasks for better performance.
63_HorizontalPivoting.zip, compatible with DMExpress version 7.12 or higher
|details on the First and Last functions||Aggregate dialog in the DMExpress Help|
|details on user defined values and functions||User defined values and functions in the DMExpress Help|
|details on direct data flows||Direct data flows and performance in the DMExpress Help as well as Best Practices When Using Direct Data Flows|
|the inverse solution||How to Perform Vertical Pivoting (Normalization)|
Copyright © 2016 Syncsort All rights reserved.