Difference between tAggregateRow and tAggregateSortedRow

Difference between tAggregateRow and tAggregateSortedRow
4.43 (88.57%) 14 votes

Difference between tAggregateRow and tAggregateSortedRow was question to me, so I started searching possible answers and I found nothing even on Talend help centre nor on Google. Therefore writing this post.

This is our input for demonstration.

tAggregateRow vs tAggregateSortedRow in Talend
tAggregateRow vs tAggregateSortedRow in Talend

I know these many columns not required but still wanted to use.

tAggregateRow: Receives a flow and aggregates it based on one or more columns. For each output line, are provided the aggregation key and the relevant result of set operations (min, max, sum…).

Question 1: Display Maximum quantity by Continent?

Step 1: Create simple job with above given input and tAggregareRow and tLogRow.

Step 2: Connect input with tAggregareRow and do the following settings.

  • Add two columns in output Schema of tAggregateRow component, for quantity & Continent. you final schema should look like below image.
tAggregateRow vs tAggregateSortedRow
tAggregateRow vs tAggregateSortedRow
  •  Do the following setting in tAggregateRow
    • In Group by table, add Continent as input and output.
    • In “Operations” table, add Quantity column as input and output and select max function from function tab. see the below image for more details.
tAggregateRow vs tAggregateSortedRow
tAggregateRow vs tAggregateSortedRow

We have done the basic setting, now we can execute the job and get output no problem question was easy but when it comes to tAggregateSortedRow it becomes complicated, because official description of tAggregateSortedRow says

tAggregateSortedRow: tAggregateSortedRow receives a sorted flow and aggregates it based on one or more columns. For each output line, are provided the aggregation key and the relevant result of set operations (min, max, sum…).

lets see how it behaves with our example job.

Add another sub job with same input and output just change the tAggregateRow to tAggregateSortedRow with same setting we did for tAggregateRow except that we will add “Input Rows Count”=7 ( we have seven rows only)

But outputs are different, see the below image with both the output.

tAggregateRow vs tAggregateSortedRow output
tAggregateRow vs tAggregateSortedRow output

Outputs are different because we do not have sorted flow for  tAggregateSortedRow component. We got our first difference that is

tAggregateSortedRow works on Sorted rows only. But tAggregateRow performs same operation without sorting rows. 

Step 2:  Add tSortRow to the tAggregateSortedRow  flow.

  • Add tSortRow component after the input and connect with input and tAggregateSortedRow using main flow.
  • Configure tSortedRow component as follows.
    • Sync columns using sync button.
    •  Inside “Criteria” table add one row and
      • Schema column=continent
      • sort num or alpha?=alpha
      • Order asc or desc?=asc
  • now execute the same job we will get a below output.
tAggregateRow vs tAggregateSortedRow output 2
tAggregateRow vs tAggregateSortedRow output 2

Now results are matching but order is shuffle.

We got our second difference.

tAggregateRow does not sort the result, but tAggregateSortedRow works on sorted flow that is why it produces result in sorted order. 

This is the final job design which is being used for demonstration.

tAggregateRow vs tAggregateSortedRow Job
tAggregateRow vs tAggregateSortedRow Job

Now we will use same job for further demonstration.

Step 3: Modify tAggregateSortedRow Setting.

  • we are working on fixed flow input so we know how many rows are in input flow. we will change the
    • “Input Rows Count”=0
  • Execute the job you will get below output.
tAggregateRow vs tAggregateSortedRow output 3
tAggregateRow vs tAggregateSortedRow output 3

We got the another difference.

tAggregateRow is not dependanat on input row count, means we can use tAggregateRow component without knowing input row count whereas tAggregateSortedRow requires input row count in prior. 

Except that I did not see any major differences using these components it behaves seemlier except above differences .

About Umesh

I am Software consultant with approx 7 years of experience mainly in Business Intelligence and data warehousing assignments using Talend. Writing is not my passion but i am doing it to help others. if you have any special case where you want me to demonstrate then please post me.

2 comments on “Difference between tAggregateRow and tAggregateSortedRow

Leave a Reply

Your email address will not be published. Required fields are marked *