Talend has various component which will help us to achieve Excel Like Transpose, for that we are going use tDenormalise and tNormalize component.
Scenario: Process spreadsheet by columns instead of by rows.
Our goal is to convert mentioned input to Expected output see the below image with source and expected output.
Follow below steps to achieve result.
Step 1 : Create Metadata of Input Excel, Using Excel metadata node.
Configure below properties during Metadata creation.
- File name: Provide our excel file with path.
- If you are reading excel with 2007 (.xlsx) files then select the “Read Excel2007 File format (.xlsx)”.
- Select First sheet (Sheet1) for “Please Select Sheet” and “Set Sheet parameters”
After this you see below output in metadata viewer tab.
Step 2: Denormalising source data.
- Drag and drop tFileInputExcel component to job designer.
- Add tMap after tFileInputExcel component and connect with main connection.
- Inside tMap just select column which has actual data ( B to F)
- Add tDenormalize component and configured as shown in below Image.
- Add tLogRow after tDenormalize and connect with main connection.
- Save the job and run, it will show you below output.
Now we got our first output which is some what transpose to rows, but we need it to convert all these columns to one column, and into rows.
Step 3: Normalise the output to get Expected result.
- Add tMap after tDenormalize and connect with Main connection. and do configuration as below.
- Create output in tMap and create only one column named with “All”
- concatenation Source columns in output column “All” as below.
- Add tNormalize component after tMap and connect with main connection.
- Configure tNormalize to normalize.
- “Column to Normalize”=”All”
- Item Separator=”~”
- Add tLogRow after tNormalize to see the final result.
- Add tExtractDelimitedFields component after tNormalize and connect with main connection.
- Configure tExtractDelimitedFields component as follows.
- “Field to Split” = “All”
- “Field Separator”=”,” (comma)
- Edit schema and create six columns as “A,B,C,D,E,F,G” with string data type.
- Add tLogRow after tExtractDelimitedFields component and connect with main connection.
- Save the job and run it will show below output.
We got our final output using various component and tactics. This can be achieved using any other workflow, Talend has many components which provides you better flexibility to do such data processing.
If you have developed same thing with different or same component but well optimised then do share with us, to help other with best practices.
Use contact us page to send your post. or comment us below.