Tableau Prep – Aggregating Strings

Aggregating strings is a common task in data cleaning, but it is not yet supported in Tableau Prep out of the box.  This post is inspired by a Tableau community question where a user asked help on combining data from multiple rows into a single cell. This is a more generalised solution to that same problem.

Before we start, I have to admit that this is a complicated flow for a simple problem. If we had a concatenate function in Tableau Prep aggregation step, we could have done this reshaping in a single step. There is already an idea in the Ideas Forum. Please upvote if you like this idea.

 

Requirement

Data is in tall format which is analysis friendly, but we need to generate an output in csv which has one row per user and all his/her workdays are separated by comma. This output is an input to another application.

Original Data

Original Data

Aggregated Strings

Required Result

This solution includes 5 Steps:

  1. Create Row IDs per group

In our case we need to create row Ids per employee. Please see this post about creating row Ids if you are having difficulty in generating row Ids per employee.

Creating Row IDs

Creating Row IDs

 

Data With Row IDs

Data With Row IDs

2. Convert Row Ids to Field Names

Use a calculation to convert RowIDs like 1,2,3 to F1,F2,F3. We will use this column as field names later.

Creating Field Names

Creating Field Names

3. Union with a dummy File to make field names consistent

Dummy Header File

Dummy Header File

Union with Dummy Header File

Union with Dummy Header File

4. Pivot Rows To columns

Add a pivot step and select row to columns. Drag ROW_ID in to the pivoted fields section and Workday to aggregate section. Use max as aggregation.

Pivot Rows To Columns

Pivot Rows To Columns

5. Concatenate Fields and Clean

Create a field called ‘Work Days’ with below formula¬† and keep only the required fields. Depending on your dataset, you might need to add more Fields in to your calculation. I am using 5 Fields(f1 to F5) here and the result is an aggregate string field.

Calculation to Concatenate Fields

Calculation to Concatenate Fields

 

Final Flow

Final Flow

 

Please download the packaged flowfile to play around with this flow.

 

You may also like...

2 Responses

  1. Anjali says:

    Hi,

    Found this method helpful. However – the last concat step of mine has 788 rows/fields to be joined .. and is crashing Tableau. Don’t suppose there’s a workaround … will have to find another way.

  2. Anjali says:

    Also – may be misunderstanding row_id. Is Employee-1 in the image above your row ID? I don’t think so – as it’s not unique by employee. The row_id tutorial to which you link – indicates row_ids should be unique by row…. yet you mention above row_id by employee (which wouldn’t be unique). Any clarification would be helpful.

Leave a Reply

Your email address will not be published.