Tableau Prep – Aggregating Strings
Aggregating strings is a common task in data cleaning, but it is not yet supported in Tableau Prep out of the box. This post is inspired by a Tableau community question where a user asked help on combining data from multiple rows into a single cell. This is a more generalised solution to that same problem.
Before we start, I have to admit that this is a complicated flow for a simple problem. If we had a concatenate function in Tableau Prep aggregation step, we could have done this reshaping in a single step. There is already an idea in the Ideas Forum. Please upvote if you like this idea.
Data is in tall format which is analysis friendly, but we need to generate an output in csv which has one row per user and all his/her workdays are separated by comma. This output is an input to another application.
This solution includes 5 Steps:
- Create Row IDs per group
In our case we need to create row Ids per employee. Please see this post about creating row Ids if you are having difficulty in generating row Ids per employee.
2. Convert Row Ids to Field Names
Use a calculation to convert RowIDs like 1,2,3 to F1,F2,F3. We will use this column as field names later.
3. Union with a dummy File to make field names consistent
4. Pivot Rows To columns
Add a pivot step and select row to columns. Drag ROW_ID in to the pivoted fields section and Workday to aggregate section. Use max as aggregation.
5. Concatenate Fields and Clean
Create a field called ‘Work Days’ with below formula and keep only the required fields. Depending on your dataset, you might need to add more Fields in to your calculation. I am using 5 Fields(f1 to F5) here and the result is an aggregate string field.
Please download the packaged flowfile to play around with this flow.