Tableau Prep – Iterating Group And Replace
We can setup up a trigger to receive email alerts when ungrouped categories are detected in the flow, but how do we modify the flow to group the ungrouped categories? We will cover that in this post using the same flow used in the email trigger post.
Click on the ‘Group Category’ step and you can see that there are two ungrouped categories as mentioned in the email trigger.
There are many ways to include these new categories in to our existing groups, but none of them are perfect.
Since we only have a couple of ungrouped items in the flow, this option will work for us, but it won’t be effective if the number of new items are huge.
We can try to use one of the built in algorithms like ‘Spelling’ or pronunciation, but it will reset our existing grouping.
It is ok if our existing grouping is entirely based on a built in algorithm, but what if we fine tuned the results manually? We will have to redo all those optimizations again. It would have been a lot better if Tableau Prep allowed us to keep the existing groups, and run the algorithm to find new matches.
Nested Algorithmic Grouping
You can keep the existing groups and run the grouping algorithm again on top of the existing grouping. To do that, add a new group and replace in the existing ‘Group Category’ step. Please note that we are not editing the existing ‘Group and Replace’ step, but adding a new one.
While this is the easiest way to integrate new changes when large number of values to be grouped, we are losing flow readability. How will the changes pane look after 10 iterations? How do you know which step replaces what
So which method should you choose? Well it depends a lot on personal preference, buy here is my take on this.
Manual – When I only have a couple of new values to be grouped
Nested Algorithmic Grouping – When I just want a quick fix.
Algorithmic Grouping – When I want to refractor the flow for readability. In that case I will remove all groupings and regroup everything in a single step.
This was the final post in my Future Proofing Group And Replace Series. SO here is The steps involved:
- Remove ungrouped(dirty) data from output
- Generate an email trigger when ungrouped(dirty) data is detected
- Modify the flow to handle the ungrouped(dirty) data