Solved! Go to Solution.
Hello Jared, is there a pattern to the two duplicate values? Perhaps the values before and after these duplicate values could be used to distinguish them?
The Replace transform, which allows you to tailor the actions based based on what is being matched, would be a good place to start.
https://www.trifacta.com/support/articles/article/625245-replace-transform/
I looked through that article, but I'm still not quite sure what to do. The two rows that I want to fix each have the value "Active" in the "customer_status" column. But These rows represent the same customer, just for different subscription dates. I want to change "Active" to "Expired" for the older of the two subscriptions...the record with the oldest end date. I keep trying to make this change, but I can't figure out how to add the condition to the replace operation.
In this case, I would recommend that you use the customerID and dates as indicators for when to flip from "Active" to "subscribe" In this situation, a combination of windows transform and the PREV function would be more useful to you.
An example of the formula would be,
window value: if(UnixTime > prev(UnixTime, 1), 'subscribe', 'active') group: CustID order: subscription_date
Keys:
This function identifies the latest date between two entries and change the output text to "Subscribe."
Note: The name of the newly generated column would need to be renamed.
Here's the link to the documentation.
https://docs.trifacta.com/display/PE/PREV+Function
Perfect, thanks for clarifying! This makes sense now
Here's the current doc link: https://docs.trifacta.com/display/SS/Replace+Transform