For example, my data set includes many rows of transactions with the same vendors over time, each vendor represented by a unique Vendor Number. Each vendor's name, however, may vary over the years, so I end up with multiple distinct Vendor Names for any given distinct Vendor Number.
The end result I need is a single Vendor Name for any given Vendor Number. In the absence of a Vendor Table to pull from, I'd like to pick the most recently-paid transaction of a given Vendor Number and use its Vendor Name only.
Thoughts? I know how to do this in SQL, but I'm trying to leverage Trifacta's native functions and practices.
Solved! Go to Solution.
Hi @Adam Houhoulis? , thank you for your question!
You can achieve the desired result in a few recipe steps:
1.Create a ROWNUMBER() column
First, we'd like to identify which record has the most up-to-date vendor name for each ID. To do this, we'll use the window function ROWNUMBER(). We'll sort by date (descending, hence a minus ('-') sign before the column) and group by company ID:
2.erase values that aren't the current company name
Now we ensure that all the values in the company name column that aren't in the row with the rownum==1 will be replaced with NULL values. We'll do this by using the IF() function:
Note that instead of creating the 'rownum' column and using it, we could use the MAXDATE() function to indicate which record has the most up-to-date company name:
Although it saves us a step, I think the ROWNUMBER() approach is more readable and would recommend it.
3.fill in the blanks
Now, for each company ID, we have one company name; the other values of this column are empty for each group.
We can use the FILL() function to fill in the gap:
Now for each company_id, the company name will always be the same; the most up-to-date company name.
Note that this is dynamic; the newest company name will be chosen each time you run the job.
also, these steps can be done in 1 recipe step via nesting all the functions:
I would recommend splitting the process into at least 2 steps to make the process more readable.
I hope this makes sense; please feel free to ask any additional questions.
Thanks,
Amit.
Excellent! Thank you so much @Amit Miller? !
You're very much welcome @Adam Houhoulis? ! =]