Hi,
I would like to combine data sets based on a key found in an array of keys:
Example: A is in [A,B,C]
Practical application: I have data-set A as per-email data on one hand, and data-set B as per-person email array on the other hand. I would need to bring in data from A to B. That is, if email in A-row matches with an email in email array of B-row, then join data.
Data Set A
john@gmail.com ; USA
sandy@yahoo.com; UK
Data Set B
[john@gmail.com, john@hotmail.com] ; John Doe
[sandy@yahoo.com, sandy.b@company.com]; Sandy Smith
Joined A and B
[john@gmail.com, john@hotmail.com] ; John Doe; USA
[sandy@yahoo.com, sandy.b@company.com]; Sandy Smith; UK
Solved! Go to Solution.
Hi @Benjamin Sasin?,
I have arrived at a rather inelegant solution. I'll investigate further. In the meantime...
Recipe B:
Does this help?
Regards,
Nathanael
Hi again @Benjamin Sasin?,
So the root concept behind my solution is sound, which is to JOIN on the elements of the array individually. But a much better way to do it than I proposed above is simply to:
Much nicer!
Cheers,
Nathanael
Thanks. I will try that.
For the first method: from my understanding the join in step 3 won;t be dynamic. So I can't predict how many joins I would have to have and so I cannot automate the process.
Your second method may be preferable. I'll give it a go,
Hi Nathaneal,
Once I have done the join and then merged columns,
I end up with an "emails" column (the copy of the email array), which I then need to collapse back into a single row per unique array.
What tool do I use to group these rows back together?
Okay I think I figured it out, first I had to extract unique values from the combined email array and also order them then I can combine them using group by.