Alteryx Designer Cloud Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Cloud.
SOLVED

Do Parameterized Datasets automatically Union the inputs where the parameters match multiple files?

It is not clear in the GCDp documentation how multiple matched files are handled other than to say they should be of the same type.

 

Is this a Union operation?

3 REPLIES 3
Trifacta_Alumni
Alteryx Alumni (Retired)

Hey Jonathon,

 

Yes, in essence this is a Union operation (though happens outside of the Wrangle language so differs in practice from the Union transform). Adding an entire directory will union all files within into a single dataset, and using parameters is a way of specifying a subset of a directory to union into a single dataset.

 

We will look into modifications to the Trifacta docs to add some clarity on this.

 

thank you!

David

Thanks for the clarification.

 

Having experimented a little it is clear that it is a simple file concatenation rather than a data union in the spirit of the Union tool within the wrangler. To whit, the header rows of the constituent files are included as data rows.

 

I shall explore what process I can use to 'normalise' the columns between csv's i was trying to gather in this way. Working within a flow with defined inputs is no problem and Dataprep unions these files no problem. But in a scenario where the number of inputs is variable, this is more complicated.

My solution to this so far is to bring the dataset in unstructured and then to filter rows that match my header.

 

Honestly, I have found that just explicitly setting the structure in unbounded data sets works far better than having Dataprep infer the structure.