Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Cloud Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Cloud.
SOLVED

I have a monthly data, 36 files, and I want to union them, is there a way to assign the file name ( month 01/2018, 02/2018 ) to rows in a new column before I parametrize my dataset.

 
4 REPLIES 4
Trifacta_Alumni
Alteryx Alumni (Retired)

Hi @Talha Ardic?,

 

You can use the "$filename" metadata parameter in your recipe. This will store the path of your source file in a column in your dataset. If you are working with parameterized source datasets, the $filename metadata parameter will insert the name of each source file captured by your parameter next to the rows that come from each file.

 

Here's how to add the $filename metadata parameter to your recipe:

First, create a "New formula" transformation step to add a column to your dataset. Click into the "Formula" box to open the contextual help menu. From the contextual help menu, click the link that reads "Metadata".

 

This will display the list of metadata operations that you can insert into your recipe. Choose the option that reads "$filepath".

 

This operation will insert the full file path for your source data into your dataset.

 

Be aware that if you have multiple datasets in your flow, followed by a union transformation, you will want to use the $filename metadata parameter in each recipe before the union. That will look roughly like the image below:

Thank you so much for your help Connor! But this is actually not what I'm looking for. Let me elaborate little bit more. I have 36 excel files with two sheets in each. Sheet one has meta data and one of the rows describe the date interval which sheet two is based on, and sheet two has all the transaction data.(1000s of rows). And each file is named by month. If I load files one by one that's gonna be too messy with 100's of sources and recipes in the flow and will be confusing, so i create data set with parameters under S3 tab and add wild card pattern parameter to the folder which selects all my files and combines them in one. But since I don't have a date column in sheet 2 I can't tell what information belong to what month. Is there a way to assign the name of the file to rows in this scenario? Or can I add the date interval that's given in sheet 1 to sheet 2? (can't make a join, there are no common columns)

Trifacta_Alumni
Alteryx Alumni (Retired)

Thanks for the additional information! The $filename metadata parameter will work across parameterized Excel files on S3. Here is my example parameterized dataset based on two Excel files:

Each of these files has two sheets, which I chose to create as separate datasets.

 

In the recipe associated with Sheet 1, I've added the $filename metadata parameter:

You can see that this will display the full file path on S3 for the Excel file. The sheet name is appended to the end of the path.

 

Since my initial sample only shows records from the first file in the folder, the transformer grid displays the full filename for the single file only. To verify that each row will be returned with the correct filename, I collected a random sample. Here is the result:

You can see that the rows from the first file have the name excel1.xlsx, and the rows from the second file are assigned the file name excel2.xlsx.

 

If you want, you can do some additional cleanup to split the filename on the "/" character to only retain the filename, and not the full directory path and sheet name.

 

You can perform this same operation on sheet 2 to create a filename column in that dataset as well. Once you have the filename in both sheets, you should be able to perform a join to combine the data from both sheets.

This was very helpful, thank you so much Connor!