Alteryx Designer Cloud Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Cloud.
SOLVED

I'm new to Wrangler, and am trying to figure out how to remove carriage returns from within fields.

Trifacta_Alumni
Alteryx Alumni (Retired)

Seems like this would be a simple thing, but I've not found a way to resolve the issue I'm encountering. Please see the image attached. The third column should read "sam knows sheila likes to eat cake" but the field contains a carriage return after the name 'sheila'.

Your attention and help would be much appreciated.

3 REPLIES 3
Trifacta_Alumni
Alteryx Alumni (Retired)

Hey Ed,

 

You can start by removing the structure from your dataset. In the flow, if you select the source data and choose remove structure

Then, when you enter the grid, the first step you will have to do is split rows on a pattern. Generally this defaults to \n, but you can customize the pattern depending on the way the dataset is structured. With files that have excess carriage returns or new lines, you sometimes have to get creative with the way you split rows and split columns. If you are able to post this dataset to this thread I'm happy to take a look!

 

Best,

David

Trifacta_Alumni
Alteryx Alumni (Retired)

This is a sample file that emulates the real working file I'm dealing with.

 

Thanks a bunch!

Trifacta_Alumni
Alteryx Alumni (Retired)

Hey Ed,

 

sorry for the delay!

 

Had to get a little bit creative here since it was actually a faulty new line character. What I basically did was modify the first split rows step so as not to actually split rows, then removed the specific new line character that was faulty, then created an array on the now valid new line characters, flattened the array, and lastly split on the | character. I'll show you what I mean with some screenshots (note I took these in 5.0, but all of these steps should be doable in 4.1 they might just take on slightly different naming conventions. If you have any trouble with these let me know!)

 

Step 1: remove structure as I did in the post above

Step 2: modify split rows

Edit the split rows step and remove the \n (in 4.1 you might actually just have do something like \n\n since I'm not sure if you can have an empty split rows before version 5.0)

Step 4: Replace faulty \n by specifying the text before and after the new line

This step finds \n occuring after sheila and before likes and replaces it with empty text

Step 5: Create an array, with each element separated by \n

The {any}+ will pick up any string of text up until the specified delimiter, which in this case is \n

 

Step 6: Flatten array values into rows

Click on the column header and find the suggestion to flatten. this will turn each element of the array into a new row

 

Step 7: Split on '|' to create columns

Click and drag over any of the pipe | delimiters, this will give you a suggestion to split 4 times

 

Lastly you can add a header if you would like.

 

Sometimes with these ill-structured csv's it takes a little bit of unconventional methods to parse it out. But it should be doable! Let me know if this helps or if you'd like me to further elaborate any of the steps!