Alteryx Designer Cloud Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Cloud.
SOLVED

I want to search a column and extract 5 terms, if they are present. For example: dog, cat, mouse, bug, person. How do I do this?

 
3 REPLIES 3
Trifacta_Alumni
Alteryx Alumni (Retired)

Hi Ellie,

 

To simply extract from a list of items, you can specify multiple "patterns" in an extract transform (https://docs.trifacta.com/display/PE/Extract+Transform) using the tick `, and separate each pattern using a pipe |. So as a step it would look like this:

Transformation: Extract

Column: Your_column

on: `dog|cat|mouse|bug|person`

 

Now it's worth noting that these patterns are case sensitive so I would recommend setting the column to lowercase first. Also, since these are patterns and not literal strings, it would also pick up any word that contains one of the above patterns of text, like concatenate, doggish, category, etc. One way to get around this would be to add the trifacta pattern {delim} before and after each word you are looking to extract, to specify that each is their own standalone word.

The step would then look like this:

Transformation: Extract

Column: Your_column

on: `{delim}dog{delim}|{delim}cat{delim}|{delim}mouse{delim}|{delim}bug{delim}|{delim}person{delim}`

 

Additionally, if there are possibly multiple instances of each word, you might want to use the Extractlist transform (https://docs.trifacta.com/display/PE/Extractlist+Transform) instead of extract. This will extract each occurrence in a row into a list.

Oh cool I didn't know you could use the pipe character like that. It would be cool if you could just list all the values you'd like to extract. It might be a little bit easier for those who (like myself) weren't aware of the pipe of tick mark usages 🙂.

Hello sirs

 

I'm trying to do something similar; I hope it's ok if I ask here

 

Let's say in my case values to be searched (dog, cat...) are located in a different dataset in a separate column (let's say in column 'category')

is there are way to do this?

 

below is a more elaborated description, in case there are may be alternative solutions:

 

I have a table with bank transactions and I'm trying to automatically categorise every transaction in the dataset.

 

One of the columns is named Description. For example, Description has the following patterns:

 

bank commission

bank comm.

bank com. for currency control

b/comm

bank fee

...

 

I want GDP to set values in column Category as "Bank fees" for each of these patterns it finds in column Description

 

Due to a large number of possible patterns, I thought I would create a dictionary with two columns: Pattern and Category

 

Could it be possible in GDP to extract not a specific value from Description, but to search for all values it the Pattern column in a separate table?

 

Would greatly appreciate your help.