Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
grossal
15 - Aurora
15 - Aurora

This is the follow-up blog article to my Alteryx Inspire Denver 2022 presentation, “The Beauty of Python Macros.” Have you ever run into an issue and thought it’s impossible to solve in Alteryx? Yes? Then come with me so I can show you the power of Python and Alteryx together!

 

Python SDK vs. Python Macro

 

First, we should clarify what a Python macro actually is. A Python macro simply is a standard Alteryx macro that has the Python tool inside it – either exclusively or combined with other tools. The Python SDK (Software Development Kit), on the other hand, is a custom development kit to build custom tools. The SDK allows you to bring in various functionalities that a macro wouldn’t allow, and it simplifies building a custom interface for the custom tool.

 

When should you use what?

 

If you want to build a feature-rich custom connector, then you should go for the Python SDK. Otherwise, Python macros cover the other 99% and are fast and easy to use.

 

2022-06-06-08-13-32-POWERPNT.png

 

The Basics

 

You don’t need to be a full-time Python developer to use the Python tool in Alteryx. For most of the tools shown below, you theoretically don’t need to know Python at all. In general, I’d say that a Python beginner course that teaches for-loops, variables, and if-else concepts is good enough. If you pack a one-hour Pandas course on top, you are more than ready to try your first steps.

 

No matter if you have used Python before or not, there are always the same three questions:

  • How do we get data in?
  • How do we get data out?
  • How do we install python packages?

 

Reading data into your Python tool can be done using the Alteryx.read(‘#1’) command. The #1 needs to be the name of the connection that you input. If you have multiple incoming connections, it is highly recommended to rename them to make your code inside the tool easier to read. Alteryx works with Pandas dataframes. Therefore the read function returns one of them, and the write function expects to receive a pandas dataframe.

 

When trying to get data out, you need to pass a dataframe object as the first argument and the output number as the second. While we allow having infinite input sources, we can only output to five output anchors, and therefore the second number has to be between 1 and 5.

 

Lastly, we need to know how to install additional packages to really get the most out of Python. There are two ways how we can do this:

  • Package.installPackages([‘Package1’,’Package 2’]) is the Alteryx recommended way of doing it.
  • !pip install Package – this is sort of a hack. You simply put an exclamation mark in front of the normal pip command and install packages.

 

2022-06-06-08-17-03-POWERPNT.png

 

What’s the difference?

 

Alteryx also uses the pip command inside the Alteryx.installPackages command, but I noticed that I often had to open Alteryx Designer as an Admin to allow the Python tool to install packages. On the other hand, the !pip install bypasses this and does not need to have admin rights, but it has downsides as well. The command only works in the interactive and not in the production mode. This might not be a problem if you simply use it on your own because you can completely remove the package install line after the installation. Still, once you share a macro and put it in production mode, you should probably go with the Alteryx intended way – even though the people would need to open Designer with Admin rights – once – to install the packages.

 

Python Tool Interface

 

The interactive mode in the Python tool uses a Jupyter notebook. In case you ever need to find help, search for "Jupyter notebook " + your question in Google. There are three things that are important for this guide:

1) The scissors icon on top is used to remove the boilerplate code because we never need them.

 

2021-02-07 15-45-33.png

 

2) Once you have written the code in one block/box, you can hit CTRL + ENTER to execute it.

3) If you want to open a new code block, you can click on the plus icon next to the scissor icon that we marked in the screenshot above.

 

The Magic

 

Let’s take a look at a couple of examples that were shown at my Inspire presentation.

 

Sending Smartphone notifications

 

For this one, I’d like to pass you on to my previous Python article, which explains how to build a pushbullet tool for notifications in all its details. This article will focus on tools that have not been published before. Read it here.

 

Create ZIP file

 

Why and when do we need it?

 

I had two reasons when building the tool:

  • E-Mail attachments that were simply so big that they stopped my mail from being delivered.
  • Too many e-mail attachments. This is not really a problem, but it doesn’t look good if you send 20 attachments in a single mail.

 

How do we build it?

 

If you have not read my article on the smartphone notification, you might be surprised, but the answer is: we simply google it. Search for “Python create zip file ” and pick the first one.

 

 

import shutil
shutil.make_archive(output_filename, 'zip', dir_name)

 

 

The best part about that is that we don’t need to install a package because shutil is part of the default packages that are already installed with Python. Therefore, we can simply copy and paste the code. Afterward, we can use two control parameters with action tools to replace output_filename (file path without .zip at the end) and dir_name (the folder you'd like to zip), and we are set and done if we keep it simple. Make sure to put both variables in quotation marks.

 

Afterward, this would look like this:

 

grossal_3-1654497064293.png

 

Then we can use two control parameters like this:

 

grossal_5-1654497127485.png

 

We need to select the productionModeScript value (after we have switched the Python tool to production mode) and select “Replace a specific string” at the bottom. For the first one, we replace “output_filename,” and for the second, we replace “dir_name.”              

 

grossal_4-1654497103852.png

 

Surely, we can improve this in many ways, but the first target is always: Make it work. This gets the job done without any additional lines of Python code. Would I recommend making improvements? 100%. At some point, I'll probably write another blog about it, but today's blog is about what was shown at Inspire this year.

 

Ensure File Path Exists

 

Why and when do we need it?

 

When using custom output locations using the ‘Change Entire Filepath,’ we often run into an issue that we can either only save all files to a single folder OR have to create all folders in advance. But what if we have customer folders and get a couple of new customers? The Output will error if we don’t have them created before. It would be nice if we’d automatically create all the folders before trying to write to them – right? That’s exactly what this tool does.

 

How do we build it?

 

Would you like to guess? If you said ‘Google it,’ then you are totally right. Look out for ‘Python create folder if not exists’ and pick the first one again.

 

 

import os
os.makedirs(path, exist_ok=True)

 

 

Yet again, the os package is part of the default installation, and we can copy and paste both lines.

 

Afterward, we only need to replace the path with the folder path (using another control parameter and an action tool) that we’d like to create, and we are ready to go! Like the previous example, it is important that you wrap the path variable with quotation marks before using an action tool on it.

 

Create QR Code

 

When and why do we need it?

 

This falls more into the very specific cases. I had a couple of customers who used Excel-VBA solutions or went to websites to create their QR codes. This works fine if you only need to generate a couple of QR codes, but what if you need to create 100,000? It might get just a bit overwhelming!

 

How do we build it?

 

Take a wild guess! Right. Google. You are noticing a theme here, aren’t you? I’m no magician--I’m a normal user like you are, and if I face a new problem, I google it. Search for ‘Python create QR code.’ In my case, the pypi package site showed up as the first hit. You can think of pypi as the Alteryx Gallery for Python packages.

 

The site shows us how to install it using pip (screenshot below). We can copy this line and add a ! (exclamation point) in front to install the package.

 

 

!pip install qrcode[pil]

 

 

Afterward, we can copy the following lines to create the QR code.

 

 

import qrcode
img = qrcode.make('Some data here')
type(img)   # qrcode.image.pil.PilImage
img.save("some_file.png")

 

 

The third line can be removed as it is only used to show the type of the image variable. Yet again, we can simply use two action tools to replace the text "Some data here" with the text we'd like to have in the QR code and the "some_file.png" with our file path.

 

Merge PDFs

 

When and why do we need it?

 

This was actually a request/post from one of the Alteryx community members. Due to compliance issues, they had an RPA bot that extracts single PDFs for every sheet in a dashboard. Afterward, they wanted to send a single PDF with all relevant sheets to a customer. It's probably a rather uncommon use case but let's do it anyway!

 

How do we build it?

 

I hope you can guess by now, but if not, I'll tell you one more time: Google it. My first hit included the following code snippet:

 

 

from PyPDF2 import PdfMerger

pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf']

merger = PdfMerger()

for pdf in pdfs:
    merger.append(pdf)

merger.write("result.pdf")
merger.close()

 

 

When you copy-paste this one, you'll quickly run into an error because the PyPDF2 is not installed by default. If we search for 'Python install PyPDF2', we'll quickly find the pypi-page for it.

 

The installation is as simple as before, we can add a ! in front of the shown command below, and we are good to go.

 

 

!pip install PyPDF2

 

 

For this one, it's a bit “trickier” to get the code to work. I'd recommend concatenating all file paths that you want to combine with a summarize tool and then replacing the "['file1.pdf','file2pdf'...]" expression with just a single action tool.

 

What should I take away?

 

Python is easy, and there are a lot of pre-built packages like macros in the Alteryx Gallery. You can easily install them and make them work in Alteryx. I hope this blog or my session at Inspire Denver got you interested in Python programming. 

 

What would you recommend to get started?

 

I'd recommend searching for free YouTube courses that involve:

  • Variables, Loops, If-Else
  • Pandas beginner course

 

The first one is important to get used to general concepts, and the second can be used to get familiar with Pandas DataFrames and how to work with them.

 

But ... can you please share your production macros?

 

No 😊 This blog is meant to get you started and you won't learn a single thing if you just use my macros. Try it yourself. It is not that hard and I have faith you can do it!

 

But ... I have really tried it for hours and hours and it doesn't work. Can you please share it?

No, but you can write me a DM with what you tried, and I'll be more than happy to hop on a Zoom/Teams to help you get it right.

Additionally you might just look at the Alteryx Gallery. I already saw that someone built a macro for one of the use cases shown above. Maybe someone else is kind enough to post a ready-to-use macros for you, but it won't be me. It's my goal to teach it and I really want you to learn not copy.

 

What do you usually change in a production macro?

 

It’s usually just two things. First, I combine it with a couple other Alteryx tools. E.g. the unique tool for the zips as there is no point in creating the same zip file multiple times. The second aspect is error handling and making the tool more robust. I add error messages if users pass wrong inputs (e.g. no valid file path) to my macro/script.

 

If you have made it that far, let me know your feedback and if you'd like to see a webinar on this topic :-)

 

Best,

Alex

 

P.S. Feel free to connect to me on LinkedIn.

P.P.S. Thank you @joshuaburkhow for proofreading this blog!

 

Comments