Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Cloud Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Cloud.
SOLVED

Is there a column limit on csv ingestion?

I have two csv files that I need to merge either wrangled by Union or by pattern matching. In either case, the input CSV file's last column has the correct header let's say Column but the rows beneath contain a row with the value Column,Column,Column and a bunch of rows where this is empty where the last column's cell is ,,,,

 

I can investigate further if the cause is the csv creation with respect to inconsistent column counts per row, but i am not making these so need a reliable way around this issue.

 

No column limits exist in the documentation that I can find.

 

The source files are both 405 columns.

3 REPLIES 3
Trifacta_Alumni
Alteryx Alumni (Retired)

Hi Jonathon,

 

I don't know if there is a hard limit. (This might be a matter of giving it a go!)

 

As for the issue of inconsistent number of rows, the question is whether the extra information is dispensable or not? (Or, how do you want to handle it?) It sounds like it is. If so, then I suggest inserting a recipe or step upstream of the merge that discards it. Quick testing (Note: on non-Dataprep Wrangler!) indicates that these extra values/placeholders are "stringified" -- quotes, commas and all -- and concatenated with the expected terminal value. Therefore, one solution might look like this:

 

 

I'm not sure if this is helpful. Please let me know what you think.

 

Cheers,

 

Nathanael

Trifacta_Alumni
Alteryx Alumni (Retired)

Hi again Jonathon,

 

I spoke with engineering about column limits, and there is indeed no hard limit, per se. (So, there is no document or code that says, "Don't exceed X columns, or else!") However, you may start running into scalability issues at about 1K+ columns. There is a limit on the size of the overall pipeline, taking into account both the size of the data, and especially the size of the recipe acting on it (the presumption being that if you have more columns, you are doing [much] more work).

 

Hope this helps.

 

Cheers,

 

Nathanael

This is helpful thankyou Nathanael.

 

I may then be running into some limit in the source application that is creating the CSVs. As you say it looks as if the source may be stringified at the tail end of some rows.

 

If this is consistent, then yes i maybe can handle this upstream.