Alteryx Designer Cloud Discussions

4a0098b7f9b243add27a · ‎11-21-2018

I have two csv files that I need to merge either wrangled by Union or by pattern matching. In either case, the input CSV file's last column has the correct header let's say Column but the rows beneath contain a row with the value Column,Column,Column and a bunch of rows where this is empty where the last column's cell is ,,,,

I can investigate further if the cause is the csv creation with respect to inconsistent column counts per row, but i am not making these so need a reliable way around this issue.

No column limits exist in the documentation that I can find.

The source files are both 405 columns.

Trifacta_Alumni · ‎11-21-2018

Hi Jonathon,

I don't know if there is a hard limit. (This might be a matter of giving it a go!)

As for the issue of inconsistent number of rows, the question is whether the extra information is dispensable or not? (Or, how do you want to handle it?) It sounds like it is. If so, then I suggest inserting a recipe or step upstream of the merge that discards it. Quick testing (Note: on non-Dataprep Wrangler!) indicates that these extra values/placeholders are "stringified" -- quotes, commas and all -- and concatenated with the expected terminal value. Therefore, one solution might look like this:

I'm not sure if this is helpful. Please let me know what you think.

Cheers,

Nathanael

Trifacta_Alumni · ‎11-21-2018

Hi again Jonathon,

I spoke with engineering about column limits, and there is indeed no hard limit, per se. (So, there is no document or code that says, "Don't exceed X columns, or else!") However, you may start running into scalability issues at about 1K+ columns. There is a limit on the size of the overall pipeline, taking into account both the size of the data, and especially the size of the recipe acting on it (the presumption being that if you have more columns, you are doing [much] more work).

Hope this helps.

Cheers,

Nathanael

4a0098b7f9b243add27a · ‎11-21-2018

This is helpful thankyou Nathanael.

I may then be running into some limit in the source application that is creating the CSVs. As you say it looks as if the source may be stringified at the tail end of some rows.

If this is consistent, then yes i maybe can handle this upstream.

Alteryx Designer Cloud Discussions

Is there a column limit on csv ingestion?