Sorting on Multiple Columns in Model
Hello Community,
I have a use case where I need to sort a data set first on column A and then, retaining this order, on column B. From what I gathered, the Sort node allows only one column at a time. Are there any methods?
6 replies
-
Interesting question, Dmitri!
I sort - sorry: thought! - that I had understood the Sort Wizard and that you should add a second sorting instruction (i.e. Sort 1 and Sort 2) but this didn't do the trick for me. And I am quite sure that I tested all the possible combinations...So, I join you in this: Are there any methods?
-
Hi , ,
Unfortunately the sorting dialogue currently only supports one level of sort.
Multilevel sort (sort by, and then by) is in development and will be delivered in a forthcoming release.
Hope that helps,
Ian
-
Hi
I think both myself and thought you were talking about Discovery. My answer above holds true for Discover, but not for Model.
For Model, as a workaround, concatenate the columns for the first and second sort, then sort on that concatenated column, then drop it.
Hope that helps.
Ian
-
I am sorry, I was talking about the Model :). Good to know that Discovery is in the works on this point! , thanks for the tip! The workaround makes sense. However, for better usability it should be taken care of in the Sort Node. Similar to what some data integration tools allow:
-
Hi ,
It's been pointed out to me that my method above of concatenating the fields and then sorting on the concatenated filed will not always work correctly.
As a workaround for now, use a Python script node to multilevel sort the data. Here's an example sorting a table by Income ascending, then by number of Children, descending:
outputDF=inputDF.sort_values(by=['Income','Children'], ascending=[True, False])
We've discussed this internally and will be addressing your requirements in future releases.
Hope that helps.
Ian