2

Sorting on Multiple Columns in Model

Hello Community,

I have a use case where I need to sort a data set first on column A and then, retaining this order, on column B. From what I gathered, the Sort node allows only one column at a time. Are there any methods?

6 replies

null
    • Lead Consultant Advanced Analytics
    • Michael_Daun
    • 2 mths ago
    • Reported - view

    Interesting question, Dmitri!
    I sort - sorry: thought! 馃槈 - that I had understood the Sort Wizard and that you should add a second sorting instruction (i.e. Sort 1 and Sort 2) but this didn't do the trick for me. And I am quite sure that I tested all the possible combinations...

    So, I join you in this: Are there any methods?

    • Senior Director of Product
    • Ian_Macdonald
    • 2 mths ago
    • Reported - view

    Hi  , ,

    Unfortunately the sorting dialogue currently only supports one level of sort.

    Multilevel sort (sort by, and then by) is in development and will be delivered in a forthcoming release.

    Hope that helps,

    Ian

    • Senior Director of Product
    • Ian_Macdonald
    • 2 mths ago
    • Reported - view

    Hi 

    I think both myself and  thought you were talking about Discovery. My answer above holds true for Discover, but not for Model.

    For Model, as a workaround, concatenate the columns for the first and second sort, then sort on that concatenated column, then drop it.

    Hope that helps.

    Ian

    • Dmitri_Kanounnikov
    • 2 mths ago
    • Reported - view

    I am sorry, I was talking about the Model :). Good to know that Discovery is in the works on this point!   , thanks for the tip! The workaround makes sense. However, for better usability it should be taken care of in the Sort Node. Similar to what some data integration tools allow:

    • Senior Director of Product
    • Ian_Macdonald
    • 2 mths ago
    • Reported - view

    Hi  ,

    It's been pointed out to me that my method above of concatenating the fields and then sorting on the concatenated filed will not always work correctly.

    As a workaround for now, use a Python script node to multilevel sort the data. Here's an example sorting a table by Income ascending, then by number of Children, descending:

    outputDF=inputDF.sort_values(by=['Income','Children'], ascending=[True, False])

    We've discussed this internally and will be addressing your requirements in future releases.

    Hope that helps.

    Ian

      • Dmitri_Kanounnikov
      • 1 mth ago
      • Reported - view

       , Thanks for the update!

Content aside

  • Status Answered
  • 2 Likes
  • 1 mth agoLast active
  • 6Replies
  • 55Views
  • 3 Following