1

Obtaining accurate summations in a Many-to-One Relationship

I am having trouble grouping lines correctly so that I can get accurate totals. The basic premise is as follows:

We are trying to transition away from buying item A and instead encouraging buying item X. However, the conversion is not a 1:1 relationship; there are some instances where item A -> X and B -> X and C to X. There are multiple pre-conversion (A,B,C, etc.) and post-conversion (X,Y,Z, etc.) items to deal from. Sometimes there is a 1:1 relationship, but many times there is this many:1 relationship as shown.

I would like to get the total quantity purchased of every item, i.e., both pre-conversion and post-conversion totals. However, the many:1 relationship causes a repetitive value problem:

Facility Pre-Convert Post-Convert

 F1                 A                       X

 F1                 B                       X

 F1                 C                       X

If, for example, X = 25, then the total amount of X purchased here would be 75 instead of the correct 25.

The Question: How do I group the X's into a single value per facility while also keeping the values of A, B, and C?

Some considerations:

1. The actual table is pivoted so that pre-convert and post-convert are separate lines. Trying to join them onto a single table to get a single line per conversion creates a many:1 join nightmare. On that note: if there is a way to solve this question using SQL or Python in a way that doesn't result in the same problem in a Discover (i.e., after that scripting is processed on the model flow), I am all ears.

2. I played around with grouping and list quick calculations. I don't think I'm using them correctly, and based on what I do know, I don't think it will help.

3. The best way I can think of would be to get the average of X per facility and use that as the value to sum across facilities. However, I don't know how I can group the aggregation by facility. I feel like that is the answer though. 

 Thank you,

JAF

3 replies

null
    • John_Fonte
    • 10 mths ago
    • Reported - view

    Also, just thinking out loud, perhaps there is a way in Python to take the value of (top 1 of X) per facility and set the rest of X per the facility to zero. There may be an issue with number of lines depending on how the dimensions are grouped in discover, but again, just an idea.

    • Customer Solutions Architect
    • Moshe_Yossef
    • 10 mths ago
    • Reported - view

    Hi 

    Trying to make sense of this:

    Facility 1 used to purchase Items A, B and C, and is now purchasing X in their stead?

    You want to show both the original purchase (A,B and C) as well the replacing item (X)?

    Your data is actually:
    F1   A    500

    F1   B    300

    F1   C    200

    F1   X   1000

    ?

    If I understand correctly, could it also be that you have purchases of X that were not converted, but an actual purchase? Something like this:

    F1   A    2022 500

    F1   B    2022 300

    F1   C    2022 200

    F1   X   2022 1000

    F1   X   2023  700

    ?

    And you also have a table like this:

    A    X

    B   X

    C   X

    ?

      • John_Fonte
      • 10 mths ago
      • Reported - view

       Pivoting it so that pre-converts and post-converts were in the same column was what I ended up doing, even though the stakeholder wanted a side-by-side display. Separately, I think there was a problem with the cleanliness of the data, because this summing duplication problem was not arising with other conversion projects (even though I still couldn't do a side-by-side display for those other projects either).

      You can close out this question as Answered. I appreciate your help!

Content aside

  • Status Answered
  • 1 Likes
  • 10 mths agoLast active
  • 3Replies
  • 55Views
  • 2 Following