How to create a loss triangle in Pyramid Analytics
Does anyone know of a way to create a loss triangle in Pyramid Analytics?
It depends to some extent on your underlying data as to how you approach this, but you will need two attributes to generate the triangle, the year in which the policies came into force (Accident Year below) and successive time periods since then when claims have been made (Development Period below). You will need to have these available or be able to derive them from your data.
In my example below, built on my experience in the P&C insurance business (think Lloyds of London), we create the triangle based on the Accident Year (i.e. the year in which the policy starts) and how many "development periods" have elapsed since then, typically on a 12 month basis. So for the data set below, we have claims made after the policy came into force 0-12, 13-24, 25-36 etc. months after that time. If finer granularity is required, then simply substitute the appropriate time periods.
For each Accident Year / Development Period there will be a value for claims made.
Here's a simple data set showing the data I used:
We can easily construct a cross tab of Accident Year on Rows and Develop Period on columns:
However, most triangles show cumulative values for the claims made in both the Rows and Columns, i.e. cumulative down each column, and cumulative across the rows.
We need to construct a calculation that will accomplish this.
First we create a cumulative down each column. Using Formulate, create a Range for the Accident Year, starting for the first Accident Year and ending on the "current member", selecting the Claims Measure, and wrapping both in a SUM():
This will give us a cumulative value for Claims down each column.
We now need to take these cumulative values and create cumulative values across the rows. We use the same technique for Development Periods, using the previous expression as the value to add up:
Substituting our calculated cumulative claims for the raw claims measure, viola!:
And as a typical claims chart by year:
Of course, you can make this a whole lot more dynamic by driving the Range using Parameters to let the user select the time ranges for Accident Year and Development Period required and of course filter the whole thing on Class of Business for different insurance markets (Marine, Aviation etc.) or Regions. But I'll leave that for you to do!
Hope that helps!.
Thank you Ian! This is exactly what I needed. I couldn't figure out how to get it to sum properly, but now I know the trick.
Hi, so my example is slightly different but also along the profit loss but I am looking for percentages. The excel spreadsheet shows all the red outcomes I would need to achieve. Hopefully someone smart could give me some guidance as this is the first time I am using Pyramid Analytics.
OK, let's take the easy one first!
To create the % Drop Off Triangle we need to divide the value for each cell by the corresponding value in Period 1 column for each row (Dates).
This is done easily in Pyramid by defining a Formulate calculation in the Measures dimension, taking the Value data point and dividing it by the Value data point for Period 1:
Using our % Drop Off measure on the grid:
Now for the more challenging part, calculating the Average % Drop Off. If it was just a matter of calculating the average for all the values, then again, this would be simple. But from your spreadsheet example, the average is calculated by averaging all the values for % Drop Off for each Date in each column except the last value for Date.
We need then to construct a list of Dates for each column that excludes the last Date in the Column for which there is a value.
Fortunately, there is a PQL function that will do this for us, ClosingElement(). This will give us the last member of Dates for any given Period thus:
Note that the Member for Period is defined as the CurrentMember, rather than explicitly specifying a member. At run time, CurrentMember will apply this calculation to each Period Member in turn that is used in the query.
But we don't want the last element, we want the last but one, or in other words the one before the last member. The PQL Lag() function will gives this:
This will gives us the end point of the range of dates to use for averaging. We could use FirstElement() to get the first value, or explicitly choose the first member in our Dates list. We also need the Range() function to define the list of Dates, using the start and end points above:
Finally, we need to Aggregate the results, using the % Drop Off as the measure:
The Aggregate() function will perform the calculation based on how the % Drop Off Measures was defined, by adding up the numerators and then the divisors and dividing one by the other to produce the correct result.
A couple of things to note here:
- The calculation is placed into the Date dimension, allowing us to place it on the Rows under the grid.
- I've set the formatting to 0%
- I have set the solve order to an arbitrary amount that is higher than the % Drop Off solve order. This ensures that the Aggregate % Drop Off is calculated after the % Drop Off. It may not be needed, but it is good practice to do so. It also ensures the formatting is applied correctly.
Finally we can place our calculated Aggregate (Average) % Drop Off on our grid:
It's been a bit of journey to get to where you want to be, but I hope you can see from this worked example that complex calculations can be achieved in Pyramid by taking a step by step approach. Of course, now that the calculation is done, we can take whatever view of this data output we wish by slicing across other dimensions like category, geography etc.
As in the previous posting above, these calculations can also be parameterised to make them even more flexible and useful for users.
I'll post another reply shortly addressing the calculations in your other triangle example in the spreadsheet you supplied.
Hope that helps.
So got around to this sooner than expected!
Here's your Excel example, my understanding is you need to calculate the line in red. And there are two intermediary calculations in the two lines above it:
Rows 15 and 16 are the totals of row two to the last but one value and row 2 to the last but two values for each column.
We can build on the calculation in my previous post to accomplish this. In fact, the first is identical, the second also the same except the lag factor will be 2 not 1. Of course we will also use the Value directly, not the % Drop Off.
Here's the first calculation for row 15:
And the second:
The only difference is the Lag value, 1 for the first and 2 for the second.
Putting them on the grid:
The % Calculation in red is next. If the Period is Period 1 then divide Aggregate Range -1 by itself, otherwise for all other Periods, divide Aggregate Range -2 by Aggregate Range -1 from the previous column.
As something divided by itself is always 1, we can just put 1 into the logical IF statement when calculating for Period 1.
Otherwise, divide the Aggregate Range -1 for the Current Period by the Aggregate Range -2 from the Previous Period.
Set Formatting to 0.0% and we're done. Putting this on the grid we get:
Hope that all helps!
This is all great stuff! I do have an issue though. I am getting values for development periods that are in the future. Here's an example. I checked my formula and it exactly matches yours. We are direct querying a star schema where accident year is a dimension and development period is a dimension and the values com from the fact table. Do you know if there is something special I need to do to get this to work?
I can only surmise there is underlying data present for some reason.
If you create a report with Date of Loss Year and Development Period both on rows and use the raw data measure that you are applying the cumulative calculation to, does it show any value for 2021 / dev period 3?
I predict that it will show 225.97.
If it should be null, as we haven't yet reached dev period 3 for 2021, then there is something wrong with your underlying data.
Thank you for the response. If I just put in the measure, without the formula, this is what I get. Which looks correct. There are no values in those other periods.
I worked with Jason Picker and we figured out how to do it. We added an if statement so that it would not calculate a value if measure was null. Here's a screen shot.
I still think there is something not right with the data though, or maybe just that it is represented in a different way than my example, as I did not need that kind of logic to suppress values.
Our model is direct querying Snowflake so maybe that's part of why I needed to do it.