Prediction Wizard - Part 4 (Use Case Results)
by David Novick, Technical Writer, Pyramid Analytics
and Bar Amit, Data Scientist, Pyramid Analytics
and Imbar Marinescu bar, Developer, Pyramid Analytics
In Part 4 of this blog series, we continue the use case that we began in Part 3. We will now see how Bob organizes his prediction results into three different grids. We should mention that the wizard does not automatically display prediction results in any specific grid, chart or visual formats. Bob has to decide, like any BI Office user, how to display his results in the most meaningful manner.
Confusion Matrix for Evaluating Predictive Model
Bob decides first to view the overall results of his predictive model using the classic “Confusion Matrix”, as shown below. This small matrix summarizes the overall results, so that Bob can decide whether to use the current predictive model or create a better one. As you will recall, we cannot edit a predictive model but we can tweak the wizard template in order to create a new predictive model.
The most important cells in this matrix are the YES-YES and NO-NO cells. Out of a total of 481 people who actually bought bikes in 2016, the model estimated that 365 would do so (YES/YES of 76%). And out of a total of 519 people who did not buy bikes in 2016, Bob's model predicted that 402 would not (NO/NO of 78%). With percentages like that, Bob feels confident that he can continue the process with the current predictive model.
In reality, it might take you a few attempts to arrive at such impressive results. The good news is that the wizard makes it easy to quickly swap your input settings and retry the wizard for better results. In fact, that’s one of the big advantages of using an automated prediction tool over a hand-driven method.
The "Confusion Matrix” helps to evaluate quickly the efficiency of your predictive model.
Categories in Columns
Now that Bob is confident in his model, he wishes to discover the statistical significance of the two selected categories. To do this, he places the categories in the rows.
- Home Owner
- Marital Status
In the first line of the grid, Bob sees that out of 133 single home owners who actually bought bikes in 2016, the model predicted that 112 would do so and 21 would not.
YES/YES prediction rate for single home owners was an incredible 84% (112/133).
Categories & Measures
If you recall, Bob's predictive model was based on two categories (Home Owner and Marital Status) and three numerics (Cars, Children, Income). To display the grid shown below, Bob has selected the four measures in the Elements Panel.
- Count profileKey
- Avg Cars
- Avg Children
- Avg Income
When Bob looks at the stats for married home owners, he notices an interesting correlation between the number of cars owned and the number of bike purchases.
- Out of a total of 192 married home owners, 133 were predicted to purchase a new bike (69%) and 59 were predicted not to buy a new bike (44%).
- The 69% group of predicted purchasers owned an average of 0.86 cars, whereas the 44% group of predicted non-purchasers owned an average of 1.54 cars. In practical terms, this means that when dealing with married home owners, Bob should focus his efforts on the couples owning one car (or no car), and invest less of his time pursuing couples owning two or more cars.
Averages can be displayed in grid by selecting measures in desired order.
This use case for bike sales has demonstrated just a small number of the many Prediction Wizard capabilities – enough to get you up and pedaling with the wizard. For additional information, search for “prediction” in the User Community and in the BI Office Help.