ArviZ in depth: plot_ppc#

`kind`#

The kind argument is the one that chooses which view to use. There are three options:

“kde”: Using “kde” as kind sets the view to the probability density/mass function. Thus, depending on the type of data, KDEs or histograms will be used. There is currently no way to have histograms for float variables when using plot_ppc.
“cumulative”: Using “cumulative” sets the view to the cumulative density function.
“scatter”: Using “scatter” sets the view to stacked rug plots, with each rugplot at a different y value.

What is a rug plot?

A rug plot is a 1D graphic. Given a set of data or samples, we represent each point as a mark along the real line. For example, if our data is [-1, 0.1, 0.2, 0.3, 1, 2] our rug plot would be:

A horizontal line representing a numerical axis with dots at -1, 0.1, 0.2, 0.3, 1 and 2.

Rug plots are often incorporated into other plots, generally in their margins, either to give the viewer a better of the raw data being plotted or to complement it. For example, in plot_trace() a rug plot is incorporated into both columns whenever divergences are present. In the left column the rug plot represents the samples where divergences occurred whereas in the right column, the rug plot indicates which are the draws where divergences happened.

In other cases, such as this one, rug plots are stacked as a way to compare raw data visually in a more direct and intuitive way than say columns in a table.

When we have a lot of observations, the “scatter” kind is not a good choice, whereas on the other hand, when we have very few data points, the “kde” (and to a lesser extend the “cumulative”) kinds are then the bad choice. Here we will use the “kde” and “cumulative” kinds to generate ppc plots for all observations, and the “scatter” one to generate ppc plots for the first and first 5 observations:

_, axes = plt.subplots(2, 2, figsize=(12, 5), layout="constrained")

az.plot_ppc(radon, kind="kde", ax=axes[0, 0], legend=False)
az.plot_ppc(radon, kind="cumulative", ax=axes[0, 1], legend=True)
az.plot_ppc(radon, kind="scatter", coords={"obs_id": [0]}, ax=axes[1, 0], legend=False)
az.plot_ppc(radon, kind="scatter", coords={"obs_id": [0, 1, 2, 3, 4]}, ax=axes[1, 1], legend=False)
axes[0, 1].get_legend().set(bbox_to_anchor=(1.05, 1), loc='upper left');

../../../_images/43491f92389c29dad6c1aa25bac4c13087da9012164b89772d6c92e33afe3a4d.png

The figure above shows 4 different posterior predictive checks on the radon dataset. Its observations are the logarightm of radon concentration in the air, so they can take both positive and negative values. The first two panels show the checks for all data, one with a KDE the other using ECDF plots, everything else is the same:

The black line (“Observed” in the legend) is the one that corresponds to the observations
The blue lines (“Posterior predictive” in the legend) are the ones that correspond to a each specific posterior predictive sample
The dashed orange (“Posterior predictive mean” in the legend) line is the one that corresponds to the aggregate line using all available posterior predictive samples.

Especially on the KDE, but also a bit in the ECDF, we see that the black line is very similar to all blue lines, it’s even really close to the orange one. But there are a couple places where the black one is among the most extreme blue lines. We can try to look deeper into that in the coming sections, using other plot_ppc features.

The last two panels use the “scatter” kind, and show 6 rug plots stacked along the y axis, toghether with the dashed orange line. Even when we are only ploting the obs_id=0 subset, we still have 4000 posterior predictive samples that correspond to it, so we can always generate a KDE/ECDF for this aggregate quantity.

Tip

We have called plot_ppc multiple times, and used the legend argument to generate only one legend. Moreover, we have moved the legend outside the plot to prevent it from covering any of the actual plotted data. When facetting, plot_ppc generates only one legend per figure, but it might still be better to move it outside the plot. There will be an example of that in the last section of the blog post.

If we use plot_ppc on discrete data we get histograms. And even if they are arguably uglier, everything remains largely the same.

az.plot_ppc(rugby);

../../../_images/d8e6421f8abec415d35775d7fc4afa10710790d8b202af2b45571c6a288b3378.png

`mean`#

The next argument we’ll dive into is the mean one, which takes a boolean. It indicates weather this “mean” line, shown above in dashed orange, should be added to the plot.

az.plot_ppc(radon, mean=False);

../../../_images/a9fd965fa01603dc84d9337625b28547649188c7ddbf2bfb067bf6bf43598d54.png

What even is this line though? This KDE//hist aggregate generated using all posterior predictive samples, mostly gives a more global and less noisy view of the model predictions. I emphasized global because sometimes predictions can be quite different between draws, in which case, this aggregate will be misleading because there isn’t really any global behaviour to capture.

Here is a toy example of such behaviour. I generate fake posterior predictive values and observations from a standard normal. However, the posterior predictive samples have a caveat. In the first chain, the standard normal is truncated at 0 and only the left tail is kept, whereas in the 2nd chain, we still truncate at 0 but keep the right tail only. This model is a disaster, there isn’t a single prediction with both positive and negative values, and we see how the black line is very different from all blue ones. When we generate the aggregate however, we do have roughly a standard normal, with only a slight lack of values around 0, so the orange dashed line has nothing to do with the blue ones and is closer to the black one even.

from scipy.stats import truncnorm, norm
import numpy as np

pp = np.stack((truncnorm.rvs(-np.infty, 0, size=(500, 157)), truncnorm.rvs(0, np.infty, size=(500, 157))))
observed = norm.rvs(size=157)
idata = az.from_dict(observed_data={"obs": observed}, posterior_predictive={"obs": pp})
az.plot_ppc(idata);

../../../_images/9a0c0441d20554064f7c27dc442d74ff18fff4dc0b316e947f015b42028b3239.png

`observed` and `observed_rug`#

Next up are the observed and observed_rug arguments. observed is a boolean flag to indicate wheather the observations should be plotted or not, which can be handy when using plot_ppc for prior predictive checks as we may not have observations yet. The observed_rug argument adds a rug plot of the observations below the standard plot_ppc. Therefore it can’t be used with “scatter” as it would be redundant with the rug plot already at y=0. This can be a great complement to the plot when there aren’t a lot of observations, when the KDE can be hard to interpret or directly a bad choice.

Do keep in mind however that exactly the same processing happens for all posterior predictive samples and for observations, so even if he probability distribution is a bad representation or hard to read, they are still comparable.

az.plot_ppc(schools, observed_rug=True);

../../../_images/a46c4b933bd2ea99041da470098494a8dd10bee1bb35336e0c121b8f3922bad5.png

`data_pairs`#

In some PPLs variables can’t repeat their names, so we need to use data_pairs to indicate how to compare the variables in the observed_data group with the ones in posterior_predictive. We’ll rename the variable in our posterior predictive group to show an example of data_pairs usage:

schools.posterior_predictive = schools.posterior_predictive.rename(obs="obs_hat")
# after this doing az.plot_ppc(schools) directly would raise an error
az.plot_ppc(schools, data_pairs={"obs": "obs_hat"});

../../../_images/8aba1466ddb158a6c1c9e0a53176f1c25405e52e268e888641d70fe39f0a8922.png

The plot is the same, we have only renamed a variable, but the label is different now, and both the variable name in the observed data and in the posterior predictive groups are shown.

`num_pp_samples`#

One possible complain from all the plots we have seen so far might be that there are way too many lines overlaid when kind is “kde” or “cumulative” and/or there are too few rug plots stacked vertically when kind is “scatter”. We can control this with num_pp_samples:

fig, axes = plt.subplots(1, 2, figsize=(10, 3))
az.plot_ppc(radon, num_pp_samples=100, ax=axes[0], legend=False, coords={"obs_id": "ANOKA"})
az.plot_ppc(radon, num_pp_samples=10, ax=axes[1], kind="scatter", coords={"obs_id": "HUBBARD"})
axes[1].get_legend().set(bbox_to_anchor=(1.05, 1), loc='upper left');

../../../_images/132848e22c01ae9193015045f699e08a474d2cb20c38de976a14862cd7026cfc.png

Now though, we are plotting only a subset, which is chosen at random. To ensure reproducibililty, there is random_seed. Especially with the “scatter” kind it can also be interesting to have multiple plot_ppc calls use the same subset:

fig, axes = plt.subplots(1, 3, figsize=(13, 3))
az.plot_ppc(radon, num_pp_samples=10, ax=axes[0], legend=False, kind="scatter", coords={"obs_id": "HUBBARD"})
az.plot_ppc(radon, num_pp_samples=10, random_seed=3, ax=axes[1], legend=False, kind="scatter", coords={"obs_id": "HUBBARD"})
az.plot_ppc(radon, num_pp_samples=10, random_seed=3, ax=axes[2], kind="scatter", coords={"obs_id": "HUBBARD"})
axes[2].get_legend().set(bbox_to_anchor=(1.05, 1), loc='upper left');

../../../_images/11d08d7af0b1f1338381d3f41d61fff71a14d54dd6cf0b23196ba29d9bf4b098.png

`animated`#

There is also the option of generating a video of the plot_ppc. Using

az.plot_ppc(radon, animated=True)

you will generate the following video:

Here is also a link to the video file in case the embedded player doesn’t work.

The animated feature hasn’t received much attention, but I think it could be interesting to maintain and develop. Something that comes to mind is having it show only a handful of frames, stop for ~2 seconds on each frame and number each frame. Frames would be similar to current ones, but without mean nor observed data lines, only a single line which will generally be a posterior predictive one except in one of the frames where we’d use the observed data. The animation would then end showing the number of the frame containing the plot of the observed data.

If you guessed right then something is going on, either you have done so much EDA with your data that you have memorized how it looks, or more probably (I think?) there is some aspect of the observations that isn’t captured by the model.

ArviZ in depth: plot_ppc#

Overview#

How `plot_ppc` works under the hood#

`kind`#

`mean`#

`observed` and `observed_rug`#

`data_pairs`#

`flatten`#

`num_pp_samples`#

`animated`#

Practical examples#

Radon model#

Rugby model#

17 November 2023

ArviZ in depth: plot_ppc#

Overview#

How plot_ppc works under the hood#

kind#

mean#

observed and observed_rug#

data_pairs#

flatten#

num_pp_samples#

animated#

Practical examples#

Radon model#

Rugby model#

17 November 2023

How `plot_ppc` works under the hood#

`kind`#

`mean`#

`observed` and `observed_rug`#

`data_pairs`#

`flatten`#

`num_pp_samples`#

`animated`#