Beyond SHAP using BCG GAMMA FACET

TL;DR: FACET adds some extra features on top of SHAP which help in model explainability. To see it in action, go to my GitHub and run the Python Notebook. https://github.com/chrisrichgruber/gamma-facet

Model explainability for black box models has become much easier in recent years thanks to the work from the contributors to SHAP. Please check it out if you haven’t already. https://github.com/slundberg/shap

BCG GAMMA, with their FACET package, has taken SHAP a step forward to help with even greater model explainability. https://github.com/BCG-Gamma/facet

To go through FACET’s capabilities, I’ve utilized BCG GAMMA’s own examples, but with Kaggle’s Titanic dataset. We’ll be looking at how a person’s age related to their survival on the Titanic.

Titanic Training Set

Looking at the correlation matrix we can see a weak relationship of ‘Survived’ & ‘Age’ of -.077 which indicates survival rate decreases as age increases.

Correlation Matrix for Titanic Training Set

We can see this slight correlation looking at the survival rate for those under 40 (42%) and those 40 and above (37%).

And if we group ages together to the nearest 10 years, we can see the survival rate within each of those groups.

Now, we’re going to start using the FACET package to help us understand the relation of age and survival. First, we’ll use FACET to help us rank a Random Forest model.

# create FACET sample object
titanic_sample = Sample(observations=df_train, target_name="Survived")
# create a (trivial) pipeline for a random forest regressor
rnd_forest = ClassifierPipelineDF(
classifier=RandomForestClassifierDF(random_state=42)
)
# define grid of models which are "competing" against each other
rnd_forest_grid = [
LearnerGrid(
pipeline=rnd_forest,
learner_parameters={
"min_samples_leaf": [2,4,6]
}
),
]
# create repeated k-fold CV iterator
rkf_cv = RepeatedKFold(n_splits=5, n_repeats=10, random_state=42)
# rank your candidate models by performance (default is mean CV score - 2*SD)
ranker = LearnerRanker(
grids=rnd_forest_grid, cv=rkf_cv, n_jobs=-1
).fit(sample=titanic_sample)
# get summary report
ranker.summary_report()

Let’s take a look at the SHAP feature importance. From this we can see the top feature is ‘Sex’. If you haven’t seen a SHAP output like this before, we are looking at all the observations in the training set, the ‘Feature Value’ as a color (red for men and blue for women) and the ‘SHAP Value’ or impact to the model (which in our case, to the right is more likely to survive). From this we can see Sex = 0 (Woman) has a positive impact on survival. We can also look at ‘Age’, and see the lowest age (in blue) has a positive impact on survivability while the highest age (in red) had a negative impact on survivability.

To understand the impact on ‘Age’ more let’s use some of FACET’s capabilities.

# run inspector
clf_inspector = LearnerInspector(
n_jobs=-1,
verbose=True,
).fit(crossfit=ranker.best_model_crossfit_)

Below is a Synergy Heatmap that FACET describes as “The degree to which the model combines information from one feature with another to predict the target”.

# synergy heatmap
# visualise synergy as a matrix
synergy_matrix = clf_inspector.feature_synergy_matrix()# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Generate a custom diverging colormap
cmap = sns.diverging_palette(20, 230, as_cmap=True)
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(synergy_matrix, annot=True, vmin=0, vmax=1, center=0,
square=True, linewidths=.5, cbar_kws={"shrink": .5})

Another feature is a Redundancy Heatmap that FACET describes as “The degree to which a feature in a model duplicates the information of a second feature to predict the target.”.

# redundancy heatmap
redundancy_matrix = clf_inspector.feature_redundancy_matrix()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Generate a custom diverging colormap
cmap = sns.diverging_palette(20, 230, as_cmap=True)
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(redundancy_matrix, annot=True, vmin=0, vmax=1, center=0,
square=True, linewidths=.5, cbar_kws={"shrink": .5})
plt.title("Redundancy Heatmap", fontsize =20)

Finally, FACET allows us to run a simulation for any of the features and see how that features impacts the outcome. We’ll look at how ‘Age’ related to rate of survival.

# create a bootstrap CV crossfit for simulation using best model
boot_crossfit = LearnerCrossfit(
pipeline=ranker.best_model_,
cv=BootstrapCV(n_splits=1000, random_state=42),
n_jobs=-1,
verbose=False,
).fit(sample=titanic_sample)
# set-up and run a simulation
sim_feature = "Age"
age_simulator = UnivariateProbabilitySimulator(
crossfit=boot_crossfit,
n_jobs=-1
)
age_partitions = ContinuousRangePartitioner()age_simulation = age_simulator.simulate_feature(
feature_name=sim_feature,
partitioner=age_partitions
)
# visualize the results
SimulationDrawer().draw(data=age_simulation, title=sim_feature)
plt.gcf().set_size_inches(16, 10)

From our training set, we can see that 41% of people survived and this is shown on the graph below as a green horizontal labeled ‘Baseline’. We can also see the median survival rate based on different ages with confidence intervals. With this simulation we can clearly see that, during the Titanic disaster, ‘Age’ has a negative impact on survival.