Decoding deep learning

Read an article in Qaunta Magazine (A new approach to understanding how machines think) about what Google’s team has been doing to interpret deep learning models. They have created TCAV (Testing with Content Activation Vectors), a software tool that can be used to interrogate deep learning models to determine how sensitive they are to features of interest.

Essentially, TCAV provides a way to exercise a deep learning model using a select set of test data (that isolates a feature of interest) and to determine how sensitive the deep learning model is to that feature.

What TCAV can do for DL models

For example, in my experiments with deep learning, I’ve trained a model to predict popularity of a (RayOnStorage) blog post based on its title. But, I was also intending to do the same based on content attributes such as, text length, heading count, image count, link count, etc. In the end, I was hoping to come up with some idea of the popularity of a post based on these attributes. But in reality what I wanted to know was how each of these parameters (or features) impacted blog post (predicted and actual) popularity.

With TCAV, you essentially select training examples such as posts that have or show the parameter of interest (e.g. blog posts with a high number of images). Once you have your example set you use TCAV to feed in the samples to the model and it generates a number between 0 and 1, that tells you how sensitive the model is to the feature in the training set.

So in the example from my blog above, it might show that the blog popularity prediction DL model has a 0.2 sensitivity to the number of images in a post. In the example shown in the graphic the base model interprets images and TCAV is used to determine how important stripes are to interpreting an image has a zebra in it.

How TCAV actually works

The use of TCAV is a bit technical but essentially it feeds the example data set into the model as well as some random set of data without the feature of interest and isolates the model’s (neural net node) activation deltas between random data and example data.

TCAV uses a machine learning model to interrogate another machine learning model of the sensitivity to a characteristic feature vs a random feature set. The paper goes into much more detail than this if interested, but you train this new model to predict the sensitivity of the old model to the feature of interest. In the end, TCAV comes up with a single number determining that sensitivity


TCAV is available as an open source tool (see GitHub TCAV project page) and works with Google TensorFlow frameworks. TCAV was originally developed to work with image classification models but can work with other models as well.

If your running TensorFlow already, adding TCAV appears easy enough (checkout the readme page for the project for more info). On the TCAV project page, there’s a Jupyter notebook (Run TCAV in the GitHub directory) available that explains it in more detail.

Can’t wait to try it out on my blog popularity prediction model.


Photo Credit(s): From Neural Networks, Multiple Outputs from caesar harda (Flickr)

From the Testing with Content Activation Vectors paper

From The face of a robot with human-like features, Penn State