license: apache-2.0
Classification with Neural Decision Forests This is an example notebook for Keras sprint prepared by Hugging Face. Keras Sprint aims to reproduce Keras examples and build interactive demos to them. The markdown parts beginning with 🤗 and the following code snippets are the parts added by Hugging Face team to give you an example of how to host your model and build a demo.
Original Author of the Neural Decision Forests Example: Khalid Salama
Introduction
This example provides an implementation of the Deep Neural Decision Forest model introduced by P. Kontschieder et al. for structured data classification. It demonstrates how to build a stochastic and differentiable decision tree model, train it end-to-end, and unify decision trees with deep representation learning.
Numerical Features | Categorical Features |
---|---|
age | workclass |
education-num | education |
capital-gain | marital-status |
capital-loss | occupation |
hours-per-week | relationship |
race | |
gender | |
native-country |
Dropped Feature: fnlwgt
Labelled Feature: income_bracket
The dataset comes in two parts meant for training and testing. The training dataset has 32561 samples whereas the test dataset has 16282 samples.
Training procedure
- Prepare Data: Create tf.data.Dataset objects for training and validation-
We create an input function to read and parse the file, and convert features and labels into a tf.data.Dataset
for training and validation. We also preprocess the input by mapping the target label to an index. We also use layers.StringLookup
to prepare categorical data.
Encode Features: We encode the categorical and numerical features as follows:
Create a lookup to convert a string values to an integer indices. Since we are not using a mask token, nor expecting any out of vocabulary (oov) token, we set mask-token to None and num-oov-indices to 0.
Categorical Features: Create an embedding layer with the specified dimensions. Numerical Features: Use
tf.expand_dims
on Numerical feature as it is.Create Model:
Deep Neural Decision Tree
A neural decision tree model has two sets of weights to learn. The first set is pi, which represents the probability distribution of the classes in the tree leaves. The second set is the weights of the routing layer decision-fn, which represents the probability of going to each leave. The forward pass of the model works as follows:
- The model expects input features as a single vector encoding all the features of an instance in the batch. This vector can be generated from a Convolution Neural Network (CNN) applied to images or dense transformations applied to structured data features.
- The model first applies a used_features_mask to randomly select a subset of input features to use.
- Then, the model computes the probabilities (mu) for the input instances to reach the tree leaves by iteratively performing a stochastic routing throughout the tree levels.
- Finally, the probabilities of reaching the leaves are combined by the class probabilities at the leaves to produce the final outputs.
- Compile, Train and Evaluate Model:
- The loss function chosen was
SparseCategoricalCrossentropy
. - The metric chosen for evaluating the model's performance was
SparseCategoricalAccuracy
. - The optimizer chosen was
Adam
with a learning rate of 0.001. - The batch-size chosen was 265 and the model was trained for 5 epochs.
- Finally the performance of the model was also evaluated on the test-dataset reaching an accuracy of ~85% on both Decision Model and Forest Model.
- The loss function chosen was
Training hyperparameters
The following hyperparameters were used during training:
Hyperparameters | Value |
---|---|
name | Adam |
learning-rate | 0.01 |
batch-size | 265 |
num-epochs | 5 |
num-trees | 10 |
depth | 10 |
used-features-rate | 1.0 |
num-classes | 2 |
Model Plot
View Model Plot
Credits:
- HF Contribution: Tarun R Jain