Please have a look at our video presentation for the WSCG2020 conference.
This is the latest research about XAI, which extends on the Embedded Prototype Subspace Classification pipeline.
This paper introduces the use of cascading for ensembles with early termination. The idea is to progressively exclude neurons that cannot contribute to the solution. This can be done since we know exactly which neurons are use to compute the response or score for each class. Whenever a class is ranked low by one classifier/net then it is excluded for further processing by other classifiers/nets.
The rough estimation
But first a rough estimation is done by computing a ranking using a smaller net, which is extracted from one embedding and KDE of one classifier. By keeping the top 5 candidates, the neural net resulting from a bandwidth giving smaller clusters is used to compute the response value for each class. This means that in the very first step, the whole fine net does not need to be computed, but only for the probable classes. This will reduce the computational cost noticeably.
The figure to the left shows the result of using different bandwidths on the same data. In the top row three clusters are found, including the three main ways of writing the Hiragana character を. The resulting net would be fast since there are few clusters, but also less accurate than using the approach showed in the bottom where more clusters are found, containing inter variations within each cluster. In any case, the faster net will give a good rough estimation for the finer net to work on.
The progressive exclusion will also reduce computational cost. The early termination kicks in when the top two candidates has a rather big difference in response. Then it can be safe to conclude that we have the right candidate already. All of these mechanisms give a pipeline of cascading in the ensemble that is even faster than using just one classifier/net!
An example of the processing in the pipeline
The above image illustrates the process.
- A rough estimation is done using a small net and the classes are ranked.
- The finer net processes the neurons used to classify the top 5 classes only. Now we can safely discard the lowest ranked class before going on to process two other nets using their corresponding features vectors. It can be noted that the difference (d) between the top two classes was quite small so no early termination was done.
- Two nets run their part for 4 classes only.
- The scores of the three previous finer nets are summed, and now early termination can be done because the different between the top two candidates is large.
- If no early termination would be done then the fourth net processes its neurons for the remaining two classes.
- Finally the sum of all four nets gives the final ranking.
It was proven in the paper that very few extra misclassifications, if any, were done using this faster approach, compared to using the full ensemble, which is several times slower.