Enhanced Hierarchical Classification via Isotonic Smoothing
Source:
17th International World Wide Web Conference (WWW), Beijing, China, p.To Appear (2008)
Abstract:
Hierarchical topic taxonomies have proliferated on the World Wide
Web, and exploiting the output space
decompositions they induce in automated classification systems is an active area of research. In
many domains, classifiers learned on a hierarchy of classes have been
shown to outperform those learned on a flat set of classes. In this
paper we argue that the hierarchical arrangement of classes leads to
intuitive relationships between the corresponding classifiers' output
scores, and that enforcing these relationships as a post-processing
step to classification can improve its accuracy. We formulate the task
of smoothing classifier outputs as a regularized isotonic tree
regression problem, and present a dynamic programming based method
that solves it optimally. This new problem generalizes the classic
isotonic tree regression problem, and both, the new formulation and
algorithm, might be of independent interest. In our empirical
analysis of two real-world text classification scenarios, we show that
our approach to smoothing classifier outputs results in improved
classification accuracy.