Handling Continuous Data in Top-Down Induction of First-Order Rules

Donato Malerba, Floriana Esposito, Giovanni Semeraro, and Sergio Caggese
Dipartimento di Informatica - Universita' degli Studi di Bari
via Orabona, 4 - 70126 Bari - Italy
{malerba | esposito | semeraro | caggese}@lacam.di.uniba.it


Abstract: Handling numerical information is one of the most important research issues for practical applications of first-order learning systems. This paper is concerned with the problem of inducing first-order classification rules from both numeric and symbolic data. We propose a specialization operator that discretizes continuous data during the learning process. The heuristic function used to choose among different discretizations satisfies a property that can be profitably exploited to improve the efficiency of the specialization operator. The operator has been implemented and tested on the document understanding domain.