Abstract
Context. The problem of automating the decision tree construction is addressed. The object of study is a decision tree. The subject of study is the methods of decision tree building. Objective. The purpose of the work is to create a method for constructing models based on decision trees for data samples that are characterized by sets of individually low-informative features. Method. A method for decision tree constructing is proposed, which for a given sample determines the individual informativeness of features relatively to the output feature, and also evaluates the relationship of input features with each other as their individual informativity pairwise relatively to each other, at the step of forming the next node the method selects as a candidate feature the feature that gives the best partition in the whole set of features, after which it sequentially searches among all the features that are not selected for this node the one that is individually most closely related with the selected candidate, then for the set of selected features, iterating through the available transformations from a given set, determines the quality of the partition for each transformation, selects the best transformation and adds it to the node. When forming the next node, the method tends to single out a group of the most closely interrelated features, the conversion of which into a scalar value will provide the best partitioning of a subsample of instances hit into this node. This makes possible to reduce the size of the model and the branching of the tree, speed up the calculations in recognizing instances based on the model, as well as improve the generalizing properties of the model and its interpretability. The proposed method allows using the constructed decision tree to assess the feature significance.Results. The developed method is implemented as software and investigated at signal represented by a set of individually lowinformative readings classification problem solving. Conclusions. The experiments have confirmed the efficiency of the proposed software and allow recommending it for use in practice in solving problems of diagnostics and automatic classification by features. The prospects for further research may consist in the creation of parallel methods for constructing decision trees based on the proposed method, optimization of its software implementations, and also in an experimental study of the proposed method on a wider set of practical problems