hfs.TSELSelector

class hfs.TSELSelector(hierarchy: Optional[ndarray] = None, use_original_implementation: bool = True)[source]

A tree-based feature selection method for hierarchical features.

This hierarchical feature selection methods was proposed by Jeong and Myaeng in 2013. The features are selected by choosing the most representative nodes from each path and filtering these nodes further by removing parents with children that were also selected.

__init__(hierarchy: Optional[ndarray] = None, use_original_implementation: bool = True)[source]

Initializes a TSELSelector.

Parameters
hierarchynp.ndarray

The hierarchy graph as an adjacency matrix. The feature selection method is intended for a hierarchy graph that has a tree structure.

use_original_implementation: bool

Should the original implementation from the paper be used. If False, a slightly different interpretation of the algorithm is used. Default is True.

fit(X, y, columns=None)[source]

Fitting function that sets self.representatives_.

The number of columns in X and the number of nodes in the hierarchy are expected to be the same and each column should be mapped to exactly one node in the hierarchy with the columns parameter. After fitting self.representatives_ includes the names of all nodes from the hierarchy that are left after feature selection. The features are selected by choosing the most representative nodes from each path and filtering these nodes further by removing parents with children that were also selected.

Parameters
X{array-like, sparse matrix}, shape (n_samples, n_features)

The training input samples.

yarray-like, shape (n_samples,)

The target values. An array of int.

columns: list or None, length n_features

The mapping from the hierarchy graph’s nodes to the columns in X. A list of ints. If this parameter is None the columns in X and the corresponding nodes in the hierarchy are expected to be in the same order.

Returns
selfobject

Returns self.