hfs.GreedyTopDownSelector

class hfs.GreedyTopDownSelector(hierarchy: Optional[ndarray] = None, iterate_first_level: bool = True)[source]

Greedy Top Down feature selection method proposed by Lu et al. 2013.

The features are selected choosing nodes from the hierarchy that score in the heuristic function and aren’t an ancestor or descendant of a node with a higher score. This feature selection method is intended for hierarchical data. Therefore, it inherits from the EagerHierarchicalFeatureSelector.

__init__(hierarchy: Optional[ndarray] = None, iterate_first_level: bool = True)[source]

Initializes a GreedyTopDownSelector.

Parameters
hierarchynp.ndarray

The hierarchy graph as an adjacency matrix.

iterate_first_levelbool

The feature selection algorithm proposed by Lu et al. assumes that the hierarchy has a tree structure. If it is a DAG this parameter can be set to False to achieve similiar behaviour than in the original algorithm.

fit(X, y, columns=None)[source]

Fitting function that sets self.representatives_.

The number of columns in X and the number of nodes in the hierarchy are expected to be the same and each column should be mapped to exactly one node in the hierarchy with the columns parameter. After fitting self.representatives_ includes the names of all nodes from the hierarchy that are left after feature selection. The features are selected choosing nodes from the hierarchy that score in the heuristic function and aren’t an ancestor or descendant of a node with a higher score.

Parameters
X{array-like, sparse matrix}, shape (n_samples, n_features)

The training input samples.

yarray-like, shape (n_samples,)

The target values. An array of int.

columns: list or None, length n_features

The mapping from the hierarchy graph’s nodes to the columns in X. A list of ints. If this parameter is None the columns in X and the corresponding nodes in the hierarchy are expected to be in the same order.

Returns
selfobject

Returns self.