hfs.BottomUpSelector¶
- class hfs.BottomUpSelector(hierarchy: Optional[ndarray] = None, alpha: float = 0.01, k: int = 5, dataset_type: str = 'binary')[source]¶
Hill climbing bottom up feature selection method.
This feature selection method was proposed by Wang et al. in 2003. The features are selected by going through the feature graph from bottom to top, replacing child nodes with their parent node and evaluating the resulting feature set with a fitness function. The method is intended for hierarchical data. Therefore, it inherits from the EagerHierarchicalFeatureSelector.
- __init__(hierarchy: Optional[ndarray] = None, alpha: float = 0.01, k: int = 5, dataset_type: str = 'binary')[source]¶
Initializes a BottomUpSelector.
- Parameters
- hierarchynp.ndarray
The hierarchy graph as an adjacency matrix. For this feature selection method to work as intended the graph needs to be a tree.
- alpha: float
A hyperparameter needed for the feature selection. In the paper by Wang et al. this parameter is called beta. The default value is 0.01.
- kint
A hyperparameter needed to determine the k nearest neighbors during the feature selection algorithm. The default value is 5.
- dataset_type: string, either “binary” or “numerical”
A value indicating if the input dataset contains binary or numerical data. Default is “binary”.
- fit(X, y, columns=None)[source]¶
Fitting function that sets self.representatives_.
Calls the function performing feature selection algorithm. The number of columns in X and the number of nodes in the hierarchy are expected to be the same and each column should be mapped to exactly one node in the hierarchy with the columns parameter. After fitting self.representatives_ includes the names of all nodes from the hierarchy that are left after feature selection. The features are selected by going through the feature graph from bottom to top, replacing child nodes with their parent node and evaluating the resulting feature set with a fitness function.
- Parameters
- X{array-like, sparse matrix}, shape (n_samples, n_features)
The training input samples.
- yarray-like, shape (n_samples,)
The target values. An array of int.
- columns: list or None, length n_features
The mapping from the hierarchy graph’s nodes to the columns in X. A list of ints. If this parameter is None the columns in X and the corresponding nodes in the hierarchy are expected to be in the same order.
- Returns
- selfobject
Returns self.