Multivariate Pattern Analysis in Python |
PyMVPA provides a number of useful measures. The vast majority of them are dedicated to feature selection. To increase analysis flexibility, PyMVPA distinguishes two parts of a feature selection procedure.
First, the impact of each individual feature on a classification has to be determined. The resulting map reflects the sensitivities of all features with respect to a certain decision and, therefore, algorithms generating these maps are summarized as Sensitivity in PyMVPA.
Second, once the feature sensitivities are known, they can be used as criteria for feature selection. However, possible selection strategies range from very simple Go with the 10% best features to more complicated algorithms like Recursive Feature Elimination. Because Sensitivity Measures and selections strategies can be arbitrarily combined, PyMVPA offers a quite flexible framework for feature selection.
Similar to dataset splitters, all PyMVPA algorithms are implemented and behave like processing objects. To recap, this means that they are instantiated by passing all relevant arguments to the constructor. Once created, they can be used multiple times by calling them with different datasets.
It was already mentioned that a Sensitivity computes a featurewise score that indicates how much interesting signal each feature contains – hoping that this score somehow correlates with the impact of the features on a classifier’s decision for a certain problem.
Every sensitivity analyzer object computes a one-dimensional array with the respective score for every feature, when called with a Dataset. Due to this common behavior all Sensitivity types are interchangeable and can be combined with any other algorithm requiring a sensitivity analyzer.
By convention higher sensitivity values indicate more interesting features.
There are two types of sensitivity analyzers in PyMVPA. Basic sensitivity analyzers directly compute a score from a Dataset. Meta sensitivity analyzers on the other hand utilize another sensitivity analyzer to compute their sensitivity maps.
Meta Sensitivity Measures are FeaturewiseDatasetMeasures that internally use one of the Basic Sensitivity (and related Measures) to compute their sensitivity scores.
The SplittingFeaturewiseMeasure uses a Splitter to generate dataset splits. A FeaturewiseDatasetMeasure is then used to compute sensitivity maps for all these dataset splits. At the end a combiner function is called with all sensitivity maps to produce the final sensitivity map. By default the mean sensitivity maps across all splits is computed.