This content refers to the previous stable release of PyMVPA. Please visit www.pymvpa.org for the most recent version of PyMVPA and its documentation.

Measures¶

PyMVPA provides a number of useful measures. The vast majority of them are dedicated to feature selection. To increase analysis flexibility, PyMVPA distinguishes two parts of a feature selection procedure.

First, the impact of each individual feature on a classification has to be determined. The resulting map reflects the sensitivities of all features with respect to a certain decision and, therefore, algorithms generating these maps are summarized as Sensitivity in PyMVPA.

Second, once the feature sensitivities are known, they can be used as criteria for feature selection. However, possible selection strategies range from very simple Go with the 10% best features to more complicated algorithms like Recursive Feature Elimination. Because Sensitivity Measures and selections strategies can be arbitrarily combined, PyMVPA offers a quite flexible framework for feature selection.

Similar to dataset splitters, all PyMVPA algorithms are implemented and behave like processing objects. To recap, this means that they are instantiated by passing all relevant arguments to the constructor. Once created, they can be used multiple times by calling them with different datasets.

Sensitivity Measures¶

It was already mentioned that a Sensitivity computes a featurewise score that indicates how much interesting signal each feature contains – hoping that this score somehow correlates with the impact of the features on a classifier’s decision for a certain problem.

Every sensitivity analyzer object computes a one-dimensional array with the respective score for every feature, when called with a Dataset. Due to this common behavior all Sensitivity types are interchangeable and can be combined with any other algorithm requiring a sensitivity analyzer.

By convention higher sensitivity values indicate more interesting features.

There are two types of sensitivity analyzers in PyMVPA. Basic sensitivity analyzers directly compute a score from a Dataset. Meta sensitivity analyzers on the other hand utilize another sensitivity analyzer to compute their sensitivity maps.

Basic Sensitivity (and related Measures)¶

ANOVA¶

The OneWayAnova class provides a simple (and fast) univariate measure, that can be used for feature selection, although it is not a proper sensitivity measure. For each feature an individual F-score is computed as the fraction of between and within group variances. Groups are defined by samples with unique labels.

Higher F-scores indicate higher sensitivities, as with all other sensitivity analyzers.

Linear SVM Weights¶

The featurewise weights of a trained support vector machine are another possible sensitivity measure. The mvpa.clfs.libsvmc.sens.LinearSVMWeights and mvpa.clfs.sg.sens.LinearSVMWeights classes can internally train all types of linear support vector machines and report those weights.

In contrast to the F-scores computed by an ANOVA, the weights can be positive or negative, with both extremes indicating higher sensitivities. To deal with this property all subclasses of DatasetMeasure support a transformer arguments in the constructor. A transformer is a functor that is finally called with the computed sensitivity map. PyMVPA already comes with some convenience functors which can be used for this purpose (see transformers).

>>> from mvpa.misc.data_generators import normalFeatureDataset
>>> from mvpa.clfs.svm import LinearCSVMC
>>> from mvpa.misc.transformers import Absolute
>>>
>>> ds = normalFeatureDataset()
>>> ds
<Dataset / float64 100 x 4 uniq: 2 labels 5 chunks labels_mapped>
>>>
>>> clf = LinearCSVMC()
>>> sensana = clf.getSensitivityAnalyzer()
>>> sens = sensana(ds)
>>> sens.shape
(4,)
>>> (sens < 0).any()
True
>>> sensana_abs = clf.getSensitivityAnalyzer(transformer=Absolute)
>>> (sensana_abs(ds) < 0).any()
False

Above example shows how to use an existing classifier instance to report sensitivity values (a linear SVM in this case). The computed sensitivity vector contains one element for each feature in the dataset. transformers can be used to post-process the sensitivity scores, e.g. reporting absolute values for feature selection purposes, instead of raw sensitivities.

Note

The SVMWeights classes cannot extract reasonable weights from non-linear SVMs (e.g. with RBF kernels).

Other linear Classifier Weights¶

Any linear classifier in PyMVPA can report its weights. The procedure is identical for all of them. As outlined in the example using linear SVM weights, simply call getSensitivityAnalyzer() on a classifier instance and you’ll get an appropriate Sensitivity object. Additionally, it is possible to force (re)training of the underlying classifier or simply report the weights computed during a previous training run.

Examples of other classifier-based linear sensitivity analyzers are: SMLRWeights and GPRLinearWeights.

Noise Perturbation¶

Noise perturbation is a generic approach to determine feature sensitivity. The sensitivity analyzer NoisePerturbationSensitivity) computes a scalar DatasetMeasure using the original dataset. Afterwards, for each single feature a noise pattern is added to the respective feature and the dataset measure is recomputed. The sensitivity of each feature is the difference between the dataset measure of the original dataset and the one with added noise. The reasoning behind this algorithm is that adding noise to important features will impair a dataset measure like cross-validated classifier transfer error. However, adding noise to a feature that already only contains noise, will not change such a measure.

Depending on the used scalar DatasetMeasure using the sensitivity analyzer might be really CPU-intensive! Also depending on the measure, it might be necessary to use appropriate transformers (see transformers constructor arguments) to ensure that higher values represent higher sensitivities.

Meta Sensitivity Measures¶

Meta Sensitivity Measures are FeaturewiseDatasetMeasures that internally use one of the Basic Sensitivity (and related Measures) to compute their sensitivity scores.

Splitting Measures¶

The SplittingFeaturewiseMeasure uses a Splitter to generate dataset splits. A FeaturewiseDatasetMeasure is then used to compute sensitivity maps for all these dataset splits. At the end a combiner function is called with all sensitivity maps to produce the final sensitivity map. By default the mean sensitivity maps across all splits is computed.

Table Of Contents

Previous topic

Next topic

Measures¶

Sensitivity Measures¶

Meta Sensitivity Measures¶

Splitting Measures¶

Navigation

Table Of Contents

Previous topic

Next topic

Quick search

Measures¶

Sensitivity Measures¶

Basic Sensitivity (and related Measures)¶

ANOVA¶

Linear SVM Weights¶

Other linear Classifier Weights¶

Noise Perturbation¶

Meta Sensitivity Measures¶

Splitting Measures¶

Navigation