Table Of Contents

Previous topic

clfs.stats

Next topic

clfs.warehouse

This content refers to the previous stable release of PyMVPA. Please visit www.pymvpa.org for the most recent version of PyMVPA and its documentation.

clfs.transerror

Module: clfs.transerror

Inheritance diagram for mvpa.clfs.transerror:

Utility class to compute the transfer error of classifiers.

Classes

ClassifierError

class mvpa.clfs.transerror.ClassifierError(clf, labels=None, train=True, **kwargs)

Bases: mvpa.misc.state.ClassWithCollections

Compute (or return) some error of a (trained) classifier on a dataset.

See also

Please refer to the documentation of the base class for more information:

ClassWithCollections

Note

Available state variables:

  • confusion: State variable
  • training_confusion: Proxy training_confusion from underlying classifier.

(States enabled by default are listed with +)

See also

Please refer to the documentation of the base class for more information:

ClassWithCollections

Initialization.

Parameters:
  • clf (Classifier) – Either trained or untrained classifier
  • labels (list) – if provided, should be a set of labels to add on top of the ones present in testdata
  • train (bool) – unless train=False, classifier gets trained if trainingdata provided to __call__
  • enable_states (None or list of basestring) – Names of the state variables which should be enabled additionally to default ones
  • disable_states (None or list of basestring) – Names of the state variables which should be disabled
clf
confusion = None

TODO Think that labels might be also symbolic thus can’t directly be indicies of the array

labels
untrain()

Untrain the *Error which relies on the classifier

ConfusionBasedError

class mvpa.clfs.transerror.ConfusionBasedError(clf, labels=None, confusion_state='training_confusion', **kwargs)

Bases: mvpa.clfs.transerror.ClassifierError

For a given classifier report an error based on internally computed error measure (given by some ConfusionMatrix stored in some state variable of Classifier).

This way we can perform feature selection taking as the error criterion either learning error, or transfer to splits error in the case of SplitClassifier

See also

Please refer to the documentation of the base class for more information:

ClassifierError

Note

Available state variables:

  • confusion: State variable
  • training_confusion: Proxy training_confusion from underlying classifier.

(States enabled by default are listed with +)

See also

Please refer to the documentation of the base class for more information:

ClassifierError

Initialization.

Parameters:
  • clf (Classifier) – Either trained or untrained classifier
  • confusion_state – Id of the state variable which stores ConfusionMatrix
  • labels (list) – if provided, should be a set of labels to add on top of the ones present in testdata
  • train (bool) – unless train=False, classifier gets trained if trainingdata provided to __call__
  • enable_states (None or list of basestring) – Names of the state variables which should be enabled additionally to default ones
  • disable_states (None or list of basestring) – Names of the state variables which should be disabled
  • enable_states – Names of the state variables which should be enabled additionally to default ones
  • disable_states – Names of the state variables which should be disabled

ConfusionMatrix

class mvpa.clfs.transerror.ConfusionMatrix(labels=None, labels_map=None, **kwargs)

Bases: mvpa.clfs.transerror.SummaryStatistics

Class to contain information and display confusion matrix.

Implementation of the SummaryStatistics in the case of classification problem. Actual computation of confusion matrix is delayed until all data is acquired (to figure out complete set of labels). If testing data doesn’t have a complete set of labels, but you like to include all labels, provide them as a parameter to the constructor.

Confusion matrix provides a set of performance statistics (use asstring(description=True) for the description of abbreviations), as well ROC curve (http://en.wikipedia.org/wiki/ROC_curve) plotting and analysis (AUC) in the limited set of problems: binary, multiclass 1-vs-all.

Initialize ConfusionMatrix with optional list of labels

Parameters:
  • labels (list) – Optional set of labels to include in the matrix
  • labels_map (None or dict) – Dictionary from original dataset to show mapping into numerical labels
  • targets – Optional set of targets
  • predictions – Optional set of predictions
asstring(short=False, header=True, summary=True, description=False)

‘Pretty print’ the matrix

Parameters:
  • short (bool) – if True, ignores the rest of the parameters and provides consise 1 line summary
  • header (bool) – print header of the table
  • summary (bool) – print summary (accuracy)
  • description (bool) – print verbose description of presented statistics
error
getLabels_map()
labels
labels_map
matrices

Return a list of separate confusion matrix per each stored set

matrix
percentCorrect
plot(labels=None, numbers=False, origin='upper', numbers_alpha=None, xlabels_vertical=True, numbers_kwargs={}, **kwargs)

Provide presentation of confusion matrix in image

Parameters:
  • labels (list of int or basestring) – Optionally provided labels guarantee the order of presentation. Also value of None places empty column/row, thus provides visual groupping of labels (Thanks Ingo)
  • numbers (bool) – Place values inside of confusion matrix elements
  • numbers_alpha (None or float) – Controls textual output of numbers. If None – all numbers are plotted in the same intensity. If some float – it controls alpha level – higher value would give higher contrast. (good value is 2)
  • origin (basestring) – Which left corner diagonal should start
  • xlabels_vertical (bool) – Either to plot xlabels vertical (benefitial if number of labels is large)
  • numbers_kwargs (dict) – Additional keyword parameters to be added to numbers (if numbers is True)
  • **kwargs – Additional arguments given to imshow (eg me cmap)
Return type:

(fig, im, cb) – figure, imshow, colorbar

setLabels_map(val)

ROCCurve

class mvpa.clfs.transerror.ROCCurve(labels, sets=None)

Bases: object

Generic class for ROC curve computation and plotting

Parameters:
  • labels (list) – labels which were used (in order of values if multiclass, or 1 per class for binary problems (e.g. in SMLR))
  • sets (list of tuples) – list of sets for the analysis
ROCs
aucs

Compute and return set of AUC values 1 per label

plot(label_index=0)
TODO: make it friendly to labels given by values?
should we also treat labels_map?

RegressionStatistics

class mvpa.clfs.transerror.RegressionStatistics(**kwargs)

Bases: mvpa.clfs.transerror.SummaryStatistics

Class to contain information and display on regression results.

Initialize RegressionStatistics

Parameters:
  • targets – Optional set of targets
  • predictions – Optional set of predictions
asstring(short=False, header=True, summary=True, description=False)

‘Pretty print’ the statistics

error
plot(plot=True, plot_stats=True, splot=True)

Provide presentation of regression performance in image

Parameters:
  • plot (bool) – Plot regular plot of values (targets/predictions)
  • plot_stats (bool) – Print basic statistics in the title
  • splot (bool) – Plot scatter plot
Return type:

(fig, im, cb) – figure, imshow, colorbar

SummaryStatistics

class mvpa.clfs.transerror.SummaryStatistics(targets=None, predictions=None, values=None, sets=None)

Bases: object

Basic class to collect targets/predictions and report summary statistics

It takes care about collecting the sets, which are just tuples (targets, predictions, values). While ‘computing’ the matrix, all sets are considered together. Children of the class are responsible for computation and display.

Initialize SummaryStatistics

targets or predictions cannot be provided alone (ie targets without predictions)

Parameters:
  • targets – Optional set of targets
  • predictions – Optional set of predictions
  • values – Optional set of values (which served for prediction)
  • sets – Optional list of sets
add(targets, predictions, values=None)

Add new results to the set of known results

asstring(short=False, header=True, summary=True, description=False)

‘Pretty print’ the matrix

Parameters:
  • short (bool) – if True, ignores the rest of the parameters and provides consise 1 line summary
  • header (bool) – print header of the table
  • summary (bool) – print summary (accuracy)
  • description (bool) – print verbose description of presented statistics
compute()

Actually compute the confusion matrix based on all the sets

error
reset()

Cleans summary – all data/sets are wiped out

sets
stats
summaries

Return a list of separate summaries per each stored set

TransferError

class mvpa.clfs.transerror.TransferError(clf, errorfx=MeanMismatchErrorFx(), labels=None, null_dist=None, **kwargs)

Bases: mvpa.clfs.transerror.ClassifierError

Compute the transfer error of a (trained) classifier on a dataset.

The actual error value is computed using a customizable error function. Optionally the classifier can be trained by passing an additional training dataset to the __call__() method.

See also

Please refer to the documentation of the base class for more information:

ClassifierError

Note

Available state variables:

  • confusion: State variable
  • null_prob+: Stores the probability of an error result under the NULL hypothesis
  • samples_error: Per sample errors computed by invoking the error function for each sample individually. Errors are available in a dictionary with each samples origid as key.
  • training_confusion: Proxy training_confusion from underlying classifier.

(States enabled by default are listed with +)

See also

Please refer to the documentation of the base class for more information:

ClassifierError

Initialization.

Parameters:
  • clf (Classifier) – Either trained or untrained classifier
  • errorfx – Functor that computes a scalar error value from the vectors of desired and predicted values (e.g. subclass of ErrorFunction)
  • labels (list) – if provided, should be a set of labels to add on top of the ones present in testdata
  • null_dist (instance of distribution estimator) –
  • train (bool) – unless train=False, classifier gets trained if trainingdata provided to __call__
  • enable_states (None or list of basestring) – Names of the state variables which should be enabled additionally to default ones
  • disable_states (None or list of basestring) – Names of the state variables which should be disabled
  • enable_states – Names of the state variables which should be enabled additionally to default ones
  • disable_states – Names of the state variables which should be disabled
errorfx
null_dist