Package mvpa :: Package clfs :: Module stats :: Class MCNullDist
[hide private]
[frames] | no frames]

Class MCNullDist

source code


Null-hypothesis distribution is estimated from randomly permuted data labels.

The distribution is estimated by calling fit() with an appropriate DatasetMeasure or TransferError instance and a training and a validation dataset (in case of a TransferError). For a customizable amount of cycles the training data labels are permuted and the corresponding measure computed. In case of a TransferError this is the error when predicting the correct labels of the validation dataset.

The distribution can be queried using the cdf() method, which can be configured to report probabilities/frequencies from left or right tail, i.e. fraction of the distribution that is lower or larger than some critical value.

This class also supports FeaturewiseDatasetMeasure. In that case cdf() returns an array of featurewise probabilities/frequencies.

Nested Classes [hide private]

Inherited from misc.state.ClassWithCollections: __metaclass__

Instance Methods [hide private]
 
__init__(self, dist_class=Nonparametric, permutations=100, **kwargs)
Initialize Monte-Carlo Permutation Null-hypothesis testing
source code
 
__repr__(self, prefixes=[])
String definition of the object of ClassWithCollections object
source code
 
fit(self, measure, wdata, vdata=None)
Fit the distribution by performing multiple cycles which repeatedly permuted labels in the training dataset.
source code
 
cdf(self, x)
Return value of the cumulative distribution function at x.
source code
 
clean(self)
Clean stored distributions
source code

Inherited from NullDist: p

Inherited from NullDist (private): _setTail

Inherited from misc.state.ClassWithCollections: __getattribute__, __new__, __setattr__, __str__, reset

Inherited from object: __delattr__, __format__, __hash__, __reduce__, __reduce_ex__, __sizeof__, __subclasshook__

Class Variables [hide private]
  _DEV_DOC = ...
  dist_samples = StateVariable(enabled= False, doc= 'Samples obt...

Inherited from NullDist: tail

Inherited from NullDist (private): _ATTRIBUTE_COLLECTIONS

Inherited from misc.state.ClassWithCollections: _DEV__doc__, descr

Instance Variables [hide private]
  __permutations
Number of permutations to compute the estimate the null distribution.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, dist_class=Nonparametric, permutations=100, **kwargs)
(Constructor)

source code 
Initialize Monte-Carlo Permutation Null-hypothesis testing
Parameters:
  • dist_class, class - This can be any class which provides parameters estimate using fit() method to initialize the instance, and provides cdf(x) method for estimating value of x in CDF. All distributions from SciPy's 'stats' module can be used.
  • permutations, int - This many permutations of label will be performed to determine the distribution under the null hypothesis.
Overrides: object.__init__

__repr__(self, prefixes=[])
(Representation operator)

source code 
String definition of the object of ClassWithCollections object
Parameters:
  • fullname - Either to include full name of the module
  • prefixes - What other prefixes to prepend to list of arguments
Overrides: object.__repr__
(inherited documentation)

fit(self, measure, wdata, vdata=None)

source code 
Fit the distribution by performing multiple cycles which repeatedly permuted labels in the training dataset.
Overrides: NullDist.fit

Parameters:

measure: (Featurewise)`DatasetMeasure` | TransferError
TransferError instance used to compute all errors.
wdata: Dataset which gets permuted and used to compute the
measure/transfer error multiple times.
vdata: Dataset used for validation.
If provided measure is assumed to be a TransferError and working and validation dataset are passed onto it.

cdf(self, x)

source code 
Return value of the cumulative distribution function at x.
Overrides: NullDist.cdf

clean(self)

source code 

Clean stored distributions

Storing all of the distributions might be too expensive (e.g. in case of Nonparametric), and the scope of the object might be too broad to wait for it to be destroyed. Clean would bind dist_samples to empty list to let gc revoke the memory.


Class Variable Details [hide private]

_DEV_DOC

Value:
"""
    TODO automagically decide on the number of samples/permutations ne\
eded
    Caution should be paid though since resultant distributions might \
be
    quite far from some conventional ones (e.g. Normal) -- it is expec\
ted to
    them to be bimodal (or actually multimodal) in many scenarios.
...

dist_samples

Value:
StateVariable(enabled= False, doc= 'Samples obtained for each permutat\
ion')