Package mvpa :: Package datasets :: Module splitters :: Class HalfSplitter
[hide private]
[frames] | no frames]

Class HalfSplitter

source code

Split a dataset into two halves of the sample attribute.

The splitter yields to splits: first (1st half, 2nd half) and second (2nd half, 1st half).

Instance Methods [hide private]
__init__(self, **kwargs)
Cheap init.
source code
_getSplitConfig(self, uniqueattrs)
Huka chaka!
source code
String summary over the object
source code

Inherited from Splitter: __call__, setNPerLabel, splitDataset, splitcfg

Inherited from Splitter (private): _setStrategy

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __subclasshook__

Class Variables [hide private]
  __doc__ = enhancedDocString('HalfSplitter', locals(), Splitter)

Inherited from Splitter: strategy

Inherited from Splitter (private): _NPERLABEL_STR, _STRATEGIES

Instance Variables [hide private]

Inherited from Splitter: count

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, **kwargs)

source code 
Cheap init.
  • nperlabel - Number of dataset samples per label to be included in each split. If given as a float, it must be in [0,1] range and would mean the ratio of selected samples per each label. Two special strings are recognized: 'all' uses all available samples (default) and 'equal' uses the maximum number of samples the can be provided by all of the classes. This value might be provided as a sequence whos length matches the number of datasets per split and indicates the configuration for the respective dataset in each split.
  • nrunspersplit, int - Number of times samples for each split are chosen. This is mostly useful if a subset of the available samples is used in each split and the subset is randomly selected for each run (see the nperlabel argument).
  • permute - If set to True, the labels of each generated dataset will be permuted on a per-chunk basis.
  • count - Desired number of splits to be output. It is limited by the number of splits possible for a given splitter (e.g. OddEvenSplitter can have only up to 2 splits). If None, all splits are output (default).
  • strategy -
    If count is not None, possible strategies are possible:

    First count splits are chosen


    Random (without replacement) count splits are chosen


    Splits which are equidistant from each other

  • discard_boundary - If not None, how many samples on the boundaries between parts of the split to discard in the training part. If int, then discarded in all parts. If a sequence, numbers to discard are given per part of the split. E.g. if splitter splits only into (training, testing) parts, then `discard_boundary`=(2,0) would instruct to discard 2 samples from training which are on the boundary with testing.
  • attr - Sample attribute used to determine splits.
  • reverse - If True, the order of datasets in the split is reversed, e.g. instead of (training, testing), (training, testing) will be spit out
Overrides: object.__init__

_getSplitConfig(self, uniqueattrs)

source code 
Huka chaka!
Overrides: Splitter._getSplitConfig

(Informal representation operator)

source code 
String summary over the object
Overrides: object.__str__