Cheap init.
- Parameters:
nperlabel - Number of dataset samples per label to be included in each
split. If given as a float, it must be in [0,1] range and would
mean the ratio of selected samples per each label.
Two special strings are recognized: 'all' uses all available
samples (default) and 'equal' uses the maximum number of samples
the can be provided by all of the classes. This value might be
provided as a sequence whos length matches the number of datasets
per split and indicates the configuration for the respective dataset
in each split.
nrunspersplit , int - Number of times samples for each split are chosen. This
is mostly useful if a subset of the available samples
is used in each split and the subset is randomly
selected for each run (see the nperlabel argument).
permute - If set to True , the labels of each generated dataset
will be permuted on a per-chunk basis.
count - Desired number of splits to be output. It is limited by the
number of splits possible for a given splitter
(e.g. OddEvenSplitter can have only up to 2 splits). If None,
all splits are output (default).
strategy -
- If count is not None, possible strategies are possible:
- first
First count splits are chosen
- random
Random (without replacement) count splits are chosen
- equidistant
Splits which are equidistant from each other
discard_boundary - If not None , how many samples on the boundaries between
parts of the split to discard in the training part.
If int, then discarded in all parts. If a sequence, numbers
to discard are given per part of the split.
E.g. if splitter splits only into (training, testing)
parts, then `discard_boundary`=(2,0) would instruct to discard
2 samples from training which are on the boundary with testing.
attr - Sample attribute used to determine splits.
reverse - If True, the order of datasets in the split is reversed, e.g.
instead of (training, testing), (training, testing) will be spit
out
- Overrides:
object.__init__
|