Table Of Contents

Previous topic

misc.cmdline

Next topic

misc.errorfx

This content refers to the previous stable release of PyMVPA. Please visit www.pymvpa.org for the most recent version of PyMVPA and its documentation.

misc.data_generators

Module: misc.data_generators

Miscelaneous data generators for unittests and demos

Functions

mvpa.misc.data_generators.chirpLinear(n_instances, n_features=4, n_nonbogus_features=2, data_noise=0.4, noise=0.1)

Generates simple dataset for linear regressions

Generates chirp signal, populates n_nonbogus_features out of n_features with it with different noise level and then provides signal itself with additional noise as labels

mvpa.misc.data_generators.dumbFeatureBinaryDataset()

Very simple binary (2 labels) dataset

mvpa.misc.data_generators.dumbFeatureDataset()

Create a very simple dataset with 2 features and 3 labels

mvpa.misc.data_generators.getMVPattern(s2n)

Simple multivariate dataset

mvpa.misc.data_generators.linear1d_gaussian_noise(size=100, slope=0.5, intercept=1.0, x_min=-2.0, x_max=3.0, sigma=0.2)

A straight line with some Gaussian noise.

mvpa.misc.data_generators.linear_awgn(size=10, intercept=0.0, slope=0.4, noise_std=0.01, flat=False)

Generate a dataset from a linear function with AWGN (Added White Gaussian Noise).

It can be multidimensional if ‘slope’ is a vector. If flat is True (in 1 dimesion) generate equally spaces samples instead of random ones. This is useful for the test phase.

mvpa.misc.data_generators.multipleChunks(func, n_chunks, *args, **kwargs)

Replicate datasets multiple times raising different chunks

Given some randomized (noisy) generator of a dataset with a single chunk call generator multiple times and place results into a distinct chunks

mvpa.misc.data_generators.noisy_2d_fx(size_per_fx, dfx, sfx, center, noise_std=1)
mvpa.misc.data_generators.normalFeatureDataset(perlabel=50, nlabels=2, nfeatures=4, nchunks=5, means=None, nonbogus_features=None, snr=3.0)

Generate a univariate dataset with normal noise and specified means.

Keywords :
perlabel : int

Number of samples per each label

nlabels : int

Number of labels in the dataset

nfeatures : int

Total number of features (including bogus features which carry no label-related signal)

nchunks : int

Number of chunks (perlabel should be multiple of nchunks)

means : None or list of float or ndarray

Specified means for each of features among nfeatures.

nonbogus_features : None or list of int

Indexes of non-bogus features (1 per label)

snr : float

Signal-to-noise ration assuming that signal has std 1.0 so we just divide random normal noise by snr

Probably it is a generalization of pureMultivariateSignal where means=[ [0,1], [1,0] ]

Specify either means or nonbogus_features so means get assigned accordingly

mvpa.misc.data_generators.normalFeatureDataset__(dataset=None, labels=None, nchunks=None, perlabel=50, activation_probability_steps=1, randomseed=None, randomvoxels=False)

NOT FINISHED

mvpa.misc.data_generators.pureMultivariateSignal(patterns, signal2noise=1.5, chunks=None)

Create a 2d dataset with a clear multivariate signal, but no univariate information.

%%%%%%%%%
% O % X %
%%%%%%%%%
% X % O %
%%%%%%%%%
mvpa.misc.data_generators.sinModulated(n_instances, n_features, flat=False, noise=0.4)

Generate a (quite) complex multidimensional non-linear dataset

Used for regression testing. In the data label is a sin of a x^2 + uniform noise

mvpa.misc.data_generators.wr1996(size=200)

Generate ‘6d robot arm’ dataset (Williams and Rasmussen 1996)

Was originally created in order to test the correctness of the implementation of kernel ARD. For full details see: http://www.gaussianprocess.org/gpml/code/matlab/doc/regression.html#ard

x_1 picked randomly in [-1.932, -0.453] x_2 picked randomly in [0.534, 3.142] r_1 = 2.0 r_2 = 1.3 f(x_1,x_2) = r_1 cos (x_1) + r_2 cos(x_1 + x_2) + N(0,0.0025) etc.

Expected relevances: ell_1 1.804377 ell_2 1.963956 ell_3 8.884361 ell_4 34.417657 ell_5 1081.610451 ell_6 375.445823 sigma_f 2.379139 sigma_n 0.050835