Previous topic

Classifier Sweep

Next topic

Minimal Searchlight Example

This content refers to the previous stable release of PyMVPA. Please visit www.pymvpa.org for the most recent version of PyMVPA and its documentation.

The effect of different hyperparameters in GPRΒΆ

The following example runs Gaussian Process Regression (GPR) on a simple 1D dataset using squared exponential (i.e., Gaussian or RBF) kernel and different hyperparameters. The resulting classifier solutions are finally visualized in a single figure.

As usual we start by importing all of PyMVPA:

# Lets use LaTeX for proper rendering of greek
from matplotlib import rc
rc('text', usetex=True)

from mvpa.suite import *

The next lines build two datasets using one of PyMVPA’s data generators.

# Generate dataset for training:
train_size = 40
F = 1
dataset = data_generators.sinModulated(train_size, F)

# Generate dataset for testing:
test_size = 100
dataset_test = data_generators.sinModulated(test_size, F, flat=True)

The last configuration step is the definition of four sets of hyperparameters to be used for GPR.

# Hyperparameters. Each row is [sigma_f, length_scale, sigma_noise]
hyperparameters = N.array([[1.0, 0.2, 0.4],
                           [1.0, 0.1, 0.1],
                           [1.0, 1.0, 0.1],
                           [1.0, 0.1, 1.0]])

The plotting of the final figure and the actually GPR runs are performed in a single loop.

rows = 2
columns = 2
P.figure(figsize=(12, 12))
for i in range(rows*columns):
    P.subplot(rows, columns, i+1)
    regression = True
    logml = True

    data_train = dataset.samples
    label_train = dataset.labels
    data_test = dataset_test.samples
    label_test = dataset_test.labels

The next lines configure a squared exponential kernel with the set of hyperparameters for the current subplot and assign the kernel to the GPR instance.

sigma_f, length_scale, sigma_noise = hyperparameters[i, :]
kse = KernelSquaredExponential(length_scale=length_scale,
                               sigma_f=sigma_f)
g = GPR(kse, sigma_noise=sigma_noise, regression=regression)
print g

if regression:
    g.states.enable("predicted_variances")

if logml:
    g.states.enable("log_marginal_likelihood")

After training GPR the predictions are queried by passing the test dataset samples and accuracy measures are computed.

g.train(dataset)
prediction = g.predict(data_test)

# print label_test
# print prediction
accuracy = None
if regression:
    accuracy = N.sqrt(((prediction-label_test)**2).sum()/prediction.size)
    print "RMSE:", accuracy
else:
    accuracy = (prediction.astype('l')==label_test.astype('l')).sum() \
               / float(prediction.size)
    print "accuracy:", accuracy

The remaining code simply plots both training and test datasets, as well as the GPR solutions.

    if F == 1:
        P.title(r"$\sigma_f=%0.2f$, $length_s=%0.2f$, $\sigma_n=%0.2f$" \
                % (sigma_f,length_scale,sigma_noise))
        P.plot(data_train, label_train, "ro", label="train")
        P.plot(data_test, prediction, "b-", label="prediction")
        P.plot(data_test, label_test, "g+", label="test")
        if regression:
            P.plot(data_test, prediction-N.sqrt(g.predicted_variances),
                       "b--", label=None)
            P.plot(data_test, prediction+N.sqrt(g.predicted_variances),
                       "b--", label=None)
            P.text(0.5, -0.8, "$RMSE=%.3f$" %(accuracy))
            P.text(0.5, -0.95, "$LML=%.3f$" %(g.log_marginal_likelihood))
        else:
            P.text(0.5, -0.8, "$accuracy=%s" % accuracy)

        P.legend(loc='lower right')

    print "LML:", g.log_marginal_likelihood


if cfg.getboolean('examples', 'interactive', True):
    # show all the cool figures
    P.show()

See also

The full source code of this example is included in the PyMVPA source distribution (doc/examples/gpr.py).