Table Of Contents

Previous topic

clfs.blr

Next topic

clfs.enet

This content refers to the previous stable release of PyMVPA. Please visit www.pymvpa.org for the most recent version of PyMVPA and its documentation.

clfs.distance

Module: clfs.distance

Distance functions to be used in kernels and elsewhere

Functions

mvpa.clfs.distance.absminDistance(a, b)

Returns dinstance max(|a-b|) XXX There must be better name! XXX Actually, why is it absmin not absmax?

Useful to select a whole cube of a given “radius”

mvpa.clfs.distance.cartesianDistance(a, b)

Return Cartesian distance between a and b

mvpa.clfs.distance.mahalanobisDistance(x, y=None, w=None)

Calculate Mahalanobis distance of the pairs of points.

Parameters:
  • x – first list of points. Rows are samples, columns are features.
  • y – second list of points (optional)
  • w (N.ndarray) – optional inverse covariance matrix between the points. It is computed if not given

Inverse covariance matrix can be calculated with the following

w = N.linalg.solve(N.cov(x.T), N.identity(x.shape[1]))

or

w = N.linalg.inv(N.cov(x.T))
mvpa.clfs.distance.manhattenDistance(a, b)

Return Manhatten distance between a and b

mvpa.clfs.distance.oneMinusCorrelation(X, Y)

Return one minus the correlation matrix between the rows of two matrices.

This functions computes a matrix of correlations between all pairs of rows of two matrices. Unlike NumPy’s corrcoef() this function will only considers pairs across matrices and not within, e.g. both elements of a pair never have the same source matrix as origin.

Both arrays need to have the same number of columns.

Parameters:
  • X (2D-array) –
  • Y (2D-array) –

Example:

>>> X = N.random.rand(20,80)
>>> Y = N.random.rand(5,80)
>>> C = oneMinusCorrelation(X, Y)
>>> print C.shape
(20, 5)
mvpa.clfs.distance.pnorm_w_python(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)

Weighted p-norm between two datasets (pure Python implementation)

||x - x’||_w = (sum_{i=1...N} (w_i*|x_i - x’_i|)**p)**(1/p)

Parameters:
  • data1 (N.ndarray) – First dataset
  • data2 (N.ndarray or None) – Optional second dataset
  • weight (N.ndarray or None) – Optional weights per 2nd dimension (features)
  • p – Power
  • heuristic (basestring) – Which heuristic to use: * ‘samples’ – python sweep over 0th dim * ‘features’ – python sweep over 1st dim * ‘auto’ decides automatically. If # of features (shape[1]) is much larger than # of samples (shape[0]) – use ‘samples’, and use ‘features’ otherwise
  • use_sq_euclidean (bool) – Either to use squared_euclidean_distance_matrix for computation if p==2
mvpa.clfs.distance.squared_euclidean_distance(data1, data2=None, weight=None)

Compute weighted euclidean distance matrix between two datasets.

Parameters:
  • data1 (N.ndarray) – first dataset
  • data2 (N.ndarray) – second dataset. If None, compute the euclidean distance between the first dataset versus itself. (Defaults to None)
  • weight (N.ndarray) – vector of weights, each one associated to each dimension of the dataset (Defaults to None)