Home

Trees

Indices

Help

Package mvpa :: Package clfs :: Module distance

[hide private]

[frames] | no frames]

Module distance

source code

Distance functions to be used in kernels and elsewhere

Functions

[hide private]

cartesianDistance(a, b)
Return Cartesian distance between a and b

source code

absminDistance(a, b)
Returns dinstance max(|a-b|) XXX There must be better name! XXX Actually, why is it absmin not absmax?

source code

manhattenDistance(a, b)
Return Manhatten distance between a and b

source code

mahalanobisDistance(x, y=None, w=None)
Calculate Mahalanobis distance of the pairs of points.

source code

squared_euclidean_distance(data1, data2=None, weight=None)
Compute weighted euclidean distance matrix between two datasets.

source code

oneMinusCorrelation(X, Y)
Return one minus the correlation matrix between the rows of two matrices.

source code

pnorm_w_python(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)
Weighted p-norm between two datasets (pure Python implementation)

source code

pnorm_w(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)
Weighted p-norm between two datasets (pure Python implementation)

source code

Imports: N, externals, debug, warning, weave, converters

Function Details

[hide private]

absminDistance(a, b)

source code

Returns dinstance max(|a-b|) XXX There must be better name! XXX Actually, why is it absmin not absmax?

Useful to select a whole cube of a given "radius"

mahalanobisDistance(x, y=None, w=None)

source code

Calculate Mahalanobis distance of the pairs of points.

Inverse covariance matrix can be calculated with the following

w = N.linalg.solve(N.cov(x.T), N.identity(x.shape[1]))

w = N.linalg.inv(N.cov(x.T))

Parameters:

x - first list of points. Rows are samples, columns are features.
y - second list of points (optional)
w (N.ndarray) - optional inverse covariance matrix between the points. It is computed if not given

squared_euclidean_distance(data1, data2=None, weight=None)

source code

Compute weighted euclidean distance matrix between two datasets.

Parameters:

data1 (N.ndarray) - first dataset
data2 (N.ndarray) - second dataset. If None, compute the euclidean distance between the first dataset versus itself. (Defaults to None)
weight (N.ndarray) - vector of weights, each one associated to each dimension of the dataset (Defaults to None)

oneMinusCorrelation(X, Y)

source code

Return one minus the correlation matrix between the rows of two matrices.

This functions computes a matrix of correlations between all pairs of rows of two matrices. Unlike NumPy's corrcoef() this function will only considers pairs across matrices and not within, e.g. both elements of a pair never have the same source matrix as origin.

Both arrays need to have the same number of columns.

Example:

>>> X = N.random.rand(20,80)
>>> Y = N.random.rand(5,80)
>>> C = oneMinusCorrelation(X, Y)
>>> print C.shape
(20, 5)

Parameters: X: 2D-array Y: 2D-array

pnorm_w_python(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)

source code

Weighted p-norm between two datasets (pure Python implementation)

||x - x'||_w = (sum_{i=1...N} (w_i*|x_i - x'_i|)**p)**(1/p)

Parameters:

data1 (N.ndarray) - First dataset
data2 (N.ndarray or None) - Optional second dataset
weight (N.ndarray or None) - Optional weights per 2nd dimension (features)
p - Power
heuristic (basestring) -
Which heuristic to use:
- 'samples' -- python sweep over 0th dim
- 'features' -- python sweep over 1st dim
- 'auto' decides automatically. If # of features (shape[1]) is much larger than # of samples (shape[0]) -- use 'samples', and use 'features' otherwise
use_sq_euclidean (bool) - Either to use squared_euclidean_distance_matrix for computation if p==2

pnorm_w(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)

source code

Weighted p-norm between two datasets (pure Python implementation)

||x - x'||_w = (sum_{i=1...N} (w_i*|x_i - x'_i|)**p)**(1/p)

Parameters:

data1 (N.ndarray) - First dataset
data2 (N.ndarray or None) - Optional second dataset
weight (N.ndarray or None) - Optional weights per 2nd dimension (features)
p - Power
heuristic (basestring) -
Which heuristic to use:
- 'samples' -- python sweep over 0th dim
- 'features' -- python sweep over 1st dim
- 'auto' decides automatically. If # of features (shape[1]) is much larger than # of samples (shape[0]) -- use 'samples', and use 'features' otherwise
use_sq_euclidean (bool) - Either to use squared_euclidean_distance_matrix for computation if p==2

Home

Trees

Indices

Help