Interface

Interface

Data Orientation

Data matrices may be oriented in one of two ways with respect to the observations. Functions producing a kernel matrix require an orient argument to specify the orientation of the observations within the provided data matrix.

Row Orientation (Default)

An orientation of Val(:row) identifies when observation vector corresponds to a row of the data matrix. This is commonly used in the field of statistics in the context of design matrices.

For example, for data matrix $\mathbf{X}$ consisting of observations $\mathbf{x}_1$, $\mathbf{x}_2$, $\ldots$, $\mathbf{x}_n$:

\[\mathbf{X}_{row} = \begin{bmatrix} \leftarrow \mathbf{x}_1 \rightarrow \\ \leftarrow \mathbf{x}_2 \rightarrow \\ \vdots \\ \leftarrow \mathbf{x}_n \rightarrow \end{bmatrix}\]

When row-major ordering is used, then the kernel matrix of $\mathbf{X}$ will match the dimensions of $\mathbf{X}^{\intercal}\mathbf{X}$. Similarly, the kernel matrix will match the dimension of $\mathbf{X}^{\intercal}\mathbf{Y}$ for row-major ordering of data matrix $\mathbf{X}$ and $\mathbf{Y}$.

Column Orientation

An orientation of Val(:col) identifies when each observation vector corresponds to a column of the data matrix:

\[\mathbf{X}_{col} = \mathbf{X}_{row}^{\intercal} = \begin{bmatrix} \uparrow & \uparrow & & \uparrow \\ \mathbf{x}_1 & \mathbf{x}_2 & \cdots & \mathbf{x_n} \\ \downarrow & \downarrow & & \downarrow \end{bmatrix}\]

With column-major ordering, the kernel matrix will match the dimensions of $\mathbf{XX}^{\intercal}$. Similarly, the kernel matrix of data matrices $\mathbf{X}$ and $\mathbf{Y}$ match the dimensions of $\mathbf{XY}^{\intercal}$.

Essentials

MLKernels.ismercerMethod.
ismercer(κ::Kernel)

Returns true if kernel κ is a Mercer kernel; false otherwise.

source
MLKernels.isnegdefMethod.
isnegdef(κ::Kernel)

Returns true if the kernel κ is a negative definite kernel; false otherwise.

source
isstationary(κ::Kernel)

Returns true if the kernel κ is a stationary kernel; false otherwise.

source
isisotropic(κ::Kernel)

Returns true if the kernel κ is an isotropic kernel; false otherwise.

source
MLKernels.kernelMethod.
kernel(κ::Kernel, x, y)

Apply the kernel κ to $x$ and $y$ where $x$ and $y$ are vectors or scalars of some subtype of $Real$.

source
MLKernels.OrientationConstant.
Orientation

Union of the two Val types representing the data matrix orientations:

  1. Val{:row} identifies when observation vector corresponds to a row of the data matrix
  2. Val{:col} identifies when each observation vector corresponds to a column of the data matrix
source
kernelmatrix([σ::Orientation,] κ::Kernel, X::Matrix [, symmetrize::Bool])

Calculate the kernel matrix of X with respect to kernel κ.

source
kernelmatrix!(σ::Orientation, K::Matrix, κ::Kernel, X::Matrix, symmetrize::Bool)

In-place version of kernelmatrix where pre-allocated matrix K will be overwritten with the kernel matrix.

source
kernelmatrix([σ::Orientation,] κ::Kernel, X::Matrix, Y::Matrix)

Calculate the base matrix of X and Y with respect to kernel κ.

source
kernelmatrix!(σ::Orientation, K::Matrix, κ::Kernel, X::Matrix, Y::Matrix)

In-place version of kernelmatrix where pre-allocated matrix K will be overwritten with the kernel matrix.

source
centerkernelmatrix(K::Matrix)

Centers the (rectangular) kernel matrix K with respect to the implicit Kernel Hilbert Space according to the following formula:

\[[\mathbf{K}]_{ij} = \langle\phi(\mathbf{x}_i) -\mathbf{\mu}_{\phi\mathbf{x}}, \phi(\mathbf{y}_j) - \mathbf{\mu}_{\phi\mathbf{y}} \rangle\]

Where $\mathbf{\mu}_{\phi\mathbf{x}}$ and $\mathbf{\mu}_{\phi\mathbf{x}}$ are given by:

\[\mathbf{\mu}_{\phi\mathbf{x}} = \frac{1}{n} \sum_{i=1}^n \phi(\mathbf{x}_i) \qquad \qquad \mathbf{\mu}_{\phi\mathbf{y}} = \frac{1}{m} \sum_{i=1}^m \phi(\mathbf{y}_i)\]
source

Approximation

In many cases, fast, approximate results is more important than a perfect result. The Nystrom method can be used to generate a factorization that can be used to approximate a large, symmetric kernel matrix. Given data matrix $\mathbf{X} \in \mathbb{R}^{n \times p}$ (one observation per row) and kernel matrix $\mathbf{K} \in \mathbb{R}^{n \times n}$, the Nystrom method takes a sample $S$ of the observations of $\mathbf{X}$ of size $s < n$ and generates a factorization such that:

\[\mathbf{K} \approx \mathbf{C}^{\intercal}\mathbf{WC}\]

Where $\mathbf{W}$ is the $s \times s$ pseudo-inverse of the sample kernel matrix based on $S$ and $\mathbf{C}$ is a $s \times n$ matrix.

The Nystrom method uses an eigendecomposition of the sample kernel matrix of $\mathbf{X}$ to estimate $\mathbf{K}$. Generally, the order of $\mathbf{K}$ must be quite large and the sampling ratio small (ex. 15% or less) for the cost of the computing the full kernel matrix to exceed that of the eigendecomposition. This method will be more effective for kernels that are not a direct function of the dot product as they are not able to make use of BLAS in computing the full matrix $\mathbf{K}$ and the cross-over point will occur for smaller $\mathbf{K}$.

MLKernels.jl implements the Nystrom approximation:

NystromFact

Type for storing a Nystrom factorization. The factorization contains two fields: W and C as described in the nystrom documentation.

source
MLKernels.nystromFunction.
nystrom([σ::Orientation,] κ::Kernel, X::Matrix, [S::Vector])

Computes a factorization of Nystrom approximation of the square kernel matrix of data matrix X with respect to kernel κ. Returns a NystromFact struct which stores a Nystrom factorization satisfying:

\[\mathbf{K} \approx \mathbf{C}^{\intercal}\mathbf{WC}\]
source
nystrom(CᵀWC::NystromFact)

Compute the approximate kernel matrix based on the Nystrom factorization.

source