## Random Features for Large Scale Kernel Machines

Posted: 2015-07-17 in paper of the day, research
Tags: , ,

Ali Rahimi and Ben Recht:
Random Features for Large Scale Kernel Machines
NIPS 2007

In this paper, the authors propose to map data to a low-dimensional Euclidean space, such that the inner product in this space is a close approximation of the inner product computed by a stationary (shift-invariant) kernel (in a potentially infinite-dimensional RKHS). The approach is based on Bochner’s theorem.

The central equation is this one:

$k(x,y) = \langle\phi(x), \phi(y)\rangle \approx z(x)^T z(y)$

where $z(.)$ is a randomized feature map that embeds the original data into a low-dimensional feature space. This allows to transform the input with a $z$ and then apply fast linear learning methods to approximate the result of the nonlinear kernel machine.

Two example random feature maps are discussed: random Fourier bases $\cos(\omega^T x + b)$, drawn from the Fourier series expansion of the shift-invariant kernel (smooth, good for interpolation); random partitioning using random resolutions (non-smooth, good for approximating kernels that depend on the $L_1$ distance between data points).

The variance of the estimator is lowered by using $K$ such random feature maps to compute an average

$z(x)^Tz(y) = \tfrac{1}{K} \sum_{i = 1}^K z_i(x)^Tz_i(y)$

A significant speedup compared to the original (nonlinear) problem is achieved. Error bounds and convergence guarantees are provided.