Random Features for Large Scale Kernel Machines

Posted: 2015-07-17 in paper of the day, research
Tags: , ,

Ali Rahimi and Ben Recht:
Random Features for Large Scale Kernel Machines
NIPS 2007

In this paper, the authors propose to map data to a low-dimensional Euclidean space, such that the inner product in this space is a close approximation of the inner product computed by a stationary (shift-invariant) kernel (in a potentially infinite-dimensional RKHS). The approach is based on Bochner’s theorem.

The central equation is this one:

k(x,y) = \langle\phi(x), \phi(y)\rangle \approx z(x)^T z(y)

where z(.) is a randomized feature map that embeds the original data into a low-dimensional feature space. This allows to transform the input with a z and then apply fast linear learning methods to approximate the result of the nonlinear kernel machine.

Two example random feature maps are discussed: random Fourier bases \cos(\omega^T x + b), drawn from the Fourier series expansion of the shift-invariant kernel (smooth, good for interpolation); random partitioning using random resolutions (non-smooth, good for approximating kernels that depend on the L_1 distance between data points).

The variance of the estimator is lowered by using K such random feature maps to compute an average

z(x)^Tz(y) = \tfrac{1}{K} \sum_{i = 1}^K z_i(x)^Tz_i(y)

A significant speedup compared to the original (nonlinear) problem is achieved. Error bounds and convergence guarantees are provided.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s