Kurt T. Miller, Thomas L. Griffiths and Michael I. Jordan:

Nonparametric Latent Feature Model for Link Prediction

NIPS 2010

The objective of this paper is to predict links in social networks. The working assumption is that links depend on relational features between entities. The objective of the paper is to simultaneously infer the number of these features and learn which entities have each feature.

Therefore, it is important to extract latent structure representing the properties of individual entities from observed data. Unlike class-based methods (e.g., stochastic block model) the paper focuses on feature-based methods: Each entity has binary-valued latent features that influences its relations with other entities.

For features represents the binary matrix where each row corresponds to an entity and each column to a feature, such that if entity possesses feature . The $K\times K$-matrix contains weights that affect the probability of a link between entities . This matrix is independent of the entities, i.e., it does not naturally account for different types of entities (e.g., celebrities, brands, “normal” people). With the assumption that links between entities are independent given , we arrive at the likelihood model

where the product runs over all pairs of entities and is a sigmoidal function. Note that all-zero columns of do not affect the likelihood.

The objective is to infer the posterior of the feature matrix and the weights . Miller et al. use a Gaussian prior and an Indian Buffet Process (IPB) prior for the binary feature matrices . Due to exchangeability, a Gibbs sampler is used for inference. The generative model with the IPB prior for the latent feature relational model is:

The model allows to incorporate known features (e.g., locations) by linearly adding them to the likelihood model, such that

(The model in the paper is a bit more expressive and flexible, see Eq. (3) in the paper.)

For sampling the following is considered:

,

where is the number of non-zero entries in column , excluding row .

To sample each weight of that corresponds to non-zero features, the auxiliary sampling trick (Albert & Chib, 1993) is used with the Probit sigmoidal function in the likelihood for Gibbs sampling. This doesn’t work with the logistic, but a Metropolis-Hastings-based scheme can be used instead.

The authors mention that the intialization of the model is crucial. Therefore, it is initialized with the Stochastic Block Model, which is a special case of the method proposed in this paper with a single feature per entity.

Miller et al. also propose an extension to multiple (independent) relations, such that the likelihood factorizes over the relations. Features are shared, but all other parameters are set per relation (e.g., ).

In awe of that anwesr! Really cool!

LikeLike