Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema
A heterogeneous information network is an information
network composed of multiple types of objects. Cluster-
ing on such a network may lead to better understanding of
both hidden structures of the network and the individual role
played by every object in each cluster. However, although
clustering on homogeneous networks has been studied over
decades, clustering on heterogeneous networks has not been
addressed until recently.
A recent study proposed a new algorithm, RankClus, for
clustering on bi-typed heterogeneous networks. However,
a real-world network may consist of more than two types,
and the interactions among multi-typed objects play a key
role at disclosing the rich semantics that a network carries.
In this paper, we study clustering of multi-typed heteroge-
neous networks with a star network schema and propose a
novel algorithm, NetClus, that utilizes links across multi-
typed objects to generate high-quality net-clusters. An it-
erative enhancement method is developed that leads to ef-
fective ranking-based clustering in such heterogeneous net-
works. Our experiments on DBLP data show that NetClus
generates more accurate clustering results than the baseline
topic model algorithm PLSA and the recently proposed al-
gorithm, RankClus. Further, NetClus generates informative
clusters, presenting good ranking and cluster membership
information for each attribute object in each net-cluster.
Date: August 30, 2009
Book Title: Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'09)
Type: InProceedings
Address: Paris, Canada
Downloads: 480
Has 1 soft copy
remote linkBibtex
@InProceedings{Ranking_Based_Clustering_of_Heterogeneou,
author = "Yizhou Sun and Yintao Yu and Jiawei Han",
title = "{Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema}",
month = "August",
year = "2009",
address = ", Paris, Canada",
booktitle = "Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'09)",
}