On Mining Web Access Logs
The proliferation of information on the world wide web has made
the personalization of this information space a necessity. One
possible approach to web personalization is to mine typical user
profiles from the vast amount of historical data stored in access
logs. In the absence of any a priori knowledge, unsupervised
classification or clustering methods seem to be ideally suited to
analyze the semi-structured log data of user accesses. In this paper,
we define the notion of a “user session”, as well as a dissimilarity
measure between two web sessions that captures the organization
of a web site. To extract a user access profile, we cluster the
user sessions based on the pair-wise dissimilarities using a robust
fuzzy clustering algorithm that we have developed. We report the
results of experiments with our algorithm and show that this leads
to extraction of interesting user profiles. We also show that it
outperforms association rule based approaches for this task.
Date: May 14, 2000
Book Title: Proceedings of the SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery
Type: InProceedings
Pages: 63-69
Publisher: ACM
Downloads: 1059
Has 1 soft copy
size 106321 bytesBibtex
@InProceedings{On_Mining_Web_Access_Logs,
author = "Anupam Joshi and Raghu Krishnapuram",
title = "{On Mining Web Access Logs}",
month = "May",
year = "2000",
pages = "63-69",
booktitle = "Proceedings of the SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery",
publisher = "ACM",
}