Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms

This paper considers the problem of change detection using local distributed eigen monitoring algorithms for next generation of astronomy petascale data pipelines such as the Large Synoptic Survey Telescopes (LSST). This telescope will take repeat images of the night sky every 20 seconds, thereby generating 30 terabytes of calibrated imagery every night that will need to be coanalyzed with other astronomical data stored at different locations around the world. Change point detection and event classification in such data sets may provide useful insights to unique astronomical phenomenon displaying astrophysically significant variations: quasars, supernovae, variable stars, and potentially hazardous asteroids. However, performing such data mining tasks is a challenging problem for such high-throughput distributed data streams. In this paper we propose a highly scalable and distributed asynchronous algorithm for monitoring the principal components (PC) of such dynamic data streams. We demonstrate the algorithm on a large set of distributed astronomical data to accomplish well-known astronomy tasks such as measuring variations in the fundamental plane of galaxy parameters. The proposed algorithm is provably correct (i.e. converges to the correct PCs without centralizing any data) and can seamlessly handle changes to the data or the network. Real experiments performed on Sloan Digital Sky Survey (SDSS) catalogue data show the effectiveness of the algorithm.
Date: November 01, 2009
Book Title: Ninth SIAM International Conference on Data Mining (SDM)
Type: Article
Pages: 247-258
Downloads: 556

Has 1 soft copy


size 183509 bytes

Bibtex


@Article{Scalable_Distributed_Change_Detection_fr,
  author = "Hillol Kargupta and Chris Giannella and Kirk Borne and Wesley Griffin and Sugandha Arora and Kanishka Bhaduri and Kamalika Das",
  title = "{Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms}",
  month = "November",
  year = "2009",
  pages = "247-258",
  journal = "Ninth SIAM International Conference on Data Mining (SDM)",
}