Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms

This paper considers the problem of change detection using local distributed eigen monitoring algorithms for next generation of astronomy petascale data pipelines such as the Large Synoptic Survey Telescopes (LSST). This telescope will take repeat images of the night sky every 20 seconds, thereby generating 30 terabytes of calibrated imagery every night that will need to be coanalyzed with other astronomical data stored at different locations around the world. Change point detection and event classification in such data sets may provide useful insights to unique astronomical phenomenon displaying astrophysically significant variations: quasars, supernovae, variable stars, and potentially hazardous asteroids. However, performing such data mining tasks is a challenging problem for such high-throughput distributed data streams. In this paper we propose a highly scalable and distributed asynchronous algorithm for monitoring the principal components (PC) of such dynamic data streams. We demonstrate the algorithm on a large set of distributed astronomical data to accomplish well-known astronomy tasks such as measuring variations in the fundamental plane of galaxy parameters. The proposed algorithm is provably correct (i.e. converges to the correct PCs without centralizing any data) and can seamlessly handle changes to the data or the network. Real experiments performed on Sloan Digital Sky Survey (SDSS) catalogue data show the effectiveness of the algorithm.
Date: November 30, 2008
Book Title: 2009 SIAM Data Mining Conference
Type: Article

Bibtex


@Article{Scalable_Distributed_Change_Detection_fr,
  author = "Kamalika Das and Kanishka Bhaduri and Sugandha Arora and Wesley Griffin and Kirk Borne and Chris Giannella and Hillol Kargupta",
  title = "{Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms}",
  month = "November",
  year = "2008",
  journal = "2009 SIAM Data Mining Conference",
}