Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms
This paper considers the problem of change detection using local
distributed eigen monitoring algorithms for next generation
of astronomy petascale data pipelines such as the Large Synoptic
Survey Telescopes (LSST). This telescope will take repeat images
of the night sky every 20 seconds, thereby generating 30 terabytes
of calibrated imagery every night that will need to be coanalyzed
with other astronomical data stored at different locations
around the world. Change point detection and event classification
in such data sets may provide useful insights to unique astronomical
phenomenon displaying astrophysically significant variations:
quasars, supernovae, variable stars, and potentially hazardous asteroids.
However, performing such data mining tasks is a challenging
problem for such high-throughput distributed data streams. In this
paper we propose a highly scalable and distributed asynchronous
algorithm for monitoring the principal components (PC) of such
dynamic data streams. We demonstrate the algorithm on a large set
of distributed astronomical data to accomplish well-known astronomy
tasks such as measuring variations in the fundamental plane of
galaxy parameters. The proposed algorithm is provably correct (i.e.
converges to the correct PCs without centralizing any data) and can
seamlessly handle changes to the data or the network. Real experiments
performed on Sloan Digital Sky Survey (SDSS) catalogue
data show the effectiveness of the algorithm.
Date: November 01, 2009
Book Title: Ninth SIAM International Conference on Data Mining (SDM)
Type: Article
Pages: 247-258
Downloads: 33
Has 1 soft copy
size 183509 bytesBibtex
@Article{Scalable_Distributed_Change_Detection_fr,
author = "Hillol Kargupta and Chris Giannella and Kirk Borne and Wesley Griffin and Sugandha Arora and Kanishka Bhaduri and Kamalika Das",
title = "{Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen Monitoring Algorithms}",
month = "November",
year = "2009",
pages = "247-258",
journal = "Ninth SIAM International Conference on Data Mining (SDM)",
}