Truth Discovery with Multiple Conflicting Information Providers on the Web

The world-wide web has become the most important infor- mation source for most of us. Unfortunately, there is no guarantee for the correctness of information on the web. Moreover, different web sites often provide conflicting in- formation on a subject, such as different specifications for the same product. In this paper we propose a new problem called Veracity, i.e., conformity to truth, which studies how to find true facts from a large amount of conflicting informa- tion on many subjects that is provided by various web sites. We design a general framework for the Veracity problem, and invent an algorithm called TruthFinder, which uti- lizes the relationships between web sites and their informa- tion, i.e., a web site is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy web sites. Our ex- periments show that TruthFinder successfully finds true facts among conflicting information, and identifies trustwor- thy web sites better than the popular search engines.
Date: November 30, 2008
