Tracing Data Errors with View-Conditioned Causality
- Alexandra Meliou ,
- Wolfgang Gatterbauer ,
- Suman Nath ,
- Dan Suciu
SIGMOD'11: Proceedings of the 2011 ACM SIGMOD international conference on Management of data |
Published by ACM
A surprising query result is often an indication of errors in the query or the underlying data. Recent work suggests us- ing causal reasoning to nd explanations for the surprising result. In practice, however, one often has multiple queries and/or multiple answers, some of which may be considered correct and others unexpected. In this paper, we focus on determining the causes of a set of unexpected results, pos- sibly conditioned on some prior knowledge of the correct- ness of another set of results. We call this problem View- Conditioned Causality. We adapt the denitions of causa- lity and responsibility for the case of multiple answers/views and provide a non-trivial algorithm that reduces the problem of nding causes and their responsibility to a satisability problem that can be solved with existing tools. We evaluate both the accuracy and eectiveness of our approach on a real dataset of user-generated mobile device tracking data, and demonstrate that it can identify causes of error more eec- tively than static Boolean in uence and alternative notions of causality.