Query Decorrelation in the Fabric Data Warehouse

Proceedings of the 2025 ACM SIGMOD International Conference on Management of Data |

SQL subqueries allow practitioners to embed logic almost anywhere within the context of a more general query and are used both by developers and automatic query generators to express complex SQL logic. Correlated subqueries refer to columns defined by their outer queries and can be evaluated by tuple-at-a-time execution strategies. These evaluation schemes are generally slow, except when specialized index strategies are available. Query decorrelation is a technique that removes correlation from subqueries and typically results in faster set-oriented execution. Query decorrelation has been an active area of research and has received recent attention due to some distributed engines not directly supporting execution of correlated subqueries. This work extends previous approaches and describes our experience in building a new decorrelation framework for the Microsoft Fabric Data Warehouse. We explain our new decorrelation approach, how we integrate it in the query optimizer, and report an experimental evaluation of our techniques on a complex benchmark over TPC-DS data.