Unified Query Optimization in the Fabric Data Warehouse
- Nicolas Bruno ,
- C. Galindo-Legaria ,
- Milind Joshi ,
- Esteban Calvo Vargas ,
- Kabita Mahapatra ,
- Sharon Ravindran ,
- Guoheng Chen ,
- Ernesto Cervantes Juárez ,
- Beysim Sezgin
Companion of the 2024 International Conference on Management of Data |
Over a decade ago, Microsoft introduced Parallel Data Warehouse (PDW), a massively parallel processing system to manage and query large amounts of data. Its optimizer was built by reusing SQL Server’s infrastructure with minimal changes, which was an effective approach to bring cost-based query optimization quickly to PDW. Over time, learnings from production as well as architectural changes in the product (such as moving from an appliance form factor to the cloud, separation of compute and storage, and serverless components) required evolving the query optimizer in Fabric DW, the latest offering from Microsoft in the cloud data warehouse space. In this paper we describe the changes to the optimization process in Fabric DW, compare them to the earlier architecture used in PDW, and validate our new approach.