How Schema Evolution Actually Works (and why it's a coordination problem more than a technical one)
The technical mechanics that make schema evolution possible, and the coordination realities that make it work in practice.
Pipeline Patterns · Teaching
Essays on the mechanics underneath analytical data systems: storage, execution, modeling, correctness, and the tradeoffs that shape real work. New writing lands first on Substack.
The technical mechanics that make schema evolution possible, and the coordination realities that make it work in practice.
By the end, you should understand what durability guarantees, what it costs to provide, and how the concern translates to analytical systems where the mechanics are mostly invisible.
By the end of this article, you will understand what consistency actually guarantees, why it's often dismissed as a "redundant" property in ACID, and what it doesn't cover.
By the end, you should understand what atomicity guarantees, what it protects against, how databases actually implement it, and what its costs are.
By the end, you will understand how table formats actually work, where Iceberg, Delta, and Hudi actually differ, and when the lakehouse architecture earns its complexity versus when it doesn't.
By the end of this article you will be able to explain why columnar storage is a standard for analytical workloads.
By the end, you should be able to look at a BigQuery execution plan and tell a story about what the engine is doing in physical terms.