The Catalog Wars: A Four-Part Series

The format war is over. The catalog war just started. A four-part guide for senior data leaders to the architectural decision that defines enterprise data infrastructure in 2026: Apache Iceberg, Apache Polaris, Unity Catalog, and Delta Lake.

Nidhi VichareApril 16, 2026
6 min read
Nidhi Vichare, Chief Data and AI Officer and enterprise AI platform architect
Data ArchitectureDatabricksSnowflakeApache IcebergApache PolarisUnity CatalogDelta LakeLakehouseData GovernanceCDOEnterprise AIData Strategy
Available for advisory and speaking
Get insights delivered
LinkedIn
The Catalog WarsSeries Overview
~
The Catalog Wars
1
Catalog Wars Part 1
2
Catalog Wars Part 2
3
Catalog Wars Part 3
4
Catalog Wars Part 4

A Three-Part Series

The format war is over. Both creators have said so. The catalog war is the architecture decision of 2026 -- and most organizations are sleepwalking into it. This series is the longer version of the conversation I think senior architects need to have before they walk into Snowflake Summit or Databricks Data + AI Summit in June.

Every Chief Data and AI Officer I talk to right now is preparing for Summit season with the wrong question on the table.

They want to know whether their team should standardize on Delta or Iceberg. They want a tiebreaker between Snowflake and Databricks. They want a clean answer they can put in the architecture review deck before June.

The answer is that the people who built both formats just publicly admitted the question does not matter anymore. Ryan Blue, the original creator of Apache Iceberg, said there should be perhaps twenty people in the world who care about which underlying format is in use, and none of them should work in your organization.

If that is true -- and I believe it is -- then the architectural question shifts. The format is no longer where the lock-in lives. The format is no longer where governance lives. The format is no longer where engine compatibility lives. All three of those concerns moved up one layer. They moved into the catalog.

And the catalog is where the next decade of vendor lock-in is being constructed right now, while everyone is still staring at the format question.

This series covers the full landscape in three parts.


The Series

Part 1: The Format War Is Over. The Catalog War Just Started.

The definitive guide to the catalog decision that defines enterprise data infrastructure in 2026. Why Delta vs. Iceberg no longer matters, why Polaris vs. Unity Catalog vs. Glue is the real fight, and the four decisions every architect has to make regardless of which catalog they choose.

18 min read | 8 diagrams | 10 predictions

Covers: format convergence evidence, the catalog as chokepoint, three contenders compared honestly, the convergence boundary where lock-in actually lives, a defensible bet, and a three-year prediction timeline.


Part 2: The Other Catalog War

Polaris and Unity Catalog are fighting over the technical catalog. But the governance layer above them is a separate battle -- and the one where most enterprises are actually spending money. This part covers the two-layer architecture, the governance catalog contenders (Atlan, Alation, Collibra, OpenMetadata, DataHub), and the investment framework.

14 min read | 3 diagrams | 5 predictions

Covers: two-layer catalog architecture, commercial vs. open-source governance catalogs, cloud-specific reality (AWS, Azure, GCP), investment decision framework, and five predictions for the governance catalog market.


Part 3: Summit Season Cheat Sheet

Informed predictions for Snowflake Summit (June 2-5) and Data + AI Summit (June 15-18) -- and the signals that actually matter. Most Summit announcements are predictable. The value is in separating the signals from the marketing.

14 min read | 12 Snowflake predictions | 19 Databricks predictions

Covers: near-certain announcements, strong predictions, medium-confidence predictions, the signals that actually matter from both Summits, and what each company must prove.


The Core Thesis

The technology converges where vendors cannot monetize. It diverges where they can.

Formats converge because there is no lock-in value in format differentiation. Governance, AI governance, semantics, and maintenance diverge because there is enormous lock-in value in each.

Understanding that boundary is the single most important insight for making a catalog decision in 2026.

The Catalog Ecosystem in 2026

Who This Series Is For

This series is written for Chief Data and AI Officers, enterprise architects, and senior data platform leaders who are making -- or will soon be making -- strategic decisions about their data catalog infrastructure. The analysis assumes familiarity with lakehouse architecture concepts and the Snowflake/Databricks ecosystem.

If your architecture review still has "Delta vs. Iceberg" as the top-line question, start with Part 1. If your catalog decision is already framed but you are unsure about the governance layer, start with Part 2. If you want to prepare for Summit season with a specific watchlist, start with Part 3.

The vendors will not frame this for you in June. That is what your architecture team is for.


Frequently Asked Questions

What is the Catalog Wars series about?

The Catalog Wars is a four-part guide to the data catalog decision that now defines enterprise data architecture. With the open table format question essentially settled in favor of Apache Iceberg, the next architectural battle is over the catalog layer that governs the tables: Apache Polaris, Unity Catalog, AWS Glue, Snowflake Open Catalog, and the proprietary catalogs each platform vendor is shipping.

Who is the Catalog Wars series for?

Senior data leaders making vendor and architecture decisions: Chief Data Officers, VPs of Data, principal data architects, and platform engineering leaders responsible for lakehouse strategy. The series assumes familiarity with table formats, query engines, and the basics of governance, and goes deeper than vendor marketing on the trade-offs that matter at enterprise scale.

Why does catalog choice matter for enterprise data architecture?

The catalog determines who can read and write tables, how access is governed across engines, where lineage and audit live, how ML and AI workloads access governed data, and which engines can interoperate without re-ingestion. A catalog choice is harder to reverse than a query engine choice. It locks in governance, security, and interoperability for years.

Which catalogs does the series compare?

Apache Polaris (the Snowflake-originated open Iceberg catalog now in incubation at the Apache Software Foundation), Databricks Unity Catalog, AWS Glue, Snowflake Open Catalog, and the operational positioning of Delta Lake's catalog story. Each is evaluated on governance model, multi-engine interoperability, identity and access integration, and enterprise readiness.

About the author

Nidhi Vichare is a Chief Data and AI Officer, enterprise AI architect, and data platform executive. She writes about enterprise AI strategy, data architecture, causal measurement, AI ROI, agentic systems, and modern leadership for senior data and AI leaders.

The InferenceStay Connected
Enterprise AI strategy, data architecture, and the leadership decisions that drive measurable business lift.
Or follow on LinkedIn →No spam. Unsubscribe anytime.
Work with NidhiFor board, advisory, speaking, and strategic conversations in enterprise data and AI.
Reach out for a discussion

Made with ❤️ and ☕️ by Nidhi Vichare

Work with Nidhi