Skip to main content
Enterprise AI, decodedJune 2026

November 28, 2025Product Review

Databricks Genie: Where It's Excellent, Where It Hits a Ceiling, and the Semantic Layer Decision Every CIO Has to Make

Databricks Genie is genuinely good at what it does — natural-language analytics over a well-governed lakehouse, with Unity Catalog and Metric Views as the meaning layer underneath — and for internal self-service it is a buy, not a debate. The usual critique, lock-in, is also the wrong one: Unity Catalog is open source, Metric Views export as YAML, and the Iceberg API is honoured, so the semantics are portable. The constraint that actually matters is expressiveness, not portability. YAML semantics can describe how a KPI is computed but cannot express class hierarchies, relationships as first-class traversable objects, or prescriptive constraints like “a claim cannot be approved without a signed adjuster report.” For BI that is exactly the right level. For a fleet of agents that reason and act across functions — touching customer, financial, and regulatory state — it is the floor those agents will grow on, and its height is set the day you deploy. The strategic error is not using Genie; it is letting its semantic layer become, by default, the substrate of your agent strategy. Deploy it for what it is, and architect the ontology-grade layer above it on purpose now, or retrofit it in eighteen months while explaining why two agents reached two different conclusions about the same customer.

17 minAI Infrastructure & OperationsSemantic Layer & Enterprise SemanticsKnowledge Graphs & OntologyEnterprise Knowledge Systems

§ 01

Frame


Databricks shipped Genie broadly in late 2025, made Unity Catalog Business Semantics generally available in April 2026, and is now positioning the combination as the default way to put a chatbot on top of your data lake. The pitch is clean: Genie is now generally available as a simplified interface for business users to interact with data and AI without technical knowledge of compute, queries, models, or notebooks. Unity Catalog learns the schema. Metric Views add business definitions. Genie translates the question to SQL, runs it, returns an answer. No analyst required.

It is genuinely impressive engineering. And for any CIO or CDO whose 3-5 year trajectory is agents in every function, the Genie deployment decision today determines a much larger question tomorrow: what does the semantic substrate of your agent fleet actually look like, and who controls it?

This piece is a CIO/CDO-level read on what Genie actually is, what it cannot be, the expressive ceiling that matters for your agent strategy, and the conscious architectural choice every enterprise should make before they let it loose on their lakehouse.

§ 02

What Genie actually is, in architectural terms


There are three real ways to put intelligence on top of enterprise data, and they are not interchangeable. To talk about them without internal jargon:

  • Operational Graph Wrapper — your existing database is exposed as a queryable surface, including via natural language. Nothing moves. Meaning is implicit in column names, table comments, foreign keys, and (in Databricks's case) Metric View YAML. Translation happens at query time.
  • Semantic Layer — a formal layer of business meaning sits above the data: what "Customer" means, what "Active Account" means, what "Gross Margin" computes. Data stays in source systems. The semantic layer defines concepts once, and every consumer (BI, agents, copilots) grounds against the same definitions. A semantic layer is only as expressive as the language it is written in.
  • Knowledge Graph — data and meaning are materialised together in a dedicated store, optimised for traversal and reasoning across entities. This is where multi-hop questions and structured agent memory live.
  • Genie is an Operational Graph Wrapper sitting on top of a Semantic Layer expressed in YAML.

    Unity Catalog gives Genie schema, table comments, foreign-key relationships, and — critically — Metric Views. Databricks positions Metric Views as the heart of its semantic layer: Metric Views establish trusted, consistent definitions of business KPIs with semantic metadata like display names, formats, and synonyms that help both humans and AI interpret and apply those definitions with confidence. Because each metric is defined declaratively, the engine compiles and executes the underlying SQL deterministically at query time, ensuring that every consumer, whether human or AI agent, gets the same result from the same definition regardless of how or where they access it.

    That is real progress. The question is what kind of progress.

    § 03

    Genie's expressive ceiling — and why it matters for your agent strategy


    This section is the one that matters most for any leader whose strategy depends on agents proliferating across the organisation. Skip the rest if you only read one thing.

    The portability picture is better than Databricks's critics admit

    Databricks markets Unity Catalog Business Semantics as portable, and on inspection, the claim largely holds:

  • Unity Catalog itself is Apache 2.0 open source, governed under the LF AI & Data Foundation.
  • It implements the Iceberg REST Catalog API, which Snowflake, Trino, DuckDB, and other engines can consume.
  • Metric View definitions live as YAML inside SQL views (CREATE VIEW … WITH METRICS LANGUAGE YAML) and can be retrieved programmatically — full YAML output including agent metadata is available via SQL.
  • The April 2026 GA explicitly emphasises portability: "Access and share your business semantics across tools and use cases. With SQL and API-addressable semantics, you can easily reuse them with your favorite BI tools, notebooks and AI agents — all without vendor lock-in."
  • So the answer to "can we get our semantics out of Databricks?" is yes. You can.

    The real question is what you have when you get them out.

    The expressive ceiling: what YAML semantics cannot say

    Metric Views are richer than a labelled schema and poorer than an ontology. They sit in an intermediate tier, comparable to a dbt semantic model or a Cube schema. For a single-platform analytical workload, that is exactly the right level. For a cross-function, cross-system agent fleet, it is the binding constraint.

    Three things specifically cannot be expressed in Metric Views YAML:

    1. Formal class hierarchies and inheritance.

    A "Premium Customer" is a subtype of "Customer" with additional constraints and entitlements. A "B2B Customer" is a different subtype. In an ontology, these are subclass relationships with inherited properties. In Metric Views, they are filters or dimensions on a flat Customer table. An external reasoning agent cannot infer "all Premium Customers are Customers and therefore inherit base Customer policies" — that logic has to be hand-coded in every consuming system.

    2. Relationships as first-class objects with their own semantics.

    "A Policyholder has-claim a Claim, where the Claim has-coverage a Policy, and the Coverage applies-to a Peril" — in an ontology, each of those relationships is a typed, constrained, traversable object. In Metric Views, it is a join between fact and dimension tables. The relationship loses its identity on export. An agent reasoning across Claims, Policies, and Perils gets the data but not the semantic graph that lets it traverse safely.

    3. Constraints and prescriptive rules.

    "A Claim cannot be marked Approved without a SignedAdjusterReport." "An Account cannot be in state Closed and have an OpenTransaction." These are SHACL-style shapes or OWL constraints in a formal ontology. They allow a reasoning agent to refuse an unsafe action. Metric Views are descriptive — they describe how to compute KPIs. They do not encode the rules an agent must respect when acting on the data.

    For BI and analytical consumption, none of this matters. For an agent fleet that touches customer state, financial state, or regulatory state, all three matter.

    Why this is the strategic question, not a technical one

    If your trajectory is to deploy agents across sales, service, finance, operations, supply chain, and product over the next 3–5 years, and the only enterprise-wide source of business meaning is Metric Views in Unity Catalog, then:

  • Your agents inherit a semantic layer that is expressive enough for retrieval-style questions and KPI computation, and not for entity-reasoning or constraint-enforcement.
  • Anything your agents need to reason about beyond table joins — class structure, relationship semantics, prescriptive rules — has to be hand-coded into each agent's prompt or policy layer, not inherited from the central semantic store.
  • When you later add a more expressive semantic layer (because you will), every agent grounded against Metric Views needs to be re-grounded against the new layer. That is not a small project.
  • This is not a Databricks failure. Databricks is shipping the right product for the workload most enterprises actually have today. It is a forward-strategy question: the YAML semantic layer is a floor on which your agent fleet will grow, and the floor's height is set today. If your three-year roadmap requires agents that reason and act across functions, you should design the layer above Metric Views consciously, not inherit it accidentally.

    § 04

    What Genie is good at


    For a single-tenant, well-modelled lakehouse with disciplined Metric View hygiene, Genie is the right product. Specifically:

  • Conversational analytics over a governed warehouse. A regional sales lead asking "show me Q1 pipeline by region, weighted by stage" gets a usable answer in seconds, with the query inspectable.
  • Native integration with the data plane. No new infrastructure, no separate semantic server, no cross-platform identity headache. Unity Catalog handles access. Genie inherits it.
  • Inspect mode and benchmarks. Genie ships with reasoning-based query review ("Inspect uses advanced reasoning to review and improve the accuracy of Genie's generated SQL queries") and a benchmark framework with up to 500 test questions per space, scored against gold-standard SQL.
  • Reduction in analyst toil. The class of ad-hoc "can someone pull me a number" requests that used to clog the BI team's queue is largely absorbed.
  • The value is not theoretical. Customers are reporting measurable wins: "Metric Views helped us standardize our metrics and dramatically cut down the business workload of reconciling numbers. Queries are significantly faster, in some cases up to 10x, dashboards are easier to build, and we've seen meaningful improvements in Genie's accuracy thanks to more consistent, pre-aggregated data." — iFood

    For internal analytical self-service over a Databricks-centric estate, this is a buy decision, not a debate.

    § 05

    What Genie cannot be (and what most buyers will discover the hard way)


    The architectural ceiling of an Operational Graph Wrapper sitting on a YAML semantic layer is real, and Databricks's own documentation tells you what it is if you read carefully.

    1. The product is only as good as your Metric View hygiene

    Databricks is explicit: without Metric Views configured properly, Genie's SQL generation degrades. Their own best-practice guidance walks you through a debugging hierarchy — "If an answer is wrong, examine your assets in this order: SQL queries, SQL expressions, and finally general instructions" — that exposes the dependency. The semantic work has to happen upstream, or the chat is not worth shipping.

    Most enterprise lakehouses have the data but not the discipline. Genie deployments that quietly underperform because the underlying Metric Views were never written, never reviewed, or never versioned will be the dominant failure mode of 2026.

    2. Text-to-SQL on real enterprise schemas is harder than the demos suggest

    This is not a Databricks-specific limitation — it is the state of the art. "The Spider 2.0 benchmark, introduced in November 2024, is a more realistic benchmark, with ground truth queries over 100 lines long on tables with over 1000 columns. In April 2025, the best model's execution accuracy was only 31%. Uber built an internal Text-to-SQL application and reported 50% overlap with ground truth tables on their own evaluation set."

    Genie sits on better grounding than a raw text-to-SQL benchmark, so its real-world number is higher. But the directional truth holds: on large enterprise schemas with long, complex queries, even the best models struggle. The semantic layer is what closes that gap, and the quality and expressiveness of the semantic layer is the binding constraint, not the model.

    3. Inconsistent KPI definitions across functions are not a Genie problem to solve

    The textbook real-world failure mode: Finance defines gross margin post-corporate-allocation, including infrastructure overhead and shared services. Sales defines gross margin pre-allocation, on contribution basis, to motivate field performance. Both definitions are legitimate. Both are correctly implemented as Metric Views in their respective Genie Spaces.

    When the CFO and the Chief Revenue Officer both ask Genie for "gross margin by region," they get different numbers from different Spaces. Neither is wrong. Genie has no way to know which definition belongs to which conversational context, because Metric Views describe how to compute, not who is asking and why.

    Databricks's GA release helps by making Metric Views portable across surfaces ("Definitions then become portable across every surface: AI/BI Dashboards, Genie, Notebooks, SQL applications, and third-party tools connected to Databricks"). That is the right direction. But portability of a definition does not enforce a single governed definition where the business needs one. The reconciliation is still a human committee problem, not a platform feature.

    4. No cross-tenant semantic consistency

    If you are running a multi-tenant analytics SaaS where each customer has their own Genie Space, you have no shared ontology across tenants. Each customer's Space has its own Metric Views, its own table comments, its own SQL expressions. The benchmarking product you want to sell — "compare your operational metrics to industry peers" — is structurally impossible inside Genie alone. "Days to turn" at Customer A was defined differently than at Customer B, one Space at a time.

    This is not a fixable Genie bug. It is what an Operational Graph Wrapper architecturally is.

    5. No reasoning layer, no entity traversal

    Multi-hop questions — the ones that traverse relationships across entities, not just join tables — are where natural-language-to-SQL breaks.

    Genie can handle: "show me revenue by region, weighted by deal stage."

    Genie struggles with: "which customers who bought in the last 18 months have returned for service twice, have no active sales contact, and are in the same parent corporate group as another customer who churned in the previous quarter?"

    That second question is not a SQL problem. It is a graph traversal problem. The answer lives in the relationships between entities, not in any single table. A Knowledge Graph architecture handles it natively. An Operational Graph Wrapper with a YAML semantic layer does not.

    For agents that need to reason over entities — fraud investigation, account-relationship intelligence, supply-chain provenance, scientific discovery — Genie is the floor, not the ceiling.

    § 06

    The honest assessment


    Databricks knows the gap. The April 2026 Unity Catalog Business Semantics GA, the Metric Views portability story, the SAP semantic metadata sync ("SAP semantic metadata syncs automatically into Unity Catalog"), Genie Code's agentic capabilities ("It autonomously analyzes agent traces to fix hallucinations and tunes resource allocation before a human intervenes. Understands enterprise context: Integrated with Unity Catalog, Genie Code enforces existing governance policies and access controls. It understands business semantics and audit requirements and federates enterprise data, including data from external platforms") — these are all moves towards a richer semantic posture. The platform is converging on a real semantic layer.

    But a semantic layer expressed in YAML, governed by SQL, and limited to descriptive KPI definitions is still a better-labelled schema with metric expressions on top. It is not an ontology with formal class hierarchies, role constraints, equivalence axioms, and reasoning. It is portable, yes — Databricks has done that work — but what is portable is closer to a dbt model than to an OWL ontology.

    The gap between a portable YAML semantic layer and a portable ontology is exactly where the next generation of vertical intelligence products gets built. Vendors like Stardog (Voicebox), Palantir (Foundry Ontology + Ontology MCP), Ontop, and TopBraid EDG are explicitly building in that gap. Neo4j is doing the same on the graph-memory side. Databricks is not in that gap today, regardless of what the marketing says — and may choose never to enter it.

    § 07

    What this means for CIOs and CDOs


    Here is the decision framework, in plain language, for a leader whose strategy is to deploy agents broadly over the next 3–5 years.

    Buy and deploy Genie if…

  • Your data is centred on Databricks or you are committing to that platform anyway.
  • Your near-term use case is internal analytical self-service for business users — the "can someone pull me a number" workload.
  • You have, or will commit to building, disciplined Metric View hygiene: an owner, a review process, versioned definitions, benchmark suites for every Genie Space that matters.
  • You accept that YAML semantics are your meaning layer floor, and you are prepared to add a more expressive layer above it later if your agent strategy demands it.
  • Your near-term agent ambitions are single-tenant copilots grounded by your own warehouse, not cross-system orchestration or cross-function reasoning.
  • Pause and architect a real semantic layer before deploying broadly if…

  • You are building agents that touch customer, financial, or regulatory state, where a wrong definition or a missing constraint is a remediation incident, not a dashboard footnote.
  • You are building a multi-tenant SaaS on top of Databricks where the value proposition includes cross-customer comparison, benchmarking, or shared intelligence.
  • Your data spans multiple platforms — Databricks plus Snowflake plus Oracle plus SAP — and you need a single source of meaning that survives platform churn and is expressive enough to ground agents on any of them.
  • You expect your enterprise within three years to be defined by agents reasoning over entities, relationships, and constraints — not just retrieving from tables.
  • In these cases, Genie can still be the consumption surface for the warehouse analytical workload. It cannot be the meaning layer for your agent fleet. You need a real semantic layer sitting above it — portable, ontology-grade, expressive enough to model class hierarchies and constraints, vendor-neutral — and you need to decide today whether you build it in Stardog, Ontop, TopBraid EDG, Palantir Foundry's Ontology MCP, or one of the credible alternatives.

    Add a Knowledge Graph layer above Genie if…

  • You have investigative or scientific workloads — fraud rings, AML, drug discovery, intelligence — where the relationships are the product.
  • You are building agents that need structured memory — case history, episodic chains, learned shortcuts.
  • You need multi-hop reasoning that no NL-to-SQL system can deliver at acceptable accuracy.
  • In this case, Genie sits underneath as the operational analytical layer. The semantic layer above defines meaning portably and expressively. The Knowledge Graph above that handles reasoning and agent memory. Each layer has a clear job. Each layer is replaceable independently.

    § 08

    The takeaway


    Genie is the right product for the workload it was built for. Databricks has built it well, integrated it cleanly, and is iterating fast. The portability story is real — Unity Catalog is open source, Metric Views are exportable YAML, the Iceberg REST API is honoured.

    But portability is not expressiveness. And for any CIO or CDO whose three-year trajectory is agents across every function, the expressive ceiling of YAML semantics is the binding constraint — not vendor lock-in, not export friction, but what can and cannot be said in the semantic layer your agents will inherit.

    Most enterprises will discover this the hard way. An agent will act on a flat customer definition that missed a subtype constraint. A multi-tenant benchmarking product will turn out to be structurally impossible. A multi-hop question will quietly return the wrong answer with no warning. A second agent, deployed in a different function, will disagree with the first because each grounded against a different Metric View in a different Space.

    The CIO/CDO position to hold:

    Genie is excellent for internal analytical self-service over a well-governed Databricks warehouse with disciplined Metric View hygiene. It is not an ontology. Its semantic layer is portable but its expressiveness is bounded at the level of a dbt-style model — rich enough for KPI computation, not rich enough for the cross-function, constraint-aware agents most enterprises are planning to deploy. Deploy Genie for what it is. Architect the layer above it consciously today, or you will be retrofitting that layer in eighteen months while explaining to your board why two agents in two functions reached two different conclusions about the same customer.

    The strategic mistake is not deploying Genie. It is letting Genie's semantic layer become, by default, the substrate of your agent fleet. The gap between a portable YAML semantic layer and a portable ontology is where the next generation of enterprise AI is being built. Genie is on the right side of that gap as a consumption surface. It is not the meaning layer your agent strategy needs. Do not confuse the two.