K
KnowMBAAdvisory
Data StrategyIntermediate6 min read

Data Product Thinking

Data Product Thinking treats datasets, dashboards, and ML features as products with users, owners, SLAs, roadmaps, and lifecycles — instead of one-off project deliverables. A 'data product' has: (1) a named product manager/owner, (2) defined consumers, (3) documented SLAs (freshness, accuracy, schema stability), (4) a versioned interface, (5) deprecation policies, (6) measurable user satisfaction. The shift is profound: instead of building 50 dashboards on request and abandoning them, you build 8 well-managed data products that 80% of the company depends on, treated with the same rigor as customer-facing software. Originated at Netflix and Airbnb; now central to Data Mesh and modern data platform thinking.

Also known asData-as-a-ProductData ProductsDaaPData Product Management

The Trap

The trap is renaming dashboards 'data products' without changing how they're built or maintained. A 'data product' that doesn't have a named owner, SLA, consumer feedback loop, and deprecation policy is just a dataset with marketing. The other trap is building data products in isolation from real users. Without product-discovery work (interviewing the analysts, marketers, ops people who will consume the data), you get technically beautiful, business-irrelevant products. Most failed 'data product' initiatives skipped the product management discipline that makes the term meaningful.

What to Do

Pick 3-5 highest-value data assets (revenue dashboard, customer 360 feature store, churn model, product analytics suite). Assign a Data Product Manager to each (not just an engineer — someone with PM skills). For each: define consumers, document SLAs, build a feedback channel, set a 6-month roadmap, and publicly deprecate stale predecessors. Treat them like external products: monthly reviews, NPS-equivalent satisfaction scores, deprecation announcements with migration support.

Formula

Data Product Maturity Score = (Has Named Owner) + (Documented SLA) + (Versioned Interface) + (Active Consumer Feedback) + (Roadmap & Deprecation Policy). Score 5/5 = real product; <3/5 = dataset with marketing.

In Practice

Airbnb publicly described their shift to 'Data as a Product' in 2017-2019. Before: hundreds of one-off dashboards, abandoned datasets, analysts duplicating work. After: a curated set of ~50 enterprise data products (Customer 360, Listing Performance, Trust & Safety Insights, etc.), each with a Data Product Manager, defined SLAs, certified status, and a feedback channel. Adoption metrics are tracked and underperforming products are sunset. This shift is widely credited with Airbnb's ability to scale data-driven decision making across 6,000+ employees without exponentially scaling the data team.

Pro Tips

  • 01

    Apply 'will anyone notice if it breaks' as a product test. If a data asset went down for 48 hours and no one complained, it's not a product — it's exhaust. Sunset it. Most data orgs maintain 5x more datasets than have actual users.

  • 02

    Hire actual Data Product Managers (PM background, not analyst). The skill set — user research, prioritization, roadmapping, deprecation — is identical to consumer/SaaS PM, applied to internal data consumers. Most data orgs assign 'product ownership' to engineers who lack PM training; results suffer.

  • 03

    Publish a data product catalog with status (stable/beta/deprecated), owner, SLA, and consumer ratings — like an internal app store. Catalogs without status discipline become graveyards; catalogs WITH status discipline become the canonical 'how we do data here' reference.

Myth vs Reality

Myth

All datasets should become data products

Reality

Productization is expensive (PM time, SLA management, support). Reserve the product treatment for the ~20% of datasets that drive ~80% of consumption. The rest can remain as 'managed datasets' or experimental. Trying to productize everything dilutes attention and usually fails.

Myth

Data Product Thinking only matters for Data Mesh organizations

Reality

Even in fully centralized warehouse environments, productization changes how datasets get built, supported, and consumed. Airbnb runs centralized data; their data product approach is what makes that center scale. Mesh + product thinking is a multiplier, but product thinking alone delivers value at any architecture.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

A company maintains 850 internal dashboards. Analysis shows 600 have not been viewed in the last 90 days. The data team is overwhelmed maintaining all of them. What is the right Data Product Thinking response?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets — not absolutes.

Healthy Data Product Portfolio Composition

Mid-to-large enterprises with mature data orgs

Tier-1 Productized (PM, SLA, roadmap)

10-30 products max

Tier-2 Managed Datasets

50-200 with active owners

Tier-3 Self-Serve / Sandbox

Unbounded with warnings

Sunset / Deprecated

60-80% of legacy assets

Anti-Pattern: 500+ unowned dashboards

All severity

Source: https://medium.com/airbnb-engineering/scaling-knowledge-at-airbnb-875d73eff091

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

🏠

Airbnb

2017-present

success

Airbnb formalized 'Data as a Product' in 2017, building Knowledge Repo and the Data Portal — internal product catalogs with certification status, ownership, and consumer feedback. They reduced their dashboard sprawl from thousands to a curated set of ~50 certified data products, each with a Data Product Manager, SLAs, and a feedback channel. Consumers can rate products; underperforming products are formally sunset. Adoption: 6,000+ employees self-serve trustworthy data without exponentially scaling the data team. Widely cited as a model for modern data product management.

Certified Data Products

~50 enterprise

Internal Data Consumers

6,000+ employees

Dashboard Reduction

Thousands → curated few

Catalog Tools Built

Knowledge Repo, Data Portal

Productization scales data trust without scaling headcount. Curate aggressively, treat the survivors as products, sunset the rest.

Source ↗
🎵

Spotify

2018-present

success

Spotify's data platform evolved alongside their famous squad/tribe model: each product squad owns the data products generated by their domain (e.g., the Personalization squad owns recommendation feature data; the Payments squad owns billing event data). Central platform team provides the paved road (Backstage portal, schema registry, lineage). Each domain data product has a named owner, schema contract, and SLA. The model scales with the engineering org and eliminates the central data team bottleneck.

Domain Data Products

Hundreds across squads

Platform Team Investment

Strong central paved road

Catalog Tool

Backstage (open-sourced)

Operating Model

Federated squad ownership

Data product ownership aligned to engineering squads scales naturally with the org. The platform team enables; squads own.

Source ↗
🏢

Hypothetical: 2,000-person FinTech

2022-2023

failure

A growth-stage fintech announced a 'Data Products' transformation: renamed all 600 existing dashboards to 'Data Products', added entries to a catalog, and declared victory at month 6. No PMs hired, no sunset of unused assets, no SLAs defined. By month 18, consumers were equally confused, dashboards still proliferated, and 'data products' became internal shorthand for 'just another dashboard'. The CDO who launched it left at month 22.

Dashboards Rebranded

600 (no change in management)

PMs Hired

0

Datasets Sunset

0

Consumer Satisfaction Change

None

Renaming is not productization. Without ownership, SLAs, sunset discipline, and PM skill, 'data products' is just vocabulary inflation.

Related concepts

Keep connecting.

The concepts that orbit this one — each one sharpens the others.

Beyond the concept

Turn Data Product Thinking into a live operating decision.

Use this concept as the framing layer, then move into a diagnostic if it maps directly to a current bottleneck.

Typical response time: 24h · No retainer required

Turn Data Product Thinking into a live operating decision.

Use Data Product Thinking as the framing layer, then move into diagnostics or advisory if this maps directly to a current business bottleneck.