Data Product Thinking
Data Product Thinking treats datasets, dashboards, and ML features as products with users, owners, SLAs, roadmaps, and lifecycles — instead of one-off project deliverables. A 'data product' has: (1) a named product manager/owner, (2) defined consumers, (3) documented SLAs (freshness, accuracy, schema stability), (4) a versioned interface, (5) deprecation policies, (6) measurable user satisfaction. The shift is profound: instead of building 50 dashboards on request and abandoning them, you build 8 well-managed data products that 80% of the company depends on, treated with the same rigor as customer-facing software. Originated at Netflix and Airbnb; now central to Data Mesh and modern data platform thinking.
The Trap
The trap is renaming dashboards 'data products' without changing how they're built or maintained. A 'data product' that doesn't have a named owner, SLA, consumer feedback loop, and deprecation policy is just a dataset with marketing. The other trap is building data products in isolation from real users. Without product-discovery work (interviewing the analysts, marketers, ops people who will consume the data), you get technically beautiful, business-irrelevant products. Most failed 'data product' initiatives skipped the product management discipline that makes the term meaningful.
What to Do
Pick 3-5 highest-value data assets (revenue dashboard, customer 360 feature store, churn model, product analytics suite). Assign a Data Product Manager to each (not just an engineer — someone with PM skills). For each: define consumers, document SLAs, build a feedback channel, set a 6-month roadmap, and publicly deprecate stale predecessors. Treat them like external products: monthly reviews, NPS-equivalent satisfaction scores, deprecation announcements with migration support.
Formula
In Practice
Airbnb publicly described their shift to 'Data as a Product' in 2017-2019. Before: hundreds of one-off dashboards, abandoned datasets, analysts duplicating work. After: a curated set of ~50 enterprise data products (Customer 360, Listing Performance, Trust & Safety Insights, etc.), each with a Data Product Manager, defined SLAs, certified status, and a feedback channel. Adoption metrics are tracked and underperforming products are sunset. This shift is widely credited with Airbnb's ability to scale data-driven decision making across 6,000+ employees without exponentially scaling the data team.
Pro Tips
- 01
Apply 'will anyone notice if it breaks' as a product test. If a data asset went down for 48 hours and no one complained, it's not a product — it's exhaust. Sunset it. Most data orgs maintain 5x more datasets than have actual users.
- 02
Hire actual Data Product Managers (PM background, not analyst). The skill set — user research, prioritization, roadmapping, deprecation — is identical to consumer/SaaS PM, applied to internal data consumers. Most data orgs assign 'product ownership' to engineers who lack PM training; results suffer.
- 03
Publish a data product catalog with status (stable/beta/deprecated), owner, SLA, and consumer ratings — like an internal app store. Catalogs without status discipline become graveyards; catalogs WITH status discipline become the canonical 'how we do data here' reference.
Myth vs Reality
Myth
“All datasets should become data products”
Reality
Productization is expensive (PM time, SLA management, support). Reserve the product treatment for the ~20% of datasets that drive ~80% of consumption. The rest can remain as 'managed datasets' or experimental. Trying to productize everything dilutes attention and usually fails.
Myth
“Data Product Thinking only matters for Data Mesh organizations”
Reality
Even in fully centralized warehouse environments, productization changes how datasets get built, supported, and consumed. Airbnb runs centralized data; their data product approach is what makes that center scale. Mesh + product thinking is a multiplier, but product thinking alone delivers value at any architecture.
Try it
Run the numbers.
Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.
Knowledge Check
A company maintains 850 internal dashboards. Analysis shows 600 have not been viewed in the last 90 days. The data team is overwhelmed maintaining all of them. What is the right Data Product Thinking response?
Industry benchmarks
Is your number good?
Calibrate against real-world tiers. Use these ranges as targets — not absolutes.
Healthy Data Product Portfolio Composition
Mid-to-large enterprises with mature data orgsTier-1 Productized (PM, SLA, roadmap)
10-30 products max
Tier-2 Managed Datasets
50-200 with active owners
Tier-3 Self-Serve / Sandbox
Unbounded with warnings
Sunset / Deprecated
60-80% of legacy assets
Anti-Pattern: 500+ unowned dashboards
All severity
Source: https://medium.com/airbnb-engineering/scaling-knowledge-at-airbnb-875d73eff091
Real-world cases
Companies that lived this.
Verified narratives with the numbers that prove (or break) the concept.
Airbnb
2017-present
Airbnb formalized 'Data as a Product' in 2017, building Knowledge Repo and the Data Portal — internal product catalogs with certification status, ownership, and consumer feedback. They reduced their dashboard sprawl from thousands to a curated set of ~50 certified data products, each with a Data Product Manager, SLAs, and a feedback channel. Consumers can rate products; underperforming products are formally sunset. Adoption: 6,000+ employees self-serve trustworthy data without exponentially scaling the data team. Widely cited as a model for modern data product management.
Certified Data Products
~50 enterprise
Internal Data Consumers
6,000+ employees
Dashboard Reduction
Thousands → curated few
Catalog Tools Built
Knowledge Repo, Data Portal
Productization scales data trust without scaling headcount. Curate aggressively, treat the survivors as products, sunset the rest.
Spotify
2018-present
Spotify's data platform evolved alongside their famous squad/tribe model: each product squad owns the data products generated by their domain (e.g., the Personalization squad owns recommendation feature data; the Payments squad owns billing event data). Central platform team provides the paved road (Backstage portal, schema registry, lineage). Each domain data product has a named owner, schema contract, and SLA. The model scales with the engineering org and eliminates the central data team bottleneck.
Domain Data Products
Hundreds across squads
Platform Team Investment
Strong central paved road
Catalog Tool
Backstage (open-sourced)
Operating Model
Federated squad ownership
Data product ownership aligned to engineering squads scales naturally with the org. The platform team enables; squads own.
Hypothetical: 2,000-person FinTech
2022-2023
A growth-stage fintech announced a 'Data Products' transformation: renamed all 600 existing dashboards to 'Data Products', added entries to a catalog, and declared victory at month 6. No PMs hired, no sunset of unused assets, no SLAs defined. By month 18, consumers were equally confused, dashboards still proliferated, and 'data products' became internal shorthand for 'just another dashboard'. The CDO who launched it left at month 22.
Dashboards Rebranded
600 (no change in management)
PMs Hired
0
Datasets Sunset
0
Consumer Satisfaction Change
None
Renaming is not productization. Without ownership, SLAs, sunset discipline, and PM skill, 'data products' is just vocabulary inflation.
Related concepts
Keep connecting.
The concepts that orbit this one — each one sharpens the others.
Beyond the concept
Turn Data Product Thinking into a live operating decision.
Use this concept as the framing layer, then move into a diagnostic if it maps directly to a current bottleneck.
Typical response time: 24h · No retainer required
Turn Data Product Thinking into a live operating decision.
Use Data Product Thinking as the framing layer, then move into diagnostics or advisory if this maps directly to a current business bottleneck.