134 – What Sanjeev Mohan Learned Co-Authoring “Data Products for Dummies”

Experiencing Data with Brian O'Neill (Designing for Analytics)
Experiencing Data with Brian T. O'Neill
134 - What Sanjeev Mohan Learned Co-Authoring “Data Products for Dummies”

In this episode, I’m chatting with former Gartner analyst Sanjeev Mohan who is the Co-Author of Data Products for Dummies. Throughout our conversation, Sanjeev shares his expertise on the evolution of data products, and what he’s seen as a result of implementing practices that prioritize solving for use cases and business value. Sanjeev also shares a new approach of structuring organizations to best implement ownership and accountability of data product outcomes. Sanjeev and I also explore the common challenges of product adoption and who is responsible for user experience. I purposefully had Sanjeev on the show because I think we have pretty different perspectives from which we see the data product space.

Highlights/ Skip to:

  • I introduce Sanjeev Mohan, co-author of Data Products for Dummies (00:39)
  • Sanjeev expands more on the concept of writing a “for Dummies” book   (00:53)
  • Sanjeev shares his definition of a data product, including both a technical and a business definition (01:59)
  • Why Sanjeev believes organizational changes and accountability are the keys to preventing the acceleration of shipping data products with little to no tangible value (05:45)
  • How Sanjeev recommends getting buy-in for data product ownership from other departments in an organization (11:05)
  • Sanjeev and I explore adoption challenges and the topic of user experience (13:23)
  • Sanjeev explains what role is responsible for user experience and design (19:03)
  • Who should be responsible for defining the metrics that determine business value (28:58)
  • Sanjeev shares some case studies of companies who have adopted this approach to data products and their outcomes (30:29)
  • Where companies are finding data product managers currently (34:19)
  • Sanjeev expands on his perspective regarding the importance of prioritizing business value and use cases (40:52)
  • Where listeners can get Data Products for Dummies, and learn more about Sanjeev’s work (44:33)

Quotes from Today’s Episode

  • “You may slap a label of data product on existing artifact; it does not make it a data product because there’s no sense of accountability. In a data product, because they are following product management best practices, there must be a data product owner or a data product manager. There’s a single person [responsible for the result]. Sanjeev Mohan (09:31)
  • “I haven’t even mentioned the word data mesh because data mesh and data products, they don’t always have to go hand-in-hand. I can build data products, but I don’t need to go into the—do all of data mesh principles.” – Sanjeev Mohan (26:45)
  • “We need to have the right organization, we need to have a set of processes, and then we need a simplified technology which is standardized across different teams. So, this way, we have the benefit of reusing the same technology. Maybe it is Snowflake for storage, DBT for modeling, and so on. And the idea is that different teams should have the ability to bring their own analytical engine.” – Sanjeev Mohan (27:58)
  • “Generative AI, right now as we are recording, is still in a prototyping phase. Maybe in 2024, it’ll go heavy-duty production. We are not in prototyping phase for data products for a lot of companies. They’ve already been experimenting for a year or two, and now they’re actually using them in production. So, we’ve crossed that tipping point for data products.” – Sanjeev Mohan (33:15)
  • “Low adoption is a problem that’s not just limited to data products. How long have we had data catalogs, but they have low adoption. So, it’s a common problem.” – Sanjeev Mohan (39:10)
  • “That emphasis on technology first is a wrong approach. I tell people that I’m sorry to burst your bubble, but there are no technology projects, there are only business projects. Technology is an enabler. You don’t do technology for the sake of technology; you have to serve a business cause, so let’s start with that and keep that front and center.” – Sanjeev Mohan (43:03)



Brian: Welcome back to Experiencing Data. This is Brian T. O’Neill. I have Sanjeev Mohan on the line today, and you have helped co-author of a book called Data Products for Dummies. So, congrats on your book, first of all.

Sanjeev: Thank you. Thank you.

Brian: Yeah, yeah. So, who are the dummies [laugh]?

Sanjeev: You know, dummies—actually, it’s a great question.

Brian: I love those books, by the way. I’m not dinging you.

Sanjeev: I know.

Brian: I actually loved those books when I was younger. Like, I read them all the time. Like, get lots of knowledge on a bunch of different things in a short amount of time. Like [laugh].

Sanjeev: Absolutely. In fact, the whole premise of a dummies book is to write it in a very easy to read, parable kind of a format, as opposed to, you know, like, do you know how it is you write a document, and then you send it to the editing team, and they just rip it apart, but everything comes back so arcane, and you know, but it’s grammatically correct. You know, so dummies—so to answer your question, who are dummies? Dummies, it sounds a bit derogatory, but today, dummies are—we all dummies because we all trying to learn how to navigate in this super-charged, fast-moving world of data. Technology in general.

Brian: Yeah, yeah. So, the book is Data Products for Dummies. I wanted to dig into this topic. Obviously, we dig into it a lot on this show, particularly from, kind of, the product and the design kind of areas that I usually talk to on the show. And I’m very much skewed towards the business side of this, and how do we create meaningful solutions that impact people’s lives such that you then get business value because you’re having an impact on your employees or your coworkers if you’re doing an internal data product management, or your customers if you’re doing SaaS and commercial product management?

So, I think in order to have that conversation, I have to ask you what a data product is and your definition because as you know, we had a meta-conversation about definitions when we first talked, and there’s a lot of different ones out there. So, just to ground this conversation, can you give a short definition of what a data product is to you so we can just have that.

Sanjeev: Brian, the way I have explained it, a product is a business definition, and then there’s a technical definition. Technical definition is actually quite easy, so let me get it out of the way. Technically, a data product is not anything different than maybe a table or maybe like a single table materialized view, machine learning model, a report, or a dashboard. That’s how a data product manifests itself. Nothing different.

The difference arises on the business side. The very first criteria of a data product is that it has some business value. If it is just a temporary table that you’re going to join and produce some results, it is not a data product. That’s the first thing. So, it must have a specific, defined business value.

The second thing is that it is built with the best product management concept, not project. And a product management concept says that you define it attributes, you version control a product. Like an iPhone, there’s a new iPhone out every year, a new version number with its own specs, and at some point, we retire the old versions. How many times have you retired stuff in the data space? Not that often. You know, so we keep building stuff, we’ve got old versioned stuff lying around.

So, the idea of building a data product is that you get, like, a container, a self-contained, consumable piece of an artifact that has some contract, you can trust it, it’s reusable, it’s easy to discover. So, those are some really important pieces. And if you go down this path of a data product, it can be hugely beneficial because for the first time ever in the history of data management, we have a way to ascribe value to an investment in data, measure it, and then say, “You know, this is the productivity of our developers. We produced so many data products, we have a cadence here, we have so many users, this is how much they are using it.” And eventually, you can start monetizing it if you want, although not necessarily you have to monetize a data product, you can say that, you know, now that I’ve moved to data product, I’ve reduced my defect count, I have reduced the time for claim processing from ten days to five days. Instant ROI. You can calculate it, and you can ascribe it to the data product.

Brian: So, question on this, like, this business value thing, so like you mentioned a potential benefit here is accelerated delivery because you’re using some kind of product management principles.

Sanjeev: Right.

Brian: I guess one thing that I wonder about is if we don’t do this work right, aren’t we just putting crap out faster? And [laugh] I’m sorry to say that, but for 20-plus years, the track record of a lot of analytics and machine learning, and AI now, the track record, particularly in large enterprises, is not great. We keep throwing money at yet another platform, yet another thing that’s going to save the day, sometimes it’s a new label for it—let’s call it ‘Big Data’ now, whatever that—I don’t know what that means—how is this going to be different? Even if we package it, right, and it’s self-contained, and it has a contract and SLAs are—you know, all this stuff? Like, if it’s not designed with a purpose, with a person in mind, a customer in mind and a knowledge of what their problems and pains are, aren’t we just putting out the old stuff faster, but maybe more secure?

And it’s in a catalog now, and it’s searchable, but at that last—what I call the last mile, when the human interfaces with this solution, assuming it has a user interface—and that’s mostly what we talk about in this show—is something that’s worth the last mile where a user or customer is going to use it, how do we not end up just doing that? Becau—you know, and I’ve heard, like, in the community—you know, I launched a community, and I’ve heard some of this before, which was kind of this, like, a lot of, you know, quote, “A lot of places are just slapping this word data product on the old thing, and then they think that they’re doing this,” or they’re changing job titles, and they’re hoping to get the value of it. Because it sounds better, just like Agile sounds better. But if you just follow the recipes of Agile and you don’t follow the principles—which is really about change, it’s not about speed—you don’t get the value just following the step-by-step instructions if you’re not embracing the principles. So, I’m just kind of curious about your take about, does this really fix the underlying problem of it’s a solution to a problem that doesn’t exist or a different problem that they actually don’t have or care about right now. Talk to me about that piece?

Sanjeev: Yeah. So, there are some things we’re doing differently this time, that are distinctively game-changing, in my opinion. It sounds like, you know, I’m sold on it, so I see everything from a very rosy tinted glasses, but the thing is that there are two important changes: one is organizational, and the second is accountability. In the past, if I don’t change anything, and I just, you know, slap a label of data products on a dashboard that my CFO uses, the CFO comes to the office, he or she finds out that some of these numbers don’t make sense. So, they have this intuitive feeling, but they cannot pin it down.

They know that, you know, that something got—went missing at night in a nightly bad job because these numbers are not making sense. Who does that CFO talk to? There is no single person they can talk to because they call some guy who’s responsible for building the dashboard. The guy says, “Well, I don’t know. I only used Sigma for this, but my DBT engineer was responsible for data transformation.” “Okay, let’s go talk to the DBT guy.” The DBT guys say, “Well, I don’t know. I just took the jobs that came on Fivetran.” “Okay, let’s talk to the Fivetran guy.” The guy says, “Well, I don’t know. It came from SAP. Go talk to the SAP DBA.” You see, there’s so—

Brian: You sound like you’re in Washington, DC, like, with the acronyms [laugh]—

Sanjeev: Oh, good [crosstalk 00:09:18] [laugh].

Brian: —with the military [laugh].

Sanjeev: Okay. In my world, these would be, like, you know, just a done deal, right? Sorry.

Brian: [laugh].

Sanjeev: So, the point that I’m trying to make is that yes, you may slap a label of data product on existing artifact; it does not make it a data product because there’s no sense of accountability. In a data product, because they are following product management best practices, there must be a data product owner or a data product manager. There’s a single person. Second big change is that this person sits on the business side, not in the central IT, which is overwhelmed. See, I’m an IT guy, so I can take potshots on myself and my colleagues.

Let’s say I work for a pharma company, and I’m in IT and I get some dump of data with some requirements of creating a data warehouse. I’m looking at the data, it has pharma codes, clinical trial codes, some DNA data, and RNA data. I have no clue what all that means because I’m a hotshot, Python, Scala, Spark developer, so I’m just going to do what I have to do. But in the true data product environment, I’m taking this task, offloading it from a central data engineering team into the domain. Now, in the domain, people know what these things mean, so that organizational shift, and then having a data product owner who’s responsible for lifecycle and is singularly responsible to answer if there are any defects is what makes data products different from the traditional approach we had to building analytics.

Brian: I don’t think I remember seeing this in the book. Are you saying then that, like, let’s say you have a data product manager in charge of, like, I don’t know, increasing sales closure rate or reducing wasted time calling the wrong people in the sales department, does this mean this person’s now staffed in the sales organization?

Sanjeev: Yes, that—it means—and by the way, I think what you’re alluding to has been a very sore point. A lot of these domains like salespeople are like, “I’m sorry, what do you mean? We are responsible for writing the code? No, we’ve never done that. We are not even trained. We don’t even understand the software development lifecycle, so how is this going to happen?”

So, there will be a pushback. And the way we get over it is, sales team may not be ready to take on the ownership, so there is a central team that the goal of the central team is to increase this—bring in the data culture, increase this literacy on the concept of data products. So, the central team will say, “Don’t worry sales team, I’ll be your representative. I will put a person from my team, a dotted line, to the sales team to help you build this, but we will take ownership.” The idea being that over a period of time, the sales team will come up to speed and say, “Yes, we are now getting faster turnaround time and higher quality. We own this, the outcomes, so we are now trained.”

Maybe it’ll take six months, maybe will take one year or even longer. Now, the sales team is self-sufficient. They go on to build the data products, the central COE, now a Center of Excellence, goes to supply chain team. So, you sort of you divide and conquer, and you get each team bought into this concept of data products. And the only way they’ll get bought on is if they start seeing a significant increase in productivity. So, that’s the mechanism of how we are propagating data products within organizations and departments that are not yet ready to embrace it.

Brian: Right, so this—you call it productivity, so if we talk about this in terms of, like, there’s literally, like, a department head, and there’s a bunch of individual contributors, whether—and we can use the sales example—from a design and classic product management standpoint, we would think about this in terms of, like, making somebody’s life better. It’s easier for me to do sales, I closed stuff faster, I was able to command a higher price, so my margin—so my commission went up, or whatever. I guess one thing I didn’t see in the book, and it’s—I think if you’re looking for a lot of the—particularly, like, a technical person that wants to step into this more business-facing product management role, you understand all the facets of, you know, SLAs, and engineering, and the modeling piece, and pipelines, and all this kind of stuff, but the low adoption monster has been around a long time in the data product space, and I’m wondering, how does any of this data product stuff make that part better, making the salespersons' life better? Because we could follow all these practices and still the sales team is like, “I don’t want to put information in the CRM because”—I don’t know, just—I can’t remember where I heard this yesterday, it just—like, it’s going to get—someone was telling me this. Maybe it was Karen and the [laugh] you’re listening in the DPLC, but like, “I don’t want to put stuff in there because it’s going to get used against me because some AI model eventually is going to replace me, so I have an incentive not to use this thing in the first place.”

We’re now in human factors problems territory. This is where I think so much of the problem lies. And I do believe product management has a lot to offer here, but only if we’re spending an inordinate amount of time in the last mile piece, the problem definition phase, the spending time with, in this case, salespeople and all of that. I’m just curious if you think—how do you balance all the time needed now to package, to make it findable, to make it self-contained, and all of this? Those things are kind of easy to do if you’re technical because you could follow, like, a plan, like, here’s a step-by-step way to go about doing that, but the other stuff is, like, really fuzzy and tough.

Every team is different. Every sales team—I mean, they’re yes, they’re still selling stuff, they probably have a commission, they care about getting a high price, there’s some dynamics that will be the same, probably, across any sales team, but the culture is different. And so, much of that I feel like is still missing. We do everything up until that part, but we don’t—that last mile pieces where good work goes to die, so much of the time [laugh].

Sanjeev: So Brian, the point you raised, and what Karen had raised about the AI piece and will it replace my job, that’s a much broader topic and is slightly orthogonal to what we are discussing. So, I don’t address that in this question. Happy to talk more about it. What I want to address is, what is the situation today for the salesperson to uplevel their game, and have higher margin sale. How do they do it? They don’t even have the tools.

I know I talk to people all the time. Even banks like JPMorgan Chase, where, you know, a financial advisor is trying to sell more services, it’s a maddening situation, where you have to go to five different systems, you have to [kludge 00:16:38] together the data, you don’t even know if the quality is good enough. A system goes down. Now, you have to call backend. Literally, I know this for a fact, you have to call backend support for that.

And then you don’t know how current that data is because there’s no contract. You don’t know what you—it’s an ocean of disconnected systems. So, data products are actually doing a lot of this work—and I agree with a lot of what you said. There’s a lot of upfront work that needs to go on before the salesperson is enabled to use a data product. So, that work has to be done.

But you put in that work, you say, you know, every—this data product is refreshed every 24 hours, or every two minutes, whatever it is. I, as a salesperson, now have a window into what I’m dealing with, and let’s say tomorrow, my company decides to buy HubSpot, and now they’re pulling in information about events and who attended the events. A salesperson wants to know, did my prospects attend this event? How in the world are they going to get this information? These are not technical people, so now what happens is, if you’re a salesperson, you can make a request, and you can say, “Look, now that we have HubSpot, I want my data product to be enhanced with this new integration, new data elements.”

So, the data product owner says, “Okay, your requirements are well taken. We will go design the data product, add in the new HubSpot, and version 3.7 will be launched on this day, and it’s backward compatible, so if you’ve written any programs, don’t worry about it. The previous versions will eventually be retired.” As a salesperson, I now am focused on the output, the outcomes.

I’m not focused on can I trust it? Is it secure? You know, is it fresh data? I don’t have to worry about that because it’s in the document, in the catalog. I can read it because there’s a contract. I can start focusing and trusting a data product, my ease-of-use goes up, and the user experience is higher. So, I hope this makes sense.

Brian: Yeah. No, I understand what you’re saying. I mean, one of the things I noticed in the book, and I was kind of curious. So you mentioned user experience here.

Sanjeev: Yeah.

Brian: This came up several times in the book, and I’m curious, whose job is it, in the data product context, to design or facilitate the user experience to happen? Where does that role reside?

Sanjeev: I wrote an article called “What Exactly is A Data Product”? On my medium blog on a Friday. On the weekend, I flew to Gartner Data and Analytics Summit in Orlando. This was in March of this year, 2023, and I was shocked. By Wednesday, I had 85,000 hits. On such a to—such a… techie topic. I would never expect such a pent-up demand. In that article, I talk about, let’s say you want to go buy cereal. So, Cinnamon Toast Crunch is the example I use.

Brian: Solid example.

Sanjeev: [laugh]. Yes, thank you.

Brian: [laugh].

Sanjeev: I concur. So, I can go to a store that has it in a bin, you know, you scoop it out into your own reusable bag and—

Brian: Where is this store? I want to go [laugh]. Is it the big Costco shovel that’s like—

Sanjeev: No, it’s like [crosstalk 00:20:17], you know? So, [laugh]. Okay, so that is one way. But how do we buy it? We buy the box. The box is from the manufacturer, it has all the nutritional content, it has a best-before date, so I know when it’s going to expire, it has maybe some recipes on it. So, the producer is the one that is defining the quality.

Because, see the data product, I’d say my idea for data product, the thing is that the producer does not know who the consumers are. Like, Kellogg’s or General Mills doesn’t know, they don’t know me, right? They know that, you know, there’s a general class of consumers. Now, in data products, it’s not that amorphous, we do know who our customers are, but we may have customers of a data product that we don’t even know exist. So, a new person gets hired in a new department and gets a task to go figure out customer churn.

And this newbie says, “Well, I don’t know how to do this stuff. Let me go see. Is there a customer churn machine learning model that already exists?” And finds it, and says, “Aha, I already have a base model. I’m going to go and derive my own model from it.” So basically, the quality is not defined by the user. The user expects certain quality. The implementation is by the producer. And if the user, the new user says, “I don’t like the quality. I need this, I need that.” You had not figured it out, they communicate to the producer and the producer will then implement it.

[midroll 00:21:56]

Brian: Right, but I guess the challenge there, though—I mean, historically—is this idea that the customer knows how to express what they need, and that they’re actually hand—like, a strong product manager and a product designer, understand that you really have to take with a grain of salt a request that comes in because most problem requests are actually solutions. The customer hands you a solution.

Sanjeev: Hmm.

Brian: And the problem with that is that you think, if I just give them what they asked for, they will be happy. Until you realize, like, “I need a customer churn model,” and then you realize what this person is actually trying to do. And as a data professional, you might be like, that’s actually not going to help you because you didn’t factor in these other things. You thought you needed to churn model, but really what you’re trying to do is X. This part of the product thing is to me what’s foundationally missing because the making part is really easy if you really understand the problem space, and you understand what’s it like to be the head of sales or an individual salesperson.

If you know that piece, the building part is so much easier. But that to me is one thing I feel like that’s missing in this whole data products conversation so much of the time is we’re still not talking about that last mile piece, which is really the highest value product-y—as I call it, product-y stuff you can be doing is that customer-facing time to understand the need. And I don’t know, I’m kind of just riffing here, and I want your opinion on it. But that’s what I feel like it’s kind of missing.

Sanjeev: Okay. So, Brian, this is where the difference arises. If I’m in IT, and I’m doing customer churn model, I have no idea what my business really needs. I’m just being told. I’m trying to translate my business requirements into my technical requirements and I implement it. In fact, it takes so long to build a data mart that by the time I deliver the data model, the requirements have already changed.

So, I will go back to my previous point where because their products are being built by the domain and there’s a product owner, the product owner should understand. The product owner should have spent time in the sales team to understand, what do my salespeople need? And so, they have a better grasp over the business requirements, the quality aspects, and now they can have the engineer build it because it’s their own product and their own users.

Now, let’s say I’ve built a customer [master 00:24:29], and I’m in sales. I’ve got information about how much money the customers are spending, what their budgets are, I’ve got all of that. But my customer master is a data product that the marketing department wants to use. So, the marketing department should not really be allowed to see the spent data because that’s, you know, sensitive data. But the marketing department may want to say, “I’ve got Qualtrics survey. I’ve got Marketo. I will join with that.” So, I want to take the sales team’s product and I want to create a derived data product for my team, which is marketing. Presumably, the data product owner in the marketing team understands pretty well what does the marketing team need and then translates it into the derived data product.

Brian: In this model, though, does that translation into the data product, I guess, that needs to be built, does that mean that they’re writing up a spec or requirements document that then the technical people will go and implement, like a classic soft—well, like, not necessarily a classic one in modern speak, but is that how you’re saying that works?

Sanjeev: Yeah, it does. But the technical team member, ideally, is in the marketing team, so it’s not like going to an external team. It is sitting in the team.

Brian: Sorry, you mean—so like that—in that fictitious sales example, then it’s not just the data product owner for the sales organization that’s actually either dotted line or literally being paid for out of that department. The whole product team is now—the whole data product team is now sitting inside sales?

Sanjeev: Yes, correct.

Brian: Oh, okay.

Sanjeev: Yes. And they may not be directly part of the sales team, but they have been sent, like—

Brian: They’re attached to it, or assigned to it, or yeah—

Sanjeev: They’re attached to it from the central team because the sales team does not feel comfortable owning it. And there is, by the way, we only talk about people and process, organizational and process for this. From a technology point of view, what we are trying to do is standardize the technology. And you know, one of—by the way, we haven’t even mentioned the word data mesh because data mesh and data products, they don’t always have to go hand-in-hand. I can build data products, but I don’t need to go into the—do all of data mesh principles.

But one of the data mesh principle is called self-service data infrastructure. What that means is that if I’m going to take this leap of faith and say the entire product team is sitting in the domain, and they’re not experts at software development, the least I can do is give them self-service data infrastructure. Which means a lot of my technology needs to be abstracted through things like DataOps, single button, zero copy clone of Snowflake, I have access to data, single button, an IDE, or orchestration engine kicks in, and it starts running the task. Maybe in the future, this is an LLM that I can tell it to do things. It’s coming, by the way, whether we are skeptics or not, but GenAI is a whole new topic by itself.

But my point is that all these things need to go hand in hand. We need to have the right organization, we need to have a set of processes, and then we need a simplified technology which is standardized across different teams. So, this way, we have the benefit of reusing the same technology, maybe it is Snowflake for storage, DBT for modeling, and so on. And the idea is that different teams should have the ability to bring their own analytical engine. Like some people, maybe one department may say, “We are a bunch of SQL guys.” Another department say, “Yeah, but we only like doing Python,” and the third team may be API based, but the underlying data infrastructure should have some common standards and easy-to-use DataOps layer, so that it reduces the time of developing data products inside of the domains.

Brian: Sure, sure. That makes sense. In terms of, like, defining the ways the business is going to measure the value of these initiatives, like, especially if, like, the sales team is paying for this thing that they—I don’t—make up whatever example you want, whose job is it to create those metrics and to measure the impact to know whether or not this box is Cinnamon Toast Crunch actually, like, delighted, it brought economic return, we were able to count that economic return. Whose job, who does that work? Where does that happen?

Sanjeev: I think that function should sit with the [CDO 00:29:34]. And why do I say this is, if Kellogg’s wants to know what the sales are, it’s very easy. They can count how many boxes of Special K was sold, which markets are trending, and so on. Same idea for data product. The CDO can say that I’ve got exact metrics on how many data products were being used, how often were they refreshed, who’s using them, and if they’re not being used, then I should retire and save the cost.

So, these things can be very easily measured. And that is very different from the time when we used to create these data marts and data warehouses, it’s like, there was no way to measure, there was no way for the CIO to go in front of the CFO and defend the $4 million budget. But now, data products give us that opportunity.

Brian: I actually believe everything is quantifiable; it’s just a question of precision. And I don’t want to get way into that and act—like, how accurate do you need the answer to be for this particular question that you’re asking? But that’s another—like, I guess part of the reason I asked is, like, you have some case studies in the book of some wins there, but one thing I don’t think I saw was any economic value attached to those wins. Like, quote, you know, “Speeding up delivery,” for example. And that was a pretty easy one because you can just, like, what’s the cost of my engineering team or whatever, and then if it’s 100x, faster, well, then it’s around 100x return on my development costs.

You could just, even right there, that’s a rough estimate of what the economic value is. But I was curious, what do some successes look like in this space from your research that you did? Like, what are some wins? How would we know if we’re doing a good job with this? Like, how do we—if I’m a leader, and I’m trying to adopt this approach, like, what are the signals, and what’s it like on the other side?

Sanjeev: So, I was at KubeCon conference this year, a couple of weeks ago, and I ran into the Intuit team members. So, they were telling me about [data 00:31:40] product, their 900 data product. I was blown away with what they’re doing. A lot of TurboTax, QuickBooks, they’re turning all the analytics into data products. The Intuit team told me that in a few years, you will not have an ability to go access raw data. It will only be through data products.

Talking about the economic value, this gets a bit tricky because companies don’t want to share it. This is a very closely guarded secret. In fact, last year, we were at Big Data in London, and we had a panel with two people from Roche who might be part of your data products community, Omar Khawaja, and Paul Rankin.

Brian: Omar is, yeah.

Sanjeev: Omar is.

Brian: And he’s been on the show as well.

Sanjeev: I see. So, Omar actually talked about it.

Brian: He says, “Hi,” by the way. I told him we were talking [laugh].

Sanjeev: Oh, okay [laugh]. Awesome. Okay, great. So, he actually even came up, and he says—I forget the number, somehow 40 million is in my head—and then they was like, “Oh, my God, no, no, no. We have to get a company to make sure that we are allowed to share some of the”—

Brian: Sanction that number [laugh].

Sanjeev: Yeah. Yeah, all this is under embargo. You cannot do it. And I think now they are talking about the actual cost savings they have derived from doing data products. And there are some other companies, Equifax is another one.

I don’t have the exact number, but Travelers Insurance—there are so many companies. And this is no longer—like, generative AI, right now as we are recording, is still in a prototyping phase. Maybe in 2024, it’ll go heavy-duty production. We are not in prototyping phase for data products for a lot of companies. They’ve already been experimenting for a year or two, and now they’re actually using them in production.

So, we’ve crossed that tipping point for data products. I do have to say that we still need to make sure that data products, we don’t overcomplicate things because most organizations don’t have data products. Most organizations aren’t even in the cloud. If you listen to what AWS says they only think 15% of data is migrated into the cloud, which seems an awfully low number, but most of the data is still on-premises, most of the data is still not being leveraged through data products. So, this is going to be a long journey.

Brian: More heads in the clouds than data in the cloud, is what you’re saying [laugh].

Sanjeev: Yes, correct. Yes, very well put, yeah.

Brian: Where are all these data product managers coming from? Like, are they harvesting them? Are they training them? Are they stealing them from mature product organizations?

Sanjeev: Yeah.

Brian: Whereas the user experience knowledge qualification about how to fix experience problems coming from? Where do they get that?

Sanjeev: Actually, it’s a lot of product management people. In fact, this is a very hot topic. How do you bring in the culture of product management into IT teams that have done projects? You know how we’ve done projects in the past? I have to go build—I’ve done a bunch of data warehouse implementations—we built a data warehouse, and then we get a request.

Okay, there’s a new project. Every draw off this project, move to this other new project. And then we just let it linger for eternity, and we don’t ever go back. We just keep rebuilding and reusing. So this, the product management thinking is actually being through osmosis or training. We are getting it from the software development team or from the physical products team.

Brian: Do you mean that they’re having internal training—

Sanjeev: Yes.

Brian: —between a digital product?

Sanjeev: Correct.

Brian: Like, if there’s a functional product management, I guess, a software product management or a digital team, if they’re not a digital-native business, they come in and train the new DPMs—

Sanjeev: Yes.

Brian: —effectively? Is that—that’s the model you see most happening?

Sanjeev: It’s not mostly happening, but in some cases it is. In other cases, they don’t have a [unintelligible 00:35:53]. They are like, just this week, one of the top three banks has reorged the entire organization for the cloud modernization based on products? Have they trained their people on that journey? No, they have not. But they’ve recognized how important it is to take the product management approach. And it’s not very difficult to guess which one because I’ve always said what are the top three banks in the US. And how I know there’s a little bit of a gap is because I’m hearing it, that how do I train? How do I organize myself? So, we don’t have a proper, mature method of training DPMs right now.

Brian: Yeah, I agree. And what I, you know, particularly on the last mile side when we’re talking about data products that have interfaces that are supposed to be self-contained islands of value, or you know, effectively, we’re building applications that are data-driven, or they’re primarily about decision support—you know, now we have GenAI—the function of user experience seems to reside with the DPM largely. So, you have two major skill gaps to me: product management, which is its own beast, and that’s a wide and shallow skill set with a lot of soft skill requirements, a lot of creative thinking, a lot of relationship-building that has to happen; and then the user experience piece, which involves a lot of psychology, human factors knowledge, design of interfaces, design of experiences, all of that. It’s a ton of knowledge, and I’m kind of—I’m just curious, and I guess I’m just—I’m rambling a little bit about where, like, a bank, if they’re going to reorg the entire business, I’m always curious, well, what—how will you know that that org was better than the last org, and how will you measure the improvement there, and who is going to do all that work?

Because the, you know, changing the org chart is a… it’s a mental model of how things happen, and there’s physical reporting lines in all this, but effectively, it’s like you still have the same people, the same technology, you’re still dealing with the same, kind of, raw ingredients, unless you upskill, or you bring in new talent, you know, however you go about doing that. It feels like a lot of stuff to learn, to have to put all that on to—you know, I’m not saying it can’t be done. The most important thing to me is a passion and an interest, particularly if you’re a data scientist, or you’re an analytics person, and you want to go into this space, passion and interest is probably the most important thing: that you actually care.

Sanjeev: Yeah.

Brian: You want to build stuff that matters, you want to build for economic impact, you want your work to matter. But I do kind of wonder where these people are coming from because I don’t hear people stealing them from software companies, for example. I don’t see a lot of investment in user experience design, which is a well-established discipline to deal with these kinds of friction problems. I don’t see a lot of those in the wild in enterprise data teams, or even in the data product model. I think it’s growing. I think it’s still slow.

Maybe product management will come first and that will eventually follow because this last mile is where everything tends to fall apart. It’s still at the—when the rubber hits the road, why is the adoption low? It continues to be low over and over. You keep hearing this all the time. I don’t know, maybe I’m in an echo chamber. Do you hear it’s low? Like, do you hear low adoption is still the challenge, like—

Sanjeev: Yes.

Brian: —people don’t use the stuff we made. And we gave them what they wanted, and they still don’t pay attention. Or, like [laugh]—

Sanjeev: By the way, low adoption is a problem that’s not just limited to data products. How long have we had data catalogs, but they have low adoption. So, it’s a common problem. You know, ChatGPT? No problem with low adoption. You can write—ask questions in English, and it surprises you with his answers. Correct or incorrect. You’re still amazed. You know—

Brian: [laugh] Mostly wrong.

Sanjeev: Yes. So, the question you’re asking is super valid. In fact, I am now very intrigued to dig deeper. I just want to mention that I have my own podcast channel on YouTube only. It’s called It Depends. I had the Chief Data Analytics and Product Officer of Equifax on my channel. The topic was cloud migration, nothing to do with data product, but it turns out, he had so much to say about data product, and he’s done an amazing job.

And he was saying, you know, without data product, we could not have expanded Equifax as fast as we have. We have far more business solutions now, we have far more people. Even if we do M&A, we can bring them into the fold. So, if your listeners are interested, they can—and if you don’t mind, including the—

Brian: Please.

Sanjeev: The link for that, I can share with you.

Brian: That’s like the most memorable consultant’s podcast name. I have to congrat—

Sanjeev: [laugh].

Brian: It Depends. It’s so good. That’s so good [laugh].

Sanjeev: Yes. Me too. My entire life was like, as a consultant, every answer was, “Oh, it depends.” And then you would have, you know, cop out [laugh].

Brian: [laugh] I do want to say one thing. On page 26 in this book—

Sanjeev: Oh.

Brian: I thought you—I’ll just read the quote for you.

Sanjeev: Okay. Please, yeah.

Brian: I think some of our listeners probably don’t need to hear this. There’s probably some that maybe they do. But you said, “Data products liberate data by flipping the priorities: business value and use cases first, followed by infrastructure and data management. Traditionally, data platforms and data management have taken precedence over use cases. The flipping of intent, prioritization alone, is enough to explain why data products are so effective in delivering business advantage.”

To me, this is totally obvious because, I guess—I don’t know how you abstract out and build something that’s going to have any value if you don’t know what people are going to do with it, but I understand for a long time that’s happened because it’s like, “Well, theoretically, we need to put all this stuff in a data lake and then, you know, build all this infrastructure that’s abstracted so it can be used for multiple different use cases.” But so often, what’s missing is, well, what are the benchmark use cases by which we would test that the abstracted thing that we built is actually enabling these benchmarks that we should never make worse? That’s how I measure design of applications for business use is what are the benchmark use cases, the most important or most highly used workflows, jobs to be done, et cetera, that’s how you know whether the platform is working is it enables those benchmarks to be constantly used, and so I’m really glad that you talked about that for teams that need to hear this—

Sanjeev: Thank you.

Brian: —that you can’t start with abstracting out a giant plumbing infrastructure and then figure out faucets later. It doesn’t work well. Like [laugh].

Sanjeev: Yeah, no, very well put. So, Brian, I’m coming from the Big Data space. Just at the beginning, you said what does it even mean, but in Big Data, Hadoop world, we—well, how did we start? We said, let’s build all the pipelines to bring structured and unstructured data, everything into the data lake, and then we’ll figure it out.

Brian: Yeah.

Sanjeev: And you know, that did not go very well [laugh] at all. And that led to the whole, like, Snowflakes of the world, and BigQuery. And so, that emphasis on technology first is a wrong approach. I tell people that I’m sorry to burst your bubble, but there are no technology projects, there are only business projects. Technology is an enabler. You don’t do technology for the sake of technology; you have to serve a business cause, so let’s start with that and keep that front and center.

Brian: That’s awesome. I want to, kind of, give you the last word or see if there’s a question I should have asked you about that I didn’t ask, and then we’ll talk about where to get the book and you. But any final words or something I should have asked you?

Sanjeev: You know, there’s a whole question now coming up about what is the conversational interface going to be on data products. Again, two new topics, two topics that are not mature, but that is going to happen. In fact, the way I see it is, we are in this phase of, “Let’s chat with our data.” And I think that’s so dangerous to put a Chatbot LLM directly on data. We have no idea what will happen if we do that.

But if you put it on top of data product which is curated and has accountability, then we can control some other security, privacy, quality, reduce hallucinations, we can do some of that. So, I’m super excited to see where this space goes next year, which is an extension of data products into the conversational space.

Brian: Yeah, that should be exciting. Sanjeev, thank you so much, from SanjMo Consulting. I just want to recognize your—the book is called Data Products for Dummies, brought to you by dataops.live. Guy Adams is also one of your co-authors, and Justin Mullen, who’s actually a member of our Data Product Leadership Community. So, congrats on the book. Where do they get it?

Sanjeev: Yes, it’s very easy. If you go to dataops.live/dataproductsfordummies.

Brian: Okay.

Sanjeev: So, just two things you have to remember: it’s a very long word, but Data Products for Dummies is where you can download a copy. And if we happen to meet at one of the conferences, I will sign it for you, if you have a physical copy—

Brian: Cool, cool.

Sanjeev: —which we bring to the conferences. Yeah [laugh].

Brian: Excellent, excellent. That sounds great. And where can people follow you? Is there—LinkedIn is kind of where I have you—in my head, is kind of where you hang out.

Sanjeev: Yes. I am almost a hundred percent approachable through LinkedIn. I would like people to subscribe to my Medium site. My Medium site is very easy: my company name is SanjMo: sanjmo.medium.com.

Brian: Okay, excellent.

Sanjeev: And also, they can subscribe to the YouTube It Depends podcasts that we—it’ll make my day.

Brian: Great. Yeah, we’ll put those links in there. It’s Sanjmo—it’s S-A-N-J-M-O. So, if you’re looking up, Sanjeev, that’s how to find that. And Sanjeev, thank you so much. Congrats on the book here. Thanks for entertaining my questions here, and—

Sanjeev: [laugh] No worries.

Brian: —great to chat with you.

Sanjeev: Great questions, by the way. I have made my own notes on go find out more about these things, you know, so thank you. Like I said earlier, we are all dummies, at least I consider myself—

Brian: Me too.

Sanjeev: Yeah, so thank you for this opportunity. Love it.

Brian: All right, brought to you by Cinnamon Toast Crunch [laugh].

Sanjeev: [laugh].

Brian: Well, Sanjeev, thanks. It’s [crosstalk 00:46:22]—

Sanjeev: Thank you.

Brian: —thanks for chatting.

Sanjeev: You’re welcome.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Subscribe for Podcast Updates

Join my DFA Insights mailing list to get weekly insights on creating human-centered data products, special offers on my training courses and seminars, and one-page briefs about each new episode of #ExperiencingData.