Coming February 13, 2023:
My bi-annual training seminar is back with a new format, price, and community offering!

099 – Don’t Boil the Ocean: How to Generate Business Value Early With Your Data Products with Jon Cooke, CTO of Dataception

Experiencing Data with Brian T. O'Neill
Experiencing Data with Brian T. O'Neill
099 - Don’t Boil the Ocean: How to Generate Business Value Early With Your Data Products with Jon Cooke, CTO of Dataception
/

Today I’m sitting down with Jon Cooke, founder and CTO of Dataception, to learn his definition of a data product and his views on generating business value with your data products. In our conversation, Jon explains his philosophy on data products and where design and UX fit in. We also review his conceptual model for data products (which he calls the data product pyramid), and discuss how together, these concepts allow teams to ship working solutions faster that actually produce value. 

Highlights/ Skip to:

  • Jon’s definition of a data product (1:19) 
  • Brian explains how UX research and design planning can and should influence data architecture —so that last mile solutions are useful and usable (9:47)
  • The four characteristics of a data product in Jon’s model (16:16)
  • The idea of products having a lifecycle with direct business/customer interaction/feedback (17:15)
  • Understanding Jon’s data product pyramid (19:30)
  • The challenges when customers/users don’t know what they want from data product teams - and who should be doing the work to surface requirements (24:44)
  • Mitigating risk and the importance of having management buy-in when adopting a product-driven approach (33:23)
  • Does the data product pyramid account for UX? (35:02)
  • What needs to change in an org model that produces data products that aren’t delivering good last mile UXs (39:20)

Quotes from Today’s Episode

  • “A data product is something that specifically solves a business problem, a piece of analytics, data use case, a pipeline, datasets, dashboard, that type that solves a business use case, and has a customer, and as a product lifecycle to it.” - Jon (2:15)
  • “I’m a fan of any definition that includes some type of deployment and use by some human being. That’s the end of the cycle, because the idea of a product is a good that has been made, theoretically, for sale.” - Brian (5:50)
  • “We don’t build a lot of stuff around cloud anymore. We just don’t build it from scratch. It’s like, you know, we don’t generate our own electricity, we don’t mill our own flour. You know, the cloud—there’s a bunch of composable services, which I basically pull together to build my application, whatever it is. We need to apply that thinking all the way through the stack, fundamentally.” - Jon (13:06)
  • “It’s not a data science problem, it’s not a business problem, it’s not a technology problem, it’s not a data engineering problem, it’s an everyone problem. And I advocate small, multidisciplinary teams, which have a business value person in it, have an SME, have a data scientist, have a data architect, have a data engineer, as a small pod that goes in and answer those questions.” - Jon (26:28)
  • “The idea is that you’re actually building the data products, which are the back-end, but you’re actually then also doing UX alongside that, you know? You’re doing it in tandem.” - Jon (37:36)
  • “Feasibility is one of the legs of the stools. There has to be market need, and your market just may be the sales team, but there needs to be some promise of value there that this person is really responsible for at the end of the day, is this data product going to create value or not?” - Brian (42:35)
  • “The thing about data products is sometimes you don’t know how feasible it is until you actually look at the data…You’ve got to do what we call data archaeology. You got to go and find the data, you got to brush it off, and you’re looking at and go, ‘Is it complete?’” - Jon (44:02)

Resources and Links:

Transcript

Brian: Welcome back to Experiencing Data. This is Brian T. O’Neill. Today, I’ve got Jon Cooke on the line, CTO and founder at Dataception. Jon, how are you?

Jon: Great thanks. Thanks, Brian. I’m glad to be on the show.

Brian: Yeah. So, I’ve been seeing your material pop up on LinkedIn; a lot of talk about data products and this kind of thing, and I think we had some shared beliefs and all of that, so I kind of wanted to dig in there and see what your take is on this whole space. You come from a much more technical side of this so we may have some different framings for things here, but I wanted to jump into that because a lot of the audience on my show  aren't people that have technical backgrounds anyways, and so it’s good to just get different perspectives here.

So, the first thing I wanted to ask you was what’s a data product? [laugh]. Since nobody knows and we’re all still kind of defining this word, it’s not new. But I think there’s different definitions for what we mean, and I’m collecting definitions instead of Star Wars figures. So.

Jon: [laugh]. Well, that’s—Star Wars figures sounds fun. But yeah, for me, I mean, I’ve been to this space for, you know, about 20, 30 years, been in the data space 15 years, building platforms, solving business use cases, and I’ve worked in consultancies and stuff. And for me, that always comes down to fundamentally solving a business problem. I mean, I think that’s the core thing, core principle for me.

If you don’t, if you’re not solving the business problem, then you’re really not—you’re kind of out of the water. So, for me a product is, first and foremost, solving a business problem. It’s got to be something, you know, if I go and buy something, I buy something that I want to do something with it, I’ve got a problem I need to solve, I want to do something with it.

Secondly, it’s got to have a customer, it’s got to have someone actually who’s willing to buy it, to use it, that sort of stuff. I’ve been many programs where they built these big, kind of, data platforms and data type of deliveries, but there’s no real customer involved. So, for me, it’s about something that’s specifically solves a business problem, a piece of analytics, data use case, a pipeline, datasets, dashboard, that type that solves a business use case, and has a customer, and as a product lifecycle to it. So, it’s got a market, it’s got customer, it needs to be retired, needs to be changed, that sort of stuff. So really, for me, that’s the core from my mind.

Brian: And when we talk about customers here, and I—[laugh] this word, I remember when I had Marty Kagan on the show, he kind of cringed when we talked about using the word customer to talk to say, the head of sales within your enterprise. He’s like, “That’s not a customer. That’s a stakeholder. A customer is someone that pays for some service [laugh] that your business provides.” I think on this show, we kind of collectively use that term to mean a stakeholder here. You said they have to be someone that’s willing to buy this thing. So, does it have to be a transaction, or do you do mean there just has to be something that’s worth paying for, like, theoretically, even if it’s like an internally used solution?

Jon: Yeah, I think there’s actually two angles to it. There is actually generally, if you have a customer who wants to pay for it, so you think about a data-driven marketplace and there’s lots of them out there—and Snowflake have got one—someone actually wants to pay for a dataset or to pay for a dashboard or wants to pay for a piece of analytics. I worked with an HLS company earlier this year who actually are selling datasets on the marketplace. So, you’re absolutely right, from internal perspective of I’m delivering analytics for the head of sales, the CEO, or business line—and typically in my mind, it is typically business line—then that’s—you’re right, we call it an internal customer, but there’s no sort of transactional, you know, there’s no money necessarily changing hands unless you’re talking about cross-charging models, which we probably don’t want to go there. But really, it’s that business line wants to solve a particular problem and you get treated effectively in the same sort of way as a customer. They have a business problem, you test something with them, you kind of go for the product kind of management piece with that, rather than I think what a lot of people are doing is really sort of putting datasets onto fabrics and meshes and all that kind of stuff, and saying it’s reusable, it’s componentized, it’s operationalized. That’s a product that will—no it doesn’t have the product management, kind of that sort of customer interaction piece, which I think is actually really, really important part of the product.

Brian: Let’s talk about that for a second because for someone that doesn’t come from a product background, it’s like, “Well, that’s what we’ve been doing for 20 years. We build models, we build dashboards, we have some custom applications that are data-driven, so what’s the difference here?” Like, “Yeah, and of course there has to solve a business problem, yadda, yadda. I’ve been hearing that for 20 years. What’s different when we talk about it as a product?”

Jon: Sure. So, I think the first thing is when most people in the data space talk about their kind of the artifacts and deliveries, they are talking about just the data. I think that’s the first problem, right? You’ve got warehouses, lakes, and the whole lineage as of the data platforms of going back 20 years; it’s just the data part of it. And for me, I’ve been arguing quite a lot recently that actually, the data is only a constituent part.

The model, the analytics, the thing that actually delivers the business value, the thing that business look at, either via visualization or a dashboard, or whatever the thing that makes the insights for the decision, that’s actually the product. The data is a part of that or the source data and that kind of stuff, but it’s an intermediate artifact. I think that for music is a huge distinction. And I know, in the mesh world, this data is a product and a data product. And for me, these are the less—they actually—you’re selling them, or they are the final output that gets consumed by the business user; these are intermediate artifacts.

I think, models, dashboards and stuff are more product aligned because they actually solve for the problem, they actually give the value, but it’s actually how to package, how it’s managed, what the lifecycle is around that as well, I think is super important.

Brian: Got it. Yeah, I’m a fan of any definition that includes some type of deployment and use by some human being [laugh]. Like, that’s the end of cycle, right, because the idea of product is, like, a good that has been made theoretically for sale. And sometimes there is, like, here’s a million dollars; the sales team wants this thing, and they are literally kind of paying. It’s funny money, but they’re, quote, “Paying,” the IT department to get this solution back.

So, sometimes there is actually a sale happening there in that regard. And I agree that there’s technical artifacts and outputs that are created along the way, but outputs don’t equate to outcomes. Even well-designed services don’t necessarily guarantee an outcome. But there does have to be some type of end-to-end experience that’s included in that such that we can test it, we can see it, it can be used somehow, even if the interface is very minimal because you don’t need to look at a bunch of charts or whatever it may be, there’s still some consumption that needs to occur. Is that kind of what you’re saying as well?

Jon: Yeah, absolutely. So, I think for me that if we take a step out of the data and analytics world and into kind of the more classic enterprise software world, right, you’re building an enterprise solution; I’ve built dozens of trading systems and e-commerce systems, all this kind of stuff. These are all decomposed, they have customers, actual customers most of the time, either internal or external. They have people using them. The whole thing is regarded as almost as the product, you know?

And the microservices and components aren’t regarded as individual products. We don’t call them ‘microproducts,’ do we, for instance. And effectively the whole wave around data products and data as a product is much closer to what the microservices kind of paradigm is. It’s really just an overlay of that. But for some reason, we call that a data product, but a microservice isn’t a microproduct.

So, for me, that’s the important thing. When you’re building enterprise software, you have a bunch of users, you have a business problem you need to solve, and you deliver the whole thing to solve the problem. And you decompose that down into individual components, and you deliver those incrementally, and with Agile, all that kind of stuff. That’s the same paradigm we have in the data space, but yeah, we seem to think that data set is a product. So, I guess that’s the way I think about it.

Brian: I like the product perspective, too, of again, if there’s supposed to be some consumption here and you’re not just a technical team that is providing it to some other team who is then therefore going to productize it or operationalize it. And maybe there is a software team that will really handle that part of it, but I think the vibe I get from my clients and just people I talked to is that, no, like, we’re the were the last—we are the only group that is providing dashboards, applications, things like this, when it comes an internal customer comes to us. They are the end-to-end team that’s supposed to be delivering this to them, so I’m like, “Well, then you can do it with this very technical data-oriented mindset, or you can think of it as a product where it doesn’t matter what we do if no one uses it.” So, if we take that mindset, then, like, consumption and use has to be part of our mantra about what we’re doing here or else it doesn’t matter, unless you just want to check the box and say, “Well, I gave you what you asked for. And if you don’t like it, that’s not my problem.”

I think that mentality is kind of changing because you’re not going to have a job for a long time if you’re just taking that, like—

Jon: Absolutely.

Brian: —you know, “I gave you the model. I don’t know how it’s going to be used. I don’t know where you—whatever. Like, I did my part. It works. It’s predictive. It’s—” you know—[laugh].

Jon: Yeah, indeed. I mean, if you took that, if you transposed that into the startup world, you’d be out of the water in day one, right?

Brian: Right. [laugh].

Jon: Building a product that no one wants, no one uses, not this kind of stuff. And I will often say this a lot, you know, on LinkedIn and other kind of social media, we’ve built this thing and we’re trying to find a use case for it. We’re trying to get to—a solution looking for a problem. And you see that a lot around definitions of data products. You know, it’s got to have multiple use cases, you’ve got to have reuse. It’s got to have all these [unintelligible 00:09:26].

And well, actually, first and foremost, it’s got to solve the problem. Without all that, if you’re building the building a dataset or whatever and you’re trying to find use cases for it, you’ve kind of gone the wrong way around. It’ what the business asks for first because data just doesn’t spontaneously, you know, it’d be produced; it’s not solving the problem that’s kind of useless. So.

Brian: `How do you think about this from a technical perspective? I’m actually kind of taking off my—thinking about my audience and asking questions for them, and now I’m asking a question for Brian here just to get your perspective on this. But as a designer when I went I work with technical teams and I think about what the end-to-end solution is, but also understanding, I know what I don’t know about all the technical work that needs to happen. I’ve written plenty of code over the last 20 years to just know what I don’t know in terms of all the architecture and the planning and how the data model is important and how that can have downstream impacts on the User Experience piece, I get all that. How do we focus on making sure we solve the problem here, when say, the problem is fairly narrowly defined, we have some kind of idea, like, okay, we’re going to need some screens, and we’re probably going to need a dashboard, and maybe there’s going to be some email notifications that gets shot out, and we kind of have some idea what that these artifacts are going to look like.

However, in order to do even just to deliver that much of a UI and user experience to call it a product, the amount of plumbing required to power all of that is still a Mount Everest lift, even if it’s just a walk around the park on the, what I call the last mile, where the humans in the loop enter the picture. And so, the data people I know often say, “Well, we can’t build all these one-off, single-purpose solutions.” How do you not boil the ocean with the technology piece when you’re trying to deliver on a very specific use case that will have business value, yet the perception is we need to have data governance and scale and speed and all the stuff that you may need in place just to deliver a small quote ‘product’ at the end of it? How do you reconcile those two competing ideas there? I mean, I always see, like, the design can help the architects and they can help the software and the data people because by understanding what people need to do and how they’re going to want to use it, it informs how to build the technology piece, instead of building it first hoping and thinking about scalability, and then finding out how people want or will use it, and then having to go change really expensive stuff that’s already been built, which nobody really wants to do, and no one wants to take credit for saying I built the wrong thing. But these two things are very at odds when you need a lot of infrastructure and architecture in place. How do you think about that?

Jon: So, it’s a great point. For me, it comes right back to, kind of the, sort of one or two most common requirements I get from business people from the last 20 years, you know fundamentally, can I build in flexibility into the system, fundamentally? Can I change my mind? Those are the two things that are front and center from me. I’ve worked for a large bank, and the risk officer said to me, “I want to be able to touch and feel the data.” And I’m like, “What does that mean?”

Brian: Ew. [laugh].

Jon: You know, it was, like, 200 terabytes of data. What does that mean? Basically, it meant that he didn’t really know what he wanted, but he knew what he wanted when he looked at it. And also, the fact that that it was doing risk for the whole bank, so all these divisions and stuff, so you needed to have a lot of flexibility. You couldn’t build one [unintelligible 00:12:47] data model, although they were trying, which was a bit of a nightmare.

So, for me, taking a step back to it, when you talk about you can’t build, like, small independent deliveries for each of these use cases, I actually completely disagree with that. I think you absolutely can. And that’s exactly the point of it, where you know, you think about, you know, the way the cloud has had impact on, you know—we don’t build a lot of stuff around cloud anymore. We just don’t build it from scratch. It’s like, you know, we don’t generate our own electricity, we don’t mill our own flour.

You know, the cloud’s—there’s a bunch of composable services, which I basically pull together to build my application, whatever it is. We need to apply that thinking all the way through the stack, fundamentally. So, there’s a lot of conversations about the modern data stack and modern data platform [unintelligible 00:13:28]. For me, it’s actually it’s not that. It’s about a modern data ecosystem.

How do we have independent small engines, deliveries, pieces of software that are almost use case by use case? Use shared services together that we can add to do small incremental deliveries, we can change stuff together, we break the problem down, rather than trying to put them into big central warehouses, central lakes, you know? [Data messages (?) 00:13:49] sort of going down that path, but it actually needs to be a lot more than that. You know, fundamentally, the traditional thing around data is basically I need to put it into some kind of central model or some shared bunch of central models. That’s almost my first thing to do. It’s like… actually, I think it’s actually not.

The idea is you need to be able to work with the business, have a process for them from decomposing the actual business ask—which, you know, I’ve produced a level of data product pyramid which actually does that—break that down to little functional components, like a model here, some data services there, that type of stuff, that can iterate and be delivered incrementally, and very much follow almost like the UX approach where you start with a core strength product that you put in front of the business with the data and say, “Actually, is this just right? It doesn’t have to be super accurate, but is this the right thing to sort of—” “Yes, we like that.” And then you get more fine grained as you go forwards. So, actually—and the infrastructure needs to be able to support that. Needs to be our support different tech, you know, so it’s not just one big database. You know, you see lots of consolidations into kind of single stores and that type of stuff.

It’s like, actually, you break it down into individual technology components for the different use case with shared services like cloud, you actually get a much more flexible, much more Agile-type infrastructure that can actually cope with change and actually can cope with the full lifecycle. And that’s the other thing we talk about. I’m thinking about doing a blog tomorrow around if you can’t retire use cases on your data platform, then it’s legacy. And it’s that whole lifecycle of it. How do I actually have the small components which is solve use cases and change stuff, with the business, especially with things, like, you know, pandemics and kind of macroeconomic factors, and I can retire stuff really, really quickly.

That to me is a complete mindset shift from kind of a centralized, monolithic, you know, [unintelligible 00:15:21] the data and technology approach first, that for me, it’s really the game-changing piece that we need to get onto

Brian: If I was to summarize that, what I’m hearing is kind of a BS excuse if you think it necessarily requires a giant technology effort even to solve a small problem; that’s kind of old thinking. It’s like, no, there are Agile ways to do this such that you should be able to deliver some type of experience that then generates some kind of business outcome without boiling the ocean and thinking kind of legacy so to speak. Is that kind of a—

Jon: Yeah, yeah. It’s just trying to get away from that centralized approach where we try and push everything into a centralized model, a centralized governance, a centralized platform, to actually how do we break this down to the individual—you know, ho do we deliver the smallest piece of end-to-end business value quickly, efficiently, and how can we change that? And that’s kind of the sort of lean type approach to data and analytics really, in my mind.

Brian: Maybe you answered this already, but you had mentioned, there’s four characteristics of what a data product translate to. So, I’m just going to read these so you don’t feel like you have to regurgitate [laugh] them from your article on the spot. But, “Be targeted and specifically solve a business problem and an analytics context,” I.e. It has a direct business requirement.

Two, “Provide direct value to the business customer.” Three, “Not just be a data set,” unless it’s actually sold, which you’d kind of talked about, I’ll insert asterisk here. I agree; I don’t think a data product is a bundle of data that is sold to somebody because that suggests that, like, oh, a human can digest, you know, 30,000 columns and 50 million rows of data. Like, just here’s the dump, like, good luck. So, I agree.

Sometimes that is the right delivery mechanism if your audience needs that kind of thing, but often it’s not. And D, “Have a product lifecycle with direct business, customer interaction dash feedback.” What does that one mean? So, that’s the one that I wanted to ask you about. “Have a product lifecycle with direct business, customer interaction-feedback.”

Jon: So yeah, what’s sort of, getting in the way, I think, about it, if you look at the article the Data Product Pyramid, if you look at the components of each of those, each one of those solves a business problem. It’s in the very bottom of the pyramid, not the data bit of it, the data is what I call data services, which is like your first one, you’re just delivering data to the [unintelligible 00:17:28]. The first one could be a bunch of metrics. So, “Show me the number of sales that happened,” something really simple over the last three months. “Show me what the cost of x is y.” “Show me what risk we’ve got and why.”

Those are little metrics which will be individually delivered on their own engine, potentially, that actually give a bit of business person that metric. So, someone themselves can look at that, you know, you’ve put it in a nice visualization or something for financial that would share that. So, that is a nicely encapsulated piece of business value. It, you know, it’s not very complicated, it’s part of a larger piece, but you can save them, so you sales and finance could argue about what that is: what’s the cost? Is that P&L, is it revenue, is it—you know, they can [unintelligible 00:18:07] over it, but it’s actually a business deliverable, you know? And that’s the very lowest level.

And then obviously, as we go up, you start talking about, you know, forecasting models, you start talking about propensity models, you’re talking about all sorts of kind of value add-on top of the data. It’s not just, like, doing a metric; it’s like actually doing forecasting or bringing semantics in or that kind of stuff. And then at the very top, we can talk about decisions which is the decision the actual business person actually wants to make. These all have business interactions and they all have actually have direct value to the business customer, even though this might be—the overall ask is basically, “I want to launch a new product in a particular sector,” and you break it down with a bunch of products that we’re doing to support each other. Bottom level might not be the direct value that’s for the system, but that it still gives value to the customer.

And the idea is that those can then be reused for other use cases. Other use cases, a cost of sales would be you know, would be invaluable. What you don’t want this basically to stick that all in a warehouse and then something going I don’t trust it. And I’m going to go and rebuild the sales metric in five different areas and dashboards and all that sort of stuff, which I’ve seen so many times, and it happens in lakes as well. What you want is actually a business value, business interactive, i.e. the business can actually look at it and go, “Yes, I recognize that and I trust it,” and that sort of stuff that gets reused across. So, I guess that’s kind of the main thrust of the idea.

Brian: Let’s talk about this data product pyramid for a second. So, I’m just going to quickly summarize what stuck in my head after reading your article about this. What I thought I saw was—and I’m simplifying here—was basically that kind of the analytics maturity timeline if you want to—or technical maturity timeline, which is raw data at the bottom, historical analytics at the next layer, predictive analytics, and then you have your prescriptive analytics at the top. First of all, is that an okay—just to try to put a—because we’re trying to talk about a visual diagram now here for people that are listening—so am I oversimplifying it.

Jon: So yeah, I think that’s pretty close. I mean, for me, it’s basically, historical base, which is the base level of good maturity, you’re absolutely right. Next, it’s basically how do I go beyond just measuring what’s happened? It’s forecast, but also bringing on semantics. Why did something? You know, it’s not just what’s going to happen; it’s why it’s going to happen.

And then the top one is super simple: what am I going to do about it? Fundamentally, [unintelligible 00:20:14] prescriptive in the decision I need to make. You know because a lot of time I see is basically getting this great historical view—sometimes not so great [laugh] because of data quality and all sorts of things—but that’s where it stops. It’s like, “Well, that’s great.” That doesn’t tell me—A, it doesn’t tell me why it’s happening. And why is obviously semantics and it’s about forecasting, is about all that sort of stuff joined together other bits of behavior inside the organization.

And then, which is the real Nirvana, which is exactly what prescriptive is basically, what do I do about it? What I do next? My next best action. Like, you know, what’s—so you know, my example of going into a new product launch, you know, it’s obviously very much in the UX world, it’s basically how, you know, does that product launch make sense? Do I need to—what’s my campaign going to be look like? What kind of customers or whatever it is? It’s those decision points, which really the value is in this. And it’s also the hardest thing to get to, but you need to break it down into those, kind of, layering of what happened? What does it mean? What do I do about it? That’s really the, kind of, split that I think about.

Brian: What was odd to me about the pyramid is that it made me feel like, okay, so my data product either comes in at, like, kind of level one top of the pyramid or level four down at the bottom, and it’s kind of defined by that. And it almost suggests that the product equals a metric or a measurement of some kind. Because each measurement, you could say, falls into prescriptive, predictive, historical, or kind of this raw data, kind of, you know, almost no treatment at all, whatever you want to call that, but from a product perspective, I tend to think about products as having use cases, they have end-to-end workflows that may go beyond the scope of a single metric which may be descriptive, predictive, or prescriptive in nature. The human aspect isn’t limited to a metric that falls into one of those things. It could be a collection of these things, or if you think about, like, an ongoing to tool to help sales reduce churn, like, the business goal for the next two years is reduce our churn rate, and so we’re attacking that as a broad problem.

And so, we do have, like, a main—say, we have a product or a primary decision support tool that we’ve created for that, it has multiple screens, multiple different metrics, all of this. I would think that the sales team would perceive the product as being that solution. And so, it’s like, how does that map onto the data product pyramid if that solution most likely has a combination of bunch of different things? It has some root cause analysis, it has some, like, propensity—these customers are likely to buy; call them first—it has a bunch of these different screens. I don’t know how to talk to me about how you map those two. Or maybe I’m not conceptually getting it, but that was one thing that I was thinking about when I was looking at it.

Jon: No, it’s a great, great observation. And ultimately, the data product is exactly designed to solve that kind of problem space. So, it’s really there to do two things. First and foremost, it’s there to help business break down the ask. One thing I’ve seen many, many times is business not actually being able to throw a requirement over to a [DNA 00:23:10] team, like the one you just said, you know, “We want to reduce churn.” So, what does that mean, and actually how do we break that down? How do we guide the business?

Because what will happen is a data scientist or a data team go away and go, well, actually, what’s our churn model? What are we actually defining success? How do we do the segmentation? And all that sort of—and there’s a load of different business questions always come about this, but you go back to the business, like, “Oh, we haven’t really thought about that,” and you end up backwards and forwards backwards and forwards. The product pyramid is really a decisioning framework for actually breaking that down.

So, you can say to them, “Before you build any tech or any code, go look at your data.” You can go to the business, say, “Right. You want to reduce churn, right? What does that mean? And what do you need to know to do that? Do you need to understand your fundamental your historical churn rate?”

And that’s obviously one that’s a really easy one. Product sales, you want to understand your cost of servicing, you understand—yeah, and you’re breaking that down. And the pyramid is really a way of doing that guiding through the business. So, the idea of the pyramid, you go through almost, like, a paper-based exercise, first of all, and define core, you know, prototype product, or the current product candidates that actually break that down. And you work the data science team, and then say, “Actually, these products actually map to things we can actually deliver.”

So yeah, [unintelligible 00:24:14] propensity model, fundamentally. So, it’s like, we know what we know how to build that, so that’s one particular product, but that’s also then built on metrics, on basically on, you know, on the—and on data. Like, and you’re look at the buying patterns, you’re looking at the click-through rates, you looking through the abandonment rate, all that stuff. What metrics would come underneath it? So actually, your answer would be an entire period of different products which all depend on each other. You’re building on each other around that. And so, what you break them down by touchpoint in this process. So, it is exactly designed to solve that sort of problem you’re talking about.

Brian: Who’s well-suited to do that work? I need that; I’m a VP of data science and analytics at a large enterprise. That sounds great. We probably could use some more of that. I do hear about this problem definition space is not well done or the assumption is that the requirement given to us is what we’re going to make and then they don’t end up using it. And there’s something’s lost in translation and then it’s usually, “Well, they don’t know how to ask for what they want. They really don’t know what they want.” And then it’s that game—it’s tennis, who’s got the ball? And whose job is it to define what the need is, the problem space?

Jon: Yes, that’s a very good point. So, part of the pyramid that I’m working with a colleague of mine, [Sunil Gupta 00:25:21], is basically coming up with a value framework—which is really the sort of head of the pyramid—is basically say, how do we define what’s the highest priority? How do you ask the right question and frame it in the right way? So, part of that kind of initial piece where you’re defining your product, actually is you’re defining the value. So, for instance, you know, that question I talked about earlier, where you are asking is—we’re going to launch a product in a particular sector. You want to focus, first all, workout is that the right thing to do? And is it framed in the right way?

So fundamentally, having it all agreeing from a business perspective, and agreeing what the value framework is, is that the right priority, high priority? Is that the right thing to do? Also, are we solving it in the right way from the business and a technical? Because sometimes, you know, you get the ask, and a data scientist—or data engineer, whatever—it will go off on a particular technical task and it’s not the right ones as well. So, it the technical framing? Is it graph modeled? Is it a metric? Is it you know, basically that type of stuff.

So, you need the business framing, business value first of all to decide what the business? Then need the business framing: is it asking the right question in the right way? And then you’ve got the technical framing, which is basically are we using the right set of technical and data science type methodology? To answer your other question, who does it? Actually, it’s multidisciplinary team.

So, it’s not a data science problem, it’s not a business problem, it’s not a technology problem, it’s not a data engineering problem, it’s an everyone problem. And I advocate actually small, multidisciplinary teams, which have a business value person in it, have an SME, have a data scientist, have data architect, have a data engineer, as a small pod that goes in and answer those questions. It is a multidisciplinary problem.

Brian: Who leads and facilitates that? Because there’s a gap; a lot of times there’s a gap. The one team doesn’t know the business side, the business team doesn’t care or want to know about how the plumbing is supposed to work. How do you learn how to do this?

Jon: Yes, very, very good question. So, something I’m working on again, is basically what—we’ve come up with a name like that, sort of, data commandos. It was actually a small kind of disciplinary highly skilled, disciplinary force, multidisciplinary force can actually do this kind of stuff. And it is reorientation for a businesses, fundamentally. And you spin up these virtual, kind of, pods to solve that particular problem.

And I’ve done this before where you actually have these kind of virtual teams, which take to take people from different parts of the business and spin them up as part of an initiative to do that. And really, the business needs to transition with that because like I said, the barriers between, you know, business and IT, and data, and stuff are huge in large organizations. Their problem is not my problem; that type of stuff. And I think one of the things businesses really need to do is pivot away from that. I’m doing a piece of work now, which is a UX, kind of, driven piece where we’re taking a dashboard, and we’re breaking that down into kind of business value metrics and data products, really.

So, you need a business person, you need a UX person there to talk to the business and actually say what’s actually valuable to you, that type of stuff, and then data scientist will come and [Oh can you send the 00:27:59] models that data engineering will come in and say, “Actually, we need to get this, this how we get the data.” But it is a multidisciplinary problem, you know, fundamentally. And businesses really need to kind of get to grips with that and actually orientate themselves around that. So fundamentally, you have these relatively small pods of people we call them. You know, which can actually be spun up into from different part of the organization to actually solve the problem.

Brian: Let me try to clarify. I guess what I was asking is in terms of a skillset. Because in my experience, these what I would call an ideation session or a joint research session or whatever you want to call it, there needs to be a facilitator for that activity. In the product space, and in the software product world, typically, that would be something that likely would involve, at a minimum, a UX researcher and a product manager, or you could call it a data product manager. The goal being there not to run it, but you’re really there to facilitate getting all the useful information out of all the disparate heads and to get everyone participating in this joint effort together.

I’m wondering for teams that don’t have dedicated skills around that developed, is that something you go out and develop? Is it something where I’ve really found that, you know, people with this job title tend to do really well doing this kind of work? It’s not a natural thing that—I could see everyone gets in the room, we got this joint—we got this cross-disciplinary team in there. Why are we here though? And there’s kind of like this vague idea of what the output of this meeting is going to be.

And then it’s like, kind of back to everyone goes and does their own thing. And the data team is ready to go and make something, you know? [laugh]. And so, the next day, you know, the hammers and nails are out, and they’re off doing the plumbing. And it’s like, “Here we go again.” It’s not one meeting in my experience, either, like, when we do this in the software world, at least in my experience, this is not one session.

This often could be multiple sessions to get the clarity just around defining the problem in words that we all understand, and defining success in words that we all understand, such that we all can understand when we’re making progress. And there should be no surprises at the end. There’s no big reveal. It’s more like, “Oh, there is. I’ve been waiting for that.” It’s not like, “That’s not what I meant.”

So, skillset-wise, like, how what’s your take on who can do this? Or how to develop this skill if you don’t have the skill in-house? Maybe there’s politics that need to be dealt with. I don’t know. But I’ve heard—you’ve heard analytics translator over the years [unintelligible 00:30:18] it’s McKenzie thing.

I don’t care for that title, but I think the role that’s being performed there has some real value. And I just had Manav Misra on the show who’s a Chief Data and Analytics officer; he has this data product partner role. You just mentioned data commander. I’ve heard many different job titles, sort of, for this, but I’m really interested, regardless of the title, in that skillset. Where do you suggest—like, someone needs to know how to do this to get this going, you know, even if it’s a skill that can be taught to someone else? But what any take on that? Or anything to share?

Jon: Yeah, absolutely. So, the first thing is, in my mind, the people have got to be bought into the process. The process has actually got to be, basically, this is kind of how we’re going to build data products. And that’s got to come from the top. You’ve got to have what we used to call in consulting, the executive hammer, you know, from the C-suite, probably the CEO—probably? Maybe you know—or the [CIO 00:31:08], that literally board level, this is going to how we’re going to build data product.

And is the process. And this is, again, this training, and there’s kind of business change that has to come into that. If you don’t do that you’re going to end up with exactly this problem. Why are we all in this meeting? It’s like, this is actually it’s a business change piece of work on the way that we’re going to orientate the business around—the organization around it.

And we’re not going to change the structure, we’re not going to change the org chart and stuff. And you see a lot of us in the mesh world where you just, kind of, completely reorganize, so that’s not what this is about. All this is about is now, we actually—this is delivery method, you know? We’re going to do this now. So, if the company said, “we’re going to do Agile.” or "we’re going to do SAFe,” or we’re going to do whatever, you know, those type of things, this is the same process.

You need to sell the process first of all. This is how we’re going to do this, for instance, you know? Then you’re absolutely right, that there needs to be a lead. We’ve sort of deliberated over this quite a bit and there are, as you say, titles and stuff. The concept of a data product manager came in and I originally really didn’t like that. I think it was much more—it’s going to be much more just about organizing teams and getting people to talk to each other and stuff like that.

But actually, if you take it to its proper conclusion, and take a view from the transactional world, it is a product manager. It’s exactly that. It’s someone who drives the actual product. And if you take a product approach, rather than just as kind of intermediate dataset approach, that is product management. So, that person will actually be driving that whole process. They’ll own the product lifecycle, they’ll own that sort of stuff.

So, I think if you have organizations who actually appoint those people, who are genuinely more from the product side rather than from the data side because actually, if you have a good multidisciplinary team, they don’t need to be fully, you know, data versed. Obviously, there are different differences and building data type deliveries, rather than kind of transaction deliveries, I totally understand that, but it’s a similar sort of thing; they’re driving the product lifecycle is still the same, the key role for that, and that’s the interface and bringing everyone together and driving the delivery, but also driving the use cases, making sure that it’s the right prioritization, the right value with the value framework. But you need the process and that needs to be on the top. You can’t just go in a room and say, “We’re just going to wing it or try and learn as we go along.” You need to actually have that process. Really not neccessarily embedded but, like, bought into by the whole organization, and it’s got to be driven from the top.

Brian: There’s going to be probably need to be some proof of value right before someone’s going to adopt a process because most companies, especially large companies doing this kind of work are risk averse. Like, the status quo rules. So, it’s like, I’m wondering that first time, if you’re going to introduce a change like this, someone’s going to want to know, “What’s in it for me? What’s in it for us?” Like, why should we take the risk of doing it this other way? And I would—are you saying a data product manager is the spearhead for this initiative? You do need someone with that skillset?

Jon: So yeah, I think so. But you know, a small plug, basically, you know, that the process I’ve put together around this, you’ll get a business value data product in weeks, you know, fundamentally. If you think about kind of the buy-in factor, if you look at traditional data programs, they can, you know, it's 12 months, 24 months, three years, five years, millions and millions whatever currency you’re into, without much business value. With this process and the process we’re putting together as data product pyramid, this kind of ideation and producing incremental data products, as you go testing that out with the business, you can actually the first time you do it’s going to probably going to take two or three months, fundamentally, that sort of thing. But you’re actually going to get business value data products out of that. You actually will be solving business use cases from nothing in two or three months.

And when you get going, this should be, you know, two, three week kind of cycles, from ideation all the way into production. And that’s really the Nirvana we want to get to. So, if you can demonstrate that by just doing an initial proof of value, going through the process, two, three, four months, you know, we have teams that basically who can do this, who’re going to take them through this whole process. I’ve been doing it for 10, 12 years, and then you show actually at the back of it, you know, not only have you got some technology, but you’ve actually got some business value products and you’re solving the problem, the light bulb will go. And it lower cost. You know, you’re not spending, you know, tens of hundreds of millions on infrastructure and this type of stuff as well. So, that’s really where the proof of value comes in; it’s to being able to deliver those on those data-driven business use cases really, really quickly.

Brian: You had mentioned a UX approach and one of the projects you’re working on something. I wanted to ask you, like, where does that fall into this process? Is that part of this initial ideation that you’re talking about? And part of the reason I’m asking is because I think from especially from a UX research lens, we’re really concerned with how do people do this work now when they talk about churn reduction, or whatever? We want to fit our solutions into the natural way people do this stuff now and not the hypothetical way.

Because when we get into this ideation session, we’re not talking about magic, “Oh, my gosh, we’re finally going to solve this thing that’s been a problem.” And it sounds really great there until the status quo kicks in again, and reality is back. So, one of the ways to fight with reality there is not to fight it, but to kind of go along with it. And so, if you want them to trust the data, it’s like, “Well, how would you trust it now? And what would you do?”

“Well, I would do this and this, and I want to scan this thing, and I want to look at this, and maybe I want to filter it by this.” And, “Well, did they factor in this thing? And did they factor in geography when they did this calculation?” “Okay, so you want to know what the calculation is based on?” “Yeah, I would want to see that or else I’m not going to—” “Okay.”

This is what UX people are doing all the time is to understand this. So, when the solution comes out, everyone says, “Oh, it’s so intuitive and obvious. Like, I could have done that.” And that’s usually a sign that design was done really well when it seems so obvious to everybody that it doesn’t even seem special at all. But arriving at that as is really hard and understanding the actual context of use and things that may not come out in a group session, but individually, you find out, like, “I’m not sticking my neck out here,” or, “I would never put this in a presentation because x,” that did not come out in the ideation session, but in a one-on-one research setting, it is something that comes out.

Is this part of the solution at all? Or like, where does UX play a role in this process? Or they coming in later and really working on the visualization piece only? What’s your experience there? Just any opinions that you may have about that?

Jon: No, it’s a great, great point. I think it’s actually part and parcel, fundamentally, being able to actually present—so when you’re ideating with data pr—the business problem, part of it is actually delivering the analytic, delivering the insight, the scientific and the mathematical, kind of, datasets and all that sort of stuff. A part of it, such as storytelling and be able to present it back to the business. And that is super, super important. What you don’t want to do, as you said, is basically do a piece up front, you go away for six months, build a platform, come back and go, “those numbers don’t work. That’s not what I mean.” And, “Just give me the export to Excel button,” which been the bane of my life and lots of data-driven UX projects, as you probably will know, you know?

So, the idea is that you’re actually building the data products, which are the, kind of, you think of it the back-end, but you’re actually then also, you’re doing UX alongside that, you know? You’re doing in tandem. So actually, you could do it in, like, a wireframe piece, and then actually connect that up using a nice prototyping solution tool that you actually connect up to the actual real data, you know? And the products and the UX will actually get from some sort of coarse-grain to fine-grain as you go through it. So actually, you can present real data back to the users as part of your visualizations.

I seen them actually working… not side-by-side, but joint. It’s a part of the discipline of the multidisciplinary team, [unintelligible 00:38:07]. So obviously, your business ask is talking to a, you know, head of sales in a really large organization, they’re not really interested in like, that we hit that web service for that metric, you know? So, how do I actually then present that back, you know? And also your you might have different personas for your churn model, right?

For instance, the head of sales might not want to understand all the different segmentation, so it’s like, well, what does this mean for my sales team or my sales process? And you know, but your marketing might say, “Actually, we want a slightly different view of that.” So, you want to do—so, you might have different personas in that and you might have different products go to the different personas under that whole umbrella. So, it’s absolutely part and parcel. The missing piece for me—and I’ve done a lot of UX and I love the processes—you know, when you’re doing you know, wireframes, research, this kind of stuff, not having that immediate feedback around, if it’s possible with the data and that sort of stuff is a bit of a little bit of a gap because I spend quite a bit of time doing really good research, come up with a great solution, and the data doesn’t support it, or is not able, or it’s not feasible, that kind of stuff, you’ve spent a lot of time trying to go and do that.

So, you get this great initial piece around the visualization, the design, but then you might get 6, 12 months or sometimes even not at all because you can’t actually connect it to the rest of the organization. I think growing those two things in lockstep is absolutely the key for me.

Brian: You’re saying that you’ve been part of solutions where there was, I don’t know, if you meant heavy user experience involvement, there was good research and design. It was done in a vacuum. They created a aspirational data product that was not technically realizable in any short amount of time, so it wasn’t feasible. So, you get people in this mindset, like, “Look what’s coming.” It’s like, “Yeah, not really. It’s not really ever coming.”

Like that’s 5, 10 years of work, you know? No way. So, what happened there and how did—what needs to change in that model? Maybe tell me about that. What was the org model or what was the relationship there? Why did that happen? How do you counter that so that doesn’t happen?

Because that we don’t want that either. That just makes UX are just a waste of time. It’s just like building architecture and plumbing that never gets us to build any type of last-mile service. It’s just as bad, right? It’s just going through the exercise [laugh]. So, tell me about how do we prevent that?

Jon: Absolutely, I think was more than one use case this happened, actually, [unintelligible 00:40:17]. When I was working in banking, this used to happen quite a lot because obviously, banking infrastructure is very complex, a complex process and this kind of stuff. Some of its very legacy, you know, sometimes you’ve got 40-year-old mainframes, that type of stuff. So, the first time it happened, it was a very large bank, and exactly what you just said. The team went in, did this fantastic UX process, they’ve built these amazing kind of reactive, personalized dashboards that give risk metrics to the risk officers.

And you know, what do I need to know now rather than just you know, what’s happening. Very much go into that, kind of, knowledge and decision-based stuff that was brilliant and, yeah, it looked at looked amazing. It was, you know, the visual design was fantastic. And it kind of really—and the information architecture was really nice. They went to the head of IT and he said, “Well, give me a billion euro and I’ll make the rest of it for you.”

It was literally that kind of, you know—to get the data across a complex IT estate and build it into their kind of real time, lots of [banks 00:41:06], that stuff, it will—he obviously, he was, you know, he was paraphras—didn’t do a proper estimate, but it was that kind of problem. Yeah, it was ideated as you said, in kind of, in with the business, aspirational, that type of stuff. And I’ve seen that a number of times. So, for me that kind of light bulb moment was actually, can we do that, actually, but with the data? And it doesn’t have to be, like, you know, production-level, you know, operationalized kind of products and stuff, but can we ideate that as they go?

So, you know, a part of not just UX, but data and analytics, you see how many models don’t get put in production, all this kind of stuff. It’s a feasibility question around it, fundamentally, you know? And a commercial feasibility, not just can we do it; is commercially viable to do it? And if you can bake that into kind of that ideation piece, where you can do the equivalent of analytics for what you do with UX, you know, coarse-grain wireframes, you know, it doesn’t have to be super accurate, but you know how far—if it’s going to be feasible, if it’s how far away you are from the business value that [unintelligible 00:42:00] that—capturing that data and that sort of stuff, atk least you know you’re not spending lots of money, or even worse, promising the business something that never comes. And the upshot was, basically, of this was yeah they scrambled around for 12 months trying to push stuff into this kind of—they spun up a database, they tried grabbing data out from everywhere, it was kind of smoke and mirrors, that plastic, wire, bits of string, and didn’t really work. In the business lost face because it was too slow, the data wasn’t good enough quality, you know, all that kind of—robustness wasn’t there, fundamentally.

Brian: This is what product management in the software world does. Because feasibility is one of the stools—it’s one of the legs of the stools. Like, there has to be market need, and your market just may be the sales team, but there needs to be some promise of value there that this person is really responsible for at the end of the day, is this data product going to create value or not? I need to be attuned to what value even means in the eyes of my customers and users. The user experience piece is about making sure the users will actually use this thing: it’s usable, it’s useful, it fits into their context, all of the human factors piece.

And the technical piece is to make sure is this feasible? Can we actually make this promise of value in some amount of time that’s realistic and it’s going to have all the checks on security and governance and all those kinds of things that are important as well. That trio is really critical, and it sounds to me like there was not a product management representation there or someone with that ownership of the value because that should have been caught early that we’re not getting check-in with the technical team to make sure that the these mock ups in this new quote, “Application,” this risk application or whatever it was, is actually buildable in some amount of time. It sounded like a boil the ocean kind of project. Is that what happened? There was no, like, data product management oversight of this? There was no single point of contact, no single point of responsibility for the value?

Jon: For the end to end piece, I think that’s right, it was very much a UX-driven piece, which is, you know, which happens. But also, I think the other thing about data products and projects themselves, sometimes you don’t know how feasible it is because you don’t—until you actually look at the data. And this data science is perceived with this kind of problem, right? You know, unless it’s all nicely curated inside your data platform, which let’s be clear, you know, in large organizations, like 20—if you’re lucky—is, but you still need to gather it from the parts of the organization.

You don’t know. You fundamentally don’t know until you, you know—you’ve got to do what we call data archaeology. You got to go and find the data, you got to brush it off, and you’re looking at and go, “Is it complete? Is it all that sort of stuff?” Because if you did it on just the data that would work well curated, you wouldn’t do anything, couldn’t do anything. Because that’s such a narrow band for my use cases.

So, a lot of time, you have to go almost take a leap of faith, say, “We’re going to try this out, and we’re just going to go and get the data.” And we look—still look at you know, data science and even dashboarding projects [unintelligible 00:44:48] project now 60, 70% of its data rank, trying to find the source in data, trying to clean it, trying to—even to this day, and I was talking about this ten years ago, it’s still a lot of that in large organizations. So, you need a way of basically doing that process but doing it quickly and tying it not independently, but tying it to the business process, to the actual UX, the business value, and doing those things together quickly, so you know, very quickly without lots of cost, whether it is feasible, whether the data is good enough, that type of stuff. And that’s again, that’s the bit for me that’s really missing.

Brian: Jon, this been a great conversation. I just want to ask you, is there anything I didn’t ask you about data products today you’d like to share that you think I should have asked you about?

Jon: I guess for me, we’ve focused a lot on, kind of, on the data product, kind of, delivery, and kind of what it is. I think for me, the business framing and the technical frame is, I think, really super important. Getting the business [unintelligible 00:45:38] in the right way. I’ve done a project a couple of years ago where they were doing a sort of a transport rerouting system, and they were trying to use genetic algorithms to try and reroute trains and it wasn’t working; it was expensive and stuff.

Because one they were asking the wrong question. They were saying actually, the point-to-point question in terms of geospatial is, you know, they don’t care about it. It’s about what trains are coming to platforms. And also using the wrong technology, the wrong approach, and that kind of stuff. So, I think for me, that business value and that business framing and technical framing is actually one of the biggest problems in our industry.

You see data scientists jumping into deep learning and this kind of stuff, and I’ve just written a blog about this, but you get to take a step back, let me understand the business problem, first of all. Let’s really get the data team to kind of be much more kind of business-orientated. And they’re not all going to be SMEs because, you know, that’s impossible. And that’s why you need, kind of, that close alignment between the data teams and the business teams. And really, the only way you get that is by having these multidisciplinary things rather than this kind of—you know, fences that sit between, which is unfortunate. So, I think for me, that is almost like the first point: how do we get those together to actually solve the business problem?

Brian: Sure, sure. No, I mean, that has to be clear, and [laugh] I’ve heard a similar thing. It’s like, “Our goal is actually not to build a rerouting algorithm. Our goal is to get the trains to appear at the platform’s at the time that was written in the schedule, to do that better than we’re doing it now.” And then the question is, “Well, how are we doing it now? And how much better do we need to get it?”

And then maybe a rerouting algorithm is maybe the right way to approach that. And then you can argue about it until you’re blue in the face about the right way to do it. But I agree, a lot of these big enterprise projects go off because we’ve lost sight of the fact we just want the trains to show up at the platform more accurately at the time advertised. That is the only thing that really matters at the end of the day. How do we do that?

That’s the thing; we should be all be owning that thing because if we do that, then we get the value we get more customers, or we make the public happier, or we reduce whatever pollution or I don’t know, whatever people smoking on the platform because they can’t smoke in the train, whatever our—you know, whatever the outcomes are that we want, right? So—[laugh].

Jon: Absolutely. [crosstalk 00:47:41] very well put, yeah. Absolutely.

Brian: Yeah, right, right. Cool. Where can people stay in touch? What’s the best what’s the best way to get in touch with Jon Cooke?

Jon: So, you can do joncooke@dataception.com or follow me on LinkedIn, or you know, go to my website dataception.com. You know, there’s a number of channels.

Brian: Excellent. Excellent. Well, Jon, thanks for coming on Experiencing Data to talk about data products.

Jon: No worries. Thank you so much for having me. Been an absolute blast. So, really appreciate it.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Subscribe for Podcast Updates

Join my DFA Insights mailing list to get weekly insights on creating human-centered data products, special offers on my training courses and seminars, and one-page briefs about each new episode of #ExperiencingData.