Dr. Puneet Batra is the Associate Director of Machine Learning at the Broad Institute, where his team builds machine learning algorithms and pipelines to help discover new biological insights and impact disease. Puneet has spent his career stitching together data-driven solutions: in location data analysis, as cofounder of LevelTrigger; in health care, as Chief Data Scientist at Kyruus; as lead Analytic Scientist at Aster Data (Acq by Teradata); and in fundamental models of particle physics, developing theories for Fermilab’s Tevatron and CERN’s Large Hadron Collider. He has held research positions at Harvard, Stanford and Columbia Universities. Puneet completed his BA at Harvard University and has a Ph.D. from Stanford University.
A friend of mine introduced me to Puneet because he was kicking off a side project using machine learning to dig into what creativity is through the lens of jazz. Since Puneet is not a musician by training, he was looking for some domain-specific knowledge to inform his experiment, and I really liked his design-oriented thinking. While jazz kicks off our conversation, we went a lot more deeply into the contemporary role of the data scientist in this episode including:
- The discussions that need to happen between users, stakeholders, and subject matter experts so teams get a clear image of the problem that actually needs to be solved.
- Dealing with situations where the question you start with isn’t always the question that is answered in the end.
- When to sacrifice model quality for the sake of user experience and higher user engagement (i.e. the “good enough” approach)
- The role of a data scientist in product design.
Resources and Links:
Quotes from Puneet Batra
“Sometimes, accuracy isn’t the most important thing you should be optimizing for; it’s the rest of the package…if you can make that a good process, then I think you’re more on the road to making users happy [vs.] trying to live in this idealized world where you never make a mistake at all.”
“The question you think you’re answering from the beginning probably isn’t the one that you’re going to stay answering the entire time and you’ve just got to be flexible around that.”
“Even data scientists and engineers should be able to listen with empathy and ask questions. I got a good number of tips from people teaching me how to do things like that. Basically, we ask a question or basically shut up and hear what their answer is.”
“I’m not really sure what creativity is. I’m not really sure if machines will ever be creative. A good experiment to try to prove that out is to try to get a machine to be as creative as possible and see where it falls flat.”
Brian: Welcome back to Experiencing Data. I’m happy to share my recording with Dr. Puneet Batra, he’s a very intelligent data scientist that I recently met. We’re going to talk about how data scientists can work better with businesses. Both product management, design groups, and business stakeholders, as well as subject matter experts, and vise versa, how can those roles leverage the data scientists and their abilities. There some really great stuff in this recording about fitting your model to the actual business problem including when it might make sense to actually use a less accurate model, if you are doing machine learning or predictive analytics to actually increase user engagement. I hope you enjoy this discussion I had with Puneet.
Brian: Hi Puneet are you there?
Puneet: Yeah, I’m here.
Brian: I’m excited to have you on the show. Today we’re going to be talking to Dr. Puneet Batra who is data scientist that I was recently introduced to. We’re actually both located here in the Cambridge, Massachusetts area. We actually have some other connections of some companies we worked with together. You were a Chief Data Scientist at Kyruus in the past. I know you’ve been working on a startup called LevelTrigger out of Cambridge as well. Prior to that, you did some work with CERN’s Large Hadron Collider, is that correct? You have a physics background?
Puneet: Yeah, that’s right. I was Chief Data Scientist at Kyruus. I think maybe you were there a little bit after me, but one of our colleagues put us together for one of the projects that we’ll get to. Originally, I was a particle physicist. I was a theorist and I was doing a postdoc at Columbia, thinking about new models of physics that experimentalists might discover at experimental facilities like CERN. I got bitten by the data bug and made the jump over to the data science world.
Brian: Nice. How long ago was that?
Puneet: That was now almost a decade ago. I guess that was 2009-ish.
Brian: Got it. One of the interesting things that brought us together was actually music. A friend of mine had told me that, “I know this guy named Puneet, he’s very interested in jazz music, and predictive analytics that were predicting music.” Can you give people an introduction about what bits your bug? Where is this interesting music coming from and what is this little side project that you have going on with music? Tell us about it.
Puneet: Yeah. I think it’s still nascent and after some of our conversations, I’m a little hesitant to trumpet it too much. I’ve always loved music. I think like a lot of people, music has always been there for me. I’ve been obsessed with it, get obsessed with songs, etc. A little while ago as I started to get deeper into ML–machine learning and AI, whatever side of the fence you want to call it, I started to research what people were doing with having computers generate audio online and using machine learning techniques.
You can go online and you can find examples of concerts, symphonies, various stages that people have made and it seemed to me that given some of the techniques that have been developed in the last year, specifically around deep learning, that now was a good time to get a little deeper into that and to see what kind of damage I could do. I think the question for me is, maybe we both fall in the same line? I’m not really sure what creativity is. I’m not really sure if machines will ever be creative. A good experiment to try to prove that out is to try to get a machine to be as creative as possible and see where it falls flat. It’s set up like a great way to get deeper into some of the algorithms I was interested in getting deeper in, as well as sort of hit this problem of creativity right in the nose.
Brian: Got it. To give people a little bit more background, you had come to me and a colleague or a former roommate of yours, if I recall it, at Harvard who teaches at Berklee College of Music. My understanding at the gut of this was like, could a system predict a small piece of music based on fed recordings? Your learning data was a bunch of recordings and a bunch of songs from what we call the real book in the music world, which is basically a catalog of lead sheets of well-known jazz songs that are played if you go see a jazz trio, or a quartet, or something like that. They’ll often be playing music out of this book, which is more of a guide book than it is. It’s not an orchestral score with everything written out. It’s a collection of melodies and chord symbols that form the basis of the IP of songs. You had an interest to see whether or not a machine could effectively predict the correct slice of music, and that’s something you’re experimenting with. Is that the right way to summarize it?
Puneet: Yeah. I think that’s too generous for where I was at the time when you and I first had our conversation. I remember a big thing coming out of that conversation was the realization that there is this sort of baseline layer called the real book that bands and jazz bands as you said, you could put together four musicians, give them the same real book, and they’ll probably start producing something pretty good right off the top of their heads together as a band.
Wayne is my friend who I often go to whenever I have any dumb questions about music. We were thinking about what would be an interesting test to see how well an AI could do to build music. We started circling around jazz as being something we both liked a lot, which was pretty interesting and if we could do that, it would be pretty powerful. But then in the conversation the three of us had, I think the form of the experiment started to take place. Namely, input the real book, train it on a bunch of different bands that are playing a version of that real book, and then see if you’ve encapsulated something so that if you could input a new real book composition, you’d get something that you could reasonably say sounded like the interpretation one of those bands would have made. It was a big task, but at least something we could then start breaking into smaller pieces.
Brian: Yeah. At some point, perhaps if this goes somewhere, we’ll have you back on the show and as listeners on the show know, occasionally I like to intersperse music-related projects that are going on in the data space. This is something that’s on the horizon. The thing I was partly interested is the process and the fact about how you go about doing that. In this case, you’re exploring a personal interest, you’re not necessarily trying to create a product, or a data prop, or something like that, but what made me want to have you on the show was partly the thinking process.
I remember the first time we met. We had a beer together and you’re like, “I find it’s really critical to get stakeholders and subject matter experts involved early to help me shape the models I’m going to use and the technology.” That’s what I thought would make this conversation interesting. It’s not just so much that music project but your background and how you look at your internal customer and also the end user. Can you talk to us a little bit about how you ground your technical knowledge into things? How do you ground that such that you know you’re going to be providing some type of useful business output?
There’s lots of talk in the market. The things I hear from business leaders is that, “All my PhDs and the data people, I just want to work on the math.” They’re focused on model quality and they want to publish their research. They think they’re there to do research and the business things that are there to generate some type of business value. At the same time, there’s a lab-like environment that’s happening with some of this which is, maybe we don’t know what we can predict yet. Maybe this is experimental. Talk to me about your process of getting end users, stakeholders, and subject matter experts involved in the process of creating a useful piece of output, whatever that may be.
Puneet: Yeah. I’m happy to discuss that. I think it’s important to say that by no stretch do I think I have any of that figured out. I think there’s just a lot of painful lessons that I’ve learned in the past of things not to do, and one of the good ones is to talk to real experts as quickly as possible. I’ve worked in a lot of B2B startups, trying to take data to market in various ways. One of the things I’ve noticed is that there’s a workflow involved, so there’s a real end user. In healthcare it might be the doctor, or it might be somebody processing healthcare claims. At the startup, we’re just winding down LevelTrigger. The end user was somebody working at a large restaurant chain who might be trying to make performance optimization decisions, or somebody in a real estate office who’s trying to decide which business is best suited for this open spot that they have a vacancy for. Anyway, at the end of the day, there’s always somebody who is trying to do their job.
What I found as a data person is it is really fun to not think about that person and just ask, “What does the data say? If I were going to recreate this whole business ecosystem, what’s the most interesting problem to solve?” We almost always found that while that’s fun, it doesn’t lead to anything the end user actually cares about. They’re usually involved with something that might not be incredibly sexy, it might just be a basic piece of workflow that they have to get through 100 times a day. If you could make that easy for them, they would be really grateful. You would increase their productivity and make the world a better place.
Without knowing that, I find that it’s easy to spend a lot of time solving a science problem that might not be very useful. If you can get those people in the door sooner, and sort of become a little bit more of a domain workflow expert, you’re more likely to be able to find something that’s really valuable. I think that’s just another way of saying that people that solve the problem through decades, have a lot of really valuable experience, and you’re better off starting without experience as opposed to ignoring it.
Brian: You touched on some good things. When we talk about data products at least on this show, we’re often talking about decision support tools and ultimately, you’re trying to help support some decision that needs to be made. The data itself isn’t so much like what it’s about, it’s really about whether or not you facilitated some decision that’s going to be made on the other end. Obviously the data enables that, but you have to be aware of how people are going to potentially make those decisions so you can align your work accordingly. Do you think that’s the data scientist’s job to do?
There’s obviously a lot of crossover here with what user experience and design does, having empathy for users, understanding what their habits are and their workflows, and all this. Not to say that you have to pigeonhole everything to a job title, but do you think that’s the role of the data scientist? Do you think that’s a product management thing? If you were to slide into a new company and you know that you like to build your products this way, do you think that’s a data scientist core thing or do you think it’s more something to be aware of and you’re participating on the side with that? Where does that responsibility fall to keep that in check, that workflow?
Puneet: It’s a good question and I think the answer to it depends a lot on the specific problem, how mature the company is, and its approach to solving a problem. If it’s a big company already, they’ve got great product managers, they’ve got great user experience people, they’ve already got a workflow that they’re trying to augment with some kind of data problems. Then it’s just a matter of collaborating with the right people, being open, and influencing as much as you can, but you get information from those sources that can be useful for the data science part of it.
If it’s a much earlier product or maybe a product doesn’t even exist, maybe it’s just a kooky music generation idea, then part of the fun is being able to talk to people that are experts directly and recognizing that you’re probably not going to do as good a job listening as a real product manager would do, a real user experience, or a real design person do, but you got to wear a lot of those hats too if you’re going to make progress.
Brian: Sure. As I’ve written to my list before, I’m definitely in the camp of good enough especially when it comes to doing a research, because so much of doing good research is simply asking good questions, then stop talking and listen. It’s one of the nice things in tech where you could suck at it, you could get started and totally suck at it, and just get better, but you’re still likely going to get a ton of value out of it even if you think you can’t do it or you don’t know what to do. You’re worked up about, “Oh, what’s the psychology?” and how are you supposed to do the interview and all this kind of stuff. You can get a lot of value out of it just jumping in. There’s not a high learning curve there. I only support data scientists, engineers, especially technical people getting involved with that process.
Puneet: That’s right. They can be facilitated by people that really know what they’re doing all the better. Even data scientists and engineers should be able to listen with with empathy and ask questions. I got a good number of tips from people teaching me how to do things like that. Basically we ask a question or basically shut up and hear what their answer is. I also found it really helpful to summarize what I’ve heard and repeat that back to them to see if that captures what they’ve said, and then lots of asking open ended questions at the end, “Is there anything that I’ve forgotten? Is there anything that you really think is important?” I think if you’re an early product and you’re looking for something important, you’ll hear it if it’s there. If you don’t hear it, then the question is, are you willing to listen to that if there isn’t something interesting there too? That’s a hard thing to face. One you’ve also got to be open too.
Brian: That leads into my next thing. We just talked about being good enough with your user research that the activities that you’re going to perform are not getting lost in being perfect. Can you talk to me about that briefly in the world of data science about models being good enough versus good, especially from a business perspective? Is there also a similar thing over there? Talk to me about that.
Puneet: Yeah. I think Agile works there too and there’s a bunch of folks that are experts at applying Agile to the data in just modeling surfacing part of it too. I think there’s a little bit of a different flavor there as well. One flavor I think is probably easier to understand which is, if you’re trying to produce an outcome and an 80% accurate outcome would change somebody’s life, then by all means, call that good enough. Ship it, test it, make sure you really got it right, and then Agile will improve that as you go along. I think the area that I see is generally harder to crock and wrap your head around, wrap my head around is sometimes you don’t even know what the right question to ask is.
You might have some indication of interesting questions, interesting inputs into end user workflows, you might have some indication of interesting datasets that you could use to tackle those, but they’re never going to be perfectly aligned. What I’ve always found is, you try to connect the dots with a good enough model—don’t throw the kitchen sink at a problem to start with, throw something in there that’s enough to give you an idea if you’ve got some traction—because when you do that, when you produce your first outcome, you’ll learn things like, “Was this really the right question to be asking? Is this really the right way to measure success?”
Success may not be whether you get every individual recommendation right, it might be an overall sense of, we’re helping move the ship in the right direction. I generally find you want to connect your data to the question you think is most interesting to answer as quickly as possible, surface that, and then have some hard questions about whether that’s an interesting answer, whether there’s different data you should be bringing in, whether there’s different questions you should even be asking. That’s when you really want to get everybody at the table. I find that’s a very much a feedback loop there. The question you think you’re answering from the beginning probably isn’t the one that you’re going to stay answering the entire time and you’ve just got to be flexible around that.
Brian: Is there a tendency to want for your business or if you’re doing consulting work, your client to want to use whatever the problem is that you figured out you could actually solve with the data even if it wasn’t what was originally like, “Oh, we did some new research, we found out there’s a market need for X, the model actually gives you Y or V, “ or something a little bit different it’s close to X, but it’s really not that, but people don’t want to lose the investment. Is there a temptation to want to try to productize whatever was created because the technology seems like it’s good like, “Oh, it works. The board said we’re supposed to need to have an AI investment”?
Puneet: Exactly. When you start all this by doing AI, it turns out the answer to the question we want, we can’t really answer very well. Let’s say you find that. Let’s say the board is really asking for an AI-driven solution to something, you bring a team together, you solve a problem, and the answers just aren’t that great. What do you do? You can double down. You might have the wrong algorithm. You should try others. You might need different data sources. Your data may not be clean enough. There’s a million ways that you can improve your answer.
I think it’s an important part of the process to be able to figure out how much additional rope you have in each of those directions, and then eventually it becomes a business decision like, “Hey, we answered this question. It’s at 90% accuracy. We really need it to be at 95% accuracy before we can release it. Given these tests, we think that’s going to take six months and acquiring this data set.” That’s one possible outcome and I think a good data scientist will have a lot of intuition to be able to collect the data to give you that framework. Another thing is, “We’re only able to answer question X at 90%, but hey, question Y has never been answered before, and we think we’ve got an 85% answer to that business. Is that interesting to you?” I think that’s a question for the business side. It can be refined.
One of the tendencies is to stick your head in the sand to get the best answer you could possibly get even at the point where maybe that’s not the question to be asking anymore. I think good organizations have enough collaboration where people are comfortable surfacing early enough results as works-in-progress where everybody can have an honest conversation about whether this is the right thing to work on, or if they should gears a little bit, or if they need to bring in additional resources to get across potential the finish line.
Brian: Do you know ahead of time before you jump into doing technical work, if you have a fairly good like, “I know we’re storing this data, I know there’s an API to get whatever, some geo data that we need for this thing.” Is there a way to get ahead of this such that we don’t expend as much technical effort on something where you have a gut feeling that it’s not so much a technical problem, but what we can solve with this data is not going to be the X, it’s probably going to be the Y. We can jump in and see how good we get at Y, but it’s probably not going to be X. Is there a way to get to that sooner such that it doesn’t require doing a large implementation project just to arrive at that?
I mean it’s nice for the business here, “Well, we couldn’t do X, but we did come up with this other thing. Is that useful?” It’s probably not the best answer that they want to hear, and I totally realize that can happen, but how do we get ahead of that? Do you think more research can be done? To me, my design would say, “What if you get the people that know this data really well with designers or researchers to go out and spend more time understanding the problem space with customers, such that maybe the data scientists can then predict earlier, not literally predict with tech, but in their head, they’ll have a better sense of what might be worth investing in.” Any comments?
Puneet: I think as a data scientist, you always have a gut feel for whether this data will solve that problem. I think it’s really helpful and I’ve seen this be very successful inwards. There’s often a lot of people have already solved problems like that, maybe even exactly before. If you can talk to them about what they did where you’re likely to hit roadblocks, that can really help a lot. The other way is to set it up a little bit the way really great Agile designers and product people set it up, which is there’s no reason why you can’t do a quick mock up. Put that in front of some stakeholders and see what they say.
On the data side, that’s still a little more involved. Before you bring out the big guns and make a 15 layer of CNN that has an integration with a recurrent neural network in it to solve X, Y, or Z, you can just try a pretty simple regression model. Try a linear regression on a very limited set of data and see if it gives you something interesting. Then in the same way that I think a good Agile product designer or UI designer would roll out features incrementally get lots of good feedback, you can start increasing the complexity or your analysis. Given a lot of experience, I expect to see models improve rapidly when I add new data sets or I go do the algorithm that I think is really well suited to this compared to one that isn’t. If I don’t see that, that’s a sign that there’s something off.
Brian: Do you have a concept for what an MVP is then? I mean, maybe you’ve described it already. I know what an MVP of a product, a UI looks like. I know maybe it’s not going to be super nice looking or it’s not going to have all the functions. Maybe it’s a table, you can sort it and filter it, and there’s a detailed view, but there’s no workflow or anything, but at least we’re showing the data, and you can add and edit stuff, create, read, update, delete, something like that. I wonder what that equivalent is on your side. Is the improvement only in the model, in the quality of, like if you’re doing predictive analytics or something like that, it’s primarily in the model quality, or is there a broader MVP definition that you think about in terms of what that increment looks like?
Puneet: Yeah. I’m just going to rip off of this. I don’t think I have the answer and there’s probably more in this. There’s folks that are really invested in bringing AI in a very agile way to organizations. In my experience, you can usually tell pretty quickly if some data is going to be suited to answering a question. You can do that sometimes just by looking at some samples in Excel and visualizations. If you’re looking to determine if this is a cat or a dog, you can take a look at some pictures and you can ask yourself, “Could I tell if this is a cat or a dog?” If you can’t, then you’re likely going to have some trouble.
There’s ways of getting people and domain experts to come in to tell you, “Hey, if I took a look at this data, I would certainly be able to conclude on the basis of this whether the outcome I want is in there or not.” That’s one way to do a very shallow quick MVP, “Hey, we’ve got this data set. Here’s a couple of rows. Do we think a very smart domain expert could take this row and make the right conclusion?” If yes, then there’s probably a good algorithm that could do the same. If not, well what would it take to enable that human to do it?
If you can get to that stage where you think you have the right data sets, where smart people looking at the data can give you the answer you want, then you’re probably in some business. The next question you should ask yourself is, do I really need an algorithm to do that? Maybe you want to fully automate this process in which case the answer is, “Yeah. Okay. If a human can look at this data and come up with a conclusion, I can certainly automate that.” Automation by itself is an interesting enough thing and I’m going to go after it.
Usually, that means the end user is making lots of decisions per day. There’s some chance of fatigue or whatever, it’s wasting our time, so if you automate it, it would be great. That’s one way to say that that’s valuable. Second is you say, not only could the human make this decision, we know that the human will have to make too many decisions during the day. We can increase our capacity by doing this in an automated way. Or there’s even more data out there that we think would leverage a better solution to this problem that the human can’t look at. For example, it could be lots of context.
Other things that are around. If the image is moving and there’s lots of sub frames, or if there is a lot of audio in there that is not easy to hear, but the computer might be able to pick out, a machine might be able to pick out some of the noises from. Then that’s another good reason to try because you might be able to improve over the accuracy. Or that a human will make this decision, but they’re only making the decision based on their limited experience. They can make this decision off the data, but we know we have a lot more training samples out there, so we could probably give a better accuracy. I think any of those are good reasons to believe that a simple MVP which you could define by, could a human make this decision pretty well, would work.
Brian: I hadn’t really thought of it that way, but that’s pretty interesting. You’d be thinking about models and maybe the data required or available, based on understanding the human workflow or how the human is going about saying is it a cat or not, something like that.
Puneet: If you could prove to me that with this set of data, a human could make a decision that’s right and we’ve got that data, then these days your algorithm has a good shot of reproducing the human’s accuracy levels, if not exceeding them. Then the question is, what does the business really care about? Do they want a better cap crossfire, or do they want a faster cap crossfire? There’s lots of reasons to believe the machine could do better. Now, there are going to be some problems where even if you showed the data to a human, they’re going to have a hard time picking out what’s going on. I’m thinking about security logs here. If somebody showed you a bunch of IP accesses for a web server and some HTTP GETs and things like that, a human might not really be able to work through that and figure out what’s going on, but maybe a machine would. It’s not a foolproof way of defining whether this is a good have to an MVP or not, but it’s a good enough approximation for a lot of problems if that makes sense.
Brian: Yeah, that’s interesting.
Puneet: Maybe we’ll talk this out later but in the same way with our music project. I think what I got pretty excited about our conversation is, you sort of gave me the input, this real book input. If a bunch of humans took a look at that real book input, they would be able to play a pretty good interpretation of the song. Now the question is, starting with that same input, can we train a machine so it could do similar? We know that a real book input with a trained musician is enough to produce something awesome.
Brian: Theoretically, it’s like if you have a good data, then you can get a machine to do it as well.
Puneet: Exactly, that’s the question. Now, I think at least we have a good framework for something that I expect an algorithm to be able to do in that framework that I gave you.
Brian: Some of the challenges with that project were or will be, you have data in the form of the real book which is intellectual property. I’m not about legal issues I’m talking about. It’s IP of the song. It’s not the audio. It’s the embodiment of the song and in ink. It’s not a full rendering of an orchestral score which is effectively almost all of the code to play a symphony, it’s all written down.
Obviously, there’s stuff that’s not on the page that’s interpretation and phrasing, and some of this, but there’s a lot more data there than there is in the real book. Are there business problems you worked on where there’s a similar thing, where you have an approximation of something like that, or maybe you have the aggregate? When we were talking about this as well, we’re separating out the song IP, the actual printed real book scores from the audio, which is a completely different thing. That audio is the sum of all the sound created by all of the musicians. It’s not multi-track audio. It’s not like a midi recording where piano, bass, drums, it’s all separated, it’s all digital, it’s actually just one big stereo mix. Are there parallels to that that you deal with or that you’ve dealt with, and do you think about it from a business standpoint, like a business problem you’ve had in the past? Are they totally different than this?
Puneet: Yeah. I think that is a very common occurrence. In that framework I was giving you before about if you showed this data to a smart domain expert, could they come up with what it means, think about healthcare data records where you might have an EMR instead of information. This is the electronic medical record which is a little bit complete. You might just have a picture of claims data passing through, so the claims data is meant for the insurance company to be able to decide how much to charge folks, or maybe you just have what drug was prescribed, maybe you get it from a pharmacy.
Each of those would be very small and incomplete windows on a larger problem of what happened in this patient-physician interaction. There might be a smaller question you could ask which is, how much does somebody on average pay for medication? How often do they pick up their medication from the CVS? Some of that data might be more relevant directly related to that question, but the broader mess your question of, what happened in this patient-physician interaction?
Imagine somebody was recording a patient-physician interaction and you wanted to reproduce that from just the EMR or the claims data. That will probably be pretty tough to do. I can imagine giving the claims data to a doctor and they’re saying, “Well, it’s just not enough information for me to have any sense of what happened in this interaction.” I think it’s similar in that way. One of the things that comes out of that with that way of thinking about it is, sometimes data is good enough to answer a question, but the narrow question. If you try to broaden the question, the data becomes incomplete, noisy, etc. You got to think about what the data is capturing and if it’s relevant to the question you’re really trying to answer.
Brian: If you’re working in a company or something like that, do you look at the projects as primarily either being in the mode of, we’re in lab research mode here which is maybe we’re looking for a problem to solve with the data we have, so it’s ‘data first, problem second,’ and then there’s projects that are ‘problem first, data second,’ and everything falls into one of those two, more or less. Is it binary? Is that not how you think of it as a [.trinary]? Can you talk to me a little bit about that kind of lab mentality versus the research side?
Puneet: The problem mentality?
Brian: Yeah. Do you think of it that way?
Puneet: I think everything is sort of a grayscale and you’re somewhere in the middle. Maybe there’s a third point out there, so it’s a little more complicated than that, but those are two polls of things. I think it’s extremely rare to have somebody just say, “Hey, we’ve got this data, what can you do with it?” It’s rare for somebody to say, “We want to poke around with this data from an algorithmic perspective today, but we have no idea what it’s good for.” You’re probably more on that side especially today, you have people that focus their questions more on the data strategy. “We got this data. Is there anybody in the market that’s interested in some kind of derivative of this data?” and then it becomes a market research exercise.
On the other side from the data science perspective you’ve got, “Hey, we’ve got this problem, our accuracy is at 89.34%. We want it to be at 89.44%. Can you help us with that?” That’s also a very rare place, even though not many people are conditioned to believe that it’s more common than not because of things like [..havoc.]. I think the truth is usually somewhere in between. If a business has really defined a problem very well, they’re probably doing a good job of focusing resources on it.
The truth is, they probably have an idea that they could be doing something better, they’ve seen some market demand for it, they need some help putting the pieces together, and that help could involve basically a data science MVP. They want to take this data with some algorithm to see if it’s going to be useful to tackle that problem. Or, “We’ve been using this algorithm on this data to solve this problem, we think there might be better methods. One of the reasons we believe that is because there’s this new data set we’ve been sitting on that we haven’t been able to use. What method would now incorporate this data in solving this problem.” Those are the situations I think where you’ve got a big shift in value.
Brian: Are there things that business stakeholders, designers, and user experience people can do to facilitate the process of working with data sciences, especially in the decision support tool space? Is there a way to optimize the work that you guys do, or to make it easier just to get better value to do it more quickly? Or maybe I can ask the inverse. Have you learned what not to do with that, or have you seen processes that didn’t work so well that you’re like, “I don’t think that’s the way to do it”?
Puneet: I think I have a perspective on both sides. When we were really early at LevelTrigger, we were talking to a design firm and we kind of had a half-baked idea about using location data to help retailers. Walking into the design firm, we had no idea what the workflow should be, and that really was some trouble. I think if you’re on the data-driven insight side of the fence, you got to do a lot of exploration or some idea of what the workflow is before you bring in designers, or UI folks. That may mean the first person you need to talk to is some domain experts who are going to tell you how this data would or would not be used. I think it’s better to have as much of a sense of the workflow that you’re both trying to improve, before you really can start collaborating.
On the designer UI side, I think it’s extremely important, if you can, to give the data team, whether from the ingest side or the modeling side, much insight as possible of how it’s going to be used, how their outputs will be used, because there are folks over there who will have new ideas, and they’re likely to have more targeted new ideas, or even do a better job on the idea you already agreed on if they understand how it will be used. Similarly, I think the designers and UI folks should very much demand and expect to understand how the data science team is putting together their output.
If the design and UI folks don’t understand that, can’t come up with a way to represent it, you can give all the decision support you want, but you’re not going to do it in a convincing way to an end user who you want to hope to adopt this. I think there needs to be a lot of transparency there. Both how and why does this work, and then how is it going to be used. That’s helpful for both sides.
Brian: I want to unpack the first part of what you said. For me, when I think about design, workflow is a core concept to me of design. Whoever does it or whatever title it is, that doesn’t matter, but I’m curious what failed or what didn’t go so well. So you said you walk to the door, was that firm more of a UI design kind of place where they’re focused on ink and pixels, and not so much on workflow, which was a wrong match? I would have think that they would have started with a problem space use cases, even theoretical ones that need to go out and then be validated if they don’t start at the customer and then instead start with maybe you guys as a proxy for the actual users, and you go out and try to validate that those problems actually exist. Was that just not their competency? Tell me about that friction or the gap.
Puneet: Yeah. I think we were just too early to leverage their capabilities. They were a little more downstream. We really needed somebody to help work with us to validate how this would be used. Their sweet spot was more, once somebody understands how this process is going to be used, we’re going to help you convert it into a set of mock-ups that we can then implement together.
There’s a lot of different designers out there. There’s a lot of different parts of the process, some that can serve as project managers to implementation, some that are just implementation, some really can build the fastest, smoothest design, somebody that can mock-up the best design, and some that are really more product manager in market research if they have experience in a given domain. I think understanding where you are as a business when you start doing some machine learning initiatives is pretty important.
Brian: I fully agree. Again, regardless of who the consultant or vendor or designer or whoever you work with, the point is that that activity needed to happen. I think we both agree as a designer looking at the problem, user experience professional, and you as a data scientist looking at it. The theme here for the listeners is that, you do need to understand this workflow, whoever is going to go out and do that research or discovery that needs to happen, because it’s going to feed everything downstream. It’s going to prevent wasting time, building the wrong stuff, and spending money on things that don’t do anything.
Puneet: Yeah. I think maybe the data scientist believes, “Hey, if I don’t have that right, we’re just going to build the wrong workflow, but the modeling is still going to be good. The pipeline can still be good.” The answer is no. If you don’t have the right workflow in mind as you’re building the data ingest and the data model or the algorithms, when you do figure out the right workflow, those things are probably going to change too. You need to have an agile approach across the board for all of it.
Brian: Right. Any other lessons learned from your experience? Again, that theme of facilitating the data science activities with product management and end user experience design, any other thoughts there?
Puneet: Yeah. I think if it’s worth a little bit more conversation, the other point coming upstream or downstream, whatever perspective we’re at, the UI folks, the designers, the workflow, they should be able to understand why the algorithm is working. Why is this thing giving us answers in the first place? What are the inputs we’re getting into it? How do we expect them to change? Those are savvy end-users. They want to trust the system. If you’re successful, I’ve seen end-users like most of us with Google and Google maps, now we don’t really worry about what they’re doing, but when we first start with products like that, especially in the B2B context, we want to know if we can trust the answers that we’re about to act on.
There needs to be some transparency in the process. Really good designers and analytics folks will give just the right level of peek behind the curtain to give everybody some trust. That’s going to start with the data, the data science team, and EML team being able to explain what they’re doing to the designers, product managers, etc.
Brian: I think that’s good advice. I fully agree. When I’ve worked on some products that do root cost analysis or exception, detecting exceptions in systems, customers really want the conclusion first. But if you’re designing for a technical user, they also want some evidence. At least in my research in a particular space with IT software, they don’t really want all the data. They just want to know that somebody looked at the data and that there’s a substantiation behind the conclusion.
Over time, I think they stopped really looking into the weeds because they start to trust the conclusions the systems are made because they can see it. As you said, it’s kind of like this line. Where do you draw the line between what you’re going to put in the UI? What’s too much and what’s just nothing? You’re right. That’s a design challenge and you got to talk to people and get stuff in front of customers to figure out where that fine line is.
Puneet: On the data science side, if the front-end people are telling me, or the workflow managers, the gliders are telling me that the users want to get some trust in the system first, then maybe instead of a model that gives us no transparency into why we’re making the predictions we make, maybe I need to choose a slightly less successful model that can give me that transparency because it means it’ll be adopted more. Or maybe I need to wrap my understanding of the algorithm in some high level statistics the designer feels can give out a feeling of trust to the user. There’s lots of good reasons to do that.
Another reason and we’ve seen this a lot, sometimes you make predictions and they’re wrong, and there’s nothing you can do about it. Depending on the use case in the workflow, you may need to have an end-user override function where they can come in and say, “Yeah, I know you told me to go turn off this server, but actually it’s fine. You’re probably missing some signal.” Either I’m going to drop your system now entirely because I don’t like it because of that one bad recommendation, or give me a way to let you know that that was wrong, and then build a model that uses that as further training so that doesn’t happen again.
If you’re going to solicit feedback from the user to override a problem, a false positive, that’s a really valuable feedback to get from the user and you better take advantage of it, so you better build a pipeline that can actually use that bit of an information to ensure that you never show a similar false positive again.
Brian: Yeah, you said some great things in there like, I totally agree with feedback and I would put out a broad recommendation to that like, even if that feedback is qualitative, it’s a, “Fill out a form and leave a comment on this prediction,” or something and it’s just an email that gets out, the point is to start collecting the feedback. It doesn’t need to be highly technical, ranks, choice, and all this kind of stuff. It’s just about understanding what you don’t even know you don’t know about and getting informed about that. I love the active way to do that and the passive, meaning you’re not going out doing research, but the tool has some embedded feedback mechanism. There are so many great tools these days to like, “Leave a message in the chat window.” Almost every SaaS now uses Drift or some similar plugin.
Puneet: We’ll watch you go to the second page of search results instead of hitting the first page. That’s a great passive response that you’re not seeing what you are.
Brian: I love what you said, I hadn’t really thought about that, and this is something I think is worth repeating, if I got this right. I think you said, you might sacrifice model quality like, I’m going to downgrade the model we’re using to upgrade the transparency that we gave because adoption and engagement by the user will go up, and if you’re using commercial software, that might directly translate to attrition goes down. People are remaining subscribers every month because they trust the system. Maybe the reality is you could do better predictions with a different model but you would lose some of that trust. Is that what you said?
Puneet: Yeah, that’s right. There’s a couple of points on that. First I think even as the modeler in an internal perspective I like to first start with models that I can really interrogate. Linear regression is really great because you can see what the coefficients are. It’s definitely not going to give you the best accuracy, but that or decision tree. You’re going to understand why it’s doing what it’s doing. It’s so easy to screw up your data pipeline, come up with nonsense, and then chase tales for a long time figuring it out.
Part of the reason I’m a little bit stuck on this music project is because I think I’ve got some data normalization issues, and my model wasn’t giving me the feedback. You even want to have different models as you’re starting internally so you can understand what’s going on. From a cold-blooded flat point of view, if your model was slightly less accurate but it increases end user adoption because they also have some transparency, then you’ll have wider adoption that might lead to more training sets which could end up improving your overall product accuracy even more.
I have a good friend that used to work in the FinTech world. They were buyers of fraud algorithmic software. They were processing a lot of transaction data, and you could buy commodity fraud algorithms from various providers to help determine whether this transaction was fraudulent or not before you decided to process it. What he told me is that there was a wide variety, there was a wide range of accuracies across these algorithms. Some of the algorithms are 5% more effective than some of the other algorithms. I was like, “Well, that’s crazy. Why wouldn’t you just buy the one that has the highest accuracy all the time?” He said it was because no matter what algorithm they bought, it was never going to be more than 90% accurate at the time, and they knew that if they rejected a transaction, they wouldn’t have to get on the phone with somebody and explain why they rejected it.
They ended up purchasing using vendors that made that process easy. They got that workflow right. Everybody’s going to make mistakes and you didn’t care if there was an extra two or three mistakes a day as long as you had a better workflow for resolving. It’s an example of sometimes, accuracy isn’t the most important thing you should be optimizing for, it’s the rest of the package, how do you deal with an error that inevitably occurs. If you can make that a good process, then I think you’re more on the road to making users happy than trying to live in this idealized world where you never make a mistake at all.
Brian: I think that’s great. I think it’s great advice for any data scientists that are listening that perhaps have a little more of a bent to do really good research and do high quality modeling as to think about that. I take a hit on the model stuff. Maybe I won’t feel this good about pushing out something that’s not as accurate, but you’ll have the satisfaction of knowing that you’re changing someone’s life or you’re impacting the business. Maybe you’re going to build more revenue and income which might allow you to then go do better work like reinvest in the platform that you’re building so there could still be a win. As you said, maybe your training data grows, and so you can still win by taking a quality hit at the beginning.
Puneet: You can always recover the quality hit later, it’s just a question of what you want to learn about first. I’m not saying choose a linear regression model always. The learning models are great. You get really high accuracy with them, but when you’re learning, sometimes there’s more important things than the highest accuracy, especially if each one of your end users is very valuable to you.
Brian: Cool, that’s good stuff. This has been a really good conversation. Do you have any other parting words or advice for data product managers or analytics leaders, people living in the space that they might be able to take away from your experience?
Puneet: No, I think we covered a lot there. There’s probably a lot of garbage in there but hopefully some of it was useful.
Brian: No, this was a really good conversation and I hope people got a lot out of it. I’ve been talking to Puneet Batra again, data scientist. Where can people find out about you? Are you on the interwebs, Twitter, or any of those places?
Puneet: Yeah, I’m on Twitter. It’s @GPBatra and LinkedIn is another good place. Look me up in either of those and I’m happy to talk to people.
Brian: Cool. I will put those in the show notes. Maybe we’ll get a chance to sync up down the road a little bit if this little music project that you’re working on goes somewhere. We can talk about creativity.
Puneet: I hope they make it useful.
Brian: Cool. Thanks again and I hope to see you soon.
Puneet: Okay. Bye.