I once saw a discussion on LinkedIn about a fraud detection model that had been built but never used. The model worked — it was expensive — but it just simply didn’t get used because the humans in the loop were not incentivized to use it.
It was on this very thread that I first met Salesforce Director of Product Management Pavan Tumu, who chimed in on the thread about a similar experience he went through. When I heard about his experience, I asked him if he would share it with you and he agreed. So, today on the Experiencing Data podcast, I’m excited to have Pavan on to talk about some lessons he learned while designing ad-spend software that utilized advanced analytics — and the role of the humans in the loop. We discussed:
- Pavan's role as Director of Product Management at Salesforce and how he works to make data easier to use for teams. (0:40)
- Pavan's work protecting large-dollar advertising accounts from bad actors by designing a ML system that predicts and caps ad spending. (6:10)
- 'Human override of the machine': How Pavan addressed concerns that its advertising security system would incorrectly police legitimate large-dollar ad spends. (12:22)
- How the advertising security model Pavan worked on learned from human feedback. (24:49)
- How leading with "why" when designing data products will lead to a better understanding of what customers need to solve. (29:05)
Resources and Links:
Quotes from Today’s Episode
“We are trying to make it easier for teams who are producing and consuming data to use the right information at the right time, for the right reason, by making the data maturity better and by reducing and eliminating friction. We are reducing the risk without compromising on the pace of innovation.” - Pavan, on his job as Product Director at Salesforce (2:20)
“Through data governance councils at Salesforce, we bring in process maturity and further understand where our process is breaking down or where it is not being an enabler, or maybe where it is introducing more friction than desired.” - Pavan (3:58)
“It was a reality that accounts would get compromised; no matter what we did — there would always be scenarios where those things will happen. So, we could either try to be at the edge of it and keep bleeding when that happens, or be smart about it and anticipate that these things would happen and be better prepared to address them, and bring in our customers and enable their trust.” - Pavan (10:38)
“Part of the solution that I was trying to get to market was to help both the customers whose advertising spend account may be compromised as well as everyone else who’s participating. So, in general, I was trying to bring up the trust quotient of the solution and make it as a product differentiator.” - Pavan (12:22)
“While you worked to improve the model that predicts whether or not advertising spend is legitimate, you had to think about all these different, quote, “Selfish interests”—“I want to sell my products, I want ads placed at the right time.” And then someone else wants to just spend—“Make sure you spend that budget, so I get my paycheck.” The model wants to prevent fraud and risk for the overall business, but there are all of these different competing interests there. It’s about understanding where we allow that human feedback to come in and say, “No, we’re going to override here.” - Brian (25:10)
“One maxim that I use as gospel [when designing human-centered data products] is, “Always lead with why.” Try to understand, “What are the deep business processes of my customers?” “What is the job to be done for my customers?” … A customer is, at the end of the day, trying to do something — so understanding the context within which they are trying to do something is the differentiator between a product that is used versus a product that is not.” - Pavan (29:25)
Brian: Welcome back to Experiencing Data. This is Brian T. O’Neill, and today I have Director of Product Management from Salesforce, Pavan Tumu on the line. Pavan, what’s up?
Pavan: Hey, Brian, thanks for having me.
Brian: Absolutely. We met on LinkedIn. I’ve talked about on the program a few times what happens when we build technically right, effectively wrong software. There was a discussion about a fraud detection model that had been built, and it ended up not getting used it all; it took a long time took a lot of money, the model worked, and it just simply didn’t get used because the humans in the loop were not incentivized to use it, or it didn’t fit into their workflow. And I think you had chimed in on this thread on LinkedIn and said that you had had some either similar experiences, or you had gone through some learnings about, kind of, the what versus the why.
And so I wanted you to share some of your background from Salesforce, and if you want to tell us a little bit prior: where you came from, and your birthing in the analytics and data science fields.
Pavan: Sure thing, Brian. Yeah, I mean, I don’t know who does that, like build products, and—
Pavan: —it’s like, no one does that. So—
Brian: You’d be surprised.
Pavan: It’s an all too familiar story that we get to hear over and over. And yeah, I think a little bit about myself: I’m Pavan Tumu, I’m the product director at Salesforce. I’m focused on data products, so for the most part of my career, data has been an anchor. I’ve been with different employers, large enterprises, focused on advertising, focused on cloud, now continuing to focus on cloud here at Salesforce. So, at Salesforce, I’m right now working within a team focused on information management and strategy.
So, we are—think of it as the team who are trying to make the data maturity better, and also trying to reduce and eliminate the friction, so making it easier for teams who are producing and consuming data so they are able to use the right information at the right time, for the right reason. So, we are reducing the risk without compromising on the innovation or the pace of innovation, rather. So, that’s what we’re focused on; that’s what my day job is on.
Brian: Mm-hm. How do you measure “It’s easier to consume the information?” Was that the metric? How do you guys measure that?
Pavan: Well, it’s a combination of art and science; a lot of art, not as much science as I’d like it to be at this point. But then there’s a bunch of metrics that we’ve defined like, how many of these data sources are enabled for self-serve? Meaning if someone wants to gain access to it, are they able to gain access to it without having to go through a number of hops? How much longer is it taking? And stuff like that.
Brian: Sure, sure. Tell me about the art piece, though, because I think sometimes we feel like if we can’t measure it, that it doesn’t matter. But there are definitely more qualitative aspects, especially with design, even user experience, so what are some of the art—when you mentioned art, can you unpack that a little bit?
Pavan: When we say about art, it is all around the process. If you were to look at the golden triangle of people, technology and process, there is some technology that we are enforcing—that’s where the science comes in—but then, in terms of process, my VP also, we have the data governance councils that she owns for Salesforce. So, through those councils, we bring in process maturity, we understand where our process is breaking down or are not being enablers, or maybe they are introducing more friction than desired. So, we identify it with—we work with people, we coach people where needed vis-à-vis doing all these different activities. We bring in that thing, and then we try and bring internal spotlights on teams who’ve been successful at doing something.
So, by doing that, and sharing the secret sauce within, the goal there is to drive the ease of use forward for different parts of the company, by having someone else model it for you and see that these things do work in real-world versus there is this abstract notion of, there is a data maturity framework. You got to go from here to here by doing these things. It’s like, there’s lots of theory there. To see and understand a colleague of yours in different part of the company who’s implemented it and have actually been able to track it back to tactical solutions or strategic business unblockers, how they’ve been able to achieve it. So, that’s how it helped.
Brian: Got it. Got it. When you talk about the friction in the experience for the people, are there particular roles or people that are in charge of that aspect of getting that part right? Do you invol—are there UX people, or designers, or engineers? Does everyone collectively own it? How do you guys attack problems like that?
Pavan: I think it is collective ownership at this point, but then there is also ownership in terms of data. So, we are a data-heavy company, we focus on trust, we believe deeply in our ethos that we need to secure our customer information more than anything, so there’s also emphasis on data in that sense. So, the data source, wherever the data lives, that teams also take a larger part of ownership. But then it’s a team sport; the functions across the company, together, come to make it happen.
Brian: Mm-hm. Mm-hm. So, one of the things that we were talking about when we first met, a couple months ago I think it was, you were working in the advertising spend kind of space and I think you were building either some predictive models or something like this. And it sounded like you had gone through a real lightbulb, kind of, process. Can you talk about the pre and the post; there was this moment or something triggered for you? Can you share that story with us?
Pavan: Yeah, sure. This has been a few years, probably. So, some of these might sound like, “Who does that? Why were you so not aware or not in the know?” But then still, the experience stands there, right? So, the situation was to deal with striking the balance between preventing advertiser fraud versus ensuring the right advertisers who are legit and who are not having their accounts compromised, be able to spend as much as they want.
So, it’s like, I said compromised, there could have been, you know, account hijack, or there could have been someone fell prey to a phishing attack or what have you. So, the advertiser is a legit advertiser, but then a fraudster has gamed them and taken control of their account. So, for the system, it comes across as it’s a legit advertiser.
Brian: I want to try to paint the scenario for people so they can follow this along, and I didn’t know about this until we talked. But my understanding is, if you’re managing an ad spend account, you might put a ceiling, it’s just like your credit cards, but if your buy ads on Facebook or something, you say, “I’ll spend $100 a day and no more than $1,000 for the campaign.” Well, add a bunch of zeros on for corporate. And the concern is that if someone was to hij—another advertiser, an illegitimate advertiser, was to hack into an account, they could basically buy a whole bunch of ads for themselves using your money. And so that was the security risk that was involved in. So, you needed a system that would, kind of, police for abnormal spend, but make sure it didn’t stop legitimate spending from happening. Is that correct?
Pavan: For the most part, yes. There are more nuances, but I think we can get started with that. It’s like, there is a fine balance to be managed between these two, we are allowing the right folks. We are not allowing situations which are abnormal.
Brian: Got it. Got it. And I’m guessing this is because the budget doesn’t know which creative was served. So, it can’t really be accountable to knowing if it’s a fraudulent ad or not? Something like that?
Pavan: That’s when part of the larger puzzle, but the biggest concern is, a customer who could be spending, at any given point in time on multiple advertising campaigns in parallel—they could have, you know, if you’re talking about a very large MNC, they could have, like, 50 advertising campaigns, or 100 advertising campaigns running in parallel at the same point in time. So, part of it is to ensure there are campaign budgets—so each campaign comes with its own budget, but at an account level, some segments of customers have budgets. So, there is a ceiling at the account; there is a ceiling at the campaign, but then let’s say you lose the keys to the kingdom, then some bad actor is in the picture, then they basically get to control all of these. They get to—you know, maybe customer X had even two campaigns, just two campaigns running in parallel $1,000, $5,000 with $10,000 account cap. What is to stop them from coming in and—you know, if a bad actor took over the account, and they came in and jacked up the account budget from $10,000 to $100,000 and the campaign—they increased it 10X, or 20X.
As far as the system is concerned, this account looks legit for a while. And this, you know, in the next 24 to 48 hours before the detection kicks in and the compromised account is detected, they would have raked up, like, north of at least double-digit triple-digit thousand dollar spend. So, what I mean by that is tens or hundreds of thousands of spend, especially when a large account is compromised. So, that’s what we were trying to prevent.
And going back to your original question, Brian, what was the after, before thing, or what was I dealing with, was ultimately to bring in this balance between, okay, there are these spends, it is a reality that accounts would get compromised; no matter what we did, there will be scenarios where those things will happen. So, we could either try to be at the edge of it and keep bleeding when that happens, or be smart about it and anticipate that these things would happen and be better prepared to address them, and bring in our customers and enable their trust. And then there was the other angle to it, which is, let’s say there are at any given point in time, let’s say there are 100,000 customers across the world who are participating in advertising marketplace. And then they are trying to bet on, say, roses, right? This is Valentine’s week, let’s say, they’re trying to bid on roses, then there would be a specific amount that there will be a range where they are bidding because they are trying to win that position for advertising.
If a bad actor came in, they have no incentive to be financially responsible. So, they could start bidding for the same keyword rows for instead of like $3.20 or $1.20, they could bid, like, $350 then win the bet. But in doing so, what they’re doing is they’re skewing up the marketplace, and the rest of the advertisers are also paying higher price.
So, it’s not just the specific advertiser whose account might have been compromised who is being impacted, but also everyone else who’s participating in the marketplace at that point are ending up paying higher. So, that’s part of the solution that I was trying to get to market to help both the customers who probably might be compromised and everyone else who’s participating. So, in general, bring up the trust quotient of the solution and make it as a product differentiator. At the same time, also ensuring that from a product perspective, we are not taking the fraud hit because we would have to right off that and take a hit on our bottom line. But doing so, the biggest thing was, are we going overboard?
If we do and if we are over-conservative, we are shutting down accounts too soon. If you’re being too conservative, we might actually start throttling the customers whose accounts are probably not really compromised, and then we might have a hit on our top line. So, that’s the balance that was to be driven. So, I’ll not get into the specifics of the solution, but there’s a solution done. There’s a—you know, the best possible model that was built, it was able to predict and what it would do is it would come back and say, “Hey, you know, Mr. Customer, you’ve been spending this, I’ve looked at your trend, I’ve accounted for seasonality, I’ve extrapolated it into future, based on it, you don’t really need to spend more than this thing because you won’t, so I’ll cap you at this.”
So, if something happens, if someone came back and said, “I want to spend and bid $300 on a $30 or $3 ad,” then start firing up red flags, and then we will start taking more drastic measures to start pausing any new changes or even start blocking your account from further spend and all such fun stuff. So, this was done, and then it was waiting to be adopted. With great pains, I was able to get it out for the top 500 accounts, back in the day. But then that’s it. That took a lot of pain, but then it wasn’t gaining a lot of traction.
So, that’s where the bulbs went off. So, that’s where, Brian, I connected with your prior article a few months ago, just like, what’s really going on? I think everything is done right, these are, you know, looking at what’s valuable for customer, we have balanced it out with not penalizing the right customers. We are building the trust with the compromised accounts. What’s really going on?
So, turns out what I’ve not—you know, if I were to do this all over again, there are a few things that I’ve missed out, which is there are teams who are running behind the scenes to sign deals with these large customers and their paychecks, like, literally their paychecks and quotas for sales targets rely on how much this customer spent. And they are genuinely concerned that we might be going over-restrictive, and we might be not doing the right thing in the sense that we might be limiting the spend of a customer because history can’t always predict future. Yes, you have the model, it is able to predict into the future, and then you’re coming back, but then what if the realities of this specific customer on that specific ad campaign changed, and they want to actually spend more because they want to increase their target audience or whatever realities of that customer might have changed? Those were real concerns. So, that’s where my whole thing about understanding why is extremely important.
So, that’s the learning that came in from there. I’ve ended up meeting with a few area leaders for advertising to understand what’s there; what could we do to ease that concern? Or what’s actually concerning them, and why would they see this solution as a possible impediment to sales. And what it distilled down to is they needed visibility along with the customers on when an action is being taken. Meaning part of the solution that was rolled out would say, “Okay, Mr. Customer X, here is your spend pattern. So, you have a budget limit of let’s say, $500,000 on your account across your 50 campaigns. Looking at your last four seasons for Valentine’s”—let’s stick with Valentine’s because we started that way—“Across your last four years of Valentine’s spend, adjusted for the number of campaigns, and adjusted for the growth, here is how much you would typically spend across all these campaigns, which is, let’s say, in a day, you would spend $75,000.”
So, even if you got 500 grand as budget and you funded your accounts, we are bringing in guardrails, which would not let you spend beyond $75,000. And before you hit $75,000, let’s say you are at $55 or $60-something—which is, like, a moving target—you would start getting notifications, and then you are able to adjust. You can then have an override to say, “You know what? I get it. This is different. Let it in.” Or you could say, “Oh, my God, thanks for letting us know. Maybe we are going too fast. We’ll slow down.”
Brian: So, part of the issue here was the top 500 users who had some kind of policing or guardrails on the spend, they had an option to use this, and when it rolled out, you saw low engagement, like, low opt-in to use this? Is that correct? Or it was turned on by default, and they could turn it off? What was the metric you used to know that it wasn’t being respected?
Pavan: It wasn’t top 500 users, it was top 500 customers, and it was rolled out by default, meaning we’ve enabled it for all those customers. And how did I know it was not being used is, not about those 500 customers; we had hundreds of thousands of customers and I was not securing a go, a green light to move forward to roll it out broadly. It was narrowed down to a few of the largest customers because, you know, the potential for downside on those customers was humongous. So, I was able to move forward with that, but I thought the value was when we got to scale and that was not happening. That’s how we found it.
Brian: Oh, so your own internal—some department, whether it’s sales, or whoever does these accounts didn’t want this turned on because the incentive for them was, you know, if I land a half-million dollar ad spend budget, I want them to spend that because I get paid in part on that half million, and you’re putting up some guardrails—with good cause—but I’m concerned that maybe you’re going to actually incorrectly police some legitimate spend, which hurts my paycheck at the end of the day.
Pavan: Yeah. I think incorrectly is the keyword. Thanks for summarizing that. Incorrectly is the keyword: that’s where the concern stemmed from—
Brian: Got it. Like a false positive.
Pavan: —are we being over-restrictive, are we having a high false-positive rate, and what have you. So, the solution which shifted this whole conversation is, as I said, if I were to do this again, a few missing pieces of puzzle was we were able to bring in and roll out one piece which was going to detect any single customer who’s put through this [unintelligible 00:19:58]—with their opt-in, obviously—that their spend would be having a financial guardrail to prevent from losses, if they go through that. What I mean by that is they actually are intending to spend, and they are getting close to that limit at under 80%. Even if they’re coming 80% closer to that limit, then there would be alerts that fly out, both to the customer and to the account team, with a one-touch experience for both the customer and account team, if the customer opted in for the accounting, where they could change it and override. Let me take an example and explain: customer X has $500,000 budget, our model suggested you shouldn’t spend more than $25,000 in a day.
That’s $500,000 for your quarter, maybe, or a month, whatever that might be. So, you shouldn’t spend more than this. And then their actual spend has come closer to, you know, $23,000 or $22,000 in a day towards the 21st hour of the day. Then alerts would start firing up, and the customers, they’re like—you know, these are large customers; they have programmatic management of campaign budgets, they have automations in place, so they would detect that alert’s being triggered, and they would automatically adjust. They could take one of few actions: they could say, “I got it. Push it through. That’s okay, I’m authorizing to move forward, so even if it goes beyond 25,000, it’s good to go. We know what we are doing. Great.” Or they delegate it to their account team, they look at it and they say, “Yep. There is this new thing. They’ve extended it beyond this locale”—like, maybe they were targeting it to a specific city; they’ve extended the targeting to a few cities, the paperwork didn’t come in, or whatever the nuances that the accounting might be aware of, they could do it. So, this was the missing piece of the puzzle.
Brian: Human override of the machine intelligence?
Pavan: Yeah. Override and experience of doing it quick in time, in line, without having to go through, “I need to get on a call with these three people, and then, you know, get approval.” That whole thing was cut down. So, that’s where the experience, and design, and why matters. Understanding the why, in this case: it needs to be real-time, it needs to be done in conjunction with customer and an account team, honoring the opt-in opt-out preference of the customer, and it needs to be done before the spend of an advertising campaign is impacted.
Maybe the campaign is doing so well, it’s getting the visibility that it rightly deserves, the creative team the and editorial team, they’ve done such a fantastic job, and if you [impede 00:22:53] at the prime hour, then having a solution two hours later, even twenty-five minutes later is less effective. So, understanding that is what helped us solve it and design it in a way where it was a low-touch experience, we could enable deep integration into the customer programmatic API stack, so they could manage those budgets and give overwrites where relevant, and then take more actions where it was not relevant. So, I’ve only spoken about one piece of the puzzle, there is also the other piece where it is, like, these customers were probably never aware. Previously what would have happened was, the account compromise scenario, a legit account compromise scenario, they would not have acted until at least 24 to 48 hours. And the busy ad season, that’s a few million dollars on the line, for each customer.
So, they could now go extremely ballistic and stop, locking down any new creatives being approved. Programmatically, the customers can start taking those actions to say any ad creative that was introduced into the system in last 24 hours is blocked. They can—for example—take such action. Or they could say anything which is coming in from now forward, block it; anything which has a price hike, block it. So, there’s all these levers that’s at their disposal that they could have acted on.
So, that’s basically been the conversation changer. And then once that experience was at hand, once the end objective can be met, then the rollout, I didn’t even have to chase anybody, it naturally progressed; we got to north of 80% of all our customer base that could benefit from this solution.
Brian: It sounds like you went through that process of understanding, there are other stakeholders here that maybe we didn’t think of as being, you know, actors or users in the systems, that your own colleagues in the account management side very much had an incentive in how the system worked, even if they were not literally managing the campaigns. And I think I could see how this kind of maps to the fraud thing we were talking about because just getting the model right that predicts whether or not spend is legitimate or not is insufficient. Not only did we have to think about all these different, quote, “Selfish interests”—“I want to sell my products, I want ads placed at the right time.” And then someone else wants to just spend—“Make sure you spend that budget, so I get my paycheck.” The model wants to prevent fraud and risk for the overall business; there’s all these different competing interests there, and understanding where do we allow that human feedback to come in and say, “No, we’re going to override here.” And so this actually leads me to my next question, which was, did the system learn anything from the overrides? Did the models take in that feedback and make adjustments to the future?
Pavan: It certainly did. So, there’s multiple iterations that got rolled over, over the next year or so since rolling out the original version. And the largest improvement actually had nothing to do with the actual fine-tuning of the model, but it was to do with the overall end-to-end ecosystem to make it more efficient, to make it more faster. But in terms of the model itself, it is less than 1% of the customers who have even gotten close to 85% threshold. So, it’s, like, very few outliers who have even gone past, so which made me go back and refine the model because it was being more risk-averse than what it showed; it wasn’t as aggressive as it should be.
So, it didn’t hurt us, but we could have reduced the possible budget limits. So, then the next wave, we rolled out a tiered module depending on the credit limits of the customer and depending on how many lines of business they have, and—you know, we’ve made it more complex, for right reasons, for right customers. And there are customers whose spend and campaign depth was straightforward, then they had the vanilla version. So, it, you know, it wasn’t a one size fits all, which is what we started with, with segment-based caps versus industry-based caps, if you will; maybe that’s a better way to put it. So, then we were able to customize it to say, you know, there are specific industries who would probably—like, florists would probably focus more—bakers, florists, probably for them, Valentine seasonality is more relevant compared to a Best Buy, or the likes of which who would probably have better applicability of the seasonality of Thanksgiving.
So, all of those benefited from using that feedback loop. But as I said, the biggest area of innovation in the future iterations wasn’t something that came in from here; it was about the latency, as I said; the speed at which we could act on it. And I should thank those concerns that were raised. Sometimes you don’t know what else you find when you implement things. And what we found was the amount of campaign data that was flowing through this large system—we are talking about millions of campaigns—and we could have optimized these tabs being applied so we could only focus on relevant—meaning we don’t need to take action on all these millions of campaigns every day, every hour; there’s a lot of optimization that went in after, which increased the velocity at which we could have acted to give perspective. When we started, if it was around a few hours within which we could make this happen, I think towards the third or fourth iteration, we’ve gotten down to under five minutes within which we could act on taking a decision.
Brian: Right, right. Hey, Pavan, this has been fantastic. Thanks for sharing these insights. And I was just curious, in closing, even stepping away from this particular example, since then, you know, we've all go through our changes and career growth and stuff. Do you have any closing thoughts about designing more human-centered data products that use analytics and machine learning? What are some other findings, maybe, that you can share with us as we close out?
Pavan: Yeah. I think the one biggest takeaway that I got away from there and one that I use as gospel is always lead with why. Understand what’s the deep business processes? Like, what’s the job to be done? What is the job to be done for your customers? Those could be your internal customers. Now, if you’re a data person, the odds are you’re serving an internal customer as you’re serving an external customer. So, a customer is, end of day, trying to do something, and understanding the context within which they are trying to do something is the differentiator between a product that is used versus a product that is [unintelligible 00:30:08].
Brian: Amen. [laugh]. Totally with you there. I think that’s a great place to leave it, and thank you, Pavan, so much for coming on the show. We’ve been talking to Pavan Tumu, Director of Product Management at Salesforce. So, thanks for coming on Experiencing Data and sharing this with us.
Pavan: Absolutely. My pleasure. Thanks, Brian. Thanks for having me.