074 – Why a Former Microsoft ML/AI Researcher Turned to Design to Create Intelligent Products from Messy Data with Abhay Agarwal

Experiencing Data with Brian O'Neill (Designing for Analytics)
Experiencing Data with Brian T. O'Neill
074 - Why a Former Microsoft ML/AI Researcher Turned to Design to Create Intelligent Products from Messy Data with Abhay Agarwal

Episode Description

The challenges of design and AI are exciting ones to face. The key to being successful in that space lies in many places, but one of the most important is instituting the right design language.

For Abhay Agarwal, Founder of Polytopal, when he began to think about design during his time at Microsoft working on systems to help the visually impared, he realized the necessity of a design language for AI. Stepping away from that experience, he leaned into how to create a new methodology of design centered around human needs. His efforts have helped shift the lens of design towards how people solve problems.

In this episode, Abhay and I go into details on a snippet from his course page for the Stanford d. where he claimed that “the foreseeable future would not be well designed, given the difficulty of collaboration between disciplines.” Abhay breaks down how he thinks his design language for AI should work and how to build it out so that everyone in an organization can come to a more robust understanding of AI. We also discuss the future of designers and AI and the ebb and flow of changing, learning, and moving forward with the AI narrative.

In our chat, we covered:

  • Abhay’s background in AI research and what happened to make him move towards design as a method to produce intelligence from messy data. (1:01)
  • Why Abhay has come up with a new design language called Lingua Franca for machine learning products [and his course on this at Stanford’s d.school]. (3:21)
  • How to become more human-centered when building AI products, what ethnographers can uncover, and some of Abhay’s real-world examples. (8:06)
  • Biases in design and the challenges in developing a shared language for both designers and AI engineers. (15:59)
  • Discussing interpretability within black box models using music recommendation systems, like Spotify, as an example. (19:53)
  • How “unlearning” solves one of the biggest challenges teams face when collaborating and engaging with each other. (27:19)
  • How Abhay is shaping the field of design and ML/AI -- and what’s in store for Lingua Franca. (35:45)


Quotes from Today’s Episode

“I certainly don’t think that one needs to hit the books on design thinking or listen to a design thinker describe their process in order to get the fundamentals of a human-centered design process. I personally think it’s something that one can describe to you within the span of a single conversation, and someone who is listening to that can then interpret that and say, ‘Okay well, what am I doing that could be more human-centered?’ In the AI space, I think this is the perennial question.” - Abhay Agarwal (@Denizen_Kane) (6:30)

“Show me a company where designers feel at an equivalent level to AI engineers when brainstorming technology? It just doesn’t happen. There’s a future state that I want us to get to that I think is along those lines. And so, I personally see this as, kind of, a community-wide discussion, engagement, and multi-strategy approach.” - Abhay Agarwal (@Denizen_Kane) (18:25)

“[Discussing ML data labeling for music recommenders] I was just watching a video about drum and bass production, and they were talking about, “Or you can write your bass lines like this”—and they call it reggaeton. And it’s not really reggaeton at all, which was really born in Puerto Rico. And Brazil does the same thing with their versions of reggae. It’s not the one-drop reggae we think of Bob Marley and Jamaica. So already, we’ve got labeling issues—and they’re not even wrong; it’s just that that’s the way one person might interpret what these musical terms mean” - Brian O’Neill (@rhythmspice) (25:45)

“There is a new kind of hybrid role that is emerging that we play into...which is an AI designer, someone who is very proficient with understanding the dynamics of AI systems. The same way that we have digital UX designers, app designers—there had to be apps before they could be app designers—there is now AI, and then there can thus be AI designers.” - Abhay  Agarwal (@Denizen_Kane) (33:47)



Brian: Welcome back to Experiencing Data. This is Brian T. O’Neill. Today we’re going to talk about human-centered AI. And I have Abhay Agarwal here from Polytopal, the founder of Polytopal, which is a consulting and services firm.

You have a really interesting background, so I’m really excited to talk to you. I don’t know if you’re a convert is the way to say it, but you’re a convert from the world of data science over to design, so I’m sure you have a very fascinating perspective to share with us today. So, welcome to the show. First off, thank you for coming on.

Abhay: Thank you for having me. I appreciate it, and I love what you’re doing here.

Brian: Ah, well thank you so much. So, I have this little quote that I read in one of your bios, I think on the Stanford course page. It said, “You turn towards design after realizing that most machine-learning products into the foreseeable future would not be well designed, given the difficulty of collaboration between disciplines.” So, what happened in your li—like, something must have happened to get you—or multiple times—to get you to state that. So, what’s the background on that?

Abhay: Yeah, that’s funny because, in future iterations of that paragraph, I sort of tried to give it a more optimistic spin. So, essentially my background is, I was an AI researcher; I was in the research world. So, I was at Microsoft Research, in Building 99, if anyone listening to this is familiar with the Microsoft campus in Redmond. And we were working on—at least my team was working on—hardware tools for visually impaired people. So, we had this ambitious goal of using computer vision and new neural networks technology for object recognition for describing the world, and we wanted to apply those to a hardware instrument to help visually impaired people see, essentially.

Give them the faculty of vision in this new way, through verbalized seeing the world. And so we built these instruments. I designed and built one that was a belt, it had a camera, and had gesture sensors, and use all this fancy computer vision, and it looked at what you were picking up and tried to verbalize it to you. And immediately what I realized through that experience, multi-month experience of developing that and working with an engineering team on that is, that we really needed good design principles. We did not have good design principles in order to develop this kind of instrument, this technology.

The computer vision and the way that those objects were recognized and all that, those were living within this world in which the data and the algorithms had purely dictated any kind of user experience, any kind of output that the system had. We were sort of at the mercy of these systems. And I was really intrigued by that question and wanted to really investigate that at a deeper level. So, I actually switched my attention, as it were, sort of in a research capacity and also in a career capacity, towards design and trying to uncover what it is that makes it possible to develop a good design method, good design principles for tools like this.

Brian: Mm-hm. You said—I read at least—that AI needs a design language. And I’m seeing more toolkits, and frameworks, and pattern libraries, and user experience approaches to this so I want to dig into that, but most of our listeners right now, I think—who are primarily non-designers—I think immediately they’re probably thinking about graphical user interfaces right now when you say AI needs a design language. So, can you explain what you mean here by ‘a design language?’ And how is that different than say, software architectural design? Or other IT concepts related to building models and all that? There’s tons of plumbing, and we can use design when we talk about even an algorithm? So, what does it mean for you when you say a design language for AI?

Abhay: Totally. It’s the first and, I would say, key question to really thinking through, I guess, my point of view on this issue, and obviously other folks might have different points of view. The way that I think about design is holistic. And that comes from this tradition at the Stanford design school, which is where I got my Masters, and where, sort of, the tradition that I adhere to, in a way, if you consider there to be schools of this is, design as an expansive way to build things that are intended to solve problems and address human needs. And what that necessitates is actually observing people; it necessitates understanding people, how they behave, how they experience the world, and then taking that observation, that ethnography, that research, and turning it into a direction or whatever it is that you do, whether it’s technology, whether it’s architecture, infrastructure, so on and so forth.

So, pattern languages like Christopher Alexander’s A Pattern Language is to me, while not described as such, fundamentally a design work where you’re looking at how people behave and interact with their lived environments, and then turning that into a set of key operating principles. So an AI, I feel like the same notion of design as at play. And some might even say that software architecture design if done in a certain way in which you’re thinking about who is the user of the software architecture, is a kind of design as well. But I think what we are distancing ourselves from—if we wanted to say that—is design thought of as the realm of aesthetics itself. And so thinking of things as well-designed purely because they are minimalist, or because they’ve shaven off certain elements, or that they’ve removed pieces that other people would have expected there to be, this is the design as a tradition going back to modernism in the mid-20th century.

And what the Human-Centered Design Movement tries to do is it tries to expand the lens of design to fit into the way that people solve problems and all ways. So, that’s—if that makes more sense. Or maybe that asks more questions than answers, but that’s what I think of when I think of design.

Brian: And are you proposing that these traditional, whether it’s the d.school, or IDEO’s Design Thinking, whatever it is, are these still the foundations, or no, no no, the foundations are now different. AI, is that different that we need a fundamentally different way of approaching it?

Abhay: I think that there’s definitely traditions and there’s a history to this line of thought. I certainly don’t think that one needs to hit the books on design thinking or listen to a design thinker describe their process in order to get the fundamentals of a human-centered design process. I personally think it’s something that one can describe it to you within the span of a single conversation, and someone who is listening to that can then interpret that and say, “Okay well, what am I doing that could be more human-centered? How could I involve more of an understanding of my stakeholders, my users, and then how does that then turn into how I build things?” I mean, in the AI space, I think this is the perennial question.

How do I train a model? How do I select data for a model in a way that is cognizant of things like its end users, its biases, the environment, the user interface in which this thing is being used? Right now there’s a distance in those things. And I think people in other fields other than AI will see a distance as well, whether it’s just data sciences as a whole—how do you design a data science tool in a certain way that it meets the expectations and needs and desires, aspirations of the people that would use it is a different concern right now than somebody who’s looking at regressions and someone who’s looking at—who’s trained in a certain perspective in statistics. So, how do we bridge that gap is really where I would start. And looking back to traditions of places like IDEO and like Design Thinking is, I would say, worthwhile for folks that want to see the lineage, but definitely not necessary in order to get a baseline 101 understanding.

Brian: Got it. So, if I’m—

Abhay: Yeah.

Brian: —talking to our audience, we definitely have leaders in analytics and data science here, and enterprise teams probably don’t have any designers. Maybe they have a BI developer who works on the dashboard and visualization part and that’s the closest thing to designer that they might have on the team.

Abhay: Yeah.

Brian: Where would they, like—if I was brought in as a listener, and I’m saying, “This sounds great. I agree with you. We need to be more human-centered. Yeah, we need more adoption of our products. Our models don’t get used. But where do I start? You figured out a way to start, so where should I start with my team? I got ten consultants, data scientists, I got project leads, I got some software engineers, machine learning engineers, where do I begin this, taking a step?”

Abhay: Yeah, totally. So, I think that you start by mapping out the problem and who is experiencing that problem, and you seek to understand that, you seek to document that, you seek to take firsthand understanding. You know, you use the tools of ethnography that have been honed over a long time. Now, some teams don’t have that, and I would say if you want to bring on a team that is capable at ethnography, I would say no matter where you live in an organization, whether you’re in the data analytic side or some other side, I think that’s always worthwhile. If you’re struggling to understand what are the key issues and problems that your stakeholders are experiencing, ethnographers are trained to uncover that.

And whether that means they have to learn the jargon and terminology of those that they’re speaking with, whether that means that they have to themselves become smart about a particular subsection of a niche of a domain of an area within the work that this team is doing, that’s what ethnography is about.

Brian: Yeah.

Abhay: And so I think that is really where I start. I start with observation and that’s where my book Lingua Franca starts. When we try to unpack how one goes about addressing these problems is if you don’t feel like you have an enormous amount of clarity, you start with observation.

Brian: Mm-hm. Do you have a concrete example of—I don’t know—like before or after story, specifically with AI? Can you give me some kind of scenario where what the request was—verbally or Jira ticket or whatever it was—and what the actual need differed, and this technique of research and ethnography helped you uncover the latent unarticulated need which was actually the thing that everybody wanted and needed but was not stated. Do you have an example of that—

Abhay: Yeah, of course.

Brian: —in the AI context?

Abhay: Absolutely. And yeah, we do this all the time. This is what my company is all about, and we’ve turned it into our own sector, so to speak, horizontal capability across industries. A great example I like to give is with a company we worked with called YUR. And YUR was building an interesting technology, it was for virtual reality fitness.

So, what they did was they had a software that you install on your Oculus headset and then you wear a fitness tracker, like a polar chest band that gets the quote-unquote, “ground truth” of your heart rate. And then you can visualize that heart rate while you’re in a game, so that when you’re playing Beat Saber, or you’re dancing, or doing one of those other fun VR games, you can turn it into a workout. And so you visualize all that. And they came to us, and they’re saying, “Hey, well isn’t there a way that we can use AI to just predict your heart rate and predict your vital signs? And so they don’t need the band anymore. You’re wearing this tracker on your head so can we now use that data somehow to get your vital signs.”

So, they came to us with that problem, and they said, “Okay, well, what are you going to do? How are you going to build it? What algorithm are you going to use?” Blah, blah, blah, so on and so forth. And we said, “Hold on. What we want to do first is we want to talk to some of your users, we want to talk to some of the people that are using this technology right now or are beta-ing it.”

And not—the technology as in the polar one, not some prototype of the AI, but that using this experience, using YUR’s product, and then we want to understand, okay, what kind of meaning they drawing from this data? What do they want out of it? What level of granularity? What kind of vital signs do they care about? And so we did that—

Brian: Actually, let me pause you real quick. I just want—

Abhay: Yeah.

Brian: —to ref—make sure I understood the framing before you go too far. So, the client was saying, basically, they’re saying, “We want to get rid of the fitness tracker part so they’re wearing the Oculus, but they’re not wearing anything on their chest; can we just predict it instead of actually tracking it? The goal is get rid of the thing? Can we get rid of it and replace it with software only?” Okay. Now that I got that, please continue.

Abhay: Exactly. So, software-only solution that leverages artificial intelligence, predictive modeling—

Brian: Yeah.

Abhay: —right. And as I mentioned, they’re obviously going towards a solution first; they want to see the solution, they want to know the parameters of it, the details of it, how accurate is it going to get? How many BPM? Blah, blah, blah.

Brian: Yeah.

Abhay: And we mandated—sort of—that we’re going to talk to your users, and we’re going to understand what they care about. And it turned out when we did that, we gained a wealth of insight, of really rich insight, so rich that the company that we’re talking to, YUR, was like, “Oh, my God. How could we have not already done this before? We’re so dumb for not having reached out to them and asking all these questions.” So, I’ll give you a sample of what we found.

One is that accuracy in your heart rate turns out to not be that important to people because they don’t know how to process that information. What they care about are heart rate zones, or they care about these ranges; they care about the consistency of the experience; they care about notifications; about, “Keep it up,” so on and so forth; they care about calorie counts, and how that affects the rest of their day. And so there’s this rich wealth of what people care about, that is really significantly different from the data science goal as stated by YUR when they came to us. In addition, we found some really interesting ideas around self-image and how people view themselves when using these tools. And potentially, they care about something that is helpful to them and not necessarily accurate for the sake of being accurate.

There’s a reason why we are okay with an Apple Watch, which is an order of magnitude less accurate than a VO2 mask in order to calculate some of your vital signs. So, there’s a certain level, a standard, and also a kind of design to that experience that was really not captured by YUR until we came in and helped them through this project. So, then what we did was we actually built to the goal that we now found, the new one, the new goal. So, one thing that we found was that it matters less to accurately predict your heart rate, and matters more that you actually have a little bit of manipulation room in how you shepherd the algorithm. So, what they wanted was for an automated system that automatically, based on your movement, captured your heart rate.

And what we found is that if you’re able to encode a little bit of what you care about, what your goals are into the system, and then let that create the algorithm, was better. And what I mean by that in this particular case is, like, entering in your current heart rate maximum, or entering in your current body fat percentage. And what people could do is they could then use some of those parameters that they entered in to actually affect the algorithm if it didn’t feel like it was matching what they wanted. So, sometimes they would set their maximum heart rate to be higher or lower based on where they felt like they were at in their fitness journey. And so it changed the way that the algorithm worked.

And then we also redesigned the algorithm so that it was much less biased towards different body types, and that created a natural sort of suffer in the precise accuracy of the algorithm. But we found that due to our ethnography, we were okay with that because it was still meeting this human-centered goal of building this tool to what people wanted. So, what I think this example shows is that there’s a very distinct kind of technical outcome to much of this ethnography that is lost when you sprint ahead without thinking about the user and your products.

Brian: Yeah, yeah. I was actually going to ask you about—I want to jump into bias here because I think it’s a big part of the design framing for things. How did you get—talk to me, so at some point, body types came up and you can see how maybe a—I’m just going to throw this out, maybe at a fitness company, you have fitness-minded people making fitness products, and so maybe you forget to think about someone who is underweight, or someone who’s way overweight, or whoever—who is the typical gamer for a system like this? Is there even a typical body type for that? But to even ask the question or to know to think about body types as being relevant, how do we go about identifying the things we don’t know to ask about, which I think is part of the issue with AI, especially if you’re doing commercial non-internal work that touches a third-party who’s not in the room, how do we know to ask that question? And your area, I mean, in this particular example you could probably jump to that fairly quickly, but there are scenarios where we never thought to ask about this. How do you go about thinking about that, the things we don’t, that are not on the requirements document or whatever. You know—

Abhay: Absolutely. I think this is a huge problem for the whole industry, and I think in general, there are a lot of challenges. And something that I was talking about and thinking about back when I left Microsoft Research in 2017, we’re not really at the forefront of the conversation as they are now. And it’s kind of become—it’s sort of validating, in a sense. Obviously, I’m not—I don’t like that things have turned out this way, but the fact that these large biases exist in things like language models is just a key element, is a key proof that such a lack of engagement with the real community exists.

When these things are developed in isolation when you can see that the makeup of the community that’s developing these things, the kind of disciplines that are thinking about it are so skewed and that just becomes immediately apparent when you start to look at these problems that are emerging from systems like that. So to me, that has existed, actually, for quite a long time, and is not simply the fault of a large language model or a particular company developing that, but is really endemic to the entire space. And to me, that is one of the key challenges that we need to overcome by developing that shared language across disciplines, so that people like designers feel comfortable in a brainstorm session with a bunch of very technical data scientists and AI engineers. Right now, that is not the case.

Show me a company where designers feel at an equivalent level to AI engineers when brainstorming technology? It just doesn’t happen. There’s a future state that I want us to get to that I think is along those lines. And so, I personally see this as, kind of, a community-wide discussion, engagement, and multi-strategy approach.

Brian: You know, I always talk about, kind of these two—I see them as two, kind of, distinct audiences that I try to serve in my work. And I think that—listen, it’s people in the software industry—whether it’s business software or commercial software but there are some kind of data products—and then we have our internal enterprise teams that primarily serve internal stakeholders. Are you saying that you think design has a role at both of those different tables, or just really in the one where we’re building external software for others? Is that an irrelevant distinction?

Abhay: Design applies anytime that you care about the way that a product is used, and who uses it, and when they use it. So, I would say that it absolutely applies in both domains. And especially with an internal product, I mean, everyone has a stakeholder and someone that cares about what they build, and if we don’t think about those people, then we get the end state that a lot of our software is at right now—

Brian: Yeah. [laugh].

Abhay: —were sort of designed—or, sort of unintentionally designed in the way that it was. Like, where there was a design to it obviously, the way that anything in the world has a design, but it wasn’t the intended one for its audience.

Brian: I call it byproduct design. [laugh].

Abhay: Yeah, exactly. Yeah so—

Brian: It’s a byproduct of all of our other choices we made, you know? [laugh].

Abhay: Exactly. Exactly.

Brian: Yeah. Talk to me a little bit about, you mentioned black boxes at the beginning of this. So, I feel like there’s a lot of change towards, we’ll take the hit on accuracy because we’re starting to see that interpretability is more important than accuracy, at least in some contexts; there’s always exceptions here.

Abhay: Yeah.

Brian: Are you in the camp where it’s like unless this is truly not a human-facing algorithm, we generally need to always be using interpretable models and we need to stop using black boxes as much as possible when there’s humans in the loop, making decisions, things like this? What’s your take on interpretability within models and things like that?

Abhay: Yeah, it’s a great question. So, I’m the iconoclast, in a way. So, I actually have a third path and I’ve been trying to push for this third path for a while now, but I think it might take a little bit longer for folks in the community to come on board, or maybe I’m wrong. But there’s two camps right now. There’s the camp that says that, “Hey, this technology is a black box; there’s all these internal parameters; it’s deeply entangled. We should not be disentangling the system because that removes its ability to learn the highest level constraints, satisfaction, and relationships, right, knowledge representation.” So, that’s one camp, the pro-black-box—in a way—camp that I think folks like Yoshua Bengio have espoused.

Then there’s the camp that is saying, “No, we need interpretability, to become a forefront concern for all of these models so that they are literally designed in a way that is intended to be interpretable, whatever interpretability means.” That means you can extract parameters from these things, that you get more sliders, you know, blah, blah, blah. There’s this interpretability camp.

And there’s many, many people that are espousing for this, kind of, technical architecture of interpretability that allows a person to observe what the model has done, and then parameterize it, factorize it, so forth, look at feature gains, and look at things like perturbation, understandings, et cetera, and so forth. I actually think that there’s a third camp, and the third camp is essentially saying that interpretability is not a feature of a black box, it’s not a feature of a system. Interpretability is actually, its contextual engagement with its environment. And so the only way to really develop something that is actually interpretable is to design that experience in a way that enmeshes it within its environment. The inside—interior of a black box, or the interior of a model, is typically a relationship between data that has been fed to it, but the way that these models are then turned into experiences that live in the real world involves a lot of ingesting of environment and context that is outside of what the model would ever know.

So, the internal model relationships, in my mind, will never get us to the point where we can, quote-unquote, “interpret” what that model is doing, apart from its lived experience within the world at large. And so researchers like Timnit Gebru, I think, are on the same page of thinking of interpretability more as the lived experience, the revealed experience, so to speak, to take a behavioral economics point of view of what these systems do, rather than are.

Brian: Do you have a concrete example of that to frame, to show me the contrast between that’s an interpretive model and then this is an experience that’s reflecting a larger lived experience, not just the technical part of the models. Do you have an A/B you can give us?

Abhay: Yeah, absolutely. So,let’s take, for example, a music recommendation system. So, music recommendation systems are these typical, really useful systems that are now pervasive; we all use them all the time. One of our clients was Spotify, we helped them work on the model for music recommendations. And this system essentially has many different regions. It’s really large and diverse.

There’s a region, a cluster of cumbia music, and there’s a region of freeform jazz, and so forth. And so a lot of what machine learning researchers try to do is they try to say, “Okay, what are, sort of like, small clusters? What are clusters that are distant? What are clusters that don’t have a good internal representation?” And so on.

And they’ll try to iterate on the model in that way, just by feature importances, and by clustering, and unsupervised, and so on. What their ethnographers found out was that there was entire regions of the world, places in countries like Brazil that were not being met by what they wanted by their music because their music was being tagged and classified with labels that were inaccurate to the way that they thought about their music. So, they had this area of reggaeton, but that reggaeton obvi—there were many sub-reggaeton genres, and what was considered reggaeton for them was not actually what was considered reggaeton for people in Spain. And so then you get things like Brazilian people listening to, like, party music and it’s recommending them Christmas songs or whatever because there’s just differences.

And if you look inside of the model, you would say, “Oh no, reggaeton is great. It’s this comprehensive genre with tons of detail. There’s all these music in there. It’s related to these other things. We’ve got reggaeton meets jazz; we’ve got reggaeton meets pop; it’s all this diverse thing.”

But then you look at the ground level, and the way that these people experiencing them is not the same. And so some people have tried to turn this into a classification of quote-unquote, “biases,” this label bias, and so on. But to me, I think these are all just different lenses on this core concern, which is that the lived experience of your technology, of your model is always going to tell you more about what’s interpretable about it than just looking at the internal relationships.

Brian: Mm-hm. I’m actually a professional musician. I play a lot of world music, so—

Abhay: Yeah.

Brian: —I’m fascinated. And you’re totally right. I mean, I was just watching a video about drum and bass production, and they were talking about, “Or you can write your baselines like this”—and they call it reggaeton. And it’s not really reggaeton at all, which was really born in Puerto Rico. And Brazil does the same thing with their versions of reggae.

It’s not the one drop reggae we think of Bob Marley and Jamaica and all of this. So already, we’ve got labeling—it’s not even wrong; it’s just that that’s the way they interpret it there. It’s like, “What’s jazz?” I mean, good luck defining that even the jazz the National Associa—[laugh] doesn’t define it; they can’t. So, how do we build a system that needs these labels to say, “Play more jazz,” when we have that? So, how did you address that?

Abhay: Yeah. It’s a great question. We didn’t have a very long engagement with them, and so we weren’t able to really address it comprehensively. I spent basically enough time there to recognize the issue and basically move on. But we had an engagement with them in which we were helping them design better collaboration between their design team and their machine learning team.

So, developing shared tools, interfaces, ways of exploring the data where designers could work with data scientists. And while we didn’t have a very long engagement to see the fruits of that, or iterate and move past that, I think that’s the seed of the solution for any team, is to understand where those alliances can be formed. And maybe there are tools that actually enhance that, maybe there are tools that actually give a designer the visibility to look at what a machine learning algorithm is doing, and then to be able to craft some sort of hypothesis upon it.

Brian: What did you have to unlearn or change when you started to say, “I’m really interested in this design thing?” And are there things you have to fight, like, in your head? “I’m not going to do it that way. I’m going to go this other way. I’m going to do this long thing; [laugh] I’m going to take the long path even though I think I know the solution already.” Can you talk to me about what goes on in your head here? Or maybe that’s not the right question. I don’t know.

Abhay: So, when we engage with the other teams, I think this is probably the main challenge that we face is there’s a way that they do things, and that way that they do things is sort of battle-hardened. And—

Brian: [laugh].

Abhay: —not just in a way that they kind of have the scars of doing it other ways, and they’ve selected the way that they do it because maybe there’s like—the team is all focused on a certain way of thinking about things, and this is what has become of the team dynamics that makes it possible for them to collaborate at all. So, it’s a tricky situation. I totally recognize that with organizations that the way that they conduct themselves, the way that they work, is in a large part molded by much of the history of the organization, and what’s gone on, and what’s worked in the past. But for us, one of the things that we try to break, if there is anything, is the sense of a design silo versus an engineering silo. And so what we often do is we require in our work that we bring on at least someone from that design silo into the engineering space and help them build that context so that even when we left the engagement, when we built the tool that we were coming in to build that hopefully, we’ve created some connectivity within the organization between those different teams.

But there’s many, many challenges with that, and there’s many challenges that we face when it comes to being a part of an organization where the functions, the lines of business, so on, are super different, where the design team is the one that hires the design firm and the AI team is the one that buys the IT software and there’s never the twain shall meet.

Brian: Mm-hm. Is that how often how you guys come in is you come in through a design buyer, for example, but then you’re working with an AI team, like a data science team, and then you have to negotiate that whole working rela—

Abhay: Yeah, it’s typically the opposite actually.

Brian: Oh, it is through the data side? Okay.

Abhay: Yeah. So, it’s typically the opposite. And I think that the interesting element of that is that the design side of an organization has not been involved in those conversations, yet.

Brian: Yeah.

Abhay: So, they have no awareness that there’s a problem that even needs to be solved.

Brian: Yeah.

Abhay: So, it’s an interesting marketplace for having this new capability and service offering, where teams that you really have to find a way to fit into their existing model while also challenging it in certain ways. You play at the boundaries. You can’t give them a completely new world, even if that’s the world that you believe in, because then it won’t fit into their organization and they won’t be able to adopt it.

Brian: Yeah, yeah. Yeah, change is only—I mean, it’s the same thing I tell with data people. It’s like, well, what are your customers ready to accept? Because they have to want this, probably, at some level before they’re going to accept the change. And so, fully moving to an automated system may be a giant leap.

It might need to be something where they get to make the final decision, but the machine provides a bunch of advice about it, but you need to understand what they’re ready to change right now and that’s part of the solution, even if technically there’s a great way to do it, so that this change is not something that just happens overnight. And culture is [laugh] definitely a big part of this. And most of these teams do not have any type of design culture at all, and I’m with you, I—and this is a message to the enterprise companies out here: you probably should at least, maybe, think about having some roundtables between your produ—if you have an external, like you’re a healthcare company, and we do have a product and a UX team, but they work on the apps and stuff. And then you have this data science team building other kinds of stuff that you might want to at least have a roundtable. Maybe we can bring someone in on a project here or start to have a conversation, and both sides there need the—I don’t—most of the design teams I know are not asking any questions about this and the data [laugh] teams are not asking for design help, but I think they could probably help each other. I think there’s some great synergies that could potentially be there. But I see the same thing as well.

Abhay: Yeah, absolutely. Absolutely. It’s a challenge for the industry at large to undertake that integration. But what I’m also seeing—which is the optimistic side of these—is that the more that we see AI applications become pervasive across the surface area of a product, the more that we see that an intelligence is actually the core differentiator between you and your biggest competitor. The more that these companies are realizing that, “Hey, this is becoming a key capability of who we are, this is our identity.” And at some point, someone is going to ask, “Hey, why don’t we have someone that can really shape and design this experience?” And that to me is the seed of the opening for this new kind of thinking.

Brian: If you were advising an enterprise data science leader, or an SVP, or somebody like this, and he’s like, “Okay”—or he or she was like, “Okay, I’m sold. Yes, I can feel the pain. We don’t put out stuff that’s easy to use. People don’t want this stuff. We have a hard time getting one able to use this.”

Is the track, is the thinking they should be on, like, “I need to train my staff on how to do this.” Or is it, “No, we need to go hire some either external, or employees, or contractors, or designers who already know how to do this?” Is there a more efficient approach to starting to take those first steps? Is it, ‘train the people,’ or is it ‘hire experts?’ What’s your take on getting started?

Abhay: I think there’s many approaches, and I think that definitely a cognizance and a thoughtful understanding that there is an issue to be addressed, and then defining it, putting a name on it, putting a roadmap item on it, you know, those are all important in the large scheme of things when it comes to large organizations like is, you know, that’s the strategy shift and that is important. Now, when it comes to how that results in a lower-level shift, who’s going to be doing what, what new changes will occur, I think that there’s a couple of key things that folks should be thinking about as they make these decisions. And all the things that you mentioned, all of the above, obviously for different organizations, could be the right approach. So, hiring a new team, hiring a new PM, shifting the organization, bringing on external consultancies like us, all those things all make a huge difference. The way that I would think about it is, one, there is a new kind of hybrid role that is emerging that we play into, pretty much all of our team members have this role, which is, sort of, an AI designer, someone who is very sort of proficient with understanding the dynamics of AI systems, whether that means we’ve played around with a bunch of them across computer vision, natural language processing, we’ve worked with a bunch of big companies that are all doing this in production, we know that the general pitfalls, and where things go wrong, and what kind of visibility, what teams need in order to productionize this kind of stuff, that is a new role.

And the same way that we have digital UX designers, app designers—there had to be apps before they could be app designers—there is now AI, and then there can thus be AI designers. So, that is something that I would encourage leaders to think about is, like, this is an emerging field. This is an emerging discipline, and so are you going to be there when there’s a large population of very talented folks that are now transforming industries and companies like Google, where are they going to go next? What’s their next step? How are they going to help you?

How are you going to fit them into your organization? That’s a big thing to think about. The second thing I would think about along these lines is that you don’t need a specialist AI designer because the point of this is to create collaboration. A specialist AI designer can help you start to ask those questions, first; they can help you get to the point where you’re now observing things that you had no observability on, that you didn’t realize that there was going to be this whole sphere of issues that emerge when you’re building this model out. But the second thing is developing those collaborative instincts, and so what that means is using some of these documentations that provoke teams to do this—we’ve written a document about that, called Lingua Franca, that you referenced earlier—and those are meant as provocations to push you as an organization to see there’s connectivity there, and then to develop, and design, and build those technologies along that new way.

Brian: I want to ask you about the Lingua Franca thing in a second. To start to close this out, I did want to jump into one thing, which was a little bit more tactical, I guess you could say, or maybe it’s not tactical, but let’s talk about incorrect predictions, edge cases, handling all the what designers would call the non-happy path scenarios. And again, I want to get back to this. How did we know when our software that uses AI needs to handle these edge cases because we don’t know what they are? It’s probabilistic in nature. We can’t predict all the, quote, “edge cases” that come out.

And maybe we don’t even know when something is an edge case like, oh, the heart rate is extremely high; you could just print the number, or you can say, “It’s way too high for you.” So—and that’s only—I’m going to assume you’re going to say it’s going to come from research, but how do we know when we need to recover from a strange—how did you come up with this answer, model? Can I override it? Can I teach you something about the current setting that I’m in so that you don’t give me this bad thing again in the future? How do we unpack and know those scenarios that we need to design for? How do we even find them? Talk to me about this whole, like—[laugh]—

Abhay: Yeah, totally. It’s a really important area, and I think really one of the most crucial spaces where design plays the largest—in my opinion—and most important role. Now, maybe there's a requirement for a little bit of a lineage here; how did we come to this point? Traditionally within data science, when you train a model, when you build a regression, you have something called test, or training, or—right—accuracy. So, there’s a training set, then there’s a test set; you classify the training set, and then you see how well it does on the test in order to see whether it’s quote-unquote, “good.”

Typically, what these algorithms do and what this methodology obscures is the notion that there are these pockets in which there is very, very unusual or undefined behavior. And when you have something like a neural network which has multiple orders of magnitude more parameters, that has multiple orders of magnitude more spaces and pockets for things to go haywire, for things to just go completely sideways. So, that’s really what has happened now is we’ve created these systems that are at this level of complexity, and yet we’re still tasking them in the same way that we would task a financial projection thing, which is not going to go haywire because it’s built into the system that it has to be decently close. But when you’re talking about classifying between dogs, giraffes, cars, people, so on and so forth, their massive pockets in which it’s not going to—like, just because it classified a person wrong doesn’t mean it’s going to classify it to the next most likely thing that is next to a person, like a child or a statue; it’s going to classify it as something completely different, and therefore you have these massive spaces where things can go wrong and go wrong really badly. Now, the next question then for us is, why do our organizations not think of that as a massive concern?

Why do we still look at the training and test accuracy and then build it in that way, as if you can deploy it very quickly and the errors are some, like, single path error handling? And I think the reason is that we’re not really looking at these tools as a new kind of experience. We don’t realize that when you take this into the real world, that there is a stakeholder there that might have to be the end-user of that prediction, and might have to then be responsible for it, i.e. An autonomous vehicle.

And so that’s when you get to these situations where you’ve now stacked up the tech stack so high that you have autonomous vehicles that are crashing and you don’t know who to blame because you didn’t build that into your organization in the first place, that these models have these failures scenarios.

I have a couple solutions, but obviously, the solutions are more tactical rather than comprehensive. But for us, we believe very strongly in what we call guardrails. So, guardrails are just pre-programmed known limitations that are transparent, extremely transparent, that anyone can look at that system, can look at that block of code, can look at that documentation and say, “Okay, in this case, in this case, in this case, by definition, as far as we can tell, the system cannot do x.” So, one interesting—one way you could do this is like if you have a conversational bot, you have a bunch of words that it can’t say, and so therefore, no matter what, you can’t say the word ‘Nazi’, even if it’s like, “I despise Nazis.” Because you just don’t want it to go there.

So, those are examples of guardrails. And those happen across all industries, across all applications of the technology. And then finally—to close out this lengthy answer to your question—is I would say that those are really where you want to go in and investigate what we call journeys, user journeys. And so understanding those edge cases and just keeping it top of mind. At any point, as you look at the user’s journey, you will see that there’s an opportunity for that journey to go haywire. And then you record that, and you mention it, and you investigate it. And then you test and you regress against it.

Brian: I’m totally with you there. And I think it always comes back to spending time with people, spending time with a diverse set of people, especially in these kinds of scenarios. You’re surrounded by people that look, think, act just like you do, you’re probably not going to come up with [laugh]—

Abhay: Yeah.

Brian: —the things that you don’t know to ask about. So, I think it gets back to that research and customer, user exposure time, and just soaking in what you don’t know to ask about, which to me is maybe one of the most important parts of qualitative research is not the script, but it’s the findings of the things that you didn’t go and plan to ask about. That’s where so much of the insight comes in because it’s not on the script; you don’t know to ask, that’s how far that you don’t know how much you don’t know, [laugh] about the situation. But this is great. But tell me so you do have a lengthy document and position on designing for AI on your website. It’s called Lingua Franca. Did I hear ‘book?’ Is this coming out in a physical book at some point?

Abhay: It’s being expanded into a full-length book treatment, yeah. So the—

Brian: Oh excellent.

Abhay: —the version that’s on the web is, I would say, sort of a novella, so to speak. It’s a small treatment with some moderately-sized passages for each of the different concepts. But what that really inspired in me, releasing that about a year and three or four months ago now, was the sense that there’s this huge space and domain, this emerging field, and I wanted to really put my own hat into that ring and try to shape that field as much as I can.

Brian: Mm-hm. Where can people find that in you in general?

Abhay: Yeah, so the easiest way to get a hold of everything about what I’ve just described is Polytopal. So, P-O-L-Y-T-O-P-A-L—dot—ai. And polytopal.ai is our main landing page; it’s fairly thin, we’re obviously developing it.

Our firm is about a year old, so we still have a long way to go in terms of evolving our own presence, but there’s a sidebar there that has a link to Lingua Franca if you want to take a read. Or just search ‘Lingua Franca AI,’ then you’ll probably find it on Google, and so forth. And that’s the way to find the book and a way to get a hold of the company. You can email us hello@polytopal.ai. And then my name, again, is Abhay Agarwal. You can try to look for me on LinkedIn, and so forth. But yeah, anywhere that you can find Polytopal, you’ll probably be talking to me or one of my colleagues.

Brian: Any Twitter, other handles I should pop in the [show notes 00:43:06]?

Abhay: Yeah. So, my personal Twitter is @Denizen_Kane—K-A-N-E. And then the Polytopal one is at @polytopal_ai.

Brian: Got it. Awesome. I will link these up in the [show notes 00:43:19]. Abhay, it’s been really great to talk to you. Let me know when the book comes out. I’ll try to get that out to the mailing list. And any closing advice, last words for our audience that you’d like to share?

Abhay: Yeah, I mean, I think that if folks are listening to you and listening to this podcast, I think that’s already a big first step, to be honest, to find out these domains of things that you don’t know and try to embed new cultures and ways of doing things within the organization are super, super important. And those will always be the key catalysts for these organization transformations.

Brian: Well, thank you so much for [laugh] plugging the show like that and believing what I’m doing, and I’m fully behind what you’re doing as well, and I can’t wait to see where things go and when the book comes out. So, thank you again for coming on, and let’s stay in touch.

Abhay: Awesome. Take care. Thanks so much, Brian.

Brian: Yeah. Take care.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Subscribe for Podcast Updates

Join my DFA Insights mailing list to get weekly insights on creating human-centered data products, special offers on my training courses and seminars, and one-page briefs about each new episode of #ExperiencingData.