114 – Designing Anti-Biasing and Explainability Tools for Data Scientists Creating ML Models with Josh Noble

Experiencing Data with Brian O'Neill (Designing for Analytics)
Experiencing Data with Brian T. O'Neill
114 - Designing Anti-Biasing and Explainability Tools for Data Scientists Creating ML Models with Josh Noble
Loading
/

Today I’m chatting with Josh Noble, Principal User Researcher at TruEra. TruEra is working to improve AI quality by developing products that help data scientists and machine learning engineers improve their AI/ML models by combatting things like bias and improving explainability. Throughout our conversation, Josh—who also used to work as a Design Lead at IDEO.org—explains the unique challenges and importance of doing design and user research, even for technical users such as data scientists. He also shares tangible insights on what informs his product design strategy, the importance of measuring product success accurately, and the importance of understanding the current state of a solution when trying to improve it.

Highlights/ Skip to:

  • Josh introduces himself and explains why it’s important to do design and user research work for technical tools used by data scientists (00:43)
  • The work that TruEra does to mitigate bias in AI as well as their broader focus on AI quality management (05:10)
  • Josh describes how user roles informed TruEra’s design their upcoming monitoring product, and the emphasis he places on iterating with users (10:24) 
  • How Josh approaches striking a balance between displaying extraneous information in the tools he designs vs. removing explainability (14:28)
  • Josh explains how TruEra measures product success now and how they envision that changing in the future (17:59)
  • The difference Josh sees between explainability and interpretability (26:56)
  • How Josh decided to go from being a designer to getting a data science degree (31:08)
  • Josh gives his take on what skills are most valuable as a designer and how to develop them (36:12)

Quotes from Today’s Episode

  • “We want to make machine learning better by testing it, helping people analyze it, helping people monitor models. Bias and fairness is an important part of that, as is accuracy, as is explainability, and as is more broadly AI quality.” Josh Noble (05:13)

  • “These two groups, the data scientists and the machine-learning engineer, they think quite differently about the problems that they need to solve. And they have very different toolsets. … Looking at how we can think about making a product and building tools that make sense to both of those different groups is a really important part of user experience.” – Josh Noble (09:04)

  • “I’m a big advocate for iterating with users. To the degree possible, get things in front of people so they can tell you whether it works for them or not, whether it fits their expectations or not.” – Josh Noble (12:15)

  • “Our goal is to get people to think about AI quality differently, not to necessarily change. We don’t want to change their performance metrics. We don’t want to make them change how they calculate something or change a workflow that works for them. We just want to get them to a place where they can bring together our four pillars and build better models and build better AI.” – Josh Noble (17:38)

  • “I’ve always wanted to know what was going on underneath the design. I think it’s an important part of designing anything to understand how the thing that you are making is actually built.” – Josh Noble (31:56)

  • “There’s a empathy-building exercise that comes from using these tools and understanding where they come from. I do understand the argument that some designers make. If you want to find a better way to do something, spending a ton of time in the trenches of the current way that it’s done is not always the solution, right?” – Josh Noble (36:12)

  • “There’s a real empathy that you build and understanding that you build from seeing how your designs are actually implemented that makes you a better teammate. It makes you a better collaborator and ultimately, I think, makes you a better designer because of that.” – Josh Noble (36:46)

  • “I would say to the non-designers who work with designers, measuring designs is not invalidating the designer. It doesn’t invalidate the craft of design. It shouldn’t be something that designers are hesitant to do. I think it’s really important to understand in a qualitative way what your design is doing and understand in a quantitative way what your design is doing.” – Josh Noble (38:18)

Links

Transcript

Brian: Welcome back to Experiencing Data. This is Brian T. O’Neill. Today I’ve got Josh Noble on the line from TruEra. You have an interesting background in design and research, and you’re particularly working in the tools for data scientists. So I thought this is sort of almost like a meta-episode, but I love that you’re bringing that design lens and the research lens into this space that we talk about on this show. So welcome.

Josh: Thank you so much for having me on. I know we talked about this for a little while, and I’m glad we’re making it happen.

Brian: Yeah. Yeah. The first thing I kind of wanted to jump into was when we design for technical users—and I’ve seen this in a couple different domains—there’s this, “I don’t need it simple. I don’t want it dumbed down for me.” I would imagine maybe some listeners are kind of finding it, not funny, but surprising that a company would invest in design and research for tools for technical people who are supposed to know how to use technical stuff. Why does it need to be easy or useful or usable? Like, they’ll figure it out. Why do we need a Joshua at TruEra?

Josh: That’s a fantastic question. And one of the things that we find with our primary customer—our primary user, which is a data scientist, is that they have a workflow that they have made work, and when you come to them and say, “We have a much better way for you to do your job. We have a much better way for you to approach some of the problems that you’re trying to solve,” people are sometimes reticent to change what they’re doing (a) because they want to think that they know best, and technical users pride themselves, rightfully so, on their expertise, but also because learning new things, changing the way you do it, is a little bit intimidating. If you have a pattern or set of tools or workflows that you know work for you, when you try something new, it might not work. Maybe your boss is going to look very, you know, is going to say, “That was a waste of time,” or, “Gosh, this didn’t have the results we were hoping for.” And so it can be a little bit challenging to get extremely technical users who have deep expertise to think about new ways of doing things.

The way that we approach that is by looking at the concerns that they have and the problems that they’re trying to solve. And those are of two sorts. There is the direct problem, so in our case—data scientists developing a machine learning model and saying, “Gosh, the accuracy just isn’t as good as I hoped it would be,” or, “This isn’t going to work for the case that we want to deploy it in to.” And the other is the meta-problem of having a workflow in a regularized way of having many people working on many models and deploying them perhaps in ensemble or perhaps in sequence in some service.

And helping them think about the way to work effectively and to understand, not only what their models are doing, but how those models in tandem with other models or in tandem with human decisions lead to the business outcome that they’re hoping to work towards. And that can be making sure that the right people get loans or making sure that truck drivers don’t crash or making sure that a business runs smoothly.

So we come to them initially—usually from the angle of, “Here is the day-to-day problem you’re trying to solve, model accuracy, model performance, speed.” In the case of regulated industry is a lot of regulation and potential liability that they need to account for in their model developer process. And we come to them there and say, “These are the things you need to think about.” We can help you think about not only these specific problems but how that fits into the overall bigger problem of being a data scientist, providing machine decision support inside a business or organization, and needing to do that with many different stakeholders, potentially many different data scientists, potentially many different regulatory regimes, and that is generally, I find, a much better way to approach people with we’ve got a new way for you is to say, “Here’s the simple thing that you thought about this morning. But here’s how that levels up into the bigger thing that’s going to shape your job and your organization more broadly.”

Brian: A little bit of context. My understanding is TruEra, as you said it, you’re working to help mitigate bias in machine learning. So is this to debias models, and you’re trying to surface the information that’s relevant so that they can understand where bias is and to try to deal with it? Is that effectively what you guys do?

Josh: We have a broad focus on what we call AI quality management. So we want to make machine learning better by testing it, helping people analyze it, helping people monitor models. Bias and fairness is an important part of that, as is accuracy, as is explainability, and as is more broadly AI quality. And so we’re not looking exclusively at bias and fairness. But those are, we believe, things that come out from an approach of being honest about like what goes into AI quality. What makes machine-learning model both trustworthy and accurate?

Brian: You have a design and research background. You’re currently in a research—a UX research role; is that correct?

Josh: Mm-hmm. Yep. I am the user researcher. And so I spend a lot of time talking to data scientists. And I worked as an attraction designer for about 12 years. But also went and got a master’s in data science a few years back. And so I’m in the somewhat unusual position of being a designer who does understand the pain of like, working with multiple Jupiter notebooks, and you know—and trying to understand why your job is running really slow in training a model.

Brian: Your street cred just went up with like half of my listenership, I’m sure. Like, it’s better than my street cred. But what I want to help listeners understand who perhaps don’t have user experience researchers or those kinds of skillsets on their team, what are some, like, tangible, actionable findings that you’ve gotten out talking to data scientists that went back into the product in terms of the feature or a change to an experience or something, some kind of—because you’re, you’re really to help inform the design of the experience and the interfaces, correct? What are some things that maybe you had—that changed as a result of this either through, you know, usability testing or just you got in front of it, except you didn’t have to do the wrong one first, you got to do the right one out of the gate?

Josh: I think one thing we look for in user research is trying to understand who different users are. And a lot of times, people refer to this as personas. I think that sometimes becomes a little bit more marketing related than I necessarily think of it as. So it’s not that we’re looking for women between the ages of 35 and 50 who live in midwestern states. That’s not what we’re interested in. We’re interested in the team dynamics and the verticals that people are in, their level of experience with different types of machine learning. Typically it’s more helpful to think of those as roles.

And so in coming into TruEra there are, of course, many people within the company who have decades of cumulative experience that have understandings of what different kinds of data scientists are. But we hadn’t systematically looked at how that affected the product decisions that we were making. So what a, say, data scientist—classically someone who is going to think about collecting training data, think about the algorithm and model architecture that they’re going to use, think about how they might do feature engineering within that model, and then train the model, and then that’s sort of the end of their job—that role is going to think about things very differently than the newer role that has emerged, which tends to get the title of machine-learning engineer, who maybe does some of the training, maybe a little bit of the feature engineering but winds up being responsible for deploying that model out into production and trying to understand when that model is experiencing what we call drift. So the data coming in no longer matches the training data quite as well as you would want, or the model performance is being degraded for other reasons, you run into data pipeline problems, et cetera.

These two groups, the data scientists and the machine-learning engineer, they think quite differently about the problems that they need to solve. And they have very different toolsets. They may even have very different backgrounds. Machine learning engineer is far more likely to have a computer science background. A data scientist is more likely to have a statistical or mathematical background. And that means that they’re going to have different approaches to doing their job.

Looking at how we can think about making a product and building tools that make sense to both of those different groups is a really important part of user experience. And we call that personas, but really, it’s more about the role that someone has in their company and even the type of—not just company, but organization—the type of organization that they’re inside of.

Brian: You surface that there are these multiple roles or personas, whatever language you want to use—you surface that these exist. And then there’s the, “so what” from a product management standpoint. Who do we—who are going to care about? Now you have to decide who you’re going to care about. Do we care about all of them? Are we going to make it easy, especially great, for machine-learning engineers and the data scientists will just have to just deal with it? Or is it really for data scientists and the ML engineers will kind of just have to go along with it? Or it’s one product that makes tradeoffs? How does this—your work become actionable in the—in terms of the design—interaction design, user interface, et cetera?

Josh: Well in this case, in specific, we have a monitoring product that we are getting ready to launch. And the machine-learning engineer is really our target role for the monitoring product. The diagnostic product that we already have out in the marketplace is a little bit more oriented towards the data scientist. The thing is that most teams will have both of these roles on them, in some capacity or another. And we realize that we’re not—we don’t want to fork our product. We just want to be able to support both of these roles, and we want to as much as possible encourage them and help them collaborate with one another in showing what might be going on in a production model that’s out there making real predictions and returning real data into the world, the machine-learning engineer, give them all the information they need to go to the data scientist and say, “I think we need to relook at this feature,” or, “I think we need to retrain,” and then also support the data scientist knowing what it is they need to do to their training data or their model to make it perform better so they can get it out into production. So it wasn’t—in our case, it was realizing that we needed a monitoring product and realizing that the key differentiator that we are going to have was enabling that addition of the monitoring product to allow two different roles to communicate with each other more effectively.

Brian: I’m imagining—I don’t know if this is just like learning tool or there’s some kind of dashboard where it’s tracking metrics, you can kind of see what it’s doing in real-time and maybe it spits out information when you go off the rails, or you know, the model’s drifting, you get some kind of alert of whatever. How did you go about designing that, deciding what that was going to be? And did you land in the place where you thought you were going to land? Or through the process of doing research, the UI has drifted, so to speak, into what its current thing is? Tell me about the practice of taking your insights and they become outputs, interfaces, experiences, et cetera.

Josh: I’m a big advocate for iterating with users. To the degree possible, get things in front of people so they can tell you whether it works for them or not, whether it fits their expectations or not. We talked to different types of machine-learning engineers in different verticals, some of them extremely small companies that are more focused on tech products. They’re much more willing to use new services, to change things up quickly, to store their data elsewhere. We also talked to a lot of machine-learning engineers who were in much larger companies that are slower to move, need to be much more careful with their data, less likely to take on new tools.

From that, came up with a few different approaches that we want to support concurrently. We have an SDK. So if you want to write Python, you can write Python. You don’t need to interact with our dashboards. But you can also flip a switch and see a dashboard that will show you real-time information about how your model is doing over time and give you, that kind of classic dashboarding view. And we realized from our conversations that machine learning engineer needs to do both of those things. They need to be able to look at raw data so that they can work in an environment that they’re comfortable with. And they need to be able to see graphics, not only for quick glanceability but also because a big part of their job is communicating with other business stakeholders, people who don’t want to look at a giant table, people who don’t want to like look at code, but people who are interested in model performance, and in particular, how that model performance ties to some KPIs.

From those conversations, we realized that we needed to do two things. We needed to support two different uses for our monitoring tool. And it is a dashboard and alerting and tie-ins to our diagnostic product so that you can go from getting an alert that tells you something is off with your model to looking at a dashboard or looking in the SDK and seeing what that is to then going into root cause analysis and trying to understand exactly what might be causing that and what you need to mitigate it.

Brian: This idea of more information is always better, so I want to talk about like, designing for technical users. In my experience in this space, it can be very hard to either not convince the users off it, it’s convincing the internal team when I’m coming in as an outsider—that maybe there’s too much stuff here and that all of the telemetry, which—that’s in there, subtraction is also a design tool, right? It’s not just about adding stuff. But sometimes we need to take things away to make the experience overall better.

And it’s really hard sometimes to let go of anything that’s made its way into the product, in terms of data, to take it away because someone can come up with some obscure reason why you might want that metric, even though no one can actually name a single use case right now. There’s just this fear that someone out there might be using it. And I’m curious, have you found this also to be a challenge where removal is a difficult design tactic to use or not necessarily? Have you guys taken things out of the product that aren’t working or pieces of data that maybe in isolation are not particularly useful, it’s just kind of false signal?

Josh: We haven’t been removing much functionality, but what we’ve been doing is trying to look at the ways that not only people think about AI quality but also the way that they actually work and say, “We should group this bit of functionality in this one workflow. We should group this other bit of functionality in this other workflow.” And we don’t want to completely remove access from them. But we also want to—I think you’re saying—increase the signal, so when someone opens the explainability part of our tool, we want to show them as much about explainability as possible without overwhelming them with a ton of extra information that maybe is more about fairness or more about performance. And so it’s not necessarily removing that stuff as it is just trying to streamline based on what we hear from our users and based about—on what we understand about the data science and the model lifecycle. We’re trying to ensure that we don’t show extraneous information because it is really easy to overload even technical users who I have found do not want to admit that they feel overwhelmed by a UI. You know, they don’t want to admit that there is too much detail in something.

Brian: I should be able to use this.

Josh: And so it—I don’t think it’s as much a matter of like removing it as it is putting it in the right place. And that’s also why we support both an SDK and a UI so you have a way to walkthrough charts and graphs and buttons and checklists. And we also have a Python SDK where you can go and put together whatever workflow you want and still get all the same functionality and all the same tools that we provide in our UI but minus a lot of the stuff around it. And we have found that people tend to bounce back and forth between the two. People—data scientists machine learning engineers, data science manager—will all do a little bit of code, look at some graphs, walk through one of our workflows in the UI, maybe go back and do a little bit of code, and we want to support that as much as possible because our goal is to get people to think about AI quality differently, not to necessarily change. We don’t want to change their performance metrics. We don’t want to make them change how they calculate something or change a workflow that works for them. We just want to get them to a place where they can bring together our four pillars and build better models and build better AI.

Brian: Do you have particular user experience outcomes that you design for, like some kind of metrics that you track on the user experience front within the product?

Josh: We don’t have that yet. We are just opening out our SaaS offering. So we’re going from being strictly on-premise, behind the firewall tool, to being something that’s open to the broader public. And if—just to throw in a pitch here—if that’s something that people are interested in, please get in touch. We’re going to have an open SaaS beta very soon. We also run AI quality courses where the two founders of the company, Shayak Sen and Anupam Datta, along with some of the machine-learning engineering team, just the machine learning team, walk through our four pillars, how to use our tool and a lot of the underlying math and algorithms of explainability, fairness, and drift analysis. So just to throw that out there as a pitch, those are all open to the public.

But we’re just moving to a place where we’re going to have lots of different folks coming into our product and doing different kinds of things with our product. And so we’re just getting to a place where we’re about to have a bunch of telemetry. And for me, that is a very exciting thing to be planning for, where I can marry my love of qualitative research, where I go and get to talk to people, with some real quantitative information, some real quantitative data about what people are doing with our tools, what seem like dead ends, where people are finding success.

Brian: How do you know whether or not the design and the experience they’re having now is good? What’s the measurement of good or successful, or we added this feature, or we added the monitoring products?

Josh: Because of the nature of a lot of our installs at the moment, they’re behind firewalls. We have no access to people’s data. And that—these are regulated industries primarily, and so they can’t allow us any access to their data. And so we’re needing to get that from interviews with active users. We, internally, tend to think about success as how many models have people brought in? How many different users are looking at a single model? How many reports are being generated from our report generation tool? When we talk to our users, what are they telling us that they find helpful in it? What are they using it for? Can they show us this? And typically under NDA, we can get to see some of the things that they’re doing with the models. Right now, it is still a little bit of a bespoke process to understand what’s happening. But we’re going to move to a different and slightly more systematic way of understanding what people are doing with the tool once we get wider adoption for the SaaS tool. So we’re about to roll out this thing where anyone can come ingest some models and learn about feature importance in their model or look at fairness analysis in their model. And from that, I think we’re going to get a lot more information about what people do. And from that, we’ll create our own metrics. But it’s still a little bit of an ongoing discussion. What should we measure? How do we know whether our users are being successful or not? And how do we know whether we’re landing in the place that we hoped to land?

Brian: I think it’s important to know that, right. Otherwise, it’s hard—it’s kind of subjective. “I guess we’re doing okay.” It’s like one person hears one thing. Another person sees something else. And you’re like, “Well, he said this, but they did this.” “They struggled.” “Yeah. But it looked like—they said they loved it.” You know?

Josh: I think it’s something that we know we need to do. And that is a really crucial inflection point for a lot of startups where you have users and they tell you what they’re doing, and you have a way of understanding—you can aggregate what you hear, but that—again, it’s really difficult to find the signal in there. And I think a lot of companies as they get ready to open themselves up a little bit more or reach out to a broader audience, or you know, go out of some kind of closed beta, there is that moment you have to wrestle with, what you’re going to measure and what success looks like

There’s, for us, a lot of business considerations in there, of course, like will people—we’re going to have a free tier. And so will people go to the page here? That’s a great one. Will people ingest a lot of models? That’s a great one. Will people retrain models and put new versions of them up? Because that indicates to us that they have seen things from maybe our fairness analysis tools or explainability analysis tools that have led them to see problems in their models. And so they want to update it and then compare those two models to one another. That should look like success. And I think around the company, we would all agree that that is the outcome that we want. We want people to think more deeply about the models that they’re putting out into the world.

But it’s also interesting being a company with a lot of people who have PhDs in machine learning, and you say, “What are our success metrics?” And suddenly, it becomes a big discussion because maybe it’s possible to have too much statistical knowledge and too many strong opinions on what metrics should be.

Brian: I think there’s probably a fair number of people here who are working in large enterprises that largely have their own infrastructure setup, their own ML office, whatever you want to call it, to build models and all this kind of stuff. So it’s, it’s being custom grown. And I’m curious if you’re on the side of building the platform for the people who are building the models, are there any particular questions or things you’ve changed about how you do your research specifically when you’re talking to machine-learning engineers and data scientists? It could be types of questions. It could be how you speak to them. I don’t know what it is, but I’m just curious if there’s anything particularly unique, or it’s really not domain specific. It’s not—it’s not special relative to any other way that you would do research with users. Is there anything you can share there for—especially for people who perhaps aren’t user researchers in their main job, but they also don’t want to go stand up a giant piece of infrastructure only to find out the data scientists can’t or won’t use it. We just spent all this money building XYZ. Well, I can’t use this, like—any thoughts on that or changes that maybe you’re like, “I can’t lift what I did eight years ago and apply it here anymore. It doesn’t work.”

Josh: Sure. I think one thing that’s really important with technical tools is feature validation. One thing that happens when you have a whole bunch of really, really smart people building tools for other very smart people like themselves is [edge functionality 00:24:28] or new, cool, shiny things get a lot more attention and get foregrounded in ways that are totally understandable but maybe aren’t necessarily helpful for driving the direction of the product. When we think about new functionality we want to put into our diagnostics tool, it needs to live nicely alongside all the other things that are already there.

One strategy that we’ve been using over the past year is to take all the new functionality that we want to put in and—especially if we can find our users who are already fairly deep into the tool—to go to them and say, “Do you think this is helpful? Does this add or subtract?” And people often won’t tell you that, “Oh, this detracts from my experience,” or, “I find this, you know, really distracting or confusing.” But they will give you a sense of whether it fits into what they really need and the way that they’ve learned to use your tool.

And that kind of feature validation, I think, is a little bit different than the useability research that I tended to do when I was working in design consultancies because we would either do some sort of foundational research that would give us an understanding of who our users were or what the market was. We would maybe do a little bit of product fit—product market fit work—and then we would design some stuff usually with users directly, pass it off to our clients who then would do the hard-core usability research themselves. I’m finding now that there is this intermediate step where you want to look at new features in the context of all the things that you already have and make sure that you’re adding the right thing and that it’s fitting into the workflow in the right way. So that kind of feature validation, I think, is extremely important.

Brian: And I just want to jump in here for the listeners. We’re not talking about machine learning model features, right? We’re talking about functionality within the application, the tooling that you’re making, correct?

Josh: Correct.

Brian: Okay.

Josh: Right.

Brian: Cool. I just wanted to—

Josh: Right. Right. Yeah. When we add—when we add a new chart, we kind of think of this as like, “Let’s validate this feature.” It is a good point that a lot of times our charts are about the importance and influence that specific features in a machine learning model have. And one thing I have found as well is that navigating the language of data science and product development, there’s a lot of overlap in somewhat unhelpful ways.

Brian: Right.

Josh: Yeah.

Brian: Such as feature.

Josh: Such as feature. Right.

Brian: Let’s talk a little about features and explainability and interpretability. And my first question is do you see a difference here between—I used to think this was all the same thing. But my perspective on it has changed between explainable AI, like explaining what the model did, which is not necessarily about interpreting the results as a user trying to understand like the score was 87 and the biggest feature, the biggest reason why it got an 87 was because the ZIP code is within two miles of whatever. Do you think about both sides of these, or is this just one thing to you? Talk to me about the importance of explainability and what you’ve learned here as well.

Josh: There’s a more academic distinction between interpretability and explainability. And I guess these are also terms that we can talk about the more academic distinction between them and then the way they have been socialized. But to talk a little bit about the academic distinction between them, like classically, interpretability is having some mathematically verifiable value for and relationship between each parameter in a model that let’s you directly derive an output from an input using those relationships. So if you think of a decision tree, for instance, it’s a very simple kind of model where you can look and understand exactly how each junction in that tree, the parameter there that has led to the next item further down the tree, and the next branch and the next branch until you arrive at the output. And so you can directly derive the output from the input by looking at the values of each parameter.

Explainability is inferring the relationship between the input and the output and parameters within the model algorithmically and empirically. So you don’t necessarily have insight into exactly what each parameter within the model is doing in a rigorous enough way that you can directly derive the output from the input. But you can infer it. And that’s useful. But I think it’s not the way that people tend to talk about interpretability and explainability. And I think that’s because AI explainability is relevant to a way wider group of people than interpretability.

And so there’s kind of this socialized definition for both of those terms where like interpretability is relevant really for certain kinds of models to pretty technical users, who really want to know what a specific parameter is doing. Explainability is relevant for a lot of different kinds of models to almost anyone interacting with them. And that’s something we find with our users who make heavy use of the explainability functionality that we have. I was about to say features—but the explainability functionality that we have where they need to tell a stakeholder why the model decided something. And they need to defend that decision, or they need to provide information to a stakeholder that validates and justifies the model as a part of some business process. So it’s about building trust and understanding in a model’s behavior. And so because of that, I think explainability tends to touch on a lot more kinds of people and a lot more situations. And it gets a little bit more muddy as a term. But really, it’s looking at model features and trying to understand what each feature contributes to an output.

They’re both incredibly important areas of research. They’re both really fascinating to me. But I think the broader societal impact of explainability is going to be much, much larger. Interpretability really works for certain kinds of models. And it’s usually quite either restrictive or computationally intensive to really interpret what every parameter is doing. Explainability can be everything from me telling you with some justification why you didn’t get a bank loan or why you think a warehouse worker might quit their job or why the self-driving features of your car took over for a split second. And those are things that touch on a lot more kinds of people and a lot more kinds of situations.

Brian: How did you get to this point where you felt like you needed to go get a degree in data science? I mean, you have a design background. That’s pretty unusual. Was there a moment that drove this, or was this a slow curiosity or just interest in math and statistics? How did you get where you are now? Like what made you jump into that pool?

Josh: I have always been a fairly technical designer in that I’ve wanted to know the underlying material of what I was working with. And so, when I was doing a lot of app design, website design, I thought I need to learn some code. And I got really into writing code. I actually wrote a couple books on building interactive installations and using code in data visualizations, programming microcontrollers for small prototypes when I was working with industrial design firms. I’ve always wanted to know what was going on underneath the design. I think it’s an important part of designing anything to understand how the thing that you are making is actually built. And the folks I know who do footwear design go to factories a lot. And the people I know who do furniture designs spend a lot of time in woodworking shops and think about wood very deeply.

For me as a designer working in software, I wanted to understand what was going on underneath the stuff that I was making. And so I learned to code. And, and I’ve always had that approach. In 2019, I spent some time working as a design lead with IDEO.org based in Nairobi, Kenya. And we did a lot of work with partners like the Gates Foundation, the Marie Stopes Institute, and the Omidyar Network. And one of the things I realized working on these large-scale social impact projects was that (a) people in development and public health think very quantitatively. And if you want to work with them, you need to understand what they’re talking about when they talk about doing randomized control trials or when they show you some linear regression that they’ve pulled out. The other thing that I realized was that to work at that kind of scale, both in terms of the number of users and the amount of time that they’re, they’re—sorry, they’re programs and projects ran, you have to track things. You really have to be careful about what you’re what you’re measuring, and you have to be careful about how you’re analyzing those measurements because otherwise there’s no way to know over the course of 20 years, 10 years whether something is working or not. You can’t just go out and ask people, “Do you think it worked?” You really need to measure, and you need to understand how to analyze those measurements. That took me into a stats program in essence. I mean, it was a data science program, but I really chose the quantitative research and stats track within it and found that, well, incredibly challenging but also incredibly rewarding.

Brian: It might be worth just sharing with this audience because I don’t think most of them are—there’s a subset of designers and user experience professionals that listen to this show. I don’t—that’s not really the primary audience though. That it’s actually not, especially in the digital design space, it’s—there’s a lot of designers who don’t think that they need to know about engineering and code. And some of them are actively against learning it because, “it will soil my creativity if I know how it works. And I want to be open to”—I’m completely with you that it’s like, “Well, why are the footwear designers in the factory? And why do furniture”—like they know about different kinds of wood and how this one cuts and can be shaped and this one for different kinds of paint. This fabric isn’t good for spills, but that one’s good for—every other field in design, you need to know your materials, your paints if you’re a painter.

So why is it in digital when code is the foundation of our stuff that you don’t need to know about that? I think it only makes us better as designers. I find it kind of a cop-out a little bit. It doesn’t mean that all designers need to writing code in their job, but you need to be able to call BS, in my opinion. You need to know what’s feasible. You need to be able to speak to an engineer, you know, telegraphing difficulties and, and something when you’re saying, “We need to build X, Y, and Z feature. I know it’s going to be hard. I’ve thought about X, Y, and Z. Here’s an MVP version here because I know you’re going to have to stand up this whole new object model,” blah, blah, blah. Just being able to do that, you don’t need to code it, but you need to know what’s it like, partly to get your own stuff across. You know? And I don’t think this audience listening maybe knows that that’s it’s a little unusual to—your background here—to actively want to understand the raw material you’re working is—I think it’s great. I don’t know many that have the data science background. There’s a lot more designers I think today that are comfortable with code. I think the younger generation, it’s almost like it was when I came up where you learn Photoshop and Dreamweaver, and then you learn some coding, some templating language or whatever, and you did everything because you were a quote, “web designer.” You had to do everything, you know. And then over time, the roles specialized and got more separate.

And I kind of feel like the younger generation, they’re like hacker-types. They’re designers, but they know how to prototype, or they’re using some kind of, you know, off-the-shelf tools, but they kind of know what’s going on. They know what an API is. That’s changed a little bit there. It’s a little bit of a tangent here, but I just wanted to, like, call that out that’s it a fairly—I think it’s unusual, and it might be surprising to some.

Josh: There’s a empathy-building exercise that comes from using these tools and understanding where they come from. I do understand the argument that some designers make. If you want to find a better way to do something, spending a ton of time in the trenches of the current way that it’s done is not always the solution, right? I think this goes a little bit to some of the moonshot vision mentality that some designers have, where I’ve heard people say things. “If, in the turn of the century, your goal was to find a better way to get people around the city, hanging out in the horse stables is not going to get you there.” I get that. But I also think that there’s a real empathy that you build and understanding that you build from seeing how your designs are actually implemented that makes you a better teammate. It makes you a better collaborator and ultimately, I think, makes you a better designer because of that. You can draw up mocks for anything under the sun. But getting people to actually build those and have them actually work the way that you were intending them to work requires incredible amounts of communication and collaboration with people who are going to think very differently than you. And that’s something that’s important for folks to understand as much as possible.

Brian: I agree. I think especially with data products too, just they’re not binary. It’s not like, “Did the form process correctly or not? Yes or no.” “No. It threw errors, and here are the specific errors.” “What are all the—like the chart could be from 0 to 100 and what happens when it’s 20?” “Oh, like the Y-axis is like slammed tiny.” “Oh. Well, how do we set the Y-axis dynamically based on this value?” There’s so many more actually considerations I think you have to think about all the possible ranges of the data. And some point, we got to make a design decision. So how do we design when the data’s dynamic and all—across all these different industries, there’s so many considerations that I think it’s hard to do a great job of that if you don’t understand something about the underlying material that’s going into these things, you know.

But anyway, that’s a whole—we could have a whole other discussion on that. I don’t want to go too far on that. But it’s been a great chatting with you. Is there anything I didn’t ask you that you think I should’ve asked you given our—we have some common background here. I’m just curious. Anything you’d like to share or have the last word on?

Josh: Yeah. I guess I would say to the non-designers who work with designers, measuring designs is not invalidating the designer. It doesn’t invalidate the craft of design. It shouldn’t be something that designers are hesitant to do. I think it’s really important to understand in a qualitative way what your design is doing and understand in a quantitative way what your design is doing. So if someone says, “I like this. I don’t like that. I understand this. I don’t understand that.” That’s the sort of information that is really difficult to get until you talk to someone face to face. But there’s also a lot of ways of measuring design that are just now beginning to land in the design community in ways that I think make people somewhat uncomfortable. They don’t want to AB test things. They want to go on their designerly intuition of them. They see using quantitative approaches to research as kind of threatening to design, and I don’t think that needs to be the case. I think it’s very difficult to design for things if you don’t know how you’re going to measure them. And I want to, as much as possible, bring that to the world of people who interact with designers a lot and to designers themselves. You don’t need to be threatened by these things. These are going to make your products better, and they’re going to make you better because they’re going to help you understand at a larger scale what your users are seeing and experiencing and what your products are doing.

If I could leave with one thing, it would be to say, there is this great world of data-driven design that’s just now beginning to emerge on both sides of the design divide. And I see people feeling a little bit threatened by it or worried about it. And I don’t think that needs to be the case. And I think it’s going to be a great thing for people building products and for people designing products.

Brian: I’m with you there. I can relate to what you’re talking about there as well. And that’s good. That’s great insight. Thanks for sharing that. Great to talk with you. Where can people follow your work? Do you publish anywhere, social media, anything like that?

Josh: Yeah. Come look at TruEra. We have our AI quality courses that are run quarterly. They’re open to folks who want to sign up for them. We also have our SAS open beta that’s going to be starting soon. So if people are interested in signing up for that, there’s a form on our website where people can do that. I personally am occasionally on Twitter and writing there. And I keep a fairly active medium blog going where I’m trying to write a little bit about what this vision of data-driven design that I see in our near future might look like both for designers and for more quantitative researchers.

Brian: Awesome. Awesome. And just some people probably won’t go look at the webpage. Can you say what some of these URLs are? Like what’s the Twitter handle, for example?

Josh: Sure. The Twitter handle is @fctry2, so Factory 2, which is what happens when you use the name you thought was clever when you were 25. I have been on Twitter that long. And—

Brian: Medium Joshua Noble?

Josh: Yep. Medium—just I think it’s Joshua J Noble now that I think about it.

Brian: And it’s TruEra as T-R-U-E-R-A dot com?

Josh: Yes.

Brian: Cool. Well, thank you so much. And feel free to shoot me those links, and we’ll get them into the [show notes 00:41:31] and stuff. I really appreciate you coming on the show and chatting with me today.

Josh: Yeah. Thank you so much for having me. It was really, really fun.

 

Array
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Subscribe for Podcast Updates

Join my DFA Insights mailing list to get weekly insights on creating human-centered data products, special offers on my training courses and seminars, and one-page briefs about each new episode of #ExperiencingData.