062 – Why Ben Shneiderman is Writing a Book on the Importance of Designing Human-Centered AI

Experiencing Data with Brian T. O'Neill
062 - Why Ben Shneiderman is Writing a Book on the Importance of Designing Human-Centered AI
/

 

Ben Shneiderman is a leading figure in the field of human-computer interaction (HCI). 

Having founded one of the oldest HCI research centers in the country at the University of Maryland in 1983, Shneiderman has been intently studying the design of computer technology and its use by humans. Currently, Ben is a Distinguished University Professor in the Department of Computer Science at the University of Maryland and is working on a new book on human-centered artificial intelligence.

I’m so excited to welcome this expert from the field of UX and design to today’s episode of Experiencing Data! Ben and I talked a lot about the complex intersection of human-centered design and AI systems.

 

In our chat, we covered:

 

  • Ben's career studying human-computer interaction and computer science. (0:30)
  • 'Building a culture of safety': Creating and designing ‘safe, reliable and trustworthy’ AI systems. (3:55)
  • 'Like zoning boards': Why Ben thinks we need independent oversight of privately created AI. (12:56)
  • 'There’s no such thing as an autonomous device': Designing human control into AI systems. (18:16)
  • A/B testing, usability testing and controlled experiments: The power of research in designing good user experiences. (21:08)
  • Designing ‘comprehensible, predictable, and controllable’ user interfaces for explainable AI systems and why [explainable] XAI matters. (30:34)
  • Ben's upcoming book on human-centered AI. (35:55)

Quotes from Today’s Episode

The world of AI has certainly grown and blossomed — it’s the hot topic everywhere you go. It’s the hot topic among businesses around the world — governments are launching agencies to monitor AI and are also making regulatory moves and rules. … People want explainable AI; they want responsible AI; they want safe, reliable, and trustworthy AI. They want a lot of things, but they’re not always sure how to get them. The world of human-computer interaction has a long history of giving people what they want, and what they need. That blending seems like a natural way for AI to grow and to accommodate the needs of real people who have real problems. And not only the methods for studying the users, but the rules, the principles, the guidelines for making it happen. So, that’s where the action is. Of course, what we really want from AI is to make our world a better place, and that’s a tall order, but we start by talking about the things that matter — the human values: human rights, access to justice, and the dignity of every person. We want to support individual goals, a person’s sense of self-efficacy — they can do what they need to in the world, their creativity, their responsibility, and their social connections; they want to reach out to people. So, those are the sort of high aspirational goals that become the hard work of figuring out how to build it. And that’s where we want to go. - Ben (2:05)

 

The software engineering teams creating AI systems have got real work to do. They need the right kind of workflows, engineering patterns, and Agile development methods that will work for AI. The AI world is different because it’s not just programming, but it also involves the use of data that’s used for training. The key distinction is that the data that drives the AI has to be the appropriate data, it has to be unbiased, it has to be fair, it has to be appropriate to the task at hand. And many people and many companies are coming to grips with how to manage that. This has become controversial, let’s say, in issues like granting parole, or mortgages, or hiring people. There was a controversy that Amazon ran into when its hiring algorithm favored men rather than women. There’s been bias in facial recognition algorithms, which were less accurate with people of color. That’s led to some real problems in the real world. And that’s where we have to make sure we do a much better job and the tools of human-computer interaction are very effective in building these better systems in testing and evaluating. - Ben (6:10)

 

Every company will tell you, “We do a really good job in checking out our AI systems.” That’s great. We want every company to do a really good job. But we also want independent oversight of somebody who’s outside the company — someone who knows the field, who’s looked at systems at other companies, and who can bring ideas and bring understanding of the dangers as well. These systems operate in an adversarial environment — there are malicious actors out there who are causing trouble. You need to understand what the dangers and threats are to the use of your system. You need to understand where the biases come from, what dangers are there, and where the software has failed in other places. You may know what happens in your company, but you can benefit by learning what happens outside your company, and that’s where independent oversight from accounting companies, from governmental regulators, and from other independent groups is so valuable. - Ben (15:04)

 

There’s no such thing as an autonomous device. Someone owns it; somebody’s responsible for it; someone starts it; someone stops it; someone fixes it; someone notices when it’s performing poorly. … Responsibility is a pretty key factor here. So, if there’s something going on, if a manager is deciding to use some AI system, what they need is a control panel, let them know: what’s happening? What’s it doing? What’s going wrong and what’s going right? That kind of supervisory autonomy is what I talk about, not full machine autonomy that’s hidden away and you never see it because that’s just head-in-the-sand thinking. What you want to do is expose the operation of a system, and where possible, give the stakeholders who are responsible for performance the right kind of control panel and the right kind of data. … Feedback is the breakfast of champions. And companies know that. They want to be able to measure the success stories, and they want to know their failures, so they can reduce them. The continuous improvement mantra is alive and well. We do want to keep tracking what’s going on and make sure it gets better. Every quarter. - Ben (19:41)

 

Google has had some issues regarding hiring in the AI research area, and so has Facebook with elections and the way that algorithms tend to become echo chambers. These companies — and this is not through heavy research — probably have the heaviest investment of user experience professionals within data science organizations. They have UX, ML-UX people, UX for AI people, they’re at the cutting edge. I see a lot more generalist designers in most other companies. Most of them are rather unfamiliar with any of this or what the ramifications are on the design work that they’re doing. But even these largest companies that have, probably, the biggest penetration into the most number of people out there are getting some of this really important stuff wrong. - Brian (26:36)

 

Explainability is a competitive advantage for an AI system. People will gravitate towards systems that they understand, that they feel in control of, that are predictable. So, the big discussion about explainable AI focuses on what’s usually called post-hoc explanations, and the Shapley, and LIME, and other methods are usually tied to the post-hoc approach.That is, you use an AI model, you get a result and you say, “What happened?” Why was I denied a parole, or a mortgage, or a job? At that point, you want to get an explanation. Now, that idea is appealing, but I’m afraid I haven’t seen too many success stories of that working. … I’ve been diving through this for years now, and I’ve been looking for examples of good user interfaces of post-hoc explanations. It took me a long time till I found one. The culture of AI model-building would be much bolstered by an infusion of thinking about what the user interface will be for these explanations. And even the DARPA’s XAI—Explainable AI—project, which has 11 projects within it—has not really grappled with this in a good way about designing what it’s going to look like. Show it to me. There is another way. And the strategy is basically prevention. Let’s prevent the user from getting confused and so they don’t have to request an explanation. We walk them along, let the user walk through the step—this is like Amazon checkout process, seven-step process—and you know what’s happened in each step, you can go back, you can explore, you can change things in each part of it. It’s also what TurboTax does so well, in really complicated situations, and walks you through it. … You want to have a comprehensible, predictable, and controllable user interface that makes sense as you walk through each step. - Ben (31:13)

Links

Ben on Twitter: https://twitter.com/benbendc

Transcript

Brian: Welcome back, everyone. This is Brian T. O’Neill. You’re listening to Experiencing Data. Today, I’m really happy to talk about human-centered design and its impact on artificial intelligence, and how these disciplines are working together.

 

I recently attended a webinar by the People-Centered Internet Coalition, I believe it’s called. And Ben Shneiderman from the University of Maryland, who is a professor of computer science there, was presenting a talk. And he’s actually writing a book on HCAI—he refers to it—so we’re going to unpack what HCAI is. Ben, tell us who you are, though, and why are we talking about AI now in the design world? And why are you writing this book right now?

 

Ben: Hi, Brian. Thanks for the chance. Yeah, this has become a big passion for me. So, I’m glad to talk about this. Yes, my background’s from human-computer interaction, user interface design, user experience design, from the world of computer science.

 

I became 20% of an experimental psychologist in trying to study the way people use computers. And that’s become my calling card in life. My book Designing the User Interface is now in sixth edition, to tell the heroic story of how this small group of researchers changed the world by making websites that you could use, graphical user interfaces, and mobile devices that really work, that 2 billion people have in their pockets to connect up with the world, to do business, to get medical care, and so on.

 

So, it’s from that world that we’re building out, and learning some new lessons, and putting the old ones to work. The world of AI has certainly grown and blossomed, and it’s the hot topic everywhere you go. It’s the hot topic in businesses around the world, and in governments, and governments launching agencies to monitor AI, and making regulatory moves, and making rules. The European GDPR is changing the way things happen. People want explainable AI; they want responsible AI; they want safe, reliable, trustworthy AI.

 

They want a lot of things, but they’re not always sure how to get them. And the world of human-computer interaction has a long history of giving people what they want, and what they need. And so that blending seems like a natural way for AI to grow and to accommodate the needs of real people who have real problems. And not only the methods for studying the users, but the rules, the principles, the guidelines for making it happen. So, that’s where the action is.

 

Now, of course, what we really want from AI is to make our world a better place, and that’s a tall order, but we start by talking about the things that matter, the human values, the rights, the human rights, the access to justice, and the dignity of every person. We want to support individual goals, their sense of self-efficacy, they can do what they need to in the world, their creativity, their responsibility, and their social connections; they want to reach out to people. So, those are the sort of high aspirational goals, and then below, that becomes the hard work of how do we build that? And that’s where we want to go.

 

Brian: Yeah, no. I’m with you on some of that. I’m particularly curious, though, I think the leap for some of the audience that’s probably listening to this show, and people that are CDOs, CAOs, leaders of data science teams, analytics teams, these people are frontline workers when it comes to building out the models, and the models eventually become aspects of the engineering which then yields an interface and an experience, which is there whether it’s done with intent or not, it’s the, everyone’s a designer, whether you know it or not. So, how is it that a data science organization or an analytics team or someone that’s very close to the actual model development makes the leap to say, “Oh, I need user experience, or design help, or someone with human factors knowledge? How is this going to help me out?” That gluing those two things together is not a natural thing, I think for many data people. So, can you help us unpack why do I need that?

 

Ben: Right on. Sure. The business analyst, the designers, the developers, the implementers, the software engineers, the data managers, those are the people who are going to have to make the real tough decisions; they’re going to have to bridge the gap between the ethics that we all want—we want responsible, humane, and wonderful AI—and we have, then, the gap, though to the practitioner. What does the practitioner do in order to make a reliable, safe, and trustworthy system? And then how do they know if they’ve really done it? (What does a team do

 

How do we assess the performance? So, there’s a set of approaches to this, or 15 recommendations that I put together in this paper that was published early, it was last year now. And those things have three levels.

 

One is the level of the teams, the software engineering teams. And they’ve got real work to do. They need the right kind of workflows, and engineering patterns, the Agile development methods that will work for AI. The AI world is different because it’s not just programming, but it also involves the use of data that’s used for training. So, key distinction is that the data that drives the AI has to be the appropriate data, it has to be unbiased, it has to be fair, it has to be appropriate to the task at hand.

 

And many people and many companies are coming to grips with how to manage that. Of course, if you’re going to—this has become controversial, let’s say, in issues like granting parole, or mortgages, or hiring people. And the controversies that Amazon ran into when its hiring algorithm favored men rather than women [laugh]—that certainly caused a lot of trouble—the bias that’s occurred in facial recognition algorithms, which were less accurate with people of color, then it were with light-skinned people. And that’s led to some real problems in the real world. And that’s where we have to make sure we do a much better job and the tools of HCI are very effective in building these better systems in testing and evaluating.

 

Okay, so let’s say you’ve got the right data for the right task. You’ve also got to have the right algorithm. And you’ve really got to build the system, and then assess and test it, validate, and verify. So, the methods for doing that are a rich set of possibilities. We’re growing that and we’re shaping it for the needs of AI, and the complexity whereby the algorithms have to reassess, sometimes daily, because the training data has changed because the context of use has changed. And so when the world is changing in a way, that’s, maybe, unpredictable, we have to really be much more careful. And especially when we’re dealing with applications that Cathy O’Neil—maybe a relationship of yours—

 

Brian: [crosstalk 00:08:03].

 

Ben: —in her excellent book called Weapons of Math Destruction she talks about opaque, scalable, and harmful. Those are the three things. If you can’t understand the algorithm, if it potentially causes harm, and if it’s widely implemented, then we better pay attention. So, if there’s a startup company with a new application, or a researcher, fine, let them do their thing. But if it’s a major—if Bank of America is rolling out a new mortgage granting system, then we have to focus attention and make sure that it’s done right because people’s lives are changed.

 

Transportation systems, medical systems, military systems, all these are what I would call consequential or life-critical applications. And when you’re dealing with those, it’s not okay to be 95% correct. You have to be much higher than that. So, there are lightweight applications, like recommenders for films, books, restaurants, or et cetera. Where, okay, if it’s 95% where, that’s all right. And if 5% of the recommendations are a little strange, well that might be fun and interesting. It’s okay. But I go back where, where the real importance is, in the case of consequential and life-critical applications.

 

So, we got one level, which is the software engineering design. And within that, there’s two more big issues I want to mention. One is the idea of audit trails. It’s an easy thing to do. It’s what civil aviation has built as a way of making safety.

 

And we need that for self-driving cars; we need that for every robot. A flight data recorder for every robot is what I’m telling you about. Financial systems and trading systems already to that, so scale is possible. And we have to decide on how we’re going to have medical devices track their history, record them, report them, have an audit trail. So, if something goes wrong, we can go back, retrospectively, and analyze what happened. Audit trails are an easy, simple—maybe I shouldn’t say easy—but [laugh] it requires some thinking, but that’s something every, every automated, consequential system should be doing.

 

The second important thing that software engineers and the designers need to do is explainability. Every user needs to be able to say, “Why did that happen?” And get an answer in which they understand or they understand it enough to say, “That’s not fair. I want to complain. I want a review. I want to evaluate.” Or, “I’m talking to my lawyer. Let me [laugh]—I need to—I’ve been treated unfairly, I haven’t been granted the parole I deserve, I haven’t been given the job I really—or the raise I deserve.” So, we are talking about consequential, important systems here, and therefore, we need to build in the mechanisms by which users understand what’s happening.

 

So, those are five of the principles for software engineering teams, for business leaders—corporate leaders—building a culture of safety is becoming the right idea. It’s not a matter of just saying, “Yeah, we want safe systems,” but it’s actually doing it. Hiring the right people, allocating the resources, doing the training, building the environment, reporting the near-misses, and reporting the failures. Incident reports are a big deal. Incident reports for the near-misses and for the failures.

 

And that’s long history of successful things in aviation. Certainly, we’ve got that. But the Food and Drug Administration’s got its adverse drug event reporting system, open to the public to clinicians, pharmacists, to others, where they report problems that patients are having with certain medications. And from that, comes a better understanding. We need that.

 

So, the Partnership on AI has put up a AI incident database. Take a look at it. It’s got more than 1000 reports about systems that have failed. There’s lessons to be learned from those, and valuable ideas about what could be done to make systems still better.

 

So, the incident reporting is part of the safety culture, and the near-misses are as important as the failures. The near-misses give you the clues to what’s going wrong, there’s usually a lot more near-misses than there are failures, and the good news is sometimes people have avoided near-misses turning into failures. And that’s what you want to learn from, too. So, near-misses give you hero stories of pilots that saved a failing plane, not just the ones that crashed. So, there’s a set of things in the level of organizational design that I call the safety culture.

 

Above that, and outside, we’re going to look for the formats of independent oversight, whereby the work of companies who are building consequential and life-critical systems get reviewed by insurance companies, by accounting firms, by professional societies by standards bodies. There’s a lot of organizations that are interested, and the winning systems will be those that are open and transparent, that can be reviewed, evaluated, and discussed in an open way in an adversarial way. It’s like zoning boards; we’re going to move towards the notion of zoning board where if you want to build a house, you appear before the zoning board, where you want to build an office or a commercial building, and you say, “Here’s my plan.” And the zoning board says, “Well, that looks pretty good. Looks like you’ve adhered to the 30,000 pages of the building code, and you’ve done a good job, but there’s a few things that your neighbors are complaining about. And here are the changes we want you to make.”

 

And you do that, and then you come to some agreement, we hope, and you get a permit to build. You build it, and then the inspector comes and gives you a certificate of occupancy, and then your insurance company says, okay, we’ll insure it. So, there’s those kinds of mechanisms of zoning boards, and then continuous review, like the Food and Drug Administration does for pharmaceutical production and for food production, meatpacking, and so on, and the Federal Reserve Board does for banking systems just to make sure things get handled in a fair and equitable way. And then, of course, we have the failures. When the serious consequential failures, you want an open review.

 

Here the National Transportation Safety board’s our favorite example. They fly in when a plane crashes, or a train, or boat, and they investigate. They’re respected; their reports give guidance as to what to do, and they have teeth, and it has impact. And they’re respected as an independent oversight board. And that’s what we want.

 

Every company will tell you, “We do a really good job in checking out our AI systems.” That’s great. We want every company to do a really good job. But we also want independent oversight of somebody who’s outside the company, who knows the field, who’s looked at systems at other companies, and can bring ideas and bring understanding of the dangers as well. Because these systems operate in an adversarial environment, where there are malicious actors out there who are causing trouble.

 

And you need to understand what the dangers are, what the threats are, to the use of your system. You need to understand where the biases come from, what the dangers are there, and where the software has failed in other places. You may know what happens in your company, but you can benefit by learning what happens outside your company, and that’s where independent oversight from accounting companies, from governmental regulators, and from other independent groups is so valuable.

 

So, that’s the three, that’s the level structure for bridging the gap between ethics and practice.

 

Brian: I do get that. But for teams that are doing… they’re trying to do applied work. Let’s say that the outputs, the software, the solutions, the predictive models, whatever it may be, actually services partners, or internal employees, other departments who’re trying to help the marketing department with spend, we’re trying to build a pricing algorithm for the sales team to use. Does this framework change if the AI is not being experienced by, quote, “A paying customer,” or even a third party who doesn’t intend to be part of it, but it’s being affected by that? Like a camera using facial rec. in a public space, and I don’t have any relationship with that company that’s powering that; I do get all that.

 

Does this formula change if we’re talking about applying this work internally? Data as a strategic asset—like, let’s take this equipment, we want to predict when the hardware that our company makes is going to fail before it does, send a replacement out at this hospital so they never have downtime with that X-Ray machine; it is always working. But we need telemetry about it. Maybe there’s somewhat sensitive information in that. Like, how much of that data can we capture? Should we be building models on this? Like, does some of this have to change if we don’t really expose the AI or the machine learning algorithms?

 

Ben: [crosstalk 00:17:33] you’ve got the right idea. Cathy O’Neil’s guidance gives us a good clue. Opacity is one danger, but the ones you’re talking about are scale and harm. If it’s a small-scale thing, and there’s no harm to others, if there’s only internal dangers, that’s probably okay. You were at the edge of it, when you said a medical device.

 

The X-Ray machine is going to kill somebody, and I’m worried about it. But if it’s a maintenance device on a plumbing fixture, then you’re okay. You don’t need to do that. That’s a different rule. The internal use is very clearly what you’re talking about where we’re not talking about harms to others.

 

Brian: I guess in my experience, though, there’s a tremendous amount of waste even there. And this isn’t so much that the models and solutions aren’t ethical or whatever; most of it’s we built a model in isolation; the deployment of the model was not thought of as part of the solution the way I think a designer would. We would think holistically about it, has to include the last-mile user experience piece within the design of the whole thing, otherwise, you just build a little component that sits on a shelf, but until it goes out into the business, it’s not doing any good. So, I feel like there’s even an earlier gap that needs to be solved, even for the teams just trying to leverage this stuff internally. Like, the 101 stuff is still missing there, and I’m curious if you think the 101 design practice, the 101 standards of design apply to this space, or whether it’s a different default set of learning that data professionals need to know about design when it comes to working with artificial intelligence. Is it different, or is it mostly about the same fundamentals with a little bit of extra stuff when we’re using this capability?

 

Ben: You’re getting to important points here. I like this. I’d say the basic principle is to think about the control panel: what does the user interface look like? And that forces you to ask the question, who’s the user? Who is operating this device?

 

There’s no such thing as an autonomous device. Every device is really respon—someone owns it; somebody’s responsible; someone starts it; someone stops it; someone fixes it; someone notices when it’s performing poorly. [laugh] if that’s not true, then your organization’s in deeper trouble. I mean, you need—responsibility is a pretty key factor here. So, if there’s something going on, if a manager is deciding to use some AI system, what they need is a control panel, let them know: what’s happening? What’s it doing? What’s going wrong? And what’s going right?

 

And so that kind of supervisory autonomy is what I talk about, not full machine autonomy that’s hidden away that you never see it because that’s you just—head-in-the-sand kind of thinking. What you want to do is expose the operation of a system, and where possible, give the stakeholders who are responsible for performance the right kind of control panel and the right kind of data. The feedback. Feedback is the breakfast of champions. And companies know that.

 

They want to be able to measure the success stories, and they want to know their failures, so they can reduce them. The continuous improvement mantra is alive and well. We do want to keep tracking what’s going on and make sure it gets better. Every quarter.

 

Brian: Have you seen before-after examples of teams that may not have been practicing any sort of intentional design, or user experience, or human factors recipes, activities, behaviors in the work that they do, and then maybe later, they started to use some of these and saw a change in the work that they did. And I don’t just mean, like, rescuing yourself from a public relations disaster [laugh] because a model went south and you got on the front page of The Times but something where business value increased, customer satisfaction increased, some other metric, besides, kind of, that the headline AI situations that we often hear about?

 

Ben: Sure. The headline stories, I mean, would be—and the transformative ones are things like the early days of airbag deployments in cars: they were great; they saved 2500 lives a year. But in the early days, they killed about 100 babies and elders because of inadvertent deployments. And only when the data began to be collected through emergency rooms—it took a long time because nobody was thinking about that it could be bad. Not nobody, but not enough people.

 

And only when the data collected did those, you know, inadvertent deaths get changed. And so you built in a sensor in the passenger seat, and you move the baby into the backseat. So, all those changes are really things that made a difference. And the other, sort of, headline one was the 737 Max. Of course, and it crashes, and that took two very painful, tragic, deadly crashes to reset the user interface.

 

Remember, Boeing’s mistake was that there was no user interface. The MCAS system was hidden from the pilots. They were not even informed that it was in place. And so, they didn’t think of turning it off because they just didn’t know what was going on. So, those kinds of things are the headline-grabbers.

 

But every eCommerce company is working with this kind of testing, the idea of A/B testing is something that happens hundreds of times weekly at Microsoft and Amazon, and everywhere else. And so they will run tests: does having a bigger picture of a book sell more books or not? And so you give 10,000 users a page with bigger book pictures, and you give 10,000 users a page with the normal ones. You wait two weeks, and you see who sells more books. Is it better to have more space for the textual description, or better to have more space for the image of the book?

 

So, those kind of things, those small-focus decisions that you were talking about, those are happening on a regular basis for leading companies that understand that feedback is the breakfast of champions, that they need to know what’s working, and they need to try different things. So, there’s three ways of trying things, A/B testing is of popular commercial approach. Usability testing is another way. I just dealt with an interesting case about starting January 1 of this year, the US government requires hospitals—at 6000 hospitals in the US—to post their estimated price for 300 common procedures. And this journalist was looking for did they do a good job in making it accessible?

 

And we looked at dozens of hospital websites, and some did a great job, putting it right up front in a readable way on top, and in a meaningful way with language you could understand, and others hid it way down on the bottom, and when you clicked on it, you got this very strange kind of list of—

 

Brian: Procedure codes and, kind of [laugh]—

 

Ben: Procedure codes and abbreviations that were really tough to make sense of. And you had to click, and click, and click to get anything done, and even then you weren’t sure what the price would be. So, it’s—that’s not an easy task, I have to say, but it’s exactly the issue you’re talking about that if you want to make a procedure that’s effective, you’re going to evaluate and see how well it works. So, A/B testing in the real world, usability testing—a dozen users, give them a dozen tasks for an hour, and see how many of them get it done and what goes wrong, and then write a report about what the problems were, and what worked, and what needs to be improved. So, that’s the basic work of usability research, in a practitioner environment.

 

The third testing method would be the academic approach of controlled experiments where you give 20 users one version and 20 users another version, you got your stopwatch, and you measure how many seconds it take for them to do each of the dozen tasks, and then you look for statistically significant differences in completion times or error rates, et cetera, et cetera. So, there’s a whole world of research about that. But there’s a much bigger world of professional practitioner design and experience people who are working and doing great jobs in a lot of companies. Not everywhere [laugh], but the more—as time goes by, more and more companies are understanding the value, the power, the importance, the effect of good user experience design.

 

Brian: Mm-hm. You know, to state something against this, or—and maybe this is just the fact of learning, but if we look at some of the issues with Google, if you know about some of the hiring issues that have gone on there in the AI research area, Facebook with, you know, elections, and the way the algorithms tend to become echo chambers there, these companies also are—just and this is not through heavy research, but probably have the heaviest investment of user experience professionals within data science organizations. They have UX, ML-UX people, UX for AI people, they’re at the cutting edge where you don’t—I see a lot more generalist designers in most other companies. Most of them are rather unfamiliar with any of this or what the ramifications are on the design work that they’re doing. But even these largest companies that have, probably, the biggest penetration into the most number of people out there are getting some of this really important stuff wrong.

 

And it suggests that—and maybe it’s just because we’re still learning: how do we put design and user experience into this field? And maybe we’re still just learning that, but there’s been some big, big problems there. And where is design for that?

 

Ben: Sure.

 

Brian: Something’s not right, still.

 

Ben: I don’t promise perfection, but I will tell you—and I’ve got lots of friends working at Google and Facebook—these are places, and the stories you mentioned, I know a fair amount about. I guess, there’s a couple of responses, which is to say, these companies are doing a lot of heavyweight, serious, big things. So, while they have a lot of usability people, the question is, what percentage of the people are usability people? And what level of influence do they have inside the companies? Okay?

 

Now, the Google issues about ethics were not necessarily a usability issue, they were really questions about employee relationships and what employees should be doing, and these are complicated issues that I don’t think we want to get into. I would say, Facebook, you’ve got another case there, which is, there are usability issues about how can users control the settings? I mean, Facebook’s got 120 privacy settings to check on. And if you’ve ever dug into it, it’s pretty tough to make sense and know what settings you’ve got. And so there’s a lot more work that needs to be done.

 

And I would say that issue is not high enough on the totem pole for Facebook, and I’d like to see much better user interface control. I’d like to be able to control my newsfeed. Facebook, lets me make little twiddles on it—maybe block some users or so on, but I don’t have—I think—I would like better control. And I’ve backed away from using Facebook because I just don’t like what I’m getting. So, there you go.

 

There’s one user speaking up. But I think there's an awareness that there are problems and I’d like to see it a lot better at Facebook. And I’d like to see a lot better at Google. I mean, we have—there’s a merger here; HCI has become really powerful stuff, here. And Shoshana Zuboff’s book about surveillance capitalism reveals the much deeper fundamental problems that are happening because these AI algorithms are collecting a lot of data and shaping what we see, and what we do, and what options are available to us.

 

I’ll also put in a plug for Frank Pasquale, his book about the new laws of robotics. Pretty good stuff about the problems that are out there and that require legal changes and deep economic changes, the business models of these companies. And so that gets into a lot more issues that HCI and user interface and ACAI are part of. But the basic business models are what really have to think through.

 

Brian: You talked about the explainability earlier. And I think I’m detecting a trend, which is that I think that more teams are understanding that explainability is ultimately going to win on multiple levels. It’s better for the customer, it’s better for the business, you see more adoption, even at the cost of precision. Tell me, if I’m a data leader—well, I’m using the Shapley Framework and I’ve got my Shapley explanations for how it worked. Done. Check. Am I done? What did I miss if I—you know, we used an explainable model, we made sure we didn’t use a black-box model. So check, we did that. So, why do I need this design stuff?

 

Ben: So, you said it well before. Explainability is a competitive advantage. That’s what people are coming to understand, that people will gravitate towards systems that they understand, that they feel in control of, that are predictable. So, the big discussion about explainable AI focuses on what’s usually called post-hoc explanations, and the Shapley, and LIME, and other methods are usually tied to the post-hoc approach. That is, you use an AI model, you get a result and you say, “What happened?” Why was I denied a parole, or a mortgage, or a job?

 

And at that point, you want to get an explanation. Okay now, that idea is appealing, but I’m afraid I haven’t seen too many success stories of that working. In fact, when I read that voluminous literature, it’s taken me—I mean, I’ve been diving through this for years now, and I’ve been—especially for the past year—been looking for examples of good user interfaces of post-hoc explanations. It took me a long time till I found one. The culture of AI model-building would be much bolstered by an infusion of thinking about what the user interface will be for these explanations. And even the DARPA’s XAI—Explainable AI—project, which has 11 projects within it—has not really grappled with this in a good way about designing what it’s going to look like. So, show it to me. If any of your listeners [laugh] have a great examples, send me an example of screenshots of what an explanation—

 

Brian: You mean a post-hoc? What a postdoc looks like?

 

Ben: Yeah.

 

Brian: Okay. Yep.

 

Ben: There is another way. And the strategy is basically prevention. Let’s prevent the user from getting confused and so they don’t have to request an explanation. We walk them along, let the user walk through the step—this is like Amazon checkout process, seven-step process—and you know what’s happened in each step, you can go back, you can explore, you can change things in each part of it. It’s also what TurboTax does so well, in really complicated situations, and walks you through it.

 

And just what I was looking at this past week about the hospital price estimators, the best ones were ones that walk you through it, that ask you a series of questions you understood, and you got through it. This is really what happened 30 years ago with knowledge-based expert systems and with online help. In the old days, there used to be online help and user manuals, 300-page user manuals for user interfaces. They’re gone away. Because we’ve learned that you don’t want to have to teach people what’s in a whole user manual.

 

You want to have a comprehensible, predictable, and controllable user interface that makes sense as you walk through each step. And so, that’s what airline reservation does for you. That’s what Amazon checkout does. That’s what TurboTax does. And that’s what we need to do for AI applications, too: predictable, meaningful, applicational.

 

So, I mean, there’s a great series of work, narrowly focused, from University of Colorado, Conner Brooks, and Daniel Szafir’s work. There’s three or four really nice papers about robots, okay? Physical robots. And it turns out if you show the user of this robot what the robots going to do—give them a visual picture of what’s going to happen—then they know what’s going to happen and they say, “Okay, do it.” And then they watch it happen.

 

It’s sort of like what’s the success story of GPS navigation systems. You say, “I want to get from here to there.” And then you get a little map, which gives you three or four choices. And you say, “Okay, that’s the one I’ll take.” And then it gives you the set of directions to do it.

 

So, lots of AI in there, but users in control; they understand what’s going to happen. And maybe my favorite example is your digital camera. There’s lots of AI to set the focus, the shutter, the aperture, and reduce hand jitter, all kinds of things going on, but you’re in control. You’ve got a user interface that lets you see what the picture is going to be. You can zoom where you want to go. And then you get to click the decisive moment when you make that picture. It’s your picture. And lots of AI going, but it’s under your control, and it’s—you know what you’re going to get when the job is done.

 

Brian: Ben, this has been really great to chat with you. Do you have any closing thoughts for people that may be coming, more, from the data side, and kind of curious about this space? And they believe in the value here, but they’re not, maybe, sure how to take some steps with this and bringing design and human factors into their work?

 

Ben: The answer is, you can do it. So, talk to your friends, check it out. I mean, of course, the papers I’ve got are out there on the University of Maryland Human-Computer Interaction Lab website on Human-Centered AI, there’s four or five papers that I’ve done that cover this. And so you can check those out.

 

The one called Bridging the Gap Between Ethics and Practice was the one I mentioned earlier, and I think that’s most relevant to your listeners. So, that’s a good one to go after. But there’s a growing movement around this topic. I mean, the Stanford Human-Centered AI Institute has lots of seminars and so on.

 

I’m doing tutorials about it. April 13th, there’s a three-hour tutorial I'm doing at the ACM Conference on Intelligent User Interfaces. And on May 27th, at the Human-Computer Interaction Lab, University of Maryland, Annual Symposium again, a free, open, three-hour tutorial.

 

That’s my contribution to trying to help move the story. But there are lots of other people active in this topic. And this is important, but my message is you can do it, and it will make your products and services better, and you will have happier success story.

 

Brian: Excellent. If people want to follow you, are you on Twitter or LinkedIn? Or is there a good place?

 

Ben: I’m on Twitter: @benbendc on Twitter, so that’s a place to follow me.

 

Brian: Excellent. We’ll look that up and link that up in the [notes 00:37:42]. And when’s the book coming out, by the way? The new one?

 

Ben: Yeah, it’s cooking away through Oxford University Press. But real books take time, and so it’ll happen January 2022.

 

Brian: January 2022. Excellent. We’ll look for it.

 

Ben: For the moment, the papers are out there; it’s a good start.

 

Brian: Ben, thank you so much for coming on the show and sharing your ideas with us.

 

Ben: Thank you, Brian.

 

Subscribe for Podcast Updates

Join my DFA Insights mailing list to get weekly insights on creating human-centered data products, special offers on my training courses and seminars, and one-page briefs about each new episode of #ExperiencingData.