Season 5, Episode 6: Human Imperfections in AI with Hal Daumé

February 24 , 2022 | By Cindy Moehring

Share this via:

Cindy Moehring chats with Hal Daumé, senior principal researcher at Microsoft and professor of computer science at the University of Maryland, to explore the intersection of ethics and technology. Their discussion covers racial bias in artificial intelligence, the complexities of ethical machine learning, and who is responsible for ethical oversight at large tech firms.

Podcast

Resources From the Episode

Episode Transcript

Cindy M. 0:03
Hi, everyone. I'm Cindy Moehring, the Founder and Executive Chair of the Business Integrity Leadership Initiative at the Sam M Walton College of Business, and this is the BIS, the Business Integrity School podcast. Here we talk about applying ethics, integrity and courageous leadership and business, education and most importantly, your life today. I've had nearly 30 years of real world experience as a senior executive. So if you're looking for practical tips from a business pro who's been there, then this is the podcast for you. Welcome. Let's get started. Hi, everybody, and welcome back to another episode of the BIS, the Business Integrity School. I'm Cindy Moehring, the founder and executive chair. And we have with us today a very special guest as we continue our discussion on tech ethics. We have with us Hal Daumé and he has a very interesting background. Hi, Hal .

Hal Daumé 0:59
Hi, Cindy, it's great to see you again.

Cindy M. 1:01
It's great to see you again, too. I've had a chance to get to know Hal, and one because Microsoft serves on our external advisory board. And then we spoke together on a panel for the Northwest Arkansas Tech Summit recently. But let me tell you all about Hal and then we'll jump right into our discussion. Hal actually is a senior principal researcher at Microsoft Research in New York, where he's a member of the fairness, accountability, transparency and ethics group, as well as the machine learning and reinforcement learning groups. Now, in addition to that, Hal has up a second life, he is also the prorata chaired professor of computer science and language Science at the University of Maryland in College Park. And in that role, He's authored over 150 Peer Reviewed papers, several of which have received awards. And so you know Hal, wow, thank you for being with us today. And I would love it, if you could just share with the audience a little bit about how you got to where you are doing both of these roles, and what got you interested in tech ethics.

Hal Daumé 2:07
Sure. Um, so maybe just a little bit of background. So when I first started doing research, I was basically doing natural language processing stuff. So basically getting computers to interact with people using human language and a natural way. And my interest sort of slowly shifted around various parts of AI, like parts of machine learning and reinforcement learning. And at some point, it kind of became clear to me that we now have the ability to build a bunch of really complicated and potentially really beneficial technologies,

Cindy M. 2:42
Right.

Hal Daumé 2:42
but we're not always doing that. And sometimes we might think we are, but they have unintended consequences, and so on. And so my interest in sort of the tech ethics and fairness bias, and all those sorts of spaces is basically around this question of how do we ensure that the technology that we build, especially like AI based technology is really, you know, moving the needle forward in terms of, you know, building a better world

Cindy M. 2:42
Right.

Hal Daumé 2:57
rather than, you know, reinforcing existing social problems.

Cindy M. 3:14
Right

Hal Daumé 3:15
And so I think, as we see AI technology coming into more and more parts of our lives. This is just becoming more and more important.

Cindy M. 3:23
Yeah, yeah, it really is. And it, it will, I'm sure, we'll get into this more conversation, but it kind of feels like, you know, we launched all this technology, and everybody was like, "Yay, yay, yay", and, and now we're starting to say, "Oh, well, maybe wait a minute", you know. And so we can, we'll have a chance to talk about how we can maybe rejigger that whole process a bit and do it differently going forward. But Microsoft, I love the fact that you talked about wanting to build a better world and doing that in the right way. And I know Microsoft has a set of AI principles. In fact, for Responsible AI, can you tell us a little bit about what those principles are?

Hal Daumé 4:03
Sure. So this is something that's actually gone through a few iterations at this point, as such things do but the current version is there's six responsible AI principles. So I'll list them and then say a little bit about them. But I'm happy to, you know I think later we'll talk about examples of how these things out in practice, but so the six are fairness, inquisitiveness, reliability and safety, transparency, privacy and security and accountability.

Cindy M. 4:31
Okay

Hal Daumé 4:32
A lot of this is built on Microsoft's current corporate mission statement is to Okay, I can't get it exactly right. It was basically to empower people everywhere.

Cindy M. 4:40
Right

Hal Daumé 4:40
And so a lot of this is sort of coming from that position. So you know, if you want to empower people, you need to treat them fairly. You need to empower everyone, not just some people, not just you know, the people already have lots of power in society. You need to do this in a way that you know, people can rely on and and break things, people should be able to understand what these systems are doing. Microsoft, unlike a lot of other companies that people interact with daily is much more sort of like business to business, we have less of a sort of direct engagement with people, right? We don't really have like a social media site and things like that. And so for us, things like security and privacy are super important, because company to company trust is something that's really hard to get back if you've broken it.

Cindy M. 5:29
Oh, yeah.

Hal Daumé 5:30
And not treating people's data securely is a really good way of breaking trust.

Cindy M. 5:34
Yeah.

Hal Daumé 5:35
And then the last one is accountability, which I think is interesting, because it's the one that's not about the system itself. But the way it's phrased is people should be accountable for AI systems. So it's really putting the onus on the people rather than, you know, basically saying, like, oh, the system did something right. Like I ended the day the buck doesn't stop with the system, the buck stops, with like a person somewhere.

Cindy M. 5:57
Right. Right. So essentially, it keeps humans at the center, really, of the technology and making it clear that technology is really here to serve humans, and we have to be accountable for it not the other way around. That's actually a really powerful statement.

Hal Daumé 6:12
Yeah.

Cindy M. 6:12
So so how can you share with us maybe some examples of, of how you see those principles actually playing out at Microsoft? I think it's one thing to have them on a piece of paper? You know, I mean, today's what the 20 year anniversary of Enron, I think it is and they have lots of things on a piece of paper, too. But how do you actually see that? ( 'Cause we know, they all crashed) How do you see that coming to life? At Microsoft? How's it real?

Hal Daumé 6:36
Yeah, so, you know, I've been involved in a handful of projects where various aspects of these different pieces have come up, for instance, on the topic of inclusiveness so one place where this came up recently, it was there was a study out of Stanford that looked at a bunch of different automatic speech recognition systems from I think, like five major technology companies, including Microsoft and found that they underperformed when transcribing the speech of African Americans in comparison to white Americans. This violates the inclusiveness principle. So this is a technology now that is only empowering some people

Cindy M. 7:15
Right

Hal Daumé 7:15
Not empowering others. This sort of immediate question was, what do we do about this? And not just what do we do about the specific problem, like the specific problem that the system doesn't appear to work as well for African Americans as for white Americans, but just acknowledging the fact that there's different ways of speaking English, even in the United States?

Cindy M. 7:35
Yeah.

Hal Daumé 7:36
If we want something that's inclusive, we don't want to just fix the disparities for black Americans, but actually tried to come up with a solution that really going to be much broader than that. Yeah. This led into a lot of discussion about dialects and sociolects. Dialects tend to be geographically constrained ways of speaking somewhat differently. So I went to college and Pittsburgh, people in Pittsburgh, say yinz instead of y'all.

Cindy M. 8:04
Here in the south, we tend to say y'all

Hal Daumé 8:06
Say y'all. And so that's sort of a dialectal difference. There's also sociolectal differences. So these are things that are not constrained by, by geography, but by some other social aspects. So you know, if you look at, for instance, like socioeconomic status, right, people at different levels of socioeconomic status might speak English differently, they don't speak English better or worse, right? It's just different, different. So this study basically led to a big introspection, okay, if we're thinking about the US specifically, what are the major dialectal variations? What are the major sociolectal vote variations? How can we collect data that covers these in like, a, an ethical way? And then, of course, Microsoft is also an international company. So what does this mean, when we're talking about English internationally? What does it mean when we're talking about other languages? Right? There's sociolects and dialects of German. Sociolects and dialects of Mandarin, so

Cindy M. 9:01
It becomes a big issue, doesn't it? All of a sudden, yeah

Hal Daumé 9:03
It does and so it's hard, right? I know something about how these things work in the US. I mean, I'm, I'm certainly not an expert. But I have no idea how these things work in Germany, or even the UK. So this is created basically a large internal push to, you know, engage first linguists in various parts of the world where they will have insight into like, what are the types of variation that you're likely to come across there, got it and working to, you know, do data collection, collection, both for evaluation and also for like, building models in the first one.

Cindy M. 9:38
So that's, that's a really great example. I mean, it was, you know, it was a point of sounds like introspection that's still playing out, because all of a sudden, the, to your point, the recognition became, well, we shouldn't just put a bandaid on it and solve for that one issue. Like we need to look holistically at the whole issue. And we're a global company, how do we step back and do that? So yeah, that's a great example. So so let me ask you another question. I know you're part of the research group that is responsible for ethics and transparency in responsible AI. And in some respects, you, you may say at times or Microsoft may, I'm not sure. But you know, responsible AI is everyone's job, which is great to say. But I've also been in the corporate world for many years. And I know that sometimes when it is said that way, and understood that it's everyone's job, it can almost feel like it's no one's job, right? Like, who's on point for that? Okay, we're all responsible for it. But like, who's on point? So how does how does Microsoft deal with that issue?

Hal Daumé 10:39
Yeah, so not surprisingly, this is also something we've gone through iterations on. Now. One quick comment. First, we had this paper a couple of years ago, studying the way that people in industry, not just at Microsoft, across a large number of companies, including startups, and big major tech companies deal with fairness related issues, or sort of AI ethics issues. And one of the things that we found in a lot of the interviews we did was that a lot of times these issues are dealt with by one particularly caring person, there's just someone on the team who's decided that this is important to them. And so they basically spend their nights and weekends trying to address AI ethics issues. Uh huh. But like, they don't get promoted for it. It's like not part of their day job. So it's a one inter- one person we interviewed referred to these people as "fairness vigilantes", maybe not the best term, but I think this notion that like they're really kind of acting on their own, against a system that's maybe not supporting this directly was a problem. We also saw that it often happened that there was misunderstanding about whose job it was. So we had cases where we interviewed people on the same team, but in different levels. So like a data scientist, and a manager, for instance and

Cindy M. 12:12
Like a product manager?

Hal Daumé 12:13
Yeah. And the data scientists would say, like, oh, the product manager is worried about these things, the product manager would say, the data scientist, he really wanted to tell them, like you should talk, but we couldn't, because like confidentiality, and, stuff like that.

Cindy M. 12:28
Sure.

Hal Daumé 12:29
So anyway, one of the things we've recently tried, which seems to be working better than things we've tried in the past, is to have people on teams whose specific job it is to worry about these sorts of issues. So there's actually a couple of parallel, not quite parallel, but a couple of related organizations within Microsoft. So there's the FATE Research Group, which is really focused on research, which is where there's Aether, which is this basically AI ethics cross cutting thing across the company, which includes researchers, so I'm involved with that, as well as people from the the legal team, people from HR, data scientists, engineers, and so on. And then there's the third sort of cross cutting organization that's much more like the data scientist, engineer level. All of these are basically trying to provide scaffolding so that these individual people and each team whose job it is to think about ethics have a pool of resources that they can go to, it's like, they probably don't know everything, right?

Cindy M. 13:33
Sure, right.

Hal Daumé 13:34
A lot of what I find myself doing in my sort of Aether time, is consulting with these people whose job it is to sort of monitor AI, stuff on teams. And, you know, like, I gave the example of like, you know, how do we find linguists who know about socio likes and dialects, right? Maybe an engineer on the team doesn't know how to do that, or doesn't have the right connections, but right, one of us dies.

Cindy M. 14:00
Right, right.

Hal Daumé 14:00
So that's, that's been the structure we've had, and it's not perfect. There's certainly things that go wrong, but it seems to be working better than anything we've tried before.

Cindy M. 14:09
Yeah and what I like about that, that just sort of hearing you talk about it, is I hope it helps everyone understand this is a journey. I mean, it's not like a one and done and you get to write it, it's iterative, and you learn as you go, and you improve the processes as you go along. And this, you know, this seems to be working now. And if I talk to you a year later, though, maybe another iteration on it, but But it sounds like it's working better, you know, than it was before. And you know, it's not in common what you described that if you don't, if a company doesn't specifically and explicitly talk about whose responsibility something is, then everyone thinks it belongs to somebody else, right? And that's how how, how we can all as and when you're working in a company, a company can get into trouble because nobody really fully understands whose job it is. So it's good to see.

Hal Daumé 14:59
The other thing we've been trying to do with the sort of mixed success is provide tooling support for the people whose job it is to do this. So I wasn't involved in this. But there was a research project a couple years ago, two years ago, maybe now on, quote, fairness checklists, which is basically, you know, things you should think about throughout the full development of an AI system. It's not meant to be a checklist, a checklist sometimes get a bad rap.

Cindy M. 15:31
Yeah.

Hal Daumé 15:31
For good reasons, because it's what you have to do to check the box, and then you move on.

Cindy M. 15:35
Right.

Hal Daumé 15:37
But the, the, the checks are less about, you know, did you, you know, insert this line of code or something, but much more about did you think about XYZ's potential implication

Cindy M. 15:50
Right, right, right, is sort of a framework for things that, you know, as opposed to a checklist that they should be thinking about? That's, yeah, that's a good tool. That's a really good, it's a guide, right? Yeah.

Hal Daumé 16:00
It's not applicable everywhere, but it's trying to be, you know, Aether has been around, I'll probably get the dates wrong, but I'm gonna say, like, five years or something like that. And, you know, we've been involved in a fair number of projects through that time. And sure every project has its unique things that make it challenging, but there's a lot that happens that you see over and over again. And so if you can at least streamline those over and over again, occurrences, then you can use the other resources, like reaching out to Aether or reaching out to a group or something as a way of trying to address the things that are like really specific to your problem.

Cindy M. 16:38
Yeah, got it. Okay. Well, let's switch gears here for a minute and talk a little bit about machine learning specifically, and, and in this audience for this conversation, while not everyone is going to be a data scientist or an engineer, and may not even understand what machine learning is. So can you just sort of in a few words, just explain it very simply, what, what actually is machine learning?

Hal Daumé 17:00
Sure, I think of it is programming by example. So you want to write a program, you don't know how to write it. But you can come up with examples of desired inputs and out and their corresponding outputs. So you take a lot of those, and you feed it into some machine learning algorithm, and it essentially writes your program for you. Now, garbage in garbage out is a thing. And so you know, you don't give it good examples. It's gonna learn something not so good. But yeah, it's, it's really about, you know, trying to develop tools by providing examples of desired behavior rather than, you know, coding the behavior yourself.

Cindy M. 17:39
Okay. So the, um an algorithm actually, like you said, writes, writes at the machine rights, that you just tell them the inputs and the outputs, and you don't do the coding. So So then, with that explanation, can you can you help us understand how do the sort of the AI machine a responsible AI principles play out? In the machine learning kind of lifecycle? I'm sure there's a process for that. So how, how does all that work together?

Hal Daumé 18:06
Yeah. So there's lots of ways of thinking about the Machine Learning lifecycle. I mean, we have one that we reuse a lot. So, you know, I sort of said it in an easy way, right? Like, oh, all you do is provide examples. And then like, magic happens,

Cindy M. 18:20
Yeah, sounds like "I can just go do that."

Hal Daumé 18:22
You could right? And then, you know, you run the risk of terrible things happening in the world,

Cindy M. 18:28
Right.

Hal Daumé 18:30
So yeah, so generally, we think about this through sort of each stages. So the first is, you know, just like in any engineering project, you have to define a task that you want to solve. This often involves collecting input from various stakeholders, right? This is basically a design problem.

Cindy M. 18:49
Yeah

Hal Daumé 18:50
They're like, anything you might do and design can go into defining the task, right? So like, going back to the principles, right. So like, one of the principles is inclus-inclusiveness. And so you know, if you want your technology or your systems work for a broad range of people, it might be worth getting their input at sort of the design phase.

Cindy M. 19:11
Yeah, got it.

Hal Daumé 19:12
So then, okay, you need to get these input and output examples somewhere right now, where you start collecting and possibly annotating data. So the way I think that you know, the standard way people think of sort of responsible AI stuff coming in the collecting and the annotating the data is basically making sure that your data is as representative of how you want your system to behave as possible. So there was like an old now story about how I can't remember what car manufacturer it was, but they had a voice recognition system in their car. And it worked really well when would men talk to it and not so well when women would talk to it. Right and why? Well, it's because apparently the way that a lot of this data was collected was it was in computer science labs, they had people in the labs record themselves. Turns out lots of people in computer science labs are men. And so they ended up collecting a lot more data from men than from women and so the system, learned to do a better job on men and women. That's what it was shown how to do.

Cindy M. 20:24
Right?

Hal Daumé 20:26
So, so the collected and annotating the data really has been a drive a lot of like how well the system is going to generalize across populations. And so you often see people pointing at the data as if the data is sort of the main or only cause of potential issues.

Cindy M. 20:49
Sure. Yeah

Hal Daumé 20:49
And certainly, it has a role to play. Like, in this example of like data collected into their science labs, the data is that way, because of social conditions that lead to a world in which there are more men than women in computer science labs, right. So like, the data is a reflection of something in the world.

Cindy M. 21:10
Right.

Hal Daumé 21:11
Or I guess another example of this was there's a high profile case of, you know, big tech company that built a system to do like automated resume filtering, and it basically filtered out all women.

Cindy M. 21:22
Yeah, right.

Hal Daumé 21:23
And this again, it's the same sort of thing. It's like, it was trained on data that was like, biased by historical hiring processes. And so it's kind of no surprise that it's going to like replicate these biases in historical hiring process.

Cindy M. 21:39
Mm hmm. But just gonna say I mean, it really does put a fine point on, yes, this may be an engineering process and an engineering project. But it isn't just engineers, again, who need to be involved in that because you've got to, you got to it. To your point, thinking back just to the data isn't going to always be the answer, you've got to look look broader than that. Think about kind of where do you go get the data set, and others who aren't necessarily engineers can help. You know, two minds that are different are going to be a diverse team, it'd be better than one and thinking about where do we need to go to get the right data set so that we don't end up with, you know, cars that only understand men and not women as well, right?

Hal Daumé 22:15
Yeah. So okay, so we've got our, we have our task, we have our data. So now we find a machine learning model, which is basically going to be like, what is the structure of the thing that's going to map from the inputs to the outputs?

Cindy M. 22:27
Got it.

Hal Daumé 22:27
I think maybe in terms of the responsible AI principles, the biggest thing that comes up here is really around transparency. So a lot of people are using deep neural nets to do everything these days, because they're pretty effective at a lot of stuff when you have lots of data and lots to compute. But they're really hard to understand what they're doing. You know, it's basically this giant sequence of matrix multiplications. That goes on for like ages. And, you know, your system makes some bad decision at the end, it's really hard to say like, what went wrong in the middle.

Cindy M. 23:00
Yeah, Yeah.

Hal Daumé 23:01
And so so there's some work in research on trying to make deep neural networks more explainable. But if you go talk to Cynthia Rudin, who is a professor who's like, done a bunch of really good work in this space, I mean, her attitude is basically, if you're building a model for a high stakes setting, you should be using something that you can understand directly. So like, decision trees, or decision lists are the sorts of things that she's often advocating for.

Cindy M. 23:30
Interesting, yeah.

Hal Daumé 23:31
So there's big choices there about like, how important is it that this thing is understandable,

Cindy M. 23:36
Right

Hal Daumé 23:37
to people who are looking at it?

Cindy M. 23:38
Yeah.

Hal Daumé 23:40
Okay, so then you train the model on the data that you have, you'll then do some, like testing, usually. And then some sort of deployment of the system, and then either sort of like direct feedback, right, like some user complaints that your system doesn't work, or maybe the New York Times writes an article that your system is terrible, or more implicit things, right? So like, lots of systems these days, collect, like, click through? What's the word like? Click, click throughs, or reading times or stuff like that. Right. So there's direct feedback.

Cindy M. 24:09
Right, right, right. Yeah, yeah. And then adjusting and monitoring and tinkering as you go along the way. So So let's come back to what we what we touched on at the beginning of our conversation, I'm going to ask you a question about how to maybe avoid this. "Oh, no, we should have done it differently." You know, we're never going to be perfect. But I wonder if there is some economic incentives perhaps, for companies to get it right. Or mostly right. At the beginning before it's released. And and when I say that I'm thinking about some studies that I've seen in that you and I've talked about before about accessibility and how, you know, in that context, when problems were found after things had been rolled out, I think it was almost double the cost of the project to get it right. But it was exponentially lower, like less than 5% of the cost. If you just do it right at the beginning. Do you think the same thing applies here? If there's something to be learned in this machine learning lifecycle from this accessibility example? That would be an economic incentive to companies?

Hal Daumé 25:17
Yeah, I mean, I really want someone to do that study, I definitely suspect that this is true, like my experience is that the more you can think about at the level of designing the task, even and the initial data collection, the easier it is to move forward.

Cindy M. 25:40
Yeah

Hal Daumé 25:40
This is hard, because, you know, I think a lot of times we have this sort of attitude of, you know, like, build a minimal viable product, and

Cindy M. 25:48
Right the MVP.

Hal Daumé 25:50
Yeah. And that's not necessarily an avenue that leads to the most reflection. And so but but at the same time, what you often see in practice, is that when things go wrong, it's so hard to fix them, like from the beginning, that people tend to apply sort of like bandage solution.

Cindy M. 26:22
Yeah, yeah.

Hal Daumé 26:23
So this also came up in this study we did a couple of years ago, where you basically see, and there's a couple of reasons for this. So probably the most famous example that I know of, in this case was this case where there was a image captioning or image categorization tool that labeled a photo of a bunch of African American people as gorillas. And I mean, this made like pretty big press, and, you know, obviously offensive, especially because of various historic prejudices, like around like, exactly this topic.

Cindy M. 27:03
Yep.

Hal Daumé 27:04
And the solution that was employed at the time was just to prevent the model from outputting gorilla on any image. Right, which certainly fixes this problem. But you know, it doesn't get at the root of the problem

Cindy M. 27:22
Right.

Hal Daumé 27:23
and it also prevents you from being able to use this thing on you know, photos you take at the zoo, or something where you're actually taking a photo of a gorilla.

Cindy M. 27:29
Yeah.

Hal Daumé 27:31
And so you see this a lot. And part of it is because like, you need a quick fix, right? Like that part's totally understandable.

Cindy M. 27:37
Oh, yeah.

Hal Daumé 27:38
And the second is, one thing that's complicated about machine learning systems is that, you know, if you tweak part of the model over here, often very unclear how it's going to affect things over here. And so there's a fear that if I actually tried to address this at like, a root cause level, it's going to break a bunch of other stuff that is hard for me to detect. And so I think this is also why it's really important to try to do these, try to spend a lot of time at the beginning in terms of like defining and collecting data and stuff like that, because it prevents you from getting into this position where you're terrified of changing anything, because we don't know what it's like,

Cindy M. 28:23
Right yeah, yeah. So it's almost thinking about it in a way that to go fast, you all, we almost need to slow down right? A little bit at the beginning. Because that will allow us to go further, faster than just an MVP, that may get you you know, 10 yards down the road down the download down the field, if you will, but not 100 yards, and if you want to, you know, get that touchdown and make it sustainable wins, then the more time you spend up upfront thinking about it, right, with a diverse set of individuals could actually help you go further faster than I think so. And there's a real tension there, though, because with technology, you know, companies, you got to go fast. So yeah, you

Hal Daumé 28:29
got to go fast. Yeah, I think there's, I mean, I think like you were saying at the beginning the economic incentive, right, so yes, accessibility study, that's like, you know, designing accessibility from the start is like one or 2% of the cost. And, you know, doing it at the end, I can't remember the exact number, but I do remember that it often ends up in like, two to three times as many lines of code to try to, like, tack accessible accessibility features on to a system that already exists.

Cindy M. 29:31
Yeah.

Hal Daumé 29:32
And I'm, I don't know, you know, I hate prognosticating because, like, I'm probably going to be wrong, but like, I'm pretty sure something is similar for a if he is right. So like if you, you know, even if all you care about is sort of like bottom line.

Cindy M. 29:47
Yeah.

Hal Daumé 29:48
You know, I think there's an argument that it's much better to address these things up front than through now.

Cindy M. 29:54
Yeah, great, great research projects there right for that. That's right. That's right. So I thought of what other sort of outside the tech industry example that I wonder what you think about the viability of using it within the tech industry would be. So, you know, again, you're talking about getting a root cause what what causes mistakes, you could almost say what causes accidents to happen, which kind of brings to my mind, you know, in the transportation industry, which obviously, is very highly regulated. So that's, that's one difference right there. But you know, when a plane crashes, or there's a, you know, a bad crash on the highways, there's, you know, first airplane, they recover the black box, and they do like a root cause analysis and do a big investigation into it. And it's, you know, it's broadly shared, right, so the industry as a whole can learn from that, and, and hopefully not have, you know, root cause Mistakes happen again, or anything like that, that the industry shares on the tech side in terms of root cause of problems, or would that be beneficial? Or how would that even work and then not highly regulated industry like transportation?

Hal Daumé 31:08
Yeah. I think there's, I think there's a couple of answers. So I mean, I'm reminded most of like Ben Shneiderman, had this article, maybe a year ago or something where he was drawing this parallel between sort of algorithmic reliability and, and aviation reliability. And he talks about three things that aviation monitoring does. So there's there's sort of continual monitoring of systems,

Cindy M. 31:40
Right

Hal Daumé 31:41
Or you could think about, you know, the FDA is constantly checking to see if there's like salmonella and your spinach.

Cindy M. 31:46
Right, Yeah, right. That's true. Right. Yeah. Good safety.

Hal Daumé 31:49
And that's sort of like proactive, right? And then on the other side, there's like what you're saying, basically, when things go wrong, when a plane crashes, it's pretty obvious that something went wrong. It's actually a lot less clear for a lot of these systems, right. So, you know, I give examples of speech recognition, I gave examples of the image captioning. There's, you know,

Cindy M. 32:14
Facial recognition

Hal Daumé 32:15
tons more examples like facial recognition not working for people with dark skin, CEO search image results returning all like, you know, mid 40s, white men photos, right,

Cindy M. 32:26
Right.

Hal Daumé 32:26
And all of these have been uncovered, either by investigative journalists or by researcher as, as far as I know, I qualify that slightly, but majority of them are found this way.

Cindy M. 32:40
Right.

Hal Daumé 32:41
And so I think there's first a detection problem that's maybe a little bit more called maybe it's more analogous to the FDA case where you have to detect the salmonella, it's not yet on the know, it jumps out and says "Hi I'm salmonella".

Cindy M. 32:54
That's right.

Hal Daumé 32:55
Yeah. And so I think where the field is right now, is, we've spent a lot of time thinking about monitoring, at least self monitoring, we haven't thought much about, or we haven't, we haven't made much progress on sort of third party monitoring, like we might have in a regulatory system.

Cindy M. 33:14
Right

Hal Daumé 33:14
I don't think we're actually that far on thinking about, you know, sort of the equivalent of the airplane blackbox. When things go wrong, we don't really have that many tools at our disposal to try to understand why. So there's been like, I hinted at this before, but there's been a bunch of work on sort of explaining machine learning systems behavior. And a lot of this is with the aim of debug game. Which is essentially, you know, if something goes wrong, we'd like to debug what went wrong. Yes, there's also been a fair amount of work on trying to ascribe errors to aspects of the data on which the system was trained. So you might want to say something like, Okay, this system, you know, made this error because, you know, these five training examples, lead it to think XYZ. And so then you can ask questions, like, should I remove those training examples? Are they like, are they incorrect? Should I get more stuff like that? But to be totally honest, like, we don't really have good ways of doing this. And I think this is one of the reasons why. You see a lot of calls for regulation or various parts of like automated decision making industry, you see a lot of sort of like, facial recognition, bans and various states and countries. I think New York is implementing something about using automated decision making for hiring. For a lot of these things, there's a big gap Between, like in order to regulate something, you have to be able to measure it. And

Cindy M. 35:04
Right

Hal Daumé 35:04
to be able to measure it, you have to go to audit and just not there

Cindy M. 35:08
Not yet.

Hal Daumé 35:09
And so like, while I'm very sympathetic to a lot of the calls for regulation, like I think this makes a lot of sense. It's hard to imagine exactly what those what the what sort of regulations would actually move the needle. There. That's my personal opinion, I'm definitely not speaking for Microsoft when I say that.

Cindy M. 35:32
So how I, you're teaching at the University of Maryland, I think, a new class this semester about machine learning and an ethics, you know, what are the kind of one or two big takeaways that you've you've gotten as you as we're recording this, we're near the end of the semester. So I'm curious how that course has gone. And what you kind of learned from this first semester?

Hal Daumé 35:56
I guess the first thing I learned is, I think students were really happy to be back in person.

Cindy M. 36:01
Yeah, I know.

Hal Daumé 36:02
I think. I mean, everyone was like, super engaged and it was a lot of fun to teach this class. I think there were a couple of things. I mean, it was, the class was weird. In the sense, there was a computer science class where probably half of the stuff we read was political philosophy, which a lot of students found challenging. We also had a bunch of students from other departments from philosophy, or sociology, or psychology and the information school. So that was helpful. You know, I think the to me, the biggest thing is that there's this gap between the way that you know, a lot of sort of ethics philosophy thinks about problems and what actually happens on the ground.

Cindy M. 36:49
Yeah

Hal Daumé 36:49
I don't want to pick on like the trolley problem too much. Ethicists love to talk about the trolley problem. That's just not the problem that we actually face. I guess one of the places where we've found the most leverage in terms of like, actually really connecting, sort of like ethical principles with things on the ground is like literature from science and technology studies. So this is a literature, at least the part of it that we're looking at is like, basically, technical critique. So people like Anna Lauren Hoffman, for instance, has done a bunch of amazing work in this space. You know, sort of looking at the social implications of a lot of technology.

Cindy M. 37:29
Yeah.

Hal Daumé 37:30
And this is what we need, like we need to There was this joke we had when I was in grad school. So I mentioned that like, I didn't start as a machine learning person, I started doing language. And we kind of joked about machine learning that like in machine learning, it's like, God gives you a matrix and your job is to do something with this matrix. And really, what this means to me now is that the starting place for a lot of machine learning work is "I have some data but it's been completely decontextualized". And, and so we've moved ourselves from like thinking about like socio technical problems and how systems interact with society.

Cindy M. 38:20
Right

Hal Daumé 38:20
To purely technical problems, and that abstraction has been really useful in pushing a lot of machine learning research forward but what we're seeing is that if you only use that abstraction once the thing you build hits the real world,

Cindy M. 38:35
Right

Hal Daumé 38:36
It's like bad things happen.

Cindy M. 38:38
Right, right, right, right. So it's bringing that abstraction back to reality, but with some practical critiques of systems as so to fill that gap between just talking about philosophy, if you will, and ethics philosophy, which doesn't end up being very practical and useful at times, to how do I get out of this abstract, you know, problem of machine learning where I think I've fixed it, there's, there's a huge gap between those two, right? Yeah, so filling that with, with critiques that are relevant and practical. I think is a is a great way to go. Well, congratulations to you on teaching that course the first time. Thank you for your time with us today, Hal. This has just been a wonderful conversation. Thank you for all the work you're doing at Microsoft as well. I always like to ask my guests one last question before they go. Are there in addition to all the resources you've mentioned already? Is there anything else if a student or somebody else who's listening and executive and wants to learn more about this topic that you could recommend to them either in terms of a book or or maybe a documentary or a podcast series?

Hal Daumé 39:49
Oh, it's so hard to just pick one.

Cindy M. 39:51
Well, you don't have to you can give two or three.

Hal Daumé 39:54
I mean, I think you know, if I had to say one place to go, I would check out a lot of the reports put out by the AI NOW Institute. So I, they've done a lot of really great work on on sort of like in this like socio technical space of understanding like the real world impact of AI systems.

Cindy M. 40:18
Okay.

Hal Daumé 40:19
Um, and they have a number of, you know, relatively short documents on a wide range of topics that are really good.

Cindy M. 40:29
good.

Hal Daumé 40:30
Um, you know, I also really liked the movie Coded Bias. This is largely about Joy Buolamwini, and her efforts to rein in facial recognition, technology and all. It's not just about her, it also has sort of vignettes from like, almost, you know, sort of the who's who in this space. So yeah, the people featured in that movie is like a good person to go check out their book.

Cindy M. 40:53
And that's a great one. I've watched that one. It's very engaging. It's a documentary And she's a student at MIT it's great.

Hal Daumé 41:01
Yeah. And then maybe the third thing I would say is I really liked Ruha Benjamin's book Race After Technology. It's a it's a pretty accessible read. It's very recent. It's like in the past year, I think she brings a bunch of interesting perspectives that are sort of easy to understand once they've been written down in a way that's easy to understand. But, but also sort of deep and their implications.

Cindy M. 41:32
Yeah, yeah. Oh, those are great recommendations, Hal. Thank you. And thank you so much, again, for your time today. This has been just a fascinating conversation. Thank you so much. Appreciate it. Yeah. Thank you, Cindy.

Hal Daumé 41:43
My pleasure.

Cindy M. 41:44
All right. Talk to you later. Bye bye.

Hal Daumé 41:46
Bye

Cindy M. 41:51
Thanks for listening to today's episode of the BIS the business integrity school. You can find us on YouTube, Google, SoundCloud, iTunes or wherever you find your podcasts. Be sure to subscribe and rate us and you can find us by searching the BIS. That's one word, theBIS, which stands for the Business Integrity School. Tune in next time for more practical tips from a pro.