Cindy Moehring chats with Hal Daumé, senior principal researcher at Microsoft and professor of computer science at the University of Maryland, to explore the intersection of ethics and technology. Their discussion covers racial bias in artificial intelligence, the complexities of ethical machine learning, and who is responsible for ethical oversight at large tech firms.
Podcast
Resources From the Episode
- Stanford’s “The Race Gap in Speech Recognition Technology”
- Microsoft’s AI Fairness Checklist
- AI Now Institute
- Coded Bias by Shalini Kantayya
- Race After Technology by Ruha Benjamin
Episode Transcript
Cindy M. 0:03
Hi, everyone. I'm Cindy Moehring, the Founder and Executive Chair of the Business
Integrity Leadership Initiative at the Sam M Walton College of Business, and this
is the BIS, the Business Integrity School podcast. Here we talk about applying ethics,
integrity and courageous leadership and business, education and most importantly,
your life today. I've had nearly 30 years of real world experience as a senior executive.
So if you're looking for practical tips from a business pro who's been there, then
this is the podcast for you. Welcome. Let's get started. Hi, everybody, and welcome
back to another episode of the BIS, the Business Integrity School. I'm Cindy Moehring,
the founder and executive chair. And we have with us today a very special guest as
we continue our discussion on tech ethics. We have with us Hal Daumé and he has a
very interesting background. Hi, Hal .
Hal Daumé 0:59
Hi, Cindy, it's great to see you again.
Cindy M. 1:01
It's great to see you again, too. I've had a chance to get to know Hal, and one because
Microsoft serves on our external advisory board. And then we spoke together on a panel
for the Northwest Arkansas Tech Summit recently. But let me tell you all about Hal
and then we'll jump right into our discussion. Hal actually is a senior principal
researcher at Microsoft Research in New York, where he's a member of the fairness,
accountability, transparency and ethics group, as well as the machine learning and
reinforcement learning groups. Now, in addition to that, Hal has up a second life,
he is also the prorata chaired professor of computer science and language Science
at the University of Maryland in College Park. And in that role, He's authored over
150 Peer Reviewed papers, several of which have received awards. And so you know Hal,
wow, thank you for being with us today. And I would love it, if you could just share
with the audience a little bit about how you got to where you are doing both of these
roles, and what got you interested in tech ethics.
Hal Daumé 2:07
Sure. Um, so maybe just a little bit of background. So when I first started doing
research, I was basically doing natural language processing stuff. So basically getting
computers to interact with people using human language and a natural way. And my interest
sort of slowly shifted around various parts of AI, like parts of machine learning
and reinforcement learning. And at some point, it kind of became clear to me that
we now have the ability to build a bunch of really complicated and potentially really
beneficial technologies,
Cindy M. 2:42
Right.
Hal Daumé 2:42
but we're not always doing that. And sometimes we might think we are, but they have
unintended consequences, and so on. And so my interest in sort of the tech ethics
and fairness bias, and all those sorts of spaces is basically around this question
of how do we ensure that the technology that we build, especially like AI based technology
is really, you know, moving the needle forward in terms of, you know, building a better
world
Cindy M. 2:42
Right.
Hal Daumé 2:57
rather than, you know, reinforcing existing social problems.
Cindy M. 3:14
Right
Hal Daumé 3:15
And so I think, as we see AI technology coming into more and more parts of our lives.
This is just becoming more and more important.
Cindy M. 3:23
Yeah, yeah, it really is. And it, it will, I'm sure, we'll get into this more conversation,
but it kind of feels like, you know, we launched all this technology, and everybody
was like, "Yay, yay, yay", and, and now we're starting to say, "Oh, well, maybe wait
a minute", you know. And so we can, we'll have a chance to talk about how we can maybe
rejigger that whole process a bit and do it differently going forward. But Microsoft,
I love the fact that you talked about wanting to build a better world and doing that
in the right way. And I know Microsoft has a set of AI principles. In fact, for Responsible
AI, can you tell us a little bit about what those principles are?
Hal Daumé 4:03
Sure. So this is something that's actually gone through a few iterations at this point,
as such things do but the current version is there's six responsible AI principles.
So I'll list them and then say a little bit about them. But I'm happy to, you know
I think later we'll talk about examples of how these things out in practice, but so
the six are fairness, inquisitiveness, reliability and safety, transparency, privacy
and security and accountability.
Cindy M. 4:31
Okay
Hal Daumé 4:32
A lot of this is built on Microsoft's current corporate mission statement is to Okay,
I can't get it exactly right. It was basically to empower people everywhere.
Cindy M. 4:40
Right
Hal Daumé 4:40
And so a lot of this is sort of coming from that position. So you know, if you want
to empower people, you need to treat them fairly. You need to empower everyone, not
just some people, not just you know, the people already have lots of power in society.
You need to do this in a way that you know, people can rely on and and break things,
people should be able to understand what these systems are doing. Microsoft, unlike
a lot of other companies that people interact with daily is much more sort of like
business to business, we have less of a sort of direct engagement with people, right?
We don't really have like a social media site and things like that. And so for us,
things like security and privacy are super important, because company to company trust
is something that's really hard to get back if you've broken it.
Cindy M. 5:29
Oh, yeah.
Hal Daumé 5:30
And not treating people's data securely is a really good way of breaking trust.
Cindy M. 5:34
Yeah.
Hal Daumé 5:35
And then the last one is accountability, which I think is interesting, because it's
the one that's not about the system itself. But the way it's phrased is people should
be accountable for AI systems. So it's really putting the onus on the people rather
than, you know, basically saying, like, oh, the system did something right. Like I
ended the day the buck doesn't stop with the system, the buck stops, with like a person
somewhere.
Cindy M. 5:57
Right. Right. So essentially, it keeps humans at the center, really, of the technology
and making it clear that technology is really here to serve humans, and we have to
be accountable for it not the other way around. That's actually a really powerful
statement.
Hal Daumé 6:12
Yeah.
Cindy M. 6:12
So so how can you share with us maybe some examples of, of how you see those principles
actually playing out at Microsoft? I think it's one thing to have them on a piece
of paper? You know, I mean, today's what the 20 year anniversary of Enron, I think
it is and they have lots of things on a piece of paper, too. But how do you actually
see that? ( 'Cause we know, they all crashed) How do you see that coming to life?
At Microsoft? How's it real?
Hal Daumé 6:36
Yeah, so, you know, I've been involved in a handful of projects where various aspects
of these different pieces have come up, for instance, on the topic of inclusiveness
so one place where this came up recently, it was there was a study out of Stanford
that looked at a bunch of different automatic speech recognition systems from I think,
like five major technology companies, including Microsoft and found that they underperformed
when transcribing the speech of African Americans in comparison to white Americans.
This violates the inclusiveness principle. So this is a technology now that is only
empowering some people
Cindy M. 7:15
Right
Hal Daumé 7:15
Not empowering others. This sort of immediate question was, what do we do about this?
And not just what do we do about the specific problem, like the specific problem that
the system doesn't appear to work as well for African Americans as for white Americans,
but just acknowledging the fact that there's different ways of speaking English, even
in the United States?
Cindy M. 7:35
Yeah.
Hal Daumé 7:36
If we want something that's inclusive, we don't want to just fix the disparities for
black Americans, but actually tried to come up with a solution that really going to
be much broader than that. Yeah. This led into a lot of discussion about dialects
and sociolects. Dialects tend to be geographically constrained ways of speaking somewhat
differently. So I went to college and Pittsburgh, people in Pittsburgh, say yinz instead
of y'all.
Cindy M. 8:04
Here in the south, we tend to say y'all
Hal Daumé 8:06
Say y'all. And so that's sort of a dialectal difference. There's also sociolectal
differences. So these are things that are not constrained by, by geography, but by
some other social aspects. So you know, if you look at, for instance, like socioeconomic
status, right, people at different levels of socioeconomic status might speak English
differently, they don't speak English better or worse, right? It's just different,
different. So this study basically led to a big introspection, okay, if we're thinking
about the US specifically, what are the major dialectal variations? What are the major
sociolectal vote variations? How can we collect data that covers these in like, a,
an ethical way? And then, of course, Microsoft is also an international company. So
what does this mean, when we're talking about English internationally? What does it
mean when we're talking about other languages? Right? There's sociolects and dialects
of German. Sociolects and dialects of Mandarin, so
Cindy M. 9:01
It becomes a big issue, doesn't it? All of a sudden, yeah
Hal Daumé 9:03
It does and so it's hard, right? I know something about how these things work in the
US. I mean, I'm, I'm certainly not an expert. But I have no idea how these things
work in Germany, or even the UK. So this is created basically a large internal push
to, you know, engage first linguists in various parts of the world where they will
have insight into like, what are the types of variation that you're likely to come
across there, got it and working to, you know, do data collection, collection, both
for evaluation and also for like, building models in the first one.
Cindy M. 9:38
So that's, that's a really great example. I mean, it was, you know, it was a point
of sounds like introspection that's still playing out, because all of a sudden, the,
to your point, the recognition became, well, we shouldn't just put a bandaid on it
and solve for that one issue. Like we need to look holistically at the whole issue.
And we're a global company, how do we step back and do that? So yeah, that's a great
example. So so let me ask you another question. I know you're part of the research
group that is responsible for ethics and transparency in responsible AI. And in some
respects, you, you may say at times or Microsoft may, I'm not sure. But you know,
responsible AI is everyone's job, which is great to say. But I've also been in the
corporate world for many years. And I know that sometimes when it is said that way,
and understood that it's everyone's job, it can almost feel like it's no one's job,
right? Like, who's on point for that? Okay, we're all responsible for it. But like,
who's on point? So how does how does Microsoft deal with that issue?
Hal Daumé 10:39
Yeah, so not surprisingly, this is also something we've gone through iterations on.
Now. One quick comment. First, we had this paper a couple of years ago, studying the
way that people in industry, not just at Microsoft, across a large number of companies,
including startups, and big major tech companies deal with fairness related issues,
or sort of AI ethics issues. And one of the things that we found in a lot of the interviews
we did was that a lot of times these issues are dealt with by one particularly caring
person, there's just someone on the team who's decided that this is important to them.
And so they basically spend their nights and weekends trying to address AI ethics
issues. Uh huh. But like, they don't get promoted for it. It's like not part of their
day job. So it's a one inter- one person we interviewed referred to these people as
"fairness vigilantes", maybe not the best term, but I think this notion that like
they're really kind of acting on their own, against a system that's maybe not supporting
this directly was a problem. We also saw that it often happened that there was misunderstanding
about whose job it was. So we had cases where we interviewed people on the same team,
but in different levels. So like a data scientist, and a manager, for instance and
Cindy M. 12:12
Like a product manager?
Hal Daumé 12:13
Yeah. And the data scientists would say, like, oh, the product manager is worried
about these things, the product manager would say, the data scientist, he really wanted
to tell them, like you should talk, but we couldn't, because like confidentiality,
and, stuff like that.
Cindy M. 12:28
Sure.
Hal Daumé 12:29
So anyway, one of the things we've recently tried, which seems to be working better
than things we've tried in the past, is to have people on teams whose specific job
it is to worry about these sorts of issues. So there's actually a couple of parallel,
not quite parallel, but a couple of related organizations within Microsoft. So there's
the FATE Research Group, which is really focused on research, which is where there's
Aether, which is this basically AI ethics cross cutting thing across the company,
which includes researchers, so I'm involved with that, as well as people from the
the legal team, people from HR, data scientists, engineers, and so on. And then there's
the third sort of cross cutting organization that's much more like the data scientist,
engineer level. All of these are basically trying to provide scaffolding so that these
individual people and each team whose job it is to think about ethics have a pool
of resources that they can go to, it's like, they probably don't know everything,
right?
Cindy M. 13:33
Sure, right.
Hal Daumé 13:34
A lot of what I find myself doing in my sort of Aether time, is consulting with these
people whose job it is to sort of monitor AI, stuff on teams. And, you know, like,
I gave the example of like, you know, how do we find linguists who know about socio
likes and dialects, right? Maybe an engineer on the team doesn't know how to do that,
or doesn't have the right connections, but right, one of us dies.
Cindy M. 14:00
Right, right.
Hal Daumé 14:00
So that's, that's been the structure we've had, and it's not perfect. There's certainly
things that go wrong, but it seems to be working better than anything we've tried
before.
Cindy M. 14:09
Yeah and what I like about that, that just sort of hearing you talk about it, is I
hope it helps everyone understand this is a journey. I mean, it's not like a one and
done and you get to write it, it's iterative, and you learn as you go, and you improve
the processes as you go along. And this, you know, this seems to be working now. And
if I talk to you a year later, though, maybe another iteration on it, but But it sounds
like it's working better, you know, than it was before. And you know, it's not in
common what you described that if you don't, if a company doesn't specifically and
explicitly talk about whose responsibility something is, then everyone thinks it belongs
to somebody else, right? And that's how how, how we can all as and when you're working
in a company, a company can get into trouble because nobody really fully understands
whose job it is. So it's good to see.
Hal Daumé 14:59
The other thing we've been trying to do with the sort of mixed success is provide
tooling support for the people whose job it is to do this. So I wasn't involved in
this. But there was a research project a couple years ago, two years ago, maybe now
on, quote, fairness checklists, which is basically, you know, things you should think
about throughout the full development of an AI system. It's not meant to be a checklist,
a checklist sometimes get a bad rap.
Cindy M. 15:31
Yeah.
Hal Daumé 15:31
For good reasons, because it's what you have to do to check the box, and then you
move on.
Cindy M. 15:35
Right.
Hal Daumé 15:37
But the, the, the checks are less about, you know, did you, you know, insert this
line of code or something, but much more about did you think about XYZ's potential
implication
Cindy M. 15:50
Right, right, right, is sort of a framework for things that, you know, as opposed
to a checklist that they should be thinking about? That's, yeah, that's a good tool.
That's a really good, it's a guide, right? Yeah.
Hal Daumé 16:00
It's not applicable everywhere, but it's trying to be, you know, Aether has been around,
I'll probably get the dates wrong, but I'm gonna say, like, five years or something
like that. And, you know, we've been involved in a fair number of projects through
that time. And sure every project has its unique things that make it challenging,
but there's a lot that happens that you see over and over again. And so if you can
at least streamline those over and over again, occurrences, then you can use the other
resources, like reaching out to Aether or reaching out to a group or something as
a way of trying to address the things that are like really specific to your problem.
Cindy M. 16:38
Yeah, got it. Okay. Well, let's switch gears here for a minute and talk a little bit
about machine learning specifically, and, and in this audience for this conversation,
while not everyone is going to be a data scientist or an engineer, and may not even
understand what machine learning is. So can you just sort of in a few words, just
explain it very simply, what, what actually is machine learning?
Hal Daumé 17:00
Sure, I think of it is programming by example. So you want to write a program, you
don't know how to write it. But you can come up with examples of desired inputs and
out and their corresponding outputs. So you take a lot of those, and you feed it into
some machine learning algorithm, and it essentially writes your program for you. Now,
garbage in garbage out is a thing. And so you know, you don't give it good examples.
It's gonna learn something not so good. But yeah, it's, it's really about, you know,
trying to develop tools by providing examples of desired behavior rather than, you
know, coding the behavior yourself.
Cindy M. 17:39
Okay. So the, um an algorithm actually, like you said, writes, writes at the machine
rights, that you just tell them the inputs and the outputs, and you don't do the coding.
So So then, with that explanation, can you can you help us understand how do the sort
of the AI machine a responsible AI principles play out? In the machine learning kind
of lifecycle? I'm sure there's a process for that. So how, how does all that work
together?
Hal Daumé 18:06
Yeah. So there's lots of ways of thinking about the Machine Learning lifecycle. I
mean, we have one that we reuse a lot. So, you know, I sort of said it in an easy
way, right? Like, oh, all you do is provide examples. And then like, magic happens,
Cindy M. 18:20
Yeah, sounds like "I can just go do that."
Hal Daumé 18:22
You could right? And then, you know, you run the risk of terrible things happening
in the world,
Cindy M. 18:28
Right.
Hal Daumé 18:30
So yeah, so generally, we think about this through sort of each stages. So the first
is, you know, just like in any engineering project, you have to define a task that
you want to solve. This often involves collecting input from various stakeholders,
right? This is basically a design problem.
Cindy M. 18:49
Yeah
Hal Daumé 18:50
They're like, anything you might do and design can go into defining the task, right?
So like, going back to the principles, right. So like, one of the principles is inclus-inclusiveness.
And so you know, if you want your technology or your systems work for a broad range
of people, it might be worth getting their input at sort of the design phase.
Cindy M. 19:11
Yeah, got it.
Hal Daumé 19:12
So then, okay, you need to get these input and output examples somewhere right now,
where you start collecting and possibly annotating data. So the way I think that you
know, the standard way people think of sort of responsible AI stuff coming in the
collecting and the annotating the data is basically making sure that your data is
as representative of how you want your system to behave as possible. So there was
like an old now story about how I can't remember what car manufacturer it was, but
they had a voice recognition system in their car. And it worked really well when would
men talk to it and not so well when women would talk to it. Right and why? Well,
it's because apparently the way that a lot of this data was collected was it was in
computer science labs, they had people in the labs record themselves. Turns out lots
of people in computer science labs are men. And so they ended up collecting a lot
more data from men than from women and so the system, learned to do a better job on
men and women. That's what it was shown how to do.
Cindy M. 20:24
Right?
Hal Daumé 20:26
So, so the collected and annotating the data really has been a drive a lot of like
how well the system is going to generalize across populations. And so you often see
people pointing at the data as if the data is sort of the main or only cause of potential
issues.
Cindy M. 20:49
Sure. Yeah
Hal Daumé 20:49
And certainly, it has a role to play. Like, in this example of like data collected
into their science labs, the data is that way, because of social conditions that lead
to a world in which there are more men than women in computer science labs, right.
So like, the data is a reflection of something in the world.
Cindy M. 21:10
Right.
Hal Daumé 21:11
Or I guess another example of this was there's a high profile case of, you know, big
tech company that built a system to do like automated resume filtering, and it basically
filtered out all women.
Cindy M. 21:22
Yeah, right.
Hal Daumé 21:23
And this again, it's the same sort of thing. It's like, it was trained on data that
was like, biased by historical hiring processes. And so it's kind of no surprise that
it's going to like replicate these biases in historical hiring process.
Cindy M. 21:39
Mm hmm. But just gonna say I mean, it really does put a fine point on, yes, this may
be an engineering process and an engineering project. But it isn't just engineers,
again, who need to be involved in that because you've got to, you got to it. To your
point, thinking back just to the data isn't going to always be the answer, you've
got to look look broader than that. Think about kind of where do you go get the data
set, and others who aren't necessarily engineers can help. You know, two minds that
are different are going to be a diverse team, it'd be better than one and thinking
about where do we need to go to get the right data set so that we don't end up with,
you know, cars that only understand men and not women as well, right?
Hal Daumé 22:15
Yeah. So okay, so we've got our, we have our task, we have our data. So now we find
a machine learning model, which is basically going to be like, what is the structure
of the thing that's going to map from the inputs to the outputs?
Cindy M. 22:27
Got it.
Hal Daumé 22:27
I think maybe in terms of the responsible AI principles, the biggest thing that comes
up here is really around transparency. So a lot of people are using deep neural nets
to do everything these days, because they're pretty effective at a lot of stuff when
you have lots of data and lots to compute. But they're really hard to understand what
they're doing. You know, it's basically this giant sequence of matrix multiplications.
That goes on for like ages. And, you know, your system makes some bad decision at
the end, it's really hard to say like, what went wrong in the middle.
Cindy M. 23:00
Yeah, Yeah.
Hal Daumé 23:01
And so so there's some work in research on trying to make deep neural networks more
explainable. But if you go talk to Cynthia Rudin, who is a professor who's like, done
a bunch of really good work in this space, I mean, her attitude is basically, if you're
building a model for a high stakes setting, you should be using something that you
can understand directly. So like, decision trees, or decision lists are the sorts
of things that she's often advocating for.
Cindy M. 23:30
Interesting, yeah.
Hal Daumé 23:31
So there's big choices there about like, how important is it that this thing is understandable,
Cindy M. 23:36
Right
Hal Daumé 23:37
to people who are looking at it?
Cindy M. 23:38
Yeah.
Hal Daumé 23:40
Okay, so then you train the model on the data that you have, you'll then do some,
like testing, usually. And then some sort of deployment of the system, and then either
sort of like direct feedback, right, like some user complaints that your system doesn't
work, or maybe the New York Times writes an article that your system is terrible,
or more implicit things, right? So like, lots of systems these days, collect, like,
click through? What's the word like? Click, click throughs, or reading times or stuff
like that. Right. So there's direct feedback.
Cindy M. 24:09
Right, right, right. Yeah, yeah. And then adjusting and monitoring and tinkering as
you go along the way. So So let's come back to what we what we touched on at the beginning
of our conversation, I'm going to ask you a question about how to maybe avoid this.
"Oh, no, we should have done it differently." You know, we're never going to be perfect.
But I wonder if there is some economic incentives perhaps, for companies to get it
right. Or mostly right. At the beginning before it's released. And and when I say
that I'm thinking about some studies that I've seen in that you and I've talked about
before about accessibility and how, you know, in that context, when problems were
found after things had been rolled out, I think it was almost double the cost of the
project to get it right. But it was exponentially lower, like less than 5% of the
cost. If you just do it right at the beginning. Do you think the same thing applies
here? If there's something to be learned in this machine learning lifecycle from this
accessibility example? That would be an economic incentive to companies?
Hal Daumé 25:17
Yeah, I mean, I really want someone to do that study, I definitely suspect that this
is true, like my experience is that the more you can think about at the level of designing
the task, even and the initial data collection, the easier it is to move forward.
Cindy M. 25:40
Yeah
Hal Daumé 25:40
This is hard, because, you know, I think a lot of times we have this sort of attitude
of, you know, like, build a minimal viable product, and
Cindy M. 25:48
Right the MVP.
Hal Daumé 25:50
Yeah. And that's not necessarily an avenue that leads to the most reflection. And
so but but at the same time, what you often see in practice, is that when things go
wrong, it's so hard to fix them, like from the beginning, that people tend to apply
sort of like bandage solution.
Cindy M. 26:22
Yeah, yeah.
Hal Daumé 26:23
So this also came up in this study we did a couple of years ago, where you basically
see, and there's a couple of reasons for this. So probably the most famous example
that I know of, in this case was this case where there was a image captioning or image
categorization tool that labeled a photo of a bunch of African American people as
gorillas. And I mean, this made like pretty big press, and, you know, obviously offensive,
especially because of various historic prejudices, like around like, exactly this
topic.
Cindy M. 27:03
Yep.
Hal Daumé 27:04
And the solution that was employed at the time was just to prevent the model from
outputting gorilla on any image. Right, which certainly fixes this problem. But you
know, it doesn't get at the root of the problem
Cindy M. 27:22
Right.
Hal Daumé 27:23
and it also prevents you from being able to use this thing on you know, photos you
take at the zoo, or something where you're actually taking a photo of a gorilla.
Cindy M. 27:29
Yeah.
Hal Daumé 27:31
And so you see this a lot. And part of it is because like, you need a quick fix, right?
Like that part's totally understandable.
Cindy M. 27:37
Oh, yeah.
Hal Daumé 27:38
And the second is, one thing that's complicated about machine learning systems is
that, you know, if you tweak part of the model over here, often very unclear how it's
going to affect things over here. And so there's a fear that if I actually tried to
address this at like, a root cause level, it's going to break a bunch of other stuff
that is hard for me to detect. And so I think this is also why it's really important
to try to do these, try to spend a lot of time at the beginning in terms of like defining
and collecting data and stuff like that, because it prevents you from getting into
this position where you're terrified of changing anything, because we don't know what
it's like,
Cindy M. 28:23
Right yeah, yeah. So it's almost thinking about it in a way that to go fast, you all,
we almost need to slow down right? A little bit at the beginning. Because that will
allow us to go further, faster than just an MVP, that may get you you know, 10 yards
down the road down the download down the field, if you will, but not 100 yards, and
if you want to, you know, get that touchdown and make it sustainable wins, then the
more time you spend up upfront thinking about it, right, with a diverse set of individuals
could actually help you go further faster than I think so. And there's a real tension
there, though, because with technology, you know, companies, you got to go fast. So
yeah, you
Hal Daumé 28:29
got to go fast. Yeah, I think there's, I mean, I think like you were saying at the
beginning the economic incentive, right, so yes, accessibility study, that's like,
you know, designing accessibility from the start is like one or 2% of the cost. And,
you know, doing it at the end, I can't remember the exact number, but I do remember
that it often ends up in like, two to three times as many lines of code to try to,
like, tack accessible accessibility features on to a system that already exists.
Cindy M. 29:31
Yeah.
Hal Daumé 29:32
And I'm, I don't know, you know, I hate prognosticating because, like, I'm probably
going to be wrong, but like, I'm pretty sure something is similar for a if he is right.
So like if you, you know, even if all you care about is sort of like bottom line.
Cindy M. 29:47
Yeah.
Hal Daumé 29:48
You know, I think there's an argument that it's much better to address these things
up front than through now.
Cindy M. 29:54
Yeah, great, great research projects there right for that. That's right. That's right.
So I thought of what other sort of outside the tech industry example that I wonder
what you think about the viability of using it within the tech industry would be.
So, you know, again, you're talking about getting a root cause what what causes mistakes,
you could almost say what causes accidents to happen, which kind of brings to my mind,
you know, in the transportation industry, which obviously, is very highly regulated.
So that's, that's one difference right there. But you know, when a plane crashes,
or there's a, you know, a bad crash on the highways, there's, you know, first airplane,
they recover the black box, and they do like a root cause analysis and do a big investigation
into it. And it's, you know, it's broadly shared, right, so the industry as a whole
can learn from that, and, and hopefully not have, you know, root cause Mistakes happen
again, or anything like that, that the industry shares on the tech side in terms of
root cause of problems, or would that be beneficial? Or how would that even work and
then not highly regulated industry like transportation?
Hal Daumé 31:08
Yeah. I think there's, I think there's a couple of answers. So I mean, I'm reminded
most of like Ben Shneiderman, had this article, maybe a year ago or something where
he was drawing this parallel between sort of algorithmic reliability and, and aviation
reliability. And he talks about three things that aviation monitoring does. So there's
there's sort of continual monitoring of systems,
Cindy M. 31:40
Right
Hal Daumé 31:41
Or you could think about, you know, the FDA is constantly checking to see if there's
like salmonella and your spinach.
Cindy M. 31:46
Right, Yeah, right. That's true. Right. Yeah. Good safety.
Hal Daumé 31:49
And that's sort of like proactive, right? And then on the other side, there's like
what you're saying, basically, when things go wrong, when a plane crashes, it's pretty
obvious that something went wrong. It's actually a lot less clear for a lot of these
systems, right. So, you know, I give examples of speech recognition, I gave examples
of the image captioning. There's, you know,
Cindy M. 32:14
Facial recognition
Hal Daumé 32:15
tons more examples like facial recognition not working for people with dark skin,
CEO search image results returning all like, you know, mid 40s, white men photos,
right,
Cindy M. 32:26
Right.
Hal Daumé 32:26
And all of these have been uncovered, either by investigative journalists or by researcher
as, as far as I know, I qualify that slightly, but majority of them are found this
way.
Cindy M. 32:40
Right.
Hal Daumé 32:41
And so I think there's first a detection problem that's maybe a little bit more called
maybe it's more analogous to the FDA case where you have to detect the salmonella,
it's not yet on the know, it jumps out and says "Hi I'm salmonella".
Cindy M. 32:54
That's right.
Hal Daumé 32:55
Yeah. And so I think where the field is right now, is, we've spent a lot of time thinking
about monitoring, at least self monitoring, we haven't thought much about, or we haven't,
we haven't made much progress on sort of third party monitoring, like we might have
in a regulatory system.
Cindy M. 33:14
Right
Hal Daumé 33:14
I don't think we're actually that far on thinking about, you know, sort of the equivalent
of the airplane blackbox. When things go wrong, we don't really have that many tools
at our disposal to try to understand why. So there's been like, I hinted at this before,
but there's been a bunch of work on sort of explaining machine learning systems behavior.
And a lot of this is with the aim of debug game. Which is essentially, you know, if
something goes wrong, we'd like to debug what went wrong. Yes, there's also been a
fair amount of work on trying to ascribe errors to aspects of the data on which the
system was trained. So you might want to say something like, Okay, this system, you
know, made this error because, you know, these five training examples, lead it to
think XYZ. And so then you can ask questions, like, should I remove those training
examples? Are they like, are they incorrect? Should I get more stuff like that? But
to be totally honest, like, we don't really have good ways of doing this. And I think
this is one of the reasons why. You see a lot of calls for regulation or various parts
of like automated decision making industry, you see a lot of sort of like, facial
recognition, bans and various states and countries. I think New York is implementing
something about using automated decision making for hiring. For a lot of these things,
there's a big gap Between, like in order to regulate something, you have to be able
to measure it. And
Cindy M. 35:04
Right
Hal Daumé 35:04
to be able to measure it, you have to go to audit and just not there
Cindy M. 35:08
Not yet.
Hal Daumé 35:09
And so like, while I'm very sympathetic to a lot of the calls for regulation, like
I think this makes a lot of sense. It's hard to imagine exactly what those what the
what sort of regulations would actually move the needle. There. That's my personal
opinion, I'm definitely not speaking for Microsoft when I say that.
Cindy M. 35:32
So how I, you're teaching at the University of Maryland, I think, a new class this
semester about machine learning and an ethics, you know, what are the kind of one
or two big takeaways that you've you've gotten as you as we're recording this, we're
near the end of the semester. So I'm curious how that course has gone. And what you
kind of learned from this first semester?
Hal Daumé 35:56
I guess the first thing I learned is, I think students were really happy to be back
in person.
Cindy M. 36:01
Yeah, I know.
Hal Daumé 36:02
I think. I mean, everyone was like, super engaged and it was a lot of fun to teach
this class. I think there were a couple of things. I mean, it was, the class was weird.
In the sense, there was a computer science class where probably half of the stuff
we read was political philosophy, which a lot of students found challenging. We also
had a bunch of students from other departments from philosophy, or sociology, or psychology
and the information school. So that was helpful. You know, I think the to me, the
biggest thing is that there's this gap between the way that you know, a lot of sort
of ethics philosophy thinks about problems and what actually happens on the ground.
Cindy M. 36:49
Yeah
Hal Daumé 36:49
I don't want to pick on like the trolley problem too much. Ethicists love to talk
about the trolley problem. That's just not the problem that we actually face. I guess
one of the places where we've found the most leverage in terms of like, actually really
connecting, sort of like ethical principles with things on the ground is like literature
from science and technology studies. So this is a literature, at least the part of
it that we're looking at is like, basically, technical critique. So people like Anna
Lauren Hoffman, for instance, has done a bunch of amazing work in this space. You
know, sort of looking at the social implications of a lot of technology.
Cindy M. 37:29
Yeah.
Hal Daumé 37:30
And this is what we need, like we need to There was this joke we had when I was in
grad school. So I mentioned that like, I didn't start as a machine learning person,
I started doing language. And we kind of joked about machine learning that like in
machine learning, it's like, God gives you a matrix and your job is to do something
with this matrix. And really, what this means to me now is that the starting place
for a lot of machine learning work is "I have some data but it's been completely decontextualized".
And, and so we've moved ourselves from like thinking about like socio technical problems
and how systems interact with society.
Cindy M. 38:20
Right
Hal Daumé 38:20
To purely technical problems, and that abstraction has been really useful in pushing
a lot of machine learning research forward but what we're seeing is that if you only
use that abstraction once the thing you build hits the real world,
Cindy M. 38:35
Right
Hal Daumé 38:36
It's like bad things happen.
Cindy M. 38:38
Right, right, right, right. So it's bringing that abstraction back to reality, but
with some practical critiques of systems as so to fill that gap between just talking
about philosophy, if you will, and ethics philosophy, which doesn't end up being very
practical and useful at times, to how do I get out of this abstract, you know, problem
of machine learning where I think I've fixed it, there's, there's a huge gap between
those two, right? Yeah, so filling that with, with critiques that are relevant and
practical. I think is a is a great way to go. Well, congratulations to you on teaching
that course the first time. Thank you for your time with us today, Hal. This has just
been a wonderful conversation. Thank you for all the work you're doing at Microsoft
as well. I always like to ask my guests one last question before they go. Are there
in addition to all the resources you've mentioned already? Is there anything else
if a student or somebody else who's listening and executive and wants to learn more
about this topic that you could recommend to them either in terms of a book or or
maybe a documentary or a podcast series?
Hal Daumé 39:49
Oh, it's so hard to just pick one.
Cindy M. 39:51
Well, you don't have to you can give two or three.
Hal Daumé 39:54
I mean, I think you know, if I had to say one place to go, I would check out a lot
of the reports put out by the AI NOW Institute. So I, they've done a lot of really
great work on on sort of like in this like socio technical space of understanding
like the real world impact of AI systems.
Cindy M. 40:18
Okay.
Hal Daumé 40:19
Um, and they have a number of, you know, relatively short documents on a wide range
of topics that are really good.
Cindy M. 40:29
good.
Hal Daumé 40:30
Um, you know, I also really liked the movie Coded Bias. This is largely about Joy
Buolamwini, and her efforts to rein in facial recognition, technology and all. It's
not just about her, it also has sort of vignettes from like, almost, you know, sort
of the who's who in this space. So yeah, the people featured in that movie is like
a good person to go check out their book.
Cindy M. 40:53
And that's a great one. I've watched that one. It's very engaging. It's a documentary
And she's a student at MIT it's great.
Hal Daumé 41:01
Yeah. And then maybe the third thing I would say is I really liked Ruha Benjamin's
book Race After Technology. It's a it's a pretty accessible read. It's very recent.
It's like in the past year, I think she brings a bunch of interesting perspectives
that are sort of easy to understand once they've been written down in a way that's
easy to understand. But, but also sort of deep and their implications.
Cindy M. 41:32
Yeah, yeah. Oh, those are great recommendations, Hal. Thank you. And thank you so
much, again, for your time today. This has been just a fascinating conversation. Thank
you so much. Appreciate it. Yeah. Thank you, Cindy.
Hal Daumé 41:43
My pleasure.
Cindy M. 41:44
All right. Talk to you later. Bye bye.
Hal Daumé 41:46
Bye
Cindy M. 41:51
Thanks for listening to today's episode of the BIS the business integrity school.
You can find us on YouTube, Google, SoundCloud, iTunes or wherever you find your podcasts.
Be sure to subscribe and rate us and you can find us by searching the BIS. That's
one word, theBIS, which stands for the Business Integrity School. Tune in next time
for more practical tips from a pro.