Lukas Biewald, former founder and CEO of Figure Eight, and current founder and CEO of Weights and Biases, sits down with Cindy Moehring to discuss his experience and efforts to better the future of AI and machine learning. The pair discusses Figure Eight’s improvements of training data, along with Weights and Biases advancements of human-in-the-loop systems and versioning models to build models faster and better. They rounded out their discussion with a brief talk of inclusion and women in STEM.
Podcast
Resources From the Episode
- Read more about Appen acquiring Figure Eight
- Weights & Biases raises $135M Series C to keep building MLOps software
- Learn more about Lukas Biewald’s podcast, Gradient Dissent
- Listen to “Daphne Koller - Digital Biology and the Next Epoch of Science” on the Gradient Dissent podcast
- Listen to “Emily M. Bender - Language Models and Linguistics” on the Gradient Dissent podcast
- Listen to the Lex Fridman podcast
- Take free courses at fast.ai
Episode Transcript
Cindy Moehring 0:03
Hi, everyone. I'm Cindy Moehring, the founder and Executive Chair of the business
integrity Leadership Initiative at the Sam M. Walton College of Business, and this
is the biz, the business integrity school podcast. Here we talk about applying ethics,
integrity and courageous leadership and business, education and most importantly,
your life today, I've had nearly 30 years of real world experience as a senior executive.
So if you're looking for practical tips from a business pro who's been there, then
this is the podcast for you. Welcome. Let's get started.
Hi, everybody, and welcome back to another episode of the biz, the business integrity school. We're in season five, and we're talking about all things tech ethics and responsible AI and all of those related topics. And we are super lucky to have with us today. And entrepreneur Lucas B. Wald Hi, Lucas, how are you? Good, Mr. Nice to see it. Nice to see you too. You all are going to love hearing Lucas's story, I have a feeling he's living the dream of many of us. So we're gonna jump right into the discussion. Lucas is the founder and CEO of not just one, but two companies he first founded and is the former CEO of a company known as figure eight. And a couple of years ago, he sold that company for $300 million to another organization known as AP and congratulations on that. Why you think he may like go off into the, you know, wonderful wall that live his life. That's not the case, he turned around and founded yet another company known as weights and biases. And that is where he is continuing to work today as the founder and CEO. So like I said, Lucas, congratulations. First of all, that's an incredible accomplishment. So how did you do this? How did you end up where you are today, and what got you kind of interested in being an entrepreneur and starting all these companies,
Lukas Biewald 2:12
I always loved artificial intelligence ever since I was a little kid, I was just enamored
with the idea that you could teach computers to do things and I went to Stanford,
I really wanted to study it. And at that time, you know, people were like, AI is kind
of all hype, it doesn't work, they talked about the AI winter, in the 80s, people
really thought, wow, you know, AI is so powerful computers are going to kind of do
all the things that humans can do. And they're going to replace this kind of like
how people talk now. But at the time, you know, it was hard to find really good applications
were AI worthwhile. And there were like a few early successes, and then the technology
kind of got stuck. And so you didn't see like a lot of really meaningful applications
for a while. And that's when I was in school, it was in the middle that nobody wanted
to go into AI machine learning was a little bit of a, like a backwater, I don't think
people would have known, you know what that meant. I mean, now, and a lot of the people,
the professors were in AI, but not doing machine learning. They're doing kind of old,
older style, AI, which is more kind of logic and symbolic manipulation. And, you know,
machine learning was kind of coming out of more stats departments. But I, you know,
I just love the idea that you could teach computers to do stuff, it's always kind
of really captured my imagination. I feel like, you know, in the long term, I don't
see why we won't be able to teach computers to do everything that that humans can
do. And I think at that point, we enter a totally different world, maybe not in my
lifetime, maybe my daughter's lifetime. At some point, I think, well, we'll figure
that out. And I think it really has the opportunity to make a really amazing impact
and everyone. Yeah, yeah. And I felt like when I was graduating school, I was actually
really excited to just get a job, I wasn't one of those people, it's like, I have
to start a company. Really proud to have like a, you know, corporate job and, you
know, people cared about my, my skills, I felt really good. I noticed though, that,
you know, the only companies at the time that really wanted to hire anyone doing machine
learning, were companies doing either kind of Wall Street, like, you know, money optimization,
which felt a little empty to me, I'm not really like opposed to it, but it didn't,
it wasn't like my dream to go to Wall Street and like, you know, like, help a hedge
fund get like a slightly higher return, I actually think is kind of intellectually
interesting, but just, you know, for whole career when it wouldn't sustain me personally.
And then, so the other thing that was happening was, you know, companies like, like
Google, we're doing search optimization and try to show you better results. You know,
that time they're really important. And the reason Isn't that that was a good early
application of machine learning was actually there's a lot of training data created
implicitly. So you know, people look up stuff. And that's actually training that you
can use the clicks to teach the computers that you can use when somebody like hyperlinks
to something, you know, that's actually like a signal, they're like, when you link
to somebody say what it is, you sort of like, implicitly kind of defining what would
be good search terms to find that piece of content. So it was fun to work on. But
I really felt like the thing that was holding machine learning back at the time was
a lack of train training data. And training data is like the examples that you show
the computer, that the ML system to then learn to generalize it at the time, all the
applications are ones where you sort of were getting the train data for free, but
I felt like there was a lot of other applications where you actually have to label
it yourself, right? So if you want to do voice recognition, you need someone speaking
things, and then typing into the computer, what did that person say, if you want to
make a self driving car, you need to like take a picture that the car is seeing and
label every pixel in that image with like, what is actually the only way that computers
learn to do things. And they require a lot more training data than humans. And so
you see this big expense. And I just really wanted to make machine learning actually
work. And I felt like, there was this kind of problem in the market that people couldn't
get the training data that they wanted, I really knew nothing about how business works.
But what I didn't know is that, you know, there's a problem that I wanted to solve,
you know, it's a lot harder than I kind of, like going into, you know, built the pretty
big company over time. Yeah, yeah. You know, collecting this data for, you know, for
many companies,
Cindy Moehring 6:56
yeah, the training data. So it was a problem you not only wanted to solve, it was
a problem that needed to be solved, you may not have realized the magnitude of the
of the needing to be sold part of that equation when you started out. But that's almost
like a match made in heaven. Tell us a little bit more about figure eight, I've heard
it described your first company, as a human in the loop, machine learning and AI company.
In fact, by the way, just so the audience knows, it was included in Forbes list of
100 companies leading the way in AI in 2018. What does it really mean, when you say,
human in the loop machine learning and AI? Well,
Lukas Biewald 7:35
it's the idea that most machine learning systems in the real world, they don't work,
where the machine learning always kind of makes a decision, and then people just blindly
follow it. Right, most of the way machine learning really gets deployed is the computer
makes a guess. And if the computer is not confident, it gets sent to a human, and
getting there, right, it's actually really critical. And, you know, the training data
piece is actually really interesting here, where the ones where the computer is struggling,
or where it's confused, where actually asked a human to help are actually the perfect
examples to feed back into the system to make it get better over time. The reality
is, like, for a business, a 60% Confident process isn't gonna be that useful for most
things that a business wants to do, like, you know, with, like, with Wall Street,
you know, if you can pick stocks, right six for the time, you can make a lot of money
or, you know, search results, we don't expect search to always give us exactly what
we're thinking of, but for most like real business processes, you know, you do need
really high level of accuracy, because something's gonna break, you know, but, you
know, the thing about, like, 60 said, accuracy, if, you know, the 60% of cases where
you're going to be accurate, then a business might see that as 60% cost savings, right?
If it avoids, you know, some other process, the 60% of the time that the machine learning
is able to kind of do it automatically, right? So automating 60% of process, with
100% accuracy is really useful. Automating 100% of the process, where breaks, you
know, right,
Cindy Moehring 9:10
right, it's 40% of the time. Right? Right. Yeah, exactly.
Lukas Biewald 9:13
And so that's why, you know, it ended up being the vast majority of our customers
used figure it in this way where it would build this human, the loop system, and we
would help them you know, set things up so that, you know, you build an animal model,
that noses confidence, and then the low conference results gate gets sent to a human
who labels it, and then that gets fed back in system so it's nice to hear your process
improves over time.
Cindy Moehring 9:38
Yeah, yeah, it really does. And then when you finally get it up to a confidence level
of, I don't know 95 96% And each company has to decide for themselves what's the risk
embedded in that particular process and what you know level of accuracy do we need
before we are comfortable rolling it out, let's say and then assuming the risk for
the you know, small percentage The times the machine may get it wrong, which again,
I think will vary. So the software, one of the pieces of software anyway that I know
figure eight had included the Dash Cam application of it, and which was useful in
self driving cars. So tell us a little bit about that dash cam application of software,
how did that work?
Lukas Biewald 10:20
Now it is a lot of cars shipped with a little bit of like automation in the driving.
Yep. And the way that this typically works is there's a camera in the car. I'm like
really oversimplifying, but like, one thing that happens is there's a camera in the
car, and the camera is looking out at the world, like, and it's trying to decide,
like, Okay, what's out there, right? I mean, so it wants to know, okay, like, where's
the road? And where's the road going? It's also important to know, like, are there
like, you know, humans here, because humans, you know, are different than a tree like
humans will move, right? You also would much rather like crash into a tree than a
human. Ideally, you don't crash into anything, right? And so, so the really the critical
step, or a critical step is knowing what's in front of you from a camera. And you
might think, like, wow, I mean, if you haven't thought about it, it might seem like
computers can do so many amazing things. Why is this one so hard? And I think the
reality is that, like, humans are actually really good at this, right? Like we our
eyes have evolved for, since we were fish, right to, you know, to kind of navigate
the world that we're in and a lot of our brain, what it's doing is actually like taking
the photons that enter and figuring out like, what's in those, you know, what's out
there in the world. And so, it's just a really hard task. Like, when you think about,
like, what makes you know, like a person, a person, there's so many different ways
that it could show up, there's different lighting conditions, like you could see,
you know, the reflection of a person, like in a mirror, how do you know that that's
like, not actually a person that you might crash into? So it's like a super subtle,
difficult task. And really, what we did was make it efficient to label every pixel
because, you know, this is one of those ones, where if you had to, like click on every
pixel and say, Okay, this pixel is part of a person, this pixel is part of one time,
yeah, it would take you forever, right? So you want to be good guests. And you want
to sort of like build tools that kind of help the operator labeler. Make those labels
faster. So you're doing more kind of correction of guesses that the system is making
versus, you know, just guessing things from scratch, and then as you guess, a new
thing, then, you know, the automated system can kind of make new guests about like,
Okay, if you think that person, probably those pixels around are also part of a person,
and then you kind of iterate until you get to a point where all the pixels are labeled
accurately,
Cindy Moehring 12:31
do you think they did enhance the safety of self driving cars, or the further automation
of cars, even if a human is behind the steering wheel?
Lukas Biewald 12:40
You know, self driving cars, over time will probably be safer than than human drivers?
They may, you know, they're not now, obviously, but I think over time, they will be,
I think, what what we really what we know, that we did is we helped make those self
driving systems perform better. And you know, that actually, you know, there's, there's
chance that might have made things less safe if you just willy nilly started to playing.
But I do know is that our customers can a lot about safety themselves. And so right,
you know, they were really eager to, you know, prove the performance kind of get over
a threshold where they felt like, you know, they could use it in different scenarios.
And obviously, the first thing that you do is use it to like, augment a human driver
that's sitting at the wheel. This is almost like a human loop system, right? You can
think of like a Tesla, where you have to touch the steering wheel and like, say that
you're still looking at things. And you're, you know, the intention is that you grab
the steering wheel, if it's like doing something wrong, that's a human loop system.
Right? Because the autopilot is driving, right? Humans intervening when there's,
Cindy Moehring 13:42
at some point, self driving cars will probably be safer than humans. I don't think
that by and large, humans are there yet in terms of accepting that, probably because
things like Uber car crash, and you know that the small examples, they get really
blown up and magnified. And that's all people can think about. What do you think it's
going to take Lucas to get over that hump of kind of human acceptance?
Lukas Biewald 14:10
I think that we the bar is naturally hire for like a new kind of system, because it
makes different kinds of errors, right. And so, you know, we emphasize the places
where it makes mistakes, but I don't think that the human level of driving is like
this, the peak that that a, an automated system can do so I think that you know, the
automated systems have to be like a lot better than human systems. But the amazing
thing about computers that they keep improving, right, you know, I think what can
happen is it's going to keep keep getting better and better and better. And then,
you know, over time people accept it. And, you know, there could be generational things
like it's interesting. My, my daughter is two and she talks to our Alexa doesn't view
that as like weird at all, you know? It's like, I wonder like, maybe, you know, people
that grow up with this kind of automation might, you know, be more Is this more of
like this more about, like humans understand less well than machine learning systems?
Cindy Moehring 15:06
Yeah, you might be really right. It could be very generational, just like accepting
tech generally, you know, has been. Okay, so let's turn to your current company weights
and biases, I find that an interesting name for the company, how does your company
weights and biases help machine learning teams build better models faster?
Lukas Biewald 15:25
Well, we built developer tools that makes the developers more efficient. And you might
ask, Well, what are those? I'll give you like one example of something that you might
not realize, right, that you would need. So you know, when you write code, you version,
the code, and you version it for a lot of different reasons, right? I mean, like you,
if you don't write code, and you write like, text documents, you probably version
this text documents somewhere, you save them, and you're like, Okay, this is the latest
one, and this is the date and all that. And you do that, because you know, you may
need to go back in time and like revert some changes that you made all those things,
right. So same thing happens in software. And so people do save all the stuff that
they make. And then when you get to be bigger and bigger teams, the versioning is
really important, because everybody's modifying the codebase at the same time, right.
And it really unwieldy if you didn't have like systems that sort of like looked at
how each person was changing the codebase and sort of merging the changes back together
in a sane way. Yeah. Okay. It turns out with models, right, really, the computers
are like building these models, and the version control systems break and a lot of
ways, right, because it's not the humans writing the code, it's computers, right.
And so they the computer, like humans will like, if I make like a V to some software,
I might modify a couple lines of code and change it with computers, like, every time
they do it, they do it from scratch, right? And so, you know, it's like, you really
have to keep track of more stuff, like you need to keep track was the training data,
you know, what was happening when that model was being built, like really versioning
a model, if it's just the like, output code, it's not enough, right? You just don't
really know what what happened there, you want to like a whole track record of like,
all the things that went on, and also, you know, people doing machine learning, they
make a lot more versions, because it's not like a human has to go in and like make
each new version, the computers are making the version. So you'll have like, 1000s,
millions of versions, right. And so keeping track of all that is really important
to doing machine learning safely. So, you know, versioning models, and, you know,
versioning machine learning code is something that that we do to help teams like,
you know, both work better together, right, because the two people working on it,
they can see each other's doing, it often, like stores the valuable IP for companies,
right? Because if like, you have a person who's building on they leave, you want to
come in and share what they were doing and pick it up. And then there's a, you know,
compliance issue of like, what if you put a model in a car and the car crashes? someone's
like, Hey, why did the car crash? You can't really answer that unless you kept track
of which model you put into a car, which might seem obvious. But if you don't systematize
it, over time, you know, errors will will creep in, someone's going to forget to write
down what version when it's in which car. And so we make sure that
Cindy Moehring 17:57
as simple as that sounds, you're 100%, right? I mean, and that's a huge issue for
companies when it comes to controlling risk, making sure they are compliant and watching
out for their liability, protecting the reputation overall. I mean, if that were to
happen, and it's a simple little thing, like what you just described, what version
did it, you know, went into the car? And if they can't answer that, I mean, they'll
lose trust with all of their customers and their customers think, well, what in the
world are you doing, you're putting in a system and you don't even know which one
and you can't trace it back. So it's a little, it's a little thing that may seem obvious,
but it could have really big implications really big. I understand that the tools
are developing and the machine learning world for those developers is actually kind
of really didn't exist back to your point about machine learning was kind of an afterthought.
A backwater people weren't spending time in it. Meanwhile, basic, you know, the software
world was had been growing a lot during the time of the AI winter, if you will. So
they had a lot of DevOps, right in that space where they were doing a lot of this,
and I was reading some of your work. And in an article in TechCrunch, you mentioned
that when software code fails, he said it crashes. But when ml work, machine learning
work fails, it can behave badly in more subtle ways. And that really piqued my interest.
And I was hoping maybe you could explain that a little bit.
Lukas Biewald 19:19
Sure. I mean, I'll give you an example. Right, we work with John Deere, to help them
in fields identify, which are weeds in which our crops that you want to catch that
I decided to spray just the weeds with with pesticides, which is great, uses less,
you know, less pesticides, it saves the farmers money, and also is better for the
environment. And so you know, the way you train that data is you pull a camera over
a field of lettuce that has some weeds in it, right? And you take pictures, and you
have people label like okay, here's the weeds and here's the lettuce, right? And so
you might think like, okay, everything's great. Yeah, right. And then, you know, one
day it snows, right? You can imagine like if you just didn't have any examples of
snow. You know, I think a human could probably figured out if they were waiting, right?
Also human might like stop and say, hey, you know, I don't really know what's going
on, I can't see the weeds here, right? You know, it's really actually hard for machine
learning systems to adapt as well to new situations like that. And so what can happen
is that, you know, in the snowy condition that was never in your training data, your
ml system, rather than like stopping and not doing anything, you might decide to spray
all the lettuce with pesticides. And that'd be really bad. That'd be like, heartbreaking,
or you completely destroy the field. And that's like the fundamental danger here,
right? Because you can never collect examples of every possible edge case that you
might encounter. Right. So yeah, I mean, that's, I think that's a fundamental challenge.
It's deploying, you know, machine learning systems today is such
Cindy Moehring 20:49
an interesting point, while machines are able to do things faster, and we can train
them to be better. Everything that a human brain that we have learned over the years
are things that machines don't know if you don't teach it, right. So now, now, you're
saying they can behave badly, like ruin entire crop of lettuce, which nobody would
want. And so you have to think about probably the most likely scenarios that you know,
your tool is going to encounter and make sure you train for those. And then also think
about maybe some of those black swan events, if you will, kind of the upper left of
a of a risk heat map. And for sure, all the ones that are going to be upper right,
you know, have great likelihood and great impact. So yeah, that's, that's, that's
very interesting. So let's switch topics entirely and talk a little bit a little bit
about diversity and equity and inclusion in the AI and kind of ml industry and tech
in general. You know, no surprise, I think there's been a lot in the press about historically,
tech companies not really being seen as a place where women can thrive, particularly
if they're in the engineering, so they're in a STEM field. And you've been a two time
founder, so I'm really interested in knowing what your view of that is, and whether
and kind of what you've done to try to make a difference. Being aware of that, that
at least impression is out there?
Lukas Biewald 22:06
Well, I always feel a little shy, like answering this question, because my lived experiences
as a white man, you know, went to Stanford, right? So obviously, there's a, you know,
serious inclusion issues in tech that, you know, I really want to help with, you know,
one simple thing that's like, actually, a real challenge is I think, if you don't
get diversity in your first couple hires, it can be really challenging. You know,
I've talked to women engineers that don't want to be the first woman engineer on a
team. So I think like, when you get some of that's like, we want to do that, you know,
like, I appreciate it, you know, make sure that they're, you know, comfortable and
happy. And, you know, make sure that you're, you know, the stuff that you do as a
company feels inclusive. And I think there's a lot of people that feel like outsiders,
and a Silicon Valley Tech company. And it's a challenging thing. We do a lot of trying
to survey employees and trying to be responsive to what employees are asking for,
Cindy Moehring 22:55
I have to say, I think despite the fact that you are obviously white, and you're a
male and you went to Stanford, you clearly have an awareness and an appreciation of
the issue. You've used your position to think strategically about how do you get more
women in you recognize that if you don't do it in your first couple hires, you could
have an issue, right? And that's a lot of where it starts, right. And being a good
ally, being a good advocate, being the kind of leader that that people would say is
inclusive, so yeah, I don't think you need to be nervous. You actually have a lot
to share in that space. Are there any particular female STEM leaders that you admire?
Oh, yeah,
Lukas Biewald 23:31
well, my old advisor, Daphne Koehler, I just like admire diva. She's kind of a famous
person. Now. I mean, she, you know, she won a MacArthur and she runs a company called
in Citro, and she started a company called Coursera. That's done, you know, yeah.
You know, Emily, Bender is another person that comes to mind, you know, who's who's
kind of thought a lot about like AI and ethics. And so those
Cindy Moehring 23:55
are great suggestions. I think people can go women in particular, if they're interested,
you can go look up your women that you admire and learn a little bit more about them.
And perhaps that might inspire them to go into the field as well.
Lukas Biewald 24:06
Yeah, I mean, I would say even men could stand to learn from these people. They're
Cindy Moehring 24:09
absolutely, yeah, yeah, they can serve as role models, really, for everyone. I always
like to leave the audience with some additional resources. If they want to learn a
little bit more about this topic, or, you know, watch or read or listen to something
else and enhance their education even further. I want to know where you turn to for
your best additional learning. So you have any recommendations for the audience on
what else they might listen to, or watch or read to learn a little bit more about
this?
Lukas Biewald 24:41
Yeah, totally. Well, I mean, I don't want to show for my own stuff. We actually do
do an interview series of people in the space including the two people that I that
I mentioned. So if you kind of wanted to watch in depth interviews of, you know, people
in machine learning, we have an interview series called gradient descent. That's super
fun. I think one place that I was suggesting to people kind of outside of machine
learning that I think is kind of an interesting place to learn about what's going
on is, there's a podcast by a guy named Lex Friedman, who interviews a lot of really
high profile people in the space and it's it's like super interesting. He really gets
kind of like deep and philosophical. And it's pretty
Cindy Moehring 25:18
cool. Wow, how interesting.
Lukas Biewald 25:21
Yeah, yeah. So I guess I guess those would be my sort of, like, entry points. And
there's, there's plenty of my favorite thing would be if somebody like kind of wants
to learn, you know, how to do machine learning, and there's just so many free resources.
I think the best one I would say is fast AI. So if one person is to this and they
go too fast that AI and and take the course that would make me really happy,
Cindy Moehring 25:40
okay, and there's like free courses on there that you can
Lukas Biewald 25:44
free and it's like, yeah, by a really smart guy. Really accessible. Yeah, I really
recommend it.
Cindy Moehring 25:49
Very cool. All right. Well, we're gonna leave it there. And I'm sure that at least
one person will go to fast.ai. And we'll make sure to capture all of your recommendations
in the show notes so folks can go deeper. Lucas, thank you so much for your time for
sharing with us your background, your experience and what it is that you're doing
in this fast evolving space. Really appreciate it.
Lukas Biewald 26:09
Thank you so much. Okay.
Cindy Moehring 26:10
Bye bye. Thanks for listening to today's episode of the biz the business integrity
school. You can find us on YouTube, Google, SoundCloud, iTunes or wherever you find
your podcasts. Be sure to subscribe and rate us and you can find us by searching the
biz. That's one word th e bi S, which stands for the business integrity school. Tune
in next time for more practical tips from a pro