Episode 23: Reinforcement Learning – The Other Type of Machine Learning

Download MP3

00:00:00 Dr Genevieve Hayes
Hello and welcome to value driven data science brought to you by Genevieve Hayes Consulting. I'm doctor Genevieve Hayes. And today I'm joined by Professor Michael Littman to discuss reinforcement learning. The other type of machine learning. Michael is an award-winning professor of computer Science at Brown.
00:00:20 Dr Genevieve Hayes
University specialising in reinforcement learning.
00:00:24 Dr Genevieve Hayes
Is Co creator of the machine learning and reinforcement learning courses offered as part of Georgia Tech's online Master of Science in computer Science or OMSCS program, and is currently serving as division Director for Information and Intelligence Systems.
00:00:44 Dr Genevieve Hayes
At the US National Science Foundation, he is also the author of the soon to be released code to Joy.
00:00:51 Dr Genevieve Hayes
Why everyone should learn a little programming, Michael, welcome to the show.
00:00:56 Prof Michael Littman
Thank you so much for inviting me.
00:00:59 Dr Genevieve Hayes
In addition to all the things I've just mentioned, Michael, you are also probably the only guest I've ever had on this podcast who has appeared in a music video.
00:01:12 Prof Michael Littman
OK, alright, that's that's not not keep in mind, these are this was a music video that was on YouTube.
00:01:18 Prof Michael Littman
So that's not an incredibly high bar to get.
00:01:20 Prof Michael Littman
Onto a music video on YouTube.
00:01:22 Dr Genevieve Hayes
Ah, but it's a good view.
00:01:23 Dr Genevieve Hayes
You know, so for anyone who hasn't seen it, Michael and Professor Charles Isbell made a music video several years back where they sing about overfitting.
00:01:34 Dr Genevieve Hayes
Machine learning models to the tune of thriller by Michael Jackson, and I believe there's also a second video where you sing about kmeans clustering.
00:01:44 Dr Genevieve Hayes
To smoking out the window.
00:01:47 Prof Michael Littman
That's exactly right.
00:01:48 Dr Genevieve Hayes
I've seen parts of that one, but for my money, the thriller video is the way to go.
00:01:54 Prof Michael Littman
The thriller video has definitely gotten the most airplay of anything that I've ever posted.
00:01:58 Prof Michael Littman
We had a team at Georgia Tech who did the production and who did the the background singing and it was really high quality stuff.
00:02:06 Prof Michael Littman
So yeah, yeah, that's and.
00:02:08 Prof Michael Littman
And plus, I mean, machine learning is kind of a big thing now.
00:02:11 Prof Michael Littman
And so there's lots of people who are just starting to learn about it.
00:02:13 Prof Michael Littman
And sometimes they want to take a little break from all the.
00:02:16 Prof Michael Littman
The equation cramming and just see something a little bit different and so we get some play.
00:02:20 Dr Genevieve Hayes
And I think that probably makes you and Charles is spell the coolest machine learning lecturers on the face of.
00:02:28 Dr Genevieve Hayes
The planet.
00:02:30 Prof Michael Littman
Well, thanks so much. So I actually I made a presentation in a conference a number of years ago, I guess in 2016 and I was introduced by a real luminary in, in the field of machine learning, a guy named Tom.
00:02:42 Prof Michael Littman
Trick who introduced me as the funniest person in machine learning, and I thought that was a very nice thing to say.
00:02:48 Prof Michael Littman
But then afterwards I saw him at the at at the refreshments afterwards and he said I.
00:02:52 Prof Michael Littman
Just want to tell you you're actually #2.
00:02:54 Prof Michael Littman
Charles is #1.
00:02:56 Prof Michael Littman
So I'm like, OK, I'm OK being #2 to Charles. Charles is a very, very funny guy.
00:03:02 Dr Genevieve Hayes
Yeah, for list the for the sake of listeners, I had the pleasure of taking two classes that were taught.
00:03:08 Dr Genevieve Hayes
By Michael and Chelsea spell as part of Georgia Tech OMSCS program and one of the students in one of those classes described your lectures as being like watching a stand Up Comedy Act.
00:03:23 Dr Genevieve Hayes
And I think that is a perfect way of describing them.
00:03:27 Dr Genevieve Hayes
They are absolutely.
00:03:29 Dr Genevieve Hayes
And then you realize.
00:03:30 Dr Genevieve Hayes
Ohh yeah, I'm learning something as well.
00:03:32 Prof Michael Littman
Yeah, that's important too, right?
00:03:33 Prof Michael Littman
If we were just funny, that would probably not fly.
00:03:37 Prof Michael Littman
We'd lose our jobs as professors.
00:03:39 Prof Michael Littman
But the fact of the matter is we yeah, we try because this is how we think about the field.
00:03:42 Prof Michael Littman
Like we we've been studying this stuff for many years and.
00:03:45 Prof Michael Littman
It's hard to take it super seriously.
00:03:47 Prof Michael Littman
There's there's so much weirdness to it.
00:03:49 Prof Michael Littman
There's so much quirkiness to it that you know, you have to have a kind of good sense of humor about it sometimes.
00:03:55 Dr Genevieve Hayes
And I can see just from the way you're talking about it right now that you're very enthusiastic about this subject.
00:04:01 Dr Genevieve Hayes
And computer science in general.
00:04:03 Dr Genevieve Hayes
Where does all this enthusiasm come from?
00:04:06 Prof Michael Littman
I mean, you mean beyond the fact that computer science is like, the coolest thing to study?
00:04:11 Dr Genevieve Hayes
Well, yeah.
00:04:12 Prof Michael Littman
I mean, to me that's the main thing and and this is this has been a recurring theme in in my life.
00:04:17 Prof Michael Littman
So I first got exposed to computing. I was 13 years old, it was 1979, let's say, and I I had convinced my parents to to get me a computer and I sat in my room.
00:04:31 Prof Michael Littman
For basically three years, just like immersed in this thing because it was so compelling and and and as I'm learning about this stuff, I have two younger siblings.
00:04:39 Prof Michael Littman
I would go to them.
00:04:40 Prof Michael Littman
I'm like ohh my gosh.
00:04:41 Prof Michael Littman
You have to.
00:04:42 Prof Michael Littman
You have to see.
00:04:42 Prof Michael Littman
This is the coolest thing ever.
00:04:44 Prof Michael Littman
Like a raise blew my mind.
00:04:46 Prof Michael Littman
Just completely like rewrote my operating system code.
00:04:49 Prof Michael Littman
It was just the coolest thing.
00:04:51 Prof Michael Littman
And so I'm telling my younger siblings, and they're looking at me like I had just sprouted tentacles out of my forehead.
00:04:56 Prof Michael Littman
Like, like, what are you even talking about?
00:04:57 Prof Michael Littman
First of all, I don't understand a word you're saying.
00:05:00 Prof Michael Littman
And second of all, it doesn't sound interesting at all.
00:05:03 Prof Michael Littman
And so at the time I decided, OK, the problem is.
00:05:07 Prof Michael Littman
I caught them too.
00:05:08 Prof Michael Littman
They were too old.
00:05:08 Prof Michael Littman
Like I need to catch them younger.
00:05:10 Prof Michael Littman
So I waited.
00:05:10 Prof Michael Littman
I bided my time.
00:05:12 Prof Michael Littman
I finally had children.
00:05:13 Prof Michael Littman
Two children, almost like copies of my.
00:05:14 Prof Michael Littman
Had a younger sister and a younger brother, and then I had a daughter and a son.
00:05:19 Prof Michael Littman
I'm like this is it this is it.
00:05:21 Prof Michael Littman
Now I get to finally teach them about.
00:05:23 Prof Michael Littman
How exciting computing is.
00:05:25 Prof Michael Littman
And they were very polite about it, but it did not.
00:05:29 Prof Michael Littman
It did not move them the way that it moves me.
00:05:31 Prof Michael Littman
And I finally realized, OK, wait, maybe this isn't a them thing.
00:05:34 Prof Michael Littman
Maybe this is a me thing.
00:05:36 Dr Genevieve Hayes
The students that you're teaching, they must be very engaged in your classes because they would be on board.
00:05:42 Prof Michael Littman
That's that's interesting to say.
00:05:44 Prof Michael Littman
Well, I do.
00:05:44 Prof Michael Littman
You know, I do do, do everything I can to keep people.
00:05:47 Prof Michael Littman
Yeah, kind of in in the moment in the subject, learning about this stuff, I I've been, I've been teaching computing computer science for long enough that the the student bodies shift over time. And so there were there were times like say in the in the early 2000s.
00:06:03 Prof Michael Littman
When, first of all, there was nobody in the classes, all the other people who had been studying computer science all left because they were told, at least in the US, they were told.
00:06:10 Prof Michael Littman
This is a dumb thing to study because all of the jobs are being outsourced.
00:06:15 Prof Michael Littman
Only people in India are gonna get to do, say programming.
00:06:18 Prof Michael Littman
Don't study this.
00:06:19 Prof Michael Littman
That's stupid.
00:06:20 Prof Michael Littman
And their parents would tell them that.
00:06:21 Prof Michael Littman
And the kids would not study.
00:06:23 Prof Michael Littman
Computer science.
00:06:24 Prof Michael Littman
So the only people in the classes were people who were pretty die hard.
00:06:27 Prof Michael Littman
Either they didn't get the memo right, that nobody nobody actually bothered to tell them this is.
00:06:30 Prof Michael Littman
Not worth your.
00:06:31 Prof Michael Littman
Time or they actually were were excited about the subject.
00:06:35 Prof Michael Littman
And so that, you know, that was kind of fun, cause the people that were there in the classroom were really, really, really wanted to be there then. Wow, I mean, over over the course of 10/15.
00:06:45 Prof Michael Littman
Others, I think the the field grew or the the number of students in these classes grew tenfold, right?
00:06:51 Prof Michael Littman
Like the OR at least threefold.
00:06:53 Prof Michael Littman
But like more than one doubling, so possibly 2 doublings, and so the suddenly there's tons of people who on the on paper anyway, wanted to study computer science.
00:07:03 Prof Michael Littman
But in practice I think a lot of them were in the field.
00:07:06 Prof Michael Littman
Because they thought this is a good place to get a job.
00:07:09 Prof Michael Littman
And so teaching them is is more difficult, right?
00:07:11 Prof Michael Littman
You actually do need to help people appreciate why they should be excited, why this does matter, why it would make a difference in their lives.
00:07:19 Prof Michael Littman
Yeah, that's a, you know, like you stand up in the front of the classroom, you do absolutely everything you can.
00:07:24 Prof Michael Littman
To keep their attention.
00:07:25 Dr Genevieve Hayes
One of the things I've been told I really like having I I'm not a fan of PowerPoint, but it seems to be required nowadays.
00:07:32 Dr Genevieve Hayes
And I so if I do it, I do it under duress and I put pictures on my slides and I've been told.
00:07:40 Dr Genevieve Hayes
That actually really appeals to the class, because I've been told a lot of lecturers put just words on the slides and the just having pictures there really makes a difference.
00:07:52 Prof Michael Littman
Yeah, I think I think that's right, because I think they're already processing all these words a lot of times the just the choice of picture is revealing of the personality of the person who designed the slides.
00:08:02 Prof Michael Littman
And so you're getting something much more human, right?
00:08:05 Prof Michael Littman
You're getting.
00:08:05 Prof Michael Littman
You're getting kind of connecting with someone, a level that just the words doesn't really do the trick.
00:08:10 Prof Michael Littman
And so, yeah, I I I believe that.
00:08:13 Prof Michael Littman
I mean.
00:08:14 Prof Michael Littman
Boy, I just went to a thesis defense this week and the student was fantastic at producing visuals.
00:08:20 Prof Michael Littman
He he made a little sort of a little graphic robot that reappeared throughout the talk, and it was just, you know, it did different things.
00:08:26 Prof Michael Littman
It had different expressions.
00:08:27 Prof Michael Littman
It it conveyed a lot of it like I think it came to represent him in some ways and it was really fun.
00:08:33 Prof Michael Littman
To to just.
00:08:34 Prof Michael Littman
See what would be on the next slide, right?
00:08:36 Prof Michael Littman
So that kept it kept me.
00:08:38 Prof Michael Littman
Yeah, it kept me watching.
00:08:40 Prof Michael Littman
Nowadays, it's really hard.
00:08:41 Prof Michael Littman
I find that just for me, when I hear a talk not going to e-mail is a huge thing because e-mail is constantly trying to draw me back in, right, the the siren song of the e-mail.
00:08:53 Prof Michael Littman
And as soon as I do that, I stop listening and then I lose the threat.
00:08:56 Prof Michael Littman
And then and then the whole thing was a waste of time.
00:08:58 Prof Michael Littman
And so.
00:08:59 Prof Michael Littman
I have to try really hard to prevent myself from going down that road and the yeah, the pictures can be really helpful in preventing preventing losing my attention.
00:09:08 Dr Genevieve Hayes
So from your point of view, as a teacher, what's the biggest tip you would give people on how to engage students?
00:09:17 Prof Michael Littman
So I think I think it's.
00:09:18 Prof Michael Littman
What I tell, OK.
00:09:19 Prof Michael Littman
What I often tell my my students, my PhD students or master students when they're about to give a presentation is there's two things that everybody gets wrong.
00:09:28 Prof Michael Littman
Like everybody gets more or less.
00:09:30 Prof Michael Littman
I mean, unless they're not good at all, they're gonna get the content, right.
00:09:33 Prof Michael Littman
They're going to get the details more or less, right.
00:09:35 Prof Michael Littman
They're going to get it.
00:09:36 Prof Michael Littman
Gonna be hard, but they're gonna get the flow from topic to topic, right?
00:09:39 Prof Michael Littman
But they're often gonna miss up two things.
00:09:41 Prof Michael Littman
I say pay attention to transitions like before you flip the slide, set up the next slide because they need to know where they are and how they're getting to where.
00:09:51 Prof Michael Littman
They are next the the.
00:09:52 Prof Michael Littman
The the common error is you have.
00:09:54 Prof Michael Littman
Slide and you said you don't love PowerPoint and I think this is the reason that some people don't love PowerPoint because it convinces you that you talk you.
00:10:00 Prof Michael Littman
There's a slide up there.
00:10:01 Prof Michael Littman
You talk about that slide.
00:10:02 Prof Michael Littman
You pause, you advance the slide and then you start the story of the next slide.
00:10:06 Prof Michael Littman
And so it's a series of short stories instead of a narrative arc.
00:10:10 Prof Michael Littman
And so if you force yourself before you change the slide to set up.
00:10:14 Prof Michael Littman
What they're gonna see on the next slide so that they're ready.
00:10:16 Prof Michael Littman
For it and then boom, it's there.
00:10:19 Prof Michael Littman
That's that makes for for I think a smoother presentation.
00:10:23 Prof Michael Littman
And then the second thing is visuals, because I think that's the other thing that people tend to do.
00:10:26 Prof Michael Littman
They'll either just have slides of equations if it's a mathy talk.
00:10:30 Prof Michael Littman
Which that's really tough on the listener, like absorbing equations in real time while while someone's talking that's really rough and words, they'll just have the wall of text.
00:10:40 Prof Michael Littman
It's just like bullet, bullet, bullet, bullet, bullet.
00:10:42 Prof Michael Littman
And so now you know, the slide comes up and you're thinking, oh, crap, I have to read all this and listen.
00:10:47 Prof Michael Littman
And the language doesn't always match.
00:10:49 Prof Michael Littman
And I have to try to.
00:10:50 Prof Michael Littman
Why did they put that bullet if they said that thing?
00:10:52 Prof Michael Littman
That's exhausting.
00:10:53 Prof Michael Littman
And so making sure that there's there's visuals that actually kind of take them through the story that makes a really big difference and it's it's time consuming, right?
00:11:01 Prof Michael Littman
It's not easy to make visuals and words like you've already you already know what you're gonna say, so writing down the words is a piece of.
00:11:06 Prof Michael Littman
Right, but getting getting visuals that actually kind of help people understand what you're saying is extra work, but it's worth it.
00:11:13 Dr Genevieve Hayes
Ohh yeah yeah.
00:11:14 Dr Genevieve Hayes
Now I could happily discuss teaching advice with you all day, but that's not why our listeners are here.
00:11:20 Dr Genevieve Hayes
Our topic for today is actually reinforcement learning, which is the other type of machine learning.
00:11:28 Prof Michael Littman
Which every time you say that, it makes me think of there is an old commercial in the US, maybe not that old.
00:11:34 Prof Michael Littman
Yeah, it's old.
00:11:35 Prof Michael Littman
If I remember it, it's probably old.
00:11:37 Prof Michael Littman
Where they they would be trying to sell pork and they would call it the other white meat.
00:11:42 Prof Michael Littman
So there was chicken.
00:11:43 Prof Michael Littman
And then there's the other white.
00:11:45 Prof Michael Littman
And that's when you see reinforcement learning is the other machine learning I think, yeah.
00:11:49 Prof Michael Littman
It's basically poor.
00:11:50 Prof Michael Littman
And it's funny that you that you that you say it that way because I used to I used to teach a reinforcement learning class where I would get up in the beginning students would come in the very first lecture I would say you have to understand.
00:12:02 Prof Michael Littman
I know you're interested in machine learning.
00:12:04 Prof Michael Littman
This isn't the machine learning.
00:12:05 Prof Michael Littman
That you're thinking of?
00:12:06 Prof Michael Littman
So you know this.
00:12:07 Prof Michael Littman
Because if you're wanting that kind of machine learning, you need to go to a different class and I'd let you give people a chance to leave if they need to leave, and then I would say, OK, now we're going to talk about the other kind of machine learning reinforcement learning, which to me is the most interesting thing because it really embodies the entire AI problem.
00:12:22 Prof Michael Littman
But for from a practical standpoint, it's not been as influential as supervised.
00:12:27 Prof Michael Littman
And so I, you know, want to make sure people understand.
00:12:29 Dr Genevieve Hayes
That I was lucky enough to go to a university that had a course a whole semester long course in reinforcement learning, but I know a lot of our listeners won't have been fortunate enough to have done that.
00:12:43 Dr Genevieve Hayes
For those people who've never encountered reinforcement learning before, can you explain what it is?
00:12:49 Prof Michael Littman
Yes, I should do that.
00:12:50 Prof Michael Littman
In fact, I heard once a talk there was like an intro to data science.
00:12:54 Prof Michael Littman
Talk where they.
00:12:56 Prof Michael Littman
They the person giving the talk.
00:12:58 Prof Michael Littman
Ohh man I I hope I can remember the terms said.
00:13:01 Prof Michael Littman
You know, I think of supervised, unsupervised and reinforcement learning as being the three main flavors of machine learning.
00:13:07 Prof Michael Littman
But he said to a data science, it's actually like prespective, prospective and anti.
00:13:13 Prof Michael Littman
OK, I'm definitely making up words now.
00:13:16 Prof Michael Littman
I mean, I forget how he how he described it, but it was ohh, descriptive, proscriptive and.
00:13:23 Prof Michael Littman
Predictive I think it might have been that alright, so predictive is like supervised learning where you're trying to predict the next thing when you're given.
00:13:30 Prof Michael Littman
You're given this training data that consists of, well, this is when you're in this kind of situation.
00:13:34 Prof Michael Littman
This is what you should predict.
00:13:35 Prof Michael Littman
Descriptive is like unsupervised learning, where it just takes a whole lot of data and just tries to make some kind of sense of it.
00:13:42 Prof Michael Littman
It's trying to describe it.
00:13:43 Prof Michael Littman
Right.
00:13:43 Prof Michael Littman
And it it takes many different.
00:13:45 Prof Michael Littman
Forms, but the idea in reinforcement learning and This is why I think it's so important and compelling is that we're actually going to use it to make decisions.
00:13:52 Prof Michael Littman
We're actually going to prescribe a course of action based on the data, and that's a harder thing to do because you're not just guessing what will happen next.
00:13:59 Prof Michael Littman
You're actually making a decision.
00:14:01 Prof Michael Littman
What do I want to happen next?
00:14:03 Prof Michael Littman
And so to me, that's the essence of of reinforcement learning.
00:14:06 Prof Michael Littman
It's it's about.
00:14:06 Prof Michael Littman
Another way I describe it sometimes is.
00:14:09 Prof Michael Littman
Evaluative like learning from evaluative feedback, right?
00:14:14 Prof Michael Littman
So in supervised learning you're you learn from supervised feedback, which is here's a problem.
00:14:18 Prof Michael Littman
Try to answer it.
00:14:19 Prof Michael Littman
OK, here's what you should have said right.
00:14:21 Prof Michael Littman
This is the supervision.
00:14:22 Prof Michael Littman
This is what you should say next time.
00:14:24 Prof Michael Littman
Learn to do that better, but in reinforcement learning you don't get that kind of signal.
00:14:27 Prof Michael Littman
You're you.
00:14:28 Prof Michael Littman
You do something like that.
00:14:29 Prof Michael Littman
Learner does something and then the feedback that it.
00:14:32 Prof Michael Littman
Gets is.
00:14:34 Prof Michael Littman
You know, it gets like some kind of numerical score and its job is to try to figure out how should I produce outputs from these inputs to maximize that score.
00:14:43 Prof Michael Littman
And so in many ways, it's a much, much harder problem, right?
00:14:45 Prof Michael Littman
Because you're given less direct information, but in other ways it's much more generally applicable because you don't have to have that supervised data.
00:14:53 Prof Michael Littman
To be able to apply.
00:14:54 Dr Genevieve Hayes
These ideas.
00:14:55 Dr Genevieve Hayes
So the logical question is why is it that reinforcement learning has become the work of machine learning?
00:15:04 Prof Michael Littman
The the pork machine.
00:15:06 Prof Michael Littman
Learning why is why is reinforcement learning the port of machine learning?
00:15:11 Prof Michael Littman
I think in part because yeah, it's funny cause I wanna say it's easier to apply in the sense that it's more generally applicable, broadly applicable.
00:15:20 Prof Michael Littman
You don't need as much input, but it's harder to get it to actually do something.
00:15:24 Prof Michael Littman
Useful. So there's whereas we could probably the two of us could sit down and list like 50 really influential applications of supervised learning that touch people's lives on a day.
00:15:33 Prof Michael Littman
The basis I can only think of maybe three ways that companies have used reinforcement learning in a way that's really visible that you can point to it and say yeah, that they did that using reinforcement learning and that was the right way to handle the problem.
00:15:46 Prof Michael Littman
And so why is that part of it is because it's often the?
00:15:51 Prof Michael Littman
Somebody quipped.
00:15:51 Prof Michael Littman
Once I heard that, an AI researcher.
00:15:54 Prof Michael Littman
Say that you can tell when you're listening to a reinforcement learning talk.
00:15:58 Prof Michael Littman
The conference because the X axis in terms of training data is in exponential notation, right?
00:16:05 Prof Michael Littman
It takes it generally takes millions and millions of trials for these systems to actually learn how to grapple with the environment to try to try to maximize reward and.
00:16:15 Prof Michael Littman
It's not a necessary feature, but it is a very common feature of these.
00:16:19 Prof Michael Littman
These kinds of problems that there's just, it's such a big space of possibilities that as the system starts to explore and try things out, there's just lots of different things it has to try before it can really latch on to a successful recipe.
00:16:30 Prof Michael Littman
So a classic or or or very notable application of of reinforcement learning was the system that learned to play the game.
00:16:38 Prof Michael Littman
Of go well.
00:16:39 Prof Michael Littman
It's also been applied to other games like chess and and Goku and and and Hex, various kinds of board games, but the the program learned to play go at a level that was possibly better than any human has ever played the game.
00:16:53 Prof Michael Littman
But to do that, it had to play.
00:16:55 Prof Michael Littman
More than a human lifetime worth of games, right?
00:16:59 Prof Michael Littman
So if you were born and all you did was play, go sometimes take a break for lunch and sleep a little bit each night for 70 years you would not play the number of games of go that this program played to learn.
00:17:11 Prof Michael Littman
How to do?
00:17:12 Prof Michael Littman
Well, now, arguably that's still incredible that a program.
00:17:15 Prof Michael Littman
Could do that.
00:17:16 Prof Michael Littman
But in terms of sort of practical applications, you often don't have 70 years worth of experience before you can actually make good decisions.
00:17:25 Prof Michael Littman
Most applications don't have the.
00:17:26 Dr Genevieve Hayes
Other than playing go really, really well, what are some of those other real world applications that you mentioned that you'd thought of?
00:17:34 Prof Michael Littman
Yeah, yeah, yeah.
00:17:35 Prof Michael Littman
So one of the ones that I I think is very cool is a system that I think that was in in the nest thermostat.
00:17:43 Prof Michael Littman
So basically an auto a smart thermostat little computer that is controlling the temperature in your home.
00:17:49 Prof Michael Littman
It actually learns the thermodynamics of your house and then uses that to maximize reward.
00:17:54 Prof Michael Littman
So it starts to understand.
00:17:56 Prof Michael Littman
If it knows what the outdoor temperature is because it's on the Internet and it's getting that data, it knows what the indoor temperature is because it's got a thermometer built into it.
00:18:04 Prof Michael Littman
And it knows that.
00:18:05 Prof Michael Littman
OK, well, I I ran the the cooling system say for this amount of time at this power level, it can start to figure out.
00:18:13 Prof Michael Littman
OK, well this is how long it takes is how much energy you have to put into the house before it is cooled down.
00:18:18 Prof Michael Littman
This number of degrees and once it kind of understands, once it's able to predict accurate, this is a supervised learning problem in a sense, right.
00:18:24 Prof Michael Littman
Like if I put in this much energy and the temperature is this and the.
00:18:27 Prof Michael Littman
Outdoor temperatures.
00:18:28 Prof Michael Littman
This is how the temperature.
00:18:29 Prof Michael Littman
Gonna change.
00:18:30 Prof Michael Littman
So over time it actually does this kind of internal supervised training to be able to predict the temperatures.
00:18:36 Prof Michael Littman
Then it takes an additional step that says OK, if I'm trying to I it, it needs to be set to.
00:18:42 Prof Michael Littman
I want to use Celsius because you're in Australia, but I don't know what, but then I was going to say, you know, you want to set it to to 80 degrees Celsius and I'm thinking, OK, that's actually.
00:18:52 Prof Michael Littman
So you want to set it to some you know to let's call.
00:18:54 Prof Michael Littman
It room temperature you want to set it to room temperature and so it knows what the target is and it knows the way that your house changes in in reaction to turning the the the cooling on.
00:19:04 Prof Michael Littman
It can then optimize.
00:19:05 Prof Michael Littman
It can then maximize reward, which is to say, it can find the way of adding the least energy to the system to bring the temperature to the right temperature in the shortest amount of time, and so it does this planning it does this, this this sort of reinforcement learning step of of deciding on actions that are going to change the world to try to maximize a score and the score in this case is.
00:19:25 Prof Michael Littman
A combination of comfort and energy expended to reach that level of comfort.
00:19:29 Prof Michael Littman
And I just thought, man, that's brilliant because it's actually what's, what's wonderful about it is unlike the go example in go, it trained and trained and trained and trained.
00:19:37 Prof Michael Littman
And then finally, what came out at the end is what's called a policy.
00:19:40 Prof Michael Littman
Basically, here's a decision role for playing the game.
00:19:43 Prof Michael Littman
And at that point it doesn't need to learn anymore.
00:19:46 Prof Michael Littman
It basically like it doesn't matter that it was created with reinforcement learning.
00:19:49 Prof Michael Littman
It's this brilliant.
00:19:50 Prof Michael Littman
Policy for playing the game.
00:19:52 Prof Michael Littman
So the reinforcement learning is now, it was just like behind the scenes.
00:19:56 Prof Michael Littman
Like a parent, sort of like you work really hard.
00:19:58 Prof Michael Littman
Finally, the kids out in the world and then your job is done.
00:20:00 Prof Michael Littman
What's neat about the thermostat case is that when they when the thermostat ships from the factory.
00:20:07 Prof Michael Littman
It still needs to learn.
00:20:08 Prof Michael Littman
It has to get into your house and learn about your house.
00:20:11 Prof Michael Littman
And so the learning is actually part of the deployment of the system itself.
00:20:14 Prof Michael Littman
And I just think that's super cool, right?
00:20:16 Prof Michael Littman
So you have a system that's learning to do a good job with your data on.
00:20:21 Prof Michael Littman
Fly and not just.
00:20:23 Prof Michael Littman
We used reinforcement learning because we couldn't figure out a good policy without it.
00:20:26 Prof Michael Littman
This is like no, it has to be deployed with learning in it because your house is different from my house is different from her house and different from his house.
00:20:33 Prof Michael Littman
And so that.
00:20:34 Prof Michael Littman
That particular application I really love, and they did it in a way where it it learns, you know, in a week or so it doesn't need to take a million trials before it learns what to do.
00:20:44 Dr Genevieve Hayes
Are there any other products or software tools on the market that do anything similar to that to your knowledge?
00:20:50 Prof Michael Littman
Maybe so, but I don't I.
00:20:53 Prof Michael Littman
Don't know too.
00:20:53 Prof Michael Littman
Many of them, it's it's possible that there's some of this stuff is happening behind the scenes.
00:20:57 Prof Michael Littman
I know that that Google will sometimes brag.
00:20:59 Prof Michael Littman
That they use.
00:21:00 Prof Michael Littman
Various kinds of reinforcement learning in their data centers.
00:21:03 Prof Michael Littman
So to which again they have it's it's similar to the thermostat question where you you actually want to keep the data centers at a particular temperature.
00:21:10 Prof Michael Littman
You don't want to use up a a lot of energy to do that.
00:21:13 Prof Michael Littman
I think it's typical that data center will use half its energy on cooling and half its energy on compute.
00:21:18 Prof Michael Littman
And so if you can save energy on cooling, you've got more energy leftover for compute, which is the data centers job.
00:21:24 Prof Michael Littman
So you really wanna, you know, do that, do the best job that you can with that.
00:21:28 Prof Michael Littman
But if you let things get too warm.
00:21:30 Prof Michael Littman
Computers start to.
00:21:31 Prof Michael Littman
Die, they start to breakdown and then you can't get the compute done.
00:21:34 Prof Michael Littman
So finding the the the right balance of like OK, how how warm can I let it be right now or if I cool it for a while, can I actually let it warm up for a little bit without danger that the that the machines are gonna break?
00:21:45 Prof Michael Littman
And so there's this kind of oversight process that's watching all these readings and deciding what to do.
00:21:51 Prof Michael Littman
And my understanding is that some some companies are using reinforcement learning to tune.
00:21:55 Prof Michael Littman
That policy to.
00:21:56 Prof Michael Littman
To tune that rule for deciding when to turn on the air conditioning, what sections of the data center to to air condition at what times.
00:22:03 Dr Genevieve Hayes
I could imagine a similar application with exhaust fans in tunnels.
00:22:08 Dr Genevieve Hayes
So that wouldn't be hate, that would be just.
00:22:11 Dr Genevieve Hayes
Power exhaust needing to be cleared, but it's basically the same sort of thing.
00:22:15 Prof Michael Littman
Yeah. So so basically.
00:22:16 Prof Michael Littman
Reinforcement learning is about it.
00:22:18 Prof Michael Littman
It's what engineers used to call control theory.
00:22:21 Prof Michael Littman
It's it's about trying to decide what to do, and you have some measure of goodness of, of of what you're doing.
00:22:27 Prof Michael Littman
Classical control theory looks at a very constrained version of the problem, where the the rewards are all about, you know, distance from a target.
00:22:35 Prof Michael Littman
And you have to.
00:22:36 Prof Michael Littman
It's like squared distance from a target and the the dynamics of the system are all assumed to be locally linear and so forth.
00:22:42 Prof Michael Littman
But under those assumptions, you can you can just boom, you can just apply a control system to it and it does the right thing.
00:22:48 Prof Michael Littman
What makes reinforcement learning distinct?
00:22:50 Prof Michael Littman
I think from control theory is that we're we're very comfortable saying, oh, we don't know if it's linear or not, we're just going to use a more.
00:22:56 Prof Michael Littman
General function approximator.
00:22:57 Prof Michael Littman
Ohh, the rewards don't have to be any kind of specific distance function.
00:23:01 Prof Michael Littman
We can handle whatever and it's true like in principle it can learn much more complicated, much more rich environments.
00:23:08 Prof Michael Littman
But then there's this trade off, because then it learns much more slowly, and so you want to use control theory if you can, because it's gonna learn more effectively.
00:23:15 Prof Michael Littman
It's gonna be more robust.
00:23:16 Prof Michael Littman
But if you can't, we got some other tools for you.
00:23:19 Dr Genevieve Hayes
How do you see the use cases for reinforcement learning changing in the?
00:23:24 Prof Michael Littman
Yeah, that's a that's an interesting question.
00:23:26 Prof Michael Littman
So it strikes me that the, the, the current boom in machine learning boom or boom in machine learning that we've been seeing over the last five to 10 years, a lot of it was enabled by the fact that we just have a lot of data now and we have enough compute that we can crank through that data to the extent that we're starting to instrument.
00:23:44 Prof Michael Littman
Control systems more like if you have lots of temperature measure measures in your house and lots of different vents that can be controlled by computer.
00:23:53 Prof Michael Littman
The more that that happens, the more there's an opportunity to actually do some control in the real world.
00:23:58 Prof Michael Littman
The more that it's not like that, right that you take very rare readings and then a person interprets those readings like like what you might see in an intensive care unit in a hospital, a lot of it is not fully instrumented.
00:24:10 Prof Michael Littman
It's there's the like.
00:24:11 Prof Michael Littman
The assumption is that people are in the loop writing things down in charts, making actual decisions, and it's not, it's not measured.
00:24:18 Prof Michael Littman
And so it's really hard to then take that data and do anything with it.
00:24:22 Prof Michael Littman
The more that we are just not for other reasons, we're more naturally, you know, connecting up sensors and and storing that data.
00:24:28 Prof Michael Littman
The more opportunities we have to just like ohh wait, we could actually apply learning to this the same way that we that companies that were collecting all this data.
00:24:36 Prof Michael Littman
The on the Internet for other reasons said oh wait, we can actually, you know.
00:24:41 Prof Michael Littman
We we collected.
00:24:41 Prof Michael Littman
All these photos on on social media.
00:24:43 Prof Michael Littman
Because people want to share photos.
00:24:45 Prof Michael Littman
But wait a second.
00:24:45 Prof Michael Littman
Now we've got a million photos.
00:24:47 Prof Michael Littman
Well, what can we do with a million photos?
00:24:49 Prof Michael Littman
So I I feel like that's that.
00:24:50 Prof Michael Littman
Could be the future of reinforcement learning that as the data sort of organically is being gathered.
00:24:56 Prof Michael Littman
Then there's going to be an opportunity to actually use that data to.
00:24:58 Prof Michael Littman
Make some smarter decisions.
00:24:59 Dr Genevieve Hayes
So things like smart electricity and water meters, telematics sensors in mines, all those sorts of things could potentially be used for reinforcement.
00:25:11 Prof Michael Littman
That's exactly what I'm thinking.
00:25:12 Prof Michael Littman
Yeah, yeah, yeah.
00:25:13 Prof Michael Littman
And you see the company, the the industries or the sectors that are that are doing more of.
00:25:17 Prof Michael Littman
That the opportunities are going to happen sooner.
00:25:20 Dr Genevieve Hayes
I can totally see that happening in something like the mining sector, which in Australia is just massive from what I've heard they they've got sensors that are gathering data every second or something.
00:25:31 Prof Michael Littman
Ohh wow.
00:25:32 Prof Michael Littman
Yeah, yeah, yeah.
00:25:33 Prof Michael Littman
I mean, I definitely have heard of, like, large.
00:25:35 Prof Michael Littman
You said massive, like, massive vehicles that are actually part of these mining operations, just moving things around.
00:25:41 Prof Michael Littman
And so to the extent that so even just deciding I'm going to move this pile of stuff from here to here is incredibly expensive.
00:25:47 Prof Michael Littman
And so systems that are trying to reason about, well, what's the benefit of that?
00:25:52 Prof Michael Littman
Compared to the cost are going.
00:25:53 Prof Michael Littman
To be really valuable.
00:25:54 Dr Genevieve Hayes
I was at a conference last year where there was a man who was talking about optimization problems in the mini.
00:26:00 Dr Genevieve Hayes
The sector and there were things like how do you optimise the fuel use by getting the drivers to drive at a particular speed and things like that.
00:26:13 Dr Genevieve Hayes
And I could imagine that could become a reinforcement learning problem.
00:26:16 Prof Michael Littman
Ohh yeah absolutely.
00:26:17 Prof Michael Littman
It's perfectly set up.
00:26:18 Prof Michael Littman
For that what?
00:26:19 Dr Genevieve Hayes
About things like generative AI cause I've heard reinforcement learning was used in training ChatGPT. Although I'm not exactly sure of how it was used.
00:26:28 Prof Michael Littman
Yeah, that's a really good. That's a really good point to bring up. Chippy is amazing and it's amazing how it really caught so many people's imaginations.
00:26:36 Prof Michael Littman
And one of the things that's really cute about it is if I were teaching an intro machine learning class today, I would very much lead off with chat EPT because it uses.
00:26:46 Prof Michael Littman
Unsupervised learning to to understand the pattern of word.
00:26:50 Prof Michael Littman
Usage it uses.
00:26:51 Prof Michael Littman
Supervised learning to try to generate outputs that are more consistent with like target outputs that they want it to be.
00:26:58 Prof Michael Littman
They want it to produce language in a certain way, not just whatever kind of flows through it, but it should be like a conversation and to to do that they actually train it with some supervised.
00:27:06 Prof Michael Littman
Learning and the supervised supervised learning systems, it actually is learning to to say, OK if this was the question and this was your answer, human beings would score that as a four, you know.
00:27:17 Prof Michael Littman
Whereas here's the question.
00:27:18 Prof Michael Littman
Here's a different answer.
00:27:19 Prof Michael Littman
Human beings would score that as a 10 supervised learning is used to learn that mapping.
00:27:23 Prof Michael Littman
Then now we have a reward function and we can apply reinforcement learning and that's that's what folks do.
00:27:28 Prof Michael Littman
In a application of RL that's called RLH.
00:27:32 Prof Michael Littman
F reinforcement learning from human feedback.
00:27:34 Prof Michael Littman
So instead of a defined reward function like you get eight points for or whatever, we're gonna.
00:27:40 Prof Michael Littman
We're gonna charge this much for fuel.
00:27:41 Prof Michael Littman
We're gonna give you this much benefit for mines.
00:27:44 Prof Michael Littman
Quantity stuff brought out of the mine.
00:27:46 Prof Michael Littman
Instead of a human being writing down with that sort of price list is they have human beings.
00:27:52 Prof Michael Littman
Human judgments.
00:27:53 Prof Michael Littman
It learns to match these human judgments, and then it trains itself for a long time, trying to say.
00:27:57 Prof Michael Littman
OK, how should I?
00:27:59 Prof Michael Littman
Generate outputs so that the supervised learning reward supervised learned reward function that says whether a person would probably like this or.
00:28:06 Prof Michael Littman
Not it will be happy.
00:28:08 Prof Michael Littman
And it makes it the tremendous difference, because if you actually look at what these language models do before they're trained that way, they just output all kinds of nonsense because they they're not really guided in any way.
00:28:17 Prof Michael Littman
They're just trying to predict in all the web pages I've ever seen, what's the next token, but instead now they're kind of tweak their they're tuned to produce outputs that are judged to be.
00:28:28 Prof Michael Littman
The things that people.
00:28:29 Prof Michael Littman
Like and that's done using a A modified form of reinforcement.
00:28:33 Dr Genevieve Hayes
I'm just imagining something like the gymnastics at the Olympics, where at the end of someone's act you've got the judges holding up the cards. It's basically what you're describing, isn't it?
00:28:43 Prof Michael Littman
Yeah, that's how I think about it is, is that, that, that's evaluative feedback, right.
00:28:47 Prof Michael Littman
You're given a score now.
00:28:48 Prof Michael Littman
If you were a gymnast and you were trained by going to competitions, doing something, and then at the end of it, you know, looking at the cards held up and say, huh, I guess I should try something different next time.
00:28:59 Prof Michael Littman
Other people got bigger scores than me then that that, that would be pure reinforcement.
00:29:03 Prof Michael Littman
And of course, athletes.
00:29:04 Prof Michael Littman
Witnesses are not trained just that way, they.
00:29:06 Prof Michael Littman
Given a lot of supervised data, they're given a lot of unsupervised data.
00:29:09 Prof Michael Littman
They're given an opportunity to just play around.
00:29:11 Prof Michael Littman
They're given verbal guidance from the coaches, but at the end of the day, yes, they're trying to figure out.
00:29:16 Prof Michael Littman
How to behave?
00:29:17 Prof Michael Littman
To get that score up and and that's that is at at root that.
00:29:21 Prof Michael Littman
Is the reinforcement learning problem.
00:29:22 Dr Genevieve Hayes
Another example we had when I took not your machine learning class but another course machine learning for trading.
00:29:30 Dr Genevieve Hayes
We actually used reinforcement learning to come up with the optimal trading policy.
00:29:36 Dr Genevieve Hayes
Is reinforcement learning actually used in practice for developing stock market trading policies?
00:29:42 Prof Michael Littman
OK.
00:29:43 Prof Michael Littman
That's a great question.
00:29:44 Prof Michael Littman
I that's a tough one to answer because the folks who do this, they'll only tell you about the stuff that doesn't work well, because if it's stuff that does work well, they're going to use it to make as much money as they can as quickly as they can.
00:29:56 Prof Michael Littman
So I think Tucker Balch, I think, was the professor of that class.
00:30:00 Prof Michael Littman
And he ultimately left Georgia Tech.
00:30:03 Prof Michael Littman
To work at ohh, I'm blanking on which one it is, but it's one of one of the major financial companies in in New York City.
00:30:10 Prof Michael Littman
So he's actually doing this stuff now.
00:30:12 Prof Michael Littman
Is he doing it with reinforcement?
00:30:13 Prof Michael Littman
He he I don't think he's going to tell me, but it is perfectly.
00:30:16 Prof Michael Littman
I mean I was.
00:30:18 Prof Michael Littman
I was talking about the mining example where you have to figure out, OK, there's fuel usage.
00:30:22 Prof Michael Littman
There's time to deliver the ore.
00:30:24 Prof Michael Littman
There's quickly the ore is coming out.
00:30:26 Prof Michael Littman
You have to somehow make a a scoring function that brings all these different things together.
00:30:30 Prof Michael Littman
What's great about financial trading is.
00:30:33 Prof Michael Littman
There is a common currency, all the decisions you make.
00:30:36 Prof Michael Littman
It's really a question of maximizing currency, like getting the most money that you can you you make a trade you it cost you something, but then there's a there's a monetary benefit from that and those are exactly comparable to each other, right?
00:30:48 Prof Michael Littman
It's it's it's it's all.
00:30:50 Prof Michael Littman
I wanna say dollars and so it is a.
00:30:52 Prof Michael Littman
It's a beautiful.
00:30:54 Prof Michael Littman
And then you also have to make sort of temporally delayed decisions like I'm gonna.
00:30:57 Prof Michael Littman
I'm gonna buy this stock now and only later is it going to pay off if if the if the stock price goes up.
00:31:05 Prof Michael Littman
And so you're making these decisions with possibly very delayed effects.
00:31:09 Prof Michael Littman
Kind of like a a game like a board game, where I'm gonna move this piece.
00:31:13 Prof Michael Littman
Now I'm only gonna find out at the end of the game whether I actually.
00:31:16 Prof Michael Littman
One and I still have to figure out, OK.
00:31:17 Prof Michael Littman
Was that a smart move that I made 20 moves ago?
00:31:20 Prof Michael Littman
I'm not sure and so it is.
00:31:22 Prof Michael Littman
It's a beautiful example of reinforcement.
00:31:23 Prof Michael Littman
Are they actually doing it?
00:31:25 Prof Michael Littman
They're probably doing some interesting mix of ideas every whatever works like the people who work in these these trading companies, they're very adept at bringing together all the ideas that could possibly come to bear.
00:31:38 Prof Michael Littman
They're not purists, right?
00:31:39 Prof Michael Littman
They're not like.
00:31:39 Prof Michael Littman
Well, I'm gonna do this because mathematically it's the right thing to do.
00:31:42 Prof Michael Littman
They're like, no, we're gonna mix these things together in whatever way is actually.
00:31:46 Prof Michael Littman
Make some money.
00:31:47 Dr Genevieve Hayes
And given the profits they're making, they're they've clearly figured out a way of doing it very, very well.
00:31:54 Prof Michael Littman
That's my impression.
00:31:55 Dr Genevieve Hayes
In addition to your work as a university professor, you've also written a book code to Joy, which I'm actually really looking forward to reading once it's released and that's.
00:32:07 Dr Genevieve Hayes
Believe coming out on the 3rd of October.
00:32:09 Prof Michael Littman
Yeah, that's exactly right.
00:32:12 Prof Michael Littman
So, so well, thank you for bringing that up.
00:32:13 Prof Michael Littman
Yeah, I'm very excited about it.
00:32:14 Prof Michael Littman
It's something that I put a lot of effort into.
00:32:17 Prof Michael Littman
In fact, I spent a good chunk of time visiting Georgia Tech last year.
00:32:21 Prof Michael Littman
And in all my spare time writing, just writing, writing, writing, now I'm excited that you're excited about it, but you might not be the target audience because it's really introducing some of the ideas of computation.
00:32:32 Prof Michael Littman
The people who aren't already familiar with that, so the the goal of the book code to joy is the idea of like if we all could code a little bit, we all might be a.
00:32:41 Prof Michael Littman
Little bit happier.
00:32:42 Prof Michael Littman
And so let me you know, let me tell you about what the fundamental kind of component.
00:32:46 Prof Michael Littman
Opponents of of programming are because the emphasis in the whole the emphasis of the whole book is how we tell machines what to do.
00:32:54 Prof Michael Littman
How do we go about getting computers to do what we want them to do?
00:32:58 Prof Michael Littman
And I think of that as being connected with, you know, sort of happiness in the sense that if you want the computer to do something for you and you can know how to tell it to do that.
00:33:07 Prof Michael Littman
Well, then you're.
00:33:08 Prof Michael Littman
Going to be happy.
00:33:09 Prof Michael Littman
Right.
00:33:09 Prof Michael Littman
You don't have to depend on other people.
00:33:11 Prof Michael Littman
You don't have to.
00:33:11 Prof Michael Littman
Just like make do with what is already doing, you can act.
00:33:15 Prof Michael Littman
You actually have some say in the behavior of the machine and so I go through all the different components of.
00:33:20 Prof Michael Littman
Computing and I relate them to real world things that I think everybody can relate to because I think these concepts are not foreign like people design programming languages on top of the way that we already think about the world.
00:33:33 Prof Michael Littman
And we already think about language and communication, so I think it's not as foreign.
00:33:37 Prof Michael Littman
I think a lot of people think it's scary and like mathematical codes and stuff.
00:33:41 Prof Michael Littman
But the ideas are actually ideas that we all use day-to-day when we're when we're just explaining things.
00:33:46 Prof Michael Littman
To each other, and I also relate each of these concepts to some idea in machine learning where the computer is actually helping to create that computational concept.
00:33:57 Prof Michael Littman
So I talk about if statements, you know, conditionals that we use in programs all the time.
00:34:02 Prof Michael Littman
I relate that to the word if, like the fact that people use this all the time.
00:34:07 Prof Michael Littman
They think about lots of things in a conditional form.
00:34:09 Prof Michael Littman
I I talk about the different ways that that if.
00:34:12 Prof Michael Littman
You were to.
00:34:13 Prof Michael Littman
Try some of these things out on your own.
00:34:15 Prof Michael Littman
They could empower you to make you know home, automation and and and automated storytelling and all sorts of things that people could do right.
00:34:22 Prof Michael Littman
The box and then I say and this is kind of like in machine learning decision trees right where the the computer will actually write the if statements for you given the right kind of data.
00:34:32 Prof Michael Littman
And this is how programs can actually write.
00:34:35 Prof Michael Littman
Code essentially by learning from data and so each of the concepts all have their kind of companion machine learning concept.
00:34:42 Prof Michael Littman
Because I want people to understand that it's not magic, it's actually, it's powerful and it's useful.
00:34:47 Prof Michael Littman
And it's empowering, but it's not.
00:34:51 Prof Michael Littman
It's a thing that we can all use to.
00:34:52 Prof Michael Littman
Make our lives better.
00:34:53 Dr Genevieve Hayes
That's actually really cool.
00:34:55 Dr Genevieve Hayes
I even though I know that I'm probably not the target audience, I still want.
00:34:58 Dr Genevieve Hayes
To read this book.
00:35:00 Prof Michael Littman
Well, that makes me very happy.
00:35:01 Prof Michael Littman
Thank you for.
00:35:01 Dr Genevieve Hayes
Saying that I can see it as being something that would help me to explain a lot of the things that I use to others.
00:35:09 Prof Michael Littman
Ohh well that's.
00:35:10 Prof Michael Littman
I hadn't thought about that.
00:35:11 Prof Michael Littman
Yeah, yeah, yeah.
00:35:11 Prof Michael Littman
So I should market it to to computing professionals as well.
00:35:14 Prof Michael Littman
Like, if you gotta explain this steal my explanations I worked.
00:35:18 Prof Michael Littman
Really hard on these.
00:35:19 Dr Genevieve Hayes
Yeah, exactly.
00:35:22 Dr Genevieve Hayes
Most academics, when they write books, they tend to go for writing a textbook targeted at either an audience of their peers or at students, so that they can use that to teach a future class.
00:35:36 Dr Genevieve Hayes
What made you choose to target a general audience?
00:35:40 Prof Michael Littman
I think it goes back to the story I already told you about my my siblings.
00:35:44 Prof Michael Littman
Like I think something in me, even though I've I've concluded that I'm weird and not everybody loves.
00:35:51 Prof Michael Littman
It's part of me still wants to try to make the case started.
00:35:53 Prof Michael Littman
Part of me still believes if I could just explain it the right way, everyone would be as excited.
00:35:57 Prof Michael Littman
About this as I am because it's.
00:35:59 Prof Michael Littman
Just so cool.
00:36:01 Prof Michael Littman
They're they're just blocks of sand and metal and they and they can kind of think right, you can kind of coax them to do things like a like a human brain.
00:36:10 Prof Michael Littman
And it just.
00:36:11 Prof Michael Littman
It always just it always just made me so fascinated to know that you could if you just it's it's almost like.
00:36:18 Prof Michael Littman
Like classic wizarding right?
00:36:20 Prof Michael Littman
Where you where you say magic words and then things happen, the magic words are code, right?
00:36:25 Prof Michael Littman
You actually can come up with the right incantation if you say these words in these in this order, things are gonna get sorted.
00:36:30 Prof Michael Littman
Or if you say them in this other order, it's going to be a machine learning program or you.
00:36:34 Prof Michael Littman
Say in this.
00:36:34 Prof Michael Littman
Other order it's going to be a computer game.
00:36:36 Prof Michael Littman
That's. That's like, borderline magic.
00:36:38 Prof Michael Littman
To me.
00:36:39 Prof Michael Littman
So so I think I think that's part of it that OK, so that's that's the personal story, the the less personal story is I really do think that because computing has reached so many people like one of the ways that computing was able to reach so many people is by making it.
00:36:54 Prof Michael Littman
So you didn't have to code, you didn't have to be a computationally minded person.
00:36:59 Prof Michael Littman
To be able to get the benefits of a personal computer, so that's great.
00:37:03 Prof Michael Littman
Or a website or whatever.
00:37:04 Prof Michael Littman
Whatever it is that you're interacting with.
00:37:06 Prof Michael Littman
Cell phone, right, so that's great.
00:37:09 Prof Michael Littman
But to get it into everybody's hands, they had to hide a lot of that expressiveness, right?
00:37:15 Prof Michael Littman
So some of the power had to be hidden and I think what that does is it puts companies who who are producing that code in a position of unusual power because they get to control.
00:37:27 Prof Michael Littman
What you all of us get to?
00:37:29 Prof Michael Littman
See and interact with.
00:37:30 Prof Michael Littman
And many of them are very responsible with that power.
00:37:34 Prof Michael Littman
Some of them are not so responsible with that power.
00:37:36 Prof Michael Littman
And I think that it is put us into a into a into a situation where we're kind of at the mercy of these companies to do the right thing.
00:37:44 Prof Michael Littman
And some of them aren't doing the right thing and I think we need to kind of take some of that power back and we're never going to be able to do that.
00:37:50 Prof Michael Littman
Unless more of us can actually talk to these machines, tell these machines what we actually want them to do.
00:37:56 Prof Michael Littman
And I think that could actually really change the whole dynamic between the way that human beings and society and machines and companies all come together, hopefully more productively.
00:38:06 Dr Genevieve Hayes
Why I look at it?
00:38:07 Dr Genevieve Hayes
If you were going to a foreign country where you don't speak the language, you would learn some parts of that language just so that you can interact with the people in that country you know just hello.
00:38:20 Dr Genevieve Hayes
How much does this cost?
00:38:21 Dr Genevieve Hayes
Where is the restroom?
00:38:23 Dr Genevieve Hayes
Things like that.
00:38:24 Dr Genevieve Hayes
You know the basics and I see.
00:38:28 Dr Genevieve Hayes
Learning some programming as being a lot like that.
00:38:30 Dr Genevieve Hayes
You know you don't have to be fluent in a particular programming language, but you need to know just a little bit of it just so that you can interact with.
00:38:40 Dr Genevieve Hayes
The natives of this foreign land, which happens to be computer land where the natives are phones and computers.
00:38:49 Prof Michael Littman
I love that that is a great analogy, because it also points out I mean, if you're feeling a little paranoid, right, imagine that you go into one of these foreign countries and all you've got is like a local who's gonna be be your translator and every every interaction you have to go through that person. If that person is trustworthy, everything's fine. But if that person wants to exploit you.
00:39:10 Prof Michael Littman
There's not much you can do about it because they're.
00:39:12 Prof Michael Littman
Telling you they're.
00:39:13 Prof Michael Littman
They're mediating right?
00:39:14 Prof Michael Littman
All your experiences are going.
00:39:16 Prof Michael Littman
Through them and you know, I hate to say it, but I think some of these companies have have been doing that with us, that they get to decide that we're just gonna take your data and we're gonna sell to other people and they're gonna.
00:39:25 Prof Michael Littman
They're gonna change your world view.
00:39:27 Prof Michael Littman
They're gonna.
00:39:27 Prof Michael Littman
They're gonna expose you to different things, and you'd be exposed to because we can.
00:39:31 Prof Michael Littman
Because we're in control.
00:39:33 Prof Michael Littman
And so yeah, I really, I.
00:39:34 Prof Michael Littman
Love that?
00:39:35 Prof Michael Littman
Like the more that you can actually.
00:39:37 Prof Michael Littman
Say hello.
00:39:38 Prof Michael Littman
How much does that cost?
00:39:40 Prof Michael Littman
The more that you even know what's possible in that culture, right?
00:39:43 Prof Michael Littman
The less you can be exploited by people who who have that knowledge.
00:39:47 Dr Genevieve Hayes
Yeah. And you don't get in the taxi ride from the airport where the taxi driver convinces you that it's $1,000,000 to travel from the airport to your hotel.
00:39:55 Prof Michael Littman
Right, right.
00:39:56 Prof Michael Littman
And you've got no recourse whatsoever.
00:39:59 Prof Michael Littman
And so I.
00:39:59 Prof Michael Littman
So I do really think that that's that's important that that people not treat this as some kind of black.
00:40:05 Prof Michael Littman
Not because I really think that it's important that everybody program all the time, but because it holds, it holds the others to account, right?
00:40:12 Prof Michael Littman
It means that they can't treat you like a like a commodity.
00:40:16 Prof Michael Littman
Because you're not.
00:40:17 Prof Michael Littman
You're a person with your own opinions and feelings and preferences, and and you get to.
00:40:23 Dr Genevieve Hayes
People need to know enough about computing so that they can make informed decisions.
00:40:28 Dr Genevieve Hayes
They don't have to be able to be a software developer, but they need to be able to make informed decisions.
00:40:33 Dr Genevieve Hayes
The Yeah, it's just like with investing your money, you need to know enough finance so that you don't succumb to a scam.
00:40:41 Prof Michael Littman
I think, yeah.
00:40:42 Prof Michael Littman
No, that's that is brilliant.
00:40:43 Prof Michael Littman
I think that economists refer to this as the principal agent problem, where the idea is that there's the principle.
00:40:49 Prof Michael Littman
There's you who actually wants something, and but you're working through an agent.
00:40:53 Prof Michael Littman
And if that agent is at all corrupt, then you're stuck, right?
00:40:56 Prof Michael Littman
But but on the other hand, if you were going to do it all yourself, then you don't get the benefit of.
00:41:00 Prof Michael Littman
An agent who actually has more expertise.
00:41:02 Prof Michael Littman
And So what do you do to actually kind of navigate that that relationship?
00:41:07 Prof Michael Littman
And I I love that you said that that we don't.
00:41:09 Prof Michael Littman
It's not about making everybody a software developer.
00:41:11 Prof Michael Littman
This is a battle, not maybe battles the wrong word, a disagreement that I've had with.
00:41:15 Prof Michael Littman
Some of my.
00:41:15 Prof Michael Littman
Colleagues who study programming languages and and computer science education, where I I'm a big fan of scratch.
00:41:22 Prof Michael Littman
This is a programming language that's that's all very graphical and blocks that click together.
00:41:26 Prof Michael Littman
It's it's really quick to pick up.
00:41:29 Prof Michael Littman
It's often taught to kids, but I actually think it's great for.
00:41:32 Prof Michael Littman
They're everybody and I and I say this in front of my my professional colleagues who say you're you're dumbing things down.
00:41:39 Prof Michael Littman
You're making things worse.
00:41:40 Prof Michael Littman
You're gonna create a group of people who really have no business writing code because they view it as ultimately the purpose of writing code is to become a software developer and and actually write code that other people are going to use.
00:41:54 Prof Michael Littman
And so it's almost like the debate that you see sometimes with people who teach literature, right, where they'll say, well, we have to teach them everything we have to teach them how to be great writers.
00:42:06 Prof Michael Littman
Because otherwise they're gonna be terrible writers like, no, but everybody needs to learn how to use language to get their thing done, even if they're not gonna be authors, even if no one's ever gonna read their stuff but themselves, they still need those skills.
00:42:19 Prof Michael Littman
And so things like graphic novels, right, which you might poo poo as, like.
00:42:23 Prof Michael Littman
Well, that's not real literature like, yeah, but it's still people using language and it's helping people understand how to express things in language.
00:42:30 Prof Michael Littman
And I I guess I view languages like scratch as being a little bit like graphic novels.
00:42:34 Prof Michael Littman
It's like not the best way to train a software professional, but for the rest of us it's it's fine, it's it's a great way of getting those concepts, being able to express yourself and and connecting with the process of telling.
00:42:47 Prof Michael Littman
Machines what to do?
00:42:48 Dr Genevieve Hayes
I sort of see it as being like, I don't know.
00:42:51 Dr Genevieve Hayes
Do you do any art like painting or drawing or anything like that?
00:42:55 Prof Michael Littman
Not good.
00:42:58 Prof Michael Littman
But I I I like to.
00:42:59 Prof Michael Littman
I like to make like visual puzzles.
00:43:01 Prof Michael Littman
Like I like to take photos and manipulate them and then it's like a thing.
00:43:04 Prof Michael Littman
You have to figure out.
00:43:05 Dr Genevieve Hayes
OK, well, if you go to a art supply shop, they will have different ranges of products and you know with paint, there's always the student grade paints and the professional grade paints and.
00:43:16 Dr Genevieve Hayes
Somewhere in something.
00:43:17 Dr Genevieve Hayes
In the middle and a person who wants to learn to paint does not buy professional grade paints because you get tiny little tubes that cost a small fortune.
00:43:26 Dr Genevieve Hayes
You buy these student grade paints which are good enough to get the job done and then you can make a mess and all that and a lot of people will never graduate beyond student grade paints.
00:43:38 Dr Genevieve Hayes
They're learning how to paint.
00:43:39 Dr Genevieve Hayes
They're getting the joy of painting whatever it is they want to paint.
00:43:44 Dr Genevieve Hayes
And you know, they're you can have multiple markets for things and there's no shame in buying the cheap student paints.
00:43:52 Prof Michael Littman
That's great.
00:43:53 Prof Michael Littman
So OK, so now here's here's how people might push back in the software case, they would say, OK, but that's a waste because the you're gonna learn bad habits with the student paint because it's different from the professional paint.
00:44:05 Prof Michael Littman
It just acts different.
00:44:06 Prof Michael Littman
So you, you're like training people to use this one thing and then you have to untrain them and retrain them to do this.
00:44:12 Prof Michael Littman
Other thing, and that's just a waste of everybody's time and.
00:44:15 Prof Michael Littman
Energy, but.
00:44:17 Dr Genevieve Hayes
You could also have a situation where I know when I was a kid, if I had something really expensive like not that I ever had the professional grade paints.
00:44:26 Dr Genevieve Hayes
I think I had just student grade paints the I would have been too scared to use it because I would know that I wasn't good enough to use it, so I would just be wasting it, whereas by actually having something where I wasn't scared of.
00:44:39 Dr Genevieve Hayes
Running up a massive.
00:44:40 Dr Genevieve Hayes
Well, I was able to get to the point where, OK, we'll never get to that professional grade paint level.
00:44:48 Dr Genevieve Hayes
But where I knew that I didn't want to go to that level and I it might have led to a different decision, I might have decided I should go to that level.
00:45:00 Prof Michael Littman
OK.
00:45:00 Prof Michael Littman
That makes sense.
00:45:01 Prof Michael Littman
That makes sense.
00:45:02 Prof Michael Littman
It is it.
00:45:03 Prof Michael Littman
I do find it really interesting, though, that that's the argument that the software people will make.
00:45:06 Prof Michael Littman
They'll say they're gonna learn bad habits, that we're gonna have to train them, train software engineers out of.
00:45:12 Prof Michael Littman
I guess they don't see, they don't see the hobbyist as a as an end goal, right?
00:45:17 Prof Michael Littman
Whereas in art, you could certainly have people who like, look, I'm never gonna like, sell my paintings, but it really makes me happy to to every month or so to sit down and paint a landscape.
00:45:26 Prof Michael Littman
Right.
00:45:26 Prof Michael Littman
And I I think that there's not a recognition among a lot of software professionals, a lot of educators.
00:45:32 Prof Michael Littman
That some people just wanna have that as a as a as a skill, even if it's not something that they're leaning on professionally, that there's benefits to being able to.
00:45:40 Prof Michael Littman
Do this short of making your living this way.
00:45:43 Dr Genevieve Hayes
Just thinking of another example of that, that's a counterexample to the strictly purist approach, I guess.
00:45:51 Dr Genevieve Hayes
Did you ever read any of Raymond Smullyan's puzzle books when you were younger?
00:45:56 Prof Michael Littman
I did, yeah, yeah, he logician, right?
00:45:59 Dr Genevieve Hayes
Yeah, the logician.
00:46:00 Dr Genevieve Hayes
So yeah, he had all these.
00:46:01 Dr Genevieve Hayes
He went in for those.
00:46:02 Dr Genevieve Hayes
The truth teller and the liars.
00:46:04 Dr Genevieve Hayes
So you had to figure out ask people questions and figure out if they are truth tellers or liars and they had a copy of one of those books in my local library.
00:46:13 Dr Genevieve Hayes
I think I borrowed it about 20 times.
00:46:16 Dr Genevieve Hayes
I have read it so many times.
00:46:17 Dr Genevieve Hayes
I have my own.
00:46:18 Dr Genevieve Hayes
You know, I spent so much time reading that when I was a kid and I did not understand logic at that time, but 12 year old me managed to muddle my way through logic really badly and probably learn all sorts of bad things.
00:46:38 Dr Genevieve Hayes
And that would have had to have been retrained out of me at some future point when I did maths or logical computer science, but.
00:46:47 Dr Genevieve Hayes
I still learn stuff, even if I learned it badly.
00:46:50 Prof Michael Littman
Right.
00:46:50 Prof Michael Littman
And you, you built up, you built up some kind of conceptual foundation that the the other stuff is is you're that you're still using today and you liked it, right?
00:46:58 Prof Michael Littman
Like there is a there is a you you felt motivated to continue in that area, whereas if we had like hit you right off the bat with no, no, no no. We're doing mathematical logic. We're gonna do Good's theorem, but we're gonna do it in.
00:47:10 Prof Michael Littman
Gruesome detail you would many of us, and not maybe not you, but many people would have it.
00:47:15 Prof Michael Littman
Just like no, I'm off the bus now.
00:47:16 Prof Michael Littman
I'm not.
00:47:17 Prof Michael Littman
I'm getting off.
00:47:18 Prof Michael Littman
This is not a ride I want to take.
00:47:19 Prof Michael Littman
And so I, yeah, there's tremendous benefits to that.
00:47:22 Prof Michael Littman
But I think I think there's a bias in computer science, a really strong bias against waste, right?
00:47:28 Prof Michael Littman
There's a sense in which everything has to be completely efficient if you have to redo something a second time.
00:47:34 Prof Michael Littman
That's an error and I get that as a as an as a design goal, right?
00:47:38 Prof Michael Littman
So like when you're writing code, you don't want to write the same.
00:47:40 Prof Michael Littman
Code in multiple places.
00:47:42 Prof Michael Littman
Where it's gonna be hard to maintain and you're you're.
00:47:44 Prof Michael Littman
Not gonna get it right each time.
00:47:46 Prof Michael Littman
Like it's it's you wanna.
00:47:48 Prof Michael Littman
You wanna keep things.
00:47:49 Prof Michael Littman
Say that the one thing.
00:47:50 Prof Michael Littman
The one time the right way and don't waste any cycles on anything but the fact of the matter is.
00:47:56 Prof Michael Littman
There's a lot of fields that don't have that as a design goal.
00:47:59 Prof Michael Littman
I I watch like TV production people and it just blows me away at how much stuff they throw away.
00:48:05 Prof Michael Littman
Because when you say when you map out a movie.
00:48:07 Prof Michael Littman
You make you make the entire.
00:48:09 Prof Michael Littman
Movie wrong.
00:48:11 Prof Michael Littman
Once you do it like on paper, sometimes with audio and music and and special effects, it's like it's the storyboard version of the movie that people will watch to make sure that it kind of flows.
00:48:23 Prof Michael Littman
Then they throw it all away and they replace it with the real stuff.
00:48:26 Prof Michael Littman
And I look at that and the computer scientist in me.
00:48:30 Prof Michael Littman
Cringes I'm like no, they need a tool that where they design stuff, they design it on the computer and then the same things are reused again, just paste it into place, don't throw anything away.
00:48:39 Prof Michael Littman
That's crazy, but the fact of the matter is that is how they do it, and they do it that way for a reason right there.
00:48:45 Prof Michael Littman
Actually value in in throwing things away as part of the process.
00:48:50 Prof Michael Littman
And I think computer scientists just we have not embraced that as a design goal.
00:48:55 Prof Michael Littman
We just we we want to do it once and never do it again because that's perfection.
00:49:00 Dr Genevieve Hayes
In the National Gallery in Victoria, there is a massive painting that's like the size of a whole wall. You know, two-story high wall that is someone, some famous artists, rough draft of a picture. I mean this guy painted something that's probably, you know, 3 or 4 metres tall by.
00:49:20 Dr Genevieve Hayes
Five or six metres wide as a rough draft before he did the same thing again.
00:49:28 Prof Michael Littman
That's awesome. Yeah.
00:49:29 Prof Michael Littman
And and and and I don't think that bothers artists, right?
00:49:32 Prof Michael Littman
They're that's it's like, yes, that's how you do it.
00:49:34 Prof Michael Littman
And I, and again as a computer scientist, even though I'm making the case that you should, we should be embrace.
00:49:40 Prof Michael Littman
Saying that.
00:49:41 Prof Michael Littman
Ohh I can't.
00:49:41 Prof Michael Littman
I just can't.
00:49:42 Prof Michael Littman
Like no, do it once and then project it onto the like.
00:49:45 Prof Michael Littman
Find a way to reuse that work, because otherwise you're just wasting energy.
00:49:50 Prof Michael Littman
You're wasting effort, you know big O of log N, not big O of N ^2.
00:49:54 Prof Michael Littman
Sorry, that's not that.
00:49:55 Prof Michael Littman
Maybe that's not a generally accessible joke, but the point is shake.
00:49:59 Prof Michael Littman
Yeah, shave shave off all the waste.
00:50:01 Prof Michael Littman
It's got to be.
00:50:02 Prof Michael Littman
As bare bones as possible, and maybe we need to embrace a little bit of waste.
00:50:08 Dr Genevieve Hayes
Do it the first time to get it to work, and then the second time to get it good.
00:50:12 Dr Genevieve Hayes
Yeah, yeah.
00:50:13 Dr Genevieve Hayes
Obviously you have just said that you think scratch is a good first programming language for people once they move beyond scratch.
00:50:21 Dr Genevieve Hayes
What programming languages do you think are the most accessible to people?
00:50:25 Prof Michael Littman
I mean, everybody's using Python for everything now, so Python is obviously a just a huge juggernaut.
00:50:31 Prof Michael Littman
It's there's the, the I feel like when I was when I was younger, when I was, you know, just starting out.
00:50:37 Prof Michael Littman
There was always this desire.
00:50:38 Prof Michael Littman
There should be libraries we should be able to build on each other.
00:50:41 Prof Michael Littman
Code, but then everybody, everybody who is anybody would not pay attention to other people's libraries because if you didn't write it yourself, you didn't really understand it.
00:50:49 Prof Michael Littman
You didn't really know how to use it, and it probably.
00:50:51 Prof Michael Littman
Has waste in it.
00:50:52 Prof Michael Littman
We know how we feel about waste.
00:50:53 Prof Michael Littman
So I feel like Python somehow finally got people past that discomfort with libraries, and now you can really embrace it.
00:51:01 Prof Michael Littman
It's like.
00:51:01 Prof Michael Littman
OK, I wanted I wanted use the Hungarian optimization algorithm.
00:51:05 Prof Michael Littman
I don't have to implement it.
00:51:07 Prof Michael Littman
I actually found the library and I called it.
00:51:09 Prof Michael Littman
And it did what?
00:51:09
It was supposed to do.
00:51:10
It was.
00:51:11 Prof Michael Littman
So I do feel like Python has done a really good job with that.
00:51:14 Prof Michael Littman
I just had lunch because, you know, I guess I'm a nerd.
00:51:18 Prof Michael Littman
But I had lunch with somebody this past weekend and he he was just regaling me with his love for Rust programming language that can't make pointer errors in like that was what he was super excited about.
00:51:28 Prof Michael Littman
I I'm not so worked up about pointer.
00:51:30 Prof Michael Littman
Errors in general, like maybe that's just not the.
00:51:32 Prof Michael Littman
Kind of code that I tend to write.
00:51:33 Prof Michael Littman
But boy, I was.
00:51:34 Prof Michael Littman
He was like, I wish I could start all over again, cause I would just learn rust.
00:51:38 Prof Michael Littman
So some people feel strongly about rust.
00:51:40 Prof Michael Littman
Julia is a language that to me looks beautiful, and I I want to program.
00:51:46 Prof Michael Littman
And Julia, just because it looks like.
00:51:48 Prof Michael Littman
The way I want to think, but I've not learned.
00:51:50 Prof Michael Littman
I did teach one class reinforcement learning class in Julia, not knowing Julia because the book used Julia and I think all the students ended up doing everything in Python anyway.
00:52:01 Dr Genevieve Hayes
What's your go to language of choice?
00:52:03 Prof Michael Littman
I I have been known to you scratch for things that I need.
00:52:06 Prof Michael Littman
Like sometimes I like I want a thing that's gonna beep every 5 minutes because other.
00:52:10 Prof Michael Littman
Guys like sometimes when I'm working I I feel like, OK, I have a.
00:52:14 Prof Michael Littman
Tendency to either.
00:52:15 Prof Michael Littman
Stay on the surface and not really get anything done or go deep.
00:52:19 Prof Michael Littman
Lose track of time and then like **** people off because I was supposed to be doing something.
00:52:23 Prof Michael Littman
And so having something that just wakes me up, just like every 5 minutes, brings me back into the world.
00:52:28 Prof Michael Littman
And I know that I don't have to keep track of time.
00:52:30 Prof Michael Littman
Like, that's actually tremendously valuable.
00:52:32 Prof Michael Littman
And So what language you would you write that in?
00:52:35 Prof Michael Littman
So I wrote it in scratch.
00:52:36 Prof Michael Littman
It's two lines.
00:52:37 Prof Michael Littman
It's like, wait for, you know, 60 * 5 second.
00:52:41 Prof Michael Littman
Then make a meow like a cat and just put that in an infinite loop and.
00:52:47 Prof Michael Littman
It was like super fast.
00:52:48 Prof Michael Littman
It did what?
00:52:48 Prof Michael Littman
It needed to do and I didn't have to think about it.
00:52:51 Prof Michael Littman
It was, yeah.
00:52:51 Prof Michael Littman
So sometimes I'll use scratch, but if I'm actually getting things done, it's it's Python.
00:52:56 Dr Genevieve Hayes
Yeah, most people who have interviewed on this program seem to be Python fans, which is possibly saying something about the guests.
00:53:02 Dr Genevieve Hayes
I'm choosing more than I.
00:53:05 Prof Michael Littman
I don't know cause I cause.
00:53:06 Prof Michael Littman
I've, as you know, you took machine learning class with with me.
00:53:09 Prof Michael Littman
Charles Charles, one of the things that's amazing, is that the assignments in the machine learning class are we don't care about the code.
00:53:16 Prof Michael Littman
We want to see what the machine learning algorithms do.
00:53:18 Prof Michael Littman
We want you to talk about the output of these machine learning algorithms, and the students are like, well, what language do you want me to write it in?
00:53:25 Prof Michael Littman
And Charles says I don't.
00:53:26 Prof Michael Littman
Want you to write it at.
00:53:27 Prof Michael Littman
All, I don't care what language it's in.
00:53:29 Prof Michael Littman
You can run a library for all I care.
00:53:31 Prof Michael Littman
You don't have to.
00:53:32 Prof Michael Littman
This is not about you implementing the code.
00:53:34 Prof Michael Littman
This is about running that code and understanding the output people like.
00:53:38 Prof Michael Littman
Yeah, but that but can be Python, right?
00:53:40 Prof Michael Littman
So so students just.
00:53:43 Prof Michael Littman
First of all, they have a difficult time imagining that a computer science, a computer science teacher would not care what language they use.
00:53:49 Prof Michael Littman
And second of all, almost everybody uses Python.
00:53:52 Prof Michael Littman
It's the rare student who doesn't.
00:53:54 Prof Michael Littman
You end up using Python, so if it's if you're saying it's, you're taking weird guests and and that's that's why you're getting a biased sample and everybody's Python.
00:54:02 Prof Michael Littman
I think it's most of the field.
00:54:04 Prof Michael Littman
And that's that's your bias right there.
00:54:05 Prof Michael Littman
You're you're choosing people in in.
00:54:08 Prof Michael Littman
Getting R's great. I do stuff in R sometimes.
00:54:11 Dr Genevieve Hayes
I like Python.
00:54:12 Dr Genevieve Hayes
Python is my first choice, but at the moment I'm doing a project that is using a lot of stats that are psychometric statistics and the Python packages just aren't that great because all of the research psychologists use are so.
00:54:31 Dr Genevieve Hayes
I've actually written it in R because the best packages for it are.
00:54:36 Dr Genevieve Hayes
Nah, that for everything else, I'm gonna go back to Python.
00:54:40 Prof Michael Littman
That makes total sense to me.
00:54:41 Prof Michael Littman
Yeah, yeah.
00:54:41 Prof Michael Littman
I mean, the reason I picked up our is because early on in my career, I shared an office with a statistician.
00:54:47 Prof Michael Littman
And whenever I was like, but I, how do I do this?
00:54:49 Prof Michael Littman
I wanna make this kind of picture.
00:54:50 Prof Michael Littman
I wanna make this kind of graph to put in the paper.
00:54:52 Prof Michael Littman
She would just like bang it out and really.
00:54:54 Prof Michael Littman
Fast and like, yeah.
00:54:56 Prof Michael Littman
I should probably.
00:54:56 Prof Michael Littman
Learn, learn this language.
00:54:58 Prof Michael Littman
It was also a lot the 1st.
00:55:00 Prof Michael Littman
The first language that I was ever formally taught, so I I did a lot of self teaching in high school and and so over then I got to college.
00:55:07 Prof Michael Littman
I took a computer science class.
00:55:08 Prof Michael Littman
The first language was APL.
00:55:11 Prof Michael Littman
So I don't even know.
00:55:12 Prof Michael Littman
If you know what APL is, but.
00:55:13 Prof Michael Littman
It's it's this crazy language that is kind of if if you know the surface appearance of it, which looks we used to say it looks like line noise, it looks like the what, what, what a computer does if it's spitting out random characters because it just, it's all crazy overlaid characters.
00:55:28 Prof Michael Littman
Funny Greek symbols.
00:55:29 Prof Michael Littman
It's not.
00:55:30 Prof Michael Littman
You can't even.
00:55:30 Prof Michael Littman
You need a special keyboard to type it, but it's basically.
00:55:34 Prof Michael Littman
Are because the way that you're supposed to think about the data structures is you do operations on the entire data structure if you.
00:55:41 Prof Michael Littman
Write a loop.
00:55:42 Prof Michael Littman
You've made a mistake.
00:55:43 Prof Michael Littman
You should never write a loop.
00:55:44 Prof Michael Littman
You should always be acting on the data structure as as.
00:55:47 Prof Michael Littman
A as a.
00:55:48 Prof Michael Littman
As a whole, and I feel like that's what that's how our wants you to write things too.
00:55:52 Prof Michael Littman
You don't want to write loops in R you wanna write operations that hit the data structure all at once.
00:55:57 Prof Michael Littman
You know, add these two matrices together.
00:55:59 Prof Michael Littman
Boom, you don't loop over all the positions and add the corresponding components you just like.
00:56:04 Prof Michael Littman
Add the matrices and that's.
00:56:06 Prof Michael Littman
That's how APL tries to get.
00:56:07 Prof Michael Littman
You to write code.
00:56:09 Dr Genevieve Hayes
I worked for an organization that had some critical code that was written in APL and they paid someone a lot of money to translate it in from APL into SAS, which is what a lot of insurers use because they were scared out of their minds that all of the APL.
00:56:29 Dr Genevieve Hayes
Programmers would either retire or die, and then they would never be able to use this program ever.
00:56:36 Prof Michael Littman
And we used to say in in the class, APL is a write only language like like we we learned in the class how to express really complicated ideas in these crazy overlaid symbols.
00:56:47 Prof Michael Littman
But then if you even look at your own code, certainly not anybody else's code, you look at your own code. You're like I have no idea how this is doing what it's supposed to be doing.
00:56:56 Prof Michael Littman
It was a completely uninterpretable because it was, you know, things were acting on other things.
00:57:01 Prof Michael Littman
Like you try not to put conditionals in there.
00:57:03 Prof Michael Littman
Hit these operations against the data structures.
00:57:06 Prof Michael Littman
It was very messy too, and I can so I can see why translating it to some other language is probably a really good idea.
00:57:13 Dr Genevieve Hayes
Is there anything on your radar in the AI data and analytics space that you think is going to become important in the next three to five years?
00:57:21 Prof Michael Littman
Ah, I mean, to whom, right?
00:57:26 Prof Michael Littman
So, I mean, I mean, I'm really excited about this.
00:57:28 Prof Michael Littman
Well, the topic in general, as I said of my book, which is about telling machines what to do, these large language models that can kind of help you program are absolutely fascinating to me because you can kind of describe what you want in words and then code comes.
00:57:43 Prof Michael Littman
Nobody's sure exactly how they do it, but it doesn't seem to be by really understanding, like internally constructing us a formal specification and then translating that into code.
00:57:52 Prof Michael Littman
It's something about the way people talk about these processes in natural language that they know how to translate somehow into into working code.
00:58:02 Prof Michael Littman
In a flexible enough way that it's not just cut and paste, it actually is kind of regenerating it around.
00:58:07 Prof Michael Littman
The problem that you've currently got I I don't know where that's going.
00:58:11 Prof Michael Littman
It's scary to use it because it if you don't know how to code.
00:58:16 Prof Michael Littman
It generates code for you that you can't understand, and it may or may not be right like it tends to be.
00:58:21 Prof Michael Littman
It can be a little bit off and I think part of the reason that it's off is because it's really hard to tell machines what you want them to do.
00:58:28 Prof Michael Littman
It's hard to tell people what you want them to do, and part of the way that we actually do that in practice is by having a conversation like I'll.
00:58:35 Prof Michael Littman
Explain something to you and you'll be like.
00:58:37 Prof Michael Littman
What do you mean when you say such and such?
00:58:39 Prof Michael Littman
And then I can explain that piece and you're like, but what happens if these two things happen together and like, ohh, actually you're right.
00:58:45 Prof Michael Littman
That's, I should have said that differently.
00:58:47 Prof Michael Littman
And then so we'll have a back and forth until you feel like in your head.
00:58:50 Prof Michael Littman
You have a like a runnable.
00:58:52 Prof Michael Littman
Version of that I see programming going that way that it's it's it's we'll have a conversation with the machine where we express what we want, but then the machine comes back at us with.
00:59:03 Prof Michael Littman
Yeah, but is that really what you mean?
00:59:05 Prof Michael Littman
Or if that's what you mean, wouldn't this weird thing happen?
00:59:08 Prof Michael Littman
Like really being able to try to infer the intent?
00:59:12 Prof Michael Littman
And I think one of the dangers of.
00:59:14 Prof Michael Littman
The current systems these.
00:59:16 Prof Michael Littman
Yeah, these systems that you, you give them natural language and they spit out code is that.
00:59:21 Prof Michael Littman
They make very strong guesses about intent.
00:59:24 Prof Michael Littman
They're gonna, whatever you say.
00:59:26 Prof Michael Littman
They're gonna try to induce some kind of intent and then write to that intent.
00:59:29 Prof Michael Littman
And if they don't understand the intent, well, they're gonna be wrong, and they're gonna they're not gonna actually match your intent.
00:59:35 Prof Michael Littman
They don't really reason about wait a second.
00:59:38 Prof Michael Littman
What could you actually mean by that?
00:59:39 Prof Michael Littman
They just make a guess and they go with it.
00:59:41 Prof Michael Littman
And this is kind of like the hallucination phenomenon in in chat CPT.
00:59:46 Prof Michael Littman
Where it you know it, it'll make stuff up.
00:59:49 Prof Michael Littman
This is like we're gonna.
00:59:50 Prof Michael Littman
I don't really understand your intent, so I'm just gonna make something up.
00:59:52 Prof Michael Littman
And that's bad when you're actually trying to convey something to a machine because you don't want it to make stuff up.
00:59:57 Prof Michael Littman
You want it to be in line with what you want.
00:59:59 Dr Genevieve Hayes
My experience with trying to get ChatGPT to write code is every time I've done it, it's come back with incorrect code and I think this actually is making the case for your book because I know it's incorrect because I can.
01:00:12 Dr Genevieve Hayes
Code, but if I couldn't code or at least didn't know some code, then I could theoretically just take that code, use it, and then get the wrong results.
01:00:23 Prof Michael Littman
Yeah, you're totally right.
01:00:24 Prof Michael Littman
And and I think.
01:00:24 Prof Michael Littman
That observation is.
01:00:25 Prof Michael Littman
Really important, because there are people who should know better, who are out there saying programming is dead, like we don't need to.
01:00:32 Prof Michael Littman
It's almost like I was talking about before when many of the students left computer science because like, you'll never need to program.
01:00:37 Prof Michael Littman
That's all gonna be handled by people in another country.
01:00:40 Prof Michael Littman
Now people are saying you don't need to learn how to program because it'll just be handled by a large language model.
01:00:45 Prof Michael Littman
And I'm thinking that's just wrong on so many levels.
01:00:48 Prof Michael Littman
First of all, the language models are not going to be able to debug them.
01:00:51 Prof Michael Littman
Cells and second of all, even telling another person like, even expressing what you want in natural language to a machine, requires the ability to to conceptualize what it is that you're trying to do.
01:01:03 Prof Michael Littman
And that's a skill that programmers have to learn.
01:01:07 Prof Michael Littman
But we all have to learn because we all are are talking to each other and trying to, you know, convince people to do things.
01:01:12 Prof Michael Littman
For us, that's not trivial.
01:01:14 Prof Michael Littman
Yeah, but I I just don't buy it.
01:01:16 Prof Michael Littman
I don't buy that this is gonna somehow become magically easy because it's not even easy to talk to people.
01:01:21 Dr Genevieve Hayes
I think it'll just take programmers and basically turn them into the managers of robotic workers.
01:01:30 Prof Michael Littman
And if you've ever taken any management classes which like everybody should do, because at some point you're you're gonna be needing to tell other people what to do.
01:01:38 Prof Michael Littman
It's not trivial, and there's and communication.
01:01:40 Prof Michael Littman
Is so important.
01:01:41 Prof Michael Littman
Like you need to understand what is the other person seeing.
01:01:43 Prof Michael Littman
How are they interpreting it that back and forth is is not a trivial skill, and you're right.
01:01:49 Prof Michael Littman
Ultimately that may be what programming is that we're delegating our tasks to these machines, but we have to have kind of a relationship with the machines.
01:01:58 Prof Michael Littman
To make sure that they're doing what we actually want.
01:02:00 Prof Michael Littman
Them to.
01:02:01 Dr Genevieve Hayes
Do and what final advice would you give to data scientists looking to create business value from?
01:02:08 Prof Michael Littman
There are no shortcuts.
01:02:09 Prof Michael Littman
I feel like that is the most important message is that I think a lot of people who who haven't really gotten their hands dirty with data and really like learned how to be a data scientist, think that it's just easy that there's just you have data, you have a problem.
01:02:23 Prof Michael Littman
We're done.
01:02:24 Prof Michael Littman
And the fact of the matter is there's no shortcuts that you have to.
01:02:27 Prof Michael Littman
Engage with the problem in a really deep and fundamental way, and if you think that's wasted effort, you don't understand how hard real problems are.
01:02:36 Dr Genevieve Hayes
One of the lessons I learned when I was doing my PhD, I tried to cut so many corners in there and every time I cut a corner.
01:02:45 Dr Genevieve Hayes
It's came back to bite me and I ended up having to do the whole thing again from scratch the whole for a proper way and it's like if I had have just done this properly to begin with, I would have been finished by now.
01:02:59 Dr Genevieve Hayes
So it taught me a valuable lesson in not cutting corners.
01:03:03 Prof Michael Littman
No, that's that is that is a good lesson to learn.
01:03:06 Prof Michael Littman
And I think that a lot of people need to lead the learn this lesson.
01:03:08 Prof Michael Littman
I think that we're gonna be hearing more and more over the next couple of years for the big tech companies like, No, No, no.
01:03:14 Prof Michael Littman
We're gonna, we're gonna magically solve all your problems for you.
01:03:17 Prof Michael Littman
And one of the you know, one of the the mantras that I tell myself is if a problem is hard enough that you need AI to help you solve it or data science to help you solve it, then it's hard enough that AI alone and data science alone is not going to be able to solve it, that you actually need a kind of partnership that involves really deeply understanding the actual problem.
01:03:37 Prof Michael Littman
And these these mathematical and computational tools, there's just, there's just no way around it.
01:03:42 Dr Genevieve Hayes
So for listeners who want to learn more about you or get in contact, what can they do?
01:03:47 Prof Michael Littman
So I have, I have a Twitter handle.
01:03:49 Prof Michael Littman
You can I think I think as of today Twitter is still a thing.
01:03:53 Prof Michael Littman
Who knows? It varies from day-to-day and I have a website and I have a.
01:04:00 Prof Michael Littman
And so if people want to hear, you know, me and a colleague of mine, Dave Ackley, kind of talking about computation.
01:04:07 Prof Michael Littman
And sometimes interviewing some really interesting people.
01:04:10 Prof Michael Littman
Welcome to to join in for that.
01:04:12 Dr Genevieve Hayes
What's the podcast now?
01:04:14 Prof Michael Littman
It's called computing up so the the person.
01:04:17 Prof Michael Littman
I do this with.
01:04:17 Prof Michael Littman
Dave Ackley he's.
01:04:18 Prof Michael Littman
His studies, artificial life, and so he's really interested in the relationship between computation and intelligence and life.
01:04:27 Prof Michael Littman
And so from his perspective, everything sits on a computational foundation.
01:04:32 Prof Michael Littman
And so it's computing up like everything else is just on top of that.
01:04:36 Dr Genevieve Hayes
I will put a link to it in.
01:04:37 Dr Genevieve Hayes
The show notes thanks so much.
01:04:39 Dr Genevieve Hayes
And thank you for joining me.
01:04:41 Prof Michael Littman
Ohh my pleasure.
01:04:42 Prof Michael Littman
It was great conversing with you.
01:04:43 Prof Michael Littman
I'm really really happy to to get to talk to your audience.
01:04:46 Prof Michael Littman
And it was a real pleasure.
01:04:48 Prof Michael Littman
Thank you.
01:04:49 Dr Genevieve Hayes
And for those in the audience, thank you for.
01:04:52 Dr Genevieve Hayes
I'm doctor Genevieve Hayes and this has been value driven data science brought to you by Genevieve Hayes Consulting.

Episode 23: Reinforcement Learning – The Other Type of Machine Learning
Broadcast by