Episode 48: Overcoming the Machine Learning Deployment Challenge

Download MP3

[00:00:00] Dr Genevieve Hayes: Hello and welcome to Value Driven Data Science brought to you by Genevieve Hayes Consulting. I'm Dr. Genevieve Hayes and today I'm joined by Dr. Eric Siegel to discuss why machine learning projects fail to deploy and BizML, a six step you can follow to ensure your projects succeed. Eric is a leading machine learning consultant and the CEO and co founder of Gooder AI. He is also the founder of the long running machine learning week conference series, author of the bestselling predictive analytics, the power to predict who will click by lie or die. And the recently released the AI playbook and host of the Dr. Data Show podcast, Eric, welcome to the show.
[00:00:48] Dr Eric Siegel: Thanks, Genevieve. Thanks so much for having me.
[00:00:50] Dr Genevieve Hayes: It's been 12 years since Thomas H. Davenport and DJ Patil first declared data science to be the sexiest job of the 21st century. And in that time, a lot has changed. Universities have started offering data science degrees, the number of data scientists has grown exponentially, and generative AI technologies such as ChetGPT and Dell E have transformed the world. Yet, throughout that time, one thing has remained the same. Most machine learning projects still fail to deploy. But it's not the technical capabilities of data scientists that let them down. Those are now better than ever before. Rather, it's the lack of a well established business practice that is almost always to blame.
[00:01:37] And Eric, this shortcoming is something you've attempted to overcome with the BizML framework that you outlined in your recently released book, The AI Playbook. However, before we start discussing BizML, Eric, your background is very much as an ML cheerleader. Your first book, Predictive Analytics, is all about the exciting potential of machine learning.
[00:02:02] And you even went so far as to write and perform a song about the wonders of ML the surprisingly catchy Predict This. What happened to cause you to make the shift from optimistic ML evangelist to wary ML realist?
[00:02:18] Dr Eric Siegel: I am a bit cautionary these days, although still optimistic, but it's basically what you said. These projects are still not getting deployed. So the advent. of machine learning, data science, becoming the so called sexiest profession. Initially that seemed like great news. I was like, I can't believe this.
[00:02:40] I always thought it was reserved for firefighters. Right. How did the nerds get so cool? But of course, all along, I was like, this is the best thing since sliced bread learning from data to predict and all the potential value of that for all the main large scale operations. And yet. I've been in the field for more than 30 years and at first it was great that things really took off and got, there was so much excitement, but the actual, not just generation of value, but capturing of value.
[00:03:09] That's what happens when you deploy, right? The number crunching isn't valuable in and of itself. It's not intrinsically valuable. It's only valuable if you act on it. That's the deployment. You can't improve operations without changing them. The holy grail for changing them is per case prediction. Yes. Get that model together to do the prediction.
[00:03:29] You've done the rocket science. Potentially you think you've done the hard part, but no, it's convincing the organization, the stakeholders, the decision makers to actually make that change. And there's a certain piece missing, not a technical piece, a procedural piece, which is leading to this sort of endemic routine failure.
[00:03:47] Now, there are plenty of successes. If 20 percent of new enterprise machine learning initiatives actually succeed, that's sort of a minority of a large number of projects. So there's lots of. Case studies, lots of success. My books and the conference series I've been running since 2009 machine learning week formerly predictive analytics world pretty much the bread and butter of that is successful case studies, but at the same time, there's an overall lack of returns.
[00:04:13] In fact, an IBM industry research project, so that the average return on these projects is essentially 0 on average, lower than the cost of capital, which means you would have been better just investing in the markets in an index fund. So it's high time. We take a step back and say, look, we love this technology.
[00:04:33] I love this technology. That's why most people in data science got into it. I think, right? It's the coolest kind of STEM science, technology, engineering, math. It learns from data to predict. That's really, really interesting. The idea of machine induction, you know, drawing generalizations from examples that pan out, but just cause they pan out doesn't mean the organization's going to act on them.
[00:04:56] So it's kind of like, we're more excited about the rocket science. Then the launch of the rocket and what needs to change is basically a new bridge between the biz tech gap, the gap between the data scientists and their customer, the business stakeholder, the person in charge of the operations meant to potentially be improved with the predictions output by a model.
[00:05:19] And that gap can only be bridged with a little bit more focus. A reframing of the project is the business project. So that, first of all, there's reverse planning. We plan the project around the intended deployment, the operational change. So we think of it as an operations improvement project.
[00:05:39] First, that uses machine learning rather than framing as a machine learning project. Operations improvement project doesn't necessarily sound like the most exciting project. But on the other hand, proving things is exciting, right? That's a good thing. So it's sort of about getting priorities, getting some perspective on it, but then that sort of reverse planning is literally only the first.
[00:06:06] Of what I formalized as six steps and the end to end project. And the broader theme is we need to get the business side stakeholders involved end to end in an informed way so that they don't get cold feet if they don't get their hands dirty. Their feet will get cold. They need to understand and weigh in more than anything.
[00:06:29] More than better technology and better data scientists. These projects need people from the business perspective, who've ramped up on a certain semi technical understanding to be involved end to end so that when you pass them the model. They're going to receive the past toward the culmination of the project.
[00:06:47] Dr Genevieve Hayes: before I went into data science, my background was as an actuary. So I was specializing in general insurance pricing and With general insurance, you've already got that clear use case. The business knows the purpose of pricing, everyone's on board, so When I shifted from being a pricing actuary to being a data scientist, it really floored me because I just expected that business would be accepting of these models as they had been of the general insurance pricing models. And the whole idea of having to bring the business along for the ride just never occurred to me.
[00:07:29] And so facing those challenges is what got me interested in actually learning about processes for overcoming them. And that's what brought me to BizML.
[00:07:40] Dr Eric Siegel: So sometimes when I'm describing to an audience of newcomers to data science and machine learning in particular, I describe it in terms of insurance. Cause you can think of it as taking the core competency of insurance, the value proposition of it and extending it across all industry sectors,
[00:07:55] you're figuring out the risk of negative outcomes, the risk that somebody that you spend 2 dollars contacting will not buy the risk of a transaction turning out to be fraudulent. The risk of an ongoing customer defecting reliability modeling. So a lot of it can be framed in terms of differentiating between cases that are positive or negative outcomes.
[00:08:14] And Eric Webster, who when he was at state farm said the following, I like this quote insurance is nothing but management of information. It's pooling of risk and whoever can manipulate information. The best has a significant competitive advantage. So that's fine. When you want to take that concept of pooling risk and apply it across functions, across industry sectors machine learning is it.
[00:08:41] That's the way to do it. But unlike actuarial practices, it's unregulated. It's the wild, wild west. So going back to what I said a couple of minutes ago, data scientists fall in love with it as well. I did more than 30 years ago because it's the coolest science or technology, right?
[00:08:57] It's really interesting and clearly potentially valuable. And then we're sort of fetishizing the technology rather than focusing more strongly on its potential business value, the actual operational change, the actual deployment. The business side, the non data scientists, the business stakeholders, they're sort of doing the same thing, but from a less technical vantage.
[00:09:20] And in their case it just becomes a really fancy sales pitch, right? It's like the bright and shiny object. And that's, it couldn't be even any more true than it is now more so than ever with generative AI where the narrative around it in the general public eyes promises, I will just put it this way.
[00:09:40] It promises more autonomy than I think is actually feasible. Not that it's not valuable and certainly not that it's not amazing, but whether you're talking about predictive AI is what people are starting to call predictive analytics or enterprise ML. Or you're talking about generative AI. In both cases, we do have this issue that we need to ramp up the stakeholders.
[00:10:04] So, In fact, although my book is structured around BizML, six step playbook, and the steps almost write themselves, and the six central chapters aligned with those six steps. But probably the bigger theme and potentially more important thing in my mind that I meant the book to accomplish is to ramp up the stakeholders, the non data scientists.
[00:10:29] By reading this book, it provides that often missing semi technical understanding that they all must hold in order to then be able to collaborate deeply end to end across the steps and participate in the project.
[00:10:45] Dr Genevieve Hayes: One thing I've always found challenging with a lot of this is that idea of having to educate the stakeholders. I agree that it's important that the non technical stakeholders have at least some grasp of what the technical staff are doing. But my experience has been that those non technical stakeholders are usually very busy.
[00:11:06] So even though they like the idea of learning about things like BizML, they just don't have the time in their schedules. What success have you had in teaching BizML to these non technical stakeholders?
[00:11:21] Dr Eric Siegel: Well, that's a great question. If they are going to authorize the deployment of a model, that means to actually make change to their largest scale operation, where there's the greatest potential gains, but also the highest risk, I mean, this is the process that's already making the company survive. If they're going to authorize the deployment of a model, they don't understand or a value proposition
[00:11:47] so, just to be clear, I'm talking about semi technical understanding. They don't need to understand how logistic regression adjusts the weights and they don't have to understand area under the curve. But there is some arithmetic in terms of. Look, this is how many false positives versus false negatives, and this is how much those are going to cost on average or relative to the particular cases, those types of Mathematical endeavors that just use arithmetic and translate it into the actual value.
[00:12:18] And then the idea that it's a probability, it's a number between zero and a hundred, and you're going to draw a line somewhere and decide who does or doesn't get treated in a certain way, which transactions to audit, which customer to contact or offer a coupon, which train wheel to inspect is potentially breaking down, whatever the large scale operation that you're improving.
[00:12:37] They need to understand those bare bone mechanics. So that's what I mean by semi technical. It's not a college degree or graduate degree or that kind of thing. And if they don't have that basic understanding of this fundamental change and the reasons for it, that you're going to make to the large scale operation that they own, then there's only two ways that this is going to go, either the gonna authorize.
[00:13:01] Deployment as a leap of faith, which is probably unusual, and ill advised because if you own this operation, you probably should understand a major change being made to them. The other option, which is what happens most of the time, is that they kill the project and the failure is quite adeptly, I must say, swept under the rug.
[00:13:28] What I'm advocating for is what would avoid either of those branches, which is just clue them in. Right? So reading a book is like one airplane ride. They may be busy, but if they have a large scale operation, it does come down to the numbers. So if they want to improve it, they got to get our hands a bit dirty or their feet will get cold.
[00:13:49] Dr Genevieve Hayes: And even with the ones who aren't prepared to read it, if you had a data scientist who was familiar with BizML, they could at the very least Provide the management team with the half hour summary of BizML in some sort of meeting.
[00:14:05] Dr Eric Siegel: Yeah, so the business is the framework within which you get into those nitty gritty details. The outline of the steps in and of itself doesn't cover all the details. The details pertain to, you know, what's predicted how well, and what's done about it. That's what it all comes down to, and the 6 steps are basically formed around those 3 items.
[00:14:24] What's predicted how well, and what's done about it. What's predicted and how well, right? That's just who's going to buy if contacted and well, Then therefore let's use that to decide who gets contacted. That's the overview. You got to get down into a certain amount of details. And then the second of the three is how well that's metrics.
[00:14:42] That's sort of what's the measurement that matters. It's just arithmetic, but it's very particular arithmetic, but yeah, you don't have to force or twist the arm of an executive to read any particular book. You can relate the information by voice, but
[00:14:55] I don't think that we're going to improve our relatively dismal overall deployment success rate of enterprise ML, unless we're able to bridge this gap, unless we're able to really get them on the same page in this regard it may be a change that happens slowly, but inevitably it's important.
[00:15:14] I mean, Improving operations with machine learning, one of the last remaining points of differentiation for large scale enterprise projects. So basically. A negative way to put it is that competition will dictate that enterprises must do it. And they're only going to do it with just the right union between tech and biz.
[00:15:32] Dr Genevieve Hayes: So, just then when you were describing BizML, I think you alluded to about four steps from it. We should probably define what BizML is for our audience.
[00:15:42] Dr Eric Siegel: Sure.
[00:15:43] Dr Genevieve Hayes: this framework is described on the cover flap of the AI playbook as the gold standard six step practice for ushering machine learning projects from conception to deployment.
[00:15:55] Dr Eric Siegel: Yes.
[00:15:57] Dr Genevieve Hayes: So what does that mean for people who haven't read the book?
[00:16:00] Dr Eric Siegel: And to be clear, biz ML stands for business process for running machine learning projects. It's spelled B I Z M L. And as I mentioned a second ago, it's oriented around those sort of three main characteristics that relate to actually running one of these projects, which is what's predicted.
[00:16:17] How well, and what's done about it. And the first 3 steps are basically pre production planning steps that involve those 3, although not quite in that order. So, the 1st is to establish the deployment goal, which is that pair what's predicted and what's done about it, which transactions likely to be fraudulent.
[00:16:34] Then either block or audit that transaction, which satellites most likely to run out of battery. You know, take proactive measures, which component of this hydroelectric power plant is going to fail, then, spend the cost of shutting things down and inspecting that piece proactively. So, the fact that there's a, almost endless.
[00:16:55] number of those pairs, what's predicted and what's done about it, that define the use case to define the value proposition. That's why machine learning is so widely applicable. That's why the Harvard business review calls it the most important general purpose technology of the century.
[00:17:11] That pair in general terms and defining it, that's your value proposition, that's the antidote to hype is to Define a very specific way that you're going to achieve value that you're going to improve operations. And that's step one. So in that sense, you're planning in reverse.
[00:17:27] You're planning for what step six will be, which is the actual deployment itself, which is the culmination of the six steps, of course, after that, you have to continue to manage, maintain, and monitor the model. Step two is to take the first of those two, the prediction goal, who's going to buy if contacted, who's going to cancel if not contacted, whatever it is, and define it much more specifically because all the caveats and qualifiers that fully define the prediction goal, which as most data scientists listening know as the dependent variable, but the full definition Of the dependent variable depends on a bunch of business factors that require business input and relate directly to how you're planning.
[00:18:10] So what exactly are you predicting if you're going to use those predictions to make these operational decisions? So defining it in full detail, you know, which customers who've been around at least three months and who have spent at least this amount are going to cancel their paid subscription within the next three months.
[00:18:27] Three months and not resubscribe a month later or whatever it is. It could be easily three times as long as that, but all the caveats and qualifiers to be very, very specifically clear so that you've fully defined. Technically what's going to be predicted with business considerations in mind.
[00:18:43] That's the kind of semi technical detail. That the business stakeholders need to get involved with. And then step 3 is evaluation metrics. So define both what are the technical and business metrics like profit and savings that matter and what level do they need to be achieved before you're ready to green light the deployment.
[00:19:03] So, those 3 is sort of the pre production. And then the other 3 are just the standard production steps that every data scientist has been involved that have been the same since the 60s, since we first started using machine learning models to target marketing and credit scoring. It's just prepare the training data, train the model, and then deploy it, integrate it.
[00:19:23] Change operations with those predictions during which you're very much using all three of the first three parts that we talked about. So, for example, during model development, you're assessing the model in terms of the metrics that you decided were important.
[00:19:36] So the six steps get you to a new place, you've actually changed your business. It's not business as usual. You've actually deployed it. And you've finished. Not finished the project, but you finished starting something new. So now that things are operating in a different way, of course, you need to continue to monitor or manage it.
[00:19:54] You launch astronauts into space. The launch was successfully. Now you have to keep them alive. Refresh their oxygen, refresh the model periodically and this kind of thing. So that's the outline of the six steps. The deployment goal, the prediction goal and the evaluation metrics, and then.
[00:20:10] Data training and deployment. The important thing about this particular layout is that it's meant to be in terms that are understandable for the business side and in a structured way that engenders their deep participation at each step.
[00:20:25] Dr Genevieve Hayes: The thing I really love about this framework is that you've effectively taken a monkey first approach to solving machine learning problems. So if you come across Astro Teller's monkey and pedestal analogy
[00:20:37] Dr Eric Siegel: no Oh, I think I have. So I've known Astro teller since like 1992. He was still an undergrad and I was a grad student and we were in a genetic programming. Click. a social click of what we would keep meeting up at genetic algorithm, research conferences and that kind of thing.
[00:20:53] Right in my opinion, kind of trying to do what deep learning ultimately was successful and doing and getting it to really scale as far as the complexity and the ability to take as input not just several dozen elements of a customer, but the full scale array that defines a high resolution image or whatever it is.
[00:21:10] Can you remind me about his monkey analogy?
[00:21:13] Dr Genevieve Hayes: it's basically that if you're trying to teach a monkey how to recite Shakespeare while standing on a pedestal, then you're trying You should focus all your time and attention first on teaching the monkey to recite Shakespeare, and then only after you've succeeded should you bother to build the pedestal.
[00:21:29] Because if you can't teach the monkey to recite Shakespeare, building the pedestal's a waste of time.
[00:21:34] Dr Eric Siegel: No, , I'm a huge fan of their moonshot approach. And in fact, have recently published an article in Forbes about their bellwether project. So if you go to my Forbes contributions, you can see the article I wrote, I wrote about how they use machine learning to help the National Guard of the United States triage and prioritize where to go after an emergency.
[00:21:58] Dr Genevieve Hayes: That's interesting.
[00:22:00] Dr Eric Siegel: So, you're going to bring that just to more typical enterprise projects as far as like, what's the pedestal here? I mean, yeah, that's, that's great. I mean, I don't, I guess I never thought of applying the tech the right. Yeah. So the pedestal is, the easiest thing to get really clear is I'm gonna prep the data and I'm gonna make a model and I'm gonna get the ROC curve or the a UC or, you know, I'm gonna see these nice numbers in the shape of those curves.
[00:22:25] And I'm gonna be excited because I've developed a model that's potentially valuable and that technically is performing really well. And so in, in general, it's like. The rocket science is the easy part in a certain sense, right? A soft skills are the hard skills.
[00:22:43] Dr Genevieve Hayes: And your monkey's deployment.
[00:22:45] Dr Eric Siegel: Exactly. So the pedestal is irresistible. If I keep getting this beautiful pencil together, then I'm definitely taking steps forward.
[00:22:55] Dr Genevieve Hayes: Except no data scientist sees it as being a pedestal because they think that, as you said, it's the rocket science. So how could rocket science be a pedestal?
[00:23:04] Dr Eric Siegel: Right. So I'm here to break the bad news. The modeling part is just the pedestal. That's not a lot of fun to hear, but someone had to tell you, you know, I mean, it's not, it's like we need to actually capture enterprise value, which means changing operations.
[00:23:22] Dr Genevieve Hayes: And I think this is a problem with CrispDM, which is why it doesn't work. It's because deployment is right at the end. And by the time you get to that, everyone's just forgotten about it.
[00:23:32] Dr Eric Siegel: Okay. Crisp dm. It is a fact for those listening who are familiar with CRISP dm I have with this book, which came out in February, 2024, and the buzzword that I coined Biz ML is that on a certain level, I'm saying, look, let's start over. Let's do a refresh. And yeah, I'll be honest, I'm trying to usurp CRISP dm.
[00:23:58] I love Chris PM for the contributions that it made, but it did fail. It is a business practice. It's an organizational practice. That's right in the name cross industry standard process for data mining. This is the one by far most popular. Business framework for running whether the time was called, that's what the DM is data mining projects that shows how dated it is.
[00:24:23] It's from about 30 years ago. And the fact is first of all, these days, it's becoming more and more rare for data scientists to have heard of it, but more to the point. Nobody in the business world has ever heard of it. I'm being a little hyperbolic there, but you know what I mean?
[00:24:38] Yeah. And if this is meant to run a business project, it's meant to usher these projects from a business management standpoint, successfully through to deployment, if after 30 years, nobody in the. Business world has heard of it. That's a consideration. My main criticism would be the way that it's been marketed and positioned, the way it's written.
[00:24:59] It speaks the language of tech. I'd also say that it's very much general for what they call data mining, which now you might say analytics. It's not specialized for machine learning in particular. And the six steps as I outlined them are, but again, I just want to reemphasize. That look, you could formalize it as five steps, eight steps, people do in various ways.
[00:25:20] The most important thing is that business stakeholders get involved on some level of semi technical mechanics, just literally the ins and outs of the model, not the internal workings of the model. There's a chapter on that in the book, and it's great for them to have a sense of How it tweaks weights or builds a decision tree from data.
[00:25:38] That's fine. That's also the subject of my first book, predictive analytics, but really the business stakeholders can mostly talk about the internals of the model and the way it was generated, trained as a black box. And literally just the ins and outs, what does it take as input? And exactly what is it predicting?
[00:25:55] What's that output probability meant to predict and how is that going to be used in a business? So it's literally just the ins and outs. It's not the core rocket science. So that's the theme is we need to have an established common vernacular. And in fact, business stakeholders generally don't even realize that you do need a very particular specialized practice playbook framework paradigm for running these projects.
[00:26:22] If they're going to successfully lead to a deployment in the 1st place, let alone knowing the name of any of those. So that's why I'm trying to be splashy. I spent hours of my time working out a 5 letter. Buzzword biz ML, because I'm like, look, we need to advertise to the business world that there needs to be a specific.
[00:26:44] methodology. It's got to be catchy and relatively easy to understand, and we need to enlist them in participating.
[00:26:52] Dr Genevieve Hayes: When you talk about deployment in the context of BizML, the way I understand it is that you're mostly talking about gaining business acceptance for this model and having the business use it. But Deploying models technically into production can also be challenging, especially for organizations that have limited experience in doing so.
[00:27:15] In addition to establishing the value proposition and focusing on the transformation that you'll achieve, should data scientists also solve for the technical aspects of deployment upfront?
[00:27:28] Dr Eric Siegel: Well, I'd say I'm agnostic about that. That's sort of the domain of MLOps, which is a set of technologies and techniques to manage and deploy models from a business standpoint. So MLOps are technical solutions.
[00:27:40] And what I'm talking about with BizML is a business framework. So it's a business project that needs to meet business needs. So the business needs to be the dog that wags the tail. MLOps in that sense should be the tail. And yes, it can be very challenging, but you can see the sort of failure to the degree that, and we've done surveys and
[00:28:02] when people say, look, we didn't deploy the model because it turned out to be too technically challenging to integrate it. You can once again, put that back as a failure of planning, right? Because the whole purpose of the project is that deployment. So if you're not planning for that, for the get go, what's that going to take from an engineering standpoint, then you're failing to.
[00:28:24] Properly plan for a project that will deliver value. So if the steps 1 through 6, start with planning the deployment goal, and then step 6 culminates with the deployment itself, the actual integration going to the production, whatever you want to call it operationalization.
[00:28:39] The details of that to a large degree need to be planned within step one. So in other words, if you're reading my book to learn more details about BizML You can't just read chapter one on defining the deployment goal and then execute that first step. You need to learn also about step six and what deployment entails.
[00:29:01] So it's sort of like the books meant to be read in order, but then referenced in reverse. So the whole picture matters and you need to plan end to end from the beginning.
[00:29:12] Dr Genevieve Hayes: So as the name suggests, the focus of BizML is on machine learning projects, but machine learning and predictive analytics aren't the only types of projects that data scientists undertake. For example, causal AI and more recently generative AI are the other aspects of data science that are starting to attract a lot of attention among data scientists and business scientists.
[00:29:38] Could BizML also be used to solve causal AI and generative AI problems?
[00:29:44] Dr Eric Siegel: Yes, in the case of certain causal projects for sure. Let's talk about generative AI first where I say that the simple answer is no generative AI needs its own version of something like BizML. BizML is called BizML with ML as in enterprise ML often called predictive analytics, sometimes called predictive AI.
[00:30:04] So it's specialized for that. What's predicted, what's done about it. You know, the use of a model that does per case prediction for targeting marketing, credit scoring, and a million other kinds of operations. So on that level of detail. And it's specialized for that, but the general principles are also just as important for, let's say, generative AI in the sense that you need to start with the value proposition.
[00:30:24] Exactly. How will this thing be operationalized? What procedures or business practices or operations are going to be improved with this technology in exactly what way that's the deployment? How are we going to measure the benefit in terms of business value? Define what those metrics are. And then not only that, but actually measure it, which by the way, although we're still early in the game, that's pretty rare.
[00:30:47] The ratio of speculative breathless excitement about the potential of generative AI. In comparison to the actual projects where they actually follow through with doing that measurement, I'd say it's a very high ratio in the sense that relatively few are actually doing it, but you are seeing a bit here more and more for sure.
[00:31:07] And there is value to be had improvements of efficiencies like in customer service and drafting marketing and especially in coding . 30, 50 percent improvements in efficiency. Most of the credible numbers are within that range. Maybe it'll be higher. And that's quite interesting for sure.
[00:31:25] So the principles apply now in terms of causal causal could refer to a lot of different things, but why don't I talk about. Uplift modeling, also known as net lift modeling or persuasion modeling, the Obama 2012. Re election campaign in the United States actually used uplift modeling.
[00:31:47] And I covered the end of my first book, predictive analytics. The whole final chapter there is on uplift modeling. And that's where you're predicting persuasion. You can't optimize for persuasion unless you predict it. So generally what you predict is the outcome or behavior. But maybe more ideally, what you'd like to predict is if I'm going to make an operational decision, such as whether to contact this customer, how much would contacting the customer increase the chance of a positive outcome?
[00:32:12] That's a very different thing than simply predicting the chances of a positive outcome. But it's more actionable. It's prescriptive analytics, but it's not when people say prescriptive analytics, the vast majority of articles out there that use that catchphrase are not referring to anything concrete.
[00:32:30] And they're definitely not referring to uplift modeling, but that's really what uplift modeling is. It's saying this tells you exactly which customers would benefit. for your outcome. Same thing applies with healthcare, which healthcare patients would benefit from this treatment. If I treat a whole bunch of people with some medication and on average as an improvement, that doesn't mean every individual improved, some of them may have had a detrimental effect.
[00:32:55] But if you could do a per case prediction, not of the outcome, but of whether this treatment would increase the chances of a positive outcome, which is much harder. And it requires experimental design. You can't do it on found data like you can with normal enterprise ML projects. That's uplift modeling.
[00:33:12] I've written a bunch about it outside of my book. You can go to predictive analytics, world. com slash uplift. That will get you to an article that has a bunch of links at the end. There's a lot out there to read about it. I've got some videos as well. But in any case with that.
[00:33:27] Particular kind of causal. Yes, biz ML would totally apply in every respect because it's still you're finding exactly what you're predicting and how you're going to act on those predictions. What the data requirements are for it. There's a bunch of case studies where it's used very successfully, but they're relatively small, at least as far as public case studies, because it's a lot harder to do than standard propensity modeling.
[00:33:50] Dr Genevieve Hayes: With that Obama example that you mentioned, is that the one where they were doing all the A B testing on the different versions of the website?
[00:33:59] Dr Eric Siegel: No so for a political campaign, your sales force is a bunch of volunteers are knocking on doors or making phone calls. And what they actually did was , they didn't assign the volunteers to go pave neighborhood that gave them the exact addresses of which houses to knock on doors because they were able to predict where knocking on the door actually decreases the chance of getting a vote for their candidate Obama and increase the chance of getting the competitor vote who was Mitt Romney in 2012.
[00:34:29] And same thing with targeting phones. So instead of just simply predicting, whether you're going to get a vote from this person for your candidate, you're predicting whether. The knock on the door would increase the chances of getting a vote for your candidate. That's Uplift modeling,
[00:34:46] Dr Genevieve Hayes: So, because you upset the person so much, they swear that they'll never vote for your candidate.
[00:34:51] Dr Eric Siegel: you get a downlift that's called a sleeping dog customer, let sleeping dogs lie. Cell phone companies found this out that if you send a retention offer, some customers, you actually increase the chance of their defection, in the case of telecom, because you might have reminded them that their contractual obligation is up for renewal.
[00:35:11] And now they're free to defect to the other cell phone company that all their friends are using.
[00:35:16] Dr Genevieve Hayes: Okay. And with a political candidate, you could actually be alerting them to policies that they might disagree with, which would make them think, Oh, I should vote for the other guy.
[00:35:26] Dr Eric Siegel: Right. That would be one example, or you cause them more considered or you annoy them. Like most predictive analytics, you're not necessarily trying to establish causality, that type of understanding or explanation, but you are trying to do causality. You're trying to get them to vote for your candidate or get them to buy.
[00:35:45] Dr Genevieve Hayes: In addition to writing the AI playbook, you're also the co founder and CEO of the startup Guda AI. Which is aligned with what you talk about in the AI playbook. Can you give us a brief overview of Goodr AI and how it ties in with what we've just discussed?
[00:36:03] Dr Eric Siegel: Yeah. Gooder AI is the business console. For the stakeholder to say, Hey, look, my data scientists just gave me this model. And now I want to see how much value I can get from it. So in other words, it shows you the, what if scenarios for potential deployments, where you're using the model to drive operations for any use case that we've been talking about marketing, credit scoring, fraud detection logistics reliability model.
[00:36:28] And you're using this model to drive decisions. And typically the data scientists will say, Hey, look, this model has an area under the receiver operating characteristic curve of 0. 83. Isn't that cool. And then the stakeholders totally lost, right? It doesn't tell them anything about the potential value.
[00:36:44] It tells them it predicts better than guessing and therefore is potentially valuable, but it doesn't tell you how valuable, but what the business needs. What your customer as a data scientist needs with the stakeholder in charge of operations that are potentially going to benefit needs to know is the profit or the savings or any other kind of very straightforward lingua franca of business KPIs, whatever you want to call it which generally is missing from these projects.
[00:37:10] But the thing is, is when you move to a business. that it depends on certain factors business inputs that are subject to change or uncertainty. So what you actually need to do is have a very specialized GUI, an interactive visual experience, where you can try moving those levers. Parameterize that estimation of the business value, however you want, move that decision threshold, see how it changes the potential business value.
[00:37:41] See the degree to which uncertainty matters, if at all, or , how you can limit the uncertainty, see how competing KPIs trade off depending on these deployment scenarios. So that's good. So good or AI is we're a very early stage startup. We're actually just going to trials. So. If anyone listening is interested and you have a model that's meant to drive binary decisions we can easily set you up with a trial so you can tell us what else you want the product to do and whether it provides the visibility we think it will, not only for the business stakeholder, but also during the development.
[00:38:19] I mean, if you think about the development of a model is a semi automatic process of train, test, train, test, train, test. Right. The testing part is typically only done in terms of technical metrics like area under the curve, precision, recall, accuracy. But if you're not measuring the business value.
[00:38:40] You're not pursuing business value. If you want to make sure you're navigating that model development toward business value in terms that mean something to the business as a whole and to your customer and to the stakeholder, you need to also be estimating and as best away as you can the business value so that you know that you're navigating that model development in the right direction.
[00:39:03] Dr Genevieve Hayes: So if you're using Goodr AI, are you still using the technical metrics during the tuning and model selection stage, or are you only using the business metrics? So I guess are the business metrics supplementing the technical metrics or are they replacing the technical metrics?
[00:39:22] Dr Eric Siegel: No, they're only meant to supplement it. You need both. , so like the deciding on the metrics chapter of my book, I very much do that. I'm like, there are technical metrics and there are business metrics and let's be sure we know the difference and know what the options are.
[00:39:38] But, generally you should be using both because the model development and model training and evaluation is a technical process. And. There are benefits to technical metrics that they can sort of be evaluated in a vacuum, in the sense that no business context is necessarily important.
[00:39:59] They're just telling you the pure abstract predictive performance of the model that has its place, but that's also it's detriment. If it's the only thing you're looking at, because by not involving any of the business context and the business deployment scenario details, you're not sussing out or stress testing the model in its intended usage.
[00:40:19] So our startup. Good or AI is not the rocket science. It's simply. For once and for all, finally stress testing the rocket in its intended usage. So in that sense, it's not a model training tool. It's meant to supplement the model training tool. Pop over to the good or AI tab and see how well your model potentially will deliver value in terms of profit savings and the like depending on your deployment plan and deployment scenario.
[00:40:46] Dr Genevieve Hayes: So you'd still build your model in your modeling software like Python and then does gooder AI sit on top of Python? Is that how it works?
[00:40:55] Dr Eric Siegel: You don't have to be doing Python. It will work for any predictive model. And the reason it has that universality, and here's our little secret trick, the way it works with any model, whether you did it with paid software, open source or whatever, is that it doesn't need the model.
[00:41:12] It doesn't care how the model works. It cares how well the model works. So it actually only needs the test data. So we're going to integrate it with scikit learn and stuff. So you'll just push a button, but even without that integration, it's just a matter of take that test data. Right. The dependent variable, the model score, any other factors that are important for measuring the business value and those business metrics, and then provide that simple table of data to good or AI
[00:41:41] Dr Genevieve Hayes: Okay, so you'd provide it with the true values of your target variable and your predicted values and then it can do all its calculations.
[00:41:50] Dr Eric Siegel: that's right. And then it's highly configurable, so you can parameterize however you want false positive and false negative cost that's often. All you need, but that's also often a little bit too simplistic because the cost is going to depend on the individual case. You can parameterize it.
[00:42:06] However you want. Each parameter is a slider on the screen, move the slider, see how the shape of your profit or savings curve changes, move the decision threshold, see what the values are, see how it changes other competing KPIs. So it's just a very generally applicable, highly configurable solution for doing what we need to do, which is to visualize and interact with the business value in terms of these kind of charts that allow you to explore the potential value which is what it takes if we're going to make the much needed move from only technical metrics to also measuring for business value.
[00:42:43] Dr Genevieve Hayes: One of the challenges with measuring for business value is you need additional information over and above things like your true and predicted values of your target variable. How do you get that additional information into good AI?
[00:42:58] Dr Eric Siegel: Well You know, let's say you're doing fraud detection for credit card scoring. It turns out that the industry average. Is 500 dollars, at least in this country, it's 500 dollars for a false negative. You allowed fraud to go. Uncaptured and you didn't block it and it's 100 dollars for false positive.
[00:43:17] You inconvenience the card holder who is making a legitimate transaction. And on average, that might be around 100 dollars because they're going to stop using your credit card or eventually will, if this keeps happening. But those factors. Our subject to change, let's say you're saying, well, look, what if we offered a 5 Amazon coupon every time we inconvenience the customer that could cut the false positive cost in half.
[00:43:41] So now we can move that slider on the screen and see how it changes. So the business factors. tend to be few. There may be two, there may be 10, but they're not that many. You can parameterize them. You can start with best guesses, but reasonable ranges see how much, I mean, the purpose of our solution is that you can get that intuitive sense by changing them and seeing how big of a difference they make.
[00:44:05] So for example, churn modeling, you might have a good model that's telling you the chances somebody is going to defect. But the action you're taking is to try to retain them. Let's say with a discount, what you typically don't have is the effectiveness of that discount offer to keep them. So you can parameterize that kind of put a best guess range, move them along that range, see how it's actively changing the shape of the curve, how much it matters.
[00:44:32] And then you're motivated in a very concrete way. You know what? This is too much uncertainty. We need to do some pilot experiment and see the effectiveness rate of this type of retention campaign. So you got to start somewhere. If we're going to move to business metrics, you can start out with.
[00:44:49] Fermi estimations, best guess ranges, and very quickly get from a model to visualizing how big of a difference they make, and then narrow down those uncertainty ranges incrementally by way of going across silos, by running experiments, whatever it takes. But we got to start somewhere with this stuff.
[00:45:07] Dr Genevieve Hayes: So it sounds like to calculating the business metrics, it's also allowing you to perform sensitivity analysis on those metrics.
[00:45:15] Dr Eric Siegel: Yeah. I mean, informally it's meant to provide it. It's a data visualization solution, very specific for the deployment of a predictive model that uses the test data you have as best as it can, just like right now, your test data is telling you the area under the curve. Now let's use it to also tell you the potential profit, depending on how you use the data.
[00:45:37] And then given that there are certain factors where there's uncertainty ranges, get the intuitive sense visually of how big of a difference they make.
[00:45:45] Dr Genevieve Hayes: So you mentioned previously that you're currently at the beta testing phase for this product. Have you received any feedback so far about the impact this has had on the success rate for machine learning projects?
[00:45:59] Dr Eric Siegel: Yeah, we like literally at that right now. So, we have a couple of pilot projects and things like targeting collections, targeting nonprofit, and we've got a big pipeline of other ones. So we're literally, we're very early, very small only initially funded startup. So if anything right now.
[00:46:17] We're just calling for more, you know, we've kind of got about a dozen or 20, depending on how you count companies in the pipeline, going to these trials, the ones that have gotten there are very excited about it and are saying it's very valuable, but we're just at that point over the summer right now, just getting to the real actual genuine trials,
[00:46:36] it's one thing to have someone try it on a public model. Just to get a sense of what the product does. It's another to try on a production model. And we're just getting there now. But we need more of those cases, which is why we're offering the trial on a platter, right? We're not charging for it.
[00:46:51] We'll do the configuration setup. So it works on your use case, and it'll be very little effort to give it a try.
[00:46:58] Dr Genevieve Hayes: And when do you think the beta testing will be finished and you'll get to the main launch?
[00:47:03] Dr Eric Siegel: Oh you're asking a startup founder to estimate how long it'll take. Well, I think it'll take two or three months to accumulate a few more proof of concepts, and then they'll have to be more substantial fundraising. We'll probably launch a freemium model of the product, at some point next year, but Right now, we're in a very early phase.
[00:47:26] We're forging a new category of ML software. Most ML software is the training part. This is meant to compliment that. It's never been done before.
[00:47:35] Dr Genevieve Hayes: So what final advice would you give to data scientists looking to create business value from data
[00:47:43] Dr Eric Siegel: Well, that's a great question. I think that first and foremost, you've got to focus on exactly what the deployment is going to entail. It's not the fun part from a scientific standpoint, but it's the only thing that matters from a business standpoint. So focusing on that concrete value proposition and what it means to actively operationally get there.
[00:48:05] And that's what I've been trying to Approach with the biz ML framework. And the ability to assess model performance in terms of business metrics is definitely part and parcel to that whole process.
[00:48:18] Dr Genevieve Hayes: for listeners who want to learn more about you or get in contact? What can they do?
[00:48:23] Dr Eric Siegel: Well, the website for the book, funny enough is biz, ml. com. So B I Z M L. com there, there's a bunch of pages there beyond the book. There's a page about me. There's a page about my keynote speaking, and I've been commissioned to keynote 160 times and there's a link to the good or AI website, but we're early.
[00:48:44] You're not going to find any real substance on that website. Well, you might by the time this drops, we might just be getting a bit more substance, but the main thing is we're still at an early phase and we're interested in new trial customers. So the best way is just to reach out to me directly for that.
[00:48:59] Dr Genevieve Hayes: And what's the best way to reach out to you directly? Is that LinkedIn or emailing you through your site?
[00:49:05] Dr Eric Siegel: Yep. You can email through the site and if you go to my bio page, it links to my LinkedIn as well.
[00:49:10] Dr Genevieve Hayes: Okay. Great. So thanks for joining me today, Eric.
[00:49:14] Dr Eric Siegel: Yeah. My pleasure. Genevieve has been great to be on the program.
[00:49:17] Dr Genevieve Hayes: And for those in the audience, thank you for listening. I'm Dr. Genevieve Hayes, and this has been value driven data science brought to you by Genevieve Hayes Consulting.

Episode 48: Overcoming the Machine Learning Deployment Challenge
Broadcast by