Episode 80: Why Decision Scientists Succeed Where Data Scientists Fail

Download MP3

[00:00:00] Dr Genevieve Hayes: Hello and welcome to Value Driven Data Science, the podcast that helps data scientists transform their technical expertise into tangible business value, career autonomy, and financial reward. I'm Dr. Genevieve Hayes, and today I'm joined by Professor Jeff Camm. Jeff is a decision scientist and the Inmar presidential chair in analytics at the Wake Forest University School of Business.
[00:00:29] His research has been featured in top ranking academic journals, and he's the co-author of 10 books on business statistics management, science, data visualization and business analytics. In this episode, you'll learn exactly what decision science is and how adding decision science skills to your toolkit can increase your value as a data scientist.
[00:00:54] So get ready to boost your impact, earn what you're worth, and rewrite your career algorithm. Jeff, welcome to the show.
[00:01:02] Prof Jeff Camm: Thanks, Genevieve. It's great to be with you.
[00:01:05] Dr Genevieve Hayes: As listeners of this podcast should be well aware of by now, the role of the data scientist is to enable better decision making within organizations through the use of quantitative methods and data. But data science has only emerged as a profession in the last 20 years. And organizations have needed to make better decisions since the dawn of commerce.
[00:01:28] So what did organizations do before data scientists existed? Turns out that long before data science, there was another profession using mathematical approaches to help organizations make decisions. Decision science. Which emerged during World War II and comprises skills that when combined with data science can dramatically increase project success rates
[00:01:55] even though decision science has been around since before my parents were born, I've only just become aware of its existence and I suspect many of our listeners will also be in the same boat. Jeff, for listeners who might be hearing the term decision science for the first time, what exactly is it and how did it come to
[00:02:15] Prof Jeff Camm: Sure. So it's like most of these disciplines, there's no universally accepted definition uh, but decision science is really. Primarily a set of quantitative techniques that are used to help people make better decisions. That's one component. And then there's also a behavioral side where decision scientists study how people make decisions in that way.
[00:02:42] It's a little bit like economics. There's a quantitative side and there's a behavioral side to decision science. But, I equate it to and people might not be familiar with these terms either, but operations, research management, science , my PhD is actually in management science, which I would say is equivalent to decision science.
[00:03:05] And the key is that it's focused on the decision making process and making better decisions. And, to contrast that maybe to data science, sort of by definition, data science is more focused on the data and generating insights with ultimately I think, the same end goal, which is to improve people's decision making process.
[00:03:29] But we've done some research where we looked at job ads and, academic curricular in decision science and data science. And so we've done a study that looks at the different skill sets, both from an industry perspective and from, an academic perspective. And there are differences.
[00:03:46] Most notably I would say that, decision science is focused more on actually modeling the decision at hand, whereas data science tends to be more focused on modeling the data rather than modeling, mathematically a decision or a decision process. So. For example a data scientist might forecast demand for our, we have five products.
[00:04:12] Let's make a simple example. We ask our data scientists to forecast the annual demand for the next year of our five products. And that's primarily a data-driven process. Maybe we're they're using time series,
[00:04:26] and so what do you do with that when you have it?
[00:04:29] Well, you give it to a manager or you give it to someone and they have to make actually what kind of decisions now? The forecasting of the demand is a data problem, but if you gave that to a decision scientist, the decision scientist might take that and model the actual production planning process where the real business problem is
[00:04:50] what mix of those five products should we produce next year to maximize contribution to profit? And there's a mathematical model you can build, but the data scientist forecast would be inputs to that decision model. And so a key thing here is that the decision model is in the nomenclature vocabulary, if you will, of the decision maker.
[00:05:16] I say contribution to profit. I didn't say. So fit, right. And so that's in a sense, the way I view it as data science feeds. In some cases we can talk more about data science directly impacting things, but data science feeds inputs to a decision model. The decision scientist can model decisions or has that skillset.
[00:05:40] It could be an optimization model. It could be a simulation model. It could be decision analysis. But you're actually modeling the decision that the decision maker has to make directly. And so to go back to my example, forecasting demand is one thing, but you can't just say to a manager, make these amounts of these five products, because you might not have the production capacity to do that.
[00:06:04] So you have limited capacity. We can model that with constraints in an optimization model.
[00:06:09] Dr Genevieve Hayes: Okay.
[00:06:10] Prof Jeff Camm: So given capacity, given that the five products have different profit margins, have different resource requirements, the real business problem is given our limitations as a company and the forecast of demand for our products, what's the right products mixed to produce to maximize contribution to profit.
[00:06:29] Dr Genevieve Hayes: Does this get into that whole linear programming thing?
[00:06:32] Prof Jeff Camm: Exactly. Well done. Yes, exactly what I'm talking about.
[00:06:36] Dr Genevieve Hayes: Ah, yeah. I remember that. He had all the constraint functions and using, the simplex method to find the point where you made the decision.
[00:06:47] Prof Jeff Camm: Exactly. That's music to my ears. That's what I do.
[00:06:52] Dr Genevieve Hayes: Yeah, I took a unit of statistics that focused on that many, many years ago, and I always wondered what you'd do do with that. But it sounds like it really is used in practice.
[00:07:03] Prof Jeff Camm: Absolutely. Yeah. I don't know if you had this in your studies, but there's a linear model where the variables have to be integer, including binary variables. And so I consulted probably for 10 years or so in supply chain design where I go into a company and I.
[00:07:22] They wanna know which products should be made at which plants, which plants should be open or closed. Where the distribution center should be. That's a big messy mathematical optimization problem. But when I would go in and do that for a company you wouldn't optimize for the current.
[00:07:38] Demand. You have to have forecasted demand for the future. Like sometimes five years out at least a decent estimate of what demand would be. So you could manage the capacities and everything in the supply chain. So again, just another example of data science, forecasting, statistics feeding and optimization model, feeding the inputs that model needs.
[00:07:59] And that's the decision model. So I kind of distinguish between data models and decision models.
[00:08:05] Dr Genevieve Hayes: With data science, a lot of our models are basically using the past to predict the future. You mentioned when you're describing decision science, that one can. Of it is that whole behavioral economic side. Would you also use that to predict, okay, in the past there was demand for this, but because of this particular event, which is causing a change in behavior, we take that into account in making our forecasts and determining.
[00:08:39] The right decision to make, which might override what these models that are based on past behavior would predict.
[00:08:46] Prof Jeff Camm: Yes, absolutely. So. Forecasting demand does not have to be simply time series. It can be other kinds of variables including behavioral variables, economic variables. And so that's one way that the behavioral side could come in, but it's also. Observing decision behavior and to some extent capturing what some might view as irrational behavior and how likely is that to happen, for example, and modeling uncertainty and how the consumer might behave.
[00:09:19] Economic modeling, all of that to the extent that economic. Modeling can sometimes be behavioral. That creeps into this whole decision process as well.
[00:09:29] Dr Genevieve Hayes: If there's a pandemic or anything bad happening, everyone races to the supermarket and buys toilet paper.
[00:09:35] Prof Jeff Camm: Exactly. Interestingly I did a paper with a colleague that was looking at what did companies do with their. Mostly automated machine learning when the pandemic hit, because it was a radical shift in demand, and what most of them replied with was that they basically shut down the model and went back to dashboards and just like, okay, we have to get through the next couple of weeks, sort of by the seat of our pants and see what happens.
[00:10:01] Dr Genevieve Hayes: It's basically all you could do.
[00:10:03] Prof Jeff Camm: I mean, that's such a shock to the system. But there was past behavioral data on pandemics. It's just been a long time before 2020, so anyway, I haven't done a lot of the behavioral side myself. But there are decision scientists who purely study bias in decisions and that sort of thing.
[00:10:23] So I've been more on the quantitative side, which I think does pair up with data science, more so.
[00:10:29] Dr Genevieve Hayes: So we've all heard the statistic that 87% of data science projects fail. Is there a similarly high failure rate for decision science projects, or is that another area where these two disciplines differ? I.
[00:10:42] Prof Jeff Camm: It's great. It's been a long time since people have written a lot about decision science, project failure, but if you go back to even like the nineties, there was a lot written where I. It's a little bit different. Let me back up a minute. When I first went to grad school, we could build these great models.
[00:11:00] This is in the eighties, 1980s. The Achilles heel of decision science at the time was we didn't have the data we need for many of these models. We wanted to build. So a lot of projects failed because we didn't have the data and we were forced to use other data that maybe resulted in sort of garbage in, garbage out,
[00:11:19] that was one reason why decision science failed. There was also you know, you're trying to build a model of decision. You can make so many assumptions that when you come up with a plan and take that to a manager and you have to explain what assumptions you made, they're like. That's not real enough for me to depend on what you're recommending,
[00:11:38] so there's that George Box famously said, all models are wrong, some are useful, that was certainly true of decision science earlier. And there was a lot written about. Decision scientists, especially on the quantitative side before computing became so powerful and algorithms became better, we focused too much on trying to solve larger problems and not enough on modeling,
[00:12:02] and so we came obsessed with the math, the algorithmic side of things. So all those led to project failure. In fact, there was a paper written it's probably in the nineties. It was. Using the phrase operations research, which again, I view as a synonym for decision science. But the title of the paper was operations Research A Postmortem, because it said basically it was dying.
[00:12:25] So the answer is yes, certainly decision scientists, not nirvana, all the problems go away. But. A lot of the failure was early on we didn't have the data we needed to build the kind of models that we knew we could build if we had the data. So then what happened is this is again my view of things.
[00:12:47] Big data arrives, and now, we have more data than we could possibly ever use. And so the old problem of we don't have a data we need goes away pretty quickly. But also all the companies we're trying to figure out how can I just use this data to my advantage, and. Make better decisions.
[00:13:06] And so data science, in my opinion, was a reaction to that. We had traditional statistics forever, but data science was kinda like, we have this data, find the golden nugget in that data and then , let's use that for, generating insights. And so, decision science kind of was out of favor data science, I think, because of big data became unbelievably popular.
[00:13:29] But then, and I think I could see this coming because I've lived through the first thing with decision science. I've always been a mind of, well, you're focusing on the data and there's like a research and development component to that where literally we collected all these massive amounts of data.
[00:13:45] There must be patterns and things in there we don't know about. And so the data scientists attacks that data and starts testing things, but I was always bothered by the fact that at least at the start as data scientists, we weren't focused on the directly the business problem.
[00:13:59] But that loosened up as time went on and data scientists started, you know, logistic regression and should you offer somebody a loan or not? You know, those kinds of things. Recommendation engines, right. To me, recommendation engines actually is sort of when you make the transformation from data science to decision science.
[00:14:18] I think of recommendation engine for Amazon or whatever. You look at how things are correlated. That's data analysis. But then you have a cutoff rule maybe that says, what I actually show at the bottom of the screen, this is what other people bought. Maybe there's a cutoff, there's a rule.
[00:14:34] So rule-based systems. And optimization. And simulation. They transfer what I think of as data science techniques into decision science. And there's a big gray area there, when you get to optimization, like when we're talking about production planning, I think that's clearly decision science.
[00:14:54] And you look at the skill sets that'll be obvious when you look at the data from academic programs and from industry ads. But you can do decision science, basically optimization, decision theory, those sorts of things. But you can also use rule-based systems.
[00:15:10] And then the extreme sort of are the decision science is just a manager taking the data analysis and then using his or her gut and making a decision. The other extreme is wholly automated decision making. Like when we get in our car and our navigation system tells us how to drive, it makes decision for us what route to take.
[00:15:29] Dr Genevieve Hayes: And if you disobey it, it starts screaming at you.
[00:15:34] Prof Jeff Camm: So. I was talking a little bit about rule-based systems. So there was a student back at a former institution where I used to teach, she did some work for a city zoo, and ultimately they wanted to increase their zoo membership.
[00:15:47] You could pay an annual fee and then your family gets to go to the zoo as much as you want, or you could pay at the gate, so they wanted to increase zoo membership. So what this student did, undergraduate student, really smart student she had zip code data, population data, and the zoo had membership by zip code
[00:16:06] and so long story short, she built a regression model that forecasted the expected number of zoo memberships by zip code and that's fine. Now, for every zip code in the area of the zoo, she had the expected number of Zoom memberships. What do you do with that? So she did the data science part.
[00:16:23] Well, she very cleverly said, let's do something simple. If my model says there should be more zoom memberships than what the actual was that were looked at, predicted versus actual, then let's send an email or a flyer to that zip code. So she took a rule that said, if my model says there should be more, let's take action.
[00:16:42] What's the action we're gonna take? What's the decision to be made? We're gonna send a flyer for a Zoom membership to everybody in that zip code. It was one of the most successful marketing campaigns in the history of the zoo. So she transformed the data science into a decision. And I would argue that rule-based system she came up with is decision science.
[00:17:03] But there's a lot of gray between these two areas, I think is with rule-based systems. And lot of data scientists that I've worked with would say what she did was took the predictive model and deployed it. So they would call it data science deployed and maybe that's another way to think about it.
[00:17:21] Decision science is data science deployed.
[00:17:23] Dr Genevieve Hayes: I would say decision science is what data scientists should be doing.
[00:17:29] Prof Jeff Camm: Right. So I agree with you. There's been a lot of literature out there over the last two years especially that I've seen that says there's a communication breakdown between the data scientists and managers. Managers are the ones who have to take action. And the classic example is a data scientist goes to the manager and starts talking about lift.
[00:17:52] Or maybe worse than lift would be goodness of fit.
[00:17:56] Dr Genevieve Hayes: Precision and recall.
[00:17:58] Prof Jeff Camm: Yeah. Right. And, there's been some literature written about, well, we need to at least train the data scientists to convert those metrics into business metrics before they go talk to manager.
[00:18:10] Dr Genevieve Hayes: Is this Eric Siegel, the AI Playbook? Yep. I interviewed him earlier.
[00:18:15] Prof Jeff Camm: That's exactly what Eric is pitching, and I agree that's what we need to do.
[00:18:20] Where we differ is, I would argue that if you model the decision directly rather than just the data, which is what a decision scientist would do, measure the decision. You already in that the metrics of the manager and there might be things in your decision model that you would never catch if we just relied on the data analysis and transform the metrics into business metrics.
[00:18:47] Does that make some sense?
[00:18:48] Dr Genevieve Hayes: Yeah, I'm getting that. So if we could look at that zoo example. So that seemed like the student you had. Was taking a data science approach and then putting an overlay on it at the end, how might she have modeled that decision directly?
[00:19:02] Prof Jeff Camm: So. I suppose in that case, in the rule-based system, she didn't actually, model the decision directly, although I guess that she did in a sense, in that her rule had the action embedded in it. She transformed the problem from, where are we underperforming to, if my model says there should be more there, the decision we're going to make is send a flyer to that.
[00:19:28] So , in that sense, she captured the decision part, the action part in her rule. It's a lot different than the production planning example we talked about earlier.
[00:19:36] Dr Genevieve Hayes: so they're models that actually have, as their output make decision A, B, C, or D as opposed to the probability of this is point. 9, 7, 4, or whatever, and then you have to add that step on top of it, so it's integrating everything into the one system.
[00:19:56] Prof Jeff Camm: Exactly. That's exactly right. Maybe another good example is logistic regression. If we're trying to figure out if someone should get a loan or not. From our bank. You can build a predictive model that says, here's the probability of default for individual B or a
[00:20:11] but to make that actionable, what do you have to do? You need a cutoff rule. And that cutoff rule, you could just make one up based on past data. Or you could take into account the asymmetry between, what's the financial impact of giving somebody a loan when they default, and what's the financial impact of getting it right?
[00:20:31] Giving somebody a loan when they're gonna pay it off, and you know, the probabilities, you can build a decision model that captures all that from the start.
[00:20:39] Dr Genevieve Hayes: Right. And so. Therefore, the important part is understanding those decisions to begin with, which gets into a problem framing problem. Yeah.
[00:20:49] Prof Jeff Camm: Yeah. And one of the big pluses that we found in our research is that when you go back to, your question about why projects fail, and so we talked a little bit about why decision scientists fail you know, they're not foolproof. One of the big reasons for data science failure that if you read the literature cited a lot is basically you have a nail defined problem you attack.
[00:21:12] So one of the skill sets that the data shows decision scientists are better trained for apparently is this idea of problem framing. Because they're not starting with the data. They're starting from the point of view of the decision problem.
[00:21:30] And because they're tasked with modeling the decision, they just ask a lot more questions at the start. Decision scientists do. So you have to understand, go back to the production planning example. If someone just comes and says, forecast demand for me, and you do that, well, that's not the real business problem, but if you charge me with directly, okay.
[00:21:50] If you're asking questions like, why do you want forecast the demand? Oh, we have to do production planning. Well, what's involved with that production planning? Well, we have capacity at the plants. These products, by the way, have different profit margins and you're asking and asking questions because you gotta model the decision then that, you're just, it's hard.
[00:22:11] Problem framing is hard. People say you go from mess to model. It's a challenge and. I sometimes make it sound easier than it is that when people ask me about problem framing, I say, a decision scientist has to be you have to be more like a good medical doctor.
[00:22:28] Because managers typically, they don't tell you what the problem is. They tell you the symptoms they're experiencing from the problem. And so you have to start asking questions. Just like if you go into your doctor and you're, you know, I'm a runner and I say My knee hurts. Well, that's a symptom.
[00:22:43] That's not the problem. And so the doctor will start asking me a whole bunch of other questions, and that just is a skill that takes experience to keep asking the probing questions to get to what the real root cause is. And that's what you have to model. What's the real decision that has to be made?
[00:23:00] So, if they say forecast demand for our products a natural question might be, well for any kind of data analysis or data science task that someone might give you as a data scientist, how is this analysis gonna be used? Well, we have to figure out how to do production planning.
[00:23:18] So you're gonna take my forecast and put it into a model? Well, maybe. Or we're gonna wing it and make what you. Told us the demand might be, oh, by the way, that forecast is stochastic. Maybe we need to take into account the errors associated with that. Maybe we need simulation.
[00:23:34] But you have to take into account the business rules. The business rules are things like capacity of the plants, labor availability, material availability profit margins. Those are all things that have to be considered before you can model the actual decision.
[00:23:50] Dr Genevieve Hayes: So what I'm hearing is where the decision scientists really excel is at the beginning of the process and at the end of the process, whereas. Data science tends to focus on that bit in the middle, but without the beginning, you don't understand what you can model properly and without the bit at the end, your models are useless.
[00:24:09] So that's why if decision science is done well, then it's probably gonna have a much higher success rate than data science currently is.
[00:24:20] Prof Jeff Camm: That is my belief. Exactly. So at the front end, if you take the time to do it right and model the decision, even if you never can really solve the model you built, but it informs where everything is going, and maybe you have to use a heuristic or something to actually come up with a plan. It's, what we call the bookends of the process.
[00:24:40] The front end is making sure you're solving the right problem, which should be the decision problem, in my opinion. And then at the end, because you've modeled the manager's decision problem, you're already in the business terminology, and that makes the influencing with your results easier in theory.
[00:24:59] We talked a little bit before about, decision science is not the end all. One achilles heel of decision science is we often said, here's the answer. You know, like, oh, here's what you should do. But I think we're getting better at. Recommending family of, even if from an optimization model, I've done some work on how you can generate a family of solutions and give that information to a manager because there's things that you didn't model the manager might know,
[00:25:27] Model's a model. The decision science approach is. Getting a lot better in terms of not just giving an answer, but making a list of recommendations that the manager can choose from. That will be good solutions maybe. And also capturing uncertainty better.
[00:25:44] Dr Genevieve Hayes: So just like how, as you said before, Amazon might recommend a book or a movie to you. This is think about things in terms of recommending a decision to a manager.
[00:25:54] Prof Jeff Camm: Exactly. I gave a talk right before the pandemic and the title of my talk was optimization. The Best Recommendation Engine. And my point was don't just give an answer. Use your model to generate a family of recommendations that the manager can evaluate. Ultimately, in my mind, everything comes down to like return and risk.
[00:26:15] You know, put it in sort of a financial framework, which is mostly what businesses are about at the end of the day, not all of them, but most of them. And it's always about. Risk and return and, how much risk can you stomach for return? Increased expected return, let's say, and production planning problems can be cast that way.
[00:26:34] Marketing problems can be cast that way, not just finance problems. So it's ultimately all about risk and return and how aggressive a company or a manager wants to be with that.
[00:26:45] Dr Genevieve Hayes: So if data scientists trying to think more like a decision scientist, what's the first step that they should take?
[00:26:53] Prof Jeff Camm: I think the baby step. Any data scientist could take, because a lot of people are in positions where they're not gonna be handled. The problem that has to be modeled is if someone says, here's the data I want you to forecast.
[00:27:06] Ask the question just how is it gonna be used? Hopefully whoever gave that to you will be willing to take the time to say, well, here's what we're gonna do with this. And that might not impact exactly what you do, but it surely could impact how you present it to the person at the end of your analysis.
[00:27:23] And then probably the bigger step I think would be. At the end of the day, I think impact comes from changing somebody's decision for the better. So always think about whatever it is you're doing. How is it gonna drive return on investment in your time I think you're only gonna see OROI.
[00:27:43] If you made somebody's decision better, that's how we're really getting a return on, what we invested in the analytics or the data science or the decision science. And then like you described very well before problem framing analysis.
[00:27:58] There could be data science and decision science in there. In fact, most times I think to be successful, what we argue is they both should be there. And then how do you deploy or influence the decision? That whole process, just thinking about that every time you do a project or building a data product, how's it gonna be used?
[00:28:15] Who's gonna use it? Understanding the business context. There are environments that'll be very conducive to what I just said, and there will be environments that are not very conducive to it. So it's not a catchall for everyone. But I think just being inquisitive,
[00:28:28] and don't limit yourself artificially to just, oh, I'm just the person doing the data analysis, or the data munging so I think that is what I would recommend.
[00:28:37] Dr Genevieve Hayes: So for listeners who wanna get in contact with you, Jeff, what can they do?
[00:28:41] Prof Jeff Camm: Sure. So happy to always connect with folks on LinkedIn. And then as you mentioned at the start. I'm at Wake Forest University, which is in North Carolina, here in the United States. And, my email's on the web, but it's pretty straightforward.
[00:28:56] It's Camm, CAMM, J as in Jeffrey, D as in Douglas, at wfu.edu, wake forest university.edu. So they can reach out by email. I'm always happy to talk about this and I'm always interested in what people are doing, in the data science, decision science space .
[00:29:16] Dr Genevieve Hayes: There you have it. Another value packed episode to help turn your data skills into serious clout, cash, and career freedom. If you enjoyed this episode, why not make it a double? I. Next week catch Jeff's value boost a 10 minute episode where he shares one powerful tip for getting real results real fast.
[00:29:38] Make sure you're subscribed so you don't miss it. Thanks for joining me today, Jeff,
[00:29:43] Prof Jeff Camm: Thank you for having me, Genevieve. I've enjoyed it.
[00:29:46] Dr Genevieve Hayes: and for those in the audience, thanks for listening. I'm Dr. Genevieve Hayes, and this has been Value-Driven Data Science.

Episode 80: Why Decision Scientists Succeed Where Data Scientists Fail
Broadcast by