Episode 22: Software Engineering for Data Science

Download MP3

00:00:00 Dr Genevieve Hayes
Hello and welcome to value driven data science brought to you by Genevieve Hayes Consulting. I'm doctor Genevieve Hayes. And today I'm joined by Ethan Garofolo to discuss techniques from software engineering and software development that you can use to become a better data scientist. Ethan is a software developer.
00:00:20 Dr Genevieve Hayes
And software architect specialising in micro service based projects and using Lean and Dev OPS principles to make software development teams more effective. He is the author of Practical Microservices build event driven architectures with event sourcing and CQR.
00:00:38 Dr Genevieve Hayes
Press and runs the Utah Micro Services meet up group. Ethan, welcome to the show.
00:00:44 Ethan Garofolo
Thanks for having me, Genevieve.
00:00:46 Dr Genevieve Hayes
Data science sits at the intersection of computer science and statistics, so it comes as no surprise that many of the best data scientists have a computer science or software development background.
00:00:59 Dr Genevieve Hayes
And those that don't, well, there's a lot they can learn from software developers. I'm a better data scientist today because of what I've learned from my software developer friends and coworkers, which is why I was very keen to get a software developer onto this show to discuss this topic.
00:01:19 Dr Genevieve Hayes
In more detail.
00:01:21 Dr Genevieve Hayes
And I'm very grateful to you, Ethan, for agreeing to do this with me. You bet.
00:01:27 Dr Genevieve Hayes
From reading your LinkedIn profile, I can see you've worked in a very diverse range of roles within the software industry over the course of your.
00:01:36 Dr Genevieve Hayes
Career some of the job titles you've got listed include software developer, software engineer, software architect, data architect and software development.
00:01:48 Dr Genevieve Hayes
Engineer and I think I saw a profile you'd written of yourself online where you described yourself as a professional.
00:01:56 Dr Genevieve Hayes
Yammer all the different software related roles have always confused me a little.
00:02:01 Dr Genevieve Hayes
Bit, especially since when I've worked with software developers in the past, they're often not even consistent in how they're referring to themselves within the one job. So based on your experience, our software programmers, developers.
00:02:16 Dr Genevieve Hayes
Engineers and architects just different names for effectively the same role. Or is there a difference between these different types of software professionals?
00:02:26 Ethan Garofolo
That's a great question and it makes me chuckle at the inconsistency that we have and that I have, and we'll probably have in the same conversation that we have here today.
00:02:35 Ethan Garofolo
I would make a distinction between architect and developer, programmer or engineer. Was the other one that we used. So I'll go with that second group first to be it comes down to like how much tidal inflation does your company employ.
00:02:53 Ethan Garofolo
With Bing, so like I don't know what, when you think of an engineer, what comes to your mind when you hear that word?
00:03:01 Dr Genevieve Hayes
I say it as well in the context of software, it's the person who's building the software. So I guess sort of like how a mechanical engineer would be building mechanical things.
00:03:12 Ethan Garofolo
I got you. Yeah. So so engineering to me speaks to a lot more rigour than the typical software development shop employees.
00:03:23 Ethan Garofolo
And so like on the 1 extreme, you might have like the the script Kitty in high school who's able to write things that.
00:03:32 Ethan Garofolo
Tear down other people's websites and there's probably not a lot of rigour in that. Just trying stuff until it works.
00:03:39 Ethan Garofolo
And you could sometimes end up like that in the professional world, too, especially as deadlines approach. So to me, when I hear engineer, I'm thinking of like the people who write the software that runs on the space shuttle or the run on the Mars Rovers.
00:03:52 Ethan Garofolo
And I I don't. I've never worked on a Mars Rover or on a space shuttle, and that's probably a good thing for the.
00:03:58 Ethan Garofolo
Occupants of the shuttle but.
00:04:00 Ethan Garofolo
Like I just imagine that where failure has such a high cost.
00:04:05 Ethan Garofolo
And updates are very difficult to do. I mean, there's quite a bit of a delay between here and Mars, for example, you just approach the work very differently.
00:04:13 Ethan Garofolo
And in the typical software development company, the projects, it's a mixture of things. When the consequences of a failure generally aren't things with people inside of them exploding.
00:04:25 Ethan Garofolo
For example, like you might double post a tweet or a comment on Facebook or something like yeah, it doesn't look great, but not a big deal. And so I think of developer, programmer and those two are kind of the same.
00:04:40 Ethan Garofolo
Playing and that's what most people are, even the ones who are titled engineers. I don't refer to myself as an engineer.
00:04:47 Ethan Garofolo
It's something I aspire to be, to bring that kind of a rigour, but the words engineer has lost its meaning in my mind, in the software world, because we use it to sound cool, but we don't do things that.
00:05:01 Ethan Garofolo
Engineers do, and I'll probably come off sounding completely like an elitist gatekeeper. Maybe whatever.
00:05:06 Ethan Garofolo
But I don't know that's that's how I view them. So where I've listed engineer is because that's what the company hiring me called me.
00:05:14 Ethan Garofolo
But in casual conversation, I'll like people ask me, what do you do for work and I'll say I work in software development, so OK, that's those. So then architect, that's the other one. How's?
00:05:25 Ethan Garofolo
That different.
00:05:26 Ethan Garofolo
In some places, like architect is just you've become a senior programmer and so the next step is to be architect.
00:05:33 Ethan Garofolo
But I don't think that's quite right. The architect. I think that the the best architects are ones who came up through the development ranks and actually we're building and shipping things. But once you start focusing on architecture.
00:05:46 Ethan Garofolo
It's less about building the actual critical path production code and more about there's these 10 things our system needs to do.
00:05:57 Ethan Garofolo
How do we organise that and divvy it up amongst the various teams that we have so that they can keep working mostly independently of one another?
00:06:07 Ethan Garofolo
And delivering so that the work can keep moving along without getting blocked. And I feel like I'm getting along with it in this answer, but I'll wrap it up with this in a lot of places you see a technical division of teams like, oh, this is the database team or this is the quote front end team.
00:06:23 Ethan Garofolo
But all of your functionality requires the effort of all of those teams, and so you get the effects of having to wait on each other and then playing the game of telephone and getting it wrong and going back and doing rework.
00:06:34 Ethan Garofolo
And to me, that says well your architecture is incorrect at that point because if the work is stalling and not getting delivered, that's a problem.
00:06:43 Ethan Garofolo
Your your customers don't care if your people are busy. Your customers care when they get new things that help them do their work better. What I mean, not every app is for work, but.
00:06:54 Ethan Garofolo
I think you get the idea that I'm trying to get out there and so your team structure is often one of the biggest impediments to delivering at a regular cab.
00:07:02 Ethan Garofolo
And and so the architecture function then is focused on that. How do we organise the work in general so that we're not constantly waiting?
00:07:11 Ethan Garofolo
It's not just waiting, it's like quality issues. It's in morale and all that kind of stuff. And so yeah, that, that's, I would divvy those up.
00:07:21 Dr Genevieve Hayes
So you're architecting.
00:07:23 Dr Genevieve Hayes
The team and the organisation, so it's a leadership role rather than actually designing the software which.
00:07:29 Dr Genevieve Hayes
That's what the title always suggested to me.
00:07:33 Ethan Garofolo
Ohh, that's a great point. So once you start doing that, the thing that I found so this is some when I first became an architect I viewed it as purely a technical thing. Like I'm designing the software and then you just you keep butting up against the team organisation until it finally clicked in my head. It's like oh, this is also a people thing and this has actually been a point of contention.
00:07:54 Ethan Garofolo
In places that I've worked is, the architects are viewed. You're a technical person, you're doing technical the.
00:08:00 Ethan Garofolo
Things, but until you organise all of the system of work properly, you won't get to the architecture that you actually want.
00:08:09 Ethan Garofolo
And so I I believe that software architecture is a management function, although oftentimes the people tasked with that are not viewed as management. A lot of what I base this on, there's a.
00:08:21 Ethan Garofolo
Guy, have you ever heard of Mel Conway?
00:08:24 Dr Genevieve Hayes
I think have you show included some of his videos in your emails?
00:08:29 Ethan Garofolo
Not videos, but I've probably made reference to him before.
00:08:33 Ethan Garofolo
OK, so Mel Conway, I've never met him. He he's still alive and he writes. But he did. I think it was like in the mid 60s or 70s or something. He did a study.
00:08:45 Ethan Garofolo
Of software architecture and he didn't name it this. It's called Conway's law. And doing this research project what he discovered was that and I'm going to horribly paraphrase this, but to give it.
00:08:57 Ethan Garofolo
Trying to do this word for word as best I can. He said that any group of people designing a system is constrained to design it in a way that reflects their communication structure.
00:09:11 Ethan Garofolo
That will often get misquoted as their team structure or you ship your org chart and so they were building a compiler. I think in the research project and what he found out was.
00:09:21 Ethan Garofolo
That if you had.
00:09:22 Ethan Garofolo
Four teams effectively in your communication structure, you would get a four pass compiler and if you have three teams, you'd get a free pass.
00:09:28 Ethan Garofolo
And so when I've been hired as an architect, it was, hey, we're not shipping as fast as we would like to.
00:09:35 Ethan Garofolo
We know you have this technical expertise with microservices. Can you please come here and fix our architecture and then the first time that I did that, I thought I was going in to do a technical job.
00:09:48 Ethan Garofolo
And then you find out it's like, oh, we are constrained to have the architecture that mirrors our communication structure. So we cannot change our architecture successfully until we change our communication structure.
00:10:01 Ethan Garofolo
Sure. And it's not like a law in the same sense that gravity is. We don't have mathematical proofs of this.
00:10:06 Ethan Garofolo
It's an observation that really holds an observation, which is not the same as science. Like we can't test it in the same way that we can't other things with the scientific method. But whatever my anecdotes are worth. And Mill Conway's anecdotes and lots of other people.
00:10:22 Ethan Garofolo
It's a a pattern that you repeat. So yes, if you're going to do architecture, gonna change the architecture. You have to change the way that the organisation.
00:10:29 Ethan Garofolo
Grades. Otherwise people revert exactly back to what's familiar and how they're currently graded on their job performance. So yeah, so architecture is really skirting that line between what is management and what is building. So it it like it combines both elements I think.
00:10:47 Dr Genevieve Hayes
I'd never thought about that before so.
00:10:49 Dr Genevieve Hayes
I actually think data scientists could benefit from having the equivalent of an architect in their teams.
00:10:57 Ethan Garofolo
Why do you suppose that is like? Just hypothesising the things that come to my mind is there like a a fear thing that it's so new we've all got to prove ourselves or they're just not.
00:11:07 Ethan Garofolo
Industry standard practises of how data science coordinate with one another.
00:11:11 Ethan Garofolo
Or what do?
00:11:12 Ethan Garofolo
You. Why do you think that is?
00:11:13 Dr Genevieve Hayes
I think I'd say the latter is more likely to be the case, because that's the thing. It's I know when I did my I did a masters in computer science and one of the courses you could take, I didn't take it was software development pros.
00:11:32 Dr Genevieve Hayes
Yes. So the fact that there was a course called soft, the software development process suggests there is an industry standard for that.
00:11:39 Dr Genevieve Hayes
There was no equivalent to the data science development process. A lot of training and data science, I think, tends to focus on, and here's this lot of techniques that you can use. And here's this lot of techniques that you can use.
00:11:53 Dr Genevieve Hayes
Rather than, how do we coordinate all this into producing some sort of?
00:12:00 Ethan Garofolo
OK. Huh. I did part of a Masters degree in computer science. I didn't finish it, but my exposure to the academic world like it's like there was collaboration, but at the same time there wasn't cause. Everybody had to complete their thing individually. And so like, not trying to say it's a like a.
00:12:20 Ethan Garofolo
Like evil intent or something? Or it's none of that. It's just I wonder if data science feels more like.
00:12:26 Ethan Garofolo
Doing like a a thesis or a dissertation versus like getting together with a band and jamming together or something.
00:12:34 Dr Genevieve Hayes
Yeah, I think that's it. It feels like, well, a lot of the assignments are very similar to doing, you know, maths assignments, type thing. And yeah, I don't think you really have that jamming type thing.
00:12:46 Dr Genevieve Hayes
With data science or I haven't encountered it, maybe some people do.
00:12:50
OK.
00:12:51 Ethan Garofolo
Huh, that's interesting. I see elements of that attitude in software development as well. I think what I would call a team actually working together might look a bit differently than what a lot of people in my industry would cause.
00:13:03 Ethan Garofolo
When I observe most teams, I just see a bunch of people who talk once a day going on and there's like this allergy to.
00:13:12 Ethan Garofolo
Co creating things like that, techniques like pair programming for example, where two people are working on the exact same thing at the same time, 1 keyboard between them taking turns typing.
00:13:24 Ethan Garofolo
Or mod programming, which is that but with three or more people. And like that's very, I don't know. I haven't quizzed the entire industry, let's say.
00:13:33 Ethan Garofolo
But anecdotally, again, my reaction for bringing these techniques up around people ranges from like, oh, that's interesting. But I've never tried it to. You're insane. Like, I literally got called insane.
00:13:44 Ethan Garofolo
That's for suggesting that three people work together on the same code. But why would you use three people to do the work of wine? Well, it turns out it's not the work of 1, et cetera, et cetera.
00:13:52 Ethan Garofolo
But I don't think that's how programming started out either. I think it was also viewed as a, but I don't know.
00:13:59 Ethan Garofolo
Like I've seen pictures from like when Grace Hopper was doing stuff. There's pictures of her surrounded by other people working together.
00:14:07 Ethan Garofolo
It's like that's bobbing. That's. That's what it looks like. I don't know what point it became so individual.
00:14:13 Dr Genevieve Hayes
It's interesting cause I've had experiences like that where I've been working with other people to get a board paper out for example, and these are not programming experiences, but you know we're we've had an hour to get the board paper out.
00:14:28 Dr Genevieve Hayes
We can't do it back and forth, so one of us will be sitting at the keyboard typing and.
00:14:33 Dr Genevieve Hayes
Everyone will just be shouting at that person and you end up with something really amazing at the end of that hour and it's way better than whatever board paper you could have produced in a week had you done the back.
00:14:44 Dr Genevieve Hayes
4th so I think if you did something like that with programming that would be really amazing.
00:14:50 Ethan Garofolo
Oh, it's a joy. Like it's tiring. I'm an introvert, and so I take Susan King's definition of that. That means that it's not that I'm shy and not that I necessarily have poor social skills if I have them. It's not because I'm an introvert, it's just that I don't recharge.
00:15:06 Ethan Garofolo
From being in groups of people. And so when I have a good day of pair programming of mob programming, it's like the team bonding is just incredible.
00:15:15 Ethan Garofolo
The work product is incredible. There's a lot of steps that I would have to do if I were working by myself that I don't have to do in a mob context like the peer review process, because it was happening happening continuously.
00:15:25 Ethan Garofolo
As we were writing the code, what purpose is there to another after the fact inspection?
00:15:30 Ethan Garofolo
But I come home. It's like a good workout at the gym. It's like I know this is good for me and I enjoy it and in some sense I feel really good, but I'm tired right now.
00:15:38 Ethan Garofolo
Now I need to go isolate for a while to recover, but you know the the IT is a joyful experience to like actually Co create now it can go very poorly. Like what are the anti parents for? It is.
00:15:50 Ethan Garofolo
Now when one person's just.
00:15:52 Ethan Garofolo
Like I stream on Twitch sometimes and when I stream on Twitch, I talk as I'm coding.
00:15:57 Ethan Garofolo
If mob programming were sitting in a room with me for four hours watching me do that, that would be excruciating.
00:16:02 Ethan Garofolo
I wouldn't wish that on anybody, but in these modes of work we take turns being the person at the word, and that person is just taking notes from what the other people are saying.
00:16:09 Ethan Garofolo
And when that lands like it's just I really like.
00:16:13 Dr Genevieve Hayes
It, and I can imagine the worst case scenario which is.
00:16:18 Dr Genevieve Hayes
Have you ever had one of those work experiences where the powers that they in an organisation?
00:16:25 Dr Genevieve Hayes
Take everyone away to an off site, put them in groups of people who don't know each other, and then say you've got an hour to come up with some brilliant idea that we can use in the corporate plan or something like that. And that's just painful.
00:16:40 Ethan Garofolo
It is, yeah, it takes time. The team needs some time to gel and doing the work together can help with that, but that sounds like a much higher pressure situation than the typical we need to do this work item on our project.
00:16:54 Ethan Garofolo
Are they like high pressure situations like you better have something after an hour?
00:16:59 Dr Genevieve Hayes
They usually are at the time, you know one one person from this group will have to present, but at the end of it, you know the quality of ideas that are generated on the spot in an hour with no prior knowledge of the topic, you tend to end up with very poor quality outputs.
00:17:19 Dr Genevieve Hayes
That every group does and you're like, I don't think any of these are actually usable. And then they sort of end up in a drawer. So.
00:17:28 Dr Genevieve Hayes
I think you have to know the people in the group. If you're gonna do some sort of group development situation and understand what the background on the subject.
00:17:38 Dr Genevieve Hayes
So possibly it might help if you started off doing some work as individuals and then grouped together. Once you've developed an understanding of the project.
00:17:49 Dr Genevieve Hayes
In your own right. And then that's when you could get, I guess it's like with musicians. I mean you wouldn't have.
00:17:55 Dr Genevieve Hayes
The members of a band learning how to play their instruments in a group. They'd learn them, and then.
00:18:01 Dr Genevieve Hayes
Grouped together to jam.
00:18:03 Ethan Garofolo
Yeah, that would be tricky to do.
00:18:04 Ethan Garofolo
That I would, I would agree that there has to be some level of familiarity with the craft to do that in a software sense, but I've had some friends who were competent developers but not familiar with the tech they were using at a new job, for example. But in working in an ensemble setup like that.
00:18:25 Ethan Garofolo
They said, wow, I got up to speed with this so much faster than I ever had. And so that probably wouldn't have gone as well if they were had no programming experience. But I've tried this with my kids. Actually. Two of my daughters, we're interested in programming and so.
00:18:41 Ethan Garofolo
Like they would, we would talk about what is the problem that we're trying to solve and I would let them do the talking.
00:18:47 Ethan Garofolo
And I do the typing because I knew the syntax in this particular time that I'm thinking of, but I just prompt them with questions like. Well, here's this big thing we want to do.
00:18:55 Ethan Garofolo
How do we break that into something smaller and keep doing that until we?
00:18:58 Ethan Garofolo
Could put it in lines of code and they were able to.
00:19:01 Ethan Garofolo
Meaningfully participate in what we were trying to build together because they're capable of thinking and breaking it down. They just don't know how to render it into the syntax.
00:19:10 Ethan Garofolo
Did that help them learn the syntax? I don't know. But it was a good bonding experience. So, and I think you're right, you probably do have to have.
00:19:16 Ethan Garofolo
Some minimum level for that to be a positive thing, but certainly for like acclimating to the norms of a new company or a new team, I find that's very hard to do solo because that type of stuff tends to not be documented.
00:19:30 Ethan Garofolo
I don't even know that that's a good use of time to document things like, oh, this is how we indent our code, or this is how we do.
00:19:37 Ethan Garofolo
Effects and that sometimes that's better left to people just communicating with one another. I think that kind of thing would be very hard to learn, so although, but the actual craft of programming, you can definitely level up. And I mean, I did.
00:19:50 Ethan Garofolo
I mean, not just because I did something that means it's the right way to do it, but I did manage to learn how to get computers to do things by myself.
00:19:58 Ethan Garofolo
So yeah, that would be strange. Like four people show up who've never played any instrument. Each one picks one up. All right, let's do this.
00:20:07 Dr Genevieve Hayes
I heard the reminds taught themselves how to play their own instruments. So.
00:20:12 Ethan Garofolo
Did they?
00:20:12 Dr Genevieve Hayes
Who knows? Yeah, apparently. Yeah. It's why if you hear a lot of their songs. Yeah, a lot of them are very simple musically, but yeah.
00:20:21 Ethan Garofolo
That's fantastic. Good for them.
00:20:24 Dr Genevieve Hayes
Have you ever actually had any experiences working with data scientists or data analysts in any of your jobs?
00:20:31 Ethan Garofolo
A little bit. Yeah. My last position, we had a data science team and the interaction with them was mostly they were, I forget, who coined, but there's like 5 or 7 phases in this model.
00:20:47 Ethan Garofolo
Of like something going from idea to being a product in the various steps along the way. I don't know if any of that is ringing a.
00:20:54 Dr Genevieve Hayes
Bell for you. Is this the crisp DM process? So business understanding data understanding.
00:21:00 Dr Genevieve Hayes
Using data preparation, modelling, evaluation, deployment.
00:21:08 Ethan Garofolo
That could very well be. That sounds vaguely familiar. They were still in the towards the research side of all that, figuring out what they were going to do.
00:21:18 Ethan Garofolo
And so my involvement was to just become aware of it because the plan was they would develop models that would then be incorporated back into the product so that customers could.
00:21:28 Ethan Garofolo
System and so it's just like, hey, we need you folks to understand what the constraints of working with our system will be when you're ready to make it a product.
00:21:37 Ethan Garofolo
And that was it. That's as far as we got. We never. By the time that I had left there, we didn't get it integrated back into the main product.
00:21:46 Dr Genevieve Hayes
Did you see any?
00:21:48 Dr Genevieve Hayes
Possible issues that could have arisen, or any strengths that were coming out of what you observed of this.
00:21:55 Ethan Garofolo
It was super impressive to me from just exercising the tooling that they had. They found some. This is going to sound like I'm making this slip, but because I don't remember the particulars, I just remember that taking the data and exercising it, they found some insights that were useful to the business.
00:22:16 Ethan Garofolo
And they published them on LinkedIn and some other people said, oh wow, that sounds interesting and useful. It was venture capital related data. And so just noticing pattern.
00:22:25 Ethan Garofolo
Of the like the types of business structures people do and how money was flowing, particularly with what's happened in the recent six months, I think that research was very interesting.
00:22:38 Ethan Garofolo
And so that was cool. Like, that ability to bind these insights from the data. I was really impressed with that.
00:22:45 Ethan Garofolo
Now the dangers that I saw of it was like, gosh, I don't if they're gonna listen to this, but it it felt very like cowboy worked me in a sense. That's like.
00:22:55 Ethan Garofolo
There wasn't a lot of. I wasn't seeing it anyway. It doesn't mean it wasn't there just because I don't see something doesn't mean it wasn't there.
00:23:01 Ethan Garofolo
Of ohh. When we turn this into a product we need to integrate this with the rest of the system.
00:23:08 Ethan Garofolo
Kind of a I don't know, and that could just be like failings in my own character, but that's I didn't find the same warmth towards playing with the rest of the system as.
00:23:22 Ethan Garofolo
We were trying to give like, hey, we want to support you in what you're doing.
00:23:25 Ethan Garofolo
Thing let's work together. How do we figure this out? Because you know a bunch that we don't. We need it.
00:23:31 Ethan Garofolo
It's going to be great, but we need our customers to be able to leverage it as well, so.
00:23:35 Ethan Garofolo
I don't know.
00:23:35 Dr Genevieve Hayes
That's that's a. That's a fair observation, cause that's something that I I think a lot of data scientists, they are trained in how to find those insights. So they are very good at it.
00:23:45 Dr Genevieve Hayes
And when I've spoken to other non data science guests or just people, I know the thing that they often identify is the greatest strength of the data science team is the ability to find insights.
00:24:00 Dr Genevieve Hayes
But I would say based on my experiences as a data scientist and what I know of how data scientists are trained, there isn't a lot of training in how do you take that and integrate it back into a system. So I don't think it's so much people deliberately trying to be Cowboys. I think it's that they.
00:24:20 Dr Genevieve Hayes
I haven't received that training in, OK, so what's next?
00:24:25 Ethan Garofolo
I love that observation that you make there because that's one of my core beliefs too, is like whenever you're seeing something.
00:24:32 Ethan Garofolo
Not going the way that you want it, especially in a business setting like that is like don't look to the people, look to the system.
00:24:39 Ethan Garofolo
What is the system incentivizing? What has the system provided and the real fix to that is not be like software developer. You have unreasonable expectations or data scientist. You're a diva. It's why is this not flowing?
00:24:52 Ethan Garofolo
Correctly in the 1st place. Let's fix the system and then we'll get it there. That assigns a very new discipline.
00:24:58 Ethan Garofolo
And so there wouldn't be hundreds of blog posts about how to incorporate this and to point the finger right back at myself.
00:25:06 Ethan Garofolo
I think everything I just said there is what? Like every business person says about their software development teams like, we can't understand anything you're doing. So how do we talk to each other?
00:25:18 Dr Genevieve Hayes
So how do we fix the situation? So if you encounter that situation again and you're in a position where you could do something about it, how would you work with that data science team to help them integrate their findings back into the?
00:25:35 Ethan Garofolo
Yeah, this gets into like in the bio part that you read about me talking about lean in DevOps principles, Dev OPS, how do I make this not like a 15 minute long answer is a portmanteau of development and operations. And so the software development and running it on the servers was typically organised AS2, completely different teams.
00:25:57 Ethan Garofolo
Different reporting structures up to the VPN.
00:26:01 Ethan Garofolo
And so software development and taking a lot of this from the DevOps Handbook, fantastic book. What we're trying to say, right?
00:26:08 Ethan Garofolo
So developments job was to deliver new functionality, change things operations. Job was to keep it running. They want stability and there's a core chronic conflict, as they say in the book like those two goals cannot be reconciled.
00:26:21 Ethan Garofolo
So long as they are separate teams with separate reporting structures, and So what the DevOps movement inspired by things like Lean came to realise is like, why are these separate teams? Isn't the goal delivering value to the customer?
00:26:37 Ethan Garofolo
What if you merged them and so people who were operations are now part of the same team? Same manager, same reporting structure.
00:26:46 Ethan Garofolo
And so like at this last job that I'm talking about, data science was a separate department. We had a VP of Engineering that my line of reporting went up through.
00:26:54 Ethan Garofolo
Data science went up through VP of Data Science and those being separate. Not quite departments but separate lines in the org chart that creates a conflict.
00:27:03 Ethan Garofolo
Again, and I just always like how do we align around the customer value we're trying to deliver. Who cares what the specific tools are to get?
00:27:11 Ethan Garofolo
I mean, we need to make sure that a team is composed of everyone we need to deliver it and like, am I a radical thinking on this?
00:27:17 Ethan Garofolo
I'm like, why is customer support a separate division? Aren't we all part of the same product team trying to deliver?
00:27:24 Ethan Garofolo
Experience. So that's where I'd start looking first.
00:27:27 Dr Genevieve Hayes
I actually think that would work very well, cause one of the jobs I had earlier in my career when I was so I started off working in insurance and I did a job in insurance pricing and I was in the premium pricing team and it was part of the broader premium division. So you had in this one division.
00:27:48 Dr Genevieve Hayes
And the people who were doing the pricing, which were the programmers, statisticians, things like that, and they were in the same broad division as the people who are doing the legislation behind the premiums, the people who were actually putting things in deploying things into production, the people who were auditing premiums.
00:28:09 Dr Genevieve Hayes
You know every single premium related function was in this one division and it was really great because you had this, you know, everyone who was doing trying to achieve the same goal was.
00:28:19 Dr Genevieve Hayes
In that one area, under the one director.
00:28:23 Ethan Garofolo
Yeah. No, that.
00:28:24 Ethan Garofolo
Super interesting because like in the kind of work that we do and all include us both in the same thing, we're probably working on something where the solution is not known yet because we don't know exactly what it is yet, which is very distinct from manufacturing like you do manufacturing. Well, when things don't change and you just.
00:28:45 Ethan Garofolo
Out the same thing. If you and I do work, that's just what's already been done before. We're not adding any value, we're just consuming dollars at that point because.
00:28:54 Ethan Garofolo
By definition, we're trying to figure out the unknown, and so anything we can do to structure the work to get more test observation, refined test, observation, refine, that's where the power comes in.
00:29:07 Ethan Garofolo
And so if we're trying to achieve objective X but we organise our teams such that we have to hop teams to do one of these iterations.
00:29:16 Ethan Garofolo
We're really slow in getting more data and discovering new knowledge and deploying it in our customers objectives. And so I I think that's what it comes down to, like how good is our feedback loop and if we've got the wrong teams slash system.
00:29:32 Ethan Garofolo
You're wrong. Is such a loaded word. What am?
00:29:35 Ethan Garofolo
I trying to say.
00:29:36 Ethan Garofolo
We're not optimising for that learning path. I think that hurts us a lot in knowledge work.
00:29:43 Dr Genevieve Hayes
Yeah. And also I mean what you're saying about having all the teams that aren't all working towards a common purpose, you can also have issues with if team one has this task as being its #1 priority.
00:29:56 Dr Genevieve Hayes
And team two has the that task as it's #10 prior.
00:30:01 Dr Genevieve Hayes
Quality. You're going to end up with bottlenecks at some point.
00:30:05 Ethan Garofolo
Big time, big time, and then everybody gets frustrated. And another thing that I could go off for hours and hours about is like that creates the conditions where then you need people to act heroically to get anything done.
00:30:18 Ethan Garofolo
It lean has its seven waists that it talks about things like motion is waste and manufacturing because that's time that's not going towards delivering value to the.
00:30:28 Ethan Garofolo
Company husband and wife, author pair Tom and Mary Pop and ****. They wrote a book where they said, hey, let's take these ideas of lean waste and apply them to software development.
00:30:37 Ethan Garofolo
And then the authors of the DevOps handbook adding it added a couple more to that, and one of them is heroics. When you have all these competing priorities, who finally is the arbiter?
00:30:48 Ethan Garofolo
Of what we're going to do first as a company, the proper deciding factor of the.
00:30:52 Ethan Garofolo
Is the first person who's a common manager to these different groups, like that's the role, that's what they're supposed to do, is arbitrate things like that if they're not already providing the proactive leadership.
00:31:04 Ethan Garofolo
But unfortunately you get a lot of situations where management abdicates that responsibility, and I'll use management and leadership somewhat interchangeably here.
00:31:13 Ethan Garofolo
And I think that management gets a bad rap. People look down on it and we want leaders like, well, well, do we, would we rather have systems of work where we didn't have the problems that required leaders in the first?
00:31:24 Ethan Garofolo
Place and if we structure the separate teams and give them competing priorities, it's hard to not end up in that situation.
00:31:32 Ethan Garofolo
We take very deliberate effort and so then I'm thinking too like.
00:31:36 Ethan Garofolo
The what data scientist? Thinking back to my experience of the last child like not being familiar with the tooling not being familiar with the objectives we're on the scene team like.
00:31:45 Ethan Garofolo
I think that too would create a lot of empathy between us, like first of all, like, I'd start to understand their craft a lot better.
00:31:51 Ethan Garofolo
I would never be able to do it to the same level without years of study like they have.
00:31:55 Ethan Garofolo
But I would start to understand it and the challenges that they face and vice versa, they'd get the same thing.
00:32:00 Ethan Garofolo
And so now if they're like, you know what, we need to focus on data science because now we have that bond.
00:32:05 Ethan Garofolo
It's like ohh cool. I'm on board with.
00:32:07 Ethan Garofolo
That how can.
00:32:08 Ethan Garofolo
I help versus like those aren't data scientists.
00:32:12 Dr Genevieve Hayes
The reason why I really love working with software developers now is because I had two experiences. So when I did my masters of computer science.
00:32:21 Dr Genevieve Hayes
The majority of of my fellow students were software developers or software engineers. Before they've done that masters, and so when I was in all the class discussion forums.
00:32:31 Dr Genevieve Hayes
I was constantly seeing all these posts written by software developers, and they'd be saying no. This is how you do it in software developer world.
00:32:39 Dr Genevieve Hayes
And that really made me understand things from the point of view of the software developers, because my background was in statistics.
00:32:45 Dr Genevieve Hayes
So I was looking at it from a very different way and that's when I first started to really appreciate.
00:32:51 Dr Genevieve Hayes
The value that software developers could bring.
00:32:54 Dr Genevieve Hayes
But then in a previous job I had, I started off in the data science team and there was very much this conflict between the data scientists and the software developers.
00:33:07 Dr Genevieve Hayes
Each team seemed to have this attitude that the other team was doing something annoying that was stopping them.
00:33:14 Dr Genevieve Hayes
From achieve.
00:33:15 Dr Genevieve Hayes
They got and then I ended up spending. I don't.
00:33:19 Dr Genevieve Hayes
Know about 6:00.
00:33:21 Dr Genevieve Hayes
6 to 9 months in one of the software development teams as part of, you know, particular project I was working on and I started attending all these soft meetings of this software developer team and suddenly I was hearing their point of view.
00:33:35 Dr Genevieve Hayes
On what was happening and I realised they weren't deliberately trying to be obstructionist, they were actually trying their best to do what the data science team wanted.
00:33:47 Dr Genevieve Hayes
It's just that they didn't understand what that was and what was interpreted by the data science team as the software developers being annoying.
00:33:59 Dr Genevieve Hayes
The software developers in their meetings were saying we're trying to do what they want. They just won't tell us. They just can't.
00:34:07 Dr Genevieve Hayes
Articulate to us in the language.
00:34:09 Dr Genevieve Hayes
We need what that is and it just felt you needed some sort of bridge between those two teams that could communicate that.
00:34:19 Dr Genevieve Hayes
And so I think if you had them more opportunities for software developers to go and sit in a data science team and data scientists go in and sit in a software development team.
00:34:31 Dr Genevieve Hayes
I think there'd be fewer conflicts and more. Ohh right, we're all trying to achieve the same goal. It's just we're not necessarily all speaking the same language.
00:34:40 Ethan Garofolo
Yeah, I think there's a lot to that cause even.
00:34:42 Ethan Garofolo
With design and software development, you can get the same kind of conflicts going on. Why can't the developers just render the designs the way that I'm drawing them?
00:34:52 Ethan Garofolo
Or then on the software side, it's like, why doesn't the designer understand how difficult it is to do the thing they're asking for? Could we not just talk about this and the best ways that I've seen to solve that is?
00:35:03 Ethan Garofolo
Let's quit being separate teams, right? We're working on the same goal. Let's sit together like we'll sit down if we're remote, we're remote. Fine. We'll get on a zoom called discord.
00:35:12 Ethan Garofolo
Ever. And I'll put my screen up as the developer and the designer can say OK that thing is off by a little bit.
00:35:19 Ethan Garofolo
Can you make this change? Sure. Quick, I'll do it, and then we can see it in real time and respond to that.
00:35:25 Ethan Garofolo
I wonder, I'm inclined to say yes, my bias is tell me, yes, that would also help with data science for the same reasons, because fostering that team spirit.
00:35:34 Ethan Garofolo
The shared purpose getting to know one another, getting to understand the challenges of each other, seeing.
00:35:39 Ethan Garofolo
And then have you heard the notion of the T shaped expertise?
00:35:44 Dr Genevieve Hayes
Yeah, yeah, I've heard about that.
00:35:46 Ethan Garofolo
OK. Yeah. So like if a designer works long enough with developers, I truly believe that any designer could pick up how to code in HTML and CSS, even pick up some JavaScript.
00:35:59 Ethan Garofolo
I worked with a designer who did that. It was the most fantastic experience I've ever had. Interacting with the design.
00:36:04 Ethan Garofolo
Team and then if developers spend enough time around designers and observing what they're doing and their reasons for it, and if we can have that conversation, here's why I'm doing this.
00:36:14 Ethan Garofolo
I think designers could pick up way better. I'm sorry developers could pick up way better design skills and intuit things better understand. Like Oh yeah, this is harder for me to do what you're asking. But what a better result.
00:36:25 Ethan Garofolo
Will deliver for the customer. I I think that kind of interaction really would benefit any people with different skills who are trying to achieve the same end and it might even help people who think they aren't trained to achieve the same and realise that they are. It sounds very Kumbaya.
00:36:44 Ethan Garofolo
I like, I don't know, like I don't know how to get people to feel like they're on the same team if they're not working together.
00:36:51 Ethan Garofolo
And I don't think that a daily meeting or an every six month design review or once a week meeting constitutes working together.
00:36:58 Ethan Garofolo
Like if we're not in the work literally together and that's where like, people will call me.
00:37:04 Ethan Garofolo
Crazy because like.
00:37:05 Ethan Garofolo
Oh, what's the designer going to do while developers are sitting around coding like I don't know, but again.
00:37:12 Ethan Garofolo
I'm not interested in maximum utilisation of people, I'm interested in work progressing and I think people get a lot more satisfaction from completing things than they do from just producing a bunch of half completed work.
00:37:27 Dr Genevieve Hayes
And I think.
00:37:28 Dr Genevieve Hayes
There are skills that, yeah, I'm not saying data scientists should go and do software development.
00:37:34 Dr Genevieve Hayes
But I think if a data scientist spent a little bit of time on a software development team just to understand it when I was working in that team when they had.
00:37:44 Dr Genevieve Hayes
You know too much work to do. I'd often say, look, I can programme in Python your Python programmers. If you've got a task that you need done because you've got too much on your plate, just send it my way and I'll give it a go.
00:37:57 Dr Genevieve Hayes
I'm not going to be the best programmer, but yeah, I can do something and that was how I learned about test driven development.
00:38:06 Dr Genevieve Hayes
Because that was the framework that was being used there and that taught me stuff which I then was able to take over into my data science work.
00:38:15 Dr Genevieve Hayes
Different developments, actually something that you mentioned in your emails and I'd actually like to discuss that a bit.
00:38:22 Dr Genevieve Hayes
With you now.
00:38:23 Ethan Garofolo
Yeah. No, I think it's a great practise. Not everyone does, but how would I describe it? I guess it it is a development activity. The word test in there is joined with a hyphen to driven. So test driven is a modifier.
00:38:37 Ethan Garofolo
On the word development, it's a way to do development and I might even go so far as to say it's a design exercise, not like visual UI design, but design of the software design exercise. Because keeping with that theme of fast feedback cycles is.
00:38:53 Ethan Garofolo
How do I know if I have developed code that is usable if no one's trying to use it?
00:39:00 Ethan Garofolo
Got no idea. A test suite is something that's trying to use my code and so if my code is not testable, there's a really good chance it's also not usable and that if someone were trying to use my module.
00:39:14 Ethan Garofolo
To build their thing, they would hate it. So that's one of the big benefits I see to it. And then the other one is like the the actual cycle of it is.
00:39:22 Ethan Garofolo
Is I write a failing test and then I write code that makes the test pass and then I can make sure that all of my test pass that I have up to this point and then I can refactor the code.
00:39:35 Ethan Garofolo
So not assuming that everybody knows what refactoring means, but it's where I make changes to the code that don't change its functionality.
00:39:42 Ethan Garofolo
That make it more maintainable and easier to work with that kind of a thing.
00:39:46 Ethan Garofolo
But unspoken in that is a practise that friend of mine, he actually presented at our meet up on this we needed TD and he said don't just code and then run the tests code pause anticipate in your mind what the result of running the test will be and then run the test.
00:40:05 Ethan Garofolo
That changed my world like I was all for test driven development, but that added step of stop. Think what you think is going to have.
00:40:13 Ethan Garofolo
Up in it's not good enough for stuff to work if I don't understand why it was working like if I was expecting a failure and it passed not good enough.
00:40:22 Ethan Garofolo
I'm not done. I'm going to go back in and I'm going to break the code until I understand why it's running the way that it is. I think that discipline, I think that could be a great thing in data science.
00:40:32 Dr Genevieve Hayes
I think that's a good idea because.
00:40:35 Dr Genevieve Hayes
When I'm writing tests, the two things I'm always scared about are what happens if I don't have full coverage and what happens if I've written tests that are too simple and therefore they're not covering the harder scenarios. If you do that, pause and think.
00:40:55 Dr Genevieve Hayes
Then you could say, OK, are my tests complex enough to actually?
00:41:00 Dr Genevieve Hayes
Allow for all the possible scenarios, or am I deliberately writing tests that are super trivial so that I'll just tick the box that says pass and then I can go on to the next task?
00:41:13 Ethan Garofolo
Yeah, that's a fun one coverage. There's a lot of schools of thought on this, and some people are like we need 100% test coverage.
00:41:20 Ethan Garofolo
But you can write tests that cover 100% of the code that don't actually test anything. There are code coverage tools that you can plug in to your Ruby, your JavaScript code. I'm sure every language has this.
00:41:33 Ethan Garofolo
And it will give you a report at the end of the test suite which lines were actually exercised by the test, which ones weren't. But you could then just not put any assertions in your.
00:41:43 Ethan Garofolo
And so Yep, every line was touched, but we didn't actually validate that anything worked the way that we wanted to.
00:41:48 Ethan Garofolo
So I don't know. Like I'm not a 100% coverage person. I'm certainly not a 0% coverage person.
00:41:56 Ethan Garofolo
TDD though it isn't testing, so I would never say because I have a a a test suite that came from TDD that I have.
00:42:03 Ethan Garofolo
Satisfy the testing requirement.
00:42:05 Ethan Garofolo
Of some piece of software. It's a development tool. It happens to give me runnable tests which has some really nice features to it, but the discipline of testing is different from the discipline of test driven development.
00:42:17 Ethan Garofolo
Any of that said, that's another thing where I think that working in pairs or in mobs is very helpful because someone's checking my blind spots.
00:42:26 Ethan Garofolo
You could write a test and someones like. Wait Ethan, what is this actually testing that? That's a trivial to.
00:42:30 Ethan Garofolo
What value is there from that and then I can have that moment of like, oh, that's a really good point.
00:42:35 Ethan Garofolo
We try to do that in software with after the fact reviews, but I've just never seen those be through.
00:42:41 Ethan Garofolo
There's always the fear that like, ohh, someone's gonna accuse me of nitpicking or whatever, but then also after the fact, I have no insight to what your thought process is.
00:42:50 Ethan Garofolo
Is as you're writing it, and so suppose I see something that needs correction. I don't even know what you were thinking. So now we've got to get together anyway to talk through what you were.
00:43:00 Ethan Garofolo
Thinking and same thing when you're reviewing my code. So I just think that like I, I wouldn't worry too much about the coverage, what these tools would report and what your test coverage is.
00:43:11 Ethan Garofolo
But that's what's too simple. That's where the peer review is so important, and I think that's best done in real time.
00:43:18 Dr Genevieve Hayes
I do think it's something that I I know it's.
00:43:21 Dr Genevieve Hayes
Something that can make your code a lot more robust.
00:43:24 Dr Genevieve Hayes
Because I think a lot of data scientists signed in the discipline of doing it and then they discover the issues with their code once it's reached the deployment stage and then they've got a model in production and someone phones up and says, hey, do you know your models doing this weird thing and rejecting everything when?
00:43:44 Dr Genevieve Hayes
So if you've actually done that test prior to deployment, then you wouldn't have to get that phone call.
00:43:50 Ethan Garofolo
Yeah, I could see that. Like the the purpose was someone said that who has it was a Dijkstra, I think.
00:43:58 Ethan Garofolo
Had the quotation that, like computer science, is as much about computers as astronomy is about telescopes kind of a thing, and so like data science, probably is rendered in computer programmes. That's the crude.
00:44:13 Ethan Garofolo
Material that you have to work with to realise the data science.
00:44:16 Ethan Garofolo
Well, so that programming aspect of it isn't the data science, but it can sure stop the data science from providing value to people if not done well.
00:44:27 Dr Genevieve Hayes
I think a lot of data scientists would benefit from learning to strength, strengthen their programming, and well, everyone would.
00:44:37 Dr Genevieve Hayes
Benefit from strengthening all the different elements that make up their career. I.
00:44:40 Ethan Garofolo
Think for sure. Yeah, absolutely. What do you with finite time?
00:44:45 Ethan Garofolo
What do you pick and choose? And that's why I always come back to the team. The team can do that, like put the people together.
00:44:52 Ethan Garofolo
Whether I have yet to work at a job where if I make a mistake, someone's going to die because of it.
00:44:58 Ethan Garofolo
So a lot of examples I'll use people will say like, Oh well, we don't have that big of a consequence.
00:45:03 Ethan Garofolo
It's not that big of a deal if we feel sure, totally grant that. But I do think it's very useful to look at how teams organise when lives are on the line.
00:45:12 Ethan Garofolo
Take the special forces unit for example. Like everyone has basic soldiering skills, but you have someone who's really deep in medicine to deal with wounds. You've got people very well versed in demolitions and so on and so forth.
00:45:27 Ethan Garofolo
And why do they organise that way? It's for that real time feedback loop so that we can respond to new information and exploit new information very quickly.
00:45:36 Ethan Garofolo
And so if I got a data science task and started running away with it by myself, I would come back with something that would probably look like an abomination to Someone Like You who actually knows what they're doing. But if we work together.
00:45:49 Ethan Garofolo
Or you could correct in real time and then as far as wiring it up to the software that could drive it and make that a product, we can fix that in real time.
00:45:59 Ethan Garofolo
And I think you're absolutely right. Like it has to function as software to be used. That's until we develop some other interface for these models. But and I love the way you phrased it, not getting that.
00:46:10 Ethan Garofolo
Called. Nobody wants that call.
00:46:12 Dr Genevieve Hayes
Oh yeah.
00:46:13 Ethan Garofolo
Like we want to sleep at night, not debug production servers.
00:46:18 Dr Genevieve Hayes
I've done several jobs in government. The other thing we always say is when you're in government, you want to avoid ending up on the front page of the news.
00:46:27 Dr Genevieve Hayes
Safer. Probably no one's going to die unless you're working in a military type discipline. But you don't want to make a mistake so big that the newspaper decides to say, hey, look at this stuff up. That government organisation made.
00:46:45 Ethan Garofolo
Yeah, that's rough. The US's FAA went through one of those few months to go with some software error that like ground all air travel in the country for a.
00:46:56 Ethan Garofolo
Day or two? Ohh.
00:46:57 Dr Genevieve Hayes
Yeah, we heard about that in Australia.
00:47:00 Ethan Garofolo
OK, Interconnected systems, flight delays in the US gonna affect everywhere, because people coming to the US, people trying to get away from it. So I don't mean it in that sense. People who have a flight.
00:47:11 Ethan Garofolo
Scheduled to leave the US.
00:47:13 Ethan Garofolo
One delay causes another that causes ten others, and then the whole world is at an awful not a good time to be.
00:47:20 Ethan Garofolo
Going by plane but I.
00:47:22 Dr Genevieve Hayes
And from what I can gather it was just some really dumb little error that caused it. Was that right?
00:47:28 Ethan Garofolo
Yeah, as I recall, it was something to do with some 5.
00:47:32 Ethan Garofolo
You'll this comes from configuration file and I remember being upset at the headlines because they're like FA traces it back to actions that this one employee took. It's like OK, sure that person typed the keys.
00:47:46 Ethan Garofolo
Why system where someone could type the keys and ground aeroplanes across the world like that's management and systems failure, not this poor person getting blamed for it.
00:47:56 Dr Genevieve Hayes
When you're dealing with software development or with data science.
00:47:59 Dr Genevieve Hayes
It's often the case that you've got senior management who doesn't understand the technical disciplines and they're relying on people in a software development team or data science team to take responsibility for whatever they're producing when for appropriate risk management.
00:48:19 Dr Genevieve Hayes
The people in senior management should at least have enough of an understanding of what the technical teams are doing, even if they can't programme or fit models so that they can ask the right questions.
00:48:33 Dr Genevieve Hayes
To make sure that you're not going to get problems like that.
00:48:37 Ethan Garofolo
100% in agreement with you. It's not to excuse we techies from doing our bit to outreach and facilitate that transfer of understanding.
00:48:47 Ethan Garofolo
But yeah, if you are going to manage a technical organisation, you're going to have to bone up on your technical skills like you don't have to code. But.
00:48:56 Ethan Garofolo
I don't know. I could pure speculation, but I'm saying like if you're a VIP director level person.
00:49:02 Ethan Garofolo
You should be able to pop into a mob session and make a meaningful contribution. You may not type the code, but you'd be right there to answer a question if one pops up, you might be able to make a clarification like hearing your team talk about what they're doing.
00:49:17 Ethan Garofolo
You're like, oh, wow, they totally misunderstood what we were trying to do. I'm right here to correct it. That's great.
00:49:22 Dr Genevieve Hayes
I had this fantastic boss early in my career and I was programming in SAS and I had a boss he couldn't programme but he.
00:49:32 Dr Genevieve Hayes
Had learn to understand Sasco so he couldn't write it from scratch. But if I showed him a piece of code, I could talk him through it and he could understand, you know, he understood things like if then statements all the mathematical type calculations for loops, things like that.
00:49:53 Dr Genevieve Hayes
And I thought that was really great because even though he couldn't write the code from scratch, you could understand the logic behind it so that he could understand what it was doing.
00:50:02 Ethan Garofolo
I love that that fits with a lot of my understanding of lean two. I the I don't know if this was true or not a hypocritical, but supposedly when Toyota gets a new frontline manager, they'll take that person down to the production floor and draw a chalk circle on the ground. Be like this is your office like you need to be here.
00:50:24 Ethan Garofolo
Observing the work being done not up there, not observing the work being done.
00:50:29 Dr Genevieve Hayes
With this guy, he was, as I said, he was probably one of the best bosses I've had in my.
00:50:33 Dr Genevieve Hayes
Life, who had been in that organisation for over 20 years when I started so he had, I think he started when he just graduated from university and well, if you'd been there for over 20 years, it would have been in his early 40s. So he knew everything about that organisation, from actually living through different jobs at different levels.
00:50:56 Ethan Garofolo
That's awesome. In tech. They say you do your career disservice when you stay at a place for a.
00:51:02 Ethan Garofolo
That time. But when you do stay in a place for a long.
00:51:05 Ethan Garofolo
Time. There's just.
00:51:06 Ethan Garofolo
So much knowledge you have the implicit knowledge of the organisation and the reasons why things were done and I don't care how good your documentation game is like you, you can't capture all of that in documentation.
00:51:19 Ethan Garofolo
You got to be there to see it.
00:51:20 Dr Genevieve Hayes
Happening I've been at organisations where for various reasons they've had a.
00:51:26 Dr Genevieve Hayes
High turnover rate in the organisation or?
00:51:28 Dr Genevieve Hayes
The team and even though people have to document things on their way out, you lose so much knowledge when those situations happen.
00:51:37 Ethan Garofolo
Yeah, but we when.
00:51:39 Ethan Garofolo
We talk about retention costs. We're always like, oh, the cost of hiring a new employee.
00:51:43 Ethan Garofolo
Yes, it's high, especially if you're working through recruiters, but I don't think we even attempt to do the accounting on how much is lost from the knowledge.
00:51:53 Dr Genevieve Hayes
And when people leave an organisation, I mean, I'm not saying people deliberately try and leave things out of their hand over notes to sabotage the organisation because I don't think people.
00:52:04 Dr Genevieve Hayes
We'll do that, but I know when I've done documentation in the past, I'll do the documentation and then two weeks later think to myself ohh hell I should have included whatever it was in it and it wasn't.
00:52:17 Dr Genevieve Hayes
I was trying to sabotage anyone. It was just at that time I didn't think of this. And then it occurred to me.
00:52:24 Dr Genevieve Hayes
When I was taking a shower at a later point in.
00:52:27 Ethan Garofolo
I forget the context, but someone was saying once that like as you develop expertise in something.
00:52:33 Ethan Garofolo
And you don't even realise what are the things you just take for granted anymore. And I think that plays into when you're trying to document.
00:52:40 Ethan Garofolo
Ohh what's the most important stuff that I could leave for the company now that I'm departing? If if you're growing in your knowledge and at the frontier of your knowledge, you're like, oh, that's the stuff that's really important that I need to work on, but that this stuff is just all taken for granted now. Sorry.
00:52:55 Ethan Garofolo
For those you can't see, I'm using my hands in amazing gestures to represent a body of knowledge, so you forget what the stuff is that you're actually doing, cause it's just innate. It's instinct now.
00:53:05 Ethan Garofolo
But someone who's going to take your place probably means that stuff more than the stuff at the frontier of your knowledge, cause that's the basics of just how to do anything.
00:53:14 Ethan Garofolo
In your role.
00:53:16 Dr Genevieve Hayes
Whenever I've stashed and you organisation, the hardest thing I find is figuring out how to connect my code to the database.
00:53:25 Dr Genevieve Hayes
Or that way.
00:53:26 Dr Genevieve Hayes
Yes, because.
00:53:28 Dr Genevieve Hayes
Being able to programme in a particular language that's a transferable skill between organisations. Understanding the actual specific systems for an organisation that is unique to an organisation and once you've been there, that's well, doesn't everyone know how to access the data warehouse? No, no they don't and no one.
00:53:47 Dr Genevieve Hayes
Reference it.
00:53:48 Ethan Garofolo
I I think an analogue of that to the pure software development world would be how do I deploy my code and every organisation does it slightly differently?
00:53:56 Ethan Garofolo
Teams within the same organisation do it differently. That's something I really appreciate about where I am right now is that we don't rely on any one person to be the deployment person or to be the database.
00:54:09 Ethan Garofolo
Person we all rotate around working with each other and everyone is expected to deploy. I did in my first week. I was talked through.
00:54:18 Ethan Garofolo
How to execute the deployment script? I need to go back and do it slower to understand what each step.
00:54:22 Ethan Garofolo
Represents better but.
00:54:23 Ethan Garofolo
I push code to production in that week.
00:54:25 Ethan Garofolo
So like if.
00:54:26 Ethan Garofolo
Everyone on the team disappeared overnight. I could probably get a deployment out the next day and that's cool.
00:54:33 Ethan Garofolo
I want to learn a deeper that redundancy.
00:54:35 Ethan Garofolo
I think is a good thing.
00:54:37 Dr Genevieve Hayes
I think you need to train.
00:54:38 Dr Genevieve Hayes
People early on and I think it's good that you're getting that training.
00:54:42 Ethan Garofolo
For sure. Absolutely. I'm grateful for it.
00:54:45 Dr Genevieve Hayes
Is there anything on your radar in the AI data and analytics space that you think is going to become important in the next three to five years?
00:54:55 Ethan Garofolo
That's a tough question, cause it might immediately goes to specific technologies, but I think like from what I've been exposed to with like seemingly magic things like ChatGPT, and LM's of other flavours and all that kind of stuff, generative AI, the real challenge is how do we use it properly?
00:55:15 Ethan Garofolo
And what I mean by that is like GitHub has GitHub copilot or you can ask ChatGPT coding questions and.
00:55:22 Ethan Garofolo
We'll generate a first pass of things for you. How do we leverage that while still while not abdicating what a human should do?
00:55:32 Ethan Garofolo
And what do I mean by that? It's like I don't wanna just take ChatGPT snippets and just paste them without understanding them.
00:55:37 Ethan Garofolo
But at the same time using ChatGPT has saved me probably hours in getting that initial understanding of something.
00:55:45 Ethan Garofolo
And so far more like, even as far as like non AI technologies. What I am excited for the future of software development what I think will make the the critical advantage is like how do we build better systems of work so more so than the specific technologies is how do we incorporate these technologies into our workflows such that they are helpful.
00:56:07 Ethan Garofolo
Like there's the obvious, like we don't want to get into legal trouble for plagiarism or using data that we shouldn't have in the training.
00:56:13 Ethan Garofolo
But I work at a law firm right now, and so that's a big question. Like, can generative AI make contracts? Do we even want?
00:56:21 Ethan Garofolo
To, but certainly could, it could help us. So figuring out what are the workflows, how do we take these technologies, make them not novelties, make them not land mines, but actual things that increase the capacity of people like the Iron Man suit, like Iron Man wasn't a robot, it was a suit that made Tony Stark.
00:56:41 Ethan Garofolo
Better at what he does. How do we make the technology function like that? That's I don't have any specific things.
00:56:47 Ethan Garofolo
That was a lot of answers. Or a.
00:56:49 Ethan Garofolo
Lot of words.
00:56:49 Ethan Garofolo
To say I don't know, no, I think that's.
00:56:51 Dr Genevieve Hayes
Good cause I've experimented with using chat PT for my own programming work.
00:56:57 Dr Genevieve Hayes
And I find it gets things wrong so often that.
00:57:02 Dr Genevieve Hayes
I would be cautious using it if it was something I didn't know how to do myself already, because you need to be at have the sufficient knowledge to be able to spot when Chef GP's getting something wrong.
00:57:15 Ethan Garofolo
Yeah. That's like going back to the TD discussion, taking that deliberate step of predicting that would what would what will happen when I run the test.
00:57:24 Ethan Garofolo
To me it's the same thing. It's like I have to be in control of the system and understand what it's doing and if I'm just pasting snippets of code from ChatGPT.
00:57:32 Ethan Garofolo
I am letting go of that understanding of how things are working and so where it's been helpful is replacing that initial Googling.
00:57:41 Ethan Garofolo
Now I can ask a question of it and it'll produce a snippet of code with some minor explanations. And to me it's like cool. Now I know what I need to Google to actually go learn and.
00:57:52 Ethan Garofolo
Understand this.
00:57:53 Dr Genevieve Hayes
I've found when I've done something, if it's something I'm not familiar with, it is interesting to look at. OK, HPT, how would you do this to see if there is a better way of doing it?
00:58:04 Dr Genevieve Hayes
And sometimes I can see improvements, but sometimes I look at it and it's like, no, I think I did it.
00:58:09 Dr Genevieve Hayes
Better the first time.
00:58:10 Ethan Garofolo
Oh, that's awesome. I have. I love some of the things that produces like this pure entertainment.
00:58:16 Dr Genevieve Hayes
And if you read some of the things people have tricked it into doing, and, yeah, they're, they're fun.
00:58:23 Ethan Garofolo
I asked them a question once where I wanted to take a strong stance on something. Some trivial thing like what video game is best and it's like.
00:58:30 Ethan Garofolo
That's a language model I cannot do.
00:58:32 Ethan Garofolo
Blah blah blah and I was like, huh, what's the definition of a wet blanket? And then it gave me the exact definition of that term.
00:58:40 Ethan Garofolo
I said well said ChatGPT well said because it just it wouldn't take the stance and I don't know, it was a wet blanket.
00:58:47 Dr Genevieve Hayes
To me, I've tried the whole which is the best James Bond type question.
00:58:52 Dr Genevieve Hayes
And no, I cannot weigh in on this. This is a matter of personal opinion. Yeah, but don't you reckon that Sean Connery is better than Roger Moore and that sort of thing? No, no. That is a matter of personal.
00:59:03 Dr Genevieve Hayes
Opinion. You're no fun.
00:59:06 Ethan Garofolo
Yes, well, that's fine.
00:59:09 Dr Genevieve Hayes
What final advice would you give to data scientists looking to create business value from data?
00:59:14 Ethan Garofolo
I would give the same advice that I would give to a software developer learning how to drive more business value from the technology.
00:59:23 Ethan Garofolo
When you're young, you need to learn the tools, but as you get older, I think you'll find that you'll make yourself a lot more valuable to your org by learning more about your business and what it does.
00:59:34 Ethan Garofolo
And what helps it succeed than in learning yet another programming language, or yet another LLM, or some other tool like the.
00:59:42 Dr Genevieve Hayes
Yeah. What is your programming language of choice, by the way?
00:59:46 Ethan Garofolo
I use Ruby most of the time these days.
00:59:49 Dr Genevieve Hayes
That's interesting set of web development language mostly.
00:59:53 Ethan Garofolo
I would say it's probably used for that mostly like the big thing in Ruby is Ruby on Rails and so big that there are some people who don't realise that Ruby is something separate from Ruby on Rails. That's just the library written in it.
01:00:07 Dr Genevieve Hayes
OK.
01:00:08 Dr Genevieve Hayes
Right.
01:00:08 Ethan Garofolo
Yeah. So yeah, it's a programming language and it existed for a good 12 years before rails was invented. Maybe 10 or so.
01:00:18 Ethan Garofolo
It's just a general purpose programming language written by a Japanese guy named Matt, and he wanted to write a language that was pleasant to use, and I think he nailed it.
01:00:28 Ethan Garofolo
Ruby was like the darling of all startups, from like 2007 to 2013, and now there's the perception it's dead. No one uses.
01:00:38 Ethan Garofolo
Is it? Well, it's like, have you heard of GitHub? Have you heard of Shopify? Have you heard of base camp like all these companies using it?
01:00:45 Ethan Garofolo
It's the way developers are with languages, but it's not the fastest executing, so if you've got to do like real time stock trading, I would say don't use Ruby for that, but I find it very pleasant to work in.
01:00:58 Dr Genevieve Hayes
I know it doesn't have.
01:00:59 Dr Genevieve Hayes
Very much support for machine learning, so I think that's why a lot of data scientists don't use.
01:01:04 Dr Genevieve Hayes
But I've heard a lot of very nice.
01:01:06 Dr Genevieve Hayes
Things about it.
01:01:07 Ethan Garofolo
Yeah, I agree with your assessment there. Python, at least in the early days, got Numpy and Scipy, and those were just phenomenal libraries.
01:01:15 Ethan Garofolo
And then the things built on top of them and it got momentum going and just kept going. Ruby has Ruby fan that a friend of mine may give some Ruby bindings for fast artificial neural network but.
01:01:27 Ethan Garofolo
That's older tech, I would say so, yeah.
01:01:31 Dr Genevieve Hayes
I've read somewhere that.
01:01:33 Dr Genevieve Hayes
If Numpy had been written in Ruby instead of Python, then Ruby would have been the dominant language for data science today.
01:01:41 Ethan Garofolo
I bet I and that I think is one of the the critical things to understand software development like language is common languages go, but people are drawn to the tools of the language, kind of like video game consoles like nobody wanted an Xbox. People wanted to play Halo, they could only do that with an Xbox since they got an Xbox.
01:02:03 Ethan Garofolo
And so I agree with you on that course, you wouldn't have had the cool Numpy. How I like to call Numpy. I don't see how that would have worked as well with Ruby.
01:02:13 Ethan Garofolo
The world would have. What was that? Maybe. Yeah, yeah.
01:02:13 Dr Genevieve Hayes
It would have been Numpy.
01:02:16 Dr Genevieve Hayes
Numby, yes.
01:02:19 Ethan Garofolo
Hey, that has a ring to it as well. So how do we fix this problem and then make review? The predominant thing for data science.
01:02:28 Dr Genevieve Hayes
For listeners who want to learn more about you or get in contact, what can they do?
01:02:33 Ethan Garofolo
Probably go talk to their therapist. No, just kidding. But no. If you want to find me, I run a website practicalmicroservices.com where I write about a lot of the things we've talked about here today.
01:02:44 Ethan Garofolo
And I'm also on Twitter or ex. I guess it's called now. I don't know what's.
01:02:49 Dr Genevieve Hayes
Going on with that. That's right. They've changed the name.
01:02:51 Dr Genevieve Hayes
Haven't they I've?
01:02:52 Ethan Garofolo
Only seen the logo but like is the company.
01:02:55 Ethan Garofolo
Called something different. Now I don't know. Whatever I tweet from time to time there just that my first and last name. Ethan. Careful though. And then twitch.tv/ethan Garfield see me from time to.
01:03:05
OK.
01:03:06 Dr Genevieve Hayes
And I'll also encourage people to sign up for your daily email list because I've learned a lot from reading your daily emails, and I think a lot of data scientists also could.
01:03:17 Ethan Garofolo
I appreciate that endorsement as I've learned a lot from yours. That was one of the themes of our conversation here today, just like these separate worlds.
01:03:24 Ethan Garofolo
And I think you've got an advantage because you've worked with software development teams. It sounds a lot closer than I have with data science teams, but it's powerful tooling and I would do well to learn more about.
01:03:35 Dr Genevieve Hayes
It thank you very much for joining me today.
01:03:38 Ethan Garofolo
My pleasure. Thanks for having.
01:03:39 Dr Genevieve Hayes
Me and for those in the audience, thank you for listening. I'm doctor Genevieve Hayes, and this has been value driven data science brought to you by Genevieve Hayes Consulting.

Episode 22: Software Engineering for Data Science
Broadcast by